You are not logged in.
I have been trying to install NVIDIA drivers for the past few days without any success. The first thing I did was followed the official guide, and by running the command I got this
01:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)
Subsystem: ASUSTeK Computer Inc. Device 87b3
Kernel driver in use: nouvea
Following the guide my card is NV170 family (Ampere) so I installed the nvidia package and proceeded to remove "kms" from the HOOKS array in /etc/mkinitcpio.conf and regenerateed the initramfs.
Then I installed the nvidia-utils so that the open source driver is blacklisted and not loaded. For context I have both the linux and linux-lts kernels, so that I could go back in case something went wrong (which it did). I rebooted the machine and selected linux from the GRUB menu,
I got the initial loading kernel message, and then on my monitor I got "no signal" message and sometimes it would disappear and the message would be gone and there would be a black screen but it quickly went back into "no signal" and then it loops.
I then rebooted and re-did all the steps for the linux-lts, and the same issue appeared.
I then went to the Troubleshooting section to try and fix the issue. At the time I was using wayland on KDE so I thought that could have been the issue and switched to X11 and re-did all the steps, again I am greeted with the same no signal from my monitor.
I would also like to point out that I was not able to switch TTYs at any point during all of this so I reverted back to the open source driver and setup an SSH server to see if I could ssh into my machine, to my surprise I was able to ssh into my machine and looking at the logs
and running nvidia-smi I was able to see that the driver was loaded. At this point I went and tried to use the nvidia-open drivers to see if anything will change, and the result was the same. I also tried to generate the X11 config with "nvidia-xconfig" and to move it from
"/etc/X11/xorg.conf" to "/etc/X11/xorg.conf.d/10-nvidia.conf" and also added the extra line
"BusID "PCI:1:0:0"
This also didn't help and the issue was the same, when I looked into the logs the only thing that stood out from journalctl was
Aug 19 20:07:54 archlinux kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20230628/dswload2-477)
Aug 19 20:07:54 archlinux kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20230628/psparse-529)
Aug 19 20:07:54 archlinux kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20230628/dsfield-184)
Aug 19 20:07:54 archlinux kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20230628/dswload2-477)
Aug 19 20:07:54 archlinux kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20230628/psparse-529)
Aug 19 20:07:54 archlinux kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20230628/dsfield-184)
Aug 19 20:07:54 archlinux kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20230628/dswload2-477)
Aug 19 20:07:54 archlinux kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20230628/psparse-529)
And from xorg logs
[ 28.683] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 28.683] (--) NVIDIA(GPU-0): DFP-5: Internal TMDS
[ 28.683] (--) NVIDIA(GPU-0): DFP-5: 165.0 MHz maximum pixel clock
[ 28.683] (--) NVIDIA(GPU-0):
[ 29.462] (--) NVIDIA(GPU-0): LG Electronics LG HDR QHD (DFP-5): connected
[ 29.462] (--) NVIDIA(GPU-0): LG Electronics LG HDR QHD (DFP-5): Internal TMDS
[ 29.462] (--) NVIDIA(GPU-0): LG Electronics LG HDR QHD (DFP-5): 600.0 MHz maximum pixel clock
[ 29.462] (--) NVIDIA(GPU-0):
[ 53.782] (--) NVIDIA(GPU-0): DFP-5: disconnected
[ 53.782] (--) NVIDIA(GPU-0): DFP-5: Internal TMDS
[ 53.782] (--) NVIDIA(GPU-0): DFP-5: 165.0 MHz maximum pixel clock
[ 53.782] (--) NVIDIA(GPU-0):
[ 54.525] (--) NVIDIA(GPU-0): LG Electronics LG HDR QHD (DFP-5): connected
[ 54.525] (--) NVIDIA(GPU-0): LG Electronics LG HDR QHD (DFP-5): Internal TMDS
[ 54.525] (--) NVIDIA(GPU-0): LG Electronics LG HDR QHD (DFP-5): 600.0 MHz maximum pixel clock
The last thing I tried was switching my cable from HDMI to DisplayPort but this also didn't help. Another thing to note is that I am dual booting and also using Windows on the same machine and there are no such issues with the GPU.
Last edited by atomicblimp (2024-08-23 16:36:26)
Offline
Please post the full system journal and xorg.logs.
Offline
Here is the journalctl:
http://0x0.st/XJ1e.txt
And the xorg logs:
http://0x0.st/XJ19.txt
Offline
Those logs are from the -lts kernel. You did not install drivers for the lts kernel. We need logs from the regular kernel.
Offline
Okay, was also including that to just in case something was off there as well.
I have reinstalled the drivers and rebooted the machine so the logs should be more clean now.
Linux version 6.10.5-arch1-1 journalctl -b
http://0x0.st/XJjv.txt
xorg:
http://0x0.st/XJjx.txt
lspci -k | grep -A 2 -E "(VGA|3D)"
01:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)
Subsystem: ASUSTeK Computer Inc. Device 87b3
Kernel driver in use: nvidia
And nvidia-smi
Mon Aug 19 22:40:12 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.58.02 Driver Version: 555.58.02 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 On | N/A |
| 0% 41C P8 43W / 350W | 249MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1177 G /usr/lib/Xorg 54MiB |
| 0 N/A N/A 1204 G /usr/bin/sddm-greeter-qt6 169MiB |
+-----------------------------------------------------------------------------------------+
I have also tested the GPU with
hashcat -b
as it does not run well without the proprietary drivers, and It ran as expected however I tested this over ssh as I still can't see my screen.
Offline
Enable https://wiki.archlinux.org/title/NVIDIA … de_setting - use the "nvidia_drm.modeset=1" kernel parameter (modprobe.conf won't do!) to enabe drm, expose the edid to it and get rid of the simpledrm device
Offline
Okay, I have tried adding that kernel parameter to /etc/default/grub so that it looks like this:
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet nvidia_drm.modeset=1"
It didn't seem to fix the issue, however I did notice that when I boot into LTS, I no longer have a GUI,
but I am able to switch to another TTY. Running inside the Linux kernel (not lts)
# cat /sys/module/nvidia_drm/parameters/modeset
gives me
Y
I then tried to add another parameter to the kernel "nvidia_drm.fbdev=1" to see if this will change things but it didn't
Here are the logs from that:
journal:
http://0x0.st/XJOa.txt
xorg:
http://0x0.st/XJOm.txt
Another thing I have noticed it that running "xrandr" with the open source driver gives me an odd output:
Screen 0: minimum 320 x 200, current 2560 x 1440, maximum 4096 x 4096
None-1 connected primary 2560x1440+0+0 (normal left inverted right x axis y axis) 0mm x 0mm
2560x1440 60.00*+
I am using an HDMI cable currently but here it says None-1 and it is not a phantom display, my monitor has 2 HDMI and 1DP ports,
and I am using the second one (HDMI) as the first one seems to be damaged (or dusty?) or not working properly as it gives me the no signal even during
the bootloader, but I didn't test directly booting into windows.
Offline
None-1 is the simpledrm device which isn't available nor used in the posted logs.
[ 10.129] (II) NVIDIA(0): Validated MetaModes:
[ 10.129] (II) NVIDIA(0): "DFP-2:nvidia-auto-select"
[ 10.129] (II) NVIDIA(0): Virtual screen size determined to be 2560 x 1440
[ 10.302] (--) NVIDIA(0): DPI set to (92, 93); computed from "UseEdidDpi" X config
It didn't seem to fix the issue, however I did notice that when I boot into LTS, I no longer have a GUI
You most likely still don't have the nvidai module built/installed for the lts kernel, but there's also absolutely no indication of any problem whatsoever in the logs you posted.
It's the cable or the Tv, do you have alternatives to that?
Offline
You were right it was the monitor, I plugged in the cable into a spare I had and it worked, system booted up normally nvidia-smi has output and I can run "hashcat -b" without errors.
The weird thing that happened now is that I switched to the main monitor that had no signal and suddenly it now works even without the nvidia_drm.modeset=1 for clarity when I tried switching monitors I never had this parameter set as I wanted to see if it would work without it.
Here are the logs from the second monitor:
journal:
http://0x0.st/XJW5.txt
xorg:
http://0x0.st/XJWh.txt
And also logs after I connected the first monitor back that didn't work before:
journal:
http://0x0.st/XJWC.txt
xorg:
http://0x0.st/XJ4r.txt
Right now I am on the non-lts kernel with the proprietary nvidia driver and it works but I am afraid to test anything in case I need to save some config somewhere or do any other tests.
here is the xrandr output from the first monitor that didn't work before and had the None-1.
Screen 0: minimum 8 x 8, current 2560 x 1440, maximum 32767 x 32767
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 disconnected (normal left inverted right x axis y axis)
HDMI-0 connected primary 2560x1440+0+0 (normal left inverted right x axis y axis) 598mm x 336mm
2560x1440 99.95 + 74.97 59.95*
3840x2160 59.94 50.00 29.97 25.00 23.98
1920x1080 75.00 60.00 59.94 50.00
1680x1050 59.95
1600x900 60.00
1280x1024 75.02 60.02
1280x800 59.81
1280x720 60.00 59.94 50.00
1152x864 60.00
1024x768 75.03 60.00
800x600 75.00 60.32
720x576 50.00
720x480 59.94
640x480 75.00 59.94 59.93
DP-2 disconnected (normal left inverted right x axis y axis)
DP-3 disconnected (normal left inverted right x axis y axis)
HDMI-1 disconnected (normal left inverted right x axis y axis)
DP-4 disconnected (normal left inverted right x axis y axis)
DP-5 disconnected (normal left inverted right x axis y axis)
Offline
Aug 20 11:56:15 archlinux kernel: smpboot: CPU0: 13th Gen Intel(R) Core(TM) i9-13900K (family: 0x6, model: 0xb7, stepping: 0x1)
https://ark.intel.com/content/www/us/en … 0-ghz.html shows your processor should come with an integrated intel gpu .
you probably have a file in /etc/X11/xorg.conf.d that blocks X from seeing/using the intel gpu .
Is that intentional ?
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
Aug 20 11:56:15 archlinux kernel: smpboot: CPU0: 13th Gen Intel(R) Core(TM) i9-13900K (family: 0x6, model: 0xb7, stepping: 0x1)
https://ark.intel.com/content/www/us/en … 0-ghz.html shows your processor should come with an integrated intel gpu .
you probably have a file in /etc/X11/xorg.conf.d that blocks X from seeing/using the intel gpu .
Is that intentional ?
No, I didn't have anything there, except for 00-keyboard.conf. I did try also blacklisting the integrated GPU with
/etc/modprobe.d/blacklist.conf
blacklist i915
Another thing I have tried is to replicate the steps that made it work on the first monitor and it no longer works. What I did was:
1. Booted up the PC connected to the first monitor
2. Selected the Linux kernel
3. Black Screen/No Signal
4. Unplugged the cable from the first monitor
5. Plugged the cable to the second monitor
6. Second monitor shows no signal
7. Rebooted using the power button and pressing enter
8. Selected the Linux kernel
9. Nvidia drivers load and I can log in and run nvidia-smi
10. Unplugged the cable from the second monitor
11. Plugged the cable into the first monitor
The first time I did this it worked, but now it does not. Not sure why, I am really confused.
At least we know it could be an issue with the monitor communicating with X and not the cable,
I just never had such and issue before so I don't know what to do.
Offline
On some systems it is possible to disable an integrated gpu in the uefi firmware (often incorrrectly called bios) .
Did you change graphics related settings in firmware ?
Please post the full output (run as user) of
$ lspci -knn
The output will help to determine if your system uses hybrid graphics or not.
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
The only thing I have changed in the UEFI is the Fan Speed Profile, and also set the Intel Defaults for my CPU as the other defaults used some crazy values.
$ lspci -knn
output:
http://0x0.st/XJt6.txt
Offline
There's no IGP, they also tend to de-activate when you plug a GPU into the PEG slot.
The HDMI handshake is super-fragile. If there's even just mild interference on the cable or the port, that can completely break it - esp. for higher resolutions (larger signals)
my monitor has 2 HDMI and 1DP ports, and I am using the second one (HDMI) as the first one seems to be damaged (or dusty?) or not working properly as it gives me the no signal even during
the bootloader
You maybe want to look into this - the other monitor works reliably? Does it provide a comparable resolution?
Offline
There's no IGP, they also tend to de-activate when you plug a GPU into the PEG slot.
The HDMI handshake is super-fragile. If there's even just mild interference on the cable or the port, that can completely break it - esp. for higher resolutions (larger signals)
my monitor has 2 HDMI and 1DP ports, and I am using the second one (HDMI) as the first one seems to be damaged (or dusty?) or not working properly as it gives me the no signal even during
the bootloaderYou maybe want to look into this - the other monitor works reliably? Does it provide a comparable resolution?
Yes, the other monitor works fine I think, I tried running Factorio as a test and it worked, however Left4Dead 2 had 10 FPS and crashed.
I am not sure why but I don't think it is related to this as I don't really plan to play games on arch, and the crash was as per dmsg due the
CPU segfaulting (probably due to power settings as they were rest to the insane defaults). Both monitors have the same resolution however the first one that is problematic
is running 2560x1440/100Hz when looking at the monitor overlay and is larger in physical size, the second one is
running with 2560x1440/60Hz when looking at the overlay and works just as expected (ignoring the CPU issue that is not related to this).
Offline
How's the cable graded?
If the 100Hz modeline is somewhat naive, "HDMI Premium High Speed" might be at its limits and you's require an "HDMI Ultra High Speed" cable (yes, that BS is the official grading) and in #9 you were running the output at 60Hz, what's also likely the reason why it suddenly worked - you kept the previous mode.
If you'
for OUT in /sys/class/drm/card*; do echo $OUT; edid-decode $OUT/edid; echo "================="; done
You'll need https://aur.archlinux.org/packages/edid-decode-git and nvidia_drm.modeset=1 for this to work.
Offline
How's the cable graded?
If the 100Hz modeline is somewhat naive, "HDMI Premium High Speed" might be at its limits and you's require an "HDMI Ultra High Speed" cable (yes, that BS is the official grading) and in #9 you were running the output at 60Hz, what's also likely the reason why it suddenly worked - you kept the previous mode.If you'
for OUT in /sys/class/drm/card*; do echo $OUT; edid-decode $OUT/edid; echo "================="; done
You'll need https://aur.archlinux.org/packages/edid-decode-git and nvidia_drm.modeset=1 for this to work.
I am not sure what grade the cable is, but it does work on windows without any issues it. The cable should be of higher grade but I don't remember and there is nothing on the cable to indicate what type it is. I have also tested the other HDMI port and it seems to work fine while I am booted into windows, but not in arch or in the GRUB bootloader. Setting the kernel parameter and running the command I get :
http://0x0.st/XJkZ.txt
Offline
Please avoid fully quoting previous posts.
DTD 1: 2560x1440 99.946436 Hz 16:9 150.919 kHz 410.500000 MHz (697 mm x 392 mm)
isn't too close to the "HDMI Premium High Speed" limits (though fwwi, that shit is typically printed all over the cable, in doubt near the plugs), though the windows driver might end up running 4:2:0, https://wiki.archlinux.org/title/NVIDIA … ubsampling
Assuming you can reliably connect on 60Hz (or does windows simply run on that?) the lower subsampling will most likely work as well and in that case it's very most likely a signal quality issue.
Ceterum censeo: 3rd link below. Mandatory.
Disable it (it's NOT the BIOS setting!) and reboot windows and linux twice for voodo reasons.
Offline
Okay, good news it works . So I ended up booting into windows disabled the fast start, restarted twice, booted into arch and generated the X11 conf and also adding the Option "ForceYUV420" "True"
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 555.58.02
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0" 0 0
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection
Section "Files"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "ForceYUV420" "True"
SubSection "Display"
Depth 24
EndSubSection
EndSection
I rebooted again, this is all while using the HDMI cable and I still got the "no signal" however, then I unplugged the HDMI cable and used the DP one,
rebooted again and now It works. Now the issue still stands to if this was due to the cable or because of the Option "ForceYUV420" "True" or both. Another odd thing that I was only able to see now is that my monitor now runs at 144Hz which it didn't do before while I was using windows.
Screen 0: minimum 8 x 8, current 2560 x 1440, maximum 32767 x 32767
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 disconnected (normal left inverted right x axis y axis)
HDMI-0 disconnected (normal left inverted right x axis y axis)
DP-2 connected primary 2560x1440+0+0 (normal left inverted right x axis y axis) 697mm x 392mm
2560x1440 144.00*+ 120.00 74.97 59.95
1920x1080 74.91 60.00
1680x1050 59.95
1600x900 60.00
1280x1024 75.02 60.02
1280x800 59.81
1280x720 60.00
1152x864 59.96
1024x768 75.03 60.00
800x600 75.00 60.32
640x480 75.00 59.94
DP-3 disconnected (normal left inverted right x axis y axis)
HDMI-1 disconnected (normal left inverted right x axis y axis)
DP-4 disconnected (normal left inverted right x axis y axis)
DP-5 disconnected (normal left inverted right x axis y axis)
I am now afraid to mess with the config again as this really took way too long and it feels nice to actually see my screen, but I would still like to confirm what the issue was. Also, thanks everyone for being patient and helping me out!
Offline
The big thing is that you completely changed the output.
On top of that you also limited the subsampling (what's likely not required)
Please remove the static server config, /etc/X11/xorg.conf.d/20-nvidia.conf
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
Option "ForceYUV420" "True"
EndSection
should™ do (and still likely not be required as the displayport connection - and cable - will likely be able to provide that mode at 4:4:4
Offline