You are not logged in.

#1 2022-12-24 18:55:21

rokser
Member
Registered: 2022-12-24
Posts: 30

Nvidia driver not working on linux kernel

I can't manage to get nvidia driver working on linux kernel, even though it works fine on linux-lts.
I installed the nvidia package, but the driver is not loading.
Some outputs:
nvidia-smi

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

sudo modprobe nvidia -vv

modprobe: INFO: custom logging function 0x55c4ba15faf0 registered
insmod /lib/modules/6.1.1-arch1-1/extramodules/nvidia.ko.xz 
modprobe: INFO: Failed to insert module '/lib/modules/6.1.1-arch1-1/extramodules/nvidia.ko.xz': No such device
modprobe: ERROR: could not insert 'nvidia': No such device
modprobe: INFO: context 0x55c4ba875460 released

lspci -k

00:00.0 Host bridge: Intel Corporation Coffee Lake HOST and DRAM Controller (rev 0c)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel driver in use: skl_uncore
00:02.0 VGA compatible controller: Intel Corporation WhiskeyLake-U GT2 [UHD Graphics 620] (rev 02)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel driver in use: i915
        Kernel modules: i915
02:00.0 3D controller: NVIDIA Corporation GP108M [GeForce MX250] (rev a1)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel modules: nouveau, nvidia_drm, nvidia

pacman -Q | grep nvidia

lib32-nvidia-utils 525.60.11-1
nvidia 525.60.11-5
nvidia-prime 1.0-4
nvidia-settings 525.60.11-2
nvidia-utils 525.60.11-1

uname -r

6.1.1-arch1-1

sudo dmesg | grep -E 'nvidia|NVRM'

[    3.391036] nvidia: loading out-of-tree module taints kernel.
[    3.391052] nvidia: module license 'NVIDIA' taints kernel.
[    3.461850] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    3.739474] nvidia-nvlink: Nvlink Core is being initialized, major device number 510
[    3.740893] nvidia 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
[    3.741087] NVRM: This is a 64-bit BAR mapped above 4GB by the system
               NVRM: BIOS or the Linux kernel, but the PCI bridge
               NVRM: immediately upstream of this GPU does not define
               NVRM: a matching prefetchable memory window.
[    3.741090] NVRM: This may be due to a known Linux kernel bug.  Please
               NVRM: see the README section on 64-bit BARs for additional
               NVRM: information.
[    3.741091] nvidia: probe of 0000:02:00.0 failed with error -1
[    3.741111] NVRM: The NVIDIA probe routine failed for 1 device(s).
[    3.741112] NVRM: None of the NVIDIA devices were initialized.
[    3.741284] nvidia-nvlink: Unregistered Nvlink Core, major device number 510
[    4.224423] nvidia-nvlink: Nvlink Core is being initialized, major device number 510

Setting ibt=off does nothing. Setting "acpi_osi=Windows 2009" fixes the driver issue but breaks touchpad and power button.

Is there a way to resolve this?

Offline

#2 2022-12-24 22:11:53

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Online

#3 2022-12-25 08:46:00

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

Tried, no luck

Offline

#4 2022-12-25 09:07:00

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

No luck

Did you unload the module beforehand?
Also the commands there are a bit BS - "sudo echo" won't work.

lspci -k
sudo modprobe -r nvidia # does this case any errors?
echo 1 | sudo tee '/sys/bus/pci/devices/0000:02:00.0/remove'
echo 1 | sudo tee  /sys/bus/pci/rescan
sudo modprobe -v nvidia # does this indicate any action
lspci -k

Online

#5 2022-12-25 09:20:56

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

before:

lspci -k
00:00.0 Host bridge: Intel Corporation Coffee Lake HOST and DRAM Controller (rev 0c)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel driver in use: skl_uncore
00:02.0 VGA compatible controller: Intel Corporation WhiskeyLake-U GT2 [UHD Graphics 620] (rev 02)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel driver in use: i915
        Kernel modules: i915
02:00.0 3D controller: NVIDIA Corporation GP108M [GeForce MX250] (rev a1)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel modules: nouveau, nvidia_drm, nvidia

after:

lspci -k
00:00.0 Host bridge: Intel Corporation Coffee Lake HOST and DRAM Controller (rev 0c)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel driver in use: skl_uncore
00:02.0 VGA compatible controller: Intel Corporation WhiskeyLake-U GT2 [UHD Graphics 620] (rev 02)
        Subsystem: Acer Incorporated [ALI] Device 1301
        Kernel driver in use: i915
        Kernel modules: i915

So rescanning doesn't bring the gpu back. No errors with sudo modprobe -r nvidia (I think it does nothing in my case, the driver isn't loaded), and sudo modprobe -v nvidia is the same as it was:

insmod /lib/modules/6.1.1-arch1-1/extramodules/nvidia.ko.xz 
modprobe: ERROR: could not insert 'nvidia': No such device

Offline

#6 2022-12-25 09:43:51

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

The device is actually completely gone after the rescan.
What if you add "pcie_aspm=off" to the kernel parameters?
There're no BIOS updates available for the device?

Online

#7 2022-12-25 10:03:48

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

seth wrote:

"pcie_aspm=off"

Nothing's changed.

seth wrote:

There're no BIOS updates available for the device?

There is an update but I don't have windows installed so updating may be problematic.

Offline

#8 2022-12-25 10:21:12

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

And as I said, the driver works fine on linux-lts + nvidia-lts. I don't understand why linux + nvidia (and linux + nvidia-dkms) won't work.

Offline

#9 2022-12-25 15:02:12

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

I actually missed that.
I guess things started to fail w/ 6.0?
https://bbs.archlinux.org/viewtopic.php … 1#p2062391

However, do you get

[    3.741087] NVRM: This is a 64-bit BAR mapped above 4GB by the system
               NVRM: BIOS or the Linux kernel, but the PCI bridge
               NVRM: immediately upstream of this GPU does not define
               NVRM: a matching prefetchable memory window.

on the lts kernel as well?

Online

#10 2022-12-25 15:27:18

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

seth wrote:

I guess things started to fail w/ 6.0?

No, it started before 6.0 and hasn't been fixed since.
And no, I don't have this error in my dmesg on the lts kernel.

Offline

#11 2022-12-25 15:32:36

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

Test and warning aren't new.
Please post the dmesg from either kernel (no grepping)

Online

#12 2022-12-25 16:02:01

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

dmesg from linux:
https://pastebin.com/PRxtrWF0
dmesg from linux-lts:
https://pastebin.com/2xeLbZ8t

Offline

#13 2022-12-25 17:02:34

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

The BAR config is exactly the same between the kernels, so this is probably a misinterpretation and red herring and the only real problem is

[    3.665434] nvidia 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible

Add "rcutree.rcu_idle_gp_delay=1" to the kernel parameters.
If that doesn't help, add i915 and nvidia, nvidia_modeset, nvidia_uvm and nvidia_drm to the initramfs (rather don't use the kms hook, remove it if it's there)
If that doesn't help, keep i915 but remove the nvidia modules from the initramfs.

Also

zgrep PREEMPT /proc/config.gz

linux:

[    0.109706] rcu: Preemptible hierarchical RCU implementation.
[    0.109707] rcu: 	RCU restricting CPUs from NR_CPUS=320 to nr_cpu_ids=8.
[    0.109709] rcu: 	RCU priority boosting: priority 1 delay 500 ms.
[    0.109714] 	Trampoline variant of Tasks RCU enabled.
[    0.109715] 	Rude variant of Tasks RCU enabled.
[    0.109717] 	Tracing variant of Tasks RCU enabled.
[    0.109721] rcu: RCU calculated value of scheduler-enlistment delay is 30 jiffies.
[    0.109727] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
[    0.118666] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.131129] rcu: Hierarchical SRCU implementation.
[    0.131129] rcu: 	Max phase no-delay instances is 1000.

linux-lts:

[    0.139534] rcu: Hierarchical RCU implementation.
[    0.139537] rcu: 	RCU restricting CPUs from NR_CPUS=320 to nr_cpu_ids=8.
[    0.139539] 	Rude variant of Tasks RCU enabled.
[    0.139540] 	Tracing variant of Tasks RCU enabled.
[    0.139542] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.139543] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
[    0.168081] rcu: Hierarchical SRCU implementation.

Online

#14 2022-12-25 17:20:23

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

seth wrote:

zgrep PREEMPT /proc/config.gz

CONFIG_PREEMPT_BUILD=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_PREEMPTION=y
CONFIG_PREEMPT_DYNAMIC=y
CONFIG_PREEMPT_RCU=y
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_DRM_I915_PREEMPT_TIMEOUT=640
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_PREEMPT_TRACER is not set
# CONFIG_PREEMPTIRQ_DELAY_TEST is not set

I'll try the kernel parameter as soon as I can

Offline

#15 2022-12-25 17:30:16

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

Is the config from the lts, the main or both kernels (ie. no difference at all)?

Online

#16 2022-12-25 17:33:53

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

It's from the main, can't reboot the machine at the moment

Offline

#17 2022-12-25 18:21:57

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

Ok, I tried the rcutree.rcu_idle_gp_delay=1 parameter, then modules but nothing changed

Offline

#18 2022-12-25 21:58:40

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

Do you have an updated journal (and also one for "acpi_osi=Windows 2009")?
I suspect some sort of race condition and the lts kernel loaded the GPU modules ~0.5s before the main kernel.
Even if that's not it, there's hopefully some sort of pattern between the good and bad cases.

Can you somewhat track down the kernel version when this started (maybe in your pacman log)?

Online

#19 2022-12-26 08:19:55

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

I'm not sure what you meant by updated journal.
I'll try to trace back the kernel version in which the bug first occurred.
Here's dmesg for linux + acpi_osi=! "acpi_osi=Windows 2009" on which the driver loads successfully: https://pastebin.com/HW0rctVg

Offline

#20 2022-12-26 08:45:06

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

After adding the modules to the initramfs (the lasted log seems to have i915 there) and "rcutree.rcu_idle_gp_delay=1"
The nvidia module loads even later in the most recent ("good") journal, so that's probably not it.

I'll say that your best bet is probably to update the BIOS, but just to be sure: have you tried this w/o apparmor?

Online

#21 2022-12-26 08:49:43

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

seth wrote:

have you tried this w/o apparmor?

I haven't. I'll give it a shot

Offline

#22 2022-12-26 08:54:32

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

Disabling apparmor didn't help

Offline

#23 2022-12-27 14:58:11

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

seth wrote:

your best bet is probably to update the BIOS

Okay, I installed windows and updated UEFI to the latest version. But nothing changed.
Do you have any other ideas? Or maybe you know someone I can address?

Offline

#24 2022-12-27 15:58:57

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,127

Re: Nvidia driver not working on linux kernel

What happens if you blacklist all nvidia drivers (check "lsmod", do NOT use the "install /bin/true" approach) and explicitly load them after the boot?
https://wiki.archlinux.org/title/Kernel … acklisting

Online

#25 2022-12-28 09:55:35

rokser
Member
Registered: 2022-12-24
Posts: 30

Re: Nvidia driver not working on linux kernel

seth wrote:

blacklist all nvidia drivers and explicitly load them

Tried that, gives the same

insmod /lib/modules/6.1.1-arch1-1/extramodules/nvidia.ko.xz 
modprobe: ERROR: could not insert 'nvidia_modeset': No such device
insmod /lib/modules/6.1.1-arch1-1/extramodules/nvidia.ko.xz 
modprobe: ERROR: could not insert 'nvidia_drm': No such device

and so on

Offline

Board footer

Powered by FluxBB