You are not logged in.

#1 2023-01-01 21:15:30

H2R
Member
Registered: 2021-11-13
Posts: 13

[SOLVED] Kernel Module Fails To Load With Any Nvidia Card

I am having startup issues after installing the proprietary Nvidia driver. I have tested this on 3 different Nvidia cards, all have the same behavior. The system boots fine with any AMD or Intel card.

I get a long delay before getting a prompt to enter the luks password, after ~2 minutes the error is "timed out for waiting the udev queue to become empty." After entering the password I get some console output and then the screen hangs. Adding "nomodeset" as a kernel parameter gets me further, but it hangs at "loading a kernel module, no limit" until it eventually gives up and I can get to login.

The motherboard is a MSI Z790 Ace with a i9-13900k.

Last edited by H2R (2023-01-06 19:27:41)

Offline

#2 2023-01-01 21:18:48

H2R
Member
Registered: 2021-11-13
Posts: 13

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

This is my module list:

MODULES=(btrfs nvidia nvidia_uvm nvidia_modeset nvidia_drm)

and have Nvidia modeset enabled via kernel parameter.

Last edited by H2R (2023-01-01 21:19:05)

Offline

#3 2023-01-01 21:25:27

seth
Member
Registered: 2012-09-03
Posts: 51,132

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

I get some console output and then the screen hangs.

Can you boot the multi-user.target (2nd link below)?

Otherwise reboot from there by frenetically pressing ctrl+alt+del or https://wiki.archlinux.org/title/Keyboa … el_(SysRq) and post the system journal of that boot

sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st # for the previous boot, otherwise decrease "-1" to go back in time

You do have "ibt=off" in the kernel parameters?

Online

#4 2023-01-04 04:12:43

H2R
Member
Registered: 2021-11-13
Posts: 13

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

So I do have "ibt=off" set as a kernel parameter. Looking through journalctl I find this somewhat unhelpful udev error

systemd-udevd[971]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code 1.

The full log can be found here: https://0x0.st/oRtO.txt

pacman -Q nvidia-open-dkms
nvidia-open-dkms 525.60.11-6

My impression was that "ibt=off" is only needed now for the closed source kernel driver. I randomly don't have the delay pre-luks anymore, but I still can't get to the login screen.

Last edited by H2R (2023-01-04 04:56:34)

Offline

#5 2023-01-04 15:15:23

seth
Member
Registered: 2012-09-03
Posts: 51,132

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

Please don't apply random filters on the journal, I'll assume your hostname is something embarrassing like "pornstation", but nobody actually cares.
Removing timestamps isn't helpful and it fucks up syntax highlighting.

pci 0000:01:00.0: [10de:2704] type 00 class 0x030000

is your GPU, brand new "Asus RTX 4080 16 GB"?
It only picked up support w/ very recent drivers, though  525.60.11 should™ support it and is succefully loaded for the device

NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64  525.60.11  Release Build  (archlinux-builder@)  
nvidia-uvm: Loaded the UVM driver, major device number 236.
nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64  525.60.11  Release Build  (archlinux-builder@)  
[drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0

I still can't get to the login screen.

Try to only boot the multi-user.target, 2nd link below

For a better analysis what actually happened during that boot, post the UNEDITED journal.
It's not supposed to contain any private data and nobody here really cares whether your username is "ifuckmyhamster".

Iffff you want to obfuscate any data, you need to do this in an obvious and pseudonyming way (ie. use unique tokens for any replacement) - and leave the time stamps alone.

Online

#6 2023-01-04 17:40:01

H2R
Member
Registered: 2021-11-13
Posts: 13

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

Sorry about the filter, it was more readable for me at the time. I don't even have a unique hostname at this point, it's all testing on a benchmarking machine. "ifuckmyhamster@pornstation" is actually pretty creative though, I might go with that. I assumed it was an Nvidia/motherboard problem because I got the same behavior with Ampere and Turing cards. \

Here is the journal without the console filter: https://0x0.st/oR3u.txt

Offline

#7 2023-01-04 21:02:56

seth
Member
Registered: 2012-09-03
Posts: 51,132

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

Jan 04 11:34:06 pornstation systemd[1]: Reached target Multi-User System.
…
Jan 04 11:35:16 pornstation systemd[1]: Received SIGINT.
…
Jan 04 11:35:16 pornstation systemd[1]: Stopped target Multi-User System.

You reach the multi-user.target, then basically nothing happens and 70 seconds later you press ctrl+c or ctrl+alt+del and the system reboots.

There're no critical errors logged

Jan 04 11:34:05 pornstation systemd-udevd[1550]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code 1.
Jan 04 11:34:05 pornstation systemd-udevd[1550]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \  -f 4); do /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) ${i}; done'' failed with exit code 1.

is standard and meaningless itc.

But: there's

Jan 04 11:34:06 pornstation kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Jan 04 11:34:06 pornstation kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes

There're actually two GPUs in the system, if you have wired the output to the nvidia chip and cannot deactivate the IGP, try to simply blacklist i915
https://wiki.archlinux.org/title/Kernel … acklisting
Use the "module_blacklist=i915" kernel parameter for a simple and transient approach.

If i915 still shows up in the journal, you'll have to use the /bin/true method.
/etc/modprobe.d/bbc.conf # "blacklist bad chip" … or what did you think?

install i915 /bin/true

If the module then *still* shows up in the journal, it's probably copied into the initramfs (through the "kms" hook) and you need to rebuild that.

Online

#8 2023-01-04 21:42:39

H2R
Member
Registered: 2021-11-13
Posts: 13

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

Nice catch. Blacklisting the IGP driver seems to have fixed the problem. Thanks.

Offline

#9 2023-01-04 22:10:05

seth
Member
Registered: 2012-09-03
Posts: 51,132

Re: [SOLVED] Kernel Module Fails To Load With Any Nvidia Card

\o/

Please always remember to mark resolved threads by editing your initial posts subject - so others will know that there's no task left, but maybe a solution to find.
Thanks.

Online

Board footer

Powered by FluxBB