You are not logged in.

#1 2023-06-01 14:02:22

sourproton
Member
Registered: 2023-04-07
Posts: 50

[SOLVED] Can't get GPU to use nvidia driver

UPDATE: It was a hardware problem. I'm returning the laptop.

---

For the first time I have a laptop with 2 GPUs, the intel processor's GPU and a dedicated NVIDIA GPU. I want to use primarely the intel GPU but to be able to run games on the NVIDIA one with prime-run.

I just performed a new Arch installation following the installation guide. To install NVIDIA's driver, I downloaded the nvidia package, removed kms from the hooks in /etc/mkinitcpio.conf and regenerated the initramfs with mkinitcpio -P.

I then rebooted, removed the installation media and checked

lspci -k | grep -A 2 -E "(VGA|3D)"

which showed the NVIDIA GPU was using the nvidia driver.

Then I installed xorg, plasma, plasma-wayland-session, kde-applications and nvidia-prime.

After booting with sddm,

lspci -k | grep -A 2 -E "(VGA|3D)"

returns

00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-P [Iris Xe Graphics] (rev 04)
	Subsystem: Lenovo Raptor Lake-P [Iris Xe Graphics]
	Kernel driver in use: i915
--
01:00.0 3D controller: NVIDIA Corporation AD107M [GeForce RTX 4050 Max-Q / Mobile] (rev a1)
	Subsystem: Lenovo GN21-X2
	Kernel modules: nouveau

which shows the NVIDIA GPU is using nouveau.

I can't get it to use the nvidia driver, or use nvidia-settings, nvidia-prime, nvidia-smi. I tried reinstalling nvidia but it didn't change anything.

Any help on getting NVIDIA Optimus to work with PRIME render offload?

Last edited by sourproton (2023-06-03 11:47:34)

Offline

#2 2023-06-01 14:27:27

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

The steps you mentioned should be the ticket, are you accidentally booting an old/outdated kernel? what exactly is the current setup?

uname -a
pacman -Q linux
pacman -Qs nvidia
sudo journalctl -b

Offline

#3 2023-06-01 15:09:55

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

Hi, thanks for answering. I'm using version 6.3.4 of the kernel because, for some reason, version 6.3.5 breaks my system. That's another (?) story I need to debug. With 6.3.5 the boot crashes at the last moment, just before, or during the launch of sddm.

> $ uname -a
Linux sabbath 6.3.4-arch2-1 #1 SMP PREEMPT_DYNAMIC Mon, 29 May 2023 13:58:34 +0000 x86_64 GNU/Linux

> $ pacman -Q linux
linux 6.3.4.arch2-1

> $ pacman -Qs nvidia
local/egl-wayland 2:1.1.11-4
    EGLStream-based Wayland external platform
local/lib32-nvidia-utils 530.41.03-1
    NVIDIA drivers utilities (32-bit)
local/libvdpau 1.5-1
    Nvidia VDPAU library
local/libxnvctrl 530.41.03-1
    NVIDIA NV-CONTROL X extension
local/nvidia 530.41.03-15
    NVIDIA drivers for linux
local/nvidia-prime 1.0-4
    NVIDIA Prime Render Offload configuration and utilities
local/nvidia-settings 530.41.03-1
    Tool for configuring the NVIDIA graphics driver
local/nvidia-utils 530.41.03-1
    NVIDIA drivers utilities

sudo journalctl -b

Last edited by sourproton (2023-06-01 15:11:20)

Offline

#4 2023-06-01 15:30:09

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

Which is the reason for the issue. Any given nvidia module is built against a specific kernel, if you opt to downgrade the kernel, you need to downgrade the nvidia package accordingly (or opt for the linux-headers package and nvidia-dkms which will build the module locally for you against the kernel headers) so you need to downgrade the nvidia package to 530.41.03-14 as well.

Offline

#5 2023-06-01 16:14:37

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

Thank you for that information. As recommended, I downgraded the nvidia package to version 530.41.03-14 but now only the intel GPU shows under lspci:

> $ uname -a
Linux sabbath 6.3.4-arch2-1 #1 SMP PREEMPT_DYNAMIC Mon, 29 May 2023 13:58:34 +0000 x86_64 GNU/Linux

> $ pacman -Q linux
linux 6.3.4.arch2-1

> $ pacman -Qs nvidia
local/egl-wayland 2:1.1.11-4
    EGLStream-based Wayland external platform
local/lib32-nvidia-utils 530.41.03-1
    NVIDIA drivers utilities (32-bit)
local/libvdpau 1.5-1
    Nvidia VDPAU library
local/libxnvctrl 530.41.03-1
    NVIDIA NV-CONTROL X extension
local/nvidia 530.41.03-14
    NVIDIA drivers for linux
local/nvidia-lts 1:530.41.03-12
    NVIDIA drivers for linux-lts
local/nvidia-prime 1.0-4
    NVIDIA Prime Render Offload configuration and utilities
local/nvidia-settings 530.41.03-1
    Tool for configuring the NVIDIA graphics driver
local/nvidia-utils 530.41.03-1
    NVIDIA drivers utilities

> $ lspci -k | grep -A 2 -E "(VGA|3D)"
00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-P [Iris Xe Graphics] (rev 04)
	Subsystem: Lenovo Raptor Lake-P [Iris Xe Graphics]
	Kernel driver in use: i915

> $ prime-run glxinfo | grep "OpenGL renderer"
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  150 (GLX)
  Minor opcode of failed request:  24 (X_GLXCreateNewContext)
  Value in failed request:  0x0
  Serial number of failed request:  50
  Current serial number in output stream:  51

> $ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Offline

#6 2023-06-01 16:21:17

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

journal now? This shouldn't be able to have such an effect unless you have a very low level PCI crash right now.

Last edited by V1del (2023-06-01 16:21:51)

Offline

#7 2023-06-01 16:31:19

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

I tried restarting to generate a clean log but now it won't boot. It's the same behaviour as with the newest kernel. So the problem might not be the newest kernel afterall, but something else, perhaps related to this nvidea driver story.

I can't even ctrl alt F2, the screen is 100% frozen.

My last two lines are

iwlwifi 0000:00:14.3:  WRT:  Invalid buffer destination
Bluetooth:  hci0: Malformed MSFT vendor event:  0x02

Those two lines were also present before, they were the last shown before the sddm screen.

I can arch-chroot from my usb and update the kernel and nvidea and regenerate the initramfs, but I know the boot behaviour will be the same, with the same freeze.

Any clues on how to debug that?

Last edited by sourproton (2023-06-01 16:33:55)

Offline

#8 2023-06-01 16:40:57

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

So I chrooted and ran yay to upgrade everything and then mkinitcpio -P.

The system still won't boot, crashing at the same moment with the same last two lines.

I also have the LTS kernel, it is booting but barely working. Sddm screen is normal but the mouse is veeeery laggy, like 1fps, and after login I have a black screen.

---

Edit: I upgraded the nvidia package to the latest version while maintaining the kernel at 6.3.4. So now I'm back at the beginning, with the GPU showing at the lspci, displaying it's using nouveau.

So now I know it's not working because the kernel and nvidia versions are not compatible, but I don't know how to proceed, since downgrading nvidia or upgrading the kernel breaks the system.

---

Edit 2: I upgraded the kernel to its most recent version and removed the nvidia package with pacman -R nvidia. The system boots normally. So the crash before sddm might be because of the latest nvidia version. The shows under lspci using nouveau.

Removing nvidia-lts doesn't solve the sluggishness from the LTS kernel booting.

Last edited by sourproton (2023-06-01 17:52:08)

Offline

#9 2023-06-01 17:52:14

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

The "latest nvidia version" is simply a rebuild against the kernel, version wise nothing changed there.

From the working system post

sudo journalctl -b-1

assuming the journal prior to your current one having been broken. Is xf86-video-intel installed? You shouldn't have that one, remove it if it is.

Offline

#10 2023-06-01 18:06:05

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

The latest kernel, without nvidia installed, generated this journal

No xf86-video-intel showing running pacman -Qs xf86-video-intel.

Offline

#11 2023-06-01 18:14:31

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

Another thing that might be relevant is that I received this new PC on Tuesday. It was my first time dealing with this switchable graphics thing so I was trying a lot of different stuff, but in the end I did manage to run prime-run minecraft and get it to work with the NVIDIA GPU, confirmed by the F3 screen inside the game.

I ran yay on tuesday night or Wednesday morning and after that it couldn't boot, presenting the aforementioned boot freeze behavior, which I then "solved" by downgrading the kernel, from which point the nvidia driver couldn't be recognized anymore by the GPU.

I did plenty of clean installs since then.

If I want to re-test the May 29th or 30th's version of the kernel, how do I know which version of nvidia to install?

Also, system-maintenance-wise, if there is a new kernel or nvidia version available, is it safe to install it without verifying their compatibilities with each other's older version?

Last edited by sourproton (2023-06-01 18:28:18)

Offline

#12 2023-06-01 20:10:02

espritlibre
Member
Registered: 2022-12-15
Posts: 128

Re: [SOLVED] Can't get GPU to use nvidia driver

sourproton wrote:

If I want to re-test the May 29th or 30th's version of the kernel, how do I know which version of nvidia to install?

Also, system-maintenance-wise, if there is a new kernel or nvidia version available, is it safe to install it without verifying their compatibilities with each other's older version?

i'd just go with nvidia-dkms, it's just way less hassle in case you have to downgrade the kernel. the compilation of the kernel module when up/downgrading kernel takes a couple of seconds. just uninstall nvidia and nvidia-lts and install the header files corresponding to your installed kernel and nvidia-dkms

EDIT:

ArchWiki

ArchWiki wrote:

5. Remove kms from the HOOKS array in /etc/mkinitcpio.conf and regenerate the initramfs. This will prevent the initramfs from containing the nouveau module making sure the kernel cannot load it during early boot.

Last edited by espritlibre (2023-06-01 20:16:15)

Offline

#13 2023-06-01 20:19:07

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

That journal shows the nvidia module is not loaded nor found, we'd need a failing boot with the correct nvidia module (... assuming this is an nvidia issue in the first place, there's a crash on intel in that output). I'd agree with the above, otherwise if you want to know look at the changelog of the nvidia module, it mentions which version is a rebuild against which kernel: https://gitlab.archlinux.org/archlinux/ … mmits/main

As for the follow up question, there's nothing you need to do here, the maintainers generally take care of that, assuming you're updating your system normally.

Offline

#14 2023-06-01 20:38:14

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

Thank you both for the information and the suggestion. After this discussion, nvidia-dkms seemed more appealing to me, but I still can't boot with nvidia installed.

I have both linux-headers and linux-lts-headers. I uninstalled nvidia and nvidia-lts, installed nvidia-dkms, removed kms from the hooks and restarted. Linux and linux-ltx crash before sddm and freeze without the possibility to ctrl-alt-F2.

How can I share these journals? Is there a way to access them via chroot and store them in a file, so that I can uninstall nvidia-dkms and boot?

Also, my GPU is pretty recent, it has the Ada Lovelace architecture. Can this be the cause of the issue due to some lack of compatibility?

Last edited by sourproton (2023-06-01 20:49:08)

Offline

#15 2023-06-01 21:52:25

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

You're the first person I'm seeing that seems to have a general issue here. After reproducing the crash on the chroot you can still post it with journalctl -b afaik in the chroot that should be the broken boot.

Offline

#16 2023-06-01 22:24:55

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

Here is the journal of a failed boot that froze with nvidia-dkms installed.

---

Edit: I also tried the DRM kernel mode setting by installing nvidia-dkms and adding nvidia, nvidia_modeset, nvidia_uvm and nvidia_drm to the initramfs and then running mkinitcpio -P, but without success. I have the same crash.

Here is the journal.

Last edited by sourproton (2023-06-01 22:52:02)

Offline

#17 2023-06-02 01:10:53

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

It boots completely normally (some weird intel crash though this seems to be present everywhere and not actually have an effect on the rest -- but might want to look into a UEFI/BIOS update) and then sddm dies, post your /var/log/Xorg.0.log . Did you generate a config with nvidia-xconfig? Don't do that, remove the config file it generated. On "black screen and crashed" can you switch VTs ?

Last edited by V1del (2023-06-02 01:11:19)

Offline

#18 2023-06-02 06:16:09

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

I did try nvidia-xconfig but I'm not sure it was in this build... I'm going to redo a clean install with a bigger /boot/ partition (ran into problem building the initramfs for two kernels with all those modules) and install nvidia-dkms last, without running nvidia-xconfig.

Here is /var/log/Xorg.0.log:

[   100.649] _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
[   100.649] _XSERVTransMakeAllCOTSServerListeners: server already running
[   100.649] (EE) 
Fatal server error:
[   100.649] (EE) Cannot establish any listening sockets - Make sure an X server isn't already running(EE) 
[   100.649] (EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[   100.649] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   100.649] (EE) 
[   100.649] (EE) Server terminated with error (1). Closing log file.

Offline

#19 2023-06-02 08:13:44

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

Checking again in your latest journal I see

juin 02 00:37:08 sabbath kernel: NVRM: GPU at PCI:0000:01:00: GPU-57396ea2-7a31-5c83-5fe4-d8766a281d47
juin 02 00:37:08 sabbath kernel: NVRM: Xid (PCI:0000:01:00): 62, pid='<unknown>', name=<unknown>, badfbadf(badfbadf) 00000000 00000000
juin 02 00:37:13 sabbath kernel: i915 0000:00:02.0: [drm] Selective fetch area calculation failed in pipe A

which isn't really a good sign. xid 62 is some internal microcontroller failure, so could indeed be a driver issue, but I find the preceding intel_bios crash weird already, is there a UEFI/firmware update available for this system?

Offline

#20 2023-06-02 08:44:36

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

I don't know how to check for that, Lenovo only supports windows methods for updating the bios. Either way, I would assume there isn't any updates, since on their website the last .exe to update the bios dates from April, and my machine was assembled in May. The BIOS its up to date, version LWCN22WW.

I'll be re-installing from scratch now.

Last edited by sourproton (2023-06-02 09:58:13)

Offline

#21 2023-06-02 09:40:20

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

So I reinstalled Arch. First I installed xorg, plasma and plasma-wayland and verified it booted fine. The GPU was using the nouveau driver, as expected.

Then I installed nvidia-dkms, without modifying anything else (the wiki says to remove kms if installing nvidia, not nvidia-dkms).

It still crashes. Here is the journal.

I uninstalled nvidia-dkms and it boots again. Here is the journal.

Last edited by sourproton (2023-06-02 10:01:01)

Offline

#22 2023-06-02 10:28:02

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,720

Re: [SOLVED] Can't get GPU to use nvidia driver

Yeah you get that Xid, this is either a driver bug or something else being very wonky with your firmware. One thing you can try is opting for the linux-lts kernel and linux-lts-headers and then installing an older version of the nvidia driver via the ALA to test whether you can gain something stable.

https://wiki.archlinux.org/title/Arch_Linux_Archive
https://archive.archlinux.org/packages/n/nvidia-dkms/
https://archive.archlinux.org/packages/n/nvidia-utils/

You could test e.g. 525.89.02-2

Offline

#23 2023-06-02 10:32:59

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,919

Re: [SOLVED] Can't get GPU to use nvidia driver

Please stop booting to sddm/graphical target for the time being.

Troubleshooting is much easier in multi-user.target (very close to the old sysv-init runlevel 3) , see https://wiki.archlinux.org/title/System … _boot_into .

Both logs show a similar crash with the intel videochip.
Is the i915 kernel module loaded in the initramfs through modules= in /etc/mkinitcpio.conf ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#24 2023-06-02 10:37:48

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

Thank you for the suggestions. I'll experiment with linux-lts and older versions of the driver.

I'll also read the page on how to boot into a non-GUI interface, thank you for the links.

My mkinit.conf modules array is empty by default:

# vim:set ft=sh
# MODULES
# The following modules are loaded before any boot hooks are
# run.  Advanced users may wish to specify all system modules
# in this array.  For instance:
#     MODULES=(usbhid xhci_hcd)
MODULES=()

# BINARIES
# This setting includes any additional binaries a given user may
# wish into the CPIO image.  This is run last, so it may be used to
# override the actual binaries included by a given hook
# BINARIES are dependency parsed, so you may safely ignore libraries
BINARIES=()

# FILES
# This setting is similar to BINARIES above, however, files are added
# as-is and are not parsed in any way.  This is useful for config files.
FILES=()

# HOOKS
# This is the most important setting in this file.  The HOOKS control the
# modules and scripts added to the image, and what happens at boot time.
# Order is important, and it is recommended that you do not change the
# order in which HOOKS are added.  Run 'mkinitcpio -H <hook name>' for
# help on a given hook.
# 'base' is _required_ unless you know precisely what you are doing.
# 'udev' is _required_ in order to automatically load modules
# 'filesystems' is _required_ unless you specify your fs modules in MODULES
# Examples:
##   This setup specifies all modules in the MODULES setting above.
##   No RAID, lvm2, or encrypted root is needed.
#    HOOKS=(base)
#
##   This setup will autodetect all modules for your system and should
##   work as a sane default
#    HOOKS=(base udev autodetect modconf block filesystems fsck)
#
##   This setup will generate a 'full' image which supports most systems.
##   No autodetection is done.
#    HOOKS=(base udev modconf block filesystems fsck)
#
##   This setup assembles a mdadm array with an encrypted root file system.
##   Note: See 'mkinitcpio -H mdadm_udev' for more information on RAID devices.
#    HOOKS=(base udev modconf keyboard keymap consolefont block mdadm_udev encrypt filesystems fsck)
#
##   This setup loads an lvm2 volume group.
#    HOOKS=(base udev modconf block lvm2 filesystems fsck)
#
##   NOTE: If you have /usr on a separate partition, you MUST include the
#    usr and fsck hooks.
HOOKS=(base udev autodetect modconf kms keyboard keymap consolefont block filesystems fsck)

# COMPRESSION
# Use this to compress the initramfs image. By default, zstd compression
# is used. Use 'cat' to create an uncompressed image.
#COMPRESSION="zstd"
#COMPRESSION="gzip"
#COMPRESSION="bzip2"
#COMPRESSION="lzma"
#COMPRESSION="xz"
#COMPRESSION="lzop"
#COMPRESSION="lz4"

# COMPRESSION_OPTIONS
# Additional options for the compressor
#COMPRESSION_OPTIONS=()

# MODULES_DECOMPRESS
# Decompress kernel modules during initramfs creation.
# Enable to speedup boot process, disable to save RAM
# during early userspace. Switch (yes/no).
#MODULES_DECOMPRESS="yes"

Offline

#25 2023-06-02 12:25:06

sourproton
Member
Registered: 2023-04-07
Posts: 50

Re: [SOLVED] Can't get GPU to use nvidia driver

I installed nvidia-dkms, didn't touch mkinitcpio.conf, changed the target to multi-user and it boots (using latest nvidia and linux)!

The weird thing is that those last journal boot messages show at the login screen. See here.

Here is the journal.

lspci shows

01:00.0 3D controller: NVIDIA Corporation AD107M [GeForce RTX 4050 Max-Q / Mobile] (rev a1)
	Subsystem: Lenovo GN21-X2
	Kernel modules: nouveau, nvidia_drm, nvidia

Any clues on how to proceed to boot with sddm?

Last edited by sourproton (2023-06-02 12:43:15)

Offline

Board footer

Powered by FluxBB