You are not logged in.
Hello, I have been running a QEMU VM with GPU passthrough completely fine and reliably up until recently.
Since [2025-07-08T01:11:07+1200] [ALPM] upgraded linux-lts (6.12.34-1 -> 6.12.35-1), the QEMU GPU passthrough VM is no longer starting properly.
With the host OS running >=6.12.35-1-lts, the VM guest OS fails to boot to GUI.
The guest OS dmesg reports :-
[ 5.722522] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x62:0xffff:2584)
[ 5.724357] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[ 5.724816] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000300] Failed to allocate NvKmsKapiDevice
[ 5.725066] [drm:nv_drm_register_drm_device [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000300] Failed to register device
[ 5.735638] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.However, With the host OS running linux-6.15.6-arch1-1 the VM guest OS boots to GUI without error.
The guest OS dmesg reports :-
[ 5.386502] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 575.64 Tue Jun 10 19:18:37 UTC 2025
[ 5.441084] [drm] [nvidia-drm] [GPU ID 0x00000300] Loading driver
[ 7.037058] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:03:00.0 on minor 0
[ 7.080275] fbcon: nvidia-drmdrmfb (fb0) is primary device
[ 7.182869] nvidia 0000:03:00.0: [drm] fb0: nvidia-drmdrmfb frame buffer deviceIs there any known reason why the GPU passthrough is failing on the LTS kernel?
TIA.
Last edited by jimhend (2025-07-20 10:28:42)
Offline
Have you looked at the host journal?
The GPU seems passed through, it just cannot initialize.
Also what happens if you add "rcutree.gp_init_delay=1" to the guests kernel parameters?
Offline
I tried adding "rcutree.gp_init_delay=1" to the guests kernel parameters.
dmesg | egrep -i 'rcutree|NVRM|drm'
egrep: warning: egrep is obsolescent; using grep -E
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.6-x86_64 root=UUID=c5740c7b-859c-4d89-8cf0-250f1ad14516 rw ipv6.disable=1 rcutree.gp_init_delay=1
[ 0.106048] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.6-x86_64 root=UUID=c5740c7b-859c-4d89-8cf0-250f1ad14516 rw ipv6.disable=1 rcutree.gp_init_delay=1
[ 0.699283] ACPI: bus type drm_connector registered
[ 0.704458] simple-framebuffer simple-framebuffer.0: [drm] could not acquire memory region [mem 0x20000000-0x2063ffff flags 0x200]
[ 0.704569] simpledrm_probe+0x437/0x750
[ 4.038006] systemd[1]: Starting Load Kernel Module drm...
[ 5.270643] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 575.64 Tue Jun 10 19:18:37 UTC 2025
[ 5.505815] [drm] [nvidia-drm] [GPU ID 0x00000300] Loading driver
[ 5.905177] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x62:0xffff:2584)
[ 5.907047] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[ 5.907483] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000300] Failed to allocate NvKmsKapiDevice
[ 5.907693] [drm:nv_drm_register_drm_device [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000300] Failed to register devicehost journal shows stuff I dont really know what to make of..
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0xc001020a data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0xc0010208 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0xc0010206 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0xc0010204 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0xc0010202 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0xc0010200 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x3f2 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x600 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x3f1 data 0x0
Jul 16 21:48:21 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x64d data 0x0
Jul 16 21:48:21 mothership kernel: kvm_do_msr_access: 13 callbacks suppressed
Jul 16 21:48:18 mothership kernel: usb 1-1: reset low-speed USB device number 2 using xhci_hcd
Jul 16 21:48:17 mothership kernel: usb 1-10: reset full-speed USB device number 5 using xhci_hcd
Jul 16 21:48:17 mothership kernel: usb 1-13: reset full-speed USB device number 6 using xhci_hcd
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x1a7 data 0x0
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored wrmsr: 0x1a7 data 0x11
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x1a7 data 0x0
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x1a6 data 0x0
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored wrmsr: 0x1a6 data 0x11
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x1a6 data 0x0
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x1c9 data 0x0
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored wrmsr: 0x1c9 data 0x3
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x1c9 data 0x0
Jul 16 21:48:15 mothership kernel: kvm: kvm [2109]: ignored rdmsr: 0x492 data 0x0
Jul 16 21:47:46 mothership systemd[1]: geoclue.service: Deactivated successfully.
Jul 16 21:47:46 mothership geoclue[1102]: Service not used for 60 seconds. Shutting down..
Jul 16 21:47:38 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:38 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:36 mothership systemd[1]: systemd-timedated.service: Deactivated successfully.
Jul 16 21:47:23 mothership systemd[1]: systemd-localed.service: Deactivated successfully.
Jul 16 21:47:22 mothership systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Jul 16 21:47:17 mothership gnome-shell[1381]: Error calling StartServiceByName for org.freedesktop.PackageKit: Timeout was reached
Jul 16 21:47:11 mothership systemd-timesyncd[314]: Initial clock synchronization to Wed 2025-07-16 21:47:11.347838 NZST.
Jul 16 21:47:11 mothership systemd-timesyncd[314]: Contacted time server 14.102.99.217:123 (2.arch.pool.ntp.org).
Jul 16 21:47:07 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:07 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:07 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:07 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:07 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:07 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:06 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:06 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:06 mothership libvirtd[730]: host doesn't support hyperv 'vapic' feature
Jul 16 21:47:06 mothership libvirtd[730]: host doesn't support hyperv 'relaxed' feature
Jul 16 21:47:06 mothership rtkit-daemon[873]: Supervising 8 threads of 5 processes of 1 users.
Jul 16 21:47:06 mothership rtkit-daemon[873]: Successfully made thread 2302 of process 2165 owned by '1000' RT at priority 10.
Jul 16 21:47:06 mothership rtkit-daemon[873]: Supervising 7 threads of 4 processes of 1 users.
Jul 16 21:47:06 mothership rtkit-daemon[873]: Supervising 7 threads of 4 processes of 1 users.
Jul 16 21:47:06 mothership rtkit-daemon[873]: Supervising 7 threads of 4 processes of 1 users.
Jul 16 21:47:06 mothership rtkit-daemon[873]: Supervising 7 threads of 4 processes of 1 users.Thanks
Offline
Don't filter for random stuff.
Please post your complete system journal for the host and the guest:
sudo journalctl -b | curl -F 'file=@-' 0x0.stOffline
Offline
Do you rely on explicitly setting "intel_iommu=on iommu=pt"? You should™ not have to.
Other than that, the GPU is passed through, enabled by vfio and otherwise completely ignored by the host.
In the guest
Jul 17 10:48:02 manjaro kernel: pci 0000:02:00.0: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window
Jul 17 10:48:02 manjaro kernel: pci 0000:03:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: no compatible bridge window
…
Jul 17 10:48:02 manjaro kernel: pci 0000:03:00.0: BAR 6: no space for [mem size 0x00080000 pref]
Jul 17 10:48:02 manjaro kernel: pci 0000:03:00.0: BAR 6: failed to assign [mem size 0x00080000 pref]do you get any of this w/ the 6.15 host?
Also remove the nvidia modules from the
Jul 17 10:48:05 manjaro systemd-modules-load[302]: Inserted module 'nvidia'
Jul 17 10:48:05 manjaro systemd-modules-load[302]: Inserted module 'nvidia_drm'
Jul 17 10:48:05 manjaro systemd-modules-load[302]: Inserted module 'nvidia_uvm'
Jul 17 10:48:05 manjaro systemd-modules-load[302]: Inserted module 'uinput'explicit systemd-modules-load array and see whether you can operate the GPU "later" ie, rescan the PCI bus after the boot.
Do you btw. have the same problems w/ a different guest (current arch version or so)?
Offline
Do you rely on explicitly setting "intel_iommu=on iommu=pt"? You should™ not have to.
I set those boot parameters when I first setup the system for passthrough, it has always worked so I have never touched them since.
I tried removing both intel_iommu=on iommu=pt from the boot command line and now the GPU is no longer isolated from the host, I can see the nouveau driver now gets loaded on the host.
do you get any of this w/ the 6.15 host?
yes, I see the exact same messages with the working* guest using 6.15 host.
Jul 18 10:14:15 manjaro kernel: pci 0000:02:00.0: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window
Jul 18 10:14:15 manjaro kernel: pci 0000:03:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: no compatible bridge window
...
Jul 18 10:14:15 manjaro kernel: pci 0000:03:00.0: BAR 6: no space for [mem size 0x00080000 pref]
Jul 18 10:14:15 manjaro kernel: pci 0000:03:00.0: BAR 6: failed to assign [mem size 0x00080000 pref]Also remove the nvidia modules from the explicit systemd-modules-load array and see whether you can operate the GPU "later" ie, rescan the PCI bus after the boot.
I found only nvidia and nvidia_drm listed in /etc/modules-load.d/ and commented them out.
[manjaro ~]# systemctl status systemd-modules-load.service
● systemd-modules-load.service - Load Kernel Modules
Loaded: loaded (/usr/lib/systemd/system/systemd-modules-load.service; static)
Active: active (exited) since Fri 2025-07-18 11:34:19 NZST; 1min 35s ago
Invocation: bc85a87a83b149a1a2496feb53eea2af
Docs: man:systemd-modules-load.service(8)
man:modules-load.d(5)
Process: 303 ExecStart=/usr/lib/systemd/systemd-modules-load (code=exited, status=0/SUCCESS)
Main PID: 303 (code=exited, status=0/SUCCESS)
Mem peak: 89.1M
CPU: 1.753s
Jul 18 11:34:19 manjaro systemd-modules-load[303]: Inserted module 'nvidia_uvm'
Jul 18 11:34:19 manjaro systemd-modules-load[303]: Inserted module 'uinput'
Jul 18 11:34:19 manjaro systemd[1]: Finished Load Kernel Modules.Removing the nvidia modules did not make any noticeable difference (the 6.15 host still starts the VM all ok and the 6.12lts host does not).
I am not sure what is calling for the nvidia_uvm module to be loaded.
Do you btw. have the same problems w/ a different guest (current arch version or so)?
The only other guest I have on this system is a windows 11 machine, which also will not start up into GUI using the LTS kernel on host.
Thank you seth.
Last edited by jimhend (2025-07-17 23:47:29)
Offline
Sorry,
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set isn't actually enabled on the main kernel, my bad.
(re-activate the iommu parameters)
I am not sure what is calling for the nvidia_uvm module to be loaded.
CUDA/GPGPU stuff, it's typically explicitly loaded via systemd or a service
The only other guest I have on this system is a windows 11 machine, which also will not start up into GUI using the LTS kernel on host.
'key.
Can you downgrade the guest driver pre 575xx (565xx, 550xx - in doubt even 570xx)?
Resp. is this actually reality-relevant or an academic research on the situation?
Offline
Resp. is this actually reality-relevant or an academic research on the situation?
I was just using LTS because I thought that was the recommended kernel to use.
But since 6.15 is doing the trick then I will just use that.
Thanks
Offline
I thought that was the recommended kernel to use.
The recommended kernel to use is always the one that works ![]()
Please always remember to mark resolved threads by editing your initial posts subject - so others will know that there's no task left, but maybe a solution to find.
Thanks.
Offline
Offline
Wow! I see the problem made its way to greg k-h desk.
Thanks for posting nicob
Offline