You are not logged in.

#1 2020-06-10 02:17:49

cjuniorfox
Member
Registered: 2020-06-10
Posts: 5

GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

Hello community.
Is my first topic here on Arch Linux forum. I'm a Arch's user for at least a couple years and until then, I uses Libvirt / Qemu (KVM) virtualization solution with GPU Passthrought to play games on an Windows 10 installation.
But since 5.4.33-1-lts kernel update, I was unable to initialize NVIDIA Graphics Card Drivers on guest's machine. Works in VESA mode, but hangs and reboots soon as graphics drivers try to initializes.
I did a spare installation of Arch and, using "downgrade" tool, i tested some kernels and realized that the last fully operational kernel for my setup is the 1.5.4.32-1-lts. As of 1.5.4.33-1-lts, had no luck to run my graphics card.
I tried many things, like disable hugepages, load some different VBIOS, installs different builds of Windows 10, none of this worked since kernel 1.5.4.33-1-lts.

Some informations:
Intel i7-3930K (12) @ 3.800GHz
Gigabyte X79S-UP5-WIFI
NVIDIA Corporation GP104 [GeForce GTX 1070] [10de:1b81] (rev a1) (prog-if 00 [VGA controller])
qemu : 5.0.0-7
libvirt: 6.4.0-1

xml:

<domain type='kvm'>
  <name>win10</name>
  <uuid>124cc88b-9a93-4236-943c-42ed4ef3ffe2</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement='static'>10</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-4.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
    <boot dev='hd'/>
    <bootmenu enable='no'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='10de0af0'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>
    <vmport state='off'/>
    <ioapic driver='kvm'/>
  </features>
  <cpu mode='host-passthrough' check='partial'>
    <topology sockets='2' dies='1' cores='6' threads='1'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='yes'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='writeback'/>
      <source file='/share/Images/win10.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/share/ISO/virtio-win-0.1.164.iso'/>
      <target dev='sdc' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:a8:d8:15'/>
      <source network='default'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:3b:de:5f'/>
      <source network='default'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='usb' managed='yes'>
      <source>
        <vendor id='0x1b3f'/>
        <product id='0x2008'/>
      </source>
      <address type='usb' bus='0' port='4'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='yes'>
      <source>
        <vendor id='0x045e'/>
        <product id='0x0095'/>
      </source>
      <address type='usb' bus='0' port='1'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='yes'>
      <source>
        <vendor id='0x04b3'/>
        <product id='0x3025'/>
      </source>
      <address type='usb' bus='0' port='5'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
      </source>
      <rom file='/share/VBIOS/Zotac.GTX1070.8192.180330.patched.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x07' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </hostdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='2'/>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='3'/>
    </redirdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </memballoon>
  </devices>
</domain>

Last edited by cjuniorfox (2020-06-10 02:22:54)

Offline

#2 2020-06-12 17:35:58

cjuniorfox
Member
Registered: 2020-06-10
Posts: 5

Re: GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

bump!

Offline

#3 2020-06-13 12:10:04

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,911

Re: GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

Please don't do that, https://wiki.archlinux.org/index.php/Co … ct#Bumping .

There have been reports of passthrough problems wrt intel processors / iGPU in recent kernels, check https://bbs.archlinux.org/viewtopic.php?id=254847

The full journalctl -b  output  from the host from one of the fails may be heIpful .
Have you tried running qemu directly from commandline instead of through virtmanager ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#4 2020-06-14 01:48:03

cjuniorfox
Member
Registered: 2020-06-10
Posts: 5

Re: GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

Sorry about bumping.
And thanks for answering.
journalctl -b >> https://pastebin.com/qKhGV0Nr

I tried the workaround on topic's you mentioned. I compiled a new kernel excluding the specified commit and didn't worked. On article, the subject kernel version was some prior version and I had to solve one conflit appeared on following file: arch/x86/kvm/ioapic.c.  Follows below.

https://pastebin.com/V3QqVeXs

Offline

#5 2020-06-14 11:56:36

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,911

Re: GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

Looks like this is a server motherboard with specific hardware like a SAS controller.
Please post lspci -knn .


jun 13 22:24:05 archlinux kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
jun 13 22:24:05 archlinux kernel: EDAC sbridge: Couldn't find mci handler
jun 13 22:24:05 archlinux kernel: EDAC sbridge: Failed to register device with error -19

If you're sure you don't have ECC memory installed, this is harmless .


jun 13 22:24:07 archlinux libvirtd[904]: libvirt version: 6.4.0
jun 13 22:24:07 archlinux libvirtd[904]: hostname: archlinux
jun 13 22:24:07 archlinux libvirtd[904]: Obsolete nvram variable is set while firmware metadata files found. Note that the nvram config file variable is going to be ignored.

jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.729+0000: 1429: info : libvirt version: 6.4.0
jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.729+0000: 1429: info : hostname: archlinux
jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.729+0000: 1429: warning : virSecurityValidateTimestamp:195 : Invalid XATTR timestamp detected on /share/Images/win10.qcow2 secdriver=dac
jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.732+0000: 1429: warning : virSecurityValidateTimestamp:195 : Invalid XATTR timestamp detected on /var/lib/libvirt/qemu/nvram/win10_VARS.fd secdriver=dac

Don't know how important they are, but it looks like you should check the libivrt nvram setup.

jun 13 22:25:04 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0x100e2806 ignored rdmsr: 0x122
jun 13 22:25:04 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0x150ae46 ignored rdmsr: 0x122
jun 13 22:25:07 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0x1a63416 ignored rdmsr: 0x122
jun 13 22:25:08 archlinux kernel: usb 1-1.5: reset low-speed USB device number 3 using ehci-pci
jun 13 22:25:09 archlinux kernel: usb 1-1.6: reset low-speed USB device number 4 using ehci-pci
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a2a63 ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff80668d8987f ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a171f ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a1461 ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a155a ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff80668d8987f ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a2a63 ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a171f ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a1461 ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a155a ignored rdmsr: 0x122
jun 13 22:25:16 archlinux kernel: kvm_get_msr_common: 55 callbacks suppressed

Searching for that error gives lots of results, https://github.com/intel/gvt-linux/issues/79 looks like it may apply .
Are you using gvt-linux and/or xf86-video-intel ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#6 2020-06-14 20:59:44

cjuniorfox
Member
Registered: 2020-06-10
Posts: 5

Re: GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

Lone_Wolf wrote:

Looks like this is a server motherboard with specific hardware like a SAS controller.
Please post lspci -knn .

Follows:
https://pastebin.com/UFWudkca.
My motherboard is SAS capable, but it's working at SATA mode.

Lone_Wolf wrote:
jun 13 22:24:05 archlinux kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
jun 13 22:24:05 archlinux kernel: EDAC sbridge: Couldn't find mci handler
jun 13 22:24:05 archlinux kernel: EDAC sbridge: Failed to register device with error -19

If you're sure you don't have ECC memory installed, this is harmless .

Nope, I not using ECC memory. Just old regular DDR3.

jun 13 22:24:07 archlinux libvirtd[904]: libvirt version: 6.4.0
jun 13 22:24:07 archlinux libvirtd[904]: hostname: archlinux
jun 13 22:24:07 archlinux libvirtd[904]: Obsolete nvram variable is set while firmware metadata files found. Note that the nvram config file variable is going to be ignored.

jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.729+0000: 1429: info : libvirt version: 6.4.0
jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.729+0000: 1429: info : hostname: archlinux
jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.729+0000: 1429: warning : virSecurityValidateTimestamp:195 : Invalid XATTR timestamp detected on /share/Images/win10.qcow2 secdriver=dac
jun 13 22:24:55 archlinux libvirtd[1429]: 2020-06-14 01:24:55.732+0000: 1429: warning : virSecurityValidateTimestamp:195 : Invalid XATTR timestamp detected on /var/lib/libvirt/qemu/nvram/win10_VARS.fd secdriver=dac

Don't know how important they are, but it looks like you should check the libivrt nvram setup.

I did that. I even installed a new version of Windows on new virtual machine, tried another versions of OVMF. But the issue persists with same behavior. Kernels prior than 1.5.4.32-1-lts works flawsoly.

jun 13 22:25:04 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0x100e2806 ignored rdmsr: 0x122
jun 13 22:25:04 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0x150ae46 ignored rdmsr: 0x122
jun 13 22:25:07 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0x1a63416 ignored rdmsr: 0x122
jun 13 22:25:08 archlinux kernel: usb 1-1.5: reset low-speed USB device number 3 using ehci-pci
jun 13 22:25:09 archlinux kernel: usb 1-1.6: reset low-speed USB device number 4 using ehci-pci
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a2a63 ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff80668d8987f ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a171f ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a1461 ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff806691a155a ignored rdmsr: 0x122
jun 13 22:25:09 archlinux kernel: kvm [1428]: vcpu0, guest rIP: 0xfffff80668d8987f ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a2a63 ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a171f ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a1461 ignored rdmsr: 0x122
jun 13 22:25:10 archlinux kernel: kvm [1428]: vcpu1, guest rIP: 0xfffff806691a155a ignored rdmsr: 0x122
jun 13 22:25:16 archlinux kernel: kvm_get_msr_common: 55 callbacks suppressed

Searching for that error gives lots of results, https://github.com/intel/gvt-linux/issues/79 looks like it may apply .
Are you using gvt-linux and/or xf86-video-intel ?

Nope, I don't have onboard video. Lately I've been using only my discrete videocard, unloading NVIDIA drivers and hooking up vfio extension when the virtual machine starts, as intructed here: https://github.com/joeknock90/Single-GPU-Passthrough. I have already installed a second videocard in an hope to fix the issue, a spare AMD videocard I have with another monitor, but the issue still persisted.

Offline

#7 2020-06-15 10:25:53

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,911

Re: GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

Hmm, it's not going to be easy to troubleshoot this.

You are using nvidia proprietary driver, a bug report to kernel people will be rejected

Bisecting the kernel may be an option.
What is the first kernel where things break , 5.4.33 or 5.4.34 , other version  ?
The changelogs for kernel 5.x are at https://cdn.kernel.org/pub/linux/kernel/v5.x/ , maybe those will help to determine whether bisecting is worth the effort.


    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
      </source>
      <rom file='/share/VBIOS/Zotac.GTX1070.8192.180330.patched.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </hostdev>

You made low level changes to the nvidia rom used by the VM as that guide indicates.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#8 2020-06-15 11:18:25

cjuniorfox
Member
Registered: 2020-06-10
Posts: 5

Re: GPU Passthrough. Windows Graphics Driver not works after 5.4.33-1-lts

1.5.4.32-1-lts is the last functional one.
I installed another instance of Arch without any proprietary driver and blacklisted all NVIDIA's kernel extension.
Follows my /etc/modprobe.d/blacklist.conf

blacklist nouveau
blacklist nvidia
blacklist nvidia-drm

I'll look for some information at changelog you mentioned. Also, I will test some vanilla kernels to determine witch of these, the gpu passthrough stops working.
By the way, thank you so much for the help

Offline

Board footer

Powered by FluxBB