You are not logged in.

#1 2025-12-14 17:39:37

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Optimus PCI passthrough: internal error: Unknown PCI header type '127'

>>SPECS:
  • BIOS:             UEFI, most recent version

  • CPU:               Intel(R) Core(TM) i7-8750H (6 cores, 8th Gen)

  • iGPU:              Intel Gen 9.5 CoffeeLake-H UHD Graphics 630

  • dGPU:             NVIDIA GeForce GTX 1050 Ti Mobile [MUXed]

  • Kernel:            Linux 6.17.9-arch1-1

  • Compositor:      Hyprland 0.52.1 (Wayland)

Secure boot = off
fastboot = "thorough" (everything is initialized in the boot process) instead of "minimal"
=================================================

>>The problem

Virt-manager outputs several messages (which I will show further below), such as:

"Parameter 'x-pci-device-id' expects uint64"

... I assume that my NVIDIA chip cannot reliably be reset after booting/shutting down the VM, but there seems to be something else going on.

I've been spending a week trying to figure out how to passthrough the NVIDIA chip.
So far, my theory comes down to three possibilities:

  • -- the proper vBIOS not being patched onto the system BIOS of the edk2 prevents the NVIDIA chip from working properly in the guest, unlike most graphics cards

  • -- something else in Linux is preventing a proper release of my dGPU

  • -- or maybe I have yet to figure out a particular setting in the VM's XML definition (perhaps setting the dGPU as the primary renderer in the <graphics> element of the XML definition)

I am inclined to believe that the missing vBIOS is my main problem due to the VM not booting at all despite there being a GPU present at the correct bus, but for that same reason the other two points may apply.
The VM completely stopped booting as soon as I tried specifying

x-pci-sub-vendor-id

and

x-pci-sub-device-id

My system almost always freezes during the booting process into the host OS ever since loading vfio_pci, and when I manage to reach a point where my graphic session is fully launched, sometimes the vfio_pci driver may not fully bind the chip to the VM and would still result in a lock up/hang―that's even with the i915 driver loaded to be used by my iGPU after vfio_pci in mkinitcpio.conf. Most of the time, this happens along with an error similar to

Aborting method \_SB.PCI0.SPI2.FPNT._CRS due to previous error

followed by a

TIMEOUT

and

vfio-pci 0000:01:00.0: not ready 65535ms after resume; giving up

============

.bashrc
    ------------------------------------------------
    export ANV_DEBUG="video-decode,video-encode"
    export MESA_VK_DEVICE_SELECT=8086:3e9b!       (The vendor-device ID of my iGPU)
    export MESA_LOADER_DRIVER_OVERRIDE="i915"
    export LIBVA_DRIVER_NAME="iHD"

Sometimes even just running:

$ journalctl

causes the system to no longer respond. Not sure if it's just the framebuffer or the whole system crashing.
In both my Linux host and Windows guest OS, the chip is detected as a generic 3D controller, though the vendor NVIDIA still shows up.
When I could still boot into the guest, Windows' Device Manager would always list:

PCI\VEN_10DE&DEV_1C8C&SUBSYS_00000000&REV_A1

as a hardware id--the string of zeroes indicates that the dGPU has either been blocked or that something is missing in the VM's firmware, as I've come to understand that laptops are much more complex to work with in this kind of setup, especially with hardware like mine that is no longer maintained by its manufacturer... More so if certain restrictions by said manufacturer have not been truly lifted.

    ❯ sudo dmesg | grep -i vfio
    [ 2.553048] VFIO - User Level meta-driver version: 0.3
    [ 2.695107] vfio_pci: add [10de:1c8c[ffffffff:ffffffff]] class 0x000000/00000000
    [ 4.222502] vfio_pci: add [10de:0fb9[ffffffff:ffffffff]] class 0x000000/00000000
    [ 216.060161] vfio-pci 0000:01:00.0: not ready 1023ms after resume; waiting
    [ 217.147997] vfio-pci 0000:01:00.0: not ready 2047ms after resume; waiting
    [ 219.260149] vfio-pci 0000:01:00.0: not ready 4095ms after resume; waiting
    [ 223.804082] vfio-pci 0000:01:00.0: not ready 8191ms after resume; waiting
    [ 232.508027] vfio-pci 0000:01:00.0: not ready 16383ms after resume; waiting
    [ 249.403758] vfio-pci 0000:01:00.0: not ready 32767ms after resume; waiting
    [ 283.707594] vfio-pci 0000:01:00.0: not ready 65535ms after resume; giving up

    ❯ sudo dmesg
    [32088.864999] ACPI Error: Aborting method \_SB.PCI0.PGON due to previous error (AE_AML_LOOP_TIMEOUT) (20250404/psparse-529)
    [32088.865016] ACPI Error: Aborting method \_SB.PCI0.PEG0.PG00._ON due to previous error (AE_AML_LOOP_TIMEOUT) (20250404/psparse-529)
$ lspci -vnnn | grep -iP -n15 "vga|nvidia|nouveau|vfio-pci"
172:01:00.0 3D controller [0302]: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] [10de:1c8c] (rev a1)
173-	Subsystem: Dell Device [1028:0870]
174-	Flags: bus master, fast devsel, latency 0, IRQ 11, IOMMU group 2
175-	Memory at ec000000 (32-bit, non-prefetchable) [size=16M]
176-	Memory at c0000000 (64-bit, prefetchable) [size=256M]
177-	Memory at d0000000 (64-bit, prefetchable) [size=32M]
178-	I/O ports at 4000 [disabled] [size=128]
179-	Expansion ROM at ed000000 [disabled] [size=512K]
180-	Capabilities: <access denied>
181:	Kernel driver in use: vfio-pci
182:	Kernel modules: nouveau, nvidia_drm, nvidia
183-
184:01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1) (prog-if 00 [HDA compatible])
185-	Flags: fast devsel, IRQ 10, IOMMU group 2
186-	Memory at ed080000 (32-bit, non-prefetchable) [disabled] [size=16K]
187-	Capabilities: <access denied>
188:	Kernel driver in use: vfio-pci
189-	Kernel modules: snd_hda_intel

I should mention that I have setup Intel GVT-g to use my iGPU as a mediated device in my guest.

<devices>
  ...
  <graphics type='spice'>
    <listen type='none'/>
    <image compression='off'/>
    <gl enable='yes' rendernode='/dev/dri/by-path/pci-0000:00:02.0-render'/>
  </graphics>
  ...
  </devices>

    This is for Intel GVT-g, but perhaps I should try setting the dGPU as the primary renderer instead...?
=================================================

>>Some prior steps

When I set these lines (I'm pretty sure they were wrong, so I later re-edited them) in the XML definition:

<domain xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0" type="kvm">
     ...
     <devices>
         ... 
       <hostdev mode="subsystem" type="pci" managed="yes">
         <source>
           <address domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
         </source>
         <rom bar="on"/>
         <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0" multifunction="on"/>
       </hostdev>
         ...
     </devices>
         ...
     <qemu:override>
       <qemu:device alias="hostdev1">
         <qemu:frontend>
           <qemu:property name="romfile" type="string" value="~/GeForce_GTX_1050Ti_86-07-63-00-63.rom"/>
           <qemu:property name="x-pci-vendor-id" type="string" value="0x10de"/>
           <qemu:property name="x-pci-device-id" type="string" value="0x1c8c"/>
           <qemu:property name="x-pci-sub-vendor-id" type="string" value="0x1028"/>
           <qemu:property name="x-pci-sub-device-id" type="string" value="0x0870"/>
         </qemu:frontend>
       </qemu:device>
     </qemu:override>
   </domain>

Virt-manager either gives the aforementioned expected x-pci-device-id error, or...:

    "Error starting domain: internal error: Unknown PCI header type '127' for device '0000:01:00.0'"
    -----------------------------------------------------------------
    Dec 12 04:06:18 TacoDELL libvirtd[680]: Unable to read from monitor: Connection reset by peer
    Dec 12 04:06:18 TacoDELL libvirtd[680]: internal error: QEMU unexpectedly closed the monitor (vm='win11'): 2025-12-12T08:06:18.625455Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:01:00.0","id":"hostdev1","bus":"pci.1","multifunction":true,"addr":"0x0","rombar":1,"romfile":"~/GeForce_GTX_1050Ti_86-07-63-00-63.rom","x-pci-vendor-id":"0x10de","x-pci-device-id":"0x1c8c","x-pci-sub-vendor-id":"0x1028","x-pci-sub-device-id":"0x0870"}: Parameter 'x-pci-device-id' expects uint64
    Dec 12 04:06:18 TacoDELL libvirtd[680]: internal error: process exited while connecting to monitor: 2025-12-12T08:06:18.625455Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:01:00.0","id":"hostdev1","bus":"pci.1","multifunction":true,"addr":"0x0","rombar":1,"romfile":"~/GeForce_GTX_1050Ti_86-07-63-00-63.rom","x-pci-vendor-id":"0x10de","x-pci-device-id":"0x1c8c","x-pci-sub-vendor-id":"0x1028","x-pci-sub-device-id":"0x0870"}: Parameter 'x-pci-device-id' expects uint64

I think I've read somewhere in the Arch wiki that the kernel itself may cause some issues with vfio_pci, probably a mix of factors with my particular laptop.
This particular error "header type '127'" has also been reported on AMD platforms prior to a BIOS update, from what I've read.
The current vBIOS I am using is one that I dumped directly from my dGPU under a Windows 10 host...or so I thought.
I've tried patching it on the edk2 by following this guide (https://lantian.pub/en/article/modify-c … h.lantian/), in the section "Setting up NVIDIA dGPU Passthrough". (NOTE: Since this guide is a bit outdated, I had to update the git submodules [$ git submodule update --init] and edit the nvidia-hack.diff changes into the QemuFwCfgAcpi.c file myself since its path has changed.)
    Still, NVIDIA is listed as an unknown device / 3D controller in the guest, just like in Arch, and the NVIDIA installer just closes as soon as it scans my VM without any error message and nothing gets installed, even after setting <vendor id="anything"/> and <kvm> <hidden state="on">...
So, I've been trying to find a way to see if the vBIOS can't be extracted from my motherboard's BIOS update and use that one instead, if it's even in there.
The program I used is called "VBiosFinder"―it resulted in a message indicating that no vBIOS could be found in my system BIOS update, though I'm not sure if that's true or if it failed to extract it in the case of my particular manufacturer and their way of compressing and/or encrypting the update file. I then used this DELL script (https://github.com/T-vK/DellBiosUnpackerPOC) with it, and results are still inconclusive as it may work on certain machines that are not listed in the Github README.

The following messages would appear when I tried to dump the GPU vBIOS under linux, likely due to it not being present (at least that's what I think):

[18480.733183] pci 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff

or

"Input/Output Error"

with the last of the following commands:

    # echo "0000:01:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind
    $ cd /sys/bus/pci/devices/0000:01:00.0
    # echo 1 > rom
    # cat rom > ~/GTX1050Timanual_vBIOS.dump

I understand that these forums may not be the place to bring up what I did based on an article other than the Arch wiki, so my questioning has more to do with the GPU itself not being properly
detected and passed to the VM even after reading these articles: https://wiki.archlinux.org/title/PCI_pa … _device_ID and https://wiki.archlinux.org/title/NVIDIA_Optimus...
I'm also trying to figure out what is causing the freezing.
Is it perhaps just because of the fact that it's a dGPU and that I'm missing some way of working around Optimus?

❯ journalctl -p 3 -xb
Dec 07 07:00:59 archlinux kernel: x86/cpu: SGX disabled or unsupported by BIOS.
Dec 07 07:00:59 archlinux kernel: ACPI Error: Aborting method \_SB.PCI0.SPI1.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/psparse-529)
Dec 07 07:00:59 archlinux kernel: ACPI Error: Method execution failed \_SB.PCI0.SPI1.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/uteval-68)
Dec 07 07:00:59 archlinux kernel: ACPI Error: Aborting method \_SB.PCI0.SPI1.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/psparse-529)
Dec 07 07:00:59 archlinux kernel: ACPI Error: Method execution failed \_SB.PCI0.SPI1.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/uteval-68)
Dec 07 07:00:59 archlinux kernel: ACPI Error: Aborting method \_SB.PCI0.SPI2.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/psparse-529)
Dec 07 07:00:59 archlinux kernel: ACPI Error: Method execution failed \_SB.PCI0.SPI2.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/uteval-68)
Dec 07 07:00:59 archlinux kernel: ACPI Error: Aborting method \_SB.PCI0.SPI2.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/psparse-529)
Dec 07 07:00:59 archlinux kernel: ACPI Error: Method execution failed \_SB.PCI0.SPI2.FPNT._CRS due to previous error (AE_AML_INVALID_RESOURCE_TYPE) (20250404/uteval-68)
Dec 07 07:01:01 archlinux kernel: sd 2:0:0:0: [sdc] Asking for cache data failed
Dec 07 07:04:01 TacoDELL libvirtd[671]: Unable to read from monitor: Connection reset by peer
Dec 07 07:04:01 TacoDELL libvirtd[671]: internal error: QEMU unexpectedly closed the monitor (vm='win11'): 2025-12-07T11:04:01.505053Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:01:00.0",">
Dec 07 07:04:31 TacoDELL kernel: ACPI Error: Aborting method \_SB.PCI0.PGON due to previous error (AE_AML_LOOP_TIMEOUT) (20250404/psparse-529)
Dec 07 07:04:31 TacoDELL kernel: ACPI Error: Aborting method \_SB.PCI0.PEG0.PG00._ON due to previous error (AE_AML_LOOP_TIMEOUT) (20250404/psparse-529)
Dec 07 07:05:41 TacoDELL libvirtd[671]: internal error: Unknown PCI header type '127' for device '0000:01:00.0'
Dec 07 07:05:41 TacoDELL libvirtd[671]: Failed to reset PCI device: internal error: Unknown PCI header type '127' for device '0000:01:00.0'
Dec 07 07:05:41 TacoDELL libvirtd[671]: internal error: Unknown PCI header type '127' for device '0000:01:00.1'
Dec 07 07:05:41 TacoDELL libvirtd[671]: Failed to reset PCI device: internal error: Unknown PCI header type '127' for device '0000:01:00.1'
Dec 07 07:05:41 TacoDELL libvirtd[671]: Timed out during operation: cannot acquire state change lock
Dec 07 07:05:41 TacoDELL libvirtd[671]: Timed out during operation: cannot acquire state change lock
Dec 07 07:27:45 TacoDELL libvirtd[671]: internal error: Unknown PCI header type '127' for device '0000:01:00.0'

Now that I think about it, these ACPI errors have been present ever since I installed Arch on my current PC, so even before ever delving into virtual machines...and now they appear
to happen not only on boot, but also whenever I try to use vfio_pci drivers.
I remember seeing "SB.PCI0.**********" as part of the unknown device that is supposed to be the dGPU in Windows's Device Manager... I'm referring to its BIOS device name: "\_SB.PCI0.S10.S00".
Clearly, whatever problems I have are not limited to the VM if the ones before the cache data bit have anything to do with the GPU, but that's just me speculating.

It even still works just fine in my old Windows 10 installation.
At some point, I started to experiment with the NVIDIA driver just to confirm that the card would work under Linux...
My reasoning for not installing the driver at first is that having to blacklist it along with Nouveau anyway wouldn't make a difference.
Moreover, I wanted to avoid the potential issues regarding Optimus laptops under Wayland if possible.


=================================================

>>Testing the NVIDIA driver in Wayland

And so, I updated my system and installed the latest NVIDIA drivers...
The NVIDIA GPU is still detected as a generic 3D controller rather than a VGA compatible controller. I can confirm this is the case with or without vfio_pci & vfio in the mkinitcpio modules array, and whether or not the nvidia driver is in use.

Only my iGPU shows up as a card in here:

/dev/dri/by-path/
    pci-0000:00:02.0-card ⇒ ../card0
    pci-0000:00:02.0-render ⇒ ../renderD128

However, the dGPU does show up after switching from vfio_pci to nvidia and rebooting:

pci-0000:00:02.0-card ⇒ ../card0
pci-0000:00:02.0-render ⇒ ../renderD128
pci-0000:01:00.0-card ⇒ ../card1
pci-0000:01:00.0-render ⇒ ../renderD129

The dGPU appears to be functional as I have steady output on an external monitor plugged into the HDMI port directly connected to said dGPU.
No performance hit so far... No wildly spinning fans... Not even an instance of the system/framebuffer hanging during and after the booting process―just a little more CPU usage for whatever is displayed on the external monitor.
intel_gpu_top still shows my iGPU doing some rendering in my browser.

This all seems to confirm that the dGPU can indeed work to some extent in Linux.

❯ sudo dmesg | grep -ie "iommu" -e "nvidia"
    [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=839c22ab-70f1-4b0b-bc01-89065821a255 rw rootflags=subvol=@ initcall_blacklist=sysfb_init loglevel=3 quiet iommu=1 intel_iommu=on iommu=pt i915.enable_gvt=1 i915.enable_fbc=0
    [    0.032808] Kernel command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=839c22ab-70f1-4b0b-bc01-89065821a255 rw rootflags=subvol=@ initcall_blacklist=sysfb_init loglevel=3 quiet iommu=1 intel_iommu=on iommu=pt i915.enable_gvt=1 i915.enable_fbc=0
    [    0.032887] DMAR: IOMMU enabled
    [    0.085647] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
    [    0.365826] iommu: Default domain type: Passthrough (set via kernel command line)
    [    0.429241] pci 0000:00:02.0: Adding to iommu group 0
    [    0.429278] pci 0000:00:00.0: Adding to iommu group 1
    [    0.429292] pci 0000:00:01.0: Adding to iommu group 2
    [    0.429301] pci 0000:00:04.0: Adding to iommu group 3
    [    0.429312] pci 0000:00:08.0: Adding to iommu group 4
    [    0.429325] pci 0000:00:12.0: Adding to iommu group 5
    [    0.429343] pci 0000:00:14.0: Adding to iommu group 6
    [    0.429352] pci 0000:00:14.2: Adding to iommu group 6
    [    0.429365] pci 0000:00:14.3: Adding to iommu group 7
    [    0.429381] pci 0000:00:15.0: Adding to iommu group 8
    [    0.429390] pci 0000:00:15.1: Adding to iommu group 8
    [    0.429402] pci 0000:00:16.0: Adding to iommu group 9
    [    0.429411] pci 0000:00:17.0: Adding to iommu group 10
    [    0.429431] pci 0000:00:1b.0: Adding to iommu group 11
    [    0.429446] pci 0000:00:1d.0: Adding to iommu group 12
    [    0.429468] pci 0000:00:1f.0: Adding to iommu group 13
    [    0.429478] pci 0000:00:1f.3: Adding to iommu group 13
    [    0.429489] pci 0000:00:1f.4: Adding to iommu group 13
    [    0.429498] pci 0000:00:1f.5: Adding to iommu group 13
    [    0.429503] pci 0000:01:00.0: Adding to iommu group 2
    [    0.429508] pci 0000:01:00.1: Adding to iommu group 2
    [    0.429524] pci 0000:3b:00.0: Adding to iommu group 14
    [    2.229568] nvidia: loading out-of-tree module taints kernel.
    [    2.229577] nvidia: module license 'NVIDIA' taints kernel.
    [    2.229581] nvidia: module verification failed: signature and/or required key missing - tainting kernel
    [    2.229582] nvidia: module license taints kernel.
    [    2.814992] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
    [    2.820514] nvidia 0000:01:00.0: enabling device (0006 -> 0007)
    [    3.033751] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  580.105.08  Wed Oct 29 23:15:11 UTC 2025
    [    3.070184] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  580.105.08  Wed Oct 29 22:15:26 UTC 2025
    [    3.077479] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
    [    3.082810] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
    [    3.687849] [drm] Initialized nvidia-drm 0.0.0 for 0000:01:00.0 on minor 1
    [    3.735621] nvidia 0000:01:00.0: [drm] fb1: nvidia-drmdrmfb frame buffer device
    [    5.685990] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input28
    [    5.686041] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input29
    [    5.686085] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input30
    [    5.686524] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input31
    [    8.281601] intel_vgpu_mdev be318a5b-8324-4a18-bc15-f8b618f2363a: Adding to iommu group 15

It's odd to me that the kernel is getting tainted with out-of-tree modules...
I'm not sure what to think of it being tainted by NVIDIA... Doesn't this mean that it may function in unpredictable ways?
It's probably unrelated, but my dGPU is listed as "Current, supported" on https://wiki.archlinux.org/title/NVIDIA.

Also, you may notice here that three devices have been added to group 2 (the reason why I isolated the two NVIDIA ones with a modprobe file):

❯ lsiommu | grep -i -n4 "group 2"
    1-IOMMU Group 0:
    2-  00:02.0 VGA compatible controller [0300]: Intel Corporation CoffeeLake-H GT2 [UHD Graphics 630] [8086:3e9b]
    3-IOMMU Group 1:
    4-  00:00.0 Host bridge [0600]: Intel Corporation 8th/9th Gen Core Processor Host Bridge / DRAM Registers [8086:3ec4] (rev 07)
    5:IOMMU Group 2:
    6-  00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
    7-  01:00.0 3D controller [0302]: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] [10de:1c8c] (rev a1)
    8-  01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
    9-IOMMU Group 3:
    lspci: -s: Invalid slot number

    Regarding the hardware ID in the guest's Device Manager, I've installed hwinfo and used it in Arch.
   

❯ hwinfo --gfxcard
❯ lspci -nnks 01:00.

    The results show that the subsystem for the NVIDIA dGPU is the same as the Intel iGPU...

    For NVIDIA, I still don't have anything like Optimus-manager or Envy-control installed -- it's just the following packages:

  • nvidia (580.105.08)

  • nvidia-lts

  • nvidia-utils

  • lib32-nvidia-utils

  • egl-wayland

  • egl-gbm

    So, when I run:

$ nvidia-smi

    It shows that my dGPU is "off"...yet the PCI address is "on", oddly enough.

These are the variables I used while having nvidia modules loaded without vfio_pci and vfio:

hyprland.conf
    -----------------------------------
    env = LIBVA_DRIVER_NAME,nvidia
    env = __GLX_VENDOR_LIBRARY_NAME,nvidia
    env = GBM_BACKEND,nvidia-drm

=================================================

>>Rebooting again with vfio_pci & commenting out all nvidia-related variables

<qemu:device alias="hostdev1">
  <qemu:frontend>
    <qemu:property name="romfile" type="string" value="~/GeForce_GTX_1050Ti_86-07-63-00-63.rom"/>
    <qemu:property name="x-pci-sub-vendor-id" type="unsigned" value="1028"/>
    <qemu:property name="x-pci-sub-device-id" type="unsigned" value="0870"/>
  </qemu:frontend>
</qemu:device>

      Finally, Virt-manager outputs a new error after making the above changes, editing the ids below in /etc/modprobe.d/vfio_pci.conf to try to add the missing subsystem IDs:

options vfio_pci ids=10de:1c8c:1028:0870,10de:0fb9

      and rebooting with vfio_pci loaded once again:

$ journalctl -p 3 -xb
          Dec 12 05:43:57 TacoDELL libvirtd[680]: Unable to read from monitor: Connection reset by peer
          Dec 12 05:43:57 TacoDELL libvirtd[680]: internal error: QEMU unexpectedly closed the monitor (vm='win11'): 2025-12-12T09:43:57.462013Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:01:00.0","id":"hostdev1","bus":"pci.1","multifunction":true,"addr":"0x0","rombar":1,"romfile":"~/GeForce_GTX_1050Ti_86-07-63-00-63.rom","x-pci-sub-vendor-id":1028,"x-pci-sub-device-id":870}: vfio 0000:01:00.0: error getting device from group 2: No such device
          Verify all devices in group 2 are bound to vfio-<bus> or pci-stub and not already in use

      "not already in use"
      Even though the device is clearly listed at that address and using vfio_pci.
      And now, Virt-manager no longer boots the VM with this current configuration.

--UPDATE--

Dec 13 19:36:08 TacoDELL libvirtd[681]: internal error: Unknown PCI header type '127' for device '0000:01:00.0'
Dec 13 19:41:58 TacoDELL libvirtd[681]: internal error: Unknown PCI header type '127' for device '0000:01:00.0'

      This error reappeared after the following changes were made to the XML, as I've gotten another vBIOS file, this time from the Windows Registry, and planning
      to patch it to the edk2:

...
<qemu:override>
  </qemu:device>
    <qemu:device alias="hostdev1">
      <qemu:frontend>
        <qemu:property name="x-pci-sub-vendor-id" type="unsigned" value="4136"/>
        <qemu:property name="x-pci-sub-device-id" type="unsigned" value="2160"/>
      </qemu:frontend>
    </qemu:device>

      Virt-manager:

Error starting domain: Timed out during operation: cannot acquire state change lock (held by monitor=remotedispatchdomain create)

      And then the whole system freezes again, and I have to power cycle multiple times before being able to relaunch the user session...

      If anyone has any suggestions, I'll be back whenever I can with the results.

===================================================================================================================================================

>>Configs, parameters, variables, modules...

Everything I modprobed:

blacklist nouveau
blacklist nvidia

options drm debug=0

options net ifnames=0

options vfio_iommu_type1 allow_unsafe_interrupts=1

options vfio_pci ids=10de:1c8c,10de:0fb9              [UPDATE]: options vfio_pci ids=10de:1c8c:1028:0870,10de:0fb9

---

GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet iommu=1 intel_iommu=on iommu=pt i915.enable_gvt=1 i915.enable_fbc=0"
GRUB_CMDLINE_LINUX="initcall_blacklist=sysfb_init"

---

    Dec 08 21:07:53 archlinux kernel: DMAR: Intel(R) Virtualization Technology for Directed I/O

    Intel VT-d is enabled.

    ❯ LC_ALL=C.UTF-8 lscpu | grep Virtualization
    Virtualization:                          VT-x

    So is VT-x

    ❯ lsmod | grep kvm
    kvm_intel             438272  0
    kvmgt                 475136  0
    mdev                   20480  1 kvmgt
    kvm                  1400832  2 kvmgt,kvm_intel
    irqbypass              16384  2 vfio_pci_core,kvm
    vfio                   77824  5 vfio_pci_core,kvmgt,vfio_iommu_type1,vfio_pci
    i915                 4853760  51 kvmgt

/etc/mkinitcpio
    ---------------------------------------------------
    MODULES=(... vfio_pci vfio vfio_iommu_type1 i915 nvidia nvidia_modeset nvidia_uvm nvidia_drm)
    HOOKS=(base systemd autodetect microcode modconf kms keyboard keymap sd-vconsole block filesystems fsck)

/etc/modules-load.d/nvidia.conf
    ---------------------------------------------------
    nvidia
    nvidia_modeset
    nvidia_uvm
    nvidia_drm

/etc/modules-load.d/i915
    ---------------------------------------------------
    i915

/etc/modules-load.d/kvmgt.conf
    ---------------------------------------------------
    kvmgt

/etc/modules-load.d/mdev.conf
    ---------------------------------------------------
    mdev

/etc/modules-load.d/pci-passthrough
    ---------------------------------------------------
    vfio
    vfio_pci

    # Some modules required for Intel GVT
    exngt
    vfio_mdev

/etc/modules-load.d/vfio-iommu-type1.conf
    ---------------------------------------------------
    vfio_iommu_type1

      My XML file: https://0x0.st/PrzH.txt

Last edited by CryogEnix (2025-12-15 14:53:15)

Offline

#2 2025-12-14 21:49:25

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

My journal from the last boot: http://0x0.st/PrK6.txt

After that last OVMF patch with a vBIOS I took from the Windows registry, I am now greeted by a black screen on the VM console.
Even when I remove the dGPU's rom file, it makes no difference. Not even a thing shows up on my external monitor and CPU usage stays at a constant 20% without any sign of something happening...
I have to reboot and do a full power cycle every time the VM is shut down for the vfio_pci driver to reinitialize the chip and have another shot at making this work...if the driver doesn't fail to bind the dGPU.
I think I still need the gvt rom file.

I made sure to replace the correct files in these paths:

<loader readonly="yes" type="pflash" format="raw">/home/Asterion/builds/edk2/Build/OvmfX64/DEBUG_GCC5/FV/OVMF_CODE.fd</loader>
<nvram format="raw">/var/lib/libvirt/qemu/nvram/win11_VARS.fd</nvram>

At the very least, I can confirm that after many reboots, the modifications I made to the x-pci-sub-vendor-id and x-pci-sub-device-id elements of the XML (and using decimal values rather than hexadecimals because Libvirt changed its format at some point) prior to replacing these two files above to apply the new patch did change the subsystem ID in Window's Device Manager. Still, no NVIDIA drivers would be installed. And again, no output on the monitor plugged into the HDMI port.
As always, this is the 3D video controller's device status in Device Manager:

The drivers for this device are not installed. (Code 28)

There are no compatible drivers for this device.

To find a driver for this device, click Update Driver.

So, no code 43―never ran into it. And I'd rather not try to manually install drivers again in case that it makes debugging more difficult.

I had to switch to these files to get the VM to boot again:

/usr/share/edk2/x64/OVMF_CODE.4m.fd
/usr/share/edk2/x64/OVMF_VARS.4m.fd



I did use rom-parser to check my vBIOS compatibility with UEFI, by the way.
I just checked it again with the vBIOS i got from Windows Registry:

Valid ROM signature found @0h, PCIR offset 170h
	PCIR: type 0 (x86 PC-AT), vendor: 10de, device: 1c8c, class: 030000
	PCIR: revision 3, vendor revision: 1
	Last image

Odd... I was sure that it was UEFI-compatible... I mean, it's the firmware that my machine boots with, and again my dGPU works fine under the Windows host.

Last edited by CryogEnix (2025-12-15 11:02:52)

Offline

#3 2025-12-15 11:01:41

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 14,795

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

Please post the full output of the script listed in https://wiki.archlinux.org/title/PCI_pa … _are_valid

Have you tried passing through the card to an archlinux guest ?


Moderator Note
moving to Kernel & Hardware

Last edited by Lone_Wolf (2025-12-15 11:03:08)


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#4 2025-12-15 11:04:34

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

No, I have not yet tried to pass it to a Linux guest.

I did already use that script, but as an alias I've made: $ lsiommu
Anyway, here is the full output:

❯ #!/bin/bash
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
    echo "IOMMU Group ${g##*/}:"
    for d in $g/devices/*; do
        echo -e "\t$(lspci -nns ${d##*/})"
    done;
done;
IOMMU Group 0:
	00:02.0 VGA compatible controller [0300]: Intel Corporation CoffeeLake-H GT2 [UHD Graphics 630] [8086:3e9b]
IOMMU Group 1:
	00:00.0 Host bridge [0600]: Intel Corporation 8th/9th Gen Core Processor Host Bridge / DRAM Registers [8086:3ec4] (rev 07)
IOMMU Group 2:
	00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
	01:00.0 3D controller [0302]: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] [10de:1c8c] (rev a1)
	01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
IOMMU Group 3:
	00:04.0 Signal processing controller [1180]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem [8086:1903] (rev 07)
IOMMU Group 4:
	00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911]
IOMMU Group 5:
	00:12.0 Signal processing controller [1180]: Intel Corporation Cannon Lake PCH Thermal Controller [8086:a379] (rev 10)
IOMMU Group 6:
	00:14.0 USB controller [0c03]: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller [8086:a36d] (rev 10)
	00:14.2 RAM memory [0500]: Intel Corporation Cannon Lake PCH Shared SRAM [8086:a36f] (rev 10)
IOMMU Group 7:
	00:14.3 Network controller [0280]: Intel Corporation Cannon Lake PCH CNVi WiFi [8086:a370] (rev 10)
IOMMU Group 8:
	00:15.0 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0 [8086:a368] (rev 10)
	00:15.1 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #1 [8086:a369] (rev 10)
IOMMU Group 9:
	00:16.0 Communication controller [0780]: Intel Corporation Cannon Lake PCH HECI Controller [8086:a360] (rev 10)
IOMMU Group 10:
	00:17.0 RAID bus controller [0104]: Intel Corporation 82801 Mobile SATA Controller [RAID mode] [8086:282a] (rev 10)
IOMMU Group 11:
	00:1b.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 [8086:a32c] (rev f0)
IOMMU Group 12:
	00:1d.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #14 [8086:a335] (rev f0)
IOMMU Group 13:
	00:1f.0 ISA bridge [0601]: Intel Corporation HM370 Chipset LPC/eSPI Controller [8086:a30d] (rev 10)
	00:1f.3 Audio device [0403]: Intel Corporation Cannon Lake PCH cAVS [8086:a348] (rev 10)
	00:1f.4 SMBus [0c05]: Intel Corporation Cannon Lake PCH SMBus Controller [8086:a323] (rev 10)
	00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller [8086:a324] (rev 10)
IOMMU Group 14:
	3b:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
IOMMU Group 15:
lspci: -s: Invalid slot number

Last edited by CryogEnix (2025-12-15 11:25:26)

Offline

#5 2025-12-15 12:17:33

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 14,795

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

lspci: -s: Invalid slot number

That's not supposed to happen, what does lspci -k (run as user) show ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#6 2025-12-15 12:28:35

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

❯ lspci -k
00:00.0 Host bridge: Intel Corporation 8th/9th Gen Core Processor Host Bridge / DRAM Registers (rev 07)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: skl_uncore
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 07)
	Subsystem: Dell Device 0870
	Kernel driver in use: pcieport
00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-H GT2 [UHD Graphics 630]
	DeviceName: Onboard - Video
	Subsystem: Dell Device 0870
	Kernel driver in use: i915
	Kernel modules: i915
00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem (rev 07)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: proc_thermal
	Kernel modules: processor_thermal_device_pci_legacy
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: intel_pch_thermal
	Kernel modules: intel_pch_thermal
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: xhci_hcd
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
00:14.3 Network controller: Intel Corporation Cannon Lake PCH CNVi WiFi (rev 10)
	DeviceName: Onboard - Ethernet
	Subsystem: Intel Corporation Device 42a4
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi
00:15.0 Serial bus controller: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0 (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: intel-lpss
	Kernel modules: intel_lpss_pci
00:15.1 Serial bus controller: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #1 (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: intel-lpss
	Kernel modules: intel_lpss_pci
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: mei_me
	Kernel modules: mei_me
00:17.0 RAID bus controller: Intel Corporation 82801 Mobile SATA Controller [RAID mode] (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: ahci
00:1b.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 (rev f0)
	Subsystem: Dell Device 0870
	Kernel driver in use: pcieport
00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #14 (rev f0)
	Subsystem: Dell Device 0870
	Kernel driver in use: pcieport
00:1f.0 ISA bridge: Intel Corporation HM370 Chipset LPC/eSPI Controller (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
	DeviceName: Onboard - Sound
	Subsystem: Dell Device 0870
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_soc_avs, snd_sof_pci_intel_cnl, snd_hda_intel
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: i801_smbus
	Kernel modules: i2c_i801
00:1f.5 Serial bus controller: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
	DeviceName: Onboard - Other
	Subsystem: Dell Device 0870
	Kernel driver in use: intel-spi
	Kernel modules: spi_intel_pci
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1)
	Subsystem: Dell Device 0870
	Kernel driver in use: vfio-pci
	Kernel modules: nouveau, nvidia_drm, nvidia
01:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel
3b:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller (rev 15)
	Subsystem: Dell Device 0870
	Kernel driver in use: r8169
	Kernel modules: r8169

I think this happens because the mdev host device uses a UUID rather than a PCI address that the [-s] flag uses, given that it's a vGPU that I've set to be automatically created on boot for Intel GVT-g. It's also shown in my original post with dmesg:
   

[    8.281601] intel_vgpu_mdev be318a5b-8324-4a18-bc15-f8b618f2363a: Adding to iommu group 15

============================================================================================
I just installed nvflash to check the vBIOS version under Linux − this is the result even after a full shutdown:

❯ sudo nvflash --version
[sudo] password for Asterion:
Sorry, try again.
[sudo] password for Asterion:
NVIDIA Firmware Update Utility (Version 5.867.0)
Copyright (C) 1993-2024, NVIDIA Corporation. All rights reserved.


ERROR: A system restart might be required before running the utility.
 Nvflash CPU side error Code:2Error Message: Falcon In HALT or STOP state, abort uCode command issuing process.

I think Falcon is some kind of security feature for Pascal GPUs...?
I see that it's a microprocessor of some kind...which means that it just probed my actual GPU. Gonna check [--help].

❯ sudo nvflash --version ~/nvidia_gtx1050Ti.rom
[sudo] password for Asterion:
NVIDIA Firmware Update Utility (Version 5.867.0)
Copyright (C) 1993-2024, NVIDIA Corporation. All rights reserved.

...
Subsystem Vendor ID   : 0x10DE
Subsystem ID          : 0x0000
Device Name(s)        : GeForce GTX 1050 Ti
Vendor ID             : 0x10DE
Device ID             : 0x1C8C
UEFI Version          : No Version Found or Out-dated (  )
UEFI Variant ID       : No Variant ID Found ( No Variant ID Found )
UEFI Signer(s)        : Unknown signer
...

======================================================
After a bit of searching on techpowerup, I picked a rom file that supports UEFI and is close enough to my actual vBIOS, then added it to my GPU in the XML definition...
Still no luck.

Last edited by CryogEnix (2025-12-15 20:34:45)

Offline

#7 2025-12-16 11:16:07

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 14,795

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

The reason I asked if you had tried it in an archlinux VM is that nvidia doesn't like consumer cards to be used in VMs and blocks that hard in their windows driver.
To workaround that requires special configuration.

Please try a linux VM to verify just the passthrough.
Can you disable the "vGPU that I've set to be automatically created on boot for Intel GVT-g" for testing ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#8 2025-12-16 13:05:10

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

Sure, I'll try that right away.

Last edited by CryogEnix (2025-12-16 14:01:28)

Offline

#9 2025-12-17 10:11:50

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

When running $ lspci -knn in the Arch VM, the dGPU does appear and is using the nvidia driver after having installed it. Still no output on the external monitor neutral

There are no issues while in the guest's virtual console terminal...until I launch a graphical session − the mouse input is severely laggy (2-3 seconds updates) when interacting with graphic elements or moving it too quickly, even when redirecting my USB device to the VM, setting all devices to virtio, the disk cache to writeback, CPU to host-passthrough with pinned cores... The cursor does not appear in the VM, but I can see its interaction when hovering over buttons, most of which would cause its lag to occur in full force and some do not work at all.
Typing in a terminal emulator like Kitty also has input lag causing button presses to be doubled.
This happens because of the storage device being a qcow2 image that I took from the official installation guide, and I'm still trying to find the reason why this happens on my machine when almost everyone I see online have no such issues related to this format.
For one thing, it never happens in my Windows guest, whose storage is raw / bare metal. I even made sure to not put the qcow2 file on a btrfs filesystem.

Last edited by CryogEnix (2025-12-17 11:08:38)

Offline

#10 2025-12-17 11:45:31

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 14,795

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

Please run as root/with root rights

# journalctl -b | curl -F 'file=@-' 0x0.st

on host and arch guest in that configuration .

The command will upload your system journal to a public hosting site and output a link. Post that link.

Last edited by Lone_Wolf (2025-12-17 11:45:45)


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#11 2025-12-17 14:37:01

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 73,317

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

He posted a journal in https://bbs.archlinux.org/viewtopic.php … 1#p2277841 and that has vfio and nvidia in the initramfs.
I didn't read the thread, but is it your intention to conditionally forward the nvidia GPU at runtime?

If not and you want to unconditionally forward the nvidia GPU, remove the nvidia drivers and either remove the kms hook (you can put i915 into the MODULES array) or blacklist nouveau.
Then regenerate the initramfs.

If yes, see https://wiki.archlinux.org/title/PCI_pa … _device_ID

It is paramount to forward both, video and audio device but

options vfio_pci ids=10de:1c8c:1028:0870,10de:0fb9

seems wrong, where's the "1028:0870" coming from?

Offline

#12 2025-12-17 15:28:47

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

Lone_Wolf wrote:

Please run as root/with root rights

# journalctl -b | curl -F 'file=@-' 0x0.st

on host and arch guest in that configuration .

(Guest):

Your IP address is blocked from uploading files.

Since registering on these forums, I've only sent 4 files to that pastebin service. I'll see if I can get a different one...
Here we go: https://termbin.com/zye0

Dec 17 11:03:44 archlinux systemd-modules-load[141]: Inserted module 'nvidia_modeset'
Dec 17 11:03:44 archlinux kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  580.119.02  Mon Dec  8 07:37:54 UTC 2025
Dec 17 11:03:44 archlinux kernel: [drm] [nvidia-drm] [GPU ID 0x00000600] Loading driver
Dec 17 11:03:44 archlinux kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
Dec 17 11:03:44 archlinux kernel: ACPI Warning: \_SB.PCI0.S14.S00._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20250807/nsarguments-61)
Dec 17 11:03:44 archlinux kernel: NVRM: GPU 0000:06:00.0: Failed to copy vbios to system memory.
Dec 17 11:03:44 archlinux kernel: NVRM: GPU 0000:06:00.0: RmInitAdapter failed! (0x30:0xffff:1116)
Dec 17 11:03:44 archlinux kernel: NVRM: GPU 0000:06:00.0: rm_init_adapter failed, device minor number 0
Dec 17 11:03:44 archlinux kernel: [drm:nv_drm_dev_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000600] Failed to allocate NvKmsKapiDevice
Dec 17 11:04:07 Pebshka gnome-shell[674]: Running GNOME Shell (using mutter 49.2) as a Wayland display server
Dec 17 11:04:07 Pebshka rtkit-daemon[604]: Successfully made thread 701 of process 674 owned by '1000' high priority at nice level -15.
Dec 17 11:04:07 Pebshka gnome-shell[674]: Thread 'KMS thread' will be using high priority scheduling
Dec 17 11:04:07 Pebshka gnome-shell[674]: DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Dec 17 11:04:07 Pebshka gnome-shell[674]: Assuming 131072kB available aperture size.
Dec 17 11:04:07 Pebshka gnome-shell[674]: May lead to reduced performance or incorrect rendering.
Dec 17 11:04:07 Pebshka gnome-shell[674]: get chip id failed: -1 [38]
Dec 17 11:04:07 Pebshka gnome-shell[674]: param: 4, val: 0
Dec 17 11:04:07 Pebshka gnome-shell[674]: i915 does not support EXECBUFER2
Dec 17 11:04:07 Pebshka gnome-shell[674]: [intel_init_bufmgr:1028] Error initializing buffer manager.
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: Failed to copy vbios to system memory.
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: RmInitAdapter failed! (0x30:0xffff:1116)
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: rm_init_adapter failed, device minor number 0
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: Failed to copy vbios to system memory.
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: RmInitAdapter failed! (0x30:0xffff:1116)
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: rm_init_adapter failed, device minor number 0
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: Failed to copy vbios to system memory.
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: RmInitAdapter failed! (0x30:0xffff:1116)
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: rm_init_adapter failed, device minor number 0
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: Failed to copy vbios to system memory.
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: RmInitAdapter failed! (0x30:0xffff:1116)
Dec 17 11:04:07 Pebshka kernel: NVRM: GPU 0000:06:00.0: rm_init_adapter failed, device minor number 0
Dec 17 11:04:07 Pebshka gnome-shell[674]: DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Dec 17 11:04:07 Pebshka gnome-shell[674]: Assuming 131072kB available aperture size.
Dec 17 11:04:07 Pebshka gnome-shell[674]: May lead to reduced performance or incorrect rendering.
Dec 17 11:04:07 Pebshka gnome-shell[674]: get chip id failed: -1 [38]
Dec 17 11:04:07 Pebshka gnome-shell[674]: param: 4, val: 0
Dec 17 11:04:07 Pebshka gnome-shell[674]: i915 does not support EXECBUFER2
Dec 17 11:04:07 Pebshka gnome-shell[674]: [intel_init_bufmgr:1028] Error initializing buffer manager.
Dec 17 11:04:07 Pebshka gnome-shell[674]: libEGL warning: DRI2: failed to create dri screen
Dec 17 11:04:07 Pebshka gnome-shell[674]: DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Dec 17 11:04:07 Pebshka gnome-shell[674]: Assuming 131072kB available aperture size.
Dec 17 11:04:07 Pebshka gnome-shell[674]: May lead to reduced performance or incorrect rendering.
Dec 17 11:04:07 Pebshka gnome-shell[674]: get chip id failed: -1 [38]
Dec 17 11:04:07 Pebshka gnome-shell[674]: param: 4, val: 0
Dec 17 11:04:07 Pebshka gnome-shell[674]: i915 does not support EXECBUFER2
Dec 17 11:04:07 Pebshka gnome-shell[674]: [intel_init_bufmgr:1028] Error initializing buffer manager.
Dec 17 11:04:07 Pebshka gnome-shell[674]: libEGL warning: DRI2: failed to create dri screen
Dec 17 11:04:07 Pebshka gnome-shell[674]: Added device '/dev/dri/card1' (virtio_gpu) using atomic mode setting.
Dec 17 11:04:07 Pebshka gnome-shell[674]: Failed to initialize accelerated iGPU/dGPU framebuffer sharing: Not hardware accelerated
Dec 17 11:04:07 Pebshka gnome-shell[674]: Created gbm renderer for '/dev/dri/card1'
Dec 17 11:04:07 Pebshka gnome-shell[674]: Boot VGA GPU /dev/dri/card1 selected as primary

I've imported my .bashrc from a USB drive that has some of my host's variables... I'm thinking of switching to using another driver in the Arch guest to see how it goes, even if I don't give it any GPU, but that's outside the topic of this thread.
These lines are interesting to me:

Dec 17 11:04:07 Pebshka gnome-shell[674]: Added device '/dev/dri/card1' (virtio_gpu) using atomic mode setting.
Dec 17 11:04:07 Pebshka gnome-shell[674]: Failed to initialize accelerated iGPU/dGPU framebuffer sharing: Not hardware accelerated
Dec 17 11:04:07 Pebshka gnome-shell[674]: Created gbm renderer for '/dev/dri/card1'
Dec 17 11:04:07 Pebshka gnome-shell[674]: Boot VGA GPU /dev/dri/card1 selected as primary

Could it be that I can't use the dGPU without also passing the integrated one?
In the guest, this is NVIDIA's PCI address: 0000:06:00.0
While I have xorg-xwayland installed in there, I didn't get the xorg-server itself.

seth wrote:

I didn't read the thread, but is it your intention to conditionally forward the nvidia GPU at runtime?

I left the NVIDIA drivers in there in case I hit a dead end in this endeavor. They are now removed for simplicity's sake, except for the kms hook. Everything else you mentioned has already been done.
1028:0870 is the subsystem (<Class ID>:<Programming interface>). I've added it because it was missing in Window's Device Manager as part of the chip's "Hardware ID", though from what I've seen it doesn't solve the problem of the NVIDIA installer not installing anything.
I hesitate to manually try to install drivers in the Windows guest if it complicates debugging, as I've read that it should automatically happen on boot if the NVIDIA chip is properly passed, but eventually I found out that my ROM doesn't even support UEFI (which I know is listed as a prerequisite in the Arch article), so I fetched one from Techpowerup that does support it and is similar enough, then included it in the XML―still no luck.

Last edited by CryogEnix (2025-12-18 17:30:39)

Offline

#13 2025-12-20 20:43:36

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

I've noticed that I accidentally included my iGPU as a rendernode, so I removed it from the XML and changed variables to use nvidia instead.
https://termbin.com/s29kf/

Offline

#14 2025-12-22 14:45:41

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

At this point, I'm thinking that the reason why the dGPU is properly initialized in both of my hosts is because the system BIOS has the UEFI-compatible ROM that I need, and that I was not successful in extracting it from the BIOS update file while the Windows registry didn't have it. Thus, no patch to the edk2 would work every time I tried to pass the NVIDIA chip to a VM, even if it's an Arch guest.
I can't just add a ROM to the GPU in the XML definition in my particular case because my machine is a laptop – its graphics components are configured differently than what is assumed to be used in the prerequisites (https://wiki.archlinux.org/title/PCI_pa … requisites).

Last edited by CryogEnix (2025-12-22 15:22:18)

Offline

#15 2025-12-23 08:45:41

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 14,795

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

For clarity :
https://www.techpowerup.com/vgabios/?ar … +Ti&page=1 lists many types of GTX 1050 TI .
You have checked if your model is in the list ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#16 2025-12-23 21:54:52

CryogEnix
Member
Registered: 2025-12-14
Posts: 10

Re: Optimus PCI passthrough: internal error: Unknown PCI header type '127'

Among the ones that share the same card vendor and brand, none have UEFI-compatibility with clock speeds, memory type vendor, power adjustment range and other parameters close to this: https://www.techpowerup.com/gpu-specs/g … bile.c2912.
I did try with an MSI one...right here: https://www.techpowerup.com/vgabios/193 … 096-170601, though I have not yet made a patch with it.

Last edited by CryogEnix (2025-12-23 22:27:31)

Offline

Board footer

Powered by FluxBB