You are not logged in.

#1 2024-08-11 04:33:02

cloverskull
Member
Registered: 2018-09-30
Posts: 277

[SOLVED] Help with GPU passthrough

Hey friends, I have a laptop with dual GPUs - onboard Intel with NVidia.

I use Intel for my DE (Plasma 6, Wayland). I only use the NVidia GPU for gaming and CUDA work.

Here's my rEFInd boot stanza with kernel commandline:

menuentry "Arch ck" {
    icon     /EFI/refind/icons/os_arch.png
    volume   c12a7328-f81f-11d2-ba4b-00a0c93ec93b
    loader   vmlinuz-linux-ck-generic-v3
    initrd   initramfs-linux-ck-generic-v3.img
    options  "rd.luks.name=a93ae094-9596-43a8-991e-18ff7caf8cbd=root root=/dev/mapper/root rootflags=subvol=@ ro zswap.enabled=0 rootfstype=btrfs add_efi_memmap initrd=/intel-ucode.img mitigations=off nowatchdog modprobe.blacklist=iTCO_wdt nvidia_drm.modeset=1 nvidia_drm.fbdev=1 fsck.repair=yes"
}

Here's `mkinitcpio.conf`

MODULES=(btrfs)
BINARIES=(/usr/bin/btrfs)
FILES=()
HOOKS=(systemd keyboard autodetect microcode modconf sd-vconsole block sd-encrypt)

I didn't do anything specifically for IOMMU, but here's the result of `sudo dmesg | grep -i -e DMAR -e IOMMU`

[    0.004604] ACPI: DMAR 0x000000005F72B000 000088 (v02 INTEL  Dell Inc 00000002      01000013)
[    0.004629] ACPI: Reserving DMAR table memory at [mem 0x5f72b000-0x5f72b087]
[    0.101046] DMAR: Host address width 39
[    0.101047] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.101050] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[    0.101051] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.101054] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[    0.101055] DMAR: RMRR base: 0x0000006c000000 end: 0x000000707fffff
[    0.101058] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.101058] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.101059] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.102594] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.537919] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[    0.936218] iommu: Default domain type: Translated
[    0.936218] iommu: DMA domain TLB invalidation policy: lazy mode
[    0.970902] DMAR: Intel-IOMMU force enabled due to platform opt in
[    0.970908] DMAR: No ATSR found
[    0.970908] DMAR: No SATC found
[    0.970909] DMAR: IOMMU feature fl1gp_support inconsistent
[    0.970910] DMAR: IOMMU feature pgsel_inv inconsistent
[    0.970910] DMAR: IOMMU feature nwfs inconsistent
[    0.970911] DMAR: IOMMU feature dit inconsistent
[    0.970912] DMAR: IOMMU feature sc_support inconsistent
[    0.970912] DMAR: IOMMU feature dev_iotlb_support inconsistent
[    0.970913] DMAR: dmar0: Using Queued invalidation
[    0.970916] DMAR: dmar1: Using Queued invalidation
[    0.971124] pci 0000:00:02.0: Adding to iommu group 0
[    0.971153] pci 0000:00:00.0: Adding to iommu group 1
[    0.971165] pci 0000:00:01.0: Adding to iommu group 2
[    0.971172] pci 0000:00:04.0: Adding to iommu group 3
[    0.971181] pci 0000:00:06.0: Adding to iommu group 4
[    0.971190] pci 0000:00:07.0: Adding to iommu group 5
[    0.971197] pci 0000:00:07.1: Adding to iommu group 6
[    0.971203] pci 0000:00:08.0: Adding to iommu group 7
[    0.971213] pci 0000:00:0a.0: Adding to iommu group 8
[    0.971224] pci 0000:00:0d.0: Adding to iommu group 9
[    0.971231] pci 0000:00:0d.2: Adding to iommu group 9
[    0.971238] pci 0000:00:0e.0: Adding to iommu group 10
[    0.971250] pci 0000:00:12.0: Adding to iommu group 11
[    0.971262] pci 0000:00:12.6: Adding to iommu group 11
[    0.971274] pci 0000:00:14.0: Adding to iommu group 12
[    0.971281] pci 0000:00:14.2: Adding to iommu group 12
[    0.971288] pci 0000:00:14.3: Adding to iommu group 13
[    0.971301] pci 0000:00:15.0: Adding to iommu group 14
[    0.971308] pci 0000:00:15.1: Adding to iommu group 14
[    0.971318] pci 0000:00:16.0: Adding to iommu group 15
[    0.971327] pci 0000:00:1c.0: Adding to iommu group 16
[    0.971345] pci 0000:00:1f.0: Adding to iommu group 17
[    0.971353] pci 0000:00:1f.3: Adding to iommu group 17
[    0.971361] pci 0000:00:1f.4: Adding to iommu group 17
[    0.971368] pci 0000:00:1f.5: Adding to iommu group 17
[    0.971376] pci 0000:01:00.0: Adding to iommu group 18
[    0.971386] pci 0000:a4:00.0: Adding to iommu group 19
[    0.971436] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    1.235013] pci 10000:e0:06.0: Adding to iommu group 10
[    1.235284] pci 10000:e1:00.0: Adding to iommu group 10

IOMMU groups:

IOMMU Group 0:
        0000:00:02.0 VGA compatible controller [0300]: Intel Corporation Raptor Lake-P [Iris Xe Graphics] [8086:a7a0] (rev 04)
IOMMU Group 1:
        0000:00:00.0 Host bridge [0600]: Intel Corporation Raptor Lake-P 6p+8e cores Host Bridge/DRAM Controller [8086:a706]
IOMMU Group 2:
        0000:00:01.0 PCI bridge [0604]: Intel Corporation Raptor Lake PCI Express 5.0 Graphics Port (PEG010) [8086:a70d]
IOMMU Group 3:
        0000:00:04.0 Signal processing controller [1180]: Intel Corporation Raptor Lake Dynamic Platform and Thermal Framework Processor Participant [8086:a71d]
IOMMU Group 4:
        0000:00:06.0 System peripheral [0880]: Intel Corporation RST VMD Managed Controller [8086:09ab]
IOMMU Group 5:
        0000:00:07.0 PCI bridge [0604]: Intel Corporation Raptor Lake-P Thunderbolt 4 PCI Express Root Port #0 [8086:a76e]
IOMMU Group 6:
        0000:00:07.1 PCI bridge [0604]: Intel Corporation Device [8086:a73f]
IOMMU Group 7:
        0000:00:08.0 System peripheral [0880]: Intel Corporation GNA Scoring Accelerator module [8086:a74f]
IOMMU Group 8:
        0000:00:0a.0 Signal processing controller [1180]: Intel Corporation Raptor Lake Crashlog and Telemetry [8086:a77d] (rev 01)
IOMMU Group 9:
        0000:00:0d.0 USB controller [0c03]: Intel Corporation Raptor Lake-P Thunderbolt 4 USB Controller [8086:a71e]
        0000:00:0d.2 USB controller [0c03]: Intel Corporation Raptor Lake-P Thunderbolt 4 NHI #0 [8086:a73e]
IOMMU Group 10:
        0000:00:0e.0 RAID bus controller [0104]: Intel Corporation Volume Management Device NVMe RAID Controller Intel Corporation [8086:a77f]
        10000:e0:06.0 PCI bridge [0604]: Intel Corporation Raptor Lake PCIe 4.0 Graphics Port [8086:a74d]
        10000:e1:00.0 Non-Volatile memory controller [0108]: SK hynix Platinum P41/PC801 NVMe Solid State Drive [1c5c:1959]
IOMMU Group 11:
        0000:00:12.0 Serial controller [0700]: Intel Corporation Alder Lake-P Integrated Sensor Hub [8086:51fc] (rev 01)
        0000:00:12.6 Serial bus controller [0c80]: Intel Corporation Device [8086:51fb] (rev 01)
IOMMU Group 12:
        0000:00:14.0 USB controller [0c03]: Intel Corporation Alder Lake PCH USB 3.2 xHCI Host Controller [8086:51ed] (rev 01)
        0000:00:14.2 RAM memory [0500]: Intel Corporation Alder Lake PCH Shared SRAM [8086:51ef] (rev 01)
IOMMU Group 13:
        0000:00:14.3 Network controller [0280]: Intel Corporation Raptor Lake PCH CNVi WiFi [8086:51f1] (rev 01)
IOMMU Group 14:
        0000:00:15.0 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #0 [8086:51e8] (rev 01)
        0000:00:15.1 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #1 [8086:51e9] (rev 01)
IOMMU Group 15:
        0000:00:16.0 Communication controller [0780]: Intel Corporation Alder Lake PCH HECI Controller [8086:51e0] (rev 01)
IOMMU Group 16:
        0000:00:1c.0 PCI bridge [0604]: Intel Corporation Alder Lake-P PCH PCIe Root Port #4 [8086:51bb] (rev 01)
IOMMU Group 17:
        0000:00:1f.0 ISA bridge [0601]: Intel Corporation Raptor Lake LPC/eSPI Controller [8086:519d] (rev 01)
        0000:00:1f.3 Multimedia audio controller [0401]: Intel Corporation Raptor Lake-P/U/H cAVS [8086:51ca] (rev 01)
        0000:00:1f.4 SMBus [0c05]: Intel Corporation Alder Lake PCH-P SMBus Host Controller [8086:51a3] (rev 01)
        0000:00:1f.5 Serial bus controller [0c80]: Intel Corporation Alder Lake-P PCH SPI Controller [8086:51a4] (rev 01)
IOMMU Group 18:
        0000:01:00.0 3D controller [0302]: NVIDIA Corporation AD107M [GeForce RTX 4060 Max-Q / Mobile] [10de:28a0] (rev a1)
IOMMU Group 19:
        0000:a4:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5260 PCI Express Card Reader [10ec:5260] (rev 01)

So group 18 is what's relevant here. The only device in this group is the NVidia RTX 4060 (10de:28a0)

I tried manually setting vfio.ids=10de:28a0 but then when I used lsmod and lspci or otherwise tried to determine the GPU driver at use, it remained nvidia. What am I missing here?

Note that it is important to me to be able to use the NVidia GPU in linux on demand, but also let a VM own the GPU when running.

Appreciate the help!

Edit - solved.

Here's my functioning mkinitcpio.conf

MODULES=(vfio_pci vfio vfio_iommu_type1 i915 btrfs)
BINARIES=(/usr/bin/btrfs)
FILES=()
HOOKS=(systemd keyboard autodetect microcode modconf kms sd-vconsole block sd-encrypt)

And my working kernel config

<snip> intel_iommu=on vfio-pci.ids=10de:28a0

I had a damn typo, replaced a zero with an 'o'. Now everything works as expected.

It would be neat to be able to hotswap the kernel module to use either nvidia or vfio, that way I could use the nvidia GPU in linux using the nvidia driver, but swap to vfio prior to launching my VM that I want to pass the GPU to. Alas it looks like I need two separate boot stanzas in rEFInd: one using the nvidia driver, one using vfio.

Last edited by cloverskull (2024-08-11 06:27:24)

Offline

#2 2024-08-11 04:57:46

cloverskull
Member
Registered: 2018-09-30
Posts: 277

Re: [SOLVED] Help with GPU passthrough

Updated my kernel command line:

    options  "rd.luks.name=a93ae094-9596-43a8-991e-18ff7caf8cbd=root root=/dev/mapper/root rootflags=subvol=@ ro zswap.enabled=0 rootfstype=btrfs add_efi_memmap initrd=/intel-ucode.img mitigations=off nowatchdog modprobe.blacklist=iTCO_wdt fsck.repair=yes intel_iommu=on vfio-pci.ids=10de:28ao"

Removed nvidia_drm.modeset=1 nvidia_drm.fbdev=1 and added intel_iommu=on vfio-pci.ids=10de:28ao

Updated mkinicpio.conf

MODULES=(vfio_pci vfio vfio_iommu_type1 i915 btrfs)
BINARIES=(/usr/bin/btrfs)
FILES=()
HOOKS=(systemd keyboard autodetect microcode modconf kms sd-vconsole block sd-encrypt)

The changes were to add the vfio modules, add the i915 module, and put kms into my hooks.

`sudo dmesg | grep -i -e DMAR -e IOMMU`

[    0.000000] Command line: rd.luks.name=a93ae094-9596-43a8-991e-18ff7caf8cbd=root root=/dev/mapper/root rootflags=subvol=@ ro zswap.enabled=0 rootfstype=btrfs add_efi_memmap initrd=/intel-ucode.img mitigations=off nowatchdog modprobe.blacklist=iTCO_wdt fsck.repair=yes intel_iommu=on vfio-pci.ids=10de:28ao initrd=initramfs-linux-ck-generic-v3.img
[    0.004629] ACPI: DMAR 0x000000005F72B000 000088 (v02 INTEL  Dell Inc 00000002      01000013)
[    0.004654] ACPI: Reserving DMAR table memory at [mem 0x5f72b000-0x5f72b087]
[    0.046495] Kernel command line: rd.luks.name=a93ae094-9596-43a8-991e-18ff7caf8cbd=root root=/dev/mapper/root rootflags=subvol=@ ro zswap.enabled=0 rootfstype=btrfs add_efi_memmap initrd=/intel-ucode.img mitigations=off nowatchdog modprobe.blacklist=iTCO_wdt fsck.repair=yes intel_iommu=on vfio-pci.ids=10de:28ao initrd=initramfs-linux-ck-generic-v3.img
[    0.046556] DMAR: IOMMU enabled
[    0.100889] DMAR: Host address width 39
[    0.100890] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.100894] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[    0.100895] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.100898] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[    0.100900] DMAR: RMRR base: 0x0000006c000000 end: 0x000000707fffff
[    0.100902] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.100903] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.100903] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.102425] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.502744] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[    0.837010] iommu: Default domain type: Translated
[    0.837010] iommu: DMA domain TLB invalidation policy: lazy mode
[    0.869081] DMAR: No ATSR found
[    0.869082] DMAR: No SATC found
[    0.869083] DMAR: IOMMU feature fl1gp_support inconsistent
[    0.869083] DMAR: IOMMU feature pgsel_inv inconsistent
[    0.869084] DMAR: IOMMU feature nwfs inconsistent
[    0.869084] DMAR: IOMMU feature dit inconsistent
[    0.869085] DMAR: IOMMU feature sc_support inconsistent
[    0.869085] DMAR: IOMMU feature dev_iotlb_support inconsistent
[    0.869086] DMAR: dmar0: Using Queued invalidation
[    0.869089] DMAR: dmar1: Using Queued invalidation
[    0.869248] pci 0000:00:02.0: Adding to iommu group 0
[    0.869578] pci 0000:00:00.0: Adding to iommu group 1
[    0.869587] pci 0000:00:01.0: Adding to iommu group 2
[    0.869593] pci 0000:00:04.0: Adding to iommu group 3
[    0.869603] pci 0000:00:06.0: Adding to iommu group 4
[    0.869611] pci 0000:00:07.0: Adding to iommu group 5
[    0.869617] pci 0000:00:07.1: Adding to iommu group 6
[    0.869623] pci 0000:00:08.0: Adding to iommu group 7
[    0.869630] pci 0000:00:0a.0: Adding to iommu group 8
[    0.869640] pci 0000:00:0d.0: Adding to iommu group 9
[    0.869646] pci 0000:00:0d.2: Adding to iommu group 9
[    0.869652] pci 0000:00:0e.0: Adding to iommu group 10
[    0.869663] pci 0000:00:12.0: Adding to iommu group 11
[    0.869670] pci 0000:00:12.6: Adding to iommu group 11
[    0.869680] pci 0000:00:14.0: Adding to iommu group 12
[    0.869686] pci 0000:00:14.2: Adding to iommu group 12
[    0.869693] pci 0000:00:14.3: Adding to iommu group 13
[    0.869706] pci 0000:00:15.0: Adding to iommu group 14
[    0.869714] pci 0000:00:15.1: Adding to iommu group 14
[    0.869722] pci 0000:00:16.0: Adding to iommu group 15
[    0.869730] pci 0000:00:1c.0: Adding to iommu group 16
[    0.869745] pci 0000:00:1f.0: Adding to iommu group 17
[    0.869752] pci 0000:00:1f.3: Adding to iommu group 17
[    0.869759] pci 0000:00:1f.4: Adding to iommu group 17
[    0.869765] pci 0000:00:1f.5: Adding to iommu group 17
[    0.869772] pci 0000:01:00.0: Adding to iommu group 18
[    0.869781] pci 0000:a4:00.0: Adding to iommu group 19
[    0.871601] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    1.127342] pci 10000:e0:06.0: Adding to iommu group 10
[    1.127580] pci 10000:e1:00.0: Adding to iommu group 10

I've also tried

❯ cat /etc/modprobe.d/vfio.conf 
softdep nvidia pre: vfio-pci

But the NVidia GPU is still not using the vfio driver. Is it possible IOMMU isn't really happening with this computer? Help appreciated smile

Offline

#3 2024-08-11 14:53:55

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 14,962

Re: [SOLVED] Help with GPU passthrough

add the i915 module, and put kms into my hooks.

No need to have both present, you can remove the kms hook .


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#4 2024-08-11 21:57:52

cloverskull
Member
Registered: 2018-09-30
Posts: 277

Re: [SOLVED] Help with GPU passthrough

Thanks!

Offline

Board footer

Powered by FluxBB