You are not logged in.

#1 2023-11-13 18:22:29

archusar
Member
Registered: 2023-11-13
Posts: 3

[SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

Hello everyone, I have recently installed Arch Linux and am happy with it with the exception of PCI passthrough not working for me. I know that it is user error somewhere, but I am unsure where to look next in the troubleshooting process.

The farthest I have gotten with the following setup in any kernel (linux,linux-zen,linux-lts) is the boot being stuck at running early hook [udev], loading VFIO meta driver

Applicable Information:
dmesg outputs

# dmesg | grep -i -e DMAR -e IOMMU
[    0.000000] Command line: BOOT_IMAGE=/_active/rootvol/boot/vmlinuz-linux-lts root=UUID=f46f4719-8c41-41f4-a825-eadcd324db74 rw rootflags=subvol=_active/rootvol loglevel=8 amd_iommu=on iommu=pt vfio-pci.ids=1002:73a5,1002:73a5 [    0.040013] Kernel command line: BOOT_IMAGE=/_active/rootvol/boot/vmlinuz-linux-lts root=UUID=f46f4719-8c41-41f4-a825-eadcd324db74 rw rootflags=subvol=_active/rootvol loglevel=8 amd_iommu=on iommu=pt vfio-pci.ids=1002:73a5,1002:73a5 [    0.477910] iommu: Default domain type: Passthrough (set via kernel command line) [    0.491724] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported [    0.491741] pci 0000:00:01.0: Adding to iommu group 0 [    0.491747] pci 0000:00:01.2: Adding to iommu group 1 [    0.491753] pci 0000:00:02.0: Adding to iommu group 2 [    0.491760] pci 0000:00:03.0: Adding to iommu group 3 [    0.491764] pci 0000:00:03.1: Adding to iommu group 4 [    0.491770] pci 0000:00:04.0: Adding to iommu group 5 [    0.491776] pci 0000:00:05.0: Adding to iommu group 6 [    0.491782] pci 0000:00:07.0: Adding to iommu group 7 [    0.491788] pci 0000:00:07.1: Adding to iommu group 8 [    0.491794] pci 0000:00:08.0: Adding to iommu group 9 [    0.491799] pci 0000:00:08.1: Adding to iommu group 10 [    0.491806] pci 0000:00:14.0: Adding to iommu group 11 [    0.491810] pci 0000:00:14.3: Adding to iommu group 11 [    0.491824] pci 0000:00:18.0: Adding to iommu group 12 [    0.491828] pci 0000:00:18.1: Adding to iommu group 12 [    0.491832] pci 0000:00:18.2: Adding to iommu group 12 [    0.491837] pci 0000:00:18.3: Adding to iommu group 12 [    0.491841] pci 0000:00:18.4: Adding to iommu group 12 [    0.491845] pci 0000:00:18.5: Adding to iommu group 12 [    0.491849] pci 0000:00:18.6: Adding to iommu group 12 [    0.491853] pci 0000:00:18.7: Adding to iommu group 12 [    0.491862] pci 0000:01:00.0: Adding to iommu group 13 [    0.491867] pci 0000:01:00.1: Adding to iommu group 13 [    0.491872] pci 0000:01:00.2: Adding to iommu group 13 [    0.491875] pci 0000:02:00.0: Adding to iommu group 13 [    0.491877] pci 0000:02:04.0: Adding to iommu group 13 [    0.491880] pci 0000:02:08.0: Adding to iommu group 13 [    0.491882] pci 0000:03:00.0: Adding to iommu group 13 [    0.491885] pci 0000:03:00.1: Adding to iommu group 13 [    0.491888] pci 0000:04:00.0: Adding to iommu group 13 [    0.491891] pci 0000:05:00.0: Adding to iommu group 13 [    0.491897] pci 0000:06:00.0: Adding to iommu group 14 [    0.491902] pci 0000:07:00.0: Adding to iommu group 15 [    0.491910] pci 0000:08:00.0: Adding to iommu group 16 [    0.491918] pci 0000:08:00.1: Adding to iommu group 17 [    0.491923] pci 0000:09:00.0: Adding to iommu group 18 [    0.491929] pci 0000:0a:00.0: Adding to iommu group 19 [    0.491935] pci 0000:0a:00.1: Adding to iommu group 20 [    0.491940] pci 0000:0a:00.3: Adding to iommu group 21 [    0.491946] pci 0000:0a:00.4: Adding to iommu group 22 [    0.492190] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40 [    0.492409] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank). [    0.600125] AMD-Vi: AMD IOMMUv2 loaded and initialized

IOMMU group for guest GPU
IOMMU Group 16: 08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6950 XT] [1002:73a5] (rev c0) IOMMU Group 17: 08:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]

GRUB EDIT:
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=8 amd_iommu=on iommu=pt vfio-pci.ids=1002:73a5,1002:ab28"

updated using sudo grub-mkconfig -o /boot/grub/grub.cfg

/etc/mkinitcpio.conf changes:
MODULES=(vfio_pci vfio vfio_iommu_type1)
HOOKS=(base vfio udev autodetect modconf kms keyboard keymap consolefont block filesystems fsck grub-btrfs-overlayfs)

updated using # sudo mkinitcpio -p linux-zen

Additonal system info:
OS: Arch Linux x86_64
Host: B550 PG Velocita
Kernel: 6.6.1-zen1-1-zen
Shell: bash 5.2.15
DE: Xfce 4.18
WM: Xfwm4 WM
CPU: AMD Ryzen 9 5900X (24) @ 3.700GHz
GPU: AMD ATI FirePro W2100 (radeon, amdgpu drivers)
GPU: AMD ATI Radeon RX 6950 XT (amdgpu driver)
Memory: 6293MiB / 32015MiB

Things I have tried:
- Following PCI passthrough guide on Arch Wiki
- Installing linux-lts,linux-zen for easier troubleshooting if unable to boot
- Passing through just VGA card and not audio device
- Placing gpu drivers (amdgpu,radeon) before/after vfio modules in mkinitcpio.conf
- Trying wiki edits in linux and linux-zen kernels
- Updating system via pacman -Syu
- Binding device via kernel parameters and via /etc/modprobe.d/vfio.conf
- Following example from https://wiki.archlinux.org/title/PCI_pa … F/Examples section 1.22 (Blacklisting amdgpu driver), and 1.28 (softdep amdgpu pre: vfio-pci)
- Checking forums (https://bbs.archlinux.org/viewtopic.php?id=280512)
- This guide results in no change

I appreciate your time and support with this issue, please let me know if there is any additional information you may need or would like to point me towards.

SOLVED:

What I've done:
* Install drivers for W2100 GPU through https://wiki.archlinux.org/title/ATI, installing all packages through pacman and add XORG configuration
* Blacklist AMDGPU drivers in BIOS
* HDMI cable for W2100 GPU was unplugged, may have been the issue all along but cannot verify

File changes:

/etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="module_blacklist=amdgpu loglevel=8 hugepagesz=1GB hugepages=3 vfio-pci.ids=1002:73a5,1002:ab2"
#sudo grub-mkconfig -o /boot/grub/grub.cfg

/etc/mkinitcpio.conf

MODULES=(vfio_pci vfio_iommu_type1 vfio)
HOOKS=(base udev autodetect modconf kms keyboard keymap consolefont block files filesystems fsck grub-btrfs-overlayfs)
#sudo mkinitcpio -p linux-zen


#lspci -knn

03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland GL [FirePro W2100] [1002:6608]
    Subsystem: Hewlett-Packard Company Oland GL [FirePro W2100] [103c:2120]
    Kernel driver in use: radeon
    Kernel modules: radeon, amdgpu
08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6950 XT] [1002:73a5] (rev c0)
    Subsystem: Tul Corporation / PowerColor Navi 21 [Radeon RX 6950 XT] [148c:2420]
    Kernel driver in use: vfio-pci
    Kernel modules: amdgpu
08:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel


Thanks again for the support and troubleshooting assistance, the forums are very helpful.

Last edited by archusar (2023-11-17 01:39:05)

Offline

#2 2023-11-14 12:09:22

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,224

Re: [SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

amd_iommu=on iommu=pt

amd_iommu doesn't accept on as parameter, only intel_iommu supports that.

IOMMU is used for much more then virtualisation and iommu=pt limits iommu functionality severely. It should only be used if there are issues for the host without it.

please remove both parameters for now.


You appear to have 2 PCIe videocards, which one is set as primary / bootup card in (bios) firmware ?

Please post the full outputs of lspci -knn (as normal user) and journal (run with root rights) .

Welcome to archlinux forums.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#3 2023-11-16 01:31:10

archusar
Member
Registered: 2023-11-13
Posts: 3

Re: [SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

Apologies for the delay,

I have removed the parameters as requested and have been able to boot in with dmesg showing proper vfio drivers loaded, but lspci -knn will always display amdgpu controlling the graphics card instead of vfio. I am not sure if there is a way to set a primary gpu in my uefi.

The requested information is available here in file format due to pastebin limitations.

Offline

#4 2023-11-16 09:13:15

seth
Member
Registered: 2012-09-03
Posts: 53,661

Re: [SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

That's a 20MB file from a recapcha service, including your complete journal since the dawn of time.
Please post your complete system journal for the boot:

sudo journalctl -b | curl -F 'file=@-' 0x0.st

I'll just guess that the amdgpu module is in the initramfs and vfio is not, or the order is wrong and so amdgpu loads first and gets the device before vfio has a chance.
https://wiki.archlinux.org/title/PCI_pa … #initramfs

Online

#5 2023-11-16 09:47:59

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,224

Re: [SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

No need to upload the whole journal since first installation , most recent boot would have been enough.

check https://wiki.archlinux.org/title/System … ing_output for ways to filter/limit the journalctl output,

It looks like vfio is started after the RX 6950 XT has been initialised already by amdgpu.
The RX is also detected before the W2100 which suggests the RX is the primary card.

I am not sure if there is a way to set a primary gpu in my uefi.

from https://pg.asrock.com/mb/AMD/B550%20PG% … cification

 2 x PCI Express x16 Slots (PCIE1: Gen4x16 mode; PCIE3: Gen3 x4 mode)*

Incase the firmware doesn't allow to choose the primary PCIe card, MB maufacturers typically hardcode the highest spec'ed slot as primary .

Is the RX placed in PCIE1 ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#6 2023-11-17 00:36:01

archusar
Member
Registered: 2023-11-13
Posts: 3

Re: [SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

The journal command recommended by seth is posted here.

I've noticed that the times in the command output do not properly match my current timezone, not sure if that may be a part of the issue.

The RX6950xt is placed in PCE1 in reference to the motherboard manual.

Setting modules to MODULES=(vfio_pci vfio_iommu_type1 vfio amdgpu) has caused the same issue with the kernel being stuck loading VFIO meta-driver.

Thanks for your continued support, I truly appreciate it.

Offline

#7 2023-11-17 11:15:40

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,224

Re: [SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

Nov 17 00:26:06 archiedesktop systemd[1]: Starting User Login Management...
Nov 17 00:26:06 archiedesktop systemd[1]: TPM2 PCR Barrier (User) was skipped because of an unmet condition check (ConditionPathExists=/sys/firmware/efi/efivars/StubPcrKernelImage-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f).
Nov 17 00:26:06 archiedesktop systemd[1]: Starting Insert vfio-pci driver...
Nov 17 00:26:06 archiedesktop systemd[1]: Take snapper snapshot of root on boot was skipped because of an unmet condition check (ConditionPathExists=/etc/snapper/configs/root).
Nov 17 00:26:06 archiedesktop dhcpcd[734]: dhcpcd-10.0.5 starting
Nov 17 00:26:06 archiedesktop dhcpcd[739]: DUID 00:01:00:01:2c:e0:28:22:9c:6b:00:11:74:4f
Nov 17 00:26:06 archiedesktop kernel: VFIO - User Level meta-driver version: 0.3
Nov 17 00:26:06 archiedesktop systemd[1]: Started D-Bus System Message Bus.
Nov 17 00:26:06 archiedesktop systemd[1]: Starting Network Manager...
Nov 17 00:26:06 archiedesktop systemd-logind[735]: New seat seat0.
Nov 17 00:26:06 archiedesktop systemd[1]: vfio-load.service: Deactivated successfully.

vfio is still loaded way to late and you don't want linux to handle the RX card so the amdgpu module should not be loaded at all.

Try

MODULES=(vfio_pci vfio_iommu_type1 vfio radeon)

for the timing issue you should probably start a different thread, include the output of $ ls -l /etc/localtime in it.

As seth pointed out OP had figured that out themselves, but only mentioned that in their first post in the thread, not in their last.

Last edited by Lone_Wolf (2023-11-19 11:14:20)


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#8 2023-11-17 14:00:35

seth
Member
Registered: 2012-09-03
Posts: 53,661

Re: [SOLVED] VFIO Passthrough - Boot Stuck Loading Drivers

OP in editing hte OP wrote:

SOLVED:

What I've done:
* Install drivers for W2100 GPU through https://wiki.archlinux.org/title/ATI, installing all packages through pacman and add XORG configuration
* Blacklist AMDGPU drivers in BIOS
* HDMI cable for W2100 GPU was unplugged, may have been the issue all along but cannot verify

Online

Board footer

Powered by FluxBB