You are not logged in.

#1 2018-12-22 13:13:07

fcontesse
Member
Registered: 2018-12-22
Posts: 3

[SOLVED] Gpu passthrough with 2 identical GPUs, can't load vfio-pci

Hello everyone,

I installed Arch recently and am in the process of setting up GPU passthrough.

Kernel version :

4.19.11-arch1-1-ARCH

I followed the steps in the wiki, below the IOMMU groups :

...
IOMMU Group 14 1d:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
IOMMU Group 14 1d:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
IOMMU Group 15 1e:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
IOMMU Group 15 1e:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
...

Since i have 2 identical gpus and can't use the vendorid:modelid i went to this section and created the script shown. It is also said there that since 4.18.16 vfio-pci is built in so it is needed to create a hook which i did like so (modifying one i found here) :

/etc/initcpio/install/vfio-pci

#!/bin/bash

build() {
    add_module vfio-pci
    add_module vfio_iommu_type1
    add_runscript
}

help() {
    echo "This hook binds the vfio-pci driver to specific devices by pci address (not vendor:device like pci-stub.ids= does)."
}

/etc/initcpio/hooks/vfio-pci

#!/bin/sh

run_hook(){
  for i in /sys/bus/pci/devices/*/boot_vga; do
    if [ $(cat "$i") -eq 0 ]; then
      GPU="${i%/boot_vga}"
      AUDIO="$(echo "$GPU" | sed -e "s/0$/1/")"
      echo "vfio-pci" > "$GPU/driver_override"
      if [ -d "$AUDIO" ]; then
        echo "vfio-pci" > "$AUDIO/driver_override"
      fi
    fi
  done

  modprobe -i vfio-pci
}

/etc/mkinitcpio.conf

...
MODULES=(dm_mod dm_raid dm_crypt aes_x86_64 raid1 raid5 raid456)
...
HOOKS=(base udev autodetect modconf vfio-pci block filesystems keyboard systemd sd-vconsole fsck mdadm_udev sd-encrypt sd-lvm2 filesystems usr shutdown)
...

syslinux.cfg :

LABEL arch
    MENU LABEL Arch Linux
    LINUX ../vmlinuz-linux
    APPEND rd.luks.name=5887c45f-23f9-47fb-89a9-2c6f0e750514=cryptroot root=/dev/cryptvg/lvroot luks.options=discard amd_iommu=on iommu=pt debug
    INITRD ../initramfs-linux.img
sudo mkinitcpio -p linux

==> Building image from preset: /etc/mkinitcpio.d/linux.preset: 'default'
  -> -k /boot/vmlinuz-linux -c /etc/mkinitcpio.conf -g /boot/initramfs-linux.img
==> Starting build: 4.19.11-arch1-1-ARCH
  -> Running build hook: [base]
  -> Running build hook: [udev]
  -> Running build hook: [autodetect]
  -> Running build hook: [modconf]
  -> Running build hook: [vfio-pci]
  -> Running build hook: [block]
  -> Running build hook: [filesystems]
  -> Running build hook: [keyboard]
  -> Running build hook: [systemd]
  -> Running build hook: [sd-vconsole]
  -> Running build hook: [fsck]
  -> Running build hook: [mdadm_udev]
  -> Running build hook: [sd-encrypt]
  -> Running build hook: [sd-lvm2]
  -> Running build hook: [filesystems]
  -> Running build hook: [usr]
  -> Running build hook: [shutdown]
==> Generating module dependencies
==> Creating gzip-compressed initcpio image: /boot/initramfs-linux.img
==> Image generation successful

Unfortunately after reboot the driver is still amdgpu :

...
1d:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
        Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 580 4GB [1da2:e366]
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
1d:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
        Subsystem: Sapphire Technology Limited Ellesmere [Radeon RX 580] [1da2:aaf0]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
1e:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
        Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 580 4GB [1da2:e366]
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
1e:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
        Subsystem: Sapphire Technology Limited Ellesmere [Radeon RX 580] [1da2:aaf0]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
...

Then i read this section and it said thatvfio-pci is not built in the kernel. Not knowing which was right i tried the installation as described here but got the following error during boot :

journalctl -u systemd-modules-load.service -b
-- Logs begin at Sat 2018-12-15 18:52:00 CET, end at Sat 2018-12-22 14:08:00 CET. --
Dec 22 12:37:50 archlinux systemd[236]: systemd-modules-load.service: Executing: /usr/lib/systemd/systemd-modules-load
Dec 22 12:37:50 archlinux systemd-modules-load[236]: apply: /etc/modules-load.d/MODULES.conf
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: dm_mod
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Inserted module 'dm_mod'
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: dm_raid
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Inserted module 'dm_raid'
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: dm_crypt
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Inserted module 'dm_crypt'
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: aes_x86_64
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Module 'aes_x86_64' is already loaded
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: raid1
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Inserted module 'raid1'
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: raid5
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Module 'raid456' is already loaded
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: raid456
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Module 'raid456' is already loaded
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: vfio-pci
Dec 22 12:37:50 archlinux systemd-modules-load[236]: sh: /usr/bin/vfio-pci-override.sh: not found
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Error running install command for vfio_pci
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Failed to insert module 'vfio_pci': Key has expired
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Loading module: vfio_iommu_type1
Dec 22 12:37:50 archlinux systemd-modules-load[236]: Module 'vfio_iommu_type1' is already loaded
Dec 22 12:37:50 archlinux systemd[1]: systemd-modules-load.service: Child 236 belongs to systemd-modules-load.service.
Dec 22 12:37:50 archlinux systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE
Dec 22 12:37:50 archlinux systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'.
Dec 22 12:37:50 archlinux systemd[1]: systemd-modules-load.service: Changed start -> failed
Dec 22 12:37:50 archlinux systemd[1]: systemd-modules-load.service: Job systemd-modules-load.service/start finished, result=failed
Dec 22 12:37:50 archlinux systemd[1]: Failed to start Load Kernel Modules.
Dec 22 12:37:50 archlinux systemd[1]: systemd-modules-load.service: Unit entered failed state.
Dec 22 12:38:01 archlinux systemd[1]: systemd-modules-load.service: Changed dead -> failed
Dec 22 12:38:01 archws systemd[589]: systemd-modules-load.service: Executing: /usr/lib/systemd/systemd-modules-load
Dec 22 12:38:01 archws systemd-modules-load[589]: apply: /usr/lib/modules-load.d/bluez.conf
Dec 22 12:38:01 archws systemd-modules-load[589]: Loading module: crypto_user
Dec 22 12:38:01 archws systemd-modules-load[589]: Inserted module 'crypto_user'
Dec 22 12:38:01 archws systemd[1]: systemd-modules-load.service: Child 589 belongs to systemd-modules-load.service.
Dec 22 12:38:01 archws systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=0/SUCCESS
Dec 22 12:38:01 archws systemd[1]: systemd-modules-load.service: Changed start -> exited
Dec 22 12:38:01 archws systemd[1]: systemd-modules-load.service: Job systemd-modules-load.service/start finished, result=done

And the driver used by the gpu is still amdgpu.

I am not sure which method i should use and what i'm doing wrong with the method that is supposed to work. If you need more information don't hesitate to ask.

Thank you in advance for your answers.

Best regards.

Last edited by fcontesse (2018-12-22 14:28:42)

Offline

#2 2018-12-22 13:24:39

loqs
Member
Registered: 2014-03-06
Posts: 17,369

Re: [SOLVED] Gpu passthrough with 2 identical GPUs, can't load vfio-pci

Welcome to the arch linux forums fcontesse.  vfio-pci is no longer built in.

Offline

#3 2018-12-22 13:34:54

fcontesse
Member
Registered: 2018-12-22
Posts: 3

Re: [SOLVED] Gpu passthrough with 2 identical GPUs, can't load vfio-pci

loqs wrote:

Welcome to the arch linux forums fcontesse.  vfio-pci is no longer built in.

Hello loqs,

Thank you very much for your answer, i'm going to remove the hook and check what i did wrong with the method descibed here.

Best regards.

Offline

#4 2018-12-22 14:27:59

fcontesse
Member
Registered: 2018-12-22
Posts: 3

Re: [SOLVED] Gpu passthrough with 2 identical GPUs, can't load vfio-pci

Hello,

I found what i did wrong, i was using #!/bin/bash instead of #!/bin/sh, force of habbit.
It's working now :

1d:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
        Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 580 4GB [1da2:e366]
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
1d:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
        Subsystem: Sapphire Technology Limited Ellesmere [Radeon RX 580] [1da2:aaf0]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
1e:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
        Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 580 4GB [1da2:e366]
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu
1e:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
        Subsystem: Sapphire Technology Limited Ellesmere [Radeon RX 580] [1da2:aaf0]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

I'm going to mark the thread as solved.

Thank you again loqs for your quick answer.

Best regards.

Offline

Board footer

Powered by FluxBB