You are not logged in.
esmth wrote:sudo su bin/vfio-bind 0000:01:00:0 0000:01:00.1 ./qemu-system-x86_64 -enable-kvm -M q35 -m 1024 -cpu host -smp 6,sockets=1,cores=6,threads=1 -bios /home/esmth/src/seabios/out/bios.bin -vga std -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 -device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on -device vfio-pci,host=01:00.1,bus=root.1,addr=00.1 -drive file=/home/esmth/image.img,id=disk,format=raw -device ide-hd,bus=ide.0,drive=disk -drive file=/home/esmth/tmp/windows.iso,id=isocd -device ide-cd,bus=ide.1,drive=isocd -usb -usbdevice host:046d:c52b
if anyone can help, it'd be appreciated, thanks
Could you try to use it with
-vga none
I actually did run it with -vga none, but i copied the wrong command. I used -vga std to install windows in the VM
Offline
@esmth: I had a lot of similar troubles recently (similar to yours with sporadic freeze after start, freeze of VM only, etc...) and I found out that stability of VGA passthrough highly depends on which PCIe x16 slot I'm using. If you don't mind and if it is possible I would just experiment a bit with it.
Offline
dmesg | grep -e vfio then shows
[302278.633004] vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded [302278.633015] vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded
Permissions for /dev/vfio/ are
crw-rw-rw- 1 root root 251, 0 May 18 22:48 1 crw-rw-rw- 1 root root 10, 196 May 18 22:45 vfio
qemu is running its processes as root and the acl in /etc/libvirt/qemu.conf is
cgroup_device_acl = [ "/dev/null", "/dev/full", "/dev/zero", "/dev/random", "/dev/urandom", "/dev/ptmx", "/dev/kvm", "/dev/kqemu", "/dev/rtc","/dev/hpet", "/dev/vfio/vfio", "/dev/vfio/1" ]
In addition, I've increased default memlock limits in /etc/security/limits.conf and confirmed with ulimit.
user@kvmhost-2:~> ulimit -l
1048576
user@kvmhost-2:~> ulimit -Hl
1048576
So I have no idea why dmesg still gives the RLIMIT_MEMLOCK 65k...
It does look like a permissions issue somewhere but I'm all out of ideas...
Last edited by siddharta (2014-05-22 15:04:30)
Offline
Finally I figured out how to start and boot my vm. Passthrough works fine. Performance is also good. The only problem is and therefore I originally tried to use virtualitation that I am not able to start two virtual machines with a dedicated graphic card each. I always get:
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: error opening /dev/vfio/1: Device or resource busy
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: failed to get group 1
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device initialization failed.
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device 'vfio-pci' could not be initialized
I figured out that all my 3 cards are in the same iommu_group and therefore got in the same vfio_group (in my case its group 1).
Doing some research I even found the procedures/functions in the iommu.c part of the kernel in which iommu_groups are assigned.
But i cant manage to find a working solution to change this group.
Does anybody have an idea how I could solve this issue?
Help would be really appreciated, thanks
Offline
Finally I figured out how to start and boot my vm. Passthrough works fine. Performance is also good. The only problem is and therefore I originally tried to use virtualitation that I am not able to start two virtual machines with a dedicated graphic card each. I always get:
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: error opening /dev/vfio/1: Device or resource busy
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: failed to get group 1
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device initialization failed.
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device 'vfio-pci' could not be initializedI figured out that all my 3 cards are in the same iommu_group and therefore got in the same vfio_group (in my case its group 1).
Doing some research I even found the procedures/functions in the iommu.c part of the kernel in which iommu_groups are assigned.
But i cant manage to find a working solution to change this group.Does anybody have an idea how I could solve this issue?
Help would be really appreciated, thanks
Google for "ACS override", apply the patch you find and enable it via kernel options. If one of the cards can be attached to a PCH root port (00:1c.*), the v3.15 kernel may help.
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
siddharta wrote:dmesg | grep -e vfio then shows
[302278.633004] vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded [302278.633015] vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded
Permissions for /dev/vfio/ are
crw-rw-rw- 1 root root 251, 0 May 18 22:48 1 crw-rw-rw- 1 root root 10, 196 May 18 22:45 vfio
qemu is running its processes as root and the acl in /etc/libvirt/qemu.conf is
cgroup_device_acl = [ "/dev/null", "/dev/full", "/dev/zero", "/dev/random", "/dev/urandom", "/dev/ptmx", "/dev/kvm", "/dev/kqemu", "/dev/rtc","/dev/hpet", "/dev/vfio/vfio", "/dev/vfio/1" ]
In addition, I've increased default memlock limits in /etc/security/limits.conf and confirmed with ulimit.
user@kvmhost-2:~> ulimit -l 1048576 user@kvmhost-2:~> ulimit -Hl 1048576
So I have no idea why dmesg still gives the RLIMIT_MEMLOCK 65k...
It does look like a permissions issue somewhere but I'm all out of ideas...
VFIO requires that the user has permission to lock the memory used by the guest. You can change the user libvirt uses to root to avoid the problem (/etc/libvirt/qemu.conf). You can also assign some other device to the VM using libvirt, then it will know to set the locked memory limit for the process appropriately.
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
EDIT: Okay, so it seems I was wrong.
I can successfully run the test code if I set the host to run only on the intel processor. It does not matter which PCIe slot my passthrough card is in.
However, if I set my host to run on a dedicated card, I get no output from the other card running the test code.
Has anyone here successfully run a 2 card setup with 1 for the host and 1 for a passthrough? Or does anyone else have any intuition of what might be going on?
Last edited by Slabity (2014-05-22 23:00:52)
Offline
EDIT: Okay, so it seems I was wrong.
I can successfully run the test code if I set the host to run only on the intel processor. It does not matter which PCIe slot my passthrough card is in.
However, if I set my host to run on a dedicated card, I get no output from the other card running the test code.
Has anyone here successfully run a 2 card setup with 1 for the host and 1 for a passthrough? Or does anyone else have any intuition of what might be going on?
Are you using 2 nvidia cards on your host?, because if you are you'll need to patch the nvidia drivers
Last edited by nbhs (2014-05-22 23:28:55)
Offline
Chrissi wrote:Finally I figured out how to start and boot my vm. Passthrough works fine. Performance is also good. The only problem is and therefore I originally tried to use virtualitation that I am not able to start two virtual machines with a dedicated graphic card each. I always get:
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: error opening /dev/vfio/1: Device or resource busy
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: failed to get group 1
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device initialization failed.
qemu-system-x86_64: -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device 'vfio-pci' could not be initializedI figured out that all my 3 cards are in the same iommu_group and therefore got in the same vfio_group (in my case its group 1).
Doing some research I even found the procedures/functions in the iommu.c part of the kernel in which iommu_groups are assigned.
But i cant manage to find a working solution to change this group.Does anybody have an idea how I could solve this issue?
Help would be really appreciated, thanksGoogle for "ACS override", apply the patch you find and enable it via kernel options. If one of the cards can be attached to a PCH root port (00:1c.*), the v3.15 kernel may help.
You can get the acs override patch from my linux-mainline build, download linux-mainline.tar.gz, unpack it and it should be there
Offline
Slabity wrote:EDIT: Okay, so it seems I was wrong.
I can successfully run the test code if I set the host to run only on the intel processor. It does not matter which PCIe slot my passthrough card is in.
However, if I set my host to run on a dedicated card, I get no output from the other card running the test code.
Has anyone here successfully run a 2 card setup with 1 for the host and 1 for a passthrough? Or does anyone else have any intuition of what might be going on?
Are you using 2 nvidia cards on your host?, because if you are you'll need to patch the nvidia drivers
No. I have one AMD and one Nvidia. I'm trying to let my host use the AMD card while I pass the Nvidia one to the guest.
$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barts XT [Radeon HD 6870]
02:00.0 VGA compatible controller: NVIDIA Corporation GK110 [GeForce GTX 780] (rev a1)
I can pass through the card if I am running my host on integrated graphics. I can't seem to pass the card if I run the host on the AMD card, which is my current target.
Last edited by Slabity (2014-05-22 23:46:13)
Offline
nbhs wrote:Slabity wrote:EDIT: Okay, so it seems I was wrong.
I can successfully run the test code if I set the host to run only on the intel processor. It does not matter which PCIe slot my passthrough card is in.
However, if I set my host to run on a dedicated card, I get no output from the other card running the test code.
Has anyone here successfully run a 2 card setup with 1 for the host and 1 for a passthrough? Or does anyone else have any intuition of what might be going on?
Are you using 2 nvidia cards on your host?, because if you are you'll need to patch the nvidia drivers
No. I have one AMD and one Nvidia. I'm trying to let my host use the AMD card while I pass the Nvidia one to the guest.
$ lspci | grep VGA 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barts XT [Radeon HD 6870] 02:00.0 VGA compatible controller: NVIDIA Corporation GK110 [GeForce GTX 780] (rev a1)
I can pass through the card if I am running my host on integrated graphics. I can't seem to pass the card if I run the host on the AMD card, which is my current target.
Are you using the open source driver or fglrx?
Offline
Are you using the open source driver or fglrx?
Open source.
$ lspci -v
...
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barts XT [Radeon HD 6870] (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device 21fa
Flags: bus master, fast devsel, latency 0, IRQ 47
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at ef320000 (64-bit, non-prefetchable) [size=128K]
I/O ports at e000 [size=256]
Expansion ROM at ef300000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: radeon
Kernel modules: radeon
...
02:00.0 VGA compatible controller: NVIDIA Corporation GK110 [GeForce GTX 780] (rev a1) (prog-if 00 [VGA controller])
Subsystem: eVga.com. Corp. Device 2784
Flags: fast devsel, IRQ 10
Memory at ee000000 (32-bit, non-prefetchable) [disabled] [size=16M]
Memory at e0000000 (64-bit, prefetchable) [disabled] [size=128M]
Memory at e8000000 (64-bit, prefetchable) [disabled] [size=32M]
I/O ports at d000 [disabled] [size=128]
Expansion ROM at ef000000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: vfio-pci
Kernel modules: nouveau
...
Last edited by Slabity (2014-05-23 00:38:33)
Offline
Hello,
I am also trying to get PCI-Passthrough to work on my system and am having issues generating the test window. I've followed all of the tutorial steps to the best of my knowledge. When attempting to run the test window (sudo):
qemu-system-x86_64 -enable-kvm -M q35 -m 1024 -cpu host \
-smp 6,sockets=1,cores=6,threads=1 \
-bios /usr/share/qemu/bios.bin -vga none \
-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \
-device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \
-device vfio-pci,host=01:00.1,bus=root.1,addr=00.1
I get the following errors:
qemu-system-x86_64: -device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: error opening /dev/vfio/1: No such file or directory
qemu-system-x86_64: -device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: failed to get group 1
qemu-system-x86_64: -device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device initialization failed.
qemu-system-x86_64: -device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device 'vfio-pci' could not be initialized
Some notes about my system:
MOBO: ASROCK Z77 (VT-d enabled)
CPU: i7-3770 (non-K)
HOST OS: Ubuntu 14.04 LTS
Primary GPU: On-Board Graphics
Secondary GPU: Saphire Radeon 4770
and configuration:
~$ grep GRUB_CMDLINE /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 pci-stub.ids=1002:xxxx,1002:yyyy"
GRUB_CMDLINE_LINUX=""
~$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.15.0-rc6 root=/dev/mapper/rocko--vg-root ro quiet splash intel_iommu=on pci-stub.ids=1002:xxxx,1002:yyyy vt.handoff=7
#NOTE: I compiled 3.15.0-rc6 after having the same issue with the stock Ubuntu 14.04-provided kernel. So it doesn't seem related.
#NOTE: I tried booting with and without 'intel_iommu=on' in grub.cfg
#NOTE: I can't seem to get "vfio_iommu_type1.allow_unsafe_interrupts=1" to show up in /proc/cmdline even though its been added to /etc/default/grub, which is used to update /boot/grub/grub.cfg in Ubuntu. I've run 'update-grub' and 'update-initramfs -u'. I also tried adding "options vfio_iommu_type1 allow_unsafe_interrupts=1" to "/etc/modprobe.d/vfio_iommu_type1.conf". I'm not sure if this addition is necessary in my case regardless.
I'm using QEMU 2.0.0 and included the pci_stub, vfio, vfio_pci, vfio_iommu_type1, kvm, and kvm_intel modules. I also blacklisted 'radeon' and 'fglrx' on the host Ubuntu system and bound the Radeon 4770 to PCI-stub.
~$ qemu -version
QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1), Copyright (c) 2003-2008 Fabrice Bellard
~$ sudo apt-get install seabios | grep seabios
seabios is already the newest version.
~$ lsmod | grep -e pci -e vfio -e kvm| tail -n 6
pci_stub 12622 1
vfio_pci 36474 0
vfio_iommu_type1 17632 0
kvm_intel 143255 0
kvm 446871 1 kvm_intel
vfio 25164 2 vfio_iommu_type1,vfio_pci
~$ cat /etc/modprobe.d/blacklist.conf | grep -e radeon -e fglrx
blacklist radeon
blacklist fglrx
~$ lspci | grep Advanced
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV740 PRO [Radeon HD 4770]
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series]
~$ lspci -n | grep "01:00"
01:00.0 0300: 1002:xxxx
01:00.1 0403: 1002:yyyy
~$ dmesg | grep pci-stub
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.15.0-rc6 root=/dev/mapper/rocko--vg-root ro quiet splash intel_iommu=on pci-stub.ids=1002:xxxx,1002:yyyy vt.handoff=7
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.15.0-rc6 root=/dev/mapper/rocko--vg-root ro quiet splash intel_iommu=on pci-stub.ids=1002:xxxx,1002:yyyy vt.handoff=7
[ 47.239570] pci-stub: add 1002:xxxx sub=FFFFFFFF:FFFFFFFF cls=00000000/00000000
[ 47.239584] pci-stub 0000:01:00.0: claimed by stub
[ 47.239590] pci-stub: add 1002:yyyy sub=FFFFFFFF:FFFFFFFF cls=00000000/00000000
[ 47.239597] pci-stub 0000:01:00.1: claimed by stub
~$ cat /etc/vfio-pci.cfg
DEVICES="0000:01:00.0 0000:01:00.1"
Note that I have created /usr/bin/vfio-bind and /etc/systemd/system/binds-vfio-pci.service with no customizations. I am not trying to bind usb ports yet, just the 2 radeon 'devices' (gpu+gpu audio component). From what I can tell there is supposed to be a vfio 'group' containing these devices at /dev/vfio/1, but there isn't:
~$ ls /dev/vfio/
vfio
I'm basing this on 2 of the errors and the output of readlink, per further investigation:
vfio: failed to get group 1
vfio: error opening /dev/vfio/1
~$ readlink /sys/bus/pci/devices/0000\:01\:00.0/iommu_group
../../../../kernel/iommu_groups/1
I apologize if I've spammed too much information here; I just want to make sure that I've included all relevant troubleshooting information. Let me know if I need to share something else. Or if i need to be clearer. I suspect the key to my issue is the lack of '/dev/vfio/1' creation. If true, I'm also not sure why it isn't being created. I trawled through these 70+ pages yesterday trying to find a solution but am not sure where I've gone wrong. I would appreciate any feedback.
Thanks in advance,
shelladept
Offline
I apologize if I've spammed too much information here; I just want to make sure that I've included all relevant troubleshooting information. Let me know if I need to share something else. Or if i need to be clearer. I suspect the key to my issue is the lack of '/dev/vfio/1' creation. If true, I'm also not sure why it isn't being created. I trawled through these 70+ pages yesterday trying to find a solution but am not sure where I've gone wrong. I would appreciate any feedback.
Thanks for the detailed report, more information is always better. What I don't see is any evidence that 1:00.0 and 1:00.1 are actually bound to vfio-pci, you've kind of skimmed over the vfio-bind scripts you're using. What does 'lspci -ks 1:' report for the driver in use? Your lsmod shows the reference count of vfio-pci is 0, so I don't think you've got any devices bound to it. The vfio group dev file is created when devices are bound to vfio-pci.
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
shelladept wrote:I apologize if I've spammed too much information here; I just want to make sure that I've included all relevant troubleshooting information. Let me know if I need to share something else. Or if i need to be clearer. I suspect the key to my issue is the lack of '/dev/vfio/1' creation. If true, I'm also not sure why it isn't being created. I trawled through these 70+ pages yesterday trying to find a solution but am not sure where I've gone wrong. I would appreciate any feedback.
Thanks for the detailed report, more information is always better. What I don't see is any evidence that 1:00.0 and 1:00.1 are actually bound to vfio-pci, you've kind of skimmed over the vfio-bind scripts you're using. What does 'lspci -ks 1:' report for the driver in use? Your lsmod shows the reference count of vfio-pci is 0, so I don't think you've got any devices bound to it. The vfio group dev file is created when devices are bound to vfio-pci.
Thank you for the quick reply! 'lspci -ks 1:' returns the following:
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV740 PRO [Radeon HD 4770]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0d00
Kernel driver in use: pci-stub
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series]
Kernel driver in use: pci-stub
If I understand what is going on here, 'pci-stub' has the 2 card 'devices' instead of vfio_pci? But why does it show only 1 in the lsmod output? Is this because PCI 'sees' one object while VFIO distinguishes between the 2 'devices'? Pardon my ignorance, I'm still figuring out exactly how VFIO does its magic. I also apologize for leaving out those particular details you mentioned. I was trying to keep my post as short as possible. To clear up any confusion, the verbatim contents of '/usr/bin/vfio-bind' are:
~$ cat /usr/bin/vfio-bind
#!/bin/bash
modprobe vfio-pci
for dev in "$@"; do
vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
device=$(cat /sys/bus/pci/devices/$dev/device)
if [ -e /sys/bus/pci/devices/$dev/driver ]; then
echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
fi
echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id
done
Now on further review I think the 'modprobe vfio-pci' line may be redundant since I included vfio-pci in my /etc/modules file:
~$ cat /etc/modules | grep -e vfio -e pci
vfio
vfio_pci
pci_stub
vfio_iommu_type1
'vfio-bind' is launched by systemd using '/etc/systemd/system/binds-vfio-pci.service' at boot:
~$ cat /etc/systemd/system/binds-vfio-pci.service
[Unit]
Description=Binds devices to vfio-pci
After=syslog.target
[Service]
EnvironmentFile=-/etc/vfio-pci.cfg
Type=oneshot
RemainAfterExit=yes
ExecStart=-/usr/bin/vfio-bind $DEVICES
[Install]
WantedBy=multi-user.target
and 'binds-vfio-pci.service' obtains the value of $DEVICES from '/etc/vfio-pci.cfg':
~$ cat /etc/vfio-pci.cfg
DEVICES="0000:01:00.0 0000:01:00.1"
At least that is what I believe is happening; I very well could be misinterpreting one of these files and would appreciate any clarification. Let me know if I can provide further details and thank you again!
shelladept
Offline
aw wrote:shelladept wrote:I apologize if I've spammed too much information here; I just want to make sure that I've included all relevant troubleshooting information. Let me know if I need to share something else. Or if i need to be clearer. I suspect the key to my issue is the lack of '/dev/vfio/1' creation. If true, I'm also not sure why it isn't being created. I trawled through these 70+ pages yesterday trying to find a solution but am not sure where I've gone wrong. I would appreciate any feedback.
Thanks for the detailed report, more information is always better. What I don't see is any evidence that 1:00.0 and 1:00.1 are actually bound to vfio-pci, you've kind of skimmed over the vfio-bind scripts you're using. What does 'lspci -ks 1:' report for the driver in use? Your lsmod shows the reference count of vfio-pci is 0, so I don't think you've got any devices bound to it. The vfio group dev file is created when devices are bound to vfio-pci.
Thank you for the quick reply! 'lspci -ks 1:' returns the following:
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV740 PRO [Radeon HD 4770] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0d00 Kernel driver in use: pci-stub 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series] Kernel driver in use: pci-stub
...
At least that is what I believe is happening; I very well could be misinterpreting one of these files and would appreciate any clarification. Let me know if I can provide further details and thank you again!
Well, the device are bound to pci-stub, so either something is wrong with the scripts or they aren't being run. So start with running them by hand...
/usr/bin/vfio-bind 0000:01:00.0 0000:01:00.1
Run lspci -k again, are the device bound to vfio-pci? If so, is the service enabled in systemd? If not, is the script executable?
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
shelladept wrote:aw wrote:Thanks for the detailed report, more information is always better. What I don't see is any evidence that 1:00.0 and 1:00.1 are actually bound to vfio-pci, you've kind of skimmed over the vfio-bind scripts you're using. What does 'lspci -ks 1:' report for the driver in use? Your lsmod shows the reference count of vfio-pci is 0, so I don't think you've got any devices bound to it. The vfio group dev file is created when devices are bound to vfio-pci.
Thank you for the quick reply! 'lspci -ks 1:' returns the following:
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV740 PRO [Radeon HD 4770] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0d00 Kernel driver in use: pci-stub 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series] Kernel driver in use: pci-stub
...
At least that is what I believe is happening; I very well could be misinterpreting one of these files and would appreciate any clarification. Let me know if I can provide further details and thank you again!
Well, the device are bound to pci-stub, so either something is wrong with the scripts or they aren't being run. So start with running them by hand...
/usr/bin/vfio-bind 0000:01:00.0 0000:01:00.1
Run lspci -k again, are the device bound to vfio-pci? If so, is the service enabled in systemd? If not, is the script executable?
Thank you again, aw! Running vfio-bind manually works:
~$ lsmod | grep vfio
vfio_iommu_type1 17632 0
vfio_pci 36474 0
vfio 25164 2 vfio_iommu_type1,vfio_pci
so the issue is probably related to my configuration of systemd/binds-vfio-pci.service.
It looks like I am getting the test window to display now, although I get some weird colors on my host screen output afterward. I'm guessing this is related to the Intel issue addressed by linux-mainline.tar.gz per the first page (If memory serves--more research for me!). I may have to compile another kernel but this is plenty of progress for me to get back on track. Thanks again!
shelladept
Offline
It looks like I am getting the test window to display now, although I get some weird colors on my host screen output afterward. I'm guessing this is related to the Intel issue addressed by linux-mainline.tar.gz per the first page (If memory serves--more research for me!). I may have to compile another kernel but this is plenty of progress for me to get back on track. Thanks again!
Yes, i915 patches should clear up the host screen.
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
Hey guys, sorry for posting again, but I'm at a real loss here on what's wrong with my setup.
My current system has 3 VGA outputs. Intel, AMD, and Nvidia:
$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barts XT [Radeon HD 6870]
02:00.0 VGA compatible controller: NVIDIA Corporation GK110 [GeForce GTX 780] (rev a1)
When I run the example script:
qemu-system-x86_64 -enable-kvm -M q35 -m 1024 -cpu host \
-smp 6,sockets=1,cores=6,threads=1 \
-bios /usr/share/qemu/bios.bin -vga none \
-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \
-device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \
-device vfio-pci,host=02:00.1,bus=root.1,addr=00.1
My system works perfectly when my host is using intel graphics. A blank window appears and the seabios screen is being passed to another monitor.
But when I set my host to use the AMD card, the blank window appears, but there is nothing being passed to the other monitor.
I am using the open source radeon drivers.
Sorry again for posting so much, but I'm running out of ideas.
Offline
VFIO requires that the user has permission to lock the memory used by the guest. You can change the user libvirt uses to root to avoid the problem (/etc/libvirt/qemu.conf). You can also assign some other device to the VM using libvirt, then it will know to set the locked memory limit for the process appropriately.
Thanks for this. I dug back into qemu.conf and found that, while qemu processes were supposed to run as root
user = "root"
group = "root"
I had missed the clear_emulator_capabilities flag
# If clear_emulator_capabilities is enabled, libvirt will drop all
# privileged capabilities of the QEmu/KVM emulator. This is enabled by
# default.
#
# Warning: Disabling this option means that a compromised guest can
# exploit the privileges and possibly do damage to the host.
#
clear_emulator_capabilities = 0
which was set to its default restrictive setting.
For my understanding, could you please clarify
You can also assign some other device to the VM using libvirt, then it will know to set the locked memory limit for the process appropriately.
Thanks again, I'm thrilled to have this working.
Offline
Google for "ACS override", apply the patch you find and enable it via kernel options. If one of the cards can be attached to a PCH root port (00:1c.*), the v3.15 kernel may help.
I just installed the new 3.15-rc6 kernel which in my understanding contains the acs override patch.
Attaching one card to one vm still works fine. ( Starting more than one vm each with its own card assigned still gives the same 'failed to get group 1' error. )
I can't figure out how to attach a card to a specific PCH root port and what I would have to do afterwards to pass those root ports to my vm.
It would be great if you could give me some more advice regarding this topic.
Offline
Hey guys, sorry for posting again, but I'm at a real loss here on what's wrong with my setup.
My current system has 3 VGA outputs. Intel, AMD, and Nvidia:
$ lspci | grep VGA 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barts XT [Radeon HD 6870] 02:00.0 VGA compatible controller: NVIDIA Corporation GK110 [GeForce GTX 780] (rev a1)
When I run the example script:
qemu-system-x86_64 -enable-kvm -M q35 -m 1024 -cpu host \ -smp 6,sockets=1,cores=6,threads=1 \ -bios /usr/share/qemu/bios.bin -vga none \ -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \ -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \ -device vfio-pci,host=02:00.1,bus=root.1,addr=00.1
My system works perfectly when my host is using intel graphics. A blank window appears and the seabios screen is being passed to another monitor.
But when I set my host to use the AMD card, the blank window appears, but there is nothing being passed to the other monitor.
I am using the open source radeon drivers.
Sorry again for posting so much, but I'm running out of ideas.
It sounds like VGA arbitration isn't working when the AMD card is in use by the host. What does 'dmesg | grep vgaarb' show?
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
aw wrote:VFIO requires that the user has permission to lock the memory used by the guest. You can change the user libvirt uses to root to avoid the problem (/etc/libvirt/qemu.conf). You can also assign some other device to the VM using libvirt, then it will know to set the locked memory limit for the process appropriately.
Thanks for this. I dug back into qemu.conf and found that, while qemu processes were supposed to run as root
user = "root" group = "root"
I had missed the clear_emulator_capabilities flag
# If clear_emulator_capabilities is enabled, libvirt will drop all # privileged capabilities of the QEmu/KVM emulator. This is enabled by # default. # # Warning: Disabling this option means that a compromised guest can # exploit the privileges and possibly do damage to the host. # clear_emulator_capabilities = 0
which was set to its default restrictive setting.
For my understanding, could you please clarify
libvirt attempts to run qemu with as few privileges as possible to protect the host from the guest "breaking out" of it's VM. Using an assigned device requires certain privileges which libvirt retains when device assignment is used. When libvirt doesn't know about an assigned device, it can't grant those privileges.
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
aw wrote:Google for "ACS override", apply the patch you find and enable it via kernel options. If one of the cards can be attached to a PCH root port (00:1c.*), the v3.15 kernel may help.
I just installed the new 3.15-rc6 kernel which in my understanding contains the acs override patch.
Nope, 3.15 includes quirks approved by Intel to advertise the ACS-like isolation capabilities of PCH-based root ports and make sure they're enabled. This means that the hardware is actually providing the isolation required. The ACS override patch is a way for the user to override the hardware isolation requirement and is not going upstream.
Attaching one card to one vm still works fine. ( Starting more than one vm each with its own card assigned still gives the same 'failed to get group 1' error. )
I can't figure out how to attach a card to a specific PCH root port and what I would have to do afterwards to pass those root ports to my vm.
It would be great if you could give me some more advice regarding this topic.
The host root ports never get passed to the VM. Which root ports are used is a physical property of your motherboard layout. It may require that you move the card to a different slot to connect to the PCH root port, if it's even possible. On my system I have:
$ lspci | grep "Root Port"
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5)
00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 (rev b5)
00:1c.6 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 (rev b5)
You can tell by the description (and bus addresses) that 00:01.0 is a processor-based root port while 00:1c.* are PCH-based root ports. My graphics cards are installed as:
$ lspci -tv | grep -e NVIDIA -e AMD
+-01.0-[01]--+-00.0 NVIDIA Corporation GK208 [GeForce GT 635]
| \-00.1 NVIDIA Corporation Device 0e0f
+-1c.0-[02]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240 OEM]
| \-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
So that only one is connected to the processor-based root port and the other is connected to the PCH-based root port. This way I don't need the ACS override patch since the v3.15 ACS quirks work for this chipset. Intel currently has no plans to provide information about whether hardware isolation is possible with processor-based root ports, so we shouldn't expect similar quirks for them anytime soon. We can hope that they've learned their lesson with ACS and intend to support it on future consumer grade processors. BTW, even though the root port description indicates a Xeon processor, this is just a regular i5, apparently the PCI device ID is re-used between parts. Actual Xeon processors do have ACS support on the processor-based root ports.
http://vfio.blogspot.com
Looking for a more open forum to discuss vfio related uses? Try https://www.redhat.com/mailman/listinfo/vfio-users
Offline
It sounds like VGA arbitration isn't working when the AMD card is in use by the host. What does 'dmesg | grep vgaarb' show?
When I run with Intel as the host:
[ 0.333676] vgaarb: device added: PCI:0000:00:02.0,decodes=io+mem,owns=io+mem,locks=none
[ 0.333680] vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=none,locks=none
[ 0.333682] vgaarb: device added: PCI:0000:02:00.0,decodes=io+mem,owns=none,locks=none
[ 0.333682] vgaarb: loaded
[ 0.333683] vgaarb: bridge control possible 0000:02:00.0
[ 0.333683] vgaarb: bridge control possible 0000:01:00.0
[ 0.333684] vgaarb: no bridge control possible 0000:00:02.0
[ 2.855092] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none
[ 3.069294] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io:owns=io
When I run with AMD as the host:
[ 0.330061] vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none
[ 0.330064] vgaarb: device added: PCI:0000:02:00.0,decodes=io+mem,owns=none,locks=none
[ 0.330065] vgaarb: loaded
[ 0.330065] vgaarb: bridge control possible 0000:02:00.0
[ 0.330066] vgaarb: bridge control possible 0000:01:00.0
[ 2.609390] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none
Doesn't seem too much different. Anything catch your eye?
In case it matters:
$ lspci -tv
+-01.0-[01]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Barts XT [Radeon HD 6870]
| \-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Barts HDMI Audio [Radeon HD 6800 Series]
+-01.1-[02]--+-00.0 NVIDIA Corporation GK110 [GeForce GTX 780]
| \-00.1 NVIDIA Corporation GK110 HDMI Audio
Last edited by Slabity (2014-05-23 18:19:00)
Offline