You are not logged in.

#1 2015-10-03 04:09:52

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

[SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Fixed in 4.4.

******************

It took me hours of troubleshooting to figure this out, because I tried downgrading linux AFTER every other possible fix or downgrade. I need to remember: always try downgrading linux first if I just upgraded my system. Anyway,

I'm running linux 4.1.6-1. Tonight, I upgraded my system, and that included going up to linux 4.2.2-1. I have 2 VMs managed by libvirt, one Windows 10 and one Arch Linux. Both of them use vfio-pci and PCI passthrough with OVMF (using the TianoCore EDK2 UEFI) to have their own graphics card and USB3 card (same one for each (they don't run at the same time)). On 4.1.6-1, everything works fine. On 4.2.2-1, both VMs, even when I tried to boot into the Windows install disc, completely freeze a few seconds after the UEFI boot. For Windows, it hangs right before you would normally see the spinny wide dots under the blue logo. For Linux, it hangs at a black screen right after selecting "Arch Linux" in the UEFI. This happens when I pass through any one of the previously mentioned PCI devices on linux 4.2.2-1. If I take out all of the PCI devices from a VM, the problem goes away. If I downgrade linux, the problem goes away. Neither of those are good permanent solutions. They aren't even decent workarounds. They're just me being desperate. I want to be able to upgrade linux again someday.

All the info I have as to what kind of hang it is: Windows, upon rebooting into recovery mode, reports the error code 0xc0000001. Also, when either VM freezes, my keyboard LEDs go off, so I think they freeze in a way that deactivates all USB devices, or maybe it deactivates all PCI devices, or maybe both--we don't know since my USB devices are plugged in via PCIe. That's all I've been able to figure out.

Last edited by Cadeyrn (2016-01-22 05:23:23)

Offline

#2 2015-10-04 13:50:21

mostlyharmless
Member
Registered: 2014-01-16
Posts: 72

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Can confirm this problem:  I have my VM starting automatically on boot, which crashes the whole system.  Booted with a CD, chrooted in and downgraded to 4.1.6 and all is back to normal.   What do you suppose would be the best way to report this problem, and to whom?
[edit] I see an IOMMU error with kernel 4.2+ has been reported.  I'm guessing that's the same issue.

Last edited by mostlyharmless (2015-10-04 13:55:45)

Offline

#3 2015-10-04 16:41:31

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I hope it is. It sounds like we don't have exactly the same issue, though, because the VMs never once froze my host.

Offline

#4 2015-10-05 01:11:46

markzz
Member
From: Michigan, United States
Registered: 2013-06-08
Posts: 89
Website

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I would suggest using the linux-lts (version 4.1.9) kernel until 4.2 is fixed.  This way, you are getting patches, but you are using a version of the kernel that works.

The maintainer of linux-vfio on the AUR also created a package using the lts kernel (linux-vfio-lts) because he was also having issues with his package as well.

It seems that 4.2 has a few issues that people are reporting.  I personally cannot use it on any of my computers without issues, so I use the lts kernel.  Even my computer I use for PCI passthrough.

Last edited by markzz (2015-10-05 01:12:59)


I don't want to work.  I want to bang on the drum all day.

Offline

#5 2015-10-05 09:53:09

Physicist1616
Member
Registered: 2015-02-16
Posts: 32

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Hah, what a time to be experimenting with this for the first time.  I can at least say this issue wasn't the source of my QEMU launch troubles, but still.

I'm looking to run my Arch KVM Host with no display adapter (passing both NVidia cards through to guests) and do host-level interacts solely through ssh/ctrl-alt-2.  Is there some place that talks about this, or is it even possible?

edit:  clarified nvidia passthrough

Last edited by Physicist1616 (2015-10-05 09:55:08)

Offline

#6 2015-10-05 15:10:35

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I don't see why that wouldn't be possible. Thanks to the way vfio-pci works, you'll have to immediately reserve both graphics cards when the host boots, meaning your host will never have access to any graphics if you intend to run the VMs, unless you reboot the whole computer with different grub/kernel parameters when you want to run those VMs.

Offline

#7 2015-10-05 17:05:50

Physicist1616
Member
Registered: 2015-02-16
Posts: 32

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Thanks for the note.  Also huge thanks for mentioning the linux-vfio packages being on AUR; I hadn't realized I was missing a crucial package that would only come from AUR.  This may explain my error -22.

Edit:  Ah, I see that this is a patch-inclusion-kernel; still, I think it might be required for my intel + no-host-EFI setup.

Last edited by Physicist1616 (2015-10-05 17:37:38)

Offline

#8 2015-10-05 17:55:33

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Not necessarily. I don't use EFI on my host and I don't use the linux-vfio kernel, and there were only 4 steps to passing any PCIe (not PCI, even if it had a PCIe bridge into a PCIe slot) device:
-Turn on IOMMU and make sure my mobo, CPU, and graphics card all support IOMMU, EFI, and passthrough stuff
-Use the TianoCore EDK2 UEFI as the "BIOS" for the VM (the Arch Wiki page for PCI passthrough via OVMF) has the download link and install guide for that)
-Assign the desired PCI device to the vfio-pci kernel module
-Add it to the libvirt XML file or QEMU parameters

You seem like you haven't heard of vfio.blogspot.com. There's an amazing guide there that has everything (except for the step that I mentioned is in the Arch Wiki).

Offline

#9 2015-10-11 08:23:26

hiliev
Member
From: Aachen
Registered: 2015-10-11
Posts: 2
Website

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I just want to share here my experience. I was keeping my kernel at 4.1.6-1 until I saw the updated version of the PCI passthrough wiki article and decided to switch to the LTS kernel. But for some reason the VM doesn't make it past the OVMF splash screen with the official linux-lts kernel (currently 4.1.10-1). One core spins at 100% and it never makes it to the OS boot loader. It works with linux-vfio-lts from AUR (also based on 4.1.10-1) though. To add more context: GTX 970 on a X99-based motherboard.

Offline

#10 2015-10-11 17:28:53

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

That's interesting. I'm still on linux-lts 4.1.9-2 and it works fine. I'll try 10 and see what happens later. But, I thought the linux-vfio kernel is meant for VMs that don't use UEFI to have a much easier time with OVMF.

Offline

#11 2015-10-12 04:46:20

markzz
Member
From: Michigan, United States
Registered: 2013-06-08
Posts: 89
Website

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Cadeyrn wrote:

That's interesting. I'm still on linux-lts 4.1.9-2 and it works fine. I'll try 10 and see what happens later. But, I thought the linux-vfio kernel is meant for VMs that don't use UEFI to have a much easier time with OVMF.

It also fixes the IOMMU problem of more than one device being grouped together.  This is why I use it.


I don't want to work.  I want to bang on the drum all day.

Offline

#12 2015-10-13 08:41:03

hiliev
Member
From: Aachen
Registered: 2015-10-11
Posts: 2
Website

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Cadeyrn wrote:

That's interesting. I'm still on linux-lts 4.1.9-2 and it works fine. I'll try 10 and see what happens later. But, I thought the linux-vfio kernel is meant for VMs that don't use UEFI to have a much easier time with OVMF.

The problem that I experience with linux-lts 4.1.10-1 is that the VM hangs/spins right at the TianoCore splash screen. This is different than the 4.2.x problem where the VM boot is extremely slow. I will continue tracking the official LTS kernel and see what happens in the future, but until the problem is resolved, I'm sticking to the vfio version from AUR. Also, I'm getting a ~3 fps boost in The Witcher 3 since switching from 4.1.6-1 to the latest linux-vfio-lts, but that could easily be attributed to the new Nvidia driver as well.

Offline

#13 2015-10-13 15:12:41

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I was getting a hang in 4.2, not a slow boot.

Offline

#14 2015-10-14 03:01:01

mapintar
Member
Registered: 2010-04-17
Posts: 50

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Try disable ssm. Example:

 ... -machine pc-q35-2.4,smm=off ...

Offline

#15 2015-10-19 11:08:15

toxster
Member
Registered: 2015-04-28
Posts: 5

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

any progress on this matter? or a ticket one can follow?

Offline

#16 2015-10-20 10:02:44

novist
Member
Registered: 2014-03-14
Posts: 47

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

For me qemu was crashing right on start and libvirt indicated vm immediately entering paused state that cant be resumed. Downgrade also helped.

Offline

#17 2015-10-20 17:14:52

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

hiliev wrote:
Cadeyrn wrote:

That's interesting. I'm still on linux-lts 4.1.9-2 and it works fine. I'll try 10 and see what happens later. But, I thought the linux-vfio kernel is meant for VMs that don't use UEFI to have a much easier time with OVMF.

The problem that I experience with linux-lts 4.1.10-1 is that the VM hangs/spins right at the TianoCore splash screen. This is different than the 4.2.x problem where the VM boot is extremely slow. I will continue tracking the official LTS kernel and see what happens in the future, but until the problem is resolved, I'm sticking to the vfio version from AUR. Also, I'm getting a ~3 fps boost in The Witcher 3 since switching from 4.1.6-1 to the latest linux-vfio-lts, but that could easily be attributed to the new Nvidia driver as well.

On upgrading to 4.1.10 (still using linux-lts), my VM is hanging at the Windows 10 logo, which thanks to the way UEFI works is probably the same thing as hanging at the TianoCore screen. But, downgrading back to 4.1.9 didn't fix it. I'll have to figure out how to fix it later.

EDIT: Nevermind. It broke again because I finally upgraded my linux package from 4.1.6 to 4.2 at the same time as upgrading lts to 4.10, and I've been accidentally using it instead of linux-lts this whole time because I didn't realize I have to not just make an init image for linux-lts, but also tell grub to boot that image. 4.10 works just fine for me.

Last edited by Cadeyrn (2015-10-21 01:03:10)

Offline

#18 2015-10-28 08:27:21

xwyz
Member
Registered: 2015-10-28
Posts: 2

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Glad to see I'm not alone. Related: https://forums.gentoo.org/viewtopic-t-1028054.html

Tried 4.3-rc and it doesn't seem solve it. Luckily 4.1 is LTS...

Offline

#19 2015-10-28 13:48:09

schefi
Member
Registered: 2015-10-27
Posts: 10

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I can confirm the same symptoms as is the thread starter post. Im on Fedora 22, altough I pretty sure this issue is simply related to kernel version. Upgrading from 4.1.7 to 4.2.1 broke my VGA passthrough exactly the same way. Returning to 4.1.7 makes it work again. Qemu version up or downgrade is of no relevance.
I investigated a little, and found an entry in the kernel log (running 4.2.1), which was not present at 4.1.7: pmd_set_huge: Cannot satisfy [mem 0xe0000000-0xe0200000] with a huge-page mapping due to MTRR override.
There was a commit which intruduced herdened checking on PAT table entries, and demands that an MTRR entry has to be uniform, in case of setting up huge pages on non write-back modes Only one entry covering the affected range). I also figured out that my /proc/mtrr table shows the memory regions of my nvidia 970 card (mem 0xe0000000-0xe0200000 and 0xf0000000-0xf02000000, this is somewhere around 3500 and 3800 MB address) overlapping with a non-cacheable entry starting from 2Gbytes and ending at 4Gbytes physical address in MTRR.
I tried disabling mtrr entries two ways, to reolve the conflict: first I removed a 2G entry, sesond I reomved the 2 etries corresponding to the VGA cards PCI mem address, with echo "disable=n > /proc/mtrr" where n is the number of the correcponding line, In both cases the kernel log message disappeard, as pmd_set_huge was succesfull this way. But VGA passthrough is still broken. I only see the Windows logo on guest boot, without the white dots. After some time the guest shuts down, qemu exits.
Therefore I think it may not be related at all. But couldn't find any other relevant items in the kernel changelog. Or I wasn't thorough enough.
I'd be glad if someone with more understanding could figure out, what commit caused this issue. It would be great to revert this change. Whatever is causing the bug, it's still broken in 4.3-rc7. I confirmed today.

The only 'solution' for now is to be stuck with 4.1 kernel...

Offline

#20 2015-10-28 22:06:57

usul1589
Member
Registered: 2015-10-28
Posts: 1

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I've observed the same symptoms on an fresh Debian testing installation, and don't have find any workaround for now. I needed to rollback on a 4.1 kernel

Offline

#21 2015-10-28 22:53:15

Cadeyrn
Member
Registered: 2013-04-06
Posts: 170

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Gentoo and Debian? So it's not just Arch Linux, then. Perhaps it's an issue with the whole mainline kernel?

I'm wondering if it's this bug? https://bugzilla.kernel.org/show_bug.cgi?id=103321

I never left my VM running for as long as a few minutes when I had this bug on 4.2, so for all I know, it could be just going that slow as it is for this person in this bug. And it did happen to them right at 4.2 and persist through 4.3. If our bug isn't this one, that means our bug hasn't even been filed yet.

Last edited by Cadeyrn (2015-10-28 22:57:56)

Offline

#22 2015-10-28 23:36:15

schefi
Member
Registered: 2015-10-27
Posts: 10

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

I think it is not the same bug. Checked the bug id, and kernel changelog says to have the correcponding commit reverted in 4.2.4. I tried today with both 4.2.5 (Fedora 23) and 4.3-RC7 (Fedora Rawhide). Issue is still there.
Rigth now 4.1.7 works for me, and it also works with 4.0.4.
Another thing: some of us in this topic experience extreme slowdown, others experience guest VM hang. It may be possible that we are talking about two different bugs. Kernel changelog 4.2.4 states that the revert fixes bug 103321 (mentioned by Cadeyrn). This bug goes about slowdown. Although it clearly doesn't fix my case, which is VM hanging.
To be very exact: I have extreme slowdown, then after minutes the white dots below the windows logo show up and start moving a bit (like 2-4 frames) then VM stops running. Sometimes I don't even get there. Just see the Windows logo for minutes and VM is stuck. UEFI shell in guest VM with passthrough VGA card works perfectly. Things get messed up when guest OS tries to interact with UEFI framebuffer, as it seems.
It is pretty much possible, that this bug is related to host/guest MTRR and PAT mapping, but there could have been multiple related bugs, and 103321 is the only one fixed so far.

Offline

#23 2015-10-29 15:44:11

xwyz
Member
Registered: 2015-10-28
Posts: 2

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

schefi wrote:

Another thing: some of us in this topic experience extreme slowdown, others experience guest VM hang. It may be possible that we are talking about two different bugs.

I think both are the same. I (gentoo) have experienced both things: extreme slowdown at boot (1 minute instead of <10 seconds, but everything seems to be ok *after* boot) and no boot at all (several minutes with a black screen). 4.1.12 works fine.

I assume the only solution would be a kernel git bisect...

Offline

#24 2015-11-03 23:10:55

letni69
Member
Registered: 2015-09-21
Posts: 17

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Did anyone already try to use the latest released linux kernel ? (4.3).

Offline

#25 2015-11-04 20:24:16

schefi
Member
Registered: 2015-10-27
Posts: 10

Re: [SOLVED] linux 4.1.6-1 -> 4.2.2-1 breaks PCI passthrough in QEMU/KVM

Today I compiled and checked with 4.4.0-rc0 from fedora rawhide and unmodified 4.3 stable (directly from kernel.org). Issue is still present both cases.
Some of you reported the guest working after a few minutes of slow period, thoug I can't seem to boot the guest at all, even if waiting for like 20 minutes. An MCE error is also reported after about 1 minute. But the whole stuff works fine with 4.1.7 kernel.
I can't put my finger on it. PCI passthrough related MCE error, and slowdown bugs are present in kernel bugzilla tickets, and 4.2.4 (and some other) changelogs claim to have reverted corresponding changes, because of the very same symptoms we experience. If they have been reverted, then how come it still doesn't work in 4.3?

Offline

Board footer

Powered by FluxBB