You are not logged in.

#1 2016-02-28 07:08:05

8BitAce
Member
Registered: 2015-09-04
Posts: 9

No Sleep After VM Shutdown

Hello,

I have a strange issue. I have 2 GPUs in my machine: an NVIDIA card that powers both my monitors and an AMD card that's only used for passthrough to a Windows VM.
I boot with both monitors on the NVIDIA card and if I need windows I switch one monitor's cable over to the AMD card and pass it through. No problems there.
After shutting down the VM and returning the cable to the NVIDIA card everything works fine except when I later do a `systemctl suspend` the monitors turn off but the actual system refuses to sleep. Normally the fans shut down ~1sec after hitting enter, but now I end up having to REISUB to get out of this "limbo".

My guess is that the kernel is trying to grab the AMD GPU even though radeon is blacklisted. The output from journalctl (below) shows a kernel oops regarding vga arbitration but beyond this I'm lost as to how it's related. But it's definitely related as it is 100% reproducible ONLY when I have booted that VM.

Any ideas? Possibly a way to tell the kernel to ignore that PCI slot entirely?

Output from the journal starting at the point of shutting down the VM:

Feb 27 23:50:10 arch_d smbd[491]: [2016/02/27 23:50:10.417268,  0] ../source3/printing/print_standard.c:69(std_pcap_cache_reload)
Feb 27 23:50:10 arch_d smbd[491]:   Unable to open printcap file /etc/printcap for read!
Feb 27 23:57:25 arch_d kernel: usb 3-4: reset full-speed USB device number 4 using xhci_hcd
Feb 27 23:57:25 arch_d kernel: usb 3-2: reset full-speed USB device number 2 using xhci_hcd
Feb 27 23:57:25 arch_d kernel: usb 3-2: ep 0x85 - rounding interval to 64 microframes, ep desc says 80 microframes
Feb 27 23:57:25 arch_d kernel: usb 3-3: reset low-speed USB device number 3 using xhci_hcd
Feb 27 23:57:26 arch_d kernel: usb 3-3: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
Feb 27 23:57:26 arch_d dhcpcd[471]: vnet0: carrier lost
Feb 27 23:57:26 arch_d kernel: virbr0: port 2(vnet0) entered disabled state
Feb 27 23:57:26 arch_d kernel: device vnet0 left promiscuous mode
Feb 27 23:57:26 arch_d kernel: virbr0: port 2(vnet0) entered disabled state
Feb 27 23:57:26 arch_d dhcpcd[471]: vnet0: removing interface
Feb 27 23:57:26 arch_d dhcpcd[471]: virbr0: carrier lost
Feb 27 23:57:26 arch_d kernel: logitech-djreceiver 0003:046D:C52B.000B: hiddev0,hidraw0: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:14.0-4/input2
Feb 27 23:57:26 arch_d kernel: input: DELL Dell USB Entry Keyboard as /devices/pci0000:00/0000:00:14.0/usb3/3-3/3-3:1.0/0003:413C:2107.000C/input/input35
Feb 27 23:57:26 arch_d kernel: hid-generic 0003:413C:2107.000C: input,hidraw1: USB HID v1.10 Keyboard [DELL Dell USB Entry Keyboard] on usb-0000:00:14.0-3/input0
Feb 27 23:57:26 arch_d kernel: input: Burr-Brown from TI               USB Audio CODEC  as /devices/pci0000:00/0000:00:14.0/usb3/3-2/3-2:1.3/0003:08BB:2902.000E/input/input36
Feb 27 23:57:26 arch_d kernel: hid-generic 0003:08BB:2902.000E: input,hidraw2: USB HID v1.00 Device [Burr-Brown from TI               USB Audio CODEC ] on usb-0000:00:14.0-2/input3
Feb 27 23:57:26 arch_d virtlogd[21854]: Cannot open log file: '/var/log/libvirt/qemu/Win8-Passthrough.log': Device or resource busy
Feb 27 23:57:26 arch_d virtlogd[21854]: End of file while reading data: Input/output error
Feb 27 23:57:28 arch_d systemd-machined[22280]: Machine qemu-Win8-Passthrough terminated.
Feb 27 23:57:28 arch_d kernel: vgaarb: device changed decodes: PCI:0000:02:00.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
Feb 27 23:57:28 arch_d kernel: input: Logitech M315/235/317 as /devices/pci0000:00/0000:00:14.0/usb3/3-4/3-4:1.2/0003:046D:C52B.000B/0003:046D:4009.000D/input/input37
Feb 27 23:57:28 arch_d kernel: logitech-hidpp-device 0003:046D:4009.000D: input,hidraw3: USB HID v1.11 Mouse [Logitech M315/235/317] on usb-0000:00:14.0-4:1
Feb 27 23:57:28 arch_d kernel: kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
Feb 27 23:57:28 arch_d kernel: BUG: unable to handle kernel paging request at ffff88041ab167f0
Feb 27 23:57:28 arch_d kernel: IP: [<ffff88041ab167f0>] 0xffff88041ab167f0
Feb 27 23:57:28 arch_d kernel: PGD 1b3f067 PUD 1b42067 PMD 41ab68063 PTE 800000041ab16163
Feb 27 23:57:28 arch_d kernel: Oops: 0011 [#1] PREEMPT SMP 
Feb 27 23:57:28 arch_d kernel: Modules linked in: vfio_pci vfio_iommu_type1 vfio_virqfd vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter nvidia_uvm(PO) sha256_ssse3 sha256_generic hmac drbg ansi_cprng dm_crypt loop dm_mod fuse nvidia_modeset(PO) cfg80211 nls_iso8859_1 nls_cp437 snd_usb_audio vfat fat snd_usbmidi_lib snd_rawmidi snd_seq_device mousedev input_leds nvidia(PO) intel_rapl iosf_mbi eeepc_wmi asus_wmi iTCO_wdt sparse_keymap iTCO_vendor_support led_class x86_pkg_temp_thermal evdev drm rfkill mac_hid intel_powerclamp mxm_wmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi
Feb 27 23:57:28 arch_d kernel:  coretemp kvm_intel kvm irqbypass psmouse crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr serio_raw aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel snd_hda_codec e1000e fjes snd_hda_core snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore thermal fan battery tpm_infineon video button shpchp mei_me mei ptp lpc_ich pps_core wmi tpm_tis tpm processor sch_fq_codel ip_tables x_tables ext4 crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj sd_mod hid_generic usbhid hid atkbd libps2 xhci_pci xhci_hcd ahci libahci libata ehci_pci ehci_hcd scsi_mod usbcore usb_common i8042 serio
Feb 27 23:57:28 arch_d kernel: CPU: 0 PID: 21852 Comm: libvirtd Tainted: P           O    4.4.1-1-vfio #1
Feb 27 23:57:28 arch_d kernel: Hardware name: ASUS All Series/MAXIMUS VI HERO, BIOS 1603 08/15/2014
Feb 27 23:57:28 arch_d kernel: task: ffff880411d06040 ti: ffff8801a40a4000 task.ti: ffff8801a40a4000
Feb 27 23:57:28 arch_d kernel: RIP: 0010:[<ffff88041ab167f0>]  [<ffff88041ab167f0>] 0xffff88041ab167f0
Feb 27 23:57:28 arch_d kernel: RSP: 0018:ffff8801a40a7cb0  EFLAGS: 00010286
Feb 27 23:57:28 arch_d kernel: RAX: ffff88041ab167f0 RBX: ffff88041bc54098 RCX: 0000000000000000
Feb 27 23:57:28 arch_d kernel: RDX: 0000000000000000 RSI: ffff88041bc54098 RDI: ffff88041bc54098
Feb 27 23:57:28 arch_d kernel: RBP: ffff8801a40a7cd0 R08: ffffffff81ae6920 R09: ffffffff810cdea6
Feb 27 23:57:28 arch_d kernel: R10: 0000000000000000 R11: ffff88041365ca00 R12: ffff88041bc54148
Feb 27 23:57:28 arch_d kernel: R13: ffff88041ab167f0 R14: 0000000000000000 R15: 000000000000000c
Feb 27 23:57:28 arch_d kernel: FS:  00007f68e2a4e800(0000) GS:ffff88042ec00000(0000) knlGS:0000000000000000
Feb 27 23:57:28 arch_d kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 27 23:57:28 arch_d kernel: CR2: ffff88041ab167f0 CR3: 00000001b1482000 CR4: 00000000001406f0
Feb 27 23:57:28 arch_d kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 27 23:57:28 arch_d kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb 27 23:57:28 arch_d kernel: Stack:
Feb 27 23:57:28 arch_d kernel:  ffffffff813f4f22 0000000000000008 ffff88041bc54098 0000000000000004
Feb 27 23:57:28 arch_d kernel:  ffff8801a40a7d00 ffffffff813f5d22 ffff88041bc54098 0000000000000004
Feb 27 23:57:28 arch_d kernel:  ffff88041bc54148 0000000000000246 ffff8801a40a7d30 ffffffff813f5ddd
Feb 27 23:57:28 arch_d kernel: Call Trace:
Feb 27 23:57:28 arch_d kernel:  [<ffffffff813f4f22>] ? __rpm_callback+0x32/0x70
Feb 27 23:57:28 arch_d kernel:  [<ffffffff813f5d22>] rpm_idle+0x1f2/0x260
Feb 27 23:57:28 arch_d kernel:  [<ffffffff813f5ddd>] __pm_runtime_idle+0x4d/0x70
Feb 27 23:57:28 arch_d kernel:  [<ffffffff8130e6da>] pci_device_remove+0x7a/0xc0
Feb 27 23:57:28 arch_d kernel:  [<ffffffff813eb0a1>] __device_release_driver+0xa1/0x150
Feb 27 23:57:28 arch_d kernel:  [<ffffffff813eb173>] device_release_driver+0x23/0x30
Feb 27 23:57:28 arch_d kernel:  [<ffffffff813e9d6d>] unbind_store+0x10d/0x160
Feb 27 23:57:28 arch_d kernel:  [<ffffffff813e8f25>] drv_attr_store+0x25/0x30
Feb 27 23:57:28 arch_d kernel:  [<ffffffff81259957>] sysfs_kf_write+0x37/0x40
Feb 27 23:57:28 arch_d kernel:  [<ffffffff81258f1a>] kernfs_fop_write+0x11a/0x170
Feb 27 23:57:28 arch_d kernel:  [<ffffffff811de6f7>] __vfs_write+0x37/0x100
Feb 27 23:57:28 arch_d kernel:  [<ffffffff811df007>] vfs_write+0xa7/0x1a0
Feb 27 23:57:28 arch_d kernel:  [<ffffffff811dfce5>] SyS_write+0x55/0xc0
Feb 27 23:57:28 arch_d kernel:  [<ffffffff81591bee>] entry_SYSCALL_64_fastpath+0x12/0x71
Feb 27 23:57:28 arch_d kernel: Code: 88 ff ff d0 6e 65 c8 02 88 ff ff d0 6e 65 c8 02 88 ff ff 00 00 00 00 00 00 00 00 e0 67 b1 1a 04 88 ff ff e0 67 b1 1a 04 88 ff ff <f0> 67 b1 1a 04 88 ff ff f0 67 b1 1a 04 88 ff ff 00 74 69 12 02 
Feb 27 23:57:28 arch_d kernel: RIP  [<ffff88041ab167f0>] 0xffff88041ab167f0
Feb 27 23:57:29 arch_d kernel:  RSP <ffff8801a40a7cb0>
Feb 27 23:57:29 arch_d kernel: CR2: ffff88041ab167f0
Feb 27 23:57:29 arch_d kernel: ---[ end trace 714ebe983c059769 ]---
Feb 27 23:59:46 arch_d sudo[6150]: pam_unix(sudo:auth): auth could not identify password for [eightbit]
Feb 28 00:00:02 arch_d systemd[1]: Starting Verify integrity of password and group files...
Feb 28 00:00:02 arch_d systemd[1]: Starting Rotate log files...
Feb 28 00:00:02 arch_d systemd[1]: Starting Update man-db cache...
Feb 28 00:00:02 arch_d systemd[1]: Started Verify integrity of password and group files.
Feb 28 00:00:03 arch_d systemd[1]: Started Rotate log files.
Feb 28 00:00:09 arch_d systemd[1]: Started Update man-db cache.
Feb 28 00:00:10 arch_d polkitd[750]: Registered Authentication Agent for unix-process:6409:20838450 (system bus name :1.270 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
Feb 28 00:00:10 arch_d systemd[1]: Reached target Sleep.
Feb 28 00:00:10 arch_d polkitd[750]: Unregistered Authentication Agent for unix-process:6409:20838450 (system bus name :1.270, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
Feb 28 00:00:10 arch_d systemd[1]: Starting Suspend...
Feb 28 00:00:10 arch_d systemd-sleep[6414]: Suspending system...
Feb 28 00:00:11 arch_d kernel: PM: Syncing filesystems ... done.
Feb 28 00:00:11 arch_d kernel: PM: Preparing system for sleep (mem)

Offline

#2 2016-03-21 13:34:18

DuKeTHeReaL
Member
Registered: 2016-03-21
Posts: 3

Re: No Sleep After VM Shutdown

Have you looked into your dmesg?

I guess I have a similar or the same Problem.

I am using:
- 4.4.1-1-vfio
- qemu
- no libvirt
- ovmf

My setup is:

- Intel i7 6700k
- ASUS Z170 A
- Geforece GTX 970

My command to start the vm is the following:

#!/bin/bash

echo "vfio-pci"

sudo modprobe vfio-pci

echo "kvm intel guest state"

sudo modprobe -r kvm_intel
sudo modprobe kvm_intel emulate_invalid_guest_state=0

echo "vfiobind"

vfiobind() {
    dev="$1"
    vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
    device=$(cat /sys/bus/pci/devices/$dev/device)
    if [ -e /sys/bus/pci/devices/$dev/driver ]; then
        echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
    fi
    echo "$vendor $device bound"
    echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id
}

echo "Binding USB 3 Hub"
vfiobind 0000:02:00.0

echo "Binding Graphics Card"
vfiobind 0000:01:00.0
vfiobind 0000:01:00.1

echo "Starting VM"

modprobe snd-hda-intel

sudo \
/usr/bin/qemu-system-x86_64 \
-serial none \
-parallel none \
-nodefaults \
-nodefconfig \
-enable-kvm \
-name madvmarch \
-cpu host,kvm=off,check \
-smp sockets=1,cores=2,threads=2 \
-m 8192 \
`# Maus` \
-usb \
-device usb-host,hostbus=1,hostaddr=2 \
`# OVMF` \
-drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on \
`# DRIVE` \
-drive file=/var/lib/qemu/madvmarch_VARS.fd,if=pflash,format=raw,unit=1 \
-rtc base=localtime \
-boot order=c \
`# Netzwerk` \
-netdev tap,id=net0 \
-device rtl8139,netdev=net0,mac=52:54:00:35:e6:2b \
`# Festplatte` \
-drive file=/var/lib/libvirt/images/madvmarch.qcow2,format=qcow2,if=virtio,id=drive0,cache=none,aio=native \
-boot c \
-nographic \
`# Grafikkarte` \
`#-vga qxl` \
`#-device vfio-pci,host=00:01.0,bus=pci.0` \
-device vfio-pci,host=02:00.0,bus=pci.0,addr=0x7 \
-device vfio-pci,host=01:00.0,id=hostdev1,bus=pci.0,addr=0x5,multifunction=on \
-device vfio-pci,host=01:00.1,id=hostdev2,bus=pci.0,addr=0x6 \

echo "VM terminated"

From time to time I get the following output in dmesg:

[…]
[    2.100683] asus_wmi: Number of fans: 1
[    2.102383] intel_rapl: Found RAPL domain package
[    2.102397] intel_rapl: Found RAPL domain core
[    2.102419] intel_rapl: Found RAPL domain uncore
[    2.102432] intel_rapl: Found RAPL domain dram
[    2.177580] BTRFS info (device sdb1): disk space caching is enabled
[    2.177587] BTRFS: has skinny extents
[    2.342921] e1000e 0000:00:1f.6 eth0: registered PHC clock
[    2.342930] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) 9c:5c:8e:71:9f:16
[    2.342935] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
[    2.342979] e1000e 0000:00:1f.6 eth0: MAC: 12, PHY: 12, PBA No: FFFFFF-0FF
[    2.343549] snd_hda_intel 0000:01:00.1: Disabling MSI
[    2.343562] snd_hda_intel 0000:01:00.1: Handle vga_switcheroo audio client
[    2.345370] e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
[    2.347881] [drm] Memory usable by graphics device = 4096M
[    2.347887] [drm] VT-d active for gfx access
[    2.347893] checking generic (b0000000 300000) vs hw (b0000000 10000000)
[    2.347896] fb: switching to inteldrmfb from EFI VGA
[    2.347960] Console: switching to colour dummy device 80x25
[    2.348197] [drm] Replacing VGA console driver
[    2.355606] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    2.355612] [drm] Driver supports precise vblank timestamp query.
[    2.358314] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=none:owns=io+mem
[    2.382809] clocksource: Switched to clocksource tsc
[    2.384972] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[    2.385222] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input14
[    2.612785] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    2.612798] [drm] Initialized i915 1.6.0 20151010 for 0000:00:02.0 on minor 0
[    2.613793] fbcon: inteldrmfb (fb0) is primary device
[    2.767326] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input15
[    2.767469] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input16
[    2.767592] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input17
[    2.767726] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input18
[    2.795658] snd_hda_codec_realtek hdaudioC0D0: autoconfig for ALC892: line_outs=3 (0x14/0x15/0x16/0x0/0x0) type:line
[    2.795662] snd_hda_codec_realtek hdaudioC0D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[    2.795666] snd_hda_codec_realtek hdaudioC0D0:    hp_outs=1 (0x1b/0x0/0x0/0x0/0x0)
[    2.795668] snd_hda_codec_realtek hdaudioC0D0:    mono: mono_out=0x0
[    2.795670] snd_hda_codec_realtek hdaudioC0D0:    dig-out=0x11/0x1e
[    2.795672] snd_hda_codec_realtek hdaudioC0D0:    inputs:
[    2.795675] snd_hda_codec_realtek hdaudioC0D0:      Front Mic=0x19
[    2.795678] snd_hda_codec_realtek hdaudioC0D0:      Rear Mic=0x18
[    2.795681] snd_hda_codec_realtek hdaudioC0D0:      Line=0x1a
[    2.823792] input: HDA Intel PCH Front Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input19
[    2.823863] input: HDA Intel PCH Rear Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input20
[    2.823923] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1f.3/sound/card0/input21
[    2.823985] input: HDA Intel PCH Line Out Front as /devices/pci0000:00/0000:00:1f.3/sound/card0/input22
[    2.824079] input: HDA Intel PCH Line Out Surround as /devices/pci0000:00/0000:00:1f.3/sound/card0/input23
[    2.824178] input: HDA Intel PCH Line Out CLFE as /devices/pci0000:00/0000:00:1f.3/sound/card0/input24
[    2.824406] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1f.3/sound/card0/input25
[    2.824623] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input26
[    2.824838] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input27
[    2.892798] Console: switching to colour frame buffer device 240x67
[    2.904218] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[    2.937566] BTRFS error (device sda2): could not find root 8
[    2.937660] BTRFS error (device sda2): could not find root 8
[    3.183286] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready
[    3.295344] e1000e: enp0s31f6 NIC Link is Down
[    3.295478] bridge: automatic filtering via arp/ip/ip6tables has been deprecated. Update your scripts to load br_netfilter if you need this.
[    3.506576] device enp0s31f6 entered promiscuous mode
[    3.506634] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready
[    3.532334] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready
[    3.547316] usb 3-3: ep 0x83 - rounding interval to 64 microframes, ep desc says 80 microframes
[    3.576187] input: Logitech Logitech G930 Headset as /devices/pci0000:00/0000:00:1b.0/0000:02:00.0/usb3/3-3/3-3:1.3/0003:046D:0A1F.000A/input/input28
[    3.629624] hid-generic 0003:046D:0A1F.000A: input,hiddev0,hidraw7: USB HID v1.01 Device [Logitech Logitech G930 Headset] on usb-0000:02:00.0-3/input3
[    3.734704] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
[    3.794715] ip6_tables: (C) 2000-2006 Netfilter Core Team
[    3.896007] usbcore: registered new interface driver snd-usb-audio
[    3.940928] Ebtables v2.0 registered
[    4.359632] [drm] RC6 on
[    4.547581] Netfilter messages via NETLINK v0.30.
[    4.566673] ip_set: protocol 6
[    4.856349] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready
[    4.858396] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready
[    4.872850] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready
[    6.148024] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[    6.148076] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s31f6: link becomes ready
[    6.148103] br0: port 1(enp0s31f6) entered forwarding state
[    6.148111] br0: port 1(enp0s31f6) entered forwarding state
[    6.148166] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready
[   19.552743] VFIO - User Level meta-driver version: 0.3
[   19.554650] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=io+mem:owns=none
[   19.566402] vfio_pci: add [10de:13c2[ffff:ffff]] class 0x000000/00000000
[   19.566406] vfio_pci: add [10de:0fbb[ffff:ffff]] class 0x000000/00000000
[   19.682709] xhci_hcd 0000:02:00.0: remove, state 4
[   19.682718] usb usb4: USB disconnect, device number 1
[   19.694011] xhci_hcd 0000:02:00.0: USB bus 4 deregistered
[   19.694017] xhci_hcd 0000:02:00.0: remove, state 1
[   19.694022] usb usb3: USB disconnect, device number 1
[   19.694023] usb 3-1: USB disconnect, device number 2
[   19.830199] usb 3-2: USB disconnect, device number 3
[   19.870253] usb 3-3: USB disconnect, device number 4
[   19.912723] xhci_hcd 0000:02:00.0: USB bus 3 deregistered
[   19.933880] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=io+mem:owns=none
[   19.950046] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=io+mem:owns=none
[   20.166668] tun: Universal TUN/TAP device driver, 1.6
[   20.166671] tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
[   20.679681] device tap0 entered promiscuous mode
[   20.679812] br0: port 2(tap0) entered forwarding state
[   20.679839] br0: port 2(tap0) entered forwarding state
[   21.195993] br0: port 1(enp0s31f6) entered forwarding state
[   22.285815] vfio-pci 0000:01:00.0: enabling device (0000 -> 0003)
[   22.285945] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1e@0x258
[   22.285960] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x900
[   29.026789] kvm: zapping shadow pages for mmio generation wraparound
[   29.028433] kvm: zapping shadow pages for mmio generation wraparound
[   35.700561] br0: port 2(tap0) entered forwarding state
[   37.760519] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead.
[  259.138821] br0: port 2(tap0) entered disabled state
[  259.138877] device tap0 left promiscuous mode
[  259.138881] br0: port 2(tap0) entered disabled state
[  301.236279] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=io+mem:owns=none
[  301.251128] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=io+mem:owns=none
[  301.280940] general protection fault: 0000 [#1] PREEMPT SMP 
[  301.281011] Modules linked in: kvm_intel kvm tun vfio_pci vfio_iommu_type1 vfio_virqfd vfio nf_conntrack_netbios_ns nf_conntrack_broadcast xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip_set nfnetlink ebtable_broute ebtable_filter ebtable_nat ebtables ip6table_raw ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables iptable_raw iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device cfg80211 bridge stp llc snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi nls_iso8859_1 nls_cp437 vfat fat intel_rapl iosf_mbi x86_pkg_temp_thermal eeepc_wmi intel_powerclamp
[  301.281810]  asus_wmi sparse_keymap coretemp mxm_wmi i915 irqbypass snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec aesni_intel snd_hda_core aes_x86_64 lrw gf128mul glue_helper e1000e ablk_helper snd_hwdep cryptd drm_kms_helper snd_pcm psmouse joydev evdev snd_timer input_leds drm serio_raw snd pcspkr led_class mousedev mei_me ptp intel_gtt i2c_i801 mac_hid syscopyarea pps_core sysfillrect soundcore mei sysimgblt fb_sys_fops i2c_algo_bit shpchp thermal fan hci_uart i2c_hid btbcm btqca btintel battery bluetooth wmi video tpm_tis pinctrl_sunrisepoint rfkill pinctrl_intel tpm crc16 acpi_als intel_lpss_acpi intel_lpss kfifo_buf fjes processor industrialio acpi_pad button sch_fq_codel ip_tables x_tables btrfs xor hid_generic hid_logitech_hidpp hid_logitech_dj usbhid hid raid6_pq sr_mod cdrom
[  301.282680]  sd_mod atkbd libps2 crc32c_intel ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common i8042 serio pci_stub [last unloaded: kvm]
[  301.282842] CPU: 2 PID: 1437 Comm: madvmarch Not tainted 4.4.1-1-vfio #1
[  301.282904] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1602 01/07/2016
[  301.282981] task: ffff880461fcd880 ti: ffff8800641b0000 task.ti: ffff8800641b0000
[  301.283047] RIP: 0010:[<ffffffff813f4fcf>]  [<ffffffff813f4fcf>] __rpm_callback+0x2f/0x70
[  301.283129] RSP: 0018:ffff8800641b3cb8  EFLAGS: 00010286
[  301.283177] RAX: d4bdea835113adb6 RBX: ffff88047088b098 RCX: 0000000000000000
[  301.283240] RDX: 0000000000000000 RSI: ffff88047088b098 RDI: ffff88047088b098
[  301.283303] RBP: ffff8800641b3cd0 R08: ffffffff81ae6920 R09: ffffffff810cde96
[  301.283365] R10: 0000000000000000 R11: ffff8804090ff000 R12: ffff88047088b148
[  301.283427] R13: d4bdea835113adb6 R14: 0000000000000000 R15: 000000000000000d
[  301.283490] FS:  00007f64d6d94700(0000) GS:ffff880483c80000(0000) knlGS:0000000000000000
[  301.283561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  301.283612] CR2: 00000000006ca0e8 CR3: 00000001e72d6000 CR4: 00000000003406e0
[  301.283675] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  301.283737] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  301.283798] Stack:
[  301.283819]  0000000000000008 ffff88047088b098 0000000000000004 ffff8800641b3d00
[  301.283896]  ffffffff813f5dd2 ffff88047088b098 0000000000000004 ffff88047088b148
[  301.283973]  0000000000000246 ffff8800641b3d30 ffffffff813f5e8d ffff88047088b098
[  301.284049] Call Trace:
[  301.284079]  [<ffffffff813f5dd2>] rpm_idle+0x1f2/0x260
[  301.284129]  [<ffffffff813f5e8d>] __pm_runtime_idle+0x4d/0x70
[  301.284184]  [<ffffffff8130e79a>] pci_device_remove+0x7a/0xc0
[  301.284238]  [<ffffffff813eb151>] __device_release_driver+0xa1/0x150
[  301.284296]  [<ffffffff813eb223>] device_release_driver+0x23/0x30
[  301.284355]  [<ffffffff813e9e1d>] unbind_store+0x10d/0x160
[  301.284408]  [<ffffffff813e8fd5>] drv_attr_store+0x25/0x30
[  301.284462]  [<ffffffff81259947>] sysfs_kf_write+0x37/0x40
[  301.284514]  [<ffffffff81258f0a>] kernfs_fop_write+0x11a/0x170
[  301.284572]  [<ffffffff811de6d7>] __vfs_write+0x37/0x100
[  301.284624]  [<ffffffff811defe7>] vfs_write+0xa7/0x1a0
[  301.284672]  [<ffffffff811dfcc5>] SyS_write+0x55/0xc0
[  301.284720]  [<ffffffff811fb523>] ? __close_fd+0xa3/0xd0
[  301.284773]  [<ffffffff81591cae>] entry_SYSCALL_64_fastpath+0x12/0x71
[  301.284830] Code: 00 55 f6 86 99 01 00 00 02 48 89 e5 41 55 41 54 4c 8d a6 b0 00 00 00 53 49 89 fd 48 89 f3 4c 89 e7 74 29 e8 64 c5 19 00 48 89 df <41> ff d5 f6 83 99 01 00 00 02 41 89 c5 4c 89 e7 75 16 e8 2a c7 
[  301.285191] RIP  [<ffffffff813f4fcf>] __rpm_callback+0x2f/0x70
[  301.285250]  RSP <ffff8800641b3cb8>
[  301.307083] ---[ end trace 6469b2ae0c961bcb ]---

As I blacklisted nouveau and nvidia drivers the graphics card does't have these on boot, but the graphics card also uses the open source sound driver from intel. I can stop this if I blacklist the sound driver in /etc/modprobe.d/blacklist_snd_hda_intel.conf via:

blacklist snd_hda_intel

Unfortunately this deactivates not only the driver of the graphics card, but also the one of the internal intel gpu. If I modprobe snd_hda_intel after booting and vfio-binding the graphics card only the internal sound-card activates but not the hdmi-sound of the internal gpu. But with this solution the crash goes away and I can poweroff and boot my vm thousands of times without any crash.

I have to add, that for me didn't freeze the host completly but I can't poweroff the host or start a vm again.

Kind regards

Offline

Board footer

Powered by FluxBB