You are not logged in.

#1 2021-06-01 21:29:36

jackson2k
Member
Registered: 2021-06-01
Posts: 6

Kernel warning when starting a VFIO virtual machine

I recently switched from loading vfio drivers to capture the GPU at boot time to using a hotplug script so i don't have to reboot every time i want to use my virtual machine. After i started using it i noticed that my journalctl gets filled up with kernel warnings as i start the virtual machine:

Jun 01 17:23:26 mainPC kernel: ------------[ cut here ]------------
Jun 01 17:23:26 mainPC kernel: WARNING: CPU: 8 PID: 32335 at arch/x86/include/asm/kfence.h:44 kfence_protect_page+0x39/0xc0
Jun 01 17:23:26 mainPC kernel: Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat bridge xt_tcpudp nft_counter cfg80211 xt_state xt_conntrack rfkill nf_conntrack 8021q garp nf_defrag_ipv6 mrp nf_defrag_ipv4 nft_compat stp nf_tables llc nct6775 libcrc32c nfnetlink hwmon_vid intel_rapl_msr vfat intel_rapl_common amdgpu fat snd_hda_codec_realtek edac_mce_amd snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi mousedev joydev kvm snd_hda_codec gpu_sched i2c_algo_bit drm_ttm_helper ttm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul drm_kms_helper snd_hwdep ghash_clmulni_intel snd_pcm wmi_bmof cec aesni_intel r8169 sp5100_tco ccp snd_timer syscopyarea realtek crypto_simd sysfillrect mdio_devres cryptd snd sysimgblt rapl pcspkr k10temp fb_sys_fops libphy i2c_piix4 soundcore rng_core wmi gpio_amdpt pinctrl_amd gpio_generic
Jun 01 17:23:26 mainPC kernel:  mac_hid acpi_cpufreq drm fuse crypto_user agpgart bpf_preload ip_tables x_tables usbhid ext4 crc32c_generic crc16 mbcache jbd2 xhci_pci crc32c_intel xhci_pci_renesas [last unloaded: vfio_virqfd]
Jun 01 17:23:26 mainPC kernel: CPU: 8 PID: 32335 Comm: kwin_x11 Tainted: G      D W  OE     5.12.8-arch1-1 #1
Jun 01 17:23:26 mainPC kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Gaming K4, BIOS P6.40 08/31/2020
Jun 01 17:23:26 mainPC kernel: RIP: 0010:kfence_protect_page+0x39/0xc0
Jun 01 17:23:26 mainPC kernel: Code: 25 28 00 00 00 48 89 44 24 08 31 c0 48 8d 74 24 04 c7 44 24 04 00 00 00 00 e8 33 32 dc ff 48 85 c0 74 07 83 7c 24 04 01 74 06 <0f> 0b 31 c0 eb 4c 48 8b 38 48 89 c2 84 db 75 59 48 89 f8 0f 1f 40
Jun 01 17:23:26 mainPC kernel: RSP: 0018:ffffb89e80867ad8 EFLAGS: 00010046
Jun 01 17:23:26 mainPC kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb89e80867adc
Jun 01 17:23:26 mainPC kernel: RDX: ffffb89e80867adc RSI: 0000000000000000 RDI: 0000000000000000
Jun 01 17:23:26 mainPC kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Jun 01 17:23:26 mainPC kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jun 01 17:23:26 mainPC kernel: R13: ffffb89e80867b98 R14: 0000000000000010 R15: 0000000000000000
Jun 01 17:23:26 mainPC kernel: FS:  00007faa10039840(0000) GS:ffff8c81cec00000(0000) knlGS:0000000000000000
Jun 01 17:23:26 mainPC kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 01 17:23:26 mainPC kernel: CR2: 0000000000000010 CR3: 000000021e010000 CR4: 00000000003506e0
Jun 01 17:23:26 mainPC kernel: Call Trace:
Jun 01 17:23:26 mainPC kernel:  kfence_unprotect+0x13/0x30
Jun 01 17:23:26 mainPC kernel:  page_fault_oops+0x9d/0x2d0
Jun 01 17:23:26 mainPC kernel:  ? free_one_page+0x5f/0xd0
Jun 01 17:23:26 mainPC kernel:  exc_page_fault+0x67/0x170
Jun 01 17:23:26 mainPC kernel:  asm_exc_page_fault+0x1e/0x30
Jun 01 17:23:26 mainPC kernel: RIP: 0010:ttm_resource_free+0x1c/0x50 [ttm]
Jun 01 17:23:26 mainPC kernel: Code: 5e c3 0f 0b eb e2 b8 f4 ff ff ff eb ec 90 0f 1f 44 00 00 53 48 63 56 1c 48 89 f3 48 8b 87 48 01 00 00 48 8b bc d0 80 00 00 00 <48> 8b 47 10 48 85 c0 74 0e 48 8b 40 08 48 85 c0 74 05 e8 3d 2c c9
Jun 01 17:23:26 mainPC kernel: RSP: 0018:ffffb89e80867c40 EFLAGS: 00010282
Jun 01 17:23:26 mainPC kernel: RAX: ffff8c812b8e55a8 RBX: ffff8c7ef2c225c8 RCX: 00000000802a001e
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: ffff8c7ef2c225c8 RDI: 0000000000000000
Jun 01 17:23:26 mainPC kernel: RBP: ffff8c7ef2c225c0 R08: 0000000000000001 R09: 0000000000000001
Jun 01 17:23:26 mainPC kernel: R10: dead000000000100 R11: dead000000000100 R12: ffff8c812b8e55a8
Jun 01 17:23:26 mainPC kernel: R13: ffff8c7ef2c22458 R14: 0000000000008440 R15: ffff8c80af6ea7b0
Jun 01 17:23:26 mainPC kernel:  ttm_bo_release+0x171/0x2f0 [ttm]
Jun 01 17:23:26 mainPC kernel:  ttm_bo_vm_close+0x15/0x30 [ttm]
Jun 01 17:23:26 mainPC kernel:  remove_vma+0x31/0x70
Jun 01 17:23:26 mainPC kernel:  exit_mmap+0xe9/0x1f0
Jun 01 17:23:26 mainPC kernel:  mmput+0x52/0x120
Jun 01 17:23:26 mainPC kernel:  do_exit+0x322/0xa50
Jun 01 17:23:26 mainPC kernel:  do_group_exit+0x33/0xa0
Jun 01 17:23:26 mainPC kernel:  get_signal+0x137/0x910
Jun 01 17:23:26 mainPC kernel:  arch_do_signal_or_restart+0x116/0x750
Jun 01 17:23:26 mainPC kernel:  ? do_send_sig_info+0x6b/0xb0
Jun 01 17:23:26 mainPC kernel:  exit_to_user_mode_prepare+0xd4/0x150
Jun 01 17:23:26 mainPC kernel:  syscall_exit_to_user_mode+0x23/0x50
Jun 01 17:23:26 mainPC kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 01 17:23:26 mainPC kernel: RIP: 0033:0x7faa15927d22
Jun 01 17:23:26 mainPC kernel: Code: Unable to access opcode bytes at RIP 0x7faa15927cf8.
Jun 01 17:23:26 mainPC kernel: RSP: 002b:00007ffd43d7afd0 EFLAGS: 00000246 ORIG_RAX: 000000000000000e
Jun 01 17:23:26 mainPC kernel: RAX: 0000000000000000 RBX: 0000000000000009 RCX: 00007faa15927d22
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: 00007ffd43d7afd0 RDI: 0000000000000002
Jun 01 17:23:26 mainPC kernel: RBP: 000000000000000f R08: 0000000000000000 R09: 00007ffd43d7afd0
Jun 01 17:23:26 mainPC kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffd43d7b110
Jun 01 17:23:26 mainPC kernel: R13: 00007ffd43d7b108 R14: 000000000000000b R15: 0000000000000000
Jun 01 17:23:26 mainPC kernel: ---[ end trace 1916a19b8162488f ]---
Jun 01 17:23:26 mainPC kernel: ------------[ cut here ]------------
Jun 01 17:23:26 mainPC kernel: WARNING: CPU: 8 PID: 32335 at mm/kfence/core.c:134 kfence_unprotect+0x18/0x30
Jun 01 17:23:26 mainPC kernel: Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat bridge xt_tcpudp nft_counter cfg80211 xt_state xt_conntrack rfkill nf_conntrack 8021q garp nf_defrag_ipv6 mrp nf_defrag_ipv4 nft_compat stp nf_tables llc nct6775 libcrc32c nfnetlink hwmon_vid intel_rapl_msr vfat intel_rapl_common amdgpu fat snd_hda_codec_realtek edac_mce_amd snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi mousedev joydev kvm snd_hda_codec gpu_sched i2c_algo_bit drm_ttm_helper ttm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul drm_kms_helper snd_hwdep ghash_clmulni_intel snd_pcm wmi_bmof cec aesni_intel r8169 sp5100_tco ccp snd_timer syscopyarea realtek crypto_simd sysfillrect mdio_devres cryptd snd sysimgblt rapl pcspkr k10temp fb_sys_fops libphy i2c_piix4 soundcore rng_core wmi gpio_amdpt pinctrl_amd gpio_generic
Jun 01 17:23:26 mainPC kernel:  mac_hid acpi_cpufreq drm fuse crypto_user agpgart bpf_preload ip_tables x_tables usbhid ext4 crc32c_generic crc16 mbcache jbd2 xhci_pci crc32c_intel xhci_pci_renesas [last unloaded: vfio_virqfd]
Jun 01 17:23:26 mainPC kernel: CPU: 8 PID: 32335 Comm: kwin_x11 Tainted: G      D W  OE     5.12.8-arch1-1 #1
Jun 01 17:23:26 mainPC kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Gaming K4, BIOS P6.40 08/31/2020
Jun 01 17:23:26 mainPC kernel: RIP: 0010:kfence_unprotect+0x18/0x30
Jun 01 17:23:26 mainPC kernel: Code: 05 2c b6 93 01 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 81 e7 00 f0 ff ff 31 f6 e8 fd fe ff ff 84 c0 74 01 c3 <0f> 0b c6 05 ff b5 93 01 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f
Jun 01 17:23:26 mainPC kernel: RSP: 0018:ffffb89e80867b00 EFLAGS: 00010046
Jun 01 17:23:26 mainPC kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb89e80867adc
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jun 01 17:23:26 mainPC kernel: RBP: 0000000000000010 R08: 0000000000000000 R09: 0000000000000000
Jun 01 17:23:26 mainPC kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jun 01 17:23:26 mainPC kernel: R13: ffffb89e80867b98 R14: 0000000000000010 R15: 0000000000000000
Jun 01 17:23:26 mainPC kernel: FS:  00007faa10039840(0000) GS:ffff8c81cec00000(0000) knlGS:0000000000000000
Jun 01 17:23:26 mainPC kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 01 17:23:26 mainPC kernel: CR2: 0000000000000010 CR3: 000000021e010000 CR4: 00000000003506e0
Jun 01 17:23:26 mainPC kernel: Call Trace:
Jun 01 17:23:26 mainPC kernel:  page_fault_oops+0x9d/0x2d0
Jun 01 17:23:26 mainPC kernel:  ? free_one_page+0x5f/0xd0
Jun 01 17:23:26 mainPC kernel:  exc_page_fault+0x67/0x170
Jun 01 17:23:26 mainPC kernel:  asm_exc_page_fault+0x1e/0x30
Jun 01 17:23:26 mainPC kernel: RIP: 0010:ttm_resource_free+0x1c/0x50 [ttm]
Jun 01 17:23:26 mainPC kernel: Code: 5e c3 0f 0b eb e2 b8 f4 ff ff ff eb ec 90 0f 1f 44 00 00 53 48 63 56 1c 48 89 f3 48 8b 87 48 01 00 00 48 8b bc d0 80 00 00 00 <48> 8b 47 10 48 85 c0 74 0e 48 8b 40 08 48 85 c0 74 05 e8 3d 2c c9
Jun 01 17:23:26 mainPC kernel: RSP: 0018:ffffb89e80867c40 EFLAGS: 00010282
Jun 01 17:23:26 mainPC kernel: RAX: ffff8c812b8e55a8 RBX: ffff8c7ef2c225c8 RCX: 00000000802a001e
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: ffff8c7ef2c225c8 RDI: 0000000000000000
Jun 01 17:23:26 mainPC kernel: RBP: ffff8c7ef2c225c0 R08: 0000000000000001 R09: 0000000000000001
Jun 01 17:23:26 mainPC kernel: R10: dead000000000100 R11: dead000000000100 R12: ffff8c812b8e55a8
Jun 01 17:23:26 mainPC kernel: R13: ffff8c7ef2c22458 R14: 0000000000008440 R15: ffff8c80af6ea7b0
Jun 01 17:23:26 mainPC kernel:  ttm_bo_release+0x171/0x2f0 [ttm]
Jun 01 17:23:26 mainPC kernel:  ttm_bo_vm_close+0x15/0x30 [ttm]
Jun 01 17:23:26 mainPC kernel:  remove_vma+0x31/0x70
Jun 01 17:23:26 mainPC kernel:  exit_mmap+0xe9/0x1f0
Jun 01 17:23:26 mainPC kernel:  mmput+0x52/0x120
Jun 01 17:23:26 mainPC kernel:  do_exit+0x322/0xa50
Jun 01 17:23:26 mainPC kernel:  do_group_exit+0x33/0xa0
Jun 01 17:23:26 mainPC kernel:  get_signal+0x137/0x910
Jun 01 17:23:26 mainPC kernel:  arch_do_signal_or_restart+0x116/0x750
Jun 01 17:23:26 mainPC kernel:  ? do_send_sig_info+0x6b/0xb0
Jun 01 17:23:26 mainPC kernel:  exit_to_user_mode_prepare+0xd4/0x150
Jun 01 17:23:26 mainPC kernel:  syscall_exit_to_user_mode+0x23/0x50
Jun 01 17:23:26 mainPC kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 01 17:23:26 mainPC kernel: RIP: 0033:0x7faa15927d22
Jun 01 17:23:26 mainPC kernel: Code: Unable to access opcode bytes at RIP 0x7faa15927cf8.
Jun 01 17:23:26 mainPC kernel: RSP: 002b:00007ffd43d7afd0 EFLAGS: 00000246 ORIG_RAX: 000000000000000e
Jun 01 17:23:26 mainPC kernel: RAX: 0000000000000000 RBX: 0000000000000009 RCX: 00007faa15927d22
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: 00007ffd43d7afd0 RDI: 0000000000000002
Jun 01 17:23:26 mainPC kernel: RBP: 000000000000000f R08: 0000000000000000 R09: 00007ffd43d7afd0
Jun 01 17:23:26 mainPC kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffd43d7b110
Jun 01 17:23:26 mainPC kernel: R13: 00007ffd43d7b108 R14: 000000000000000b R15: 0000000000000000
Jun 01 17:23:26 mainPC kernel: ---[ end trace 1916a19b81624890 ]---
Jun 01 17:23:26 mainPC kernel: BUG: kernel NULL pointer dereference, address: 0000000000000010
Jun 01 17:23:26 mainPC kernel: #PF: supervisor read access in kernel mode
Jun 01 17:23:26 mainPC kernel: #PF: error_code(0x0000) - not-present page
Jun 01 17:23:26 mainPC kernel: PGD 0 P4D 0 
Jun 01 17:23:26 mainPC kernel: Oops: 0000 [#4] PREEMPT SMP NOPTI
Jun 01 17:23:26 mainPC kernel: CPU: 8 PID: 32335 Comm: kwin_x11 Tainted: G      D W  OE     5.12.8-arch1-1 #1
Jun 01 17:23:26 mainPC kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Gaming K4, BIOS P6.40 08/31/2020
Jun 01 17:23:26 mainPC kernel: RIP: 0010:ttm_resource_free+0x1c/0x50 [ttm]
Jun 01 17:23:26 mainPC kernel: Code: 5e c3 0f 0b eb e2 b8 f4 ff ff ff eb ec 90 0f 1f 44 00 00 53 48 63 56 1c 48 89 f3 48 8b 87 48 01 00 00 48 8b bc d0 80 00 00 00 <48> 8b 47 10 48 85 c0 74 0e 48 8b 40 08 48 85 c0 74 05 e8 3d 2c c9
Jun 01 17:23:26 mainPC kernel: RSP: 0018:ffffb89e80867c40 EFLAGS: 00010282
Jun 01 17:23:26 mainPC kernel: RAX: ffff8c812b8e55a8 RBX: ffff8c7ef2c225c8 RCX: 00000000802a001e
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: ffff8c7ef2c225c8 RDI: 0000000000000000
Jun 01 17:23:26 mainPC kernel: RBP: ffff8c7ef2c225c0 R08: 0000000000000001 R09: 0000000000000001
Jun 01 17:23:26 mainPC kernel: R10: dead000000000100 R11: dead000000000100 R12: ffff8c812b8e55a8
Jun 01 17:23:26 mainPC kernel: R13: ffff8c7ef2c22458 R14: 0000000000008440 R15: ffff8c80af6ea7b0
Jun 01 17:23:26 mainPC kernel: FS:  00007faa10039840(0000) GS:ffff8c81cec00000(0000) knlGS:0000000000000000
Jun 01 17:23:26 mainPC kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 01 17:23:26 mainPC kernel: CR2: 0000000000000010 CR3: 000000021e010000 CR4: 00000000003506e0
Jun 01 17:23:26 mainPC kernel: Call Trace:
Jun 01 17:23:26 mainPC kernel:  ttm_bo_release+0x171/0x2f0 [ttm]
Jun 01 17:23:26 mainPC kernel:  ttm_bo_vm_close+0x15/0x30 [ttm]
Jun 01 17:23:26 mainPC kernel:  remove_vma+0x31/0x70
Jun 01 17:23:26 mainPC kernel:  exit_mmap+0xe9/0x1f0
Jun 01 17:23:26 mainPC kernel:  mmput+0x52/0x120
Jun 01 17:23:26 mainPC kernel:  do_exit+0x322/0xa50
Jun 01 17:23:26 mainPC kernel:  do_group_exit+0x33/0xa0
Jun 01 17:23:26 mainPC kernel:  get_signal+0x137/0x910
Jun 01 17:23:26 mainPC kernel:  arch_do_signal_or_restart+0x116/0x750
Jun 01 17:23:26 mainPC kernel:  ? do_send_sig_info+0x6b/0xb0
Jun 01 17:23:26 mainPC kernel:  exit_to_user_mode_prepare+0xd4/0x150
Jun 01 17:23:26 mainPC kernel:  syscall_exit_to_user_mode+0x23/0x50
Jun 01 17:23:26 mainPC kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 01 17:23:26 mainPC kernel: RIP: 0033:0x7faa15927d22
Jun 01 17:23:26 mainPC kernel: Code: Unable to access opcode bytes at RIP 0x7faa15927cf8.
Jun 01 17:23:26 mainPC kernel: RSP: 002b:00007ffd43d7afd0 EFLAGS: 00000246 ORIG_RAX: 000000000000000e
Jun 01 17:23:26 mainPC kernel: RAX: 0000000000000000 RBX: 0000000000000009 RCX: 00007faa15927d22
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: 00007ffd43d7afd0 RDI: 0000000000000002
Jun 01 17:23:26 mainPC kernel: RBP: 000000000000000f R08: 0000000000000000 R09: 00007ffd43d7afd0
Jun 01 17:23:26 mainPC kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffd43d7b110
Jun 01 17:23:26 mainPC kernel: R13: 00007ffd43d7b108 R14: 000000000000000b R15: 0000000000000000
Jun 01 17:23:26 mainPC kernel: Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat bridge xt_tcpudp nft_counter cfg80211 xt_state xt_conntrack rfkill nf_conntrack 8021q garp nf_defrag_ipv6 mrp nf_defrag_ipv4 nft_compat stp nf_tables llc nct6775 libcrc32c nfnetlink hwmon_vid intel_rapl_msr vfat intel_rapl_common amdgpu fat snd_hda_codec_realtek edac_mce_amd snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi mousedev joydev kvm snd_hda_codec gpu_sched i2c_algo_bit drm_ttm_helper ttm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul drm_kms_helper snd_hwdep ghash_clmulni_intel snd_pcm wmi_bmof cec aesni_intel r8169 sp5100_tco ccp snd_timer syscopyarea realtek crypto_simd sysfillrect mdio_devres cryptd snd sysimgblt rapl pcspkr k10temp fb_sys_fops libphy i2c_piix4 soundcore rng_core wmi gpio_amdpt pinctrl_amd gpio_generic
Jun 01 17:23:26 mainPC kernel:  mac_hid acpi_cpufreq drm fuse crypto_user agpgart bpf_preload ip_tables x_tables usbhid ext4 crc32c_generic crc16 mbcache jbd2 xhci_pci crc32c_intel xhci_pci_renesas [last unloaded: vfio_virqfd]
Jun 01 17:23:26 mainPC kernel: CR2: 0000000000000010
Jun 01 17:23:26 mainPC kernel: ---[ end trace 1916a19b81624891 ]---
Jun 01 17:23:26 mainPC kernel: RIP: 0010:ttm_resource_free+0x1c/0x50 [ttm]
Jun 01 17:23:26 mainPC kernel: Code: 5e c3 0f 0b eb e2 b8 f4 ff ff ff eb ec 90 0f 1f 44 00 00 53 48 63 56 1c 48 89 f3 48 8b 87 48 01 00 00 48 8b bc d0 80 00 00 00 <48> 8b 47 10 48 85 c0 74 0e 48 8b 40 08 48 85 c0 74 05 e8 3d 2c c9
Jun 01 17:23:26 mainPC kernel: RSP: 0018:ffffb89e81fc3c40 EFLAGS: 00010282
Jun 01 17:23:26 mainPC kernel: RAX: ffff8c7ec92255a8 RBX: ffff8c7ef2c169c8 RCX: 00000000802a0010
Jun 01 17:23:26 mainPC kernel: RDX: 0000000000000000 RSI: ffff8c7ef2c169c8 RDI: 0000000000000000
Jun 01 17:23:26 mainPC kernel: RBP: ffff8c7ef2c169c0 R08: 0000000000000001 R09: 0000000000000001
Jun 01 17:23:26 mainPC kernel: R10: dead000000000100 R11: dead000000000100 R12: ffff8c7ec92255a8
Jun 01 17:23:26 mainPC kernel: R13: ffff8c7ef2c16858 R14: 0000000000008440 R15: ffff8c7ed8a90870
Jun 01 17:23:26 mainPC kernel: FS:  00007faa10039840(0000) GS:ffff8c81cec00000(0000) knlGS:0000000000000000
Jun 01 17:23:26 mainPC kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 01 17:23:26 mainPC kernel: CR2: 0000000000000010 CR3: 000000021e010000 CR4: 00000000003506e0
Jun 01 17:23:26 mainPC kernel: Fixing recursive fault but reboot is needed!

My startup script looks like this:

# Stop the KDE display manager
/usr/bin/systemctl stop sddm

# Unbind VT consoles
for con in /sys/class/vtconsole/vtcon*
do
 /usr/bin/echo 0 | /usr/bin/tee $con/bind
done

/usr/bin/sleep 1

# Unbind the GPU from the host
pci_devs=(pci_0000_03_00_0 pci_0000_03_01_0 pci_0000_0a_00_0 pci_0000_0a_00_1)

for dev in ${pci_devs[*]}
do
 /usr/bin/virsh nodedev-detach $dev
done

/usr/bin/modprobe vfio_pci

I'm pretty sure it's not the script but i might be wrong here.

Last edited by jackson2k (2021-06-01 21:29:58)

Offline

#2 2021-06-02 14:04:04

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,086

Re: Kernel warning when starting a VFIO virtual machine

Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Gaming K4, BIOS P6.40 08/31/2020

You have this motherboard ?

How many videocards, what brand / model and drivers ?
(wild guess : one nvidia card running nvidia proprietary driver)

The logs mention kfence and there were recently changes to kfence handling in the kernel .
Have you tried with linux-lts ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#3 2021-06-02 17:11:05

jackson2k
Member
Registered: 2021-06-01
Posts: 6

Re: Kernel warning when starting a VFIO virtual machine

Lone_Wolf wrote:
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Gaming K4, BIOS P6.40 08/31/2020

You have this motherboard ?

How many videocards, what brand / model and drivers ?
(wild guess : one nvidia card running nvidia proprietary driver)

The logs mention kfence and there were recently changes to kfence handling in the kernel .
Have you tried with linux-lts ?

motherboard: Yes, that is the exact model.

videocard: 1x Sapphire RX 580 Nitro+ 4GB. And i'm using xf86-video-amdgpu 19.1.0-2 driver.

linux-lts: I haven't tried it, but i will as soon as possible and report back.

Last edited by jackson2k (2021-06-02 17:13:06)

Offline

#4 2021-06-02 19:12:51

loqs
Member
Registered: 2014-03-06
Posts: 18,917

Re: Kernel warning when starting a VFIO virtual machine

Lone_Wolf wrote:

The logs mention kfence and there were recently changes to kfence handling in the kernel .
Have you tried with linux-lts ?

kfence was added in 5.12 so switching to linux-lts will avoid any issues with kfence as linux-lts does not include it.
You believe it is a bug in kfence rather a correct detection?
Edit:
kfence detection is currently disabled by default due to to performance issues:
CONFIG_KFENCE_SAMPLE_INTERVAL=0
Have you enabled it explicitly?

Last edited by loqs (2021-06-02 19:27:02)

Offline

#5 2021-06-02 23:14:28

jackson2k
Member
Registered: 2021-06-01
Posts: 6

Re: Kernel warning when starting a VFIO virtual machine

Hmmm that's odd. Now with linux-lts kernel it produces a different warning, though its a bit shorter:

Jun 03 02:04:56 mainPC kernel: Oops: 0000 [#1] SMP NOPTI
Jun 03 02:04:56 mainPC kernel: CPU: 2 PID: 723 Comm: kwin_x11 Tainted: G        W         5.10.41-1-lts #1
Jun 03 02:04:56 mainPC kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Gaming K4, BIOS P6.40 08/31/2020
Jun 03 02:04:56 mainPC kernel: RIP: 0010:ttm_resource_free+0x1c/0x50 [ttm]
Jun 03 02:04:56 mainPC kernel: Code: eb e4 41 bd f4 ff ff ff eb dc 0f 1f 40 00 0f 1f 44 00 00 53 48 63 56 24 48 89 f3 48 8b 87 48 01 00 00 48 8b bc d0 80 00 00 00 <48> 8b 47 10 48 85 c0 74 0e 48 8b 40 08 48 85 c0 74 05 e8 4d 7d 32
Jun 03 02:04:56 mainPC kernel: RSP: 0018:ffff9f8f80d6fc50 EFLAGS: 00010286
Jun 03 02:04:56 mainPC kernel: RAX: ffff8bc950f455b8 RBX: ffff8bc977bca9d0 RCX: 000000008020000e
Jun 03 02:04:56 mainPC kernel: RDX: 0000000000000000 RSI: ffff8bc977bca9d0 RDI: 0000000000000000
Jun 03 02:04:56 mainPC kernel: RBP: ffff8bc950f455b8 R08: 0000000000000001 R09: 0000000000000000
Jun 03 02:04:56 mainPC kernel: R10: 0000000000000001 R11: 00000000ffffff00 R12: ffffffffc08e4f68
Jun 03 02:04:56 mainPC kernel: R13: ffff8bc977bca858 R14: ffff8bc977bca9c8 R15: 0000000000008480
Jun 03 02:04:56 mainPC kernel: FS:  00007f9ec7177840(0000) GS:ffff8bcc4e880000(0000) knlGS:0000000000000000
Jun 03 02:04:56 mainPC kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 03 02:04:56 mainPC kernel: CR2: 0000000000000010 CR3: 000000003ee10000 CR4: 00000000003506e0
Jun 03 02:04:56 mainPC kernel: Call Trace:
Jun 03 02:04:56 mainPC kernel:  ttm_bo_release+0x185/0x320 [ttm]
Jun 03 02:04:56 mainPC kernel:  ttm_bo_vm_close+0x15/0x30 [ttm]
Jun 03 02:04:56 mainPC kernel:  remove_vma+0x29/0x60
Jun 03 02:04:56 mainPC kernel:  exit_mmap+0xea/0x1a0
Jun 03 02:04:56 mainPC kernel:  mmput+0x49/0x110
Jun 03 02:04:56 mainPC kernel:  do_exit+0x30c/0xa20
Jun 03 02:04:56 mainPC kernel:  do_group_exit+0x33/0xa0
Jun 03 02:04:56 mainPC kernel:  get_signal+0x157/0x890
Jun 03 02:04:56 mainPC kernel:  ? __send_signal+0x1cd/0x3b0
Jun 03 02:04:56 mainPC kernel:  arch_do_signal+0x30/0x710
Jun 03 02:04:56 mainPC kernel:  ? do_send_sig_info+0x6b/0xc0
Jun 03 02:04:56 mainPC kernel:  exit_to_user_mode_prepare+0xb4/0x120
Jun 03 02:04:56 mainPC kernel:  syscall_exit_to_user_mode+0x28/0x140
Jun 03 02:04:56 mainPC kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jun 03 02:04:56 mainPC kernel: RIP: 0033:0x7f9ecca66d22
Jun 03 02:04:56 mainPC kernel: Code: Unable to access opcode bytes at RIP 0x7f9ecca66cf8.
Jun 03 02:04:56 mainPC kernel: RSP: 002b:00007ffc1b9dbe50 EFLAGS: 00000246 ORIG_RAX: 000000000000000e
Jun 03 02:04:56 mainPC kernel: RAX: 0000000000000000 RBX: 0000000000000009 RCX: 00007f9ecca66d22
Jun 03 02:04:56 mainPC kernel: RDX: 0000000000000000 RSI: 00007ffc1b9dbe50 RDI: 0000000000000002
Jun 03 02:04:56 mainPC kernel: RBP: 000000000000000f R08: 0000000000000000 R09: 00007ffc1b9dbe50
Jun 03 02:04:56 mainPC kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffc1b9dbf90
Jun 03 02:04:56 mainPC kernel: R13: 00007ffc1b9dbf88 R14: 000000000000000b R15: 0000000000000000
Jun 03 02:04:56 mainPC kernel: Modules linked in: vhost_net vhost vhost_iotlb tap tun vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat bridge xt_tcpudp nft_counter cfg80211 xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables rfkill 8021q joydev mousedev garp nct6775 mrp libcrc32c stp llc nfnetlink hwmon_vid usbhid snd_hda_codec_realtek vfat fat amdgpu snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi edac_mce_amd snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence kvm_amd snd_hda_codec kvm snd_hda_core snd_hwdep soundwire_bus snd_soc_core gpu_sched i2c_algo_bit ttm irqbypass crct10dif_pclmul crc32_pclmul snd_compress ghash_clmulni_intel ac97_bus aesni_intel snd_pcm_dmaengine wmi_bmof crypto_simd drm_kms_helper snd_pcm cryptd glue_helper ccp rapl r8169 snd_timer cec realtek k10temp snd mdio_devres syscopyarea sysfillrect sp5100_tco libphy pcspkr sysimgblt
Jun 03 02:04:56 mainPC kernel:  fb_sys_fops i2c_piix4 soundcore rng_core wmi gpio_amdpt pinctrl_amd gpio_generic mac_hid acpi_cpufreq drm fuse crypto_user agpgart bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 crc32c_intel xhci_pci xhci_pci_renesas
Jun 03 02:04:56 mainPC kernel: CR2: 0000000000000010
Jun 03 02:04:56 mainPC kernel: ---[ end trace 49ac0a8447ed947e ]---
Jun 03 02:04:56 mainPC kernel: RIP: 0010:ttm_resource_free+0x1c/0x50 [ttm]
Jun 03 02:04:56 mainPC kernel: Code: eb e4 41 bd f4 ff ff ff eb dc 0f 1f 40 00 0f 1f 44 00 00 53 48 63 56 24 48 89 f3 48 8b 87 48 01 00 00 48 8b bc d0 80 00 00 00 <48> 8b 47 10 48 85 c0 74 0e 48 8b 40 08 48 85 c0 74 05 e8 4d 7d 32
Jun 03 02:04:56 mainPC kernel: RSP: 0018:ffff9f8f80d6fc50 EFLAGS: 00010286
Jun 03 02:04:56 mainPC kernel: RAX: ffff8bc950f455b8 RBX: ffff8bc977bca9d0 RCX: 000000008020000e
Jun 03 02:04:56 mainPC kernel: RDX: 0000000000000000 RSI: ffff8bc977bca9d0 RDI: 0000000000000000
Jun 03 02:04:56 mainPC kernel: RBP: ffff8bc950f455b8 R08: 0000000000000001 R09: 0000000000000000
Jun 03 02:04:56 mainPC kernel: R10: 0000000000000001 R11: 00000000ffffff00 R12: ffffffffc08e4f68
Jun 03 02:04:56 mainPC kernel: R13: ffff8bc977bca858 R14: ffff8bc977bca9c8 R15: 0000000000008480
Jun 03 02:04:56 mainPC kernel: FS:  00007f9ec7177840(0000) GS:ffff8bcc4e880000(0000) knlGS:0000000000000000
Jun 03 02:04:56 mainPC kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 03 02:04:56 mainPC kernel: CR2: 0000000000000010 CR3: 000000003ee10000 CR4: 00000000003506e0
Jun 03 02:04:56 mainPC kernel: Fixing recursive fault but reboot is needed!

Edit: Now after switching back to rolling-release kernel i have problems with kwin_x11 hangups when rebooting/shutting down the system if i had started up a vm during that boot.
Edit 2: Now even switching to linux-lts i have that issue. No idea what is going on.

Last edited by jackson2k (2021-06-03 14:39:55)

Offline

#6 2021-06-02 23:41:23

jackson2k
Member
Registered: 2021-06-01
Posts: 6

Re: Kernel warning when starting a VFIO virtual machine

So i did some trial and error testing and i noticed that if i disable KDE's display manager SDDM, reboot and start my vm from a tty console directly, it does not produce any warnings. I will try to pinpoint whether the issue is with SDDM or KDE itself.

Edit: It certainly isn't SDDM.

Last edited by jackson2k (2021-06-03 02:29:39)

Offline

#7 2021-06-02 23:49:39

loqs
Member
Registered: 2014-03-06
Posts: 18,917

Re: Kernel warning when starting a VFIO virtual machine

The GPU you are passing through uses the amdgpu module when not passed through?  If so you are not explicitly unloading that module?

Offline

#8 2021-06-03 00:29:20

jackson2k
Member
Registered: 2021-06-01
Posts: 6

Re: Kernel warning when starting a VFIO virtual machine

loqs wrote:

The GPU you are passing through uses the amdgpu module when not passed through?  If so you are not explicitly unloading that module?

Yes the GPU is using amdgpu module when not passed through.

I thought virsh does most of the stuff behind the scenes.

PS. I did add a line so it unloads the module before starting it now and nothing changes in terms of warnings.

Last edited by jackson2k (2021-06-03 00:30:41)

Offline

#9 2021-06-03 00:54:40

loqs
Member
Registered: 2014-03-06
Posts: 18,917

Re: Kernel warning when starting a VFIO virtual machine

I would guess at a kernel bug in amdgpu.  I would suggest reporting the issue to https://gitlab.freedesktop.org/drm/amd/-/issues.

Offline

#10 2021-06-03 01:23:00

jackson2k
Member
Registered: 2021-06-01
Posts: 6

Re: Kernel warning when starting a VFIO virtual machine

loqs wrote:

I would guess at a kernel bug in amdgpu.  I would suggest reporting the issue to https://gitlab.freedesktop.org/drm/amd/-/issues.

I wonder if kwin_x11 hangups are caused by amdgpu. Even killing the process doesn't help it just stays there. This happens when i start my vm at least once. If i don't start it during that boot then all is fine.

Offline

#11 2021-06-03 10:30:29

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,086

Re: Kernel warning when starting a VFIO virtual machine

loqs wrote:

kfence was added in 5.12 so switching to linux-lts will avoid any issues with kfence as linux-lts does not include it.
You believe it is a bug in kfence rather a correct detection?

The kfence_protect error made me wonder, I also was not aware kfence didn't exist before 5.12 .

The later posts make clear I was wrong.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

Board footer

Powered by FluxBB