You are not logged in.

#1 2023-12-01 23:04:58

JessicaJill
Member
Registered: 2022-01-04
Posts: 20

[SOLVED] Troubleshooting advice

Im tinkering with passing through my GPU to a VM and currently at a point where if I add anything to my vfio.conf, my PC will freeze on boot.

There is a lot of trial and error on my part and having to fix it every time it freezes is tedious (usually Ill just grab the latest snapshot, but then I lose log files) so I was curious if anyone had any advice of an easier way or if this is just part of the learning process?

Last edited by JessicaJill (2023-12-03 14:54:21)

Offline

#2 2023-12-02 08:10:26

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,054

Offline

#3 2023-12-02 15:38:07

JessicaJill
Member
Registered: 2022-01-04
Posts: 20

Re: [SOLVED] Troubleshooting advice

My apologies, that wasn't my intentions.  My goal was to was go gather information to troubleshoot and IF I could not solve the problem (like the link you posted) provide full sources of information to others for assistance. Obviously, as you have said, I cannot come here and say, "my system wont boot", so I need to get access to the logs and gather as much information as possible to help you help me.  The problem is when I restore my system, the logs no longer exist so I was simply asking, if you cannot boot, what steps do you use in order to get access to the logs and troubleshoot? I just didn't know if chroot was the only way as that's onerous with my setup.

To your question, my system specs are ASRock Z590 / Intel i7-10700K / Arch (i3) 6.1.63-1-lts / Radeon RX 5700 XT (single gpu)
Regarding the passthrough, Here is what I have accomplished so far:
First, confirm IOMMU is on

❯ sudo dmesg | grep -e DMAR -e IOMMU
[    0.008138] ACPI: DMAR 0x000000009E6AD000 000050 (v02 INTEL  EDK2     00000002      01000013)
[    0.008167] ACPI: Reserving DMAR table memory at [mem 0x9e6ad000-0x9e6ad04f]
[    0.051920] DMAR: IOMMU enabled
[    0.124909] DMAR: Host address width 39
[    0.124910] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.124914] DMAR: dmar0: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[    0.124917] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 0
[    0.124918] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.124919] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.126343] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.347314] DMAR: No RMRR found
[    0.347315] DMAR: No ATSR found
[    0.347315] DMAR: No SATC found
[    0.347316] DMAR: dmar0: Using Queued invalidation
[    0.348101] DMAR: Intel(R) Virtualization Technology for Directed I/O

Check IOMMU Groups:

❯ Holdtank/GPUPassThru/iommu_group_check.sh
Group 0:	[8086:9b43]     00:00.0  Host bridge                              10th Gen Core Processor Host Bridge/DRAM Registers
Group 1:	[8086:1901]     00:01.0  PCI bridge                               6th-10th Gen Core Processor PCIe Controller (x16)
			[1002:1478] [R] 01:00.0  PCI bridge                               Navi 10 XL Upstream Port of PCI Express Switch
			[1002:1479] [R] 02:00.0  PCI bridge                               Navi 10 XL Downstream Port of PCI Express Switch
			[b][1002:731f] [R] 03:00.0  VGA compatible controller                Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
			[1002:ab38]     03:00.1  Audio device                             Navi 10 HDMI Audio[/b]
Group 2:	[8086:43ed]     00:14.0  USB controller                           Tiger Lake-H USB 3.2 Gen 2x1 xHCI Host Controller
USB:		[1d6b:0002]		 Bus 001 Device 001                       		  Linux Foundation 2.0 root hub 
USB:		[046d:c53d]		 Bus 001 Device 003                       		  Logitech, Inc. G631 Keyboard 
USB:		[174c:2074]		 Bus 001 Device 004                       		  ASMedia Technology Inc. ASM1074 High-Speed hub 
USB:		[26ce:01a2]		 Bus 001 Device 005                       		  ASRock LED Controller 
USB:		[3938:1032]		 Bus 001 Device 006                       		  MOSART Semi. 2.4G RF Keyboard & Mouse 
USB:		[8087:0029]		 Bus 001 Device 007                       		  Intel Corp. AX200 Bluetooth 
USB:		[1d6b:0003]		 Bus 002 Device 001                       		  Linux Foundation 3.0 root hub 
USB:		[174c:3074]		 Bus 002 Device 002                      		  ASMedia Technology Inc. ASM1074 SuperSpeed hub 
USB:		[090c:1000]		 Bus 002 Device 003                       		  Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) Flash Drive 
			[8086:43ef]     00:14.2  RAM memory                               Tiger Lake-H Shared SRAM
Group 3:	[8086:43e0]     00:16.0  Communication controller                 Tiger Lake-H Management Engine Interface
Group 4:	[8086:43d2]     00:17.0  SATA controller                          Device 43d2
Group 5:	[8086:43c2] [R] 00:1b.0  PCI bridge                               Device 43c2
Group 6:	[8086:43ba] [R] 00:1c.0  PCI bridge                               Tiger Lake-H PCIe Root Port #3
Group 7:	[8086:4385]     00:1f.0  ISA bridge                               Z590 LPC/eSPI Controller
			[8086:f0c8]     00:1f.3  Audio device                             Device f0c8
			[8086:43a3]     00:1f.4  SMBus                                    Tiger Lake-H SMBus Controller
			[8086:43a4]     00:1f.5  Serial bus controller                    Tiger Lake-H SPI Controller
Group 8:	[10ec:8125] [R] 04:00.0  Ethernet controller                      RTL8125 2.5GbE Controller
Group 9:	[8086:2723] [R] 05:00.0  Network controller                       Wi-Fi 6 AX200

Set iommu on in kernel

GRUB_CMDLINE_LINUX_DEFAULT="quiet loglevel=3 audit=0 nvme_load=yes intel_iommu=on iommu=pt"

Set modules in mkinitcpio:

MODULES=(crc32c-intel amdgpu vfio vfio_pci vfio_iommu_type1)

Run mkinitcpio with all presets
Everything works fine up until this point.  I see no errors in the logs, I can boot, etc.

Setup /etc/modprobe.d/vifo.conf:

# vfio.conf

# Soft dependencies
softdep amdgpu pre: vfio vfio_pci vfio-pci
softdep snd_hda_intel pre: vfio_pci vfio-pci #not sure if this is even needed

Adding GPU 
# VFIO options
options vfio-pci ids=1002:731f,1002:ab38

PC fails to come back from boot.
Ive tried changing many different configurations with no luck, at this point Im a bit stuck.

Not sure if this now belongs in a different forum, but I appreciate any advice of suggestions.

Looking around, I did find this post https://bbs.archlinux.org/viewtopic.php?id=280512 where #6 they did mention enabling the onboard GPU in the BIOS then add i915 to the module in mkinitcpio.conf.  Ill be trying this next to see what happens.

Last edited by JessicaJill (2023-12-02 17:57:15)

Offline

#4 2023-12-02 15:57:02

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,054

Re: [SOLVED] Troubleshooting advice

Please use [code][/code] tags, not "quote" tags. Edit your post in this regard.

The problem is when I restore my system, the logs no longer exist

Don't reboot w/ the power button, https://wiki.archlinux.org/title/Keyboa … el_(SysRq)

single gpu

If there's only one GPU, the moment you pass it through to a VM it's no longer available to the host. You can exclusively use the VM at this point.
This is possible, but not trivial (and typically you'll pass through the GPU somewhen at runtime, not during the boot) and I'm not sure whether that's actually what you want/expect?

Offline

#5 2023-12-02 15:59:55

JessicaJill
Member
Registered: 2022-01-04
Posts: 20

Re: [SOLVED] Troubleshooting advice

Makes sense, but no, thats not what I want to do.  I only want to pass through the GPU when Im running KVM.

Offline

#6 2023-12-02 16:18:44

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 9,003
Website

Re: [SOLVED] Troubleshooting advice

How about using the iGPU on the Intel processor for the desktop? Then you could block amdgpu so the Radeon card can be passed through.

Last edited by Head_on_a_Stick (2023-12-02 16:19:27)


Jin, Jîyan, Azadî

Offline

#7 2023-12-02 16:29:20

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,054

Re: [SOLVED] Troubleshooting advice

The 10700K should™ have an IGP, not sure whether it's implciitly disabled, though.

lspci -k
sudo journalctl -b | curl -F 'file=@-' 0x0.st

And if you want to follow that route, you still need to attach an output there.

Offline

#8 2023-12-02 17:25:50

JessicaJill
Member
Registered: 2022-01-04
Posts: 20

Re: [SOLVED] Troubleshooting advice

Yes the mobo does have on IGPU which was turned off in the bios (terribly named) and appears to be on now

00:02.0 Display controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 05)
	Subsystem: ASRock Incorporation CometLake-S GT2 [UHD Graphics 630]
	Kernel driver in use: i915
	Kernel modules: i915
00:14.0 USB controller: Intel Corporation Tiger Lake-H USB 3.2 Gen 2x1 xHCI Host Controller (rev 11)
	Subsystem: ASRock Incorporation Tiger Lake-H USB 3.2 Gen 2x1 xHCI Host Controller
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci

did you want the output of sudo journalctl -b | curl -F 'file=@-' 0x0.st?

So so I have to block the amdgpu driver now so it can be passed through?  Meaning I have to choose, i915 for Arch and amdgpu for VM?

Last edited by JessicaJill (2023-12-02 17:58:21)

Offline

#9 2023-12-02 17:37:23

JessicaJill
Member
Registered: 2022-01-04
Posts: 20

Re: [SOLVED] Troubleshooting advice

Assuming I understand correctly, next steps is to change mkinitcpio.conf to use the i915 driver and remove amdgpu

MODULES=(crc32c-intel i915 vfio vfio_pci vfio_iommu_type1)

then tell the vfio.conf to use the amdgpu

# vfio.conf

# Soft dependencies
softdep amdgpu pre: vfio vfio_pci vfio-pci

# VFIO options
options vfio-pci ids=1002:731f,1002:ab38

Last edited by JessicaJill (2023-12-02 17:57:55)

Offline

#10 2023-12-02 20:58:11

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,054

Re: [SOLVED] Troubleshooting advice

Yes.

Dec 02 11:18:38 persephone kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Dec 02 11:18:38 persephone kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Dec 02 11:18:38 persephone kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes

nb. that you still need to plug a monitor into the i915 chip in order to get visual output on the host.
The amdgpu chip will be "lost" for linux. It'll exclusively be driven by the VM

Offline

#11 2023-12-03 00:14:03

JessicaJill
Member
Registered: 2022-01-04
Posts: 20

Re: [SOLVED] Troubleshooting advice

Sorry but Im confused now..
With this setup though, one monitor will always be dedicated to Linux and the other to VM, correct?  If I wasnt running a VM could I still use dual monitors in a single environment?

Is there a way using a single passthrough so that both monitors will be dedicated to either environment?  Or would dual GPUs be required for this?  (Come to think of it I actually do have another stashed away in a closet but its a completely different brand

Last edited by JessicaJill (2023-12-03 00:58:34)

Offline

#12 2023-12-03 07:41:06

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,054

Re: [SOLVED] Troubleshooting advice

The brand of the monitor is irrelevant, if you've a KVM switch or the output has multiple (switchable inputs), you can attach it to both GPUs and switch between host and VM display.
And of course you can use multihead setups in various ways on a single system, https://wiki.archlinux.org/title/Multihead

Offline

#13 2023-12-03 14:50:52

JessicaJill
Member
Registered: 2022-01-04
Posts: 20

Re: [SOLVED] Troubleshooting advice

Sorry I mean a second GPU (fairly decent) but you have given me more than enough to get started, thank you very much!

Last edited by JessicaJill (2023-12-03 14:53:52)

Offline

Board footer

Powered by FluxBB