You are not logged in.

#1 2019-10-18 10:29:24

velumyx
Member
Registered: 2019-10-18
Posts: 2

System freeze with integrated graphics

Greetings,

recently, I decided to set up my PC for vfio passthrough. I planned to pass my dedicated graphics card to a QEMU/KVM and use the integrated Intel graphics to run a X server with i3wm.
To accomplish this, I first tried to boot my PC over the integrated graphics - which worked fine - until I tried to start up X using SDDM. After roughly 10 seconds, my PC froze and I had to hard-reset.
This happens *constantly* and *reliably*.

I tried everything - from tinkering with the X configuration files up to changing kernel parameters, different kernels (linux-ck, linux-zen, linux-lts) and even compiling the most recent linux-drm-tip-kernel - but nothing has helped me so far.


Things I've attempted to fix those freezes:
- Changing UEFI settings
- Reinstalling Linux completely
- Installing and using different kernels (linux, linux-lts, linux-zen, linux-ck, linux-drm-tip-git)
- Applying various kernel parameters, like:

intel_iommu=off, on, igfx_off
i915.enable_psr=0, 1
i915.enable_dc=0, 1
i915.modeset=1, 0
intel_idle.max_cstate=1, 7

- Removing NVIDIA kernel modules
- Uninstalling NVIDIA drivers
- Installing and using xf86-video-intel
- Building and installing the X server from git
- Tinkering with the xorg.conf.d configuration files


But so far, without success. I couldn't even find any useful debug information in various logs - journalctl, dmesg, Xorg.0.log - so I'm left clueless.

I've only found out one thing:
The X server *won't* freeze if it is not using hardware accelerated graphics (2d or 3d acceleration)
I've noticed that when I used the modesetting driver and the "sna" or "uxa" AccelMethod - compton would fail to start and glxinfo suggested that it's using llvmpipe.
Setting the AccelMethod back to "glamor" or leaving it unset causes the PC to freeze again after up to 10 seconds - however, I managed to run glxinfo once in i3wm using glamor and then it actually ended up using hardware accelerated graphics and compton would run just fine.

When using the Intel driver, the PC ends up freezing no matter what I set the AccelMethod to.


System:

CPU: Intel i7-4790K
RAM: 16GB DDR3 1600MHz
Mainboard: MSI H81M-P33

uname -a

Linux archMachine 5.3.6-zen1-1-zen #1 ZEN SMP PREEMPT Fri Oct 11 18:28:20 UTC 2019 x86_64 GNU/Linux

lspci

00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05)
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d5)
00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d5)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation H81 Express LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 1aeb (rev a1)
01:00.2 USB controller: NVIDIA Corporation Device 1aec (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1aed (rev a1)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)

Some log files:
https://paste.ee/p/RLvnz


I hope that somebody can help me find out what kind of mistake I made because I'm absolutely clueless at this point.


Thank you all in advance!

Last edited by velumyx (2019-10-18 10:39:54)

Offline

#2 2019-10-18 10:59:45

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,922

Re: System freeze with integrated graphics

Basics first :
- You are using early microcode loading
- the latest firmware for your mobo is from 2018, https://www.msi.com/Motherboard/support/H81M-P33 , are you using that version ?

lspci -k is more useful then plain lspci, please post that.

your kernel command line is very long, but doesn't seem to include vfio parameters.
How are you isolating the nvidia card so it is available for passthrough ?

post your /etc/mkinitcpio.conf file

when changing graphical setups , display managers tend to make troubleshooting harder.
Please boot with systemd.unit=multi-user.target as kernel parameter and configure startx or xinit .


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#3 2019-10-18 11:05:32

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,738

Re: System freeze with integrated graphics

You are also doing some weird overrides, you shouldn't override acpi (and likely not explicitly to Linux) unless you actually know you need to. Why are you configured to use DRI 2 and ignore ABI?

I suggest you go down to a cleaner slate in general.  Revert the out of tree packages, go back to stable versions, remove most of the kernel parameters outside of the ones explicitly necessary for passthrough

Offline

#4 2019-10-18 11:42:44

velumyx
Member
Registered: 2019-10-18
Posts: 2

Re: System freeze with integrated graphics

Hello everyone and thanks for checking in!

Lone_Wolf wrote:

Basics first :
- You are using early microcode loading

Yeah. I tried booting without the microcode img from Intel, but that didn't change anything about those freezes.

Lone_Wolf wrote:

- the latest firmware for your mobo is from 2018, https://www.msi.com/Motherboard/support/H81M-P33 , are you using that version ?

Just updated the BIOS, didn't change anything for me, however. Still freezing.
Should've updated that earlier on, thanks.

Lone_Wolf wrote:

lspci -k is more useful then plain lspci, please post that.

Here you go:

00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 4th Gen Core Processor DRAM Controller
        Kernel driver in use: hsw_uncore
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
        Kernel driver in use: pcieport
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
        DeviceName:  Onboard IGD
        Subsystem: Micro-Star International Co., Ltd. [MSI] Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller
        Kernel driver in use: i915
        Kernel modules: i915
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 8 Series/C220 Series Chipset Family USB xHCI
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 8 Series/C220 Series Chipset Family MEI Controller
        Kernel driver in use: mei_me
        Kernel modules: mei_me
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 05)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 8 Series/C220 Series Chipset Family USB EHCI
        Kernel driver in use: ehci-pci
        Kernel modules: ehci_pci
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 05)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 8 Series/C220 Series Chipset High Definition Audio Controller
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d5)
        Kernel driver in use: pcieport
00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d5)
        Kernel driver in use: pcieport
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 05)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 8 Series/C220 Series Chipset Family USB EHCI
        Kernel driver in use: ehci-pci
        Kernel modules: ehci_pci
00:1f.0 ISA bridge: Intel Corporation H81 Express LPC Controller (rev 05)
        Subsystem: Micro-Star International Co., Ltd. [MSI] H81 Express LPC Controller
        Kernel driver in use: lpc_ich
        Kernel modules: lpc_ich
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode]
        Kernel driver in use: ahci
        Kernel modules: ahci
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 05)
        Subsystem: Micro-Star International Co., Ltd. [MSI] 8 Series/C220 Series Chipset Family SMBus Controller
        Kernel driver in use: i801_smbus
        Kernel modules: i2c_i801
01:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)
        Subsystem: ASUSTeK Computer Inc. TU116 [GeForce GTX 1660 Ti]
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia
01:00.1 Audio device: NVIDIA Corporation Device 1aeb (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 86a3
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
01:00.2 USB controller: NVIDIA Corporation Device 1aec (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 86a3
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1aed (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 86a3
        Kernel driver in use: nvidia-gpu
        Kernel modules: i2c_nvidia_gpu
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
        Subsystem: Micro-Star International Co., Ltd. [MSI] RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
        Kernel driver in use: r8169
        Kernel modules: r8169
Lone_Wolf wrote:

your kernel command line is very long, but doesn't seem to include vfio parameters.
How are you isolating the nvidia card so it is available for passthrough ?

I have a configuration file in the modprobe.d directory that's responsible for applying the vfio parameters. Uncommented that to use my GPU on the host - but the passthrough works fine (have been using the QEMU/KVM several times already)

options vfio-pci ids=10de:2182,10de:1aeb,10de:1aec,10de:1aed

This is the vfio.conf from my modprobe.d directory.

Lone_Wolf wrote:

post your /etc/mkinitcpio.conf file

Here you go: https://paste.ee/p/cmHd5
(just realized that I have "modconf" twice in the hooks section. Will fix that)

Lone_Wolf wrote:

when changing graphical setups , display managers tend to make troubleshooting harder.
Please boot with systemd.unit=multi-user.target as kernel parameter and configure startx or xinit .

I'll do that as soon as I'm home!

-------

V1del wrote:

You are also doing some weird overrides, you shouldn't override acpi (and likely not explicitly to Linux) unless you actually know you need to.

Yeah, those parameters weren't meant to stay there. I've added them because apparently it helped some people to fix some weird issues on laptops with integrated graphics. I've removed them now.

V1del wrote:

Why are you configured to use DRI 2 and ignore ABI?

Also added that in my xorg.conf.d just to check if it'd fix anything. When using my integrated graphics, the Xorg.0.log would claim that my screen doesn't support DRI2. Thought it may be related to the freezes - but it wasn't.

Ignore ABI was needed to run the xorg server I built from the git repositories. Otherwise, the drivers would refuse to load. However, I'm already back on the stable version from the official repositories.

V1del wrote:

I suggest you go down to a cleaner slate in general.  Revert the out of tree packages, go back to stable versions, remove most of the kernel parameters outside of the ones explicitly necessary for passthrough

Okay, I just removed unnecessary kernel parameters and old config entries.

Offline

#5 2019-10-18 15:43:15

Mainvoid
Member
Registered: 2016-08-01
Posts: 5

Re: System freeze with integrated graphics

Hello,

The problem you are describing sounds familiar to me; basically the same thing happens on my laptop (1050ti) when using the nouveau driver. You could try blacklisting the nouveau module that is included with the kernel package. Did these problems occur as well before you removed the nvidia-package? The nvidia-package contains a similar blacklist file to prevent nouveau from loading. Exactly the same problem occurs for me when I boot an Ubuntu live-cd without adding "module_blacklist=nouveau" to the kernel parameters; essentially the whole UI freezes and the system responds to nothing. Good luck debugging your issue!

Mainvoid

Offline

#6 2019-10-19 17:56:26

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,922

Re: System freeze with integrated graphics

What mainvoid describes applies to some systems with nvidia cards that don't use vfio .
When vfio works correctly, the nvidia parts are under control by vfio kernel modules , no need for blacklisting.


velumyx, when you are ready do the following

- Verify in uefi/bios firmware the intel integrated gpu will be booted as primary videocard.
(it may already be setup that way but please verify it)

- boot to multi-user
- login as root
- post full dmesg & lspci -k .


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

Board footer

Powered by FluxBB