You are not logged in.

#1 2024-12-19 22:38:19

carlossss111
Member
Registered: 2024-12-19
Posts: 7

Games crash with 'GPU:0 Failed to query display engine channel state'

Hi, seemingly at random while playing demanding games, the system will crash to tty1 and I have to press the power button to restart.

The message is always:

NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.

followed by

nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c77d:0:0:0x0000000f

I am running a laptop setup with two GPUs. My intel GPU renders my Cinnamon DE and general applications like firefox etc. Intensive applications like games are rendered by my Nvidia GPU.
I am using nvidia-dkms 565.77-2 and I experience this problem both on 6.12.4-arch1-1 and on the LTS kernel. I have tried nvidia-open-dkms but get this issue even more.



Here is the full journal log (I think the errors begin at 20:30:12)
journalctl -b -2: https://0x0.st/XCWD.txt

Not sure what other logs to provide, so here is lspci, nvidia-smi and mkinitcpio to give more of an idea of my setup.
lspci: https://0x0.st/XC4i.txt
nvidia-smi: https://0x0.st/XC4-.txt
mkinitcpio: http://0x0.st/XC4B.txt

First post so please let me know if there's any other logs etc that could be of use!

Many Thanks!

Offline

#2 2024-12-19 23:23:02

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,413

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Dec 19 20:30:12 DAN-LNX kernel: pcieport 0000:00:01.0: AER: Correctable error message received from 0000:00:01.0
Dec 19 20:30:12 DAN-LNX kernel: NVRM: GPU at PCI:0000:01:00: GPU-86fd0c08-6adb-7868-3e70-75190cd7f962
Dec 19 20:30:12 DAN-LNX kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
Dec 19 20:30:12 DAN-LNX kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
Dec 19 20:30:12 DAN-LNX kernel: pcieport 0000:00:01.0:   device [8086:a70d] error status/mask=00002000/00000000
Dec 19 20:30:12 DAN-LNX kernel: pcieport 0000:00:01.0:    [13] NonFatalErr           
Dec 19 20:30:12 DAN-LNX kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
Dec 19 20:30:12 DAN-LNX kernel: NVRM: Xid (PCI:0000:01:00): 154, pid='<unknown>', name=<unknown>, GPU recovery action changed from 0x0 (None) to 0x2 (Node Reboot Required)
Dec 19 20:30:14 DAN-LNX kernel: NVRM: Error in service of callback 

This is a hybrid system.

Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] Rescanning PCI bus
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:01:00.0: NVIDIA graphics
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:01:00.0: Function for 0000:01:00.0
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:01:00.1: Function for 0000:01:00.0
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:00:02.0: Intel graphics
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:00:02.0: Function for 0000:00:02.0
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] Enabling graphics power

Disable system76-power and please post your Xorg log, https://wiki.archlinux.org/title/Xorg#General
There're no outputs attached to the nvidia GPU but afaiu you prime-run steam (or the game)?

Dec 19 20:14:03 DAN-LNX kernel: nvidia 0000:01:00.0: [drm] Cannot find any crtc or sizes

"GPU has fallen off the bus" means exactly that: the GPU has (logically) dropped out of the system - if you didn't hear some cLoNcK sound, that probably means it went underpowered.
Everyting afterwards is just noise, the GPU doesn't talk to the OS anymore.

Online

#3 2024-12-20 01:28:57

Nyctfall
Member
Registered: 2023-04-03
Posts: 82

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

carlossss111 wrote:

First post

Welcome! Hope you like the premier bleeding-edge Linux distro! smile

carlossss111 wrote:

seemingly at random while playing demanding games, the system will crash to tty1 and I have to press the power button to restart.

I used an NVIDIA GPU with Arch before... it's usually the drivers that are the problem. Other times it's configuration.
See: Arch Wiki > NVIDIA > Wayland and X.Org Configuration,
Arch Wiki > Hardware Video Acceleration > Verification,
Arch Wiki > NVIDIA/Tips_and_tricks > Listening to ACPI Events.

carlossss111 wrote:

please let me know if there's any other logs etc that could be of use!

# text searched for 'system76'
Dec 19 20:14:01 DAN-LNX kernel: input: System76 ACPI Hotkeys as /devices/LNXSYSTM:00/LNXSYBUS:00/17761776:00/input/input13
Dec 19 20:14:01 DAN-LNX kernel: ACPI: battery: new extension: System76 Battery Extension
Dec 19 20:14:01 DAN-LNX systemd[1]: Starting Load/Save Screen Backlight Brightness of leds:system76_acpi::kbd_backlight...
Dec 19 20:14:01 DAN-LNX systemd[1]: Finished Load/Save Screen Backlight Brightness of leds:system76_acpi::kbd_backlight.

Dec 19 20:14:03 DAN-LNX systemd[1]: Starting System76 Power Daemon...
Dec 19 20:14:03 DAN-LNX systemd[1]: Started System76 Firmware Daemon.
Dec 19 20:14:03 DAN-LNX systemd[1]: Started System76 airplane-mode hotkey and LED support.
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] Rescanning PCI bus
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:01:00.0: NVIDIA graphics
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:01:00.0: Function for 0000:01:00.0
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:01:00.1: Function for 0000:01:00.0
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:00:02.0: Intel graphics
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] 0000:00:02.0: Function for 0000:00:02.0
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] Enabling graphics power
Dec 19 20:14:03 DAN-LNX systemd[1]: Started System76 Power Daemon.
Dec 19 20:14:03 DAN-LNX system76-power[864]: setting powersave with max 5600000
Dec 19 20:14:03 DAN-LNX systemd[1]: Started Run system76-power profile on startup. Daniel R 2024..
Dec 19 20:14:03 DAN-LNX system76-power[864]: [ERROR] fan daemon: platform hwmon not found
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] Handling dbus requests
Dec 19 20:14:03 DAN-LNX system76-power[864]: [INFO] hid_backlight: no devices found
Dec 19 20:14:03 DAN-LNX system76-power[864]: setting powersave with max 2800000
Dec 19 20:14:03 DAN-LNX systemd[1]: system76-power-startup.service: Deactivated successfully.
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  **** Process start at monotonic time 5.863720874
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  model: 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  Brightness hack not needed for 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  Airplane mode hack not needed for 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  ACPI Interrupt fix not needed for 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  ACPI Interrupt fix not needed for 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  ESS DAC autoswitch not needed for 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  Headphone volume adjustment not needed for 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  DPCD PWM fix not needed for 'addw4'
Dec 19 20:14:03 DAN-LNX system76-daemon[869]: 2024-12-19 20:14:03,671  INFO  Limit Power Draw not needed 'addw4'
Dec 19 20:14:08 DAN-LNX system76-power[864]: [INFO] Setting power management to auto
Dec 19 20:30:18 DAN-LNX kernel:  i2c_mux i2c_hid intel_vsec intel_hid idma64 pmt_class intel_scu_pltdrv mei system76_acpi coreboot_table pinctrl_alderlake sparse_keymap mac_hid crypto_user loop dm_mod nfnetlink ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 nvme nvme_core crc32c_intel spi_intel_pci serio_raw spi_intel atkbd nvme_auth libps2 vivaldi_fmap i8042 serio hid_generic usbhid i915 i2c_algo_bit drm_buddy video wmi ttm intel_gtt drm_display_helper cec
Dec 19 20:32:02 DAN-LNX systemd[1]: Stopping System76 Power Daemon...
Dec 19 20:32:03 DAN-LNX system76-power[864]: [INFO] caught signal: SIGTERM
Dec 19 20:32:03 DAN-LNX systemd[1]: Stopping System76 Firmware Daemon...
Dec 19 20:32:03 DAN-LNX systemd[1]: Stopping System76 airplane-mode hotkey and LED support...
Dec 19 20:32:03 DAN-LNX systemd[1]: system76-firmware-daemon.service: Deactivated successfully.
Dec 19 20:32:03 DAN-LNX systemd[1]: Stopped System76 Firmware Daemon.
Dec 19 20:32:03 DAN-LNX systemd[1]: system76.service: Deactivated successfully.
Dec 19 20:32:03 DAN-LNX systemd[1]: Stopped System76 airplane-mode hotkey and LED support.
Dec 19 20:32:03 DAN-LNX system76-power[864]: [INFO] daemon exited from loop
Dec 19 20:32:03 DAN-LNX systemd[1]: com.system76.PowerDaemon.service: Deactivated successfully.
Dec 19 20:32:03 DAN-LNX systemd[1]: Stopped System76 Power Daemon.

Based on the System76 daemons installed, it seems like some AUR packages are installed. `system76-power` seems very suspicious. Please provide the output of:

$ pacman -Qm

For the drivers, provide the output of:

$ pacman -Qs 'va-api|vdpau|vulkan|nvidia|dkms'

What is the output (as root) of:

# cat /sys/module/nvidia_drm/parameters/modeset

And what are the STEAM start command launch option (under the game properties) for when you play games with STEAM (like ProtonDB > Baldur's Gate 3)?

carlossss111 wrote:

I am running a laptop setup with two GPUs. My intel GPU renders my Cinnamon DE and general applications like firefox etc. Intensive applications like games are rendered by my Nvidia GPU.
I am using nvidia-dkms 565.77-2 and I experience this problem both on 6.12.4-arch1-1 and on the LTS kernel. I have tried nvidia-open-dkms but get this issue even more.

01:00.0 VGA compatible controller: NVIDIA Corporation AD107M [GeForce RTX 4060 Max-Q / Mobile] (rev a1)
Dec 19 20:14:01 DAN-LNX kernel: Linux version 6.12.4-arch1-1 (linux@archlinux) (gcc (GCC) 14.2.1 20240910, GNU ld (GNU Binutils) 2.43.0) #1 SMP PREEMPT_DYNAMIC Mon, 09 Dec 2024 14:31:57 +0000
...
Dec 19 20:14:01 DAN-LNX kernel: DMI: System76 Adder WS/Adder WS, BIOS 2024-03-11_4e3ade8-dirty 03/07/2024
...
Dec 19 20:14:01 DAN-LNX kernel: smpboot: CPU0: Intel(R) Core(TM) i9-14900HX (family: 0x6, model: 0xb7, stepping: 0x1)
...
Dec 19 20:14:01 DAN-LNX kernel: Memory: 16140480K/16615816K available (18432K kernel code, 2654K rwdata, 14236K rodata, 4248K init, 4092K bss, 429012K reserved, 0K cma-reserved)

See: Arch Wiki > NVIDIA Optimus.

As an added note: Since is looks like that hardware is a Intel Core i9-14900HX and a Nvidia GeForce RTX 4060 Max-Q with 16GB of RAM in a System76 Adder "addw4" laptop.
It may be worth looking into how much power the 14th Gen Intel CPU is drawing, as it was infamous for some Intel 14th Gen models (see: here) to over-volt themselves without the most up-to-date micro-code update in the newest BIOS revisions that recently came out... might want to talk to System76 to made sure the BIOS/Intel micro-code versions and warranty are in order.

Last edited by Nyctfall (2024-12-20 01:33:08)

Offline

#4 2024-12-20 21:31:38

carlossss111
Member
Registered: 2024-12-19
Posts: 7

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Hi again, disabling the system-76-power service seems to have fixed the system crashes. I can no longer seem to replicate this issue, though it might take another day or two to be sure. I originally installed it to manage battery-life and stuff like CPU turbo, but I can shop around for other ways to do this.


This fix has made another issue more obvious though, and I'm not sure if it's related or not (probably not so let me know if I need to rename the thread).
Given that I am playing a game with Proton, when I ALT-TAB a few times (or in rare cases, at total random), then the game visually freezes but the audio and the gameplay continue. If I hold ALT-TAB then I can see the game displaying, but as soon as I let go it freezes. This happens fullscreen and windowed. I think it is failing to focus.
I have tried fixing this by installing 'gamescope', but instead of the display freezing, the application just crashes instead.
I guess this particular problem will be related to X11 or to my DE?

As asked for:
xorg config: https://0x0.st/XC0D.txt
new journal with the ALT-TAB issue: https://0x0.st/XC0G.txt
pacman -Qm: https://0x0.st/XC0k.txt
pacman -Qs 'va-api|vdpau|vulkan|nvidia|dkms': https://0x0.st/XC0d.txt
cat /sys/module/nvidia_drm/parameters/modeset: 'Y'

There're no outputs attached to the nvidia GPU but afaiu you prime-run steam (or the game)?

If it is a Proton game, then it seems to run on dGPU automatically. If it is native, I use 'prime-run' as a launch option. (Usually no other launch options, though depends on the game)

I used an NVIDIA GPU with Arch before... it's usually the drivers that are the problem. Other times it's configuration.

My /etc/X11/xorg.conf.d/ directory is empty... A while ago when I set this system up when I added things it broke it, so I thought it was doing a good job of setting it up automatically. Might be a rookie error?

Thanks again for your help and advice!

Offline

#5 2024-12-20 21:53:48

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,413

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Dec 20 20:39:31 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: swiotlb buffer is full (sz: 2097152 bytes), total 32768 (slots), used 0 (slots)

Add "nvidia-drm.modeset=1" to the https://wiki.archlinux.org/title/Kernel_parameters (the modprobe config option is applied, but the commandline parameter will shut down the simpledrm device)

Gamescope aborts but also

Dec 20 20:44:16 DAN-LNX steam[1714]: [gamescope] [Info]  wlserver: Running compositor on wayland display 'gamescope-0'
Dec 20 20:44:16 DAN-LNX steam[1714]: [gamescope] [Info]  wlserver: [xwayland/server.c:107] Starting Xwayland on :1

seems to try to run on an xwayland server on :1 but you've posted an Xorg log on :0 and most likely are running cinnabun on X11 anyway?

The alt+tab situation is most likely because of the framebuffer updates and I'd be shocked. Literally shocked if that would turn out to be the simplydumb device…

It could also just be an issue w/ the cinnabun compositor, afaict there's an option to unredirect fullscreen windows (ie. to not pipe them through the compositor) what's good for performance but maybe prone to cause this - it would however not affect the windowed game hmm

You could however test steam on an openbox session and see whether that causes similar problems (so whether it's X11 or muffin)

Online

#6 2024-12-20 22:46:44

carlossss111
Member
Registered: 2024-12-19
Posts: 7

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Add "nvidia-drm.modeset=1"

Done now thankyou. The alt-tab problem still persists so I think you're right its probably something about framebuffer updates then.

seems to try to run on an xwayland server on :1 but you've posted an Xorg log on :0 and most likely are running cinnabun on X11 anyway?

Yes that's right. The XWayland server is from when I tried the gamescope microcompositor. Maybe this is the log for :1 https://0x0.st/XCG6.txt

You could however test steam on an openbox session and see whether that causes similar problems (so whether it's X11 or muffin)

Now tried that. I get exactly the same ALT-TAB issue, (and gamescope doesn't even want to run either).

Offline

#7 2024-12-21 09:15:47

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,413

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

So it's not the cinnabun compositor.
Check the status of https://wiki.archlinux.org/title/PRIME# … ronization and flip that.

Sanity check: "swiotlb buffer is full" no longer shows up in the journal?
What does the xorg log now look like?

Another thing you could try would be "nvidia_drm.fbdev=0", though that typically causes different issues.

The X11 log is from mid august.

Online

#8 2024-12-21 10:47:16

carlossss111
Member
Registered: 2024-12-19
Posts: 7

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Check the status of https://wiki.archlinux.org/title/PRIME# … ronization and flip that.

I don't seem to have that option. My output for 'xrandr --prop' is https://0x0.st/XCnN.txt.

# xrandr --output "eDP-1" --set "PRIME Synchronization" 0
X Error of failed request:  BadName (named color or font does not exist)
  Major opcode of failed request:  140 (RANDR)
  Minor opcode of failed request:  11 (RRQueryOutputProperty)
  Serial number of failed request:  59
  Current serial number in output stream:  59

There does seem to be another connected output called 'None 2-1'. I tried that with/without synchronization and nothing changed.

As for the buffer message:

$ journalctl -b | grep "swiotlb buffer is full"
Dec 21 10:32:34 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: swiotlb buffer is full (sz: 8388608 bytes), total 32768 (slots), used 0 (slots)

Full journal is here: https://0x0.st/XCnz.txt

Updated xorg log here: https://0x0.st/XCni.txt

Another thing you could try would be "nvidia_drm.fbdev=0", though that typically causes different issues.

Alright so I added this to the kernel parameters and it hasn't fixed the problem, but it seems have made it a lot better. Now I can cause the game to crash by aggressively pressing ALT-TAB for about a minute, instead of only for a few times. I'm not sure if this will be fine for regular gameplay yet though or whether it will crash after a long session, would need to try for a few days. Not sure this solves the root problem though.

edit:typo

edit2: still crashes too much hmm

Last edited by carlossss111 (2024-12-21 10:56:20)

Offline

#9 2024-12-21 13:20:56

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,413

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Dec 21 09:55:13 DAN-LNX kernel: The simpledrm driver will not be probed
Dec 21 09:55:13 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: [drm] could not acquire memory region [mem 0xd0000000-0xd07e8fff flags 0x80000200]
Dec 21 09:55:13 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: [drm] Registered 1 planes with drm panic
Dec 21 09:55:13 DAN-LNX kernel: [drm] Initialized simpledrm 1.0.0 for simple-framebuffer.0 on minor 1
Dec 21 09:55:13 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: [drm] fb1: simpledrmdrmfb frame buffer device
Dec 21 09:55:20 DAN-LNX NetworkManager[931]: <info>  [1734774920.1850] Config: added 'bgscan' value 'simple:30:-65:300'
Dec 21 09:55:23 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: swiotlb buffer is full (sz: 8388608 bytes), total 32768 (slots), used 0 (slots)

WTF?
I mean, W.T.F?

Do you explcitly load the module?

Do you get better results w/ https://wiki.archlinux.org/title/Intel_ … _Xe_driver
Do you still get the simplydumb device w/ the LTS kernel (and nvidia-drm.modeset=1)?

Online

#10 2024-12-22 10:26:21

carlossss111
Member
Registered: 2024-12-19
Posts: 7

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Do you explcitly load the module?

Had a read of https://wiki.archlinux.org/title/Kernel_module and I dont think so.

mkinitcpio.conf doesn't contain it, and I don't have a modprobe conf file that installs it?
I did find this in '/etc/modprobe.d/system76-power.conf' but after commenting it out nothing changes (seems like modeset was already enabled here btw):

# Automatically generated by system76-power
blacklist i2c_nvidia_gpu
alias i2c_nvidia_gpu off
options nvidia NVreg_DynamicPowerManagement=0x02
options nvidia-drm modeset=1
# Preserve video memory through suspend
options nvidia NVreg_PreserveVideoMemoryAllocations=1

There are no other modprobe confs.

Here are some related outputs:
lsmod https://0x0.st/8rXV.txt
modprobe -c https://0x0.st/8rXW.txt

Do you get better results w/ https://wiki.archlinux.org/title/Intel_ … _Xe_driver

No unfortunately systemd fails to start lightdm with these kernel parameters enabled.
related journal: https://0x0.st/8rXv.txt

Do you still get the simplydumb device w/ the LTS kernel (and nvidia-drm.modeset=1)?

I don't get the 'swiotlb buffer is full', if that's what you mean?
related journal: https://0x0.st/8rXy.txt

After testing my game a couple of times I've noticed that since turning off sys76-power and changing modesetting and fbdev as suggested, the game crashes after a while on the current version, but not the LTS version. Therefore, I think that the 'swiotlb buffer is full' message means the game will crash, so maybe that's progress.
Now on the LTS kernel I'm now getting the old behaviour where it doesn't crash, but the game is running any time except for when I'm focussing on the window / fullscreen.
I must've really borked something here sorry roll

Offline

#11 2024-12-22 14:37:41

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,413

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

mkinitcpio.conf doesn't contain it, and I don't have a modprobe conf file that installs it?

Wrong infrastructure.
https://wiki.archlinux.org/title/Kernel_module#systemd

I did find this in '/etc/modprobe.d/system76-power.conf' but after commenting it out nothing changes

If the module is in the initramfs, you'll have to regenerate the latter to apply modprobe.conf changes.

(seems like modeset was already enabled here btw):

Yes, but in this location it won't block the simpledrm device. *Only* the kernel parameter does that.

No unfortunately systemd fails to start lightdm with these kernel parameters enabled.

Dec 22 09:56:05 DAN-LNX kernel: xe 0000:00:02.0: Your graphics device a788 is not officially supported
                                by xe driver in this kernel version. To force Xe probe,
                                use xe.force_probe='a788' and i915.force_probe='!a788'
                                module parameters or CONFIG_DRM_XE_FORCE_PROBE='a788' and
                                CONFIG_DRM_I915_FORCE_PROBE='!a788' configuration options.
Dec 22 09:56:05 DAN-LNX kernel: pci 0000:00:02.0: [8086:a788] type 00 class 0x030000 PCIe Root Complex Integrated Endpoint
Dec 22 09:56:05 DAN-LNX kernel: i915 0000:00:02.0: I915 probe blocked for Device ID a788.

Seems you bloked i9415 but didn't enforce xe?

I don't get the 'swiotlb buffer is full', if that's what you mean?

You especially  don't load the simple-framebuffer module at all.
It might be hanging around in the initramfs

for initrd in /boot/initramfs-linux*; do sudo lsinitcpio $initrd | grep -i simple && echo $initrd; done

Online

#12 2024-12-22 19:08:18

carlossss111
Member
Registered: 2024-12-19
Posts: 7

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Ah, here is the explicit kernel modules:

$ cat /etc/modules-load.d/*
#tls
system76_acpi

If the module is in the initramfs, you'll have to regenerate the latter to apply modprobe.conf changes.

Yep tried this, as stated there was no difference in performance.

Seems you bloked i9415 but didn't enforce xe?

Ah sorry I had a typo in the parameters.
Now with specifically 'options root=UUID=0eadff8e-6e38-4343-bf19-d424bad598f4 rw nvidia-drm.modeset=1 nvidia_drm.fbdev=0 xe.force_probe=a788 i915.force_probe=!a788', it does launch but I get worse performance. Feels like it lowered my refresh rate so will remove these xe/i915 options.
Related journal: https://0x0.st/8rAa.txt


You especially don't load the simple-framebuffer module at all.

I have checked for '*simple*' in all the initramfs in the boot directory and theres nothing.
For example, here is 'lsinitcpio /boot/initramfs-linux.img' : https://0x0.st/8rAQ.txt
What does this mean? Do I need to download the module myself somehow and add it perhaps? Maybe with DKMS? I would've thought it should all come with the nvidia-dkms package.

Thanks for sticking with me

Offline

#13 2024-12-22 20:48:48

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,413

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

https://0x0.st/8rAa.txt wrote:

Dec 22 18:35:14 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: [drm] *ERROR* could not acquire memory range [mem 0xd0000000-0xd07e8fff flags 0x80000200]: -16
Dec 22 18:35:14 DAN-LNX kernel: simple-framebuffer simple-framebuffer.0: probe with driver simple-framebuffer failed with error -16

Do I need to download the module myself somehow and add it perhaps?

No, the plan is to get rid of that.
https://gitlab.archlinux.org/archlinux/ … type=heads

CONFIG_SYSFB_SIMPLEFB=y

Fuck.

Try to add "initcall_blacklist=simplefb_probe" to the https://wiki.archlinux.org/title/Kernel_parameters

Since the xe driver gets rid of that, and assuming you're still prime-running the game, does it still crash w/ the xe driver?

Feels like it lowered my refresh rate so will remove these xe/i915 options.

If we're leaving the feelings aside tongue
What do "xrandr -q" and "glxgears" report?
Also please post your Xorg log for the xe driver run, https://wiki.archlinux.org/title/Xorg#General

Online

#14 2025-02-03 20:50:26

carlossss111
Member
Registered: 2024-12-19
Posts: 7

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Hi again, sorry I have been away without a reply life has been very busy.
A couple of weeks ago the issues stopped after an update if I used gamescope, I thought I would see if they would return and eventually they did.
As before, games freeze after a while or after alt-tabbing. Gamescope does not launch either.

nitcall_blacklist=simplefb_probe

journal: https://0x0.st/8K1s.txt
xorg log:  https://0x0.st/8K1z.txt
xrandr -q: https://0x0.st/8K1i.txt

I also searched around about the simpleframebuffer log messages and found this page: https://bbs.archlinux.org/viewtopic.php?id=268394.
It might be completely irrelevant, but I tried the 'iommu=soft' param and found that 'journalctl -b | grep "swiotlb buffer is full"' no longer returned anything (edit: including if I try without the xe params https://0x0.st/8K1c.txt). Not sure if that's helpful but just incase, here are the logs:
journal: https://0x0.st/8K1H.txt
xorg log: https://0x0.st/8K1X.txt
xrandr -q: https://0x0.st/8K1i.txt
I still encounter crashes just the same.

What do "xrandr -q" and "glxgears" report?

xrandr is listed above. I have never had an issue with glxgears, it runs at a steady framerate without crashing. (See below)

other
* I'm pretty sure (though not flatout certain) that these problems only occur with Proton and Wine. E.g. no problem with prime running glxgears.
* Gamescope mostly fixes the issues when it works. But now it's been screwed with an update and now I get: https://0x0.st/8K1P.txt

Last edited by carlossss111 (2025-02-03 20:56:44)

Offline

#15 2025-02-04 08:05:23

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,413

Re: Games crash with 'GPU:0 Failed to query display engine channel state'

Feb 03 20:24:15 DAN-LNX kernel: The simpledrm driver will not be probed
Feb 03 20:24:15 DAN-LNX kernel: blacklisting initcall simplefb_probe

the simpledrm device is gone, regardless of iommu.

The latest journal has neither of the error in the OP

Steam complains a bit

Feb 03 20:27:11 DAN-LNX steam[2130]: setlocale "en_US.UTF-8": No such file or directory
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Missing locale en_US.UTF-8
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generating locale en_GB.UTF-8...
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generated locale en_GB.UTF-8 successfully
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generating locale en_US.UTF-8...
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generated locale en_US.UTF-8 successfully
…
Feb 03 20:24:51 DAN-LNX steam[2130]: Proton: Upgrading prefix from GE-Proton9-10 to GE-Proton9-2 (/home/daniel/.local/share/Steam/steamapps/compatdata/0/)
Feb 03 20:24:51 DAN-LNX steam[2130]: Proton: Prefix has an invalid version?! You may want to back up user files and delete this prefix.
Feb 03 20:25:12 DAN-LNX steam[2130]: Proton: Upgrading prefix from GE-Proton9-2 to GE-Proton9-10 (/home/daniel/.local/share/Steam/steamapps/compatdata/0/)
Feb 03 20:25:12 DAN-LNX steam[2130]: Proton: Prefix has an invalid version?! You may want to back up user files and delete this prefix.

And here's your final run of "Celeste"

Feb 03 20:27:06 DAN-LNX steam[2130]: chdir "/drive/A/SteamLibrary/steamapps/common/Celeste"
Feb 03 20:27:06 DAN-LNX steam[2130]: ERROR: ld.so: object '/home/daniel/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
Feb 03 20:27:06 DAN-LNX steam[2130]: ERROR: ld.so: object '/home/daniel/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
Feb 03 20:27:06 DAN-LNX steam[2130]: ERROR: ld.so: object '/home/daniel/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
Feb 03 20:27:06 DAN-LNX steam[2130]: Game Recording - would start recording game 504230, but recording for this game is disabled
Feb 03 20:27:06 DAN-LNX steam[2130]: Adding process 5310 for gameID 504230
Feb 03 20:27:06 DAN-LNX steam[2130]: ERROR: ld.so: object '/home/daniel/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
Feb 03 20:27:06 DAN-LNX steam[2130]: ERROR: ld.so: object '/home/daniel/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
Feb 03 20:27:06 DAN-LNX steam[2130]: Adding process 5311 for gameID 504230
Feb 03 20:27:06 DAN-LNX steam[2130]: ERROR: ld.so: object '/home/daniel/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
Feb 03 20:27:07 DAN-LNX steam[2130]: Adding process 5312 for gameID 504230
Feb 03 20:27:07 DAN-LNX steam[2130]: Adding process 5313 for gameID 504230
Feb 03 20:27:10 DAN-LNX kernel: x86/split lock detection: #AC: CHTTPClientThre/5415 took a split_lock trap at address: 0xdbf21caf
Feb 03 20:27:11 DAN-LNX steam[2130]: wine: using kernel write watches, use_kernel_writewatch 1.
Feb 03 20:27:11 DAN-LNX steam[2130]: Adding process 5417 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: setlocale "en_US.UTF-8": No such file or directory
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Missing locale en_US.UTF-8
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generating locale en_GB.UTF-8...
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generated locale en_GB.UTF-8 successfully
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generating locale en_US.UTF-8...
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-locale-gen: Generated locale en_US.UTF-8 successfully
Feb 03 20:27:11 DAN-LNX steam[2130]: pressure-vessel-adverb[5370]: W: Container startup will be faster if missing locales are created at OS level
Feb 03 20:27:11 DAN-LNX steam[2130]: pid 5425 != 5420, skipping destruction (fork without exec?)
Feb 03 20:27:11 DAN-LNX steam[2130]: pid 5430 != 5420, skipping destruction (fork without exec?)
Feb 03 20:27:11 DAN-LNX steam[2130]: Adding process 5420 for gameID 504230
Feb 03 20:27:11 DAN-LNX steam[2130]: Adding process 5451 for gameID 504230
Feb 03 20:27:11 DAN-LNX steam[2130]: Adding process 5452 for gameID 504230
Feb 03 20:27:11 DAN-LNX steam[2130]: Adding process 5453 for gameID 504230
Feb 03 20:27:11 DAN-LNX steam[2130]: Adding process 5454 for gameID 504230
Feb 03 20:27:11 DAN-LNX steam[2130]: pid 3363 != 3362, skipping destruction (fork without exec?)
Feb 03 20:27:11 DAN-LNX steam[2130]: Game Recording - game stopped [gameid=1316910]
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 5417 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3487 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3478 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3456 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3406 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3395 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3380 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3370 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3367 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3365 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3362 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3361 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3360 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3359 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3134 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3132 for gameID 1316910
Feb 03 20:27:11 DAN-LNX steam[2130]: Removing process 3131 for gameID 1316910
Feb 03 20:27:12 DAN-LNX steam[2130]: Setting breakpad minidump AppID = 504230
Feb 03 20:27:12 DAN-LNX steam[2130]: Steam_SetMinidumpSteamID:  Caching Steam ID:  76561198060431335 [API loaded no]
Feb 03 20:27:12 DAN-LNX steam[2130]: CELESTE : 1.4.0.0
Feb 03 20:27:12 DAN-LNX steam[2130]: FNA3D Driver: OpenGL
Feb 03 20:27:12 DAN-LNX steam[2130]: OpenGL Renderer: NVIDIA GeForce RTX 4060 Laptop GPU/PCIe/SSE2
Feb 03 20:27:12 DAN-LNX steam[2130]: OpenGL Driver: 4.6.0 NVIDIA 570.86.16
Feb 03 20:27:12 DAN-LNX steam[2130]: OpenGL Vendor: NVIDIA Corporation
Feb 03 20:27:12 DAN-LNX steam[2130]: MojoShader Profile: glsl120
Feb 03 20:27:12 DAN-LNX steam[2130]: BEGIN LOAD
Feb 03 20:27:12 DAN-LNX steam[2130]: (process:3799): GLib-GObject-CRITICAL **: 20:27:12.944: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
Feb 03 20:27:13 DAN-LNX steam[2130]:  - GFX LOAD: 769ms
Feb 03 20:27:13 DAN-LNX steam[2130]:  - MTN LOAD: 469ms
Feb 03 20:27:13 DAN-LNX steam[2130]: FULLSCREEN
Feb 03 20:27:13 DAN-LNX steam[2130]: GAME DISPLAYED (in 1407ms)
Feb 03 20:27:14 DAN-LNX steam[2130]: System.DllNotFoundException: libfmodstudio.so.10
Feb 03 20:27:14 DAN-LNX steam[2130]:   at (wrapper managed-to-native) FMOD.Studio.System.FMOD_Studio_System_Create(intptr&,uint)
Feb 03 20:27:14 DAN-LNX steam[2130]:   at FMOD.Studio.System.create (FMOD.Studio.System& studiosystem) [0x00005] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]:   at Celeste.Audio.Init () [0x00010] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]:   at Celeste.GameLoader.LoadThread () [0x0000c] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]:   at Celeste.RunThread.RunThreadWithLogging (System.Action method) [0x00000] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]: steam-runtime-urlopen: Unable to open URL
Feb 03 20:27:14 DAN-LNX steam[2130]: steam-runtime-urlopen: tried using xdg-desktop-portal, received error: Unable to open URL with xdg-desktop-portal: GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod: No such interface ?org.freedesktop.portal.OpenURI? on object at path /org/freedesktop/portal/desktop
Feb 03 20:27:14 DAN-LNX steam[2130]: 02/03 20:27:14 minidumps folder is set to /tmp/dumps
Feb 03 20:27:14 DAN-LNX steam[2130]: 02/03 20:27:14 Init: Installing breakpad exception handler for appid(gameoverlayui)/version(20250128005041)/tid(5474)
Feb 03 20:27:14 DAN-LNX steam[2130]: 02/03 20:27:14 Init: Installing breakpad exception handler for appid(gameoverlayui)/version(1.0)/tid(5474)
Feb 03 20:27:14 DAN-LNX steam[2130]: Adding process 5472 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Game Recording - game stopped [gameid=504230]
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5472 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5454 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5453 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5452 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5451 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5420 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5313 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5312 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5311 for gameID 504230
Feb 03 20:27:14 DAN-LNX steam[2130]: Removing process 5310 for gameID 504230
Feb 03 20:27:15 DAN-LNX steam[2130]: (process:5474): GLib-GObject-CRITICAL **: 20:27:15.938: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
Feb 03 20:27:22 DAN-LNX steam[2130]: reaping pid: 3799 -- gameoverlayui
Feb 03 20:27:32 DAN-LNX steam[2130]: reaping pid: 5474 -- gameoverlayui

Seems to crahs for missing fmod.

Feb 03 20:27:14 DAN-LNX steam[2130]: System.DllNotFoundException: libfmodstudio.so.10
Feb 03 20:27:14 DAN-LNX steam[2130]:   at (wrapper managed-to-native) FMOD.Studio.System.FMOD_Studio_System_Create(intptr&,uint)
Feb 03 20:27:14 DAN-LNX steam[2130]:   at FMOD.Studio.System.create (FMOD.Studio.System& studiosystem) [0x00005] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]:   at Celeste.Audio.Init () [0x00010] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]:   at Celeste.GameLoader.LoadThread () [0x0000c] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]:   at Celeste.RunThread.RunThreadWithLogging (System.Action method) [0x00000] in <4a26f9ded6704c87a2f47e66d2d85163>:0
Feb 03 20:27:14 DAN-LNX steam[2130]: steam-runtime-urlopen: Unable to open URL
Feb 03 20:27:14 DAN-LNX steam[2130]: steam-runtime-urlopen: tried using xdg-desktop-portal, received error: Unable to open URL with xdg-desktop-portal: 

and it immediately terminates what seems a game specific (unrelated) issue?

Online

Board footer

Powered by FluxBB