You are not logged in.
Hello everyone.
I've been facing a problem with my PC. I'm not sure where the problem is.
I'm using Gnome with Wayland, 7900X + RX 7800XT, the latest stable kernel. When I do long gaming sessions (not sure exactly how long, but around 2-3 hours), my Gnome session crashes and goes back to login.
The complete journalctl log can be found here: https://pastejustit.com/jzcn4nprav
But I think the problem is somewhere here.
Oct 17 03:48:28 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=5979673, emitted seq=5979675
Oct 17 03:48:28 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: Process information: process vrcompositor.re pid 12608 thread RenderThread pid 12632
Oct 17 03:48:28 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Oct 17 03:48:30 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
Oct 17 03:48:30 amd-arch kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Oct 17 03:48:31 amd-arch kernel: [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State Completed
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: MODE1 reset
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
Oct 17 03:48:31 amd-arch kernel: [drm] PCIE GART of 512M enabled (table at 0x00000083FEB00000).
Oct 17 03:48:31 amd-arch kernel: [drm] VRAM is lost due to GPU reset!
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: PSP is resuming...
Oct 17 03:48:31 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: reserve 0xa700000 from 0x83e0000000 for PSP TMR
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000003d, smu fw if version = 0x00000040, smu fw program = 0, smu fw version = 0x00505000 (80.80.0)
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
Oct 17 03:48:32 amd-arch kernel: [drm] DMUB hardware initialized: version=0x07002A00
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_1 uses VM inv eng 1 on hub 8
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 4 on hub 8
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 14 on hub 0
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
Oct 17 03:48:32 amd-arch kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
Oct 17 03:48:32 amd-arch steam[2686]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
Oct 17 03:48:32 amd-arch gnome-shell[1653]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
Oct 17 03:48:32 amd-arch gnome-shell[1653]: == Stack trace for context 0x5b69346caf00 ==
Oct 17 03:48:32 amd-arch gnome-shell[1653]: #0 5b69347bf7e8 i resource:///org/gnome/shell/ui/init.js:21 (1f80fe70c90 @ 48)
Oct 17 03:48:32 amd-arch systemd-coredump[33479]: Process 1653 (gnome-shell) of user 1000 terminated abnormally with signal 6/ABRT, processing...
Oct 17 03:48:32 amd-arch systemd[1]: Started Process Core Dump (PID 33479/UID 0).
Oct 17 03:48:32 amd-arch steam[2686]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
Oct 17 03:48:32 amd-arch steam[2686]: radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
Oct 17 03:48:32 amd-arch steam[2686]: 10/17 03:48:32 Failed writing minidump, nothing to upload.
Oct 17 03:48:32 amd-arch pipewire[2171]: pw.node: (Dummy-Driver-29) graph xrun not-triggered (51 suppressed)
Oct 17 03:48:32 amd-arch pipewire[2171]: pw.node: (Dummy-Driver-29) xrun state:0x7ea49fef5008 pending:1/3 s:2468077941312 a:2468077970382 f:2468077970812 waiting:29070 process:430 status:triggered
Oct 17 03:48:32 amd-arch pipewire[2171]: pw.node: (ALVR Audio-92) xrun state:0x7ea49fbc2008 pending:0/2 s:2468080622087 a:2468077957462 f:2468077967042 waiting:18446744073706886991 process:9580 status:triggered
Oct 17 03:48:32 amd-arch assert_20241017034823_4.dmp[33484]: Uploading dump (out-of-process)
/tmp/dumps/assert_20241017034823_4.dmp
Oct 17 03:48:32 amd-arch vrcompositor-linux[12617]: assert_20241017034823_4.dmp[33484]: Uploading dump (out-of-process)
Oct 17 03:48:32 amd-arch vrcompositor-linux[12617]: /tmp/dumps/assert_20241017034823_4.dmp
Oct 17 03:48:32 amd-arch systemd-coredump[33482]: Process 12569 (Main Thread) of user 1000 terminated abnormally with signal 6/ABRT, processing...
Oct 17 03:48:32 amd-arch steam[2686]: crash_20241017034832_2.dmp[33488]: Uploading dump (out-of-process)
Oct 17 03:48:32 amd-arch steam[2686]: /tmp/dumps/crash_20241017034832_2.dmp
Oct 17 03:48:32 amd-arch crash_20241017034832_2.dmp[33488]: Uploading dump (out-of-process)
/tmp/dumps/crash_20241017034832_2.dmp
Oct 17 03:48:32 amd-arch systemd-coredump[33489]: Process 12378 (vrmonitor) of user 1000 terminated abnormally with signal 6/ABRT, processing...
Oct 17 03:48:32 amd-arch crash_20241017034832_5.dmp[33492]: Uploading dump (out-of-process)
/tmp/dumps/crash_20241017034832_5.dmp
Oct 17 03:48:32 amd-arch vrcompositor-linux[12617]: crash_20241017034832_5.dmp[33492]: Uploading dump (out-of-process)
Oct 17 03:48:32 amd-arch vrcompositor-linux[12617]: /tmp/dumps/crash_20241017034832_5.dmp
Oct 17 03:48:32 amd-arch systemd[1]: Started Process Core Dump (PID 33482/UID 0).
Oct 17 03:48:32 amd-arch systemd-coredump[33493]: Process 12608 (vrcompositor.re) of user 1000 terminated abnormally with signal 6/ABRT, processing...
Oct 17 03:48:32 amd-arch systemd[1]: Started Process Core Dump (PID 33489/UID 0).
Oct 17 03:48:32 amd-arch systemd[1]: Started Process Core Dump (PID 33493/UID 0).
If anyone could help, I'd appreciate it.
Thank you.
Offline
same issue , Some people had luck fixing the issue here : https://bbs.archlinux.org/viewtopic.php … 5#p2197225
But it didn't work for me .I log in to tty3 in gdm , then reset the dconf with dconf reset -f /
Log in and never log out until the next gnome-shell update , or downgrade gnome-shell and mutter to 46.5
Supposedly the Issue is fixed , here in Gnome-shell gitlab they fixed it https://gitlab.gnome.org/GNOME/gnome-sh … ssues/7912 , but the update didn't hit Arch repos yet .
Last edited by houssem (2024-10-17 12:55:56)
Offline
The issue that you've mentioned doesn't look like the same problem as mine. I can still log in and everything. My problem is that gnome shell is crashing after long gaming sessions. And after the crash if I log in and start playing again, it crashes again after a couple of minutes. If I let my PC rest for an hour or two without playing any games, then I can do another one to two hour of long gaming session. I'm not sure if it's a hardware problem or mesa or the kernel. I don't think I had this issue 2-3 weeks ago.
Offline
I think you may have the same problem as me, I've been having it for months now and never knew what's causing it.
Wayland, AMD RX6700 XT, always updated.. I can tell you some things:
- If i login again on gdm after the crash and go to the system monitor there is still some wine processes there, sometimes even the music from the game keeps running (wtf).
- This never happened to me when playing demanding games on Discord with my friends for hours and hours, like playing Borderlands for more than 6 hours straight.
- Sometimes this happened to me when not playing any game, BUT it could be that I've played something and then quit, can't tell.
As soon as I have another crash I'll try to get the journal and compare it with yours.
Some programs I normally use: Steam, Discord, Telegram Desktop, Firefox, Blanket.
Offline
The issue that you've mentioned doesn't look like the same problem as mine. I can still log in and everything. My problem is that gnome shell is crashing after long gaming sessions. And after the crash if I log in and start playing again, it crashes again after a couple of minutes. If I let my PC rest for an hour or two without playing any games, then I can do another one to two hour of long gaming session. I'm not sure if it's a hardware problem or mesa or the kernel. I don't think I had this issue 2-3 weeks ago.
I have been having similar infrequent crashing issues (which became more frequent) and I isolated the cause to the kernel since a version of 6.10 or 6.11. Although I don't use GDM or Gnome as I am using i3 with SDDM, I do have an AMD GPU and the crashes only seemed to occur while gaming and could be anywhere from a few minutes in to a few hours in.
To solve my issue, I have swapped over to using the linux-lts kernel (I was using linux-zen prior and I believe that linux mainline is also affected) and the problem is no longer occurring for me.
It may be worth your while trying out the linux-lts kernel and see if it helps.
Desktop: Arch Linux | i3-gaps WM | AMD Ryzen 5700X | 32GB RAM | AMD Radeon RX 6700XT | Dual monitors @ 2560x1440
Laptop: Debian Linux | i3WM | Dell Latitude E7270 | Intel Core i5-6300U | 16GB RAM
~ Do or do not, there is no try ~
Offline
Here is a link to my post about it.
https://bbs.archlinux.org/viewtopic.php?id=299954
Desktop: Arch Linux | i3-gaps WM | AMD Ryzen 5700X | 32GB RAM | AMD Radeon RX 6700XT | Dual monitors @ 2560x1440
Laptop: Debian Linux | i3WM | Dell Latitude E7270 | Intel Core i5-6300U | 16GB RAM
~ Do or do not, there is no try ~
Offline