You are not logged in.
Hey there! New user, here. Hopefully someone here will be able to make sense of what's currently going on with my system.
Occasionally, my system will hang up during a game (so far it has happened in WoW and Diablo 4), with purple and green artifacts displaying on the screen (see the attached image). The first thing I suspected was that my new (or more precisely, my newly installed, renewed/refurbished) RX 5600 XT was overheating, as I had seen this happen before, many years ago, on a laptop I owned. During this, however, my system temperatures seem to all be fine, according to the sensors utility. I use a .yml config file to maintain a somewhat aggressive fan curve (using amdgpu-fan) for my GPU, and during these crashes, the fans continue running for a few moments before suddenly shutting off entirely. After that, I'll continue hearing game audio, but I have to restart my computer using the chassis' reset or power buttons. This morning, I played Diablo 4 for several hours without a single issue, but this last crash (depicted in the picture below) happened moments after selecting a character and loading the game world. The only pattern to these crashes that I think I have noticed (though this could be entirely coincidental) is that the crash happens during sessions in which I am watching, or have watched, something from Twitch via streamlink, so that the stream plays in VLC on my desktop. It has not yet occurred during a session wherein I have not used streamlink. This could just be a red herring.
My system's specs are as follows:
Intel i7-7700K (4.20 GHz)
32 GB DDR4 RAM (3200 MHz)
Radeon RX 5600 XT
GIGABYTE Z170X-Gaming 3 motherboard
KDE Plasma 6.3.5 with Wayland
6.15.1-arch1-2 64-bit kernel
According to corectrl, my GPU BIOS/VBIOS is 115-D182PI0-100 and it's running the amdgpu driver.
Mesa version 25.1.3.
I used
journalctl | grep "gpu"
and looked around the time of the crash (18:37) and following the system reboot, and I pasted the results here. One time when shutting down my PC following such a crash, I received the following messages in tty:
amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 timeout, but soft recovered
amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 timeout, but soft recovered
From what cursory research I have been doing regarding this issue, it seems that AMD released a VBIOS for my card (among others) to boost their max clock speeds from the original 1500 MHz, and this ostensibly resulted in similar issues. But even flashing the VBIOS to a previous version hasn't always been a guaranteed fix, according to some. So far, I have tried some easy implementations to combat the issue, such as slightly reducing my GPU's max clock from approximately 1800 MHz to 1700 MHz -- but I haven't messed with that much as I have little to no experience with that matter -- and reducing max framerates to 60 in my games (my monitor is 100 Hz), to try and reduce load on the GPU. Some folks suggest undervolting the card somewhat, but I haven't done that yet.
Please let me know if there's anything else I can do to help you help me! I'd be happy to supply more details or run some commands to give more information. You might have to give me some hints/pointers on how to do some things, though, as this is my fifth day or so on Arch. (By the way, I installed this GPU right before installing Arch, and I didn't test it out much on my previous distro, Kubuntu, before making the distro switch.)
Thanks a bunch in advance!
Offline
https://justpaste.it/i6crx - can you please use aless-shitty pastebin service that hands out plain text and not pre-wrapped html?
Also please don't grep for random stuff, there're traces of a kernel module crash, but not the crash itself.
tl;dr, if you had such crash during your previous ("-1") boot:
sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st
Offline
https://justpaste.it/i6crx - can you please use aless-shitty pastebin service that hands out plain text and not pre-wrapped html?
Also please don't grep for random stuff, there're traces of a kernel module crash, but not the crash itself.
tl;dr, if you had such crash during your previous ("-1") boot:sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st
My apologies! It seems my network is blocking uploads to 0x0.st (?), so for now I used pastebin to dump about half of the journalctl output (closer to the crash time) here, to avoid the file size limit. Pastebin still has text wrapping for longer lines, but the readability seems to be much better.
Last edited by lostfile (2025-06-11 18:34:58)
Offline
Pastebin actually also gives you the raw file and I've a script to download that automatically, but please nb. that pastebin is discouraged because it likewise isn't globally available.
Jun 10 18:35:38 blackbird steam[2845]: 06/10 18:35:38 Init: Installing breakpad exception handler for appid(gameoverlayui)/version(20250519195100)/tid(27748)
Jun 10 18:35:38 blackbird steam[2845]: 06/10 18:35:38 Init: Installing breakpad exception handler for appid(gameoverlayui)/version(1.0)/tid(27748)
Jun 10 18:37:35 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State
Jun 10 18:37:35 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State Completed
Jun 10 18:37:35 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 timeout, signaled seq=173864, emitted seq=173865
Jun 10 18:37:35 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: Process information: process Diablo IV.exe pid 27642 thread vkd3d_queue pid 27720
Jun 10 18:37:35 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: Starting comp_1.0.1 ring reset
Jun 10 18:37:36 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: Ring comp_1.0.1 reset failure
Jun 10 18:37:36 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Jun 10 18:37:36 blackbird kernel: amdgpu 0000:03:00.0: amdgpu: BACO reset
2 minutes into steam the GPU resets.
Jun 10 18:37:37 blackbird amdgpu-fan[27817]: File "/usr/lib/python3.13/site-packages/amdgpu_fan/lib/amdgpu.py", line 37, in read_endpoint
Jun 10 18:37:37 blackbird amdgpu-fan[27817]: return e.read()
Jun 10 18:37:37 blackbird amdgpu-fan[27817]: ~~~~~~^^
Jun 10 18:37:37 blackbird amdgpu-fan[27817]: PermissionError: [Errno 1] Operation not permitted
Jun 10 18:37:37 blackbird systemd[1]: amdgpu-fan.service: Main process exited, code=exited, status=1/FAILURE
Jun 10 18:37:37 blackbird systemd[1]: amdgpu-fan.service: Failed with result 'exit-code'.
Jun 10 18:37:37 blackbird systemd[1]: amdgpu-fan.service: Scheduled restart job, restart counter is at 5.
Jun 10 18:37:37 blackbird systemd[1]: Started amdgpu fan controller.
Jun 10 18:37:37 blackbird amdgpu-fan[27820]: starting amdgpu-fan
Jun 10 18:37:37 blackbird amdgpu-fan[27820]: Traceback (most recent call last):
the amdgpu-fan daemon throws a tantrum because it can no longer talk to the GPU
Jun 10 18:37:38 blackbird steam[2845]: radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
Jun 10 18:37:38 blackbird plasmashell[979]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
Jun 10 18:37:38 blackbird kwin_wayland[737]: kwin_scene_opengl: 0x3: GL_CONTEXT_LOST in context lost
steam, kwin and plasmashell use their resp. GL contexts
Jun 10 18:37:38 blackbird kwin_wayland[737]: kwin_scene_opengl: A graphics reset not attributable to the current GL context occurred.
Jun 10 18:37:41 blackbird systemd-coredump[27830]: Process 979 (plasmashell) of user 1000 dumped core.
kwin complains, plasma crashes.
If you run a basic X11 session (openbox) and steam there:
1. do you face a likewise GPU reset
2. does that also result in the LSD display?
(The visual part could just be the kwin_wayland compositor and since it draws everyting, you cannot see whether an underlying non-GL client, and therefore the framebuffer, is actually fine)
Offline
I'll try using the X11 session and report back with what I find. After playing for quite some time across multiple play sessions, I was able to reproduce the crash, and I've successfully saved the journalctl output to 0x0.st this time. While I'm testing, here's the log of the previous session, in case it yields any additional information. https://0x0.st/8EwW.txt
Last edited by lostfile (2025-06-12 19:32:08)
Offline
I return with some results. I ran openbox in TTY3 and only used Steam in that session. The familiar crash did occur, with all the same symptoms. Here is the journalctl output: https://0x0.st/8ElH.txt.
Furthermore, I was notified that an additional log file was generated by X11; in case that's useful, I have uploaded it here: https://0x0.st/8ElM.txt.
X11 returned some messages as well, following the crash:
Offline
https://bbs.archlinux.org/viewtopic.php … 7#p2246017 - but that's on a ryzen APU, so limiting the c-states wil probably not do much.
I'd suggest to throw a different software stack (notably kernel and mesa) at it, maybe a *slightly* older ubuntu live system or so and see whether you get the GPU to reset there as well.
Offline