You are not logged in.

#1 2025-02-05 16:58:07

tonykuroi
Member
Registered: 2012-07-05
Posts: 10

Computer randomly locks up at graphical level

Hello,

I have a device at home that is experiencing a strange issue where the graphical/rendering layer seems to just freeze at random intervals. Thus far, I haven't really been able to discern exactly why, other than it seems to be possibly the AMD driver crashing, specifically the DRM layer.

I've uploaded a quite long gist with the systemd logs for the past ten boots https://gist.github.com/TonyKuroi/4a0d2 … 221db17516

some common errors that stand out are

kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out
kwin_wayland[1385]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug

and another time, a potentially telling panic was this snippet:

kwin_wayland[1091]: kwin_core: XCB error: 3 (BadWindow), sequence: 1913, resource id: 14680130, major code: 129 (SHAPE), minor code: 6 (Input)
kwin_wayland[1091]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out
kwin_wayland[1091]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:79:crtc-0] hw_done or flip_done timed out
kwin_wayland[1091]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
kwin_wayland[1091]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
kernel: amdgpu 0000:03:00.0: [drm] *ERROR* flip_done timed out
kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:79:crtc-0] commit wait timed out
kwin_wayland[1091]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
kwin_wayland[1091]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
kernel: amdgpu 0000:03:00.0: [drm] *ERROR* flip_done timed out
kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [PLANE:76:plane-6] commit wait timed out
kernel: ------------[ cut here ]------------
kernel: WARNING: CPU: 19 PID: 821 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8585 amdgpu_dm_atomic_commit_tail+0x3b4f/0x3c30 [amdgpu]
kernel: Modules linked in: rfcomm cmac algif_hash algif_skcipher af_alg snd_seq_dummy snd_hrtimer snd_seq ccm bnep vfat fat amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi mt7925e mt7925_common snd_hda_intel btu>
kernel:  crc32c_intel dm_mod amdxcp polyval_clmulni i2c_algo_bit polyval_generic drm_ttm_helper ghash_clmulni_intel ttm sha512_ssse3 drm_exec sha256_ssse3 gpu_sched sha1_ssse3 aesni_intel drm_suballoc_helper gf128mul drm_buddy nvme crypto_simd drm_display_helper cryptd usbhid ccp nvme_core c>
kernel: CPU: 19 UID: 0 PID: 821 Comm: systemd-logind Not tainted 6.12.1-arch1-1 #1 33f4a68ee85c59cb5d6edb747af0349869779b24
kernel: Hardware name: ASUS System Product Name/PRIME X870-P WIFI, BIOS 0231 07/18/2024
kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x3b4f/0x3c30 [amdgpu]
kernel: Code: f1 cd e9 dc fd ff ff 49 8d 87 50 31 04 00 c6 85 38 fe ff ff 00 48 89 85 48 fe ff ff e9 d8 cb ff ff 0f 0b e9 fc f2 ff ff 0f 0b <0f> 0b e9 12 f3 ff ff 0f 0b e9 11 cc ff ff 48 c7 85 28 fe ff ff 00
kernel: RSP: 0018:ffffb07005cdb6b0 EFLAGS: 00010086
kernel: RAX: 0000000000000001 RBX: 0000000000000286 RCX: ffff9455902d4118
kernel: RDX: 0000000000000001 RSI: 0000000000000297 RDI: ffff945594000178
kernel: RBP: ffffb07005cdb900 R08: ffffb07005cdb59c R09: 0000000000000000
kernel: R10: ffffb07005cdb608 R11: ffffb07005cdb60c R12: ffffb07005cdb768
kernel: R13: 0000000000000000 R14: ffff945721388c00 R15: ffff9455902d4000
kernel: FS:  000071c253e51900(0000) GS:ffff945cbe380000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00001e44002f8000 CR3: 000000013b73c000 CR4: 0000000000f50ef0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel:  <TASK>
kernel:  ? amdgpu_dm_atomic_commit_tail+0x3b4f/0x3c30 [amdgpu 84e88e0534dc2928d32f8b075d0992f565877334]
kernel:  ? __warn.cold+0x93/0xf6
kernel:  ? amdgpu_dm_atomic_commit_tail+0x3b4f/0x3c30 [amdgpu 84e88e0534dc2928d32f8b075d0992f565877334]
kernel:  ? report_bug+0xff/0x140
kernel:  ? handle_bug+0x58/0x90
kernel:  ? exc_invalid_op+0x17/0x70
kernel:  ? asm_exc_invalid_op+0x1a/0x20
kernel:  ? amdgpu_dm_atomic_commit_tail+0x3b4f/0x3c30 [amdgpu 84e88e0534dc2928d32f8b075d0992f565877334]
kernel:  commit_tail+0x91/0x130
kernel:  drm_atomic_helper_commit+0x11a/0x140
kernel:  drm_atomic_commit+0xa6/0xe0
kernel:  ? __pfx___drm_printfn_info+0x10/0x10
kernel:  drm_client_modeset_commit_atomic+0x203/0x250
kernel:  drm_client_modeset_commit_locked+0x5a/0x160
kernel:  __drm_fb_helper_restore_fbdev_mode_unlocked+0x5e/0xd0
kernel:  drm_fb_helper_set_par+0x30/0x40
kernel:  fb_set_var+0x25c/0x460
kernel:  ? update_load_avg+0x7e/0x7b0
kernel:  ? __dequeue_entity+0x3f5/0x4b0
kernel:  ? sched_clock+0x10/0x30
kernel:  ? sched_clock_cpu+0xf/0x1d0
kernel:  ? psi_group_change+0x13b/0x310
kernel:  fbcon_blank+0x271/0x330
kernel:  do_unblank_screen+0xad/0x150
kernel:  complete_change_console+0x54/0x120
kernel:  vt_ioctl+0xec3/0x12c0
kernel:  ? do_syscall_64+0x8e/0x190
kernel:  tty_ioctl+0xe2/0x8a0
kernel:  ? __seccomp_filter+0x303/0x520
kernel:  __x64_sys_ioctl+0x91/0xd0
kernel:  do_syscall_64+0x82/0x190
kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
kernel: RIP: 0033:0x71c253923ced
kernel: Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
kernel: RSP: 002b:00007ffc804450e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
kernel: RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 000071c253923ced
kernel: RDX: 0000000000000001 RSI: 0000000000005605 RDI: 0000000000000024
kernel: RBP: 00007ffc80445130 R08: 00007ffc804450c0 R09: 0000590e7021c248
kernel: R10: 00007ffc80445110 R11: 0000000000000246 R12: 0000000000000000
kernel: R13: 00007ffc804451c0 R14: 0000590e701e9ee0 R15: 0000590e701eb010
kernel:  </TASK>
kernel: ---[ end trace 0000000000000000 ]---

the hardware is as follows:

Ryzen 9 9900X
ASUS Prime x870-P WiFi
Sapphire Pulse RX 7900XT
Corsair 2x16G 6Ghz RAM CMK32GX5M2B6000C30
PSU is a 1000W, so it shouldn't be low on power

Is anyone able to gleam any potential thoughts? We've (it's my friend's computer) tried not running discord (as it often happens while discord is being used), a different DE, tried an earlier kernel (but that broke wifi and bluetooth), and keep everything up to date, but it's been happening now for a couple months. Some days it doesn't crash at all, others it may crash three times in an hour. Originally we were hoping it was just a driver bug that would be fixed, but there's been kernel and driver upgrades since it started and it's still an issue (he updates it regularly). Just trying to figure out if we should start suspecting hardware.

Happy to provide any more information upon request.

Offline

#2 2025-02-05 17:42:10

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 24,400

Re: Computer randomly locks up at graphical level

There's been a major new kernel release with a bunch of amdgpu fixes, try upgrading to that first and foremost (though seeing the copious amount of flatpak, that currently has known issues on that kernel: https://gitlab.archlinux.org/archlinux/ … issues/110 )

Offline

#3 2025-02-05 20:05:21

BrunoPT
Member
From: Portugal
Registered: 2013-11-19
Posts: 28

Re: Computer randomly locks up at graphical level

I started experiencing the same issue on my Framework 13 Ryzen 7 7840U today.

kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
...
kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out
...
kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* [CRTC:79:crtc-0] commit wait timed out

It happened twice, both while watching a video on MPV.

Offline

#4 2025-02-05 21:02:29

BrunoPT
Member
From: Portugal
Registered: 2013-11-19
Posts: 28

Re: Computer randomly locks up at graphical level

The issue went away after I downgraded to Kernel 6.12.10-arch1-1

Offline

#5 2025-02-06 11:13:51

spiffyk
Member
Registered: 2025-02-06
Posts: 1

Re: Computer randomly locks up at graphical level

I am also affected, on a laptop with an integrated AMD Radeon 780M. Downgrading to kernel 6.12.10-arch1-1 works around this for me as well.

Offline

#6 2025-02-06 18:11:18

Kljunas2
Member
Registered: 2021-01-02
Posts: 7

Re: Computer randomly locks up at graphical level

Offline

#7 2025-02-07 17:16:00

Kljunas2
Member
Registered: 2021-01-02
Posts: 7

Re: Computer randomly locks up at graphical level

After downgrading linux-firmware and linux-firmware-whence to 20241210.b00a7f7e-1, I haven't yet run into any issues, even with the 6.13.1-arch1-1 version of linux kernel.

Offline

#8 2025-02-12 09:21:29

Kljunas2
Member
Registered: 2021-01-02
Posts: 7

Re: Computer randomly locks up at graphical level

Unfortuantely the firmware downgrade doesn't help. Video driver crashed soon after I posted the previous reply. New versions of linux (6.13.2) didn't fix the problem either so I am currently running older version of linux which works flawlessly.

Offline

#9 2025-02-13 22:07:27

BrunoPT
Member
From: Portugal
Registered: 2013-11-19
Posts: 28

Re: Computer randomly locks up at graphical level

Same for me, even the linux-lts is broken now

Offline

#10 2025-02-17 22:40:11

BrunoPT
Member
From: Portugal
Registered: 2013-11-19
Posts: 28

Re: Computer randomly locks up at graphical level

Just updated the computer today, the issue seems to be fixed for me.

Edit: I just got the issue popping up again, the screen freezes completely when playing a video on mpv.

Last edited by BrunoPT (2025-02-18 23:30:11)

Offline

#11 2025-02-18 17:12:54

Kljunas2
Member
Registered: 2021-01-02
Posts: 7

Re: Computer randomly locks up at graphical level

I also upgraded the following packages to the latest version:

amd-ucode (20250109.7673dffd-1 -> 20250210.5bc5868b-1)
linux (6.12.10.arch1-1 -> 6.13.2.arch1-1)
linux-firmware-whence (20250109.7673dffd-1 -> 20250210.5bc5868b-1)
linux-firmware (20250109.7673dffd-1 -> 20250210.5bc5868b-1)

But the issue is not completely gone. The video driver still crashes (same DMCUB error) although this time it could partially recover. The screen freezes every few seconds for about a second, otherwise it's almost usable.

Offline

#12 2025-02-25 16:06:22

Kljunas2
Member
Registered: 2021-01-02
Posts: 7

Re: Computer randomly locks up at graphical level

I've been using the 6.13.4-arch1-1 kernel without issues for two days now.

Offline

#13 2025-02-25 18:40:27

BrunoPT
Member
From: Portugal
Registered: 2013-11-19
Posts: 28

Re: Computer randomly locks up at graphical level

Yes, it's working fine for me as well.
I also added amdgpu.dcdebugmask=0x10 kernel parameter, I'm not sure if it helped.

Offline

#14 2025-04-21 16:53:40

PHLAK
Member
From: Arizona
Registered: 2021-03-29
Posts: 6
Website

Re: Computer randomly locks up at graphical level

I'm also experiencing this issue running kernel 6.14.3-arch1-1 with or without the amdgpu.dcdebugmask=0x10 parameter, even with the latest mesa version (i.e. 1:25.0.4-1).

Apr 21 09:25:17 Ratchet kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 21 09:25:17 Ratchet kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 21 09:25:18 Ratchet kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 21 09:25:28 Ratchet kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out

However, if I revert mesa to 1:24.3.4-1 the issue seems to be better (i.e. less frequent) but still occurs.

Last edited by PHLAK (2025-04-21 17:00:36)

Offline

#15 2025-04-22 20:25:45

PHLAK
Member
From: Arizona
Registered: 2021-03-29
Posts: 6
Website

Re: Computer randomly locks up at graphical level

Actually, I think the issue is related to power profiles. I'm running GNOME (48) and had a udev rule set up to automatically switch my power profile based on whether or not I was on battery or wall power.

I noticed that while plugged in and the power profile was set to "Balanced" no crashing occurred. However, it definitely occurs when on battery and my power profile was set to "Power Saver". I tried manually switching to the "Balanced" profile while on battery power and didn't experience a lockup/crash.

I've now disabled my udev rule so I'm always running the "Balanced" profile by default. I've also removed the amdgpu.dcdebugmask=0x10 kernel parameter and haven't had a lock up yet. Will report back if the system locks up with this configuration.

Update: I have been experiencing lock ups with the above configuration. Going to try re-adding the amdgpu.dcdebugmask=0x10 kernel parameter to see if it helps.

Update 2: Still crashing after re-adding the amdgpu.dcdebugmask=0x10 kernel parameter.

Last edited by PHLAK (2025-04-24 16:46:45)

Offline

#16 2025-05-09 15:24:58

PHLAK
Member
From: Arizona
Registered: 2021-03-29
Posts: 6
Website

Re: Computer randomly locks up at graphical level

Has anyone found a reliable workaround for this issue yet?

I keep getting full system freezes unless my laptop is hooked up to my Thunderbolt dock which makes it difficult to do anything when I'm away from home.

Last edited by PHLAK (2025-05-12 20:37:10)

Offline

#17 2025-05-12 20:36:15

PHLAK
Member
From: Arizona
Registered: 2021-03-29
Posts: 6
Website

Re: Computer randomly locks up at graphical level

I believe I have finally solved my issues. After scouring my (root) system logs repeatedly I recently was looking at my user systemd logs (i.e. journalctl --user) and noticed a coredump caused by ulauncher at the time of a freeze.

[?] Process 2118 (ulauncher) of user 1000 dumped core.
                                                
Stack trace of thread 2118:
#0  0x00007daeb7f4f824 n/a (libgtk-3.so.0 + 0x34f824)
#1  0x00007daeb7f72351 n/a (libgtk-3.so.0 + 0x372351)
#2  0x00007daeb7f5a8d5 n/a (libgtk-3.so.0 + 0x35a8d5)
#3  0x00007daeba0784bb g_type_create_instance (libgobject-2.0.so.0 + 0x3e4bb)
#4  0x00007daeba05d768 n/a (libgobject-2.0.so.0 + 0x23768)
#5  0x00007daeba05ede7 g_object_new_with_properties (libgobject-2.0.so.0 + 0x24de7)
#6  0x00007daeba05fe42 g_object_new (libgobject-2.0.so.0 + 0x25e42)
#7  0x00007daeb818810a n/a (libgtk-3.so.0 + 0x58810a)
#8  0x00007daeba0784bb g_type_create_instance (libgobject-2.0.so.0 + 0x3e4bb)
#9  0x00007daeba05d768 n/a (libgobject-2.0.so.0 + 0x23768)
#10 0x00007daeba05f1e3 g_object_newv (libgobject-2.0.so.0 + 0x251e3)
#11 0x00007daeb7eefb85 n/a (libgtk-3.so.0 + 0x2efb85)
#12 0x00007daeb7ef288e n/a (libgtk-3.so.0 + 0x2f288e)
#13 0x00007daeb7ef36e8 n/a (libgtk-3.so.0 + 0x2f36e8)
#14 0x00007daeba1251d4 n/a (libglib-2.0.so.0 + 0x631d4)
#15 0x00007daeba128607 g_markup_parse_context_parse (libglib-2.0.so.0 + 0x66607)
#16 0x00007daeb7ef4b33 n/a (libgtk-3.so.0 + 0x2f4b33)
#17 0x00007daeb7ee6c4e gtk_builder_add_from_file (libgtk-3.so.0 + 0x2e6c4e)
#18 0x00007daebb581976 n/a (libffi.so.8 + 0x7976)
#19 0x00007daebb57e13c n/a (libffi.so.8 + 0x413c)
#20 0x00007daebb580f0e ffi_call (libffi.so.8 + 0x6f0e)
#21 0x00007daeba24de59 n/a (_gi.cpython-313-x86_64-linux-gnu.so + 0x33e59)
#22 0x00007daeba24b682 n/a (_gi.cpython-313-x86_64-linux-gnu.so + 0x31682)
#23 0x00007daebaf5f82d PyObject_Vectorcall (libpython3.13.so.1.0 + 0x15f82d)
#24 0x00007daebaf6ecd4 _PyEval_EvalFrameDefault (libpython3.13.so.1.0 + 0x16ecd4)
#25 0x00007daebafcde4a n/a (libpython3.13.so.1.0 + 0x1cde4a)
#26 0x00007daebaf5d377 _PyObject_MakeTpCall (libpython3.13.so.1.0 + 0x15d377)
#27 0x00007daebaf6ecd4 _PyEval_EvalFrameDefault (libpython3.13.so.1.0 + 0x16ecd4)
#28 0x00007daebb041695 PyEval_EvalCode (libpython3.13.so.1.0 + 0x241695)
#29 0x00007daebb07f433 n/a (libpython3.13.so.1.0 + 0x27f433)
#30 0x00007daebb07c81a n/a (libpython3.13.so.1.0 + 0x27c81a)
#31 0x00007daebb079f27 n/a (libpython3.13.so.1.0 + 0x279f27)
#32 0x00007daebb0791e0 n/a (libpython3.13.so.1.0 + 0x2791e0)
#33 0x00007daebb078ff3 n/a (libpython3.13.so.1.0 + 0x278ff3)
#34 0x00007daebb077244 Py_RunMain (libpython3.13.so.1.0 + 0x277244)
#35 0x00007daebb02e95c Py_BytesMain (libpython3.13.so.1.0 + 0x22e95c)
#36 0x00007daebac376b5 n/a (libc.so.6 + 0x276b5)
#37 0x00007daebac37769 __libc_start_main (libc.so.6 + 0x27769)
#38 0x000062eb64f1a045 _start (/usr/bin/python3.13 + 0x1045)
ELF object binary architecture: AMD x86-64

As a troubleshooting step I removed ulauncher and system stopped freezing. I have since installed ulauncher-git from the AUR and my system has been working without lock ups for several days now.

For anyone continuing to experience this issue I recommend checking your user systemd logs (i.e. journalctl --user) for additional information that may lead you to the offending application(s).

Last edited by PHLAK (2025-05-12 20:36:45)

Offline

Board footer

Powered by FluxBB