You are not logged in.
Recently I started experiencing Sway freezes. Luckily, I found an operation that always causes a freeze: doing Menu → File → Open in GIMP 2.10 several times. I experience freezes in other situations too. After freezing, I can't switch to another virtual terminal with Ctrl+Alt+F2, and my Sway hotkey which executes `poweroff` doesn't work. The version of the `linux` package is 6.12.4.
I collected `journalctl` records in the following way:
find the time T of the last record
cause Sway to freeze
wait for one minute
turn off and turn on the computer
collect the records after T
Record strips:
дек 15 21:49:25 beroal chronyd[2356]: Selected source 193.106.144.13 (2.arch.pool.ntp.org)
дек 15 21:49:36 beroal chronyd[2356]: Source 144.24.146.96 replaced with 15.207.248.194 (2.in.pool.ntp.org)
дек 15 21:49:55 beroal kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:73:crtc-0] hw_done or flip_done timed out
дек 15 21:56:25 beroal Tor[2768]: Performing bandwidth self-test...done.
дек 15 21:57:03 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
дек 15 21:57:46 beroal kernel: clocksource: timekeeping watchdog on CPU0: wd-tsc-wd excessive read-back delay of 90904500ns vs. limit of 100000ns, wd-wd read-back delay only 2793ns, attempt 3, marking tsc unstable
дек 15 21:57:46 beroal kernel: tsc: Marking TSC unstable due to clocksource watchdog
дек 15 21:57:46 beroal kernel: TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
дек 15 21:57:46 beroal kernel: sched_clock: Marking unstable (379056140904, 425895944)<-(379485786147, -3738920)
дек 15 21:57:46 beroal kernel: clocksource: Checking clocksource tsc synchronization from CPU 2 to CPUs 0,3.
дек 15 21:57:47 beroal kernel: clocksource: Switched to clocksource hpet
дек 15 21:57:54 beroal kernel: INFO: NMI handler (perf_event_nmi_handler) took too long to run: 85.934 msecs
дек 15 21:57:54 beroal kernel: perf: interrupt took too long (671410 > 2500), lowering kernel.perf_event_max_sample_rate to 300
дек 15 22:01:39 beroal systemd[2544]: Starting Virtual filesystem service - disk device monitor...
дек 15 22:01:39 beroal systemd[1]: Starting Disk Manager...
дек 15 22:01:39 beroal udisksd[2825]: udisks daemon version 2.10.1 starting
дек 15 22:01:39 beroal systemd[1]: Started Disk Manager.
дек 15 22:01:39 beroal udisksd[2825]: Acquired the name org.freedesktop.UDisks2 on the system message bus
дек 15 22:01:40 beroal systemd[2544]: Started Virtual filesystem service - disk device monitor.
дек 15 22:02:00 beroal chronyd[2370]: Source 94.158.46.150 replaced with 91.236.251.24 (2.ua.pool.ntp.org)
дек 15 22:02:43 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
дек 15 22:05:22 beroal chronyd[2369]: Selected source 194.54.80.29 (2.arch.pool.ntp.org)
дек 15 22:05:25 beroal systemd[2540]: Starting Virtual filesystem service - disk device monitor...
дек 15 22:05:25 beroal systemd[1]: Starting Disk Manager...
дек 15 22:05:25 beroal udisksd[2867]: udisks daemon version 2.10.1 starting
дек 15 22:05:25 beroal systemd[1]: Started Disk Manager.
дек 15 22:05:25 beroal udisksd[2867]: Acquired the name org.freedesktop.UDisks2 on the system message bus
дек 15 22:05:26 beroal systemd[2540]: Started Virtual filesystem service - disk device monitor.
дек 15 22:06:27 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
дек 15 22:08:46 beroal systemd[2522]: Starting Virtual filesystem service - disk device monitor...
дек 15 22:08:46 beroal systemd[1]: Starting Disk Manager...
дек 15 22:08:46 beroal udisksd[2806]: udisks daemon version 2.10.1 starting
дек 15 22:08:46 beroal systemd[1]: Started Disk Manager.
дек 15 22:08:46 beroal udisksd[2806]: Acquired the name org.freedesktop.UDisks2 on the system message bus
дек 15 22:08:47 beroal systemd[2522]: Started Virtual filesystem service - disk device monitor.
дек 15 22:09:57 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
дек 15 22:15:25 beroal systemd[2611]: Starting Virtual filesystem service - disk device monitor...
дек 15 22:15:25 beroal systemd[1]: Starting Disk Manager...
дек 15 22:15:25 beroal udisksd[2812]: udisks daemon version 2.10.1 starting
дек 15 22:15:25 beroal systemd[1]: Started Disk Manager.
дек 15 22:15:25 beroal udisksd[2812]: Acquired the name org.freedesktop.UDisks2 on the system message bus
дек 15 22:15:26 beroal systemd[2611]: Started Virtual filesystem service - disk device monitor.
дек 15 22:15:50 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
дек 15 22:16:08 beroal chronyd[2371]: Selected source 193.106.144.6 (2.arch.pool.ntp.org)
дек 15 22:16:47 beroal kernel: clocksource: timekeeping watchdog on CPU0: wd-tsc-wd excessive read-back delay of 90905478ns vs. limit of 100000ns, wd-wd read-back delay only 2793ns, attempt 3, marking tsc unstable
дек 15 22:16:47 beroal kernel: tsc: Marking TSC unstable due to clocksource watchdog
дек 15 22:16:47 beroal kernel: TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
дек 15 22:16:47 beroal kernel: sched_clock: Marking unstable (173080222205, 512953137)<-(173597955405, -4791704)
дек 15 22:16:47 beroal kernel: clocksource: Checking clocksource tsc synchronization from CPU 3 to CPUs 0-2.
дек 15 22:16:48 beroal kernel: clocksource: Switched to clocksource hpet
дек 15 22:21:31 beroal systemd[2532]: Starting Virtual filesystem service - disk device monitor...
дек 15 22:21:31 beroal systemd[1]: Starting Disk Manager...
дек 15 22:21:31 beroal udisksd[2841]: udisks daemon version 2.10.1 starting
дек 15 22:21:31 beroal systemd[1]: Started Disk Manager.
дек 15 22:21:31 beroal udisksd[2841]: Acquired the name org.freedesktop.UDisks2 on the system message bus
дек 15 22:21:32 beroal systemd[2532]: Started Virtual filesystem service - disk device monitor.
дек 15 22:22:10 beroal chronyd[2361]: Source 109.110.82.19 replaced with 62.149.2.7 (0.ua.pool.ntp.org)
дек 15 22:22:33 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
дек 15 23:12:34 beroal systemd[2540]: Starting Virtual filesystem service - disk device monitor...
дек 15 23:12:34 beroal systemd[1]: Starting Disk Manager...
дек 15 23:12:34 beroal udisksd[2826]: udisks daemon version 2.10.1 starting
дек 15 23:12:34 beroal systemd[1]: Started Disk Manager.
дек 15 23:12:34 beroal udisksd[2826]: Acquired the name org.freedesktop.UDisks2 on the system message bus
дек 15 23:12:34 beroal chronyd[2364]: Source 144.24.146.96 replaced with 192.46.211.253 (2.in.pool.ntp.org)
дек 15 23:12:35 beroal systemd[2540]: Started Virtual filesystem service - disk device monitor.
дек 15 23:13:06 beroal kernel: audit_log_start: 184 callbacks suppressed
дек 15 23:13:06 beroal kernel: audit: audit_backlog=65 > audit_backlog_limit=64
дек 15 23:13:06 beroal kernel: audit: audit_lost=7195 audit_rate_limit=0 audit_backlog_limit=64
дек 15 23:13:06 beroal kernel: audit: backlog limit exceeded
дек 15 23:13:06 beroal kernel: audit: audit_backlog=65 > audit_backlog_limit=64
дек 15 23:13:06 beroal kernel: audit: audit_lost=7196 audit_rate_limit=0 audit_backlog_limit=64
дек 15 23:13:06 beroal kernel: audit: backlog limit exceeded
дек 15 23:13:06 beroal kernel: audit: audit_backlog=65 > audit_backlog_limit=64
дек 15 23:13:06 beroal kernel: audit: audit_lost=7197 audit_rate_limit=0 audit_backlog_limit=64
дек 15 23:13:06 beroal kernel: audit: backlog limit exceeded
дек 15 23:13:20 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
дек 15 23:14:07 beroal kernel: clocksource: timekeeping watchdog on CPU0: hpet retried 2 times before success
дек 15 23:14:19 beroal kernel: clocksource: timekeeping watchdog on CPU0: hpet retried 2 times before success
The following line looks suspicious:
beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
I have AMD Ryzen 3 2200G (Vega 8, Raven Ridge, Zen/GCN5 according to Wikipedia). I upgraded the kernel from the version 6.11.9 to the current version on 2024-12-10. Thus I downgraded the kernel to the version 6.11.9. The freezes didn't go away. However, `journalctl` started showing something interesting:
дек 15 23:25:32 beroal systemd[2546]: Starting Virtual filesystem service - disk device monitor...
дек 15 23:25:32 beroal systemd[1]: Starting Disk Manager...
дек 15 23:25:32 beroal udisksd[2864]: udisks daemon version 2.10.1 starting
дек 15 23:25:33 beroal systemd[1]: Started Disk Manager.
дек 15 23:25:33 beroal udisksd[2864]: Acquired the name org.freedesktop.UDisks2 on the system message bus
дек 15 23:25:33 beroal systemd[2546]: Started Virtual filesystem service - disk device monitor.
дек 15 23:27:29 beroal kernel: amdgpu 0000:27:00.0: amdgpu: ring comp_1.1.0 timeout, signaled seq=9043, emitted seq=9046
дек 15 23:27:29 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Process information: process Xwayland pid 2622 thread Xwayland:cs0 pid 2624
дек 15 23:27:29 beroal kernel: amdgpu 0000:27:00.0: amdgpu: GPU reset begin!
дек 15 23:27:31 beroal kernel: clocksource: timekeeping watchdog on CPU3: hpet wd-wd read-back delay of 90909249ns
дек 15 23:27:31 beroal kernel: clocksource: wd-tsc-wd read-back delay of 90908202ns, clock-skew test skipped!
дек 15 23:27:48 beroal kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:73:crtc-0] hw_done or flip_done timed out
дек 15 23:27:49 beroal systemd[1]: Received SIGINT.
дек 15 23:27:49 beroal systemd[1]: Activating special unit System Reboot...
дек 15 23:27:49 beroal greetd[3014]: config: Config { file: ConfigFile { terminal: ConfigTerminal { vt: None, switch: false }, general: ConfigGeneral { source_profile: true, runfile: "/run/greetd.run", service: "greetd" }, default_session: ConfigSession { command: "", user: "", service: "" }, initial_session: None }, internal: ConfigInternal { session_worker: 11 } }
дек 15 23:32:26 beroal kernel: amdgpu 0000:27:00.0: amdgpu: ring gfx timeout, signaled seq=2245, emitted seq=2246
дек 15 23:32:26 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Process information: process sway pid 2561 thread sway:cs0 pid 2577
дек 15 23:32:26 beroal kernel: amdgpu 0000:27:00.0: amdgpu: GPU reset begin!
I'm aware of several threads on this forum describing similar problems with AMD GPU. What should I do now?
[Update. There are no freezes with the LTS kernel version 6.6.65.]
Last edited by beroal (2025-01-26 11:33:44)
we are not condemned to write ugly code
Offline
[Update. There are no freezes with the LTS kernel version 6.6.65.]
Good to know. guess we're gonna do a good 'ol bisect.
First, try `linux-next`, if it works there try `linux-git` and report that here.
Then try these:
6.7.9
6.8.9
6.9.10
6.10.10
6.11.9
Then try all the minor versions in between the working and the broken.
You don't necessarily need to do it in this order.
Then we'll start bisecting unless someone else has any other idea.
https://wiki.archlinux.org/title/Downgrading_packages
describing similar problems with AMD GPU
do they have the same error? I haven't found any, so you must be referring to general AMD GPUs in general having problems.
Last edited by jl2 (2024-12-16 07:28:21)
Why I run Arch? To "BTW I run Arch" the guy one grade younger.
And to let my siblings and cousins laugh at Arsch Linux...
Offline
https://github.com/swaywm/sway/issues/7139
use gimp git / 2.99 / 3
PS: this will only help with sway/gtk2 issues which is known to cause freezes. If your GPU is actually crashing, that's still on the kernel / mesa / whatever.
Last edited by frostschutz (2024-12-16 09:03:25)
Offline
[Update. There are no freezes with the LTS kernel version 6.6.65.]
Unfortunately, I have just experienced a freeze on the LTS kernel not in GIMP.
дек 16 14:55:19 beroal kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=558, emitted seq=561
дек 16 14:55:19 beroal kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox pid 2786 thread firefox:cs0 pid 3179
дек 16 14:55:19 beroal kernel: amdgpu 0000:27:00.0: amdgpu: GPU reset begin!
дек 16 14:55:21 beroal kernel: clocksource: timekeeping watchdog on CPU0: hpet retried 2 times before success
дек 16 14:55:34 beroal systemd-logind[2331]: Power key pressed short.
дек 16 14:55:34 beroal systemd-logind[2331]: Powering off...
дек 16 14:55:34 beroal systemd-logind[2331]: System is powering down.
There are a lot of records after this.
we are not condemned to write ugly code
Offline
describing similar problems with AMD GPU
do they have the same error? I haven't found any, so you must be referring to general AMD GPUs in general having problems.
A lot of
amdgpu: ring $X timeout
and
amdgpu: GPU reset begin!
I didn't find logs ending in the following record though.
amdgpu: Dumping IP State
we are not condemned to write ugly code
Offline
Is this problem at least less likely on the LTS?
Can you find/report the issue on the AMD DRM bugtracker? https://gitlab.freedesktop.org/drm/amd/-/issues
Oh, and I almost forgot, what GPU you got?
Last edited by jl2 (2024-12-16 14:21:00)
Why I run Arch? To "BTW I run Arch" the guy one grade younger.
And to let my siblings and cousins laugh at Arsch Linux...
Offline
Oh, and I almost forgot, what GPU you got?
It's in the first post, AMD Ryzen 3 2200G.
https://github.com/swaywm/sway/issues/7139
use gimp git / 2.99 / 3
PS: this will only help with sway/gtk2 issues which is known to cause freezes. If your GPU is actually crashing, that's still on the kernel / mesa / whatever.
Thank you for this piece of advice. `gimp-devel` version 3.0.0rc1 doesn't freeze Sway on the stock kernel. I'll mark it solved for now and unmark in case of new freezes. Although it's not a proper solution because no GUI program is allowed to freeze Sway.
we are not condemned to write ugly code
Offline
Although it's not a proper solution because no GUI program is allowed to freeze Sway.
WTF? It's going to be fixed in the next version, that counts as a fix.
Why I run Arch? To "BTW I run Arch" the guy one grade younger.
And to let my siblings and cousins laugh at Arsch Linux...
Offline
Although it's not a proper solution because no GUI program is allowed to freeze Sway.
WTF? It's going to be fixed in the next version, that counts as a fix.
I mean that this is a bug in Sway or the amdgpu driver, not in GIMP. A new version of GIMP isn't a proper solution.
we are not condemned to write ugly code
Offline
I have just experienced a freeze on the stock kernel. GIMP wasn't running. Thunderbird, Firefox, and Midnight Commander were running. I see
дек 17 14:25:37 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Dumping IP State
in the log. If I shut down the computer properly, there is no such record in the log.
Last edited by beroal (2024-12-17 12:54:03)
we are not condemned to write ugly code
Offline
https://github.com/swaywm/sway/issues/8458 looks like your issue
edit:
It's different, sway does not crash, it freezes for a little bit then reappears without the wallpaper. That's a issue I have too.
Open a new issue and link to that one as related.
Last edited by jl2 (2024-12-17 13:12:26)
Why I run Arch? To "BTW I run Arch" the guy one grade younger.
And to let my siblings and cousins laugh at Arsch Linux...
Offline
https://bbs.archlinux.org/viewtopic.php … 9#p2214959
There might be a mesa/linux/amdgpu clusterfuck.
See whether downgrading to linux 6.11.x (the bug might have been backported in recent LTS kernels) and mesa [Edit: https://bbs.archlinux.org/viewtopic.php … 3#p2214943 24.2.7 ] stabilizes things.
If yes, see whether you can break it with either a kernel or mesa update.
Last edited by seth (2024-12-17 16:13:36)
Offline
I set up SSH. Now I can view the log via SSH.
When I do the GIMP freeze, the record
amdgpu: Dumping IP State
doesn't appear just after a freeze. It appears later. Probably, it doesn't say anything about the problem.
When I try to shut down the computer via SSH after a freeze, the computer shuts down the network, but doesn't turn off itself, and the image on the screen doesn't change.
The above pertains only to GIMP freezes.
we are not condemned to write ugly code
Offline
"amdgpu: Dumping IP State" is probably a result of the freeze induced (and then failing) reset.
If it's not mesa nor the kernel, see whether https://bbs.archlinux.org/viewtopic.php … 6#p2212906 can help you out here.
Though that wouldn't fit "Menu → File → Open"?
Is this a hybrid system?
=> https://bbs.archlinux.org/viewtopic.php … 8#p2213338 ?
Offline
Is this a hybrid system?
No, just one videocard.
we are not condemned to write ugly code
Offline
Downgrading the kernel to version 6.11.9 made freezes much less frequent. Nevertheless, I have experienced a freeze when moving a mouse over Firefox with the following log records:
дек 26 18:45:40 beroal kernel: amdgpu 0000:27:00.0: amdgpu: ring gfx timeout, signaled seq=420866, emitted seq=420869
дек 26 18:45:40 beroal kernel: amdgpu 0000:27:00.0: amdgpu: Process information: process firefox pid 2735 thread firefox:cs0 pid 2895
дек 26 18:45:40 beroal kernel: amdgpu 0000:27:00.0: amdgpu: GPU reset begin!
No
amdgpu: Dumping IP State
Last edited by beroal (2024-12-26 17:02:57)
we are not condemned to write ugly code
Offline
My dmesg looked exactly the same with the freezes, downgrading mesa to version 1:24.2.7 was the only way to fix it for me. Tried a view different kernels, didn't make a difference.
Offline
Hello, I just want to report that I have been having the same problem.
I have random system freezes when google chrome, firefox and/or obsidian is opened.
I dont think its sway because I dont use it, Im still on i3.
The system freezes but the audio keeps going for 1 minute or so before stopping.
I have AMD Ryzen 3400G.
Offline
Hello, I just want to report that I have been having the same problem.
I have random system freezes when google chrome, firefox and/or obsidian is opened.
I dont think its sway because I dont use it, Im still on i3.
The system freezes but the audio keeps going for 1 minute or so before stopping.
I have AMD Ryzen 3400G.
Check `journalctl -b -1` before posting here, please.
we are not condemned to write ugly code
Offline
Check `journalctl -b -1` before posting here, please.
Im not having system freezes since I downgrade mesa version to 1:24.2.7 like @Commodore-Freak said.
If I get another freeze I'll post everything.
These are the type of errors I was having since I came back to Linux (2 weeks ago) which followed a system freeze:
Dec 15 19:47:53 kem kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:73:crtc-0] hw_done or flip_done timed out
Dec 15 19:48:05 kem kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:77:crtc-1] hw_done or flip_done timed out
I also have this but I dont think its the one causing the freeze:
Dec 26 22:55:26 kem kernel: amdgpu 0000:26:00.0: amdgpu: Secure display: Generic Failure.
Dec 26 22:55:26 kem kernel: amdgpu 0000:26:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
I also had this every time:
Dec 26 22:54:51 kem kernel: amdgpu 0000:26:00.0: amdgpu: Dumping IP State
But since I downgrade mesa I dont see that message anymore.
And just once I saw this, so I dont think it cause the freezes
Dec 16 12:52:03 kernel: amdgpu 0000:26:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
Dec 16 12:52:17 kernel: amdgpu 0000:26:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
Last edited by yodmk (2024-12-27 13:52:29)
Offline
Dec 15 19:47:53 kem kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:73:crtc-0] hw_done or flip_done timed out
Dec 15 19:48:05 kem kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:77:crtc-1] hw_done or flip_done timed out
Do you have a multiscreen setup?
Do you get the same when only using one monitor?
Offline
Dec 15 19:47:53 kem kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:73:crtc-0] hw_done or flip_done timed out Dec 15 19:48:05 kem kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:77:crtc-1] hw_done or flip_done timed out
Do you have a multiscreen setup?
Do you get the same when only using one monitor?
I have 2 monitors. I didnt try it but since I downgrade mesa I didnt have any of those errors anymore except the SECUREDISPLAY stuff, I look that online and it shouldnt be a problem.
So Im ok now, no more system freezes / errors.
Offline
You'll eventually have to update mesa again, if the freezes return, try the behavior w/ only one monitor.
Offline
The wlgreet greeter for greetd stopped working with the old version of mesa after updating on 2025-01-24, so I'm back to testing
linux 6.12.10.arch1-1
linux-headers 6.12.10.arch1-1
mesa 1:24.3.4-1
Strangely, that update updated neither of these:
linux
linux-headers
mesa
pam
greetd
greetd-wlgreet
sway
we are not condemned to write ugly code
Offline
And what did it update?
Offline