You are not logged in.
I don't know if it could help, but I've searched in the journal and this is what I've found:
Freezes started on December, 11 and the following kernels, firmware and mesa have driven to freezes:
Linux kernel: 6.12.8 to 6.12.10, and 6.6.63-1-lts to 6.6.69-1-lts (last lts tested version).
Linux firmware: 20241210.b00a7f7e-1 to 20250109.7673dffd-1.
Mesa: 24.3.1 to 24.3.3
By now I'm using kernel 6.12.10, firmware 20250109.7673dffd-1 and mesa 25.0.0_devel.200085.94da1edbe49-1 (#131) and no freezes yet.
Offline
@pacoandres - also @kclisp
Your test is helpful from my point of view, as your system can produce the error quite quickly (as also with @LnxFCA).
You are currently testing mesa-test-git 25.0.0 with patch. However, you should bear in mind that in this test-git the error that is occurring here may not even be present.
Therefore, if no error occurs, you should also test mesa-test-git 25.0.0 without this patch.
I am currently doing this for the third day and so far there is no crash, which I usually have during this time. It's still a bit early, but this could lead to the hypothesis that the error I'm experiencing might not be included in mesa 25.0.0 at this time.
Later I will test mesa 24.3.4 after release. All versions of 24.3.x cause a crash for me so far, even with the latest packages.
Offline
@pacoandres - also @kclisp
Your test is helpful from my point of view, as your system can produce the error quite quickly (as also with @LnxFCA).
You are currently testing mesa-test-git 25.0.0 with patch. However, you should bear in mind that in this test-git the error that is occurring here may not even be present.
Therefore, if no error occurs, you should also test mesa-test-git 25.0.0 without this patch.I am currently doing this for the third day and so far there is no crash, which I usually have during this time. It's still a bit early, but this could lead to the hypothesis that the error I'm experiencing might not be included in mesa 25.0.0 at this time.
Later I will test mesa 24.3.4 after release. All versions of 24.3.x cause a crash for me so far, even with the latest packages.
I've been testing mesa-test-git 25.0.0 with and without patch, working with eclipse and these are the results:
With the unpatched version it takes 26 minutes to freeze, with this log:
ene 20 15:55:52 monelle kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State
ene 20 15:55:53 monelle kwin_wayland[1088]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
ene 20 15:55:58 monelle kwin_wayland[1088]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug
ene 20 15:55:58 monelle kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1071525529 wd_nsec: 1071525012
ene 20 15:56:01 monelle kernel: sysrq: Keyboard mode set to system default
Fortunately REISUB works this time.
With the patched version I've been working more than 4 hours with no freeze. Shutdown for eating and testing the unpatched, and now I'm using it again.
If the system doesn't freeze this evening I'll post it later.
Offline
With the unpatched version it takes 26 minutes to freeze, ...
With the patched version I've been working more than 4 hours with no freeze. Shutdown for eating and testing the unpatched, and now I'm using it again.
If I understand everything correctly, then there is a high probability that Marek Olšák's patch can prevent or at least reduce the supposedly most frequent or at least a *not uncommon* cause of crashes/freezes.
Why mesa-test-git 25.0.0 without patch *won't* crash for me and @bernd_b in the past is another question.
If your result is confirmed, then @kclisp should perhaps also note this on mesa ticker #12310, as it can't then be confirmed by his i3 test alone.
From my point of view, this is a progress
thanks @pacoandres
Offline
pacoandres wrote:With the unpatched version it takes 26 minutes to freeze, ...
With the patched version I've been working more than 4 hours with no freeze. Shutdown for eating and testing the unpatched, and now I'm using it again.If I understand everything correctly, then there is a high probability that Marek Olšák's patch can prevent or at least reduce the supposedly most frequent or at least a *not uncommon* cause of crashes/freezes.
Why mesa-test-git 25.0.0 without patch *won't* crash for me and @bernd_b in the past is another question.
If your result is confirmed, then @kclisp should perhaps also note this on mesa ticker #12310, as it can't then be confirmed by his i3 test alone.
From my point of view, this is a progress
thanks @pacoandres
Thanks everyone for the continued testing! It's promising that the patch seems to work for other users as well. I'll note this upstream, and hopefully it can be in a new release soon.
As for myself, I haven't seen a freeze with the patched mesa over the last couple of days.
@orbit-oc
It's interesting though that @bernd_b did get a crash with his own build.
@orbit-oc @LnxFCA
For completeness, what are your APU models (lscpu)?
Offline
orbit-oc wrote:pacoandres wrote:With the unpatched version it takes 26 minutes to freeze, ...
With the patched version I've been working more than 4 hours with no freeze. Shutdown for eating and testing the unpatched, and now I'm using it again.If I understand everything correctly, then there is a high probability that Marek Olšák's patch can prevent or at least reduce the supposedly most frequent or at least a *not uncommon* cause of crashes/freezes.
Why mesa-test-git 25.0.0 without patch *won't* crash for me and @bernd_b in the past is another question.
If your result is confirmed, then @kclisp should perhaps also note this on mesa ticker #12310, as it can't then be confirmed by his i3 test alone.
From my point of view, this is a progress
thanks @pacoandresThanks everyone for the continued testing! It's promising that the patch seems to work for other users as well. I'll note this upstream, and hopefully it can be in a new release soon.
As for myself, I haven't seen a freeze with the patched mesa over the last couple of days.
@orbit-oc
It's interesting though that @bernd_b did get a crash with his own build.
@orbit-oc @LnxFCA
For completeness, what are your APU models (lscpu)?
Here the relevant output of lscpu on my system:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 43 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx
CPU family: 23
Model: 17
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU(s) scaling MHz: 81%
CPU max MHz: 2000.0000
CPU min MHz: 1600.0000
BogoMIPS: 3992.22
Full output: https://privatebin.at/?3f8bc089a336c052 … WaFpDMz3CZ
Note: Its been more than half day without freezes with patched mesa v24.3.3. Without the patch it crashes in less than 1 or 2 hours.
Last edited by LnxFCA (2025-01-20 19:11:28)
Offline
orbit-oc wrote:@pacoandres - also @kclisp
Your test is helpful from my point of view, as your system can produce the error quite quickly (as also with @LnxFCA).
You are currently testing mesa-test-git 25.0.0 with patch. However, you should bear in mind that in this test-git the error that is occurring here may not even be present.
Therefore, if no error occurs, you should also test mesa-test-git 25.0.0 without this patch.I am currently doing this for the third day and so far there is no crash, which I usually have during this time. It's still a bit early, but this could lead to the hypothesis that the error I'm experiencing might not be included in mesa 25.0.0 at this time.
Later I will test mesa 24.3.4 after release. All versions of 24.3.x cause a crash for me so far, even with the latest packages.
I've been testing mesa-test-git 25.0.0 with and without patch, working with eclipse and these are the results:
With the unpatched version it takes 26 minutes to freeze, with this log:
ene 20 15:55:52 monelle kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State ene 20 15:55:53 monelle kwin_wayland[1088]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug ene 20 15:55:58 monelle kwin_wayland[1088]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug ene 20 15:55:58 monelle kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1071525529 wd_nsec: 1071525012 ene 20 15:56:01 monelle kernel: sysrq: Keyboard mode set to system default
Fortunately REISUB works this time.
With the patched version I've been working more than 4 hours with no freeze. Shutdown for eating and testing the unpatched, and now I'm using it again.
If the system doesn't freeze this evening I'll post it later.
After four hours of intensive use (Eclipse, multiple browsers with youtube and documentation sites) no freezes.
A summary of what's working:
Kernel 6.12.10
Mesa 25.0.0 with the patch (I think it's this https://gitlab.freedesktop.org/mesa/mes … e_2728386)
APU Ryzen 3400G
KDE Plasma 6.2.5
Offline
going to try and build git package applying the patch as this is really effecting my wife's computer. I'll report back after the computer runs for a while to see if it crashes
Last edited by galvez_65 (2025-01-20 21:27:41)
Offline
It's interesting though that @bernd_b did get a crash with his own build.
The matter is not at all clear. At first everything was stable for days and then suddenly not at all, which could indicate an external change. With the information available, this cannot be used for further analysis.
For completeness, what are your APU models (lscpu)?
APU: AMD Ryzen 3 3200G (Picasso | zen+ | Ryzen 3000 series)
Family: 0x17
Model: 0x18
Stepping: 0x01
integrated pgu: AMD Radeon Vega 8 Graphics (Raven/Raven2)
documented: https://www.techpowerup.com/cpu-specs/ryzen-3-3200g.c2205
For categorisation:
Affected are (all) AMD APU's of the Ryzen 2000 (Raven Ridge) and Ryzen 3000 (Picasso) series. The integrated gpu is called Raven or gfx9.
I no longer see any reason to run my test with mesa-test-git 25.0.0 without a patch for very long. I will change at the latest when mesa 24.3.4 is released.
But it would be good if Paco (@pacoandres) could test the patch for a few more days.
Like me, @pacoandres has a Picasso. He reacts to mesa-test-git 25.0.0 without a patch, I don't. This may not be the end of the story here, but we are one step further.
Offline
Running mesa-test-git 25.0.0_devel.200085.94da1edbe49-1 for around 24hr.
Cannot get it to crash using GPU loading that has crashed other versions.
ie: running a few 1080p or 4K youtube videos while using gimp, rapidly switch between windows, etc.
Running: `sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover` in xfce session recovered normally.
Log out of xfce session to console, no display manager used. Running: `sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover` results in black screen, keyboard seems unresponsive can't switch tty's.
This system is setup for [Alt]+[SysRq] R E I S U B, so run it and wait about a min before hard reboot.
Some useful looking data from journal: http://0x0.st/8HW4.txt
Package versions:
$ pacman -Q mesa-test-git llvm-libs linux amd-ucode linux-firmware
mesa-test-git 25.0.0_devel.200085.94da1edbe49-1
llvm-libs 19.1.7-1
linux 6.12.10.arch1-1
amd-ucode 20250109.7673dffd-1
linux-firmware 20250109.7673dffd-1
Possibly useful info?
$ glxinfo -B
name of display: :0.0
display: :0 screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
Vendor: AMD (0x1002)
Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.59, 6.12.10-arch1-1) (0x15dd)
Version: 25.0.0
Accelerated: yes
Video memory: 1024MB
Unified memory: no
Preferred profile: core (0x1)
Max core profile version: 4.6
Max compat profile version: 4.6
Max GLES1 profile version: 1.1
Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
VBO free memory - total: 774 MB, largest block: 774 MB
VBO free aux. memory - total: 7410 MB, largest block: 7410 MB
Texture free memory - total: 774 MB, largest block: 774 MB
Texture free aux. memory - total: 7410 MB, largest block: 7410 MB
Renderbuffer free memory - total: 774 MB, largest block: 774 MB
Renderbuffer free aux. memory - total: 7410 MB, largest block: 7410 MB
Memory info (GL_NVX_gpu_memory_info):
Dedicated video memory: 1024 MB
Total available memory: 8482 MB
Currently available dedicated video memory: 774 MB
OpenGL vendor string: AMD
OpenGL renderer string: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.59, 6.12.10-arch1-1)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 25.0.0-devel (git-94da1edbe4)
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL version string: 4.6 (Compatibility Profile) Mesa 25.0.0-devel (git-94da1edbe4)
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 25.0.0-devel (git-94da1edbe4)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
EDIT
May be worth mentioning that during testing of various mesa versions, the `sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover` has resulted in mixed results from both the X session and logged into a tty1 console with X not running, and implementing:
MESA_SHADER_CACHE_DISABLE=true
has changed the results of testing with/without it.
Are the mesa cache directories, `~/.cache/mesa_shader_cache/` and `~/.cache/mesa_shader_cache_db/` not used unless set to do so? Seems mine are not being used with default settings.
Last edited by NuSkool (2025-01-20 19:55:52)
Offline
...does anyone have a patched package they are willing to share? Thanks in advance
see #131 or #197
Offline
@NuSkool
Leave the gpu recovery call aside. In my case, this only led to undesirable results.
See also what Pierre-Eric Pelloux-Prayer wrote about this.
...the GPU recovery mechanism ... is caused by the kernel and/or firmware...
Offline
Yes. System update performed 5 minutes ago.
...mesa 24.3.2, Xfce 4.20, Linux LTS 6.6.67...Release notes of mesa with changelog can be found here:
https://docs.mesa3d.org/relnotes/24.3.2.htmledit 19:10 utc+1
system chrashes like before. Uptime 22 minutes.
Downgrade to mesa 24.2.7I thought so, but I had to try it. ;-)
As I said, this will take a little longer. Someone will have to write a ticket, otherwise we'll stay with *this* bug...
downgrading to mesa 24.2.7 gives me a black screen with just a cursor
Offline
because llvm got updated, it's not safe to downgrade mesa in isolation and downgrading llvm along with it will bring a bunch of issues with associated tools. You either do a proper rebuild against the new llvm or find the issue.
Offline
Looks like mesa-test-git-25.0.0_devel.200085.94da1edbe49-1-x86_64.pkg.tar.zst fixed the problem
AMD Ryzen 3 3200G
APU Vega 8
Offline
After posting earlier today, https://bbs.archlinux.org/viewtopic.php … 7#p2221537 , I switched from Lone Wolf's patched to unpatched mesa. ie:
[2025-01-20T12:16:00-0800] [ALPM] upgraded mesa-test-git (25.0.0_devel.200085.94da1edbe49-1 -> 25.0.0_devel.200085.94da1edbe49-2)
Initially, I tried to produce a crash per my earlier post description, but gave up without success.
Just now while watching a youtube vid, a crash (lockup) occured.
Unfortunately there is nothing of interest in the journal this time.
For the time being on this system, looks like Lone Wolf's patched version of mesa...
mesa-test-git 25.0.0_devel.200085.94da1edbe49-1
works well over the 24hr test period + trying to induce a crash, compared to the unpatched version.
Here's a few packages/versions currently running on this system. Last system update was yesterday.
$ pacman -Q mesa-test-git llvm-libs linux amd-ucode linux-firmware
mesa-test-git 25.0.0_devel.200085.94da1edbe49-2
llvm-libs 19.1.7-1
linux 6.12.10.arch1-1
amd-ucode 20250109.7673dffd-1
linux-firmware 20250109.7673dffd-1
Switching back to the patched version of mesa and I'll report anything unusual.
Just ask if you'd like any add info...
Last edited by NuSkool (2025-01-21 01:45:42)
Offline
For anyone without heavy OpenGL / mesa applications looking for a simple stability workaround until these mesa/radeonsi driver bugs are resolved, there is hope with the following environment variable:
LIBGL_ALWAYS_SOFTWARE=1
Which will use the CPU/fallback llvmpipe driver instead of buggy crash-prone unstable radeonsi mesa driver. I am on AMD Ryzen 3 2200G Vega 8 APU hardware, and was previously downgrading mesa + llvm-libs packages (pacman --ignore mesa,llvm-libs) to avoid the chronic kworker amdgpu-reset-dev CPU-spike hard-system crash.
Now I am stable with a home theater PC (no heavy OpenGL/3D gaming needs), so this llvmpipe fallback software mesa driver should work fine for my workload. I just want to watch videos / YouTube like a noob without any hard system crash. I already had CPU video decoding, I never got hardware GPU-video-decoding working with Vulkan/VAAPI chromium anyhow, and the CPU load is still reasonably low for simple video watching with this llvmpipe mesa driver loaded.
To load the CPU llvmpipe mesa driver instead, you can export the variable in your display manager config files at boot for all X11/Wayland/Desktop environment processes. I got this loaded at boot with my sddm display manager by creating a ~/.xsessionrc file to set the mesa environment variable for all desktop processes:
export LIBGL_ALWAYS_SOFTWARE=1
It's been working well, and I verified the llvmpipe driver is loaded with my desktop, instead of the buggy radeonsi mesa driver, by looking at the eglinfo -B output from the mesa-utils package within my desktop terminal using X11:
$ eglinfo -B
X11 platform:
EGL API version: 1.5
EGL vendor string: Mesa Project
EGL version string: 1.5
EGL client APIs: OpenGL OpenGL_ES
OpenGL core profile vendor: Mesa
OpenGL core profile renderer: llvmpipe (LLVM 19.1.6, 256 bits)
Hope this helps any of you out there waiting for the dust to settle on mesa bugs since version 1:24.3.0-1, and do not want to maintain patched AUR git packages every time the vast dependency of other packages are updated with pacman (mesa is a mess of dependency, like the old Microsoft Windows DLL-hell days ). Good luck! In rolling-release ArchLinux we Vega APU users trust, for the mesa radeonsi driver patches to get fixed soon!
The src/gallium/drivers/radeonsi/si_pipe.c patch by Marek, thanks to the debugging work done by LoneWolf, kclisp, and others looks promising from reading Mesa issue 12310. We appreciate all the posting/reproduction/patch compiling, keep up the great work.
Offline
I withdraw the enquiry but retain the statement:
I get no crash/freeze with mesa-test-git 25.0.0 without patch. So without a crash I don't need a patch and it doesn't help then.
To find out if my crashes/freezes in mesa 24.3.x are caused by a *different* bug, you would have to build a version that includes the *patch*.
If the thesis is correct, the crashes will continue to occur despite the patch. If it is the *same* bug, then there should be no more crashes/freezes.
Last edited by orbit-oc (2025-01-21 19:09:18)
Offline
@maxrd2 : this thread is about the vega gfx cards, not about the Polaris family the RX 580 belongs to.
You probably have another issue, I will split off your post to a new thread .
@requited : thanks for mentioning another workaround . Keep in mind llvmpipe typically increases the power draw of the system by a lot .
Last edited by Lone_Wolf (2025-01-21 10:44:17)
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
Just wanted to report back that I @Lone_Wolf's applied the patch to mesa-git and so far no crash. I have not tried mesa-test-git as mentioned above. Thank you all for helping to sort this out, hopefully the fix will be applied to the main branch soon / mesa-test-get finally fixes the issue and it makes it way into the supported packages.
Offline
because llvm got updated, it's not safe to downgrade mesa in isolation and downgrading llvm along with it will bring a bunch of issues with associated tools. You either do a proper rebuild against the new llvm or find the issue.
Thanks for the confirmation, easier to build against the new llvm than to figure out the regression issues.
Offline
I am not on Arch, but facing the same issue on fedora. 2200g, gigabyte motherboard, from Dec.
So far I can trigger it very quickly with flatpak gimp(current flathub version) under a Minute. Sometimes it triggers immediately just by opening flatpak gimp and waiting few seconds. Sometime it happens when opening multi files from nautilus into flatpak gimp.
Can anyone confirm it with their flatpak gimp as well?
Offline
I am not on Arch, but facing the same issue on fedora. 2200g, gigabyte motherboard, from Dec.
So far I can trigger it very quickly with flatpak gimp(current flathub version) under a Minute. Sometimes it triggers immediately just by opening flatpak gimp and waiting few seconds. Sometime it happens when opening multi files from nautilus into flatpak gimp.Can anyone confirm it with their flatpak gimp as well?
I can confirm that the system crashes every time I try to open an image in flatpak GIMP.
(Also not on Arch)
AMD 2400G APU)
kernel 6.12.10
mesa 24.3.3
Last edited by userxyz (2025-01-21 20:22:28)
Offline
Another working day without freezes with @Lone_Wolf's patched mesa version (25.0.0_devel.200085.94da1edbe49-1)
Running Android Studio, Eclipse, browsers, okular, emacs and some other applications simultaneously with no GPU problem.
Thank you very much.
Last edited by pacoandres (2025-01-21 19:04:25)
Offline
Hello, i'm suffering from the same problem, and am trying to get the patched mesa version working, but it conflicts with a lot of packages, im not sure if i should remove mesa and all the packages that depend on it.
heres the output i get trying to install the patched mesa:
:: mesa-test-git-25.0.0_devel.200085.94da1edbe49-1 and mesa-1:24.3.3-3 are in conflict. Remove mesa? [y/N] y
:: mesa-test-git-25.0.0_devel.200085.94da1edbe49-1 and vulkan-radeon-1:24.3.3-3 are in conflict. Remove vulkan-radeon? [y/N] y
:: mesa-test-git-25.0.0_devel.200085.94da1edbe49-1 and opencl-rusticl-mesa-1:24.3.3-3 are in conflict. Remove opencl-rusticl-mesa? [y/N] y
error: failed to prepare transaction (could not satisfy dependencies)
:: removing opencl-rusticl-mesa breaks dependency 'opencl-rusticl-mesa' required by lib32-opencl-rusticl-mesa
Offline