You are not logged in.

#301 2025-01-27 19:21:54

NuSkool
Member
Registered: 2015-03-23
Posts: 195

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Thanks for the thorough update on the status @Mechanicus!

I'd be game to test the kernel if those commits would be present in the AUR linux-git package. https://aur.archlinux.org/packages/linux-git.
Looks like it'll build current mainline.

Both your links say committed 2 weeks ago, and  https://www.kernel.org/  says ''mainline:  6.13  2025-01-19', so seems AUR linux-git should build with those commits?

I'd give the default PKGBUILD build a shot in a clean chroot, but this will take a long while on my hardware.

Any suggestions before I get started building a kernel?
If/when I get through building the kernel, any suggestions on mesa related packages or parameters to start with?

Last edited by NuSkool (2025-01-27 19:26:00)

Offline

#302 2025-01-27 20:46:16

kclisp
Member
Registered: 2025-01-04
Posts: 33

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Update: there is now a proposed MR in mesa to fix the bisected bug! https://gitlab.freedesktop.org/mesa/mes … ests/33248

@Lone_Wolf

Maybe we should test it?

Offline

#303 2025-01-27 20:54:46

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

NuSkool wrote:

Thanks for the thorough update on the status @Mechanicus!

I'd be game to test the kernel if those commits would be present in the AUR linux-git package. https://aur.archlinux.org/packages/linux-git.
Looks like it'll build current mainline.

Both your links say committed 2 weeks ago, and  https://www.kernel.org/  says ''mainline:  6.13  2025-01-19', so seems AUR linux-git should build with those commits?

I'd give the default PKGBUILD build a shot in a clean chroot, but this will take a long while on my hardware.

Any suggestions before I get started building a kernel?
If/when I get through building the kernel, any suggestions on mesa related packages or parameters to start with?

Guess you need https://aur.archlinux.org/packages/linux-mainline or just install linux-6.13 from Arch Linux testing repo (https://archlinux.org/packages/core-tes … _64/linux/). With 6.13 no extra kernel options needed to check the expectations.

Last edited by Mechanicus (2025-01-27 20:56:48)

Offline

#304 2025-01-27 21:04:00

nek0panchi
Member
Registered: 2020-08-07
Posts: 12

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

@kclisp
If i didn't misread the commit it looks like it's slightly different from the patch, hopefully it actually fixes it, because I still had one freeze with the patch.

Offline

#305 2025-01-27 21:40:38

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 13,225

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

kclisp wrote:

Update: there is now a proposed MR in mesa to fix the bisected bug! https://gitlab.freedesktop.org/mesa/mes … ests/33248

@Lone_Wolf

Maybe we should test it?

Yup.

New binary uploaded : mesa trunk 84b660b9229 plus  mesa MR 33248

direct downloadlink


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#306 2025-01-27 22:17:21

kclisp
Member
Registered: 2025-01-04
Posts: 33

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

@Lone_Wolf

Thanks! Build seems stable with regards to my reproducer.

Offline

#307 2025-01-28 12:38:41

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Linux-6.13 from core-testing: after 2.5 hours of multiple browser windows (2 with WebGL samples and 2 with YouTube HW accelerated playback) the system froze when switching between windows.
Test with amdgpu.enforce_isolation=1: freeze when switching between multiple browser windows reproduced. But the error messages now different:

[  681.153691] amdgpu 0000:07:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[  698.013230] amdgpu 0000:07:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
[  714.563699] amdgpu 0000:07:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[  714.600222] amdgpu 0000:07:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[  721.573168] amdgpu 0000:07:00.0: amdgpu: Dumping IP State
[  731.499225] iwlwifi 0000:04:00.0: Unhandled alg: 0x703
[  747.269661] amdgpu 0000:07:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
[  747.753099] amdgpu 0000:07:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706

Test amdgpu.cwsr_enable=0: no freezes regardless of system load so far.
Important: with amdgpu.cwsr_enable=0 the WebGL Aquarium FPS in Google Chrome increased from 23 to 30.

Last edited by Mechanicus (2025-01-28 14:32:16)

Offline

#308 2025-01-28 13:42:08

lpr1
Member
Registered: 2017-10-08
Posts: 93

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Mechanicus wrote:

Linux-6.13 from core-testing: after 2.5 hours of multiple browser windows (2 with WebGL samples and 2 with YouTube HW accelerated playback) the system froze when switching between windows.
Test with amdgpu.enforce_isolation=1: freeze when switching between multiple browser windows reproduced. But the error messages now different:

[  681.153691] amdgpu 0000:07:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[  698.013230] amdgpu 0000:07:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
[  714.563699] amdgpu 0000:07:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[  714.600222] amdgpu 0000:07:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[  721.573168] amdgpu 0000:07:00.0: amdgpu: Dumping IP State
[  731.499225] iwlwifi 0000:04:00.0: Unhandled alg: 0x703
[  747.269661] amdgpu 0000:07:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
[  747.753099] amdgpu 0000:07:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706

Testing with amdgpu.cwsr_enable=0 now.

Worked whole day with amdgpu.cwsr_enable=0 , no freezes so far, this option seems like the only one that make any difference on my system, will see if it freezes at some point in the future.

Offline

#309 2025-01-28 14:13:51

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

lpr1 wrote:

Worked whole day with amdgpu.cwsr_enable=0 , no freezes so far, this option seems like the only one that make any difference on my system, will see if it freezes at some point in the future.

I got the same result. Probably we found at least one problematic part: https://github.com/torvalds/linux/blob/ … r_gfx9.asm
Need more volunteers to test this flag. Then we can create a patch to disable CWSR for GFX 8 and 9. @kclisp, @Lone_Wolf would you like to take a part in the party?

Last edited by Mechanicus (2025-01-28 14:48:42)

Offline

#310 2025-01-28 15:18:10

nek0panchi
Member
Registered: 2020-08-07
Posts: 12

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

got a freeze on 24.3.4-1 with amdgpu.cwsr_enable=0 in the first 5 minutes of usage (nothing in the logs), going back to Lone_Wolf's mesa-test-git 25.0.0_devel.200756.84b660b9229-1 with Marek's MR, it's the only one that hasn't crashed on me yet.

Offline

#311 2025-01-28 15:41:26

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

There is more prominent patch from Alex Deucher: https://gitlab.freedesktop.org/drm/amd/ … te_2755333
@Lone_Wolf we should test it, since it is related to GFX9 only, and not related to MESA. Could you please prepare the kernel build?

Last edited by Mechanicus (2025-01-28 15:45:16)

Offline

#312 2025-01-28 17:56:14

NotAnArchUser
Member
Registered: 2025-01-25
Posts: 6

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Yesterday I've got another freeze testing amdgpu.cwsr_enable=0 parameter. Now I'm testing amdgpu.mes=1. Reminding everyone that I'm on Void Linux.

2025-01-27T05:52:53.04480 kern.err: [ 5201.225840] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
2025-01-27T05:52:53.04491 kern.err: [ 5201.226311] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing A4A6 (len 84, WS 0, PS 0) @ 0xA4DC
2025-01-27T05:52:53.04493 kern.err: [ 5201.226722] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing CF70 (len 525, WS 0, PS 0) @ 0xCFC1

By the way some time ago I've managed to get cat /sys/kernel/debug/dri/0/amdgpu_gpu_recover working by applying amdgpu.ppfeaturemask=0xffff7bcf (here I turned off PP_POWER_CONTAINMENT_MASK, PP_UVD_HANDSHAKE_MASK, PP_CLOCK_STRETCH_MASK and PP_GFXOFF_MASK)

Offline

#313 2025-01-28 18:07:32

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

NotAnArchUser wrote:

Yesterday I've got another freeze testing amdgpu.cwsr_enable=0 parameter. Now I'm testing amdgpu.mes=1. Reminding everyone that I'm on Void Linux.

2025-01-27T05:52:53.04480 kern.err: [ 5201.225840] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
2025-01-27T05:52:53.04491 kern.err: [ 5201.226311] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing A4A6 (len 84, WS 0, PS 0) @ 0xA4DC
2025-01-27T05:52:53.04493 kern.err: [ 5201.226722] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing CF70 (len 525, WS 0, PS 0) @ 0xCFC1

By the way some time ago I've managed to get cat /sys/kernel/debug/dri/0/amdgpu_gpu_recover working by applying amdgpu.ppfeaturemask=0xffff7bcf (here I turned off PP_POWER_CONTAINMENT_MASK, PP_UVD_HANDSHAKE_MASK, PP_CLOCK_STRETCH_MASK and PP_GFXOFF_MASK)

Could you compile your own kernel? Here is an updated fix from AMD developer: https://gitlab.freedesktop.org/drm/amd/ … te_2755499
Regarding amdgpu_gpu_recover - the mask you've applied just disabled GPU modules, so it is not OK.

Last edited by Mechanicus (2025-01-28 18:57:53)

Offline

#314 2025-01-28 18:30:54

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Linux-6.13 (build based on https://aur.archlinux.org/packages/linux-mainline + ArchLinux patches) with updated gfxoff patch (https://gitlab.freedesktop.org/drm/amd/ … te_2755499).
Download link:https://drive.google.com/drive/folders/ … KOx34jmcRx

Note: you need manually select this kernel in boot menu. It is not a replacement for default package.

Last edited by Mechanicus (2025-01-28 19:38:30)

Offline

#315 2025-01-28 18:38:34

pacoandres
Member
Registered: 2020-03-05
Posts: 20

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Till now I've been using 24.3.4 compiled with the patch with no freezes neither other issues.

I'm going to compile it with this patch and see what happens.
https://gitlab.freedesktop.org/mesa/mes … te_2755501

EDIT: I've seen it's not a patch for mesa, but for the kernel driver.

Last edited by pacoandres (2025-01-28 18:46:07)

Offline

#316 2025-01-28 18:44:16

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

pacoandres wrote:

Till now I've been using 24.3.4 compiled with the patch with no freezes neither other issues.

I'm going to compile it with this patch and see what happens.
https://gitlab.freedesktop.org/mesa/mes … te_2755501

Link for kernel with this patch is available in previous comment

Offline

#317 2025-01-28 18:47:13

pacoandres
Member
Registered: 2020-03-05
Posts: 20

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Mechanicus wrote:
pacoandres wrote:

Till now I've been using 24.3.4 compiled with the patch with no freezes neither other issues.

I'm going to compile it with this patch and see what happens.
https://gitlab.freedesktop.org/mesa/mes … te_2755501

Link for kernel with this patch is available in previous comment

Thanks.

Offline

#318 2025-01-28 18:58:40

NuSkool
Member
Registered: 2015-03-23
Posts: 195

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

System a froze using kernel linux-git 6.13.r8997.f34b580514c9-1 with official repo mesa and no additional kernel parameters.

I'll install Mechanicus  https://bbs.archlinux.org/viewtopic.php … 8#p2223048  custom kernel for testing now.

EDIT: Change of plans.... Thanks Mechanicus for the huge time saver!

Last edited by NuSkool (2025-01-28 19:12:38)

Offline

#319 2025-01-28 19:01:43

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

NuSkool wrote:

System a froze using kernel linux-git 6.13.r8997.f34b580514c9-1 with official repo mesa and no additional kernel parameters.

I'll install Mechanicus  https://bbs.archlinux.org/viewtopic.php … 8#p2223048 custom kernel for testing now.

EDIT: Change of plans.... Thanks Mechanicus for the huge time saver!

Thanks to all of you who accepted my point of view on the problem and participated in testing! smile
Regarding compilation time - you can drastically improve it by applying optimized parameters to makepkg, like I do here: https://github.com/SeryogaBrigada/Simpl … pdate#L127

Last edited by Mechanicus (2025-01-28 19:20:07)

Offline

#320 2025-01-28 20:10:06

NuSkool
Member
Registered: 2015-03-23
Posts: 195

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

OK, that didn't take long, froze during verifying to myself I did everything right.
Reboot for another go...

@Mechanicus, I see a discrepancy* so double checking. Did I get/running your correct kernel for testing and verifying you uploaded the correct kernel?
* Between 'pacman -Q linux-mainline' and 'uname -r'.

I'm used to seeing the output from those two commands match.
ie: A different Arch system

$ pacman -Q linux ; uname -r
linux 6.12.9.arch1-1
6.12.9-arch1-1

Pacman log installing test kernel:

[2025-01-28T11:13:56-0800] [PACMAN] Running 'pacman --color=always -U linux-mainline-6.13-2-x86_64.pkg.tar.zst linux-mainline-headers-6.13-2-x86_64.pkg.tar.zst'
[2025-01-28T11:13:59-0800] [ALPM] transaction started
[2025-01-28T11:14:00-0800] [ALPM] installed linux-mainline (6.13-2)
[2025-01-28T11:14:02-0800] [ALPM] installed linux-mainline-headers (6.13-2)

And some verification:

$ pacman -Q linux-mainline ; uname -r
linux-mainline 6.13-2
6.13.0-arch1-2-mainline-gffd294d346d1-dirty


$ ls -1 /boot
efi
grub
GRUB-BU
amd-ucode.img
initramfs-linux-fallback.img
initramfs-linux-git-fallback.img
initramfs-linux-git.img
initramfs-linux.img
initramfs-linux-mainline-fallback.img
initramfs-linux-mainline.img
vmlinuz-linux
vmlinuz-linux-git
vmlinuz-linux-mainline


$ grep 'mainline' /boot/grub/grub.cfg
	linux	/boot/vmlinuz-linux-mainline root=UUID=60bc1026-da96-43b5-8963-eda5d63b8049  rw  loglevel=3  sysrq_always_enabled=1  amd_pstate=passive fsck.mode=force
	initrd	/boot/initramfs-linux-mainline.img

And a reply to:

you can drastically improve it by applying optimized parameters to makepkg

Yea thanks. I have 6 threads jobs of 8 setup to use on this system for clean chroot builds. I ran out of root disk space, so had to restart with clean chroot in my home dir...
Compiling generic kernels with all the drivers is a big time sink on a weak system, and didn't feel like slimming it down to the essentials.

Last edited by NuSkool (2025-01-28 20:23:00)

Offline

#321 2025-01-28 20:50:10

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

NuSkool wrote:

@Mechanicus, I see a discrepancy* so double checking. Did I get/running your correct kernel for testing and verifying you uploaded the correct kernel?
* Between 'pacman -Q linux-mainline' and 'uname -r'.
And some verification:

$ pacman -Q linux-mainline ; uname -r
linux-mainline 6.13-2
6.13.0-arch1-2-mainline-gffd294d346d1-dirty

This is correct. uname -r should return 6.13.0-arch1-2-mainline-gffd294d346d1-dirty

Offline

#322 2025-01-28 22:31:11

Mechanicus
Member
Registered: 2025-01-13
Posts: 48

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Important

Everyone who uses mesa-24.3.4 please report any change in the behavior after applying amdgpu.ppfeaturemask=0xfff73fff kernel parameter. This option disables GFXOFF module, so the increase in GPU power consumption is expected.

Offline

#323 2025-01-29 07:06:35

lpr1
Member
Registered: 2017-10-08
Posts: 93

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Also got a freeze with amdgpu.cwsr_enable=0 as others, took some time to get there, but it happened. Now I'll test amdgpu.ppfeaturemask=0xfff73fff.

Offline

#324 2025-01-29 07:14:07

NuSkool
Member
Registered: 2015-03-23
Posts: 195

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Ran the following setup for testing:

linux-mainline 6.13-2   Mechanicus patched kernel
mesa 1:24.3.4-1              official repo mesa

Locked up twice. First time within minutes with the second taking several hours.


Added the following parameter to this setup for further testing:

amdgpu.ppfeaturemask=0xfff73fff  Mechanicus kernel parameter

Last edited by NuSkool (2025-01-29 07:18:22)

Offline

#325 2025-01-29 08:06:00

kode54
Member
Registered: 2013-10-21
Posts: 30

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

I started experiencing these freezes on my system with gfx11 / RDNA 3 graphics (7700 XT) around the 25th, right around when I updated Mesa to 24.3.4, and when I updated a Docker container with GPU access to Ubuntu 24.10.x with presumably a 24.2.8 Mesa package.

I ended up rotating the 7700 XT out of my installed hardware, since I wasn't experiencing the freezes with a 6700 XT, and not currently experiencing them with an RX 480 I put in the machine I removed the 6700 XT from.

I'll consider testing the ppfeaturemask workaround as well, if that looks like it will fix it until the firmware is fixed.

Offline

Board footer

Powered by FluxBB