You are not logged in.

#326 Yesterday 08:14:26

seth
Member
Registered: 2012-09-03
Posts: 60,756

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Fwwi, there might be interest in https://bbs.archlinux.org/viewtopic.php?id=302858 (script to decode and look up the various module flags)

Offline

#327 Yesterday 08:26:44

NotAnArchUser
Member
Registered: 2025-01-25
Posts: 6

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Mechanicus wrote:

Could you compile your own kernel? Here is an updated fix from AMD developer: https://gitlab.freedesktop.org/drm/amd/ … te_2755499

Never compiled a kernel before and never applied patches. But I'll learn if necessary. But NuSkool already reported it's freezing. Should I test it too?

Mechanicus wrote:

Regarding amdgpu_gpu_recover - the mask you've applied just disabled GPU modules, so it is not OK.

I understand that this is not a proper solution and not a solution at all. I'm just reporting that turning off these exact modules (PP_POWER_CONTAINMENT_MASK, PP_UVD_HANDSHAKE_MASK, PP_CLOCK_STRETCH_MASK and PP_GFXOFF_MASK with amdgpu.ppfeaturemask=0xffff7bcf) made cat /sys/kernel/debug/dri/0/amdgpu_gpu_recover work and in my case drastically decreased freeze frequency. I'm talking about of minutes or couple of hours of usage versus days. Especially if I'm not connecting my second display. But freezes still persist.

Last edited by NotAnArchUser (Yesterday 08:31:52)

Offline

#328 Yesterday 09:00:55

Mechanicus
Member
Registered: 2025-01-13
Posts: 46

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

NotAnArchUser wrote:
Mechanicus wrote:

Could you compile your own kernel? Here is an updated fix from AMD developer: https://gitlab.freedesktop.org/drm/amd/ … te_2755499

Never compiled a kernel before and never applied patches. But I'll learn if necessary. But NuSkool already reported it's freezing. Should I test it too?

Mechanicus wrote:

Regarding amdgpu_gpu_recover - the mask you've applied just disabled GPU modules, so it is not OK.

I understand that this is not a proper solution and not a solution at all. I'm just reporting that turning off these exact modules (PP_POWER_CONTAINMENT_MASK, PP_UVD_HANDSHAKE_MASK, PP_CLOCK_STRETCH_MASK and PP_GFXOFF_MASK with amdgpu.ppfeaturemask=0xffff7bcf) made cat /sys/kernel/debug/dri/0/amdgpu_gpu_recover work and in my case drastically decreased freeze frequency. I'm talking about of minutes or couple of hours of usage versus days. Especially if I'm not connecting my second display. But freezes still persist.

Thank you for the update. Looks like the amount of workarounds in AMD driver kills itself.

Offline

#329 Yesterday 10:09:27

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 13,225

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

MR 33248 was merged to trunk in a slightly different form that only affects raven & raven2 chipsets.

mesa 25.0 hasn't been branched off yet, so 25.0 rc candidates and stable will have the change .
I have taken down the previous 25.0 builds and uploaded a new one with the change merged .

download link

I suggest the testers of the kernel parameters stick to 24.3.x mesa .


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#330 Yesterday 11:09:17

orbit-oc
Member
Registered: 2024-12-15
Posts: 61

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Now that's something!

The decision (mesa) to limit the deactivation of the compute queues to Raven/Raven2 (gfx9) is a good one.
I also like the fact that the routine for the next bugfix release 24.3.5 is now being merged.

The kernel patch and kernel parameter testers now have time to thoroughly test the matter. The current tests seem a bit non-transparent to me and often a bit hasty in their conclusions.

Pierre-Eric Pelloux-Prayer @pepp - mesa Developer

R-b, I think it's ok to merge this patch for now. If Alex's patch turns out to fix the root cause, we'll rework this patch to only disable compute queues on kernels without the workaround.

Offline

#331 Yesterday 17:20:38

pacoandres
Member
Registered: 2020-03-05
Posts: 20

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

I've been working the whole day while testing the kernel parameter amdgpu.ppfeaturemask=0xfff73fff with no freezes:

  • Kernel 6.12.10 unpatched

  • Mesa 24.3.4 unpatched

As far as I've tested the GPU temperature is greater than usually (39º C vs 27ªC, it's so cold these days here) and of course fans speed too.

Maybe the Mesa patch is better if you are not planning to use GPU compute capabilities. And, as I understand, that is the solution adopted by Mesa developers https://gitlab.freedesktop.org/mesa/mes … ests/33248

Last edited by pacoandres (Yesterday 17:26:55)

Offline

#332 Yesterday 17:47:30

Mechanicus
Member
Registered: 2025-01-13
Posts: 46

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

pacoandres wrote:

I've been working the whole day while testing the kernel parameter amdgpu.ppfeaturemask=0xfff73fff with no freezes:

  • Kernel 6.12.10 unpatched

  • Mesa 24.3.4 unpatched

As far as I've tested the GPU temperature is greater than usually (39º C vs 27ªC, it's so cold these days here) and of course fans speed too.

Maybe the Mesa patch is better if you are not planning to use GPU compute capabilities. And, as I understand, that is the solution adopted by Mesa developers https://gitlab.freedesktop.org/mesa/mes … ests/33248

It is a workaround (one more, yeah). The actual root cause of the problem is still there (since 2018). The mesa fix just makes the freeze less frequent.

Offline

#333 Yesterday 18:20:29

orbit-oc
Member
Registered: 2024-12-15
Posts: 61

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Mechanicus wrote:

The actual root cause of the problem is still there (since 2018). The mesa fix just makes the freeze less frequent.

@kclisp is running the original mesa-fix (1) since it became available. This is probably the longest running test so far. He has not yet reported a crash.
Afterwards the mesa-fix was tightened further (2) and now it has been restricted to Raven/Raven2 (gfx9) (3).

I have never tested the fix. But now I'm on mesa-test-git 25.0.0_devel.200908.66775c89fce-1 (Build Lone_Wolf with last fix) for hours and I'll stay there until the release of mesa 24.3.5 (or until another freeze).
For affected users, the arrival of stability is now important.

@Mechanicus has done a good research. It remains to be seen whether an adaptation of the kernel (Alex Deucher - Kernel/AMD developer) can improve things. For the time being, the stabilisation that is now becoming apparent through the mesa-fix is decisive and creates the necessary time for possible further research and testing.

Offline

#334 Yesterday 18:29:29

Mechanicus
Member
Registered: 2025-01-13
Posts: 46

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

orbit-oc wrote:

@Mechanicus has done a good research. It remains to be seen whether an adaptation of the kernel (Alex Deucher - Kernel/AMD developer) can improve things. For the time being, the stabilisation that is now becoming apparent through the mesa-fix is decisive and creates the necessary time for possible further research and testing.

I'm also investigating the amdgpu issue now. There are a lot of workarounds for Raven family, and I guess one of them is obsolete and makes the things worse. So, I'm testing all of them one by one. When the stable solution is found, I'll make an announcement here and upload a new build. Actually, I do the research from the different entry point compared to Alex Deucher.

***Update***

I found one problematic place, testing on my system looks promising. Would like to know your results.
https://drive.google.com/drive/folders/ … drive_link
Note: NO KERNEL PARAMETERS
sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover - this is not working yet. Current goal is stability, then recover.
Important information to provide: temps and performance.

You can monitor the progress here: https://github.com/SeryogaBrigada/linux … .13-amdgpu

Last edited by Mechanicus (Yesterday 19:30:48)

Offline

#335 Yesterday 19:14:03

NuSkool
Member
Registered: 2015-03-23
Posts: 193

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Ran the following setup for testing:

linux-mainline 6.13-2  Mechanicus patched kernel
mesa 1:24.3.4-1  official repo mesa
amdgpu.ppfeaturemask=0xfff73fff  Mechanicus kernel parameter


System seems to run well so far.
Ran a 4K video overnight and was still running this morning.

Seems this kernel parameter by itself may be an improvement in a positive direction.

The command  sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover  results were a bit odd.

Running it from my DE, it initially recovered for a few seconds, then seemingly reran to not recover.
Running it after stopping X11 from the console was successful with 0 exit code.

I was planning on switching to the official repo linux kernel with the same parameter, but with the latest news....
I'll switch to Mechanicus latest kernel for further testing.

@Mechanicus, should I continue with the amdgpu.ppfeaturemask=0xfff73fff parameter with this kernel?

Online

#336 Yesterday 19:16:07

Mechanicus
Member
Registered: 2025-01-13
Posts: 46

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

NuSkool wrote:

@Mechanicus, should I continue with the amdgpu.ppfeaturemask=0xfff73fff parameter with this kernel?

No, no extra parameters please.

Last edited by Mechanicus (Yesterday 19:19:49)

Offline

#337 Yesterday 20:02:23

pacmancrashedagain
Member
Registered: 2024-12-14
Posts: 19

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

I'm not sure if this is still relevant, but i had mesa-git with that first Lone Wolf patch(https://bbs.archlinux.org/viewtopic.php?pid=2220871#p2220871) that was deemed the most stable one and i got a lockup after 5 or 6 days when i opened mpv.

Offline

#338 Yesterday 20:08:43

nek0panchi
Member
Registered: 2020-08-07
Posts: 12

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

@pacmancrashedagain
Yeah that's exactly what happened to me too, see my post #240, I'm on Lone_Wolf's mesa-test-git 25.0.0_devel.200908.66775c89fce-1 now, this and the previous one with the MR, hasn't crashed on me yet.

Offline

#339 Yesterday 20:35:59

kode54
Member
Registered: 2013-10-21
Posts: 30

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Does this new kernel have any effect on RDNA 3? Because the kernel parameter from the last page is still fixing my problem. Specifically, amdgpu.ppfeaturemask=0xfff73fff

Offline

#340 Yesterday 20:43:44

Mechanicus
Member
Registered: 2025-01-13
Posts: 46

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

kode54 wrote:

Does this new kernel have any effect on RDNA 3? Because the kernel parameter from the last page is still fixing my problem. Specifically, amdgpu.ppfeaturemask=0xfff73fff

Nope, I didn't changed anything for GFX11 yet. But thank you for the update. This is one more step to the root cause.

Offline

#341 Yesterday 20:44:42

NuSkool
Member
Registered: 2015-03-23
Posts: 193

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Ran the following setup for testing:

linux-test 6.13.arch1-1  Mechanicus kernel:  https://bbs.archlinux.org/viewtopic.php … 6#p2223336 https://github.com/SeryogaBrigada/linux … .13-amdgpu
uname -r: 6.13.0-arch1-1-test
mesa 1:24.3.4-1  official repo mesa
No additional kernel parameters

Ran for a short time before freezing...

I'm switching back to official repo kernel 'linux' with Mechanicus kernel paramerer: amdgpu.ppfeaturemask=0xfff73fff

Last edited by NuSkool (Yesterday 20:45:58)

Online

#342 Yesterday 20:56:48

orbit-oc
Member
Registered: 2024-12-15
Posts: 61

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

@NuSkool

I'm switching back to official repo kernel 'linux' with Mechanicus kernel paramerer: amdgpu.ppfeaturemask=0xfff73fff

Pay attention to the power consumption of that kernel parameter if this is to be standard for you. One user has reported that it increases significantly with this parameter.
https://gitlab.freedesktop.org/drm/amd/ … te_2755781

Offline

#343 Yesterday 21:00:27

orbit-oc
Member
Registered: 2024-12-15
Posts: 61

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

@pacmancrashedagain @nek0panchi

...and i got a lockup after 5 or 6 days when i opened mpv.

@kclisp IMHO has a Raven Ridge APU while you both IMHO have a Picasso APU. Myself too.
Maybe there are differences in behaviour - who knows.

But as @nek0panchi and me already said: there was a sharpening of the fix afterwards. Test the latest build (#329) like we do...

Offline

#344 Yesterday 21:50:17

NuSkool
Member
Registered: 2015-03-23
Posts: 193

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

@orbit-oc Thanks for the heads up.

My system is an HP mini desktop. I monitor CPU temp, overall CPU MHz and individual thread usage %. 
I've never noticed any thermal throttling on this system even when maxed out, so either well designed cooling or my monitoring isn't telling me.
I guess if battery life isn't a concern, the excessive power usage would potentially result in heat dissipation issues?

Reading the link you provided makes me realize how fast discoveries are being made behind the scenes.
Unfortunately it's moving beyond my ability to monitor, comprehend and keep up....
Seems you guys are doing a good job here in this thread, but again moving fast.

These limitations in mind, I'm here to test and provide feedback.
@everyone If there's a better direction I could take in testing please let me know.
I don't want to waste time providing test results that are already determined.

I've consistently had good results with Lone_Wolfs patched mesa.

My "fallback stable" when not testing and until a fix is released will tentatively be his latest:
Lone_Wolfs latest patched mesa: https://bbs.archlinux.org/viewtopic.php … 0#p2223190 https://app.box.com/s/8ednrt82hzac90x9ng5x0f5i9fjkxpos

That said I'll move onto testing Lone_Wolfs latest patched mesa with:
official repo kernel 'linux'
without additional kernel parameters

Last edited by NuSkool (Yesterday 22:13:39)

Online

#345 Yesterday 21:58:06

pacmancrashedagain
Member
Registered: 2024-12-14
Posts: 19

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Thanks @nekopanchi.

@orbit-oc , i have a 3200G which is Raven Ridge. 

But yeah, i'm going to try the latest mesa from Lone_Wolf right now, i just didn't tried it before since i experienced like 5 days without crashes so i thought it was all clear.

Last edited by pacmancrashedagain (Yesterday 22:00:57)

Offline

#346 Today 07:21:47

SnowF
Member
Registered: 2025-01-17
Posts: 10

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Testing

  • Kernel 6.12.10 Unpatched.

  • Mesa 24.3.4 Unpatched.

Kernel boot parameter:

amdgpu.ppfeaturemask=0xf7fff

PP_GFXOFF_MASK: Dynamic graphics engine power control.

No crashes 12+ hs.

Source

Online

#347 Today 09:27:30

orbit-oc
Member
Registered: 2024-12-15
Posts: 61

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

confusing terms

@pacmancrashedagain

...i have a 3200G which is Raven Ridge

Ryzen 3000 - (Refresh of Ryzen 2000)
The AMD Ryzen 3 3200G APU is using the Zen+ (Picasso) architecture with Socket AM4.
The iGPU is a Radeon Vega 8 from Generation Vega IGP (Raven).
These APUs including a Vega iGPU are using, afaik for example @pacmancrashedagain and @nekopanchi.

Ryzen 2000
The APUs before are using the Zen (Raven Ridge) architecture.
This is a bit confusing because it has the same name as the Vega 8. These APUs including a Vega iGPU are using, afaik for example @kclisp and @bernd_b.

The merged mesa-fix has been limited to Raven/Raven2 gpu's.
I do not have an exact allocation of Raven and Raven2 (Raven2 seems to belong to the Athlon 3000 APUs).

https://www.techpowerup.com/cpu-specs/r … 200g.c2205
https://www.techpowerup.com/gpu-specs/r … ga-8.c3042
#337, #338, #343, #345

Offline

#348 Today 09:51:32

Nicky726
Member
From: Czech Republic
Registered: 2008-02-15
Posts: 146

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Tested Lone_Wolf provided patched mesa with "AMD Ryzen 5 3400G with Radeon Vega Graphics" for about 6 days without encountering any issues (the machine is up all the time, GUI/KDE mostly for web-browsing is used some of the time). The only amdgpu message in dmesg so far is (couple of):

[137689.994618] amdgpu 0000:0a:00.0: [drm] pstate TEST_DEBUG_DATA: 0x3EFE0000

Everything else is from ArchLinux repositories:

mesa-test-git 25.0.0_devel.200085.94da1edbe49-1
linux 6.12.10.arch1-1
linux-firmware 20250109.7673dffd-1
amd-ucode 20250109.7673dffd-1

Upgrading the system and Lone_Wolf provided patched mesa now to:

mesa-test-git-25.0.0_devel.200908.66775c89fce-1

"Although the masters make the rules
For the wise men and the fools
I got nothing, Ma, to live up to."

Offline

#349 Today 09:57:53

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 13,225

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

Use glxinfo or eglinfo (for wayland users) to verify which chipset mesa detects .

example :

$ glxinfo | grep radeonsi
    Device: AMD Radeon RX 580 Series (radeonsi, polaris10, ACO, DRM 3.59, 6.12.10-arch1-1) (0x67df)
OpenGL renderer string: AMD Radeon RX 580 Series (radeonsi, polaris10, ACO, DRM 3.59, 6.12.10-arch1-1)
$

The name behind radeonsi is what you want to look at, my rx580 has  a polaris10 videochipset.

Only cards with raven / raven2 there are affected by the MR.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#350 Today 10:16:42

lpr1
Member
Registered: 2017-10-08
Posts: 93

Re: Issues with Mesa 24.3.x and amdgpu Vega graphics

SnowF wrote:

Testing

  • Kernel 6.12.10 Unpatched.

  • Mesa 24.3.4 Unpatched.

Kernel boot parameter:

amdgpu.ppfeaturemask=0xf7fff

PP_GFXOFF_MASK: Dynamic graphics engine power control.

No crashes 12+ hs.

Source

I can confirm no crashes with amdgpu.ppfeaturemask=0xf7fff, same conditions as well (unpatched kernel, mesa from Arch repo), however, we should give it more time for testing as well.

Offline

Board footer

Powered by FluxBB