You are not logged in.
@orbit-oc I think you know more about all these patches, etc. I'm definitely not up to speed on deciphering the id and useful info from them.....
AFAIK.... The kernel I'm on has the following applied by logs in post #613. [PATCH 1/2] https://lore.kernel.org/stable/20250219 … r@amd.com/
Logs was suggesting I let our 'Arch' kernel maintainers know what we're testing, whats working at this point, so they can look into applying the patches to our next official kernel update.
We're still on 6.13.3.arch1-1 , so maybe a bump to 6.13.3.arch1-2 until we get on 6.13.4?
Seems we're more or less beta testers of all Linux packages for other distros. The timing may be getting critical on this bug before a huge morass of users are affected.
Mostly all speculation on my part though.... I'd wait for the professionals if looking for accurate facts.
Last edited by NuSkool (2025-02-21 23:09:48)
Scripts I Use : https://github.com/Cody-Learner
$ grep -m1 'model name' /proc/cpuinfo : AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
$ glxinfo | grep Device : Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.61, 6.13.9-rc1) (0x15dd)
$ sudo dmesg | awk '/drm/ && /gfx/' : [ 6.427009] [drm] add ip block number 6 <gfx_v9_0>
Offline
#625
The back-ported patch drm-amdgpu-gfx9-manually-control-gfxoff-for-cs-on-rv.patch was dropped two days ago.
A patch drm-amdgpu-bump-version-for-rv-pco-compute-fix.patch seems to be included.
The reason for the withdrawal of Alex Deucher's backport seems to be the reporting below by Valentine Burley...
Offline
https://gitlab.freedesktop.org/drm/amd/ … te_2788066 which links to https://lore.kernel.org/stable/20250219 … r@amd.com/ is the replacement backport mentioned in https://git.kernel.org/pub/scm/linux/ke … b7282cf5ba. So far these manual backports to 6.13 have not been withdrawn that I can equally they have not been accepted into the stable queue.
Edit:
@NuSkool In registering on Arch's gitlab instance if you are stuck as the QR code for an oath token is there an option to switch it to a string that you can use with `oathtool --totp --base32 "YOUR STRING HERE"`?
Last edited by loqs (2025-02-22 17:46:38)
Offline
@logs Click [can't get code] > run $ oathtool --totp -b 'GETA LONG BUNx CHxOFx LETx TERSx GOOD' > 123456 > paste digits in 'One-time code' box.
Grrr at wasting time on 'accounts.archlinux.org' with 'gopass' and 'zbarimg'...
Last edited by NuSkool (2025-02-23 02:21:51)
Scripts I Use : https://github.com/Cody-Learner
$ grep -m1 'model name' /proc/cpuinfo : AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
$ glxinfo | grep Device : Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.61, 6.13.9-rc1) (0x15dd)
$ sudo dmesg | awk '/drm/ && /gfx/' : [ 6.427009] [drm] add ip block number 6 <gfx_v9_0>
Offline
to everyone generally on this thread, my friend has been getting this exact thing on windows, with an rx 6600 GPU, and theres no fix for it yet, is it possible to package the patch for mesa windows, or is this issue completely different?
Offline
Unless you mean something like WSL, it's a completely different issue, Windows doesn't use Mesa.
Offline
xyznoobb, part of this thread deals with an issue deep in the linux kernel amd driver.
Some comments in bug reports linked from this thread suggest the cause may be in the amd firmware.
If that is correct , other OSes could have similar issues.
The best advise I can give your friend is to ensure their windows install & amd drivers are uptodate.
In case they use an amd processor, they should also check the firmware for their system.
Good luck.
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
Preliminary test results: logs patched linux 6.13.3.arch1-1.2 from post #613.
With around 17HR runtime has been reliable, runs and performs well with repo mesa.
I'll continue testing this setup.pacman -Q linux linux 6.13.3.arch1-1.2
uname -sr Linux 6.13.3-arch1-1.2
pacman -Q mesa mesa 1:24.3.4-1
cat /proc/cmdline... rw loglevel=3 sysrq_always_enabled=1 amd_pstate=passive fsck.mode=forceglxgears resize test Passed @ 60fps
vkgears resize test Passed @ 60fpsPower Usage : Idle desktop w/ chromium 121.80F 5.40W
WebGL Aquarium Chromium : 500 fish 60fps 133.65F 9.96W
1000 fish 60fps 135.33F 10.37W
5000 fish 60fps 143.26F 11.82W
10000 fish 60fps 153.07F 13.82W
15000 fish ~54fps 161.44F 14.30W
20000 fish ~43fps 159.34F 13.67W
30000 fish ~30fps 157.87F 12.45W
I have 3 days on logs patched linux 6.13.3.arch1-1.2 from post #613 with repo mesa. I'd say my system runs reliably on this kernel.
I see the current repo kernel has been updated to 'linux 6.13.4.arch1-1'.
@logs Does the updated kernel have the patch/s applied as in your post #613? I've finally made to through the 'accounts.archlinux.org registration' process. I'll file a report on 'linux' if it's still needed. I'll also switch to testing the latest repo kernel if it has any proposed fixes.
Last edited by NuSkool (2025-02-23 19:17:54)
Scripts I Use : https://github.com/Cody-Learner
$ grep -m1 'model name' /proc/cpuinfo : AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
$ glxinfo | grep Device : Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.61, 6.13.9-rc1) (0x15dd)
$ sudo dmesg | awk '/drm/ && /gfx/' : [ 6.427009] [drm] add ip block number 6 <gfx_v9_0>
Offline
The patches have not been applied to 6.13.4.arch1. 6.13.4.arch1with the same two patches applied:
linux-6.13.4.arch1-1.1-x86_64.pkg.tar.zst/linux-headers-6.13.4.arch1-1.1-x86_64.pkg.tar.zst
diff of changes I made to the PKGBUILD. Increment pkgrel to 1.1 to tell the releases apart, drop makedepends for build docs, add patches to sources array, do not build or package docs:
diff --git a/PKGBUILD b/PKGBUILD
index db548cc..5e2908a 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -2,7 +2,7 @@
pkgbase=linux
pkgver=6.13.4.arch1
-pkgrel=1
+pkgrel=1.1
pkgdesc='Linux'
url='https://github.com/archlinux/linux'
arch=(x86_64)
@@ -17,13 +17,6 @@ makedepends=(
python
tar
xz
-
- # htmldocs
- graphviz
- imagemagick
- python-sphinx
- python-yaml
- texlive-latexextra
)
options=(
!debug
@@ -34,6 +27,8 @@ _srctag=v${pkgver%.*}-${pkgver##*.}
source=(
https://cdn.kernel.org/pub/linux/kernel/v${pkgver%%.*}.x/${_srcname}.tar.{xz,sign}
$url/releases/download/$_srctag/linux-$_srctag.patch.zst{,.sig}
+ drm-amdgpu-gfx9-manually-control-gfxoff-for-cs-on-rv.patch::https://lore.kernel.org/stable/20250219132559.3940753-1-alexander.deucher@amd.com/raw
+ drm-amdgpu-bump-version-for-rv-pco-compute-fix.patch::https://lore.kernel.org/stable/20250219132559.3940753-2-alexander.deucher@amd.com/raw
config # the main kernel config file
)
validpgpkeys=(
@@ -46,11 +41,15 @@ sha256sums=('b80e0bc8efbc31e9ce5a84d1084dcccfa40e01bea8cc25afd06648b93d61339e'
'SKIP'
'9396ecd603c0129ca8457731db5fef117f75b63aec7a6782d5acbe8e4cd64787'
'SKIP'
+ '4086b15d66f2f8d8ebafff193c0a5116b9eabcda6b4e7bbeefa8773fe4c56f80'
+ '7091cc80581c1d62aa9cfd717d0605a7049c4a801ee5ec20c0cd5439994cfb21'
'9a195bc4d8b492b0f44da392689746f605ca946e4f396bbc25fdbffb383899c1')
b2sums=('2fe8e972e7de458fba6fbb18a08a01f17b49e4a2d31aa1368e50895a2698c6e1aaaf5137d0c0018860de3fe598e4ba425d6126ade7387ba227f690137111a66d'
'SKIP'
'da2f63697300bd07a28ab201aa879974eb50870cfcb6d0593c4ca33434ee0ccaa778be9a165f998f2f3e41e4f9f81d811255e6f056c9d15f8259da60d6680e2b'
'SKIP'
+ '80efc0c7b53fad68c28ef20aae314285ecbad959566abc57281fb59d654c7d326eb361d25e5acd0225a7715310e9300d6df62c7baa348b7116f5f1df53198a14'
+ 'bb1caf4dac5ec2a901d8ad80757adab3731a1e9b50a7b7c5540c58f24deb4601cbba1870b9cd5e406625ed7fcbb8331e94d5cced5a36bb3092280d99d7420318'
'eedd98ed226561af9b279b931d5251974ed98cf21fa0974a855dd0365a16d6702f190dc60eca9bb337534110d94d070e7c7bc5cd61ba774ab38d441b567cce6c')
export KBUILD_BUILD_HOST=archlinux
@@ -87,7 +86,6 @@ build() {
cd $_srcname
make all
make -C tools/bpf/bpftool vmlinux.h feature-clang-bpf-co-re=1
- make htmldocs
}
_package() {
@@ -214,29 +212,9 @@ _package-headers() {
ln -sr "$builddir" "$pkgdir/usr/src/$pkgbase"
}
-_package-docs() {
- pkgdesc="Documentation for the $pkgdesc kernel"
-
- cd $_srcname
- local builddir="$pkgdir/usr/lib/modules/$(<version)/build"
-
- echo "Installing documentation..."
- local src dst
- while read -rd '' src; do
- dst="${src#Documentation/}"
- dst="$builddir/Documentation/${dst#output/}"
- install -Dm644 "$src" "$dst"
- done < <(find Documentation -name '.*' -prune -o ! -type d -print0)
-
- echo "Adding symlink..."
- mkdir -p "$pkgdir/usr/share/doc"
- ln -sr "$builddir/Documentation" "$pkgdir/usr/share/doc/$pkgbase"
-}
-
pkgname=(
"$pkgbase"
"$pkgbase-headers"
- "$pkgbase-docs"
)
for _p in "${pkgname[@]}"; do
eval "package_$_p() {
Offline
Thanks for the patched packages and patch file @logs! I can use the patch file to learn more about patching kernels. I'll also file a bug report on our linux package pointing out your patches after some testing time verifies it works well on this version.
I did test repo 'linux 6.13.4.arch1' and it quickly froze.
Switching to logs post #634 latest patched kernel and headers.
Last edited by NuSkool (2025-02-23 21:07:28)
Scripts I Use : https://github.com/Cody-Learner
$ grep -m1 'model name' /proc/cpuinfo : AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
$ glxinfo | grep Device : Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.61, 6.13.9-rc1) (0x15dd)
$ sudo dmesg | awk '/drm/ && /gfx/' : [ 6.427009] [drm] add ip block number 6 <gfx_v9_0>
Offline
I don't know if it could work for everybody, but the following workflow drives to a quick freeze in my system if the kernel is faulty.
Open glxgears, resize quickly for 10'' - 20''.
Open vkcube without closing glxgears, resize quickly for 10'' - 20''.
Quick window change (alt+tab) for about for 10''-20''.
Close glxgears and vkcube.
Open chromium and got a freeze.
And I want to share a thought about this issue and it's (not ready) solution. Two months have passed since the first reported freeze, there have been several patches for mesa and the kernel that solves the issue (at least in my system and others who report), but by now none of them are in the repo packages or in the main trunk.
In the meanwhile I've been reading mesa and amdgpu threads about the issue and I've seen there's some kind of refusal to commit the patches that can solve the problem, without any clear reason.
I've been using Linux based OS since 1996, starting with Slackware, later move to Debian and now on Arch, and I thought that I understand how the open source universe works. But the way that things are going around this issue makes me think that I've never understand how it really works.
Offline
I ducked out of this thread to keep the noise down and emailed Mechanicus directly.
I've been running tests 6.13.2.arch1-1 to 6.13.4.arch1-2 and 6.13.4.arch1-4. on a 2200G processor.
These tests have been stable for me:
6.13.2.arch1-6
6.13.2.arch1-9
6.13.2.arch1-10 (best power consumption)
6.13.2.arch1-13 (40% increase glxgears performance, high power consumption)
Offline
I ducked out of this thread to keep the noise down and emailed Mechanicus directly.
I've been running tests 6.13.2.arch1-1 to 6.13.4.arch1-2 and 6.13.4.arch1-4. on a 2200G processor.
These tests have been stable for me:
6.13.2.arch1-6
6.13.2.arch1-9
6.13.2.arch1-10 (best power consumption)
6.13.2.arch1-13 (40% increase glxgears performance, high power consumption)
I agree that those @Mechanicus builds are stable with repo mesa.
I've checked also @loqs builds and are stable with repo mesa too (I'm working with this build right now, and have more than 3 hours with no freeze).
And patched mesa 25.0.0 with repo kernel is stable too.
So we have three different approaches that seem solve the issue.
I don't know the pros and cons of each solution, but sure they have. That may be the reason why none have been committed yet to the main trunks.
At this time the combination repo mesa + repo kernel drives to freeze.
Last edited by pacoandres (2025-02-24 11:16:19)
Offline
I don't know the pros and cons of each solution, but sure they have. That may be the reason why none have been committed yet to the main trunks.
Another possibility is the maintainers are not aware of the issue as it has never been reported on Arch's gitlab instance?
Edit:
@Mechanicus what is the status of the five commits from https://github.com/SeryogaBrigada/linux … .13-amdgpu? I could not find them in mainline or amd-staging-drm-next.
Last edited by loqs (2025-02-24 13:43:13)
Offline
pacoandres wrote:I don't know the pros and cons of each solution, but sure they have. That may be the reason why none have been committed yet to the main trunks.
Another possibility is the maintainers are not aware of the issue as it has never been reported on Arch's gitlab instance?
Edit:
@Mechanicus what is the status of the five commits from https://github.com/SeryogaBrigada/linux … .13-amdgpu? I could not find them in mainline or amd-staging-drm-next.
They are not accepted by AMD developers. To their point, they are useless.
But it is their point. I'm not going to give up and stay with degraded performance caused by adding a dozen extra operations to the GPU pipeline. Once I get both good performance and stable system I'm going to push my changes to mainline via all possible ways.
Offline
loqs wrote:@Mechanicus what is the status of the five commits from https://github.com/SeryogaBrigada/linux … .13-amdgpu? I could not find them in mainline or amd-staging-drm-next.
They are not accepted by AMD developers. To their point, they are useless.
But it is their point. I'm not going to give up and stay with degraded performance caused by adding a dozen extra operations to the GPU pipeline. Once I get both good performance and stable system I'm going to push my changes to mainline via all possible ways.
The arch kernel maintainers are resistant to apply patches that have not been accepted upstream so with that in mind it might be better to go with the backports of the two that have or wait on mesa 25?
Offline
@loqs The "official" workaround for Raven only is merged in Linux 6.15. Backports should be merged to stable branches soon. And with them we'll get both higher power consumption and lower performance.
Last edited by Mechanicus (2025-02-24 14:10:43)
Offline
Offline
I filed a bug report on our linux package: https://gitlab.archlinux.org/archlinux/ … issues/114
I'll edit it per request....
Can I get someone to provide an upstream bug report link for linux so I can add to to my bug report? In the meantime I'll dig in and look.
I can also file a similar bug report for our mesa package. Please share your thoughts on this.
Another thought I have now is I've never combined patched 'linux linux-headers' with patched 'mesa'. Anyone test this?
I could test this combo in the near future, but I need more test time on my current setup.
Last edited by NuSkool (2025-02-24 20:17:52)
Scripts I Use : https://github.com/Cody-Learner
$ grep -m1 'model name' /proc/cpuinfo : AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
$ glxinfo | grep Device : Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.61, 6.13.9-rc1) (0x15dd)
$ sudo dmesg | awk '/drm/ && /gfx/' : [ 6.427009] [drm] add ip block number 6 <gfx_v9_0>
Offline
Can I get someone to provide an upstream bug report link for linux so I can add to to my bug report?
https://gitlab.freedesktop.org/drm/amd/-/issues/3861
The two fixes are queued for 6.13.5 so will be included as part of the next stable kernel release.
Offline
Testing gromit's prebuilt version of 6.13.5: https://pkgbuild.com/\~gromit/linux-bis … kg.tar.zst
Is the appended 'home' going to impact testing though?
$ pacman -Q linux
linux 6.13.5rc1-1
$ uname -rs
Linux 6.13.5-rc1-1home
$ ls -d /usr/lib/modules/*-rc*
/usr/lib/modules/6.13.5-rc1-1home
Last edited by NuSkool (2025-02-24 21:36:45)
Scripts I Use : https://github.com/Cody-Learner
$ grep -m1 'model name' /proc/cpuinfo : AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
$ glxinfo | grep Device : Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.61, 6.13.9-rc1) (0x15dd)
$ sudo dmesg | awk '/drm/ && /gfx/' : [ 6.427009] [drm] add ip block number 6 <gfx_v9_0>
Offline
Additional local version string gromit added?
zgrep CONFIG_LOCALVERSION /proc/config.gz
Offline
$ zgrep CONFIG_LOCALVERSION /proc/config.gz
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
https://gitlab.archlinux.org/archlinux/ … ote_248995
It's also in the extracted kernel at.
$ ls linux-6.13.5rc1-1-x86_64.pkg/usr/lib/modules/
6.13.5-rc1-1home
Last edited by NuSkool (2025-02-24 22:20:57)
Scripts I Use : https://github.com/Cody-Learner
$ grep -m1 'model name' /proc/cpuinfo : AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
$ glxinfo | grep Device : Device: AMD Radeon Vega 11 Graphics (radeonsi, raven, ACO, DRM 3.61, 6.13.9-rc1) (0x15dd)
$ sudo dmesg | awk '/drm/ && /gfx/' : [ 6.427009] [drm] add ip block number 6 <gfx_v9_0>
Offline
After the release of linux 6.13.5 and linux-lts 6.12.17 I am now back to the current mesa 24.3.4 with the current linux-lts.
Both kernels should contain the following patches:
drm-amdgpu-bump-version-for-rv-pco-compute-fix.patch
drm-amdgpu-gfx9-manually-control-gfxoff-for-cs-on-rv.patch
First I test if there are actually no more crashes (freezes).
After that, it is of my interest whether mesa changes its patch again, which was announced in case the kernel itself will contain a patch, which should now be the case.
Pierre-Eric Pelloux-Prayer - mesa developer (a month ago):
R-b, I think it's ok to merge this patch for now. If Alex's patch turns out to fix the root cause, we'll rework this patch to only disable compute queues on kernels without the workaround.
The matter is not yet completely settled. Are there any comments?
https://web.git.kernel.org/pub/scm/linu … e2b69a26a9
https://gitlab.freedesktop.org/mesa/mesa/-/issues/12310
https://gitlab.freedesktop.org/mesa/mes … ests/33248
Offline
6.13.5-arch1-1 + mesa 1:24.3.4-1
uptime 2 hours, 24 minutes
when watching YouTube in Firefox
kernel: amdgpu 0000:2a:00.0: amdgpu: ring gfx timeout, signaled seq=2501866, emitted seq=2501868
kernel: amdgpu 0000:2a:00.0: amdgpu: Process information: process firefox pid 18764 thread firefox:cs0 pid 18833
kernel: amdgpu 0000:2a:00.0: amdgpu: Starting gfx ring reset
kernel: amdgpu 0000:2a:00.0: amdgpu: Ring gfx reset failure
kernel: amdgpu 0000:2a:00.0: amdgpu: ring gfx timeout, signaled seq=2501868, emitted seq=2501871
kernel: amdgpu 0000:2a:00.0: amdgpu: Process information: process kwin_wayland pid 938 thread kwin_wayla:cs0 pid 1192
kernel: amdgpu 0000:2a:00.0: amdgpu: Starting gfx ring reset
kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
kwin_wayland[938]: kwin_wayland_drm: Checking test buffer failed!
kernel: amdgpu 0000:2a:00.0: amdgpu: Dumping IP State
kernel: amdgpu 0000:2a:00.0: amdgpu: Dumping IP State Completed
kernel: amdgpu 0000:2a:00.0: amdgpu: ring comp_1.1.0 timeout, signaled seq=19273, emitted seq=19295
kernel: amdgpu 0000:2a:00.0: amdgpu: Process information: process Xwayland pid 1614 thread Xwayland:cs0 pid 1661
kernel: amdgpu 0000:2a:00.0: amdgpu: Starting comp_1.1.0 ring reset
kwin_wayland[938]: kwin_scene_opengl: A graphics reset not attributable to the current GL context occurred.
it continued to work, but there was a pause of about 10 seconds, during which there were freezes
Last edited by grayich (2025-03-01 01:52:00)
Offline