You are not logged in.

#26 2023-08-30 19:14:59

LinuxSquare
Member
Registered: 2023-08-17
Posts: 16

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Lone_Wolf wrote:

Please post the output of

$ archlinux-java status
[linuxsquare@archtuxy2 ~]$ archlinux-java status
Available Java environments:
  java-17-openjdk (default)
Lone_Wolf wrote:

In the log I also noticed prismlauncher uses gamemoded . Can you try disabling that ?

It seems like, PrismLauncher only starts the service, but doesn't use it actively.
When looking through the codebase of PrismLauncher, it checks, if I have checked the checkbox to enable Feral's GameMode, (which I haven't)

# https://github.com/PrismLauncher/PrismLauncher/blob/b83fdbd1b752acdf555fb90d397ff61ddb896f2c/launcher/ui/pages/instance/InstanceSettingsPage.cpp#L220

...
if (performance) {
     m_settings->set("EnableFeralGamemode", ui->enableFeralGamemodeCheck->isChecked());
     m_settings->set("EnableMangoHud", ui->enableMangoHud->isChecked());
...

When taking a look in the prismlauncher.cfg in .local/share/PrismLauncher, it says, that GameMode is not enabled:

...
DownloadsDirWatchRecursive=false
EnableFeralGameMode=false
EnableMangoHud=false
...

So I suppose, we can savely assume, that GameMode isn't being used without my knowings, since I haven't found an argument to completely remove GameMode from starting on my system.

I suppose, to completely remove it, I have to either put a flag when compiling or edit the code and remove GameMode completely.
I couldn't find a flag to disable Gamemode in the PKGBUILD's of both prismlauncher and prismlauncher-qt5 either.

Offline

#27 2023-08-31 08:15:24

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,187

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Searched for "amd faulty UTCL2 client" and found several results that indicate this may be a graphics firmware issue .

https://gitlab.freedesktop.org/drm/amd/-/issues/1677 I find especially interesting and your card is from that same family .

I suggest you try older versions of linux-firmware using the Arch Linux Archive


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#28 2023-08-31 18:56:37

LinuxSquare
Member
Registered: 2023-08-17
Posts: 16

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Lone_Wolf wrote:

I suggest you try older versions of linux-firmware using the Arch Linux Archive

I've tried versions 20230404 & 20220610 and both didn't work.
Even reverting to lts kernel 5.15.94 didn't help.  (I've chrooted into my Manjaro partition, where I've noticed, that I've ran a 5.15 lts kernel and 20230404 firmware)
The log is the same (I've tried again with Plasma, since we know now, it's not likely a Plasma problem):
http://0x0.st/Hpun.txt

I do have a spare ssd lying here somewhere, where I could try Manjaro again and see if the latest version crashes too, now.

It's a really a mystery, why all of a sudden I have such problems, from switching from Manjaro to Arch Linux on completely new Nvme SSD's.

Update: Manjaro crashes aswell with the same error-message. Kind of good, kind of bad. But we know, it's not just an vanilla Arch-Problem.

Last edited by LinuxSquare (2023-08-31 19:37:50)

Offline

#29 2023-08-31 19:58:07

seth
Member
Registered: 2012-09-03
Posts: 53,470

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Did you rebuild the initramfs after downgrading the FW?
The difference between plasma and openbox is that the kwin_wayland compositor falls victim to the GPU hiccup and then throws a tantrum.

Also judging from Lone_Wolfs fdo bug and https://bugzilla.kernel.org/show_bug.cgi?id=211157 won't cut it - the last good version seems to have been linux-firmware-20210315.3568f96 …

Edit: https://bugzilla.kernel.org/show_bug.cgi?id=211157#c11

Seems to be working fine on anything under linux-firmware version 20210517

I guess that's a language error and means "below" and that supports the fdo bug claim.

Last edited by seth (2023-08-31 20:00:19)

Offline

#30 2023-09-01 18:33:33

LinuxSquare
Member
Registered: 2023-08-17
Posts: 16

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

seth wrote:

Did you rebuild the initramfs after downgrading the FW?

Unfortunately no, but this time when downgrading to 20210511, I did.

[linuxsquare@archtuxy2 ~]$ pacman -Q linux-firmware
linux-firmware 20210511.7685cf4-1

I did downgrade to the above version and rebuilt the initramfs, rebooted the device and tried it again, but unfortunately, the error happened again.
http://0x0.st/HpLG.txt

So I guess, I have to buy a new graphics card then, maybe Nvidia this time hmm
(although, I'm not a fan of how the driver packages are packaged in linux, since it requires two of them. I had a Nvidia quite back in time and had problems, because nvidia-utils got released first and after about two days, the nvidia package got it's update. Upgrading the system wasn't so fun, when none of my monitors turned on. But that's clearly a different problem and probably more of a "me" problem).

I don't know, if it's worth the hassle, to troubleshoot even more, when a clearly much older firmware version doesn't work for my system-configuration, when it did for him:

Seems to be working fine on anything under linux-firmware version 20210517

I still have an older nvidia lying around here and I can test if you're interested in the results.
But currently I neither have the motivation nor time for this, this weekend.

So maybe sometimes next week.

Last edited by LinuxSquare (2023-09-01 18:37:03)

Offline

#31 2023-09-01 19:11:21

seth
Member
Registered: 2012-09-03
Posts: 53,470

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Did you try 20210315 ? (idk. how realible the comment about "anything under" is itr. in particular what the users previous version was and the march version is explicitly confirmed to have been good)
https://archive.archlinux.org/packages/ … kg.tar.zst

Offline

#32 2023-09-01 20:04:57

LinuxSquare
Member
Registered: 2023-08-17
Posts: 16

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Unfortunately, even the March-version did crash my system. (Installed and rebuilt initramfs, ofc)

I'll just post this short output, since the output is always the same. I hope it's enough:

Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:6 pasid:32783, for process java pid 1510 thread java:cs0 pid 1684)
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu:   in page starting at address 0x0000800115fb1000 from IH client 0x1b (UTCL2)
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00601030
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu:          MORE_FAULTS: 0x0
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu:          WALKER_ERROR: 0x0
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu:          MAPPING_ERROR: 0x0
Sep 01 22:00:50 archtuxy2 kernel: amdgpu 0000:0d:00.0: amdgpu:          RW: 0x0
Sep 01 22:01:00 archtuxy2 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Sep 01 22:01:00 archtuxy2 kwin_wayland[874]: kwin_scene_opengl: A graphics reset not attributable to the current GL context occurred.
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: kwin_scene_opengl: Waiting for glGetGraphicsResetStatus to return GL_NO_ERROR timed out!
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: OpenGL vendor string:                   AMD
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: OpenGL renderer string:                 AMD Radeon RX Vega (vega10, LLVM 16.0.6, DRM 3.49, 6.1.50-1-lts)
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: OpenGL version string:                  4.6 (Core Profile) Mesa 23.1.6-arch1.4
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: OpenGL shading language version string: 4.60
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: Driver:                                 Unknown
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: GPU class:                              Unknown
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: OpenGL version:                         4.6
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: GLSL version:                           4.60
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: Mesa version:                           23.1.6
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: X server version:                       1.23.2
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: Linux kernel version:                   6.1.50
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: Requires strict binding:                no
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: GLSL shaders:                           yes
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: Texture NPOT support:                   yes
Sep 01 22:01:01 archtuxy2 kwin_wayland[874]: Virtual Machine:                        no

Offline

#33 2023-10-16 17:56:28

LinuxSquare
Member
Registered: 2023-08-17
Posts: 16

Re: [SOLVED] [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

It seems like there is a problem within mesa, when running minecraft + sodium, as stated here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9925 and here: https://github.com/CaffeineMC/sodium-fabric/issues/1792
When removing sodium from minecraft, it works. So, since there is an open issue on freedesktop's gitlab and github, I don't think, that you can do much to solve this problem.

But I do have to thank you, for your patience and helpful comments on trying to solve this problem.
I now know, that I'm not alone with this issue, which is such a load off my mind.

Thank you. Marking this thread as "solved", since a workaround is known and working.

For future readers with the same issue and specs: head to the Github issue for possible solutions.
What worked best for me is, to  simply remove sodium and all of it's dependents from the mods, as long as you do not depend on it.

Offline

Board footer

Powered by FluxBB