You are not logged in.

#1 2023-02-05 23:14:54

norihiori
Member
Registered: 2023-01-26
Posts: 3

Thinkpad T14s Gen3 AMD Drivers GPU reset

Hi, on my Thinkpad T14s Gen3 AMD, I sometimes encounter a problem, my screen freezes and sound keep going. I can't switch to TTY, it may happen that problem is solved without restarting after a short black screen.

On the Arch page of this model, there seems to be a problem with GPU reset. But as I don't get the same error message nor the same behavior, I have doubts.
https://wiki.archlinux.org/title/Lenovo … _Gen_3#GPU
What do you think about it?

Uname

Linux Arrakis 6.1.8-hardened1-1-hardened #1 SMP PREEMPT_DYNAMIC Tue, 24 Jan 2023 17:22:34 +0000 x86_64 GNU/Linux

Journalctl (self fixed this time)

févr. 05 23:52:33 Arrakis kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=864066, emitted seq=864068
févr. 05 23:52:33 Arrakis kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
févr. 05 23:52:33 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset begin!
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: free PSP TMR buffer
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: MODE2 reset
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset succeeded, trying to resume
févr. 05 23:52:34 Arrakis kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F43FC00000).
févr. 05 23:52:34 Arrakis kernel: [drm] PSP is resuming...
févr. 05 23:52:34 Arrakis kernel: [drm] reserve 0xa00000 from 0xf43e000000 for PSP TMR
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: RAS: optional ras ta ucode is not available
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: RAP: optional rap ta ucode is not available
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: SMU is resuming...
févr. 05 23:52:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: SMU is resumed successfully!
févr. 05 23:52:34 Arrakis kernel: [drm] DMUB hardware initialized: version=0x0400002E
févr. 05 23:52:35 Arrakis kernel: [drm] Watermarks table not configured properly by SMU
févr. 05 23:52:35 Arrakis kernel: [drm] kiq ring mec 2 pipe 1 q 0
févr. 05 23:52:35 Arrakis kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
févr. 05 23:52:35 Arrakis kernel: [drm] JPEG decode initialized successfully.
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: recover vram bo from shadow start
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: recover vram bo from shadow done
févr. 05 23:52:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset(1) succeeded!
févr. 05 23:52:35 Arrakis firefoxdeveloperedition.desktop[20868]: [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate
févr. 05 23:52:35 Arrakis io.element.Element.desktop[3381944]: [3381944:0205/235235.693929:ERROR:shared_context_state.cc(859)] SharedContextState context lost via ARB/EXT_robustness. Reset status = GL_INNOCENT_CONTEXT_RESET_KHR
févr. 05 23:52:35 Arrakis io.element.Element.desktop[3381944]: [3381944:0205/235235.694150:ERROR:gpu_service_impl.cc(988)] Exiting GPU process because some drivers can't recover from errors. GPU process will restart shortly.
févr. 05 23:52:35 Arrakis io.element.Element.desktop[3381906]: [3381906:0205/235235.723057:ERROR:gpu_process_host.cc(991)] GPU process exited unexpectedly: exit_code=8704
févr. 05 23:52:42 Arrakis gnome-shell[17588]: ../mutter/clutter/clutter/clutter-actor.c:9047: Actor '<unnamed>[<ClutterClone>:0x13ef60b2a20]' tried to allocate a size of 0,00 x -19,56
févr. 05 23:52:42 Arrakis gnome-shell[17588]: ../mutter/clutter/clutter/clutter-actor.c:9047: Actor '<unnamed>[<ClutterClone>:0x13ef60b2a20]' tried to allocate a size of 5,00 x -17,56
févr. 05 23:52:42 Arrakis gnome-shell[17588]: ../mutter/clutter/clutter/clutter-actor.c:9047: Actor '<unnamed>[<ClutterClone>:0x13ef60b2a20]' tried to allocate a size of 17,00 x -14,56
févr. 05 23:52:42 Arrakis gnome-shell[17588]: ../mutter/clutter/clutter/clutter-actor.c:9047: Actor '<unnamed>[<ClutterClone>:0x13ef60b2a20]' tried to allocate a size of 37,00 x -9,56

Last edited by norihiori (2023-02-05 23:37:56)

Offline

#2 2023-02-06 06:37:19

Head_on_a_Stick
Member
From: London
Registered: 2014-02-20
Posts: 7,732
Website

Re: Thinkpad T14s Gen3 AMD Drivers GPU reset

Have you tried using amd-pstate? I have a P14s Gen 2a and it's been working very well with my hardware for several months now. The running temperatures are significantly lower compared to the ACPI version.

Offline

#3 2023-02-15 11:38:18

norihiori
Member
Registered: 2023-01-26
Posts: 3

Re: Thinkpad T14s Gen3 AMD Drivers GPU reset

Okay, I just activated it. I'll know if it works if the problem doesn't show up for 1 month sad
Thank you.

Offline

#4 2023-03-18 13:14:42

WholesomeDoktor
Member
Registered: 2021-08-25
Posts: 10

Re: Thinkpad T14s Gen3 AMD Drivers GPU reset

Did the amd-pstate solve your issue? I haven't faced this problem untill the recent kernel updates. would love to hear from you

Offline

#5 2023-04-22 22:42:10

dfsbbl
Member
Registered: 2022-12-26
Posts: 5

Re: Thinkpad T14s Gen3 AMD Drivers GPU reset

WholesomeDoktor wrote:

Did the amd-pstate solve your issue? I haven't faced this problem untill the recent kernel updates. would love to hear from you

Can confirm that the same issue happens on HP Elitebook 845 Gen 9, which uses the same Ryzen 6000 series APU. I'm currently using amd-pstate driver for power management, but my screeen still occasionally freeze with the same

[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=xxx, emitted seq=yyy

error and GPU resetting. My guess is that for now both the default acpi driver and the amd-pstate driver are buggy.

Last edited by dfsbbl (2023-04-22 22:42:31)

Offline

#6 2023-05-11 12:52:22

norihiori
Member
Registered: 2023-01-26
Posts: 3

Re: Thinkpad T14s Gen3 AMD Drivers GPU reset

I still have a problem, and with `amd_pstate=active` it's worse.
Now with `amd_pstate=passive`, I have a freeze, black screen, and everything come back, but after that there are unstable on "sleep mode" (at closing screen)

The new journalctl output are:
```
mai 11 14:32:34 Arrakis kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=4519096, emitted seq=4519098
mai 11 14:32:34 Arrakis kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
mai 11 14:32:34 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset begin!
mai 11 14:32:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: free PSP TMR buffer
mai 11 14:32:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: MODE2 reset
mai 11 14:32:35 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset succeeded, trying to resume
mai 11 14:32:35 Arrakis kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F43FC00000).
mai 11 14:32:35 Arrakis kernel: [drm] PSP is resuming...
mai 11 14:32:35 Arrakis kernel: [drm] reserve 0xa00000 from 0xf43e000000 for PSP TMR
mai 11 14:32:36 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: RAS: optional ras ta ucode is not available
mai 11 14:32:36 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: RAP: optional rap ta ucode is not available
mai 11 14:32:36 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
mai 11 14:32:36 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: SMU is resuming...
mai 11 14:32:36 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: SMU is resumed successfully!
mai 11 14:32:36 Arrakis kernel: [drm] DMUB hardware initialized: version=0x0400002E
mai 11 14:32:41 Arrakis kernel: [drm] Watermarks table not configured properly by SMU
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: [drm] *ERROR* Step 2 of creating MST payload for 00000000fd12d253 failed: -5
mai 11 14:32:41 Arrakis kernel: [drm] kiq ring mec 2 pipe 1 q 0
mai 11 14:32:41 Arrakis kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
mai 11 14:32:41 Arrakis kernel: [drm] JPEG decode initialized successfully.
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: recover vram bo from shadow start
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: recover vram bo from shadow done
mai 11 14:32:41 Arrakis kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset(4) succeeded!
```

Last line `GPU reset succeeded`... Yep, I hope ^^

Offline

Board footer

Powered by FluxBB