You are not logged in.

#1 2025-12-16 16:07:30

0xFFFFF
Member
Registered: 2025-12-16
Posts: 1

RX 9070XT eGPU reset and system freeze

Hello,

I have a pretty old Asus laptop (with an i7 6700HQ + Nvidia GTX 960M) and I tried to plug my AMD RX9070XT eGPU into the Thunderbolt 3 port. Unfortunately,  the <b>amdgpu</b> module fails and the whole system freeze (can't switch to another tty) after 10-15s.

[   70.228651] amdgpu 0000:0b:00.0: [drm] Cannot find any crtc or sizes
[   80.625472] amdgpu 0000:0b:00.0: amdgpu: Dumping IP State
[   80.630040] amdgpu 0000:0b:00.0: amdgpu: Dumping IP State Completed
[   80.630105] amdgpu 0000:0b:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
[   80.630109] amdgpu 0000:0b:00.0: amdgpu: [drm] Check your /sys/class/drm/card2/device/devcoredump/data
[   80.630112] amdgpu 0000:0b:00.0: amdgpu: ring sdma1 timeout, signaled seq=6, emitted seq=8
[   80.630115] amdgpu 0000:0b:00.0: amdgpu: Starting sdma1 ring reset
[   80.630154] amdgpu 0000:0b:00.0: amdgpu: reset sdma queue (1:0:0)
[   81.123221] amdgpu 0000:0b:00.0: amdgpu: failed to wait on sdma queue reset done
[   81.123229] amdgpu 0000:0b:00.0: amdgpu: failed to reset legacy queue
[   81.123230] amdgpu 0000:0b:00.0: amdgpu: Ring sdma1 reset failed
[   81.123233] amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
[   85.063641] amdgpu 0000:0b:00.0: amdgpu: failed to suspend display audio
[   87.502330] amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
[   87.502335] amdgpu 0000:0b:00.0: amdgpu: failed to unmap legacy queue
[   89.802731] amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
[   89.802755] amdgpu 0000:0b:00.0: amdgpu: failed to unmap legacy queue
[   92.098902] amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
[   92.098923] amdgpu 0000:0b:00.0: amdgpu: failed to unmap legacy queue
[   94.369147] amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
[   94.369152] amdgpu 0000:0b:00.0: amdgpu: failed to unmap legacy queue
[   96.632588] amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
[   96.632594] amdgpu 0000:0b:00.0: amdgpu: failed to unmap legacy queue
[   98.948323] amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
[   98.948334] amdgpu 0000:0b:00.0: amdgpu: failed to unmap legacy queue
[   98.982596] amdgpu 0000:0b:00.0: amdgpu: MODE1 reset
[   98.982603] amdgpu 0000:0b:00.0: amdgpu: GPU mode1 reset
[   98.982905] amdgpu 0000:0b:00.0: amdgpu: GPU smu mode1 reset
[  100.005187] amdgpu 0000:0b:00.0: amdgpu: GPU reset succeeded, trying to resume
[  100.005516] amdgpu 0000:0b:00.0: amdgpu: PCIE GART of 512M enabled (table at 0x0000008000000000).
[  100.005771] amdgpu 0000:0b:00.0: amdgpu: VRAM is lost due to GPU reset!
[  100.005773] amdgpu 0000:0b:00.0: amdgpu: PSP is resuming...
[  104.925505] amdgpu 0000:0b:00.0: amdgpu: PSP load kdb failed!
[  104.925508] amdgpu 0000:0b:00.0: amdgpu: PSP resume failed
[  104.925510] amdgpu 0000:0b:00.0: amdgpu: resume of IP block <psp> failed -62
[  104.925513] amdgpu 0000:0b:00.0: amdgpu: GPU reset end with ret = -62
[  104.925515] amdgpu 0000:0b:00.0: amdgpu: GPU Recovery Failed: -62

I tried different kernel parameters :

amdgpu.audio=0 amdgpu.dpm=0 amdgpu.bapm=0 amdgpu.aspm=0 amdgpu.runpm=0

Using amdgpu.dpm=0 prevents the freeze but also prevents the initialization.

[   61.859021] amdgpu: smu firmware loading failed
[   61.859024] amdgpu 0000:0b:00.0: amdgpu: amdgpu_device_ip_init failed

I suspected weird autosuspend behavior + power management so I tested 

usbcore.autosuspend=-1 amdgpu.runpm=0 amdgpu.aspm=0

but without any success :

https://0x0.st/PrVl.txt

This eGPU setup works on another laptop just fine but I don't understand what is the core issue here...

With kernel LTS (and no other config changes), same problem: https://0x0.st/PrWT.txt

déc. 16 17:19:23 Archoum kernel: xhci_hcd 0000:0c:00.0: WARN: xHC save state timeout
déc. 16 17:19:23 Archoum kernel: xhci_hcd 0000:0c:00.0: PM: suspend_common(): xhci_pci_suspend returns -110
déc. 16 17:19:23 Archoum kernel: xhci_hcd 0000:0c:00.0: can't suspend (hcd_pci_runtime_suspend returned -110)
déc. 16 17:19:23 Archoum kernel: snd_hda_intel 0000:0b:00.1: azx_get_response timeout, switching to polling mode: last cmd=0x000f0000
déc. 16 17:19:24 Archoum kernel: snd_hda_intel 0000:0b:00.1: No response from codec, disabling MSI: last cmd=0x000f0000
déc. 16 17:19:25 Archoum kernel: snd_hda_intel 0000:0b:00.1: Codec #0 probe error; disabling it...
déc. 16 17:19:25 Archoum kernel: snd_hda_intel 0000:0b:00.1: no codecs initialized
déc. 16 17:19:32 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: Dumping IP State
déc. 16 17:19:32 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: Dumping IP State Completed
déc. 16 17:19:32 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: ring sdma0 timeout, signaled seq=8, emitted seq=10
déc. 16 17:19:32 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
déc. 16 17:19:36 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: failed to suspend display audio
déc. 16 17:19:39 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
déc. 16 17:19:39 Archoum kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
déc. 16 17:19:41 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
déc. 16 17:19:41 Archoum kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
déc. 16 17:19:43 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
déc. 16 17:19:43 Archoum kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
déc. 16 17:19:46 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
déc. 16 17:19:46 Archoum kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
déc. 16 17:19:48 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
déc. 16 17:19:48 Archoum kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
déc. 16 17:19:50 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
déc. 16 17:19:50 Archoum kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
déc. 16 17:19:50 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: MODE1 reset
déc. 16 17:19:50 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU mode1 reset
déc. 16 17:19:50 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU smu mode1 reset
déc. 16 17:19:51 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset succeeded, trying to resume
déc. 16 17:19:51 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: PCIE GART of 512M enabled (table at 0x0000008000000000).
déc. 16 17:19:51 Archoum kernel: [drm] VRAM is lost due to GPU reset!
déc. 16 17:19:51 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: PSP is resuming...
déc. 16 17:19:55 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: PSP load kdb failed!
déc. 16 17:19:55 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: PSP resume failed
déc. 16 17:19:55 Archoum kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
déc. 16 17:19:55 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset(1) failed
déc. 16 17:19:55 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset end with ret = -62
déc. 16 17:19:55 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU Recovery Failed: -62
déc. 16 17:20:06 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: Dumping IP State
déc. 16 17:20:06 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: Dumping IP State Completed
déc. 16 17:20:06 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: ring sdma1 timeout, signaled seq=12, emitted seq=15
déc. 16 17:20:06 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
déc. 16 17:20:10 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: failed to suspend display audio
déc. 16 17:20:10 Archoum kernel: amdgpu 0000:0b:00.0: amdgpu: Failed to disallow df cstate

Any ideas on what I can try next?

Thanks

Last edited by 0xFFFFF (2025-12-16 16:39:57)

Offline

Board footer

Powered by FluxBB