You are not logged in.

#1 2026-01-20 19:25:17

Aqualung
Member
Registered: 2025-09-07
Posts: 45

AMD eGPU over Thunderbolt: crashing like there's no tomorrow

I have been agonizing hands-on over this issue for at least a week, day in and day out: how to make my new RX 9060 XT work as an eGPU over Thunderbolt. I have been using Copilot/Claude 4.5 quite extensively in my debugging efforts. Claude's conclusion is that it is hopeless, at least at this point in time, to attempt to run this AMD GPU over Thunderbolt. I have been getting catastrophic crashes, that prompt me to do hard reboots to get my system up and running. On the other hand, my ARC B580 works fine, no crashes, very well behaved under the same circumstances(!)--albeit being severely underpowered. Someone told me that I should look for GPUs that are capable to work without ReBAR, so here I am buying an AMD GPU, which, presumably, works better than the ARC in the absence of ReBAR. Some of the saga has been documented here.

Here's my eGPU setup:

LGGram (TB4) -> Caldigit TS5+ -> Razer Core X V2 -> ARC B580 or RX 9060 XT eGPU -> Acer  SA242Y HBI monitor (60Hz, 1920x1080)

LGGram (TB4) -> Caldigit TS5+ -> Sabrent SB-TB4K -> 2xSamsung 4K monitors (60Hz, 3840x2160)

OS: ArchLinux

All connection cables are certified TB5.

My GRUB parameters look like this:

loglevel=3 quiet pci=realloc,hpiosize=512K,hpmemsize=1G,big_root_window pcie_gen_cap=0x40000 runpm=0 noretry=0 aspm=0

I have both ReBAR and "Above 4G Decoding" enabled in BIOS by default.

Here are the BIOS parameters that I have messed with--everything else is default:

ACPI Settings -> ACPI Auto Configuration = Enabled
Connectivity Configuration -> Preboot BLE = Enabled
System Agent (SA) Configuration -> Graphics Configuration -> Aperture Size = 1024
System Agent (SA) Configuration -> Graphics Configuration -> Intel Ultrabook Event Support -> IUER Dock Enable = Enabled
Platform Settings -> VTIO -> Enable VTIO Support = Enabled
Platform Settings -> TCSS Platform Setting -> Thunderbolt Configuration -> OS Native Resource Balance = Enabled
Platform Settings -> TCSS Platform Setting -> Thunderbolt Configuration -> Integrated Thunderbolt Configuration -> ITBT Root Port 1 Configuration -> Reserved PMemory = 4096
Platform Settings -> TCSS Platform Setting -> Thunderbolt Configuration -> Integrated Thunderbolt Configuration -> ITBT Root Port 1 Configuration -> PMemory Alignment = 31
Platform Settings -> TCSS Platform Setting -> Thunderbolt Configuration -> Integrated Thunderbolt Configuration -> ITBT Root Port 2 Configuration -> Reserved PMemory = 4096
Platform Settings -> TCSS Platform Setting -> Thunderbolt Configuration -> Integrated Thunderbolt Configuration -> ITBT Root Port 2 Configuration -> PMemory Alignment = 31

What am I doing wrong?

Last edited by Aqualung (2026-01-24 22:14:23)

Offline

#2 2026-01-20 20:07:56

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

As promised, here is lspci -k (will attach journalctl -b to a subsequent reply):

00:00.0 Host bridge: Intel Corporation Raptor Lake-P/U 4p+8e cores Host Bridge/DRAM Controller
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: igen6_edac
	Kernel modules: igen6_edac
00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-P [Iris Xe Graphics] (rev 04)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: i915
	Kernel modules: i915, xe
00:04.0 Signal processing controller: Intel Corporation Raptor Lake Dynamic Platform and Thermal Framework Processor Participant
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: proc_thermal_pci
	Kernel modules: processor_thermal_device_pci
00:06.0 PCI bridge: Intel Corporation Raptor Lake PCI Express 4.0 Graphics Port
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: pcieport
	Kernel modules: shpchp
00:07.0 PCI bridge: Intel Corporation Raptor Lake-P Thunderbolt 4 PCI Express Root Port #2
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: pcieport
	Kernel modules: shpchp
00:07.3 PCI bridge: Intel Corporation Raptor Lake-P Thunderbolt 4 PCI Express Root Port #1
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: pcieport
	Kernel modules: shpchp
00:08.0 System peripheral: Intel Corporation GNA Scoring Accelerator module
	Subsystem: LG Electronics, Inc. Device 0513
00:0a.0 Signal processing controller: Intel Corporation Raptor Lake Crashlog and Telemetry (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: intel_vsec
	Kernel modules: intel_vsec
00:0d.0 USB controller: Intel Corporation Raptor Lake-P Thunderbolt 4 USB Controller
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
00:0d.3 USB controller: Intel Corporation Raptor Lake-P Thunderbolt 4 NHI #1
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: thunderbolt
	Kernel modules: thunderbolt
00:14.0 USB controller: Intel Corporation Alder Lake PCH USB 3.2 xHCI Host Controller (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
00:14.2 RAM memory: Intel Corporation Alder Lake PCH Shared SRAM (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
00:14.3 Network controller: Intel Corporation Raptor Lake PCH CNVi WiFi (rev 01)
	Subsystem: Intel Corporation Device 0094
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi, wl
00:15.0 Serial bus controller: Intel Corporation Alder Lake PCH Serial IO I2C Controller #0 (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: intel-lpss
	Kernel modules: intel_lpss_pci
00:15.1 Serial bus controller: Intel Corporation Alder Lake PCH Serial IO I2C Controller #1 (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: intel-lpss
	Kernel modules: intel_lpss_pci
00:16.0 Communication controller: Intel Corporation Alder Lake PCH HECI Controller (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: mei_me
	Kernel modules: mei_me
00:1f.0 ISA bridge: Intel Corporation Raptor Lake LPC/eSPI Controller (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
00:1f.3 Multimedia audio controller: Intel Corporation Raptor Lake-P/U/H cAVS (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: sof-audio-pci-intel-tgl
	Kernel modules: snd_soc_avs, snd_sof_pci_intel_tgl, snd_hda_intel
00:1f.4 SMBus: Intel Corporation Alder Lake PCH-P SMBus Host Controller (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: i801_smbus
	Kernel modules: i2c_i801
00:1f.5 Serial bus controller: Intel Corporation Alder Lake-P PCH SPI Controller (rev 01)
	Subsystem: LG Electronics, Inc. Device 0513
	Kernel driver in use: intel-spi
	Kernel modules: spi_intel_pci
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9B1 (DRAM-less) (rev 02)
	Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller PM9B1 (DRAM-less)
	Kernel driver in use: nvme
	Kernel modules: nvme
2c:00.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
2d:00.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
2d:01.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
2d:02.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
2d:03.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
2d:04.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
2e:00.0 USB controller: ASMedia Technology Inc. ASM2142/ASM3142 USB 3.1 Host Controller
	Subsystem: CalDigit, Inc. Device 3144
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
2f:00.0 Ethernet controller: Aquantia Corp. AQtion AQC113 NBase-T/IEEE 802.3an Ethernet Controller [Antigua 10G] (rev 03)
	Subsystem: CalDigit, Inc. Device 0173
	Kernel driver in use: atlantic
	Kernel modules: atlantic
30:00.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
31:00.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
31:01.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
31:02.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
31:03.0 PCI bridge: Intel Corporation JHL9480 Thunderbolt 5 80/120G Bridge [Barlow Ridge Hub 80G 2023] (rev 85)
	Subsystem: Device 2222:1111
	Kernel driver in use: pcieport
	Kernel modules: shpchp
32:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev 25)
	Subsystem: Tul Corporation / PowerColor Device 1478
	Kernel driver in use: pcieport
	Kernel modules: shpchp
33:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (rev 25)
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
	Kernel driver in use: pcieport
	Kernel modules: shpchp
34:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 44 [Radeon RX 9060 XT] (rev c0)
	Subsystem: Tul Corporation / PowerColor Device 2437
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
34:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 48 HDMI/DP Audio Controller
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 48 HDMI/DP Audio Controller
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
3d:00.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] (rev 03)
	Subsystem: Intel Corporation Device 0000
	Kernel driver in use: pcieport
	Kernel modules: shpchp
3e:00.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] (rev 03)
	Subsystem: Intel Corporation Device 0000
	Kernel driver in use: pcieport
	Kernel modules: shpchp
3e:01.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] (rev 03)
	Subsystem: Intel Corporation Device 0000
	Kernel driver in use: pcieport
	Kernel modules: shpchp
3e:02.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] (rev 03)
	Subsystem: Intel Corporation Device 0000
	Kernel driver in use: pcieport
	Kernel modules: shpchp
3e:03.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] (rev 03)
	Subsystem: Intel Corporation Device 0000
	Kernel driver in use: pcieport
	Kernel modules: shpchp
3e:04.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] (rev 03)
	Subsystem: Intel Corporation Device 0000
	Kernel driver in use: pcieport
	Kernel modules: shpchp

Offline

#3 2026-01-20 20:31:12

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

And here's the journalctl -b: full.log.

Also, here's my mpv.conf, as the crash occurs during mpv playback:

# Copy to ~/.config/mpv to enable Vulkan hardware acceleration in mpv.
hwdec=vulkan # Enables hardware-accelerated video decoding via Vulkan.
vo=gpu-next # Use gpu-next for video output (required by Vulkan).
gpu-api=vulkan # Forces Vulkan for rendering.
gpu-context=waylandvk # For Wayland with Vulkan.
hwdec-codecs=all # Enables hardware decoding for all supported codecs.
# vulkan-device='Intel(R) Arc(tm) B580 Graphics (BMG G21)'
vulkan-device='AMD Radeon RX 9060 XT (RADV GFX1200)'
slang=eng # Sets the subtitle language to English.

Noted that the amdgpu driver crashes with hwdec=no as well.

Last edited by Aqualung (2026-01-20 20:48:48)

Offline

#4 2026-01-22 15:20:22

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

runpm=0 noretry=0 aspm=0

This is not in the journal but also nonsense - you need to hint the module for the parameters - "amdgpu.runpm=0" etc.

works better than the ARC in the absence of ReBAR … I have both ReBAR and "Above 4G Decoding" enabled in BIOS by default.

???
Does the PCI setup look different when you toggle rebar support, notably wrt

Jan 20 14:54:33 DadsGram kernel: pcieport 0000:00:07.3: bridge window [io  size 0x431000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:00:07.3: bridge window [io  size 0x431000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:00:07.3: bridge window [io  size 0x380000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:00:07.3: bridge window [io  size 0x380000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2c:00.0: bridge window [io  size 0x380000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2c:00.0: bridge window [io  size 0x380000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2c:00.0: bridge window [io  size 0x380000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2c:00.0: bridge window [io  size 0x380000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:01.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:01.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:02.0: bridge window [io  size 0x128000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:02.0: bridge window [io  size 0x128000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:03.0: bridge window [io  size 0x12c000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:03.0: bridge window [io  size 0x12c000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:04.0: bridge window [io  size 0x128000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:04.0: bridge window [io  size 0x128000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:01.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:01.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:02.0: bridge window [io  size 0x128000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:02.0: bridge window [io  size 0x128000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:03.0: bridge window [io  size 0x12c000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:03.0: bridge window [io  size 0x12c000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:04.0: bridge window [io  size 0x128000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:2d:04.0: bridge window [io  size 0x128000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:30:00.0: bridge window [io  size 0x128000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:30:00.0: bridge window [io  size 0x128000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:30:00.0: bridge window [io  size 0x128000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:30:00.0: bridge window [io  size 0x128000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [mem size 0x400200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [mem size 0x400200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:01.0: bridge window [io  size 0x62000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:01.0: bridge window [io  size 0x62000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:02.0: bridge window [io  size 0x62000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:02.0: bridge window [io  size 0x62000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:03.0: bridge window [io  size 0x62000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:03.0: bridge window [io  size 0x62000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [mem size 0x400200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [mem size 0x400200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:01.0: bridge window [io  size 0x62000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:01.0: bridge window [io  size 0x62000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:02.0: bridge window [io  size 0x62000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:02.0: bridge window [io  size 0x62000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:03.0: bridge window [io  size 0x62000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:31:03.0: bridge window [io  size 0x62000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [mem size 0x400200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [mem size 0x400200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [mem size 0x400200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [mem size 0x400200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:32:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [mem size 0x400200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [mem size 0x400200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [mem size 0x400200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [mem size 0x400200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:33:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 0 [mem size 0x400000000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 0 [mem size 0x400000000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 2 [mem size 0x00200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 2 [mem size 0x00200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 0 [mem size 0x400000000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 0 [mem size 0x400000000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 2 [mem size 0x00200000 64bit pref]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: amdgpu 0000:34:00.0: BAR 2 [mem size 0x00200000 64bit pref]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3d:00.0: bridge window [io  size 0x12c000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3d:00.0: bridge window [io  size 0x12c000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3d:00.0: bridge window [io  size 0x12c000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3d:00.0: bridge window [io  size 0x12c000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:01.0: bridge window [io  size 0x63000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:01.0: bridge window [io  size 0x63000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:02.0: bridge window [io  size 0x63000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:02.0: bridge window [io  size 0x63000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:03.0: bridge window [io  size 0x63000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:03.0: bridge window [io  size 0x63000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:04.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:04.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:00.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:00.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:01.0: bridge window [io  size 0x63000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:01.0: bridge window [io  size 0x63000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:02.0: bridge window [io  size 0x63000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:02.0: bridge window [io  size 0x63000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:03.0: bridge window [io  size 0x63000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:03.0: bridge window [io  size 0x63000]: failed to assign
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:04.0: bridge window [io  size 0x1000]: can't assign; no space
Jan 20 14:54:33 DadsGram kernel: pcieport 0000:3e:04.0: bridge window [io  size 0x1000]: failed to assign
Jan 20 14:54:34 DadsGram gnome-shell[3024]: Added device '/dev/dri/card0' (amdgpu) using atomic mode setting.
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Dumping IP State
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Dumping IP State Completed
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: ring vcn_unified_0 timeout, signaled seq=1284, emitted seq=1285
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu:  Process mpv pid 195158 thread av:h264:df1 pid 195185
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Starting vcn_unified_0 ring reset
Jan 20 14:57:48 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to load ucode VCN0_RAM(0x3B) 
Jan 20 14:57:48 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0)
Jan 20 14:57:48 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: vcn_v5_0_0_start_dpg_mode: vcn sram load failed -22
Jan 20 14:57:48 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Ring vcn_unified_0 reset failed
Jan 20 14:57:48 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: GPU reset begin!. Source:  1
Jan 20 14:57:48 DadsGram kernel: amdgpu 0000:34:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Jan 20 14:57:51 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
Jan 20 14:57:51 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to unmap legacy queue
Jan 20 14:57:53 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
Jan 20 14:57:53 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to unmap legacy queue
Jan 20 14:57:55 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
Jan 20 14:57:55 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to unmap legacy queue
Jan 20 14:57:57 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
Jan 20 14:57:57 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to unmap legacy queue
Jan 20 14:57:59 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
Jan 20 14:57:59 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to unmap legacy queue
Jan 20 14:58:01 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: MES(1) failed to respond to msg=REMOVE_QUEUE
Jan 20 14:58:01 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to unmap legacy queue
Jan 20 14:58:03 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0)
Jan 20 14:58:03 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Failed to terminate hdcp ta
Jan 20 14:58:03 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: suspend of IP block <psp> failed -22
Jan 20 14:58:03 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: MODE1 reset
Jan 20 14:58:03 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: GPU mode1 reset
Jan 20 14:58:03 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: GPU smu mode1 reset
Jan 20 14:58:04 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: GPU reset succeeded, trying to resume
Jan 20 14:58:04 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: PCIE GART of 512M enabled (table at 0x0000008000000000).
Jan 20 14:58:04 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: VRAM is lost due to GPU reset!
Jan 20 14:58:04 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: PSP is resuming...
Jan 20 14:58:05 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: psp reg (0x16091) wait timed out, mask: 0, read: 0 exp: 0
Jan 20 14:58:05 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: PSP load sos failed!
Jan 20 14:58:05 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: PSP resume failed
Jan 20 14:58:05 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: resume of IP block <psp> failed -62
Jan 20 14:58:05 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: GPU reset end with ret = -62
Jan 20 14:58:05 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: GPU Recovery Failed: -62
Jan 20 14:58:15 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Dumping IP State
Jan 20 14:58:15 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Dumping IP State Completed
Jan 20 14:58:15 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
Jan 20 14:58:15 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
Jan 20 14:58:15 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: ring vcn_unified_0 timeout, signaled seq=1285, emitted seq=1285
Jan 20 14:58:15 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Starting vcn_unified_0 ring reset
Jan 20 14:58:17 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: failed to load ucode VCN0_RAM(0x3B) 
Jan 20 14:58:17 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: psp gfx command UNLOAD_TA(0x2) failed and response status is (0x0)
Jan 20 14:58:17 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: vcn_v5_0_0_start_dpg_mode: vcn sram load failed -22
Jan 20 14:58:17 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Ring vcn_unified_0 reset failed
Jan 20 14:58:17 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: GPU reset begin!. Source:  1

Not sure whether

Jan 20 14:56:33 DadsGram org.gnome.Nautilus[195024]: Connecting to org.freedesktop.Tracker3.Miner.Files
Jan 20 14:56:33 DadsGram nautilus[195024]: WARNING: radv is not a conformant Vulkan implementation, testing use only.
Jan 20 14:56:33 DadsGram nautilus[195024]: MESA-INTEL: warning: ../mesa-25.3.3/src/intel/vulkan/anv_formats.c:993: FINISHME: support more multi-planar formats with DRM modifiers
Jan 20 14:56:33 DadsGram nautilus[195024]: MESA-INTEL: warning: ../mesa-25.3.3/src/intel/vulkan/anv_formats.c:959: FINISHME: support YUV colorspace with DRM format modifiers

means that you're trying to use radv w/ the intel IGP

Offline

#5 2026-01-22 15:52:27

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

seth wrote:
runpm=0 noretry=0 aspm=0

This is not in the journal but also nonsense - you need to hint the module for the parameters - "amdgpu.runpm=0" etc.

The four parameters (pcie_gen_cap=0x40000, runpm=0, noretry=0, aspm=0) are actually specified in an amd-pcie-fix.conf file in /etc/modprobe.d, which looks like this:

options amdgpu pcie_gen_cap=0x40000 runpm=0 noretry=0 aspm=0

works better than the ARC in the absence of ReBAR … I have both ReBAR and "Above 4G Decoding" enabled in BIOS by default.

???

The statement "works better then the ARC in the absence of ReBAR" is a just general statement. Surely enough, I tried running the AMD eGPU in both scenarios (i.e. ReBAR enabled and disabled), with the same results. Both scenarios crashed.

Does the PCI setup look different when you toggle rebar support, notably wrt ...

I have not yet captured a log in the no ReBAR scenario, but I will, and I will post it in a subsequent comment. Btw, when disabling ReBAR in BIOS, should I also disable "Above 4G Decoding"? (Or is that something that gets disabled automatically when disabling ReBAR?)

Not sure whether

Jan 20 14:56:33 DadsGram org.gnome.Nautilus[195024]: Connecting to org.freedesktop.Tracker3.Miner.Files
Jan 20 14:56:33 DadsGram nautilus[195024]: WARNING: radv is not a conformant Vulkan implementation, testing use only.
Jan 20 14:56:33 DadsGram nautilus[195024]: MESA-INTEL: warning: ../mesa-25.3.3/src/intel/vulkan/anv_formats.c:993: FINISHME: support more multi-planar formats with DRM modifiers
Jan 20 14:56:33 DadsGram nautilus[195024]: MESA-INTEL: warning: ../mesa-25.3.3/src/intel/vulkan/anv_formats.c:959: FINISHME: support YUV colorspace with DRM format modifiers

means that you're trying to use radv w/ the intel IGP

All I can tell you is that all the logs that I posted, and all the code above, pertain to the AMD eGPU use scenario. I have not posted any logs pertaining to the use of the ARC eGPU. Not sure what makes you think that I am trying to use redv with the Intel eGPU. How did you infer that?

Last edited by Aqualung (2026-01-22 17:55:52)

Offline

#6 2026-01-22 20:11:31

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

Or is that something that gets disabled automatically when disabling ReBAR?

Idk what your BIOS does but those aren't strictly linked.

How did you infer that?

I'd assume that the system runs on the IGP (hence the explicit mpv config and limited crash triggers) and nautilus is posting messages from MESA-INTEL but also complains about radv (which might just because no vulkan driver is explicitly selected by exporting VK_DRIVER_FILES)
https://wiki.archlinux.org/title/Vulkan

Sidebar (well, main topic, actually)

as the crash occurs during mpv playback

"during" or "when"? Ie. can you play the video fine for a moment or does it crash right away?

Offline

#7 2026-01-22 20:40:23

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

seth wrote:

How did you infer that?

I'd assume that the system runs on the IGP (hence the explicit mpv config and limited crash triggers) and nautilus is posting messages from MESA-INTEL but also complains about radv (which might just because no vulkan driver is explicitly selected by exporting VK_DRIVER_FILES)
https://wiki.archlinux.org/title/Vulkan

The main system GPU is, indeed, the Iris Xe iGPU, though I choose to do mpv playback using the eGPU. So yes, that is why you see messages pertaining to both GPUs in the log.

as the crash occurs during mpv playback

"during" or "when"? Ie. can you play the video fine for a moment or does it crash right away?

Yes, video playback proceeds smoothly for a few good minutes, and everything looks peachy, with barely any stress of the eGPU ... that is, until the amdgpu driver crashes, and video freezes. Audio playback continues undisturbed until I kill off mpv. If I remember correctly, I have to disconnect the eGPU in order to be able to kill mpv. Even so, the amdgpu driver remains hung such that, in order to do a shutdown I have to actually use the laptop shutdown button, as soft shutdowns get stuck.

Last edited by Aqualung (2026-01-22 20:46:37)

Offline

#8 2026-01-23 15:04:51

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Offline

#9 2026-01-23 16:25:27

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

Thank you, but I am not sure what exactly I am supposed to focus on on that page. Is this a suggestion to add the pci=hpbussize=0x33 parameter to grub by any chance? Or is there something else?

Offline

#10 2026-01-23 16:29:37

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

The link goes to the paragraph about "Forcing power", the idea being that the TB links shuts down to save power, not understanding that the eGPU behind it is currently in use.
nb. that the UUID "86CCFD48-205E-4A77-9C48-2021CBEDE341" there is a placeholder and will rather be different on your system.

You could also try to add "thunderbolt.clx=0" to the https://wiki.archlinux.org/title/Kernel_parameters (see "modinfo thunderbolt")

Offline

#11 2026-01-23 19:14:56

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

Here's what Copilot/Claude suggests I send to ... whomever may be interested in this sort of crashes (AMD, Mesa etc.):

Perfect! Here are the key AMD GPU crash messages from this log (Jan 23, 13:55:26 onwards):

Critical Error Sequence to send to manufacturer:

Jan 23 13:55:26 - ring sdma1 timeout, signaled seq=1589, emitted seq=1591
Jan 23 13:55:26 - Starting sdma1 ring resetJan 23 13:55:26 - *ERROR* ring sdma1 test failed (-110)
Jan 23 13:55:26 - Ring sdma1 reset failed
Jan 23 13:55:27 - *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic dataJan 23 13:55:29-39 - MES(1) failed to respond to msg=REMOVE_QUEUE (repeated 6 times)
Jan 23 13:55:42 - psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0)
Jan 23 13:55:42 - Failed to terminate hdcp taJan 23 13:55:42 - suspend of IP block <psp> failed -22
Jan 23 13:55:43 - Timeout waiting for VM flush ACK!
Jan 23 13:55:45 - psp gfx command UNKNOWN CMD(0x0) failed and response status is (0x0)
Jan 23 13:55:45 - Failed to load tocJan 23 13:55:45 - PSP tmr init failed!
Jan 23 13:55:45 - PSP resume failed
Jan 23 13:55:45 - resume of IP block <psp> failed -22
Jan 23 13:55:45 - GPU Recovery Failed: -22

Summary for manufacturer:

  • SDMA1 (DMA engine) timeouts and reset failures

  • MES (Micro Engine Scheduler) unresponsive

  • PSP (Platform Security Processor) recovery failures

  • GPU recovery fails completely over Thunderbolt 4 connection

  • Occurs regardless of ReBAR on/off, with/without video decode

  • Network: Linux 6.18.6-zen1, eGPU via Thunderbolt 4 → CalDigit TS5+ → Razer Core X V2

Offline

#12 2026-01-23 19:42:59

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

Stop wasting your time with these bullshit generators, the only thing AMD *might* be interested in in the firmware dump:

Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Dumping IP State
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: Dumping IP State Completed
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
Jan 20 14:57:46 DadsGram kernel: amdgpu 0000:34:00.0: amdgpu: [drm] Check your /sys/class/drm/card0/device/devcoredump/data

Did you attempt to prevent the thunderbolt powersaving?

Offline

#13 2026-01-23 20:04:59

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

seth wrote:

Did you attempt to prevent the thunderbolt powersaving?

I did add the thunderbolt.clx=0 parameter to grub, and it still didn't fix the crash. In order to try out the other suggestion, I would need the uuid of the Thunderbolt port: how do I figure that out?

Last edited by Aqualung (2026-01-23 20:09:00)

Offline

#14 2026-01-23 20:36:42

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

ls /sys/bus/wmi/devices/*/force_power

Offline

#15 2026-01-23 22:30:16

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

seth wrote:
ls /sys/bus/wmi/devices/*/force_power

Thank you. Tried thunderbolt powersaving, and it didn't work. (Also note, as I said above, that no such crashes happen with my ARC B580 eGPU, which makes me think that the crash doesn't happen due to Thunderbolt's shutdown to save power.)

Anyway, I have posted here a short film of the crash, for whomever wants to see with their own eyes what it looks like. Just be patient and watch it all. You may want to download it locally.

Finally, huge thanks for all your help and advice!

Last edited by Aqualung (2026-01-23 22:30:58)

Offline

#16 2026-01-24 15:39:17

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

Well, at least the cause is obvious - it's clearly stunned by Ganz' performance…

nb. that frequency and power draw freeze - those values are unreliable because the GPU no longer responds.
Is the delay somewhat deterministic or can you sometime play 15s and then 10 mins ?
Can you trigger this by running glxgears or some xscreensaver hack or vkcube or a unigine demo on the GPU?

Offline

#17 2026-01-24 17:35:56

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

seth wrote:

Is the delay somewhat deterministic or can you sometime play 15s and then 10 mins ?

No, not deterministic. Not as far as I could tell. As a matter of fact, what you see there is actually quite a long delay: usually the crash happens a lot earlier.

Can you trigger this by running glxgears or some xscreensaver hack or vkcube or a unigine demo on the GPU?

I'll look into it. As far as I can tell, no crash happens if the AMD eGPU is kept idle. Otherwise, displaying image on a monitor connected to the eGPU or doing video playback on the eGPU trigger crashes.

Offline

#18 2026-01-25 21:59:25

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

Just for the heck of it, I tried bypassing the TS5+, hence I connected the eGPU directly to the laptop's other TB4 port. I got the same crash. At this point, I am set on returning the current 9060XT GPU (a PowerColor), and I bought another one from ASUS this time. Will report back if that one crashes once I get it.

Offline

#19 2026-02-02 02:28:54

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

Just reporting back, I got the same type of crash with my new ASUS-branded RX 9060 XT eGPU; as such, it appears to be an issue with all 9060 amdgpu drivers.

Offline

#20 2026-02-02 17:13:28

MarPop
Member
Registered: 2026-02-02
Posts: 4

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

The problem lies in power states handling of the gpu. Could be AMD didn't disclose enough information - to linux ecosystem - on how to handle power saving dynamics (they say it's stable on windows). A nightmare that was resolved yesterday. I have a gb-BRR7H-4800 with Ryzen 7 4800u, and it used to block, crash, and all the rest described in your comments. Yesterday I disabled PSS Support BIOS option, and now everything runs smoothly and fast, though I cannot disable boost and/or use cpupower to regulate frequencies or policies (disabling PSS, ACPI is not exposed to the kernel). Now I have sway session and KDE session opened, and have tried all the sites where it used to crash.
Sorry, but I just registered here for the sake of letting you know  how it has been solved here - to fix yours, at least temporarily, and to let know those in charge.

Offline

#21 2026-02-02 17:33:02

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

MarPop wrote:

The problem lies in power states handling of the gpu. Could be AMD didn't disclose enough information - to linux ecosystem - on how to handle power saving dynamics (they say it's stable on windows). A nightmare that was resolved yesterday. I have a gb-BRR7H-4800 with Ryzen 7 4800u, and it used to block, crash, and all the rest described in your comments. Yesterday I disabled PSS Support BIOS option, and now everything runs smoothly and fast, though I cannot disable boost and/or use cpupower to regulate frequencies or policies (disabling PSS, ACPI is not exposed to the kernel). Now I have sway session and KDE session opened, and have tried all the sites where it used to crash.
Sorry, but I just registered here for the sake of letting you know  how it has been solved here - to fix yours, at least temporarily, and to let know those in charge.

Thank you. Will look for that in my BIOS. Just FYI, my system is Intel, and I am being told that this "PSS Support" is usually a setting in AMD BIOSes.

At any rate, at this point in time my hope is that kernel 6.19 will straighten out the ReBAR morass, which I am also told is the root of my crashes.

Offline

#22 2026-02-02 23:08:08

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 72,824

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

https://bbs.archlinux.org/viewtopic.php … 5#p2285745
He's crossposting that as generic PSA and I doubt MarPop is even using an eGPU.

my ARC B580 works fine, no crashes, very well behaved under the same circumstances(!)--albeit being severely underpowered

I'd test the HW w/ windows (if you can/don't mind) just to make sure the TB actually provides enough power for the GPU or you could also try whether running the AMD GPU at the minimum power profile or performance level works…
https://wiki.archlinux.org/title/AMDGPU … nce_levels

Offline

#23 2026-02-03 04:43:52

Aqualung
Member
Registered: 2025-09-07
Posts: 45

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

seth wrote:

make sure the TB actually provides enough power for the GPU

As far as I know the B580 has more stringent power requirements than the 9060. AT any rate, my enclosure has an 850W PSU.

Offline

#24 2026-02-04 01:51:02

MarPop
Member
Registered: 2026-02-02
Posts: 4

Re: AMD eGPU over Thunderbolt: crashing like there's no tomorrow

seth wrote:

https://bbs.archlinux.org/viewtopic.php … 5#p2285745
He's crossposting that as generic PSA and I doubt MarPop is even using an eGPU.

I posted there after having realized that I posted here by mistake, and I didn't remove it from here because I was thinking it can be useful anyway. And I mentioned this fact  there. Though I wonder where from the need to point the obvious, seen that it could have been inferred from what I wrote.
I'll add — here and there ­— that the freezes manifests anyway after suspend. I am new here, so tell me if mine are useless.

Offline

Board footer

Powered by FluxBB