You are not logged in.

#1 2024-02-05 21:18:52

cryptearth
Member
Registered: 2024-02-03
Posts: 74

[SOLVED] additional input on amdgpu reboot issue

// update
As this is fixed with todays 6.7.5 I mark this as solved.

//original

So, as I'm also affected by this issue https://bbs.archlinux.org/viewtopic.php?id=290707 but also have additional hardware to test may I give the community some input.
I just upgraded from a RX 570 to a RX 770 XT (Sapphire, PULSE). As I have 3 monitors but the card only has 2 display port and 2 hdmi I use an active DP-splitter - and from my tests DisplayPort seems to be at play here. Similar to the user DeX77 over at the bug report: https://gitlab.freedesktop.org/drm/amd/-/issues/3062 "using 2 displays via DP (daisy chain)"
I don't understand what https://git.kernel.org/pub/scm/linux/ke … 043ee23357 is about - but as it properly was done for some reason just reverting it can't be a fix forever. So, to aid further development on whatever this is about here are my two cents:

Short story: When I don't use the DP-splitter - or no DP at all - there seems to be no issue. Only when using DP it starts to get funky.

A bit longer explanation:

I first encountered issues with a passive DP-splitter some years ago. So I bought two active ones. Unfortunately, as I figured out later, both of the active ones have the same chip inside - and likely a similar or maybe even the same firmware. So I see them as equivalent.
When I only hook up 2 monitors directly via DP everything works fine. As soon as I throw one of the splitters into the mix the issue comes up.
For the sake of testing I switched back to my RX 570 which has 3x DP + 1x HDMI - and it shows the same: When connecting all 3 monitors directly to one port each everything is fine. Only when I hook up the splitter the issue comes up.

I also setup windows along (using the Chris Titus "Windows - the Arch way" guide) - and the windows driver doesn't seem to be affected but works fine in any setup (even when hooking up both splitters no matter which gpu). So from this I guess it's clear that it's not something hardware specific - and neither is it specific to the 7000 series but seems to affect all "recent" amd gpus using the amdgpu driver. I also have an even older R9 290 - which by default uses the older radeon driver. Using the radeon driver the 290 doesn't show any issue. When disable the radeon driver but force the amdgpu driver even the old 290 shows the same issue: no screen after reboot - but works fine when booting into windows.

So, to whoever is responsible for the change that caused that - as just reverting the change can't be a solution forever this may aid in the development to figure out what's going on.

I'm sorry if this isn't the right place to post this information as it seem to not only affect Arch but is a general upstream issue and hence should be posted somewhere upstream. I just wanted to post it somewhere as I'm able to test it with different connection setups and even with different cards - so to me this rules out it's something specific to the 7000 series but rather something more general in the amdgpu driver affecting any generation using it - and it somehow seems to be related to DisplayPort - or rather when not using it in a 1-to-1 connection but in something more unusual as in daisy-chaining or using multi-stream via a splitter - as the issue does not occur when using HDMI or DVI.

btw - a bit off-topic: this somewhat reminds me of a bug in the usb-audio stack: Someone changed something which affected external usb audio interfaces in a way that they only worked either as output or as input but can't be used both way at the same time. The issue was discovered rather quick within a week or so and the fix (also just reverting the change) was done and upstream within another 2 weeks - so from occuring of the issue due to some faulty patch to working again by reveting it took only a couple of weeks. I'm looking forward that this issue is solved in a similar fashion as it's already merged into Linus Torvalds master tree and we only have to wait for it to get released.
It's a bit unfortunate that this issue seems to be already backported to the 6.6.x-lts as it, too, shows the same issue. But as I use ZFS I can't really downgrade to an even older kernel as I then somehow would have to get zfs build for it which I don't have any idea how.

Last edited by cryptearth (2024-02-17 22:42:26)

Offline

Board footer

Powered by FluxBB