You are not logged in.

#1 2024-05-22 03:12:20

glyn
Member
Registered: 2023-06-27
Posts: 6

Crash on resume, possibly due to AMD FirePro V5900

Since replacing my graphics card with an AMD FirePro V5900 in March 2024 I have experienced a crash on resume approximately 20% of the time (over 42 suspend/resume cycles). The symptoms are that, on resume, I see only a black screen and pressing the Num Lock key does not light up the Num Lock LED. Pressing the power button of the system unit successfully shuts down the system.

This system log is of the most recent crash. This earlier system log shows several successful suspend/resume cycles over several days before the crash occurs.

I have been upgrading the system regularly. The latest crash was on the 6.9.1-arch1-1 kernel.

I can't see any obvious cause in the logs. (Please note that the network card is always slow to activate on resume and this seems to cause a fatal error in backup.service. However, this occurs in successful as well as failed resumes, so does not appear to be the cause of the crash.)

Has anyone else experienced this crash? Is there a way to obtain better diagnostics?

Last edited by glyn (2024-05-22 09:20:18)

Offline

#2 2024-05-22 07:36:16

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,287

Re: Crash on resume, possibly due to AMD FirePro V5900

radeon, not amdgpu - cayman is northern islands so you can't switch.

There're no obvious errors in the journal, you do not seem to have "pressed and held" the power key, though?

May 22 03:22:46 zion systemd-logind[414]: Power key pressed short.

nb. that actually hard-rebooting w/ the power key will prevent the journal from being written to disk - the last event was on May 22 03:22:03 (your time)?

If so, the kernel actually still runs and the ethernet eventually came up and got a lease - can you ssh into the system at this point?

It also doesn't look like you're running a GUI session? This is all just on the console?

Offline

#3 2024-05-22 09:27:25

glyn
Member
Registered: 2023-06-27
Posts: 6

Re: Crash on resume, possibly due to AMD FirePro V5900

seth wrote:

radeon, not amdgpu - cayman is northern islands so you can't switch.

I see: I have to use the radeon driver rather than the amdgpu driver since the latter doesn't support my graphics card. Thanks.

seth wrote:

There're no obvious errors in the journal, you do not seem to have "pressed and held" the power key, though?

May 22 03:22:46 zion systemd-logind[414]: Power key pressed short.

My mistake - thanks for pointing it out. I've corrected the OP.

seth wrote:

nb. that actually hard-rebooting w/ the power key will prevent the journal from being written to disk - the last event was on May 22 03:22:03 (your time)?

Yes, I understand that if I had hard-rebooted the system the journal would have been truncated. The last event was at May 22 03:22:47 (my time):

May 22 03:22:47 zion systemd-journald[269]: Journal stopped
seth wrote:

If so, the kernel actually still runs and the ethernet eventually came up and got a lease - can you ssh into the system at this point?

I can't ssh in (it doesn't run a ssh server), but I can successfully ping the system.

seth wrote:

It also doesn't look like you're running a GUI session? This is all just on the console?

I run xmonad (via startx) after the system has booted.

Last edited by glyn (2024-05-22 11:00:25)

Offline

#4 2024-05-22 11:02:57

glyn
Member
Registered: 2023-06-27
Posts: 6

Re: Crash on resume, possibly due to AMD FirePro V5900

Holger Hoffstätte kindly responded elsewhere:

Holger wrote:

This is with the old "radeon" driver, right? If so, this unfortunately just happens sometimes. The driver is in maintenance mode (except for the occasional compile fix) and - according to AMD - has "unfixable" design problems. sad

(I am indeed using the old "radeon" driver. This was what seth was telling me earlier: that I can't switch to the amdgpu driver because it doesn't support my graphics card.)

Offline

#5 2024-05-22 14:05:08

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,287

Re: Crash on resume, possibly due to AMD FirePro V5900

I can't ssh in (it doesn't run a ssh server), but I can successfully ping the system.

The idea would have been  to setup sshd so that you can remotely log into the broken machine for life inspection and mitigation attempts.

I run xmonad (via startx) after the system has booted.

Also a compositor (picom)?
How black is the screen? Is there still a mouse cursor?
Can you switch the VT when this happens (ctrl+alt+f2,f3,…)?
Or zap the X11 server (ctrl+alt+backspace, this will kill your GUI session!)?

Offline

#6 2024-05-22 14:26:30

glyn
Member
Registered: 2023-06-27
Posts: 6

Re: Crash on resume, possibly due to AMD FirePro V5900

seth wrote:

I can't ssh in (it doesn't run a ssh server), but I can successfully ping the system.

The idea would have been  to setup sshd so that you can remotely log into the broken machine for life inspection and mitigation attempts.

I might resort to that, but the disk is clean after power off, so I simply reboot. It's just irritating, but rather fast.

seth wrote:

I run xmonad (via startx) after the system has booted.

Also a compositor (picom)?

Not knowingly!

seth wrote:

How black is the screen?

Actually, IIRC, not that black.

seth wrote:

Is there still a mouse cursor?

No.

seth wrote:

Can you switch the VT when this happens (ctrl+alt+f2,f3,…)?

No, I tried that, and cannot.

seth wrote:

Or zap the X11 server (ctrl+alt+backspace, this will kill your GUI session!)?

I suspect I won't be able to, but I'll certainly try it. (The numlock and capslock not working give the impression the whole keyboard is out of action.)

Offline

#7 2024-05-22 15:23:08

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,287

Re: Crash on resume, possibly due to AMD FirePro V5900

It's just irritating, but rather fast.

The point is to figure what's going on and what it takes to salvage the situation.
Otherwise this thread is merely a status post - "snafu, that's life".

Offline

#8 2024-05-22 15:27:10

glyn
Member
Registered: 2023-06-27
Posts: 6

Re: Crash on resume, possibly due to AMD FirePro V5900

seth wrote:

It's just irritating, but rather fast.

The point is to figure what's going on and what it takes to salvage the situation.
Otherwise this thread is merely a status post - "snafu, that's life".

Good point. I've set up a SSH daemon, so I can try logging in when the problem next recurs. If you have any suggestions for things I might try if I can SSH in, I'm all ears.

I must admit to being pessimistic about there being a fix based on Holger's comments, but workarounds would be interesting.

Offline

Board footer

Powered by FluxBB