You are not logged in.

#1 2023-05-26 19:14:15

realitaetsverlust
Member
Registered: 2021-09-07
Posts: 52

RX 7900 XTX crashes the system at seemingly random intervals

My RX 7900 XTX has some severe problems on my system. It basically completely prevents me from using my linux distro at all at this point.

I have - seemingly random - crashes at any point in time. Sometimes it runs perfectly fine for an entire day, sometimes it crashes minutes from startup. Right now, for example, my system crashed while trying to play a game - hard reset. Then, after the reboot, I was opening my browser, checked out some text-based websites (no videos or anything) - crash again, literal minutes apart. Now, at first I thought the GPU was dying, but out of curiosity I tried running the windows partition for a week now and I did not have a single crash or issue with it, which sadly leads me to the conclusion it's an arch problem or at least a problem with the GPU that only occurs on arch.

As I'm honestly not very hardware savvy, I don't quite know what to do or how to solve this. My system is up-to-date, with latest drivers and everything.

I have collected the journalctl logs and xorg logs here:

First boot + crash:
- journalctl: http://ix.io/4wM7
- Xorg: (not available, as only 2 are stored)

Second boot + crash:
- journalctl: http://ix.io/4wM8
- Xorg: http://ix.io/4wM9

Now working session (at time of writing)
- journalctl: http://ix.io/4wMa
- Xorg: http://ix.io/4wMb

[rv@exodus ~]$ sudo lspci -v -s 2f:00.0
[sudo] password for rv: 
2f:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900 XTX] (rev c8) (prog-if 00 [VGA controller])
	Subsystem: XFX Limited RX-79XMERCB9 [SPEEDSTER MERC 310 RX 7900 XTX]
	Flags: bus master, fast devsel, latency 0, IRQ 113, IOMMU group 27
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at e0000000 (64-bit, prefetchable) [size=2M]
	I/O ports at f000 [size=256]
	Memory at fca00000 (32-bit, non-prefetchable) [size=1M]
	Expansion ROM at fcb00000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
	Capabilities: [64] Express Legacy Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150] Advanced Error Reporting
	Capabilities: [200] Physical Resizable BAR
	Capabilities: [240] Power Budgeting <?>
	Capabilities: [270] Secondary PCI Express
	Capabilities: [2a0] Access Control Services
	Capabilities: [2d0] Process Address Space ID (PASID)
	Capabilities: [320] Latency Tolerance Reporting
	Capabilities: [410] Physical Layer 16.0 GT/s <?>
	Capabilities: [450] Lane Margining at the Receiver <?>
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu

If you need any other info, please let me know - as I said, I'm not very hardware-savvy so I'm not sure what you could need here.

Offline

#2 2023-05-27 12:04:44

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,018

Re: RX 7900 XTX crashes the system at seemingly random intervals

xorg logs doesn't show problems, although you want to remove xf86-video-ati . It doesn't support your card and adds clutter to xorg log.


All 3 journals show

May 26 21:54:37 exodus kernel: [drm:dcn20_wait_for_blank_complete [amdgpu]] *ERROR* DC: failed to blank crtc!

similar lines multiple times, but they doesn't seem to be the cause of the crash .

two things that stand out :
You appear to have 2 monitors connected, a GSM  Model: 5b09  on DP-1 and a SAM  Model: c4d on DP-3 .
Do the crashes also happen if you use only one of them (try with just the sam first)

Locale not supported by C library.
                                             Using the fallback 'C' locale.

That kinda suggests you have set your locale incorrectly and some applications choke badly on that (steam used to be one of them, not sure if it still is) ..

see https://wiki.archlinux.org/title/Locale … the_locale and post the output of the 3 commands mentioned there.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#3 2023-05-27 15:18:22

realitaetsverlust
Member
Registered: 2021-09-07
Posts: 52

Re: RX 7900 XTX crashes the system at seemingly random intervals

Lone_Wolf wrote:

xorg logs doesn't show problems, although you want to remove xf86-video-ati . It doesn't support your card and adds clutter to xorg log.

Aight, I clean that up, thanks.

Lone_Wolf wrote:

All 3 journals show

May 26 21:54:37 exodus kernel: [drm:dcn20_wait_for_blank_complete [amdgpu]] *ERROR* DC: failed to blank crtc!

similar lines multiple times, but they doesn't seem to be the cause of the crash .

Good to hear, a dead GPU was the thing that worried me the most.

Lone_Wolf wrote:

two things that stand out :
You appear to have 2 monitors connected, a GSM  Model: 5b09  on DP-1 and a SAM  Model: c4d on DP-3 .
Do the crashes also happen if you use only one of them (try with just the sam first)

I try to replicate the issue on just one monitor, however, as the crashes are fairly random and sometimes don't happen at all, it might take a while to confirm this.

Lone_Wolf wrote:
Locale not supported by C library.
                                             Using the fallback 'C' locale.

That kinda suggests you have set your locale incorrectly and some applications choke badly on that (steam used to be one of them, not sure if it still is) ..

see https://wiki.archlinux.org/title/Locale … the_locale and post the output of the 3 commands mentioned there.

[rv@exodus ~]$ locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
[rv@exodus ~]$ localedef --list-archive
en_GB.utf8
[rv@exodus ~]$ localectl list-locales
C.UTF-8
en_GB.UTF-8

Offline

#4 2023-05-27 16:22:03

Skidout
Member
Registered: 2023-05-26
Posts: 37

Re: RX 7900 XTX crashes the system at seemingly random intervals

Are you running the LTS kernel? You need at least version 6.2 to have a stable experience on 7900 XTX. Also, try to see if this is happening on another desktop environment or on Wayland. That could also be the issue. If it still crashes, disable your display manager with systemctl and see if it still crashes in a pure TTY. This way the issue could be isolated.

Offline

#5 2023-05-28 10:45:26

realitaetsverlust
Member
Registered: 2021-09-07
Posts: 52

Re: RX 7900 XTX crashes the system at seemingly random intervals

Skidout wrote:

Are you running the LTS kernel? You need at least version 6.2 to have a stable experience on 7900 XTX.

No, I'm running 6.3.4.arch1-1

Skidout wrote:

Also, try to see if this is happening on another desktop environment or on Wayland.

It also happened on busybox, so it's not KDE specific. Wayland I didn't try yet. Will do that next.

Skidout wrote:

If it still crashes, disable your display manager with systemctl and see if it still crashes in a pure TTY. This way the issue could be isolated.

Will do that as well after I tried wayland. Might take a while until I get some results as the crashes are rather random and are hard to reproduce.

Offline

#6 2023-05-28 11:36:43

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,018

Re: RX 7900 XTX crashes the system at seemingly random intervals

Your locale is indeed setup wrong. You enabled the en_GB locale, but your system uses the not enabled  en_US as default.


As root edit /etc/locale.gen and remove the # from the line #en_US.UTF-8 UTF-8 .
Save the edit , then run locale-gen (again as root ).

That should fix the locale errors .

Sidenote : Many programs expect the en_US locale to be supported, so even if you don't use it it's a good idea to enable it.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#7 2023-05-28 12:31:28

MadCat_X
Member
Registered: 2009-10-08
Posts: 189

Re: RX 7900 XTX crashes the system at seemingly random intervals

How exactly do these "random crashes" manifest? There is nothing in any of your log files that would point to a problem.

While you should straighten up your locale settings, I very much doubt that you would experience hard crashes as a result.

Last edited by MadCat_X (2023-05-28 12:31:50)

Offline

#8 2023-05-28 15:09:05

realitaetsverlust
Member
Registered: 2021-09-07
Posts: 52

Re: RX 7900 XTX crashes the system at seemingly random intervals

MadCat_X wrote:

How exactly do these "random crashes" manifest? There is nothing in any of your log files that would point to a problem.

While you should straighten up your locale settings, I very much doubt that you would experience hard crashes as a result.

Basically, my system freezes for roughly 4 seconds. If any sound played during the crash, the last second of sound is repeated 4 times. Then, my system hard resets as if I'm booting it anew. There's nothing else happening, I can't even nail it down to something I'm doing. Sometimes it crashes while playing a game, sometimes it crashes while having a simple notepad application open. I can try to get a video of it but it's really hard as it's not reproducible.

I saw several hardware error lines in dmesg - could they be a problem?

Offline

#9 2023-05-29 00:48:59

Skidout
Member
Registered: 2023-05-26
Posts: 37

Re: RX 7900 XTX crashes the system at seemingly random intervals

realitaetsverlust wrote:

I saw several hardware error lines in dmesg - could they be a problem?

Yes, those could definitely be an issue. At this point its hard to say if the 7900 XTX is even the issue, it could something else entirely. It would be useful if you could post these logs, someone might be able to tell what's going on.


realitaetsverlust wrote:

If any sound played during the crash, the last second of sound is repeated 4 times.

The audio continuing to play is somewhat telling, as it means your audio server is not freezing along with the rest of whatever you're running. If an application freezes and does not disconnect from the audio server, some servers will play the audio a certain number of times over again. It could be that only your graphical environment is crashing, not you whole system including the kernel. One way you might be able to test for that would be to set up an SSH and see if you can login from another computer after freezing.

It would also be useful to know if three monitors work, since a corrupted driver might not be able to display on three monitors at once. A few months ago, when I first got my 7900 XTX, I also experienced very similar crashes due to bad driver support. The other issue I experienced was only being able to use two of me three monitors.

Offline

#10 2023-05-29 11:09:06

realitaetsverlust
Member
Registered: 2021-09-07
Posts: 52

Re: RX 7900 XTX crashes the system at seemingly random intervals

Skidout wrote:

Yes, those could definitely be an issue. At this point its hard to say if the 7900 XTX is even the issue, it could something else entirely. It would be useful if you could post these logs, someone might be able to tell what's going on.

The messages from dmesg are contained within the journalctl-logs - I can't post dmesg logs themselves as, at least as far as I know, they are are not persisted somewhere by default and are not accessible after a reboot.

My first idea was the GPU as that was the latest part I had replaced in my rig, upgraded from an RX 6900 XTX.

Skidout wrote:

The audio continuing to play is somewhat telling, as it means your audio server is not freezing along with the rest of whatever you're running. If an application freezes and does not disconnect from the audio server, some servers will play the audio a certain number of times over again. It could be that only your graphical environment is crashing, not you whole system including the kernel. One way you might be able to test for that would be to set up an SSH and see if you can login from another computer after freezing.

Considering that my system hard reboots after 4 - 5 seconds, I would assume it's not just the DE that is crashing. There is also no way I can try to SSH into the system in that narrow timeframe.

Skidout wrote:

It would also be useful to know if three monitors work, since a corrupted driver might not be able to display on three monitors at once. A few months ago, when I first got my 7900 XTX, I also experienced very similar crashes due to bad driver support. The other issue I experienced was only being able to use two of me three monitors.

Yes, I also had severe issues with my GPU in the beginning so my PC was actually unuseable for me. I don't have a third monitor lying around tho - I try to get my hands on a small one.

However, with all that said - is it likely that this is a driver issue? Because instead of looking for such a hard to debug error, I might just reinstall my entire OS and see if that fixes it.

Offline

#11 2023-05-29 22:28:41

Skidout
Member
Registered: 2023-05-26
Posts: 37

Re: RX 7900 XTX crashes the system at seemingly random intervals

realitaetsverlust wrote:

The messages from dmesg are contained within the journalctl-logs - I can't post dmesg logs themselves as, at least as far as I know, they are are not persisted somewhere by default and are not accessible after a reboot.

Then you could post the messages in the journalctl-logs. If those are too verbose, you can use (sudo watch -n 0.2 "dmesg > /home/$USER/dmesg.log"). This command will save dmesg output to dmesg.log in your home directory every 0.2 seconds. There's a pretty good chance it could catch something.

realitaetsverlust wrote:

My first idea was the GPU as that was the latest part I had replaced in my rig, upgraded from an RX 6900 XTX.

If you replaced the GPU and immediately started having crashes, then sure. It would probably be caused by some sort of issue with configs somewhere, and not drivers then.

realitaetsverlust wrote:

However, with all that said - is it likely that this is a driver issue? Because instead of looking for such a hard to debug error, I might just reinstall my entire OS and see if that fixes it.

If you're up for a full reinstall, that's what I would do in this situation. I think I've reinstalled at least twice since getting my new card, though only after I had a relatively stable system, I reinstalled for other reasons. If that doesn't work, we can try some of the other stuff I mentioned, and I have at least one more idea, but its a last resort kind of idea. I would recommend not keeping your home directory as it is if you reinstall. There could be some sort of configs messing that up. Back up your home directory and only copy over what you know you need.

Offline

#12 2023-05-30 11:12:34

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,018

Re: RX 7900 XTX crashes the system at seemingly random intervals

I doubt very much reinstalling will help.

I did scrutinize the logs posted by realitaetsverlust and feel the errors in it point to driver errors, not hardware issues.
There are many cases were adding a 2nd monitor causes issues that are not present with only 1 connected monitor. I have never heard of a case were adding a 3rd monitor made the issues go away so see no point in trying that.

Setting up a watch on dmesg is a good idea.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#13 2023-10-28 11:56:35

zombywuf
Member
Registered: 2023-10-28
Posts: 1

Re: RX 7900 XTX crashes the system at seemingly random intervals

I joined this forum just to post this rather longshot possible solution (my 7900 XTX issue is different).

I for a while had issues with random crashes after a GPU upgrade that I eventually discovered were due to the overly large bookshelf speakers I had on my desk next to my case. They had quite a strong magnet in them which seemed to be enough to cause instability in the presumably quite delicate electronics in the new card. If you've been at the innards of your machine I assume you've moved things around a bit, have you left your case next to a speaker or other strong magnet?

Offline

#14 2023-11-13 17:51:08

orlfman
Member
Registered: 2007-11-20
Posts: 141

Re: RX 7900 XTX crashes the system at seemingly random intervals

just want to chime in been having a similar issue with my 79000 xt as well that i bought in july of 2023. from july to end of august it ran very stable. no issues. late august i went back to windows 11 out of fear of starfield not running. i stayed on windows 11 until the end of September. second day back on arch i had a strange system freeze. tried playing cyberpunk 2077 and the game froze. completely. i was able to alt tab out and kill its process, but after 2 minutes, the entire computer froze. what's odd, nothing was caught in journalctl. no reports of a gfx ring crash or anything. then i tried playing wc3 reforged. played fine for two hours, then the game randomly froze at the menu. forced exited it, then again after another 1 or 2 minutes, my entire system froze.

ended up going back to windows 11. since then, i've gone back and forth trying to give linux with arch another shot. every time, i still keep getting these complete system freezes. one new thing i noticed is if i keep the game up instead of trying to exit it, after 1 or 2 minutes the system will freeze. if the game freezes, and i try to switch to another tty, it will instant freeze. this only ever happens, when playing a game.

the only game this never happened with, was WoW classic. every other game i've tried, has caused these freezes. and to note, windows 11, has been stable and these freezes on linux never happened on purely desktop or hardware accelerated video playback. its always been when playing a game.

since then, because i really do want to come back to arch, is:
replaced the following:
swapped my 13900k with another brand new 13900k
switched from an msi z690 edge wifi ddr5 to a new msi z790 edge wifi ddr5
swapped my 64gb ddr5 6000mhz with a completely, different, new 64gb ddr5 6000mhz kit
i even bought a brand new 7900 xt to replace my original 7900 xt

and my powersupply is a corsair rmx (2022 version, bought this year) 1000 watt.

Last edited by orlfman (2023-11-13 17:53:49)

Offline

#15 2023-11-13 22:13:36

seth
Member
Registered: 2012-09-03
Posts: 52,291

Re: RX 7900 XTX crashes the system at seemingly random intervals

Have you tried the LTS or LTS515 (AUR) kernels?
Though #4 suggests you'd need 6.2, OP was on 6.3 and you probably 6.4 at the time?

OP also has an AMD Ryzen 9 5900X and there's oc. https://wiki.archlinux.org/title/Ryzen#Troubleshooting

I'm sure you've disabled windows fast-start?

Offline

#16 2023-11-14 17:24:39

orlfman
Member
Registered: 2007-11-20
Posts: 141

Re: RX 7900 XTX crashes the system at seemingly random intervals

seth wrote:

Have you tried the LTS or LTS515 (AUR) kernels?
Though #4 suggests you'd need 6.2, OP was on 6.3 and you probably 6.4 at the time?

OP also has an AMD Ryzen 9 5900X and there's oc. https://wiki.archlinux.org/title/Ryzen#Troubleshooting

I'm sure you've disabled windows fast-start?

when i got the card i was on 6.4 with whatever mesa version was out at the time and things worked well. this was july - end of august. i've tried 6.5 and 6.6. even ran 6.6 when it was rc too. i tried mesa stable in the repos and mesa-git. i also don't have fast boot enabled. secure boot disabled, and even disabled the intel ftpm.

i'm really at a loss with what to do. i read elsewhere it might be rebar related? should i try disabling that?

Offline

#17 2023-11-14 20:56:19

seth
Member
Registered: 2012-09-03
Posts: 52,291

Re: RX 7900 XTX crashes the system at seemingly random intervals

"Fast Boot" != "Windows fast start" - 3rd link below. Mandatory.
Disable it and reboot windows and linux twice for voodo reasons.

disabled the intel ftpm

So we can infer that you don't have a ryzen chip?
Can you trigger the freeze w/ a synthetic https://wiki.archlinux.org/title/Benchmarking#Graphics ?
Did you monitor the system temperatures at the freze (GPU, CPU, Board)?

seth wrote:

Have you tried the LTS or LTS515 (AUR) kernels?

Offline

Board footer

Powered by FluxBB