You are not logged in.
Pages: 1
I get completely random/sudden crashes on my system, i have fresh-ish install i did last week, and mostly functional besides this one issue.
Trying to run some diagnostic after a crash gives following information:
[user@pc ~]$ journalctl -b -p 1
Apr 14 14:49:19 archlinux kernel: [Hardware Error]: System Fatal error.
Apr 14 14:49:19 archlinux kernel: [Hardware Error]: CPU:11 (17:71:0) MC5_STATUS[-|UE|MiscV|AddrV|PCC|TCC|SyndV|-|-|-]: 0xbea0000000000108
Apr 14 14:49:19 archlinux kernel: [Hardware Error]: Error Addr: 0x0001ffffc07f507c
Apr 14 14:49:19 archlinux kernel: [Hardware Error]: IPID: 0x000500b000000000, Syndrome: 0x000000004d000000
Apr 14 14:49:19 archlinux kernel: [Hardware Error]: Execution Unit Ext. Error Code: 0
Apr 14 14:49:19 archlinux kernel: [Hardware Error]: cache level: RESV, tx: GEN, mem-tx: GEN
For context i'm running: asus x570p motherboard, ryzen 3700x cpu, radeon 7800xt gpu, Linux 6.18.22-1-lts kernel, Gnome DE on wayland, and 32gigs of ram. I've tried adding the "processor.max_cstate=5" argument in grub cmdline and rebuilding grub with grub-mkconfig, since i've read it can help, it didn't. I've also tried turning off DOCP, overclocking, secureboot, and power management options in bios, and updated to newest bios available right now. But the issue persists and the CPU load seems to have no major correlation with the freezes, using web browser to browse internet has basically same chance of freeze as playing somewhat intensive game, sometimes just moving cursor around on desktop with no other windows open can crash it. I've tried pretty much all fixes i could find from googling to this issue.
Offline
https://wiki.archlinux.org/title/Ryzen#Troubleshooting
Adjust the PBO curve optimizer
Online
PBO curve optimizdr is a feature added in zen3 CPU's, mine's zen2 and does not support it. I will try giving it s somewhat higher voltage though to see if it's of any help
Offline
Yea more voltage hasn't stabilised at all
Offline
Please post your complete system journal for the boot:
sudo journalctl -b | curl -s -H "Accept: application/json, */*" --upload-file - 'https://paste.c-net.org/'But
Hardware Errormeans exactly that.
Online
Sorry took a while. Here's the what i got from complete journal https://paste.c-net.org/PrestigeAuthor
"Hardware Error means exactly that." Yeah but couldn't it be a firmware or kernel side too? Since atleast when i was on windows i didn't have any issues with hardware
Offline
couldn't it be a firmware or kernel side too?
Firmware in the sense of "UEFI", yes - typically the problem is that the PBO defaults are way too aggressive.
Apr 15 15:24:44 archlinux kernel: DMI: System manufacturer System Product Name/PRIME X570-P, BIOS 5044 01/04/2026BIOS is very recent, though.
The kernel is software, it can *trigger* (something will - if the CPU is just sitting on a shelf it won't crash… unless you suck at carpentry) the problem but isn't the cause for MCE errors
when i was on windows i didn't have any issues with hardware
This is typically an aggravating problem ie. gets worse over time and also
Windows seems to run the CPUs at higher voltage and lower peak frequencies, compared to the stock linux kernel
which essentially translates to "because windows keeps the cpu busier/hotter, the possible PBO fluctuation is narrower" - you could typically still trigger this w/ corecycler which intentionally creates load patterns prone to trigger this.
processor.max_cstate=5 rcu_nocbs=0-15 loglevel=3 quiet idle=nomwait processor.max_cstate=5 rcu_nocbs=0-15 amdgpu.aspm=0
pcie_aspm=off amdgpu.runpm=0 amdgpu.dpm=0 amdgpu.bapm=0 processor.max_cstate=1You can also try to constrain the power demands of the GPU, https://wiki.archlinux.org/title/AMDGPU … nce_levels
"processor.max_cstate=1" will prevent the CPU from idling and increase the power draw, but also heat it up so it might not try to boost as excessively
Sidebar, disable dhcpcd - in collides w/ NMs dhcp implementation.
Online
Adding those amdgpu lime's to my grub would cause it not to boot weirdly, cganges cstate 5 to 1, i ran the command to set power on gpu to low that's linked in that article though, i should look into corecycler i guess. And thanks for reminding me to disable dhcpcd
Offline
Adding those amdgpu lime's to my grub would cause it not to boot weirdly
Remove amdgpu.dpm=0
Online
Okay i added the other ones without the amdgpu.dpm=0 command, after 15mins in desktop it crashed without me doing anything, just let it sit on desktop and it crashed in it's own.
Offline
gnome certainly did something ![]()
This isn't I/O driven and keeping constant load on the CPU might actually help (heating it up, preventing the boost of individual cores)
Did you try to enforce one of the lower amdgpu performance levels or manual profiles?
Online
I tried lower one.
"constant load on the CPU might actually help " funnily enough based on purely anecdotal and limited data, for me playing video games seemed more stable than browse internet on a browsing or doing random tasks on desktop
Offline
You can try to tune the https://wiki.archlinux.org/title/CPU_fr … amd_pstate scheduler or the behavior w/ acpi_cpufreq and there's https://archlinux.org/packages/extra/x86_64/corectrl/
All mitigation boils down to keeping the core temperatures slightly higher to flatten the performance curve.
And just to prevent a complete d'oohh - you do have https://wiki.archlinux.org/title/Microcode (amd-ucode, family: 0x17, model: 0x71, stepping: 0x0 is covered by microcode_amd_fam17h.bin)?
Online
>You can try to tune the https://wiki.archlinux.org/title/CPU_fr … amd_pstate scheduler
There's few kernel parameters in that article, i'm not sure in the "proper one" but trying active for now.
>And just to prevent a complete d'oohh - you do have https://wiki.archlinux.org/title/Microcode (amd-ucode, family: 0x17, model: 0x71, stepping: 0x0 is covered by microcode_amd_fam17h.bin)?
To my knowledge i think so? I have installed the microcode and then ran grub-mkconfig, however i must admit microcode article has been a headache to read through so i wouldn't say the chance of me missing up something is exactly 0 either.
Offline
Pages: 1