You are not logged in.

#1 2021-11-14 12:42:30

Yannik_Sc
Member
Registered: 2017-03-04
Posts: 3

PC resetting when loading GPU & CPU at the same time

Hello people,

since the last weekend (06.11.2021) I experience more or less random crashes when loading the GPU and CPU at the same time. The screen blanks out and the PC reboots.

Unfortunately there are no logs or whatsoever generated. The dmesg just stops outputting as well as the journalctl (when I'm logged in through SSH or piping it to a file on disk).

What I tried

It seems not to be related to thermals as all my hardware stays relatively cool at
~70c for the GPU from nvidia-smi
~60c for the CPU from lm_sensors
~40c for the PSU from Corsair Link

The power output of the PSU is ~500 - 550W (read out through liquidctl // Corsair Link) so absolutely inside its ratings.

Funny enough when I tried to replicate it on Windows the problem did not appeared although power usage and temps stayed the same so I suspect it to be a Software error.

I have also tried to downgrade the Kernel (although there was no update during this time) without success.

I switched to a different DE (from GNOME to i3) no success here either.

For crashing/stress testing the components I used phoronix-test-suite with
pts/tachyon for the CPU
pts/unigine-heaven for the GPU
but games work as well for this.

My Hardware

CPU: Ryzen 7 5900X
GPU: NVIDIA 1080ti
PSU: Corsair RM850i
RAM: GSkill Trident Z @ 3600MHz

My Software

Kernel: 5.15.2-arch1-1 (tried with 5.14.16 and zen versions)
GPU Driver: nvidia-dkms 495.44-7 (tried downgrade to 470.74-1 - not working)
DE: GNOME on Xorg

I'm not expecting a solution, but I definitely need help debugging this.
I'm happy about any ideas coming up!

Offline

#2 2021-11-14 13:27:50

Wild Penguin
Member
Registered: 2015-03-19
Posts: 399

Re: PC resetting when loading GPU & CPU at the same time

Hi,

Are you sure there is nothing relevant in the logs? Try:

journalctl -b-1

After a crash on the next reboot (this will output whole log of the previous boot. Also, try to look at journals boot entries with --list-boots and different parameters for -b, in case this is not familiar to you, already).

As a first longshot, also list output of lsmod.

I also noticed weird instabilities around the same time, but they were not random resets. Curiously, I had an out-of-tree module called corefreqk loaded (installed at around the time instabilities surfaced), and after removing it, all instabilities are gone so far (for around 6 days). I'm not sure cpufreqk is the culprit, though. Re-inserting it and testing is on my todo list, but haven't had time for it so far; I've needed to do actual stuff on my computer ;-). It could still be a hair fracture or something on my motherboard (those can come and go and cause the symptoms I've had)!

EDIT: Not knowing better, and just hearing what you've described, my first suspect would be a faulty PSU. But it could be caused by many, many things!

EDIT: My system is: CPU: Ryzen 5950X / GPU RX Vega 64 / MB MSI Tomahawk B550.

EDIT: Module is NOT cpufreqk but corefreqk!

Last edited by Wild Penguin (2021-11-14 21:48:28)

Offline

#3 2021-11-14 15:10:40

Yannik_Sc
Member
Registered: 2017-03-04
Posts: 3

Re: PC resetting when loading GPU & CPU at the same time

Thanks for your reply.

My first suspicion also was a faulty PSU. But it would'nt explain why it's not happening in windows as well. However, I will try to borrow one from work next week and check it out.

And for now I got an lsmod and the journalctl -b-1.
For me nothing looks that unordinary.

I also remembered, that I'm actually running Garuda and not plain Arch, so I guess I should rather use the Garuda Forum.

Edit: nevermind; Its surely the PSU. Its also happening on Windows. Just takes longer.

Last edited by Yannik_Sc (2021-11-14 15:29:37)

Offline

Board footer

Powered by FluxBB