You are not logged in.

#1 2024-01-04 18:13:50

italoghost
Member
Registered: 2024-01-04
Posts: 8

Random freezes that crashes my computer

Hi!

Every now and then my computer freezes and crashes, demanding a hard reboot.

Every time this happens, my GPU starts to blink a red light on it's RGB and if something with audio is playing, the audio starts do crack while the computer is freezing.

This only happens when I am doing things with low CPU and GPU usage, like updating my system via the terminal, listening to music or browsing the web. While playing games this has never happened.

This has been happening for some years now (with different Operation Systems) and I really do not know what causes it. Maybe it is a faulty RAM?

Could you help me bisect the root of the problem?

The logs from the last boot, before the crash, are here:

https://0x0.st/H6Jo.txt

Last edited by italoghost (2024-01-05 00:38:52)

Offline

#2 2024-01-05 18:57:20

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,788

Re: Random freezes that crashes my computer

There's little suggesting system crash in that journal so if it's indeed dying due to RAM/disk issues that could be the case.

First things first is this a CachyOS or an Arch install? Can you reproduce this on the normal Arch kernel?

Try running a memtest over night: https://wiki.archlinux.org/title/Stress … MemTest86+

Online

#3 2024-01-05 20:49:13

seth
Member
Registered: 2012-09-03
Posts: 51,684

Re: Random freezes that crashes my computer

This only happens when I am doing things with low CPU and GPU usage

https://wiki.archlinux.org/title/Ryzen# … k_freezing
Try to limit thhe c-states.

demanding a hard reboot

https://wiki.archlinux.org/title/Keyboa … el_(SysRq)
Otherwise the journal is worthless.

Offline

#4 2024-01-06 19:24:29

italoghost
Member
Registered: 2024-01-04
Posts: 8

Re: Random freezes that crashes my computer

V1del wrote:

There's little suggesting system crash in that journal so if it's indeed dying due to RAM/disk issues that could be the case.

First things first is this a CachyOS or an Arch install? Can you reproduce this on the normal Arch kernel?

Try running a memtest over night: https://wiki.archlinux.org/title/Stress … MemTest86+

So, it is an Arch Linux install! I am just using their kernel. It has been happening throughout the years with different kernels and distros (even on Windows).

I have done the memory test and it had no errors, so it is not the RAM.

Last edited by italoghost (2024-01-06 19:25:43)

Offline

#5 2024-01-06 19:34:15

italoghost
Member
Registered: 2024-01-04
Posts: 8

Re: Random freezes that crashes my computer

seth wrote:

This only happens when I am doing things with low CPU and GPU usage

https://wiki.archlinux.org/title/Ryzen# … k_freezing
Try to limit thhe c-states.

demanding a hard reboot

https://wiki.archlinux.org/title/Keyboa … el_(SysRq)
Otherwise the journal is worthless.

Hi! My problem is exactly the one reported on bugzilla. I have tried the first  proposed solution, by adding the output of the command

echo rcu_nocbs=0-$(($(nproc)-1))

as a kernel parameter. I will wait some time to see if the problem is resolved. If it is not, I will try the other the solutions.

I will enable the SysRq key to the kernel in case this happens again!

Do I maintain this thread open?

Last edited by italoghost (2024-01-06 19:37:28)

Offline

#6 2024-01-07 17:21:28

italoghost
Member
Registered: 2024-01-04
Posts: 8

Re: Random freezes that crashes my computer

The first solution didn't work. I added the

processor.max_cstate=5

kernel parameter and enabled the SysRq key to safely reboot. I will keep watching to see if it happens again.

Offline

#7 2024-01-07 18:02:34

seth
Member
Registered: 2012-09-03
Posts: 51,684

Re: Random freezes that crashes my computer

Be more aggressive and try "processor.max_cstate=1" to establish a functional baseline.
If that doesn't work, it's not the C-States.

Offline

#8 2024-01-09 11:39:47

italoghost
Member
Registered: 2024-01-04
Posts: 8

Re: Random freezes that crashes my computer

It has happened again and I the SysRq key (Alt + PrintScreen + b) didn't work when this happened. I will try setting `processor.max_cstate=1`.

Offline

#9 2024-01-24 20:00:55

italoghost
Member
Registered: 2024-01-04
Posts: 8

Re: Random freezes that crashes my computer

I have tested all the options mentioned on the ArchWiki and the error still occurred. Even disabling the C-State on the Global C-State Control Option in the BIOS didn't do the trick, as my PC crashed today after some days without it happening.

I will assume that it is a problem with my hardware.

Anyway, I would like to thank you guys for the support!

Last edited by italoghost (2024-01-24 20:01:16)

Offline

#10 2024-01-24 20:44:39

seth
Member
Registered: 2012-09-03
Posts: 51,684

Re: Random freezes that crashes my computer

I have done the memory test and it had no errors, so it is not the RAM.

nb. that useful memtest86+ runs are measured in days, one cycle hardly tells you anything.

If you can, downclock the RAM resp. chose the most conservative timings and see whether the system stabilizes.

Offline

#11 2024-04-07 23:23:14

italoghost
Member
Registered: 2024-01-04
Posts: 8

Re: Random freezes that crashes my computer

Hi! Sorry for necroposting my own issue, but I was finally able to reboot with the SysRq key. It seems to be a error with my GPU:

abr 07 19:34:12 pc-qi kernel: [UFW BLOCK] IN=enp34s0 OUT= MAC=01:00:5e:00:00:01:c0:3d:d9:b5:08:c0:08:00 SRC=192.168.15.1 DST=224.0.0.1 LEN=32 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2 
abr 07 19:34:14 pc-qi kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=947677, emitted seq=947678
abr 07 19:34:14 pc-qi kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 12793 thread gnome-shel:cs0 pid 12829
abr 07 19:34:14 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: GPU reset begin!
abr 07 19:34:14 pc-qi kernel: amdgpu 0000:26:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)
abr 07 19:34:14 pc-qi kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
abr 07 19:34:15 pc-qi kernel: amdgpu: cp is busy, skip halt cp
abr 07 19:34:15 pc-qi kernel: amdgpu: rlc is busy, skip halt rlc
abr 07 19:34:15 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: BACO reset
abr 07 19:34:15 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: GPU reset succeeded, trying to resume
abr 07 19:34:15 pc-qi kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400380000).
abr 07 19:34:15 pc-qi kernel: [drm] VRAM is lost due to GPU reset!
abr 07 19:34:16 pc-qi kernel: [UFW ALLOW] IN= OUT=enp34s0 SRC=fe80:0000:0000:0000:7c5a:de9e:f089:a36a DST=ff12:0000:0000:0000:0000:0000:0000:8384 LEN=319 TC=0 HOPLIMIT=1 FLOWLBL=1035683 PROTO=UDP SPT=48298 DPT=21027 LEN=279 
abr 07 19:34:16 pc-qi kernel: [UFW ALLOW] IN= OUT=enp34s0 SRC=192.168.15.93 DST=192.168.15.255 LEN=299 TOS=0x00 PREC=0x00 TTL=64 ID=60133 DF PROTO=UDP SPT=43989 DPT=21027 LEN=279 
abr 07 19:34:25 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: 
                              last message was failed ret is 0
abr 07 19:34:25 pc-qi kernel: amdgpu: SMU Firmware start failed!
abr 07 19:34:25 pc-qi kernel: amdgpu: Failed to load SMU ucode.
abr 07 19:34:25 pc-qi kernel: amdgpu: fw load failed

I couldn't find anything assertive about the

[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout

. Do you have any idea?

Here is the full log: https://0x0.st/XiWN.txt

Offline

#12 2024-04-08 07:12:29

seth
Member
Registered: 2012-09-03
Posts: 51,684

Re: Random freezes that crashes my computer

abr 07 19:34:14 pc-qi kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 12793 thread gnome-shel:cs0 pid 12829

gnome-shell does something™ that triggers a GPU reset and that fails to load the firmware.

And it happened right after you installed and ran lutris.

abr 07 19:31:22 pc-qi sudo[45644]:    ghost : TTY=pts/0 ; PWD=/home/ghost ; USER=root ; COMMAND=/usr/bin/pacman --upgrade --noconfirm -- /home/ghost/.cache/paru/clone/lutris-git/lutris-git-0.5.16.r532.g3866cf21b-1-any.pkg.tar.zst
abr 07 19:31:27 pc-qi gnome-shell[12793]: Object .Gjs_ui_dateMenu_EventsSection (0x59618bdfc020), has been already disposed — impossible to set any property on it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
abr 07 19:31:27 pc-qi gnome-shell[12793]: Object .Gjs_ui_dateMenu_EventsSection (0x59618bdfc020), has been already disposed — impossible to set any property on it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
abr 07 19:31:27 pc-qi gnome-shell[12793]: Object .Gjs_ui_dateMenu_WorldClocksSection (0x59618be042b0), has been already disposed — impossible to set any property on it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
abr 07 19:32:22 pc-qi systemd[1293]: Started Application launched by gnome-shell.
abr 07 19:32:24 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:32:24,340: Starting Lutris 0.5.17
abr 07 19:32:24 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:32:24,483: AMD Radeon RX 570 Series (1002:67df 1458:2310 amdgpu) Driver 24.0.4
abr 07 19:32:25 pc-qi gnome-shell[12793]: Ignoring length property that isn't a number at line 4, col 20
abr 07 19:32:28 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:28 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:28 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:32 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:32 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:32 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:41 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:41 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:41 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:44 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:44 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:32:44 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:00 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:00 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:00 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:02 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:02 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:02 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:09 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:33:09,900: Creating new configuration with runner wine
abr 07 19:33:10 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:33:10,625: Accessing game config while runner wasn't given one.
abr 07 19:33:10 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:33:10,625: Accessing game config while runner wasn't given one.
abr 07 19:33:10 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:33:10,625: The game doesn't have an executable
abr 07 19:33:10 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:33:10,629: Accessing game config while runner wasn't given one.
abr 07 19:33:10 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:33:10,629: Accessing game config while runner wasn't given one.
abr 07 19:33:10 pc-qi net.lutris.Lutris.desktop[51777]: 2024-04-07 19:33:10,630: The game doesn't have an executable
abr 07 19:33:26 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:26 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:33:26 pc-qi gnome-shell[12793]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
abr 07 19:34:14 pc-qi kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=947677, emitted seq=947678
abr 07 19:34:14 pc-qi kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 12793 thread gnome-shel:cs0 pid 12829
abr 07 19:34:14 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: GPU reset begin!
abr 07 19:34:14 pc-qi kernel: amdgpu 0000:26:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)
abr 07 19:34:14 pc-qi kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
abr 07 19:34:15 pc-qi kernel: amdgpu: cp is busy, skip halt cp
abr 07 19:34:15 pc-qi kernel: amdgpu: rlc is busy, skip halt rlc
abr 07 19:34:15 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: BACO reset
abr 07 19:34:15 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: GPU reset succeeded, trying to resume
abr 07 19:34:25 pc-qi kernel: amdgpu 0000:26:00.0: amdgpu: 
abr 07 19:34:25 pc-qi kernel: amdgpu: SMU Firmware start failed!
abr 07 19:34:25 pc-qi kernel: amdgpu: Failed to load SMU ucode.
abr 07 19:34:25 pc-qi kernel: amdgpu: fw load failed

The game has very likely increased the power demand, either the GPU might be systematically underpowered or this happens during the transition.
Does locking in the dpm performance level stabilize the system https://wiki.archlinux.org/title/AMDGPU … cy_problem
(if you lock it into "high" and the system starts to crash all the time, the GPU is probably underpowered)

Offline

#13 2024-04-08 13:09:52

italoghost
Member
Registered: 2024-01-04
Posts: 8

Re: Random freezes that crashes my computer

That is interesting, as I didn't launch any game within Lutris. I was just looking for the "umu-launcher" option. Just browsing the program already triggered a GPU reset.

I will lock the GPU to "High" to see if it crashes, but I think that it is very unlikely, as it just crashes when I am doing things that are not heavy on the GPU or CPU (it has never crashed playing games, for example).

Offline

Board footer

Powered by FluxBB