You are not logged in.
Pages: 1
about 1-2 weeks ago my desktop computer started to randomly restart and sometimes (more rare) freeze, I do upgrade from time to time. at first it was like once in 2-3 days but now it got worse.
the last thing in journalctl before restart is something like:
Nov 05 18:29:31 arc NetworkManager[581]: <info> [1667669371.0430] device (wlo1): set-hw-addr: set MAC address to 4E:27:A8:8B:25:71 (scanning)
Nov 05 18:29:31 arc NetworkManager[581]: <info> [1667669371.0452] device (wlo1): supplicant interface state: disconnected -> inactive
Nov 05 18:29:31 arc NetworkManager[581]: <info> [1667669371.0453] device (p2p-dev-wlo1): supplicant management interface state: disconnected -> inactivedoes that mean anything? if nothing could be seen in journalctl -- could it mean that something is wrong with the hardware or power?
the full dmesg log for the boot which ended with such restart: http://0x0.st/oE8L.txt
Last edited by shlyapa (2022-11-05 19:41:49)
Offline
Please post the complete journal. NetworkManager setting a network interface to 'inactive' isn't going to shut down your system.
CLI Paste | How To Ask Questions
Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L
Offline
Please post the complete journal.
http://0x0.st/oETA.txt -- I just removed the lines from my periodic mbsync check and it was the last thing on the log:
Nov 05 18:30:51.738840 arc mbsync[244176]: Loading near side box...
Nov 05 18:30:51.739006 arc mbsync[244176]: near side: 129 messages, 0 recent
Nov 05 18:30:51.739006 arc mbsync[244176]: Synchronizing...
Nov 05 18:30:51.739091 arc mbsync[244176]: Opening far side box Notes...
Nov 05 18:30:51.739091 arc mbsync[244176]: Opening near side box Notes...
Nov 05 18:30:51.779191 arc mbsync[244176]: Loading far side box...
Nov 05 18:30:51.779191 arc mbsync[244176]: far side: 0 messages, 0 recent
Nov 05 18:30:51.779191 arc mbsync[244176]: Loading near side box...
Nov 05 18:30:51.779407 arc mbsync[244176]: near side: 0 messages, 0 recent
Nov 05 18:30:51.779407 arc mbsync[244176]: Synchronizing...
Nov 05 18:30:51.780891 arc mbsync[244176]: Opening far side box [Gmail]/All Mail...
Nov 05 18:30:51.780891 arc mbsync[244176]: Opening near side box [Gmail]/All Mail.>
Nov 05 18:30:51.830473 arc mbsync[244176]: Loading far side box...
Nov 05 18:30:51.830473 arc mbsync[244176]: Loading near side box...
Nov 05 18:30:51.833494 arc mbsync[244176]: near side: 3925 messages, 0 recent
Nov 05 18:30:52.264646 arc mbsync[244176]: far side: 3887 messages, 0 recent
Nov 05 18:30:52.264998 arc mbsync[244176]: Synchronizing...
Nov 05 18:30:52.266450 arc mbsync[244176]: Opening far side box [Gmail]/Sent Mail.>
Nov 05 18:30:52.266450 arc mbsync[244176]: Opening near side box [Gmail]/Sent Mail>
Nov 05 18:30:52.311357 arc mbsync[244176]: Loading far side box...
Nov 05 18:30:52.311357 arc mbsync[244176]: Loading near side box...
Nov 05 18:30:52.312350 arc mbsync[244176]: near side: 1843 messages, 0 recent
Nov 05 18:30:52.619494 arc mbsync[244176]: far side: 1718 messages, 0 recent
Nov 05 18:30:52.619494 arc mbsync[244176]: Synchronizing...
Nov 05 18:30:52.620026 arc mbsync[244176]: Opening far side box [Gmail]/Spam...
Nov 05 18:30:52.620026 arc mbsync[244176]: Opening near side box [Gmail]/Spam...
Nov 05 18:30:52.660704 arc mbsync[244176]: Loading far side box...
Nov 05 18:30:52.660704 arc mbsync[244176]: Loading near side box...
Nov 05 18:30:52.660853 arc mbsync[244176]: near side: 67 messages, 0 recentOffline
Is this somewhat linked to S3 cycles (sleep/resume/5m/crash)?
Online
Is this somewhat linked to S3 cycles (sleep/resume/5m/crash)?
not really, it could work for 2-3 days without a restart/freeze but then yesterday I had 4-5 times a day.
Offline
I'm on Ryzen 5 3600 and this MB https://www.msi.com/Motherboard/B550M-P … pport#bios -- does it make sense to update BIOS to the latest?
Last edited by shlyapa (2022-11-08 09:05:43)
Offline
"usually", but also see https://wiki.archlinux.org/title/Ryzen#Troubleshooting
Online
thanks. I'll try new BIOS and then a kernel with CONFIG_RCU_NOCB_CPU.
Last edited by shlyapa (2022-11-08 09:05:21)
Offline
zgrep CONFIG_RCU_NOCB_CPU /proc/config.gzAnd I'd start w/ the broadsword mentioned in the wiki:
Other less ideal solutions include disabling c-states in the BIOS or adding processor.max_cstate=1 to your kernel command line arguments.
Online
zgrep CONFIG_RCU_NOCB_CPU /proc/config.gz
thanks! it's already there!
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_DEFAULT_ALL is not set
# CONFIG_RCU_NOCB_CPU_CB_BOOST is not setAnd I'd start w/ the broadsword mentioned in the wiki:
the wiki wrote:Other less ideal solutions include disabling c-states in the BIOS or adding processor.max_cstate=1 to your kernel command line arguments.
I actually tried disabling c-states in BIOS before but it didn't help.
Offline
Do you have
NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [DOM Worker:1364] like messages in your dmesg?
Did you see that
To solve this problem you need to supply higher voltage to your CPU so that it is stable when running at peak frequencies. The easiest way to achieve this is to use the AMD curve optimiser which is accessible via your motherboard's bios.
Online
Do you have
NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [DOM Worker:1364]like messages in your dmesg?
Did you see thatthe wiki wrote:To solve this problem you need to supply higher voltage to your CPU so that it is stable when running at peak frequencies. The easiest way to achieve this is to use the AMD curve optimiser which is accessible via your motherboard's bios.
no, nothing like that, I couldn't see anything suspicious in dmesg (this is the one from boot to random restart http://0x0.st/oE8L.txt)
Last edited by shlyapa (2022-11-08 09:50:36)
Offline
Did you see that
the wiki wrote:To solve this problem you need to supply higher voltage to your CPU so that it is stable when running at peak frequencies. The easiest way to achieve this is to use the AMD curve optimiser which is accessible via your motherboard's bios.
Online
Did you see that
the wiki wrote:To solve this problem you need to supply higher voltage to your CPU so that it is stable when running at peak frequencies. The easiest way to achieve this is to use the AMD curve optimiser which is accessible via your motherboard's bios.
I saw that but I thought it was only applicable to the case with mce events in dmesg, which I don't have. should I try it still?
Offline
Yes. "randomly restart" reeks of power issues.
Online
although I've updated to the latest BIOS there's no curve optimizer (I checked on youtube where people had it), maybe there's none for Ryzen 5 3600, one can only increase voltage by exact positive value of volts -- no idea what to put there.
for now I tried to disable CBP and PBO and:
To fix this issue, go into your BIOS settings for your motherboard and search for an option labeled something like this: "Power idle control". Change its value to "Typical current idle".
shrug, will have a look. thanks!!
Offline
now I've found one activity which forces a restart: just running pg_restore with a 1gb dump in a docker container and after 3-5 minutes it's gone
how do i diagnose what causes it? there's still nothing in journalctl
(I've run it with psensor and it seems like right before restart cpu sensor gets to 37 degrees and stays there without drops, as in minutes before it was getting there and dropping)
Last edited by shlyapa (2023-01-06 08:22:39)
Offline
Do you have a swap partition/file?
Online
Do you have a swap partition/file?
I do, I don't know if 16G is enough though, is it?
swapon --show
NAME TYPE SIZE USED PRIO
/dev/sda3 partition 16G 0B -2Offline
You'd then notice a massive slowdown first if an OOM was the cause of the reboot ![]()
I guess pg_restore will lead to a lot of disk I/O?
You could try stress, https://wiki.archlinux.org/title/Stress_testing or just "dd if=/dev/zero of=/home/shlyapa/lotta.zeros count=5M" (be extra careful what you put there! 5M will be 2.5GB on a default bs of 512 and if you have a stray blank in the of, you're gonna shred your installation. dd is not robust against user errors!)
Online
You'd then notice a massive slowdown first if an OOM was the cause of the reboot
I see, thanks!
I guess pg_restore will lead to a lot of disk I/O?
You could try stress, https://wiki.archlinux.org/title/Stress_testing or just "dd if=/dev/zero of=/home/shlyapa/lotta.zeros count=5M" (be extra careful what you put there! 5M will be 2.5GB on a default bs of 512 and if you have a stray blank in the of, you're gonna shred your installation. dd is not robust against user errors!)
I tried both stress package https://wiki.archlinux.org/title/Stress_testing#stress for some time with no restart and dd with 40M went OK too.
and also after that I ran pg_restore again and it went fine without a restart for the first time after at least 5 tries before :-O
Last edited by shlyapa (2023-01-06 10:43:25)
Offline
Pages: 1