You are not logged in.
Pages: 1
Hello everyone
for quite a while now I'm experiencing full system freezes at random. The only common cause I can find is that it's (almost always) when I'm in an active Teams Meeting in Vivaldi. The whole system just locks up - No more audio, no reaction to keypresses at all, no reaction to CTRL+ALT+DEL, only a poweroff via the power button does anything. Can't switch ttys, I just have the last frame on-screen and that's it.
This doesn't happen predictably - sometims I get 3 of these in a day, sometimes two in a row within minutes of each other, sometimes it's days between crashes.
I don't think it's load, because I do not experience any lock ups when stress testing the system or playing a game or whatever.
journalctl -b -1 right after the crash doesn't show anything:
Aug 04 11:13:20 lynxcore sudo[50920]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan
Aug 04 11:13:20 lynxcore sudo[50920]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:20 lynxcore sudo[50920]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:20 lynxcore sudo[50924]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan --device=nvme
Aug 04 11:13:20 lynxcore sudo[50924]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:20 lynxcore sudo[50924]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:20 lynxcore sudo[50927]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:20 lynxcore sudo[50929]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:20 lynxcore sudo[50928]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:20 lynxcore sudo[50930]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:20 lynxcore sudo[50927]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:20 lynxcore sudo[50929]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:20 lynxcore sudo[50928]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:20 lynxcore sudo[50930]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:20 lynxcore sudo[50928]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:20 lynxcore sudo[50927]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:20 lynxcore sudo[50930]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:20 lynxcore sudo[50929]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:30 lynxcore sudo[50940]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan
Aug 04 11:13:30 lynxcore sudo[50940]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:30 lynxcore sudo[50940]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:30 lynxcore sudo[50945]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan --device=nvme
Aug 04 11:13:30 lynxcore sudo[50945]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:30 lynxcore sudo[50945]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:30 lynxcore sudo[50951]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:30 lynxcore sudo[50950]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:30 lynxcore sudo[50949]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:30 lynxcore sudo[50948]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --info --health --attributes --tolerance=verype>
Aug 04 11:13:30 lynxcore sudo[50951]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:30 lynxcore sudo[50950]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:30 lynxcore sudo[50949]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:30 lynxcore sudo[50948]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=959)
Aug 04 11:13:30 lynxcore sudo[50948]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:30 lynxcore sudo[50951]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:30 lynxcore sudo[50949]: pam_unix(sudo:session): session closed for user root
Aug 04 11:13:30 lynxcore sudo[50950]: pam_unix(sudo:session): session closed for user rootFull journal from journalctl -b -1 after the last crash: https://0x0.st/8hsn.txt
I'm logging temps for both my GPU and the rest of my system via telegraf to my homelab, and I don't see a rise in temps over the telegraf interval (10 seconds), so if it's a thermal thing, it happens FAST. See this image of the minute during which the crash occured - metrics collected by telegraf and sent to remote server: https://img.lynxcore.org/250804_114222.png
Specs:
AMD Ryzen 7800X3D
NVidia RTX 3080FE
64GB Corsair Vengeance DDR5@6200MT/s
Most likely useless NeoFetch output:
OS: Arch Linux x86_64
Kernel: 6.15.9-arch1-1
Uptime: 24 mins
Packages: 1929 (pacman), 19 (flatpak)
Shell: zsh 5.9
Resolution: 1200x1920, 3440x1440, 2560x1440
WM: awesome
Theme: Adwaita-dark [GTK2/3]
Icons: Obsidian-Mint [GTK2/3]
Terminal: urxvt
Terminal Font: Source Code Pro
CPU: AMD Ryzen 7 7800X3D (16) @ 5.053GHz
GPU: NVIDIA GeForce RTX 3080
GPU: AMD ATI 0d:00.0 Raphael
Memory: 12218MiB / 63427MiB amdgpu blacklisted because I don't use the on-board GPU.
Things I didn't yet try:
- REISUB/SysRq
- SSHing into the machine
Because I'm in active meetings when this occurs I usually don't have time to try those two. Will try when it happens again in one of the more useless meetings.
Definitely open programs when this occurs:
- 1 Instance of Vivaldi running the teams meeting
Usually, but not always open programs:
- 1 additional Vivaldi instance
- 1 instance of google chrome
- 1 or 2 instances of rxvt-unicode
- sublime-text
- sometimes I'm connected to the company VPN via OpenVPN, but not always
Any pointers welcome.
Last edited by Whoracle (2025-08-04 10:13:58)
Online
Do you use Vivaldi from Arch repo or from AUR or from elsewhere? Did you tried run Teams Meeting in other browsers (Chromium, Firefox) from Arch repo?
Things I didn't yet try:
- REISUB/SysRq
- SSHing into the machine
Definitely try!
Also see: https://wiki.archlinux.org/title/Ryzen
Offline
Vivaldi from extra. I had the same issue in Chromium and Chrome a few months back before I switched my work profile over to Vivaldi. Never tried Firefox, don't have that installed. Will go over the Ryzen article, but I doubt I'll find the solution in there since the issue is so... specialized? But I'll see.
Online
Happened again just now. SSH did not work, and neither did REISUB. Looks like my wireless KB didn't even get through to the dongle anymore.
journalctl -b -1: http://0x0.st/8hWo.txt
Online
…
Aug 06 16:13:10 lynxcore sudo[124790]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan --device=nvme
Aug 06 16:13:20 lynxcore sudo[124807]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan
Aug 06 16:13:20 lynxcore sudo[124811]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan --device=nvme
Aug 06 16:13:30 lynxcore sudo[124846]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan
Aug 06 16:13:30 lynxcore sudo[124850]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan --device=nvme
Aug 06 16:13:40 lynxcore sudo[124870]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan
Aug 06 16:13:40 lynxcore sudo[124874]: telegraf : PWD=/ ; USER=root ; COMMAND=/usr/bin/smartctl --scan --device=nvmewtf is that?
https://aur.archlinux.org/packages/telegraf-bin![]()
https://wiki.archlinux.org/title/S.M.A. … self-tests
Disable that, just to reduce noise and interference.
Next to https://wiki.archlinux.org/title/Ryzen#Troubleshooting you might be facing https://wiki.archlinux.org/title/Solid_ … leshooting
If you can somewhat trigger this, make sure to keep a visible terminal emulator running dmesg -w on screen.
Maybe it allows you to catch some last messages from the kernel before things go south.
Offline
wtf is that?
SMART plugin for telegraf. Disabled now.
Next to https://wiki.archlinux.org/title/Ryzen#Troubleshooting you might be facing https://wiki.archlinux.org/title/Solid_ … leshooting
Possible, but I'd expect to see some of the mentioned events in some log or other, no? (Also, I have none of these vendors' drives - mine are all Crucial. Doesn't mean much, granted).
If you can somewhat trigger this, make sure to keep a visible terminal emulator running dmesg -w on screen.
Maybe it allows you to catch some last messages from the kernel before things go south.
100% not reproducible at will, but I've got enough monitors to keep a dmesg open at all times, so I'll try. I think I remember having one open a while back, when I was getting 3 crashes in a row, and it not displaying anything, but worth making sure and documenting the outcome here.
Gonna plug in an additional, wired keyboard for REISUB, too.
Online
I'd expect to see some of the mentioned events in some log or other, no?
You'd only get the MCE errors on spontaneous reboots, everything else will be lost w/ the power button.
SMART plugin for telegraf. Disabled now.
Just to be clear, the comment was rather on the "interesting" implementation here - I wasn't faulting you for using it (and actually knew where this is kinda coming from)
I was just too tired and baffled for a more measured comment than "wtf" ![]()
Offline
No worries. I'm not happy with it either, but it's the only way I have found to get the disk fan speed and temps into influx.
Now we wait another few days for the next crash to happen.
Online
Pages: 1