[SOLVED] Proton game crash takes almost the whole system with it

blinkingbit · 2024-05-30 18:25:39

I have been playing tekken 8 and it crashes from time to time during gameplay, the thing is that it takes down almost every program in the system when it happens. Bluetooth daemon, window manager... I don't really know what is alive as the only thing I can see is some sort of log/stacktrace of the crash (that is not present in the journal of that session).

Launch options:

PROTON_LOG=1 LANG=es_ES.UTF-8 LC_ALL=es_ES.UTF-8 HOST_LC_ALL=es_ES.UTF-8 %command%

journalctl -b -1

proton log

The error displayed in the screen after the crash is:

(process:19744): GLib-GObject-CRITICAL **: 13:15:13.851: g_object_unref: assertion 'G_IS_OBJECT (object)'

It also says that the crashdump is being saved and uploaded but I couldn't find it anywhere. It looked like a really long stacktrace so I guess the most important thing is the glib assertion error.

Last edited by blinkingbit (2024-06-23 08:07:43)

cryptearth · 2024-05-30 19:34:44

1) Have you done as told in the wiki: "enable AT LEAST en_US.UTF-8"?
2) what happens when you remove the lang options (or better yet: all of the options and have the game run without any parameters)
3) what proton version do you use?

seth · 2024-05-30 21:19:39

sort of log/stacktrace of the crash (that is not present in the journal of that session).

And that would be really useful.
Don't reboot w/ the power button, use https://wiki.archlinux.org/title/Keyboa … el_(SysRq) and in very doubt make a photo of your monitor.

The gobject assertion out of context is meaningless.

blinkingbit · 2024-06-03 18:07:07

I have en_US.UTF-8 enabled (funnily enough es_ES.UTF-8 wasn't so I added it to /etc/locale.gen).
If lang options are removed, just region and text is affected and defaults are used (en_US) so english text and US region for matchmaking.
Proton version is the default one for the game, experimental-9.0-20240522.

I have experience the same crash with other game that I have no launch parameters so it's unrelated to that.

I set up Sysrq (although my dvorak layout is reset after the crash so instead of "REISUB" I do the equivalent of writing "P.COGX" any tips with this would be appreciated) but still nothing in the journal after recovering. I could get to login prompt after E+Sysrq+alt.

There were no logs on the journalctl so here are some photos of the monitor.

Last edited by blinkingbit (2024-06-03 18:07:31)

seth · 2024-06-03 21:36:34

sysrq+e only SIGTERMs all processes, indicating that the "system" is fine
The screenshot shows the https://wiki.archlinux.org/title/Getty#Staircase_effect and a firefox crash ("exiting due to channel error") and ultimately a shutdown - that's not an actual "crash" of anything.

sysrq+r will reset the keyboard to ascii, but that's not mandatory.
What does your /etc/vconsole.conf look like, what kind of GUI session do you start and how? (the posted journal shows you logging in on TTY1, no DM)

blinkingbit · 2024-06-04 15:49:12

vconsole.conf just one line:

KEYMAP=dvorak

I login directly from tty to xorg and then initialize i3-wm

.bash_profile:

[[ -f ~/.bashrc ]] && . ~/.bashrc

if [[ "$(tty) == '/dev/tty1'" ]] && [ -z "$SSH_TTY" ]; then
       exec startx
fi

.xinitrc:

#!/bin/sh
font tty-hack

pulseaudio --start
exec i3

The thing I don't quite understand is that, as the wiki says

The exec command ensures that the user is logged out when the X server exits, crashes or is killed by an attacker.

so X server should be alive, as I can't return to login without sending SIGTERM. But I find it weird that journalctl can't log anything of what happened and also other daemons (ssh and bluetooth for example) seem to die after the getty staircase thing appears.

I'm not really worried about proton dying but the whole window manager collapsing with it is what I don't really understand. I'm going to try to switch to other tty next time, maybe I can check xorg log or journalctl if I don't send SIGTERM. I'm not familiar using barebones xorg without any window manager so maybe I'm missing an obvious way to recover important info.

** Edited **

I tried to change to tty-2 but even though login seems to work (I can enter user and password) login fails. But I got xorg log with some extra info that may be relevant:

[ 45375.397] (WW) AMDGPU(0): flip queue failed in amdgpu_scanout_flip: Permission denied, TearFree inactive
[ 45375.399] (EE) AMDGPU(0): failed to set mode: Permission denied

Also the modeline spam is worrying too

Last edited by blinkingbit (2024-06-06 15:24:37)

seth · 2024-06-04 20:16:40

There're some bursts of output polling, but

[ 45571.284] (II) Server terminated successfully (0). Closing log file.

the server didn't crash.

What if you "exec xterm" and start "i3" from there (don't! exec it) so in case i3 crashes, the session remains active (by the xterm)?

Another thing would certainly be to try w/o xf86-video-amdgpu

blinkingbit · 2024-06-06 15:42:33

using exec xterm and then starting i3 from it like a normal command (without exec) ends up in the same situation, back on the getty stair case but with a different message:

/home/blinkingbit/.xinitrc: line 2: font: command not found
xterm: cannot load font "-misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1"

I have found this topic in which it's mentioned the need for xorg-mkfontscale because xorg needs it for fonts. According to the wiki, in the fonts configuration page.

I don't really understand it, is Fontconfig part of xorg or an optional tool for configurating it? I didn't spend that much time in my fonts configuration and indeed there are some warnings in my xorg log but it seems like they are managed gracefuly:

[   126.484] (WW) The directory "/usr/share/fonts/misc" does not exist.
[   126.484] 	Entry deleted from font path.
[   126.484] (WW) `fonts.dir' not found (or not valid) in "/usr/share/fonts/TTF".
[   126.484] 	Entry deleted from font path.
[   126.484] 	(Run 'mkfontdir' on "/usr/share/fonts/TTF").
[   126.484] (WW) `fonts.dir' not found (or not valid) in "/usr/share/fonts/OTF".
[   126.484] 	Entry deleted from font path.
[   126.484] 	(Run 'mkfontdir' on "/usr/share/fonts/OTF").
[   126.484] (WW) The directory "/usr/share/fonts/Type1" does not exist.
[   126.484] 	Entry deleted from font path.
[   126.484] (WW) The directory "/usr/share/fonts/100dpi" does not exist.
[   126.484] 	Entry deleted from font path.
[   126.484] (WW) The directory "/usr/share/fonts/75dpi" does not exist.
[   126.484] 	Entry deleted from font path.
[   126.484] (==) FontPath set to:
	
[   126.484] (==) ModulePath set to "/usr/lib/xorg/modules"

seth · 2024-06-06 21:02:21

Your xinitrc is broken - not only do you try to run some "font" process that doesn't exist, but also you'll eng up w/ a degraded session (last link below - 2nd blue note about what to include at least)

You'll need some ttf/otf font installed to display text (pcf is no longer supported by pango, but would work in an xterm), in doubt install https://archlinux.org/packages/extra/any/ttf-dejavu/

However that doesn't explain why the X11 server seems to crash when running i3, please post the resulting X11 log and your i3 config

blinkingbit · 2024-06-07 07:42:39

xorg log

i3 config

I've tried removing xf86-video-amdgpu but it happened again. This time without any game running, just the steam client open.

V1del · 2024-06-07 09:12:42

What happens if you remove that attempt at resizing a steam_app ?

blinkingbit · 2024-06-07 12:33:47

V1del wrote:

What happens if you remove that attempt at resizing a steam_app ?

Steam just opens in whichever workspace I opened it and is not a floating window so it occupies all the space it can and this also applies to every window spawned by it (friend list or chats)

Last edited by blinkingbit (2024-06-07 12:34:24)

seth · 2024-06-07 14:32:44

The point is, does i3 still terminate w/o that rule when playing steam games?
Any chance you accidentally hit the i3 exit or restart shortcuts in the heat of the game?

Though what I don't understand is that the X1 server terminated cleanly, ie. the session process ie. xterm would have to terminate.

[  7239.666] (II) event8: opening input device '/dev/input/event8' failed (Permission denied).
[  7239.668] (II) event7: opening input device '/dev/input/event7' failed (Permission denied).
[  7239.669] (II) event2: opening input device '/dev/input/event2' failed (Permission denied).
[  7239.669] (II) event21: opening input device '/dev/input/event21' failed (Permission denied).
[  7239.670] (II) event23: opening input device '/dev/input/event23' failed (Permission denied).
[  7239.671] (II) event6: opening input device '/dev/input/event6' failed (Permission denied).
[  7239.692] (II) event5: opening input device '/dev/input/event5' failed (Permission denied).
[  7239.709] (II) event22: opening input device '/dev/input/event22' failed (Permission denied).
[  7394.580] (II) event21: opening input device '/dev/input/event21' failed (Permission denied).
[  7394.588] (II) event6: opening input device '/dev/input/event6' failed (Permission denied).
[  7394.589] (II) event8: opening input device '/dev/input/event8' failed (Permission denied).
[  7394.606] (II) event7: opening input device '/dev/input/event7' failed (Permission denied).
[  7394.608] (II) event23: opening input device '/dev/input/event23' failed (Permission denied).
[  7394.609] (II) event2: opening input device '/dev/input/event2' failed (Permission denied).
[  7394.633] (II) event5: opening input device '/dev/input/event5' failed (Permission denied).
[  7394.650] (II) event22: opening input device '/dev/input/event22' failed (Permission denied).

Is there any chance that nothing actually "crashes" and you just accidentally switch the VT and ctrl+alt+F2 kicks you back into the GUI?

blinkingbit · 2024-06-08 17:19:54

I tried with i3 rules disabled and it still happens.

No, there is no way I can hit shift+cmd+e while playing and I've experience one "crash" without touching the keyboard at all, no game launched just the steam client. In my setup I have to click a button to confirm that I really want to quit i3.

And no, I can't accidentally switch the VT because I have my F keys in a separate keyboard layer and a bit of hand contortionism is needed to use that command.

I have installed ttf-dejavu just in case but it still happens.

seth · 2024-06-08 19:13:55

seth wrote:

what I don't understand is that the X11 server terminated cleanly, ie. the session process ie. xterm would have to terminate

When this happens
a) do you end up w/ the console (and the staircase effect, but as mentioned - the latter is not the problem here) or the xterm?
b) what does the xorg log look immediately afterwards?

We need to figure *what* actually crashes here, for i3 or xterm I'd expect a coredump to be left and for X11 you'd end with an unclean server termination - possibly with a backtrace in the Xorg log.
Right now there's no indication for either.

blinkingbit · 2024-06-08 22:41:40

I end up in console, no windows on the screen so I guess i3 is dead then, no xterm, no input available without using SysRq for sending SIGTERM.
xorg log ends somewhat gracefully so there is no more info after the server terminates. Just after the issue happens, I can't log in without rebooting and then it's a complete normal log.

"$ coredumpctl list" only shows me coredumps of a program I use for text recognition called tesseract with SIGFPE but the hours of these dumps are never near this problem.

ulimit is set to unlimited so i3 should be able to create one. I've enabled verbose logging on i3 for the next time.

seth · 2024-06-09 07:46:34

Have you meanwhile fixed your xinitrc?

also other daemons (ssh and bluetooth for example) seem to die after the getty staircase thing appears

implies that you also cannot ssh into the system after the crash (but you can before)?

If X11 terminates gracefully in response to the sysrq and also gets synced to disc there's almost no way you don't have any journal of the crashing boot afterwards - please post that (-b -1, but you can also look at older journals by increasing the widget, you don't have to wait for this to happen again. "sysrq" will show up in a journal where you used it)

What most likely happens is a cras in the graphics stack, you see some frozen framebuffer and actually can switch the VT, just not get a visaual reflection of it - but in that case you'd also be able to ssh into the system (unless, because of the BT situation, the radio dies as well and you don't have a wired network connection?)

blinkingbit · 2024-06-09 10:06:57

Yeah, I have removed that font command.

You are right, I can ssh into the system before the crash but after it happens I can't. I have a wired connection, so BT being down should not be a problem.

Here is a recent i3 log , xorg log and sudo journal -b -1. I can't find where sysrq was used, is supposed to be those logind messages? I want to stress that xorg does not terminate by itself, if I do not use sysrq to end current processes it stays open as if the exec X hasn't return yet.

This time I opened the game in a i3 workspace and leave it open there while using other workspace, so definitely can't be any weird thing I do while playing.

In the i3 log appears the g-lib assert and this time when I was back in getty there was no staircase (I redirected i3 output to a logfile).

seth · 2024-06-09 13:50:05

Yeah, I have removed that font command.

That's not the only part of the xinitrc that's broken and actually the least problem.

seth wrote:

you'll end up w/ a degraded session (last link below - 2nd blue note about what to include at least)

The journal has no indication of being restarted w/ the sysqr, but /var is the mountpoint for 8d65df66-ee3b-43a8-ba43-804281e3d9dc (sdb1) which is the 1TB segate barracude on ata8 which

Jun 09 11:18:51 NotYourUsualMachine kernel: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jun 09 11:18:51 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:20:58 NotYourUsualMachine kernel: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jun 09 11:20:58 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:21:14 NotYourUsualMachine kernel: ata8: link is slow to respond, please be patient (ready=0)
Jun 09 11:21:19 NotYourUsualMachine kernel: ata8: softreset failed (device not ready)
Jun 09 11:21:19 NotYourUsualMachine kernel: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jun 09 11:21:19 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:21:29 NotYourUsualMachine kernel: ata8: limiting SATA link speed to 3.0 Gbps
Jun 09 11:21:30 NotYourUsualMachine kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Jun 09 11:21:30 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:21:31 NotYourUsualMachine kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Jun 09 11:21:31 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:21:46 NotYourUsualMachine bluetoothd[751]: Adv Monitor app :1.158 disconnected from D-Busata8
Jun 09 11:22:01 NotYourUsualMachine kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Jun 09 11:22:01 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:23:20 NotYourUsualMachine kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Jun 09 11:23:20 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:28:16 NotYourUsualMachine kernel: ata8: limiting SATA link speed to 1.5 Gbps
Jun 09 11:28:17 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:28:17 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:30:30 NotYourUsualMachine kernel: ata8: SATA link down (SStatus 0 SControl 310)
Jun 09 11:30:30 NotYourUsualMachine kernel: ata8: SATA link down (SStatus 0 SControl 310)
Jun 09 11:30:31 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:30:31 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:30:41 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:30:41 NotYourUsualMachine kernel: ata8.00: configured for UDMA/133
Jun 09 11:30:41 NotYourUsualMachine kernel: ata8.00: limiting speed to UDMA/100:PIO4
Jun 09 11:30:42 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:30:42 NotYourUsualMachine kernel: ata8.00: configured for UDMA/100
Jun 09 11:31:43 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:31:43 NotYourUsualMachine kernel: ata8.00: configured for UDMA/100
Jun 09 11:31:44 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:31:44 NotYourUsualMachine kernel: ata8.00: configured for UDMA/100
Jun 09 11:35:39 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:35:39 NotYourUsualMachine kernel: ata8.00: configured for UDMA/100
Jun 09 11:35:39 NotYourUsualMachine kernel: ata8.00: limiting speed to UDMA/33:PIO4
Jun 09 11:35:40 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:35:40 NotYourUsualMachine kernel: ata8.00: configured for UDMA/33
Jun 09 11:35:46 NotYourUsualMachine kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jun 09 11:35:46 NotYourUsualMachine kernel: ata8.00: configured for UDMA/33

seems to flicker where as the xorg log is stored in your $HOME…

mount
lsblk -f

blinkingbit · 2024-06-09 16:11:48

seth wrote:

That's not the only part of the xinitrc that's broken and actually the least problem.

What's broken about it? Starting pulseaudio there or starting i3? There is nothing more in that file. Do you mean some xorg directory like /etc/X11/xorg.conf.d or /etc/X11/xinitrc.d ?

mount output:

proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sys on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
dev on /dev type devtmpfs (rw,nosuid,relatime,size=7739380k,nr_inodes=1934845,mode=755,inode64)
run on /run type tmpfs (rw,nosuid,nodev,relatime,mode=755,inode64)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
/dev/nvme0n1p3 on / type ext4 (rw,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=37,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=2770)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,nosuid,nodev,relatime,pagesize=2M)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=7747420k,nr_inodes=1048576,inode64)
systemd-1 on /home/blinkingbit/bigfiles type autofs (rw,relatime,fd=52,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=5335)
systemd-1 on /home/blinkingbit/gestion/dumps type autofs (rw,relatime,fd=55,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=5340)
systemd-1 on /home/blinkingbit/media type autofs (rw,relatime,fd=56,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=5345)
/dev/nvme0n1p1 on /boot type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro)
/dev/sda1 on /home/blinkingbit/gestion/docs type ext4 (rw,relatime)
/dev/sdb2 on /home/blinkingbit/utils type ext4 (rw,relatime)
/dev/sdb1 on /var type ext4 (rw,relatime)
/dev/sdb3 on /home/blinkingbit/proyects type ext4 (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=1549480k,nr_inodes=387370,mode=700,uid=1000,gid=1000,inode64)
/dev/sdb4 on /home/blinkingbit/bigfiles type ext4 (rw,relatime,x-systemd.automount)
portal on /run/user/1000/doc type fuse.portal (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
/dev/sda3 on /home/blinkingbit/media type ext4 (rw,relatime,x-systemd.automount)

lsblk -f output:

NAME        FSTYPE FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                                
├─sda1      ext4   1.0         795b51ac-57cb-42a6-ba25-3a7fe5c0d955  479.6G     2% /home/blinkingbit/gestion/docs
├─sda2      ext4   1.0         f6d4ac18-ea81-4dbd-a64c-cd61147faf7c                
└─sda3      ext4   1.0         7b53599e-82dd-4ef4-a05c-f30f5f64aa5f  210.8G    70% /home/blinkingbit/media
sdb                                                                                
├─sdb1      ext4   1.0         8d65df66-ee3b-43a8-ba43-804281e3d9dc     26G    42% /var
├─sdb2      ext4   1.0         d59299ed-5a22-4d28-a2a4-8e5615dcd06a   80.6G    12% /home/blinkingbit/utils
├─sdb3      ext4   1.0         55ead6c4-20e8-4ab2-aab4-5e6e15e89e06  379.3G    18% /home/blinkingbit/proyects
└─sdb4      ext4   1.0         324e0441-c3b2-4ffb-8c8a-fb0d9d1ea24a  135.2G    46% /home/blinkingbit/bigfiles
nvme0n1                                                                            
├─nvme0n1p1 vfat   FAT32       A6ED-14A0                             278.1M    44% /boot
├─nvme0n1p2 swap   1           02b4d658-90ec-4d1e-8204-766f6a09fefe                [SWAP]
└─nvme0n1p3 ext4   1.0         af35f02a-203e-422b-b98e-a59dc33e5cd9  138.6G    63% /

The game is installed on sda3 and steam on nvme0n1.
I'm not used to do health checks to my disks, I'm going to follow some steps with smartmontools and report some results. Sdb is the oldest disk so maybe its life is coming to an end .

** edited **
I did both smartctl -H and short test on every disk. Everything seems okay, no errors. Should I do the long tests?

Last edited by blinkingbit (2024-06-09 16:42:44)

seth · 2024-06-09 17:04:48

You're supposed to source /etc/X11/xinit/xinitrc.d/* in your xinitrc, notably 50-systemd-user.sh will import the session variables for logind integration and device access permissions.

The drive flicker might be the bus rather than the drive, but also the error free tests in and by themselves (esp. the short one) don't mean all that much,
Post the entire "smartctl -a" output.

For testing purposes you might want to avoid the /var mount and use the /var directory on the nvme (where the xorg log gets stored) - the main task here is to get a journal covering the crash and w/o /var you're oc also not getting any coredumps.

blinkingbit · 2024-06-09 18:48:09

I have added the following:

if [ -d /etc/X11/xinit/xinitrc.d ] ; then
 for f in /etc/X11/xinit/xinitrc.d/?*.sh ; do
  [ -x "$f" ] && . "$f"
 done
 unset f
fi

smartctl -a output:

=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5 (CMR)
Device Model:     ST1000DM010-2EP102
Serial Number:    W9ALLKAQ
LU WWN Device Id: 5 000c50 0d4349d04
Firmware Version: CC43
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5528
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jun  9 20:35:34 2024 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(    0) seconds.
Offline data collection
capabilities: 			 (0x73) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 109) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x1085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   082   063   006    Pre-fail  Always       -       188829444
  3 Spin_Up_Time            0x0003   097   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   094   094   020    Old_age   Always       -       7094
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   073   060   045    Pre-fail  Always       -       22227275
  9 Power_On_Hours          0x0032   085   085   000    Old_age   Always       -       13377
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1270
183 Runtime_Bad_Block       0x0032   074   074   000    Old_age   Always       -       26
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   058   050   040    Old_age   Always       -       42 (Min/Max 21/43)
193 Load_Cycle_Count        0x0032   097   097   000    Old_age   Always       -       7140
194 Temperature_Celsius     0x0022   042   012   000    Old_age   Always       -       42 (0 12 0 0 0)
195 Hardware_ECC_Recovered  0x001a   006   001   000    Old_age   Always       -       188829444
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       9709h+27m+07.734s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       6436815120
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       4993559987

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     13375         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

I'll tell you my plan for avoiding the var, just in case before doing something wrong.
1 - umount var (sdb1)
2 - mount sdb1 on /mnt
3 - copy the contents to /var
4 - commenting sdb1 entry in fstab during this testing period

Also, do I need to chroot for this? I know /var is important but I don't really know how much it is for the stability of the system.

seth · 2024-06-09 21:35:55

I have added the following:

Before or after "exec i3" ?

183 Runtime_Bad_Block       0x0032   074   074   000    Old_age   Always       -       26

Not good. You might want to eventually run https://wiki.archlinux.org/title/Badblocks on the drive.

Also, do I need to chroot for this?

No, but you want to do this offline (ie. from some live distro w/o booting the system), you'll otherwise likely not be able to umount /var because some process has some files on it opened.

blinkingbit · 2024-06-11 14:03:20

Yeah I added that as the first thing in .xinitrc

I have run badblocks -b 4096 -svn -c 32768 /dev/sdb but there were no bad blocks found:

Checking for bad blocks in non-destructive read-write mode
From block 0 to 244190645
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern:
done
Pass completed, 0 bad blocks found. (0/0/0 errors)
badblocks -b 4096 -svn -c 32768 /dev/sdb 40.74s user 31.77 system 0% cpu 6:54:09.75 total

I'm going to leave that partition out just in case, but it seems like the drive is fine. Maybe some forced shutdown messed with the smart properties of that drive.

blinkingbit · 2024-06-21 21:06:38

After moving /var to /dev/nvme0n1p3 the issue didn't happen anymore. I guess it is related to the bus speed or the SATA cable being too slow or unstable.

My idea mounting /var on that HDD disk was to avoid all log related read-writes on my SSD to save up those precious limited writes flash memory have... but I guess it was a bad idea
If there is nothing more to explore about this I will mark this as resolved in a couple days.

Arch Linux

#1 2024-05-30 18:25:39

[SOLVED] Proton game crash takes almost the whole system with it

#2 2024-05-30 19:34:44

Re: [SOLVED] Proton game crash takes almost the whole system with it

#3 2024-05-30 21:19:39

Re: [SOLVED] Proton game crash takes almost the whole system with it

#4 2024-06-03 18:07:07

Re: [SOLVED] Proton game crash takes almost the whole system with it

#5 2024-06-03 21:36:34

Re: [SOLVED] Proton game crash takes almost the whole system with it

#6 2024-06-04 15:49:12

Re: [SOLVED] Proton game crash takes almost the whole system with it

#7 2024-06-04 20:16:40

Re: [SOLVED] Proton game crash takes almost the whole system with it

#8 2024-06-06 15:42:33

Re: [SOLVED] Proton game crash takes almost the whole system with it

#9 2024-06-06 21:02:21

Re: [SOLVED] Proton game crash takes almost the whole system with it

#10 2024-06-07 07:42:39

Re: [SOLVED] Proton game crash takes almost the whole system with it

#11 2024-06-07 09:12:42

Re: [SOLVED] Proton game crash takes almost the whole system with it

#12 2024-06-07 12:33:47

Re: [SOLVED] Proton game crash takes almost the whole system with it

#13 2024-06-07 14:32:44

Re: [SOLVED] Proton game crash takes almost the whole system with it

#14 2024-06-08 17:19:54

Re: [SOLVED] Proton game crash takes almost the whole system with it

#15 2024-06-08 19:13:55

Re: [SOLVED] Proton game crash takes almost the whole system with it

#16 2024-06-08 22:41:40

Re: [SOLVED] Proton game crash takes almost the whole system with it

#17 2024-06-09 07:46:34

Re: [SOLVED] Proton game crash takes almost the whole system with it

#18 2024-06-09 10:06:57

Re: [SOLVED] Proton game crash takes almost the whole system with it

#19 2024-06-09 13:50:05

Re: [SOLVED] Proton game crash takes almost the whole system with it

#20 2024-06-09 16:11:48

Re: [SOLVED] Proton game crash takes almost the whole system with it

#21 2024-06-09 17:04:48

Re: [SOLVED] Proton game crash takes almost the whole system with it

#22 2024-06-09 18:48:09

Re: [SOLVED] Proton game crash takes almost the whole system with it

#23 2024-06-09 21:35:55

Re: [SOLVED] Proton game crash takes almost the whole system with it

#24 2024-06-11 14:03:20

Re: [SOLVED] Proton game crash takes almost the whole system with it

#25 2024-06-21 21:06:38

Re: [SOLVED] Proton game crash takes almost the whole system with it

Board footer