You are not logged in.

#1 2020-05-24 15:57:49

themusicalduck
Member
Registered: 2011-07-04
Posts: 123

Many full system freezes each day for the last few weeks.

I've happily been using Arch for years now and never felt like it was too unstable.

Something has changed recently on my system that means I've been getting so many full system freezes. This started happening occasionally a few weeks ago, but in the last few days it seems to happen multiple times a day.

I've tried using linux-zen, linux-lts, linux and linux-git. I get crashes on all of them. I saw a suggestion to try linux-amd-staging-drm-next-git from the AUR but it fails on compilation and I wasn't able to figure out why.

I was previously using mesa-git but downgraded to mesa.

The freezes are most common when I'm using SteamVR, but it also happens during normal desktop use. It very often happens when I'm trying to log on after having been away from the PC for a while.

It happens whether I'm using Wayland or X. I'm using Gnome and GDM. I was using the xrdesktop patched version of gnome-shell, but I switched back to normal gnome-shell and I still got a freeze.

Sometimes I'm able to get to a TTY and reboot, but most of the time I can't. I can't ssh in either. Sometimes the screen goes black, sometimes it goes black and then comes back but covered in graphical artefacts (but still frozen), sometimes it just freezes and ignores all input.

If I run sudo journalctl | grep -i "hardware error" I get this output

May 17 01:38:23 rikka kernel: mce: [Hardware Error]: Machine check events logged
May 17 01:38:23 rikka kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 5: bea0000000000108
May 17 01:38:23 rikka kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff95f129cc MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
May 17 01:38:23 rikka kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1589675897 SOCKET 0 APIC 5 microcode 8701013
May 18 06:10:18 rikka kernel: mce: [Hardware Error]: Machine check events logged
May 18 06:10:18 rikka kernel: mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 5: bea0000000000108
May 18 06:10:18 rikka kernel: mce: [Hardware Error]: TSC 0 ADDR 7f8d6c3c201c MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
May 18 06:10:18 rikka kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1589778613 SOCKET 0 APIC a microcode 8701013
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: CPU 11: Machine Check: 0 Bank 5: bea0000000000108
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffc0fcd3f4 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1589840939 SOCKET 0 APIC 7 microcode 8701013
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: Machine check events logged
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: CPU 14: Machine Check: 0 Bank 5: bea0000000000108
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff922789f6 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
May 18 23:29:04 rikka kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1589840939 SOCKET 0 APIC d microcode 8701013

which suggests a CPU issue might be the cause sometimes, but since the last entry is from the 18th I don't think it's the most common cause.

This is a grep for "error" during a period today where my system froze. I'm not sure if there is anything useful:

May 24 08:30:47 rikka kernel: RAS: Correctable Errors collector initialized.
May 24 08:30:49 rikka ntpd[1177]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
May 24 08:30:49 rikka ntpd[1177]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
May 24 08:30:49 rikka watchdog[1188]: error retry time-out = 60 seconds
May 24 08:30:49 rikka sh[1189]: /bin/sh: line 0: echo: write error: Invalid argument
May 24 08:30:50 rikka gnome-session[1545]: gnome-session-binary[1545]: WARNING: Falling back to non-systemd startup procedure due to error: GDBus.Error:org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.freedesktop.systemd1 exited with status 1
May 24 08:30:50 rikka gnome-session-binary[1545]: WARNING: Falling back to non-systemd startup procedure due to error: GDBus.Error:org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.freedesktop.systemd1 exited with status 1
May 24 08:30:51 rikka gnome-shell[1565]: Error looking up permission: GDBus.Error:org.freedesktop.portal.Error.NotFound: No entry for geolocation
May 24 08:30:51 rikka gsd-sharing[1812]: Failed to StopUnit service: GDBus.Error:org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.freedesktop.systemd1 exited with status 1
May 24 08:30:51 rikka gsd-sharing[1812]: Failed to StopUnit service: GDBus.Error:org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.freedesktop.systemd1 exited with status 1
May 24 08:30:51 rikka gsd-sharing[1812]: Failed to StopUnit service: GDBus.Error:org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.freedesktop.systemd1 exited with status 1
May 24 08:30:52 rikka org.gnome.Shell.desktop[2018]: > Internal error:   Could not resolve keysym Invalid
May 24 08:30:52 rikka org.gnome.Shell.desktop[2018]: Errors from xkbcomp are not fatal to the X server
May 24 08:30:54 rikka gnome-session-c[2183]: Error creating FIFO: File exists
May 24 08:31:05 rikka gnome-shell[2200]: JS ERROR: Could not load extension gnome-shell-extension-xrdesktop: Error: Missing metadata.json
May 24 08:31:06 rikka gnome-shell[2200]: JS ERROR: Could not load extension nope: Error: Missing metadata.json
May 24 08:31:06 rikka gnome-shell[2814]: > Internal error:   Could not resolve keysym Invalid
May 24 08:31:06 rikka gnome-shell[2814]: Errors from xkbcomp are not fatal to the X server
May 24 08:31:06 rikka gnome-software[2560]: not GsPlugin error g-io-error-quark:15: failed to process components/component/id[text()='archlinux.www.Arch Linux-(null)']/../pkgname/..|components/component[@type='webapp']/id[text()='archlinux.www.Arch Linux-(null)']/..|component/id[text()='archlinux.www.Arch Linux-(null)']/..: cannot parse text or number `null`
May 24 08:31:06 rikka gnome-software[2560]: not GsPlugin error g-io-error-quark:15: failed to process components/component/id[text()='archlinux.www.Arch Linux-(null)']/../pkgname/..|components/component[@type='webapp']/id[text()='archlinux.www.Arch Linux-(null)']/..|component/id[text()='archlinux.www.Arch Linux-(null)']/..: cannot parse text or number `null`
May 24 08:31:06 rikka gnome-software[2560]: not GsPlugin error g-io-error-quark:15: failed to process components/component/id[text()='archlinux.www.Arch Linux-(null)']/../pkgname/..|components/component[@type='webapp']/id[text()='archlinux.www.Arch Linux-(null)']/..|component/id[text()='archlinux.www.Arch Linux-(null)']/..: cannot parse text or number `null`
May 24 08:31:06 rikka gnome-software[2560]: not handling error failed for action refine: failed to process components/component/id[text()='archlinux.www.Arch Linux-(null)']/../pkgname/..|components/component[@type='webapp']/id[text()='archlinux.www.Arch Linux-(null)']/..|component/id[text()='archlinux.www.Arch Linux-(null)']/..: cannot parse text or number `null`
May 24 08:31:06 rikka gnome-software[2560]: Error loading the metadata file for 'system/flatpak/gnome/localization/org.gnome.Platform.Locale/3.22': No such file or directory
May 24 08:31:07 rikka discord.desktop[2971]: [2971:0524/083107.218783:ERROR:buffer_manager.cc(488)] [.DisplayCompositor]GL ERROR :GL_INVALID_OPERATION : glBufferData: <- error from previous GL command
May 24 08:31:07 rikka discord.desktop[2971]: [2971:0524/083107.684568:ERROR:buffer_manager.cc(488)] [.DisplayCompositor]GL ERROR :GL_INVALID_OPERATION : glBufferData: <- error from previous GL command
May 24 08:31:09 rikka discord.desktop[2580]: Error downloading with electron net: HTTP Error: Status Code 403
May 24 08:31:09 rikka discord.desktop[2580]: [Modules] Failed fetching module discord_krisp@0: Error: HTTP Error: Status Code 403
May 24 08:31:13 rikka zeitgeist-datah[2587]: zeitgeist-datahub.vala:207: Error during inserting events: GDBus.Error:org.gnome.zeitgeist.EngineError.InvalidArgument: Incomplete event: interpretation, manifestation and actor are required
May 24 08:31:14 rikka zeitgeist-datah[2587]: zeitgeist-datahub.vala:207: Error during inserting events: GDBus.Error:org.gnome.zeitgeist.EngineError.InvalidArgument: Incomplete event: interpretation, manifestation and actor are required
May 24 08:31:21 rikka gnome-shell[4779]: [Child 4779, MediaDecoderStateMachine #1] WARNING: Decoder=7f03b813a000 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /build/firefox/src/firefox-76.0.1/dom/media/MediaDecoderStateMachine.cpp, line 3367
May 24 08:31:21 rikka gnome-shell[4779]: [Child 4779, MediaDecoderStateMachine #1] WARNING: Decoder=7f03b813a000 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /build/firefox/src/firefox-76.0.1/dom/media/MediaDecoderStateMachine.cpp, line 3367
May 24 08:31:21 rikka gnome-shell[4779]: [Child 4779, MediaDecoderStateMachine #1] WARNING: Decoder=7f03b813a000 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /build/firefox/src/firefox-76.0.1/dom/media/MediaDecoderStateMachine.cpp, line 3367
May 24 08:34:25 rikka zeitgeist-datah[2587]: zeitgeist-datahub.vala:207: Error during inserting events: GDBus.Error:org.gnome.zeitgeist.EngineError.InvalidArgument: Incomplete event: interpretation, manifestation and actor are required

My boot entry and kernel options:

title   Arch Linux
linux   /vmlinuz-linux
initrd  /amd-ucode.img
initrd  /initramfs-linux.img
options root=PARTUUID="9e9ae1df-6a48-4391-99c2-940a242d4a7d" rootfstype=ext4 add_efi_memmap amd_iommu=on iommu=pt vfio-pci.ids=10de:13c2,10de:0fbb

amd-ucode is installed.

System specs:

AMD 3700X
RX5700XT
tuf gaming x570-plus motherboard
Corsair Vengeance LPX RAM

Arch is fully up to date.

It doesn't seem to be overheating.

Does anyone have any ideas on how I can investigate this? I've already experienced one instance of data corruption because of the hard restarts and I'm worried about what may happen if I continuously have to do it.

Thank you.

Last edited by themusicalduck (2020-05-24 16:02:30)

Offline

#2 2020-05-24 16:42:54

Morn
Member
Registered: 2012-09-02
Posts: 886

Re: Many full system freezes each day for the last few weeks.

The MCE errors look like the good old Ryzen CPU Linux freeze bug to me. (I thought this was no longer an issue with current Ryzen CPUs?) I suppose you could try "processor.max_cstate=5" and see if that fixes the MCE errors and crashes...

Offline

#3 2020-05-24 16:50:04

themusicalduck
Member
Registered: 2011-07-04
Posts: 123

Re: Many full system freezes each day for the last few weeks.

Morn wrote:

The MCE errors look like the good old Ryzen CPU Linux freeze bug to me. (I thought this was no longer an issue with current Ryzen CPUs?) I suppose you could try "processor.max_cstate=5" and see if that fixes the MCE errors and crashes...

Thanks I'll give that a try.

I also just found the linux-amd package on AUR and managed to get it to compile. I'm going to see if that helps too.

Last edited by themusicalduck (2020-05-24 16:50:30)

Offline

#4 2020-05-24 17:41:29

Morn
Member
Registered: 2012-09-02
Posts: 886

Re: Many full system freezes each day for the last few weeks.

My Ryzen 7 1700 has not had any more freezes with the kernel boot parameter, no AUR package has been necessary.

Offline

#5 2020-05-25 00:32:00

Ropid
Member
Registered: 2015-03-09
Posts: 1,069

Re: Many full system freezes each day for the last few weeks.

Do you have the latest BIOS for your board? I see a BIOS from April this year for your motherboard on the manufacturer's website, and it has "improve system stability" in its description.

I've heard about people having instability with stock settings and needing a slight tweak like +0.05V offset for the core voltage.

The other thing I heard is that you can't rely on "XMP" speed profile of the RAM running stable. People often need to do manual overclocking of the memory and can't use the XMP profile.

Offline

#6 2020-05-25 09:26:58

themusicalduck
Member
Registered: 2011-07-04
Posts: 123

Re: Many full system freezes each day for the last few weeks.

Thanks Ropid. My BIOS actually wasn't up to date but is now. Thanks for pointing that out.

I should add that my system runs stable on Windows so I have a feeling it isn't a hardware problem.

My RAM speed was set to auto, which actually was putting it at a lower frequency than it should be. I've set it to the stock frequency.

Since I made all of these changes I've just now had a freeze on logging in, so it doesn't seem like I've found the problem yet.

This time I was able to ssh in and got a dmesg output:

[ 1440.897086] audit: type=1103 audit(1590396601.697:112): pid=5862 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 1440.897121] audit: type=1006 audit(1590396601.697:113): pid=5862 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=6 res=1
[ 1440.900185] audit: type=1105 audit(1590396601.700:114): pid=5862 uid=0 auid=0 ses=6 msg='op=PAM:session_open grantors=pam_loginuid,pam_limits,pam_unix acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 1440.900189] audit: type=1110 audit(1590396601.700:115): pid=5862 uid=0 auid=0 ses=6 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 1443.937699] audit: type=1104 audit(1590396604.737:116): pid=5862 uid=0 auid=0 ses=6 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 1443.937713] audit: type=1106 audit(1590396604.737:117): pid=5862 uid=0 auid=0 ses=6 msg='op=PAM:session_close grantors=pam_loginuid,pam_limits,pam_unix acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 2040.967857] audit: type=1101 audit(1590397201.747:118): pid=5943 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_time acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 2040.967873] audit: type=1103 audit(1590397201.747:119): pid=5943 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 2040.967908] audit: type=1006 audit(1590397201.747:120): pid=5943 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=7 res=1
[ 2040.971122] audit: type=1105 audit(1590397201.750:121): pid=5943 uid=0 auid=0 ses=7 msg='op=PAM:session_open grantors=pam_loginuid,pam_limits,pam_unix acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 2040.971212] audit: type=1110 audit(1590397201.750:122): pid=5943 uid=0 auid=0 ses=7 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 2044.012082] audit: type=1104 audit(1590397204.792:123): pid=5943 uid=0 auid=0 ses=7 msg='op=PAM:setcred grantors=pam_unix,pam_env acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 2044.012086] audit: type=1106 audit(1590397204.792:124): pid=5943 uid=0 auid=0 ses=7 msg='op=PAM:session_close grantors=pam_loginuid,pam_limits,pam_unix acct="root" exe="/usr/bin/crond" hostname=? addr=? terminal=cron res=success'
[ 2106.376485] BUG: kernel NULL pointer dereference, address: 0000000000000314
[ 2106.376489] #PF: supervisor read access in kernel mode
[ 2106.376490] #PF: error_code(0x0000) - not-present page
[ 2106.376491] PGD 0 P4D 0 
[ 2106.376494] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 2106.376496] CPU: 8 PID: 5784 Comm: kworker/8:0 Not tainted 5.6.14-zen1-1-zen #1
[ 2106.376498] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1407 04/01/2020
[ 2106.376598] Workqueue: events dm_irq_work_func [amdgpu]
[ 2106.376688] RIP: 0010:dc_link_handle_hpd_rx_irq+0x594/0xe40 [amdgpu]
[ 2106.376690] Code: 89 74 24 18 be 01 00 00 00 48 89 7c 24 20 48 8d 7c 24 18 e8 be f8 ff ff 49 63 c5 48 69 c0 a8 04 00 00 48 8b 84 03 c0 01 00 00 <83> b8 14 03 00 00 40 0f 84 a9 07 00 00 49 8b 96 20 01 00 00 48 8b
[ 2106.376692] RSP: 0018:ffffa40982fd7d70 EFLAGS: 00010282
[ 2106.376693] RAX: 0000000000000000 RBX: ffff954f8d020000 RCX: 0000000000000004
[ 2106.376694] RDX: ffffffffc09c60ec RSI: ffffffffc09c6110 RDI: 0000000000000000
[ 2106.376696] RBP: ffffa40982fd7df8 R08: ffffffffc09c610b R09: 0000000000000001
[ 2106.376697] R10: ffffa40982fd7b10 R11: ffffffffc09c6190 R12: 0000000000000000
[ 2106.376698] R13: 0000000000000000 R14: ffff955092e0a400 R15: ffffa40983e1fdb8
[ 2106.376700] FS:  0000000000000000(0000) GS:ffff95509ec00000(0000) knlGS:0000000000000000
[ 2106.376701] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2106.376702] CR2: 0000000000000314 CR3: 000000078b1da000 CR4: 0000000000350ee0
[ 2106.376704] Call Trace:
[ 2106.376710]  ? __switch_to_asm+0x34/0x70
[ 2106.376711]  ? __switch_to_asm+0x40/0x70
[ 2106.376715]  ? syscall_return_via_sysret+0xf/0x7f
[ 2106.376717]  ? __switch_to_asm+0x34/0x70
[ 2106.376808]  handle_hpd_rx_irq+0x7f/0x330 [amdgpu]
[ 2106.376895]  dm_irq_work_func+0x49/0x60 [amdgpu]
[ 2106.376899]  process_one_work+0x1da/0x3d0
[ 2106.376902]  worker_thread+0x4d/0x470
[ 2106.376905]  ? process_one_work+0x3d0/0x3d0
[ 2106.376907]  kthread+0x153/0x170
[ 2106.376909]  ? __kthread_init_worker+0x50/0x50
[ 2106.376911]  ret_from_fork+0x22/0x40
[ 2106.376915] Modules linked in: tun rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache fuse rfcomm cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel bluetooth ecdh_generic ecc cdc_acm uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 ebtable_filter videobuf2_common ebtables videodev snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device mc ip6table_filter ip6_tables iptable_filter joydev mousedev input_leds hid_generic nls_iso8859_1 nls_cp437 vfat fat eeepc_wmi edac_mce_amd asus_wmi battery sparse_keymap rfkill wmi_bmof ccp snd_hda_codec_realtek rng_core snd_hda_codec_generic ledtrig_audio kvm snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul ghash_clmulni_intel r8169 snd_pcm nouveau aesni_intel sp5100_tco snd_timer realtek snd crypto_simd libphy mxm_wmi soundcore i2c_piix4 cryptd k10temp glue_helper pcspkr wmi pinctrl_amd evdev mac_hid acpi_cpufreq usbhid hid msr sg
[ 2106.376950]  crypto_user ip_tables x_tables ext4 crc16 mbcache jbd2 xhci_pci xhci_hcd amdgpu gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm agpgart vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio overlay btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq zram
[ 2106.376965] CR2: 0000000000000314
[ 2106.376967] ---[ end trace f0c13d8f24a229c7 ]---
[ 2106.377051] RIP: 0010:dc_link_handle_hpd_rx_irq+0x594/0xe40 [amdgpu]
[ 2106.377053] Code: 89 74 24 18 be 01 00 00 00 48 89 7c 24 20 48 8d 7c 24 18 e8 be f8 ff ff 49 63 c5 48 69 c0 a8 04 00 00 48 8b 84 03 c0 01 00 00 <83> b8 14 03 00 00 40 0f 84 a9 07 00 00 49 8b 96 20 01 00 00 48 8b
[ 2106.377054] RSP: 0018:ffffa40982fd7d70 EFLAGS: 00010282
[ 2106.377055] RAX: 0000000000000000 RBX: ffff954f8d020000 RCX: 0000000000000004
[ 2106.377056] RDX: ffffffffc09c60ec RSI: ffffffffc09c6110 RDI: 0000000000000000
[ 2106.377058] RBP: ffffa40982fd7df8 R08: ffffffffc09c610b R09: 0000000000000001
[ 2106.377059] R10: ffffa40982fd7b10 R11: ffffffffc09c6190 R12: 0000000000000000
[ 2106.377060] R13: 0000000000000000 R14: ffff955092e0a400 R15: ffffa40983e1fdb8
[ 2106.377061] FS:  0000000000000000(0000) GS:ffff95509ec00000(0000) knlGS:0000000000000000
[ 2106.377063] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2106.377064] CR2: 0000000000000314 CR3: 000000078b1da000 CR4: 0000000000350ee0
[ 2108.708433] audit: type=1100 audit(1590397269.495:125): pid=5966 uid=0 auid=1000 ses=3 msg='op=PAM:authentication grantors=pam_tally2,pam_shells,pam_unix,pam_permit,pam_gnome_keyring acct="theo" exe="/usr/lib/gdm-session-worker" hostname=rikka addr=? terminal=/dev/tty1 res=success'
[ 2108.713553] audit: type=1101 audit(1590397269.501:126): pid=5966 uid=0 auid=1000 ses=3 msg='op=PAM:accounting grantors=pam_tally2,pam_access,pam_unix,pam_permit,pam_time acct="theo" exe="/usr/lib/gdm-session-worker" hostname=rikka addr=? terminal=/dev/tty1 res=success'
[ 2108.715099] audit: type=1110 audit(1590397269.502:127): pid=5966 uid=0 auid=1000 ses=3 msg='op=PAM:setcred grantors=pam_tally2,pam_shells,pam_unix,pam_permit,pam_gnome_keyring acct="theo" exe="/usr/lib/gdm-session-worker" hostname=rikka addr=? terminal=/dev/tty1 res=success'
[ 2155.565261] audit: type=1101 audit(1590397316.356:128): pid=5990 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_tally2,pam_access,pam_unix,pam_permit,pam_time acct="theo" exe="/usr/bin/sshd" hostname=192.168.0.14 addr=192.168.0.14 terminal=ssh res=success'
[ 2155.565992] audit: type=1103 audit(1590397316.356:129): pid=5990 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_tally2,pam_shells,pam_unix,pam_permit,pam_env acct="theo" exe="/usr/bin/sshd" hostname=192.168.0.14 addr=192.168.0.14 terminal=ssh res=success'
[ 2155.565995] audit: type=1006 audit(1590397316.357:130): pid=5990 uid=0 old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=8 res=1
[ 2155.575466] audit: type=1105 audit(1590397316.366:131): pid=5990 uid=0 auid=1000 ses=8 msg='op=PAM:session_open grantors=pam_loginuid,pam_keyinit,pam_limits,pam_unix,pam_permit,pam_mail,pam_systemd,pam_env acct="theo" exe="/usr/bin/sshd" hostname=192.168.0.14 addr=192.168.0.14 terminal=ssh res=success'
[ 2155.576649] audit: type=1103 audit(1590397316.367:132): pid=5992 uid=0 auid=1000 ses=8 msg='op=PAM:setcred grantors=pam_tally2,pam_shells,pam_unix,pam_permit,pam_env acct="theo" exe="/usr/bin/sshd" hostname=192.168.0.14 addr=192.168.0.14 terminal=ssh res=success'
[ 2162.097087] audit: type=1334 audit(1590397322.888:133): prog-id=19 op=LOAD
[ 2162.097160] audit: type=1334 audit(1590397322.888:134): prog-id=20 op=LOAD
[ 2162.306331] audit: type=1130 audit(1590397323.097:135): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-localed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 2192.367134] audit: type=1131 audit(1590397353.160:136): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-localed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

When googling about the stack trace part of the ouput I came accross this comment:

https://bugzilla.redhat.com/show_bug.cgi?id=1810830#c3

which seems to be exactly what I'm experiencing. Shame there's no more information on that page about it.

I'm going to try disabling all my extensions. I also recently reconnected my second GPU which I was using for vfio.. That could very well be the problem if there is some kind of conflict (even with the vfio-pci.ids options to disable using the card until vfio is started). I'll disconnect it and see if there is any change.

Last edited by themusicalduck (2020-05-25 09:27:27)

Offline

#7 2020-05-25 12:40:38

Ropid
Member
Registered: 2015-03-09
Posts: 1,069

Re: Many full system freezes each day for the last few weeks.

Because of "amdgpu" being mentioned, I would try to test the future 5.7 kernel. There's an AUR package "linux-mainline" to do it. There's someone sharing a repo here where it's already built so that you don't have to do it yourself (building it yourself would mean around 2GB of source code to download and 15 minutes compiling with 8 cores):

https://arch.miffe.org/x86_64/

You add the repo like this at the end of /etc/pacman.conf:

# 'linux-mainline'
[miffe]
Server = https://arch.miffe.org/$arch/

The amdgpu bug tracker is here, I can see some people there battling with the driver and a 5700XT:

https://gitlab.freedesktop.org/drm/amd/ … &state=all

Last edited by Ropid (2020-05-25 12:42:06)

Offline

#8 2020-05-25 14:29:28

themusicalduck
Member
Registered: 2011-07-04
Posts: 123

Re: Many full system freezes each day for the last few weeks.

After disconnecting the second GPU I haven't had any more freezes while logging in. This is a great improvement already. Although I am investigating into why it happens in the first place, as I would like to keep making use of it.

I'll try out the mainline kernel and see if it prevents the freezes while using SteamVR (although I do like the performance gain I seem to get from using linux-zen, but at least once it's updated to the next linux release I can go back to using it).

Offline

Board footer

Powered by FluxBB