You are not logged in.

#1 2016-07-17 14:31:46

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

[nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

Regardless of workload/application (happened to me in Chromium and Dota), sometimes my system hangs: first the graphics stop refreshing, then the cursor freezes; however, music keeps playing, so it might just be a graphical bug? System doesn't respond to ctrl+alt+del, alt+f4, alt+f2,alt+tab. Haven't tried switching terminals yet.
After rebooting with the help of case button everything seems to work fine until the next crash. Happened to me 3 times already and is extremely frustrating since there's no way I can tell the system is about to hang or prevent it.
How can I find relevant logs, either during crash with the help of another terminal or after reboot? Every journal known to me only stores logs of current session. What could be the reason for the hangs?
Apparently it started after upgrade to Plasma 5.7.1, although that might not be true due to relative infrequency of the phenomenon.

I'm using proprietary NVIDIA drivers.

Thanks in advance.

Last edited by L1ghtmareI (2016-07-18 10:19:28)

Offline

#2 2016-07-17 14:37:57

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

https://forum.kde.org/viewtopic.php?f=2 … 94c831ed3b

this might be related, however it refers to freezes and intel graphics + is fixed by alt+tab?

Last edited by L1ghtmareI (2016-07-17 14:40:37)

Offline

#3 2016-07-17 14:40:01

headkase
Member
Registered: 2011-12-06
Posts: 1,976

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

First thing, enable the SysReq key combinations.  Then you have a clean way to reboot your computer that should not result in data loss.  So, while holding right-alt AND SysReq keys (SysReq is sometimes unlabeled but in that case it's your "Print Screen" key) slowly press in sequence. R, E, I, S, U, B.  With the "B" your system will reboot cleanly.  Give a few seconds between each key press.  Also another SysReq key is "K" which will just kill your X server and if you have a login manager should kick you back to that.

Once you do the above then you should be able to have the error occur, cleanly reboot, and then going forward try to track down the actual problem without risking data loss/corruption.

Offline

#4 2016-07-17 14:46:49

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

Thanks, I've enabled SysReq.

What files can I use to track down the issue?

Offline

#5 2016-07-17 14:48:16

headkase
Member
Registered: 2011-12-06
Posts: 1,976

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

You're welcome.  You're good for pretty well most system hangs now.  As to where to go from here, sorry - never used KDE or Plasma, Xfce4 all the way for myself.

Offline

#6 2016-07-17 14:50:48

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

Well I never used any KDE specific tools for that, only dmesg and journalctl which are common tools of Linux.

Offline

#7 2016-07-17 14:53:05

headkase
Member
Registered: 2011-12-06
Posts: 1,976

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

I'm using xf86-video-ati, Compiz 0.9, and Xfce.  My system does not freeze.  You're using Nvidia and Plasma - we're just different environments.  We do share the common base of tools but for locking up we have different software.  Good luck, I hope you do get to the root of your issue sooner than later.

Offline

#8 2016-07-17 15:21:13

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

Happened again, switching terminals doesnt work, i think might be connected to system load after all.

This is what appears in journalctl -b -1 at the moment of the crash:

Jul 17 18:14:07 AlexDesktop kernel: BUG: scheduling while atomic: swapper/0/0/0x00000103
Jul 17 18:14:07 AlexDesktop kernel: Modules linked in: sha256_ssse3 sha256_generic hmac drbg ansi_cprng ctr ccm cmac ecb fuse rfcomm bnep mousedev btusb btrtl btbcm btintel bluetooth snd_usb_audio snd_usbmidi_lib joydev input_leds btrfs 
Jul 17 18:14:07 AlexDesktop kernel:  snd_hwdep snd_soc_rt5640 snd_soc_rl6231 emu10k1_gp snd_soc_ssm4567 mii gameport snd_soc_core lpc_ich shpchp snd_compress snd_pcm_dmaengine ac97_bus snd_pcm parport_pc snd_timer i8042 battery parport f
Jul 17 18:14:07 AlexDesktop kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: P        W  O    4.6.4-1-ARCH #1
Jul 17 18:14:07 AlexDesktop kernel: Hardware name: Gigabyte Technology Co., Ltd. H97-HD3/H97-HD3, BIOS F2 04/28/2014
Jul 17 18:14:07 AlexDesktop kernel:  0000000000000286 ece2e61d8c7d5472 ffff88041dc03bc8 ffffffff812e54c2
Jul 17 18:14:07 AlexDesktop kernel:  ffff88041dc15940 ffffffffa0d1c2a0 ffff88041dc03bd8 ffffffff810a112b
Jul 17 18:14:07 AlexDesktop kernel:  ffff88041dc03c28 ffffffff815c3549 00ffffff00000001 0000000000000ebb
Jul 17 18:14:07 AlexDesktop kernel: Call Trace:
Jul 17 18:14:07 AlexDesktop kernel:  <IRQ>  [<ffffffff812e54c2>] dump_stack+0x63/0x81
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff810a112b>] __schedule_bug+0x4b/0x60
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815c3549>] __schedule+0x899/0xad0
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815c37bc>] schedule+0x3c/0x90
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815c6293>] schedule_timeout+0x1d3/0x260
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff814983ac>] ? pci_conf1_read+0xbc/0x100
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa0342f42>] ? os_acquire_spinlock+0x12/0x20 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa0342f42>] ? os_acquire_spinlock+0x12/0x20 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815c5236>] __down+0x76/0xc0
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff810c4611>] down+0x41/0x50
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa033af51>] nv_get_adapter_state+0x31/0xc0 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa08a432c>] _nv016801rm+0xdc/0x120 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa053e1b3>] ? _nv008008rm+0x63/0x2a0 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa05dcaa7>] ? _nv009852rm+0xa7/0xb0 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa069a792>] ? _nv014185rm+0x572/0x600 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa08a76a2>] ? _nv000841rm+0x142/0x170 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa08ac413>] ? rm_isr_bh+0x23/0x70 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffffa0337d3d>] ? nvidia_isr_bh+0x3d/0x70 [nvidia]
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff810802f0>] ? tasklet_action+0xb0/0xd0
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815c9d56>] ? __do_softirq+0xe6/0x2ec
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff8107fd83>] ? irq_exit+0xa3/0xb0
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815c9a84>] ? do_IRQ+0x54/0xd0
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815c7b82>] ? common_interrupt+0x82/0x82
Jul 17 18:14:07 AlexDesktop kernel:  <EOI>  [<ffffffff81475516>] ? cpuidle_enter_state+0x126/0x2d0
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff814754f1>] ? cpuidle_enter_state+0x101/0x2d0
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff814756f7>] ? cpuidle_enter+0x17/0x20
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff810bd9fa>] ? call_cpuidle+0x2a/0x50
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff810bde18>] ? cpu_startup_entry+0x2d8/0x390
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff815ba3f4>] ? rest_init+0x84/0x90
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff8190cff0>] ? start_kernel+0x43e/0x45f
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff8190c120>] ? early_idt_handler_array+0x120/0x120
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff8190c346>] ? x86_64_start_reservations+0x2a/0x2c
Jul 17 18:14:07 AlexDesktop kernel:  [<ffffffff8190c494>] ? x86_64_start_kernel+0x14c/0x16f
Jul 17 18:15:12 AlexDesktop kernel: sysrq: SysRq : Keyboard mode set to system default

edit: Might it be that the mobo is dying?
I've actually tinkered with kernel modules and BIOS settings, here's my mkinitcpio, ask for relevant BIOS settings:

MODULES="xfs" #filesystems 
MODULES+=" sd_mod ahci xhci_pci ehci_pci sdhci_acpi" #storage
MODULES+=" usbhid ohci_pci hid_generic" #keyboard 
MODULES+=" nvidia_drm nvidia_uvm crc32c_intel" #misc (nvidia KMS, hardware crc32 implementation)

BINARIES="fsck fsck.xfs xfs_repair"

FILES=""

HOOKS="base autodetect modconf"

Last edited by L1ghtmareI (2016-07-17 15:29:44)

Offline

#9 2016-07-17 15:35:46

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

On second thought this is probably caused by the new version of proprietary nvidia driver that was released 15th. Here are the patchnotes:

 Fixed a regression that could cause console corruption when resuming from suspend.
Improved buffer write performance of the nvidia-drm DRM KMS driver by using write-combined DRM Dumb Buffers where available.
Fixed a bug that caused X to crash when applying changes to the RandR CscMatrix property while VT-switched away from X.
Fixed a bug that caused a crash when exiting nvidia-settings on displays with 8 or 15 bit color depths.
Added support for VDPAU Feature Set H to the NVIDIA VDPAU driver. GPUs with VDPAU Feature Set H are capable of hardware-accelerated decoding of 8192x8192 (8k) H.265/HEVC video streams.
Fixed a bug that caused the X server to sometimes skip displaying Vulkan frames when the Composite extension is enabled.
Added support for querying clock values on Pascal GPUs.
Removed the Base Mosaic configuration option from nvidia-settings on systems where the feature is not actually supported.
Fixed a bug that caused nvidia-smi to report an inaccurate version number.
Fixed a bug that could lead to a system crash if there was a peer-to-peer mapping still active during CUDA context teardown.
Fixed a bug that prevented nvidia-bug-report.sh from finding relevant messages in kernel log files.

Should I report this bug to Nvidia/maintainers?

Last edited by L1ghtmareI (2016-07-17 15:37:25)

Offline

#10 2016-07-17 15:51:22

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

del

Last edited by L1ghtmareI (2016-07-17 16:20:09)

Offline

#11 2016-07-17 16:08:35

headkase
Member
Registered: 2011-12-06
Posts: 1,976

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

REISUB ends in a reboot with "B."  If you just want to kill the X server then use "K" and none of the other REISUB steps.

Offline

#12 2016-07-17 16:19:59

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

My bad, misunderstood the first note

Offline

#13 2016-07-18 10:19:08

L1ghtmareI
Member
Registered: 2014-08-29
Posts: 96

Re: [nvidia/vulkan bug?] Plasma 5.7 hangs into crash?

All but confirmed nvidia vulkan issue, since the last patch introduced a vulkan fix. Watching dota replays in 2x speed with vulkan guaranteed crash, with opengl it works normally.

Offline

Board footer

Powered by FluxBB