You are not logged in.

#1 2018-07-24 19:39:26

Maniaxx
Member
Registered: 2014-05-14
Posts: 761

What debug option do we have after a system freeze?

From time to time my system freezes. Everything stops immediately and the keyboard leds are blinking. Sometime 3, sometimes just 2. I'm not sure if that's a last resort message. No tty switching possible, but the system reboots by itself after some time.

After reboot i've checked 'journalctl -b -1 -r' but nothing helpful there. What can i do to debug that? Can i enable '/proc/last_kmsg' or something like this on vanilla kernel? I want to narrow down the device/driver that was used right before the crash.

Intel P67, 2500k
GTX770 (nouveau)

Last edited by Maniaxx (2018-07-24 19:41:55)


sys2064

Offline

#2 2018-07-24 20:06:28

loqs
Member
Registered: 2014-03-06
Posts: 18,948

Re: What debug option do we have after a system freeze?

Anything in /sys/fs/pstore ?  The default kernel config will not reboot on oops or panic.
If the reboot is caused by a hardware watchdog then there may be no indication of the cause if the hardlock was serious enough.

Offline

#3 2018-07-24 20:09:07

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 20,652

Re: What debug option do we have after a system freeze?

That would be a kernel panic
https://en.wikipedia.org/wiki/Kernel_panic

Start here: https://wiki.archlinux.org/index.php/Ge … nel_panics
Unfortunately, this often turns into a game of whack-a-mole as you try to isolate things.

You might start by ensuring that any microcode updates for you processor are installed and enabled.  https://wiki.archlinux.org/index.php/Microcode


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
The shortest way to ruin a country is to give power to demagogues.— Dionysius of Halicarnassus
---
How to Ask Questions the Smart Way

Online

#4 2018-07-24 21:45:32

Maniaxx
Member
Registered: 2014-05-14
Posts: 761

Re: What debug option do we have after a system freeze?

loqs wrote:

Anything in /sys/fs/pstore ?

Unfortunately, no. That's the new place (or the 1st place for desktop systems) for 'last_kmsg' if i get it right. Good to know.

Forwarding to console is probably useless. I cannot switch to anything when it crashes. Serial-Out would be a chance but its freezing maybe just once or twice a month if at all. I'm not going to run a laptop all the time just for that.

ewaller wrote:

You might start by ensuring that any microcode updates for you processor are installed and enabled.  https://wiki.archlinux.org/index.php/Microcode

Yes, properly injected in grub.cfg already.

Personally, i suspect nouveau. Crashes seem to happen mostly when working with video surfaces (seeking a lot in videos) and such. This last crash happened when i worked with VapourSynth/ffmpeg even though it crashed exactly when i started (pressed the button for) ffmpeg encoding.

Last edited by Maniaxx (2018-07-24 21:47:59)


sys2064

Offline

#5 2018-07-25 07:02:58

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,177

Re: What debug option do we have after a system freeze?

Nouveau would have to bleed into the core (otherwise the module can crash "fine" and you loose your graphics stack, but the system woul remain responsive and you would not get the fancy LED ambience)

Tried the nvidia blob?
Tried the LTS kernel?

Offline

#6 2018-07-27 19:22:54

Maniaxx
Member
Registered: 2014-05-14
Posts: 761

Re: What debug option do we have after a system freeze?

I don't like the nvidia blob. No native tty resolution or enforced power-save mode.

I'm currently investigating the high I/O throughput of 'vapoursynth-editor'. When you move the window (qt5) it hammers the hdd to 100% for no obvious reason (can be seen with iotop) (Edit: Its ~/.config/vsedit.config.lock). Similar happens with 'meld'. There it is dconf (~/.config/dconf) that can be moved to tmpfs to workaround it.

I really hoped for an isolated dmesg buffer that could be investigated after a hard crash. The Android Linux kernel seems to be way ahead there.

Last edited by Maniaxx (2018-07-27 20:29:45)


sys2064

Offline

#7 2018-07-27 19:30:22

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,177

Re: What debug option do we have after a system freeze?

I didn't ask you to hump the blob, but if the problem doesn't exist w/ it, that would support the nouveau assumption.

See https://wiki.archlinux.org/index.php/Kdump for kernel debugging.

Offline

#8 2018-07-27 20:38:15

Maniaxx
Member
Registered: 2014-05-14
Posts: 761

Re: What debug option do we have after a system freeze?

These are all guesses from my side anyway. Even though the nouveau 'modesetting' seems not to be in the best shape atm indeed (like this: https://bugs.freedesktop.org/show_bug.cgi?id=106908) and minor gfx glitches.

Kdump sounds much better but for now i think i will just keep watching it some more until next major kernel update.

Last edited by Maniaxx (2018-07-27 20:39:30)


sys2064

Offline

#9 2018-07-27 20:56:44

loqs
Member
Registered: 2014-03-06
Posts: 18,948

Re: What debug option do we have after a system freeze?

Try 4-18-rc7 when it is released this weekend?

Offline

#10 2018-07-29 20:47:57

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: What debug option do we have after a system freeze?

I would start by making sure that no out of tree modules are loaded, that eliminates a few variables. Also try to correlate the crashes with something you do, having a way to reproduce the problem will speed up narrowing down the problem and testing fixes.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#11 2018-07-29 20:53:45

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,464

Re: What debug option do we have after a system freeze?


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#12 2018-08-14 23:43:50

Maniaxx
Member
Registered: 2014-05-14
Posts: 761

Re: What debug option do we have after a system freeze?

It crashed again. At least gfx and keyboard. I had an sshd set up and system was still partly running. Like i suspected, its nouveau. The system hung at shutdown then. Not sure if related to this bug. Kernel 4.18 is near.

[31087.907946] INFO: task kworker/u8:2:8062 blocked for more than 120 seconds.
[31087.907952]       Not tainted 4.17.14-arch1-1-ARCH #1
[31087.907954] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[31087.907957] kworker/u8:2    D    0  8062      2 0x80000000
[31087.908018] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
[31087.908020] Call Trace:
[31087.908030]  ? __schedule+0x282/0x890
[31087.908035]  schedule+0x32/0x90
[31087.908038]  schedule_timeout+0x311/0x4a0
[31087.908047]  ? ttm_bo_validate+0x37/0x130 [ttm]
[31087.908092]  ? nouveau_fence_is_signaled+0x39/0x40 [nouveau]
[31087.908097]  dma_fence_default_wait+0x1e8/0x270
[31087.908100]  ? dma_fence_default_wait+0x270/0x270
[31087.908104]  dma_fence_wait_timeout+0x39/0x110
[31087.908115]  drm_atomic_helper_wait_for_fences+0x38/0xc0 [drm_kms_helper]
[31087.908160]  nv50_disp_atomic_commit_tail+0x5b/0x1ef0 [nouveau]
[31087.908164]  ? _raw_spin_unlock_irq+0x1d/0x30
[31087.908168]  process_one_work+0x1d1/0x3b0
[31087.908172]  worker_thread+0x2b/0x3d0
[31087.908174]  ? process_one_work+0x3b0/0x3b0
[31087.908177]  kthread+0x112/0x130
[31087.908181]  ? kthread_flush_work_fn+0x10/0x10
[31087.908184]  ret_from_fork+0x35/0x40

sys2064

Offline

#13 2018-08-15 00:06:40

loqs
Member
Registered: 2014-03-06
Posts: 18,948

Re: What debug option do we have after a system freeze?

4.18 is in testing,  4.18.1 should be in testing later this week.

Offline

Board footer

Powered by FluxBB