You are not logged in.

#1 2014-06-25 15:39:55

jackwild
Member
Registered: 2014-01-15
Posts: 30

[SOLVED?]Soft(?) lockup, how to debug?

So I've been having this issue since 3.14. Seemingly randomly I get a total freeze, no caps lock or numlock lights but I can use the magic sysreq combinations to reboot. I can't switch VTs. The logs have nothing for me, probably not been flushed to disk (flushing is at 5min default). I've been trying to narrow down the cause but I am a little at a loss.

I've just downgraded to 3.13.7-1 and I don't know if it's going to help yet, I'm not even sure if it's a kernel issue.

I've run memtest86+ successfully, disabled precache,preload and zswap as these are things I had enabled recently but the problem persists.

I'm going to change the log flushing to a lower value (5 seconds) and I haven't tried to SSH  from another box yet, but I will next time it happens.

What else can I do to find the cause of this?

I haven't included any extra information as I don't have anything which seems relevant. Ask and ye shall receive though.


Edit: I think it's solved. See https://bbs.archlinux.org/viewtopic.php?pid=1431047

Last edited by jackwild (2014-06-28 18:33:17)

Offline

#2 2014-06-25 15:49:13

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,804

Re: [SOLVED?]Soft(?) lockup, how to debug?

Other things to try.  Can you ping the box over the net when it is hung?  Can you ssh into the box when it is hung?


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#3 2014-06-26 19:05:29

jackwild
Member
Registered: 2014-01-15
Posts: 30

Re: [SOLVED?]Soft(?) lockup, how to debug?

I can't find a way to reliably reproduce this so i'm left with waiting for it to happen. I can leave it stress testing the CPU for over two hours without issue so I think that overhaeting is basically ruled out. It just did with 3.15.1-1-ARCH.

I can ping over the network but ssh gets stuck before authentication, it just hangs. Output of ssh -vvv:

OpenSSH_6.6.1, OpenSSL 1.0.1h 5 Jun 2014
debug1: Reading configuration data /etc/ssh/ssh_config
debug2: ssh_connect: needpriv 0
debug1: Connecting to 192.168.1.2 [192.168.1.2] port 22.
debug1: Connection established.
debug1: identity file /home/arch/.ssh/id_rsa type -1
debug1: identity file /home/arch/.ssh/id_rsa-cert type -1
debug1: identity file /home/arch/.ssh/id_dsa type -1
debug1: identity file /home/arch/.ssh/id_dsa-cert type -1
debug1: identity file /home/arch/.ssh/id_ecdsa type -1
debug1: identity file /home/arch/.ssh/id_ecdsa-cert type -1
debug1: identity file /home/arch/.ssh/id_ed25519 type -1
debug1: identity file /home/arch/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.6.1

Again, the last log entry is about 10 minutes before it happened so it isn't a flushing issue. I'm really stuck here. I'll have to try different kernel versions in hope. Any ideas welcomed.

Offline

Board footer

Powered by FluxBB