You are not logged in.
Can anyone else confirm booting with the 'maxcpus=1' flag stops the panics for you too?
Last edited by graysky (2015-05-09 19:50:21)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
That could explain why maybe NUMA disabled solves some kernel panics - since the NUMA allows multiple CPUs (not cores) to access with priority their local Memory controller. The NUMA description in kernel is confusing a bit, but there's an explanation on a SO question. Interesting finding indeed!
Offline
Rolled back to 4.0.2-1 and using the maxcpus=1 flag so far no panic. Only three boots in, but will continue to test.
edit) A few boots on and did receive panic at shut down. Forced power off and have rebooted a couple of times without issue.
edit 2) Several boots later and another panic at shut down. Again, has been ok since.
Last edited by paneless (2015-05-10 10:27:17)
Offline
@inglor: Actually the kernel panics stop when NUMA is enabled.
However, for the record, I re-read the documentation for Kdump and found that the parameter maxcpus=1 was needed only by the crash dump kernel so I was able to collect several dumps for 4.0.2-1-ck that I analyzed with crash and they all look like the following:
crash 7.1.0
Copyright (C) 2002-2014 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
KERNEL: vmlinux
DUMPFILE: vmcore-2.dump
CPUS: 4
DATE: Thu Jan 1 01:00:00 1970
UPTIME: 00:06:23
LOAD AVERAGE: 0.00, 0.00, 0.00
TASKS: 187
NODENAME: 530U3C
RELEASE: 4.0.2-1-ck
VERSION: #1 SMP PREEMPT Sun May 10 09:16:20 CEST 2015
MACHINE: x86_64 (1696 Mhz)
MEMORY: 5.8 GB
PANIC: "general protection fault: 0000 [#1] PREEMPT SMP "
PID: 16
COMMAND: "ksoftirqd/2"
TASK: ffff880198268000 [THREAD_INFO: ffff880198264000]
CPU: 2
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 16 TASK: ffff880198268000 CPU: 2 COMMAND: "ksoftirqd/2"
#0 [ffff880198267a40] machine_kexec at ffffffff81055e3b
#1 [ffff880198267ab0] crash_kexec at ffffffff810e2cc2
#2 [ffff880198267b80] oops_end at ffffffff810197a8
#3 [ffff880198267bb0] die at ffffffff81019c6b
#4 [ffff880198267be0] do_general_protection at ffffffff8101615a
#5 [ffff880198267c10] general_protection at ffffffff815536c8
[exception RIP: skb_dequeue+75]
RIP: ffffffff8143e2cb RSP: ffff880198267cc8 RFLAGS: 00010097
RAX: 0000000000000292 RBX: ffff88008ccc2090 RCX: 8745392e629fa6bf
RDX: cf21a9d2430f9b9e RSI: 000000000000000a RDI: ffff88018ccc20a4
RBP: ffff880198267ce8 R8: 0000000000000292 R9: ffffffff811b0600
R10: ffff88019f296bc0 R11: ffffea0006551c00 R12: ffff88018ccc2090
R13: ffff88018ccc20a4 R14: ffff8800cdd39600 R15: ffff88019f294340
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#6 [ffff880198267cf0] skb_queue_purge at ffffffff8143e328
#7 [ffff880198267d10] netlink_sock_destruct at ffffffff814809f2
#8 [ffff880198267d50] __sk_free at ffffffff81439d4d
#9 [ffff880198267d70] sk_free at ffffffff81439e79
#10 [ffff880198267d80] deferred_put_nlk_sk at ffffffff8147e580
#11 [ffff880198267d90] rcu_process_callbacks at ffffffff810bbecc
#12 [ffff880198267e00] __do_softirq at ffffffff81077861
#13 [ffff880198267e70] run_ksoftirqd at ffffffff81077a79
#14 [ffff880198267e80] smpboot_thread_fn at ffffffff81094e4c
#15 [ffff880198267ec0] kthread at ffffffff81091228
#16 [ffff880198267f50] ret_from_fork at ffffffff815516d8
crash>
So it seems that on my system something goes wrong with the process ksoftirqd that runs on the core 2 and that is the reason why maxcpus=1 stops the panics for me, because it forces the kernel to use only the core 0.
Offline
Hey graysky, after the last update (3.19.7 -> 3.19.8) the nvidia module won't load:
kernel: nvidia: disagrees about version of symbol module_layout
Offline
Thanks for letting me know. I can only test broadcom + vbox + nvidia-304xx which all worked. I bumped all and am rebuilding now. Check back in 15 min or so (ie pacman -Syyu).
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
Hey graysky, after the last update (3.19.7 -> 3.19.8) the nvidia module won't load:
kernel: nvidia: disagrees about version of symbol module_layout
Ya, had that happen to me not too long ago. I forgot to download the new kernel header and was rebuilding nvidia on the old header. I felt pretty silly about it.
Offline
New packages online now.
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
For those of you running 4.0.2-2-ck ... have you experienced a panic at all? I released it on May 8th so if you've been panic-free since running it, I'd like to know. CK recommends just keeping NUMA enabled while he digs into the code when he has time. I would like to push 4.0.2-2-ck into the repo with 3.19.8-1-ck in the repo archive if you 5 or so have not experienced any issues. Please report back.
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
For those of you running 4.0.2-2-ck ... have you experienced a panic at all? I released it on May 8th so if you've been panic-free since running it, I'd like to know. CK recommends just keeping NUMA enabled while he digs into the code when he has time. I would like to push 4.0.2-2-ck into the repo with 3.19.8-1-ck in the repo archive if you 5 or so have not experienced any issues. Please report back.
No panic at all, although I haven't powered off many times (which was when I experienced my panics). It's been live for just shy of 3 days 8 hours without a sniff of an issue.
@archun: Intel® Core™ i5-4210M • [GPU] Intel® HD Graphics 4600 • [Kernel] linux-ck-haswell
Handmade.Network • GitLab
The Life and Times of Miblo del Carpio
Offline
Been running 4.0.2-2 for 24hrs with bfq enabled without issue. Between 10-20 reboots.
Offline
For those of you running 4.0.2-2-ck ... have you experienced a panic at all? I released it on May 8th so if you've been panic-free since running it, I'd like to know. CK recommends just keeping NUMA enabled while he digs into the code when he has time. I would like to push 4.0.2-2-ck into the repo with 3.19.8-1-ck in the repo archive if you 5 or so have not experienced any issues. Please report back.
Been running 4.0.2-2-ck with no issues over the last 2 days. Some reboots but not many. Survived a suspend also! BFQ enabled, nouveau drivers.
Offline
Only one panic since the 8th, after about 5-6 reboots a day.
Edit: I might be able to blame it on a bios setting I was looking at.
Last edited by Buddlespit (2015-05-11 21:32:32)
Offline
4.0.2-2 has been just fine for me as well, BFQ enabled and nvidia drivers.
Offline
4.0.2-2 goes well with my two machines.
Offline
Offline
No problem at all with 4.0.2-2 until now (BFQ enabled).
Are you saying that when you enabled BFQ you experienced a problem?
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
mauritiusdadd wrote:No problem at all with 4.0.2-2 until now (BFQ enabled).
Are you saying that when you enabled BFQ you experienced a problem?
No, forgive my bad English. I'm just saying I have BFQ enabled and 4.0.2-2 works well. No kernel panic occurred with 4.0.2-2 since I installed it.
Last edited by mauritiusdadd (2015-05-12 08:06:55)
Offline
Anyone experienced the panic with AMD processors?
Offline
Anyone experienced the panic with AMD processors?
Me. But again, it was once and it might have been a bios setting I was trying. I haven't tried to duplicate it (because I don't remember what it was I tried).
Offline
OK. CK emailed me a new test patch that I build into 4.0.2-3-ck. For those of you experiencing the panics, please try it. What's different from 4.0.2-2-ck?
1) NUMA is disabled,
2) The new patch is active.
Option 1 (roll your own):
http://repo-ck.com/PKG_source/next/testing/linux-ck-4.0.2-3.src.tar.gz
Option 2 (my builds):
pacman -U http://repo-ck.com/PKG_source/next/testing/linux-ck-4.0.2-3-x86_64.pkg.tar.xz
pacman -U http://repo-ck.com/PKG_source/next/testing/linux-ck-headers-4.0.2-3-x86_64.pkg.tar.xz
Last edited by graysky (2015-05-12 18:54:02)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
Offline
[..]For those of you experiencing the panics, please try it.[..]
Do you want to try it only the ones who have kernel panics with 4.0.2-2-ck? Because we already established that they are not many (and you already released it to AUR).
Offline
I don't think anyone has experienced a panic under 4.0.2-2-ck that can be directly attributed to that kernel...
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
Just to add another data point, 4.0.2-3 is broken for me too.
Last edited by Wibjarm (2015-05-12 20:45:51)
Offline