You are not logged in.

Can anyone else confirm booting with the 'maxcpus=1' flag stops the panics for you too?
Last edited by graysky (2015-05-09 19:50:21)
Offline

That could explain why maybe NUMA disabled solves some kernel panics - since the NUMA allows multiple CPUs (not cores) to access with priority their local Memory controller. The NUMA description in kernel is confusing a bit, but there's an explanation on a SO question. Interesting finding indeed!
Offline
Rolled back to 4.0.2-1 and using the maxcpus=1 flag so far no panic. Only three boots in, but will continue to test.
edit) A few boots on and did receive panic at shut down. Forced power off and have rebooted a couple of times without issue.
edit 2) Several boots later and another panic at shut down. Again, has been ok since.
Last edited by paneless (2015-05-10 10:27:17)
Offline

@inglor: Actually the kernel panics stop when NUMA is enabled.
However, for the record, I re-read the documentation for Kdump and found that the parameter maxcpus=1 was needed only by the crash dump kernel so I was able to collect several dumps for 4.0.2-1-ck that I analyzed with crash and they all look like the following:
crash 7.1.0
Copyright (C) 2002-2014  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
      KERNEL: vmlinux                           
    DUMPFILE: vmcore-2.dump
        CPUS: 4
        DATE: Thu Jan  1 01:00:00 1970
      UPTIME: 00:06:23
LOAD AVERAGE: 0.00, 0.00, 0.00
       TASKS: 187
    NODENAME: 530U3C
     RELEASE: 4.0.2-1-ck
     VERSION: #1 SMP PREEMPT Sun May 10 09:16:20 CEST 2015
     MACHINE: x86_64  (1696 Mhz)
      MEMORY: 5.8 GB
       PANIC: "general protection fault: 0000 [#1] PREEMPT SMP "
         PID: 16
     COMMAND: "ksoftirqd/2"
        TASK: ffff880198268000  [THREAD_INFO: ffff880198264000]
         CPU: 2
       STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 16     TASK: ffff880198268000  CPU: 2   COMMAND: "ksoftirqd/2"
 #0 [ffff880198267a40] machine_kexec at ffffffff81055e3b
 #1 [ffff880198267ab0] crash_kexec at ffffffff810e2cc2
 #2 [ffff880198267b80] oops_end at ffffffff810197a8
 #3 [ffff880198267bb0] die at ffffffff81019c6b
 #4 [ffff880198267be0] do_general_protection at ffffffff8101615a
 #5 [ffff880198267c10] general_protection at ffffffff815536c8
    [exception RIP: skb_dequeue+75]
    RIP: ffffffff8143e2cb  RSP: ffff880198267cc8  RFLAGS: 00010097
    RAX: 0000000000000292  RBX: ffff88008ccc2090  RCX: 8745392e629fa6bf
    RDX: cf21a9d2430f9b9e  RSI: 000000000000000a  RDI: ffff88018ccc20a4
    RBP: ffff880198267ce8   R8: 0000000000000292   R9: ffffffff811b0600
    R10: ffff88019f296bc0  R11: ffffea0006551c00  R12: ffff88018ccc2090
    R13: ffff88018ccc20a4  R14: ffff8800cdd39600  R15: ffff88019f294340
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 #6 [ffff880198267cf0] skb_queue_purge at ffffffff8143e328
 #7 [ffff880198267d10] netlink_sock_destruct at ffffffff814809f2
 #8 [ffff880198267d50] __sk_free at ffffffff81439d4d
 #9 [ffff880198267d70] sk_free at ffffffff81439e79
#10 [ffff880198267d80] deferred_put_nlk_sk at ffffffff8147e580
#11 [ffff880198267d90] rcu_process_callbacks at ffffffff810bbecc
#12 [ffff880198267e00] __do_softirq at ffffffff81077861
#13 [ffff880198267e70] run_ksoftirqd at ffffffff81077a79
#14 [ffff880198267e80] smpboot_thread_fn at ffffffff81094e4c
#15 [ffff880198267ec0] kthread at ffffffff81091228
#16 [ffff880198267f50] ret_from_fork at ffffffff815516d8
crash> So it seems that on my system something goes wrong with the process ksoftirqd that runs on the core 2 and that is the reason why maxcpus=1 stops the panics for me, because it forces the kernel to use only the core 0.
Offline

Hey graysky, after the last update (3.19.7 -> 3.19.8) the nvidia module won't load:
kernel: nvidia: disagrees about version of symbol module_layoutOffline

Thanks for letting me know. I can only test broadcom + vbox + nvidia-304xx which all worked. I bumped all and am rebuilding now. Check back in 15 min or so (ie pacman -Syyu).
Offline

Hey graysky, after the last update (3.19.7 -> 3.19.8) the nvidia module won't load:
kernel: nvidia: disagrees about version of symbol module_layout
Ya, had that happen to me not too long ago. I forgot to download the new kernel header and was rebuilding nvidia on the old header. I felt pretty silly about it.
Offline

New packages online now.
Offline

For those of you running 4.0.2-2-ck ... have you experienced a panic at all? I released it on May 8th so if you've been panic-free since running it, I'd like to know. CK recommends just keeping NUMA enabled while he digs into the code when he has time. I would like to push 4.0.2-2-ck into the repo with 3.19.8-1-ck in the repo archive if you 5 or so have not experienced any issues. Please report back.
Offline

For those of you running 4.0.2-2-ck ... have you experienced a panic at all? I released it on May 8th so if you've been panic-free since running it, I'd like to know. CK recommends just keeping NUMA enabled while he digs into the code when he has time. I would like to push 4.0.2-2-ck into the repo with 3.19.8-1-ck in the repo archive if you 5 or so have not experienced any issues. Please report back.
No panic at all, although I haven't powered off many times (which was when I experienced my panics). It's been live for just shy of 3 days 8 hours without a sniff of an issue.
@archun: Intel® Core™ i5-4210M  • [GPU] Intel® HD Graphics 4600 • [Kernel] linux-ck-haswell
Handmade.Network • GitLab
The Life and Times of Miblo del Carpio
Offline
Been running 4.0.2-2 for 24hrs with bfq enabled without issue. Between 10-20 reboots.
Offline

For those of you running 4.0.2-2-ck ... have you experienced a panic at all? I released it on May 8th so if you've been panic-free since running it, I'd like to know. CK recommends just keeping NUMA enabled while he digs into the code when he has time. I would like to push 4.0.2-2-ck into the repo with 3.19.8-1-ck in the repo archive if you 5 or so have not experienced any issues. Please report back.
Been running 4.0.2-2-ck with no issues over the last 2 days. Some reboots but not many. Survived a suspend also! BFQ enabled, nouveau drivers.
Offline

Only one panic since the 8th, after about 5-6 reboots a day.
Edit: I might be able to blame it on a bios setting I was looking at.
Last edited by Buddlespit (2015-05-11 21:32:32)
Offline
4.0.2-2 has been just fine for me as well, BFQ enabled and nvidia drivers.
Offline
4.0.2-2 goes well with my two machines.
Offline

Offline

No problem at all with 4.0.2-2 until now (BFQ enabled).
Are you saying that when you enabled BFQ you experienced a problem?
Offline

mauritiusdadd wrote:No problem at all with 4.0.2-2 until now (BFQ enabled).
Are you saying that when you enabled BFQ you experienced a problem?
No, forgive my bad English. I'm just saying I have BFQ enabled and 4.0.2-2 works well. No kernel panic occurred with 4.0.2-2 since I installed it.
Last edited by mauritiusdadd (2015-05-12 08:06:55)
Offline
Anyone experienced the panic with AMD processors?
Offline

Anyone experienced the panic with AMD processors?
Me. But again, it was once and it might have been a bios setting I was trying. I haven't tried to duplicate it (because I don't remember what it was I tried).
Offline

OK.  CK emailed me a new test patch that I build into 4.0.2-3-ck.  For those of you experiencing the panics, please try it.  What's different from 4.0.2-2-ck?  
1) NUMA is disabled,
2) The new patch is active.
Option 1 (roll your own):
http://repo-ck.com/PKG_source/next/testing/linux-ck-4.0.2-3.src.tar.gzOption 2 (my builds):
pacman -U http://repo-ck.com/PKG_source/next/testing/linux-ck-4.0.2-3-x86_64.pkg.tar.xz
pacman -U http://repo-ck.com/PKG_source/next/testing/linux-ck-headers-4.0.2-3-x86_64.pkg.tar.xzLast edited by graysky (2015-05-12 18:54:02)
Offline

Offline

[..]For those of you experiencing the panics, please try it.[..]
Do you want to try it only the ones who have kernel panics with 4.0.2-2-ck? Because we already established that they are not many (and you already released it to AUR).
Offline

I don't think anyone has experienced a panic under 4.0.2-2-ck that can be directly attributed to that kernel...
Offline
Just to add another data point, 4.0.2-3 is broken for me too.
Last edited by Wibjarm (2015-05-12 20:45:51)
Offline