You are not logged in.

#1 2013-02-16 15:38:43

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

[solved] Constant kernel panics

From yesterday I suddenly started to experience constant kernel panics. During boot as well as during normal work, like web browsing. I've managed to grab a picture of the screen from one of the panics:

http://postimage.org/image/os13aypix/

It is somewhat different from other panics I've experienced. Journalctl doesn't have anything interesting. I've tested RAM using memtest86++ and it was OK.

Any advice is highly appreciated.


EDIT: It was a faulty motherboard. Using new one and everything is OK.

Last edited by Demon (2013-03-08 12:17:00)

Offline

#2 2013-02-19 18:44:17

ilkyest
Member
From: Brazil
Registered: 2010-02-13
Posts: 269

Re: [solved] Constant kernel panics

probably microcode issue. Try put on grub "clocksource=hpet" and try start

if don't, activate "microcode updation" on BIOS, put clocksource=hpet and try again. Once booted, install microcode to your processor "pacman -Ss microcode" choose one and install

Offline

#3 2013-02-19 20:47:39

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

I can boot, after few unsuccessfull tries. I've read somewhere that this could be caused by dust in the case. I'm gonna try this first: cleaning case and cpu cooler and resetting bios. Will report back.

Offline

#4 2013-02-19 21:28:52

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: [solved] Constant kernel panics

I doubt that microcode or use of HPET would cause "ECC error during data access from L2". This sounds like broken CPU or overheating.

Did you perform any kernel updates recently?

Don't overclock (in case you do).
Clean CPU cooler from dust.
Try disabling SpeedStep/Cool'n'Quiet - somebody recently reported that CPUFReq appears to overclock his CPU to 5GHz (???) and crashes the system.
Check if you can boot some live cd, preferably one known to work on this computer in the past.
Check CPU temperature.

As a last resort, downclocking the CPU may increase probability that it will boot.

Last edited by mich41 (2013-02-19 21:32:00)

Offline

#5 2013-03-02 10:30:31

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

I've cleaned the case and reseted BIOS, and everything is the same. I can boot with Cool'n'Quiet off but only if underclocked. Really don't know what else to try, except BIOS upgrade (I already have the latest version).

I've also installed amd-ucode, but I honestly don't know what to do with it. smile

Offline

#6 2013-03-02 12:03:10

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: [solved] Constant kernel panics

If underclocking helps then it's almost certainly a hardware problem. Is it always L2 ECC error on CPU1?

Install lm_sensors. Run sensors-detect and load driver for the SuperIO chip it finds. Run sensors.

If it reports temperatures above 60°C, it's possible that you have insufficient cooling and the CPU goes crazy due to high temperature.
Otherwise, it's core1 L2 cache damage or some extremely weird bug.

Last edited by mich41 (2013-03-02 12:05:10)

Offline

#7 2013-03-02 12:07:11

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

No, I'm following temperature constantly, and it is not overheating. I'm gonna try with kernel 3.8 microcode update. If this doesn't help - I guess I'll have to buy a new CPU. sad

Offline

#8 2013-03-02 12:15:01

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: [solved] Constant kernel panics

You can try some old kernels just to be sure - old live cds, linux-lts (it's v3.0), etc.

Another option is disabling the faulting core. This should help with K10 which has separate L2 caches for each core. Not sure about K8, though.

Last edited by mich41 (2013-03-02 12:18:06)

Offline

#9 2013-03-02 12:16:44

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

I've tried already old live CDs, no use. This also happens with Windows XP (BSOD). How can I disable the faulting core?

Offline

#10 2013-03-02 12:21:00

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: [solved] Constant kernel panics

Some BIOSes have an option to hide cores or "unlock" factory-hidden ones.

Another way:

echo 0 >/sys/devices/system/cpu/cpu1/online

Run this before the system crashes, e.g. in /etc/rc.local or whatever is the systemd equivalent if you use this.

Yet another: add nosmp to kernel command line if it's a dual-core CPU.

Last edited by mich41 (2013-03-02 12:23:29)

Offline

#11 2013-03-02 12:22:17

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

OK, thank you very much for your help.

Offline

#12 2013-03-02 15:43:19

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

echo 0 >/sys/devices/system/cpu/cpu1/online

This doesn't help, as the failure occurs before. Also, dmesg reports this:

microcode: AMD CPU family 0xf not supported

so the last hope is gone. For now it works if I disable Cool'n'quiet (which I can live with) and if I down clock the cpu, which is not acceptable.


Any other ideas?

Edit: I've already updated BIOS, still the same.

Last edited by Demon (2013-03-02 15:44:55)

Offline

#13 2013-03-02 16:00:45

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

I also see this in kernel panic log:

Tag Snoop Error

Offline

#14 2013-03-02 20:16:14

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: [solved] Constant kernel panics

Family 0xf is K8 so it must be dual core. Since core1 is failing, simply add maxcpus=1 to boot parameters and Linux will run on core0 exclusively.

Tag snooping points to cache again.

Offline

#15 2013-03-02 20:42:32

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

No use. The same errors happen, just this time for CPU0. Only way I can boot and use my computer is to disable AMD Cool'N'Quiet from BIOS and to downclock to 1500-1600 MHz.

Offline

#16 2013-03-08 12:17:24

Demon
Member
From: Republic of Srpska, BA
Registered: 2008-03-02
Posts: 246

Re: [solved] Constant kernel panics

It was a faulty motherboard, afteral.

Offline

Board footer

Powered by FluxBB