You are not logged in.

#1 2013-03-31 22:49:33

scott_fakename
Member
Registered: 2012-08-15
Posts: 58

Kernel oops, then panic, then catastrophic crash when using ethernet

I'm running arch on my laptop, using netcfg to manage networks. I can connect with no problems using wireless, to any kind of wireless network, which is what I've been doing most of the time, as my house until recently only had wireless in the first place.

A few months ago I was at a friend's house, and we hooked up my computer to their wired internet. Within about a minute or two, while I was still in tty1, the screen flickered briefly, dumped out a bunch of messages (three or so, I didn't get a chance to look clearly) that said "uuuhhhh... Dazed and trying to continue." Then it did a panic, and dumped the panic message. About a second after that, the screen became frenetic and staticy and it starting blurring and stretching the display (hopefully someone else knows what I'm talking about because i can't really describe it any better). Never before have I had THAT part happen on a kernel panic.

I just ignored it because wireless works fine, so I figured maybe something was odd about their network. That was a few months ago.

Recently I installed wired connections in my own house and connected it... Same thing. Every time. It could take thirty seconds to happen, it could wait about twenty minutes... But eventually, the computer WILL go down.

I thought it might have something to do with hardware interrupts, because when I logged dmesg and tried to make it happen again it kept mentioning irqs and PS1 (my mouse i think). But I don't know. It doesn't mention a panic in the log, nor an oops... Nothing. The log just ends.

Here is a link to the file I dumped it into: http://pastebin.com/kXp3P3GG

Anything else I should check?

--Scott

Edit: There's a good chance this belongs in hardware.

Last edited by scott_fakename (2013-04-01 04:56:12)

Offline

#2 2013-04-01 07:13:29

t1nk3r3r
Member
From: The Pacific Northwest
Registered: 2011-03-22
Posts: 79

Re: Kernel oops, then panic, then catastrophic crash when using ethernet

Edit: There's a good chance this belongs in hardware.

I'm inclined to agree.  Whether it's dust/overheating chipset, wrong module being loaded, or faulty ethernet chip.
Then again, I did see a system with random hangs wind up being faulty RAM.


--------------------------The only wasted day is one in which you learn nothing.--------------------------

Offline

#3 2013-04-01 17:25:27

combuster
Member
From: Serbia
Registered: 2008-09-30
Posts: 711
Website

Re: Kernel oops, then panic, then catastrophic crash when using ethernet

I would say to you tu upgrade your BIOS, but I can't seem to find any for this particular model on toshiba website...

Offline

#4 2013-04-01 17:59:40

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 12,390

Re: Kernel oops, then panic, then catastrophic crash when using ethernet

That "Dazed and confused" bit usually means that some piece of hardware has fired off an unsolicited interrupt;  the kernel did not enable whatever generated the interrupt to generate one.  Since it does not know what generated it, it does not know how to deal with it, or worse, service it to turn it off.  If that interrupt is shared with something else, for example a NIC, then bad things can happen.  When the interrupt occurs, the kernel goes out and talks to all the devices sharing the interrupt trying to find the culprit; when all of the known devices deny any knowledge of why there is an interrupt, the kernel effectively loses the ability to cope with interrupts from those devices because they are all sharing the channel that cannot be cleared.

Debugging this is as much an art as it is science.  It might be that a PnP BIOS is setting up hardware (maybe a PXE boot environment on the NIC card) that is leaving the hardware in an unexpected state prior to starting the boot loader.

Try different PnP settings in the BIOS.  Tell the BIOS you are using an OS that handles it's own PnP, if you don't, it will try to set it up for you (left over from the DOS days)
You might try experimenting with some of the kernel command line parameters, especially the acpi ones having to do with apic, and irq leveling. 
You might try tuning off hardware in the BIOS that you are not using.

Be patient.


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Like you, I have no idea what you are doing, but I am pretty sure it is wrong...Jasonwryan
----
How to Ask Questions the Smart Way

Offline

#5 2013-04-01 21:24:28

combuster
Member
From: Serbia
Registered: 2008-09-30
Posts: 711
Website

Re: Kernel oops, then panic, then catastrophic crash when using ethernet

Debugging this is as much an art as it is science.  It might be that a PnP BIOS is setting up hardware (maybe a PXE boot environment on the NIC card) that is leaving the hardware in an unexpected state prior to starting the boot loader.

This ! NIC are prone to this kind of behaviour, you could also try to disable any kind of power management related to NIC in BIOS (WoL ie). Windows also tends to leave ethernet cards in "dirty" state, try power cycling if that is the case and boot straight to Linux and try to reproduce the panic.

Offline

#6 2013-04-01 22:05:58

scott_fakename
Member
Registered: 2012-08-15
Posts: 58

Re: Kernel oops, then panic, then catastrophic crash when using ethernet

I was looking in the BIOS setup for things to change, but the only thing I could find related to nic was "wakeup on LAN", and that was only in the context of resuming from suspend.

I've found that the BIOS on this computer is very screwy in a lot of ways. A lot (!!!!) of things are locked down and can't be changed (or even seen). For example, when I tried to install qemu on this computer, I found that the KVM cpu instruction set was "disabled in bios" (and indeed, when I'm booting, the kernel tells me that every time too), and when I called toshiba since I couldn't find anywhere to re-enable it, they said basically that there was nothing they could (would?) do about it.

All that to say, fixing it in bios, while I would *prefer* to do it that way, is likely not a feasible option in this case unfortunately. So I'm now stuck trying various different kernel options. Which is fine. I will post when I know more.

Thank you!
--Scott

Offline

#7 2013-04-02 07:13:59

scott_fakename
Member
Registered: 2012-08-15
Posts: 58

Re: Kernel oops, then panic, then catastrophic crash when using ethernet

So I tried upgrading my BIOS with a CD i got from the toshiba website. I tried numerous different kernel options... Turning off ACPI in various ways (pci=noacpi, acpi=noirq...) prevent it from crashing but disable network access.

I did notice a more clear look at the oops. It said that it was due to "unknown reason 3d." Does that mean anything? And it said "do you have a strange power management setup?" or something very like that. I do have laptop-mode-utils installed, but I don't see any mention of any problem like this around the documentation of laptopmodeutils. Is that a possibility?

Does it make sense to install something like irqbalance, or isapnp? Does that have any chance of changing anything?

Also, I don't know if it's normal but when I boot the usual way, and then cat /proc/interrupts, I get:

           CPU0       CPU1       
  0:         54          0   IO-APIC-edge      timer
  1:         76       3990   IO-APIC-edge      i8042
  7:          1          0   IO-APIC-edge    
  8:          0          1   IO-APIC-edge      rtc0
  9:          0          0   IO-APIC-fasteoi   acpi
 12:       1767      55866   IO-APIC-edge      i8042
 16:        325       8747   IO-APIC-fasteoi   snd_hda_intel, rtlwifi
 17:          2        105   IO-APIC-fasteoi   ehci_hcd:usb1, ehci_hcd:usb2, ehci_hcd:usb3
 18:          0          0   IO-APIC-fasteoi   ohci_hcd:usb4, ohci_hcd:usb5, ohci_hcd:usb6
 19:        189      18898   IO-APIC-fasteoi   ahci
 40:         67       1526   PCI-MSI-edge      radeon
NMI:          9         11   Non-maskable interrupts
LOC:     103574     108430   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          9         11   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:     132336     108518   Rescheduling interrupts
CAL:         40         26   Function call interrupts
TLB:       5690       2069   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          5          5   Machine check polls
ERR:          1
MIS:          0

Then if I plug in the ethernet (still booted the normal way) it adds a line directly below radeon (video card) that says 41 blah blah PCI-MSI=edge eth0 where blah blah are numbers. These are the only two devices like that, but I don't know if that's normal.

I'm grateful for the help.
Thanks,
--Scott

Offline

#8 2013-04-02 14:50:33

combuster
Member
From: Serbia
Registered: 2008-09-30
Posts: 711
Website

Re: Kernel oops, then panic, then catastrophic crash when using ethernet

Does nmi_watchdog=0 kernel parameter helps ? Disabling acpi maybe ?

If all of this fails you can always file a bug report on the kernel bugzilla, but do this only if you can catch the  entire panic output, or else it will mean nothing to the maintainers of the particular code (I don't know whether this should be filed under drivers/net/atheros or acpi or something else...)

Offline

Board footer

Powered by FluxBB