You are not logged in.

#1 2010-03-16 22:31:02

lasu1
Member
Registered: 2010-02-10
Posts: 83

[Solved] irq 18: nobody cared (kernel panic after suspend, sometimes)

Hello,

I've been getting kernel panics following some pm-suspends. This does not always happen, and I am unable to pinpoint exactly when it happens; although, it seems to occur within 30 minutes of awakening from pm-suspend (if it does happen within the 30 minute time frame, it doesn't seem to happen at all.)

I use Arch 86_64 with xfce as my desktop. I do not use a display manager. I have the NVIDIA 190.53 driver installed, which I use to play EVE online sometimes ;-)

Things I've tried:
1.) Moving and doubling swap partition size to twice my RAM.
2.) Suspending from xterm (that is, not clicking on the "suspend" button within XFCE).

Please let me know if more info could be helpful, and I appreciate any assistance!

Here is what looks to be the error from kernel.log:

Mar 16 18:12:59 nicholas-arch kernel: input: Logitech USB Trackball as /devices/pci0000:00/0000:00:1a.2/usb5/5-2/5-2:1.0/input/input6
Mar 16 18:12:59 nicholas-arch kernel: generic-usb 0003:046D:C408.0001: input,hidraw0: USB HID v1.10 Mouse [Logitech USB Trackball] on usb-0000:00:1a.2-2/input0
Mar 16 18:12:59 nicholas-arch kernel: usbcore: registered new interface driver usbhid
Mar 16 18:12:59 nicholas-arch kernel: usbhid: v2.6:USB HID core driver
Mar 16 18:12:59 nicholas-arch kernel: irq 18: nobody cared (try booting with the "irqpoll" option)
Mar 16 18:12:59 nicholas-arch kernel: Pid: 0, comm: swapper Tainted: P           2.6.32-ARCH #1
Mar 16 18:12:59 nicholas-arch kernel: Call Trace:
Mar 16 18:12:59 nicholas-arch kernel: <IRQ>  [<ffffffff810ab7fe>] ? __report_bad_irq+0x1e/0x90
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffff810ab9fb>] ? note_interrupt+0x18b/0x1d0
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffff81019915>] ? read_tsc+0x5/0x20
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffff810ac30d>] ? handle_fasteoi_irq+0xcd/0xf0
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffff810153e5>] ? handle_irq+0x15/0x20
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffff81014902>] ? do_IRQ+0x62/0xe0
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffff81012a53>] ? ret_from_intr+0x0/0x11
Mar 16 18:12:59 nicholas-arch kernel: <EOI>  [<ffffffff8101a8a3>] ? mwait_idle+0x63/0xe0
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffff81011202>] ? cpu_idle+0xb2/0x110
Mar 16 18:12:59 nicholas-arch kernel: handlers:
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffffa0b667e0>] (usb_hcd_irq+0x0/0x80 [usbcore])
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffffa0b667e0>] (usb_hcd_irq+0x0/0x80 [usbcore])
Mar 16 18:12:59 nicholas-arch kernel: [<ffffffffa0b667e0>] (usb_hcd_irq+0x0/0x80 [usbcore])
Mar 16 18:12:59 nicholas-arch kernel: Disabling IRQ #18

Here is my lspci:

[nicholas@nicholas-arch ~]$ lspci
00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller (rev 02)
00:01.0 PCI bridge: Intel Corporation 82G33/G31/P35/P31 Express PCI Express Root Port (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 4 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation GT200b [GeForce GTX 275] (rev a1)
03:00.0 IDE interface: JMicron Technology Corp. JMB368 IDE controller
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
05:02.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8185 IEEE 802.11a/b/g Wireless LAN Controller (rev 20)

Kernel info:

[nicholas@nicholas-arch ~]$ uname -a
Linux nicholas-arch 2.6.32-ARCH #1 SMP PREEMPT Tue Feb 23 19:43:46 CET 2010 x86_64 Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz GenuineIntel GNU/Linux
[nicholas@nicholas-arch ~]$

Last edited by lasu1 (2010-04-11 15:19:43)

Offline

#2 2010-03-17 01:03:10

tavianator
Member
From: Waterloo, ON, Canada
Registered: 2007-08-21
Posts: 858
Website

Re: [Solved] irq 18: nobody cared (kernel panic after suspend, sometimes)

To figure out what's breaking, I'd first suggest suspending without X running at all.  If that works, it's probably nVidia's fault.  Otherwise it's likely a kernel bug.

Offline

#3 2010-03-17 11:51:57

lasu1
Member
Registered: 2010-02-10
Posts: 83

Re: [Solved] irq 18: nobody cared (kernel panic after suspend, sometimes)

Ok, this is quite odd.  If I exit X and suspend from the command line, I can't resume at all.  If I suspend from X, I can successfully resume 70-80% of the time; the other 30% I get the freeze described upon resume. How weird is that?

My logic (obviously wrong) would hold that the suspends would be more successful from the CLI than from X.

So, this problem seems to have been around for a few years; although, not so much in Arch (seemed to be common in Fedora 10). This thread: https://bugzilla.redhat.com/show_bug.cgi?id=474624 over at Fedora suggests booting with the pci=msi flag on.

What is the pci=msi flag, and how do I boot with it? Does anyone know if this will negatively impact my system's performance in other ways?

Well, before going any further, I should note that this is more of an "annoyance" than a true problem. I do all of my work on my Arch machine, and, truth be told, Resume from Suspend is only seconds faster than a cold boot (I'm at CLI within 10-13 seconds of power on, X starts immediately when I issue the command). I'd more like to fix this because I don't like the idea of something "not being 100% working," rather than this is actually getting in the way of my productivity (because it really isn't hehe):) I also really get a kick out of seeing how long I can get uptime to be.

*edited for URL syntax
*Edit #2: I should add that this morning I pacman -Syu'd and upgraded the kernel...

Last edited by lasu1 (2010-03-17 12:13:56)

Offline

#4 2010-03-17 15:50:54

tavianator
Member
From: Waterloo, ON, Canada
Registered: 2007-08-21
Posts: 858
Website

Re: [Solved] irq 18: nobody cared (kernel panic after suspend, sometimes)

Well that would be my logic too, but then again the nVidia proprietary drivers don't like to play nice with the VT consoles in my experience.  What about suspend from a console without ever having started X, and without the nVidia module loaded?

MSI is just an optional part of the PCI standard; I can't see any reason it would hurt your system's performance.  To try it, just add pci=msi to the end of the kernel line in GRUB (press 'e' to edit the commands at bootup).  However, I seriously doubt you're seeing the same problem as people with Fedora 10 are seeing -- just because you have the same symptom doesn't mean the underlying problem is the same.

Offline

#5 2010-03-18 00:09:46

lasu1
Member
Registered: 2010-02-10
Posts: 83

Re: [Solved] irq 18: nobody cared (kernel panic after suspend, sometimes)

Ok, thanks for your interest, btw.

So, I tried to pm-suspend from the CLI -- before starting X (right after a cold boot). Same problem...very odd...as far as I can tell from digging through the logs, the system DOES resume successfully without X (or so it says in the log).  The "Irq 18: nobody cares" happens DURING the resume, in this case though.  Odd...(EDIT to add: just to clarify the issue: the system does seem to resume; the problem is that the monitor is not getting a signal from the computer...again, very weird.)

Ok, so I had another freeze. I'm thinking that you might be right about NVIDIA...it crashed while I was in a game, using 3D. The crash didn't happen immediately, it only happened about 10 minutes after starting the program.

So, I've booted with the pci=msi option. Everything seems fine, so far. Let's see how it goes.

Good point, re: Fedora.

EDIT: Alright, I'm going to confirm that the freeze seems to occur only after using some sort of 3D application. The finger points to NVIDIA....

Last edited by lasu1 (2010-03-18 16:23:53)

Offline

#6 2010-04-11 15:21:27

lasu1
Member
Registered: 2010-02-10
Posts: 83

Re: [Solved] irq 18: nobody cared (kernel panic after suspend, sometimes)

This problem dissappeared after updating to kernel 2.6.33, and proprietary NVIDIA driver 195.36.15.

Offline

Board footer

Powered by FluxBB