You are not logged in.

#1 2009-12-28 18:05:38

castagnaru
Member
Registered: 2008-12-29
Posts: 4

Random kernel panics

Something is driving me crazy.

A week ago I've decided to migrate to Arch64 from i686.
After the complete installation of what I need, I started using my desktop as usual. The problem is that I started having many random kernel panics (Caps and Scroll locks blinking): no panic logs allow me to investigate the problem. sad
So, I thought it could be something related to x86_64 drivers: after two days with no investigation results and other kernel panics, exasperated, I decided to come back to the "safe" i686. Here's the BIG problem: random kernel panics still occur! sad
I really don't know how to diagnose the problem: note that I've used i686 Arch for more than 1 years and a half without panics.
Same hardware!

I can provide you something for helping me:

lspci

00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller (rev 02)
00:01.0 PCI bridge: Intel Corporation 82G33/G31/P35/P31 Express PCI Express Root Port (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation G86 [GeForce 8400 GS] (rev a1)
03:00.0 IDE interface: JMicron Technology Corp. JMB368 IDE controller
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
05:01.0 Ethernet controller: Marvell Technology Group Ltd. 88w8335 [Libertas] 802.11b/g Wireless (rev 03)

Something from kernel.log (wrote on boot, can't paste the whole log, 65k post forum limit smile). I didn't get this on the previous (good working) installation.

Dec 28 18:19:43 archbox kernel: ------------[ cut here ]------------                                                                                             
Dec 28 18:19:43 archbox kernel: WARNING: at kernel/irq/manage.c:272 __enable_irq+0x62/0xa0()                                                                     
Dec 28 18:19:43 archbox kernel: Hardware name: P35-S3G                                                                                                           
Dec 28 18:19:43 archbox kernel: Unbalanced enable for IRQ 16                                                                                                     
Dec 28 18:19:43 archbox kernel: Modules linked in: snd_hda_codec_realtek snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device fan snd_hda_intel battery snd_pcm_oss snd_mixer_oss snd_hda_codec snd_hwdep snd_pcm snd_timer ac nvidia(P) intel_agp i2c_i801 snd uhci_hcd ehci_hcd r8169 iTCO_wdt soundcore jmicron(+) i2c_core usbcore iTCO_vendor_support mii snd_page_alloc vboxdrv agpgart ide_core ppdev psmouse parport_pc serio_raw evdev coretemp button processor floppy thermal sg pcspkr lp parport it87 hwmon_vid rtc_cmos rtc_core rtc_lib ext3 jbd mbcache sd_mod ata_piix libata scsi_mod                                              
Dec 28 18:19:43 archbox kernel: Pid: 718, comm: modprobe Tainted: P           2.6.31-ARCH #1                                                                     
Dec 28 18:19:43 archbox kernel: Call Trace:                                                                                                                      
Dec 28 18:19:43 archbox kernel: [<c10464da>] ? warn_slowpath_common+0x7a/0xc0                                                                                    
Dec 28 18:19:43 archbox kernel: [<c10954a2>] ? __enable_irq+0x62/0xa0                                                                                            
Dec 28 18:19:43 archbox kernel: [<c1046597>] ? warn_slowpath_fmt+0x37/0x60                                                                                       
Dec 28 18:19:43 archbox kernel: [<c10954a2>] ? __enable_irq+0x62/0xa0                                                                                            
Dec 28 18:19:43 archbox kernel: [<c10959e4>] ? enable_irq+0x44/0xa0                                                                                              
Dec 28 18:19:43 archbox kernel: [<f84fe389>] ? ide_probe_port+0x159/0x690 [ide_core]                                                                             
Dec 28 18:19:43 archbox kernel: [<f84feb91>] ? ide_host_register+0x231/0x650 [ide_core]                                                                          
Dec 28 18:19:43 archbox kernel: [<f8504f58>] ? ide_pci_init_two+0x548/0x6b0 [ide_core]                                                                           
Dec 28 18:19:43 archbox kernel: [<c1108eff>] ? find_inode+0x4f/0xa0                                                                                              
Dec 28 18:19:43 archbox kernel: [<c11505f0>] ? sysfs_ilookup_test+0x0/0x30                                                                                       
Dec 28 18:19:43 archbox kernel: [<c11509a1>] ? sysfs_find_dirent+0x31/0x50                                                                                       
Dec 28 18:19:43 archbox kernel: [<c110892b>] ? iput+0x2b/0x70                                                                                                    
Dec 28 18:19:43 archbox kernel: [<c11510e6>] ? sysfs_addrm_finish+0x46/0x230                                                                                     
Dec 28 18:19:43 archbox kernel: [<c1150de5>] ? sysfs_addrm_start+0x65/0xe0                                                                                       
Dec 28 18:19:43 archbox kernel: [<f85050dd>] ? ide_pci_init_one+0x1d/0x40 [ide_core]                                                                             
Dec 28 18:19:43 archbox kernel: [<c11b138a>] ? local_pci_probe+0x1a/0x40                                                                                         
Dec 28 18:19:43 archbox kernel: [<c11b2641>] ? pci_device_probe+0x81/0xb0                                                                                        
Dec 28 18:19:43 archbox kernel: [<c1234869>] ? driver_probe_device+0x89/0x170                                                                                    
Dec 28 18:19:43 archbox kernel: [<c12349e1>] ? __driver_attach+0x91/0xa0                                                                                         
Dec 28 18:19:43 archbox kernel: [<c1234950>] ? __driver_attach+0x0/0xa0                                                                                          
Dec 28 18:19:43 archbox kernel: [<c1233ee2>] ? bus_for_each_dev+0x62/0xa0                                                                                        
Dec 28 18:19:43 archbox kernel: [<c1234692>] ? driver_attach+0x22/0x40                                                                                           
Dec 28 18:19:43 archbox kernel: [<c1234950>] ? __driver_attach+0x0/0xa0                                                                                          
Dec 28 18:19:43 archbox kernel: [<c123366e>] ? bus_add_driver+0xce/0x2b0
Dec 28 18:19:43 archbox kernel: [<c11b2520>] ? pci_device_remove+0x0/0x60
Dec 28 18:19:43 archbox kernel: [<c1234d5f>] ? driver_register+0x6f/0x130
Dec 28 18:19:43 archbox kernel: [<c1068706>] ? notifier_call_chain+0x46/0x80
Dec 28 18:19:43 archbox kernel: [<f8783000>] ? jmicron_ide_init+0x0/0x36 [jmicron]
Dec 28 18:19:43 archbox kernel: [<c11b2ad9>] ? __pci_register_driver+0x49/0xd0
Dec 28 18:19:43 archbox kernel: [<c100115b>] ? do_one_initcall+0x3b/0x1b0
Dec 28 18:19:43 archbox kernel: [<c107e675>] ? sys_init_module+0xe5/0x230
Dec 28 18:19:43 archbox kernel: [<c1003cb3>] ? sysenter_do_call+0x12/0x28
Dec 28 18:19:43 archbox kernel: ---[ end trace 0c7ff589307cc1fc ]---

This log is about the boot time, but I get random panics especially several minutes after it. sad

For any other information needed, just ask.
Thanks in advance.

Offline

#2 2009-12-29 00:03:25

kjon
Member
From: Temuco, Chile
Registered: 2008-04-16
Posts: 398

Re: Random kernel panics

yup, I see.
try booting with acpi=irqpoll and see if that solves the problem. It isn't the best solution, but, by judging the output it looks like an irq issue. I don't know if disabling SMP might help.


They say that if you play a Win cd backward you hear satanic messages. That's nothing! 'cause if you play it forwards, it installs windows.

Offline

#3 2009-12-29 13:04:58

castagnaru
Member
Registered: 2008-12-29
Posts: 4

Re: Random kernel panics

It didn't work. sad
I've added that option to the kernel on grub, but it gives me the same IRQ problem. I've also had another panic in the meantime.

I read around about that message, and someone on the Debian forum says it is harmless, but it is the only "strange" one I get. No log about the panic is never produced.

Another anomalous message I get is this:

Dec 28 22:51:25 archbox kernel: INFO: task hald-addon-stor:1412 blocked for more than 120 seconds.                                                               
Dec 28 22:51:25 archbox kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                        
Dec 28 22:51:25 archbox kernel: hald-addon-st D a0000004     0  1412   1375 0x00000000                                                                           
Dec 28 22:51:25 archbox kernel: f6832d60 00000086 00000000 a0000004 00000000 00000000 0000003e 00000000                                                          
Dec 28 22:51:25 archbox kernel: c13115c0 f672a000 00000000 c1489140 c1489140 f6832f08 c1484744 00000000                                                          
Dec 28 22:51:25 archbox kernel: 00000000 f6832f08 c1489140 c1489140 00043efe 00000001 c743134e f6832d60                                                          
Dec 28 22:51:25 archbox kernel: Call Trace:                                                                                                                      
Dec 28 22:51:25 archbox kernel: [<c1306545>] ? schedule_timeout+0x195/0x210                                                                                      
Dec 28 22:51:25 archbox kernel: [<c1053ac3>] ? lock_timer_base+0x33/0x70                                                                                         
Dec 28 22:51:25 archbox kernel: [<c1054132>] ? del_timer+0x52/0x70                                                                                               
Dec 28 22:51:25 archbox kernel: [<c1305583>] ? wait_for_common+0xa3/0x140                                                                                        
Dec 28 22:51:25 archbox kernel: [<c103d060>] ? default_wake_function+0x0/0x30                                                                                    
Dec 28 22:51:25 archbox kernel: [<c11897c0>] ? blk_execute_rq+0x90/0x100                                                                                         
Dec 28 22:51:25 archbox kernel: [<c1189620>] ? blk_end_sync_rq+0x0/0x50                                                                                          
Dec 28 22:51:25 archbox kernel: [<c1189a9a>] ? blk_recount_segments+0x2a/0x60                                                                                    
Dec 28 22:51:25 archbox kernel: [<c1189345>] ? blk_rq_map_kern+0xe5/0x140                                                                                        
Dec 28 22:51:25 archbox kernel: [<f9d87615>] ? ide_cd_queue_pc+0x105/0x1f0 [ide_cd_mod]                                                                          
Dec 28 22:51:25 archbox kernel: [<c1181afb>] ? __freed_request+0xdb/0x140                                                                                        
Dec 28 22:51:25 archbox kernel: [<c1181b93>] ? freed_request+0x33/0x80                                                                                           
Dec 28 22:51:25 archbox kernel: [<f9d87996>] ? cdrom_read_tocentry+0xb6/0xe0 [ide_cd_mod]                                                                        
Dec 28 22:51:25 archbox kernel: [<f9d87b64>] ? ide_cd_read_toc+0xf4/0x4b0 [ide_cd_mod]                                                                           
Dec 28 22:51:25 archbox kernel: [<f9d88600>] ? idecd_revalidate_disk+0x20/0x40 [ide_cd_mod]                                                                      
Dec 28 22:51:25 archbox kernel: [<f9d89081>] ? ide_cdrom_check_media_change_real+0x41/0x60 [ide_cd_mod]                                                          
Dec 28 22:51:25 archbox kernel: [<f9d6d0da>] ? media_changed+0x7a/0xc0 [cdrom]                                                                                   
Dec 28 22:51:25 archbox kernel: [<c1120e40>] ? check_disk_change+0x60/0x70                                                                                       
Dec 28 22:51:25 archbox kernel: [<f9d70440>] ? cdrom_open+0x230/0xae0 [cdrom]                                                                                    
Dec 28 22:51:25 archbox kernel: [<f8548c15>] ? do_ide_request+0x385/0x5c0 [ide_core]                                                                             
Dec 28 22:51:25 archbox kernel: [<c1306545>] ? schedule_timeout+0x195/0x210                                                                                      
Dec 28 22:51:25 archbox kernel: [<c1053ac3>] ? lock_timer_base+0x33/0x70                                                                                         
Dec 28 22:51:25 archbox kernel: [<c1054132>] ? del_timer+0x52/0x70                                                                                               
Dec 28 22:51:25 archbox kernel: [<c13055dc>] ? wait_for_common+0xfc/0x140                                                                                        
Dec 28 22:51:25 archbox kernel: [<c103d060>] ? default_wake_function+0x0/0x30                                                                                    
Dec 28 22:51:25 archbox kernel: [<c11897c0>] ? blk_execute_rq+0x90/0x100                                                                                         
Dec 28 22:51:25 archbox kernel: [<c1189620>] ? blk_end_sync_rq+0x0/0x50                                                                                          
Dec 28 22:51:25 archbox kernel: [<c11828bd>] ? get_request+0x35d/0x3b0                                                                                           
Dec 28 22:51:25 archbox kernel: [<c1181afb>] ? __freed_request+0xdb/0x140                                                                                        
Dec 28 22:51:25 archbox kernel: [<c1306e44>] ? __mutex_lock_slowpath+0x1f4/0x2e0                                                                                 
Dec 28 22:51:25 archbox kernel: [<c11980ed>] ? kobject_get+0x1d/0x40                                                                                             
Dec 28 22:51:25 archbox kernel: [<c1306e44>] ? __mutex_lock_slowpath+0x1f4/0x2e0                                                                                 
Dec 28 22:51:25 archbox kernel: [<c11980ed>] ? kobject_get+0x1d/0x40                                                                                             
Dec 28 22:51:25 archbox kernel: [<f9d866d1>] ? idecd_open+0xa1/0xc0 [ide_cd_mod]                                                                                 
Dec 28 22:51:25 archbox kernel: [<c112206e>] ? __blkdev_get+0x7e/0x320                                                                                           
Dec 28 22:51:25 archbox kernel: [<c10fef9d>] ? __link_path_walk+0x6bd/0xd50                                                                                      
Dec 28 22:51:25 archbox kernel: [<c11223a8>] ? blkdev_open+0x68/0xd0                                                                                             
Dec 28 22:51:25 archbox kernel: [<c10f037d>] ? __dentry_open+0xfd/0x2c0                                                                                          
Dec 28 22:51:25 archbox kernel: [<c1122340>] ? blkdev_open+0x0/0xd0                                                                                              
Dec 28 22:51:25 archbox kernel: [<c10f066d>] ? nameidata_to_filp+0x6d/0x80                                                                                       
Dec 28 22:51:25 archbox kernel: [<c1100e41>] ? do_filp_open+0x5b1/0x970                                                                                          
Dec 28 22:51:25 archbox kernel: [<c1306e44>] ? __mutex_lock_slowpath+0x1f4/0x2e0                                                                                 
Dec 28 22:51:25 archbox kernel: [<c110bb7d>] ? alloc_fd+0xcd/0x120                                                                                               
Dec 28 22:51:25 archbox kernel: [<c10f0079>] ? do_sys_open+0x69/0x150                                                                                            
Dec 28 22:51:25 archbox kernel: [<c106e699>] ? do_gettimeofday+0x19/0x50                                                                                         
Dec 28 22:51:25 archbox kernel: [<c10f0208>] ? sys_open+0x38/0x60                                                                                                
Dec 28 22:51:25 archbox kernel: [<c1003cb3>] ? sysenter_do_call+0x12/0x28

It happens sometimes, every two minutes. It doesn't cause system to crash, but there's no trace about it in the previous installation logs.

I sadly don't understand why I haven't had any problem of this kind before, with the same distro, the same hardware and the same kernel! Maybe a hardware component has blown up?

EDIT: just happened also with mkinitcpio

INFO: task mkinitcpio:3406 blocked for more than 120 seconds.                                                                                                    
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                                                        
mkinitcpio    D cda4802c     0  3406   3405 0x00000004                                                                                                           
 cda4c200 00000086 2871814a cda4802c cda4802c 7e268f9b 0000001f 00000a00                                                                                         
 00000002 f6689c00 000001a5 c1489140 c1489140 cda4c3a8 c1484744 12220066                                                                                         
 000001a5 cda4c3a8 c1489140 c1489140 f6664e54 00000001 dd3e773b f9d5ede0                                                                                         
Call Trace:                                                                                                                                                      
 [<c1306d62>] ? __mutex_lock_slowpath+0x112/0x2e0                                                                                                                
 [<c1306f48>] ? mutex_lock+0x18/0x40                                                                                                                             
 [<c1122027>] ? __blkdev_get+0x37/0x320                                                                                                                          
 [<c10fef9d>] ? __link_path_walk+0x6bd/0xd50                                                                                                                     
 [<c11223a8>] ? blkdev_open+0x68/0xd0                                                                                                                            
 [<c10f037d>] ? __dentry_open+0xfd/0x2c0                                                                                                                         
 [<c1122340>] ? blkdev_open+0x0/0xd0                                                                                                                             
 [<c10f066d>] ? nameidata_to_filp+0x6d/0x80
 [<c1100e41>] ? do_filp_open+0x5b1/0x970
 [<c10d56d1>] ? __do_fault+0x351/0x430
 [<c110bb7d>] ? alloc_fd+0xcd/0x120
 [<c10f0079>] ? do_sys_open+0x69/0x150
 [<c10f0208>] ? sys_open+0x38/0x60
 [<c1003cb3>] ? sysenter_do_call+0x12/0x28

Last edited by castagnaru (2009-12-29 13:28:44)

Offline

#4 2009-12-29 15:41:41

lilsirecho
Veteran
Registered: 2003-10-24
Posts: 5,000

Re: Random kernel panics

Perhaps your ram.........


Prediction...This year will be a very odd year!
Hard work does not kill people but why risk it: Charlie Mccarthy
A man is not complete until he is married..then..he is finished.
When ALL is lost, what can be found? Even bytes get lonely for a little bit!     X-ray confirms Iam spineless!

Offline

#5 2009-12-30 14:52:29

castagnaru
Member
Registered: 2008-12-29
Posts: 4

Re: Random kernel panics

Already tried a memory check: likely all works fine.

But maybe I've resolved.

The problem's shown just a line before the call trace about the IRQ issue:

...
Dec 28 18:19:43 archbox kernel: input: HDA Digital PCBeep as /devices/pci0000:00/0000:00:1b.0/input/input6                                                       
Dec 28 18:19:43 archbox kernel: hda: DVD DC DW1670, ATAPI CD/DVD-ROM drive                                                                                       
Dec 28 18:19:43 archbox kernel: ------------[ cut here ]------------                                                                                             
Dec 28 18:19:43 archbox kernel: WARNING: at kernel/irq/manage.c:272 __enable_irq+0x62/0xa0()                                                                     
Dec 28 18:19:43 archbox kernel: Hardware name: P35-S3G                                                                                                           
Dec 28 18:19:43 archbox kernel: Unbalanced enable for IRQ 16
...
Dec 28 18:19:43 archbox kernel: [<c107e675>] ? sys_init_module+0xe5/0x230
Dec 28 18:19:43 archbox kernel: [<c1003cb3>] ? sysenter_do_call+0x12/0x28
Dec 28 18:19:43 archbox kernel: ---[ end trace 0c7ff589307cc1fc ]---
Dec 28 18:19:43 archbox kernel: hda: host max PIO5 wanted PIO255(auto-tune) selected PIO4
Dec 28 18:19:43 archbox kernel: hda: UDMA/66 mode selected
Dec 28 18:19:43 archbox kernel: Probing IDE interface ide1...
Dec 28 18:19:43 archbox kernel: ide0 at 0xc000-0xc007,0xc102 on irq 16
Dec 28 18:19:43 archbox kernel: ide1 at 0xc200-0xc207,0xc302 on irq 16
Dec 28 18:19:43 archbox kernel: ide-cd driver 5.00
Dec 28 18:19:43 archbox kernel: ide-cd: hda: ATAPI 48X DVD-ROM DVD-R/RAM CD-R/RW drive, 2048kB Cache
Dec 28 18:19:43 archbox kernel: Uniform CD-ROM driver Revision: 3.20
...

It refers to the DVD burner, on the IDE bus. Well, I forgot to tell you that I've removed the PATA and SCSI hook from the initcpio: I have never refer to that because I was SURE it wasn't to problem. Now this situation change all. hmm
Reinserting the two hooks seems to solve the problem: at least, I don't get that creepy log and, yep, no kernel panic anymore. smile

But this experience has told me that maybe I don't understand something about initcpio. I read the whole wiki page (http://wiki.archlinux.org/index.php/Mkinitcpio), so I removed the two hooks below, cause my root partition (and, in general, all my system partition) is on a SATA2 drive. I also have this IDE DVD burner, but, I guess, it isn't necessary for the boot process. I'm a bit confused.
Could someone explain me where am I wrong? (then I'll mark this topic as solved big_smile)

Last edited by castagnaru (2009-12-30 14:54:12)

Offline

Board footer

Powered by FluxBB