You are not logged in.

#1 2009-06-30 19:35:38

antis
Member
From: sweden
Registered: 2007-05-18
Posts: 108

Help with a Kernel Oops that I don't understand

I'm in the process of installing Arch on a new setup based on a Zotac-IONITX-b motherboard.

I have pretty much come to the end of the initial set up of Arch but I am experiencing something that makes me clueless, A kernel Oops.

This is the output from dmesg as it happens:

BUG: unable to handle kernel paging request at 000d8a0b
IP: [<c016bfbc>] m_show+0x9c/0x1a0
*pde = 00000000 
Oops: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/module/mbcache/initstate
Modules linked in: ext3 jbd ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables arc4 ecb snd_hda_codec_nvhdmi rt73usb crc_itu_t rt2x00usb rt2x00lib snd_hda_codec_realtek led_class snd_seq_dummy input_polldev lirc_mceusb2 lirc_dev snd_seq_oss mac80211 snd_seq_midi_event snd_hda_intel usbhid snd_seq snd_seq_device snd_hda_codec hid nvidia(P) cfg80211 snd_pcm_oss snd_mixer_oss snd_hwdep agpgart snd_pcm snd_timer ohci_hcd snd soundcore i2c_nforce2 shpchp ehci_hcd psmouse snd_page_alloc pci_hotplug pcspkr usbcore sg serio_raw i2c_core forcedeth wmi evdev thermal processor fan button battery ac rtc_cmos rtc_core rtc_lib ext2 mbcache sd_mod pata_acpi ata_generic ahci libata scsi_mod

Pid: 1947, comm: lsmod Tainted: P           (2.6.30-ARCH #1) To Be Filled By O.E.M.
EIP: 0060:[<c016bfbc>] EFLAGS: 00010282 CPU: 0
EIP is at m_show+0x9c/0x1a0
EAX: 00000000 EBX: f6a615a0 ECX: 00000e76 EDX: 00000000
ESI: f98de6e0 EDI: 000d8a0b EBP: f98de810 ESP: f69edeb4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process lsmod (pid: 1947, ti=f69ec000 task=f6556800 task.ti=f69ec000)
Stack:
 f6a615a0 c0457a4c f9b76bec 00011824 f98de6e4 f69edf30 c03d28a8 88e01f5c
 c03e0768 f6a615a0 f98de6e4 f69edf30 c01ee703 c012d160 00000000 00000174
 00000000 b807c014 f6a615c8 000003ec 00000014 f65b2780 f69edf90 0000001f
Call Trace:
 [<c03d28a8>] ? mutex_lock+0x18/0x40
 [<c01ee703>] ? seq_read+0x263/0x470
 [<c012d160>] ? __wake_up+0x50/0x80
 [<c01ee4a0>] ? seq_read+0x0/0x470
 [<c0219b59>] ? proc_reg_read+0x79/0xc0
 [<c01d1923>] ? vfs_read+0xc3/0x1a0
 [<c0219ae0>] ? proc_reg_read+0x0/0xc0
 [<c01d1b08>] ? sys_read+0x58/0xb0
 [<c0103c73>] ? sysenter_do_call+0x12/0x28
Code: 10 31 c0 81 c5 2c 01 00 00 39 ef 74 2d 66 90 8b 47 08 89 1c 24 c7 44 24 04 4c 7a 45 c0 83 c0 0c 89 44 24 08 e8 d6 22 08 00 8b 3f <8b> 07 0f 18 00 90 39 ef 75 da b8 01 00 00 00 8b 96 d4 00 00 00 
EIP: [<c016bfbc>] m_show+0x9c/0x1a0 SS:ESP 0068:f69edeb4
CR2: 00000000000d8a0b
---[ end trace a2e72733227d6c47 ]---

My kernel skills are next to zero so I don't even know where to begin to look for things to fix. The only thing I know is that if I set the runlevel to 5 at boot I get the Oops straight at boot. It appears that xorg (with nvidia module) starts loading but only makes it to about half a deacent log file. And when I try to issue lsmod the terminal just hangs.

Runlevel 3 boots fine witout the Oops but as soons as I try to do something module related (like lsmod again) the Oops is right there.

Any ideas what to do about this or am I down the reinstall route again?

Offline

#2 2009-07-01 01:18:20

boris
Member
Registered: 2006-02-25
Posts: 20

Re: Help with a Kernel Oops that I don't understand

Interesting, you have a lot of modules loaded in the kernel is it possible to alter your rc.conf to minimize the amount of modules being loaded and only load the ones you necessarily need? Does modprobe cause the oops as well? If so ,after altering the $MODULES variable in rc.conf reboot the machine and see if the panic occurs. If not start slowly loading back in the other modules to see which one is possibly causing the panic.

Offline

#3 2009-07-01 06:11:36

antis
Member
From: sweden
Registered: 2007-05-18
Posts: 108

Re: Help with a Kernel Oops that I don't understand

My modules array in rc.conf is actually completely empty so at the moment everything you see as linked in modules are loaded automatically.
I guess I can list them all and put a "!" infront of them and start from there. My main problem then is to know which modules that are absolutely necessary for the pc to boot at all. smile

I haven't tired modprobeing anything yet so I don't know if that also causes problems.

But thanks for the suggestion boris. I'll try this later today when I get home from work.

Offline

#4 2009-07-05 18:29:54

antis
Member
From: sweden
Registered: 2007-05-18
Posts: 108

Re: Help with a Kernel Oops that I don't understand

I'm still having problems with this sad
I don't really know where to begin when it comes to prevent modules being loaded. Which ones are safe to remove etc?

This is what I get at boot trying to get straight into X.
Can anyone make something out of the dump? Any ideas what might be wrong is much appreciated.

BUG: unable to handle kernel paging request at 000d8a0b
IP: [<c016bfbc>] m_show+0x9c/0x1a0
*pde = 00000000 
Oops: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:10.0/0000:03:00.0/resource
Modules linked in: ext3 jbd ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables arc4 ecb snd_hda_codec_nvhdmi rt73usb crc_itu_t rt2x00usb rt2x00lib led_class lirc_mceusb2 snd_seq_dummy snd_hda_codec_realtek input_polldev lirc_dev snd_seq_oss joydev mac80211 snd_seq_midi_event snd_seq usbhid snd_hda_intel snd_seq_device hid snd_hda_codec cfg80211 nvidia(P) snd_pcm_oss snd_mixer_oss snd_hwdep snd_pcm snd_timer agpgart ohci_hcd snd psmouse ehci_hcd soundcore shpchp serio_raw i2c_nforce2 pcspkr usbcore snd_page_alloc pci_hotplug sg forcedeth i2c_core wmi evdev thermal processor fan button battery ac rtc_cmos rtc_core rtc_lib ext2 mbcache sd_mod pata_acpi ata_generic ahci libata scsi_mod

Pid: 2023, comm: X Tainted: P           (2.6.30-ARCH #1) To Be Filled By O.E.M.
EIP: 0060:[<c016bfbc>] EFLAGS: 00210282 CPU: 0
EIP is at m_show+0x9c/0x1a0
EAX: 00000000 EBX: f73a5540 ECX: 00000e31 EDX: 00000000
ESI: f98ac6e0 EDI: 000d8a0b EBP: f98ac810 ESP: ec821eb4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process X (pid: 2023, ti=ec820000 task=ec96bc00 task.ti=ec820000)
Stack:
 f73a5540 c0457a4c f9bfcc2c 00011824 f98ac6e4 ec821f30 c03d28a8 9f0f8b2b
 c03e0768 f73a5540 f98ac6e4 ec821f30 c01ee703 00000000 9f0f8b2b 000001b9
 00000000 b7f54014 f73a5568 000003ec 00000014 ec990900 ec821f90 00000021
Call Trace:
 [<c03d28a8>] ? mutex_lock+0x18/0x40
 [<c01ee703>] ? seq_read+0x263/0x470
 [<c01ee4a0>] ? seq_read+0x0/0x470
 [<c0219b59>] ? proc_reg_read+0x79/0xc0
 [<c01d1923>] ? vfs_read+0xc3/0x1a0
 [<c0219ae0>] ? proc_reg_read+0x0/0xc0
 [<c01d1b08>] ? sys_read+0x58/0xb0
 [<c0103c73>] ? sysenter_do_call+0x12/0x28
Code: 10 31 c0 81 c5 2c 01 00 00 39 ef 74 2d 66 90 8b 47 08 89 1c 24 c7 44 24 04 4c 7a 45 c0 83 c0 0c 89 44 24 08 e8 d6 22 08 00 8b 3f <8b> 07 0f 18 00 90 39 ef 75 da b8 01 00 00 00 8b 96 d4 00 00 00 
EIP: [<c016bfbc>] m_show+0x9c/0x1a0 SS:ESP 0068:ec821eb4
CR2: 00000000000d8a0b
---[ end trace c0390704e02a5e60 ]---

edit:
I hope this is solved now. I couldn't figure out what module that was acting up so I reinstalled and started slowly to install and activate my devices. As soon as I reached the point of installing X, and specifically the nvidia drivers, it all went bonkers again.

I uninstalled the nvidia oackage and built the beta drivers from AUR, but had the same result. I ten turned to nvidia directly and installed their driver (the.run file) and after that everything seems to run smoothly.

So, probably there is something in the pacman package that doesn't work with my Zotac IONitx board.

Edit again:
The error is back and I have filed a bug report to nvidia. I'm keeping my fingers crossed that they can solve this for me.

Last edited by antis (2009-07-28 14:55:34)

Offline

Board footer

Powered by FluxBB