You are not logged in.
Pages: 1
Hi,
since a couple of weeks, my system fails to boot. After the "Loading modules" message and instead of the change of resolution (kms ?), I get a series of messages that end with something like this and all leds are blinking:
[ 5.xxxxx] [<ffffffff813e8ec2>] ? system_call_fastpath+0x16/0x1b
I shut down my laptop (dell studio 15) with the button and boot again, and then I have no problem to boot.
I did not found these messages in the logs so I can not paste them. The problem is almost systematic since several days.
Where should I look to have more info on this issue (nothing in the log) ? kernel related or something else, I don't know, init scripts, udev ?
Last edited by boulde (2011-08-24 21:51:06)
Offline
That looks like a hard kernel lock, you might want to search the kernel bugtracker for similar reports: https://bugzilla.kernel.org
ᶘ ᵒᴥᵒᶅ
Offline
after 2 days without the issue, yesterday It took 2 attempt before I can boot, and today no problem ...
I looked at the kernel bugzilla, but I don't know how to diagnose (?) the problem and maybe find a similar one.
All I can say is that it happens between the "Loading Modules" and "Waiting udev events to be processed".
Offline
It will be hard to diagnose without the whole backtrace. Take a picture or something.
Offline
you're right, I will take a photo next time.
Offline
here is the picture: http://pix.toile-libre.org/?img=1311968785.jpg
Offline
Sadly that isn't the entire backtrace, the top got cut off. Can you add vga=791 or something to the kernel command line to increase the size of the console?
Offline
with vga=791: http://pix.toile-libre.org/?img=1312028043.jpg
but still no entire I guess ?
Offline
Wow, that is the longest backtrace I've ever seen. If you have another computer, you could use netconsole to get the whole trace.
Offline
after 2 days without the issue, yesterday It took 2 attempt before I can boot, and today no problem ...
I looked at the kernel bugzilla, but I don't know how to diagnose (?) the problem and maybe find a similar one.
All I can say is that it happens between the "Loading Modules" and "Waiting udev events to be processed".
This might mean that the loading of a module is the culprit. To figure out which module it is, I suggest doing something like this:
Boot successfully, and note down all your loaded modules ("lsmod") (tedious, I know, but I couldn't think of a better way).
Boot with a shell as your init (in this way you don't run any of the initscripts): add "init=/bin/bash" to your kernel command line.
Once booted, start udev: "/sbin/udevd --daemon"
I assume that it has not yet crashed, if starting udev caused a crash, then try again without starting udev. We will NOT trigger udev to load modules automatically, but rather load them manually.
Go through your list of modules and do "modprobe <modulename>" for each of them. Hopefully one of them will cause a crash, and then at least you know who the culprit is, and you can pass this on to lkml, or look through the kernel bugzilla for related problems.
Offline
Offline
Thanks all for your answers, I will try to explore this before going in holidays !
For netconsole, I'm not sure to understand how to use it. Should I put the «modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"» in modprobe.conf ?
Offline
ok, I booted with the init=/bin/bash and played with modprobe.
modprobe ite_cir several times gave me segfaults and once a backtrace (and this time modprobe -r didn't return control).
So I will blacklist ite-circ and see what happens. (it seems to be related to Infrared Remote Control which I don't use).
Offline
@boulde: i should have asked earlier: are you using the kernel from [testing] or [core]? Could you try to reproduce with the other one?
I think it might be worth to report his to b.k.o.
Offline
I use the kernel from core, did not try with the 3.0 (as I go on holidays in 2 days without internet, I prefer not to update to testing now).
Offline
Hi,
after blacklisting ite_circ, I still have the bug : http://pix.toile-libre.org/?img=1312633790.jpg
This time with a message "kernel panic - not syncing: Fatal exception in interrupt"
Offline
That looks like something to send upstream. However, 2.6.39 is EOL now, so I guess it would be more interesting if you were able to reproduce with 3.0.1 (testing)...
Offline
yes I am waiting the upgrade to 3.0 before reporting it.
Offline
Hi,
after some holidays :-) and a few days testing the 3.0 kernel, I did not see the bug !
thanks all for your help
Offline
boulde, please mark the thread as [SOLVED] by editing the title of your first post.
Edit: Oh, I see you marked it [fixed]. Ok so.
Last edited by bernarcher (2011-08-24 22:09:55)
To know or not to know ...
... the questions remain forever.
Offline
Pages: 1