You are not logged in.

#1 2005-01-05 14:54:45

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

smp issue with 2.6.10?

i'm using a dell 5150 laptop (p4 HT), and i seem to be having issues of the system locking up randomly (or at least without anything i can yet reproduce). 

it's only been since the 2.6.10 upgrade, which also required the nvidia (package) and ndiswrapper (compiled from sourceforge) updates.  at first everything appears fine.  then after some amount of using, both 'processors' get pegged. 

if i stop syslog-ng, that unfreezes, one processor, but the other could fry an egg.

the wild process is listed under top as 'events/1'

does anybody know of issues related to smp and the anticipatory scheduler?  that's just a guess  as to what is occuring, i'm going to try a different scheduler and see what i see. 

i have googled those terms and didn't find much related to my experience. 

thanks for any suggestions/help in advance.

jp

:-| have a day

Offline

#2 2005-01-05 16:39:16

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: smp issue with 2.6.10?

the news letter has an article about switching the IO scheduler... if you want to try that...

I have a similar setup... p4/HT laptop running nvidia and ndiswrapper (bcmwl5a driver)...

I get some odd hangs and lockups, but haven't had time to diagnose... so I just reboot on a hardlock and continue doing what I was doing.... I thought mine was related to firefox until I read this post....

I'll do some fidgeting tonight and get back to you...

Offline

#3 2005-01-05 16:45:34

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

well i can't say that it's not related to firefox, all i know is that the issue only started since the 2.6.10 update. i almost always have a browser open, so that certainly could be the problem. 

here's a complete left field guess... it's something with network io and the anticipatory scheduler on smp machines.  problem happens, syslog gets swamped and system comes down.  it hasn't occured since my post ( :-) ), but i have added a line to the kernel boot to try of cfq on the next boot.

Offline

#4 2005-01-05 17:07:36

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

figures, as i hit submit on my last post, my system came down.  obviously while i was using firefox. but after reboot,  i (thought i) had set the elevator=cfq parameter. trying a different scheduler (tell me that's wrong and i'm just and idiot please), all that i opened was vnc, and boom!

# (0) Arch Linux
title  Arch Linux  [/boot/vmlinuz26]
root   (hd0,4)
kernel (hd0,2)/vmlinuz26 root=/dev/discs/disc0/part5 ro vga=0x346 devfs=nomount elevator=cfq

here's a cute tidbit, in it's death throws, heres (a snippet) of what it puts out, the full choke is about 437525 lines long....

only seeing nvidia, in there.  which would explain why vnc brings it down so fast and the scheduler didn't seem to matter.

i'll have to check nvidia's site and see if they say anything about this.

Jan  5 11:52:47 aragorn [<c011ba5e>] __wake_up+0x3e/0x60
Jan  5 11:52:47 aragorn [<c04de3f2>] schedule_timeout+0xb2/0xc0
Jan  5 11:52:47 aragorn [<f9a5930b>] nv_kern_poll+0x3a/0xb0 [nvidia]
Jan  5 11:52:47 aragorn [<c0430416>] sock_poll+0x26/0x30
Jan  5 11:52:47 aragorn [<c0173ff8>] do_select+0x298/0x2e0
Jan  5 11:52:47 aragorn [<c0173b90>] __pollwait+0x0/0xd0
Jan  5 11:52:47 aragorn [<c017430f>] sys_select+0x28f/0x540
Jan  5 11:52:47 aragorn [<c015fdba>] vfs_read+0x9a/0x160
Jan  5 11:52:47 aragorn [<c01031d9>] sysenter_past_esp+0x52/0x75
Jan  5 11:52:47 aragorn [<c04ddd2a>] schedule+0xaaa/0xc60
Jan  5 11:52:47 aragorn [<c036057c>] tty_ldisc_try+0x3c/0x50
Jan  5 11:52:47 aragorn [<c011ba5e>] __wake_up+0x3e/0x60
Jan  5 11:52:47 aragorn [<c04de3f2>] schedule_timeout+0xb2/0xc0
Jan  5 11:52:47 aragorn [<f9a5930b>] nv_kern_poll+0x3a/0xb0 [nvidia]
Jan  5 11:52:47 aragorn [<c0430416>] sock_poll+0x26/0x30
Jan  5 11:52:47 aragorn [<c0173ff8>] do_select+0x298/0x2e0
Jan  5 11:52:47 aragorn [<c0173b90>] __pollwait+0x0/0xd0
Jan  5 11:52:47 aragorn [<c017430f>] sys_select+0x28f/0x540
Jan  5 11:52:47 aragorn [<c015fdba>] vfs_read+0x9a/0x160
Jan  5 11:52:47 aragorn [<c01031d9>] sysenter_past_esp+0x52/0x75
Jan  5 11:52:47 aragorn [<c04ddd2a>] schedule+0xaaa/0xc60
Jan  5 11:52:47 aragorn [<c036057c>] tty_ldisc_try+0x3c/0x50
Jan  5 11:52:47 aragorn [<c011ba5e>] __wake_up+0x3e/0x60
Jan  5 11:52:47 aragorn [<c04de3f2>] schedule_timeout+0xb2/0xc0
Jan  5 11:52:47 aragorn [<f9a5930b>] nv_kern_poll+0x3a/0xb0 [nvidia]
Jan  5 11:52:47 aragorn [<c0430416>] sock_poll+0x26/0x30
Jan  5 11:52:47 aragorn [<c0173ff8>] do_select+0x298/0x2e0
Jan  5 11:52:47 aragorn [<c0173b90>] __pollwav_kern_poll+0x3a/0xb0 [nvidia]
Jan  5 11:52:47 aragorn [<c0430416>] sock_poll+0x26/0x30
Jan  5 11:52:47 aragorn [<c0173ff8>] do_select+0x298/0x2e0
Jan  5 11:52:47 aragorn [<c0173b90>] __pollwait+0x0/0xd0
Jan  5 11:52:47 aragorn [<c017430f>] sys_select+0x28f/0x540
Jan  5 11:52:47 aragorn [<c015fdba>] vfs_read+0x9a/0x160
Jan  5 11:52:47 aragorn [<c01031d9>] sysenter_past_esp+0x52/0x75
Jan  5 11:52:47 aragorn [<c04ddd2a>] schedule+0xaaa/0xc60
Jan  5 11:52:47 aragorn [<c036057c>] tty_ldisc_try+0x3c/0x50
Jan  5 11:52:47 aragorn [<c011ba5e>] __wake_up+0x3e/0x60
Jan  5 11:52:47 aragorn [<c04de3f2>] schedule_timeout+0xb2/0xc0
Jan  5 11:52:47 aragorn [<f9a5930b>] nv_kern_poll+0x3a/0xb0 [nvidia]
Jan  5 11:52:47 aragorn [<c0430416>] sock_poll+0x26/0x30
Jan  5 11:52:47 aragorn [<c0173ff8>] do_select+0x298/0x2e0
Jan  5 11:52:47 aragorn [<c0173b90>] __pollwait+0x0/0xd0
Jan  5 11:52:47 aragorn [<c017430f>] sys_select+0x28f/0x540
Jan  5 11:52:47 aragorn [<c015fdba>] vfs_read+0x9a/0x160
Jan  5 11:52:47 aragorn [<c01031d9>] sysenter_past_esp+0x52/0x75
Jan  5 11:52:47 aragorn [<c04ddd2a>] schedule+0xaaa/0xc60
Jan  5 11:52:47 aragorn [<c036057c>] tty_ldisc_try+0x3c/0x50
Jan  5 11:52:47 aragorn [<c011ba5e>] __wake_up+0x3e/0x60
Jan  5 11:52:47 aragorn [<c04de3f2>] schedule_timeout+0xb2/0xc0
Jan  5 11:52:47 aragorn [<f9a5930b>] nv_kern_poll+0x3a/0xb0 [nvidia]
Jan  5 11:52:47 aragorn [<c0430416>] sock_poll+0x26/0x30
Jan  5 11:52:47 aragorn [<c0173ff8>] do_select+0x298/0x2e0

Offline

#5 2005-01-05 17:25:15

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: smp issue with 2.6.10?

hmmm, I've heard of some problems with cfq... haven't tried any other schedulers...

nvidia would explain the issues... especially if firefox is involved, cosidering it's rendering alot of disparate information (text, colors, images, moving thingies....) I'm going to check nvidia for similar issues to mine though

(my original issue was that firefox just hung and I had to manually kill it as root to make it drop... now in 2.6.10 it locks the whole machine)

Offline

#6 2005-01-05 17:51:03

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

don't know yet if this is the same problem, or if the patches mentioned here are applied yet, i see a bunch of patches from minion.de listed in the nividia package, i'll have to look closer

http://www.nvnews.net/vbulletin/showthr … d1&t=42964

Offline

#7 2005-01-05 20:01:17

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

after running without crash for a good 20 minutes straight, i believe you can side-step this by setting "elevator=noop" as i did.

Offline

#8 2005-01-06 14:30:06

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

maybe it's not the nvidia...

i just died running with the elevator=noop param.  but while firefox was not open.  i was checking something into cvs.  so now it's down to the wireless driver or tcp or io.
   
going to try wiring in for a bit and giving that a try.

Offline

#9 2005-01-06 14:35:59

iphitus
Forum Fellow
From: Melbourne, Australia
Registered: 2004-10-09
Posts: 4,927

Re: smp issue with 2.6.10?

Looking at your logs from above, it mentions nvidia repetitively. Try running without nvidia, dont load the module, and use either nv or vesa for X.

Offline

#10 2005-01-06 18:45:57

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

turned off my wireless card dell 1300 (ndiswrapper)

using elevator=as.

no problems so far. 

interestingly, using the anticipatory scheduler on this p4HT laptop makes everything slower, compiling time is about double for my projects.

try a few more days and see.

Offline

#11 2005-01-06 20:53:13

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: smp issue with 2.6.10?

actually, I seem to recall reading that ndiswrapper was not safe for SMP kernels.... nice call...

Yesterday I was SSHed into my box and it hung... I came home and saw this (which made me laugh):

login: Kernel Panic!

which begs the question.... what's the password?

Offline

#12 2005-01-06 22:57:16

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

it used to not be safe prior to like 0.6 or something.  for the longest time i had to use the linuxant driver.   maybe it's just a regression since 0.12 and 1.0rc1.

i just looked though, and they released 1.0 rc2 today, so i'm going to give that another try, i'll let you know tomorrow.

oh, and uh.....

which begs the question.... what's the password?

rotflmao

Offline

#13 2005-01-06 23:02:10

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: smp issue with 2.6.10?

i should try 1.0 as well...

The goofy thing is that my machine only seems to hang (the more I think about it) when doing some decent network traffic... web browsing with flash and stuff, pacman -Syu'ing, etc....
I think you may have hit the nail on the head, at least for my problem, that it may be ndiswrapper... I'll diagnose tonight

Offline

#14 2005-01-07 13:37:29

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

looks like 1.0rc2 of the ndiswrapper works okay for now.  i've also reset my scheduler to 'as' and my compile times seem to have come back down to normal.

if anything changes i'll let you know

Offline

#15 2005-01-07 15:54:46

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: smp issue with 2.6.10?

jp_fielding wrote:

looks like 1.0rc2 of the ndiswrapper works okay for now.  i've also reset my scheduler to 'as' and my compile times seem to have come back down to normal.

if anything changes i'll let you know

let's hope I can ssh long enough to rebuild before it panics... otherwise I'll have to wait till tonight

EDIT: rebuitl, rebooted.... gonna ssh back in when it comes back up and attempt a pacman -Syu (which failed last time)

Offline

#16 2005-01-07 16:51:59

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: smp issue with 2.6.10?

yes, that was it - I'll post an informational post in the laptop forums.....

Offline

#17 2005-01-07 16:58:20

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

cool, sounds good.  everyting still looks good here so that sounds like it to me too.

Offline

#18 2005-01-09 13:48:09

jp_fielding
Member
Registered: 2004-08-28
Posts: 85

Re: smp issue with 2.6.10?

are you also using ndiswrapper?  the only time i experienced any troubles was when i upgraded the wrapper module for the new kernel.  when i finally upgraded ndiswrapper to 1.0rc2, the problems in my system vanished again. 

this is not to say that firefox may not have issues, but i don't experience them.  but firefox did help make my problems come alive as it certainly was using the ndiswrapper module.

don't know what to say about your bad experiences in general with firefox.   i can say you appear to be the exception, because i know lots of users of firefox since it's first release and they only problem i ever had was one release .9 or something has bad https support.  maybe something else on your system just doesn't like it.

Offline

#19 2005-01-10 16:37:35

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: smp issue with 2.6.10?

i have occasional problems with firefox....

and I can't recompile it in debug mode... it seems to only happen when I have an external program associalted with a link... specifically .torrent files...
sigh

at least my network is stable now...

but firefox itself still hangs (doen't take down the system)

Offline

Board footer

Powered by FluxBB