You are not logged in.

#26 2009-09-26 14:08:19

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Hardware (?) trouble

That seems like a problem with your graphics card (but hard to know actually), I have never done stress testing in linux before so I don't know of any programs to stress each component separately but try to find something to stress only your cpu/ram and check if your pc crashes, then try to find something to stress your graphics card and check if it crashes.

You may also want to try to see if there is anything different during POST, if you are not using the fancy POST images all manufacturers insist on using now by default that is.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#27 2009-09-26 14:22:58

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

Thanks for your reply again!

I did not notice anything different during POST.

It being a GPU problem would surprise me. I was playing a game just a few hours ago for at least 30 minutes which I am sure requires more GPU power than watching a video.

Though, I guess if I can't manage to install windows to check my drives with the manufacturer's diagnostics I'll just install Arch again and then try to find apps to stress every part of my system and see what happens.

EDIT: Stupidity on my part, I forgot the obvious and did not disable AHCI before trying to install windows. I guess once I have that installed I might as well do the stress tests there.

Last edited by eyescream (2009-09-26 14:31:55)

Offline

#28 2009-09-26 19:03:43

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

I've run extended tests on both of my hard drives and both passed.

I don't think it's a coincidence though, so there must be something going on.
Usually a freeze like that shouldn't happen in the first place.

I will run some more (stress) tests tomorrow and see what I can find out.


Some interesting things I've noticed:

I can set my RAM voltage to 2.0 as well as 2.1 and it will still boot into windows as opposed to Linux (tested with Ubuntu and Arch).
When I select "BIOS - All" in the memory sizing options in memtest86+ my system freezes. I tested that with v2.11 and v1.7x. I did however read that this option can cause trouble with some mainboards so I am not sure whether that has any significance.

Offline

#29 2009-09-26 19:59:54

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Hardware (?) trouble

I was now wondering, you mention windows crashed with a bsod, do you still remember anything of what it said? It may hold a clue to what's wrong.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#30 2009-09-26 20:34:43

Mindaugas
Member
From: Ireland
Registered: 2004-03-25
Posts: 95

Re: Hardware (?) trouble

Power supply unit. Maybe?

Offline

#31 2009-09-27 09:07:31

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

I haven't noticed anything that would indiciate a PSU problem and I have no way of testing it either for the moment. I guess once I am done trying stuff in windows I will format and set up my system again and run that for a week (if it lasts that long, that is) to see whether it happens again.
Though, if it doesn't happen again there would still be no explanation for the seemingly random freeze other than a bug in vlc/the avi/the video drivers and that would just be too much of a coincidence.

The BSOD is just about the only error I didn't write down, though I can easily reproduce it later by enabling AHCI before starting the windows (xp) installation.

Offline

#32 2009-09-27 17:38:57

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

I have run stress tests (Prime95 and FurMark) to check whether anything shows any errors or acts up.
Nothing. I ran Prime95 for hours (I even tried my RAM with both, 1066 mhz and 2.1V and 800 mhz and 1.9V just to be sure) but no errors were found.
Going by that it seems my mainboard, CPU, RAM and PSU are fine. (I rechecked my RAM in memtest too, passed)
My GPU handled FurMark well too, no errors and it went smoothly.

Everything seems to indicate a stable system.
Perhaps there is some bug in the video drivers that was triggered by a rare event which led to the hard-reboot which then caused the damage to my filesystem. But that seems to be too much of a coincidence..

I am currently rechecking my hard drives for bad sectors but it is unlikely that anything will be found.

I guess I will just have to format and set it all up again and see whether it happens again or not. If it doesn't, no explanation other than "bad luck" then. If it does, then oh well..

Offline

#33 2009-10-08 15:48:14

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

So, after my last post I formatted again.
Everything worked normally until today. The freezes never happened again. Not during boot and not while watching something.
Today however, I booted and I received yet another error:

dsc00052.jpg
(below that is one more line saying "<<EOE>>")

I hard-rebooted and it booted as it's supposed to. Then, I rebooted again just to test and it booted normally, too.

I haven't made any changes to the system (apart from a full system upgrade yesterday or the day before) and I didn't notice anything different.

I will see what happens in the next few days and how it behaves.

Offline

#34 2009-10-18 12:14:42

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

Okay, I am pretty sure something is broken.
Yesterday while browsing the web my pc froze (not just X). I hit reset and it wouldn't boot anymore.

POST looked perfectly normal, then I selected Arch Linux, it displayed the first few lines of the booting procedure before the screen went black and it rebooted.
The same thing happened when trying to boot the fallback image.

I let it continue to try to boot and after about 10 attempts it did boot, displaying some errors during boot (and the subsequent reboots).
Udev showed a problem loading something (I couldn't quite catch what), then on another boot it failed to load my network.

Sounds to me like my mainboard has some weird issues after all.

Any ideas as to what I could do?

Offline

#35 2009-10-18 12:26:50

stefanwilkens
Member
From: Enschede, the Netherlands
Registered: 2008-12-10
Posts: 624

Re: Hardware (?) trouble

if you've assured yourself that the system is not overheating, let's start taking out hardware.

Strip it to bare minimum and boot the system a couple of times. Bare minumum being one strip of ram, a hard drive and a videocard.

How old is your PSU and how hot does it typically get? Over time, PSU units will loose output which may drop below system requirements. Heat is a strong factor in this process.


Arch i686 on Phenom X4 | GTX760

Offline

#36 2009-10-18 12:59:14

hokasch
Member
Registered: 2007-09-23
Posts: 1,461

Re: Hardware (?) trouble

By chance, can you remember affronting a nerdy voodoo-priest recently?
But seriously, the thing with cold boot/first boot a day = error, reboots = fine is quite odd. That should mean something, but... no idea, sorry.
I guess the only thing left would be to put all your components on a different board, to confirm if that is the responsible part. The ram timing/voltage issues may also point in that direction. Sadly, there would not be much to rescue if the motherboard is indeed toast. 

(my system just crashed when I hit submit. creepy?)

Last edited by hokasch (2009-10-18 13:00:10)

Offline

#37 2009-10-18 13:20:08

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

stefanwilkens wrote:

if you've assured yourself that the system is not overheating, let's start taking out hardware.

Strip it to bare minimum and boot the system a couple of times. Bare minumum being one strip of ram, a hard drive and a videocard.

How old is your PSU and how hot does it typically get? Over time, PSU units will loose output which may drop below system requirements. Heat is a strong factor in this process.

I have checked CPU, RAM, GPU, HDD and mainboard temps with different utilities in Linux and Windows and compared it to manufacturer specs and temps of other people on forums etc. Everything checks out and is mostly average temperature or even below average.
My PSU, as well as all the other parts, are 4 months old. It's a new PC which makes it all the more annoying.

As for the PSU temp, how does one go about checking that other than measuring it manually? For now I can only say that if I put my hands on it (tried all sides) it's not very warm, just a bit.
The air coming out of it is not very warm, either.

I guess I will strip it later and see what happens, though I have the feeling it is going to be hard to test seeing as I've rebooted a couple of times now with no errors whatsoever.
This is driving me crazy. I can't even properly test it, the errors just vanish and reappear at random days later..

hokasch wrote:

By chance, can you remember affronting a nerdy voodoo-priest recently?
But seriously, the thing with cold boot/first boot a day = error, reboots = fine is quite odd. That should mean something, but... no idea, sorry.
I guess the only thing left would be to put all your components on a different board, to confirm if that is the responsible part. The ram timing/voltage issues may also point in that direction. Sadly, there would not be much to rescue if the motherboard is indeed toast.

(my system just crashed when I hit submit. creepy?)

Fortunately, I did not do that big_smile
And I agree that it is very odd. I've also checked my RAM a million times but it never shows errors.
In Windows setting the RAM back to its original settings seemed to work, though it could've just been luck.
What's also odd is the fact that I had the same RAM settings for weeks before the freezes started. I didn't change a thing.
It simply started one day.

And oh my god, that IS creepy. Maybe I am cursed, who knows! tongue

EDIT: This thread has some similarities to my issue http://bbs.archlinux.org/viewtopic.php?id=78854
He can run torture tests for hours at a time without crashes like me. Though, he and the others there frequently have crashes while using their PC.
I've only had that twice, once 2 weeks ago and once yesterday, so I am unsure whether it is related. Probably not considering my RAM issues.

Last edited by eyescream (2009-10-18 13:33:15)

Offline

#38 2009-10-18 13:56:07

hokasch
Member
Registered: 2007-09-23
Posts: 1,461

Re: Hardware (?) trouble

Just did a quick google search for "cold boot fails reboot", since this seems to be the most reproducible of the errors you got...
Basically, the only recommendations on the first entries where "do a long post" and "flash BIOS". I guess you tried the first already, the second one sounds a bit dangerous with this setup.
Just one more tip for running tests: UltimateBootCD

Offline

#39 2009-10-18 14:15:57

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

Thanks for the tip! I totally forgot about that! I think I'll grab that and try running some things from there.
And you are right, I tried long POST and considered flashing my BIOS but I've found too many hits on google with people not having flashed the same mainboard with similar hardware plus I'd prefer to avoid that..risky.

However, I've just rebooted a few times (~10 times today so far) without any errors. Stripping won't do me any good like this hmm
At some point this is going to drive me insane.

Offline

#40 2009-10-18 15:33:59

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

EDIT: Nevermind. Was not relevant to the actual problem.

I am going to burn the UBCD and run some tests I guess.

EDIT2: It froze again. I am documenting the behavior and trying stuff at the moment. Going to post that in a bit..

Last edited by eyescream (2009-10-18 16:40:07)

Offline

#41 2009-10-18 18:59:56

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

Okay, like I said earlier, my PC froze again and I had to reset.
I took the following pictures while trying to reboot. I rebooted countless times and the pictures are from various boots, the first one is the one right after I had to reset due to the freeze:

http://img32.imageshack.us/img32/9501/dsc00060aj.jpg
I couldn't type anything here. After a while it displayed the kernel panic screen.

http://img148.imageshack.us/img148/227/dsc00064s.jpg
Same as above.

http://img40.imageshack.us/img40/3720/dsc00065bq.jpg
I found an option to check the lan cable in BIOS after last night's freeze and enabled it because I figured it wouldn't hurt. Then this showed up when I tried to boot, every time starting from reboot #3 after today's freeze.

http://img40.imageshack.us/img40/4828/dsc00066yz.jpg
Extended POST. No idea whether something is off, seems fine to me.

http://img40.imageshack.us/img40/7666/dsc00067xa.jpg
Errors during boot. Fsck would keep running, keyboard wouldn't work and at some point the kernel panic screen would be displayed.

http://img237.imageshack.us/img237/1307/dsc00072h.jpg
Another boot, keyboard input worked, stuck on starting network. Kernel Panic screen after a while.
No idea what happened to fsck there.

http://img42.imageshack.us/img42/7372/dsc00073p.jpg
Fsck passed as opposed to before. Stuck on starting network again.
Because of the lan cable error during POST I pulled the cable which made no difference, though.

http://img40.imageshack.us/img40/4318/dsc00075um.jpg
Here it finished booting and put me into a virtual terminal where I could log in.
It had done that the boot prior to this one too, yet it told me my username didn't exist (/home/eyescream was gone) so I rebooted and ended up in the virtual terminal again where I changed to my normal user and entered startx. The picture is the result.

http://img508.imageshack.us/img508/3790/dsc00076wf.jpg
After a few reboots I was able to log in _and_ have files in /var (sometimes it would be empty). Nano would work but using the arrow keys/page up/down wouldn't work. Used cat to try to find some helpful lines in some of the logs.

http://img267.imageshack.us/img267/1007/dsc00077i.jpg
Same as the one above.

http://img19.imageshack.us/img19/4884/dsc00079qu.jpg
This was one of the last boots. All partition passed fsck again but the keyboard didn't work.


Now after all of the above my system will randomly boot and drop me into a terminal from which I can barely do anything anyway. X won't start and nothing is working anymore.

I have no idea whether any of these pictures contain a clue or can be helpful at all.
A friend of mine noted that in the first picture it says something about klibc, meaning that it crashes before the root partition is mounted or something. Maybe that is helpful.

Also, I had a terminal running top open at the time of the freeze. It didn't show anything unusual, just the usual right before the freeze. Apart from that it is interesting to note that it froze twice within a day. That has never happened before. Though, that is probably because before yesterday a freeze like that (a freeze after my system had been running for a while) had only happened once which resulted in my system being unusable right away, as opposed to taking 2 freezes like this time.

Any thoughts?

EDIT: I had a hunch and turned my PC off for a while. When I booted again it was exactly as I had thought, it booted normally (except for the fact that it still displays the lan cable error during POST). KDM appeared, I logged in, everything is working. Now I am even more confused.

My initial freeze problem happened only on cold-boots but never on reboots.
When my PC freezes like today it won't boot properly on reboots and a cold boot is required.
(Though, yesterday it worked after a reboot. But I needed a few reboots until all errors would disappear during boot, like not being able to load the network.)

Last edited by eyescream (2009-10-18 19:33:28)

Offline

#42 2009-10-18 20:25:41

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

I've looked through my logs again and found some errors and other things that don't seem right, plus there might be some indication as to what causes the errors:

errors.log: http://pastebin.com/f5ed29f85

kdm.log: http://pastebin.com/m22e65187

Xorg.0.log: http://pastebin.com/m33bf53fc

xorg.conf: http://pastebin.com/d427b08d7

kernel.log: http://pastebin.com/m407117d0

There are also some minor errors that I've googled but couldn't really tell whether they can be neglected or should be addressed (and if so, how they can be addressed).

I don't think the Xorg error(s) are the cause seeing as I've already had the PC problems when I was still using Ubuntu. (Though, if someone could tell me whether my xorg.conf is fine the way it is or whether something needs to be changed would be awesome as well! I don't really have a good understanding when it comes to that.)
Also bear in mind that any log entry past 17:00 (or possibly even a bit before that) occurred during the time when I was having trouble/was trying things after the freeze.

Offline

#43 2009-10-19 17:05:54

eyescream
Member
Registered: 2009-08-24
Posts: 49

Re: Hardware (?) trouble

I just tested my RAM stick by stick to see whether it might be a faulty controller or slot or whatever.
No matter what I tried, it would always work. Even a voltage of 2.1 at 1066 mhz worked with each setup, even when I put the sticks back in their original slots.
This makes absolutely no sense to me. Yesterday (or the day before) after the freeze I tried the RAM settings again and it froze as it used to.
Now suddenly it doesn't freeze anymore, not on a reboot and not on a cold-boot either.

It appears to be a mainboard issue. My RAM checks out in memtest, my HDDs check out, my GPU and CPU survived all stress tests, I don't see how my PSU could be responsible for such behavior (correct me if i'm mistaken) and that leaves only the mainboard. The issues seem completely random and then just disappear. Now even the only consistent thing, the freezes, have disappeared.
I am at a total loss here. So random.

Offline

Board footer

Powered by FluxBB