You are not logged in.

#1 2011-02-21 23:26:30

nycthbris
Member
Registered: 2007-10-23
Posts: 10

Intermittent system freeze: no evidence in logs

Lengthy description here, but I've tried just about everything to the best of my knowledge. Any insights are appreciated!

3 weeks ago I upgraded my system with an additional 1TB HD and 4 more gigs of RAM. Around the same time, the HD with my Arch install started to fail mechanically. I was luckily able to move my data over to the new drive and set up my system there. After this new install the system would unpredictably freeze with a blank screen of a random color (usually whatever color was covering the majority of the screen). Naturally I thought it was a video card issue so I picked up a new card, reinstalled, but the freezing persisted (although the screen would just freeze where I was, no blanket color). I ran memtest to see if my new RAM was the issue, but it reported no errors. I realized that at the time of the freeze I would hear the drive in use spin down so I checked the SMART status of the new drive and turns out it was failing too, so I picked up a new drive and reinstalled again (no SMART warnings on the new drive). Until this point I was unable to successfully reproduce or anticipate the error, it would occur unpredictably. For this new installation I set up KDE4.6 and discovered I was able to reproduce the error by simply logging in via KDM, opening Dolphin, and hovering my mouse over the drives listed in the left panel (yes I know that sounds ridiculous, but it happened every time, even if I waited after logging in to open Dolphin). I also realized that the freeze would only take place while X was running, and would not occur while I was in a virtual console (tty1-6). There seems to be no manifestation of this problem when I'm running Win 7 on the same hardware, although I spend significantly less time doing so.

Currently I switched over to GNOME and can run the system for usable lengths of time (hours to a few days) before the freeze happens, which is usually when starting or closing applications. I can't ssh into the frozen system. I've tailed dmesg prior to it freezing and nothing peculiar comes up. There aren't any errors in Xorg.log either. I've tried removing extra PCI cards I've got (tv-tuner, wireless cards, etc.) but the problem persists. I have not had the problem while running a LiveCD. I've even tried switching up which SATA ports/cables and power connections I'm using on my motherboard.

Specs:
Intel Core i7 CPU  860  @ 2.80GHz
Gigabyte P55M-UD2 motherboard
6 GB DDR3 1600 (PC3 12800) RAM
750W Corsair PSU (replaced less than a year ago)

lspci:

00:00.0 Host bridge: Intel Corporation Core Processor DMI (rev 11)
00:03.0 PCI bridge: Intel Corporation Core Processor PCI Express Root Port 1 (rev 11)
00:08.0 System peripheral: Intel Corporation Core Processor System Management Registers (rev 11)
00:08.1 System peripheral: Intel Corporation Core Processor Semaphore and Scratchpad Registers (rev 11)
00:08.2 System peripheral: Intel Corporation Core Processor System Control and Status Registers (rev 11)
00:08.3 System peripheral: Intel Corporation Core Processor Miscellaneous Registers (rev 11)
00:10.0 System peripheral: Intel Corporation Core Processor QPI Link (rev 11)
00:10.1 System peripheral: Intel Corporation Core Processor QPI Routing and Protocol Registers (rev 11)
00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 05)
00:1a.1 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 05)
00:1a.2 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 05)
00:1a.7 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 05)
00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05)
00:1c.4 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 5 (rev 05)
00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 6 (rev 05)
00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 05)
00:1d.1 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 05)
00:1d.2 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 05)
00:1d.3 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 05)
00:1d.7 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
00:1f.0 ISA bridge: Intel Corporation 5 Series Chipset LPC Interface Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller (rev 05)
01:00.0 VGA compatible controller: nVidia Corporation G96 [GeForce 9400 GT] (rev a1)
03:00.0 SATA controller: JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 02)
03:00.1 IDE interface: JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 02)
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03)
05:07.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link)

I'm out of ideas save for replacing my mobo. Any idea what I could be missing?

Offline

#2 2011-02-21 23:31:36

davidoff
Member
Registered: 2008-06-10
Posts: 23

Re: Intermittent system freeze: no evidence in logs

I get a similar problem and I wonder about memory.  What is the memory output if you install and run lshw ?

Offline

#3 2011-02-21 23:40:27

nycthbris
Member
Registered: 2007-10-23
Posts: 10

Re: Intermittent system freeze: no evidence in logs

Memory section output of lshw:

     *-memory
          description: System Memory
          physical id: 19
          slot: System board or motherboard
          size: 6GiB
        *-bank:0
             description: DIMM 1600 MHz (0.6 ns)
             physical id: 0
             slot: A0
             size: 2GiB
             width: 2244 bits
             clock: 1600MHz (0.6ns)
        *-bank:1
             description: DIMM 1600 MHz (0.6 ns)
             physical id: 1
             slot: A1
             size: 2GiB
             width: 2244 bits
             clock: 1600MHz (0.6ns)
        *-bank:2
             description: DIMM 1600 MHz (0.6 ns)
             physical id: 2
             slot: A2
             size: 2GiB
             width: 2244 bits
             clock: 1600MHz (0.6ns)
        *-bank:3
             description: DIMM [empty]
             physical id: 3
             slot: A3

Offline

#4 2011-02-22 03:00:56

davidoff
Member
Registered: 2008-06-10
Posts: 23

Re: Intermittent system freeze: no evidence in logs

I have the same crash problem (usually over ssh) and in lshw I get that the width is very large in bits.  I think it should be either 32 or 64 bits if memory is working correctly with the system.  I have no idea how to fix, though.

Offline

#5 2011-02-23 21:38:05

Tempel
Member
Registered: 2010-01-19
Posts: 10

Re: Intermittent system freeze: no evidence in logs

I've been getting the same problem (and it sounds like they might have it in another thread too).  I can't replicate it - it often happens when I'm away from the computer - and no logs show anything interesting; just ordinary MARKs that abruptly end.  For me, lshw is giving a width of 64 bits, so perhaps a large width is not the problem.  The other thread speculated on nvidia graphics drivers being the problem; I have those drivers, but it looks like you have Intel graphics, so maybe graphics drivers aren't the problem either.  All I can be certain of is that it started after upgrading to kernel 2.6.37.

Offline

#6 2011-02-24 15:21:53

nycthbris
Member
Registered: 2007-10-23
Posts: 10

Re: Intermittent system freeze: no evidence in logs

I'm not sure what's up with my memory width, its the same (2244 bits) when running on a live CD also. I've updated my BIOS but the discrepancy still remains.

I actually have a nVidia card (GeForce 9400 GT) and have always used their proprietary drivers. I don't think my video card is the problem, but I could give the other drivers a shot.

Offline

#7 2011-02-26 10:28:53

Devotedfollower
Member
Registered: 2010-09-26
Posts: 48

Re: Intermittent system freeze: no evidence in logs

try their beta drivers... I was having similar freeze issues. Nothing was in the logs and this fixed the issue for me. HTH

DF

Offline

#8 2011-03-26 18:59:05

Tempel
Member
Registered: 2010-01-19
Posts: 10

Re: Intermittent system freeze: no evidence in logs

I've become convinced the problem is with kernel 2.6.37.  First I tried switching to nouveau drivers; then I tried using no video drivers and just working from console.  In each case, I still experienced random lockups.  In these cases, I saw a long error trace on lockup, but still no evidence in logs, so I think it's same thing.

This morning, I upgraded to kernel 2.6.37.5-1 and nvidia 270.30-3, and still got a lockup.  I'm now trying kernel26-lts along with nvidia-lts, and haven't seen a lockup yet (I don't know why I didn't think of this earlier).  So I have to conclude that the kernel itself is to blame.

Offline

#9 2011-03-27 03:26:41

mrunion
Member
From: Jonesborough, TN
Registered: 2007-01-26
Posts: 1,938
Website

Re: Intermittent system freeze: no evidence in logs

Do you use your wireless interface? My money's on that. I have trouble with mine since 2.6.37. Try tunning on wired for a while and see if it goes away.


Matt

"It is very difficult to educate the educated."

Offline

#10 2011-03-27 04:05:11

Tempel
Member
Registered: 2010-01-19
Posts: 10

Re: Intermittent system freeze: no evidence in logs

I do use wireless; wired, unfortunately, isn't an option.  But I'll keep that in mind, thanks.  Maybe I can avoid crashes just by unloading the wireless driver (rt2500usb).

EDIT: I managed to get 48 hours of uptime by only intermittently connecting to wireless.  Then I got greedy and left it connected for too long and it froze up.  So this must be my problem; now, how to fix it...

Last edited by Tempel (2011-03-31 15:32:37)

Offline

Board footer

Powered by FluxBB