You are not logged in.

#1 2010-11-21 16:39:29

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Serious lack of stability

Okay so I have a custom built desktop (hardware specs below), running arch 64 for a friend of mine (will try and get him on the forums)..

Im not sure if it can go through an entire day without freezing. There seem to be some serious stability issues with this thing and I cant figure out where I have gone wrong with the build! Literally, the graphics seem to freeze with either a monotone colour screen (ramdom colour and at random intervals) or a 'no signal'. Its actually worse than win95! ;D

Any ideas on what's causing the problem or on how to debug this thing? It kills sshd everytime it freezes to I cant remote log in

Cheers

Spec:

CPU: AMD Phenom II X6 1055T
Mobo: ASUS M4N72-E nForce 750a
GPU: Nvidia GeForce 8800 GT 512mb

Last edited by GalacticArachnid (2010-11-21 16:44:59)

Offline

#2 2010-11-21 17:01:36

tomk
Forum Fellow
From: Ireland
Registered: 2004-07-21
Posts: 9,839

Re: Serious lack of stability

tail the logs via ssh, check the most recent entries when it freezes.

Offline

#3 2010-11-24 01:20:01

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

Another crash, here's the juicy part of the log:

Nov 24 00:55:01 localhost -- MARK --
Nov 24 01:11:36 localhost kernel: [drm] nouveau 0000:02:00.0: PFIFO_DMA_PUSHER - Ch 2
Nov 24 01:11:36 localhost kernel: [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x1360 Data 0x00000000:0x000cb55c
Nov 24 01:11:36 localhost kernel: [drm] nouveau 0000:02:00.0: PGRAPH_DATA_ERROR - INVALID_ENUM

top says 100% cpu usage from X

Last edited by GalacticArachnid (2010-11-24 01:22:07)

Offline

#4 2010-11-24 01:23:27

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,595
Website

Re: Serious lack of stability

Remove nouveau and try the nvidia package...


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#5 2010-11-24 01:24:44

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

Already have, but the system the crashed as soon as you start X with nvidia

Last edited by GalacticArachnid (2010-11-24 01:25:03)

Offline

#6 2010-11-24 01:35:51

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,595
Website

Re: Serious lack of stability

Sound like a hardware issue... does the 8800 have it's own power connector hooked-up by chance?  Is the card firmly seated in the slot?

Last edited by graysky (2010-11-24 01:37:03)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#7 2010-11-24 01:38:07

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

Yeah, the connector is seated fine and the card is snugly in the pci slot xD

Offline

#8 2010-11-24 01:56:09

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

Crash number two, about 5 min after starting the pc and logging in

Nov 24 01:47:30 localhost kernel: NET: Registered protocol family 10
Nov 24 01:47:30 localhost kernel: lo: Disabled Privacy Extensions
Nov 24 01:47:31 localhost kernel: [drm] nouveau 0000:02:00.0: Allocating FIFO number 2
Nov 24 01:47:31 localhost kernel: [drm] nouveau 0000:02:00.0: nouveau_channel_alloc: initialised FIFO 2
Nov 24 01:47:31 localhost kernel: [drm] nouveau 0000:02:00.0: Allocating FIFO number 3
Nov 24 01:47:31 localhost kernel: [drm] nouveau 0000:02:00.0: nouveau_channel_alloc: initialised FIFO 3
Nov 24 01:47:41 localhost kernel: eth0: no IPv6 routers present
Nov 24 01:50:25 localhost kernel: [drm] nouveau 0000:02:00.0: PFIFO_DMA_PUSHER - Ch 2

Last edited by GalacticArachnid (2010-11-24 01:58:41)

Offline

#9 2010-11-24 02:18:57

AngryKoala
Member
Registered: 2009-01-22
Posts: 197

Re: Serious lack of stability

Hmm, I had an 8800gt 512mb for a long time and it ran perfectly.  When you installed the nvidia driver you ran nvidia-xconfig right?

Offline

#10 2010-11-24 02:19:26

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,595
Website

Re: Serious lack of stability

What if you boot into runlevel 3?  Stable?  Again, could be a hardware issue.


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#11 2010-11-24 02:25:52

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

AngryKoala: Yeah, configed it fine. -- followed the wiki, tryed a couple of methods includind manual and just run nvidia-xconfig

graysky: all crashes occure at rl3, both log extracts up there were with X runing

Last edited by GalacticArachnid (2010-11-24 02:26:29)

Offline

#12 2010-11-24 03:17:15

Fruity
Member
Registered: 2009-12-16
Posts: 198

Re: Serious lack of stability

Just a thought, do you have the right volts going to the memory modules? Check the sticky on the side of one of them, then check the bios.

Offline

#13 2010-11-24 03:21:11

jowilkin
Member
Registered: 2009-05-07
Posts: 243

Re: Serious lack of stability

GalacticArachnid wrote:

AngryKoala: Yeah, configed it fine. -- followed the wiki, tryed a couple of methods includind manual and just run nvidia-xconfig

graysky: all crashes occure at rl3, both log extracts up there were with X runing

With X running you are at run level 5, not 3.  He is suggesting trying run level 3 which is full functionality, but no X server.  That would let you know that the problem is likely with the graphics hardware/driver.  It definately seems to be from the logs.

When you try to run with the nvidia driver, what turns up in the logs when you try to start X?  I wouldn't trust the nouveau driver at all personally, and the nvidia driver is generally quite good.

Offline

#14 2010-11-24 03:28:10

TigTex
Member
From: Portugal
Registered: 2008-06-19
Posts: 301

Re: Serious lack of stability

This is the 3rd topic today with a nvidia 8000 and X problems...
I'll try to ask the same to the others with problems.

Try to find out whats your GPU model. It should be a G84 or a G92.
If its a G84 or G86, it's most likely it's faulty and you need a new graphic card...
edit: is that a Gigabyte 8800gt? It it is, i had one friend with a similar problem (and thats a G92). Upgrade your video card bios to the latest version, it solves the problem.

if is not the gpu, I would run memtest86 and check your system temperatures

Last edited by TigTex (2010-11-24 03:34:22)


.::. TigTex @ Portugal .::.

Offline

#15 2010-11-24 06:35:12

lagagnon
Member
From: an Island in the Pacific...
Registered: 2009-12-10
Posts: 1,087
Website

Re: Serious lack of stability

Try running your system with a Live Linux CD or boot a Live Linux from your USB. If you still get freezing then you have a hardware fault and its probably nothing to do with the video module. If you don't get the freezing then start looking at the software systems themselves.


Philosophy is looking for a black cat in a dark room. Metaphysics is looking for a black cat in a dark room that isn't there. Religion is looking for a black cat in a dark room that isn't there and shouting "I found it!". Science is looking for a black cat in a dark room with a flashlight.

Offline

#16 2010-11-28 16:56:12

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

My apologies, it is in rl5.

rl3 is stable, there have been no crashes as is the live cd. Its only when running graphics off of the local install.

P.S. memtest didnt reveal anything; it was run for ~30min (dont know if thats enought)

Last edited by GalacticArachnid (2010-11-28 16:59:10)

Offline

#17 2010-11-28 17:16:11

gtklocker
Member
Registered: 2009-09-01
Posts: 462

Re: Serious lack of stability

Go to your /etc/mkinitcpio.conf, make nouveau shut up.

add !nouveau to MODULES, then

pacman -S kernel26 nvidia; nvidia-xconfig

Offline

#18 2010-11-28 17:16:42

Pati_boi
Member
Registered: 2010-11-23
Posts: 2

Re: Serious lack of stability

Hi its my machine GalacticArachnid is posting about.
Many months ago, i think around easter time i had the nvidia drivers working fine and all was well, but following a near complete system re-build the problems with nvidia began and linux began, we have tried many different distros ubuntu/debian/arch to see if that made a difference, and lots of the solutions up on the wiki to get nvidia working, to no avail. But for now we have left that problem to aside as months of trying to fix that problem have sent us(me) a little insane and so we have stuck with the nouveau driver. That is to say the card is in full working order, and has previously worked fine with linux.

Offline

#19 2010-11-28 17:32:21

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

Oooh, one more crash, juicy log this time (not that I can make that much sense of it):

Nov 28 17:08:31 localhost -- MARK --
Nov 28 17:23:15 localhost kernel: operapluginwrap[3680]: segfault at 7dd22d8c ip 000000007dd22d8c sp 000000007dd22d8c error 14
Nov 28 17:24:12 localhost kernel: [drm] nouveau 0000:02:00.0: Allocating FIFO number 4
Nov 28 17:24:12 localhost kernel: [drm] nouveau 0000:02:00.0: nouveau_channel_alloc: initialised FIFO 4
Nov 28 17:25:53 localhost kernel: [drm] nouveau 0000:02:00.0: PGRAPH_TRAP - Ch 2/5 Class 0x8297 Mthd 0x0f04 Data 0x00000000:0x00000000
Nov 28 17:25:53 localhost kernel: [drm] nouveau 0000:02:00.0: PGRAPH_TRAP_CCACHE_FAULT - VM: Trapped read at 0040722000 status 00000560 00000000 channel 2
Nov 28 17:25:53 localhost kernel: [drm] nouveau 0000:02:00.0: PGRAPH_TRAP_CCACHE_FAULT - 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Nov 28 17:25:56 localhost kernel: [drm] nouveau 0000:02:00.0: fail wait other chan
Nov 28 17:25:56 localhost kernel: [drm] nouveau 0000:02:00.0: validate vram_list
Nov 28 17:25:56 localhost kernel: [drm] nouveau 0000:02:00.0: validate: -16
Nov 28 17:25:59 localhost kernel: [drm] nouveau 0000:02:00.0: fail wait other chan
Nov 28 17:25:59 localhost kernel: [drm] nouveau 0000:02:00.0: validate vram_list
Nov 28 17:25:59 localhost kernel: [drm] nouveau 0000:02:00.0: validate: -16

Also, btw:

VGA compatible controller: nVidia Corporation G92 [GeForce 8800 GT] (rev a2)

Last edited by GalacticArachnid (2010-11-28 17:35:08)

Offline

#20 2010-11-28 18:45:57

GalacticArachnid
Member
Registered: 2009-01-02
Posts: 155
Website

Re: Serious lack of stability

Right, so we tried using nvidia. Consistant crashing every time (apparantly resising fullscreen to normal a flash vid really aggrevates). Here are the two log extracts.

Crash 1:

Nov 28 18:24:04 localhost load-modules.sh: 'wmi:05901221-D566-11D1-B2F0-00A0C9062910' is not a valid module or alias name
Nov 28 18:24:04 localhost load-modules.sh: 'acpi:device:' is not a valid module or alias name
Nov 28 18:24:04 localhost load-modules.sh: 'acpi:device:' is not a valid module or alias name
Nov 28 18:24:04 localhost load-modules.sh: 'acpi:device:' is not a valid module or alias name
Nov 28 18:24:04 localhost load-modules.sh: 'platform:regulatory' is not a valid module or alias name
Nov 28 18:24:04 localhost load-modules.sh: 'platform:regulatory' is not a valid module or alias name
Nov 28 18:24:05 localhost load-modules.sh: Not loading module 'nvidia' for alias 'pci:v000010DEd00000611sv00001682sd00002333bc03sc00i00' because it is blacklisted
Nov 28 18:24:20 localhost init: Entering runlevel: 3
Nov 28 18:24:25 localhost dhcpcd[3463]: eth0: leased 192.168.1.64 for 86400 seconds
Nov 28 18:24:25 localhost dhcpcd[3463]: forked to background, child pid 3496
Nov 28 18:28:36 localhost kernel: NET: Registered protocol family 10
Nov 28 18:28:36 localhost kernel: lo: Disabled Privacy Extensions
Nov 28 18:28:37 localhost kernel: nvidia: module license 'NVIDIA' taints kernel.
Nov 28 18:28:37 localhost kernel: Disabling lock debugging due to kernel taint
Nov 28 18:28:37 localhost kernel: nvidia 0000:02:00.0: PCI INT A -> Link[LN0A] -> GSI 19 (level, low) -> IRQ 19
Nov 28 18:28:37 localhost kernel: nvidia 0000:02:00.0: setting latency timer to 64
Nov 28 18:28:37 localhost kernel: vgaarb: device changed decodes: PCI:0000:02:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
Nov 28 18:28:37 localhost kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  260.19.21  Thu Nov  4 21:16:27 PDT 2010
Nov 28 18:28:46 localhost kernel: eth0: no IPv6 routers present

crash 2:

Nov 28 18:34:00 localhost load-modules.sh: 'wmi:A1799AF2-9429-4529-927E-DFE13736EEBA' is not a valid module or alias name
Nov 28 18:34:00 localhost load-modules.sh: 'acpi:device:' is not a valid module or alias name
Nov 28 18:34:00 localhost load-modules.sh: 'acpi:device:' is not a valid module or alias name
Nov 28 18:34:00 localhost load-modules.sh: 'acpi:device:' is not a valid module or alias name
Nov 28 18:34:01 localhost load-modules.sh: Not loading module 'nvidia' for alias 'pci:v000010DEd00000611sv00001682sd00002333bc03sc00i00' because it is blacklisted
Nov 28 18:34:01 localhost load-modules.sh: 'platform:regulatory' is not a valid module or alias name
Nov 28 18:34:01 localhost load-modules.sh: 'platform:regulatory' is not a valid module or alias name
Nov 28 18:34:16 localhost init: Entering runlevel: 3
Nov 28 18:34:22 localhost dhcpcd[3462]: eth0: leased 192.168.1.64 for 86400 seconds
Nov 28 18:34:22 localhost dhcpcd[3462]: forked to background, child pid 3495
Nov 28 18:37:55 localhost kernel: NET: Registered protocol family 10
Nov 28 18:37:55 localhost kernel: lo: Disabled Privacy Extensions
Nov 28 18:37:56 localhost kernel: nvidia: module license 'NVIDIA' taints kernel.
Nov 28 18:37:56 localhost kernel: Disabling lock debugging due to kernel taint
Nov 28 18:37:56 localhost kernel: nvidia 0000:02:00.0: PCI INT A -> Link[LN0A] -> GSI 19 (level, low) -> IRQ 19
Nov 28 18:37:56 localhost kernel: nvidia 0000:02:00.0: setting latency timer to 64
Nov 28 18:37:56 localhost kernel: vgaarb: device changed decodes: PCI:0000:02:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
Nov 28 18:37:56 localhost kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  260.19.21  Thu Nov  4 21:16:27 PDT 2010
Nov 28 18:38:05 localhost kernel: eth0: no IPv6 routers present

nouveau is blacklisted and nvidia is configured via nvidia-xconfig. Had to force it to use 96x96 dpi though >|

Last edited by GalacticArachnid (2010-11-28 18:47:54)

Offline

#21 2010-11-28 18:47:28

alfadude
Member
Registered: 2007-05-17
Posts: 4

Re: Serious lack of stability

From that log you can see that Nouveau is still loading.
The solution is already posted.

Go to your /etc/mkinitcpio.conf, make nouveau shut up.
add !nouveau to MODULES,

Edit: You already did while I was typing.

Last edited by alfadude (2010-11-28 18:49:11)

Offline

#22 2010-12-07 19:08:30

Pati_boi
Member
Registered: 2010-11-23
Posts: 2

Re: Serious lack of stability

So anymore ideas guys?
Also did i mention that with nvidia installed the crash is very different to the crashes with nouveau installed. Nvidia crashes the system almost always within about 30 secs of having X running, and the crash shuts down the whole system instantly, unlike the crash with nouveau installed were only the screen looses signal and the crashes are far more unpredictable, sometimes not occurring for days

Offline

Board footer

Powered by FluxBB