You are not logged in.
I do not game or over-clock. System has excellent airflow... I don't even use very much 3D.
In the past 2 months I have had two graphics cards fail (XFX GeForce 8600 GT and BFG 9800 GT).
At first my monitors would randomly go to sleep when I was using the console (NOT running X server). I
have tried everything to get the monitor to wake up -- nothing worked but a hard reboot. When I started the X server,
the fan ran at 100% for a couple minutes before quieting down.
Kernel log:
Jun 18 16:43:48 ArchMain kernel: NVRM: os_map_kernel_space: won't map address 0x0 UC!
Jun 18 16:43:48 ArchMain kernel: NVRM: RmInitAdapter failed! (0x26:0xffffffff:1076)
Jun 18 16:43:48 ArchMain kernel: NVRM: rm_init_adapter(0) failed
After about a month, screen started shaking and red and green dots started flashing at random. Card was dead.
Same story for the second card.
It took me a little while to figure it out what was happening... My NVIDIA cards were overheating when X was not
running. They would heat up to 90-100C and then poweroff the monitors. When I started X, the fan would run at 100%
until the card cooled down.
From what I see, the problem is caused by a couple things:
1. The GPU fan does not seem to activate when the X server is not running.
2. The GPU seems to run at a max power state when the X server is not running.
I do not start X (no kdm, no gdm, no xdm...) on boot. I use agetty as my login manager and have added to following to my .bashrc
if [ -z "$DISPLAY" ] && [ $(tty) == /dev/tty1 ]; then
exec /usr/bin/startx &>/dev/null
clear
exit
fi
I believe the combination of the problem stated above and the fact that I do not use a graphical login manager (leaving
my computer on for many hours with no X server running) caused the cards to overheat and quickly fail.
Has anybody else experienced this issue?
Kernel 2.6.34 (x86_64), NVIDIA 256.35
Last edited by cactus.ed (2010-07-07 07:43:40)
Offline
From what I see, the problem is caused by a couple things:
1. The GPU fan does not seem to activate when the X server is not running.
2. The GPU seems to run at a max power state when the X server is not running.
Loading nvidia driver even when not using X may fix this.
Maybe nvclock could help too (http://www.linuxhardware.org/nvclock/).
...
Low-level Overclocking for all Nvidia cards except for the riva128/riva128zx
Additional Coolbits overclocking for GeforceFX/6/7/8 (desktop) cards
Hardware monitoring (including temperature reading, fanspeed adjustments)
...
Offline
cactus.ed wrote:From what I see, the problem is caused by a couple things:
1. The GPU fan does not seem to activate when the X server is not running.
2. The GPU seems to run at a max power state when the X server is not running.Loading nvidia driver even when not using X may fix this.
Maybe nvclock could help too (http://www.linuxhardware.org/nvclock/).
Nvclock Features wrote:...
Low-level Overclocking for all Nvidia cards except for the riva128/riva128zx
Additional Coolbits overclocking for GeforceFX/6/7/8 (desktop) cards
Hardware monitoring (including temperature reading, fanspeed adjustments)
...
NVIDIA module is loaded at boot... I tried Nvclock, it does not seem to work for fan speed on my card(s) -- even with force option.
Last edited by cactus.ed (2010-07-07 17:50:51)
Offline