I'm trying to figure out if my video card is dying, or if it is something else. The available data is:
This image of VLC playing any video corrupted like this. A reboot usually fixes it: http://imgur.com/Ssll3OE
Same with Starting Steam, the window is completely black, no errors launching from cli: http://imgur.com/nLyNjV2
This random error that was in my logs:
localhost kernel: [ 9464.881584] chromium: segfault at 7fdf296d7768 ip 00007fd4a218a657 sp 00007fff2055c760 error 4 in libnvidia-glcore.so.313.30[7fd4a0d80000+1a1b000] ... Apr 10 17:05:28 localhost kernel: [48185.912695] CrPPAPIMain: segfault at 8 ip 00007f239e3693c9 sp 00007fff5b07ef20 error 4 in chromium[7f239bc28000+5708000] Apr 10 17:40:12 localhost kernel: [50269.941280] traps: CrPPAPIMain general protection ip:7f23946c96a7 sp:7fff5b07edb0 error:0 in libc-2.17.so[7f2394693000+1a3000] Apr 10 17:57:36 localhost kernel: [51313.900677] traps: CrPPAPIMain general protection ip:7f23946c96a7 sp:7f2386779910 error:0 in libc-2.17.so[7f2394693000+1a3000] Apr 10 18:18:18 localhost kernel: [52555.956561] CrPPAPIMain: segfault at 8 ip 00007f239e3693c9 sp 00007fff5b07b180 error 4 in chromium[7f239bc28000+5708000] Apr 10 18:24:39 localhost kernel: [52937.476736] CrPPAPIMain: segfault at 8 ip 00007f239e3693c9 sp 00007fff5b07ef20 error 4 in chromium[7f239bc28000+5708000] Apr 10 18:38:21 localhost kernel: [53759.096515] traps: CrPPAPIMain general protection ip:7f23946c96a7 sp:7fff5b07ecb0 error:0 in libc-2.17.so[7f2394693000+1a3000]
I think this was after I was playing TF2 and it crashed. Then I was trying to watch some flash videos but they would get stuck.
And when I was playing TF2 for linux, it would crash and I would get this:
localhost kernel: [47177.061986] [ 9871.518115] hrtimer: interrupt took 7822 ns [10085.639206] NVRM: GPU at 0000:01:00: GPU-e6f5f204-accd-9cc5-e940-c2a16b849f87 [10085.639210] NVRM: Xid (0000:01:00): 56, CMDre 00000001 00000094 bfef0c11 00000007 00000000 [10354.300516] NVRM: Xid (0000:01:00): 56, CMDre 00000001 000000c0 bfef0f00 00000004 00000084
I don't think this is related, but throwing it in there in case maybe it links to CPU or ram issue?:
[184765.787187] traps: gpg trap divide error ip:7f50623bcccb sp:7fffd1ab6f00 error:0 in libgcrypt.so.11.8.0[7f506236a000+7a000] [184844.924595] traps: gpg trap divide error ip:7fde44ecbccb sp:7fff9b912240 error:0 in libgcrypt.so.11.8.0[7fde44e79000+7a000] [184884.879213] traps: gpg trap divide error ip:7f229dd54ccb sp:7fff0cd817c0 error:0 in libgcrypt.so.11.8.0[7f229dd02000+7a000]
I think this was when pacman was getting a signature
My temps seem fine:
CPU Temperature: +34.0°C MB Temperature: +31.0°C $ nvidia-settings -q gpucoretemp -t 40
Is it a hardware issue? Could it be a driver issue? Maybe its not the video card? I don't know...
Last edited by boast (2013-04-13 22:17:36)
Asus M4A785TD-V ;; Phenom II X4 @ 3.9GHz ;; Ripjaws 12GB DDR3-1600 ;; 128GB Samsung 830 ;; MSI GTX460 v2 w/ blob ;; Arch Linux + KDE 4.x
Well, it's really hard to tell if it's a physical GPU problem without thoroughly testing it.
First, did this happen shortly after an update? Could it be a software problem?
Maybe even try swapping from nvidia to nouveau just to see if it's the problem.
Obviously, make a backup so you can swap back later with ease.
I'm going to just guess and say you don't have any spare GPU's or even in integrated GPU to use?
If you do, WONDERFUL! Just pull out the possible glitchy one (replacing with the spare) and go about your business to see if the problem persists.
The next best thing to do is run some GPU benchmark tools to see just what happens under load.
I did a quick search for Linux and GPU Benchmark, found https://unigine.com/products/benchmarks/.
It's a commercial application with a (full featured but up-selling) free version available for Linux.
I have zero experience with the tools but they seem serious about them (there are 4).
Best of all, the company seem to have a solid business model so I wouldn't mind trying it.
If through all this it persists, maybe try (if you have ~3GB of memory or more) a live CD XOrg test.
Grab the latest image and setup in the live environment XOrg to see if it could be something in your configs/install causing the problem.
You said you thought it might be memory. Try MemTest86 (Offered on the Arch Live CD).
Let us know how all this pans out and we'll see what we can do to help with what the results show.
Last edited by pilotkeller (2013-04-16 14:10:38)
I would also test off a live CD to see if it ocurrs to try an isolate a software issue.
Life is pleasant. Death is peaceful. It's the transition that's troublesome. Isaac Asimov - / - My Pastebin