You are not logged in.

#1 2017-07-21 07:30:35

kelnoky
Member
Registered: 2007-11-20
Posts: 134

[Solved] Graphics issues with weird lockup, graphics card dying?

Over the course of the past month or so I have had weird graphics issues. I would sit at my PC working away and suddenly the screen would go black, then grey and stay grey. The music that was playing continues but no input is accepted - for example my keyboard shortcuts that pause/resume music don't work or my opening dmenu and entering my shutdown command yields no result. So I hard rebooted my PC via the power button and this didn't occur again until like a week later, same thing. The same symptoms only with the screen turning into a grid of pink lines also happened once.

Now today this happened three times in a row (shortly after resuming from suspend, then shortly after the reboot, then again shortly after the reboot) instead of a few weeks apart, so now I am worried. I checked systemd journal, these are the last two logs:

https://ptpb.pw/EMMO

https://ptpb.pw/pzhf

Both of these logs (as well as one of the older incidents where I also checked the log) have a line like this:

Jul 21 08:34:10 william kernel: NVRM: Xid (PCI:0000:04:00): 79, GPU has fallen off the bus. 

This is followed by some kernel stuff I can't make heads or tails of. Is my graphics card dying? If so, shouldn't my pc still accept my keyboard shortcuts after the graphics card shit the bed? Is it my PSU? I am kinda short on money right now, I would hate to buy a new graphics card to find out that wasn't it.

Last edited by kelnoky (2017-08-27 10:11:11)

Offline

#2 2017-07-21 07:50:31

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,846
Website

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

In my experience, GPUs falling off the bus is not a sign of hardware failure, but could be a sign that the hardware is not seated properly. Have you tried reseating the GPU? Giving it a clean at the same time may help, as well as a blast of compressed air into the PCI slot on the motherboard.

Mod note: Moving to kernel/hardware.


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

#3 2017-07-21 11:57:52

LaurentvdB
Member
Registered: 2017-04-24
Posts: 32

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

If I see it correctly on the site of Intel (here), your processor has an integrated graphics card (the "HD Graphics 4600"). Does that have a port to connect (e.g. VGA/DVI)? If the solution of WorMzy does not work and it turns out to be hardware you can always try to use that one if it can connect to the monitor.

Last edited by LaurentvdB (2017-07-21 11:58:08)

Offline

#4 2017-07-21 14:42:55

seth
Member
Registered: 2012-09-03
Posts: 51,056

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

1. Underpowered (what GPU in particular, model/vendor)
2. Badly seated/"wrong" PCI slot
3. Temperature issue
4. https://wiki.archlinux.org/index.php/GR … ramebuffer - because of

Jul 21 08:33:06 william kernel: NVRM: Your system is not currently configured to drive a VGA console
Jul 21 08:33:06 william kernel: NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
Jul 21 08:33:06 william kernel: NVRM: requires the use of a text-mode VGA console. Use of other console
Jul 21 08:33:06 william kernel: NVRM: drivers including, but not limited to, vesafb, may result in
Jul 21 08:33:06 william kernel: NVRM: corruption and stability problems, and is not supported.

Offline

#5 2017-07-21 15:40:51

kelnoky
Member
Registered: 2007-11-20
Posts: 134

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

I will definitely reset the card next time it does this, thanks for the tip - it's been fine for the rest of today at least.

It's not the temperature, unless there is something really weird going on with the card's cooler because it is a lot warmer in here than it was this morning and it's fine atm.

It's a ZOTAC GeForce GTX 760 AMP! Edition card, the PSU is definitely powerful enough - unless it's the PSU that is somehow failing.

I've also changed the framebuffer, thanks.

Offline

#6 2017-07-21 16:16:45

seth
Member
Registered: 2012-09-03
Posts: 51,056

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

The thing has *two* dedicated 6pin power adapters and draws 185W on full load. I can see where this can fail ;-)
Ensure to not take both supplies from the same PSU line (eg. using a Y-splitter or so) and try to underclock the device (check nvidia-settings on the power settings anyway)
Also try different environments (assuming a GL composited desktop: try openbox behavior and avoid GL clients, including various browsers, resp. turn off HW acceleration)

Offline

#7 2017-07-21 16:39:26

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,774

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

seth wrote:

The thing has *two* dedicated 6pin power adapters and draws 185W on full load. I can see where this can fail ;-)

O_o

Does it use tubes? https://s-media-cache-ak0.pinimg.com/23 … puters.jpg


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#8 2017-07-30 11:12:00

kelnoky
Member
Registered: 2007-11-20
Posts: 134

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Ok, so after the last time this happened I thourouhly cleaned the inside of my PC, including the graphics card and the PCIe slot and in the process I of course reset the thing. There were a few days without incident, but just now I had one where the screen turned grey and again I couldn't input anything to any effect. This time however, the journal doesn't list the GPU falling off the bus. However, I still can't make anything of this error message:

https://ptpb.pw/LMUf

What the hell is wrong with my PC? big_smile

Offline

#9 2017-07-30 11:44:40

ugjka
Member
From: Latvia
Registered: 2014-04-01
Posts: 1,806
Website

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Maybe try doing what the error message suggests? (try booting with the "irqpoll" option)

Jul 30 12:57:51 william kernel: irq 16: nobody cared (try booting with the "irqpoll" option)
Jul 30 12:57:51 william kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: P           O    4.12.3-1-ARCH #1
Jul 30 12:57:51 william kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B85M Pro4, BIOS P2.10 07/03/2014
Jul 30 12:57:51 william kernel: Call Trace:
Jul 30 12:57:51 william kernel:  <IRQ>
Jul 30 12:57:51 william kernel:  dump_stack+0x63/0x8d
Jul 30 12:57:51 william kernel:  __report_bad_irq+0x35/0xc0
Jul 30 12:57:51 william kernel:  note_interrupt+0x24b/0x2a0
Jul 30 12:57:51 william kernel:  handle_irq_event_percpu+0x54/0x80
Jul 30 12:57:51 william kernel:  handle_irq_event+0x39/0x60
Jul 30 12:57:51 william kernel:  handle_fasteoi_irq+0x8b/0x150
Jul 30 12:57:51 william kernel:  handle_irq+0x1a/0x30
Jul 30 12:57:51 william kernel:  do_IRQ+0x46/0xd0
Jul 30 12:57:51 william kernel:  common_interrupt+0x89/0x89
Jul 30 12:57:51 william kernel: RIP: 0010:cpuidle_enter_state+0x12b/0x300
Jul 30 12:57:51 william kernel: RSP: 0018:ffffffffaca03dc0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff8e
Jul 30 12:57:51 william kernel: RAX: ffff8a609dc18c40 RBX: 000021605da49dca RCX: 000000000000001f
Jul 30 12:57:51 william kernel: RDX: 000021605da49dca RSI: ffff8a609dc16458 RDI: 0000000000000000
Jul 30 12:57:51 william kernel: RBP: ffffffffaca03e00 R08: cccccccccccccccd R09: 0000000000000008
Jul 30 12:57:51 william kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff8a609dc21200
Jul 30 12:57:51 william kernel: R13: 0000000000000000 R14: 0000000000000001 R15: ffffffffacaa9158
Jul 30 12:57:51 william kernel:  </IRQ>
Jul 30 12:57:51 william kernel:  cpuidle_enter+0x17/0x20
Jul 30 12:57:51 william kernel:  call_cpuidle+0x23/0x40
Jul 30 12:57:51 william kernel:  do_idle+0x18a/0x1e0
Jul 30 12:57:51 william kernel:  cpu_startup_entry+0x71/0x80
Jul 30 12:57:51 william kernel:  rest_init+0x84/0x90
Jul 30 12:57:51 william kernel:  start_kernel+0x43b/0x45c
Jul 30 12:57:51 william kernel:  ? early_idt_handler_array+0x120/0x120
Jul 30 12:57:51 william kernel:  x86_64_start_reservations+0x29/0x2b
Jul 30 12:57:51 william kernel:  x86_64_start_kernel+0x143/0x166
Jul 30 12:57:51 william kernel:  secondary_startup_64+0x9f/0x9f
Jul 30 12:57:51 william kernel: handlers:
Jul 30 12:57:51 william kernel: [<ffffffffc00ac580>] usb_hcd_irq [usbcore]
Jul 30 12:57:51 william kernel: Disabling IRQ #16

https://ugjka.net
paru > yay | webcord > discord
pacman -S spotify-launcher
mount /dev/disk/by-...

Offline

#10 2017-07-30 11:47:21

Morn
Member
Registered: 2012-09-02
Posts: 886

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Maybe this is an issue with the motherboard, not the graphics card? The symptoms remind me of the eMac G4 and the lockups it gets due to failing caps on the motherboard close to the graphics card.

So maybe visually inspecting the mobo and looking for any bulging caps might be a good idea just in case.

Offline

#11 2017-07-30 14:35:24

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,774

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Two things you might try.

First, you have an i5 processor, but I see no evidence that you have installed or configured the Intel microcode updates.  That is important. https://wiki.archlinux.org/index.php/Microcode
Second, I realize that this system is a power beast, but it seems that your problems often happen after waking from sleep; you might look into the correlation between sleep and the occurrence of the issue.  Maybe try not using sleep mode

You had indicated that temperature seems not to be the issue -- it had failed when it was cooler and worked when hotter.  For reference, what are the ambient temperatures in the room?  I assume it is summer in your part of the world.

Last edited by ewaller (2017-07-30 14:36:20)


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#12 2017-07-30 17:14:07

kelnoky
Member
Registered: 2007-11-20
Posts: 134

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Next time it happens, I will check for bad capacitors - even though I am not so sure I would recognize them.

I installed the Intel microcodes, didn't know about that, thanks! And the last ~4 times it happened was not at all close to waking from suspend. The first few times it happened, I thought the same, but then it just never occured within some reasonable time after waking up.

And yeah, it's summer, it's pretty much always 25° in here. Not too warm, but also not cold. Also yeah, it never happened while I was playing something, even graphic intensive stuff like PUBG (on windows, but that wouldn't matter if the gfx card is busted) or CS:GO or other games on Linux. It was always while doing very mundane things like browsing or not even doing anything at all.

Offline

#13 2017-07-30 18:05:06

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,774

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

kelnoky wrote:

Next time it happens, I will check for bad capacitors - even though I am not so sure I would recognize them.

https://en.wikipedia.org/wiki/Capacitor_plague
No need to wait, if they have failed, they have failed.  It is not like they recover or change state causing the system to fail.  My guess is this is not the root cause.

I installed the Intel microcodes, didn't know about that, thanks! And the last ~4 times it happened was not at all close to waking from suspend. The first few times it happened, I thought the same, but then it just never occurred within some reasonable time after waking up.

I would be interested in seeing a new journal dump of the most recent boot to check it.

And yeah, it's summer, it's pretty much always 25° in here. Not too warm, but also not cold. Also yeah, it never happened while I was playing something, even graphic intensive stuff like PUBG (on windows, but that wouldn't matter if the gfx card is busted) or CS:GO or other games on Linux. It was always while doing very mundane things like browsing or not even doing anything at all.

Okay, so I am thinking it is not a stress thing.  Let's see how the microcode updates pan out.
That graphics card is still a power beast though smile


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#14 2017-08-02 11:25:37

kelnoky
Member
Registered: 2007-11-20
Posts: 134

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Ok, so it happened three times again today. After the first time I checked the log and there was the GPU fell off the bus line again. Didn't check after the second time, wanted to check now after the third time and I just get this:

william% journalctl -b -1
Data from the specified boot (-1) is not available: No data available
william [journalctl -b -1] ~                                                             17-08-02  1:23pm
william% journalctl -b -2
Data from the specified boot (-2) is not available: No data available
william [journalctl -b -2] ~                                                             17-08-02  1:23pm
william% journalctl -b -3
Data from the specified boot (-3) is not available: No data available

I guess the hard resets were not kind to the logs?

But I checked the motherboard and it seems fine. It's also just a year old - not that that means much, but it does still look new-ish, no capacitor defects that I can see.

So we are back to the graphics card being the cause I guess? Maybe I should just buy a 1050ti and see if this problem vanishes. If it doesn't maybe I can still return it.

Offline

#15 2017-08-02 11:46:52

seth
Member
Registered: 2012-09-03
Posts: 51,056

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Ddi you try irqpoll'ing?
Though I assume this is the GPU and I assume this is its power supply.
If you can: try to replace it with sth. less demanding.
The ensure the power supply as suggested in comment #6
Iirc, the vendor suggests a 600W PSU, btw. - do you match that requirement?

Offline

#16 2017-08-27 10:10:32

kelnoky
Member
Registered: 2007-11-20
Posts: 134

Re: [Solved] Graphics issues with weird lockup, graphics card dying?

Ok, so I put in a new graphics card and haven't had the issue since.

Last edited by kelnoky (2017-08-27 10:10:38)

Offline

Board footer

Powered by FluxBB