You are not logged in.

#1 2021-06-16 21:23:00

pntruongan
Member
Registered: 2011-01-31
Posts: 63

[SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

I have an old NVIDIA Geforce GTX 1060 3GiB graphic card. About a few months back, I started seeing this problem. The screen would turn off, completely blank like the computer has shutdown. Except for the fact that computer is still running and I can even ssh into it.
But the screen and keyboard/mouse remain unresponsive.

After a hard reset, I ran

journalctl -xb -1

to trace the problem and those line would appear:

Jun 17 04:03:41 archlinux kernel: NVRM: GPU at PCI:0000:08:00: GPU-c0b31bf4-5aaf-2c80-0b87-b4218586ab36
Jun 17 04:03:41 archlinux kernel: NVRM: GPU Board Serial Number: 
Jun 17 04:03:41 archlinux kernel: NVRM: Xid (PCI:0000:08:00): 79, pid=1187, GPU has fallen off the bus.
Jun 17 04:03:41 archlinux kernel: NVRM: GPU 0000:08:00.0: GPU has fallen off the bus.
Jun 17 04:03:41 archlinux kernel: NVRM: GPU 0000:08:00.0: GPU is on Board .
Jun 17 04:03:41 archlinux kernel: NVRM: A GPU crash dump has been created. If possible, please run
                                  NVRM: nvidia-bug-report.sh as root to collect this data before
                                  NVRM: the NVIDIA kernel module is unloaded.

Has been seeing this problem for a while now. I did some googling and most of them lead to hardware dying issues. However, I still may want to know is there anything else to try? Buying a replace graphic card during this global shortage is impossible for me. sad

Last edited by pntruongan (2021-06-22 10:48:50)

Offline

#2 2021-06-17 21:29:52

Buddlespit
Member
From: Chesapeake, Va.
Registered: 2014-02-07
Posts: 501

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.

Did you do this?

Offline

#3 2021-06-18 00:37:51

pntruongan
Member
Registered: 2011-01-31
Posts: 63

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

Buddlespit wrote:

NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.

Did you do this?

Oh yes. The data nvidia-bug-report.sh collected is quite large so I put in on the web here:
https://truongan.name.vn/wp-content/upl … report.txt

This is not done immediately after the crash, but a few boot later though. Is it crucial that I must run nvidia-bug-report right after the crash?

Last edited by pntruongan (2021-06-18 00:39:28)

Offline

#4 2021-06-18 07:14:22

seth
Member
Registered: 2012-09-03
Posts: 51,017

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

Is it crucial that I must run nvidia-bug-report right after the crash?

Yes.

Could just be https://bbs.archlinux.org/viewtopic.php?id=265563 - does it correlate w/ the update?

Offline

#5 2021-06-18 08:23:06

pntruongan
Member
Registered: 2011-01-31
Posts: 63

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

Seem unlikely. My problem has been going on for a few months, across various nvidia driver and kernel update. I just keep ignoring it at first becuase it so random. Just this week the frequency that GPU fallen off the bus has risen to 2-3 times a day, unberable for me.

Offline

#6 2021-06-18 11:21:30

seth
Member
Registered: 2012-09-03
Posts: 51,017

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

I see.
Try to pass "rcutree.rcu_idle_gp_delay=1 pcie_aspm=off" to the kernel.

Offline

#7 2021-06-18 12:02:28

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,393

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

It could be a simple hardware problem, In the past it happened to me with an old 9800.
The PCI-E Card was literally fallen off the bus; it was not properly mounted into the slot, check it.


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#8 2021-06-18 23:25:27

pntruongan
Member
Registered: 2011-01-31
Posts: 63

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

kokoko3k wrote:

It could be a simple hardware problem, In the past it happened to me with an old 9800.
The PCI-E Card was literally fallen off the bus; it was not properly mounted into the slot, check it.

I have reseated the card several times before. Took it out, clean the slot and the card with isopropyl alcohol, wait for the alcohol to dry and put it back in. Several times  in the past few months, still this problem keep showing up

Offline

#9 2021-06-19 00:27:32

pntruongan
Member
Registered: 2011-01-31
Posts: 63

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

seth wrote:

I see.
Try to pass "rcutree.rcu_idle_gp_delay=1 pcie_aspm=off" to the kernel.

I will try this out on Sunday. Since I need this machine to work from home (COVID lock down going on here), I've been switching to  a spare, much older GTX 750. The GTX750 seem to work fine in the last two days (which make my hope for the 1060 even slimmer).
It's a pain to watching GTX750 try to cope with a 4K monitor but I really need the machine for Teams meeting.

Offline

#10 2021-06-22 10:49:45

pntruongan
Member
Registered: 2011-01-31
Posts: 63

Re: [SOLVED]NVIDIA 1060 3GiB, GPU has fallen off the bus.

I surrender to the god of bad luck. It's must be a hardware issue then, Then GTX 750 has run  stable for 2 days in a row, no crashing.

Offline

Board footer

Powered by FluxBB