You are not logged in.

#1 2018-02-20 14:01:09

jnbrains
Member
Registered: 2015-03-26
Posts: 25

NVidia errors hardware or software issue?

Hello everyone,

  I'm having some issues playing video (mplayer mainly) - the whole X freezes (xfce4),
and I have to connect via the network (ssh) and kill the process to get the access back.

Here's what I have in the dmesg (issue related):

[ 1048.791790] pcieport 0000:00:03.0: AER: Corrected error received: id=0018
[ 1048.791800] pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0018(Transmitter ID)
[ 1048.791808] pcieport 0000:00:03.0:   device [8086:d138] error status/mask=00001000/00002000
[ 1048.791813] pcieport 0000:00:03.0:    [12] Replay Timer Timeout  
...
[ 3862.156253] NVRM: GPU at PCI:0000:01:00: GPU-cb5f1341-98fb-ff33-aa74-7093bb7f4c78
[ 3862.156272] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 3904.843772] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 3941.466346] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 3980.519440] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 4015.747347] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 4056.457319] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 4109.475704] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 4157.783689] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000104, Data 00000001
[ 4197.101705] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00001b0c, Data 0000f000
...
[ 4277.729516] NVRM: Xid (PCI:0000:01:00): 6, PE0001 
[ 4366.558071] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0003, Class 00008597, Offset 00000f10, Data 44340000
[ 4366.561196] NVRM: Xid (PCI:0000:01:00): 6, PE0001 

Video Card: GeForce GTS 360M
Kernel: 4.15.3-2-ARCH
Drivers: nvidia-340xx 340.106-11

Offline

#2 2018-02-20 14:57:18

seth
Member
Registered: 2012-09-03
Posts: 51,319

Re: NVidia errors hardware or software issue?

http://docs.nvidia.com/deploy/xid-errors/index.html

Check it on linux-lts + nvidia-340xx-lts, if it still happens try a completely different stack (random live system w/ nvidia driver of different version)
Could be a(n uncorrected) bus error, but "0000:00:03.0" doesn't seem to be your GPU. So that message would not be significant by itself.
You also should rule out bad RAM - https://www.archlinux.org/packages/extr … emtest86+/

Offline

#3 2018-02-20 16:00:14

jnbrains
Member
Registered: 2015-03-26
Posts: 25

Re: NVidia errors hardware or software issue?

Thanks for the advise, seth.

memtest86+ and cuda_memtest(ocl_memtest) showed no errors
I'll try the lts a try as soon as time allows.

Last edited by jnbrains (2018-02-20 16:49:49)

Offline

#4 2018-02-20 16:44:32

seth
Member
Registered: 2012-09-03
Posts: 51,319

Re: NVidia errors hardware or software issue?

fyi, and though I don't think it's your RAM, but you need to run memtest86+ for quite some time (hours) to be relatively sure on this.

Offline

#5 2018-02-20 16:51:35

jnbrains
Member
Registered: 2015-03-26
Posts: 25

Re: NVidia errors hardware or software issue?

seth, the memory tests were done properly, prior to posting this question on the forums.

Offline

#6 2018-02-22 00:34:21

jnbrains
Member
Registered: 2015-03-26
Posts: 25

Re: NVidia errors hardware or software issue?

It seems that the LTS kernel and driver didn't resolve the issue:

[50415.985888] NVRM: GPU at PCI:0000:01:00: GPU-cb5f1341-98fb-ff33-aa74-7093bb7f4c78
[50415.985940] NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0005, Class 00008597, Offset 00001458, Data 0000b603
...
[50510.103907] NVRM: Xid (PCI:0000:01:00): 6, PE0001 
[50510.265836] NVRM: Xid (PCI:0000:01:00): 3, C 00000001 SC 00000006 M 00000800 Data 0261044a

Will have to try and see how it works under Windows, or start downgrading the nvidia driver.
Unfortunately, I've never made the nouveau driver work for me on that laptop.

Offline

#7 2018-02-25 10:31:40

jnbrains
Member
Registered: 2015-03-26
Posts: 25

Re: NVidia errors hardware or software issue?

Further research into the issue suggested that the gpu might be a overheating -  known to happen with similar models.
I have opened the laptop, cleaned the mess, re-applied thermal paste, and left it working for about two days straight.
No glitches since and no more messages in the logs.

Offline

Board footer

Powered by FluxBB