You are not logged in.

#1 2022-02-25 10:10:45

spikeyamk
Member
Registered: 2021-06-17
Posts: 3

AMD Ryzen 5 2600 Instability issues

Hello there,
ever since I left my PC compile firefox from source I am experiencing random shutdown, slowdowns and troubles booting and shutting down the system. The CPU itself functions fine. I tried stressing all the CPU threads and it worked fine. Problems occur when stressing the I/O: random network hangups or slow disk speeds. I benchmarked my SAMSUNG 970 EVO NVMe SSD and it showed only 800 MB/s sequential. It should be at least a couple of thousands MB/s. Next thing is I get these error in dmesg. I tried swapping the motherboard and it showed the same results. I tried booting with pci=nomsi which does nothing just hides these error messages. I'm sure the SSD and RAM and GPU and MOBO etc. is working. I am suspecting the CPU sustained a heat damage after leaving it compile code for hours. When stressing heavily the I/O through USB, it works fine. When stressing the I/O through SATA or NVMe links, that's where problems occur. Is there anything I can do? Can I probably use other PCIe lanes in order to connect it to the CPU? Or do I just have to buy a new CPU?


dmesg output:

 [  125.475432] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  125.475444] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  125.475446] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000000c0/00006000
[  125.475450] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  125.475452] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  125.486433] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  125.486440] pcieport 0000:00:01.1: AER: can't find device of ID0000
[  125.497457] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  125.497469] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  125.497472] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000000c0/00006000
[  125.497476] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  125.497479] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  125.519515] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  125.519529] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  125.519533] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000000c0/00006000
[  125.519537] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  125.519540] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  125.519545] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  125.519548] nvme 0000:01:00.0:   device [144d:a808] error status/mask=00001100/0000e000
[  125.519552] nvme 0000:01:00.0:    [ 8] Rollover              
[  125.519554] nvme 0000:01:00.0:    [12] Timeout               
[  125.530569] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  125.530581] pcieport 0000:00:01.1: AER: can't find device of ID0000
[  125.651759] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  125.651776] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  125.651780] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000000c0/00006000
[  125.651785] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  125.651789] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  125.662871] pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:00.0
[  125.662881] pcieport 0000:00:01.1: AER: can't find device of ID0000
[  125.673797] pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:00.0
[  125.673807] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  125.673810] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
[  125.673815] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  126.698738] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  126.698754] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  126.698758] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00001080/00006000
[  126.698763] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  126.698767] pcieport 0000:00:01.1:    [12] Timeout               
[  126.709859] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  126.709868] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  126.709869] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000010c0/00006000
[  126.709871] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  126.709873] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  126.709875] pcieport 0000:00:01.1:    [12] Timeout               
[  126.709878] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  126.709880] nvme 0000:01:00.0:   device [144d:a808] error status/mask=00001100/0000e000
[  126.709881] nvme 0000:01:00.0:    [ 8] Rollover              
[  126.709883] nvme 0000:01:00.0:    [12] Timeout               
[  126.720853] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  126.720863] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  126.720865] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000000c0/00006000
[  126.720867] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  126.720869] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  126.731782] pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:00.0
[  126.731790] pcieport 0000:00:01.1: AER: can't find device of ID0000
[  127.492231] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  127.492247] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  127.492250] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00001080/00006000
[  127.492255] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  127.492259] pcieport 0000:00:01.1:    [12] Timeout               
[  127.503239] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  127.503247] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  127.503249] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000000c0/00006000
[  127.503251] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  127.503253] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  127.503256] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  127.503258] nvme 0000:01:00.0:   device [144d:a808] error status/mask=00001000/0000e000
[  127.503260] nvme 0000:01:00.0:    [12] Timeout               
[  127.514274] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  127.514280] pcieport 0000:00:01.1: AER: can't find device of ID0000
[  130.732339] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  130.732355] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  130.732358] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000000c0/00006000
[  130.732364] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  130.732367] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  130.743346] pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:00.0
[  130.743355] pcieport 0000:00:01.1: AER: can't find device of ID0000
[  130.754371] pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:00.0
[  130.754376] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  130.754378] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
[  130.754380] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  131.393587] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  131.393604] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  131.393608] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=000010c0/00006000
[  131.393614] pcieport 0000:00:01.1:    [ 6] BadTLP                
[  131.393618] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  131.393621] pcieport 0000:00:01.1:    [12] Timeout               
[  131.393626] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  131.393629] nvme 0000:01:00.0:   device [144d:a808] error status/mask=00001100/0000e000
[  131.393634] nvme 0000:01:00.0:    [ 8] Rollover              
[  131.393637] nvme 0000:01:00.0:    [12] Timeout               
[  133.906336] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  133.906353] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[  133.906357] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00001080/00006000
[  133.906362] pcieport 0000:00:01.1:    [ 7] BadDLLP               
[  133.906367] pcieport 0000:00:01.1:    [12] Timeout               
[  133.917345] pcieport 0000:00:01.1: AER: Multiple Corrected error received: 0000:00:00.0
[  133.917359] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
[  133.917363] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
[  133.917367] pcieport 0000:00:01.1:    [ 6] BadTLP      

lspci | grep 01 output

00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller (rev 01)
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller (rev 01)
02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge (rev 01)
03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01) 

Offline

#2 2022-02-25 19:43:26

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,065

Re: AMD Ryzen 5 2600 Instability issues

The aer / pcie errors are probably unrelated to the heavy cpu load from compiling, but they can result in performance issues.


https://unix.stackexchange.com/question … 090#369090


I had similar errors on my threadripper system and solved them by adding pcie_aspm=off to my boot command line (as recommended on the bottom of that thread) .

Some people reported updating motherboard firmware (uefi/bios) or adding pci=nommconf solved their issues.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

Board footer

Powered by FluxBB