You are not logged in.

#1 2024-08-25 06:46:13

Cory Parsnipson
Member
Registered: 2024-08-25
Posts: 20

[SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

Hello, I recently installed arch on an AMD based lenovo thinkcenter and I'm seeing weird I/O errors with this sata based SSD I'm using. The SSD is about a decade old now, but I think it is still good (see smartctl output below).

Occasionally when I try to access this, it gives me various I/O errors, causing the system to hang for about 20-40 seconds waiting for the SSD to respond. The error message always changes, though it could be FLUSH CACHE error, READ DMA, WRITE DMA, READ PFDMA QUEUED, etc and sometimes complains about misaligned CHS sector 0. Mounting takes a couple minutes (too long), and when I try to unmount it takes a really long time and the shell process never gets control back.

I ran a short and a long self test and both came back with no errors.

I have a second, newer SSD (Samsung EVO 870) that works just fine if I swap them out. I also tried my Intel SSD on my personal laptop mounting, reformatting with Fedora linux with absolutely no errors cropping up in dmesg. So that's why I think these errors must be spurious and hopefully I can resolve them by patching the firmware or kernel drivers or something. (I'm still trying to search how to do that, but haven't found anything yet... I am going to try disabling NCQ shortly to see if that helps.)

Anyone have insight into this? Thanks!

And here's lots of information below, as well as results of smartctl:

System info:

~ > uname -a                                                                                                                                                                               
Linux <hostname redacted> 6.10.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 19 Aug 2024 17:02:39 +0000 x86_64 GNU/Linux
~ > lscpu                                                                                                                                                                                  
Architecture:             x86_64                                                                                                                                                                                   
  CPU op-mode(s):         32-bit, 64-bit                                                                                                                                                                           
  Address sizes:          43 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   8
  On-line CPU(s) list:    0-7
Vendor ID:                AuthenticAMD
  Model name:             AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics
    CPU family:           23
    Model:                17
    Thread(s) per core:   2
    Core(s) per socket:   4
    Socket(s):            1
    Stepping:             0
    Frequency boost:      enabled
    CPU(s) scaling MHz:   52%
    CPU max MHz:          3200.0000
    CPU min MHz:          1600.0000
    BogoMIPS:             6390.36
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc 
                          cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misaligns
                          se 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha
                          _ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif ov
                          erflow_recov succor smca sev sev_es
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.10.6-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Intel 53x and Pro 1500/2500 Series SSDs
Device Model:     INTEL SSDSC2BW120H6
Serial Number:    CVTR52320672120AGN
LU WWN Device Id: 5 5cd2e4 14c886546
Firmware Version: RG20
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
TRIM Command:     Available, deterministic
Device is:        In smartctl database 7.3/5528
ATA Version is:   ACS-3 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Aug 24 23:42:38 2024 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 2930) seconds.
Offline data collection
capabilities: 			 (0x7f) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Abort Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  58) minutes.
Conveyance self-test routine
recommended polling time: 	 (   4) minutes.
SCT capabilities: 	       (0x0025)	SCT Status supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours_and_Msec 0x0032   100   100   000    Old_age   Always       -       8067h+00m+00.000s
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       2708
170 Available_Reservd_Space 0x0033   083   100   010    Pre-fail  Always       -       0
171 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
174 Unexpect_Power_Loss_Ct  0x0032   100   100   000    Old_age   Always       -       44
183 SATA_Downshift_Count    0x0032   100   100   000    Old_age   Always       -       122
184 End-to-End_Error        0x0033   100   100   090    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   030   100   000    Old_age   Always       -       30 (Min/Max 9/50)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       44
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       2
225 Host_Writes_32MiB       0x0032   100   100   000    Old_age   Always       -       208091
226 Workld_Media_Wear_Indic 0x0032   100   100   000    Old_age   Always       -       65535
227 Workld_Host_Reads_Perc  0x0032   100   100   000    Old_age   Always       -       34
228 Workload_Minutes        0x0032   100   100   000    Old_age   Always       -       65535
232 Available_Reservd_Space 0x0033   083   100   010    Pre-fail  Always       -       0
233 Media_Wearout_Indicator 0x0032   075   100   000    Old_age   Always       -       0
241 Host_Writes_32MiB       0x0032   100   100   000    Old_age   Always       -       208091
242 Host_Reads_32MiB        0x0032   100   100   000    Old_age   Always       -       109148
249 NAND_Writes_1GiB        0x0032   100   100   000    Old_age   Always       -       47661

SMART Error Log not supported

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      8066         -
# 2  Short offline       Aborted by host               30%      8059         -
# 3  Short offline       Completed without error       00%      8059         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

Last edited by Cory Parsnipson (2024-08-27 04:38:56)

Offline

#2 2024-08-25 06:50:28

Cory Parsnipson
Member
Registered: 2024-08-25
Posts: 20

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

Oh forgot dmesg logs:

Here I mount and unmount after a few seconds. The mount works immediately and without error in this instance, while the unmount takes about a minute and gives me a couple errors like failed READ DMA, and FLUSH CACHE. The ata device also keeps "hard resetting link" and prints out "device reported invalid CHS sector 0" too:

[Aug24 23:46] EXT4-fs (sda1): mounted filesystem 70d74cb5-ed11-4a23-ba69-a1959fda572a r/w with ordered data mode. Quota mode: none.
[Aug24 23:47] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0xc0000 action 0x6 frozen
[  +0.000013] ata1: SError: { CommWake 10B8B }
[  +0.000011] ata1.00: failed command: READ DMA
[  +0.000004] ata1.00: cmd c8/00:08:08:28:00/00:00:00:00:00/e0 tag 16 dma 4096 in
                       res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  +0.000018] ata1.00: status: { DRDY }
[  +0.000015] ata1: hard resetting link
[  +0.466918] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  +0.040570] ata1.00: configured for UDMA/133
[  +0.000373] ata1.00: device reported invalid CHS sector 0
[  +0.000023] sd 0:0:0:0: [sda] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=30s
[  +0.000008] sd 0:0:0:0: [sda] tag#16 Sense Key : Illegal Request [current] 
[  +0.000006] sd 0:0:0:0: [sda] tag#16 Add. Sense: Unaligned write command
[  +0.000007] sd 0:0:0:0: [sda] tag#16 CDB: Read(10) 28 00 00 00 28 08 00 00 08 00
[  +0.000004] I/O error, dev sda, sector 10248 op 0x0:(READ) flags 0x83700 phys_seg 1 prio class 0
[  +0.000019] ata1: EH complete
[  +0.007765] EXT4-fs (sda1): unmounting filesystem 70d74cb5-ed11-4a23-ba69-a1959fda572a.
[Aug24 23:48] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0xc0000 action 0x6 frozen
[  +0.000014] ata1: SError: { CommWake 10B8B }
[  +0.000009] ata1.00: failed command: FLUSH CACHE
[  +0.000005] ata1.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 17
                       res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  +0.000016] ata1.00: status: { DRDY }
[  +0.000015] ata1: hard resetting link
[  +0.473600] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  +0.040470] ata1.00: configured for UDMA/133
[  +0.000014] ata1.00: retrying FLUSH 0xe7 Emask 0x4
[  +0.005284] ata1.00: device reported invalid CHS sector 0
[  +0.000020] ata1: EH complete

Offline

#3 2024-08-25 07:21:32

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 9,003
Website

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

If the errors show up on one machine but not another then it must be the cable and/or connector on the bad machine.


Jin, Jîyan, Azadî

Offline

#4 2024-08-25 07:37:38

Cory Parsnipson
Member
Registered: 2024-08-25
Posts: 20

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

I bought the computer that arch is installed on refurbished so that's definitely a possibility.

I did this though, which leads me to believe it's not the SSD, but also maybe not the sata connector?:

1 Intel SSD (giving errors)
1 samsung EVO 870 SSD

If I put the samsung SSD into the arch machine, no ata I/O errors in arch.
If I put the intel SSD (not hotswapping), I/O errors on umount.

Putting the Intel SSD into my personal laptop on Fedora shows no I/O errors. I am able to use gparted to reformat and fsck the drive with no issues...

==================

Also small update: I tried disabling NCQ by writing 1 to the queue_depth of the block device, but that doesn't seem to help (or at least didn't solve all the problems).

I tried using the intel/solidigm fw update tool from the SSD wiki page (https://wiki.archlinux.org/title/Solid_ … e#Firmware) to no avail... The tool seems to be unable to see this drive, even though I can list it in fdisk.

Last edited by Cory Parsnipson (2024-08-25 07:38:33)

Offline

#5 2024-08-25 08:09:59

Cory Parsnipson
Member
Registered: 2024-08-25
Posts: 20

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

Oh and I forgot to mention this thread, which seems to be very similar: https://bbs.archlinux.org/viewtopic.php?id=210100

This user had similar error messages about FLUSH CACHE being the failed command and 20 seconds of latency correlating to data access.

Someone in that thread mentioned something that sounded really helpful:

headkase wrote:

Arch packages upstream.  If VirtualBox doesn't provide a sane "machine" for the standard kernel to run on then it is a VirtualBox problem and not an Arch problem.  If you install Arch on the bare metal and get the same problem then it is likely an upstream kernel problem.  Assuming you had the VirtualBox modules properly installed in your virtual machine of course.

I am on bare metal, not a VM, so it could be an "upstream kernel problem" like they say? I have no idea where to go from here, but would be grateful if anyone had clues.

Thanks!

Offline

#6 2024-08-25 10:41:26

xerxes_
Member
Registered: 2018-04-29
Posts: 1,056

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

Can you post output of commands:

lspci -k
smartctl -x /dev/sda

Also look here:
https://serverfault.com/questions/95214 … nning-slow
https://bbs.archlinux.org/viewtopic.php?id=200492

Offline

#7 2024-08-25 21:22:27

Cory Parsnipson
Member
Registered: 2024-08-25
Posts: 20

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

Thanks for the response! I took a look at the threads you linked, it looks like the easiest thing to do is to change the cable or junk the drive, which I'm slowly thinking that might be the only thing left for me to do. I'll hang in there for a little bit longer, though, just in case. Thanks again for the help, both of you.

I think the interesting part of the lspci is here:

5:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 61)
	Subsystem: Lenovo Device 3130
	Kernel driver in use: ahci

But here's the full output of lspci -k:

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex
	Subsystem: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
	Subsystem: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0]
	Subsystem: Lenovo Device 3130
	Kernel driver in use: pcieport
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0]
	Subsystem: Lenovo Device 3130
	Kernel driver in use: pcieport
00:01.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0]
	Subsystem: Lenovo Device 3130
	Kernel driver in use: pcieport
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus A
	Subsystem: Advanced Micro Devices, Inc. [AMD] Device 0000
	Kernel driver in use: pcieport
00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus B
	Subsystem: Advanced Micro Devices, Inc. [AMD] Device 0000
	Kernel driver in use: pcieport
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
	Subsystem: Lenovo Device 3130
	Kernel driver in use: piix4_smbus
	Kernel modules: i2c_piix4, sp5100_tco
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
	Subsystem: Lenovo Device 3130
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 3
	Kernel driver in use: k10temp
	Kernel modules: k10temp
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 7
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller (rev 0e)
	Subsystem: Lenovo Device 3130
	Kernel driver in use: r8169
	Kernel modules: r8169
01:00.1 Serial controller: Realtek Semiconductor Co., Ltd. RTL8111xP UART #1 (rev 0e)
	Subsystem: Lenovo Device 3130
	Kernel driver in use: serial
01:00.2 Serial controller: Realtek Semiconductor Co., Ltd. RTL8111xP UART #2 (rev 0e)
	DeviceName: Broadcom 5762
	Subsystem: Lenovo Device 3130
	Kernel driver in use: serial
01:00.3 IPMI Interface: Realtek Semiconductor Co., Ltd. RTL8111xP IPMI interface (rev 0e)
	Subsystem: Lenovo Device 3130
	Kernel modules: ipmi_si
01:00.4 USB controller: Realtek Semiconductor Co., Ltd. RTL811x EHCI host controller (rev 0e)
	Subsystem: Lenovo Device 3130
	Kernel driver in use: ehci-pci
02:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
	Subsystem: Lenovo Device 0827
	Kernel driver in use: ath10k_pci
	Kernel modules: ath10k_pci
03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961/SM963
	Subsystem: Samsung Electronics Co Ltd SM963 2.5" NVMe PCIe SSD
	Kernel driver in use: nvme
	Kernel modules: nvme
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] (rev d6)
	Subsystem: Lenovo Device 3130
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
04:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
	Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
	Kernel driver in use: ccp
	Kernel modules: ccp
04:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
	Subsystem: Lenovo Device 3130
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
04:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
	Subsystem: Lenovo Device 3130
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
04:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor
	Subsystem: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor
	Kernel driver in use: snd_pci_acp3x
	Kernel modules: snd_pci_acp3x, snd_rn_pci_acp3x, snd_pci_acp5x, snd_pci_acp6x, snd_acp_pci, snd_rpl_pci_acp6x, snd_pci_ps, snd_sof_amd_renoir, snd_sof_amd_rembrandt, snd_sof_amd_vangogh, snd_sof_amd_acp63
04:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller
	DeviceName: Realtek ALC898
	Subsystem: Lenovo Device 3130
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
5:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 61)
	Subsystem: Lenovo Device 3130
	Kernel driver in use: ahci

And here's smartctl -x /dev/sda:

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.10.6-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Intel 53x and Pro 1500/2500 Series SSDs
Device Model:     INTEL SSDSC2BW120H6
Serial Number:    CVTR52320672120AGN
LU WWN Device Id: 5 5cd2e4 14c886546
Firmware Version: RG20
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
TRIM Command:     Available, deterministic
Device is:        In smartctl database 7.3/5528
ATA Version is:   ACS-3 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Aug 25 14:01:40 2024 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 2930) seconds.
Offline data collection
capabilities: 			 (0x7f) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Abort Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  58) minutes.
Conveyance self-test routine
recommended polling time: 	 (   4) minutes.
SCT capabilities: 	       (0x0025)	SCT Status supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  5 Reallocated_Sector_Ct   -O--CK   100   100   000    -    0
  9 Power_On_Hours_and_Msec -O--CK   100   100   000    -    8069h+00m+00.000s
 12 Power_Cycle_Count       -O--CK   100   100   000    -    2712
170 Available_Reservd_Space PO--CK   083   100   010    -    0
171 Program_Fail_Count      -O--CK   100   100   000    -    0
172 Erase_Fail_Count        -O--CK   100   100   000    -    0
174 Unexpect_Power_Loss_Ct  -O--CK   100   100   000    -    46
183 SATA_Downshift_Count    -O--CK   100   100   000    -    157
184 End-to-End_Error        PO--CK   100   100   090    -    0
187 Uncorrectable_Error_Cnt -O--CK   100   100   000    -    0
190 Airflow_Temperature_Cel -O--CK   029   100   000    -    29 (Min/Max 9/50)
192 Power-Off_Retract_Count -O--CK   100   100   000    -    46
199 UDMA_CRC_Error_Count    -O--CK   100   100   000    -    2
225 Host_Writes_32MiB       -O--CK   100   100   000    -    208091
226 Workld_Media_Wear_Indic -O--CK   100   100   000    -    65535
227 Workld_Host_Reads_Perc  -O--CK   100   100   000    -    34
228 Workload_Minutes        -O--CK   100   100   000    -    65535
232 Available_Reservd_Space PO--CK   083   100   010    -    0
233 Media_Wearout_Indicator -O--CK   075   100   000    -    0
241 Host_Writes_32MiB       -O--CK   100   100   000    -    208091
242 Host_Reads_32MiB        -O--CK   100   100   000    -    109150
249 NAND_Writes_1GiB        -O--CK   100   100   000    -    47662
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x04       GPL,SL  R/O      1  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x30       GPL,SL  R/O     16  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xb7       GPL,SL  VS      16  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log (GP Log 0x03) not supported

SMART Error Log not supported

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      8066         -
# 2  Short offline       Aborted by host               30%      8059         -
# 3  Short offline       Completed without error       00%      8059         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       0 (0x0000)
Device State:                        Active (0)
Current Temperature:                    29 Celsius
Power Cycle Min/Max Temperature:      9/50 Celsius
Lifetime    Min/Max Temperature:      9/50 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        10 minutes
Min/Max recommended Temperature:      0/70 Celsius
Min/Max Temperature Limit:            0/75 Celsius
Temperature History Size (Index):    478 (321)

Index    Estimated Time   Temperature Celsius
 322    2024-08-22 06:30    37  ******************
 323    2024-08-22 06:40    37  ******************
 324    2024-08-22 06:50    36  *****************
 325    2024-08-22 07:00     ?  -
 326    2024-08-22 07:10    29  **********
 327    2024-08-22 07:20    33  **************
 328    2024-08-22 07:30    35  ****************
 329    2024-08-22 07:40    35  ****************
 330    2024-08-22 07:50    35  ****************
 331    2024-08-22 08:00    33  **************
 332    2024-08-22 08:10    36  *****************
 333    2024-08-22 08:20     ?  -
 334    2024-08-22 08:30     ?  -
 335    2024-08-22 08:40    26  *******
 336    2024-08-22 08:50    31  ************
 337    2024-08-22 09:00    34  ***************
 338    2024-08-22 09:10    37  ******************
 339    2024-08-22 09:20    37  ******************
 340    2024-08-22 09:30    38  *******************
 341    2024-08-22 09:40     ?  -
 342    2024-08-22 09:50    23  ****
 343    2024-08-22 10:00    32  *************
 344    2024-08-22 10:10    35  ****************
 345    2024-08-22 10:20     ?  -
 346    2024-08-22 10:30    29  **********
 347    2024-08-22 10:40    33  **************
 348    2024-08-22 10:50    34  ***************
 349    2024-08-22 11:00    34  ***************
 350    2024-08-22 11:10    34  ***************
 351    2024-08-22 11:20     ?  -
 352    2024-08-22 11:30    26  *******
 353    2024-08-22 11:40    32  *************
 354    2024-08-22 11:50    32  *************
 355    2024-08-22 12:00    34  ***************
 356    2024-08-22 12:10    34  ***************
 357    2024-08-22 12:20    34  ***************
 358    2024-08-22 12:30     ?  -
 359    2024-08-22 12:40    26  *******
 360    2024-08-22 12:50    32  *************
 361    2024-08-22 13:00    35  ****************
 362    2024-08-22 13:10    36  *****************
 363    2024-08-22 13:20    36  *****************
 364    2024-08-22 13:30     ?  -
 ...    ..( 18 skipped).    ..  -
 383    2024-08-22 16:40     ?  -
 384    2024-08-22 16:50    36  *****************
 385    2024-08-22 17:00    41  **********************
 386    2024-08-22 17:10    44  *************************
 387    2024-08-22 17:20    45  **************************
 388    2024-08-22 17:30     ?  -
 ...    ..( 11 skipped).    ..  -
 400    2024-08-22 19:30     ?  -
 401    2024-08-22 19:40    34  ***************
 402    2024-08-22 19:50    37  ******************
 403    2024-08-22 20:00    39  ********************
 404    2024-08-22 20:10    40  *********************
 405    2024-08-22 20:20    40  *********************
 406    2024-08-22 20:30    43  ************************
 407    2024-08-22 20:40    44  *************************
 408    2024-08-22 20:50    45  **************************
 409    2024-08-22 21:00     ?  -
 ...    ..(227 skipped).    ..  -
 159    2024-08-24 11:00     ?  -
 160    2024-08-24 11:10    25  ******
 161    2024-08-24 11:20     ?  -
 ...    ..( 78 skipped).    ..  -
 240    2024-08-25 00:30     ?  -
 241    2024-08-25 00:40    29  **********
 242    2024-08-25 00:50     ?  -
 ...    ..( 57 skipped).    ..  -
 300    2024-08-25 10:30     ?  -
 301    2024-08-25 10:40    27  ********
 302    2024-08-25 10:50    29  **********
 303    2024-08-25 11:00    31  ************
 304    2024-08-25 11:10     ?  -
 305    2024-08-25 11:20     ?  -
 306    2024-08-25 11:30     ?  -
 307    2024-08-25 11:40    28  *********
 308    2024-08-25 11:50     ?  -
 ...    ..(  6 skipped).    ..  -
 315    2024-08-25 13:00     ?  -
 316    2024-08-25 13:10    35  ****************
 317    2024-08-25 13:20    39  ********************
 318    2024-08-25 13:30     ?  -
 ...    ..(  2 skipped).    ..  -
 321    2024-08-25 14:00     ?  -

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 2) ==
0x01  0x008  4            2712  ---  Lifetime Power-On Resets
0x01  0x010  4            8069  ---  Power-on Hours
0x01  0x018  6     13637490934  ---  Logical Sectors Written
0x01  0x020  6       246485129  ---  Number of Write Commands
0x01  0x028  6      7153255757  ---  Logical Sectors Read
0x01  0x030  6       159070545  ---  Number of Read Commands
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4               0  ---  Resets Between Cmd Acceptance and Completion
0x05  =====  =               =  ===  == Temperature Statistics (rev 1) ==
0x05  0x008  1              29  ---  Current Temperature
0x05  0x010  1              31  ---  Average Short Term Temperature
0x05  0x018  1               -  ---  Average Long Term Temperature
0x05  0x020  1              47  ---  Highest Temperature
0x05  0x028  1              17  ---  Lowest Temperature
0x05  0x030  1              36  ---  Highest Average Short Term Temperature
0x05  0x038  1              27  ---  Lowest Average Short Term Temperature
0x05  0x040  1               -  ---  Highest Average Long Term Temperature
0x05  0x048  1               -  ---  Lowest Average Long Term Temperature
0x05  0x050  4               0  ---  Time in Over-Temperature
0x05  0x058  1              70  ---  Specified Maximum Operating Temperature
0x05  0x060  4               0  ---  Time in Under-Temperature
0x05  0x068  1               0  ---  Specified Minimum Operating Temperature
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4           15462  ---  Number of Hardware Resets
0x06  0x010  4             772  ---  Number of ASR Events
0x06  0x018  4               2  ---  Number of Interface CRC Errors
0x07  =====  =               =  ===  == Solid State Device Statistics (rev 1) ==
0x07  0x008  1              42  ---  Percentage Used Endurance Indicator
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            2  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            2  Device-to-host register FISes sent due to a COMRESET
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0010  2            0  R_ERR response for host-to-device data FIS, non-CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x0013  2            0  R_ERR response for host-to-device non-data FIS, non-CRC
0x0002  2            0  R_ERR response for data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS

Offline

#8 2024-08-26 19:01:46

xerxes_
Member
Registered: 2018-04-29
Posts: 1,056

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

Except cable check (data and power cable) you can also try 'ahci.mobile_lpm_policy=1' boot parameter if it something change related to your SSD. This turn off power saving.

Offline

#9 2024-08-27 04:38:39

Cory Parsnipson
Member
Registered: 2024-08-25
Posts: 20

Re: [SOLVED] Intel Sata SSD Giving Suprious I/O Errors (I think...)

xerxes_ wrote:

Except cable check (data and power cable) you can also try 'ahci.mobile_lpm_policy=1' boot parameter if it something change related to your SSD. This turn off power saving.

I think that worked!! WOO HOO

Thank you!

The slowness is completely gone. No I/O errors reported in dmesg whatsoever!

I tried adding both noncq and ahci.mobile_lpm_policy=1 to my boot parameters, but the noncq doesn't seem to make a difference. Thanks!!

Last edited by Cory Parsnipson (2024-08-27 04:39:55)

Offline

Board footer

Powered by FluxBB