You are not logged in.

#1 2024-05-23 11:33:23

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,411

Sporadic Random(?) errors on Bus? ssd?

Hi there,

I'm having errors on my bus or ssd :

koko@Gozer# sudo smartctl -a /dev/sdd
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.7.0-arch3-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Crucial/Micron Client SSDs
Device Model:     CT500MX500SSD1
Serial Number:    21082D1A481B
LU WWN Device Id: 5 00a075 12d1a481b
Firmware Version: M3CR032
User Capacity:    500.107.862.016 bytes [500 GB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available
Device is:        In smartctl database 7.3/5528
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu May 23 13:24:49 2024 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  30) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x0031) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0
  5 Reallocate_NAND_Blk_Cnt 0x0032   100   100   010    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       8036
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       37
171 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 Ave_Block-Erase_Count   0x0032   073   073   000    Old_age   Always       -       411
174 Unexpect_Power_Loss_Ct  0x0032   100   100   000    Old_age   Always       -       5
180 Unused_Reserve_NAND_Blk 0x0033   000   000   000    Pre-fail  Always       -       41
183 SATA_Interfac_Downshift 0x0032   100   100   000    Old_age   Always       -       2
184 Error_Correction_Count  0x0032   100   100   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   060   040   000    Old_age   Always       -       40 (Min/Max 0/60)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_ECC_Cnt 0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       11
202 Percent_Lifetime_Remain 0x0030   073   073   001    Old_age   Offline      -       27
206 Write_Error_Rate        0x000e   100   100   000    Old_age   Always       -       0
210 Success_RAIN_Recov_Cnt  0x0032   100   100   000    Old_age   Always       -       0
246 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       23654003066
247 Host_Program_Page_Count 0x0032   100   100   000    Old_age   Always       -       412275119
248 FTL_Program_Page_Count  0x0032   100   100   000    Old_age   Always       -       5978128706

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Completed [00% left] (0-65535)
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

Smartctl seems ok in all but: 199 UDMA_CRC_Error_Count

Logs last a month:

apr 23 16:40:05 Gozer kernel: ata6.00: exception Emask 0x10 SAct 0x1000000 SErr 0x280100 action 0x6 frozen
apr 23 16:40:05 Gozer kernel: ata6.00: irq_stat 0x08000000, interface fatal error
apr 23 16:40:05 Gozer kernel: ata6: SError: { UnrecovData 10B8B BadCRC }
apr 23 16:40:05 Gozer kernel: ata6.00: failed command: READ FPDMA QUEUED
apr 23 16:40:05 Gozer kernel: ata6.00: cmd 60/c8:c0:18:30:60/00:00:1d:00:00/40 tag 24 ncq dma 102400 in
                                       res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
apr 23 16:40:05 Gozer kernel: ata6.00: status: { DRDY }
apr 23 16:40:05 Gozer kernel: ata6: hard resetting link
apr 23 16:40:05 Gozer kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
apr 23 16:40:05 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
apr 23 16:40:05 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
apr 23 16:40:05 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
apr 23 16:40:05 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
apr 23 16:40:05 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
apr 23 16:40:05 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
apr 23 16:40:05 Gozer kernel: ata6.00: configured for UDMA/133
apr 23 16:40:05 Gozer kernel: sd 5:0:0:0: [sdd] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
apr 23 16:40:05 Gozer kernel: sd 5:0:0:0: [sdd] tag#24 Sense Key : Illegal Request [current] 
apr 23 16:40:05 Gozer kernel: sd 5:0:0:0: [sdd] tag#24 Add. Sense: Unaligned write command
apr 23 16:40:05 Gozer kernel: sd 5:0:0:0: [sdd] tag#24 CDB: Read(10) 28 00 1d 60 30 18 00 00 c8 00
apr 23 16:40:05 Gozer kernel: I/O error, dev sdd, sector 492843032 op 0x0:(READ) flags 0x80700 phys_seg 25 prio class 2
apr 23 16:40:05 Gozer kernel: ata6: EH complete
--
                                       res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
apr 29 18:24:56 Gozer kernel: ata6.00: status: { DRDY }
apr 29 18:24:56 Gozer kernel: ata6: hard resetting link
apr 29 18:24:56 Gozer kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
apr 29 18:24:56 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
apr 29 18:24:56 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
apr 29 18:24:56 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
apr 29 18:24:56 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
apr 29 18:24:56 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
apr 29 18:24:56 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
apr 29 18:24:56 Gozer kernel: ata6.00: configured for UDMA/133
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#6 Sense Key : Illegal Request [current] 
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#6 Add. Sense: Unaligned write command
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#6 CDB: Read(10) 28 00 1b 6a e5 28 00 00 08 00
apr 29 18:24:56 Gozer kernel: I/O error, dev sdd, sector 459990312 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#7 Sense Key : Illegal Request [current] 
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#7 Add. Sense: Unaligned write command
apr 29 18:24:56 Gozer kernel: sd 5:0:0:0: [sdd] tag#7 CDB: Read(10) 28 00 22 6d 59 40 00 00 08 00
apr 29 18:24:56 Gozer kernel: I/O error, dev sdd, sector 577591616 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
apr 29 18:24:56 Gozer kernel: ata6: EH complete
--
mag 15 10:36:04 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 15 10:36:04 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 15 10:36:04 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 15 10:36:04 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 15 10:36:04 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 15 10:36:04 Gozer kernel: ata6.00: configured for UDMA/133
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#8 Sense Key : Illegal Request [current] 
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#8 Add. Sense: Unaligned write command
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#8 CDB: Read(10) 28 00 32 bc 0c 38 00 00 08 00
mag 15 10:36:04 Gozer kernel: I/O error, dev sdd, sector 851184696 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#9 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#9 Sense Key : Illegal Request [current] 
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#9 Add. Sense: Unaligned write command
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#9 CDB: Read(10) 28 00 1a 2d f1 a8 00 00 08 00
mag 15 10:36:04 Gozer kernel: I/O error, dev sdd, sector 439218600 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#10 Sense Key : Illegal Request [current] 
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#10 Add. Sense: Unaligned write command
mag 15 10:36:04 Gozer kernel: sd 5:0:0:0: [sdd] tag#10 CDB: Read(10) 28 00 27 9b 6e d8 00 00 08 00
mag 15 10:36:04 Gozer kernel: I/O error, dev sdd, sector 664497880 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
mag 15 10:36:04 Gozer kernel: ata6: EH complete
--
mag 17 00:02:04 Gozer python[1522229]: backintime (root/1): INFO: Lock
mag 17 00:02:04 Gozer python[1522229]: backintime (root/1): INFO: Take a new snapshot. Profile: 1 Main profile
mag 17 00:02:04 Gozer python[1522229]: backintime (root/1): INFO: Call rsync to take the snapshot
mag 17 00:02:05 Gozer python[1522232]: QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
mag 17 00:02:05 Gozer krunner[578479]: PLUGIN_090_INDEXER.reindex_timer_Timer.297: 13        minutes to go
mag 17 00:02:17 Gozer kernel: ata6.00: exception Emask 0x10 SAct 0x400 SErr 0x280100 action 0x6 frozen
mag 17 00:02:17 Gozer kernel: ata6.00: irq_stat 0x08000000, interface fatal error
mag 17 00:02:17 Gozer kernel: ata6: SError: { UnrecovData 10B8B BadCRC }
mag 17 00:02:17 Gozer kernel: ata6.00: failed command: READ FPDMA QUEUED
mag 17 00:02:17 Gozer kernel: ata6.00: cmd 60/08:50:20:11:a1/00:00:2c:00:00/40 tag 10 ncq dma 4096 in
                                       res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
mag 17 00:02:17 Gozer kernel: ata6.00: status: { DRDY }
mag 17 00:02:17 Gozer kernel: ata6: hard resetting link
mag 17 00:02:17 Gozer kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
mag 17 00:02:17 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 17 00:02:17 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 17 00:02:17 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 17 00:02:17 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 17 00:02:17 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 17 00:02:17 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 17 00:02:17 Gozer kernel: ata6.00: configured for UDMA/133
mag 17 00:02:17 Gozer kernel: ata6: EH complete
--
mag 21 00:02:50 Gozer kernel: ata6.00: exception Emask 0x10 SAct 0x20000 SErr 0x280100 action 0x6 frozen
mag 21 00:02:50 Gozer kernel: ata6.00: irq_stat 0x08000000, interface fatal error
mag 21 00:02:50 Gozer kernel: ata6: SError: { UnrecovData 10B8B BadCRC }
mag 21 00:02:50 Gozer kernel: ata6.00: failed command: READ FPDMA QUEUED
mag 21 00:02:50 Gozer kernel: ata6.00: cmd 60/20:88:28:a5:a5/00:00:19:00:00/40 tag 17 ncq dma 16384 in
                                       res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
mag 21 00:02:50 Gozer kernel: ata6.00: status: { DRDY }
mag 21 00:02:50 Gozer kernel: ata6: hard resetting link
mag 21 00:02:50 Gozer kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
mag 21 00:02:50 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 21 00:02:50 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 21 00:02:50 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 21 00:02:50 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 21 00:02:50 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 21 00:02:50 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 21 00:02:50 Gozer kernel: ata6.00: configured for UDMA/133
mag 21 00:02:50 Gozer kernel: sd 5:0:0:0: [sdd] tag#17 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
mag 21 00:02:50 Gozer kernel: sd 5:0:0:0: [sdd] tag#17 Sense Key : Illegal Request [current] 
mag 21 00:02:50 Gozer kernel: sd 5:0:0:0: [sdd] tag#17 Add. Sense: Unaligned write command
mag 21 00:02:50 Gozer kernel: sd 5:0:0:0: [sdd] tag#17 CDB: Read(10) 28 00 19 a5 a5 28 00 00 20 00
mag 21 00:02:50 Gozer kernel: I/O error, dev sdd, sector 430286120 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 2
mag 21 00:02:50 Gozer kernel: ata6: EH complete
--
mag 21 00:06:09 Gozer kernel: ata6.00: exception Emask 0x10 SAct 0x1000 SErr 0x280100 action 0x6 frozen
mag 21 00:06:09 Gozer kernel: ata6.00: irq_stat 0x08000000, interface fatal error
mag 21 00:06:09 Gozer kernel: ata6: SError: { UnrecovData 10B8B BadCRC }
mag 21 00:06:09 Gozer kernel: ata6.00: failed command: READ FPDMA QUEUED
mag 21 00:06:09 Gozer kernel: ata6.00: cmd 60/08:60:60:16:61/00:00:36:00:00/40 tag 12 ncq dma 4096 in
                                       res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
mag 21 00:06:09 Gozer kernel: ata6.00: status: { DRDY }
mag 21 00:06:09 Gozer kernel: ata6: hard resetting link
mag 21 00:06:10 Gozer kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
mag 21 00:06:10 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 21 00:06:10 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 21 00:06:10 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 21 00:06:10 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 21 00:06:10 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 21 00:06:10 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 21 00:06:10 Gozer kernel: ata6.00: configured for UDMA/133
mag 21 00:06:10 Gozer kernel: ata6: EH complete
--
mag 21 01:11:00 Gozer kernel: ata6.00: exception Emask 0x10 SAct 0x40000 SErr 0x280100 action 0x6 frozen
mag 21 01:11:00 Gozer kernel: ata6.00: irq_stat 0x08000000, interface fatal error
mag 21 01:11:00 Gozer kernel: ata6: SError: { UnrecovData 10B8B BadCRC }
mag 21 01:11:00 Gozer kernel: ata6.00: failed command: READ FPDMA QUEUED
mag 21 01:11:00 Gozer kernel: ata6.00: cmd 60/a8:90:60:19:20/00:00:02:00:00/40 tag 18 ncq dma 86016 in
                                       res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
mag 21 01:11:00 Gozer kernel: ata6.00: status: { DRDY }
mag 21 01:11:00 Gozer kernel: ata6: hard resetting link
mag 21 01:11:00 Gozer kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
mag 21 01:11:00 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 21 01:11:00 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 21 01:11:00 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 21 01:11:00 Gozer kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT5._GTF.DSSP], AE_NOT_FOUND (20230628/psargs-330)
mag 21 01:11:00 Gozer kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT5._GTF due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
mag 21 01:11:00 Gozer kernel: ata6.00: supports DRM functions and may not be fully accessible
mag 21 01:11:00 Gozer kernel: ata6.00: configured for UDMA/133
mag 21 01:11:00 Gozer kernel: sd 5:0:0:0: [sdd] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
mag 21 01:11:00 Gozer kernel: sd 5:0:0:0: [sdd] tag#18 Sense Key : Illegal Request [current] 
mag 21 01:11:00 Gozer kernel: sd 5:0:0:0: [sdd] tag#18 Add. Sense: Unaligned write command
mag 21 01:11:00 Gozer kernel: sd 5:0:0:0: [sdd] tag#18 CDB: Read(10) 28 00 02 20 19 60 00 00 a8 00
mag 21 01:11:00 Gozer kernel: I/O error, dev sdd, sector 35658080 op 0x0:(READ) flags 0x80700 phys_seg 21 prio class 3
mag 21 01:11:00 Gozer kernel: ata6: EH complete
koko@Gozer# lspci
00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)
00:16.0 Communication controller: Intel Corporation 9 Series Chipset Family ME Interface #1
00:1a.0 USB controller: Intel Corporation 9 Series Chipset Family USB EHCI Controller #2
00:1b.0 Audio device: Intel Corporation 9 Series Chipset Family HD Audio Controller
00:1c.0 PCI bridge: Intel Corporation 9 Series Chipset Family PCI Express Root Port 1 (rev d0)
00:1c.2 PCI bridge: Intel Corporation 9 Series Chipset Family PCI Express Root Port 3 (rev d0)
00:1c.3 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d0)
00:1d.0 USB controller: Intel Corporation 9 Series Chipset Family USB EHCI Controller #1
00:1f.0 ISA bridge: Intel Corporation Z97 Chipset LPC Controller
00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode]
00:1f.3 SMBus: Intel Corporation 9 Series Chipset Family SMBus Controller
01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 11)
04:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 04)
05:01.0 Ethernet controller: D-Link System Inc DGE-528T Gigabit Ethernet Adapter (rev 10)
koko@Gozer# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda      8:0    0 465,8G  0 disk 
├─sda1   8:1    0 457,4G  0 part /mnt/disco2
└─sda2   8:2    0   8,4G  0 part [SWAP]
sdb      8:16   0 465,8G  0 disk 
├─sdb1   8:17   0 457,8G  0 part /mnt/rotativo3
└─sdb2   8:18   0     8G  0 part [SWAP]
sdc      8:32   0 447,1G  0 disk 
└─sdc1   8:33   0 447,1G  0 part /mnt/ssd_sandisk_vecchia
sdd      8:48   0 465,8G  0 disk 
├─sdd1   8:49   0     1G  0 part /boot
├─sdd2   8:50   0     4G  0 part 
├─sdd3   8:51   0    80G  0 part /
└─sdd4   8:52   0 380,8G  0 part /home
sde      8:64   1     0B  0 disk 
sdf      8:80   1     0B  0 disk 
sdg      8:96   1     0B  0 disk 
sdh      8:112  1     0B  0 disk 

Any advice?

Thank you!

Last edited by kokoko3k (2024-05-23 11:36:02)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#2 2024-05-23 14:05:39

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 22,093

Re: Sporadic Random(?) errors on Bus? ssd?

Look into/try disabling SATA power management: https://wiki.archlinux.org/title/Power_ … Management

Offline

#3 2024-05-23 19:37:47

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,411

Re: Sporadic Random(?) errors on Bus? ssd?

Thanks, i'll try and report back asap (weeks, probably)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#4 2024-05-28 22:46:28

loqs
Member
Registered: 2014-03-06
Posts: 17,714

Re: Sporadic Random(?) errors on Bus? ssd?

If disabling LPM works please consider trying:

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index c449d60d9bb9..4b45ce6ed1c4 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4183,6 +4183,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
 	/* Crucial BX100 SSD 500GB has broken LPM support */
 	{ "CT500BX100SSD1",		NULL,	ATA_HORKAGE_NOLPM },
 
+	/* Crucial MX500 SSD 500GB has broken LPM support */
+	{ "CT500MX500SSD1",		NULL,	ATA_HORKAGE_NOLPM },
+
 	/* 512GB MX100 with MU01 firmware has both queued TRIM and LPM issues */
 	{ "Crucial_CT512MX100*",	"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
 						ATA_HORKAGE_ZERO_AFTER_TRIM |

Offline

#5 2024-05-29 04:16:51

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,411

Re: Sporadic Random(?) errors on Bus? ssd?

Thanks, so far so good, but the error rate was really low in the past, so i'm waiting with my finger crossed.


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#6 2024-05-29 08:21:53

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,411

Re: Sporadic Random(?) errors on Bus? ssd?

Dang.
Another one:

grep . /sys/class/scsi_host/host*/link_power_management_policy
/sys/class/scsi_host/host0/link_power_management_policy:max_performance
/sys/class/scsi_host/host1/link_power_management_policy:max_performance
/sys/class/scsi_host/host2/link_power_management_policy:max_performance
/sys/class/scsi_host/host3/link_power_management_policy:max_performance
/sys/class/scsi_host/host4/link_power_management_policy:max_performance
/sys/class/scsi_host/host5/link_power_management_policy:max_performance

Could it be the cable? What else?


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#7 2024-05-29 20:01:37

seth
Member
Registered: 2012-09-03
Posts: 53,182

Re: Sporadic Random(?) errors on Bus? ssd?

The shown errors are all CRC, the sectors all over the place.
If it's not alpm, maybe ASPM (pcie_aspm=off) but I'd look at the cable, yes. Or the plugs.

Offline

Board footer

Powered by FluxBB