You are not logged in.

#1 2018-02-05 01:21:13

Al.Piotrowicz
Member
Registered: 2017-08-07
Posts: 116

4.14.15-1 arch, single ssd disk emask frozen ata error

After fighting some time with WD-EARS green disk errors, those seem to get away by simply changing the BIOS ata mode from native-ata into AHCI. Recently during system idle state the ata bug has triggered. Suspected WD-green's again, but surprisingly this time it was related to GOODRAM ssd. I admit this drive is attached only for windows intallation, has been in the idle state and wasnt mounted when the bug occured.

kernel: ata4: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen
kernel: ata4: irq_stat 0x00400000, PHY RDY changed
kernel: ata4: SError: { RecovComm Persist PHYRdyChg 10B8B }
kernel: ata4: hard resetting link
kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
kernel: ata4.00: configured for UDMA/133
kernel: ata4: EH complete

These are the ONLY logged into journal. My question is what the hell these could be related to? I dont see any recent kernel libata commit which could theoretically affect this bug.

smartctl output :

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-1-ARCH] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Phison Driven SSDs
Device Model:     GOODRAM
Serial Number:    8EFA076804E705324860
Firmware Version: SAFM22.3
User Capacity:    240,057,409,536 bytes [240 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Feb  5 02:10:41 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   30) seconds.
Offline data collection
capabilities: 			 (0x79) SMART execute Offline immediate.
					No Auto Offline data collection support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (   2) minutes.
Conveyance self-test routine
recommended polling time: 	 (   3) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       1801
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       219
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
170 Bad_Blk_Ct_Erl/Lat      0x0003   100   100   010    Pre-fail  Always       -       0/734
173 MaxAvgErase_Ct          0x0012   100   100   000    Old_age   Always       -       4 (Average 1)
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       12
194 Temperature_Celsius     0x0023   098   098   057    Pre-fail  Always       -       30 (Min/Max 30/30)
218 CRC_Error_Count         0x000b   100   100   050    Pre-fail  Always       -       0
231 SSD_Life_Left           0x0013   100   100   000    Pre-fail  Always       -       100
241 Lifetime_Writes_GiB     0x0012   100   100   000    Old_age   Always       -       230

Thank you for any reply. Regards.

Last edited by Al.Piotrowicz (2018-02-06 00:52:12)

Offline

#2 2018-02-16 10:41:46

Al.Piotrowicz
Member
Registered: 2017-08-07
Posts: 116

Re: 4.14.15-1 arch, single ssd disk emask frozen ata error

Still happens with latest mainline generic kernel 4.15.3-1 in the same circumstances. SSD disk related to the affected ata port was unmounted during exception. The error in journal is EXACTLY the same.

Last edited by Al.Piotrowicz (2018-02-16 10:44:26)

Offline

Board footer

Powered by FluxBB