You are not logged in.

#1 2015-12-09 13:14:21

e_Irbis
Member
Registered: 2015-12-09
Posts: 3

ssd quickly ending life

I have the SSD OCZ. Filesystem BTRFS. The /var moved to the hdd. Less than a year Media_Wearout_Indicator index fell to a value of 78%.
What could be the problem?
Log SMART and cat /etc/fstab attached

smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.5-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     OCZ-ARC100
Serial Number:    A22L0061519002219
LU WWN Device Id: 5 e83a97 100065354
Firmware Version: 1.01
User Capacity:    120 034 123 776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Dec  9 15:53:23 2015 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x1d) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Abort Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x00) Error logging NOT supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   0) minutes.
Extended self-test routine
recommended polling time:        (   0) minutes.

SMART Attributes Data Structure revision number: 18
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0000   000   000   000    Old_age   Offline      -       0
  9 Power_On_Hours          0x0000   100   100   000    Old_age   Offline      -       1126
 12 Power_Cycle_Count       0x0000   100   100   000    Old_age   Offline      -       416
171 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       39646288
174 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       22
195 Hardware_ECC_Recovered  0x0000   100   100   000    Old_age   Offline      -       0
196 Reallocated_Event_Count 0x0000   100   100   000    Old_age   Offline      -       0
197 Current_Pending_Sector  0x0000   100   100   000    Old_age   Offline      -       0
208 Unknown_SSD_Attribute   0x0000   100   100   000    Old_age   Offline      -       685
210 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
224 Unknown_SSD_Attribute   0x0000   100   100   000    Old_age   Offline      -       1
233 Media_Wearout_Indicator 0x0000   078   078   000    Old_age   Offline      -       78
241 Total_LBAs_Written      0x0000   100   100   000    Old_age   Offline      -       971
242 Total_LBAs_Read         0x0000   100   100   000    Old_age   Offline      -       465
249 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       56808951

SMART Error Log Version: 1
No Errors Logged

Warning! SMART Self-Test Log Structure error: invalid SMART checksum.
SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]
# /dev/sda1 LABEL=archroot
UUID=44454709-0478-4530-af2f-374c10f1bacb       /               btrfs           rw,noatime,compress=lzo,ssd,discard,space_cache,subvol=@/curr   0 0

# /dev/sda1 LABEL=archroot
UUID=44454709-0478-4530-af2f-374c10f1bacb       /home           btrfs           rw,noatime,compress=lzo,ssd,discard,space_cache,subvol=@home/curr       0 0

# /dev/sda1 LABEL=archroot
UUID=44454709-0478-4530-af2f-374c10f1bacb       /home/vadim/virtual     btrfs           rw,noatime,compress=lzo,ssd,discard,space_cache,subvol=@virtual/curr    0 0

# /dev/sdb4 LABEL=hdd-data
UUID=7b8e0dff-391b-42cb-8505-a44b4eca35be       /var            btrfs           rw,relatime,compress=lzo,space_cache,subvol=@var/curr   0 0

# /dev/sdb4 LABEL=hdd-data
UUID=7b8e0dff-391b-42cb-8505-a44b4eca35be       /home/vadim/docks_hdd   btrfs           rw,relatime,compress=lzo,space_cache,subvol=@docs/curr  0 0

Offline

#2 2015-12-09 14:24:19

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,731
Website

Re: ssd quickly ending life

What is that indicator based on and is it even relevant?

Offline

#3 2015-12-09 14:40:55

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: ssd quickly ending life

You should try to find out what all those Unknown_Attribute mean, maybe there is something useful there.

I'd say that you should look into the spare sectors/blocks if the ssd provides that info instead of the Media_Wearout_Indicator. If it doesn't provide that info I suppose that the attribute Reallocated_Sector_Ct might be a good indicator of when the ssd starts to go bad, all this assuming that the reported smart attributes are even right or mean what the seem to mean, ex.: Total_LBAs_Written, I highly doubt that you have written only 971 sectors to disk in one year.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#4 2015-12-09 18:47:53

ua4000
Member
Registered: 2015-10-14
Posts: 559

Re: ssd quickly ending life

e_Irbis wrote:

Device is:        Not in smartctl database [for details use: -P showall]

This makes the posted raw values non-inerpretable. You should update the database as suggested and re-run the test.

best regards, ua4000

Offline

#5 2015-12-10 12:10:06

e_Irbis
Member
Registered: 2015-12-09
Posts: 3

Re: ssd quickly ending life

Thanks for your the advice. Updated database of SMART. I did the self-test.
Out smartctl attached.
Do I understand correctly? According to SMART is written to the disk 980 GB of data.
Declared yield of this disc 21,9TB. 0.98 / 21.9 = 4.4%.
Why, then, the value Remaining_Lifetime_Percent = 77%?
And is normal, then the value of Host_Writes_Gib more than Host_Reads_Gib?

smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.5-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Indilinx Barefoot 3 based SSDs
Device Model:     OCZ-ARC100
Serial Number:    A22L0061519002219
LU WWN Device Id: 5 e83a97 100065354
Firmware Version: 1.01
User Capacity:    120 034 123 776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Dec 10 14:37:25 2015 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 249) Self-test routine in progress...
                                        90% of test remaining.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x1d) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Abort Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x00) Error logging NOT supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   0) minutes.
Extended self-test routine
recommended polling time:        (   0) minutes.

SMART Attributes Data Structure revision number: 18
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Runtime_Bad_Block       0x0000   000   000   000    Old_age   Offline      -       0
  9 Power_On_Hours          0x0000   100   100   000    Old_age   Offline      -       1141
 12 Power_Cycle_Count       0x0000   100   100   000    Old_age   Offline      -       418
171 Avail_OP_Block_Count    0x0000   100   100   000    Old_age   Offline      -       39646288
174 Pwr_Cycle_Ct_Unplanned  0x0000   100   100   000    Old_age   Offline      -       22
195 Total_Prog_Failures     0x0000   100   100   000    Old_age   Offline      -       0
196 Total_Erase_Failures    0x0000   100   100   000    Old_age   Offline      -       0
197 Total_Unc_Read_Failures 0x0000   100   100   000    Old_age   Offline      -       0
208 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       695
210 SATA_CRC_Error_Count    0x0000   100   100   000    Old_age   Offline      -       0
224 In_Warranty             0x0000   100   100   000    Old_age   Offline      -       1
233 Remaining_Lifetime_Perc 0x0000   077   077   000    Old_age   Offline      -       77
241 Host_Writes_GiB         0x0000   100   100   000    Old_age   Offline      -       980
242 Host_Reads_GiB          0x0000   100   100   000    Old_age   Offline      -       469
249 Total_NAND_Prog_Ct_GiB  0x0000   100   100   000    Old_age   Offline      -       57484655

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Aborted by host               90%       117         -

Selective Self-tests/Logging not supported

Offline

#6 2015-12-10 20:21:26

Darkimmortal
Member
Registered: 2012-01-04
Posts: 30

Re: ssd quickly ending life

The minimum individual write is going to be around 256kb - 4mb, so if you were doing a large amount of tiny writes then such a large discrepency between 'host writes' (which implies the amount of input data rather than the amount written to NAND) and lifetime can be expected

Last edited by Darkimmortal (2015-12-10 20:21:41)

Offline

Board footer

Powered by FluxBB