You are not logged in.
Pages: 1
I've been using my trusty Dell Precision 5510 for ages (over 5 years now). The machine's still running well, but one thing that keeps me a bit worried is the NVME drive. I understand more recent SSDs are built for extended usage, but not really sure if a 5 year old one is classed as "modern" these days. The laptop is my workhorse (it was a maxed out configuration in 2015), and runs pretty much all day everyday. I ran smartctl, and the results are given below. I'm new to smartctl, and am trying to understand the results. I see the drive "passed", and that vendor reckons I've only used 4% of the lifespan of the drive. Is my interpretation correct? Is there something that looks a bit concerning? I have forced killed the power from time to time due to lock ups resuming from sleep, etc.
Reading up on smartctl, I see that the expected run time for a -t long is 20-30 minutes. However, it's returning instantly for me. Does that mean the information is not accurate?
>smartctl -t long -a /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.9.10-arch1-1] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: PM951 NVMe SAMSUNG 1024GB
Serial Number: S2FZNXAGA09940
Firmware Version: BXV77D0Q
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Controller ID: 1
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,024,209,543,168 [1.02 TB]
Namespace 1 Utilization: 881,679,765,504 [881 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 455a1b26d4
Local Time is: Sat Dec 5 13:01:43 2020 GMT
Firmware Updates (0x06): 3 Slots
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x001f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Maximum Data Transfer Size: 32 Pages
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.00W - - 0 0 0 0 5 5
1 + 4.20W - - 1 1 1 1 30 30
2 + 3.10W - - 2 2 2 2 100 100
3 - 0.0700W - - 3 3 3 3 500 5000
4 - 0.0050W - - 4 4 4 4 2000 22000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 31 Celsius
Available Spare: 100%
Available Spare Threshold: 50%
Percentage Used: 4%
Data Units Read: 33,049,228 [16.9 TB]
Data Units Written: 59,137,420 [30.2 TB]
Host Read Commands: 593,366,012
Host Write Commands: 1,210,173,947
Controller Busy Time: 12,192
Power Cycles: 6,451
Power On Hours: 19,872
Unsafe Shutdowns: 729
Media and Data Integrity Errors: 0
Error Information Log Entries: 8,351
Error Information (NVMe Log 0x01, max 64 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS
0 8351 0 0x000a 0x4004 0x000 0 1 -
1 8350 0 0x0007 0x4004 0x000 0 - -
2 8349 0 0x0002 0x4004 0x000 0 0 -
3 8348 0 0x0006 0x4004 0x000 0 0 -
4 8347 0 0x000c 0x4004 0x000 0 0 -
5 8346 0 0x000e 0x4004 0x000 0 0 -
6 8345 0 0x0016 0x4004 0x000 0 0 -
7 8344 0 0x0017 0x4004 0x000 0 0 -
8 8343 0 0x001a 0x4004 0x000 0 0 -
9 8342 0 0x001b 0x4004 0x000 0 0 -
10 8341 0 0x000c 0x4004 0x000 0 0 -
11 8340 0 0x0014 0x4004 0x000 0 0 -
12 8339 0 0x0007 0x4004 0x000 0 0 -
13 8338 0 0x000e 0x4004 0x000 0 0 -
14 8337 0 0x0016 0x4004 0x000 0 0 -
15 8336 0 0x0006 0x4004 0x000 0 0 -
... (48 entries not shown)Offline
~20K power-on time is getting up there. I have a spinner from 2011 that has 29K power-on hours (it's not used for anything critical, anymore).
You might want to start thinking about a replacement strategy; you've got 30TB of data written to that drive.
You might also want to look into nvme-cli
Eenie meenie, chili beanie, the spirits are about to speak -- Bullwinkle J. Moose
It's a big club...and you ain't in it -- George Carlin
Registered Linux user #149839
perl -e 'print$i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10); '
Offline
Thanks. So, what things should I be looking at with nvme-cli?
I see that the Samsung drive is rated at an MTBF of 1.5 million hours and to 300 TB writes. Now of course, I'm not expecting it to last that long. But where can I find information as to what is a safe range before replacing it? I don't want to unnecessarily replace it if it's at 4-10% of actual life, specially seeing that I might get a new machine next year, or the year after seeing how Intel responds to the Apple M1. The drive crashing won't be much more than a little inconvenience as I do have things regularly back up offline. But curious to know of warning signs I can look out for.
Offline
20K hours isn't shit...
# smartctl -a /dev/sdb | grep Power_On_H
9 Power_On_Hours 0x0032 001 001 000 Old_age Always - 86451Not even an enterprise level drive:
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.80-2-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Green (AF)
Device Model: WDC WD20EARS-00MVWB0
...I would think SSDs would have way more life in them regarding power on hours... my understanding is I/O is how they wear.
Last edited by graysky (2020-12-05 16:39:58)
Offline
To answer the initial question on the outset, a -t long will always return instantly (...even on spinners with a -t long time of 6 hours). You can use everything normally and then double check state once the time that is expected to be needed has elapsed with a smartctl -a
That said I also don't see any reason for concern in these outputs.
Offline
Pages: 1