You are not logged in.

#1 2017-10-09 00:27:36

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,132

How to prevent 'unsafe shutdowns' of NVMe drive?

I'm getting a very high rate of 'unsafe shutdowns' on a new NVMe disk. https://bugs.freebsd.org/bugzilla/show_ … ?id=211852 suggests that the system is not waiting long enough for the controller to finish, although I'm not sure if I'm seeing the same problem. I don't have the same disk, but it is also an Intel. Two solutions are proposed there, as I understand it. One is to increase the time the system waits before shutting down without the OK from the controller. The other involves, I think, signalling the shutdown to the controller through a different interface or pathway. Either would, I think, need a patched kernel. (So I really hope this isn't the problem!) I guess https://bz-attachments.freebsd.org/atta … ?id=185303 is a kernel patch, for example?

I'm especially worried that running Arch may be damaging the hardware as it is clearly something which is bothering the disk itself. (I don't mean Arch specifically - I know this is obviously upstream code. I just mean 'the software I'm running' which is, in my case, Arch.) Is there anything I can do to protect against this?

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.13.4-1-ARCH] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       INTEL SSDPEKKF512G7L
Serial Number:                      XXXX
Firmware Version:                   121P
PCI Vendor/Subsystem ID:            0x8086
IEEE OUI Identifier:                0x5cd2e4
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512,110,190,592 [512 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Sun Oct  8 22:48:20 2017 BST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0007):   Security Format Frmw_DL
Optional NVM Commands (0x001e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.00W       -        -    0  0  0  0        5       5
 1 +     4.60W       -        -    1  1  1  1       30      30
 2 +     3.80W       -        -    2  2  2  2       30      30
 3 -   0.0700W       -        -    3  3  3  3    10000     300
 4 -   0.0050W       -        -    4  4  4  4     2000   10000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0x1)
Critical Warning:                   0x00
Temperature:                        22 Celsius
Available Spare:                    97%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    2,159,656 [1.10 TB]
Data Units Written:                 1,360,228 [696 GB]
Host Read Commands:                 131,137,230
Host Write Commands:                9,065,871
Controller Busy Time:               255
Power Cycles:                       85
Power On Hours:                     124
Unsafe Shutdowns:                   55
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, max 64 entries)
No Errors Logged

I'm also alarmed by the count of 55 'unsafe shutdowns' when I've only just got the machine. Can I avoid such shutdowns and, if so, how?

Last edited by cfr (2017-10-09 01:28:57)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#2 2017-10-10 01:08:26

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,132

Re: How to prevent 'unsafe shutdowns' of NVMe drive?

I have checked that the drive has the latest available firmware from Intel. I also looked to see if there is a BIOS update for the laptop. There is. However, the notes say it provides a security fix, so I suspect it is to do with TPM/secure boot or similar. Since I'm not using those features, I'm loathe to risk a BIOS flash until there's an update with something worth having.

Summarising the results of my googling attempts, the internet at large suggests this problem is (1) common (2) uncommon (3) corrupts data, but does not do hardware damage (4) shortens the life of the disk (5) occurs for both (some) Linux and OS X systems (6) may mean different things for different vendors' disks (6) (always) means the OS does not signal imminent power loss to the drive (6) is a reliable or unreliable guide to whatever it is a guide too (7) possibly or (8) possibly not.

I admit that I'm lost without the Arch wiki.


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#3 2017-10-10 11:42:50

Slithery
Administrator
From: Norfolk, UK
Registered: 2013-12-01
Posts: 5,776

Re: How to prevent 'unsafe shutdowns' of NVMe drive?

Personally I'd try flashing the laptop firmware, updates often contain fixes that aren't listed in the changelog.


No, it didn't "fix" anything. It just shifted the brokeness one space to the right. - jasonwryan
Closing -- for deletion; Banning -- for muppetry. - jasonwryan

aur - dotfiles

Offline

#4 2017-10-11 01:01:17

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,132

Re: How to prevent 'unsafe shutdowns' of NVMe drive?

slithery wrote:

Personally I'd try flashing the laptop firmware, updates often contain fixes that aren't listed in the changelog.

Hmm..... I may try that, then, though googling suggests this is more likely an OS thing than a BIOS thing. I can't do it straight off as I'll have to find a CD/DVD drive somewhere. And probably get somebody to help me burn a CD in Windows.

Does anybody know if it is harmful to hardware? I realise it risks data corruption, possibly. (I suspect it doesn't, actually, but the disk doesn't know this.)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

Board footer

Powered by FluxBB