High SMART Data Units Read/Written BTRFS

spacekobold · 2024-06-26 00:51:22

I have my main system 2TB NVMe drive formatted as 1GiB EFI, 64GiB Swap, and the rest BTRFS. Today I received a notification that my drive was failing (again) according to SMART. This is the second time this has happened with my current install. When I look at the SMART reporting, I notice that the data units read/written seem suspiciously high... it reports over 36 TB read and 19 TB written. I'm pretty sure I haven't used my drive anywhere near that amount, as I'm not churning through large files all of the time. The one thing I am a little bit suspicious might be causing this are my BTRFS snapshots, which I currently have going hourly. From what I understand these should mostly be metadata if nothing substantial has changed, so I don't understand how it could have reached over 18x/9x my total storage capacity in reads/writes at this point. Does anyone have experience with this; is this normal for BTRFS? I've been running this install on this drive for just over 1 year, after having to recover my system from a failing drive last time this happened... both are WD_BLACK SN770 2TB drives, I RMA'ed the 1st one to get this new one, though this one has lasted much longer than the 2-3 months the 1st one I had did. SMART results are listed below:

$ sudo smartctl -a /dev/nvme1n1
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.9.5-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       WD_BLACK SN770 2TB
Serial Number:                      22500Z802630
Firmware Version:                   731100WD
PCI Vendor/Subsystem ID:            0x15b7
IEEE OUI Identifier:                0x001b44
Total NVM Capacity:                 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      0
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          2,000,398,934,016 [2.00 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            001b44 8b4eaef0ff
Local Time is:                      Tue Jun 25 20:18:24 2024 EDT
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x00df):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
Log Page Attributes (0x7e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg Log0_FISE_MI Telmtry_Ar_4
Maximum Data Transfer Size:         256 Pages
Warning  Comp. Temp. Threshold:     84 Celsius
Critical Comp. Temp. Threshold:     88 Celsius
Namespace 1 Features (0x02):        NA_Fields

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     5.40W    5.40W       -    0  0  0  0        0       0
 1 +     3.50W    3.00W       -    0  0  0  0        0       0
 2 +     2.40W    2.00W       -    0  0  0  0        0       0
 3 -   0.0150W       -        -    3  3  3  3     1500    2500
 4 -   0.0050W       -        -    4  4  4  4    10000    6000
 5 -   0.0033W       -        -    5  5  5  5   176000   25000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
- NVM subsystem reliability has been degraded

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x04
Temperature:                        48 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    71,653,833 [36.6 TB]
Data Units Written:                 37,541,797 [19.2 TB]
Host Read Commands:                 2,855,435,298
Host Write Commands:                756,614,873
Controller Busy Time:               4,140
Power Cycles:                       866
Power On Hours:                     3,085
Unsafe Shutdowns:                   56
Media and Data Integrity Errors:    337
Error Information Log Entries:      337
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               69 Celsius
Temperature Sensor 2:               46 Celsius

Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged

Read Self-test Log failed: Invalid Field in Command (0x4002)

spiffeeroo · 2024-06-26 03:09:46

Read and write amplification is higher with Btrfs than something like ext4. Do you have autodefrag or compression enabled in the mount options? Compression increases read/write amplification. The default block size with compression is 128 KB, so if there is a smaller 1KB / 4KB file, the read/write amplification can be more severe.

A couple Btrfs devs posted in this issue discussing the read/write amplification and how Btrfs is deployed at Meta/FB.
https://pagure.io/fedora-btrfs/project/issue/36

spacekobold · 2024-06-26 05:52:25

No, only relatime and space_cache=v2, which I believe were just the defaults. That issue didn't seem very hopeful about the read/write amplification on BTRFS... at least it doesn't seem to be something I'm doing wrong?

Anyways, 19TB written in a year may be a lot but if the drive has a rated life of 1000TB that should be fine, that would take 50 years to wear out in theory. I'm pretty sure these Western Digital drives are defective somehow considering this has happened twice with basically new drives. Now I'm just trying to figure out how to clone my system over to a new drive, which is proving difficult considering the drive does appear to fail intermittently when using ddrescue.

agapito · 2024-06-26 14:49:24

How much RAM do you have? If you are low on RAM the system will be constantly reading and writing to the disk swap.

What is the purpose of taking a snapshot every hour? I do them manually every two weeks and I only have 2 that rotate. They are done after a couple of weeks without updating the system and only when I am sure that everything is working perfectly.

spacekobold · 2024-06-27 03:53:11

I have 32 GB of memory so I'm not hitting my swap space that hard. I've stopped hibernating my system too because it was extremely slow, so I don't think my swap space really ever gets used.

I'm just using Snapper for snapshots and it defaults to 1hr snapshots. I have mostly used the pre/post update snapshots I've configured to rewind badness, but the 1hr period is mainly for accidental file deletions, just in case. I probably need to rethink my priorities on data backup considering I've just lost all of my data regardless.

mesaprotector · 2024-06-29 08:34:14

I doubt your snapshots have anything to do with it (definitely recommend backing up properly on an external drive though). If you've had two drives fail early, I'd look at the temperature and other environmental factors first. Are you leaving your PC on all the time? If you are, is the temperature often staying above 50? You could even look into humidity in the room and nearby magnetic fields.

Otherwise it's certainly possible you just got two bad drives. SSDs don't often fail early (HDDs are more prone to this, because mechanical defects can cause problems immediately), which makes me consider the previously mentioned causes, but since you got two of the same drive, who knows.

Arch Linux

#1 2024-06-26 00:51:22

High SMART Data Units Read/Written BTRFS

#2 2024-06-26 03:09:46

Re: High SMART Data Units Read/Written BTRFS

#3 2024-06-26 05:52:25

Re: High SMART Data Units Read/Written BTRFS

#4 2024-06-26 14:49:24

Re: High SMART Data Units Read/Written BTRFS

#5 2024-06-27 03:53:11

Re: High SMART Data Units Read/Written BTRFS

#6 2024-06-29 08:34:14

Re: High SMART Data Units Read/Written BTRFS

Board footer