You are not logged in.

#1 2022-10-15 05:11:43

Brocellous
Member
Registered: 2017-11-27
Posts: 155

[Solved] Discard has no effect on btrfs on luks

Today I noticed that the nvme capacity reporting on my ssd reports 100% usage in the namespace. I'm not sure whether it was always like this or only filled recently, but it's less than 6 months old and hasn't really seen heavy usage:

$ sudo smartctl -ir nvmeioctl /dev/nvme1n1
[...]
=== START OF INFORMATION SECTION ===
 [NVMe call: opcode=0x06, size=0x1000, nsid=0x00000001, cdw10=0x00000000]
 [Duration: 0.000170s]
 [NVMe call succeeded: result=0x00000000]
Model Number:                       Sabrent Rocket 4.0 2TB
Serial Number:                      469F07181E7000004979
Firmware Version:                   RKT401.3
PCI Vendor ID:                      0x1987
PCI Vendor Subsystem ID:            0x1987
IEEE OUI Identifier:                0x6479a7
Total NVM Capacity:                 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size:                   2,000,398,934,016 [2.00 TB]
Namespace 1 Capacity:               2,000,398,934,016 [2.00 TB]
Namespace 1 Utilization:            2,000,398,934,016 [2.00 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            6479a7 5390c0100d
Local Time is:                      Fri Oct 14 21:42:13 2022 MST

I do have fstrim timer enabled, and it has been running:

$ journalctl -u fstrim --since=-7d
Oct 10 01:24:43 rdtw systemd[1]: Starting Discard unused blocks on filesystems from /etc/fstab...
Oct 10 01:25:23 rdtw fstrim[71405]: /: 1.6 TiB (1721727229952 bytes) trimmed on /dev/mapper/root
Oct 10 01:25:23 rdtw fstrim[71405]: /boot: 456.2 MiB (478388224 bytes) trimmed on /dev/nvme1n1p1
Oct 10 01:25:23 rdtw systemd[1]: fstrim.service: Deactivated successfully.
Oct 10 01:25:23 rdtw systemd[1]: Finished Discard unused blocks on filesystems from /etc/fstab.
Oct 10 01:25:23 rdtw systemd[1]: fstrim.service: Consumed 2.747s CPU time.

There is plenty of free space to trim and I ran it once more for good measure:

$ df -h /     
Filesystem        Size  Used Avail Use% Mounted on
/dev/mapper/root  1.9T  265G  1.6T  15% /
$ sudo fstrim -v /
/: 1.6 TiB (1716177018880 bytes) trimmed

It reports 1.6T trimmed, but has no effect on the utilization reported by my ssd. I understand that the discard is advisory and that the drive may ignore it, but it doesn't seem right to me that the utilization would be always at maximum.

My root filesystem is btrfs on luks. My LUKS volume uses the "discard" cryptsetup option and btrfs is mounted without discard, though I have tried remounting it with discard=async.

$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: keyring
  device:  /dev/nvme1n1p2
  sector size:  512
  offset:  32768 sectors
  size:    3905970176 sectors
  mode:    read/write
  flags:   discards no_read_workqueue no_write_workqueue
$ cat /etc/fstab
UUID=076B-C157    /boot       vfat   rw,noatime 0 2

/dev/mapper/root  /           btrfs  rw,noatime,space_cache=v2,subvol=/subroot
/dev/mapper/root  /home       btrfs  rw,noatime,space_cache=v2,subvol=/home
/dev/mapper/root  /var/cache  btrfs  rw,noatime,space_cache=v2,subvol=/cache
/dev/mapper/root  /var/log    btrfs  rw,noatime,space_cache=v2,subvol=/log
/dev/mapper/root  /var/tmp    btrfs  rw,noatime,space_cache=v2,subvol=/tmp

/dev/mapper/root  /swap  btrfs  rw,noatime,space_cache=v2,subvol=/swap
/swap/swapfile    none   swap   defaults

So I don't understand why fstrim has no effect. It does work as expected on my laptop, which has a rootfs of ext4 with no luks — the utilization is roughly the same as the filesystem usage and running fstrim reduces the utilization slightly.

Is trim working for you guys, or have you encountered nvme drives that just report 100% utilization for some reason?

Last edited by Brocellous (2022-10-15 19:12:17)

Offline

#2 2022-10-15 18:02:17

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,167

Re: [Solved] Discard has no effect on btrfs on luks

Isn't this expected if you formatted the entire drive? All the space on my drive is assigned either to the ESP or the LUKS and I see the same. I don't think this has anything to do with trimming.

In contrast I only show utilisation of 3% of the drive's spare capacity. What does smartctl -a show you?

Last edited by cfr (2022-10-15 18:03:39)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#3 2022-10-15 18:11:31

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: [Solved] Discard has no effect on btrfs on luks

It is not expected. You may have not mounted the luks volume with discard support, it is not the default. In my case the cryptsetup output shows the discard flag is applied, and lsblk determines that the encrypted volume supports discards:

$ lsblk -o +DISC-GRAN,DISC-MAX /dev/mapper/root
NAME MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS DISC-GRAN DISC-MAX
root 254:0    0  1.8T  0 crypt /var/tmp         512B       2T
                               /var/log              
                               /var/cache            
                               /swap                 
                               /home                 
                               /       

See the description of the discard option in crypttab(5), that is how I apply the option:

$ sudo cat /etc/crypttab
root UUID=3dde781e-5122-42ef-9505-79789f86e860 - fido2-device=auto,discard,no-read-workqueue,no-write-workqueue

You can also apply it to an opened luks volume with:

$ sudo cryptsetup refresh --allow-discards $devname

If you are missing the discard option, would you mind trying that and then try fstrim -v?

Last edited by Brocellous (2022-10-15 18:17:20)

Offline

#4 2022-10-15 18:16:10

Scimmia
Fellow
Registered: 2012-09-01
Posts: 12,903

Re: [Solved] Discard has no effect on btrfs on luks

You still haven't explained why you think there's a problem. Everything looks fine to me.

NUSE being the same as NSZE is normal when not using thin provisioning.

Last edited by Scimmia (2022-10-15 18:29:50)

Offline

#5 2022-10-15 18:30:38

tucuxi
Member
From: Switzerland
Registered: 2020-03-08
Posts: 291

Re: [Solved] Discard has no effect on btrfs on luks

I am not sure either where you suspect an issue. Keep in mind that df doesn't give accurate information for btrfs filesystems.

Offline

#6 2022-10-15 18:34:59

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: [Solved] Discard has no effect on btrfs on luks

Here is the output from my laptop:

[rxps ~]$ sudo smartctl -ir nvmeioctl /dev/nvme0n1
[...]
=== START OF INFORMATION SECTION ===
 [NVMe call: opcode=0x06, size=0x1000, nsid=0x00000001, cdw10=0x00000000]
 [Duration: 0.000665s]
 [NVMe call succeeded: result=0x00000000]
Model Number:                       Samsung SSD 970 EVO 1TB
Serial Number:                      S5H9NS0N840571Z
Firmware Version:                   2B2QEXE7
PCI Vendor ID:                      0x144d
PCI Vendor Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      4
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size:                   1,000,204,886,016 [1.00 TB]
Namespace 1 Capacity:               1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization:            513,911,869,440 [513 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 580143b8f2
Local Time is:                      Sat Oct 15 11:19:40 2022 MST

and the utilization responds properly to fstrim. What do you suppose is the cause for the difference? It's a problem because it is a factor that causes write-amplification: https://en.wikipedia.org/wiki/Write_amp … user_space.

tucuxi wrote:

Keep in mind that df doesn't give accurate information for btrfs filesystems.

I'm aware, but the difference here is not relevant so I use the more compact command output:

$ sudo btrfs filesystem df -h / | column -ts:
Data, single            total=262.01GiB, used=259.82GiB
System, DUP             total=8.00MiB, used=48.00KiB
Metadata, DUP           total=5.00GiB, used=2.84GiB
GlobalReserve, single   total=412.22MiB, used=0.00B

Offline

#7 2022-10-15 18:43:51

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: [Solved] Discard has no effect on btrfs on luks

Scimmia wrote:

NUSE being the same as NSZE is normal when not using thin provisioning.

No, you're thinking of NCAP. NUSE is specifically for the host to track the blocks in use: https://nvmexpress.org/resources/nvm-ex … amespaces/

Offline

#8 2022-10-15 18:48:21

Scimmia
Fellow
Registered: 2012-09-01
Posts: 12,903

Re: [Solved] Discard has no effect on btrfs on luks

Offline

#9 2022-10-15 19:11:41

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: [Solved] Discard has no effect on btrfs on luks

The microsoft source is the only one that matches what I have observed. It says:

learn.microsoft.com wrote:

A controller may report a NUSE value equal to an NCAP value at all times if the product is not targeted for thin provisioning environments.

So it would be possible my drive does not support this optional feature. It's worth noting that the drive in my laptop and this one both report nsfeat value of 0 indicating no support for thin provisioning, even though my laptop drive does report NUSE as expected:

$ sudo nvme id-ns -o json /dev/nvme1n1 | jq .nsfeat             
0
[rxps ~]$ sudo nvme id-ns -o json /dev/nvme0n1 | jq .nsfeat
0

I will accept this as solved unless I learn otherwise.

EDIT: Emphasis mine.

Last edited by Brocellous (2022-10-15 19:47:34)

Offline

#10 2022-10-15 19:19:50

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,167

Re: [Solved] Discard has no effect on btrfs on luks

I don't use crypttab but kernel parameters, but discard is definitely enabled. According to crypttab, it isn't relevant for root. At any rate, the kernel parameters seem to work fine.

cryptdevice=...:<name>:allow-discards rd.luks.options=discard

I don't think fstrim is relevant to namespace utilisation?

Last edited by cfr (2022-10-15 19:35:14)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#11 2022-10-15 19:21:33

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: [Solved] Discard has no effect on btrfs on luks

cfr wrote:

I enable it in a different way because crypttab isn't useful for root.

It is if copied into the initramfs. The sd-encrypt mkinitcpio hook will do this automatically if /etc/crypttab.initramfs is present.

$ file /etc/crypttab.initramfs        
/etc/crypttab.initramfs: symbolic link to /etc/crypttab
cfr wrote:

I don't think fstrim is expected to affect namespace utilisation.

I can assure you that it does, at least on my laptop. I have observed it.

Last edited by Brocellous (2022-10-15 19:31:46)

Offline

#12 2022-10-15 19:44:08

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,167

Re: [Solved] Discard has no effect on btrfs on luks

Scimmia wrote:

And yet all other sources disagree with that.

https://metebalci.com/blog/a-quick-tour … ress-nvme/

Although the explanation there says nuse will differ only if thin-provisioning is used and that it is equal to nsize in the example, the actual output shown does have nuse lower than ncap=nsize.


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#13 2022-10-15 19:59:48

frostschutz
Member
Registered: 2013-11-15
Posts: 1,554

Re: [Solved] Discard has no effect on btrfs on luks

On a fully encrypted SSD, you can roughly check how much is trimmed (zero). This assumes the SSD actually returns trimmed areas as zero (deterministic read zero after trim). In some cases it can be necessary to drop caches first.

# pv -cN input < /dev/nvme0n1 | lzop --fast | pv -cN output > /dev/null
    input:  238GiB 0:05:13 [ 779MiB/s] [===========================>] 100%
   output:  147GiB 0:05:13 [ 482MiB/s] [           <=>                               ]

Basically this reads the whole SSD and compresses the data. pv will print progress for input (uncompressed raw data) and output (compressed data).

If input and output stays the same, that suggests it's not trimmed since encrypted/random data can't be compressed at all. Instead the compressed data will be larger due to overhead. This method works less well with unencrypted data, as it can be compressed to some extent naturally.

If the SSD has empty/unencrypted areas, then the amount of output should be considerably less than input.

If fstrim claimed to have trimmed 1.6 terabytes of data, then about that much should be missing from the output, too.

Edit: for a nicer output progress bar, add --size=$(blockdev --getsize64 /dev/nvme0n1)

Edit: my Micron SSD also does not reflect TRIM/discard in smartctl output

Namespace 1 Size:                   256,060,514,304 [256 GB]
Namespace 1 Capacity:               256,060,514,304 [256 GB]
Namespace 1 Utilization:            256,060,514,304 [256 GB]

so in this case you just have to use other means of verification

Last edited by frostschutz (2022-10-15 21:13:41)

Online

Board footer

Powered by FluxBB