You are not logged in.

#1 2021-06-11 23:51:59

williewillus
Member
Registered: 2011-03-27
Posts: 28

Extremely poor performance under IO load (Full-Disk Encryption)

Hey all.

I have an installation which has both the root disk and a data disk encrypted with dm-crypt/LUKS. The encryption is on the block device level.
I have a problem where if the system is under heavy IO load to either disk (particularly during writes), the entire system grinds to a halt. This is especially apparent during system upgrades.

It seems like the writes occur extremely quickly into memory buffers/caches, but then the system struggles to encrypt and write it all out.
I've tried tweaking the vm.dirty_{background_}bytes sysctls, but to no avail.

Has anyone run into this and found a way to get graceful degradation under IO load? Ideally there's backpressure throughout and things never get to the point of overload, at the price of reduced throughput.
Thanks in advance.


EDIT: In case it helps, system info:
CPU: Intel(R) Core(TM) i5-7600K CPU @ 3.80GHz
RAM: 32G
Filesystem: ext4 on both disks

Last edited by williewillus (2021-06-11 23:54:41)

Offline

#2 2021-06-12 00:11:25

schard
Forum Moderator
From: Hannover
Registered: 2016-05-06
Posts: 2,640
Website

Re: Extremely poor performance under IO load (Full-Disk Encryption)

I have experienced a similar behaviour in the past months on my working laptop.
AMD Ryzen 5 2500U
16 GB RAM
FS: ext4

$ cat /etc/fstab 
# /dev/mapper/root LABEL=root
UUID=835654de-1137-4c83-be10-4cf7d6879a32	/         	ext4      	rw,relatime,discard	0 1

# /dev/nvme0n1p1 LABEL=EFI
UUID=608C-EC65      	/boot     	vfat      	rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro	0 2
$ doas cryptsetup luksDump /dev/nvme0n1p2 
LUKS header information
Version:       	2
Epoch:         	3
Metadata area: 	16384 [bytes]
Keyslots area: 	16744448 [bytes]
UUID:          	28e49560-1d78-414d-9b2e-382646e87aaf
Label:         	(no label)
Subsystem:     	(no subsystem)
Flags:       	(no flags)

Data segments:
  0: crypt
	offset: 16777216 [bytes]
	length: (whole device)
	cipher: aes-xts-plain64
	sector: 512 [bytes]

Keyslots:
  0: luks2
	Key:        512 bits
	Priority:   normal
	Cipher:     aes-xts-plain64
	Cipher key: 512 bits
	PBKDF:      argon2i
	Time cost:  4
	Memory:     767745
	Threads:    4
	Salt:       e9 d5 7b f2 5f 60 0c 0d 57 c6 24 80 b4 e9 ee 2b 
	            2c 0c 91 16 34 6c e5 c7 2b 2c f1 39 e9 c2 44 3b 
	AF stripes: 4000
	AF hash:    sha256
	Area offset:32768 [bytes]
	Area length:258048 [bytes]
	Digest ID:  0
Tokens:
Digests:
  0: pbkdf2
	Hash:       sha256
	Iterations: 139884
	Salt:       f9 38 3b 0b dc c7 88 0d 52 ff a2 dc 08 cc 5a ff 
	            96 dc 9a df 9b c1 a6 c0 63 ef f0 de 05 a7 89 ea 
	Digest:     4c 53 50 c2 33 06 44 55 ef 5e a1 a6 71 97 4c 21 
	            c4 3b 9b 03 ed 48 71 ff 82 72 d0 27 00 19 ed e4 
$ doas cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1      1989707 iterations per second for 256-bit key
PBKDF2-sha256    3483641 iterations per second for 256-bit key
PBKDF2-sha512    1263344 iterations per second for 256-bit key
PBKDF2-ripemd160  737395 iterations per second for 256-bit key
PBKDF2-whirlpool  538836 iterations per second for 256-bit key
argon2i       4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id      4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b       940.4 MiB/s      2899.4 MiB/s
    serpent-cbc        128b               N/A               N/A
    twofish-cbc        128b               N/A               N/A
        aes-cbc        256b       726.9 MiB/s      2197.2 MiB/s
    serpent-cbc        256b               N/A               N/A
    twofish-cbc        256b               N/A               N/A
        aes-xts        256b      2330.6 MiB/s      2319.8 MiB/s
    serpent-xts        256b               N/A               N/A
    twofish-xts        256b               N/A               N/A
        aes-xts        512b      1462.2 MiB/s      1478.3 MiB/s
    serpent-xts        512b               N/A               N/A
    twofish-xts        512b               N/A               N/A

Inofficial first vice president of the Rust Evangelism Strike Force

Offline

#3 2021-06-12 00:11:27

loqs
Member
Registered: 2014-03-06
Posts: 18,876

Re: Extremely poor performance under IO load (Full-Disk Encryption)

Is the encryption method using hardware acceleration suck as AES-NI?  Is the encrypted volume opened with the option no-write-workqueue?  See man 5 crypttab.  Although the full list of options that volume is using would be useful.
What options is the filesystem mounted with?

Offline

#4 2022-01-08 01:09:11

williewillus
Member
Registered: 2011-03-27
Posts: 28

Re: Extremely poor performance under IO load (Full-Disk Encryption)

Hi, sorry for the late update.

More details on my setup:
my /boot is unencrypted and the root encrypted volume is passed on the kernel command line:

/boot/grub/grub.cfg

linux   /vmlinuz-linux root=UUID=4937802f-ff0e-432c-9c07-30b41d82f847 rw cryptdevice=UUID=c9932acc-b23e-461e-b437-eeb3fb4d1b1b:cryptroot:no-write-workqueue root=/dev/mapper/cryptroot loglevel=3 quiet

/etc/fstab

UUID=4937802f-ff0e-432c-9c07-30b41d82f847       /               ext4            rw,relatime     0 1

My sysctls are set to

vm.dirty_background_bytes = 402653200
vm.dirty_bytes = 402653200

but I still get extreme performance dips, to the point that if I'm performing pacman upgrades, the desktop is pretty much unusable interactively.

Here is cryptsetup status:

/dev/mapper/cryptroot is active and is in use.
  type:    LUKS2
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: keyring
  device:  /dev/sda2
  sector size:  512
  offset:  32768 sectors
  size:    459942570 sectors
  mode:    read/write
  flags:   no_write_workqueue

Here is cryptsetup benchmark, in case it helps. My processor can do aes-xts at over 2 GiB/s so I still think this is some sort of vmm/io scheduling issue:

# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1      1842840 iterations per second for 256-bit key
PBKDF2-sha256    2332760 iterations per second for 256-bit key
PBKDF2-sha512    1643536 iterations per second for 256-bit key
PBKDF2-ripemd160  967321 iterations per second for 256-bit key
PBKDF2-whirlpool  727167 iterations per second for 256-bit key
argon2i       7 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id      8 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b      1235.2 MiB/s      3627.1 MiB/s
    serpent-cbc        128b       105.4 MiB/s       800.9 MiB/s
    twofish-cbc        128b       241.8 MiB/s       440.4 MiB/s
        aes-cbc        256b       952.2 MiB/s      2968.3 MiB/s
    serpent-cbc        256b       107.8 MiB/s       802.0 MiB/s
    twofish-cbc        256b       246.8 MiB/s       440.3 MiB/s
        aes-xts        256b      3624.8 MiB/s      3626.5 MiB/s
    serpent-xts        256b       696.1 MiB/s       707.6 MiB/s
    twofish-xts        256b       409.9 MiB/s       413.9 MiB/s
        aes-xts        512b      2950.7 MiB/s      2939.3 MiB/s
    serpent-xts        512b       706.9 MiB/s       705.6 MiB/s
    twofish-xts        512b       411.8 MiB/s       413.4 MiB/s

Offline

#5 2022-01-08 08:26:34

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,404

Re: Extremely poor performance under IO load (Full-Disk Encryption)

I'll just throw https://www.freedesktop.org/software/sy … crypt-cpus against the wall - no idea whether it sticks.

Offline

#6 2023-11-22 21:38:34

nicklan
Member
Registered: 2021-03-02
Posts: 3

Re: Extremely poor performance under IO load (Full-Disk Encryption)

Bumping this as I have basically the same issue. cryptbench indicates that I should be able to do about 3GiB/s, but I'm getting about 120 Mb/s when I write sad

This is ext4 with data=ordered.

Things I've tried that haven't helped at all:
- setting barrier=0
- setting noatime
- setting commit=100

None of these seem to effect perf at all. My drive is this one, which seems like it should be able to go MUCH faster...

If anyone has any further ideas of things to try, that would be much appreciated. Thanks!

Info:

$ sudo cryptsetup luksDump /dev/nvme0n1p3
LUKS header information
Version:       	2
Epoch:         	3
Metadata area: 	16384 [bytes]
Keyslots area: 	16744448 [bytes]
UUID:          	757a4a56-ecd4-472a-b42f-a575198aff4d
Label:         	(no label)
Subsystem:     	(no subsystem)
Flags:       	(no flags)

Data segments:
  0: crypt
	offset: 16777216 [bytes]
	length: (whole device)
	cipher: aes-xts-plain64
	sector: 512 [bytes]

Keyslots:
  0: luks2
	Key:        512 bits
	Priority:   normal
	Cipher:     aes-xts-plain64
	Cipher key: 512 bits
	PBKDF:      argon2i
	Time cost:  7
	Memory:     1048576
	Threads:    4
	Salt:       04 bd 2c 81 d3 c4 2b ad c3 44 6f bd 5a af ad 9b
	            b7 a9 4a b8 fe a2 ad 29 29 67 26 c6 2e 67 7d 63
	AF stripes: 4000
	AF hash:    sha256
	Area offset:32768 [bytes]
	Area length:258048 [bytes]
	Digest ID:  0
Tokens:
Digests:
  0: pbkdf2
	Hash:       sha256
	Iterations: 153121
	Salt:       32 62 9c 11 ca 35 18 d9 35 fd 26 16 fa 84 84 52
	            8e f2 66 14 4d 30 b3 48 61 64 bc 0d eb cf 7d 41
	Digest:     03 b2 a8 e0 7b 1f 6c 45 ca 07 7c dd 57 d6 b2 1f
	            40 ad 7f dd f4 a6 33 56 e7 3f 49 3c 57 04 85 b4
sudo cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1      1565038 iterations per second for 256-bit key
PBKDF2-sha256    2198272 iterations per second for 256-bit key
PBKDF2-sha512    1638400 iterations per second for 256-bit key
PBKDF2-ripemd160  903944 iterations per second for 256-bit key
PBKDF2-whirlpool  720175 iterations per second for 256-bit key
argon2i      10 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id     10 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b      1237.4 MiB/s      3256.3 MiB/s
    serpent-cbc        128b       110.5 MiB/s       790.9 MiB/s
    twofish-cbc        128b       248.2 MiB/s       415.7 MiB/s
        aes-cbc        256b       956.9 MiB/s      2705.7 MiB/s
    serpent-cbc        256b       109.5 MiB/s       798.0 MiB/s
    twofish-cbc        256b       252.6 MiB/s       418.8 MiB/s
        aes-xts        256b      3128.8 MiB/s      3072.2 MiB/s
    serpent-xts        256b       713.6 MiB/s       714.2 MiB/s
    twofish-xts        256b       396.9 MiB/s       395.5 MiB/s
        aes-xts        512b      2748.7 MiB/s      2687.9 MiB/s
    serpent-xts        512b       716.3 MiB/s       714.3 MiB/s
    twofish-xts        512b       399.4 MiB/s       397.4 MiB/s

Offline

#7 2023-11-25 13:41:01

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,404

Re: Extremely poor performance under IO load (Full-Disk Encryption)

The OP has apparent IO scheduling issues, there was no mention of general slower-than-expected writes.

which seems like it should be able to go MUCH faster.

Perhaps, "should" … but does it?

your link wrote:

"Terrible consistency"

Avg. 4K Random Write Speed 132MB/s

Did you test the unencrypted IO performance of the drive? Ideally at a similar fill rate.

Depending on what your performance test actually was, incl. sideload during the setup (other processes writing to that drive), your results seem to meet that number rather well.
Since you're playing w/ FS parameters (only), is the (encrypted) write performance fine when you bypass the FS?

Offline

Board footer

Powered by FluxBB