You are not logged in.
Hello there, I'd like to know what am I missing/doing wrong with my encrypted swap setup.
I've followed the whole procedure and I open both root and swap on boot via custom hook, and the resume as well (but I've tried without resume, same results).
Everything is fine, until I reach RAM limits ... where instead of helping out, the SWAP access freezes the OS instead.
I am testing via this C script https://unix.stackexchange.com/a/1368 , modified as such (to speed it up)
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char** argv) {
int max = -1;
int mb = 0;
char* buffer;
if(argc > 1)
max = atoi(argv[1]);
while((buffer=malloc(1024*1024)) != NULL && mb != max) {
memset(buffer, 0, 1024*1024);
mb++;
printf("Allocated %d MB\n", mb);
usleep(1000);
}
return 0;
}
My `/etc/fstab` looks like the following one:
# UUID=18f24c4a-901a-443a-9f3e-cbc909da3fb9
/dev/mapper/root / ext4 rw,relatime 0 1
# UUID=7940-9249
UUID=7940-9249 /boot vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 2
# UUID=38ed4406-6261-4f6c-a247-09163387bae9
/dev/mapper/swap none swap defaults 0 0
The `swapon --show` command shows this:
NAME TYPE SIZE USED PRIO
/dev/nvme0n1p2 partition 7.5G 0B -2
and if I type `lsblk` this is the outcome:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 477G 0 disk
├─nvme0n1p1 259:1 0 127M 0 part /boot
├─nvme0n1p2 259:2 0 7.5G 0 part [SWAP]
└─nvme0n1p3 259:3 0 469.3G 0 part /
Every disk related software shows the configuration is right, and yet as soon as the swap is reached, the OS goes bananas.
Please note, this is not on one machine, it's in 3 different machines with 3 different kind of hardware, but the setup is the same, hence I can say it's easy to reproduce, but I've no idea what is it that is not making it work as expected.
Any help appreciated, so thanks in advance to whoever could give me at least a clue, or a hint.
Best Regards
Last edited by WebReflection (2020-01-14 15:32:29)
Offline
Isn't it normal?
Swap is not RAM. Swap is incredibly slow. Even on SSD.
In your C program the "sleep(1/2);" bothers me a little. Isn't that just sleep(0)? Your program might not be doing what you intend.
If you keep allocating more and more and more RAM, the system will grind to a halt until those 8G of swap are filled (which why smaller swap is sometimes better - large swap takes ages to fill) and then the OOM killer kicks in at some point (might be too late, or hit the wrong process).
I haven't used swap in a long time but I remember systems being locked down for hours before they recovered from a swap accident. Juggling pages between RAM and swap is like an endless variant of this game: https://en.wikipedia.org/wiki/Musical_chairs
Swap is only great as long as no program stress tests it, otherwise the system locks up pretty much indefinitely. If you actually seriously need more RAM, the only solution is to buy more RAM.
To my knowledge the systems OOM killer still has no time component... it will kill when out of both ram and swap, but won't act if the system is just going into endless grind mode.
Last edited by frostschutz (2020-01-14 10:04:55)
Online
Isn't it normal?
The same system with a non encrypted setup never froze, so I assume it's not?
Swap is not RAM. Swap is incredibly slow. Even on SSD.
I'm OK with slow, but I'm not OK with frozen, as it wasn't happening with the same machines before without cryptsetup in the mix.
In your C program the "sleep(1/2);" bothers me a little. Isn't that just sleep(0)? Your program might not be doing what you intend.
I am not into C much these days, and I guess a `usleep` (if it's even a thing) would be better. However, the software does exactly what it's meant to do, and I can monitor the RAM piling up.
That being said, the problem manifested for real while building WPEWebKit, which reached over 16GB of RAM at some point, and froze the OS, so that I couldn't finish building it.
As that was working just fine without cryptsetup in the mix, I'm puzzled on why now everything freezes.
If you keep allocating more and more and more RAM, the system will grind to a halt until those 8G of swap are filled and then the OOM killer kicks in at some point (might be too late, or hit the wrong process).
The freezing part happens pretty much as soon as the SWAP starts being used. If I manage to kill close enough to that point the `memeater` program, then memory frees up and the swap contains something that will still be there for dunno know how long.
However, since the SWAP is already slow, like you mentioned, I wonder if what I am seeing is the effect of an encrypted swap ... so maybe I could try to drop the swap, create a 2GB or something file and swap on that one, and see how it goes.
As I am pretty confident that won't likely freeze the OS, is there anything else you believe could go wrong in my current configuration?
Thanks.
edit there was `usleep` so we can now drop the script from the equation, as I've used that only to reach the swap, not to find out my OS was freezing, as that happened for real while using the whole amount of RAM building WebKit.
Last edited by WebReflection (2020-01-14 10:11:42)
Offline
Slowing incrementally down the `memeater` program seems to produce different results:
* it doesn't instantly freeze, the SWAP starts being used without major issues
* after a while it freezes anyway, but if I manage to kill the memeater program everything goes back to normal
At this point, I believe the result I'm seeing is just the encrypted swap being super slow.
Last question: is `resume` possibly a cause for OS freezing when the swap is used, or resume works only when there is something to store and resume?
Once I've got this question answered, I guess I could close this as solved.
Thanks.
P.S. meanwhile, I've ordered 32GB of RAM ...
Last edited by WebReflection (2020-01-14 11:45:07)
Offline
resume only does anything when actually suspending or resuming
at most there could be side effects, like resume is opened early boot (initramfs) and if your system supports aes instructions, the relevant aesni module should already be loaded at that time or it might end up running in slow software aes mode
but that would be an error in the initramfs then; if it works with encryption it naturally has to load all the relevant crypto stuff for that
Online
the relevant aesni module should already be loaded at that time or it might end up running in slow software aes mode
how can I check that's not the case?
to provide more context, the hooks order is:
HOOKS=(base udev autodetect keyboard keymap modconf block my-login-prompt encrypt resume filesystems fsck)
The hook provides `cryptsetup` as binary (even if it was there anyway), and my-login-prompt does both root and swap cryptsetup at once on boot.
The reason I am using my own login prompt is that otherwise I'd need to enter the password twice per boot, which is annoying, but at the same time, I want to be sure the system is using hardware to encrypt/decrypt, not software.
The root partition though, seems to work just fine, even on old Pentium based laptops ... it's almost as fast as when not encrypted.
Thanks.
Last edited by WebReflection (2020-01-14 12:16:18)
Offline
Not sure if there is a way to query it directly... info only gives the cipher (not the implementation used), dmesg is silent about created crypto mappings too.
there is /proc/crypto but it only gives a generic refcount (how many users there are) not what is using it exactly
so a roundabout way (if you are able to swapoff and remove the crypto mapping) is to compare refcount before/after
# cp /proc/crypto a
# cryptsetup close foobar
# cp /proc/crypto b
# diff -u a b
--- a 2020-01-14 15:59:40.145933161 +0100
+++ b 2020-01-14 15:59:46.195751797 +0100
@@ -11,7 +11,7 @@
driver : cryptd(__xts-aes-aesni)
module : cryptd
priority : 451
-refcnt : 5
+refcnt : 4
selftest : passed
internal : yes
type : skcipher
@@ -128,7 +128,7 @@
driver : xts-aes-aesni
module : aesni_intel
priority : 401
-refcnt : 5
+refcnt : 4
selftest : passed
internal : no
type : skcipher
@@ -192,7 +192,7 @@
driver : __xts-aes-aesni
module : aesni_intel
priority : 401
-refcnt : 5
+refcnt : 4
selftest : passed
internal : yes
type : skcipher
So dm-crypt 'foobar' was using aesni_intel, as refcounts went down by one after closing it. If it was software aes the refcount would have gone down for aes-generic instead.
Well, this is a rare case. It shouldn't be a problem with archlinux initcpio normally.
For read benchmark you can use something like
dd status=progress bs=1M of=/dev/null iflag=direct if=/dev/mapper/cryptothingy
and look at CPU usage (htop) at the same time
Last edited by frostschutz (2020-01-14 15:08:01)
Online
My sequence:
# cp /proc/crypto a
# sudo swapoff /dev/mapper/swap
# sudo cryptsetup close swap
# cp /proc/crypto b
# diff -u a b
--- a 2020-01-14 16:14:02.267979632 +0100
+++ b 2020-01-14 16:14:49.328592872 +0100
@@ -72,7 +72,7 @@
driver : cryptd(__xts-aes-aesni)
module : cryptd
priority : 451
-refcnt : 3
+refcnt : 2
selftest : passed
internal : yes
type : skcipher
@@ -200,7 +200,7 @@
driver : xts-aes-aesni
module : aesni_intel
priority : 401
-refcnt : 3
+refcnt : 2
selftest : passed
internal : no
type : skcipher
@@ -264,7 +264,7 @@
driver : __xts-aes-aesni
module : aesni_intel
priority : 401
-refcnt : 3
+refcnt : 2
selftest : passed
internal : yes
type : skcipher
as I've no idea what I am looking at, do you see any indication the encryption is software, instead of hardware?
edit reading again your comment I understand everything is fine there
Thanks
Last edited by WebReflection (2020-01-14 15:39:46)
Offline
P.S. the dd bench showed this, on an ASUS E203 with an eMMC of 64GB ... it's the slowest machine I have, yet it doesn't look too bad ... or is it?
1458569216 bytes (1.5 GB, 1.4 GiB) copied, 9 s, 162 MB/s
doing the same operation after dropping the swap and cryptsetup, on the same partition/disk, is slightly faster, but not twice as fast:
1573912576 bytes (1.6 GB, 1.5 GiB) copied, 8 s, 197 MB/s
I guess we could call the case "solved", unless there's really something wrong with what I've posted.
Offline
P.S. 2 ... after checking via `htop` I could see one of the 4 thread was slightly higher than without cryptsetup on, but not too high ... well, I think I just confirmed myself it was a good idea to order new RAM for my Desktop PC.
Thanks a lot for hints and tests!
Offline