[SOLVED] NVMe SSD unresponsive when much data is written

jkhsjdhjs · 2019-05-26 00:38:19

PREVIOUS TOPIC: LUKS/LVM on NVMe SSD unresponsive when much data is written

Hello,

I'm running Arch Linux on an LVM with 3 partitions (root, home and swap). The volume group is located on a LUKS partition using the cipher aes-xts-plain64. which is located on a Samsung 960 EVO with 250GB.
When copying large files on my home partition, my system becomes really slow and often completely unresponsive. I don't have much space left on my root partition so I couldn't test it there, but I expect it to behave in the same way.

I have discard/trim enabled in /etc/lvm/lvm.conf (issue_discards=1) and also enabled it in /etc/fstab for every partition by adding discard as mount option.
I'm using systemd for decryption on boot with the following line in /etc/crypttab.initramfs, also with discard:

luks-lvm        UUID=9a5efb63-9054-45b1-b8bf-380a533e8f29       -       luks,discard,tries=0

That's why I don't think the unresponsiveness is caused by write amplicifction.

While writing this post I noticed several entries in the NVMe Log of the SSD with the LUKS partition.
I also have another NVMe SSD in my Computer, which didn't have any errors in its log. This SSD doesn't host a LUKS partition, just plain ext4 and doesn't have any issues.
So this may be a possible cause for the unresponsiveness.
Another possible cause could be thermal throttling, but the temperature stays rather low when writing.

The following gist contains information about my system. It also contains the NVMe error log: https://gist.github.com/jkhsjdhjs/6ba51 … 4682a5487f

I also recorded a video where I write data from /dev/zero to both NVMe SSDs while constantly reading the temperature, CPU load and write speed: https://screens.totally.rip/vokoscreen- … -39-36.mp4
The video also has a counter in one tmux pane so you can see when the system becomes completely unresponsive (not even the cursor would move), it happens when the counter is at 450.
While the SSDs are writing data I constantly listed directory contents of random directories in dolphin (on the corresponding SSD of course).
When writing to the encrypted disk, directory listing in dolphin took ages (more than 10 seconds) or didn't finish at all.
When writing to the other SSD, directories were listed with no noticable delay as usual.

The questions I have for you are the following:
How can I interpret the NVMe logs? Is the SSD defective or are these CPU <-> SSD communication issues?
Or is this expected behavior, LUKS with aes-xts-plain64 just doesn't perform better on my CPU?

Thanks in advance for your answers!

Last edited by jkhsjdhjs (2019-07-24 19:41:47)

frostschutz · 2019-05-26 08:01:25

do you have aggressive power saving settings?

for example https://wiki.archlinux.org/index.php/Po … Management

find /sys -name "*power*policy*" -exec head {} +

anything in dmesg when these slowdowns happen?

you can also verify that aesni is in use by checking dmesg for messages like

[  122.198071] device-mapper: crypt: xts(aes) using implementation "xts-aes-aesni"

if it uses a non-aesni implementation then your cpu is encrypting in software. this can happen if modules are not available by the time the crypt container is opened, and loading the module later does not necessarily make it switch over to the correct implementation

I have discard/trim enabled in /etc/lvm/lvm.conf (issue_discards=1)

That is a common misunderstanding. issue_discards=1 does not enable/disable trim. it makes LVM actively discard your data when you lvremove, lvresize, pvmove etc.

LVM keeps backups of LVM metadata changes in /etc/lvm/{archive,backup} so in theory you could undo a botched LV resize, however with issue_discards=1 those backups are useless as data is already gone.

I recommend issue_discards=0 and if you want to TRIM free lvm space you can always create a temporary -l100%FREE LV and blkdiscard that.

Otherwise temporarily set it to 0 when you do potential screw-up operations.

Last edited by frostschutz (2019-05-26 08:02:04)

jkhsjdhjs · 2019-05-26 13:17:50

Power saving settings:

$ find /sys -name "*power*policy*" -exec head {} +
==> /sys/devices/pci0000:00/0000:00:17.0/ata1/host0/scsi_host/host0/link_power_management_policy <==
max_performance

==> /sys/devices/pci0000:00/0000:00:17.0/ata6/host5/scsi_host/host5/link_power_management_policy <==
max_performance

==> /sys/devices/pci0000:00/0000:00:17.0/ata4/host3/scsi_host/host3/link_power_management_policy <==
max_performance

==> /sys/devices/pci0000:00/0000:00:17.0/ata2/host1/scsi_host/host1/link_power_management_policy <==
max_performance

==> /sys/devices/pci0000:00/0000:00:17.0/ata5/host4/scsi_host/host4/link_power_management_policy <==
max_performance

==> /sys/devices/pci0000:00/0000:00:17.0/ata3/host2/scsi_host/host2/link_power_management_policy <==
max_performance

frostschutz wrote:

anything in dmesg when these slowdowns happen?

Nope, nothing. Neither in syslog, checked with journalctl -xf.
There was another thing I noticed when just testing it now:
The SSD has two temperature sensors. After writing about 60 GiB the first one reported 54°C while the second one reported 80°C.
Also no new entries have been added to the NVMe Log while writing, but slowdowns did happen.

aesni is in use:

[   39.165512] device-mapper: crypt: xts(aes) using implementation "xts-aes-aesni"

frostschutz wrote:

issue_discards=1 does not enable/disable trim. it makes LVM actively discard your data when you lvremove, lvresize, pvmove etc.

So when I remove a file with rm it will still trim the data?

frostschutz · 2019-05-26 13:28:33

jkhsjdhjs wrote:

frostschutz wrote:
issue_discards=1 does not enable/disable trim. it makes LVM actively discard your data when you lvremove, lvresize, pvmove etc.
So when I remove a file with rm it will still trim the data?

yes, LVM's issue_discards doesn't change anything about what happens on the filesystem level

unfortunately I have no idea what is going on with your SSD there, good luck finding the cause

jkhsjdhjs · 2019-05-26 13:33:15

Thanks for the clarification and your help so far! I'll leave issue_discards disabled then.

jkhsjdhjs · 2019-05-27 11:56:40

I searched a bit and found a question on askubuntu regarding system freezes and unresponsiveness when copying large files to an USB flash drive. Not exactly my problem but I thought I'd give it a try.
I first disabled swap completely as given in this answer and as this didn't help, ran the following two commands as described here:

echo $((16*1024*1024)) > /proc/sys/vm/dirty_background_bytes
echo $((48*1024*1024)) > /proc/sys/vm/dirty_bytes

Modifying dirty_background_bytes and dirty_bytes helped a lot, my system feels a lot more responsive when doing:

echo /dev/zero > testfile

For example dolphin lists directories nearly instant and my system doesn't freeze completely anymore.

Still, programs are starting much slower than usual, so it's not as fast as I'd like it to be, like the other, non encrypted, NVMe SSD.

linux-mate · 2019-05-27 21:33:01

"Message" is a required field in this form.

Last edited by linux-mate (2020-01-05 21:57:56)

jkhsjdhjs · 2019-05-27 21:38:42

Yes, I know about that bug. I already updated, as seen in the system information I posted. Thanks for the information anyways!

EDIT: Yes, luckily I haven't lost any data!

Last edited by jkhsjdhjs (2019-05-27 21:39:40)

jkhsjdhjs · 2019-06-15 00:27:04

I did some additional tests to further narrow down the root cause of this issue:
- I erased the NVMe SSD and created a normal ext4 partition without encryption. The issue still persisted, write speed dropped to ~300MB/s after a few seconds.
- I switched the NVMe SSD with the other NVMe SSD on the mainboard (one SSD is in an M.2 slot directly on the board, the other one in a PCIe x4 to M.2 adapter card). Same results, 970 EVO fast, 960 EVO slow.

Because of these results I concluded that it's an issue with the 960 EVO, not with the mainboard chipset or anything else.
Thus I thought it may be an issue with the SSD firmware, so I downloaded a firmware update from this page and updated the firmware from 2B7QCXE7 to 3B7QCXE7.
However, this still didn't improve performance.

Also, there are 7 new entries in the NVMe Error Log (they were already present before the firmware upgrade):

Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0        923     0  0x0018  0x4004  0x02c            0     0     -
  1        922     0  0x0017  0x4004  0x02c            0     0     -
  2        921     0  0x0018  0x4004  0x02c            0     0     -
  3        920     0  0x0017  0x4004  0x02c            0     0     -
  4        919     0  0x004e  0x4212  0x028            0     -     -
  5        918     0  0x0018  0x4004  0x02c            0     0     -
  6        917     0  0x0017  0x4004  0x02c            0     0     -

I'm now going to contact the samsung support and see if they can help me (or replace the SSD, as it still has warranty).

EDIT:
Samsung replied and asked me to do benchmark the SSD on windows using Samsung Magician. To my surprise the performance was good, the SSD achieved speeds of up to 3272 MB/s / 1,554 MB/s read / write according to Magician. Because I didn't believe these results at first I also did a benchmark with Crystal Disk Mark, which resulted in 3275.1 MB/s / 681.1 MB/s sequential read/write with 2 runs and 16GiB. The write performance is significantly lower than the result Samsung Magician showed, but still twice as fast as with linux. With 1x 32GiB the performance was 1972 MB/s / 684 MB/s sequential read/write.

Thus the poor performance of the 960 EVO seems to be a compatibility issue with linux, that has been fixed in Samsung's new generation, 970 EVO.

I also stumbled across the following thread in which the same problem I experience is described: https://bbs.archlinux.org/viewtopic.php?id=221916

I just got a Samsung 960 Evo and the write performance quickly drops from 2 GB/sec to around 380 MB/sec and stays there. Read performance is fine and sustains at about 2 GB/sec.

However, apparently no solution has been found.

I will share these results with with Samsung. I hope that either the nvme kernel driver is fixed or Samsung releases a firmware update for the 960 EVO series, but I highly doubt that will happen.
Also I will remove LUKS/LVM from the topic as it is not related to the issue.

EDIT: Samsung support couldn't help me, I just sold the SSD as arguing with the samsung support became too time-consuming.

EDIT2: The SSD has a 13GB cache. The values in the specification are the read and write speeds of the cache, not of the chips that actually store the data permanently. The "storage chips" are just capable of writing with 300MB/s, which explains all the benchmark results I got. See https://topnewreview.com/samsung-960-evo-250gb/

Last edited by jkhsjdhjs (2019-09-18 17:35:52)

Arch Linux

#1 2019-05-26 00:38:19

[SOLVED] NVMe SSD unresponsive when much data is written

#2 2019-05-26 08:01:25

Re: [SOLVED] NVMe SSD unresponsive when much data is written

#3 2019-05-26 13:17:50

Re: [SOLVED] NVMe SSD unresponsive when much data is written

#4 2019-05-26 13:28:33

Re: [SOLVED] NVMe SSD unresponsive when much data is written

#5 2019-05-26 13:33:15

Re: [SOLVED] NVMe SSD unresponsive when much data is written

#6 2019-05-27 11:56:40

Re: [SOLVED] NVMe SSD unresponsive when much data is written

#7 2019-05-27 21:33:01

Re: [SOLVED] NVMe SSD unresponsive when much data is written

#8 2019-05-27 21:38:42

Re: [SOLVED] NVMe SSD unresponsive when much data is written

#9 2019-06-15 00:27:04

Re: [SOLVED] NVMe SSD unresponsive when much data is written

Board footer