You are not logged in.
Pages: 1
Topic closed
My machine has two Seagate FireCuda 520 (ZP2000GM30002) NVMe SSDs and eight Western Digital Red SA500 SATA SSDs (WDS400T1R0A-68A4W0). The machine is an ASRock X570 Creator with BIOS version 3.4 and with an AMD Ryzen 3950X. The distro is an up-to-date ArchLinux.
All devices (NVMe as well as SATA) have LUKS2 containers on them, with discard enabled. Within / across the LUKS2 containers,
the two NVMe SSDs contain a Btrfs filesystem (RAID1 metadata, RAID1 data).
the eight SATA SSDs contain a Btrfs filesystem (RAID1C3 metadata, RAID6 data).
The I/O scheduler is set to none for the NVMe SSDs, mq-deadline for the SATA SSDs and none for all the “derived” LUKS2 /dev/mapper/… devices. These are the defaults; I haven’t customized any SSD-specific settings.
Both filesystems are mounted without -o discard, as generally recommended, and the fstrim timer runs weekly. There are no read/write performance problems. For example, a btrfs scrub on the NVMe RAID1 filesystem exceeds 5 GB/s. Notably, there is no big data churn with lots of deletions. Most weeks go by with less than 1 GB worth of deletions.
However, there is a huge difference in fstrim durations:
It always takes 8+ hours on the two NVMe SSDs (which defies my understanding).
It only takes seconds on the eight SATA SSDs (which is what I would expect).
During the long hours of fstrim on the NVMe SSDs, iostat -x usually looks like this:
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
nvme0n1 0,00 0,00 0,00 0,00 0,00 0,00 1,00 16,00 0,00 0,00 0,00 16,00 5,00 592,00 0,00 0,00 101,20 118,40 0,00 0,00 0,51 52,30
nvme1n1 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 5,00 592,00 0,00 0,00 100,80 118,40 0,00 0,00 0,50 52,30
The trim speed (d…) is extremely slow, utilization is always around 50% (whatever that value means) etc.
What could be the problem with the NVMe SSDs?
Flawed devices? Maybe, this “solved” (in fact given-up) thread suggests there might be something wrong with Seagate FireCuda.
Bad or failing devices? Probably not. S.M.A.R.T. looks almost the same on both, no errors are reported, both are relatively new (and the problem has been around since I bought them).
Firmware? There are no firmware updates for this NVMe model on Seagate’s website.
LUKS2? No. Another filesystem on LUKS2 works fine on the SATA SSDs.
Filesystem type? No. A much more complex Btrfs setup works fine on the SATA SSDs.
The obvious question is: Why do I care? Well, because there are occasional (and pretty unpleasant) 1-2-second I/O lags during the entire 8+hour trimming ordeal every week. (Which just doesn’t feel right on a recent 32-CPU machine with 128 GB of RAM and PCIe 4 SSDs.)
So, how can one make a Seagate FireCuda 520 trim fast(er)? Any ideas? Should I try to report it somewhere (kernel, Seagate, whoknowswhat)?
Last edited by andrej.podzimek (2021-03-01 03:52:39)
Offline
This thread might be related. But people complain there about trims taking minutes (wouldn’t that be great?) rather than hours. So it could well be a different issue.
Offline
You can try to debug what fstrim actually does ( with blktrace looking for discard requests ), there have been bugs in the past where trim requests got split up the wrong way, so many tiny requests despite large contiguous region.
fstrim also has a --minimum option, if you have many tiny files, it can ignore too small trimmable areas that take very long time to process with little benefit...
you can also check what kernel sees as like `head /sys/block/*/queue/*discard*`
Offline
For me here, it's that "trim taking minutes" problem from that thread you found, it's luckily not taking hours. It needs four minutes to complete on the NVMe drive and six seconds on a SATA drive. Both filesystems are btrfs and use the same mount options. It's simple single drive btrfs setups, using 'single' profile for data and 'dup' for metadata. The drives are both Samsung. I am blaming btrfs for the problem.
I use the new "discard=async" mount option and have the fstrim timer disabled to side-step the issue.
I got my numbers like this:
$ sudo -v
$ time sudo fstrim -v /
/: 508.8 GiB (546319405056 bytes) trimmed
real 3m59.041s
...
$ sudo -v
$ time sudo fstrim -v /
/: 15.7 GiB (16893767680 bytes) trimmed
real 3m51.399s
...
There was an hour between the two runs of fstrim. It doesn't seem to matter how much area has to be trimmed. This is why I'm blaming btrfs. It still being slow on the second run should mean the time disappears inside whatever btrfs is doing internally to search for the spots to queue up for trimming?
There's no other problems with the drive and btrfs. For example the "btrfs scrub" test you mention seems to run at full speed of PCIe 3.0.
I just tried creating a small ext4 filesystem on the drive and then run fstrim on that empty, new filesystem. It completes fast:
$ time sudo fstrim -v .
.: 91.1 GiB (97850060800 bytes) trimmed
real 0m1.092s
...
I then got curious and created a btrfs filesystem in that same spot, like this:
mkfs.btrfs /dev/disk/by-partlabel/nvmetest -m dup -R free-space-tree -f
The free-space-tree and dup-metadata options are there to make it the same as my old btrfs filesystem.
Running fstrim on this empty filesystem is very fast, see here:
$ time sudo fstrim -v .
.: 91.2 GiB (97872281600 bytes) trimmed
real 0m0.104s
...
I guess this means the problem should have something to do with fragmentation in the old btrfs filesystem? But that can't explain why things are fine on my SATA drive's filesystem, see here:
$ time sudo fstrim -v /data
/data: 336.7 GiB (361535684608 bytes) trimmed
real 0m6.028s
...
$ time sudo fstrim -v /data
/data: 2.6 GiB (2780217344 bytes) trimmed
real 0m4.468s
...
Last edited by Ropid (2021-03-01 13:47:34)
Offline
[...]
fstrim also has a --minimum option, if you have many tiny files, it can ignore too small trimmable areas that take very long time to process with little benefit...
[...]
I tried this "--minimum" parameter and it works great! The time for fstrim drops to just one second when using 1 MB as the minimum size. The space difference between what's discarded compared to a normal, full fstrim run is also fine. The lost, non-trimmed space is about 1% of the drive's size.
Here's a run after a reboot with a 1 MB minimum size:
$ sudo -v; time sudo fstrim --minimum 1048576 -v /
/: 497.6 GiB (534296932352 bytes) trimmed
real 0m1.235s
user 0m0.005s
sys 0m0.062s
Here's a second run:
$ sudo -v; time sudo fstrim --minimum 1048576 -v /
/: 4.6 GiB (4901683200 bytes) trimmed
real 0m0.947s
user 0m0.006s
sys 0m0.068s
If I repeat this, that "4.6 G trimmed" message stays the same.
EDIT:
I tested minimum size numbers between 1M and 1K.
Here's a table with the results:
--minimum | time | trimmed
------------------+-------------+----------------
1048576 ( 1M) | 0.9 sec | 4.6 GiB
524288 (512K) | 2.8 sec | 5.8 GiB
262144 (256K) | 8.0 sec | 7.6 GiB
131072 (128K) | 19.4 sec | 9.6 GiB
65536 ( 64K) | 49.4 sec | 12.2 GiB
32768 ( 32K) | 93.5 sec | 14.1 GiB
16384 ( 16K) | 153.0 sec | 15.2 GiB
8192 ( 8K) | 189.1 sec | 15.5 GiB
4096 ( 4K) | 233.2 sec | 15.7 GiB
2048 ( 2K) | 229.6 sec | 15.7 GiB
1024 ( 1K) | 229.0 sec | 15.7 GiB
Here's the command line I ran and the raw results:
Last edited by Ropid (2021-03-01 23:06:56)
Offline
Found this thread via duckduckgo. I have a very similar setup and the same hardware (btrfs, luks2 with discard enabled, FireCuda 520). Fstrim also takes hours (maybe 8 or so?)
Really curious what's going wrong here.
Anyone found a reason or a solution for this yet?
Offline
The only thing I’ve noticed is that it (unsurprisingly) depends on the number of freed blocks. It takes maybe 3 hours after a week of normal operation and easily more than 12 hours when a major cleanup was done during that week (e.g. 800 GB freed).
The Seagate website doesn’t show any firmware updates for this SSD at the moment, so I have no clue what else could be wrong.
Offline
I'm guessing it's something about btrfs, it's not the drive. I think I remember running "blkdiscard" on the drive once and it completed fast.
A work-around is to use a "--minimum 1M" argument for fstrim. You won't miss a lot of space by doing that, it's just a handful GB or so difference.
Offline
I do not think it is btrfs. My laptop with some Samsung NVME on btrfs/luks only takes 10 seconds tops. My Firecuda 520 with just btrfs, no luks, is taking forever. Same size drive.
I also have two 2TB Firecuda 520s in an mdraid raid0 configuration formatted with ext4, fstrim is also taking an abnormal amount of time.
Offline
Firecuda 520 with just btrfs, no luks, is taking forever.
Seems a known issue with this drive. Try updating the firmware: https://www.spinics.net/lists/dm-devel/msg46320.html
--
saint_abroad
Offline
Lyndeno wrote:Firecuda 520 with just btrfs, no luks, is taking forever.
Seems a known issue with this drive. Try updating the firmware: https://www.spinics.net/lists/dm-devel/msg46320.html
Thanks, I didn't see that when browsing that thread. My drives are on STNSC014 (the old version). Unfortunately when I input the serial numbers for the three of them onto the Seagate website it says there is no new firmware available. Perhaps there is somewhere else to look?
Offline
Lyndeno wrote:Firecuda 520 with just btrfs, no luks, is taking forever.
Seems a known issue with this drive. Try updating the firmware: https://www.spinics.net/lists/dm-devel/msg46320.html
Unfortunately, there is (still) no firmware update available. Although the post says Seagate has provided updated firmware version STNSC016, the problem is that provided [to one particular user] != published. It looks like Seagate has never published the new version on its download site. Searching by my devices’ serial numbers has never found anything.
A new(er) firmware version quite provably exists though, because there are reports (on various hardware benchmarking sites) of a few machines that have it installed. I’ve contacted Seagate support and asked for the firmware; let’s see how that goes…
Meanwhile, the trim time has grown from ~3 hours to ~3 days and occasional freezes have grown from ~5 seconds to ~5 minutes (causing failures/timeouts in a ton of software). Guess which storage brand I’m going to avoid in the upcoming decade!
Offline
This has now reached a point where a weekly fstrim takes more than a week. The SSDs are basically unusable. As already discussed above, this is not a Btrfs / Linux problem, because all other SSDs work and trim just fine.
A new(er) firmware version quite provably exists though, because there are reports (on various hardware benchmarking sites) of a few machines that have it installed. I’ve contacted Seagate support and asked for the firmware; let’s see how that goes…
So I had that^ conversation with Seagate support. After a shockingly frustrating experience with them, I only have one recommendation to give:
First they asked me to run SeaTools and/or SeaChest on a Windows machine.
Both SeaTools and SeaChest have Linux versions and run just fine under ArchLinux. What sense would it make to use Windows versions instead?
No, I do not have a Windows machine. No, I will not have a Windows machine. No, I will never connect SSDs with my data to a Windows machine.
Next they claimed that they never (?!) provide firmware updates for consumer products, because firmware updates are only for enterprise customers — severe problems being the only exception.
If this is not a severe problem, then what is??? (Also, I paid a somewhat “enterprise-like” price for my “consumer” SSDs.)
Obviously they did provide the STNSC016 firmware image file to at least one person, so why not make it generally available to others?
Last but not least, adding insult to injury, Seagate told me to look for third-party options to address my issue. And reiterated their BS about an unsupported operating system.
This is not my issue; this is Seagate’s mess.
I asked back which third party can provide me with the Seagate firmware file that I need — and got no further response.
In summary, these have been my last Seagate SSDs. Seagate will never get my money again. Obviously I need to swap these SSDs for something that works; that’s the only “solution”.
Offline
The firmware has been re-uploaded to the cryptsetup issue page but ymmv (better than binning the drive though): https://gitlab.com/cryptsetup/cryptsetu … _645659810
Usual disclaimers about backups and downloading random firmwares from the internet.
Good luck.
--
saint_abroad
Offline
andrej.podzimek wrote:The firmware has been re-uploaded to the cryptsetup issue page but ymmv (better than binning the drive though): https://gitlab.com/cryptsetup/cryptsetu … _645659810
Usual disclaimers about backups and downloading random firmwares from the internet.
Good luck.
Wow! Thank you so much! I’m surprised that this never popped up in my search results while I was looking for the version number (because obviously it has been posted for quite a while).
That is probably the third party option that Seagate was mentioning (yet they didn’t want to directly tell me “yo, go download what appears to be our firmware from this GitLab thread” (which is understandable, but why wouldn’t they publish the firmware on their official download site then?)).
Of course I’ll give this a try. It’s a (Btrfs) RAID1 and there are 2 extra layers of backups behind it, so I’m not all that concerned about bricking one SSD or both.
Anyway, Seagate should definitely know better.
Offline
I don't use Arch or Btrfs but was suffering from awful performance on a single 2TB Seagate FireCuda 520, on two different machines.
```
$ fio --name=520-14 --size=10G -group_reporting --time_based --runtime=300 --bs=4k --numjobs=64 --rw=randwrite --directory=/mnt/fio
520-14: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
...
fio-3.27
Starting 64 processes
Jobs: 64 (f=64): [w(64)][100.0%][w=31.4MiB/s][w=8043 IOPS][eta 00m:00s]
520-14: (groupid=0, jobs=64): err= 0: pid=6082: Fri Oct 22 20:11:43 2021
write: IOPS=7375, BW=28.8MiB/s (30.2MB/s)(8643MiB/300013msec); 0 zone resets
clat (usec): min=2, max=419651, avg=8674.67, stdev=15394.96
lat (usec): min=2, max=419651, avg=8674.90, stdev=15394.95
...
```
After applying STNSC016 from https://gitlab.com/cryptsetup/cryptsetu … _645659810 (SHA256 (FireCuda_520_E16_STNSC016.bin) = bb8a7e36a6257e510ce1e3d740d505a5081a8812d07e0d5f3169fada425983b0) and running an fstrim performance became immensely and immediately better.
```
$ fio --name=520-16 --size=10G -group_reporting --time_based --runtime=300 --bs=4k --numjobs=64 --rw=randwrite --directory=/mnt/fio
520-16: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
...
fio-3.27
Starting 64 processes
Jobs: 64 (f=64): [w(64)][100.0%][w=40.4MiB/s][w=10.3k IOPS][eta 00m:00s]
520-16: (groupid=0, jobs=64): err= 0: pid=24008: Fri Oct 22 20:53:24 2021
write: IOPS=15.5k, BW=60.4MiB/s (63.4MB/s)(17.7GiB/300006msec); 0 zone resets
clat (usec): min=2, max=266042, avg=4132.98, stdev=5499.88
lat (usec): min=2, max=266042, avg=4133.43, stdev=5500.09
...
```
Double the write IOPS, double the bandwidth, and much more importantly half the average latency and one third the standard deviation.
In the following ~24 hours the performance has increased (plus bandwidth, negative latency) by another ~50%.
Offline
Seagate seems to have finally officially released this firmware version, as inputting the serial number on their website gives me results. I have updated the firmware as per the link provided by gaima and have noticed greatly improved fstrim performance. fstrim only takes a few seconds rather than much longer. I haven't noticed a huge improvement in bandwidth, but I cam using btrfs compression (zstd:3, not forced) so perhaps it is because of that. Still get around 2 GB/s read/write so not complaining.
Offline
Fixed the issue by manually updating its firmware, see https://github.com/htop-dev/htop/issues … 1697979762 for details
Unfortunately, it isn't available for fwupd
Offline
This thread is two years old and the OP has not been back since April, so I am going to consider this thread abandoned and close it now.
Offline
Pages: 1
Topic closed