Poor BTRFS Read Performance

TheDukeofErl · 2021-12-04 17:37:26

I have some read performance issues that are mostly exclusive to BTRFS. I've done some digging and really am not too sure if there's much that can be done to improve the performance here. I recently got a 980 Pro and, being lazy, just used clonezilla to migrate my current arch install, on an SN750, over to that disk. Upon doing some light benchmarking using KDiskMark, it appeared that I had gotten a small performance increase, only getting about 1.5 GB/s compared to the advertised speeds of 7 GB/s. I don't expect that 7 GB/s will ever really be seen except on strange, bursty reads, however, I decided to do some digging. My initial thought was that the issue was with LUKS2: the default block size is 512. Though the Samsung SSDs say that their block size is 512 (and do not allow an NVMe format to change their blocksize to 4096), this seems to not be the case. The disk reads and writes 4096 blocks under the hood.

Moving forward, I decided to go back to my old disk, formatting the 980 Pro to do some testing. I partitioned the 980 Pro to a single partition using gparted. For the performance testing, I used a couple different tools. When using KDiskMark, SEQ1MQ8T1 with the default 1 GiB block size was used. I would delete and change the filesystem to btrfs or ext4 using the default mkfs.<fs> commands, only changing the blocksize as listed and adding the label "testing" to the partition. For mounting, I simply allowed gnome-disks to use the default mount options unless listed otherwise. It's also worth noting that I have the 980 Pro in the M.2 slot that is directly wired to the CPU.

I have received a few suggestions, and after the LUKS portion of the table I began to add fstrim or blkdiscards to ensure no performance issues. I also added some tests using fio itself. I'll include the fio scripts at the bottom of this post.

Crypto  Blocksize   Filesystem  Blocksize   Kernel  Tool        Read    Write   Options (cr)    Options (fs)                                    Other
                    ext4    4096    linux-zen   KDiskMark       6545.66 4480.99         
                    ext4    1024    linux-zen   KDiskMark       5937.62 4295.66         
                    btrfs   4096    linux-zen   KDiskMark       1782.14 4459.77         
                    btrfs   4096    linux-zen   KDiskMark       1710.47 4047.47                 rw,noatime,ssd,space_cache=v2   
                    btrfs   4096    linux-zen   KDiskMark       1876.87 4375.13                 rw,noatime,ssd,space_cache=v2,nodatacow 
                    btrfs   4096    linux-zen   KDiskMark       1850.86 4208.55                 rw,noatime,ssd,space_cache=v2,nodatasum 
luks2   512         ext4    4096    linux-zen   KDiskMark       3050.50 2638.06         
luks2   512         ext4    1024    linux-zen   KDiskMark       3114.68 2664.32         
luks2   512         btrfs   4096    linux-zen   KDiskMark       1465.74 2347.44         
luks2   4096        ext4    4096    linux-zen   KDiskMark       6172.50 4176.87         
luks2   4096        btrfs   4096    linux-zen   KDiskMark       1221.27 2830.69         
luks2   4096        ext4    4096    linux-zen   KDiskMark       2961.06 2138.33 perf-no_read_workqueue, perf-no_write_workqueue     
luks2   4096        btrfs   4096    linux-zen   KDiskMark       1178.82 3196.28                                                                 scheduler: none
                    btrfs   4096    linux-zen   KDiskMark       1720.20 4255.63                                                                 nvme format, blkdiscard, scheduler: none
                    btrfs   4096    linux-zen   KDiskMark       1727.14 4345.20                 rw,noatime,ssd,space_cache=v2,compress=zstd:1   scheduler: none, fstrim
                    btrfs   4096    linux-zen   fio read_uring  1978.00                                                                         fstrim
                    btrfs   4096    linux-zen   fio read_libaio 1756.00             
                    btrfs   4096    linux-zen   fio read_uring  2010.00                         rw,noatime,ssd,space_cache=v2,nodatacow         deleted prior fio test file, fstrim
                    btrfs   4096    linux-zen   fio read_libaio 1921.00                         rw,noatime,ssd,space_cache=v2,nodatacow 
                    ext4    4096    linux       KDiskMark       6394.04 4308.89                                                                 fstrim
                    btrfs   4096    linux       KDiskMark       1637.08 4184.36         
                    btrfs   4096    linux       KDiskMark       1651.09 4169.45                 rw,noatime,ssd,space_cache=v2                   fstrim
                    btrfs   4096    linux       KDiskMark       1811.89 4513.29                 rw,noatime,ssd,space_cache=v2,nodatacow         fstrim

Based on the testing results, my assumption about the 512 v 4096 blocksize for LUKS seems to have been correct. With ext4, there is a huge boost in performance when using the 4096 blocksize. That said, the BTRFS results remain quite disappointing. At this point I am leaning towards switching to LVM+ext4 but I currently enjoy btrfs features like the snapshotting, CoW, and bitrot protection. I'm pretty much grasping for straws here. I partially expected this to to go away with turning off CoW or checksumming but strangely that hasn't made a difference. The write performance is also befuddling to me. I use zfs raidz on other systems, so I'm used to seeing high write performance with the read performance of a single disk but there is an explanation there that makes sense to me.

I also did some brief testing with dd, though the accuracy of dd compared to fio is somewhat disputed. For what it's worth, I got 2.4 GB/s without CoW (with and without sync) and 2.3 GB/s without (2.2 GB/s with sync), only looking at read performance. The command I used was

dd if=fio_test_file of=/tmp/random_test bs=1M count=1024

or

dd if=fio_test_file of=/tmp/random_test bs=1M count=1024 oflag=dsync

ext4 consistently clocked in at 2.0 GB/s, regardless of the sync flag.

I'm using AMD hardware. This may be relevant in this case but I do not have enough data points to be entirely sure. After asking on Reddit and talking to a couple friends, it seems that for the three people using AMD hardware there are slowdowns on reads. The one person who was using Intel hardware reported that they got the 3 GB/s reads and writes expected of their PCIe 3.0 NVMe SSD. I'm hoping that this isn't a strange issue regarding AMD vs Intel PCIe implementation or something horrific like that, but if anyone has the bandwidth to test, I would greatly appreciate it if they could.

Does anyone have any ideas for further troubleshooting or any ideas for solutions? Thanks in advance.

Kernel: 5.15.5 (various)
SSD: Samsung 980 Pro
CPU: AMD Ryzen 7 3800X
Host: X570 AORUS ELITE WIFI -CF

fio benchmarking script

[read_uring]
directory=/foo/bar/testing/
filename=fio_test_file

direct=1
buffered=0
size=1g

startdelay=3
ramp_time=3
runtime=5
time_based

ioengine=io_uring
force_async=4

rw=read
bs=1m
iodepth=8

For testing with libaio, I removed the force_async line and changed the ioengine to libaio.

Arch Linux

#1 2021-12-04 17:37:26

Poor BTRFS Read Performance

Board footer