You are not logged in.

#1 2019-11-06 16:00:15

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

bcache and very poor (nonexistent) read caching performance

Hi,

I seem to be having problems getting bcache working properly at the moment - seems like read caching performance is very bad.

The problem is that no matter what I've tried, the performance for reads is as if the cache was not there. Things I've tried (including):

  • align with -w 4k and --bucket 2M (partially guesswork as it is difficult to say what the EBS size is on the SSD),

  • with and without discard, and

  • decrease sequential_cutoff to 0

  • thresholds to 0 (in /sys/fs/bcache/...)

  • change write cache mode (writeback -> writethorough and IIRC also writearound, but well, it should work regardless)

  • recently deleted all partitions on the SSD and made the whole 500GB device a backing device

The last thing was done in the hopes it is an alignment issue. This time, I didn't touch the default bucket and block sizes (as there should be decent performance increase even with non-optimal alignment - it's mainly meant to reduce wear leveling in any case, unless I'm mistaken). However, seems the performance is still atrociously bad, as in the SSD makes no effect whatsoever (for reads; I don't actually care for much for writes) to it being absent. Summary of the setup:

  • kernel-5.3.8 (and various older ones, but recent-ish kernels)

  • caching device: Samsung 960EVO, whole device as the single cache device (nvme0n1)

  • The NVME SSD has been on the Motherboard (Asus Maximus VII Gene) and on a separate PCIe card in a 3.0 PCIe slot (no effect which one is used)

  • backing device: many 5400RPM HDDs and partitions (the main one is shown below as /dev/sdc2). Formatted as ext4.

  • Current set (backing and cache) was created with default options, except discard was enabled enabled (make-bcache --discard -C ; make-bcache -B).

bcache-super show listings:

$ sudo bcache-super-show /dev/sdc2
sb.magic                ok
sb.first_sector         8 [match]
sb.csum                 FC6434BAB97C1B37 [match]
sb.version              1 [backing device]

dev.label               (empty)
dev.uuid                063814f0-a14b-4db5-9cd5-b98ef658993f
dev.sectors_per_block   1
dev.sectors_per_bucket  1024
dev.data.first_sector   16
dev.data.cache_mode     1 [writeback]
dev.data.cache_state    2 [dirty]

cset.uuid               d6420bd9-a45f-4688-a9c0-217c88072449

and:

$ sudo bcache-super-show /dev/nvme0n1 
sb.magic                ok
sb.first_sector         8 [match]
sb.csum                 70AE2DCA768AC61A [match]
sb.version              3 [cache device]

dev.label               (empty)
dev.uuid                7e099d6e-3426-49d8-bf55-1e79eacd59a4
dev.sectors_per_block   1
dev.sectors_per_bucket  1024
dev.cache.first_sector  1024
dev.cache.cache_sectors 976772096
dev.cache.total_sectors 976773120
dev.cache.ordered       yes
dev.cache.discard       yes
dev.cache.pos           0
dev.cache.replacement   0 [lru]

cset.uuid               d6420bd9-a45f-4688-a9c0-217c88072449

Curiously, there still seems to be decent amount of cache hits. Maybe the cache device is reading too slow? Or some bug prevents the cached data from being used (and the HDD is read)? I can see from a LED (on the NVMe adapter card) the SSD is trying to read (or write) data, but performance is just bad.

In stats_total:

$ grep -H . *
bypassed:288.8M
cache_bypass_hits:1359
cache_bypass_misses:32500
cache_hit_ratio:87
cache_hits:3186
cache_miss_collisions:1
cache_misses:436
cache_readaheads:0

Rough speed test (to see the interfaces are working as expected):

$ sudo hdparm -Tt --direct /dev/nvme0n1

/dev/nvme0n1:
 Timing O_DIRECT cached reads:   4988 MB in  2.00 seconds = 2495.19 MB/sec
 HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device
 Timing O_DIRECT disk reads: 5052 MB in  3.00 seconds = 1683.84 MB/sec

And for comparison:

$ sudo hdparm -Tt --direct /dev/sdc

/dev/sdc:
 Timing O_DIRECT cached reads:   988 MB in  2.00 seconds = 493.55 MB/sec
 Timing O_DIRECT disk reads: 552 MB in  3.00 seconds = 183.97 MB/sec

Any ideas are welcome!

Cheers!

p.s. Some more (not so relevant) background: This is a desktop computer (mainly a toy but also for occasional serious stuff). I used to have a SATA SSD (Samsung EVO 840 IIRC). It was replaced by an NVME SSD, Samsung 960 EVO (500GB). I know roughly what the performance should be with SSD read caching working; with the previous SATA SSD, boot from power off (after Linux Kernel has been loaded) was ~15 seconds until the desktop environment has settled, and ~60seconds+ with a bare mechanical 5400RPM HDD (with KDE Plasma and some applications autostarting, including Firefox and tvheadend in the background). I actually got a bit of real-life benchmarks left from the days the caching used to work (things such as loading StarCraft II, starting LibreOffice, starting Blender etc. with and without the data in bcache - I can provide these numbers in case semone is interested; depending on application / test, loading times were cut into 1/4 -> 1/6th of the time). The only change in this setup is that the cache moved from SATA -> NVME, after which I've had these problems. IIRC I never got bcache to work properly for read caches with the NVME SSD, although it should be better than the previous SATA SSD!

EDIT: Moved stuff from my setup -> "things I've tried". Also noted I'm using ext4.

Last edited by Wild Penguin (2020-03-07 20:03:00)

Offline

#2 2019-11-12 05:53:29

digitus
Member
Registered: 2016-06-02
Posts: 2

Re: bcache and very poor (nonexistent) read caching performance

Hi,

I have had similar problems and am still diagnosing, but, have you checked the latency of read/writes to your caching device?

I am experiencing quite high latencies to my nvme (50-200ms) and the congestion controls in bcache seems to be kicking in. I am experimenting with turning off congestion control in bcache, and this seems to be having positive results for me at least.

Also, as a side note: I try to keep discard off, on both bcache and filesystem. I then use fstrim (as systemd timer) with '-m 1M' in (maybe vain) attempt at keeping free space fragmentation and write amplification on the nvme down.

Offline

#3 2019-11-12 16:37:19

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

Hi digitus, thanks for your reply!

I'm not sure how to test for latency. Actually, I'm not aware of any good HDD benchmarks for Linux. I know of fio and hdparm -T and -t, but I'm not sure how to use fio (maybe there is a fio for Dummies guide somewhere? I'm trying to study it right now). Hdparm -t test is too simple to be useful for most situation (I guess it is useful only as a rough sequential read speed test, to test if the link/HDD is working at all as it should).

Also, as stated in my OP, I have turned off congestion control (set congested_write_threshold_us and congested_read_threshold_us to 0), and I have also tried with discard off (for a prolonged time, actually for a few months). However, I just realized I'm not 100% sure I did test without "discard" in fstab for the ext4 volumes (only for the bcache cache device). AFAIK it should not make a big difference, since for example during a normal (and repeated) boots, there should be little deleting taking place. Also, not having a SSD with discard enabled, the option should not do anything.

None of these have any (discernible) effect.

What I did after my post, I've (hopefully temporarily) moved (well, copied) root to the SSD (on a bcache thinly provisioned volume) to actually get some benefit from having it in the computer in the first place. Now boot-up (to functional desktop) is <10seconds. I'm hoping I can figure out why the hot data caching isn't working, and move the root back to a mechanical HDD - the other alternative being, make an actual ext4 partition on the SSD (not one sitting on top of bcache), or use some other SSD caching method, such as LVM caching.

FWIW, I've also posted on bcache mailing list.

Please let me know in case you find something else!

(p.s. I've been under the assumption that with modern SSDs  and Kernels enabling discard should be fine, and actually preferable as it reduces the need for overprovisioning, Kernel and SSDs don't take a penalty hit be enabling it, and subsequently all the space is available for writes without erase, which should actually increase performance; but that's getting a bit off-topic, and disabling discard is something I've already tried. I'll see if disabling discard (for the ext4 fs) makes any difference).

Last edited by Wild Penguin (2019-11-12 16:44:11)

Offline

#4 2019-11-12 17:38:42

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

Ok,

I moved root back to the mechanical HDD, and I disabled discard in the ext4 options and, not surprisingly, there is no effect.

I also did some very crude tests with dd. With sequential_bypass at 0, I created a 1GB file (as I'm not sure how to test non-sequential reads). I noticed this file will never be written to cache, unless cache mode is set to writethrough or writeback at creation time. More specifically:

  • cache_mode = none -> create file -> all reads (tested by flushing Kernel RAM cache by echo 3 > /proc/sys/vm/drop_caches) will result in ~100-150MB/s read. Despite changing cache_mode after file creation!

  • cache_mode = writethrough or writeback -> create file -> all reads (with cache flushed as above) (despite changing cache mode) will result in ~600-700MB/s read rate! Despite changing cache_mode after file creation.

So, it seems, that no data will be put into the cache if (just) red (or, sequential_cutoff is not working for reads). Data will be only put into cache if written to.

This just does not seem to be right, but does explain the (lack of) performance I'm observing. I definitely care much more about hot data than write caching (latter being mostly useless for desktop usage, as data is much more often red than written to)!

Offline

#5 2019-12-14 15:35:22

walmis
Member
From: Klaipeda, Lithuania
Registered: 2009-02-22
Posts: 13
Website

Re: bcache and very poor (nonexistent) read caching performance

I've noticed the same problem. If rebooting with bcache enabled a couple of times it feels like no cache is enabled. I've noticed this when updating to ubuntu 19.04 with kernel 5.0. Manually downgrading kernel to 4.18 seems to fix bcache read caching for me. Kernel 5.3 is also a no-go sad

Offline

#6 2019-12-15 18:55:22

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

Walmis: Do I understand correctly, you are using Ubuntu? Can you reproduce this on Arch Linux, too?

You may be interested in posting a bug in Ubuntu bugzilla or looking up the thread on the bcache mailing list and perhaps participating there.

I still got no solutions, nor any instructions or ideas how to decipher this problem further. But in case there are other users, maybe there is more interest in looking into this (as it has been silent on both this forum and the mailing list thread, I've deduced I'm the only one with this issue and it works for other people, or people are just installing directly on their SSDs these days and few people use bcache...)

FWIW the last time I'm certain I had a working bcache setup was inded back with 4.something series Kernel. I can not downgrade to such an old Kernel, however, since I'm dependent on AMDGPU improvements made since then...

Last edited by Wild Penguin (2019-12-15 18:55:53)

Offline

#7 2020-01-21 22:04:46

przent
Member
Registered: 2020-01-21
Posts: 2

Re: bcache and very poor (nonexistent) read caching performance

Hi,

I have the same problem on Debian 11 (Testing) running 5.4 Kernel. Writes are cached but most of the reads are not. I have 1.4TB NVME cache for a 14TB HDD RAID1 setup. The caching NVMEs are also RAID1. I have smilar setup (smaller drives) on Debian 10 with 4.19 kernel and no problems at all.

Did you find a solution or cause of the problem?

Offline

#8 2020-01-22 14:41:10

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

Hi przent,

No solutions yet.

Based on these comments, I'm starting to think read caching is completely broken at the moment, as  there has yet to be a single report of a working read caching setup in a recent-ish kernel, but several claim it is broken. Perhaps (indeed) there are few desktop users using this and other users are mainly interested in write caching (which is useful and important for many workloads, except typical desktop workloads).

The only thing even remotely "special" in my setup is that I've got a mergerfs folder in use (well, there's also snapraid, but it is completely in the user S/W space, and not running/touching any of the FS unless I tell it to do so). I'd be surprised if mergerfs would make a difference, and even if it dit, my rootfs and most of my users home directory is not part of that mergerfs folder (EDIT: This paragraph was messed up a bit, fixed).

Last edited by Wild Penguin (2020-01-22 20:01:34)

Offline

#9 2020-01-23 08:45:52

przent
Member
Registered: 2020-01-21
Posts: 2

Re: bcache and very poor (nonexistent) read caching performance

Hi,

so I made some huge progress regarding the read caching issue.

First I found a workaround that directed me towards right direction. The workaround was to completely disable readahead on the *backing* device (not on /dev/bcache*).

echo 0 > /sys/block/dm-3/queue/read_ahead_kb

I used a software called bwm-ng to monitor what the drives get hit by writes and reads and saw immediate writes to the NVMEs.

To test the results I first had to drop RAM caches: echo 3 > /proc/sys/vm/drop_caches

Most reads (99%) were cached when sequential_cutoff was 0, so it finally worked as expected .

The problem was that reading from backend device was slow as hell (well because I disabled readahead) so the solution was not that perfect but I knew now that the problem
with bcache read caching is, that it actually is skipping (like always skipping) the readahead part of a bio request. Then it was easy to google:

https://marc.info/?l=linux-bcache&m=157812108913520&w=2

The change was introduced in kernel 4.15 and lives until now.

I compiled a debian test kernel 5.4 using the patch from the mailing list and my performance gains for a desktop are HUGE. It almost feels like I had a 14TB NVME.
I tested mostly by loading X-Plane flight simulator and the loading time is reduced from several minutes to under a minute now and also by loading soma huge Enterprise Java applications
which also took ages to open in Intellij Idea and are near instant now.

As I said it kind of works for me with debian 10 and kernel 4.19 so maybe debian patched it or the fact the on these machines I use bcache to host virtualized machines maked the big difference. Maybe the readahead from virtualization is not
recognized as real readahead and the caching is not skipped.

So lets hope the patch makes it to the main kernel very very soon.

BR

Offline

#10 2020-01-23 17:31:45

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

Hi,

Thanks przent for the tip!

I've noticed the discussion about the patch, as I'm lurking around on the mailing list. I've had my suspicions, but as it is a bit technical, and as I'm just a somewhat power-end-user, not a coder/Kernel developer, and as such I'm not quite sure what is going on and relevant, I've not commented there. This does tempt me on compiling my own kernel and trying out that patch.

Offline

#11 2020-02-16 22:12:39

RobertNT
Member
Registered: 2020-02-16
Posts: 1

Re: bcache and very poor (nonexistent) read caching performance

https://lkml.org/lkml/2020/2/10/338

This patch adds options for readahead data cache policies via sysfs
file /sys/block/bcache<N>/readahead_cache_policy, the options are,
- "all": cache all readahead data I/Os.
- "meta-only": only cache meta data, and bypass other regular I/Os.

Appears in:
5.4.19+
5.5.4+
4.19.103+

I haven't updated yet, but looks like an excellent outcome.

EDIT: Updated - remarkably better load times on my main pain point (Guild Wars 2 under WINE/Proton), going from 40+ seconds to 13.
Back to what it was like a year or so ago.
Thanks so much for hunting down the cause and the kernel patch reference.

Last edited by RobertNT (2020-02-16 22:45:00)

Offline

#12 2020-02-19 15:57:57

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

Hi,

After a quick test, with 5.4.19 I'm getting the performance I would be expecting.

Marking as [SOLVED]!

(will comment later on the mailing list)

Offline

#13 2020-03-07 20:05:29

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

Been busy and hence no replies here;

Looks like I was talking too soon. After a while, the performance is gone and all reads fo to the mechanical HDD again.

I believe my first results were better, since I had cloned my root from the SSD to the mechanical (bcache cached device) again. Hence, the data was written there and in cache.

Now after running a while, new data has been written, but more important data (which is read often but not written do) does still not got to cache!

Any input / opinions are still welcome (setting the readahead_cache_policy seems to have little effect here).

Offline

#14 2020-06-04 08:35:59

djf
Member
Registered: 2020-06-04
Posts: 1

Re: bcache and very poor (nonexistent) read caching performance

I found setting congestions times to zero helped:

In the real world, SSDs don't always keep up with disks - particularly with
   slower SSDs, many disks being cached by one SSD, or mostly sequential IO. So
   you want to avoid being bottlenecked by the SSD and having it slow everything
   down.

   To avoid that bcache tracks latency to the cache device, and gradually
   throttles traffic if the latency exceeds a threshold (it does this by
   cranking down the sequential bypass).

   You can disable this if you need to by setting the thresholds to 0::

    # echo 0 > /sys/fs/bcache/<cache set>/congested_read_threshold_us
    # echo 0 > /sys/fs/bcache/<cache set>/congested_write_threshold_us

   The default is 2000 us (2 milliseconds) for reads, and 20000 for writes.

from:

https://www.kernel.org/doc/Documentation/bcache.txt

Offline

#15 2020-06-04 09:11:51

Wild Penguin
Member
Registered: 2015-03-19
Posts: 319

Re: bcache and very poor (nonexistent) read caching performance

djf wrote:

I found setting congestions times to zero helped:

Hi djf,

I'm happy it works for you, but if you read previous posts carefully, you'll notice I have already tried that.

After waiting (passively and actively looking) for a solution for over half a year, I'm considering moving to some other solution (ditch bcache, install root on the SSD or try some other solution, such as dm-cache). I haven't just done that as I haven't had time to study the alternatives (yet).

Last edited by Wild Penguin (2020-06-04 10:36:50)

Offline

Board footer

Powered by FluxBB