You are not logged in.

#1 2020-12-17 04:55:21

cdwijs
Member
Registered: 2010-04-24
Posts: 289

preventing HDD's to spinup by using (small) SSD's as cache, how?

I want to build a NAS that is as quiet, efficient and cool as possible.

In this scenario, I have the following parts:
-4 SATA connections to the motherboard.
-2 Hard drives, one is 1TB, the other is 900GB
-2 SSD's, one is 220GB, the other is 120GB.

Requirements:
-All the data must be still accessible, even when one of the drives or ssd's fails.
-The HD's must be kept idle, (not spinning) as long as possible.
-No data corruption after a drive or ssd decides to return incorrect data.

-The NAS does not have to be the fastest NAS in the world.
-It's OK to wait a few seconds until the HDD's have spun up when data is accessed that's not on the SSD's.

How I imagine this to work:
I would like to use the 2 hard drives as bulk storage, and the 2 SSD's as cache. Let's pretend both SSD's are empty, and the hard drives each hold a copy of the data. When new data is written to the NAS, it is written to both SSD's. if one of the SSD's is (almost) full, the hard drives spin up, and all of the data on the SSD's is transferred to the hard drives. Half of the data (the least accessed written data) is then removed from the SSD's, so more data from the client can be stored on the SSD's

When the client requests data that is stored on the SSD's, the HD's are kept idle. When the client requests data that's not on one of the SSD's, one of the hard drives is woken up. (not both hard drives, only one). Then the data that is requested is transferred from the hard drive to the client. Next the data that is adjacent to the data that the client requested (for instance the complete file the client is working on, all the files in that directory, or the next few episodes of the series the client is watching) are then transferred to the read cache on the biggest of the SSD's

Possible Implementation:
On the 2 SSD's there's a raid 1 (mirrored) btrfs filesystem of 120GB. This is the write cache.
On the biggest SSD there's also a 120GB btrfs filesystem of 120GB. this is the read cache.
On the 2 HDD's there's a raid 1 (mirrored) btrfs filesystem of 900GB. the 100GB of the biggest drive is not used.
2 instances of bcache [1] are used to handle the read and the write caching.

[1] https://wiki.archlinux.org/index.php/Bcache

The questions:
Is this the best setup?
Is btrfs the best filesystem to use in this scenario?
Is bcache able to not only cache the recently read data, but also to cache the adjacent data?
Is there something useful that can be done with the 100GB space of the biggest hard drive?

Offline

#2 2020-12-25 07:02:24

cdwijs
Member
Registered: 2010-04-24
Posts: 289

Re: preventing HDD's to spinup by using (small) SSD's as cache, how?

I've used bcache and btrfs to build the system.

This stack both protects from data loss from failing drives, and provides the read and write caching.

+-------------------------------------------------+
|                btrfs raid 1 /Storage            |
+-------------------------+-----------------------+
|       /dev/Bcache0      |      /dev/bcache1     |
+------------+------------+-----------+-----------+
| Cache      | Data       | Cache     | Data      |
| /dev/sda2  | /dev/sdb1  | /dev/sdc2 | /dev/sdd1 |
+------------+------------+-----------+-----------+

One remaining problem is that bcache does not know if a hard drive is idle or active. There are no triggers to write back the dirty write cache when the hard drive has spun up to service a read request.

Another problem is that both hard drives are woken by the btrfs filesystem if data is requested that's not in the read cache. I would like to keep the old, noisy and slowest of the two hard drives sleeping, while the newer, faster and more quiet drive services the read requests. The old drive should only be used when the new drive encounters an uncorrectable read error, when the data is scrubbed, or when it's SSD write cache is almost full, and therefore needs to be dumped onto the HDD.

These are the parameters I used on the bcache:

[root@bcache-test cedric]# echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
[root@bcache-test cedric]# echo writeback > /sys/block/bcache0/bcache/cache_mode
[root@bcache-test cedric]# echo 10000 > /sys/block/bcache0/bcache/writeback_delay
[root@bcache-test cedric]# echo 0 > /sys/block/bcache1/bcache/sequential_cutoff
[root@bcache-test cedric]# echo writeback > /sys/block/bcache1/bcache/cache_mode
[root@bcache-test cedric]# echo 10000 > /sys/block/bcache1/bcache/writeback_delay
[root@bcache-test cedric]# echo 0 > /sys/fs/bcache/1bbd493f-186f-4252-aa49-93c72fed9766/congested_read_threshold_us 
[root@bcache-test cedric]# echo 0 > /sys/fs/bcache/1bbd493f-186f-4252-aa49-93c72fed9766/congested_write_threshold_us 
[root@bcache-test cedric]# echo 0 > /sys/fs/bcache/5e474b6f-b818-47a2-95a8-9ee970aed9ba/congested_read_threshold_us 
[root@bcache-test cedric]# echo 0 > /sys/fs/bcache/5e474b6f-b818-47a2-95a8-9ee970aed9ba/congested_write_threshold_us
echo 200M > /sys/block/bcache0/bcache/readahead
echo 200M > /sys/block/bcache1/bcache/readahead

Offline

Board footer

Powered by FluxBB