You are not logged in.

#1 2019-07-24 06:10:42

From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Rsync Incremental on local HDD/differential over USB flash opinions.

So I want some feed back from any experienced users of rsync or those generally knowlegable elsewhere with my scenario and planned approach.
I'll try to keep it short, however my current setup is complex in it's own way, so I'll try to illustrate it here. There will be a real world benchmark regarding differential and incremental backup methods included here demonstrating three methods from my own existing backups.
So here's the only pertinent information as it relates to data storage, deduplication and optimizing storage/transfer i/o with regards to each piece of hardware. I'll get to rsync shortly. In my case I have a decent laptop with lots of ram (16Gb), CPU/RAM usage isn't a concern, It'll come into play near the end of this post though.
On my Laptop:
   256Gb Sandisk SSD (mini PCi)
   2Tb westerndigital laptop HDD
   64Gb USB3.0 Sandisk Flash Drive(w/ decent r/w speeds) purchased for the setup I have in mind.
My internal partitioning and mounting layout looks like this.

root@Pylon13 /mnt/snapshot/backups # lsblk -o NAME,PATH,FSTYPE,MOUNTPOINT,FSUSE%,FSAVAIL,FSSIZE --fs
NAME                 PATH                       FSTYPE      MOUNTPOINT         FSUSE% FSAVAIL FSSIZE
sda                  /dev/sda                                                                       
└─sda1               /dev/sda1                  crypto_LUKS                                         
  └─storage          /dev/mapper/storage        LVM2_member                                         
    ├─volume-backup  /dev/mapper/volume-backup  ext4        /mnt/snapshot         57%   29.5G  78.3G
    └─volume-storage /dev/mapper/volume-storage ext4        /home/nephthys/HDD    46%  933.3G   1.7T
sdb                  /dev/sdb                                                                       
├─sdb1               /dev/sdb1                  vfat        /boot/efi              0%  548.7M 548.9M
├─sdb2               /dev/sdb2                  ext4        /boot                 12%    1.6G   1.9G
├─sdb3               /dev/sdb3                                                                      
└─sdb4               /dev/sdb4                  crypto_LUKS                                         
  └─cryptlvm         /dev/mapper/cryptlvm       LVM2_member                                         
    ├─ssd-root       /dev/mapper/ssd-root       ext4        /                     40%   21.5G  39.1G
    ├─ssd-home       /dev/mapper/ssd-home       ext4        /home                 32%   40.5G  64.9G
    └─ssd-swap       /dev/mapper/ssd-swap       swap        [SWAP]                                  

I'm currently booting with into 2 LVMs atop of 2  LUKS2 encrypted drives(SSD+HDD) Using a USB keyfile during boot to unlock /dev/sdb4(which then unlocks /dev/sda1 from crypttab). The /dev/sdb2  boot parttiion soon to be added...
Now hopefully I haven't lost your attention. Here's where rsync comes into play:

I'm currently keeping daily incremental snapshots with a script using these options

rsync -aAX --link-dest=path/to/previous/dated/snapshot/  \
                            --del  \(incase the script gets rerun manuallly by me) 
                            --exclude-from=list-of-cache-and-other-dirs-that-aren't-important \
                            --log-file=path-to-log-file.log \

From my existing snapshots I decided to experiment with differential style backups using my oldest incremental to create differentials against for each newer increment recreating the equivalent in differential snapshots had I ran them at the same points back in time. For those unfamiliar with rsync or who haven't really poured over it's  concise(often confusing) encyclopedic man page... also does differential backups by using the --only-write-batch= argument. This creates a single data file with a script of matching name(used by rsync during a restore). That file is nothing more than all the deltas (diffs) of the files which have changed between source and destination. They can be restored using the "--read-batch=" argument.

#                              "du -sh summary" 	
15G	2019-07-12/		       14G	2019-07-12.tar		3.6G	2019-07-12.tar.xz
1.5G	2019-07-13/		569M	2019-07-13-diff		211M	2019-07-13-diff.tar.xz
807M	2019-07-14/		1.1G	2019-07-14-diff		294M	2019-07-14-diff.tar.xz
654M	2019-07-15/		1.5G	2019-07-15-diff		487M	2019-07-15-diff.tar.xz
470M	2019-07-16/		1.6G	2019-07-16-diff		499M	2019-07-16-diff.tar.xz
273M	2019-07-17/		1.7G	2019-07-17-diff		537M	2019-07-17-diff.tar.xz
2.1G	2019-07-18/		2.8G	2019-07-18-diff		761M	2019-07-18-diff.tar.xz
1.4G	2019-07-19/		1.6G	2019-07-19-diff		488M	2019-07-19-diff.tar.xz
22.2G				25.9G				6.9G

As one can see using tar and xz (with -T options to use all my CPU cores during the tar | xz  compression stage) to compress and all the diffs created by rsync resulted in a huge storage savings. Compressing the level-1 snapshot took 34m 44s. The diffs a few minutes each at most, some under a minute. 
While the differential was the largest in terms of total disk usage due to each batch containing the total differences between each date and the level-1 . The incremental snapshots on my local HDD are perfectly fine for their simplicity when it comes to restoring them, However I also plan on keeping  and periodically updating these snapshots over to my new 64G USB stick which has less space and likely a far lower lifetime writecycle than my SSD. To minimize the heavy I/O rsync would create populating and creating inodes for 1000s of directories/files on the Sandisk USB thumbdrive creating the tar.xz snapshot + diffs seems to me to be the best approach. The USB  will also be a dedicated bootable Arch installation for recovery operations, encrypted as well with LUKS, (keyfile embeded for unlocking the laptop) with it's own EFI partition 550M, 1.5G ext4 /boot and the remaining volume formatted ext4 as well for the system and snapshot directories and other important/critical data. The end goal is to have a udev rule/systemd.unit  trigger unlocking/mounting the encrypted container/partition on the USB stick after a certain number of days have passed(systemd.timer). Then the tar.xz compressed level-1 snapshot (created on the HHD by my script) and compressed diff.tar.xz files are transfered (with rsync or cp) to replace/remove any old diffs and level-1. Obviously the Arch USB intsall will also be optimized to minimize writes while booted. So I'm looking to hear what other users my have done in a similar/related setup, Also I'm open to suggestions. This current plan is what I came up with to achieve redundancy, maintain data  integrity/security on both HDD and USB while minimizing heavy r/w I/O to the USB Stick. It will be a part of my primary setup, providing the keyfile for unlocking the LUKS volumes on the Laptop while securely containing all the keys on its own encrypted volume minus the one used to boot itself(which is manually entered and not stored anywhere other than my own grey matter). All other spare keysticks safely locked away in the event this one gets compromised. I'm Open to opinions/suggestions regarding the encryption setup however the primary reason for this post is regarding these goals.
         1: Minimized I/O to the USB, Thus extending it's lifecycle
         2: Achieve offline redundancy with portable USB media.
         3: Keep Local online deduplication methods unchanged (Incremental as primary snapshot method), Compressed differentials only temporarily created on disk according to a schedule prior to the USB mount/unlock rule triggers their transfer, Minimizing the wait and transfer time by pre-creating the differential backups.   
         4: Offline deduplication for the low storage requirements (USB stick) where backups can be restored from it's own installed/live environment if booted and unlocked in the event my first backup drive and system fails.

Note:  also posted this as It may provide some useful data regarding backup methods. If any are curious of the exact commands I used with tar and xz. I'll post them too. the compression ratio was achieved with the xz utilities defaults only sped up with -T.

Last edited by Thme (2019-07-29 06:11:03)

"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts


Board footer

Powered by FluxBB