You are not logged in.
So I want some feed back from any experienced users of rsync or those generally knowlegable elsewhere with my scenario and planned approach.
I'll try to keep it short, however my current setup is complex in it's own way, so I'll try to illustrate it here. There will be a real world benchmark regarding differential and incremental backup methods included here demonstrating three methods from my own existing backups.
So here's the only pertinent information as it relates to data storage, deduplication and optimizing storage/transfer i/o with regards to each piece of hardware. I'll get to rsync shortly. In my case I have a decent laptop with lots of ram (16Gb), CPU/RAM usage isn't a concern, It'll come into play near the end of this post though.
On my Laptop:
256Gb Sandisk SSD (mini PCi)
2Tb westerndigital laptop HDD
64Gb USB3.0 Sandisk Flash Drive(w/ decent r/w speeds) purchased for the setup I have in mind.
My internal partitioning and mounting layout looks like this.
root@Pylon13 /mnt/snapshot/backups # lsblk -o NAME,PATH,FSTYPE,MOUNTPOINT,FSUSE%,FSAVAIL,FSSIZE --fs
NAME PATH FSTYPE MOUNTPOINT FSUSE% FSAVAIL FSSIZE
sda /dev/sda
└─sda1 /dev/sda1 crypto_LUKS
└─storage /dev/mapper/storage LVM2_member
├─volume-backup /dev/mapper/volume-backup ext4 /mnt/snapshot 57% 29.5G 78.3G
└─volume-storage /dev/mapper/volume-storage ext4 /home/nephthys/HDD 46% 933.3G 1.7T
sdb /dev/sdb
├─sdb1 /dev/sdb1 vfat /boot/efi 0% 548.7M 548.9M
├─sdb2 /dev/sdb2 ext4 /boot 12% 1.6G 1.9G
├─sdb3 /dev/sdb3
└─sdb4 /dev/sdb4 crypto_LUKS
└─cryptlvm /dev/mapper/cryptlvm LVM2_member
├─ssd-root /dev/mapper/ssd-root ext4 / 40% 21.5G 39.1G
├─ssd-home /dev/mapper/ssd-home ext4 /home 32% 40.5G 64.9G
└─ssd-swap /dev/mapper/ssd-swap swap [SWAP]
I'm currently booting with into 2 LVMs atop of 2 LUKS2 encrypted drives(SSD+HDD) Using a USB keyfile during boot to unlock /dev/sdb4(which then unlocks /dev/sda1 from crypttab). The /dev/sdb2 boot parttiion soon to be added...
Now hopefully I haven't lost your attention. Here's where rsync comes into play:
I'm currently keeping daily incremental snapshots with a script using these options
rsync -aAX --link-dest=path/to/previous/dated/snapshot/ \
--del \(incase the script gets rerun manuallly by me)
--exclude-from=list-of-cache-and-other-dirs-that-aren't-important \
--log-file=path-to-log-file.log \
From my existing snapshots I decided to experiment with differential style backups using my oldest incremental to create differentials against for each newer increment recreating the equivalent in differential snapshots had I ran them at the same points back in time. For those unfamiliar with rsync or who haven't really poured over it's concise(often confusing) encyclopedic man page... also does differential backups by using the --only-write-batch= argument. This creates a single data file with a script of matching name(used by rsync during a restore). That file is nothing more than all the deltas (diffs) of the files which have changed between source and destination. They can be restored using the "--read-batch=" argument.
# "du -sh summary"
15G 2019-07-12/ 14G 2019-07-12.tar 3.6G 2019-07-12.tar.xz
1.5G 2019-07-13/ 569M 2019-07-13-diff 211M 2019-07-13-diff.tar.xz
807M 2019-07-14/ 1.1G 2019-07-14-diff 294M 2019-07-14-diff.tar.xz
654M 2019-07-15/ 1.5G 2019-07-15-diff 487M 2019-07-15-diff.tar.xz
470M 2019-07-16/ 1.6G 2019-07-16-diff 499M 2019-07-16-diff.tar.xz
273M 2019-07-17/ 1.7G 2019-07-17-diff 537M 2019-07-17-diff.tar.xz
2.1G 2019-07-18/ 2.8G 2019-07-18-diff 761M 2019-07-18-diff.tar.xz
1.4G 2019-07-19/ 1.6G 2019-07-19-diff 488M 2019-07-19-diff.tar.xz
Totals
22.2G 25.9G 6.9G
As one can see using tar and xz (with -T options to use all my CPU cores during the tar | xz compression stage) to compress and all the diffs created by rsync resulted in a huge storage savings. Compressing the level-1 snapshot took 34m 44s. The diffs a few minutes each at most, some under a minute.
While the differential was the largest in terms of total disk usage due to each batch containing the total differences between each date and the level-1 . The incremental snapshots on my local HDD are perfectly fine for their simplicity when it comes to restoring them, However I also plan on keeping and periodically updating these snapshots over to my new 64G USB stick which has less space and likely a far lower lifetime writecycle than my SSD. To minimize the heavy I/O rsync would create populating and creating inodes for 1000s of directories/files on the Sandisk USB thumbdrive creating the tar.xz snapshot + diffs seems to me to be the best approach. The USB will also be a dedicated bootable Arch installation for recovery operations, encrypted as well with LUKS, (keyfile embeded for unlocking the laptop) with it's own EFI partition 550M, 1.5G ext4 /boot and the remaining volume formatted ext4 as well for the system and snapshot directories and other important/critical data. The end goal is to have a udev rule/systemd.unit trigger unlocking/mounting the encrypted container/partition on the USB stick after a certain number of days have passed(systemd.timer). Then the tar.xz compressed level-1 snapshot (created on the HHD by my script) and compressed diff.tar.xz files are transfered (with rsync or cp) to replace/remove any old diffs and level-1. Obviously the Arch USB intsall will also be optimized to minimize writes while booted. So I'm looking to hear what other users my have done in a similar/related setup, Also I'm open to suggestions. This current plan is what I came up with to achieve redundancy, maintain data integrity/security on both HDD and USB while minimizing heavy r/w I/O to the USB Stick. It will be a part of my primary setup, providing the keyfile for unlocking the LUKS volumes on the Laptop while securely containing all the keys on its own encrypted volume minus the one used to boot itself(which is manually entered and not stored anywhere other than my own grey matter). All other spare keysticks safely locked away in the event this one gets compromised. I'm Open to opinions/suggestions regarding the encryption setup however the primary reason for this post is regarding these goals.
1: Minimized I/O to the USB, Thus extending it's lifecycle
2: Achieve offline redundancy with portable USB media.
3: Keep Local online deduplication methods unchanged (Incremental as primary snapshot method), Compressed differentials only temporarily created on disk according to a schedule prior to the USB mount/unlock rule triggers their transfer, Minimizing the wait and transfer time by pre-creating the differential backups.
4: Offline deduplication for the low storage requirements (USB stick) where backups can be restored from it's own installed/live environment if booted and unlocked in the event my first backup drive and system fails.
Note: also posted this as It may provide some useful data regarding backup methods. If any are curious of the exact commands I used with tar and xz. I'll post them too. the compression ratio was achieved with the xz utilities defaults only sped up with -T.
Last edited by Thme (2019-07-29 06:11:03)
"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts
Offline