You are not logged in.

#1 2018-01-18 00:58:58

6ng4n
Member
Registered: 2012-02-07
Posts: 84

Backup Solution Recommendations

I have a external HDD and I keep my backups in it. However until this day I used manual copying for backing up my files. My files contain numerous number of photos, decent amount of music files in both lossy and lossless formats, some important documents and scanned versions of very old books and my schoolwork archive. So I've thousands of files to backup. It has become to harder by time. Finally I've decided to use some sort of backup solution.

Going through wiki (https://wiki.archlinux.org/index.php/Sy … p_programs)  I've discovered git annex and loved its logic. It looked promising but it is written in Haskell and it's very slow to add files into it. I've tested it by copying some files into a testing directory and indexing them. It took 43 seconds to index 2000ish files while my archive contains ~ 80,000 files. I'm also unhappy with the number of dependencies git annex has. So I would like to get some recommendations and learn your experiences. I have some constraints to consider though:

- No archived storage formats. I want to access my files directly as normal files on a disk drive
- No Linux FS only solutions. My disk drive is NTFS. I must be able to access them from Windows.
- No automatic syncing: This is important. I want the absolute control when and which files will be synced. git annex provides this, I can lookup changes via git annex info and push/sync them manually.
- Multi(or atleast Bi)-directional syncing: I want to sync some files from my drive while syncing some to drive
- Do no harm conflict resolution. When there's conflicting files. Backup solution must refrain from overwriting things. git annex creates renamed copies in this situation
- Solutions written in C or C++ is a plus due to performance reasons. Also software written in those languages generally use smaller number of libraries/packages.
- The idea of using git as a backend engine is somewhat appealing to me but it's not a must-have.


Keeping history of files is unnecessary for me as long as latest version is preserved and can be synced on to old ones.

Offline

#2 2018-01-18 01:03:25

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,523
Website

Re: Backup Solution Recommendations

6ng4n wrote:

- Do no harm conflict resolution. When there's conflicting files. Backup solution must refrain from overwriting things. git annex creates renamed copies in this situation...

Keeping history of files is unnecessary for me as long as latest version is preserved and can be synced on to old ones.

These two criteria are contradictory.

But in any case, my recommendation would still be rsync: just read the man page to pick the parameters for the result you want.


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#3 2018-01-18 09:34:51

seth
Member
Registered: 2012-09-03
Posts: 51,064

Re: Backup Solution Recommendations

cp -al "$LAST_BAK" "$NEW_BAK" # hardlink copy of last snapshot
rsync -aAX --delete "$BACKUP_SRC" "$NEW_BAK" # update new snapshot

NTFS supports hardlinks, but I don't know whether ntfs-3g does.
"date" helps with the creation of snapshot directory names.

git sucks with binary files, if you want to use a version control, rather use svn.

Online

#4 2018-01-18 18:32:39

6ng4n
Member
Registered: 2012-02-07
Posts: 84

Re: Backup Solution Recommendations

Trilby wrote:
6ng4n wrote:

- Do no harm conflict resolution. When there's conflicting files. Backup solution must refrain from overwriting things. git annex creates renamed copies in this situation...

Keeping history of files is unnecessary for me as long as latest version is preserved and can be synced on to old ones.

These two criteria are contradictory.

But in any case, my recommendation would still be rsync: just read the man page to pick the parameters for the result you want.


I wasn't clear about what history means. I don't want to carry old (full) versions but I'd like to be able to see if anything is changed externally. It's like keeping history of file hashes but not files itself. Backup software can detect something is wrong while not keeping the exact versions. Thus I can select to keep a file or overwrite it when syncing.  Since most of the files are binary it's unnecessary to keep their history.

Offline

#5 2018-01-18 21:00:28

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,523
Website

Re: Backup Solution Recommendations

Ah that makes sense.  There'd really be no need to keep a history of hashes though.  Just on each sync, check the hash of files with the same name: if they match, it doesn't matter what you want, there is no need to overwrite anything.  If they don't match, then you'd be prompted for the choice.

Rsync would still work for this as it has "dry run" modes where it would provide a list of what files *would* be copied for an incremental backup.


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#6 2018-01-19 03:46:26

NoSuck
Member
Registered: 2015-03-04
Posts: 157
Website

Re: Backup Solution Recommendations

I use rsync because my destination drives are ZFS pools, and ZFS provides snapshots.  If I were backing up to NTFS, I would consider using borg instead.

Offline

Board footer

Powered by FluxBB