You are not logged in.

#1 2014-08-13 15:04:02

leonidb
Member
Registered: 2014-08-13
Posts: 1

lbcp - a Python script for backup in Amazon's S3

Hello Arch world,

First of all, I'd like to mention that although not new to Linux, this is the first piece of code that I actually publish, and most (all) of my experience in Python is numeric data analysis, not actual programs, so please excuse me if I'm doing something wrong.

After looking for a backup program to match my requirements, I couldn't find any, so I wrote this.
My requirements were:
1) Locally encrypted, compressed, deduplicated (many existing options for these)
2) Open source (I used SpiderOak before, but it is failing here, plus I experienced few bugs in it)
3) Uploads entire files (not by parts) and doesn't require re-uploading everyting once in a while (otherwise excellent Duplicity fails here)

So this Python script compresses the files, (zlib) encrypts them (pycrypto) and deduplicates on a file level (no two identical files are uploaded, but if two files have identical pieces in them, they are still uploaded whole, unlike in Duplicity). The S3 handeling uses Boto, and the logging/listing of files is done with the help of Numpy.
The log file is in human readable format, and is saved to ~/.lbcp/lbcp_<device>.log by default.

Multiple "devices" can be backed up, and by "devices" I mean both physical devices, and different locations on the same device (for example: I have a home backup server, with my laptop and my camera backed up to different locations on it, and I want to back up both of them to S3, from this server. So different locations are in fact different devices).

As this script is a quick & dirty solution to my immediate backup needs, it has some rough edges, but I intend to work on it with time. Features I'd like to put in include the ability to search the backed up files, browse them, and, as a lower priority, delete options.

Obviously, this script should be used ADDITIONALLY to a local backup (external HD, or a server that you have a physical access to) because (1) downloading data from S3 costs money and (2) it is always good to keep a physical copy of your data.

If anyone has backup needs similar to mine, and would like to test it, here it is on GitHub: https://github.com/blochl/lbcp

Also, any suggestions are welcomed! wink

Leonid.

Offline

Board footer

Powered by FluxBB