QUILLEN By Greg Nelson
WARNING Until Quillen reaches v1.0, new versions may not be compatible with snapshots made with previous versions. When upgrading, please first run the uninstall command and then run the install command (to delete and create Quillen's S3 Buckets and SDB domains). This will delete previous snapshots!
Quillen backs up your important documents or other data to your Amazon S3 and SimpleDB account. Information about Amazon S3 and SimpleDB can be found at http://www.amazon.com/s3 and http://aws.amazon.com/simpledb/. You must obtain an AWS account from Amazon and enable S3 and SimpleDB prior to using Quillen.
Quillen is designed to be a simple command line tool that can be crontabbed and forgotten about.
Quillen talks directly to Amazon S3 and SimpleDB without any intermediate servers. You pay for your usage directly to Amazon.
When you backup a file or directory, Quillen places those files into a conceptual "snapshot". It is a full snapshot of those files at that point in time. A snapshot can later be restored to your local disk.
Quillen splits files into chunks of data of variable size. The chunks are determined by their contents, rather than fixed offsets into a file. This enables Quillen to keep data transfer and storage to a minimum since only chunks that have not been seen before will be uploaded. Chunks have an average size of 128K, although they can range from 2K - 5MB. This method of chunking also means that a full backup of a set of files that have already been placed in a snapshot does not result in re-uploading those files. And if the files have changed since they were last backed up, only those chunks that have changed need to be uploaded. The result is a new full snapshot. The full set of files can be restored from that snapshot. This is de-duplication between snapshots.
This method of chunking also enables de-duplication within snapshots. If two files are both backed up at the same time, and the two files share chunks of data, each of those chunks is only uploaded and stored once. Quillen keeps track of the fact that a chunk is referenced by multiple files.
Another advantage of this method is that Quillen can process and transfer multiple chunks in parallel, taking full advantage of the network connection. This results in faster backups than if the chunks were uploaded serially.
Finally, because Quillen processes files in chunks, an error during transfer (process is killed, network connetion goes away, power outage, etc.) doesn't mean that Quillen has to start all over again from the beginning. It will pick up where it left off and avoid re-uploading chunks it has already uploaded.
EXAMPLE
A directory of three files are backed up to snapshot1: foo, bar, and baz. Each is 1GB in size. What's more, bar and baz share 500MB of content. Quillen splits each file into chunks and the result is 1GB + 1GB + 500MB data transferred and stored.
Now say file foo gets edited and a single byte is appended to the beginning of the file. With an offset chunking scheme, this would result in every chunk in the file getting changed! But in Quillen, only the first chunk has been changed. If that chunk was originally 128,000 bytes, it is now 128,001 bytes. The directory is backed up to snapshot2, and only that first chunk is transferred. The result is 1GB + 1GB + 500MB + 128K data stored. At this point, you can restore snapshot1 and get the original set of files. You can restore snapshot2 and get the whole set of files with foo's edit. You can delete snapshot1 because it is out of date, and snapshot2 will still represent the whole set of files.
It should also be noted that the amount of data transferred and stored would actually be less if this data was text, since Quillen uses gzip compression.
INSTALLATION
- Make sure you have Java 1.6 or higher installed on your machine.
- Expand quillen-0.4.zip, which includes quillen-0.4.jar and all its dependencies.
- Edit the file called quillen.properties and fill it in with your AWS credentials.
- Tell Quillen to create the necessary S3 buckets and SimpleDB domains (it will create 2 buckets and 3 domains):
java -jar quillen-0.4.jar -command install 2> quillen.log
BASIC USAGE
- Backup the directory /home/greg/backmeup:
java -jar quillen-0.4.jar -command backup -base /home/greg -path backmeup 2> quillen.log
- Backup the directory /home/greg/backmeup using the UNIX find utility:
find /home/greg/backmeup | java -jar quillen-0.4.jar -command backup 2> quillen.log
- List snapshots:
java -jar quillen-0.4.jar -command list 2> quillen.log
- List contents of snapshot1:
java -jar quillen-0.4.jar -command list -snapshot snapshot1 2> quillen.log
- Restore snapshot1:
java -jar quillen-0.4.jar -command restore -snapshot snapshot1 -base /home/greg/restored 2> quillen.log
- Delete snapshot1:
java -jar quillen-0.4.jar -command delete -snapshot snapshot1 2> quillen.log
ADVANCED USAGE
- Backup the directory /home/greg/docs to snapshot docs-0001 (instead of letting Quillen auto-generate a snapshot name):
java -jar quillen-0.4.jar -command backup -base /home/greg -path docs -snapshot docs-0001 2> quillen.log
- List contents of multiple snapshots:
java -jar quillen-0.4.jar -command list -snapshot docs-0001 docs-0002 docs-0003 2> quillen.log
- Delete multiple snapshots:
java -jar quillen-0.4.jar -command delete -snapshot docs-0001 docs-0002 docs-0003 2> quillen.log
- Delete all snapshots starting with "docs"
java -jar quillen-0.4.jar -command delete -prefix docs 2> quillen.log
- Delete all snapshots that were backed up before Feb/13/2009 23:31:30
java -jar quillen-0.4.jar -command delete -date 20090213233130 2> quillen.log
- Delete all snapshots starting with "docs" that were backed up before Feb/13/2009 23:31:30
java -jar quillen-0.4.jar -command delete -prefix docs -date 20090213233130 2> quillen.log
- Restore all files in snapshot docs-0001 with file names that start with docs/dir1/foo
java -jar quillen-0.4.jar -command restore -snapshot docs-0001 -prefix docs/dir1/foo 2> quillen.log
UNINSTALLATION
Simply tell Quillen to delete the S3 buckets and SimpleDB domains it created. This may take some time especially if you have a lot of data backed up. Please note that this will delete all snapshots!
java -jar quillen-0.4.jar -command uninstall 2> quillen.log