There are quite a number of S3 file systems around. The following table attempts to give an overview. Obviously, it is biased in favor of S3QL because it mainly lists the reasons why the author chose to write a new file system instead of using one of the existing ones.
Please don't hesitate to submit any corrections or additions, I hope that this table will become less biased over time.
Some more S3 file systems are also listed under Related Projects.
There are basically three different types of S3 file systems:
- Block Based file systems expose S3 as a single block device which can then be formatted with an ordinary file system. These file systems are conceptually simple, but the performance is very difficult to get right because they work at a very low level.
- 1:1 file systems save each file in a single S3 object. This has the advantage that the files can be accessed with other S3 tools as well. The disadvantage is that only a very basic functionality can be implemented.
- Native file systems provide the complete set of unix features. They operate at a very high level and can be tailored exactly to the requirements. However, this also makes them very complex and it is very difficult to retrieve the stored data with any other S3 tool.
|
S3QL |
PersistentFS |
S3FS |
S3FSLite |
SubCloud |
S3Backer |
ElasticDrive |
| Type |
Native |
Native |
1:1 |
1:1 |
1:1 |
Block Based |
Block Based |
| File Size Limit |
unlimited |
? |
5 GB |
5 GB |
5 GB |
unlimited |
unlimited |
| File System Size |
dynamic |
? |
dynamic |
dynamic |
dynamic |
fixed |
fixed |
| License |
Open Source |
Commercial |
Open Source |
Open Source |
Commercial |
Open Source |
Commercial |
| Compression |
Yes |
No |
No |
No |
Yes |
Yes |
? |
| Encryption |
Yes |
? |
No |
No |
Yes |
Via dm-crypt |
Via dm-crypt |
Snapshots / Copy-On-Write |
Yes |
No |
No |
No |
No |
Via LVM |
Via LVM |
| Data De-Duplication |
Yes |
No |
No |
No |
No |
No |
No |
| Unix Attributes |
Yes |
? |
? |
? |
? |
Yes |
Yes |
| Hardlink Support |
Yes |
? |
No |
No |
No |
Yes |
Yes |
| Symlink Support |
Yes |
? |
No |
No |
No |
Yes |
Yes |
Directory Rename Support |
Yes |
Yes |
No |
Partial |
Partial |
Yes |
Yes |
| Block size |
configurable |
? |
file |
file |
file |
configurable |
configurable |
| Multiple Mounts |
No |
Yes |
No |
No |
Yes |
No |
No |
Eventual consistency handling |
Yes |
? |
Yes |
? |
? |
? |
? |
Notes
- directory rename support: partial means that renaming a directory implies that all contained files and directories need to be copied, so renaming a directory may take a really long time.
- block size: file means that files can only be transferred in one piece, i.e. changing a few bytes in a 1 GB file means that the whole file has to be uploaded again.
- Eventual consistency handling refers to the fact that after an object has been uploaded to Amazon S3, it is possible that downloading the object will still return the supposedly overriden old data. Robust file systems need to be able to handle this properly.
- Multiple Mounts indicates if the same file system can be mounted on several computers at the same time, like e.g. NFS or CIFS.
Thinking about leaving a comment?
Please use comments only to add additional information. If you need help or want to ask a question, use the mailing list. (We would like to keep the Wiki pages as clutter-free as possible.)
Thank you for your consideration!
Hi, It would be a good idea to compare with droplet fuse too, it's on github at: https://github.com/pozdnychev/dropletfs-fuse It's follows the 1:1 model with write caching.
Giorgio
Hi, another project to compare against would be http://opendedup.org/