My favorites | Sign in
Project Logo
                
Search
for
Updated Nov 21, 2008 by rrizun
Labels: Featured
FuseOverAmazon  

FUSE-based file system backed by Amazon S3

For a commercially supported version with additional features see http://www.subcloud.com

What's New

  • r177 fixed stale curl handle issue; fixed 100% cpu issue
  • r166 case insensitive mime type lookup
  • r152 (May 8, 2008)
    • uses x-amz-copy-source; "correct" content-type lookup via /etc/mime.types; symlinks!

Overview

s3fs is a fuse filesystem that allows you to mount an Amazon S3 bucket as a local filesystem. It stores files natively and transparently in S3 (i.e., you can use other programs to access the same files). Maximum file size=5G.

s3fs is stable and is being used in number of production environments, e.g., rsync backup to s3.

To use it:

  1. get an amazon s3 account!
  2. download the source, compile it (I've used fc5/ppc, f7/i386, f9/x86, f9/x64 and Mac OS X 10.4) and slap the binary in, say, /usr/bin/s3fs
    1. you'll need at least fuse-2.6
    2. for fedora probably need to do: yum install fuse-devel
    3. for ubuntu probably need to do something like: sudo apt get fuse-libs (I think?!?)
  3. do this:
/usr/bin/s3fs mybucket -o accessKeyId=aaa -o secretAccessKey=bbb /mnt

That's it! the contents of your amazon bucket "mybucket" should now be accessible read/write in /mnt

If you don't like specifying your secretAccessKey on the command line then you can create a file "/etc/passwd-s3fs" with a line containing a accessKeyId:secretAccessKey pair. Then the command line becomes simply:

/usr/bin/s3fs mybucket /mnt

You can have more than one set of credentials (i.e., credentials for more than one amazon s3 account) in /etc/passwd-s3fs in which case you'll have to specify -o accessKeyId=aaa on the command line.

s3fs supports mode (e.g., chmod), mtime (e.g, touch) and uid/gid (chown). s3fs stores the values in x-amz-meta custom meta headers, and as such does "brute-force" re-uploads of s3 objects if/when mode and/or mtime changes. and uses x-amz-copy-source to efficiently change them.

s3fs has a caching mechanism: You can enable local file caching to minimize downloads, e.g., :

/usr/bin/s3fs mybucket /mnt -ouse_cache=/tmp

Hosting a cvsroot on s3 works! Although you probably don't really want to do it in practice. E.g., cvs -d /s3/cvsroot init. Incredibly, mysqld also works, although I doube you really wanna do that in practice! =)

Using rsync with an s3 volume as the destination doesn't quite work because of timestamp issues. s3fs does not (yet) support changing timestamps on files. I mean, it will work, as in it will copy files, but, the timestamps will just be current timestamps (rsync will complain about not being able to set timestamps but will continue).

s3fs works with rsync! (as of svn 43) Due to the way FUSE works and s3fs' "brute-force" support of mode (chmod) and mtime (touch), upon first sync, files are downloaded/uploaded more than once (because rsync does (a) chmod (b) touch and (c) rename), however, subsequent rsyncs are pretty much as fast as can be. If that's too much downloading/downloading for ya then try using the "use_cache" option to enable the local file cache... it will definitely minimize the number of downloads. as of r152 s3fs uses x-amz-copy-source for efficient update of mode, mtime and uid/gid.

s3fs will retry s3 transactions on certain error conditions. The default retry count is 2, i.e., s3fs will make 2 retries per s3 transaction (for a total of 3 attempts: 1st attempt + 2 retries) before giving up. You can set the retry count by using the "retries" option, e.g., "-oretries=2".

Options

  • prefix (default="") (coming soon!)
    • a prefix to append to all s3 objects
  • retries (default="2")
    • number of times to retry a failed s3 transaction
  • use_cache (default="" which means disabled)
    • local folder to use for local file cache
  • connect_timeout (default="2" seconds)
    • time to wait for connection before giving up
  • readwrite_timeout (default="10" seconds)
    • time to wait between read/write activity before giving up

Details

If enabled via "use_cache" option, s3fs automatically maintains a local cache of files in the folder specified by use_cache. Whenever s3fs needs to read or write a file on s3 it first downloads the entire file locally to the folder specified by use_cache and operates on it. When fuse release() is called, s3fs will re-upload the file to s3 if it has been changed. s3fs uses md5 checksums to minimize downloads from s3.

The folder specified by use_cache is just a local cache. It can be deleted at any time. s3fs re-builds it on demand.

New for svn 43: Local file cache is disabled and I might not bring it back. I originally added local file cache thinking it would help for rsync (and createrepo). It ends up rsync works reasonably well without it. For createrepo, just rsync back and forth!

s3fs supports chmod (mode) and touch (mtime) by virtue of "x-amz-meta-mode" and "x-amz-meta-mtime" custom meta headers. As well, these are supported in a brute-force manner. That is, changing any x-amz-meta headers requires re-uploading the s3 object. This is exactly what s3fs does. When changing mode or mtime, s3fs will download the s3 object, change the meta header(s) and re-upload the s3 object. Ditto for file rename. as of r149 s3fs uses x-amz-copy-source, this means that s3fs no longer needs to operate in a brute-force manner; much faster now (one minor performance-related corner case left to solve... /usr/bin/touch)

Local file caching works by calculating and comparing md5 checksums (ETag HTTP header).

All s3 objects written by s3fs have a Content-Type of either "application/octet-stream" or "application/x-directory". as of r152, s3fs now leverages /etc/mime.types to "guess" the "correct" content-type based on file name extension. This means that you can copy a website to s3 and serve it up directly from s3 with correct content-types!

Release Notes

  • r166
    • case-insensitive lookup of content-type from /etc/mime.types
  • r152
    • added support for symlinks... ln -s works!
  • r151
    • use /etc/mime.types
  • r150
    • added support for uid/gid... chown works!
  • r149
    • support x-amz-copy-source... rsync much faster now!
  • r145
    • log svn version at startup via syslog /var/log/messages
  • r141
    • added "url" runtime parameter
  • r136, r138
    • connect_timeout and readwrite_timeout
  • r130
    • set uid/gid to whatever getuid()/getgid() returns
    • log some stuff to syslog (i.e., /var/log/messages)
    • fixed  issue 14  (local file cache bug; fixed cp, rsync, etc...)
  • r117
    • limit max-keys=20 (workaround for libcurl's 100% cpu issue?!?)
  • r116
    • added crypto locking
  • r114
    • curl_global_init
  • r107
    • use CURLOPT_NOSIGNAL
  • r106
    • rewind on retry
  • r105
    • only send x-amz-acl and x-amz-meta headers
  • r101, r102, r103
    • fixed curl_multi_timeout bug (found on mac)
  • r99
    • added "default_acl" option
  • r92
    • parallel-ized readdir(): getting a directory listing is now a lot faster
  • r88
    • removed 10s read timeout that should not have been introduced
  • r72 2008-02-18
    • use_cache now takes path to local file cache folder, e.g., /usr/bin/s3fs mybucket /s3 -ouse_cache=/tmp
  • r66 2008-02-18
    • local file cache is back! however, it is disabled by default... use "use_cache" option, e.g., /usr/bin/s3fs mybucket /s3 -ouse_cache=1
  • r57 2008-02-18
    • a few bug fixes:
      • touch x-amz-meta-mtime in flush()
      • use INFILE_LARGE (libcurl) (found on fc5/ppc)
    • tidyup
  • r43 2008-02-17
    • mode (i.e., chmod), mtime and deep rename! rsync now works!
    • temporarily disabled local file cache (might not bring it back!)
  • r28 2007-12-15
    • retry on 500 server error
  • r27 2007-12-15
    • file-based (instead of memory-based)
      • this means that s3fs will no longer allocate large memory buffers when writing files to s3

Faq

Limitations

ToDo

  • support brute-force rename fixed in svn 43
  • get symlinks working? added in r152
    • this would bog down performance: would have to do deep getattr() for every single object already doing this in svn 43... its not too bad!
  • make install target
  • get "-h" help working
  • handle utime so that rsync works! fixed in svn 43!
    • probably a bad idea after all...
    • actually don't think it can be done: can't specify arbitrary create-time for PUT
  • chmod support... acl
  • permissions: using -o allow_other, even though files are owned by root 0755, another use can make changes
    • use default_permissions option?!?
  • better error logging for troubleshooting, e.g., syslog...
    • need to parse response on, say, 403 and 404 errors, etc... and log 'em!
  • use temporary file for flush() and then stream it to amazon

See Also

Here is a list of other Amazon S3 filesystems:


Comment by rsaccon, Oct 13, 2007

anybody troubleshooted this to get it compiling on Mac OSX ?

I have MacFuse? 0.4 and get the following errors:

Package fuse was not found in the pkg-config search path. Perhaps you should add the directory containing `fuse.pc' to the PKG_CONFIG_PATH environment variable No package 'fuse' found g++ -Wall -lcurl -I/opt/local/include/libxml2 -I/opt/local/include -L/opt/local/lib -lxml2 -lz -lpthread -L/opt/local/lib -liconv -lm -ggdb s3fs.cpp -o s3fs In file included from /usr/local/include/fuse/fuse.h:23,

from /usr/local/include/fuse.h:9, from s3fs.cpp:23:
/usr/local/include/fuse/fuse_common.h:30:2: error: #error Please add -D_FILE_OFFSET_BITS=64 to your compile flags! s3fs.cpp: In function 'int s3fs_getattr(const char, stat)': s3fs.cpp:279: error: expected type-specifier before 'off_t' s3fs.cpp:279: error: expected `>' before 'off_t' s3fs.cpp:279: error: expected `(' before 'off_t' s3fs.cpp:279: error: 'off_t' was not declared in this scope s3fs.cpp:279: error: expected `)' before ';' token s3fs.cpp: In function 'int s3fs_read(const char, char, size_t, off_t, fuse_file_info)': s3fs.cpp:495: warning: format '%u' expects type 'unsigned int', but argument 3 has type 'size_t' s3fs.cpp:495: warning: format '%u' expects type 'unsigned int', but argument 4 has type 'size_t' make: all? Error 1

Comment by rsaccon, Oct 13, 2007

I started a discussion at, gt a step further, but still not working. http://groups.google.com/group/macfuse-devel/browse_thread/thread/4de259075741370a

Comment by clearskysnet, Oct 27, 2007
Comment by contact.alexkuo, Nov 26, 2007

Note to Ubuntu Newbies - Compiling in Ubuntu

If you're getting errors related to missing 'libxml' or 'curl'. You need the following c libraries in order to compile: curl, fuse, build-essential, libxml2, and openssl.

To get these, at prompt type:[BR?]

sudo apt-get install build-essential libcurl4-openssl-dev libxml2-dev libfuse-dev 

[BR?] Then in the directory where you downloaded the repository (trunk/s3fs/), type the following in the shell to compile the s3fs.cpp file: [BR?]

make
Comment by tleasure, Dec 06, 2007

First, thanks for this. Is there any progress or ETA on s3fs not using memory when writing files to S3? It says it could easily be fixed, I just wanted to check. Thanks!

Comment by rrizun, Dec 08, 2007

yup, I've definitely made progress on the "not using memory when writing files" scheme... the scheme essentially caches files locally based on their md5 checksum... I'll try to get something checked in by the end-of-the-week...!

Comment by rviswanadha, Dec 13, 2007

Hi, I compiled and ran the code. 2 problems 1. ls on the mounted directory causes s3fs to crash s3fs?# ll /mnt/test-images/ ls: reading directory /mnt/test-images/: Transport endpoint is not connected total 0 ?--------- ? ? ? ? ? itemUpload$folder$ 2. What is the format of /etc/password-s3fs ? I tried accessKey:secretKeyID .. it did not work

Comment by rrizun, Dec 15, 2007

Hi rviswanadha- Another user reported a similar issue... however I have not been able to see this behavior myself; which linux/version are you runnning?

also, the format for passwd-s3fs is simply one line per set of credentials, separated by a colon (essentially just like you wrote), e.g., accessKeyId:secretAccessKey

Comment by rrizun, Dec 15, 2007

rviswanadha- as well, be sure the file is called "passwd-s3fs" (not "password-s3fs")

Comment by rrizun, Dec 15, 2007

rviswanadha- ah! there was a mistake in the wiki page... fixed! the file is "/etc/passwd-s3fs"

Comment by rviswanadha, Dec 18, 2007

Hi RRizun, thanks for responding. I am using the default Redhat LAMP AMI from Amazon. First I installed all the required libraries and make tools

  1. yum install fuse fuse-devel python-devel
  2. yum install libxml2-devel
  3. yum -y install curl-devel
  4. yum install gcc
  5. rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY
  6. yum install libtool
  7. wget http://mirror.cogentco.com/pub/linux/fedora/linux/core/updates/4/i386/libstdc++-devel-4.0.2-8.fc4.i386.rpm
  8. wget http://mirror.cogentco.com/pub/linux/fedora/linux/core/updates/4/i386/gcc-c++-4.0.2-8.fc4.i386.rpm
  1. rpm -i libstdc++-devel-4.0.2-8.fc4.i386.rpm
  2. rpm -i gcc-c++-4.0.2-8.fc4.i386.rpm

<br/> After that I compiled s3fuse code and mounted the drive.

Comment by jimbosander, Dec 18, 2007

Great Cookbook rviswanadha! RRizun, is there any way of mounting an existing path? say if I have a "prefix" under a bucket, <bucket>:/my/path/here shouldn't "s3fs <bucket>:/my/path/here <mntpt>" work?

Comment by rrizun, Dec 19, 2007

Hi jimbosander- currently, no, but that appears to be an easy feature to add... i'll add an issue to track it

Comment by jorang, Dec 22, 2007

Anyone tried to use s3fs to make EC2 (MySQL and Apache) use files directly from S3? I read somewhere that the connection between EC2 and S3 would create problems in such a solution. Acording to your experience what could be reasonable to acomplish with EC2, S3 and s3fs when having a high-load website? Direct use, minute2minute backup, hourly backup, ...?

Comment by rrizun, Dec 24, 2007

Hi Jorang- I'm assuming your primary interest is recovering any data in case of disaster, e.g., across instance shutdowns

in my opinion for small mysql datasets it is reasonable to add a simple cron job to do a mysql dump to s3 (hourly, daily, whatever is acceptable to you); I've done this in the past myself (daily, because 'important' data changed rarely); using mysql innodb (instead of myisam) should minimize any database contention when the backup script is run

if you have a large dataset and/or your dataset changes often then maybe amazon simpledb might be a better route

I've recently checked in a version of s3fs that adds local file caching, so, for example, you could configure apache to serve files directly from a mounted s3fs volume; with the local file caching, with the exception of the initial warming up of the cache, you might be able to get decent local-file performance (though I have not tried this setup myself yet so I have no real world experience) (you definitely would not want to run mysql in this fashion!)

hope that helps!

Comment by miradu, Jan 03, 2008

Not sure if I'm allowed to distribute this, but here's a binary of the above source compiled on Mac OX 10.4.10: (expires march 1st, 2008), hosted on s3, of course ;) I followed the instructions in the above linked thread, which are also more cleanly duplicated here:http://www.rsaccon.com/2007/10/mount-amazon-s3-on-your-mac.html . The one thing I needed to also do to make it all work was to add /opt/local/bin to my PATH.

https://miradu.s3.amazonaws.com/s3fs?AWSAccessKeyId=0GWZZ6FN6895K2ETZ602&Expires=1204351195&Signature=LR8atfPW0XBjPMXIlXPLNZtAdIg%3D

Cheers,

-Michael Ducker miradu@miradu.com

Comment by rrizun, Jan 08, 2008

Thanks, Michael!

Comment by donnyspi, Jan 22, 2008

When I run s3fs as root and a dir gets mounted, only root can access the data in the s3 bucket. When I used to run fuse manually, i'd add the -o allow_others option. How can I get apache to read from the mounted dir?

Comment by rrizun, Jan 22, 2008

"allow_other" works fine (I noticed you typed "allow_others" in your comment, i.e., pluralized... perhaps a typo?!?) so, either:

/usr/bin/s3fs mybucket /mnt -o allow_other

or in /etc/fstab:

s3fs#mybucket /mnt fuse allow_other,accessKeyId=aaa,secretAccessKey=bbb 0 0

Comment by donnyspi, Jan 22, 2008

That it! Thanks!

Comment by kiizaa, Jan 26, 2008

Hello,

I installed the dependencies that contact.alexkuo recommended

sudo apt-get install build-essential libcurl4-openssl-dev libxml2-dev libfuse-dev 

But I am getting these errors while issuing make on Ubuntu 6.10

many thanks for any help

$make
g++ -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/fuse  -lfuse -lpthread   -lcurl -I/usr/include/libxml2 -L/usr/lib -lxml2 -lssl -ggdb s3fs.cpp -o s3fs
s3fs.cpp: In function ‘int main(int, char**)’:
s3fs.cpp:1072: error: invalid conversion from ‘void* (*)(fuse_conn_info*)’ to ‘void* (*)()’
s3fs.cpp:1075: error: ‘struct fuse_operations’ has no member named ‘utimens’
s3fs.cpp: At global scope:
s3fs.cpp:272: warning: ‘size_t readCallback(void*, size_t, size_t, void*)’ defined but not used
make: *** [all] Error 1
$svn diff
Index: s3fs.cpp
===================================================================
--- s3fs.cpp    (revision 42)
+++ s3fs.cpp    (working copy)
@@ -1074,5 +1074,5 @@
     s3fs_oper.access = s3fs_access;
     s3fs_oper.utimens = s3fs_utimens;
 
-    return fuse_main(custom_args.argc, custom_args.argv, &s3fs_oper, NULL);
+    return fuse_main(custom_args.argc, custom_args.argv, &s3fs_oper);
 }
$svn info
Path: .
URL: http://s3fs.googlecode.com/svn/trunk/s3fs
Repository Root: http://s3fs.googlecode.com/svn
Repository UUID: df820570-a93a-0410-bd06-b72b767a4274
Revision: 42
Node Kind: directory
Schedule: normal
Last Changed Author: rrizun
Last Changed Rev: 40
Last Changed Date: 2008-01-21 22:43:06 -0200 (Mon, 21 Jan 2008)

$gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release i486-linux-gnu
Thread model: posix
gcc version 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)
Comment by rrizun, Jan 29, 2008

I've successfully compiled on Ubuntu 7.04 but not on 6.10... perhaps fuse-lib bundled w/6.10 is too old?!?

Comment by donnyspi, Feb 05, 2008

Here's what I get on Ubuntu 6.06:

g++ -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/fuse  -lfuse -lpthread   -lcurl -I/usr/include/libxml2 -L/usr/lib -lxml2 -lz -lm -lssl -ggdb s3fs.cpp -o s3fs
s3fs.cpp:1077:74: error: macro "fuse_main" passed 4 arguments, but takes just 3
s3fs.cpp: In function ‘int s3fs_statfs(const char*, statvfs*)’:
s3fs.cpp:764: error: invalid use of undefined type ‘struct statvfs’
s3fs.cpp:762: error: forward declaration of ‘struct statvfs’
s3fs.cpp:765: error: invalid use of undefined type ‘struct statvfs’
s3fs.cpp:762: error: forward declaration of ‘struct statvfs’
s3fs.cpp:766: error: invalid use of undefined type ‘struct statvfs’
s3fs.cpp:762: error: forward declaration of ‘struct statvfs’
s3fs.cpp:767: error: invalid use of undefined type ‘struct statvfs’
s3fs.cpp:762: error: forward declaration of ‘struct statvfs’
s3fs.cpp: In function ‘int my_fuse_opt_proc(void*, const char*, int, fuse_args*)’:
s3fs.cpp:991: error: ‘FUSE_OPT_KEY_NONOPT’ was not declared in this scope
s3fs.cpp:997: error: ‘FUSE_OPT_KEY_OPT’ was not declared in this scope
s3fs.cpp: In function ‘int main(int, char**)’:
s3fs.cpp:1016: error: variable ‘fuse_args custom_args’ has initializer but incomplete type
s3fs.cpp:1016: error: ‘FUSE_ARGS_INIT’ was not declared in this scope
s3fs.cpp:1017: error: ‘fuse_opt_parse’ was not declared in this scope
s3fs.cpp:1068: error: invalid conversion from ‘int (*)(const char*, statvfs*)’ to ‘int (*)(const char*, statfs*)’
s3fs.cpp:1072: error: invalid conversion from ‘void* (*)(fuse_conn_info*)’ to ‘void* (*)()’
s3fs.cpp:1074: error: ‘struct fuse_operations’ has no member named ‘access’
s3fs.cpp:1075: error: ‘struct fuse_operations’ has no member named ‘utimens’
s3fs.cpp:1077: error: ‘fuse_main’ was not declared in this scope
s3fs.cpp: At global scope:
s3fs.cpp:272: warning: ‘size_t readCallback(void*, size_t, size_t, void*)’ defined but not used
make: *** [all] Error 1
Comment by a.skwar, Feb 18, 2008

Hi. I compiled and installed s3fs Rev 55 on a Gentoo Linux system. I'm able to mount some of my buckets. But when I try to do a "ls" of my mounted filesystem, nothing's returned:

--($:~/tmp/s3)-- mount | grep images
s3fs on /home/askwar/tmp/s3/images.alexander.skwar.name type fuse.s3fs (rw,nosuid,nodev,user=askwar)

--($:~/tmp/s3)-- ls -la /home/askwar/tmp/s3/images.alexander.skwar.name
insgesamt 0

Strange. Am I doing something wrong?

Comment by tleasure, Feb 18, 2008

Is there currently a way to make a file publicly readable? i.e. when doing a cp or mv command? Thanks!

Comment by rrizun, Feb 18, 2008

a.skwar- not sure what's going on here... are these pre-existing s3 objects in the images bucket that are maybe somehow not "compatible" with s3fs? can you create new files/directories via s3fs in that bucket? i.e., mkdir/touch, etc...?!?

Comment by rrizun, Feb 18, 2008

tleasure- there is no provision for making a file "public-read" with s3fs; you can use a tool such as jets3t for that ... I was thinking of making "chmod" fiddle with the s3 permissions... chmod's "mode" user/group/other would map nicely, however, I think it would actually be a bad idea ... it would be an unexpected surprise to most people, I would think, not realizing that they're making their files world-readable! even a simple rsync of some 777 files would cause those 777 files to be world-readable by anyone on the Internet!

Comment by tleasure, Feb 18, 2008

I've noticed that if you have pre-existing files that have a slash in the S3 filename (like a virtual directory), s3fs will not pick up on it until you create the directory locally. So if you have bucket:images/somefile.txt, you want to create a "images" directory locally in your s3 mount. Afterward, the files show up. Hope this helps.

Comment by tleasure, Feb 18, 2008

rrizun - thanks for your response. I agree that mapping chmod to S3's ACL could yield an unexpected surprise. But I do think this functionality would be huge. Do you think there are any alternatives other than chmod to set the ACL initially rather than doing another request after the file transfer?

Comment by rrizun, Feb 18, 2008

tleasure- I guess I could add a option to s3fs: "acl_default" one of: private, public-read, public-read-write or authenticated-read?!? (defaults to "private" of course)

Comment by tleasure, Feb 18, 2008

That would be awesome!

Comment by cblaise.public, Feb 18, 2008

Please consider resurrecting/improving the cache.

In particular, I'd like to see an option to control where it is written instead of .s3fs.

I've done some timing with my EC2 machine with s3fs and the local mnt partition. It's a lot faster to use a cache than always accessing S3.

Comment by rrizun, Feb 18, 2008

ok! local file cache is back just as it was before, however, it is disabled by default... use the "use_cache" option to enable, e.g., /usr/bin/s3fs mybucket /mnt -ouse_cache=1

Comment by rrizun, Feb 18, 2008

configurable cache folder (svn 72): use_cache now takes the path for the local file cache, e.g., -ouse_cache=/tmp

Comment by cblaise.public, Feb 19, 2008

Excellent! Thank you!

Comment by cblaise.public, Feb 19, 2008

I see you're currently using curl "easy" functions. Have you looked into whether the "multi" and/or "share" functions would improve performance at all?

Not that it's neccessarily "bad" given the S3 architecture, but I'm just curious.

Comment by rrizun, Feb 19, 2008

cblaise- use of libcurl's multi/share api by itself probably would not improve network performance all that much, however, use of FUSE's low-level asynchronous api in combination with libcurl's multi/share api would definitely improve the "responsiveness/robustness" of s3fs, e.g., hitting CTRL-C during an I/O operation should respond immediately due to async api vs sync api...

Comment by rrizun, Feb 19, 2008

FYI I've added configurable retries... -oretries=2... default is 2... so, s3fs makes 3 attempt per s3 transaction: 1 attempt + 2 retries

Comment by cblaise.public, Feb 20, 2008

With today's code using a 103M file, most of the time (but not always) I'm seeing the following error when copying to the drive:

cp: closing `s3/sitefilter-db.tgz': Input/output error

I've tried on two machines, one an EC2 machine (Fedora 8) and another a local machine (Fedora 5).

EC2 machine has shown it a few times not not as many times as the non-EC2 machine where it always happens.

Does not happen on smaller files (s3fs.cpp).

Comment by cblaise.public, Feb 20, 2008

Addendum: when this occur occurs the file does not copy over. It is 0 bytes.

Comment by cblaise.public, Feb 20, 2008

Seems to work properly with yesterday's (72) code.

My first clue should have been the timing. From my EC2 machine I'd expect to see under 30s (usually 11s) to copy the 103M file. On my remote machine, it should have been much longer than the ~30s time was reporting. :)

Comment by rrizun, Feb 20, 2008

I think the problem is caused by the addition of this line:

curl_easy_setopt(curl, CURLOPT_TIMEOUT, seconds);

I've removed the line and did a checkin... go ahead and do a svn update and retry?!?

sry/Thanks!

Comment by cblaise.public, Feb 20, 2008

That seems to be doing it. I'll try re-copying a few more times to be certain and will post if they fail.

Comment by st...@hanlon.co.uk, Feb 23, 2008

I'm trying this with the current (91) release code and with r88. Reads work fine, but writes are returning input/output error. A little testing and digging in the code shows that the my_curl_easy_perform is returning -EIO. (My test is just "touch /mnt/x" to create a new file)

Further, it seems that the EIO is being raised because I'm getting a 411 error back from AWS, which according to this indicates that the content-length header should be passed.

Since this is a list of errors from 2006, I'm guessing that I'm doing something wrong rather than the code or aws. Has anyone else had this problem? I assume that I'm authenticating ok to be able to read my own (private) buckets.

Comment by rrizun, Feb 23, 2008

Hmmmm.. I just did "touch /mnt/x" using r92 and it worked fine both with and without use_cache... could it be a European vs. North America s3 issue? can you use the FUSE -f switch to run s3fs in the foreground and capture its debug output?

Comment by rrizun, Feb 23, 2008

or better yet capture packets w/something like this:

tcpdump -s 1500 -A host s3.amazonaws.com

Comment by st...@hanlon.co.uk, Feb 23, 2008

Ah... it was my version of cURL. I was trying it on an old server which looks like it had v7.12 installed (I could be wrong with the version). Updated to 7.18 and it's working great. Going to try running a gallery2 album via this, if it works then I'll be very impressed. Great bit of code!

Comment by st...@hanlon.co.uk, Feb 23, 2008

rrizun, an update - if you're interested. The filesystem appears to work fine. Got gallery2 storing full and resized images on s3 using your s3fs code. A little hacking required: changed the gallery code to redirect to the s3 file and a minor change to s3fs to set the acls to public_read on file create / mkdir.

I've been looking for an easy way to do this for a long time. Thanks.

Comment by gringomaluco, Feb 25, 2008

Hi there:

I've installed on OS X and created a bucket (with some third party software) but have not had much luck getting things working...seems to send my Finder mental and only a force reload will get things back to normal after which time s3fs has died.

Two questions: one, has this been tested on OS X at all? And second do you have a recommended way of managing buckets for use with this FS?

Thanks

Jamie

Comment by rrizun, Feb 25, 2008

Hi there- I got this up and running yesterday on a MacBook? running OSX 10.4 and it worked quite well- I was able to weed out a few minor bugs in doing so. Having said that, try SVN version "107" which contains those bug fixes.

Are there files already in the bucket you're creating? Try creating and mounting an empty bucket and see if it still "sends Finder mental"! =)

I use jets3t (https://jets3t.dev.java.net) to manage buckets.

Comment by rrizun, Feb 25, 2008

hanlon.co.uk: glad to hear its working for ya; I recommend updating to svn 107 to pull in a few bug fixes

Comment by gringomaluco, Feb 25, 2008

rrizun: I have the latest SVN and its not working mate. It seems to be deadlocking or something: I've got gdb on it waiting for it to crash out and Its not even letting me break in. I'm well up for getting this working so email me - jkp@kirkconsulting.co.uk if you want a debugger.

Jamie

Comment by C.Aaker, Feb 26, 2008

Worked great on my fedora 7 install.

Comment by lionel.guichard, Feb 28, 2008

I get svn version. But its not working in compiling.

Even after applying the patch :

===================================================================
--- s3fs.cpp    (revision 42)
+++ s3fs.cpp    (working copy)
@@ -1074,5 +1074,5 @@
     s3fs_oper.access = s3fs_access;
     s3fs_oper.utimens = s3fs_utimens;
 
-    return fuse_main(custom_args.argc, custom_args.argv, &s3fs_oper, NULL);
+    return fuse_main(custom_args.argc, custom_args.argv, &s3fs_oper);
 }

I have always problem after acompiling :

s3fs.cpp: In function 'int main(int, char**)':
s3fs.cpp:1413: error: invalid conversion from 'void* (*)(fuse_conn_info*)' to 'void* (*)()'
s3fs.cpp:1416: error: 'struct fuse_operations' has no member named 'utimens'
s3fs.cpp: At global scope:
s3fs.cpp:392: warning: 'size_t readCallback(void*, size_t, size_t, void*)' defined but not used
make: *** [all] Erreur 1

I use gcc 4.1.2 and Debian Etch or Sarge

I have testing with revision 42,55 and latest

And idea ?

Comment by rrizun, Feb 28, 2008

from my research, debian etch bundles fuse 2.5... you'll need at least fuse 2.6 (preferably fuse 2.7) to compile s3fs... http://www.debianhelp.org/node/12310

Comment by jonas.nicklas, Mar 12, 2008

I think settting up a mailing list would be awesome, if there isn't one yet. That way these comments would not get abused as much by people like me ;)

I tried using s3fs with rdiff and it doesn't work, python stack trace: http://pastie.caboo.se/164903, I know nothing about this stuff though.

Also, I saw that https support is planned, and that would be killer!

Comment by rrizun, Mar 12, 2008

Hi there-

the issue you're seeing w/rdiff is related to http://code.google.com/p/s3fs/issues/detail?id=14

as a temporary workaround until I fix it, try disabling local file cache

as well, there is a mailing list: s3fs-devel; see http://code.google.com/p/s3fs/

https coming soon!

Comment by twigbranch, Mar 22, 2008

Please help! I simply can't get the /etc/passwd-s3fs to work. It contains: A12C1F:AC3143411F

That's keyId:Secret (those are obviously fake). Is this the right format? I'm getting 403 errors. Everything works if I enter the keys on the command line.

Comment by at...@verizon.net, Mar 22, 2008

when you say "All s3 objects written by s3fs have a Content-Type of either "application/octet-stream" or "application/x-directory". " will this change in the future? content disposition for images for example when uploaded using s3fs are set to force download and wont display in a browser.

Comment by rrizun, Mar 24, 2008

it might change in the future, specifically, I might drop application/x-directory because that information is already encoded in "mode"...

the intent is for s3fs to not care about the content-type... it would be an interesting feature though to somehow be able to configure content-type...

workaround? configuring apache httpd to serve up content from a mounted s3fs volume would work (i.e., have apache httpd to all of the content-type detection/smarts)

Comment by johnnymo, Mar 25, 2008

Wow, very slick, working like a charm!

I actually AM trying to use rsync to copy a gallery onto S3. Do the rsync issues with copying back and forth still exist? Enabling the use_cache option causes the following types of errors:

rsync: mkstemp "/v/blahblha/albums/.P6180002.JPG.Tl8lJj?" failed: Bad file descriptor (9) rsync: failed to set times on "/v/blahblah/albums/Anne M/SuzhouEmbroidery?": Bad file descriptor (9)

The cache directory ends up being only two directories deep, with the 'blahblah' directory showing up in cache as a 0-length file.

Other than this, WOW!!!

Comment by rrizun, Mar 25, 2008

Yup, unfortunately the local file caching issues still exist in the codebase... the symptoms are exactly as you describe... I'm sure its a simple fix but I just haven't been able to scrape up enough time to get around to fixing it!!!

Workaround: for now, disable local file cache! rsync should still work pretty good w/o it...

Once I fix it then I'll post a new src tarball...

Glad it works for ya!

Comment by rrizun, Apr 01, 2008

FYI local file caching bug is fixed in r130!

Comment by Eli.Bryan, Apr 08, 2008

I'm very new to all this linux stuff.. and I'm trying to get this to work.

I have one box that's fedora 9 and s3fs works just fine on there... but I have another that's a dedicated virtual where I keep running into

"fuse: failed to open /dev/fuse: Permission denied"

This box is running CentOS r5 and I'm running fuse 2.7.3-1.el5

/dev/fuse has permissions crw-rw---- 1 root fuse 10, 229 Apr 8 21:25 /dev/fuse

Apparently this error usually happens with fuse when the group isn't set right.. but I still get this even when I try to mount with root... and even when I set /dev/fuse to crwxrwxrwx...

I'm sure I'm just doing something dumb but I've spent wayyy too much time on this so I thought I'd ask around!

So...Any thoughts?

This project is brilliant, by the way =b

Comment by pedahzur, Apr 08, 2008

The good news: it works, and works GREAT on Ubuntu 7.10 The bad news: There is one serious design glitch that kills its usability for my needs.

I'm using Bacula to write to an S3 bucket mounted by s3fs. I'm using local caching, and 100MB volumes in Bacula so not too much is uploaded when a volume changes.

Problem: Bacula writes that 100MB volume, and then closes the file. s3fs then starts uploading that file to S3, but it blocks until the file is uploaded, and Bacula can't create a second volume (file on disk) until the first one completes uploading. That means, I can't spool to the local cache at all if I want my backup job to run in a reasonable amount of time (we're talking multi-gigabytes on a 512kpbs uplink). Is there any way around this? I'm pretty sure you're going to tell me it involves threading s3fs, and I know that is a headache and a half, but it would be cool if it could be done. Thanks for the great project!

Comment by rrizun, Apr 09, 2008

Hi Eli- Not sure what's going on... is this CentOS managed by an ISP? Perhaps FUSE is at their mercy?!? Also, selinux issue?!? Dunno, just guessing...

Comment by rrizun, Apr 09, 2008

Hi pedahzur-

Ya, threading is not an issue... s3fs could do the s3 upload in a separate thread and return immediately to the caller, (essentially a write behind cache) however, but then there would be no way for s3fs to directly convey an error back to the caller, i.e., would not be able to return a bad error/status code/return value. Futhermore there are concurrency issues to consider, easiest solution would be to serialize the write behind thread.

So, ya, it could be done (quite easily, actually), but note without making it clear to the end user he ramifications and trade-offs, thru documentation I guess!

Feel free to add a new "Issue" to track this feature/enhancement!

Comment by stephenskory, Apr 25, 2008

Hi,

As a test I changed "application/octet-stream" in the source to "text/plain" and now when I go directly to the file on S3 (like an image file) the browser displays the file directly, instead of downloading it like when "application/octet-stream" is set. (I have also changed the default_acl to "public-read").

Pardon my ignorance, but what am I breaking in enabling this useful feature? Thanks!

Comment by rrizun, Apr 25, 2008

Hi- those changes are fine! nothing should break; s3fs does not rely on content-type

there is a s3fs option called "default_acl" to set the default acl at runtime; perhaps I should also add a "default_contenttype" option?

Comment by sebastian.serrano, Apr 26, 2008

If patch s3 to save the correct content-type, what could break?

Comment by rrizun, Apr 26, 2008

Hi- you can patch s3fs to save whatever content-type you wish; should not break anything as s3fs does not rely on content-type

Comment by adilmohammed, May 04, 2008

Hi there, is it possible to store a relational database like MySQL directly into S3 using S3FS and run and update it on S3 via S3FS? Has anybody done this before? or its too risky? Thanks

Comment by rrizun, May 04, 2008

Hi- it is possible, however I would assume performance would be terrible (haven't tried myself), based on the fact that s3fs operates in a "bruteforce" manner, re-uploading entire files on changes. ElasticDrive? and/or PersistentFS would probably be better candidates since they are really block devices.

Comment by rrizun, May 04, 2008

FYI just for fun I tried running mysqld with datadir pointing to an s3 bucket and it worked! only issue is the "service mysqld start" timed out because mysqld created a 10MB ibdata and uploaded it; apparently the mysqld init script did not want to wait that long! =) I was able to do queries and it wasn't nearly as bad as I thought it would be; best bet would be to disable innodb and use myisam... =)

Comment by adilmohammed, May 05, 2008

Thanks mate, will try it out, but I guess if the size of the database becomes larger then performance will be an issue.

Comment by jstewart101, May 14, 2008

Just FYI I had all sorts of trouble using this with a European bucket. Tried for ages to get it working with no joy before it dawned on me to try a US bucket instead. Worked first time. Switched back to European bucket just to be and doing everything the same it wouldn't work.

Thanks for a brilliant FUSE project though - with rsync this is gonna be a very cheap and easy way to keep my photos synced offsite.

Comment by rrizun, May 14, 2008

Indeed- I haven't looked at what it takes to support EU buckets yet... I've added a note at the top of this wikipage...

Comment by adilmohammed, May 15, 2008

Hi there, I am having lots of speed issues with S3 using S3fs. I am using Rsync to do a backup and it has taken 1 day to do a full 1GB backup!! any clues? Thanks

Comment by rrizun, May 15, 2008

Hi there- be sure you're using r152 (x-amz-copy-source)... as well, what is your upload bandwidth? how fast? and which platform? linux? mac?

Comment by rrizun, May 15, 2008

also, might want to "tail -f /var/log/messages" to see if you're getting excessive retries?

Comment by adilmohammed, May 16, 2008

We are using it off an Amazon EC2 instance running Debian so the bandwidth should not be an issue since its all internal. I have a feeling that it could be the switch time between each file copy, what do you think? if it is, any way of reducing this? Thanks so much for your help.

Comment by abaddonsun, May 16, 2008

hi

heres some info on the above with comments

minideb:~/testing# pwd /root/testing minideb:~/testing# du -csh 1/ 78M 1/ 78M total minideb:~/testing# find 1/|wc

76 151 1128

#theres 76 files, photos, no directories, each averaging 1mb

minideb:~/testing# time cp -Rp 1/ /mnt1/testing/1

real 0m58.384s user 0m0.021s sys 0m0.169s minideb:~/testing# du -cs 1 79184 1 79184 total minideb:~/testing# expr 79184 / 59 1342 minideb:~/testing# 1.3M/s

minideb:~/testing# time tar cf data.tar 1/

real 0m0.611s user 0m0.010s sys 0m0.268s minideb:~/testing# time cp data.tar /mnt1/testing/

real 0m9.129s user 0m0.020s sys 0m0.234s minideb:~/testing# ls -l data.tar -rw-r--r-- 1 root root 80680960 2008-05-16 12:18 data.tar minideb:~/testing# expr 80680960 / 10 8068096

root 18058 0.2 0.8 72328 14320 ? Ssl May15 2:47 /root/ec22/s3fs-r152/s3fs entrip-s3fs-1 /mnt1 -ouse_cache=/tmp

any idea why a single large file gets uploaded at 8M/s and smaller ones get on average 1M/s? could there be some processing, switching, i dont know time in between upload requests that kills it on many smaller files?

Comment by abaddonsun, May 16, 2008

heres the above with a bit more formatting, i didnt realize wiki will break my new lines http://linux.pastebin.ca/1019983

Comment by rrizun, May 16, 2008

Hi- I think this is a case of many small files vs one large file; in this case it ends up significant; on my machine (cable modem at home) using ethereal/wireshark , I can see about a one second "penalty" for setup before the file transfer actually takes place; that is, using "cp -p" preserve mode, ownership, timestamps, will cause s3fs to send a flurry of HTTP PUT requests to preserve mode, ownership, timestamps; all this takes time and adds up to the observed overhead that you're seeing; 76 files times approx 1 second overhead per file accounts for the discrepancy you're seeng: 58sec/76files=0.763 seconds overhead per file (I'm seeing just over 1 second per file, that's ec2 vs cable modem at home)

s3fs uses http keep-alive so it is probably as fast as can be; the only way to speed up copying of many small files would be to parallelize the copy operation; amazon s3 is very conducive toward parallelization (s3fs is fully multi-threaded and will use multiple simultaneous http connections)

so, all seems normal! hope that helps!

Comment by rrizun, May 16, 2008

er, I guess the above calc is more like: (58sec-9sec)/76files=0.644 second overhead per file (I'm subtracting the 9 seconds it takes for the raw upload of the entire 80Mbyte file... you get the gist! =)

Comment by adilmohammed, May 16, 2008

Thanks very much, looks like we will have to tar.

Comment by abaddonsun, May 17, 2008

hi rrizun thanks! for s3fs as well, its better than elasticdrive if it matters anything, ed is slower than s3fs and just pretends to be faster (by returning instantly), under the hood is 2x or so slower, s3fs is great i wish i had more time to study the s3 api as well thanks ;-)

Comment by stephenskory, May 19, 2008

find() is case sensitive, and my mime.types has only lower case entires. So, the common .JPG extension will not find anything. This below seems to work...

/**
 * @param s e.g., "index.html"
 * @return e.g., "text/html"
 */
string
lookupMimeType(string s) {
  string result("application/octet-stream");
  string::size_type pos = s.find_last_of('.');
  if (pos != string::npos) {
    s = s.substr(1+pos, string::npos);
  }
  string low_s;
  unsigned int i;
  char* buf = new char[s.length()];
  s.copy(buf, s.length());
  for(i = 0; i<s.length(); i++)
    buf[i] = tolower(buf[i]);
  string r(buf, s.length());
  delete buf;
  mimes_t::const_iterator iter = mimeTypes.find(r);
  if (iter != mimeTypes.end())
    result = (*iter).second;
  return result;
}
Comment by stephenskory, May 19, 2008

oops, the line "string low_s;" is left over.. I should have deleted it.

Comment by rrizun, May 19, 2008

good catch! fixed in r166

Thanks!

Comment by adamchernow, May 26, 2008

I am unable to compile. I get this:

buffy:~/s3fs# make g++ -ggdb -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/fuse -lfuse -lpthread -lcurl -lgssapi_krb5 -lkrb5 -lk5crypto -lcom_err -lkrb5support -lresolv -lidn -ldl -lssl -lcrypto -lz -I/usr/include/libxml2 -L/usr/lib -lxml2 -lcrypto s3fs.cpp -o s3fs s3fs.cpp:1641:74: error: macro "fuse_main" passed 4 arguments, but takes just 3 s3fs.cpp: In function âint main(int, char)â: s3fs.cpp:1636: error: invalid conversion from âvoid ()(fuse_conn_info)â to âvoid ()()â s3fs.cpp:1639: error: âstruct fuse_operationsâ has no member named âutimensâ s3fs.cpp:1641: error: âfuse_mainâ was not declared in this scope s3fs.cpp: At global scope: s3fs.cpp:438: warning: âsize_t readCallback(void, size_t, size_t, void)â defined but not used make: all? Error 1

Comment by rrizun, May 26, 2008

debian etch?

>>> from my research, debian etch bundles fuse 2.5... you'll need at least fuse 2.6 (preferably fuse 2.7) to compile s3fs... http://www.debianhelp.org/node/12310

Comment by adamchernow, May 26, 2008

Thanks!! I installed from backports and things worked. I had just done an upgrade by downloading fuse and compiling it.

-Adam

Comment by zelvlad, May 28, 2008

My OS is Linux 2.6.24.4-64.fc8 #1 SMP Sat Mar 29 09:15:49 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux.

s3fs version is r166

I'm trying to copy a big directory tree with approx. 70K files and total size of 5.5GB using command

cp -r $DIR_LOC/* $DIR_S3/.

Some files are failed with " ###response=400". This indication of a problem on client side. Later I tried to copy individual files, that failed with command

cp $DIR_LOC/file_1 $DIR_S3/.

and they finished successfully.

What kind problem it could be? How can I get what request is being sent? Any help would be appreciated.

Thanks.

-Vlad

Comment by rrizun, May 28, 2008

Hi- what is the nature of the failed files? i.e., are they large? 1GB or greater? as well, is your system clock accurate? system clock needs to be within 15 minutes of amazon s3's clock; s3fs could use some improvement in error reporting... for now you could use ethereal/wireshark to capture 400 bad requests and inspect the xml error doc returned from s3;

I've also seen some references to amazon web services return 400 bad request for no apparent reason, and retrying the request works the 2nd time, however, not sure retrying 400 bad requests is the right thing to do...

Comment by zelvlad, May 29, 2008

These files are not big ones. Sizes usually less than 10MB

I was able to get output from unsuccessful attempt to copy a SWF file (size: 2,605,315 bytes) Packets info were caught with the following command:

tcpdump -i eth0 -s 1500 -A -v host s3.amazonaws.com > info.txt

I found the xml message inside info.txt file:

..... HTTP/1.1 400 Bad Request^M
x-amz-request-id: E94B4261A9DD7418^M
x-amz-id-2: [removed_by_zelvlad]
Content-Type: application/xml^M
Transfer-Encoding: chunked^M
Date: Thu, 29 May 2008 19:04:58 GMT^M
Connection: close^M
Server: AmazonS3^M

...

<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>RequestTimeout</Code>
<Message>Your socket connection to the server was not read from or written to within the timeout period. 
Idle connections will be closed.</Message><RequestId>E94B4261A9DD7418</RequestId>
<HostId>[removed_by_zelvlad]</HostId></Error>^M

Could it help to resolve the issue. Any suggestions?

Thank you very much for your help -Vlad

Comment by rrizun, May 29, 2008

any chance the source files are being modified during the overall copy operation to s3?

Comment by zelvlad, May 29, 2008

Absolutely not. However, the error came after another user started copying some files to S3 space. But these were different files in different directories. I repeated the attempt to copy in a minute later and it failed again with the same message.

Thank you, -Vlad

Comment by rrizun, May 30, 2008

Hi Vlad- thanks for the feedback; I'm afraid this is a tough one! I did see a problem similar to this a while ago and added issue 25 to document it; lemme think about it...

does unmounting and then remounting "fix" the problem?

Comment by kofichar...@yahoo.com, May 30, 2008
 Comment by jorang,  Dec 22, 2007

Anyone tried to use s3fs to make EC2 (MySQL and Apache) use files directly from S3? I read somewhere that the connection between EC2 and S3 would create problems in such a solution. Acording to your experience what could be reasonable to acomplish with EC2, S3 and s3fs when having a high-load website? Direct use, minute2minute backup, hourly backup, ...?

The easiest way to ensure all your ec2 dbs are save from ec2 crashes is to replicate your ec2 database on a regular machine

I currently replicate my ec2 dbs on my centos machine at home. The updates to the slave db is practically instantaneous so if ec2 every crashed I'd probably lose no more than a minute of data

Also you can make backups of your db by backing up the slave. No need to spend precious cpu power on database updates. Just run it on a utility computer sitting in your basement

Comment by rrizun, May 30, 2008

Vlad- what version of libcurl are you using? as well, do you see a "Expect: 100-Continue" in the PUT request that fails? (see issue 25)

Comment by zelvlad, May 30, 2008

Command "curl-config --version" returns libcurl 7.17.1. It looks like "100-continue" is present

About mounting/unmounting. As I mentioned I'm copying big structure with cp -r. I never managed to copy the whole directory without at least a dozen errors like this. I mounted/unmounted many times but some errors are always present.

E....H@.@....i.zH..B...P ....(..P.......PUT /A/B/C/introfinal2.swf HTTP/1.1
Host: s3.amazonaws.com
Accept: */*
Date: Thu, 29 May 2008 18:58:25 GMT
Content-Type:application/octet-stream
x-amz-acl:private
x-amz-meta-mode:33188
x-amz-meta-mtime:1212087505
Authorization: AWS [remove]
Content-Length: 2605315
Expect: 100-continue

14:58:25.108906 IP (tos 0x0, ttl 49, id 58818, offset 0, flags [DF], proto TCP (6), length 65) s3.amazonaws.com.http > [host].49798: P, cksum 0xd64d (correct), 4541:4566(25) ack 3101 win 1028
E..A..@.1...H..B.i.z.P...(.. ..AP....M..HTTP/1.1 100 Continue

...

14:58:52.173891 IP (tos 0x0, ttl 49, id 59119, offset 0, flags [DF], proto TCP (6), length 670) s3.amazonaws.com.http > [host].49798: P, cksum 0x06f6 (correct), 4566:5196(630) ack 857989 win 27854
E.....@.1...H..B.i.z.P...(.. &..P.l.....HTTP/1.1 400 Bad Request
x-amz-request-id: 14378A78D8316F17
x-amz-id-2: [removed]
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Thu, 29 May 2008 18:58:51 GMT
Connection: close
Server: AmazonS3

15c
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within 
the timeout period. Idle connections will be closed.</Message>
<RequestId>14378A78D8316F17</RequestId><HostId>[removed]</HostId></Error>

Thank you, Vlad

Comment by rrizun, May 30, 2008

Vlad- I wonder if you're running into this problem: http://forum.jungledisk.com/viewtopic.php?t=8203

coincidentally, s3fs' readwrite_timeout defaults to 10 seconds too

here's something to try: in s3fs set readwrite_timeout=5 (i.e., half of what amazon's idle connection timeout is)

Comment by kofichar...@yahoo.com, May 31, 2008

Hi rrizun, This installed perfectly on my EC2 system but it fails on my centos computer. The error I get is:

Package libcurl was not found in the pkg-config search path.
Perhaps you should add the directory containing `libcurl.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libcurl' found
g++ -D_FILE_OFFSET_BITS=64 -ggdb -Wall -D_FILE_OFFSET_BITS=64 -I/usr/local/include/fuse  -pthread -L/usr/local/lib -lfuse -lrt -ldl    -I/usr/include/libxml2 -L/usr/lib -lxml2 -lz -lpthread -lm -lcrypto s3fs.cpp -o s3fs
s3fs.cpp: In function `int s3fs_readdir(const char*, void*, int (*)(void*, const char*, const stat*, off_t), off_t, fuse_file_info*)':
s3fs.cpp:1363: error: `curl_multi_timeout' was not declared in this scope
s3fs.cpp:1363: warning: unused variable 'curl_multi_timeout'
s3fs.cpp: At global scope:
s3fs.cpp:439: warning: 'size_t readCallback(void*, size_t, size_t, void*)' defined but not used
make: *** [all] Error 1

Now I know I definately have curl installed on my system as seen below:

[root@localhost s3fs]# curl-config --version
libcurl 7.12.1

I also know the file libcurl.pc does not exist on my system (locate can not find it) Is there a way I can compile it without it looking for libcurl.pc? Thanks

Comment by rrizun, May 31, 2008

Hi- looks like the problem is actually specifically curl_multi_timeout... for that you'll want to update to at least libcurl 7.15.4 (7.12.1 is too old)

Comment by kofichar...@yahoo.com, May 31, 2008

Will do.. Thanks

Comment by kofichar...@yahoo.com, May 31, 2008

Hi again, I installed libcurl 7.18.1

[root@localhost ~]# curl-config --version
libcurl 7.18.1

I'm still having issues with curl. Now the error I'm getting is:

Unknown keyword 'URL' in '/usr/local/lib/pkgconfig/libcurl.pc'
g++ -ggdb -Wall -D_FILE_OFFSET_BITS=64 -I/usr/local/include/fuse  -pthread -L/us r/local/lib -lfuse -lrt -ldl    -I/usr/include/libxml2 -L/usr/lib -lxml2 -lz -lp thread -lm -lcrypto s3fs.cpp -o s3fs
s3fs.cpp:439: warning: 'size_t readCallback(void*, size_t, size_t, void*)' defin ed but not used
/tmp/ccnfhNOY.o(.text+0x2b6): In function `alloc_curl_handle':
/usr/src/s3fs/s3fs.cpp:164: undefined reference to `curl_easy_init'
/tmp/ccnfhNOY.o(.text+0x2eb):/usr/src/s3fs/s3fs.cpp:169: undefined reference to `curl_easy_reset'
/tmp/ccnfhNOY.o(.text+0x305):/usr/src/s3fs/s3fs.cpp:171: undefined reference to `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x31b):/usr/src/s3fs/s3fs.cpp:177: undefined reference to `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x32d):/usr/src/s3fs/s3fs.cpp:179: undefined reference to `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x345):/usr/src/s3fs/s3fs.cpp:180: undefined reference to `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x436): In function `my_curl_easy_perform':
/usr/src/s3fs/s3fs.cpp:294: undefined reference to `curl_easy_perform'
/tmp/ccnfhNOY.o(.text+0x485):/usr/src/s3fs/s3fs.cpp:301: undefined reference to `curl_easy_getinfo'
/tmp/ccnfhNOY.o(.text+0x4df):/usr/src/s3fs/s3fs.cpp:309: undefined reference to `curl_easy_strerror'
/tmp/ccnfhNOY.o(.text+0x1342): In function `get_headers(char const*, std::map<st d::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::basic _string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::bas ic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator 
<std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&)':
/usr/src/s3fs/s3fs.cpp:512: undefined reference to `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x135e):/usr/src/s3fs/s3fs.cpp:513: undefined reference to  `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x137a):/usr/src/s3fs/s3fs.cpp:514: undefined reference to  `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x1396):/usr/src/s3fs/s3fs.cpp:515: undefined reference to  `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x13b2):/usr/src/s3fs/s3fs.cpp:516: undefined reference to  `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x13e2):/usr/src/s3fs/s3fs.cpp:519: more undefined referen ces to `curl_easy_setopt' follow
/tmp/ccnfhNOY.o(.text+0x5e76): In function `s3fs_getattr':
/usr/src/s3fs/s3fs.cpp:813: undefined reference to `curl_easy_getinfo'
/tmp/ccnfhNOY.o(.text+0x5f96):/usr/src/s3fs/s3fs.cpp:819: undefined reference to  `curl_easy_getinfo'
/tmp/ccnfhNOY.o(.text+0x601f):/usr/src/s3fs/s3fs.cpp:825: undefined reference to  `curl_easy_getinfo'
/tmp/ccnfhNOY.o(.text+0x68dc): In function `s3fs_mknod':
/usr/src/s3fs/s3fs.cpp:900: undefined reference to `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x68f8):/usr/src/s3fs/s3fs.cpp:901: undefined reference to  `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x6914):/usr/src/s3fs/s3fs.cpp:902: undefined reference to  `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x6930):/usr/src/s3fs/s3fs.cpp:903: undefined reference to 
 `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x694c):/usr/src/s3fs/s3fs.cpp:904: undefined reference to  `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0x71df):/usr/src/s3fs/s3fs.cpp:918: more undefined referen ces to `curl_easy_setopt' follow
/tmp/ccnfhNOY.o(.text+0xbd88): In function `s3fs_readdir':
/usr/src/s3fs/s3fs.cpp:1330: undefined reference to `curl_slist_append'
/tmp/ccnfhNOY.o(.text+0xbf18):/usr/src/s3fs/s3fs.cpp:1331: undefined reference t o `curl_slist_append'
/tmp/ccnfhNOY.o(.text+0xc169):/usr/src/s3fs/s3fs.cpp:1332: undefined reference t o `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0xc185):/usr/src/s3fs/s3fs.cpp:1335: undefined reference t o `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0xc1a0):/usr/src/s3fs/s3fs.cpp:1336: undefined reference t o `curl_easy_setopt'
/tmp/ccnfhNOY.o(.text+0xc407):/usr/src/s3fs/s3fs.cpp:1350: undefined reference t o `curl_multi_perform'
/tmp/ccnfhNOY.o(.text+0xc49a):/usr/src/s3fs/s3fs.cpp:1363: undefined reference t o `curl_multi_timeout'
/tmp/ccnfhNOY.o(.text+0xc5ed):/usr/src/s3fs/s3fs.cpp:1371: undefined reference t o `curl_multi_fdset'
/tmp/ccnfhNOY.o(.text+0xc7b7):/usr/src/s3fs/s3fs.cpp:1377: undefined reference t o `curl_multi_perform'
/tmp/ccnfhNOY.o(.text+0xc7fb):/usr/src/s3fs/s3fs.cpp:1384: undefined reference t o `curl_multi_info_read'
/tmp/ccnfhNOY.o(.text+0xc833):/usr/src/s3fs/s3fs.cpp:1388: undefined reference t o `curl_easy_strerror'
/tmp/ccnfhNOY.o(.text+0xc9bc):/usr/src/s3fs/s3fs.cpp:1399: undefined reference t o `curl_easy_getinfo'
/tmp/ccnfhNOY.o(.text+0xcb1f):/usr/src/s3fs/s3fs.cpp:1407: undefined reference t o `curl_easy_getinfo'
/tmp/ccnfhNOY.o(.text+0xcb4c):/usr/src/s3fs/s3fs.cpp:1412: undefined reference t o `curl_easy_getinfo'
/tmp/ccnfhNOY.o(.text+0xd141): In function `s3fs_init':
/usr/src/s3fs/s3fs.cpp:1470: undefined reference to `curl_global_init'
/tmp/ccnfhNOY.o(.text+0xd4ba): In function `s3fs_destroy':
/usr/src/s3fs/s3fs.cpp:1504: undefined reference to `curl_global_cleanup'
/tmp/ccnfhNOY.o(.gnu.linkonce.t._ZN15auto_curl_slistD1Ev+0xf): In function `auto _curl_slist::~auto_curl_slist()':
/usr/src/s3fs/s3fs.cpp:271: undefined reference to `curl_slist_free_all'
/tmp/ccnfhNOY.o(.gnu.linkonce.t._ZN15auto_curl_slist6appendERKSs+0x22): In funct ion `auto_curl_slist::append(std::basic_string<char, std::char_traits<char>, std ::allocator<char> > const&)':
/usr/src/s3fs/s3fs.cpp:277: undefined reference to `curl_slist_append'
/tmp/ccnfhNOY.o(.gnu.linkonce.t._ZN15auto_curl_multiD1Ev+0x69): In function `aut o_curl_multi::~auto_curl_multi()':
/usr/src/s3fs/s3fs.cpp:254: undefined reference to `curl_multi_cleanup'
/tmp/ccnfhNOY.o(.gnu.linkonce.t._ZN15auto_curl_multi8add_curlEPv+0x28): In funct ion `auto_curl_multi::add_curl(void*)':
/usr/src/s3fs/s3fs.cpp:261: undefined reference to `curl_multi_add_handle'
/tmp/ccnfhNOY.o(.gnu.linkonce.t._ZN15auto_curl_multiC1Ev+0xb): In function `auto _curl_multi::auto_curl_multi()':
/usr/src/s3fs/s3fs.cpp:251: undefined reference to `curl_multi_init'
/tmp/ccnfhNOY.o(.gnu.linkonce.t._ZN32curl_multi_remove_handle_functorclEPv+0x12) : In function `curl_multi_remove_handle_functor::operator()(void*)':
/usr/src/s3fs/s3fs.cpp:242: undefined reference to `curl_multi_remove_handle'
/tmp/ccnfhNOY.o(.gnu.linkonce.t._ZN13cleanup_stuffclESt4pairIPv7stuff_tE+0x4c): In function `cleanup_stuff::operator()(std::pair<void*, stuff_t>)':
/usr/src/s3fs/s3fs.cpp:1208: undefined reference to `curl_slist_free_all'
collect2: ld returned 1 exit status
make: *** [all] Error 1

what am I doing wrong. Thanks in advance

Comment by rrizun, May 31, 2008

looks like a link problem... compiler sees header file(s) but linker does not see library file

Comment by kofichar...@yahoo.com, May 31, 2008

WOW... I fixed it. In case anyone else encounters this problem then this is the fix.

The key is in the error below

Unknown keyword 'URL' in '/usr/local/lib/pkgconfig/libcurl.pc'

I looked in the libcurl.pc file that was installed and saw the following:

prefix=/usr/local
exec_prefix=${prefix}
libdir=${exec_prefix}/lib
includedir=${prefix}/include

Name: libcurl
URL: http://curl.haxx.se/
Description: Library to transfer files with ftp, http, etc.

I removed the line "URL: http://curl.haxx.se" and it worked. I don't know the syntax of .pc files but I guess URL must be a reserved keyword or something.

I tested it and it worked beautifully except one new issue which I hope is not a bug.

I have the following setup. I have my ec2 system mounted and pointing to a certain bucket and I have my centos home computer mounted and pointing to the same bucket.

ec2/mymount -> s3/mybucket homecomputer/mymount -> s3/mybucket

When I place a file into ec2/mymount it shows up in s3/mybucket AND I can see it in homecomputer/mymount. BEAUTIFUL!

When I place a file into homecomputer/mymount it shows up in s3/mybucket AND I can see it in ec2/mymount. NICE!

However, when I place a file into s3/mybucket through s3fox it shows up as an EMPTY file in ec2/mymount and homecomputer/mymount. EWW!

Is this a bug or is my setup faulty

Comment by rrizun, May 31, 2008

glad you got it working!

as for the s3fox thing, see issue 27

the gist of it is that there isn't really a "standard" way to represent an actual folders object in s3; each s3 client makes up their own scheme for representing folder objects

Comment by rrizun, May 31, 2008

might wanna try jets3t cockpit instead?

Comment by james.s.white, May 31, 2008

Whenever I try to cp a file < 500 MB I get a No space left on device error and only the first 500 MB copy.

Comment by james.s.white, May 31, 2008

Sorry, forgot data that might be helpful...

$ dd if=/dev/zero of=/opt/s3/500MB bs=1024 count=512000 dd: writing `/opt/s3/500MB': No space left on device

$ dd if=/dev/zero of=/opt/s3/500MB bs=1024 count=506880 dd: closing output file `/opt/s3/500MB': Input/output error

$ dd if=/dev/zero of=/opt/s3/500MB bs=1024 count=501760 501760+0 records in 501760+0 records out 513802240 bytes (514 MB) copied, 121.188 seconds, 4.2 MB/s

# uname -a Linux localhost 2.6.23.1-linode36 #1 Sun Nov 4 12:03:06 EST 2007 i686 GNU/Linux

# dpkg -l | grep fuse ii fuse-utils 2.7.1-2~bpo40+1 Filesystem in USErspace (utilities) ii libfuse-dev 2.7.1-2~bpo40+1 Filesystem in USErspace (development files) ii libfuse2 2.7.1-2~bpo40+1 Filesystem in USErspace library

s3fs-r166

Comment by rrizun, Jun 01, 2008

how much free space is on /tmp (or where ever posix tempfile() would create its temporary files)? s3fs creates temporary local files first before uploading to s3 so is there a chance you're running out of local hard disk space?

Comment by james.s.white, Jun 01, 2008

Yep: 526 MB in / (it's a linode, so I don't get to partition the root disk). Is there an argument to change where it creates those temp files?

Comment by paul.min...@gmail.com, Jun 01, 2008

hey. i'm having problems with mv, as opposed to cp. i have s3fs mounted at /var/n54data. when i do:

dd if=/dev/zero of=sux count=1024 bs=1024; cp sux /var/n54data

the resulting file (/var/n54data/sux) is 1M in size. however when i do:

dd if=/dev/zero of=sux count=1024 bs=1024; mv sux /var/n54data

the resulting file is 0 bytes in size.

p.z. s3fs is awesome.

Comment by rrizun, Jun 02, 2008

s3fs uses posix http://linux.die.net/man/3/tmpfile to create its temporary local files... so, currently, no, there is no s3fs argument to control where temporary files are created

Comment by rrizun, Jun 02, 2008

hmmm... cp vs mv... the "mv" command itself should detect a cross-filesystem move (EXDEV) and should fall back to a brute-force "cp" itself anyway, all without any knowledge whatsoever to/from s3fs (s3fs just sees it as a cp/write). any chance of running out of local disk space for temporary local files? (s3fs creates a temporary local file before uploading to s3)

Comment by paul.min...@gmail.com, Jun 02, 2008

re: cp vs. mv ; yeah disk space was not a problem. also in the strace, it did try a rename and then fallback to cp. also it's using read/write, not mmap, so no fuse problem there. actually the straces look extremely similar; however the mv one has a SYS_320 line it (?), and of course also contains a trailing unlink :(

are you able to reproduce the problem?

Comment by rrizun, Jun 04, 2008

re: cp vs mv

I just tried this and was not able to reproduce the problem. in both cases I ended up with a 1MB "sux" file on s3... the trailing unlink seen via strace sounds normal; any chance that this is a s3 "eventual consistency" phenomena?

Comment by rrizun, Jun 04, 2008

one other thought: is s3fs local cache enabled?

Comment by kofichar...@yahoo.com, Jun 04, 2008

Hello rrizun,

I'm logged in as the user "root".

I mount a directory as another user, e.g "apache" as seen in the code below:

su apache -c 's3fs mybucket /mymount'

I do this so my http server, running as user "apache", can write to the /mymount directory.

The problem is now the user "root" does not have access to the /mymount directory as seen below:

#cd /mymount
-bash: cd: /mymount: Permission denied

My question is how come the superuser "root" does not have access to a mount owned by "apache"

Comment by rrizun, Jun 04, 2008

use fuse "allow_other" switch... search for "allow_other" on this page

Comment by marcin_...@yahoo.com, Jun 05, 2008

Is the s3fs-devel google group locked? There haven't been any posts since May 16 ... I also posted something since then and the post has not yet appeared.

Comment by rrizun, Jun 05, 2008

shouldn't be locked (didn't know it could be!) just posted a test message...

Comment by foster.l...@yahoo.com, Jun 08, 2008

I am new to s3fs. When I read that you mount your bucket with

/usr/bin/s3fs mybucket -o accessKeyId=aaa -o secretAccessKey=bbb /mnt
(or providing a password file) I wonder whether the secret access key is transmitted securely to amazon. Is openssl enabled by default, and are there any optional switches that I should know about, to make sure I never disable ssl encryption?

And what about file transfers? I'd assume they use http by default. Is there an option to change to https?

Comment by rrizun, Jun 08, 2008

Hi- s3fs never transmits the secretAccessKey over the internet; s3fs (or any other s3 client for that matter) only transmits signatures created locally from the secretAccessKey; That is actually by amazon S3 design.

For the actual file transfer, "http" is used by default but can be overridden by usnig the s3fs "url" command line option and specifying, say, "https://s3.amazonaws.com"

hope that helps!

Comment by mich...@lasmanis.com, Jun 13, 2008

Has anyone gotten https access to work? http access works fine.

I keep getting the following error: ###problem with the SSL CA cert (path? access rights?)

i'm running s3fs r166, debian etch w/ libcurl 7.15.5, libfuse 2.7.1, openssl 0.9.8

the curl command line is able to access https://s3.amazonaws.com without error so i know my CA bundle is correct. it seems like the s3fs code may not be initializing the path to the CA bundle.

Running from the commandline: curl --trace foo https://s3.amazonaws.com yields: == Info: About to connect() to s3.amazonaws.com port 443 == Info: Trying 207.171.185.193... == Info: connected == Info: Connected to s3.amazonaws.com (207.171.185.193) port 443 == Info: successfully set certificate verify locations: == Info: CAfile: /etc/ssl/certs/ca-certificates.crt

Thanks Michael

Comment by rrizun, Jun 16, 2008

Hi- I got s3fs+https to work on Fedora by merely setting url=https://... dunno about etch, perhaps fedora's default setup differs from etch such that "it just works"?!?

Comment by paul.min...@gmail.com, Jun 16, 2008

hey rrizun, sorry for the delay.

yes local_cache is enabled with the problem is exhibited.

Comment by rrizun, Jun 16, 2008

Hi- and the problem is not exhibited when local_cache is disabled? feel free to add a new Issue to track this problem and I'll take a peek at it when I have a chance... Thanks!

Comment by vanwaardhuizen, Jun 16, 2008

I have https working on both lenny and hardy with libcurl4-openssl-dev 7.18.0 and ca-certificate.

I had to modify s3fs to use curl_easy_setopt(curl, CURLOPT_CAPATH, capath) with capath set to /etc/ssl/certs. See: http://curl.haxx.se/docs/sslcerts.html

Hope this helps. -Shane

Comment by jbrendel, Jun 18, 2008

Hello! This is a great project, and it seems to work very well. However, of all the things, I actually need the deep-rename capability for directories, which currently is not supported. What would it take to enable this feature?

Should s3fs_rename() become recursive, somehow? It appears to me as if this would require a 'rename' operation on every single object under the entire directory tree, right? That sounds expensive, though...

Or could it just check whether the target is a directory and then instead execute a 'cp -r' followed by a delete of the original directory? I think this would be an even more expensive operation, though, right?

Comment by rrizun, Jun 19, 2008

Hi- yes, s3fs_rename() would need to recursively rename each sub-object and would typically be a long running operation (but should be able to be seamlessly resumed if interrupted); should be able to do deep directory rename now with some /usr/bin/find trickery, though I haven't looked into it myself; deep directory rename is one of the higher priorities, just need some spare time to implement it!

Comment by adamchernow, Jun 19, 2008

So.. Everything had been working just fine and now, for some reason, s3fs won't mount my s3 file system. I don't get any errors when I try to mount manually, just an empty directory when I do an ls on the mount point.

-Adam

Comment by rrizun, Jun 19, 2008

try "tail -f /var/log/messages" to see if it reveals anything; also, can you, e.g., mkdir and does the directory show up? as well, you can use another s3 client such as jets3t to inspect the s3 data/objects...

Comment by archie.cobbs, Jun 20, 2008

Please add http://s3backer.googlecode.com/ to the "See Also" list. Thanks.

Comment by dylan.clendenin, Jun 20, 2008

symlinks are now added. The wiki still says otherwise. That was a big requirement for me. Do others a favor and update it please.

Comment by rrizun, Jun 20, 2008

done and done; Thanks!

Comment by adamchernow, Jun 23, 2008
buffy:~# screen -r
buffy:~# tail -f /var/log/messages
Jun 23 14:02:54 buffy -- MARK --
Jun 23 14:22:54 buffy -- MARK --
Jun 23 14:42:55 buffy -- MARK --
Jun 23 15:02:55 buffy -- MARK --
Jun 23 15:22:55 buffy -- MARK --
Jun 23 15:42:55 buffy -- MARK --
Jun 23 16:02:55 buffy -- MARK --
Jun 23 16:22:55 buffy -- MARK --
Jun 23 16:42:56 buffy -- MARK --
Jun 23 17:02:56 buffy -- MARK --
Jun 23 17:08:36 buffy s3fs: init $Rev: 166 $
Jun 23 17:08:36 buffy s3fs: init $Rev: 166 $

That help at all?

-Adam

Comment by dylan.clendenin, Jun 24, 2008

Just as a suggestion, if someone were to put together a more canonical set of instructions for installation on the various linux flavors it would do a lot of good for helping adoption.

Also, can anyone speak from experience about mounting an s3fs as an NFS share? Performance degradation?

Comment by rrizun, Jun 24, 2008

Hi Dylan- indeed, more docs would be good; as for NFS, I have not tried myself, nor have I heard of anyone else doing so; I'm sure there would be some level of performance degradation due to at least the file system layering... feel free to give it a try and post your results! =)

Comment by rrizun, Jun 24, 2008

adamchernow- doesn't say much other than there was no write activity; can you write to the mount point? what does jets3t say?

Comment by coolbutuseless, Jul 02, 2008

s3fs seems to be using a unique way of creating directories on S3. Is this ever going to change so as to be more compatible with other s3 tools?

S3Fox? and JungleDisk?(when in compatibility mode) both use a similar directory scheme and can read each others buckets. However neither of them can correctly parse the s3fs directory scheme.

I understand that there is no standard way of doing directories on s3, but I guess there must be a de facto standard arising as the jungledisk site states: "most S3 applications now use a simpler scheme that uses object names that resemble standard URLs, made possible by a feature called delimiters added to Amazon S3 last year."

Comment by rrizun, Jul 02, 2008

Hi- Indeed I am interested in making s3fs compatible w/other s3 clients, as with everything its just a matter of finding time to do so; the "delimiters" feature (if it is what I'm thinking) is already being used by s3fs, but there are additional considerations wrt a complete "universal" folder representation convention; hope that helps!

Comment by coolbutuseless, Jul 02, 2008

I did some tests with some of the tools I'm considering. Seems that no tool is really very compatible with any other in regards to directory schemes. I made a comparison chart: http://www.coolbutuseless.com/post/40760186/s3-is-useful-as-a-backup-system-if-backups-are

I also tried the other 2 s3 fuse solutions and they aren't compatible with anything either.

Comment by ad...@terraninnerspace.com, Jul 03, 2008

Does the caching option cache the directory structure and files? Some of the filesystem needed for this application is very complex. Searching this data is based on filenames, and it seems as if the complex directory structure is holding things up, is that possible ?

Thanks.

Comment by rrizun, Jul 03, 2008

Hi- as coded, the cache option caches file contents only; s3fs always goes online to s3 to determine the directory structure

>>> it seems as if the complex directory structure is holding things up, is that possible ?

it is possible; try using tcpdump/ethereal/wireshark to inspect the packets to get an idea if there's any "thrashing" going on

Comment by ad...@terraninnerspace.com, Jul 05, 2008

>>> it is possible; try using tcpdump/ethereal/wireshark to inspect the packets to get an idea if there's any "thrashing" going on

Seems overly complex, I doubt I could interpret the results. I'm moving to the S3 storage platform, and the directory structure is a matrix. e.g.) a/b/0/0/0 , a/b/0/0/1 , to a/b/9/9/9. Quite a few folders and a great deal are populated.

When accessing my structure under s3fs, I do see delays the deeper I go. The delays are more noticeable when doing a "find", as opposed to a simple file read under PHP.

How does one "remap" under s3fs ? Would rebooting be the only option ?

Comment by rrizun, Jul 05, 2008

Hi- looks like the "delay" that you're seeing is directly related to the depth of the directory structure; looks like fuse does a "stat" on each directory level before each file access

e.g., if I have a dir/file like "/a/b/c/d/e/f/g/abc.txt" then fuse (or something else higher up) will end up doing a stat on "/a" and then "/a/b" and then "/a/b/c" etc... before finally ending up at "/a/b/c/d/e/f/g/abc.txt"; these are all separate and independent S3 HTTP HEAD transactions as far as s3fs is concerned

don't think s3fs has much control over that since it is really just a "slave" to the fuse subsystem; s3fs could do some caching/second guessing, I guess, but then that might start to get a little bit tricky...

>>> The delays are more noticeable when doing a "find"

if you're interested in speeding that up... in the source code, in "s3fs_readdir()" search for "max-keys" and set that value to, say, 100 (instead of 20)... recompile and remount and see if that makes a difference; it might speed up "find" by 2x or so, ymmv

Comment by guruyaya, Jul 06, 2008

Anyone experienced high load burst on a server, using s3fs?

Comment by rrizun, Jul 06, 2008

Hi- libcurl seems to have a "bug" wrt high numbers of "multi" handles... symptom is high cpu load... to "fix" it, in the source code, in s3fs_readdir(), search for "max-keys" and set the value to something less than 20, say, 5... recompile and remount and see if that makes a difference! (though that'll slow down readdir()... i.e., less parallel reads)

the other possibility is if local_cache is enabled? i.e., calculating local md5 checksums, etc...

judging by the "burst" observation, I'm guessing its the "max-keys" issue

Comment by sm8...@yahoo.com, Jul 11, 2008

Hi rrizun, I am having a heck of a time right now trying to figure out why I am getting IOEs. I have been building a test instance with FC9 using s3fs r166 and have the latest curl (7.18.2).. checking my logs, s3fs is returning ###response=403.. I have double checked my access keys and they are fine.. any idea of what might be causing this... also... the program is great and very much appreciate.. would like to make a donation to you if we can..

Comment by rrizun, Jul 11, 2008

Hi- is the local time set correctly on your fc9 box? if that isn't it then try using tcpdump (e.g., "tcpdump -s 1500 -A host s3.amazonaws.com") to capture the xml document that s3 returns for the 403 response

Comment by sm8...@yahoo.com, Jul 11, 2008

Thanks rrizun... I checked the time when I started but it had been reset... All is working.. will put it through its paces now and see how far I can push this..

Comment by dylan.clendenin, Jul 11, 2008

rrizun... would like to report good things about the s3fs/NFS experiment but so far:

exportfs: Warning /mnt/s-three-test does not support NFS export.

Does some knucklehead/hacker or combination of both anticipate what it would take to get s3fs to "support NFS export"? I've yet to find anyone sharing about this, maybe it's too knuckleheaded I don't know.

btw... (note). anyone looking for some sick (in california that means 'good') Debian and Ubuntu AMIs should check out http://alestic.com. I only found these this morning and I am sad I didn't start using them earlier. that is a personal recommendation not some kind of commercial or anything, they definitely are in good shape to work with s3fs "out of the box". sometimes the kernal module thing can be weird on those xen-based AMIs, this has been my learning experience.

Comment by rrizun, Jul 11, 2008

haven't tried exporting via nfs myself... see http://fuse.cvs.sourceforge.net/*checkout*/fuse/fuse/README.NFS

Comment by dylan.clendenin, Jul 12, 2008

genius. it's crazy enough it just might work.

Comment by nicksdjohnson, Jul 13, 2008

It'd be really great if it supported setxattr() and getxattr() to set metadata on files. Specifically, if there was a key for the canned ACL, you could easily write a tool to retrieve and set canned ACLs on files (eg "setacl public myfile") by calling setxattr().

It'd also be really nice if you could specify a transform - such as compression or encryption - for files to be stored with.

Comment by nicksdjohnson, Jul 13, 2008

One other note: It'd be handy if one of the attributes you could retrieve via getxattr was the md5 sum for the file.

Comment by SMcNam, Jul 14, 2008

Hi rrizun,

I just tried to use s3fs with mpd (music player daemon) by pointing mpd at the s3fs mount. When mpd updates its database, it looks at the file modified times for everything in the directory (my bucket of lots of mp3 files). If the file is either non-existent in its database, or its modified time is newer than what mpd stored as its last modified time, it searches for ID3 tags in the file, extracts them and creates a database record.

When I did an mpd database update on my s3fs bucket, it basically started downloading my entire music collection, incurring quite a lot of OUT bandwidth. I am pretty certain that the ID3 tag algorithm only needs to read a part of the file, since on a local disk it can generate an entire database of 5000+ MP3s in about 2 minutes, which would be an extraordinary feat if the disk were reading all 25GB of data into memory.

But it seems that when the file is opened for reading, s3fs downloads the entire file. This is expensive and time-consuming. Of course, if mpd's file modified time algorithm does its job, this should be a one-time cost for each file.. it will skip over doing a read on files that have the same modified time it's expecting. And I don't modify my mp3 files very often ;)

Do you know of any way to permit the ID3 tag algorithm to do some seeking around in the file, find the start of the ID3 tags, and extract the needed few kilobytes, rather than downloading each 5+ MB mp3 file? It seems like this would more or less demand a large amount of round trips between the local system and s3, but if it could read small "pages" of the file (say, 64KB at a time) as the data is requested, this would make the ID3 tagging much more bandwidth-efficient.

Thanks,

Sean

Comment by rrizun, Jul 15, 2008

SMcNam- as coded, s3fs follows a brute-force "all-or-nothing" strategy; indeed it will download the entire file even if ultimately only one byte needs to be read; the same phenomena that you're seeing can also be seen, e.g., by browsing an s3 mounted folder with GNOME Nautilus (or MacOSX Finder); they'll both wanna read the first few bytes of each file in order to determine their file types; I believe there is already an issue tracking this feature enhancement

Comment by rrizun, Jul 15, 2008

arachnid- the extended attributes sounds like a great idea- feel free to add a new issue to track this feature!

Comment by SMcNam, Jul 15, 2008

rrizun, good point... It seems the two best FUSE-based s3 access methods each have a substantial weakness leading to undue bandwidth expense; let me explain.

s3fs is very good at uploading files to s3, whether small or large... it doesn't incur extraneous PUT/LIST/GETs because it seems to do a single PUT for an arbitrarily large file.

Of course, having a 1:1 mapping between actual files and s3 files has its drawbacks. Just as you said, to do any I/O on the file, it has to be retrieved in its entirety. So s3fs is very write-efficient, and very read-inefficient.

s3backer, on the other hand, reverses the problems. s3backer treats a ton of S3 files as a single virtual file in FUSE, which corresponds to a virtual hard disk drive. You then format a real filesystem on to, such as ext2. ext2 divides its data into blocks for the purpose of efficiency when reading; for the same reason, s3backer treats every few kilobytes of its virtual file as a separate S3 file. So if I want to read from a single byte inside a 20-meg file, s3backer asks the filesystem's allocation table where that byte is, and maps it to a file on S3, which is between 4k and 64k in size. So it's still getting one entire S3 file to provide your data, but that S3 file happens to be 64k instead of 20 megs.

Here comes s3backer's pitfall: when you start to deal with actual files that are several megabytes large, they are stored as tens, hundreds or thousands of different S3 files. Whenever you would like to read or write this entire file sequentially, in the background you are accruing tens, hundreds or thousands of GET or PUT requests into S3.

This has two major disadvantages: first, in the case of thousands of requests, even a multithreaded s3backer can still only upload your file at a maximum of 100 KB/s or so: the overhead of initiating a new GET/PUT request every 4k to 64k (depending on your configured block size) severely limits the amount of data that can be transferred at a time. There is a large amount of waiting done at the network layer when you're constantly initiating new TCP sockets. Second, those pennies really start adding up as your thousands of GET/PUT requests pile in.

Is it theoretically possible to design an s3-backed, Linux VFS mountable filesystem (whether FUSE or not) which is optimal for both sequential and random access? Can s3fs be improved to provide smarter random access without grabbing entire files? Or can s3backer be improved to provide smarter sequential access without creating thousands of individual TCP sockets (not to mention incurring many PUTs on your bill) for each block?

I am very interested in the direction ahead; perhaps we need guidance from Amazon to fully understand their intent with S3 and to help us come up with an optimal solution.

Thanks,

Sean

Comment by SMcNam, Jul 15, 2008

Just wanted to clarify the early parts of my last post: the issue is not reads vs. writes, but really sequential vs. random access. Random access (seek to a particular place, start reading) usually doesn't involve examining the entire file, so in that case we only want a small amount of the file to go over the network, regardless of whether it's an upload or a download. Sequential access is most efficient over S3 if it can be wrapped in a single GET or PUT request.

It seems like we can't have the best of both worlds unless S3 API allows us to read parts of an S3 file without downloading it.... if so, that might be the way to proceed: Go with s3fs, because it provides the 1:1 mapping, but try and be "smart" about VFS/FUSE requests for I/O, by only downloading pieces of the file that are requested.

Hmm: You might have to go with an adaptive algorithm that starts with small pieces when I/O begins, and "catches on" if sequential access continues in a predictable way, expanding the piece size exponentially to reduce the amount of PUTs or GETs as the I/O continues.

Comment by dylan.clendenin, Jul 21, 2008

First: a moment of silence for yesterday's S3 incident.

Second: I found a really cool use for s3fs I'd like to share. Maybe others are doing this too but I'll share anyways. It is so simple.

Basically the challenge is migrating our initial data over to S3. So today I mounted the same S3 bucket four times:

/mnt/s3_1 /mnt/s3_2 /mnt/s3_3 /mnt/s3_4

then wrote a little multi-threaded uploader that mapped those s3fs directories to local directories and ran an rsync -ra SRC DEST on them simultaneously. Too cool, I thought. Really cut the time of transfer down.

This will only work for certain well defined situations where directory structure is such that the structure can be divided sensibly. And as you know, rrizun put in the work to support rsync, so that is great too.

Comment by archie.cobbs, Jul 28, 2008

Regarding the performance of s3backer's "block" access.

s3backer 1.0.x serviced each request to write a block within the FUSE thread that requested it. This ended up giving very slow write performance, because as it turns out the kernel only issues one write request at a time, and each block write would have to wait for the one before it to complete.

Version 1.1.x supports asynchronous parallel writes out of the block cache, which allows lots of blocks to be written simultaneously. Now you can saturate the network if you so choose, which is as it should be.

An interesting question is should you do the same thing, i.e., parallelization, on the read side? This would involve predictive caching ("read-ahead") of data. I.e., if you get a request to read block zero of a file, assume the next few blocks are going to be needed soon as well and read them all in parallel.

s3fs has these same issues in theory, but at the file level instead of the block level. E.g., one could imagine an application which reads or writes a bunch of files at once sequentially. Each operation would have to wait for the one before it. Of course s3fs could solve that problem the same way, i.e., using an asynchronous writer thread pool.

So there is a trade-off between file vs. block access and it probably depends on your particular situation which "granularity" is best.

Comment by bmilleare, Aug 04, 2008

Is anyone else having issues with the local cache? No matter what I try files are not being cached and constantly get re-downloaded from S3. Running on an Ubuntu EC2 AMI (feisty) with latest s3fuse (r166). Files are being read successfully (and served by apache), but not cached. Command:

/usr/local/bin/s3fs my_bucket /mnt/files -default_acl=public-read -ouse_cache=/mnt/cache -o allow_other

Comment by bmilleare, Aug 04, 2008

Figured it out. Don't use underscores in your bucket names!

Comment by rrizun, Aug 04, 2008

Hi- I don't think underscores in bucket names has anything to do with it; underscores in bucket names are legit; I think it was just a coincidence that local cache started to work when you removed underscores; I have heard other reports of local cache behaving like this so there does seem to be an issue w/local cache, although it does not appear to be a service-affecting issue

just a thought: could it be a permissions issue wrt the local_cache folder?

Comment by bmilleare, Aug 05, 2008

It looks like you're right as the problem is back. The weird thing is, some instances (1 out of 10 at the most) will cache properly yet most will fail to cache at all.

I am creating /mnt/cache on boot (in rc.local) and chmod'ing immediately afterwards to 777, so I don't think its a permissions problem on the directory - s3fs is running as root anyway.

Without local caching being consistent s3fs is pretty much unusable for me right now :(

Comment by bmilleare, Aug 05, 2008

More weirdness, here's what I did:

  1. Launched fresh instance on EC2.
  2. Apache file x.zip (result: no cache)
  3. umount'ed fuse and restarted s3fs
  4. Hit Apache x.zip (result: file cached in /mnt/cache/bucket-name/x.zip)
  5. Hit Apache y.zip (result: no cache)
  6. umount'ed fuse and restarted s3fs
  7. Hit Apache z.zip (result: no cache)

Tried the same process 3 times with exactly the same results. Why would it cache the first file and then refuse to cache anything else?

Nothing in my logs to suggest any errors either.

Comment by bmilleare, Aug 05, 2008

Also, it will consistently cache the initially cached file multiple times without fail, ie:

  1. New instance up.
  2. Hit Apache x.zip (result: file cached)
  3. rm -rf /path/to/cache
  4. Hit Apache y.zip (result: no cache)
  5. Hit Apache x.zip (result: file cached)
  6. Hit Apache z.zip (result: no cache)
Comment by rrizun, Aug 07, 2008

Hi- just to clarify what's happening here... when you say "cached" and "no cache"... do you mean cache hit vs cache miss? that is, w/cache enabled, when s3fs downloads a file, it places it in the cache folder... however, that's not a "hit" or anything, only when the second and subsequent requests come in does s3fs look at the cache folder to see if there is a cache hit.

so, having said that, sounds like what you're seeing is (a) the local cache folder is being populated but (b) there are never any cache hits because, despite the fact that the file is in the local cache folder, s3fs still seems to download it (again) from s3

does that should about right?

Comment by bmilleare, Aug 08, 2008

By "no cache" I mean the file is not actually being saved to the cache folder.

Comment by rrizun, Aug 10, 2008

Hi bmilleare- I've done a checkin of r177 that fixes a subtle stale curl handle/timeout issue; there could conceivably be some sort of interaction between that and local cache; so, if you're still interesting in resolving this, feel free to do a svn checkout of r177 and rebuild and retest and report your findings?!? I'm currently stumped on this one, so, even if this fix doesn't solve the local cache issue, it will still be addition info/help narrow things down... Thanks!

Comment by ggreenaway, Aug 11, 2008

Building on CentOS 5.1 x86_64 results in gssapi_krb5 errors. s3fs.cpp: At global scope: s3fs.cpp:439: warning: ‘size_t readCallback(void, size_t, size_t, void)’ defined but not used /usr/bin/ld: skipping incompatible /usr/lib/libkrb5.so when searching for -lkrb5 /usr/bin/ld: skipping incompatible /usr/lib/libkrb5.a when searching for -lkrb5 /usr/bin/ld: cannot find -lkrb5 collect2: ld returned 1 exit status make: all? Error 1

Comment by rrizun, Aug 12, 2008

dunno... are you running "configure" on libcurl? if so try --disable-krb4 --without-krb4 (just guessing)

Comment by hrabbach, Aug 13, 2008

any predictions when this fs will work with EU buckets? We have really slow connectivity to the US buckets from our ISP and EU buckets should be much faster, so it would be great if we could use those...

Comment by hrabbach, Aug 13, 2008

oh and by the way, nice work... we use this together with Bacula to do server backups that need to be readable in multiple places - before we had to ship physical tapes around the world.

Comment by rrizun, Aug 14, 2008

yes, EU bucket support is on the TODO list! no ETA, just hafta find time to have a look!

Comment by hrabbach, Aug 14, 2008

ok great, if there's any way I can help, let me know... I know you're doing this in your spare time, very much appreciated :)

Comment by e.vanoos...@grons.nl, Aug 28, 2008

On Ubuntu do: sudo apt-get install fuse-utils

On my 8.04 desktop installation it is installed by default.

Comment by e.vanoos...@grons.nl, Aug 28, 2008

How do I stop it?

Comment by rrizun, Aug 28, 2008

umount (or fusermount -u)

Comment by plaurent, Aug 28, 2008

On OS X 10.5.4, using Finder to copy (drag and drop) gives an Error -36. ("The Finder cannot complete the operations because some data in 'YourFile?.txt' could not be read or written. (Error code -36).") The resulting file has file size zero.

However copying from the command line (using cp) works perfectly.

Sounds like this might be a MacFuse? issue, but I thought I'd mention it here just in case.

Thanks for the excellent work.

Comment by rrizun, Aug 28, 2008
Comment by rrizun, Aug 30, 2008

FYI for a pre-compiled commercially supported enhanced variant of s3fs, see http://www.subcloud.com

Comment by koficharlie, Sep 08, 2008

Hey rrizun, Does this work for Microsoft Windows?

Comment by rrizun, Sep 08, 2008

Hi koficharlie- its Linux and MacOSX for now

Comment by cartroo, Sep 15, 2008

Hi,

I've been using various S3 tools prior to s3fs, most recently "s3cmd" by Michal Ludvig (http://s3tools.logix.cz/s3cmd). It seems like s3fs would be a lot more convenient for me, but I'm having the issue that s3cmd and s3fs don't seem to be able to see each other's files.

I mount the s3fs volume, and there are no suspcious errors in /var/log/messages - I can mkdir a directory, unmount, remount and that directory is still there, along with its files. However, I can't see the files that s3cmd created in s3fs, and neither does s3cmd see the directory and/or files that s3fs created. I can't see any extra buckets being created in s3cmd either, so I don't think it's that the bucket name is wrong, although the name does include a hyphen - I don't know if this causes any problems somehow.

Any ideas what the problem might be? Or should I not be surprised that these two packages don't seem to read each other's entries?

Comment by rrizun, Sep 15, 2008

Hi cartroo-

I've heard of s3cmd but have never used it until now; looks like s3cmd does not really have any concept of folders

in general, the various s3 client programs each have their own scheme for files and folders

s3cmd should be able to see files/folders created with s3fs, in raw form

s3fs should be able to see files create with s3cmd as long as the s3cmd commands issued are "compatible" with the way s3fs wants to view things; you probably don't really wanna do that though

(I'm typing this while listening to 3 other people, so, hope that makes sense!)

Comment by jim.cheetham, Sep 20, 2008

df shows that an s3fs filesystem is mounted; but it does not tell me which bucket it is.

How can I tell which bucket s3fs has mounted?

Comment by rrizun, Sep 20, 2008

>>> How can I tell which bucket s3fs has mounted?

You can't if you're using df, unless, e.g., some sort of naming convention is followed.

You can, however, use something like "ps ax | grep s3fs".

Comment by ad...@terraninnerspace.com, Oct 01, 2008

I'm running an EC2 instance and using s3fs actively with 6 different buckets. The problem I'm seeing is that s3fs is consuming a great deal of memory, and does not seem to release it. I'm using the newest source, without caching.

Is this truly a memory leak, or some other component not working right ie) fuse or curl.

Here's the info from top:

2129 root 21 0 24164 1804 1292 S 0.0 0.1 0:00.03 s3fs 2135 root 20 0 343m 74m 2204 S 0.0 4.4 2:10.74 s3fs 2143 root 22 0 41784 3136 1848 S 0.0 0.2 0:02.57 s3fs 2151 root 24 0 24164 1804 1292 S 0.0 0.1 0:00.01 s3fs 2161 root 25 0 32568 2412 1804 S 0.0 0.1 0:00.01 s3fs 2168 root 20 0 60556 4196 1848 S 0.0 0.2 0:00.13 s3fs 2176 root 24 0 951m 472m 2184 S 0.0 27.8 0:23.31 s3fs

Any assistance would be appreciated.

Comment by rrizun, Oct 01, 2008

Hi- what is the nature of the s3fs memory consumption? just trying to characterize the problem: does it slowly ramp up over time or does it consume that much memory right away? as well, are the files large in size? are there directories with lotsa files? Thanks

Comment by ad...@terraninnerspace.com, Oct 01, 2008

A quick summary would yes to all ! Files range 4-5mb for flv, lotsa php script, and we use a three tier folder lay out, so always 3 sub-directories in most cases.

Thanks for your quick response !

Comment by rrizun, Oct 02, 2008

Hi- about the only s3fs resource I can think of that's unbounded is curl handles; s3fs maintains a pool of persistent curl handles; if s3fs needs a curl handle and the pool is empty then it allocates a new curl handle and then returns it to the pool; under normal use I can see the pool having, say 50 curl handles, however, under heavy concurrent use it could be 200-300+; not sure off hand how much memory a curl handle consumes; might want to monitor the s3fs process using "top" and then use s3fs in a highly concurrent manner and see if there is a continuous memory consumption ramp up

Comment by martin.melin, Oct 16, 2008

Hi, this project seems like a perfect fit for me. However, I'd like to join in with the others in asking for EU bucket support. Do you suppose you could give a general idea of when you would be able to find the time? AFAIK EU buckets work exactly the same way as US, except for changing the site-id or something similar (I haven't had a look at the S3 API for over a year).

Comment by voegtlin, Oct 16, 2008

Hi,

Apparently rsync does not correctly set the content-type of files. when I copy jpeg files using 'cp' the content-type is set to 'image/jpeg'. When I copy the same files with rsync the content-type is 'application/octet-stream'. Any help would be appreciated.

Comment by rrizun, Oct 16, 2008

Hi Martin- I did have a peek at what it would take for EU bucket support; the original US bucket naming scheme that s3fs uses is different and not compatible with the new EU bucket naming scheme (e.g., mixed case, underscores, etc...), see http://docs.amazonwebservices.com/AmazonS3/2006-03-01/index.html?BucketRestrictions.html if s3fs were to use the new EU bucket naming scheme then that would break s3fs for existing buckets that were created using the original US bucket naming scheme; is the subcloud binary an option?

Comment by rrizun, Oct 16, 2008

Hi voegtlin- indeed, sounds like a bug; s3fs sets content-type based on file name extension; rsync does its initial upload with a temporary file that ends with random characters, since those random characters do not match any known file extension from /etc/mime.types, s3fs sets the content=type to octet-stream; having said that, s3fs "rename" would need to be enhanced to re-lookup the content-type based on the new file extension during rename

Comment by voegtlin, Oct 17, 2008

thank you for the explanation. I have another question. I have noticed that 'stat' returns a size zero if a file has been recently created. I create a file (about 20K in size) and do a 'stat' right after it is created, to check that its size it correct. The returned size is zero. However, if I add a 'sleep 0.1' command between file creation and the 'stat' command, then the size is correctly returned. is this a bug ? is there a way to avoid this ?

Comment by rrizun, Oct 17, 2008

Hi mite.net- I guess an EU version is an option

Comment by rrizun, Oct 17, 2008

Hi voegtlin- do you see this behavior 100% of the time? or is it intermittent? as well, are you doing the stat in parallel before the 20k file has finished uploading? or is everything being done serially? any feel for if it is "eventual-consistency" related?

Comment by voegtlin, Oct 19, 2008

no, I do it sequentially, and it happens 100% of the time. the file is a thumbnail of an image, that is created with 'convert'; perhaps it has to do with convert ?

Comment by rrizun, Oct 19, 2008

dunno if this has anything to do with it http://developer.amazonwebservices.com/connect/thread.jspa?threadID=25535&tstart=0

can you post a script that reproduces the issue?

Comment by vgivanovic, Oct 22, 2008

Works fine (once I got the right program!) I haven't tested all the combinations, but at least /etc/passwd-s3fs is used correctly.

Sorry for the wasted bandwidth. I deleted my postings (to preserve my honor and to prevent more confusion), so you might want to delete yours as well.

Comment by todd.huss.work, Oct 30, 2008

I'm running Fuse 2.7.4 with the latest S3FS r177 on Amazon EC2 with Linux kernel 2.6.18. About once I a week I'll see a server totally lockup and then after a reboot I go into /var/log/messages where I see the following "soft lockup" resulting from fuse (and s3fs is the only fuse filesystem we run). Any ideas on what could be causing this?

Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357807] BUG: soft lockup detected on CPU#1!
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357820]  [<c014b27e>] softlockup_tick+0xae/0xe0
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357830]  [<c01088ea>] timer_interrupt+0x43a/0x6d0
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357836]  [<c014b373>] handle_IRQ_event+0x33/0xb0
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357838]  [<c014b485>] __do_IRQ+0x95/0x100
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357841]  [<c0106fb1>] do_IRQ+0x31/0x70
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357955]  [<c01922f1>] mntput_no_expire+0x21/0x90
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357958]  [<c02a58e0>] evtchn_do_upcall+0xd0/0x130
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357963]  [<c0105449>] hypervisor_callback+0x3d/0x45
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357966]  [<c017ecf1>] generic_fillattr+0x81/0xc0
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357969]  [<ee0c16b2>] fuse_getattr+0x72/0xd0 [fuse]
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357976]  [<ee0c1640>] fuse_getattr+0x0/0xd0 [fuse]
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357980]  [<c017ed7c>] vfs_getattr+0x4c/0xd0
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357983]  [<c017eea4>] vfs_lstat_fd+0x34/0x50
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357985]  [<ee0be3db>] queue_request+0x6b/0x90 [fuse]
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357990]  [<ee0be713>] request_send_nowait+0x33/0x70 [fuse]
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357995]  [<c017f75f>] sys_lstat64+0xf/0x30
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.357997]  [<c0173f2c>] sys_close+0x5c/0xa0
Oct 26 09:35:05 ip-10-250-38-50 kernel: [12985631.358000]  [<c010525b>] syscall_call+0x7/0xb 
Comment by rrizun, Oct 30, 2008

I've seen references to that before; don't know what its all about http://www.google.com/search?hl=en&q=fuse+BUG%3A+soft+lockup+detected+on+CPU%231!&btnG=Search

Comment by gbroiles, Nov 19, 2008

works for me on Ubuntu 8.10 server after loading the required packages. I note that AWS returns a "403 Access Denied" error if a nonexistent bucket name is specified - this caused me to spend a long time entering the authentication information in slightly different ways until I used the jets3t browser to note that the bucket I think of as "default" when using Jungle Disk has a much longer name from AWS' (and hence s3fs') perspective. with the bucket name fixed, everything's great.

Comment by redbe...@openlabs.pl, Nov 21, 2008

Regarding the rsync performace: We have found that it is even up to 4 -5 times faster to mount a bucket on an EC2 instance and make a remote rsync to the instance (e.g. tunneled over ssh) than mount s3fs locally and make a rsync to the mounted directory. And less expensive too. I might write a simple howto about this method.

Michal Frackowiak http://michalfrackowiak.com

Comment by adilmohammed, Dec 13, 2008

Can I mount the same S3 bucket using S3fs on multiple servers? if so, any known problems?

Thanks

Comment by adilmohammed, Dec 13, 2008

Also, can you please tell me what dependencies I need to install s3fs?

Thanks

Comment by rrizun, Dec 14, 2008

what os are you building/installing on?

Comment by adilmohammed, Dec 14, 2008

I am on Debian, thanks!

Comment by rrizun, Dec 14, 2008

search this page for "apt-get"

Comment by adilmohammed, Dec 14, 2008

Thanks mate!

Another question, Can I mount the same S3 bucket using S3fs on multiple servers? if so, any known problems?

Thanks!

Comment by rrizun, Dec 14, 2008

yes!

Comment by adilmohammed, Dec 14, 2008

And no known problems or issues or things I should watch out for if I attach the same bucket to multiple instances?

Thanks so much for your help

Comment by rrizun, Dec 14, 2008

no known problems other than the usual concurrency issues, e.g., no different than NFS mounting the same folder from more than one machine

Comment by adilmohammed, Dec 14, 2008

Thanks!

Comment by sappajodu, Dec 16, 2008

hi, I mounted a bucket and created some folders and files in it. I mounted the same bucket on another instance. I do not see the files created on the first instance in the second one. When I unmount and mount it again then I see the files.

Second issue is that the changes done using s3fs are not visible from S3 firefox organizer. Am I missing something here.

Thanks

Comment by rrizun, Dec 17, 2008

don't know why files are not appearing from the perspective of the second instance... same accessKeyId? same bucket?

as far as s3fox goes, I believe s3fox uses a different scheme for representing files/folders that is incompatible with s3fs; I recommend using jets3t instead because it makes no assumptions about the contents of an s3 bucket

Comment by joeauty, Jan 07, 2009

I'm able to mount my bucket and see my file listing in it, but I'm getting input/output errors. I'm assuming that this is not a problem with my credentials or bucket since I'm getting a listing, I don't have a clock skew. I'm seeing the following:

ls -l /mnt ls: cannot access /mnt/dir1: Input/output error ls: cannot access /mnt/list: Input/output error ls: cannot access /mnt/dir2: Input/output error total 0 ?????????? ? ? ? ? ? dir1 ?????????? ? ? ? ? ? list ?????????? ? ? ? ? ? dir2

I've reproduced this on two separate Ubuntu 8.10 machines. I'm not seeing anything in /var/log/messages. Any suggestions for some things I can try to get this working?

Comment by joeauty, Jan 07, 2009

Sorry, let me reformat that output:

# ls -l /mnt
ls: cannot access /mnt/dir1: Input/output error
ls: cannot access /mnt/list: Input/output error
ls: cannot access /mnt/dir2: Input/output error
total 0
?????????? ? ? ? ?                ? dir1
?????????? ? ? ? ?                ? list
?????????? ? ? ? ?                ? dir2
Comment by rrizun, Jan 09, 2009

probably trying to read a bucket w/objects that were created by another s3 tool? if so then s3fs is looking for additional meta data and not finding it

Comment by joeauty, Jan 13, 2009

These files were uploaded by s3sync.rb, the Ruby rsync clone... Anyway to fix this?

Comment by rrizun, Jan 13, 2009

no real way to fix it because it is not really a problem per-se; s3fs and s3sync.rb do not understand each other's formats; solution would be to re-upload files w/s3fs (alternatively, figure out which meta data needs to be set to make s3fs happy then use another s3 tool to set meta data on those files uploaded w/s3sync.rb)

Comment by dragen, Jan 20, 2009

So is the official answer, at least for now, that s3fs only works with objects uploaded by s3fs, and not by other tools like s3fox? Does anyone know an easy way to convert the objects?

Comment by silverfish.design, Jan 23, 2009

Could somebody give a rough idea of costs please? Do you pay for data transferred, or by transaction? Does the storage cost depend on the amount of data in your bucket, or just the bucket size?

Can you join buckets for larger storage sizes?

Comment by rrizun, Jan 23, 2009
Comment by mathews.kyle, Jan 24, 2009

I'm seeing the same error as joeauty. I mount a new empty bucket and the process finishes with no error. However, when I try to go to the mounted directory, it tells me permission is denied.

ls -l returns

ls -l
ls: cannot access s3: Permission denied
total 0
d????????? ? ? ? ?                ? s3

Any clue what's going on here? S3FS is perfect for my needs and I'd really like to get it running.

Thanks

Comment by mathews.kyle, Jan 24, 2009

oh, forgot to mention -- I'm running Ubuntu 8.10 and the latest version (177) of S3FS. /var/log/messages isn't helping much -- all I get is:

Jan 24 11:13:00 kyle-desktop s3fs: init $Rev: 177 $
Jan 24 11:13:04 kyle-desktop s3fs: destroy
Jan 24 11:13:09 kyle-desktop s3fs: init $Rev: 177 $
Comment by mikeage, Jan 31, 2009

Does S3FS and/or the underlying S3 storage provide any form of integrity checking? I.e., is there any way a file can be corrupted during upload that would not be detected?

Comment by rrizun, Feb 02, 2009

s3fs does not check Content-MD5

Comment by stock593, Feb 11, 2009

I'm trying to figure out why rsync'ed files to S3 wouldn't be viewable in Firefox/Google Chrome. For example, if I try to view a .htm or .jpg file, it prompts me to open or save the files. Oddly enough they view fine in IE. Any ideas? Maybe s3fs isn't recognizing my /etc/mime.types?

Comment by rrizun, Feb 12, 2009

that is possible; in ff, what does "Tools -> Page Info" say about the Content-Type?

Comment by stock593, Feb 12, 2009

It says "text/html" if I open a .htm file.

Comment by rrizun, Feb 14, 2009

ah, I know what the problem is: s3fs does not consult mime.types on renames... when rsync copies, it copies to a tmp file and then renames it at the end, thus the loss of s3fs content-type... to fix it, just add these lines to s3f3_rename

meta["Content-Type"] = lookupMimeType(to); meta["x-amz-metadata-directive"] = "REPLACE";

(Note- NOT tested!)

Comment by stock593, Feb 14, 2009

Thanks, that worked perfectly :)

Comment by barrybrown70, Feb 19, 2009

Is the size of the local file cache constrained? Or will it just cache files until the disk fills up? What happens when the disk is full?

Comment by rrizun, Feb 20, 2009

no constraints... as a workaround, use a cron job to periodically prune/purge the local file cache

Comment by greenventuresinc, Feb 25, 2009

I'm mounting an S3 volume to multiple servers. When a user uploads an image to one of the servers, I want it stored on the S3 bucket so all servers can see the same images no matter which server uploaded it or which server is serving their page.

The bucket mounts to the instances without trouble, but I'm having trouble getting the permissions correct so that the apache user can write to the bucket. The bucket is symbolically linked from my web directory to /mnt/mybucket. I've also done the -o allow_other command when mounting the bucket. The bucket is set as world readable and I've tried everything from owner writable to world-writable. If I look at the directory permissions, it is showing rwxr-xr-x.

Is there a way to do this???

Comment by malone.fxhome, Mar 05, 2009

A workaround for the rsync/mime-type issue mentioned on "Feb 14, 2009" is to use the "--inplace" option with rsync. This forces it to write to the correct filename to start with, rather than writing to a temp file and renaming it. Which means the right content-type is set.

Comment by mieses, Mar 09, 2009

The '.' and '..' entities are not listed with 'ls -a'. Is this the expected result?

Comment by rrizun, Mar 09, 2009

>>> The '.' and '..' entities are not listed with 'ls -a'. Is this the expected result?

yes

Comment by pa...@inutilfutil.com, Mar 10, 2009

Hello! I've been using it for a few days in my EC2 server (An Ubuntu x86 Server instance). Yesterday, one of my s3fs mounts stopped working without reason, and would print "Transport endpoint is not connected" as a result of any try to access it. Re-mounting it fixed the problem (Temporarialy?).

Running ls -l in the parent folder displayed the "defective" mount point in red, with plenty of question marks:

ls: cannot access videos.example.com: Transport endpoint is not connected
total 0
d????????? ? ?    ?    ?                ? videos.example.com
drwxr-xr-x 1 root root 0 1969-12-31 19:00 www.example.com

In my /var/log/messages, I have a s3fs segfault (This one-line entry is the only interesting thing...)

Mar 10 13:48:49 domU-12-31-39-00-7D-13 kernel: [1132444.746495] s3fs[30280]: segfault at 0 ip b7bbc323 sp b6fe1fec error 4 in libc-2.8.90.so[b7b45000+158000]

This is everything I could find - Although it probably doesn't help much... :( Has anybody else had this issue? Any chances to get it fixed? (May I help? How?)

Anyway, Thanks for this great software ;)

Comment by mieses, Mar 11, 2009

I noticed that ls -l can return d????????? when a directory name contains a trailing /, which can be present if some other s3 clients have interacted with the bucket. I don't know if this applies in your case.

Comment by neilrobst, Mar 11, 2009

Hi - I would just like to add my voice to the requests for an Amazon S3 EU-compatible version of s3fs, please? :-)

Comment by paulo.raca, Mar 16, 2009

Mieses, I didn't have any directory inside the crashed bucket (Just a long and messy list of files), so I think the "/" problem probably doesn't affect me.

And no other client was active when it crashed.

Comment by paulo.raca, Mar 17, 2009

Crashed again :( Is anybody else running s3fs in a server? I'm getting these random crashes every 1-2 days. :(

Comment by st...@bov.nu, Mar 18, 2009

I am backing up a lot of data with a script and rsync. All was well until i modified my script with nano and because the cache had filled my os drive (small 10gb) my edit wasnt saved and i lost my script. Currently disabled the cache. Is there anyway to manage the cache ? Is it really required if mainly writing data ?

Comment by pnsweetma, Mar 19, 2009

I downloaded the "featured" code bundle s3fs-r177-source.tar.gz and compiled it on my Ubuntu 8.10 box. It never managed to establish a proper connection (I didn't get to the bottom of why) but my S3 Account Usage says I've made 1.3 million requests and will be charged accordingly! I imagine the code must have been looping, but that could prove to be a costly loop for me...

Comment by rrizun, Mar 20, 2009

st...@bov.nu- there is no local cache management; you can use a cron job to periodically purge the local cache; in your case it sounds like you probably don't even need local cache-

Comment by rrizun, Mar 20, 2009

paulo.raca- I'm not aware of any crash conditions in the s3fs code itself; wondering if its one of the libraries? can you get a coredump and invoke gdb on it for a traceback?

Comment by A1kmmm, Mar 20, 2009

A cautionary tale (and some debugging tips) about using this with SSL: If you are going to use -ourl=https://s3.amazonaws.com, make sure you do not have a trailing slash on the end of the URL (I did this at first, and it took quite a lot of effort to work out why). If it works without SSL but not with it, a good thing to check is that you don't have a trailing slash on the end of your URL.

In case you are getting a different problem, here are some debugging tips I worked out while debugging this:

It writes information to syslog - check /var/log/syslog or where your system logs to for lines, which look like this:

Mar 21 16:47:40 amlap s3fs: ###timeout
Mar 21 16:47:40 amlap s3fs: ###retrying...
Mar 21 16:47:42 amlap s3fs: ###response=403

If you get an error response (like 403), AWS sends back a response in an XML language explaining the error. The problem is you can't see the error. If the problem occurs for HTTP too, the easiest way is to use wireshark or tcpdump or another packet capture program to spy on what s3fs is doing and read the error message. If, like I was, you are encountering a problem exclusively for SSL, however, read on.

My problem meant that readdir (triggered by ls), as well as practically every other call, were not working. For ease of debugging, I chose to debug what happened when readdir is called.

s3fs sets the curl option CURLOPT_FAILONERROR, which makes it hard to get the output. So I went into the function starting with: s3fs_readdir(const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi) { and changed the following line: curl_easy_setopt(curl, CURLOPT_FAILONERROR, true); to: curl_easy_setopt(curl, CURLOPT_FAILONERROR, false);

Change this line and recompile by running make.

This is only a temporary change for debugging - it will cause problems if used in production, so don't forget to change the line back and recompile once you solve the problem.

s3fs forks when it runs, so the best way to debug it is to start it normally from the command line, and then attach to it with gdb... ps ax |grep s3fs From here, get the PID of s3fs, and run

gdb ./s3fs 1743
where ./s3fs is the path to your binary, and 1743 and the pid. Type:

  break writeCallback
  break calc_signature
  cont

Now trigger the problematic request with the ls command from another shell, and change back to the gdb shell.

A breakpoint will hit in calc_signature as follows:

Breakpoint 3, calc_signature (method=
        {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x42037dc0 "h\006G\001"}}, content_type=
        {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x42037db0 "8\020���\177"}}, date=
        {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x42037da0 "�\005G\001"}}, headers=0x1479620, resource=
        {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x42037d90 "H\003G\001"}}) at s3fs.cpp:393
393	calc_signature(string method, string content_type, string date, curl_slist* headers, string resource) {
(gdb) next
394		string Signature;
(gdb) 
395		string StringToSign;
(gdb) 
396		StringToSign += method + "\n";
(gdb) 
397		StringToSign += "\n"; // md5
(gdb) 
398		StringToSign += content_type + "\n";
(gdb) 
399		StringToSign += date + "\n";
(gdb) 
400		int count = 0;
(gdb) 
401		if (headers != 0) {
(gdb) 
404				if (strncmp(headers->data, "x-amz", 5) == 0) {
(gdb) 
402			do {
(gdb) 
411		StringToSign += resource;
(gdb) 
413		const void* key = AWSSecretAccessKey.data();
(gdb) 
414		int key_len = AWSSecretAccessKey.size();
(gdb) print StringToSign.c_str()
$1 = 0x1470378 "GET\n\n\nSat, 21 Mar 2009 03:50:45 GMT\n/wwjcode"
(gdb) cont

Next, it will likely break in writeCallback, which you can debug like this...

Breakpoint 2, writeCallback (data=0x1484c8b, blockSize=1, numBlocks=707, userPtr=0x42037ec0) at s3fs.cpp:451
451	  string* userString = static_cast<string*>(userPtr);
(gdb) next
452	  (*userString).append(reinterpret_cast<const char*>(data), blockSize*numBlocks);
(gdb) 
453	  return blockSize*numBlocks;
(gdb) 
454	}
(gdb) printf "%s", (*userString).c_str()
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><StringToSignBytes>47 45 54 0a 0a 0a 53 61 74 2c 20 32 31 20 4d 61 72 20 32 30 30 39 20 30 33 3a 35 31 3a 31 33 20 47 4d 54 0a 2f 2f 77 77 6a 63 6f 64 65</StringToSignBytes><RequestId>89EED394FEBE2205</RequestId><HostId>T+7tNvxde2qrUSUkIxY9vCv0uOeqDsa6yZgHdfR3Txuqs+Ql7YudvnanvDEOUDgt</HostId><SignatureProvided>nHjQiDuIpSUCWpCe2z6g8XBU2n4=</SignatureProvided><StringToSign>GET


Sat, 21 Mar 2009 03:51:13 GMT
//wwjcode</StringToSign><AWSAccessKeyId>1VBXKK4V5WVRSTTY0JG2</AWSAccessKeyId></Error>(gdb) 

In this case, you will see that the string we signed has /wwjcode in it, and the expected string has //wwjcode, because of the extra slash on the end of the URL. It is likely you will have some different type of error, but hopefully this is enough to get you start in debugging the problem.

Don't forget to change the

  curl_easy_setopt(curl, CURLOPT_FAILONERROR, false);

back to true and re-run make or you will get hung requests!

Comment by mumrah, May 23, 2009

In order for me to get /etc/fstab working, I needed to install fuse-utils.

Comment by markkerzner, May 26, 2009

It works great, but when I delete a file from the mount, it is not deleted from the S3. Why?

Comment by boomslang22, Jul 13, 2009

I am using rsnapshot with s3fs as the target for offsite backups. My rsnapshot logs show the following error:

09/Jul/2009:23:50:45? /usr/bin/rsnapshot daily: ERROR: /bin/cp -al /mnt/s3/daily.0 /mnt/s3/daily.1 failed (result 256, exit status 1). Perhaps your cp does not support -al options?

Specifically, the -l is what's making it fail. The -l flag tells cp to link rather than copy, and a manually-invoked cp -l shows the following:

# cp -l /mnt/s3/test /mnt/s3/test2 cp: cannot create link `rsnap2': Operation not permitted

Hard links on s3fs are not supported. Symbolic linking, however, seems to be supported, with cp -s succeeding.

Is hard link support something that can be implemented?

Comment by rrizun, Jul 13, 2009

Hi- hard links imply reference counting, so, unless amazon s3 supports references in the future, it is unlikely that s3fs will support hard links

Comment by jeffrey.t.chang, Jul 20, 2009

Hello,

I am having some trouble with the current s3fs on MacFUSE 2.0.3 running on OS X 10.5.7. I can successfully mount a bucket and see its contents, but then can not copy new files into it. If I try to copy a file into S3, I will get an "Invalid argument" error and an empty file.

$ cp ~/Desktop/api-design.pdf .
cp: ./api-design.pdf: Invalid argument
cp: /Users/jchang/Desktop/api-design.pdf: could not copy extended attributes to\
 ./api-design.pdf: Attribute not found
$ ls -lh
total 48
[...]
-rw-r--r--  1 jchang  staff    -1B Jul 20 22:04 api-design.pdf

However, if I repeat the operation, the "Invalid argument" error goes away (although I still get an "Attribute not found" error, and the file now exists in S3.

$ cp ~/Desktop/api-design.pdf .
cp: /Users/jchang/Desktop/api-design.pdf: could not copy extended attributes to\
 ./api-design.pdf: Attribute not found
$ ls -lh
total 456
[...]
-rw-r--r--  1 jchang  staff   203K Jul 20 23:46 api-design.pdf

Furthermore, drag-and-drop from the Finder generates an error and an empty file:

The Finder cannot complete the operation because some data in
<filename> could not be read or written. (Error code -36)

Any ideas on why this might be happening? I have tried s3fs on the same Amazon bucket on a similar configured computer at work, and I do not see this error. How to diagnose? I'm wondering if it might be network related?

Jeff

Comment by rrizun, Jul 21, 2009

when you say it works on a similar configured computer at work, is that macfuse too? or is it linux

might want to look at the 'issues'; there are some macfuse related problems and solutions

Comment by singh.madhusudan, Jul 27, 2009

What steps will reproduce the problem? 1. Compile s3fs.cpp with a modification : #define FUSE_USE_VERSION 26 #define off_t off_t 2. Mount s3fs bucket on amazon, using the command ./s3fs mybucketxxx ~/Desktop/CCCS3 -olocal,ping_diskarb,volname=CCCS3 3. hdiutil create MacHD -size 50G -type SPARSE -fs HFS+ -layout GPTSPUD -stretch 50G -volname MacintoshHD followed by an attempt to mount this : hdiutil attach MacHD.sparseimage

What is the expected output? What do you see instead?

Expected output - mounted sparseimage Seen :

$ hdiutil attach MacHD.sparseimage hdiutil: attach failed - Illegal seek

What version of the product are you using? On what operating system?

Please provide any additional information below.

Version 26 Operating system : Mac OSX Leopard 10.5.7

Comment by jeffrey.t.chang, Aug 02, 2009

Yes, at work the computer is also running OS X with MacFUSE 2.0.3.

I have tried a third computer at home (also running OS X 10.5.7 with MacFUSE 2.0.3), and the third computer does not have the problem. So the problem seems to be isolated to that one computer and not the network. It is a mystery to me why different computers running the same software would respond differently.

Based on the error messages, I feel that maybe the issue has something to do with the way OS X deals with extended attributes (an issue that Linux doesn't have(?)). I'll dig deeper into this and post if I find anything useful.

My problem seems to be the same as issue 49 "Can't upload files". Unfortunately, there's no solution for that one either.

Comment by ad...@terraninnerspace.com, Aug 05, 2009

Have been using s3fs for some time now, and have begun to have problems with folders with many files. In one case there are 10000 files in the bucket, and any filescan (dir/ls) results in an s3fs lockup, and in one case, server reboot. The command line used is: s3fs -o allow_other fmc_data -o retries=15 -o connect_timeout=8 -o readwrite_timeout=40 /data No caching is used, and have modified the readdir routine, increasing max-keys to 150 seeing a small performance increase. Am tempted to raise this even more, since there are so many files. Am i heading in the right direction ?

Comment by deevilcat, Aug 12, 2009

for debian lenny, you'll need following packages

sudo aptitude install build-essential libfuse-dev libfuse2 libcurl4-dev libcurl4-dev libssl0.9.8 libssl-dev libxml2-dev libxml2
Comment by ke...@incircuit.com, Aug 19, 2009

Things work beautifully for me except for any operation that involves permissions changes. For instance, a standard "cp filename dest" works fine, but "cp -p filename dest" does not. rsync, which is what I'm trying to ultimately use does a chmod automatically with the same results. A straight chmod errors out as well. All of these errors I believe are related so I think fixing one thing will most likely resolve everything.

# chmod 777 testfile.zip chmod: changing permissions of `testfile.zip': Input/output error

#/usr/bin/rsync -ru /u01/backup/exports/ /s3/backup/exports rsync: rename "/s3/backup/exports/.testfile.zip.FQOmch" -> "testfile.zip": Input/output error (5) rsync error: some files could not be transferred (code 23) at main.c(892) sender=2.6.8?

The rsync is running as root and files are owned by a lesser privileged user, if it matters.

CentOS 5.3 s3fs r177 curl 7.15.5-2.1.el5_3.5

Any ideas?

Comment by davebcn, Sep 23, 2009

Hi,

for those running into problems dealing with European buckets I made a small fork working with European buckets url schema.

Could be found at http://github.com/tractis/s3fs-fork

Hope it's useful!

Dave

Comment by jzimmek, Sep 26, 2009

great to see european bucket support - hope this will be merged back into main development.

Comment by carlo.beccaria, Oct 12, 2009

anybody know why using rsync over s3fs is very (very) slow? thanks

Comment by rrizun, Oct 12, 2009

try rsync --inplace

Comment by Gabriel....@gmail.com, Oct 18, 2009

why does the filesize show up incorrectly on a zero byte file? the same for directories, the size that shows up on a directory is ridiculously huge. this seems to break some applications that aren't happy with opening a file for writing, running stat on the file, and finding out the file size is wrong.

for example ...

# touch foo # ls -l foo -rw-r--r-- 1 root root 18446744073709551615 Oct 18 01:51 foo # mkdir bar # ls -ld bar drwxr-xr-x 1 root root 18446744073709551615 Oct 18 01:53 bar

Comment by Gabriel....@gmail.com, Oct 19, 2009

nevermind, I figured out the problem. it turns out that starting with curl 7.19.4, it returns -1 if the Content-Length is not known . See my post here for further details and a quick and dirty patch.

http://code.google.com/p/s3fs/issues/detail?id=50#c5

rrizun, can you please commit my patch to trunk (and cleanup the patch if necessary) for those of us running a newer version of curl?

Comment by heininger, Oct 21, 2009

Anybody successfully using tractis/s3fs-fork? I get Input/output error after mounting and doing an ls ...

TIA

Comment by dyamins, Oct 25, 2009

When I try to install I get the following error:

root@localhost s3fs]# make Package 'libcurl' has no Version: field g++ -ggdb -Wall -D_FILE_OFFSET_BITS=64 -I/usr/include/fuse -pthread -L/lib64 -lfuse -lrt -ldl -I/usr/include/libxml2 -lxml2 -lz -lm -lcrypto s3fs.cpp -o s3fs s3fs.cpp:365:25: error: openssl/bio.h: No such file or directory s3fs.cpp:366:28: error: openssl/buffer.h: No such file or directory s3fs.cpp:367:25: error: openssl/evp.h: No such file or directory s3fs.cpp:368:26: error: openssl/hmac.h: No such file or directory s3fs.cpp:500:25: error: openssl/md5.h: No such file or directory

I have installed openssl. Does s3fs not know where my installation is?

Comment by mathieu.clabaut, Nov 02, 2009

@heininger, yes it worked but only when using sudo....

Comment by ourdoings, Nov 03, 2009

I want to "publicly" serve files with obscure URLs that include a directory component, e.g. ab/cd/efgh.jpg but I don't want ab or ab/cd to be listable. It looks like I'm OK, in that when I try to access a directory via an s3 URL, what I get is an empty file. Am I right that the directories are inaccessible except to s3fs? So if I don't give public read access to the bucket, then nobody can list anything?

Comment by HighInBC, Nov 09, 2009
ourdoings:

I think the ability to list a buckets's contents is the property of the bucket's ACL, not the object's ACLs. If the bucket was created without specifying an ACL then I don't think anyone can get a listing of files, even if the files themselves are readable.

Comment by kaotakka, Yesterday (37 hours ago)

Caching doesn't seem to be working for me. I compiled and installed r177 without a problem on FC8 (on EC2). Everything other than caching seems to work fine. Here's a little shell dump:

# mkdir /s3
# chmod 777 /s3
# mkdir /s3tmp
# chmod 777 /s3tmp
# /usr/bin/s3fs testbucket2543 /s3 -ouse_cache=/s3tmp
# ls /s3
foo.txt
# ls /s3tmp
# cp /s3/foo.txt /tmp/
# cat /tmp/foo.txt
hai!
# ls -la /s3tmp
total 8
drwxrwxrwx  2 root root 4096 Nov 23 02:49 .
drwxr-xr-x 27 root root 4096 Nov 23 02:49 ..

Any ideas?


Sign in to add a comment
Hosted by Google Code