Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An exception in DTPHandler.close() makes the server go into an endless loop. #115

Closed
giampaolo opened this issue May 28, 2014 · 13 comments
Closed
Labels
bug Component-Library imported imported from old googlecode site and very likely outdated

Comments

@giampaolo
Copy link
Owner

From arkadius...@gmail.com on May 09, 2009 18:51:08

Hi, I'm writing an FTP server giving access to a kind of virtual file
system (hierarchical objects in memory). In order to do that I've
subclassed ftpserver.AbstractedFS and created my own File type. Because of
the way this is implemented, the file type's close() method can raise an
exception when the file was opened for writing.

This seems to confuse the pyftpdlib because in such case it jumps into an
endless loop spitting out this to the console:

127.0.0.1:49531 ==> 226 Transfer complete.
[user]@127.0.0.1:49531 "some_file_name" received.

My close() method is called only once, it raises an exception (IOError) and
then the loop never calls it again.

This can be easily reproduced by modifying the ftpserver's code. Putting a
raise statement right after the server closes the file seems to be enough.

This seems to happen in the DTPHandler.close() method, more specifically in the

    self.file_obj.close()

line. Putting this example line

    raise IOError(13, 'Permission denied.')

right after it, makes the server go into a loop when trying to send a file
to it. Note that this line is reached only once, NOT every iteration.

I'm not sure if the RFC allows errors to be reported to the client after
all the data has been received by the server but if it does, the pyftpdlib
should catch these IOErrors and pass them along to the client.

Tested using:
* pyftpdlib 0.5.1
* Python 2.6.2
* Windows Vista Ultimate 32bit SP1

Original issue: http://code.google.com/p/pyftpdlib/issues/detail?id=115

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on May 09, 2009 10:37:27

I think that the best solution is to have your file object's close() method
implementation silently ignores IOError exceptions (or log such event in some way).

Honestly I've never seen an exception raised by file().close() and I wouldn't know
how to handle this kind of event in DTPHandler.close() since there are other
"cleanup" operations involved in there that must be executed.

In r603 I should have modified the code so that at least you don't incur into the
endless loop problem, but, again, I would modify *your* code so that the exception
never gets raised at all.

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on May 09, 2009 10:38:08

PS - let me know if at least the endless loop problem is solved now.

@giampaolo
Copy link
Owner Author

From arkadius...@gmail.com on May 09, 2009 11:02:50

Checked. The loop problem is solved. Thanks for fixing.

Regarding my code, yes, it catches the exception now and just calls
traceback.print_exc() in such case. That was a fix for the loop issue but I'll just
leave it like that.

The problem is that the client "thinks" that everything went well and even worse,
some clients don't do LIST after an upload but instead just append the file to their
last listing (Vista's Windows Explorer does it for example) so it appears to the user
that everything is OK when nothing is :). Only refreshing the directory listing
reveals that the file isn't really there.

I was able to predict some most common error causes in the file's __init__() so it's
better now, I can live with the rest.

BTW, I don't know if you remember me, I'm the developer of the googlepagesftpd tool
which uses your library in a similar way. It's on Google Code too and you're a
project member :). Unfortunately it's a dead project now as Google shut the "Pages"
service down. https://code.google.com/p/googlepagesftpd/

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on May 09, 2009 11:37:19

What is the reason why that IOError exception is raised in the first place?
If the client already received the 226 response (which takes place in
DTPHandler.handle_close() method) that means that all the data (sent or 
received) has
already been transmitted.

What happens next is that handle_close() calls close() which is the place where all
the cleanup operations like closing file objects and canceling scheduled timeouts
take place.

If the operation of closing the file object raises an exception it's because of
something went bad with it (don't know what exactly, as I said I've never seen that
happen with a "real" filesystem file) but that doesn't mean that the transfer went
bad, and maybe that's because your client lists the file: because there was a
previous 226 response sent by the server.

Now, in order to avoid this situation, I'd focus on understanding why your "dummy"
file object behaves like that in the first place.
If the IOError exception is legitimate when that file has been opened for writing,
then I'd say it's ok to silently ignore the error without even logging what happened,
otherwise I would investigate on why that happens and log the exception 
somewhere as
long as you don't find a better solution.

If you think my knowledge of pyftpdlib might come in handy try to paste your code so
that I can actually see what you're trying to do and I'll try to help you out, if I can.

> BTW, I don't know if you remember me, I'm the developer of the 
googlepagesftpd 
> tool which uses your library in a similar way. It's on Google Code too and 
> you're a project member :). Unfortunately it's a dead project now as Google 
> shut the "Pages" service down.
> https://code.google.com/p/googlepagesftpd/ Yes, I remember you. Sorry to hear 
your tool doesn't work anymore. It was a useful one!

@giampaolo
Copy link
Owner Author

From arkadius...@gmail.com on May 09, 2009 12:05:48

Of course you're totally right. If the open() and all write()s haven't complained
then it means the file was created and the data written to it, mostly.

The reason why my file object may fail in the close() is because that's where I
really write all the data to the storage (without going into detail what the storage
exactly is). And the reason for that is that I need to know the size of the data
before I can write anything.

I would do it differently but it's a constrain of the storage. A file is created by
sending its size followed by the contents.

If the server knows beforehand what will be the size of the received file then maybe
I could work around this somehow.

Or maybe if I could detect the last write() call, I could store the data there so a
possible exception would be caught by the server.

<off topic>
Interfacing pyftpdlib with a virtual file system for a second time made me think of
other possible uses of this. I did a quick search and found this: 
https://code.google.com/p/pyfilesystem/ Haven't tried it yet but it gives you a 
set of standard file operations for a various
storages, memory and ZIP file being the most interesting I think. Gluing it together
with pyftpdlib should be relatively simple. Maybe you can give it a thought.
</off topic>

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on May 09, 2009 12:09:34

Summary: An exception in DTPHandler.close() makes the server go into an endless loop.
Status: Finished
Labels: Version-0.5.1 Milestone-0.5.2

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on May 09, 2009 12:52:33

Then why don't you override write() instead of close(), keep track of the amount of
written bytes, and raise IOError when necessary?

EnviromentError (of which IOError is a subclass of) is already used in
DTPHandler.handle_error() so that it's considered a "semi-expected" error condition
and treated in a way that a "426 detailed response of what actually happened" is
automatically returned to the client.
By raising IOError with, say, error code EFBIG, your client will also receive a
detailed "426 File too big" error response.
Pseudo code:

from errno import EFBIG

class YourFileObject:

    write_limit = 1024000000 # 1Mb

    def __init__(self):
        self.written_bytes = 0

    def write(chunk):
        self.written_bytes += len(chunk)
        if self.written_bytes > self.write_limit:
            # note: this will be the string returned to the client
            raise IOError(errno.EFBIG, "File too large")
        else:
            file_instance.write(chunk)


Doing the actual write operation in close() method is definitively not a good idea IMO.
If the file is too big you might incur in memory issues, and at the same time you
might temporarily block the server which will be busy on doing the writing operation
*all in one shot* (remember: pyftpdlib is asynchronous).

I'll take a look at that pyfilesystem library, thanks.

@giampaolo
Copy link
Owner Author

From arkadius...@gmail.com on May 09, 2009 13:35:23

OK, let me tell you more about where the files are actually stored. It is a GSM/GPS
modem connected using RS232 to the PC. The modem has a built-in flash memory with a
simple file system. I can control it using AT commands, there are commands to list
files, read, write or delete them.

Partly for fun but also to ease out management of these files, I decided to write an
FTP server using the modem as a backend.

To create a file, one has to execute an AT command specifying the filename *and the
file size*. The command is then immediately followed by the raw contents of the whole
file.

And that's where the problem comes from. Before I can issue this AT command, I need
to know exactly how big the file is. And that is before I can write any part of the
file data itself.

That's what forces me to buffer everything when write() is called and send the AT
command followed by the data in close().

Blocking is not really an issue because in 99.9% of the cases only one client will be
logged in at a time.

Buffering in memory is also not an issue because the files are small (there's 2MB of
flash memory on the modem).

One solution I think about now is to use the fact that I can query the modem for free
space on the flash. However, I wouldn't want to do this as I don't know exactly how
the file system works and how much of the free space will be used for file system
overhead (file attributes, etc.) and how much for the actual data.

One question, how (and when?) does the server know when to stop reading data from the
client?

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on May 09, 2009 14:21:25

Ok, I can see the problem now.
This makes me think about the ALLO command (FTP RFC-959) which was designed for such
kind of situations where, at the time it was written (1985), many filesystem needed
to know the exact size of the file you were going to store, but since that nowadays
they're almost completely disappeared I'm not aware of any server supporting ALLO
command natively.

pyftpdlib implements ALLO as a no-op by just responding with "202 No storage
allocation necessary." and doing nothing else. 
If you got an FTP client which supports ALLO command you can modify the original
FTPHandler.ftp_ALLO(arg) implementation, store the arg value somewhere (I guess the
value is expressed in bytes) and pass that value to your filesystem class.

I'm not completely sure (I should take a look at RFC-959 first), but ALLO should
almost certainly be sent before STOR, like this:

<- ALLO 1024000
-> 220 ok
<- STOR my_file_of_1024000_bytes.ext
-> 150 ok

As far as I can tell the implementation would be quite easy, as long as you decide to
follow this road.


> One question, how (and when?) does the server know 
> when to stop reading data from the client?

The server assumes that the incoming transfer is finished when the client 
disconnects
the socket of the data channel.
This state is detected by the DTPHandler when a call to socket.recv() returns an
empty string.
Everything happens in the handle_read() method of DTPHandler class:


    def handle_read(self):
        """Called when there is data waiting to be read."""
        try:
            chunk = self.recv(self.ac_in_buffer_size)
        except socket.error:
            self.handle_error()
        else:
            self.tot_bytes_received += len(chunk)
            if not chunk:
                self.transfer_finished = True
                return

            self.file_obj.write(self.data_wrapper(chunk))

@giampaolo
Copy link
Owner Author

From arkadius...@gmail.com on May 09, 2009 14:54:04

Indeed, the ALLO command seems to suit my needs exactly. From what I've found, it
should be sent right before STOR. However, none of the 3 clients I've tried seem to
send this command. Maybe there's some response from the STOR command indicating to
the client that it should use ALLO first and they would retry is such case but I
doubt it.

I guess I'll just have to leave it like that. It's not that important after all.

Anyway, thanks a lot for your help and suggestions. Sorry for bugging you the whole
evening :). Been nice to talk to you again.

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on August 29, 2009 10:34:03

Status: FixedInSVN

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on September 13, 2009 13:56:15

Status: Fixed

@giampaolo
Copy link
Owner Author

From billiej...@gmail.com on September 13, 2009 14:01:52

This is now fixed and included as part of 0.5.2 version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Component-Library imported imported from old googlecode site and very likely outdated
Projects
None yet
Development

No branches or pull requests

1 participant