Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow .read on file blocks mainloop for too long #197

Closed
giampaolo opened this issue May 28, 2014 · 5 comments
Closed

slow .read on file blocks mainloop for too long #197

giampaolo opened this issue May 28, 2014 · 5 comments
Labels
bug Component-Library imported imported from old googlecode site and very likely outdated

Comments

@giampaolo
Copy link
Owner

From irae.hue...@gmail.com on December 22, 2011 14:22:41

What steps will reproduce the problem?  
1. create a costum AbstractedFS that has a `open` method that returns a file 
like object which `read` method takes a long time to return (for example a 
wrapped `httplib.HTTPResponse` object)

The problem should also be present if with the default configuration, files are 
served that are located in a very slow file system (for example files mapped 
from another computer to the local file system) 

What is the expected output?  


What do you see instead?  
File objects returned by `AbstractFS.open` should first be checked in some way 
if a nonblocking `.read` can be performed before doing so.
`.read` is called even when it will block the entire ftp server. 

Please use labels and text to provide additional information.  
I am serving HTTP content on another server over FTP by streaming it.
By doing so pyftpdlib becomes sluggish an unusable to the client.

This is not only a problem of `read`, `listidr`, `stat`, etc may also block for too long.

Original issue: http://code.google.com/p/pyftpdlib/issues/detail?id=197

@giampaolo
Copy link
Owner Author

From g.rodola on December 22, 2011 06:34:26

Yes, this is a well known issue.
Unfortunately there's no easy/generic fix for a number of reasons.
Internally we use asyncore which does not provide anything to do that.
Also, httplib.HTTPResponse is not supposed to be used in async environments.

I wouldn't even know what to recommend exactly as it's a problem which is hard 
to resolve and there's no easy or standard way to deal with it.
I already bumbed into it, and I solved it by using a mix of multi 
threads/processes and pyftpdlib.ftpserver.CallLater, but it's pretty hackish 
(in fact I don't think it's worth to show the code).

Maybe the quickest solution would consist in spawning a thread/process for 
every connected client and use a separate socket map ( 
http://hg.python.org/cpython/file/b36cb4602e21/Lib/asyncore.py#l66 ).
That way you would use multiple event loops in multiple threads/processes and 
any dispatcher subclass can then be free to block as long as it wants.

It's something which must be developed from scratch though, tested, etc...
Maybe I can provide a proof of concept once I find some time.

@giampaolo
Copy link
Owner Author

From irae.hue...@gmail.com on December 22, 2011 09:52:46

pyftpdlib is a really amazing and well written FTP server with excellent 
customization possibilities. But this is a serious issue, any file system 
access should be considered blocking.

can you show me your hackish workaround anyway :-)

@giampaolo
Copy link
Owner Author

From g.rodola on December 22, 2011 11:14:16

Well, there are actually two different problems here:

#1 - file read() / write(), which takes place in the data channel
#2 - all other fs-related calls (listdir(), rename(), cwd(), mkdir(), etc...) 
which takes place in the data channel

Files (#1) can somehow be integrated in the event loop without using multiple 
threads/processes but only if they provide a readable/writable() method; the 
idea is to call read()/write() only when the file is actually ready to be read 
or written. 

Other fs calls (#2) cannot be integrated as described above as they are 
blocking by nature (think about os.listdir()), therefore the only way to deal 
with them is to make the call into separate process or thread.
There's a FAQ for this: 
https://code.google.com/p/pyftpdlib/wiki/FAQ#How_can_I_run_long-running_tasks_without_blocking_the_server?
 This is the general idea. 
The two problems are very different and finding a general and clean solution is 
far from easy.
There are frameworks out there, such as Twisted, which provide some facilities 
to deal with threads/processes within the async loop, but they do not guarantee 
thread-safeness, which IMO, suggests how hard this subject is: 
http://twistedmatrix.com/documents/current/core/howto/threading.html As for 
your specific problem, httplib.HTTPResponse is simply not designed to work with 
async apps/libs and cannot be integrated with asyncore.
You would have the exact same problems in other environments (twisted, tornado, 
etc...) whereas you would use *their* non-blocking HTTP clients (asyncore does 
not have one).
I still think the quickest solution is to use different threads/processes per 
event-loop.
I'll try to write down some code, but I cannot tell when exactly.


> can you show me your hackish workaround anyway

For what it's worth, it's in attachment.

Attachment: hackish.py

@giampaolo
Copy link
Owner Author

From g.rodola on December 30, 2011 09:03:48

Update - you might want to take a look at this asycore HTTP client, kindly 
provided by Josiah Carlson: https://gist.github.com/1519999 Looking back at 
this, I think I'm going to close this issue after all, as I think it's not 
something which can or even should be dealt with by pyftpdlib. I mean, it's an 
async lib, and as such it should be used in a certain way. 
Integrating it with blocking libs such as httplib is not the way it is meant to be used. 

If you want to discuss further, please feel free to post on the ml.

Status: Rejected

@giampaolo
Copy link
Owner Author

From g.rodola on August 02, 2012 12:56:55

Merging this one into issue 212 . I have some interesting news about it.

Status: Duplicate
Mergedinto: 212

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Component-Library imported imported from old googlecode site and very likely outdated
Projects
None yet
Development

No branches or pull requests

1 participant