My favorites | Sign in
Project Logo
                
New issue | Search
for
| Advanced search | Search tips
Issue 56: Temporary deadlock on daemon socket connection.
5 people starred this issue and may be notified of changes. Back to list
Status:  Accepted
Owner:  Graham.Dumpleton
Type-Defect
Priority-Medium


Sign in to add a comment
 
Reported by Graham.Dumpleton, Feb 18, 2008
When using mod_wsgi daemon mode and a POST request is received, with the amount of content 
being greater in size than the UNIX socket buffer size for the host being used, and the 
application doesn't consume the request content before sending a response, and the response 
headers plus response content is itself greater than the UNIX socket buffer size, then a deadlock 
can occur.

This is because the Apache child process side is still trying to write the request content and is 
therefore not in a position to read the response headers and response content and unblock the 
daemon side trying to write the response.

For the typical well behaved application this doesn't present a problem. The sort of use case 
where it would occur is where specifically doing things like streaming back the request content 
as the response content, possibly with modifications. Only actual situation where a problem has 
been seen in practice is where SPAM bots are performing POST requests with large amounts of 
content against arbitrary URLs. Thus, the SPAM bots are inadvertently triggering the problem and 
not normal application usage.

The actual deadlock is not permanent and will end when the timeout specified by the Apache 
Timeout directive has elapsed, typically 300 seconds.

Note that how severe the problem is is dictated by size of UNIX socket buffers. On some UNIX 
systems this is as low as 8KB (MacOS X). On Linux systems it is a lot higher. In mod_wsgi 2.0c5 
an option will exist for WSGIDaemonProcess to allow the send and recv buffer sizes for the UNIX 
socket to be increased so where a system has a default low value, it can be increased and lesson 
risk of problem occurring.

To totally eliminate this issue means reengineering the protocol used between Apache child 
process and daemon process with a packet based mechanism in conjunction with flow control 
mechanisms so that daemon can indicate when it is willing to accept more request content. How 
this may be done has been discussed extensively on the mod_wsgi mailing list.

BTW, this deadlock issue also exists with mod_cgi, mod_cgid and mod_scgi, as these all use a 
similar system based around a UNIX socket or interprocess pipe. Conceptually a deadlock 
situation may be triggered with mod_proxy as well, but in that case INET sockets are used and 
the buffer sizes on these are much much larger. Still need to determine what the critical size 
values for mod_proxy to cause a deadlock is, if it also can suffer this problem. It is possible for 
mod_proxy that some other magic happens in Apache to prevent it, but haven't been able to 
determine what that is as yet if it does exist.
Comment 1 by Graham.Dumpleton, Feb 18, 2008
(No comment was entered for this change.)
Labels: Milestone-Release3.0
Comment 2 by br...@briansmith.org, Feb 20, 2008
The request doesn't have to be a POST; any method (including GET) will cause problems
if there is a (large) request body.

Also, this problem doesn't affect applications that are not "well behaved." In fact,
the more well-behaved the application is (checking request headers for validity), the
more likely it is that it will run into the problem. In fact, any request that would
benefit from the "100-continue" optimization would run into this problem if the
request body is big enough. My application runs into this problem already, and it
will only get worse as I add more features that require me to validate requests based
on the headers.

- Brian
Comment 3 by Graham.Dumpleton, Apr 10, 2008
Now not likely to be addressed in version 3.0.
Labels: -Milestone-Release3.0
Comment 4 by bobince, Nov 05, 2008
Note that this is not specific to mod_wsgi. WSGI applications running under IIS+CGI
face the same issue. (See 'buffer bug' at
http://www.doxdesk.com/updates/2006.html#u20060416-cgi .)

WSGI apps should always make sure to read the input stream (via cgi.FieldStorage() or
whatever other method their framework supplies), even if they are only intended to be
called through GET methods.

Comment 5 by Graham.Dumpleton, Nov 05, 2008
Interesting link about IIS+CGI. Thanks.

As to WSGI applications ensuring they read all input, they nearly always don't for the case where they weren't 
expecting to do anything with the input. Can't see that that is going to change.
Comment 6 by ionel.mc, Nov 26, 2008
Isn't this actually a denial of service issue since you'll have deadlocked workers
for how much your Timeout is ?
Comment 7 by Graham.Dumpleton, Nov 26, 2008
It can be if WSGI application doesn't consume request content and generates large responses at same time. Most UNIX systems have 
large UNIX socket buffer sizes though and so would have to be reasonably big response. If WGSI applications are correctly rejecting 
POST requests against URLs which aren't expecting them and returning error response pages as they should, then wouldn't generally be 
an issue.

This same issue crops up in mod_cgi, mod_cgid and mod_scgi. From analysis of code, technically it looks like that for some mod_proxy 
and mod_fastcgi configurations it may also occur, but since INET sockets are used and buffers are usually somewhat larger would take 
even large amounts of data in each direction.

Recent discussion at:

  http://groups.google.com/group/python-web-sig/browse_frm/thread/fdd318a722383792

talked a bit about this.
Sign in to add a comment

Hosted by Google Code