Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP responses with no body cause other responses to be consumed #77

Closed
kbandla opened this issue Jun 4, 2015 · 4 comments
Closed

HTTP responses with no body cause other responses to be consumed #77

kbandla opened this issue Jun 4, 2015 · 4 comments
Assignees

Comments

@kbandla
Copy link
Owner

kbandla commented Jun 4, 2015

From andrewf...@gmail.com on September 27, 2010 18:40:27

Put more clearly, when an HTTP response, say, 304 Not Modified, has no body but still has a content-type header, all data after that in the stream is consumed. What steps will reproduce the problem? 1. Unpack attached zip file
2. Run dpkt_bug.py, which attempts to construct dpkt.http.Response's with the data in the file stream.txt, included.

This program prints the number of responses parsed. There are two responses in the file, but only one is detected, with the other response as its body. You can see this if you print the responses instead of just the length of the list.

This test was run on Windows Vista with dpkt 1.7.

Attachment: dpkt_http_bug.zip

Original issue: http://code.google.com/p/dpkt/issues/detail?id=50

@kbandla
Copy link
Owner Author

kbandla commented Jun 4, 2015

From ls...@google.com on September 28, 2010 07:32:20

Please note that 204 No Content, and others have no body either. http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.3 "For response messages, whether or not a message-body is included with
a message is dependent on both the request method and the response
status code (section 6.1.1). All responses to the HEAD request method
MUST NOT include a message-body, even though the presence of entity-
header fields might lead one to believe they do. All 1xx
(informational), 204 (no content), and 304 (not modified) responses
MUST NOT include a message-body. All other responses do include a
message-body, although it MAY be of zero length."

@kbandla
Copy link
Owner Author

kbandla commented Jun 4, 2015

From andrewf...@gmail.com on September 28, 2010 10:45:55

Ok. The current issue is being caused by dpkt reading the rest of the data into the body if the response header has a content-type header, which is wrong according to the RFC. A simple fix is simply to remove those two lines.

Of course, this doesn't solve the larger issue. Making the response intelligently decide whether to parse a body is tricky with dpkt's architecture, because the parsing of the body and headers is done in the http.Message superclass.

@kbandla
Copy link
Owner Author

kbandla commented Jun 4, 2015

From matthaeu...@gmail.com on January 10, 2014 09:11:33

Attached is a patch to fix the issue:

  1. Ignore content-type, as it does not say anything about the existence of a body.
  2. Do not attempt to read body if response status code is 1xx, 204 or 304.
  3. Here comes the ugly part: ignore body if parameter head_response=True has been passed to Response object. HEAD responses are identical to GET responses (including Content-Length), except for the missing body. dpkt can not determine automatically whether this is a HEAD response.

Also attached modified dpkt_http_bug2.zip:

  • dpkt_bug.py: call dpkt.http.Response(data, head_response=True)
  • stream.txt: file ends with '\r\n\r\n' (instead of '\r\n')

Attachment: http_fix_50.patch dpkt_http_bug2.zip

@kbandla
Copy link
Owner Author

kbandla commented Jun 4, 2015

From matthaeu...@gmail.com on January 12, 2014 04:32:40

Slight improvement: set body consistently to '' (empty string), avoid None.

Attachment: http_fix_50b.patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants