My favorites | Sign in
Project Home Downloads Wiki Issues Source
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 5298: remote_api does not do gzip compression for transmitted data
4 people starred this issue and may be notified of changes. Back to list
Status:  Fixed
Owner:  ----
Closed:  Sep 2012


Sign in to add a comment
 
Reported by angad.si...@oxylabs.com, Jul 7, 2011
I have been using remote_api for making a python based ETL tool to bulk import datastore entities to a remote mysql server. Unfortunately, the HTTP POST calls made by remote_api to appengine to get the data do not return in a gzipped format. There is no way to specify that.

I hacked into the appengine SDK code and set "Accept-Encoding" and "User-Agent" to "gzip" at line 356 of appengine_rpc.py (as suggested here https://code.google.com/appengine/kb/general.html#compression, that should have done the trick):

        req.add_header('Accept-Encoding', 'gzip')
        req.add_header('User-Agent', 'gzip')

I also set the debuglevel=1 for the urllib2 OpenerDirector object at line 453 of appengine_rpc.py to be able to see the request/reponse headers:

        opener.add_handler(urllib2.HTTPHandler(debuglevel=2))

Here's the output:
send: 'POST /_ah/remote_api HTTP/1.1\r\nX-Appcfg-Api-Version: 1\r\nContent-Length: 198\r\nAccept-Encoding: gzip\r\nHost: *******.appspot.com\r\nUser-Agent: gzip\r\nConnection: close\r\nCookie:******\r\nContent-Type: text/plain\r\n\r\n'
send: '\x12\x0cdatastore_v3\x1a\x08RunQuery"\xab\x01\n\x0boxanalytics\x1a\ndbUserInfo#0\x03r2\x1a\x0eimport_counter \x00*\x1e\x1a\x1c2011-07-05 06:25:33.491890_0$#0\x01r5\x1a\x0eimport_counter \x00*!\x1a\x1f2011-07-05 06:25:33.491890_4996$KR\x0eimport_counterX\x01L\x80\x01\xe8\x07\xb8\x01\xe8\x07\xc8\x01\x01'
reply: 'HTTP/1.1 200 OK\r\n'
header: Cache-Control: no-cache
header: Content-Type: application/octet-stream
header: Expires: Fri, 01 Jan 1990 00:00:00 GMT
header: X-AppEngine-Estimated-CPM-US-Dollars: $0.355296
header: X-AppEngine-Resource-Usage: ms=665 cpu_ms=12219 api_cpu_ms=11015
header: Date: Thu, 07 Jul 2011 11:19:14 GMT
header: Server: Google Frontend
header: Content-Length: 647200
header: Connection: close

The project I'm working on needs to fetch entities remotely from the datastore at a very high rate and cannot afford to do it without compression.

As a stark comparison, fetching 1000 entities using the a regular appengine handler call (outputting the entities in xml using Model.to_xml()) takes 1.46 seconds and downloads only 38 KB (compressed!) of data, whereas the remote API takes 6.50205 seconds and downloads a whooping 1.6 MB (uncompressed!) of data with the same query in both cases.

Having gzip really makes a huge difference. The remote_api was meant to be used by developers to be able to access the datastore remotely. I'm surprised that the bulkloader uses it and despite that it doesn't support gzip!

It would be great to see gzip support added to remote_api calls!
Aug 15, 2011
#1 guido@google.com
Nick, how hard would it be to add this to remote_api?
Status: Acknowledged
Owner: nickjohn...@google.com
Labels: -Type-Defect Type-Feature
Aug 18, 2011
Project Member #2 jfmontes...@google.com
(No comment was entered for this change.)
Labels: log-5171816
Aug 26, 2011
Project Member #3 pro...@google.com
Bulk edit: mark escalated issue as Accepted.
Status: Accepted
Sep 13, 2011
#4 jon...@google.com
(No comment was entered for this change.)
Labels: Component-RemoteApi
Sep 23, 2012
Project Member #6 tmat...@google.com
should be fixed in 1.7.2.
Status: Fixed
Owner: ---
Sign in to add a comment

Powered by Google Project Hosting