Adding support for accessing body of requests (useful for http post requests) in the onResourceRequested call back.
Which version of PhantomJS are you using? 1.2
Comment #1
Posted on Jul 4, 2011 by Quick RabbitI'm not sure how we would encode the content. Typed array comes to mind, but that's not widely supported yet.
Comment #2
Posted on Jul 16, 2011 by Quick Rabbit(No comment was entered for this change.)
Comment #3
Posted on Jul 16, 2011 by Grumpy ElephantIt seems the body would be part of the "text" property in the content object of the HAR response object: http://www.softwareishard.com/blog/har-12-spec/#content
Comment #4
Posted on Jul 21, 2011 by Grumpy ElephantAfter doing some more research, wouldn't it be feasible to add the content of the response to a text property in the data emitted when listening to the NetworkAccessManager::handleFinished and performing a readAll() on the reply argument? Then taking that byte array and converting to a QTString?
Currently, the signal is coming from: http://doc.qt.nokia.com/4.6/qnetworkaccessmanager.html#finished Then the reply object is: http://doc.qt.nokia.com/4.6/qnetworkreply.html#finished Where we could call readAll to : http://doc.qt.nokia.com/4.6/qiodevice.html#readAll
Just wanted to check if I am on the right path here.
Comment #5
Posted on Jul 22, 2011 by Quick RabbitI'm not sure readAll is wise there. Doing that definitely leaves WebKit with data at all?
Comment #6
Posted on Jul 23, 2011 by Grumpy ElephantAre you saying that if you call readAll on the response object, it won't leave that data available for WebKit to read?
Comment #7
Posted on Jul 25, 2011 by Quick RabbitYes, I am quite sure calling readAll will eat all the received data and leave nothing.
Comment #8
Posted on Aug 4, 2011 by Grumpy ElephantOff the top of your head, would there then be a way to read a copy of response data? If you can point me in the right direction, I can start experimenting with it.
Comment #9
Posted on Aug 8, 2011 by Grumpy ElephantI did a little more research and found out that it might be possible to write a proxy for the QNAM and the NetworkReply. This way you could capture the data received and still have it accessible after the read has finished. An example I found pointed me to: http://gitorious.org/qtwebkit/performance/blobs/master/host-tools/mirror/main.cpp
Am I on the right track here?
Thanks!
Comment #10
Posted on Aug 8, 2011 by Quick RabbitYes, proxying both the network access manager and network reply is the right approach.
Comment #11
Posted on Aug 8, 2011 by Grumpy ElephantI'm on my way to figuring this out but have run into a snag. Keep in mind I'm not that familiar with C++ but after adding in the reply proxy, I get a compile error of:
Undefined symbols: "vtable for NetworkReplyProxy", referenced from: NetworkReplyProxy::NetworkReplyProxy(QObject*, QNetworkReply*)in networkaccessmanager.o
Here is my latest gist of the NetworkAccessManager: https://gist.github.com/1131280
Any help would be greatly appreciated!
Comment #12
Posted on Aug 14, 2011 by Grumpy ElephantPull request: https://github.com/ariya/phantomjs/pull/124
We'll definitely need to go through a code review on this one :)
Comment #13
Posted on Aug 21, 2011 by Quick RabbitSee https://github.com/ariya/phantomjs/commit/578aa6c2.
We need an option to enable and disable (by default) this data gathering (since it introduces some performance penalty).
Comment #14
Posted on Sep 15, 2011 by Quick RabbitHmm, some ownership issue and/or race condition in the proxy might provoke a crash.
Comment #15
Posted on Sep 15, 2011 by Grumpy ElephantAriya, I'm sorry, but I'm not following. Can you elaborate a little more?
Also, when I get some time, I can add in a patch to disable the use of the proxy.
Comment #16
Posted on Sep 15, 2011 by Quick RabbitI fixed the potential crash in https://github.com/ariya/phantomjs/commit/37404ba1.
Comment #17
Posted on Sep 15, 2011 by Grumpy ElephantAwesome!
Comment #18
Posted on Sep 15, 2011 by Quick RabbitClosing this one as it is implemented already.
For the API/settings improvement, head to issue 236.
Comment #19
Posted on Sep 19, 2011 by Quick RabbitReopened. Likely needs to be rescheduled soon.
It causes regression, see issue 238.
Comment #20
Posted on Sep 19, 2011 by Quick Rabbit(No comment was entered for this change.)
Comment #21
Posted on Sep 19, 2011 by Quick RabbitI can't find a quick way to resolve this. Probably I will revert it for the time being.
Comment #22
Posted on Sep 21, 2011 by Quick RabbitUnfortunately I have to revert this, see https://github.com/ariya/phantomjs/commit/eb255817
We need to have a better working solution for 1.4.
Comment #23
Posted on Oct 4, 2011 by Grumpy ElephantAriya, do you have any other information on what needs to be added to the proxy so we can get this back in? I'd like to be able to help and get this resolved.
Comment #24
Posted on Oct 4, 2011 by Quick RabbitI have not investigated further. Basically the fix should not regress netlog and netsniff examples (easy to test). See also issue 238.
Comment #25
Posted on Oct 4, 2011 by Grumpy ElephantSounds good. I will try to look into this and see if I can find a solution. I will also look into adding the flag to disable the proxy for performance reasons.
Comment #26
Posted on Dec 19, 2011 by Quick RabbitNot enough time to resolve for 1.4. Postpone to 1.5.
Comment #27
Posted on Mar 9, 2012 by Quick RabbitNo activity, reschedule.
Comment #28
Posted on Mar 13, 2012 by Quick RabbitIssue 422 has been merged into this issue.
Comment #29
Posted on Apr 14, 2012 by Quick RabbitIssue 501 has been merged into this issue.
Comment #30
Posted on May 2, 2012 by Grumpy RabbitI think, the embedded phantomjs server can act as a proxy which intersects and saves the content of the dowloading files and the sending POST. The only problem is its unability to parse queries like "GET http://www.google.com/ HTTP/1.1". The server responses with "Error 400: Bad Request, Cannot parse HTTP request: [GET]". It seems that fixing the server would be an easier and faster solution for the people wanting to access the bodies of requests and responses.
Comment #31
Posted on May 2, 2012 by Grumpy ElephantI do have a commit that adds the ability to read the responses @ https://github.com/jgornick/phantomjs/commits/reply-proxy-issue-158. However there are 2 other issues I know of that need to be fixed in order to get this accepted:
- Issue #238 - REGRESSION: netsniff.js does not produce HAR dump
- Issue #236 - Settings to control network capture
I'm pretty slammed right now so if someone else is interested in helping out, I'd appreciate it :)
Comment #32
Posted on May 8, 2012 by Massive DogI made simple solution for that, https://github.com/ariya/phantomjs/pull/246 In my fork page.onResourceRequested accept hash with postData if its POST-request
Why you need proxy for that? I think its quite simple
Comment #33
Posted on May 12, 2012 by Grumpy Rabbitjgornik, there is one more issue with NetworkProxy approach: synchronious XMLHttpRequest doesnt work (but asynchronious still)
testcase: https://gist.github.com/2664974
Comment #34
Posted on May 14, 2012 by Happy HorseI would want the resource response text to be not just readable but writable as well (so I can modify a particular resource's response body for instrumentation purposes). See http://code.google.com/p/phantomjs/issues/detail?id=539 for more info.
Comment #35
Posted on Feb 5, 2013 by Happy HorseFor what it's worth, I have a workaround that I've been using. I followed phantomjs's code into the QTNetworking core and found where the cache is being saved and how the names are hashed. I replicated that using phantomjs and then pulled the content from the QT's cache file. I only tested this with images since that's all I need it for, there will definitely be cases where text documents don't work with it, but for what it's worth, here is my code: https://gist.github.com/bshamric/4717583
Comment #36
Posted on Feb 7, 2013 by Happy HorseInteresting idea, thanks for sharing. Unfortunately this approach also breaks down when the responding server indicates (via HTTP response headers) that the requested resources cannot be cached.
Comment #37
Posted on Mar 16, 2013 by Happy HorseClosing. This issue has been moved to GitHub: https://github.com/ariya/phantomjs/issues/10158
Status: Migrated
Labels:
Type-Defect
Priority-Medium
Milestone-FutureRelease