My favorites | Sign in
Project Home Downloads Wiki Issues
Search
for
HowDoI  
How Do I?
Phase-Implementation
Updated May 28, 2009 by jmalo...@gmail.com

Herein are workarounds and hacks that other developers use to get around things that are either not yet available as an API in WebKit Gtk or workaround to bugs, defects, etc...

Note that the author names here are their Google Code login names.

  • Get the HTML content of a document
Author: jhuangjiahua, Ref: http://code.google.com/p/pywebkitgtk/issues/detail?id=4&can=1
class WebView(webkit.WebView):
    def get_html(self):
        self.execute_script('oldtitle=document.title;document.title=document.documentElement.innerHTML;')
        html = self.get_main_frame().get_title()
        self.execute_script('document.title=oldtitle;')
        return html
  • Handling custom URLs
Author: mbull.personal, Ref: http://code.google.com/p/pywebkitgtk/issues/detail?id=2
# its a bit of a hack but you can do this by adding a handler for the
# navigation-requested event on the WebView....

def _navigation_requested_cb(view, frame, networkRequest):
    # get uri from request object
    uri=networkRequest.get_uri()
    print "request to go to %s" % uri
    # load the page somehow.....
    page=urllib.urlopen(networkRequest.get_uri())
    # load into associated view, passing in uri
    view.load_string(page.read(), "text/html", "iso-8859-15", uri)
    # return 1 to stop any other handlers running
    # eg. the default uri handler...
    return 1

# this would be pretty easy to pick out the desired protocols and return False for the
# rest (leaving the default behaviour)

DISCLAIMER: I haven't tested these personally so don't blame me if it doesn't work =).

Comment by jonathan...@gmail.com, Nov 5, 2009

My page load flickers the background image. Is there a way I can sync it so that the redraw is in one. There is basically a couple of paragraphs of text with an image bg, but the flicker happens. It's very odd.

Comment by lonhutt, Apr 5, 2010

is there anyway to modify the request header?

Comment by gg...@live.com.ar, Jun 16, 2010

i need the signal "enable-default-context-menu", i will be implemented in the next release ?

Comment by sassanh, Jun 24, 2010

Is there anyway to use proxy with it?

Comment by Lucian.B...@gmail.com, Jun 25, 2010

@gg...@live.com.ar

You need to use settings for that (http://webkitgtk.org/reference/webkitgtk-webkitwebview.html#WebKitWebView--settings) A webkit.WebSettings? object will have that as a property.

Comment by mackst...@gmail.com, Jul 23, 2010

navigation-requested does not work for assets like images, JS, CSS, videos, etc.

Use resource-request-starting. Like this:

web_view.connect('resource-request-starting', resource_cb)

def resource_cb(view, frame, resource, request, response):
    print request.get_uri()
    request.set_uri('...whatever...')

Note that you can't feed back arbitrary data. You can only fiddle with the URL at this point. But that's okay, because through the URL you can do one of two things:

1. Encode your data and set a data: URI. 2. Write your data to a temp file and set a file: URI that points to the temp file.

Comment by liweijia...@gmail.com, Sep 1, 2010

I get the content of a web page by get_html() function, it shows like this: ... <script>document.location='http://www.someelse.com'</script> ...

Now, Can I get the content of the ready-to-redirect site www.someelse.com?

Comment by loneow...@gmail.com, Sep 23, 2010

Where can I find a list of all the available signals on a webview widget?

Comment by jhuangjiahua@gmail.com, Oct 1, 2010

@sassanh

I use this code to use proxy::

import gtk, webkit
import ctypes
libgobject = ctypes.CDLL('/usr/lib/libgobject-2.0.so.0')
libwebkit = ctypes.CDLL('/usr/lib/libsoup-2.4.so.1')
libsoup = ctypes.CDLL('/usr/lib/libsoup-2.4.so.1')
libwebkit = ctypes.CDLL('/usr/lib/libwebkit-1.0.so')

proxy_uri = libsoup.soup_uri_new('http://127.0.0.1:8000')

session = libwebkit.webkit_get_default_session()
libgobject.g_object_set(session, "proxy-uri", proxy_uri, None)

import gtk, webkit; w = gtk.Window() ; s = gtk.ScrolledWindow() ; v = webkit.WebView() ; s.add(v); w.add(s) ; w.show_all()

v.open('http://www.youtube.com')
Comment by charlie....@gmail.com, Oct 1, 2010

I can't figure out how to use the navigation_requested_cb hack :

import webkit
import gtk

window = gtk.Window()
view = webkit.WebView()

window.connect("navigation-requested",
		_navigation_requested_cb,
		view,
		view.get_main_frame(),
		?????)

How/where can I get the NetworkRequest??

Comment by Roger.Si...@gmail.com, Nov 3, 2010

How do you install pywebkitgtk on centos/redhat?

Comment by mbull.pe...@gmail.com, Nov 11, 2010

charlie,

in later versions (1.1.x I seem to remember) you need to guard against loops by using a class attribute to hold the currently requested uri and chacking for it... oh and just connect to the WebView? navigation-requested and the callback args will be... (view,frame,net_request) the net_request is your NetworkRequest? object.. see below.

class BrowserPage(webkit.WebView):
    def __init__(self):
        webkit.WebView.__init__(self)
        self.connect("navigation-requested",self._nav_request_cb)
        self.l_uri=None
    def _nav_request_cb(self,view,frame,net_req):
        uri=net_req.get_uri()
        if uri==self.l_uri:
            print "same uri"
            return 2
        print "getting %s" % uri
        self.l_uri=uri
        page=urllib.urlopen(uri)
        print "load to frame"
        frame.load_string(page.read(),"text/html","iso-8859-15",page.geturl())
        print "[done]"
        return 1

note: now uses frame.load_string instead of view.load_string to preserve frames..

Comment by mbull.pe...@gmail.com, Nov 11, 2010

or better still (as navigation-requested is deprecated...)

class BrowserPage(webkit.WebView):
    def __init__(self):
        webkit.WebView.__init__(self)
        self.connect("navigation-policy-decision-requested",self._nav_request_policy_decision_cb)
        self.l_uri=None
    def _nav_request_policy_decision_cb(self,view,frame,net_req,nav_act,pol_dec):
        uri=net_req.get_uri()
        if uri==self.l_uri:
            pol_dec.use()
            return True
        if uri.startswith('about:'):
            return False
        self.l_uri=uri
        page=urllib.urlopen(uri)
        frame.load_string(page.read(),"text/html","iso-8859-15",page.geturl())
        pol_dec.ignore()
        return True
Comment by lotsof...@gmail.com, Jan 18, 2011

Is there a way to store/remember cookies? For example, if I create a PyWebkitGTK app and use it to log into a website, I would like to remain logged in to that website the next time I start the app.

Comment by boatkrap, Mar 25, 2011

I would like to use webkit get webpage html. I try python code in console it work, but change to file it isn't. This is my browser.py

#!/usr/bin/python

import sys
import webkit

class WebView(webkit.WebView):
    def get_html(self):
        self.execute_script('oldtitle=document.title;document.title=document.documentElement.innerHTML;')
        html = self.get_main_frame().get_title()
        self.execute_script('document.title=oldtitle;')
        return html
    
if __name__ == '__main__':
    if len(sys.argv) < 2:
        print "plz use # browser.py <url>"
        sys.exit()
    
    print "url: ", sys.argv[1]
    browser = WebView()
    browser.open(sys.argv[1])
    print browser.get_html()

I don't need to show browser because it is console application. How can I do. Thank.

Comment by ser...@gmail.com, Apr 15, 2011

RE : Is there a way to store/remember cookies? For example, if I create a PyWebkitGTK app and use it to log into a website, I would like to remain logged in to that website the next time I start the app.

You can get cookies stored, from example above, read documnet.cookie instead document.documentElement.innerHTML;

-self.execute_script('oldtitle=document.title;document.title=document.documentElement.innerHTML;') +self.execute_script('oldtitle=document.title;document.title=document.cookie;')

Comment by ser...@gmail.com, Apr 21, 2011

to boatkrap, Q : I don't need to show browser because it is console application. How can I do ? A: use a virtual frame buffer Xvfb :1 -fbdir /var/tmp/ & export DISPLAY=:1 python webkit , it works I try myself.

Comment by dwil...@builderadius.com, Sep 27, 2011

thanks for the heads up on how to store cookies ...

but if i have a cookie stored on my machine, how do i make sure webkit knows to load those cookies when i point it to a web page?

Comment by GAndreOl...@gmail.com, Oct 11, 2011

I tried this other method to get the HTML of the document:

import jswebkit

  def get_html(self):
    frame = self.viewer.get_main_frame()
    ctx = jswebkit.JSContext(frame.get_global_context())
    text = ctx.EvaluateScript("document.body.innerHTML")
    return text

It works very well, and it's better than putting the whole HTML into the title. Also, with this method you can access only certain parts of the HTML code, and have some DOM interaction via javascript. You have to install the "python-jswebkit" package.

Comment by jcapa...@gmail.com, Dec 21, 2011

Using something like this from mbull:

class BrowserPage?(webkit.WebView?):

def init(self):
webkit.WebView?.init(self) self.connect("navigation-policy-decision-requested",self.nav_request_policy_decision_cb) self.l_uri=None
def nav_request_policy_decision_cb(self,view,frame,net_req,nav_act,pol_dec):
uri=net_req.get_uri() if uri==self.l_uri:
pol_dec.use() return True
if uri.startswith('about:'):
return False
self.l_uri=uri page=urllib.urlopen(uri) frame.load_string(page.read(),"text/html","iso-8859-15",page.geturl()) pol_dec.ignore() return True

The problem I'm running into is that this always does a GET. urlopen will do a POST if it's presented with data, but I'm not sure how to: a) find out if we've been asked to do a post b) get the user data from inside the navigation-policy-decision-requested callback

Comment by mhluo...@gmail.com, Jan 3, 2012

I'm having the same issue. I dug into the library and bindings and it looks like the info should be available via NetworkRequest?, but that the binding hasn't got those methods fleshed out yet.

Comment by mhluo...@gmail.com, Jan 3, 2012

dwil, et al, I think I've come up with a better cookie solution using ctypes- the details can be found on StackOverflow?? (Python webkit WebView remember cookies?, How to clear cookies in webkit) and tailored to your use case.


Sign in to add a comment
Powered by Google Project Hosting