English | Site Directory

Google Base Data API

Accessing the Google Base data API using Python

The example Python application generates a sample XML file for a specific item type. It proposes the attributes that are most commonly used on Google Base for this item type and gives a few examples to illustrate how they're used.

This application shows how to access the public Google Base feeds; in particular the attributes feed.

Running the application

Before you can run this application, you will need to get a key for "Installed applications".

Usage

python generate_template.py [options] item_type

Options

  • -h, --help
    show this help message and exit
  • --url=BASE_URL
    the url of the google base server to use
  • --file=FILE
    write output to FILE instead of standard output
  • --key=KEY
    developer key (required)
  • --attribute_count=MAX_RESULTS
    maximum number of attributes to include
  • --value_count=MAX_VALUES
    maximum number of examples to include
  • --include_example
    include an example in the template

Examples

python generate_template.py --key 1234 books 
python generate_template.py --key 1234 housing 
python generate_template.py --key 1234 "events and activities"     

Understanding the code

generate_template.py is the main entry point of the application; however, it doesn't contain much that is specific to Google Base data API.

The interesting code is in googlebase.py.

Everything starts with the GoogleBaseService object. It connects to www.google.com/base, passes along your developer key, and returns the feed as a DOM tree.

You need to pass it the API key you got from http://code.google.com/apis/base/signup.html to create a new instance:

import googlebase
service = googlebase.GoogleBaseService("1234")

Once you have the service object, you can connect to any of the public feeds described in the Google Base data API documentation.

Here is how you could query the snippet feeds for books about programming:

domtree = service.run_query("snippets", bq="[item type: book] programming", max_results=20)

Any named parameter you pass to run_query() will be forwarded to the feed. Refer to the feed documentation for a list of the parameters that each feed accepts.

If you look at how run_query() is implemented, you'll see that it simply builds a feed URL, reads it and passes the result to the minidom parser.

Once you have the DOM tree, you can access any elements as described in the Python documentation.

For example, here is how you would display the title and author of all of the books returned by the previous query:

def textContent(domelement):
  if not domelement: return "?"
  return ''.join([c.toxml('utf-8') for c in domelement[0].childNodes])
for book in domtree.getElementsByTagNameNS(googlebase.ATOM_NAMESPACE_URI, "entry"):
  print '"%s" by "%s"' % (textContent(book.getElementsByTagNameNS(googlebase.ATOM_NAMESPACE_URI, "title")),
                          textContent(book.getElementsByTagNameNS(googlebase.G_NAMESPACE_URI, "author")))

This is similar to what the listMostCommonItemTypeAttributes() method does when it parses the attributes feed.

To wrap up, here's what googlebase.py does in the previous examples:

import urllib2
import xml.dom.minidom


# service = googlebase.GoogleBaseService("1234")
opener = urllib2.build_opener()
opener.addheaders = [("X-Google-Key", "key=1234")]

# domtree = service.run_query("snippets", bq="[item type: book] programming", max_results=20)
handle = opener.open("http://www.google.com/base/feeds/snippets?max-results=20&bq=%5Bitem+type%3A+book%5D+programming")
domtree = xml.dom.minidom.parse(handle)
handle.close()