My favorites | English | Sign in

Google Spreadsheets APIs and Tools (Labs)

Developer's Guide: Python

The Google Spreadsheets Data API allows client applications to view and update Spreadsheets content in the form of Google Data API feeds. Your client application can request a list of a user's spreadsheets, edit or delete content in an existing Spreadsheets worksheet, and query the content in an existing Spreadsheets worksheet.

In addition to providing some background on the capabilities of the Spreadsheets Data API, this document provides examples for interacting with the API using the Python client library. For help setting up the client library, see the Getting Started Guide.

If you're interested in understanding more about the underlying protocol used by the Python Client Library to interact with the Spreadsheets Data API, please see the protocol tab.

Contents

Audience

This document is intended for programmers who want to write client applications using the Google Data API Python client library that can interact with Google Spreadsheets.

Getting started

For help setting up the client library, see the Getting Started Guide. To use the Python client library, you'll need the Python 2.0+ and the Element Tree, httplib, and urllib modules. See Dependency Modules for further details. After downloading the client library, you'll find the sample explained in this guide in the samples/spreadsheets subdirectory of the distribution.

A full working copy of this sample is available in the Google Data API Python Client Library project in the project hosting section of code.google.com. The sample is located at /trunk/samples/spreadsheets/spreadsheetExample.py in the SVN repository accessible from the Source tab.

Run the example as follows:

python spreadsheetsExample.py --user=<username> --pw=<password>

The username and password values are the same credentials that you use to login to Google Spreadsheets.

The sample performs a number of operations on a worksheet of your choice to demonstrate the use of the Spreadsheets Data API. Please see the Run method of the code for more details as to which operations are performed.

To include the examples in this document into your own code, you'll need the following import statements:

try: 
  from xml.etree import ElementTree
except ImportError:  
  from elementtree import ElementTree
import gdata.spreadsheet.service
import gdata.service
import atom.service
import gdata.spreadsheet
import atom

The SpreadsheetsService class represents a client connection (with authentication) to a Spreadsheets service. The general procedure for sending a query to a service using the client library consists of the following steps:

  1. Obtain or construct the appropriate URL.
  2. If you're sending data to a service (for example, if you're inserting a new entry), then transform the raw data into objects using the client library classes. (This step doesn't apply if you're just requesting a feed, as in this example.)
  3. Create a new SpreadsheetsService instance, setting your application's name (in the form companyName-applicationName-versionID).
  4. Set the appropriate credentials.
  5. Call a method to send the request and receive any results.

Create a spreadsheet

It is possible to create a new spreadsheet by uploading a spreadsheet file via the Google Documents List Data API. The Spreadsheets Data API does not currently provide a way to delete a spreadsheet. For testing purposes, you may create a spreadsdheet manually or upload one.

Authenticating to the Spreadsheets service

The Python client library can be used to work with either public or private feeds. Public feeds are read-only, but do not require any authentication. Private feeds require that you authenticate to the spreadsheets servers. This can be done via ClientLogin username/password authentication or AuthSub proxy authentication.

Please see the authentication documentation for more information on AuthSub and ClientLogin.

Single-user "installed" client authentication

To use ClientLogin (also called "Authentication for Installed Applications"), invoke the ProgrammaticLogin method of SpreadsheetsService inherited from Service, specifying the ID and password of the user on whose behalf your client is sending the query. For example:

gd_client = gdata.spreadsheet.service.SpreadsheetsService()
gd_client.email = 'jo@gmail.com'
gd_client.password = 'mypassword'
gd_client.source = 'exampleCo-exampleApp-1'
gd_client.ProgrammaticLogin()

For more information about authentication systems, see the Google Account Authentication documentation.

Multiple-user web application client authentication

AuthSub proxy authentication is used by web applications which need to authenticate their users to Google accounts. The operator does not need access to the username and password for the Spreadsheets user - only special AuthSub tokens are required.

When the user first visits your application, they have not yet been authenticated. In this case, you need to print some text and and a link directing the user to Google to authenticate your request for access to their spreadsheets. The Python Data API client library provides a function to generate this URL. The code below sets up a link to the AuthSubRequest page.

def GetAuthSubUrl():
  next = 'http://www.coolspreadsheetsite.com/welcome.pyc'
  scope = 'http://spreadsheets.google.com/feeds/'
  secure = False
  session = True
  gd_client = gdata.spreadsheet.service.SpreadsheetsService()
  return gd_client.GenerateAuthSubURL(next, scope, secure, session);

authSubUrl = GetAuthSubUrl();
print '<a href="%s">Login to your Google account</a>' % authSubUrl

Notice the parameters sent to the GenerateAuthSubURL method:

  • next, the URL of the page that Google should redirect the user to after authentication.
  • scope, indicating that the application will only access Blogger feeds.
  • secure, indicating that the token returned will not be a secure token.
  • session, indicating this token can be exchanged for a multi-use (session) token.

The URL looks something like this:

https://www.google.com/accounts/AuthSubRequest?scope=http%3A%2F%2Fspreadsheets.google.com%2Ffeeds%2F&session=1&secure=0&next=http%3A%2F%2Fwww.coolspreadsheetsite.com%2Fwelcome.pyc

The user can then follow the link to Google's site and authenticate to their Google account.

After the user authenticates, they will be redirected back to the next URL. The URL will have a single-use token value appended to it as a query parameter. The URL looks something like this:

http://www.coolspreadsheetsite.com/welcome.pyc?token=14a87fe98219731acd516

Upgrading to a session token

For security, this token is single-use only, so now you need to exchange this single-use token for a session token. This process is described in the AuthSub documentation. The following code snippet shows how to upgrade the token.

gd_client = gdata.spreadsheet.service.SpreadsheetsService()
gd_client.auth_token = authsub_token
gd_client.UpgradeToSessionToken()

In this snippet, the authsub_token variable contains the value from the token query parameter in the URL. There are several ways to retrieve this value, for example:

import cgi
parameters = cgi.FieldStorage()
authsub_token = parameters['token' ]

This token value represents a single-use AuthSub token. Since session = True was specified above, this token can be exchanged for an AuthSub session token using the UpgradeToSessionToken method, which calls the AuthSubSessionToken service.

Get a list of spreadsheets

You can get a feed containing a list of the currently authenticated user's spreadsheets by sending an authenticated GET request to the following URL:

http://spreadsheets.google.com/feeds/spreadsheets/private/full

The result is a "meta-feed," a feed that lists all of that user's spreadsheets; each entry in the feed represents a spreadsheet associated with the user. This feed is accessible only using an authentication token.

You can prompt the user to select a spreadsheet from the list by:

def PromptForSpreadsheet(gd_client):
  # Get the list of spreadsheets
  feed = gd_client.GetSpreadsheetsFeed()
  PrintFeed(feed)
  input = raw_input('\nSelection: ')
  return feed.entry[string.atoi(input)].id.text.rsplit('/', 1)[1]

The resulting SpreadsheetsSpreadsheetsFeed object feed represents a response from the server. Among other things, this feed contains a list of SpreadsheetsSpreadsheet objects (feed.entry), each of which represents a single spreadsheet. SpreadsheetsSpreadsheet encapsulates the information shown in the protocol document.

The PrintFeed definition is as follows:

def PrintFeed(feed):
  for i, entry in enumerate(feed.entry):
    if isinstance(feed, gdata.spreadsheet.SpreadsheetsCellsFeed):
      print '%s %s\n' % (entry.title.text, entry.content.text)
    elif isinstance(feed, gdata.spreadsheet.SpreadsheetsListFeed):
      print '%s %s %s' % (i, entry.title.text, entry.content.text)
      # Print this row's value for each column (the custom dictionary is
      # built from the gsx: elements in the entry.) See the description of
      # gsx elements in the protocol guide.
      print 'Contents:'
      for key in entry.custom:
        print '  %s: %s' % (key, entry.custom[key].text)
      print '\n',
    else:
      print '%s %s\n' % (i, entry.title.text)

The value returned from PromptForSpreadsheet is the key of the spreadsheet at the selected index. The key is the permanent unique ID of a specific worksheet within the spreadsheet. The key is the last value in the path of the id (corresponding to the <id> element in the spreadsheet entry) attribute of the SpreadsheetsSpreadsheet object.

Get a list of worksheets

A given spreadsheet may contain multiple worksheets. For each spreadsheet, there's a worksheets metafeed listing all the worksheets in that spreadsheet.

Given a key from the <id> of SpreadsheetsSpreadsheet object you've already retrieved, you can print a list of all worksheets and prompt for a selection as follows:

def PromptForWorksheet(gd_client, key):
  # Get the list of worksheets
  feed = gd_client.GetWorksheetsFeed(key)
  PrintFeed(feed)
  input = raw_input('\nSelection: ')
  return feed.entry[string.atoi(input)].id.text.rsplit('/', 1)[1]

The resulting SpreadsheetsWorksheetsFeed object feed represents a response from the server. Among other things, this feed contains a list of SpreadsheetsWorksheet objects (feed.entry), each of which represents a single spreadsheet. SpreadsheetsWorksheet encapsulates the information shown in the protocol document.

The rertieval of the "meta-feed" is equivalent to sending an authenticated GET request to the URL:

http://spreadsheets.google.com/feeds/worksheets/worksheetId/private/full

The value returned from PromptForWorksheet is the worksheetId of the worksheet at the selected index. The worksheetId is the permanent unique ID of a specific worksheet within the spreadsheet. The worksheetId is the last value in the path of the id (corresponding to the <id> element in the worksheet entry) attribute of the SpreadsheetsWorksheet object.

Interacting with list-based feeds

A given worksheet generally contains multiple rows, each containing multiple cells. You can request data from the worksheet either as a list-based feed, in which each entry represents a row, or as a cell-based feed, in which each entry represents a single cell. For information on cell-based feeds, see Interacting with cell-based feeds.

The following sections describe how to get a list-based feed, add a row to a worksheet, and send queries with various query parameters.

The list feed makes some assumptions about how the data is laid out in the spreadsheet.

In particular, the list feed treats the first row of the worksheet as a header row; Spreadsheets dynamically creates XML elements named after the contents of header-row cells. Users who want to provide Data API feeds should not put any data other than column headers in the first row of a worksheet.

The list feed contains all rows after the first row up to the first blank row. The first blank row terminates the data set. If expected data isn't appearing in a feed, check the worksheet manually to see whether there's an unexpected blank row in the middle of the data. In particular, if the second row of the spreadsheet is blank, then the list feed will contain no data.

A row in a list feed is as many columns wide as the worksheet itself.

Get a list-based feed

To retrieve a worksheet's list feed, you might do the following. Here, assume that the gd_client is an authenticated spreadsheet.SpreadsheetsService object, key is a valid unique ID for a spreadhseet that the user has access to, and wksht_id is a valid unique ID for a worksheet in the spreadsheet corresponding to the provided key.

def ListGetAction(gd_client, key, wksht_id):
  # Get the list feed
  feed = gd_client.GetListFeed(key, wksht_id)
  PrintFeed(feed)
  return feed

The resulting SpreadsheetsListFeed object feed represents a response from the server. Among other things, this feed contains a list of SpreadsheetsList objects (feed.entry), each of which represents a single row in a worksheet. The SpreadsheetsWorksheet encapsulates the information shown in the protocol document.

This code is equivalent to sending an authenticated GET request to the URL:

http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full

Reverse-sort rows

By default, rows in the feed appear in the same order as the corresponding rows in the GUI; that is, they're in order by row number. To get rows in reverse order, set the reverse properties of the service.ListQuery object to true:

q = service.ListQuery()
q.feed = 'http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full'
q.reverse = 'true'
    

This is equivalent to sending an authenticated GET request to the URL:

http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full?reverse=true

Note that if you want to order (or reverse sort) by a particular column, rather than by position in the worksheet, you can set the orderby value of the service.ListQuery object to column:<the header of that column>. This is analogous to adding an orderby query parameter in the outbound GET request.

Send a structured query

You can set a service.ListQuery's sq value to produce a feed with entries that meet the specified criteria. For example, suppose you have a worksheet containing personnel data, in which each row represents information about a single person. You wish to retrieve all rows in which the person's name is "John" and the person's age is over 25. To do so, you would set sq as follows:

q = service.ListQuery()
q.sq = 'name=John and age>25'
q.feed = 'http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full'
    

This is equivalent to sending an authenticated GET request to the URL:

http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full?sq=name%3DJohn%20and%20age%3E25

For more information about the structured query syntax, see the List-feed query parameters reference section.

Add a row

To insert a new row in a list feed, assuming that the gd_client is an authenticated spreadsheets.SpreadsheetService object, key s a valid unique ID for a spreadsheet that the user has access to, and wksht_id is a valid unique ID for a worksheet in the spreadsheet corresponding to the provided key, do the following:

def ListInsertAction(gd_client, key, wksht_id, row_data):
  entry = gd_client.InsertRow(StringToDictionary(row_data), 
      key, wksht_id)
  if isinstance(entry, gdata.spreadsheet.SpreadsheetsList):
    print 'Inserted!'

Notice that the InsertRow definition takes a dictionary where the key corresponds to the column name and the value corresponds to the value of that column in the new row.

To parse the input string into a map the following StringToDictionary definition is used.

def StringToDictionary(row_data):
  result = {}
  for param in row_data.split():
    name, value = param.split('=')
    result[name] = value
  return result

Spreadsheets inserts the new row immediately after the last row that appears in the list-based feed, which is to say immediately before the first entirely blank row. This code is equivalent to sending an authenticated POST request to the URL:

http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full

with the corresponding XML document in the POST body.

Edit a row

To update an existing row in a list feed, assuming that gd_client is an authenticated spreadsheets.SpreadsheetService object, key is a valid unique ID for a spreadsheet that the user has access to, and wksht_id is a valid unique ID for a worksheet in the corresponding to the provided key, do the following:

def ListUpdateAction(gd_client, key, wksht_id, index, row_data):
  feed = gd_client.GetListFeed(key, wksht_id)
  entry = gd_client.UpdateRow(
      feed.entry[string.atoi(index)],
      StringToDictionary(row_data))
  if isinstance(entry, gdata.spreadsheet.SpreadsheetsList):
    print 'Updated!'

Notice that the UpdateRow definition takes a spreadsheet.SpreadsheetList object which corresponds to an existing list entry, and a dicitionary where the key corresponds to the column name and the value corresponds to the value of that column in the new row.

This is equivalent to sending an authenticated PUT request to the entry's edit link with the updated XML as PUT data.

Remove a row

To delete a row, simply invoke spreadsheet.DeleteList on the service.SpreadsheetService with the existing entry to be deleted:

def ListDeleteAction(gd_client, key, wksht_id, index):
  feed = gd_client.GetListFeed(key, wksht_id)
  gd_client.DeleteRow(feed.entry[string.atoi(index)])
  print 'Deleted!'

To delete a row, perform a DELETE request on the edit URL after obtaining it as described above.

Note: If you delete a row, that row is completely removed from the worksheet.

Interacting with cell-based feeds

In a cell-based feed, each entry represents a single cell.

Note that it isn't recommended to interact with both a cell-based feed and a list-based feed for the same worksheet at the same time.

Get a cell-based feed

To retrieve a worksheet's cell-based feed, you might do the following. Again, assume that service.SpreadsheetService is an authenticated spreadsheet.service.SpreadsheetService object, key is a valid unique ID for a spreadsheet that the user has access to, and wksht_id is a valid unique ID for a worksheet in the spreadsheet corresponding to the provided key.

def CellsGetAction(gd_client, key, wksht_id):
  # Get the feed of cells
  feed = gd_client.GetCellsFeed(key, wksht_id)
  PrintFeed(feed)

Send a cell range query

Suppose you wanted to retrieve the cells in the first column of a worksheet. You can request a cell feed containing only this column as follows:

query = gdata.spreadsheet.service.CellQuery()
query['min-col'] = '1'
query['max-col'] = '1'
query['min-row'] = '2'
feed = gd_client.GetCellsFeed(key, wksht_id, query=query)

This requests all the data in column 1, starting with row 2.

Setting the 'min-col', 'max-col', 'min-row', and 'max-row' values of a spreadsheet.service.CellQuery object is equivalent to providing values for the parameters min-col, max-col, min-row, and max-row (respectively) in the HTTP GET request.

Change contents of a cell

To modify the contents of an existing SpreadsheetsCell that you have retrieved from a feed, first change its

def CellsUpdateAction(gd_client, key, wksht_id, row, col, inputValue):
  entry = gd_client.UpdateCell(row=row, col=col, inputValue=inputValue, 
      key=key, wksht_id=wksht_id)
  if isinstance(entry, gdata.spreadsheet.SpreadsheetsCell):
    print 'Updated!'

This is equivalent to sending an authenticated PUT request with the updated cell's XML representation to the URL:

PUT http://spreadsheets.google.com/feeds/cells/key/worksheetId/private/full/cell/version

The new data is placed in the specified cell in the worksheet. If the specified cell contains data already, it will be overwritten. Instead of using the cells specific version string, the Spreadsheets Data API will also accept the string latest to automatically overwrite the last version of the cell content.

Note: Use UpdateCell (ie., HTTP PUT) to change the data in a cell, even if it is empty; the use of POST on empty cells is no longer recommended. To clear a cell, send an empty string as the inputValue instead of using a DELETE.

Back to top