The Google Spreadsheets Data API allows client applications to view and update Spreadsheets content in the form of Google Data API feeds. Your client application can request a list of a user's spreadsheets, edit or delete content in an existing Spreadsheets worksheet, and query the content in an existing Spreadsheets worksheet.
In addition to providing some background on the capabilities of the Spreadsheets Data API, this document provides examples for interacting with the API using the Python client library. For help setting up the client library, see the Getting Started Guide.
If you're interested in understanding more about the underlying protocol used by the Python Client Library to interact with the Spreadsheets Data API, please see the protocol tab.
This document is intended for programmers who want to write client applications using the Google Data API Python client library that can interact with Google Spreadsheets.
For help setting up the client library, see the Getting Started Guide. To use the Python client library, you'll need the Python 2.0+ and the
Element Tree, httplib, and urllib modules. See Dependency
Modules for further details. After downloading the client library, you'll find the sample explained in this guide in the samples/spreadsheets subdirectory of the distribution.
A full working copy of this sample is available in the Google Data API Python Client Library project in the project hosting section of code.google.com. The sample is located at /trunk/samples/spreadsheets/spreadsheetExample.py in the SVN repository accessible from the Source tab.
Run the example as follows:
python spreadsheetsExample.py --user=<username> --pw=<password>
The username and password values are the same credentials that you use to login to Google Spreadsheets.
The sample performs a number of operations on a worksheet of your choice to demonstrate the use of the Spreadsheets Data API. Please see the Run method of the code for more details as to which operations are performed.
To include the examples in this document into your own code, you'll need
the following import statements:
try: from xml.etree import ElementTree except ImportError: from elementtree import ElementTree import gdata.spreadsheet.service import gdata.service import atom.service import gdata.spreadsheet import atom
The SpreadsheetsService class represents a client connection (with authentication) to a Spreadsheets service. The general procedure for sending a query to a service using the client library consists of the following steps:
SpreadsheetsService instance, setting your application's name (in the form companyName-applicationName-versionID).It is possible to create a new spreadsheet by uploading a spreadsheet file via the Google Documents List Data API. The Spreadsheets Data API does not currently provide a way to delete a spreadsheet. For testing purposes, you may create a spreadsdheet manually or upload one.
The Python client library can be used to work with either public or private feeds. Public feeds are read-only, but do not require any authentication. Private feeds require that you authenticate to the spreadsheets servers. This can be done via ClientLogin username/password authentication or AuthSub proxy authentication.
Please see the authentication documentation for more information on AuthSub and ClientLogin.
To use ClientLogin (also called "Authentication for Installed
Applications"), invoke the ProgrammaticLogin
method of SpreadsheetsService inherited from
Service, specifying the ID and password of the user on whose behalf your client is sending the query. For example:
gd_client = gdata.spreadsheet.service.SpreadsheetsService() gd_client.email = 'jo@gmail.com' gd_client.password = 'mypassword' gd_client.source = 'exampleCo-exampleApp-1' gd_client.ProgrammaticLogin()
For more information about authentication systems, see the Google Account Authentication documentation.
AuthSub proxy authentication is used by web applications which need to authenticate their users to Google accounts. The operator does not need access to the username and password for the Spreadsheets user - only special AuthSub tokens are required.
When the user first visits your application, they have not yet been authenticated. In this case, you need to print some text and and a link directing the user to Google to authenticate your request for access to their spreadsheets. The Python Data API client library provides a function to generate this URL. The code below sets up a link to the AuthSubRequest page.
def GetAuthSubUrl(): next = 'http://www.coolspreadsheetsite.com/welcome.pyc' scope = 'http://spreadsheets.google.com/feeds/' secure = False session = True gd_client = gdata.spreadsheet.service.SpreadsheetsService() return gd_client.GenerateAuthSubURL(next, scope, secure, session); authSubUrl = GetAuthSubUrl(); print '<a href="%s">Login to your Google account</a>' % authSubUrl
Notice the parameters sent to the GenerateAuthSubURL method:
The URL looks something like this:
https://www.google.com/accounts/AuthSubRequest?scope=http%3A%2F%2Fspreadsheets.google.com%2Ffeeds%2F&session=1&secure=0&next=http%3A%2F%2Fwww.coolspreadsheetsite.com%2Fwelcome.pyc
The user can then follow the link to Google's site and authenticate to their Google account.
After the user authenticates, they will be redirected back to the next URL. The URL will have a single-use token value appended to it as a query parameter. The URL looks something like this:
http://www.coolspreadsheetsite.com/welcome.pyc?token=14a87fe98219731acd516
For security, this token is single-use only, so now you need to exchange this single-use token for a session token. This process is described in the AuthSub documentation. The following code snippet shows how to upgrade the token.
gd_client = gdata.spreadsheet.service.SpreadsheetsService() gd_client.auth_token = authsub_token gd_client.UpgradeToSessionToken()
In this snippet, the authsub_token variable contains the value from the token query parameter in the URL. There are several ways to retrieve this value, for example:
import cgi parameters = cgi.FieldStorage() authsub_token = parameters['token' ]
This token value represents a single-use AuthSub token. Since session = True was specified above, this token can be exchanged for an AuthSub session token using the UpgradeToSessionToken method, which calls the AuthSubSessionToken service.
You can get a feed containing a list of the currently authenticated user's spreadsheets by sending an authenticated GET request to the following URL:
http://spreadsheets.google.com/feeds/spreadsheets/private/full
The result is a "meta-feed," a feed that lists all of that user's spreadsheets; each entry in the feed represents a spreadsheet associated with the user. This feed is accessible only using an authentication token.
You can prompt the user to select a spreadsheet from the list by:
def PromptForSpreadsheet(gd_client):
# Get the list of spreadsheets
feed = gd_client.GetSpreadsheetsFeed()
PrintFeed(feed)
input = raw_input('\nSelection: ')
return feed.entry[string.atoi(input)].id.text.rsplit('/', 1)[1]
The resulting SpreadsheetsSpreadsheetsFeed
object feed represents a response from the server. Among other things, this feed
contains a list of SpreadsheetsSpreadsheet
objects (feed.entry), each of which represents a single
spreadsheet. SpreadsheetsSpreadsheet encapsulates the information shown in the protocol document.
The PrintFeed definition is as follows:
def PrintFeed(feed):
for i, entry in enumerate(feed.entry):
if isinstance(feed, gdata.spreadsheet.SpreadsheetsCellsFeed):
print '%s %s\n' % (entry.title.text, entry.content.text)
elif isinstance(feed, gdata.spreadsheet.SpreadsheetsListFeed):
print '%s %s %s' % (i, entry.title.text, entry.content.text)
# Print this row's value for each column (the custom dictionary is
# built from the gsx: elements in the entry.) See the description of
# gsx elements in the protocol guide.
print 'Contents:'
for key in entry.custom:
print ' %s: %s' % (key, entry.custom[key].text)
print '\n',
else:
print '%s %s\n' % (i, entry.title.text)
The value returned from PromptForSpreadsheet is the
key of the spreadsheet at the selected index. The key is the
permanent unique ID of a specific worksheet within the spreadsheet. The
key is the last value in the path of the id
(corresponding to the <id> element in the spreadsheet entry)
attribute of the SpreadsheetsSpreadsheet object.
A given spreadsheet may contain multiple worksheets. For each spreadsheet, there's a worksheets metafeed listing all the worksheets in that spreadsheet.
Given a key from the <id> of
SpreadsheetsSpreadsheet object you've already retrieved, you
can print a list of all worksheets and prompt for a selection as follows:
def PromptForWorksheet(gd_client, key):
# Get the list of worksheets
feed = gd_client.GetWorksheetsFeed(key)
PrintFeed(feed)
input = raw_input('\nSelection: ')
return feed.entry[string.atoi(input)].id.text.rsplit('/', 1)[1]
The resulting SpreadsheetsWorksheetsFeed
object feed represents a response from the server. Among other things, this feed
contains a list of SpreadsheetsWorksheet
objects (feed.entry), each of which represents a single
spreadsheet. SpreadsheetsWorksheet encapsulates the information
shown in the protocol document.
The rertieval of the "meta-feed" is equivalent to sending an authenticated GET request to the URL:
http://spreadsheets.google.com/feeds/worksheets/worksheetId/private/full
The value returned from PromptForWorksheet is the
worksheetId of the worksheet at the selected index. The
worksheetId is the
permanent unique ID of a specific worksheet within the spreadsheet. The
worksheetId is the last value in the path of the id
(corresponding to the <id> element in the worksheet entry)
attribute of the SpreadsheetsWorksheet object.
A given worksheet generally contains multiple rows, each containing multiple cells. You can request data from the worksheet either as a list-based feed, in which each entry represents a row, or as a cell-based feed, in which each entry represents a single cell. For information on cell-based feeds, see Interacting with cell-based feeds.
The following sections describe how to get a list-based feed, add a row to a worksheet, and send queries with various query parameters.
The list feed makes some assumptions about how the data is laid out in the spreadsheet.
In particular, the list feed treats the first row of the worksheet as a header row; Spreadsheets dynamically creates XML elements named after the contents of header-row cells. Users who want to provide Data API feeds should not put any data other than column headers in the first row of a worksheet.
The list feed contains all rows after the first row up to the first blank row. The first blank row terminates the data set. If expected data isn't appearing in a feed, check the worksheet manually to see whether there's an unexpected blank row in the middle of the data. In particular, if the second row of the spreadsheet is blank, then the list feed will contain no data.
A row in a list feed is as many columns wide as the worksheet itself.
To retrieve a worksheet's list feed, you might do the following. Here,
assume that the gd_client is an authenticated
spreadsheet.SpreadsheetsService object, key is a
valid unique ID for a spreadhseet that the user has access to, and
wksht_id is a valid unique ID for a worksheet in the
spreadsheet corresponding to the provided key.
def ListGetAction(gd_client, key, wksht_id): # Get the list feed feed = gd_client.GetListFeed(key, wksht_id) PrintFeed(feed) return feed
The resulting SpreadsheetsListFeed object
feed represents a response from the server. Among other
things, this feed contains a list of
SpreadsheetsList objects
(feed.entry), each of which represents a single row in a
worksheet. The SpreadsheetsWorksheet encapsulates
the information shown in the protocol document.
This code is equivalent to sending an authenticated GET request to the URL:
http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full
By default, rows in the feed appear in the same order as the
corresponding rows in the GUI; that is, they're in order by row number. To
get rows in reverse order, set the reverse properties of the
service.ListQuery object to true:
q = service.ListQuery()
q.feed = 'http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full'
q.reverse = 'true'
This is equivalent to sending an authenticated GET request to the URL:
http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full?reverse=true
Note that if you want to order (or reverse sort) by a particular column,
rather than by position in the worksheet, you can set the orderby
value of the service.ListQuery object to
column:<the header of that column>. This is analogous to adding an orderby
query parameter in the outbound GET request.
You can set a service.ListQuery's sq
value to produce a feed with entries that meet the specified criteria. For
example, suppose you have a worksheet containing personnel data, in which
each row represents information about a single person. You wish to retrieve
all rows in which the person's name is "John" and the person's age is over
25. To do so, you would set sq as follows:
q = service.ListQuery()
q.sq = 'name=John and age>25'
q.feed = 'http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full'
This is equivalent to sending an authenticated GET request to the URL:
http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full?sq=name%3DJohn%20and%20age%3E25
For more information about the structured query syntax, see the List-feed query parameters reference section.
To insert a new row in a list feed, assuming that the
gd_client is an authenticated
spreadsheets.SpreadsheetService object, key s a
valid unique ID for a spreadsheet that the user has access to, and
wksht_id is a valid unique ID for a worksheet in the
spreadsheet corresponding to the provided key, do the
following:
def ListInsertAction(gd_client, key, wksht_id, row_data):
entry = gd_client.InsertRow(StringToDictionary(row_data),
key, wksht_id)
if isinstance(entry, gdata.spreadsheet.SpreadsheetsList):
print 'Inserted!'
Notice that the InsertRow definition takes a dictionary
where the key corresponds to the column name and the value
corresponds to the value of that column in the new row.
To parse the input string into a map the following
StringToDictionary definition is used.
def StringToDictionary(row_data):
result = {}
for param in row_data.split():
name, value = param.split('=')
result[name] = value
return result
Spreadsheets inserts the new row immediately after the last row that appears in the list-based feed, which is to say immediately before the first entirely blank row. This code is equivalent to sending an authenticated POST request to the URL:
http://spreadsheets.google.com/feeds/list/key/worksheetId/private/full
with the corresponding XML document in the POST body.
To update an existing row in a list feed, assuming that
gd_client is an authenticated
spreadsheets.SpreadsheetService object, key is a
valid unique ID for a spreadsheet that the user has access to, and
wksht_id is a valid unique ID for a worksheet in the
corresponding to the provided key, do the following:
def ListUpdateAction(gd_client, key, wksht_id, index, row_data):
feed = gd_client.GetListFeed(key, wksht_id)
entry = gd_client.UpdateRow(
feed.entry[string.atoi(index)],
StringToDictionary(row_data))
if isinstance(entry, gdata.spreadsheet.SpreadsheetsList):
print 'Updated!'
Notice that the UpdateRow definition takes a
spreadsheet.SpreadsheetList object which corresponds to an
existing list entry, and a dicitionary where the key corresponds to
the column name and the value corresponds to the value of that column
in the new row.
This is equivalent to sending an authenticated PUT request
to the entry's edit link with the updated XML as PUT data.
To delete a row, simply invoke spreadsheet.DeleteList
on the service.SpreadsheetService with the existing entry to be
deleted:
def ListDeleteAction(gd_client, key, wksht_id, index): feed = gd_client.GetListFeed(key, wksht_id) gd_client.DeleteRow(feed.entry[string.atoi(index)]) print 'Deleted!'
To delete a row, perform a DELETE request on the edit URL after obtaining it as described above.
Note: If you delete a row, that row is completely removed from the worksheet.
In a cell-based feed, each entry represents a single cell.
Note that it isn't recommended to interact with both a cell-based feed and a list-based feed for the same worksheet at the same time.
To retrieve a worksheet's cell-based feed, you might do the following.
Again, assume that service.SpreadsheetService is an
authenticated spreadsheet.service.SpreadsheetService object,
key is a valid unique ID for a spreadsheet that the user has
access to, and wksht_id is a valid unique ID for a worksheet in
the spreadsheet corresponding to the provided key.
def CellsGetAction(gd_client, key, wksht_id): # Get the feed of cells feed = gd_client.GetCellsFeed(key, wksht_id) PrintFeed(feed)
Suppose you wanted to retrieve the cells in the first column of a worksheet. You can request a cell feed containing only this column as follows:
query = gdata.spreadsheet.service.CellQuery() query['min-col'] = '1' query['max-col'] = '1' query['min-row'] = '2' feed = gd_client.GetCellsFeed(key, wksht_id, query=query)
This requests all the data in column 1, starting with row 2.
Setting the 'min-col', 'max-col', 'min-row', and
'max-row' values of a
spreadsheet.service.CellQuery object is equivalent to providing
values for the parameters min-col, max-col, min-row, and
max-row (respectively) in the HTTP GET request.
To modify the contents of an existing
SpreadsheetsCell that you have retrieved from a
feed, first change its
def CellsUpdateAction(gd_client, key, wksht_id, row, col, inputValue):
entry = gd_client.UpdateCell(row=row, col=col, inputValue=inputValue,
key=key, wksht_id=wksht_id)
if isinstance(entry, gdata.spreadsheet.SpreadsheetsCell):
print 'Updated!'
This is equivalent to sending an authenticated PUT request
with the updated cell's XML representation to the URL:
PUT http://spreadsheets.google.com/feeds/cells/key/worksheetId/private/full/cell/version
The new data is placed in the specified cell in the worksheet. If the specified cell contains data already, it will be overwritten. Instead of using the cells specific version string, the Spreadsheets Data API will also accept the string latest to automatically overwrite the last version of the cell content.
Note: Use UpdateCell (ie.,
HTTP PUT) to change the data in a cell, even if it is empty; the use of
POST on empty cells is no longer recommended. To clear a cell,
send an empty string as the inputValue instead of using a DELETE.