Data Source Python Library

Google has open-sourced a Python library that creates DataTable objects for consumption by visualizations. This library can be used to create a DataTable in Python, and output it in any of three formats:

  • JSON string -- If you are hosting the page that hosts the visualization that uses your data, you can generate a JSON string to pass into a DataTable constructor to populate it.
  • JSON response -- If you do not host the page that hosts the visualization, and just want to act as a data source for external visualizations, you can create a complete JSON response string that can be returned in response to a data request.
  • JavaScript string -- You can output the data table as a string that consists of several lines of JavaScript code that will create and populate a google.visualization.DataTable object with the data from your Python table. You can then run this JavaScript in an engine to generate and populate the google.visualization.DataTable object. This is typically used for debugging only.

This document assumes that you understand basic Python programming, and have read the introductory visualization documentation for creating a visualization and using a visualization.

Contents

How to Use the Library

Here are the basic steps, in more detail:

1. Create a gviz_api.DataTable object

Import the gviz_api.py library from the link above and instantiate the gviz_api.DataTable class. The class takes two parameters: a table schema, which will describe the format of the data in the table, and optional data to populate the table with. You can add data later, if you like, or completely overwrite the data, but not remove individual rows, or clear out the table schema.

2. Describe your table schema

The table schema is specified by the table_description parameter passed into the constructor. You cannot change it later. The schema describes all the columns in the table: the data type of each column, the ID, and an optional label.

Each column is described by a tuple: (ID [,data_type [,label [,custom_properties]]]).

  • ID - A string ID used to identify the column. Can include spaces. The ID for each column must be unique.
  • data_type - [optional] A string descriptor of the Python data type of the data in that column. You can find a list of supported data types in the SingleValueToJS() method. Examples include "string" and "boolean". If not specified, the default is "string."
  • label - A user-friendly name for the column, which might be displayed as part of the visualization. If not specified, the ID value is used.
  • custom_properties - A {String:String} dictionary of custom column properties.

The table schema is a collection of column descriptor tuples. Every list member, dictionary key or dictionary value must be either another collection or a descriptor tuple. You can use any combination of dictionaries or lists, but every key, value, or member must eventually evaluate to a descriptor tuple. Here are some examples.

  • List of columns: [('a', 'number'), ('b', 'string')]
  • Dictionary of lists: {('a', 'number'): [('b', 'number'), ('c', 'string')]}
  • Dictionary of dictionaries: {('a', 'number'): {'b': 'number', 'c': 'string'}}
  • And so on, with any level of nesting.

3. Populate your data

To add data to the table, build a structure of data elements in the exact same structure as the table schema. So, for example, if your schema is a list, the data must be a list:

  • schema: [("color", "string"), ("shape", "string")]
  • data: [["blue", "square"], ["red", "circle"]]

If the schema is a dictionary, the data must be a dictionary:

  • schema: {("rowname", "string"): [("color", "string"), ("shape", "string")] }
  • data: {"row1": ["blue", "square"], "row2": ["red", "circle"]}

One table row is a section of corresponding data and schema. For example, here's how a schema of a list of two columns is applied to two rows of data.

Schema:[(color),(shape)]
            /     \       
Data: [["blue", "square"], ["red", "circle"]]

Table: 
      Color    Shape
      blue     square
      red      circle

Note that the dictionary keys here evaluate to column data. You can find more complex examples in the AppendData() method documentation in the code. The purpose of allowing such complex nesting is to let you use a Python data structure appropriate to your needs.

4. Output your data

The most common output format is JSON, so you will probably use the ToJsonResponse() function to create the data to return. If, however, you are parsing the input request and supporting different output formats, you can call any of the other output methods to return other formats, including comma-separated values, tab-separated values, and JavaScript. JavaScript is typically only used for debugging. See Implementing a Data Source to learn how to process a request to obtain the preferred response format.

Example Usage

Here are some examples demonstrating how to use the various output formats.

ToJSon and ToJS Example

#!/usr/bin/python

import gviz_api

page_template = """
<html>
  <script src="https://www.gstatic.com/charts/loader.js"></script>
  <script>
    google.charts.load('current', {packages:['table']});

    google.charts.setOnLoadCallback(drawTable);
    function drawTable() {
      %(jscode)s
      var jscode_table = new google.visualization.Table(document.getElementById('table_div_jscode'));
      jscode_table.draw(jscode_data, {showRowNumber: true});

      var json_table = new google.visualization.Table(document.getElementById('table_div_json'));
      var json_data = new google.visualization.DataTable(%(json)s, 0.6);
      json_table.draw(json_data, {showRowNumber: true});
    }
  </script>
  <body>
    <H1>Table created using ToJSCode</H1>
    <div id="table_div_jscode"></div>
    <H1>Table created using ToJSon</H1>
    <div id="table_div_json"></div>
  </body>
</html>
"""

def main():
  # Creating the data
  description = {"name": ("string", "Name"),
                 "salary": ("number", "Salary"),
                 "full_time": ("boolean", "Full Time Employee")}
  data = [{"name": "Mike", "salary": (10000, "$10,000"), "full_time": True},
          {"name": "Jim", "salary": (800, "$800"), "full_time": False},
          {"name": "Alice", "salary": (12500, "$12,500"), "full_time": True},
          {"name": "Bob", "salary": (7000, "$7,000"), "full_time": True}]

  # Loading it into gviz_api.DataTable
  data_table = gviz_api.DataTable(description)
  data_table.LoadData(data)

  # Create a JavaScript code string.
  jscode = data_table.ToJSCode("jscode_data",
                               columns_order=("name", "salary", "full_time"),
                               order_by="salary")
  # Create a JSON string.
  json = data_table.ToJSon(columns_order=("name", "salary", "full_time"),
                           order_by="salary")

  # Put the JS code and JSON string into the template.
  print "Content-type: text/html"
  print
  print page_template % vars()


if __name__ == '__main__':
  main()

ToJSonResponse Example

JSonResponse is used by a remote client in a data request.

#!/usr/bin/python

import gviz_api

description = {"name": ("string", "Name"),
               "salary": ("number", "Salary"),
               "full_time": ("boolean", "Full Time Employee")}
data = [{"name": "Mike", "salary": (10000, "$10,000"), "full_time": True},
        {"name": "Jim", "salary": (800, "$800"), "full_time": False},
        {"name": "Alice", "salary": (12500, "$12,500"), "full_time": True},
        {"name": "Bob", "salary": (7000, "$7,000"), "full_time": True}]

data_table = gviz_api.DataTable(description)
data_table.LoadData(data)
print "Content-type: text/plain"
print
print data_table.ToJSonResponse(columns_order=("name", "salary", "full_time"),
                                order_by="salary")