DSPL Developer Guide

DSPL stands for Dataset Publishing Language. It is a representation format for both the metadata (information about the dataset, such as its name and provider, as well as the concepts it contains and displays) and actual data of datasets. Datasets described in this format can be imported into the Google Public Data Explorer, a tool that allows for rich, visual exploration of the data.

Note: To upload data to Google Public Data using the Public Data upload tool, you must have a Google Account.

This document is intended for data owners who want their content to be available in the Public Data Explorer. It goes beyond the Tutorial by diving deeper into the details of the DSPL schema and supported features. Only a basic familiarity of XML is assumed, although knowledge of relational databases is also useful.

Although not a requirement, we suggest reading through the Tutorial, which is shorter and easier to digest, before looking at this document.

Overview

A DSPL dataset is a .zip file that contains an XML file and a set of CSV files. The CSV files are simple tables containing the data of the dataset, while the XML file describes the metadata of the dataset. The latter includes informational metadata like descriptions of measures, as well as structural metadata like references between tables. This metadata lets non-expert users explore and visualize your data.

Process

In general, the process of creating a DSPL dataset is as follows (some steps may take place in parallel):

  1. Create your DSPL XML file.
  2. Identify any external data sources to use in your dataset.
  3. Define your concepts, slices, and (optionally) topics. Iteratively update the content of your DSPL file.
  4. Export your source data to .csv files.
  5. Create a DSPL dataset.
  6. Submit the dataset to Google.

XML Structure

Overview

The DSPL XML file defines the metadata of the dataset, including structural relationships between concepts, slices, topics, and tables. Although it is possible to create this file by hand, data processing tools and scripts can greatly streamline the process. See a sample DSPL file in a new window.

The file includes a number of sections, which are summarized in the table below. Following the table, we describe each of the former in greater detail.

Section Summary More Info
Header and Imports The parent for all of the other elements of the dataset. Includes the target namespace (i.e., identifier) for the dataset, along with the namespaces of any imported datasets. Documentation
Dataset Information The name, description, and URL of the dataset. Documentation
Provider Information The name, description, and URL of the dataset provider. Documentation
Concepts

Definitions of "things" that appear in the dataset (e.g., countries, unemployment rate, gender, etc.)

Each concept has a unique identifier, which can be referenced by slices and tables.

Documentation
Slices

Combinations of concepts for which there is statistical data in the dataset. Each slice contains dimensions and metrics.

Slices reference concepts and also tables, which contain the actual data. Each slice has a unique identifier that can be referenced by the tables containing the actual data.

Documentation
Tables Define the data for concepts and slices. Concept tables hold enumerations and slice tables hold statistical data. Tables are defined in the XML file, and point to .csv files containing the actual data. Documentation
Topics Categories for organizing dataset concepts. While not required, these can be very helpful for users navigating your data. Documentation

Header and Imports

Declaring the Public Data namespace

A DSPL dataset begins with a top-level, <dspl> element. This is used to enclose all dataset information and to indicate any namespaces that will be used throughout the file. Here's example:

<?xml version="1.0" encoding="UTF-8"?>
<dspl targetNamespace="http://www.example.com/mystats"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://schemas.google.com/dspl/2010" >
    ...
</dspl>

A namespace is a unique identifier that can be associated with an XML schema (a set of XML elements and attributes). The targetNamespace provides a URI that identifies your dataset. This URI is not required to point to an actual resource, but it's a good idea to have the URI resolve to a document describing your content or dataset.

You are not required to provide a targetNamespace. If you don't, then one will be generated automatically for you at import time.

The targetNamespace attribute is followed by a series of xmlns attributes specifying other XML schemas that will be used in the file. Every DSPL file must include the Google Public Data schema, whose URI is "http://schemas.google.com/dspl/2010" and use it as the default namespace. It should also include the standard, W3 XML schema identified by "http://www.w3.org/2001/XMLSchema-instance". As described in the next section, other namespaces can be added to include information from other datasets.

Importing other dataset namespaces

Datasets can reuse definitions and data from other datasets. Google, for instance, provides a number of basic datasets that define concepts commonly appearing in user data. For example, most datasets need a concept to represent years. Instead of defining a new concept, you can use the year concept from the "http://www.google.com/publicdata/dataset/time" dataset. See the Canonical Concepts page for more information.

To use an external dataset, add the <import> element to the DSPL file just after the namespace declaration, and indicate the dataspace you are importing, like this:

<import namespace="http://www.google.com/publicdata/dataset/google/time"/>

Then, add the imported namespace (in this case, time="http://www.google.com/publicdata/dataset/google/time") to the namespace declaration at the top of your file, like this:

<?xml version="1.0" encoding="UTF-8"?>
<dspl targetNamespace="http://www.stats-bureau.com/mystats"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://schemas.google.com/dspl/2010"
    xmlns:time="http://www.google.com/publicdata/dataset/google/time" >
<import namespace="http://www.google.com/publicdata/dataset/google/time"/>

Your DSPL file now can reference elements from the Google Public Data time dataset. Repeat this process for every dataset you want to reference.

Referencing content in external datasets

Once you've imported another dataset, you need to be able to refer concepts, slices, and data from that dataset. To do this, you can use references of the format prefix:other_id, where prefix is the prefix used for the namespace of the external dataset.

Here's an example of a reference to the year concept from the time dataset (described above):

<slices>
  <slice id="country_slice">
    <dimension concept="country"/>
    <dimension concept="time:year"/>
    <metric concept="population"/>
    <table ref="country_slice_table"/>
  </slice>
  ...
</slices>

Dataset Information

The <info> element includes descriptive information about the dataset. An example and details on the relevant XML elements are listed below.

Example

<info>
  <name>
    <value>Unemployment Rates</value>
  </name>
  <description>
    <value>Worldwide unemployment rates by region</value>
  </description>
  <url>
    <value>http://www.example.com/mystats/info.html</value>
  </url>
</info>

Elements

Element Required? Description
<info> Yes Encloses all descriptive information about the dataset. Includes the child elements <name>, <description>, and <url>.
<name> Yes Child of <info>. Includes the child element <value>, which identifies the name of the dataset.
<description> Optional Child of <info>. Includes the child element <value>, which includes a text description of the dataset.
<url> Yes Child of <info>. A link to the a URL with more information about the dataset.

Provider Information

The <provider> element lists information about the dataset provider. An example and details on the relevant XML elements are listed below.

Example

<provider>
  <name>
    <value>Bureau of Statistics</value>
  </name>
  <url>
    <value>http://www.example.com</value>
  </url>
</provider>

Elements

Element Required? Description
<provider> Yes Encloses all descriptive information about the dataset provider. Includes the child elements <name> and <url>.
<name> Optional Child of <provider>. Includes the child element <value>, which identifies the name of the dataset provider.
<url> Optional Child of <info>. A link to a URL with more information about the dataset provider.

Concepts

Description

Each dataset contains one or more concepts. A concept is a definition of a type of data that appears in a dataset. A dataset with demographic population data, for example, could have the concepts country, state, population, and year. The data values corresponding to a given concept are called instances of that concept. Concepts are usually described in the dataset, but some concepts (such as time or year) may be described in external datasets.

Each concept can have one or more properties. A property is a characteristic of a concept instance that is stable over time. For example, the country concept could have the properties name, population, and capital.

Concepts can also have one or more attributes. Attributes provide information at the level of the concept, not its individual instances. For example, if we had a dataset with an unemployment rate concept, we could use an attribute to designate that this concept is a percentage. Another example of a common use of attributes is to provide unit information.

Example

Here's an example of a country concept with the unique id country, and the property name. The concept id can be used to reference the concept from slices and tables.

<concept id="country" extends="geo:location">
  <info>
    <name><value>Country</value></name>
    <description>
      <value>My list of countries.</value>
    </description>
  </info>
  <type ref="string"/>
  <property id="name">
    <info>
      <name><value>Name</value></name>
      <description>
        <value>The official name of the country</value>
      </description>
    </info>
    <type ref="string" />
  </property>
  <property concept="geo:continent" isParent="true"/>
  <property id="capital" concept="geo:city" />
  <table ref="countries_table" />
</concept>

Here's how this example works.

  • This code describes the concept country, which has the id country and the properties name, continent, and capital.
  • The concept extends geo:location, the canonical concept for locations. By extending geo:location, country inherits all the properties and attributes defined by the extended concept: properties name, description, url, latitude and longitude. It's okay for country to redefine some of these attributes and properties, as long as the definition is consistent with the one provided by the extended concept.
  • The concept <info> element describes the key information about the concept. This is displayed on the dataset's landing page in the Public Data Explorer.
  • The concept <type> element refers to the type of content. In this case it's string, but this could vary. The concept Population would have the type integer; the concept Eurovision winner could have the type boolean.
  • A <property> element describes each property of the concept, including its unique ID (id), info and type. Properties may also reference concepts, to indicate that their values are valid instances of those concepts.
  • The concept references a data table that points to the CSV file containing the actual data. The data table is referenced like this: <table ref="countries_table"/>.

    If your concept references a table, the associated data file must list all instances of the concept. You cannot, for example, create a table that lists only a few of the countries included in the dataset. (If there is a subset of countries you care about, you can create a separate concept to describe them. For example, mycountries.)

Elements

Element Required? Description
<concepts> Yes Top-level element. Encloses all <concept> elements.
<concept> Yes Identifies the concept. The value of the required attribute id must be unique to the concept within the dataset. If the concept references a concept data table, the value of id must match the column heading describing the concept in the data table. An extends attribute may be used to denote that this concept extends another concept. The value of extends must match the id of a concept defined in the same dataset, or be of the form prefix:concept_id, where concept_id is the id of a concept defined in the imported external dataset associated with prefix.
<info> Optional Encloses descriptive information about the concept.
<name> Yes Child of <info>. The name of the concept. The child element <value> contains the text - for example, Country.
<description> Optional Child of <info>. Includes the child element <value>, which includes a text description of the concept.
<url> Optional Child of <info>. Includes the child element <value>, which includes a URL for the concept.
<pluralName> Optional Child of <info>. The plural name for the concept. The child element <value> contains the text - for example, Countries.
<totalName> Optional Child of <info>. The name for the combination of all instances of the concept. The child element <value> contains the text- in the case of a country concept, for example, this might be World.
<type> Optional Identifies the type of content described by the concept. The required attribute ref has the following allowed values:
  • string
  • float
  • integer
  • date
  • boolean
The type may be omitted if the concept extends another concept, in which case it is inherited from the extended concept.
<property> Optional

A property of the concept, such as capital. The value of the required attribute id must be unique to the concept. An optional concept attribute may be used to indicate that values of this property are instances of a given concept. If concept is specified, then id may be omitted; its value is implicitly defined as the id of the referenced concept (e.g., <property concept="geo:country"/> is equivalent to <property id="country" concept="geo:country"/>).

A property may contain a Boolean isParent attribute, to indicate that the relationship between an instance of the concept and the value of this property is hierarchical.

A property may contain a Boolean isMapping attribute, to indicate that there is a 1-1 mapping between the instances of the concept and the values of the property.

A property may specify a nested info and type, which are defined just as they are for a concept. type is required if the property does not specify a concept attribute, and must match the type of the referenced concept if it does.

<attribute> Optional

An attribute of the concept. Attributes represent additional information about the concept (e.g., GDP is a percentage). The value of the required attribute id must be unique to the concept. An optional concept attribute may be used to indicate that values of this attribute are instances of a given concept. If concept is specified, then id may be omitted. Its value is implicitly defined as the id of the referenced concept. (e.g., <attribute concept="unit:unit"/> is equivalent to <attribute id="unit" concept="unit:unit"/>.

An attribute may specify a nested info and type, which are defined just like for a concept. type is required if the attribute does not specify a concept attribute, and must match the type of the referenced concept if it does.

<table> Optional Identifies the data table containing data for the concept. The value of the required ref attribute must match the table ID specified in the related <table> element.

Slices

Description

A slice is a combination of concepts for which data exist. A slice contains two kinds of concept references: dimensions and metrics. A dimension is a concept that is used to segment or filter your data. A metric, on the other hand, describes the observed value or values associated with each data point.

Generally, dimensions are categorical whereas metrics are non-categorical, time-varying, numeric values. Some prototypical examples of each are as follows:

  • Dimensions: Country, state, county, region, year, month, sex, age category, industry segment
  • Metrics: Population, GDP, unemployment rate, literacy, revenue, cost, price

Example

<slices>
  <slice id="country_slice">
    <dimension concept="country"/>
    <dimension concept="time:year"/>
    <metric concept="population"/>
    <table ref="country_slice_table"/>
  </slice>
  ...
</slices>

Here's how this example works.

  • This slice represents population by country.
  • It has the metric population, and the dimensions country, and year. Each dimension is a concept already defined elsewhere. The concept country and the metric population exist in the same dataset as the current slice, and are referenced like this: concept="country"
  • The concept year exists in the imported dataset time, identified by the prefix used before the concept name (year), like this: concept="time:year"
  • The slice references a data table that points to the CSV file containing the actual data. The data table is referenced like this: <table ref="country_slice_table"/>. (See above for information on importing datasets.)

Note: In general, your dataset will be more flexible if you keep metrics to a minimum, and instead create meaningful dimensions. For example, instead of creating the metrics Female Unemployment and Male Unemployment, create the single metric Unemployment, and add the dimension Gender that has the instances Female and Male.

Elements

Element Required? Description
<slices> Yes Top-level element. Encloses all <slice> elements.
<slice> Optional Identifies the slice. The value of the required attribute id must be unique to the slice.
<dimension> Optional Defines a dimension of the slice, by referencing a concept. The value of the required attribute concept must exactly match the unique id of the concept, and use a valid prefix if the concept belongs to an external imported dataset.
<metric> Optional Defines a metric of the slice, by referencing a concept. The value of the required attribute concept must exactly match the unique id of the concept, and use a valid prefix if the concept belongs to an external imported dataset.
<table> Yes Identifies the data table containing data for the slice. The value of the required ref attribute must match the table ID specified in the related <table> element.
<mapDimension> Optional Child of <table>. Contains the attributes concept and toColumn; the value of the first is a dimension in the slice, and the value of the second is the table column corresponding with the former.
<mapMetric> Optional Child of <table>. Contains the attributes concept and toColumn; the value of the first is a metric in the slice, and the value of the second is the table column corresponding with the former.

Tables

Description

The tables section of the DSPL file identifies the data tables included in the dataset. These tables can be referenced by concepts or by slices. Each <table> element specifies the columns of the tables and their types, and points to a CSV file containing the table data.

Example

<tables>
  <table id="country_slice_table">
    <column id="country" type="string"/>
    <column id="year" type="date" format="yyyy"/>
    <column id="population" type="integer"/>
    <data>
      <file format="csv" encoding="utf-8">country_slice.csv</file>
    </data>
  </table>
  ...
</tables>

Here's how this sample works.

  • This sample describes the table country_slice_table. The table has the columns country, year, and population.
  • Each column in the table has a unique id, defined by the id attribute. This id must exactly match the appropriate column heading in the associated data file.
  • The value of the optional type attribute defines the data type for each column.
  • The <data> element describes the actual .csv file (country_slice.csv) containing the data for the table. The file format is always csv.

Elements

Element Required? Description
<tables> Yes Top-level element. Encloses all <table> elements.
<table> Yes Identifies the table. The value of the required attribute id must be unique to the table.
<column> Optional Child of <table>. Information about a column included in the table. Includes the following attributes:
  • id (required): The id of the column.
  • type (optional): The data type of the information in the specified column. Allowed values are: string, float, integer, date, or boolean.
<data> Optional Child of <table>. The data file referenced by the table. If the file name is in the form of a URL (e.g., http://...), then the file will be fetched via the appropriate protocol (HTTP, HTTPS, or FTP); otherwise, a file with this name must be bundled with the dataset. The value of the required attribute format is always csv. Although the encoding attribute is optional, your .csv files must be UTF-8 encoded.

Topics

Description

Topics classify concepts hierarchically, allowing users to navigate through your dataset more easily.

The <topics> element should appear right before the <concepts> element in your DSPL file. (The order of elements is important, and you may not be able to upload your dataset if your elements appear in the wrong order.) To use topics, reference them from the concept definition.

Example

Here's an example topic definition:

<topics>
  <topic id="population_indicators">
    <info>
      <name>
        <value>Population indicators</value>
      </name>
    </info>
  </topic>
  ...
</topics>
  

...and here's an example reference to this topic from a concept:

<concept id="population">
  <info>
    <name>
      <value>Population</value>
    </name>
    <description>
      <value>Size of the resident population.</value>
    </description>
  <topic ref="population_indicators"/>
  <type ref="integer"/>
</concept>

Topics can be nested, and a concept can reference more than one topic.

Element definition

Element Required? Description
<topics> Yes Top-level element. Encloses all <topic> elements.
<topic> Yes Identifies the topic. The value of the required attribute id must be unique to the dataset.
<info> Optional Child of <topic>. Encloses information about a topic.
<name> Optional Child of <info>. Its child element <value> specifies the name of the topic.

DSPL Data Files

In addition to the XML metadata file, a DSPL dataset can also include one or more data files in CSV format. Each data file supports a table in the dataset, and is referenced from the former in its <data>...</data> section. Conceptually, these files and their associated tables are used to represent either concept definitions or slice data. Each of these data file types is described in more detail below.

Note that, regardless of the purpose, all data files must be comma-delimited (CSV) UTF-8 text files. The files must contain only plain text; no HTML. You can create the data files manually, but realistically you will need to massage the data either in the tool containing the original data source (e.g., a spreadsheet), or in the exported file itself.

Files can be bundled with the dataset or, if the name is in the form of a URL, fetched via HTTP, HTTPS, or FTP from a remote source.

Concept Data Files

Concept data files contain relevant information for each concept. The concept definition uses the <table> element to refer to this file.

Example

Here's an example of a table for the country concept defined above:

country, name
AD, Andorra
AF, Afghanistan
AI, Anguilla
AL, Albania
AO, Angola
AQ, Antarctica
AS, American Samoa

Here's how this example works:

  • Unless mappings are specified, the first line of the data file (column headings) must exactly match the concept id and the appropriate property ids of the concept with which the data are associated. However, the order of the columns doesn't have to be the same in the data file and the concept table. In this case, the first column is associated with the concept country, and the second column is associated with the property name.
  • The property columns are optional; if a property does not have a column in the table, then its value is assumed to be undefined for each row. The table above, for instance, omits columns for the latitude and longitude properties, so the countries will not be mappable.
  • Each value for the concept's id field (in this case, country) must be unique and non-empty (an empty field is one with zero or only whitespace characters).
  • Values for properties that reference other concepts must either be empty or be a valid value of the referenced concept.
  • Enclosing values in double quotes is optional except when they contain commas, double quotes, or newline characters.
  • Escape a literal double quote that appears in a value by preceding it with another double quote.

Slice Data Files

Slice data files contain relevant data for each slice. The slice definition uses the <table ref="..."> element to refer to the <table> definition, which in turn identifies this file.

Example

Here's an example of a .csv file containing the data for the population_by_country slice described above:

country, year, population
AF, 1960, 9616353
AF, 1961, 9799379
AF, 1962, 9989846
AF, 1963, 10188299

Here's how the example works:

  • The metric field is population. The fields country and year are dimension fields.
  • Each value of a dimension field must be non-empty. This includes time dimensions. Values for metric fields can be empty. An empty value is represented by no character.
  • Each column heading that references a concept (for example, the first field of the example above references the concept country) must exactly match the concept's unique id in the concept definition.
  • A unique combination of dimension values, e.g. AF, 2000, may occur only once.
  • Rows in the same time series (i.e., rows that have the same combination of all dimension values except time) must be grouped together, though they need not be otherwise sorted.

Advanced Features

Multi-Language Datasets

Translated XML Values

You can use the xml:lang attribute with every <value> element in your DSPL file. This attribute specifies the language of the element's content, using the standard, W3C language tags. Note that the use of this feature is optional; if no xml:lang attribute is included, the content is assumed to be in English.

The following example shows snipets of a dataset that's in English, Bulgarian, Catalan, and Simplified Chinese:

<dspl ...>
  <info>
    <name>
      <value xml:lang="en">World Bank, World Development Indicators</value>
      <value xml:lang="bg">Световна банка, Индикатори за световно развитие</value>
      <value xml:lang="ca">Banc Mundial, Indicadors del desenvolupament mundial</value>
      <value xml:lang="zh-CN">国家/地区</value>
    </name>
    ...
  </info>

  <concepts>
    <concept id="country">
      <info>
        <name>
          <value xml:lang="en">Country</value>
          <value xml:lang="bg">Страна</value>
          <value xml:lang="ca">País</value>
          <value xml:lang="zh-CN">国家/地区</value>
        </name>
        ...
      </info>
      ...
    </concept>
    ...
  </concepts>

  ...
</dspl>

Translated Properties

In some cases, you may want to provide translations that go beyond concept-level metadata, applying in addition (or instead) to individual concept instances. This is particularly useful when the values of a concept property (e.g., name) vary by language.

To provide such values in multiple languages, create one column in the corresponding definition table for each property/language combination. Then, link these columns to their associated properties and languages by adding a set of <mapProperty xml:lang="..." ref="..." toColumn="..."> elements to the table reference tag for the concept.

Here's an example that defines a country concept with names in English, Spanish, and French:

<concepts>
  ...
  <concept id="country" extends="geo:location">
    ...
    <property id="name">
      <info>
        <name>
          <value>Name</value>
        </name>
        <description>
          <value>The official name of the country</value>
        </description>
      </info>
      <type ref="string" />
    </property>
    ...
    <table ref="countries_table">
      <mapProperty xml:lang="en" ref="name" toColumn="name_en"/>
      <mapProperty xml:lang="es" ref="name" toColumn="name_es"/>
      <mapProperty xml:lang="fr" ref="name" toColumn="name_fr"/>
    </table>
  </concept>
  ...
</concepts>

...

<tables>
  ...
  <table id="countries_table">
    <column id="country" type="string"/>
    <column id="name_en" type="string"/>
    <column id="name_es" type="string"/>
    <column id="name_fr" type="string"/>
    ...
  </table>
</tables>

The CSV file for the countries_table would then have the following form:

country,name_en,name_es,name_fr,...
...
US,United States of America,Estados Unidos de América,États-Unis d'Amérique,...
...

Mappable Concepts

Many concepts (for instance: county, state, and city) have instances corresponding to geographic locations. DSPL supports geocoding these instances so that they'll be visualizable in the Google Public Data animated map chart.

If your concept is equivalent to World countries, US states, or US counties, then you can just link to the corresponding Google canonical concept; no explicit geocoding is required. See the Canonical Concepts Guide for more details.

If not, then you need to make your concept mappable. The first step is to make it extend from geo:location:

<concept id="..." extends="geo:location">
  ...
</concept>

Then, you must explicitly add latitude and longitude as properties:

<concept id="..." extends="geo:location">
  ...
  <property id="latitude"/>
  <property id="longitude"/>
</concept>
  

The values for these are then specified as columns in the corresponding concept definition data table.

Concept Relationships

Concepts are often related to other concepts in a structured way. For instance, a continent instance may include multiple country instances, which, in turn, may contain multiple state or province instances. Encoding these relationships in the dataset metadata allows for richer visualization features than would be otherwise possible, e.g., showing a collapsible tree of locations to choose from.

In the sections below, we describe the concept relationships supported in the DSPL schema.

Hierarchies

Concept hierarchies are represented in DSPL through the use of an isParent="true" attribute in a <property> tag of the child concept, which contains identifiers of instances from the parent concept.

As an example, Google's US County concept has the following form:

<concept id="us_county" extends="geo:location">
  <info>
    <name>
      <value xml:lang="en">County</value>
    </name>
    ...
  </info>
  ...
  <property id="state" concept="us_state" isParent="true"/>
  ...
  <data>
    <table ref="reference_us_counties"/>
  </data>
</concept>
  

The supporting data table has a state column with the two-letter state code for each county. This type of metadata allows the Public Data Explorer to show states and counties as a hierarchy, a feature that makes exploration much easier for users.

Note that a concept can have many children but no more than one parent.

Mappings

Concept mappings (i.e., concepts that represent, fundamentally, the same thing) are represented through an isMapping="true" attribute in a property tag of the mapped concept.

Specifying that one concept maps to another allows the former to inherit all of the properties and attributes of the latter. Among other applications, this is useful for "linking" personal geographic concepts with those defined in Google's canonical geo dataset:

<concept id="my_country" extends="geo:location">
  <info>
    <name>
      <value xml:lang="en">Country</value>
    </name>
    ...
  </info>
  ...
  <property id="google_country_code" concept="geo:country" isMapping="true"/>
  <data>
    <table ref="countries_concept"/>
  </data>
</concept>
  

Extensions

Concept extensions are designated through an extends element in the corresponding concept definition. Extensions are useful for indicating that a particular concept is a subclass of another, broader concept. The extended concept inherits all of the attributes and properties of its parent, and can also add additional ones.

As an example, Google's currency concept extends unit:

<concept id="unit">
  ...
</concept>

<concept id="currency" extends="unit">
  <info>
    <name>
      <value xml:lang="en">Currency unit</value>
    </name>
    ...
  </info>
  ...
  <table ref="currency_table"/>
</concept>
  

See the discussion of concept extensions in the tutorial for more explanation and examples.

Submitting Your Dataset

To submit your dataset to the Google Public Data Explorer, follow these instructions:

  1. Create a directory.
  2. Save the dataset dspl file in the directory you created. Make sure to use the .xml extension.
  3. Save any local .csv files in same directory. Data files that are referenced via URLs can be omitted.
  4. Zip the directory.
  5. Upload your dataset to the Google Public Data Explorer.

Once your dataset is uploaded and validated, you can test it when signed into your Google account. It will not be published until you've checked it and tell us it's ready.