This document provides detailed reference on the data feed for the Google Analytics Data Export API. For client-library specific information, see the Developer Guide.
|
Reports in the Analytics user interface are generally organized into these categories:
Each report, regardless of the section to which it belongs, consists of two primary fields—metrics, and dimensions. Analytics reports use a combination of metrics and dimensions to describe key types of visitor activity to your website, such as which search engine visitors used to reach your site in the Search Engines report, or which pages on your site received the most traffic in the Top Content report. Similarly, the Data Export API groups both dimensions and metrics into several categories of report data. By choosing your own combinations of dimensions and metrics, you can create a customized report tailored to your specifications.
Keep in mind that not all categories of data can be combined in a single request. When you request a combination of dimensions and metrics that are not allowed, you will receive an error esponse instead of an actual feed. This causes no harm, so feel free to experiment with combinations of metrics and dimensions that seem most useful. For a detailed list of the metrics and dimensions you can query, see the Dimensions & Metrics Reference
To understand how Analytics data is applied to the profile you are requesting data for, see the background document on Accounts and Profiles.
This section describes all the elements and parameters that make up a data feed request. In general, you provide the table ID corresponding to the profile you want to retrieve data from, choose the combination of dimensions and metrics, and provide a date range along with other parameters in a query string.
https://www.google.com/analytics/feeds/data
ids=ga:12345
<ga:tableId> element for each entry
in the account feed. This value is composed of the ga: namespace and the
profile ID of the web property. dimensions=ga:source,ga:medium
dimensions parameter defines the primary data keys for
your Analytics report, such as ga:browser or ga:city.
Use dimensions to segment your web property metrics. For example, while you
can ask for the total number of pageviews to your site, it might be more
interesting to ask for the number of pageviews segmented by browser. In this
case, you'll see the number of pageviews from Firefox, Internet Explorer,
Chrome, and so forth. When the value of the dimension cannot be determined, Analytics uses
the special string (not set). There are a number of situations
where the dimension value will not be set. For example, suppose you want
to query your reports for country, city,
and pageviews,
and suppose the following is true for your profile data:
The results for this request would return data as illustrated by the following example table.
| Country | City | Pageviews |
|---|---|---|
| (not set) | (not set) | 23 |
| Country A | (not set) | 13 |
| Country A | City A1 | 10 |
| Country A | City A2 | 5 |
| Country B | (not set) | 10 |
| Country B | City B1 | 5 |
| Country B | City B2 | 13 |
When using dimensions in a feed request, be aware of the following constraints:
For more information and the list of all dimensions, see the Dimensions section in the Dimensions and Metrics Reference.
metrics=ga:visits,ga:bounces
ga:pageviews requested with ga:country returns
the total pageviews per country rather than the total pageviews for the entire
profile. When requesting metrics, keep in mind:ga:visitors metric, which
can only be used in combination with a subset of metrics. See the Query
Validation Chart for details.sort=-ga:visits
Indicates the sorting order and direction for the returned data. For example,
the following parameter would first sort by ga:browser and then
by ga:pageviews in ascending order.
sort=ga:browser,ga:pageviews
If you do not indicate a sorting order in your query, the data is sorted by dimension from left to right in the order listed. For example, if the query looks like this:
dimensions=ga:browser,ga:country
Sorting occurs first by ga:browser, then by ga:country.
However, if the query uses a different order:
dimensions=ga:country,ga:browserSorting occurs first by
ga:country, then by ga:browser.
When using the sort parameter, keep in mind the following:
dimensions or metrics parameter.
If your request sorts on a field that is not indicated in either the dimensions
or metrics parameter, you will receive a request error. The sort direction can be changed from ascending to descending by using a
minus sign (-) prefix on the requested field. For example:
sort=ga:browser,ga:date
| sorts ascending by ga:browser and then ascending by ga:pageviews |
sort=-ga:pageviews,ga:browser |
sorts first in descending order by ga:pageviews and then
in ascending order by ga:browser. |
filters=ga:medium%3D%3Dreferral
The filters query string parameter restricts the data returned from your request to the Analytics servers. When you use the filters parameter, you supply a dimension or metric you want to filter, followed by the filter expression. For example, the following feed query requests ga:pageviews and ga:browser from profile 12134, where the ga:browser dimension starts with the string Firefox:
https://www.google.com/analytics/feeds/data?ids=ga:12134&dimensions=ga:browser&metrics=ga:pageviews&filters=ga:browser%3D~%5EFirefox&start-date=2007-01-01&end-date=2007-12-31
Filtered queries restrict the rows that do (or do not) get included in the result. Each row in the result is tested against the filter: if the filter matches, the row is retained and if it doesn't match, the row is dropped.
A single filter uses the form:
ga:name operator expression
In this syntax:
ga:pageviews will filter on the pageviews metric.There are six filter operators for dimensions and six operators for metrics. Some of the operators must be URL encoded in order to be included in URL query strings. Where necessary, the URL-encoded string for each operator is indicated in the tables.
Tip: Use the Data Feed Query Explorer to design filters that need URL encoding, since the explorer will automatically URL encode necessary strings and spaces for you.
| Operator | Description | URL Encoded Form | Examples |
|---|---|---|---|
== |
Equals | %3D%3D |
Return results where the time on the page is exactly ten seconds:filters=ga:timeOnPage%3D%3D10 |
!= |
Does not equal | !%3D |
Return results where the time on the page is not ten seconds:filters=ga:timeOnPage!%3D10 |
> |
Greater than | %3E |
Return results where the time on the page is strictly greater than ten seconds:filters=ga:timeOnPage%3E10 |
< |
Less than | %3C |
Return results where the time on the page is strictly less than ten seconds:filters=ga:timeOnPage%3C10 |
>= |
Greater than or equal to | %3E%3D |
Return results where the time on the page is ten seconds or more:filters=ga:timeOnPage%3E%3D10 |
<= |
Less than or equal to | %3C%3D |
Return results where the time on the page is ten seconds or less:filters=ga:timeOnPage%3C%3D10 |
| Operator | Description | URL Encoded Form | Example |
|---|---|---|---|
== |
Exact match | %3D%3D |
Aggregate metrics where the city is Irvine:filters=ga:city%3D%3DIrvine |
!= |
Does not match | !%3D |
Aggregate metrics where the city is not Irvine:filters=ga:city!%3DIrvine |
=~ |
Contains a match for the regular expression | %3D~ |
Aggregate metrics where the city starts with New:filters=ga:city%3D~%5ENew.* (%5E is the URL encoded from of the ^ character that anchors a pattern to the beginning of the string.) |
!~ |
Does not match regular expression | none | Aggregate metrics where the city does not start with New: filters=ga:city!~%5ENew.* |
=@ |
Contains substring | %3D@ |
Aggregate metrics where the city contains York:filters=ga:city%3D@York |
!@ |
Does not contain substring | none | Aggregate metrics where the city does not contain York:filters=ga:city!@York |
As stated earlier, filter expressions use a regular expression syntax similar to Perl regular expression syntax. For more information on common regular expression matches supported by Google Analytics, see the Help Center article on the topic. When forming regular expressions for filtering dimensions or metrics, review the following rules.
\;\,\\& must be percent-encoded in the usual way. AND and ORFilters can be combined with AND boolean logic as well as with OR boolean
logic.
| OR | ||
|---|---|---|
Uses the comma (,) as an operator. |
Country is either United States or Canada: filters=ga:country%3D%3DUnited%20States,ga:country%3D%3DCanada
|
|
Takes precendence over AND.(e.g. x and y OR a and b) |
Compare Firefox users on Windows versus Macintosh operating systems:ga:operatingSystem==Windows,ga:operatingSystem:Macintosh;ga:browser==Firefox |
|
| May NOT be used with both dimensions and metrics. | Valid combinations:ga:visitorType==New%20Visitor,ga:language!~%5Een.*ga:pageviews>5,ga:visits>2
Invalid combinations: ga:visitorType==New%20Visitor,ga:visits>5 |
|
| AND | ||
Uses the semi-colon (;) as an operator. |
Country is United States and language is not English: ga:country==United%20States;ga:language!~%5Een.* |
|
| Is preceded by the OR operator. (e.g. x or y AND a or b) |
Windows and Macintosh users from either Firefox or Chrome:ga:browser==Firefox,ga:browser==Chrome;ga:operatingSystem==Windows,ga:operatingSystem==Macintosh |
|
| Can be used in combination with either metrics or dimensions. | Valid combinations:ga:visitorType==New%20Visitor;ga:language!~%5Een.*ga:visitorType==New%20Visitor;ga:visits>5 |
|
This AND filter selects data from the United States from the browser Firefox.
filters=ga:country%3D%3DUnited%20States;ga:browser%3D@Firefox
This OR filter selects data from either the United States or Canada.
filters=ga:country%3D%3DUnited%20States,ga:country%3D%3DCanada
These next two URLs each request pageviews and country from profile 12134. The first URL limits results to cities starting with L and ending with S. The second UTRL limits results to browsers starting with Fire and the cities starting with L. (In these examples, %5E is the URL encoding for ^).
https://www.google.com/analytics/feeds/data?ids=ga:12134&dimensions=ga:country&metrics=ga:pageviews&filters=ga:country%3D~%5EL.*S$&start-date=2007-01-01&end-date=2007-12-31 https://www.google.com/analytics/feeds/data?ids=ga:12134&dimensions=ga:country&metrics=ga:pageviews&filters=ga:city%3D~%5EL;ga:browser%3D~%5EFire&start-date=2007-01-01&end-date=2007-12-31
start-date=2009-04-20
YYYY-MM-DD.end-date=2009-05-20
YYYY-MM-DD.start-index=10
1. (Feed indexes are 1-based. That
is, the first entry is entry 1, not entry 0.) Use this parameter as a pagination
mechanism along with the max-results parameter for situations when totalResults exceeds
10,000 and you want to retrieve entries indexed at 10,001 and beyond. max-results=100
start-index to retrieve
a subset of elements, or use it alone to restrict the number of returned
elements, starting with the first. If you do not use the max-results parameter
in your query, your feed returns the default maximum of 1000 entries.ga:country,
so when segmenting only by country, you can't get more than 300 entries,
even if you set max-results to
a higher value.The data feed returns data that is entirely dependent on the fields you specify in your request using the dimensions and metrics parameters.
For a list of the available dimensions and metrics that you can query in the
data feed, see the Dimensions & Metrics Reference. This section describes the general structure of the data feed as returned in XML, with a description for the key elements of interest for the data feed.
title—the string Google Analytics Data for Profile, followed by the ID of the selected profileid—the feed URLtotalResults—the total number of results for the query, regardless of the number of results in the responsestartIndex—the starting index of the entries, which is 1 by default or otherwise specified by the start-index query parameteritemsPerPage—the number of items in the current
request, which is a maximum of 10,000dxp:startDate—the first date for the query as indicated in the start-date query parameterdxp:endDate—the ending date for the query as indicated in the end-date query parameter, inclusive of the date provideddxp:aggregates—contains aggregate data for all metrics requested in the feed
dxp:metric—lists the name of the metric, its type (integer or string), the confidence interval for the value and the total value for the requestdxp:dataSource—summary information about the Analytics source of the data
dxp:tableId
—The unique, namespaced profile ID of the source, such as ga:1174dxp:tableName—The name of the profile as it appears in the Analytics administrative UIdxp:property name=ga:profileId—The profile ID of the source, such as 1174dxp:property name=ga:webPropertyId—The web property ID of the source, such as UA-30481-1dxp:property name=ga:accountName—The name of the account as it appears in the Analytics interface.entry—Each entry in the response contains the following elements
title—the list of dimensions in the query and the matching result for that entrydxp:dimension—one element for each dimension in the query, which includes the name and value of the dimensiondxp:metric—one element for each metric in the query
name—the name of the metrictype—either integer or stringvalue—the aggregate value for the query for that metric (e.g. 24 for 24 pageviews)ci—the confidence interval, or range of values likely
to include the correct value. See "Confidence
Interval" for a general description of confidence intervals
and how they are used with Google Analytics.The data feed response returns
two general categories of ci values:
0.0 - 1.0—If the confidence interval
is zero, there is no estimate involved. Any other value indicates
the percentage to use to determine the possible range of values for
the metric.INF—This value corresponds to the asterisk (*) that
you might see in the Analytics web interface. It occurs in heavily
sampled reports
and indicates that the supplied number is purely an estimate.If you expect your query to return large result sets, the guidelines below will help you optimize your API query, avoid errors, and minimize quota overruns. Keep in mind that we establish a baseline level of optimization for any given API request by allowing a maximum number of dimensions (7) and metrics (10). While some queries that specify large numbers of metrics and dimensions can take longer to process than others, limiting the number of requested metrics does not generally improve query performance. Instead, you can use the following techniques for the best performance results.
Paging through results can be a useful way to break large results sets into manageable chunks. The data feed tells you how many matching rows exist, along with giving you the requested subset of rows. If there is a high ratio of total matching rows to number of rows actually returned, then the individual queries might be taking longer than necessary. If you need only a limited number of rows, such as for display purposes, setting an explicit limit is fine. However, if the purpose of your application is to process a large set of results in its entirety, then it is most efficient to request the maximum allowed rows.
Instead of paging through the date-keyed results of one long date range, consider forming separate queries for one week—or even one day—at a time. For a very large data set, it may still be necessary to page through results, such as when a request for one day still contains more than the maximum number of result rows per query. In any case, if the number of matching rows for your query is higher than the max results rows, breaking apart the date range may improve the total time to retrieve the answer. This is true whether the queries are being sent in a single thread or in parallel.
Consider whether additional filters might reduce the data while still providing the information you need. Can a dimension filter, such as a regular expression match on a page path, return the subset of the data you care about? Can value thresholds (such as ignoring matches with less than 5 visits) filter out less interesting results? This approach can be used as a complement to any of the other suggestions mentioned earlier. With this technique, the actual time to get each result set is likely to be about the same, but fewer result pages would be retrieved, thus reducing the overall interaction time and minimizing impact on your quota allowance.