My favorites | English | Sign in

Google Analytics

Functional Overview

Google Analytics works by the inclusion of a block of JavaScript code on pages in your website. When visitors to your website view a page, this JavaScript code references a JavaScript file which then executes the tracking operation for Analytics. The tracking operation retrieves data about the page request through various means and sends this information to the Analytics server via a list of parameters attached to a single-pixel image request.

Because your website configuration and reporting needs might differ from a standard setup, it's a good idea to understand the general tracking process to ensure that your reports deliver data as you expect. In this way, you can decide how to configure Analytics tracking to best suit your own website.

The rest of this document covers:

  1. How Does Google Analytics Collect Data?
    1. How the Tracking Code Works
    2. How GIF Requests are Classified
  2. How Does Google Analytics Calculate Data?
    1. Metrics and Dimensions
    2. How Metrics are Calculated
    3. Attribution Models

How Does Google Analytics Collect Data?

The data that Google Analytics uses to provide all the information in your reports comes from these sources:

  • The HTTP request of the visitor
  • Browser/system information
  • First-party cookies

The HTTP request for any web page contains details about the browser and the computer making the request, such as the hostname, the browser type, referrer, and language. In addition, the DOM of most browsers provides access to more detailed browser and system information, such as Java and Flash support and screen resolution. Analytics uses this information in constructing reports like the Map Overlay, Browser, and Referring Sites reports. Analytics also sets and reads first-party cookies on your visitors' browsers in order to obtain visitor session and any ad campaign information from the page request. When all this information is collected, it is sent to the Analytics servers in the form of a long list of parameters attached to a single-pixel GIF image request.

The data contained in the GIF request is the data sent to the Google Analytics servers, which then gets processed and ends up in your reports. Here is an example of only a portion of a GIF request:

http://www.google-analytics.com/__utm.gif?utmwv=4&utmn=769876874&utmhn=example.com&utmcs=ISO-8859-1&utmsr=1280x1024&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=9.0%20%20r115&utmcn=1&utmdt=GATC012%20setting%20variables&utmhid=2059107202&utmr=0&utmp=/auto/GATC012.html?utm_source=www.gatc012.org&utm_campaign=campaign+gatc012&utm_term=keywords+gatc012&utm_content=content+gatc012&utm_medium=medium+gatc012&utmac=UA-30138-1&utmcc=__utma%3D97315849.1774621898.1207701397.1207701397.1207701397.1%3B...  

For more information on the data contained in a GIF request, see the section "GIF Request Parameters" in the Troubleshooting Guide.

Back to Top

How the Tracking Code Works

In general, the Google Analytics Tracking Code (GATC) retrieves web page data as follows:

  1. A browser requests a web page that contains the tracking code.
  2. The GATC creates and initializes a tracking object associated with web property ID in the code.
  3. Any customized tracking methods are executed.
  4. The tracking code is initialized and manages the following information:
    • Domain settings for the page.
    • Cookie information (including retrieving campaign tracking if it exists).
    • Browser characteristics and page/referral information from the HTTP request.
  5. The tracking code requests a single-pixel image file from the Analytics server, __utm.gif, and appends to the image request a long list of parameters containing the tracking information collected from cookies and the HTTP request.
  6. This GIF request string is collected from the logs, and the parameters are used to populate the databases which provide the reports for the Analytics report user.
GATC Request Process

The example above also illustrates how the tracking code for the web page can be customized. In this scenario, the customization enables the tracking of visitor interaction between two related sites (site linking). This is accomplished by using the function _setAllowLinker(). A default installation of the tracking code would not include such a function, but this and many more are available for you to use in order to customize the tracking code should your setup need it. For more information on the customization functional available in the Google Analytics Tracking Code, see the API reference.

Back to Top

How GIF Requests Are Classified

A GIF request is sent to the Analytics servers in the following cases and classified according to the table below. In each of these cases, the GIF request is identified by type in the utmt parameter. In addition, the type of the request also determines which data is sent to the Analytics servers. For example, transaction and item data is only sent to the Analytics servers when a purchase is made. Visitor, page, and system information is only sent when an event is recorded or when a page loads, and the user-defined value is only sent when the _setVar method is called.

Request Type Description Class
Page A web page on your server is requested. Interaction
Event An event is triggered through Event Tracking that you set up on your site. Interaction
Transaction A purchase transaction occurred on your site. Interaction
Item Each item in a transaction is recorded with a GIF request. Interaction
Var A custom user segment is set and triggered by a visitor. Non-interaction

Requests classified as interaction requests will impact the bounce rate calculations for your page or site. Bounce rate is referred to as a single-page visit to your site, but is strictly defined as a single interaction request during a user session. For this reason, a bounce rate for a page is also affected by ecommerce transactions and event tracking requests. This is because these features co-exist with page tracking and, when they are triggered, they result in additional interaction requests to the Analytics servers.

For more information on the data contained in a GIF request, see the section "GIF Request Parameters" in the Troubleshooting Guide.

Back to Top

How Does Google Analytics Calculate Data?

Once the data collected by Analytics is processed for the reports, it appears in two primary formats: metrics and dimensions. This section describes how metrics and dimensions are calculated for your reports, as well as how different calculation models are used for different categories of metrics.

Metrics and Dimensions

Metrics and dimensions are the building blocks of every report in Google Analytics. In fact, when you use the Custom Reporting feature in Google Analytics, you select individual metrics and dimensions to design your own report. It's easiest to understand metrics and dimensions in the visual context of a report, so consider the metrics and dimensions used in the Visitors Overview report shown below.

 

Visitor Overview Report

A metric is a numeric summary of user behavior to your website. For example, pageviews is a metric that summarizes the total pageviews for a particular page. Bounce rate summarizes the percentage of single-page visits to your site. Visits summarizes the number of sessions on your site. When metrics are viewed without a dimension, they provide site-wide or aggregate values. You can see these site-wide metrics in the overview reports for each major reporting category. Here, the Visitors Overview report shows the metrics that apply to your entire site such as:

  • visits
  • absolute unique visitors
  • pageviews
  • bounce rate

A dimension is a data key or field typically in the form of a string. Dimensions by themselves are not generally meaningful, but when paired with metrics, they can divide or segment the metric from the perspective of that dimension. In general, dimensions represent the larger categories of data that are used to view the metric from a meaningful context. For example, when looking at the Visitor Overview report, you can see that the Technical Profile section lists Visits as the metrics, but it uses the Browser dimension to break down visits to your website by that dimension. You can see here that another dimension—Connection Speed—is also paired with the Visits metric to display visits by connection speed. So, while it is useful to see that there is over 10,000,000 visits to the site, it's also important to analyze this metric from the perspective of users' browsers and connection speeds.

Back to Top

How Metrics are Calculated

In Analytics, visitor metrics are calculated in two basic ways:

  • As overview totals
    where the metric is displayed as a summary statistic for your entire site, such as bounce rate or total pageviews.
  • In association with one or more reporting dimensions
    where the metric value is qualified by selected dimension(s).

The following diagram illustrates these two types of calculations with a simple example. On the left side, visitor data is calculated as an overview metric, while the same data is calculated via the New Visitor dimension on the right side.

Visitor New vs Returning

In the Visitor Overview example, calculations for time on site are computed using the time difference between each visitor's initial visit and the exit, with the sum of each session length averaged across 3 visits. This number is based on a relatively simple calculation achieved by gathering time stamp data at the request level.

In the New vs Returning example, averages are not computed for all visits, but rather via the Visitor Type dimension. By pairing the Time On Site metric with a dimension, you can analyze this metric via returning vs new visitors, where the calculations are modified by the requested dimension. The use of the dimension offers an insight into visitor behavior not provided in the overview report: it's clear that new visitors are spending more time on your site than returning visitors.

While this report shows time on site varying for different user types, you cannot use this report to determine why this is the case. Only you know the design and purpose of your site, so you would also need to use other reports like Site Overlay and Keywords in order to build a consistent picture to support your interpretation of user behavior.

Metrics calculation is also affected by stacking more than one dimension with a given metric. In both the preformatted and custom reports, you can use multiple dimensions together. For example, suppose you use both the Visitor Type dimension and the Language dimension to analyze time on site for your website. In this case, the calculation for new versus returning visitors is the same, but when you drill down to view new visitors using the Language dimension, the calculation is further modified by the additional dimension. So, for example, your visitor breakdown might look like this, where the top site times are listed in order:

Visitor Type Language Avg Time On Site
All Types All Languages 3:25
Returning All Languages 5:03
  Finnish 29:49
  Vietnamese 20:44
  Indonesian 16:55
New All Languages 2:09
  Malay 17:38
  English (GB) 16:56
  Chinese (traditional) 16:20

These numbers are based on an actual Analytics report. In this case, you can determine whether new or returning visitors stayed the longest, and by using an additional dimension, which languages in each of these categories resulted in the longest time on site.

Back to Top

Attribution Models

Because Google Analytics attempts to answer a variety of web analytics questions about user behavior, it uses different calculation types or attribution models to arrive at the data that you see in the reports. Think about each Analytics report as a response to a particular kind of user analysis question. Often, these questions fall into distinct categories:

  • Content: How many times was a particular page viewed?
  • Goals: Which pages URLs contributed to the highest goal conversion rate?
  • Ecommerce: How much value did a given page contribute to a transaction?
  • Internal Search: Which internal search terms contributed to a transaction?

For each of these major categories and the reports that they contain, Google Analytics uses a distinct attribution model. Because each attribution model is designed to calculate a known set of metrics, you might notice that some metrics—such as Pageviews—appear only in certain reports and not in others. This is due to the attribution model that is used for that report.

The Google Analytics reports use three attribution models:

  • Per Request
  • Per Value
  • Per Site Search

Per Request Attribution

This attribution gives aggregate values for a single metric or for a metric/dimension pairing. This is the most common and simplest type of Analytics attribution, since values are determined from individual visitor GIF requests. Thus, for any given request, it is possible to look up a particular dimension and/or metric.

Most dimension values are available at the request level and remain persistent either via the HTTP/GET request itself, or in the GIF request, for every page or event request made to your site. Some common dimensions available at the request level are:

  • page URI—available with every request to your site, this indicates the path of the page being accessed
  • campaign—if a user comes in via a campaign, that campaign remains persistently available with every subsequent request, until the campaign itself changes
  • user agent—every request from a user contains the browser information for that user, sent in via the HTTP/GET request from the browser and stored in the log files directly.

Page Value Attribution

The purpose of this attribution type is to answer the question: "How useful was my page in relation to a goal or revenue value?" This attribution model is largely used in the Top Content and Content Detail reports when determining the average $Index value for a set of pages, or the $Index value of a particular page on your site. The following illustration shows a series of user pageviews in relationship to goals and purchases, such as what might occur on your site.

Legend: P1 through P4 represent pages. The shopping bags indicates a receipt page, and the flag image indicates a goal.

Goal Value Per Page

This attribution model is referred to as a "forward looking" attribution model, because it applies value to a page by looking forward to the goals and/or purchases that take place after the page was visited. The following table shows the value attributed to each page in this sequence.

Page Revenue/Goal Value
P1 $55 + Goal 1
P2 $55 + Goal 1
P3 $35 + Goal 1
P4 $0

This attribution model is not used in Goals or Ecommerce reports, since those reports do not display page URIs or titles in relation to ecommerce activities.

Site Search Attribution

This attribution model determines the contribution of search terms to goal and revenue value. Here, the Site Search Terms report displays goal conversion rates and goal values per search term.

Site Search Report

This attribution model operates in a different fashion from Per Value attribution, since Goal value is attributed to the nearest search term leading up to the conversion, not after. The following diagram illustrates a sequence of internal site searches along with page views and purchases.

Legend: P1 through P4 represent pages. The shopping bags indicates a receipt page, the search icon indicates a search for the terms "Shoes" and "Flowers." The flag image indicates a goal.

Site Search Attribution

Using this model, the search terms attributed to Goal 1 and the transactions are:

  • Shoes—$20
  • Flowers—$25

In this model, transactions or goals are attributed to the search term immediately preceding the goal or transaction.

Back to Top