This document describes how to query for data items and obtain metadata.
You can query on:
When you search for Google Base data, your query goes against a data item feed. There are two data item feeds: the public snippets feed and the customer-specific items feed.
http://www.google.com/base/feeds/snippets: The snippets feed contains all Google Base
data and is available to anyone to query against without a need for authentication.
This feed is read-only. The snippets feed provides access to all content in Google
Base, but may return items with a shortened description, and missing
private attributes. http://www.google.com/base/feeds/items: The items feed contains a customer-specific subset
of Google Base data. Only the customer can access this feed to insert,
update, delete and query his own data. The items feed requires authentication. Queries are constructed the same way against both types of feeds.
Note: Queries executed on the snippets and items feeds in the Google Base data API may return anywhere from 0 to all matching results. There is no guarantee that a query will return all of the matching results in Google Base.
Google Base queries use a combination of a full-text search query,
plus one or more specific attribute queries restricting, for instance,
the price of an item, or the text of an attribute such as title, to narrow
down the search. You query a Google data API feed by issuing an HTTP
GET request. The query URI consists of the resource's URI (called FeedURI in
Atom) followed by query parameters. Google Base queries use REST-style
URLs with one or more parameters, all of which must be URL encoded. To achieve this, you can combine the q and bq query parameters, described in more detail below .
The response is a feed that contains a list of items that match the query
string. Each item has a unique URL (provided in the <id> tag.)
When you want to update or delete a particular item, you can retrieve
just that item by using its URL.
This chapter contains several example queries. As a reminder, if you send a query as part of an HTTP request, any non-alphanumeric characters must be URL-encoded. You can get the proper URL encoding for any query by entering it in the following form:
To make the examples in this document easier to read, we have omitted the URL encoding in the text.
The example queries in this chapter are hyperlinked so that you can execute them in a browser. If you are using Mozilla Firefox, the title and content of each result will display in the browser window. To see the entire XML result, right-click in the browser window and select View Page Source. If you are using Internet Explorer, you will first have to save the result as an XML file, which you can then open in the browser.
Google Base supports both full-text and structured queries.
Full-text queries are executed using the q parameter in query URIs,
as follows:
q=text query
For instance, the following query would yield data items which both contain the terms "'mp3"' and "player" in an arbitrary, non consecutive order.
mp3 player
If we want to restrict our attention only to items that contain the term "mp3" immediately followed by "player", we would have to use a phrase query instead:
"mp3 player"
If we are interested in items that contain either the phrase "mp3 player" or the term "ipod", then we would have to compose the query above using the OR operator | with term "ipod".
"mp3 player" | ipod
If we do not want the term "accessory" to appear in items that match this query, we can express this using the - operator:
("mp3 player" | ipod) -accessory
Since | has a lower precedence than the implicit AND operation, we have
to enclose the previous query in parenthesis when composing it with -accessory.
Full-text queries are automatically spell-checked. If the spell checker identifies a potential spelling error, the query response will return a link to a suggested corrected query. For example, executing the query:
q=flwers
returns the following spell correction link:
<link rel='http://schemas.google.com/g/2006#spellcorrection' type='application/atom+xml' href='http://code.google.com/base/feeds/base/feeds/snippets?q=flower&start-index=1&max-results=25&oi=spell&spell=1'> </link>
Refer to the GData documentation for more information about executing full-text queries.
Structured queries are provided using the bq parameter in query URIs,
as follows:
bq=text query [attrib_name(attrib_type):value] [attrib_name(attrib_type):value] ...
This syntax uses square brackets to indicate attribute name-value pairings. That is, when constructing queries, square brackets do not mean that something is optional.
The attribute type, however, is optional. If you do not explicitly specify the type of an attribute, it gets derived from the attribute operator and value pattern.
The following example shows some elements from a recipe item.
<g:label>asian</g:label> <g:item_type type='text'>Recipes</g:item_type> <g:cooking_time type='number'>30</g:cooking_time> <g:main_ingredient type='text'>chicken</g:main_ingredient> <g:serving_count type='number'>5</g:serving_count>
To query on cooking_time for this element, you would use:
[cooking_time(number):30]
You can combine the q and bq parameters in a single query, using bq to express any structured components, and q to express the rest. This lets you maximize your benefit from spell correction.
For example, the following query is valid:
snippets?q=digital+camera&bq=[brand:canon]
When you use q and bq to execute a query on the items feed, you only have access to published and searchable items. If you want to see all of your items, including inactive and draft items, you can execute an HTTP GET against the feed URL without specifying a q or bq parameter. None of the other query parameters are applicable in this case.
NOTE: The /base/feeds/items?bq=... query will only work after
your items have become searchable. Items uploaded to Google Base might
take up to 24 hours to become searchable. Even after the items become
searchable, this query is not guaranteed to return ALL items that match
the query.
You can combine a text search with a search on one or more attributes to constrain your search to a smaller set of matching data items.
Imagine we are interested in finding items related to digital music
players. The previous query is very likely to return items that are not
product offers, but, for instance, product reviews. In order to constrain
the search to products only, we have to express that the type of our
items should be products. Each data item in Google Base has an item
type text attribute, so we just have to state that the value of
this attribute should match the term products. Such an attribute
query is expressed using the syntax [<attribute name>: <attribute
value>]. Thus, we can refine the previous query:
[item type: products] (ipod | "mp3 player")
Let's imagine we want to spend at most $150 on a digital music player
or equipment. In Google Base, the numeric attribute price refers
to the price of a product. For numeric attributes, we can use the operators <, <=, >, >=,
and == to specify constraints on their values, so a constraint
on the attribute price looks like this: [price <= 150.0].
Numbers can be associated with units. The given query does not specify
a unit, so all prices below 150.0 match, no matter what currency is used.
We can add the unit USD to make sure that we only get offers
in U.S. dollars. We append this price restrict to the previous query
and get:
[item type:products] (ipod | "mp3 player") [price <= 150.0 USD]
If we are not interested in products below $50, we could match
the value of the price attribute against the number range 50.0..200.0
with the query: [price: 50.0..150.0 USD]
Google Base supports the attribute types described in the Feeds Reference.
You can restrict on attribute type in your query by specifying the type in parenthesis after the attribute name, as in:
[item type(text):products]
Numeric types can be parameterized with a unit, which just gets appended to the type as in the following price query
[price(float USD) <= 150.0 USD]
In this case, the second USD is optional and can be omitted.
When an attribute value starts with a number, a number type is inferred. You can explicitly specify that the number should be interpreted as text by surrounding it with quotes. For example, [foo: "120.30" USD] specifies that 123.30 should be interpreted as text, not as a number.
To express that a certain attribute is present in a data item without
constraining its value, omit the operator and value
pattern in an attribute query. For instance: [listing type] matches
items that define an attribute "listing type" of an arbitrary
type and value. The query [bedrooms(int)] matches only those
items that define an attribute bedrooms of type int with
an arbitrary value.
The following query matches all data items that have a location attribute
of type location whose value refers to an address that is
at most 3 miles away from the address "10 Market St, San Francisco, CA, USA":
[location: @"10 Market St, San Francisco, CA, USA" + 3mi]
@"..." is a location literal that gets geocoded
by the API server into a latitude/longitude. The radius specified by
suffix + 3mi defines the area in which locations are expected
to lie.
If you specify a point as identified by a lat/long combination or an address, together with a distance, the query will return all items with a location within the area specified by the center point and the radius. The following query matches all data items that have a location attribute
of type location whose value refers to an address that is
at most 3 miles away from the city center of "San Francisco, CA, USA":
[location: @"San Francisco, CA, USA" + 3mi]
If the latitude/longitude was specified when an item was added to Google Base, you can specify it directly as in this query:
[location: @+37.795-122.395 + 3mi]
The syntax that defines locations is defined by ISO 6709. The example above refers to a three mile radius from the point with the latitude +37.795 and longitude -122.395 degrees. Note that there is no seperator between the latitude and the longitude.
You can also match locations in a box, by specifying the latitude/longitude of the opposing corners. More precisely, given two locations, Google Base will construct the smallest rectangular area that contains both, and return all items with locations inside this area.
[location: @+34-086..@+37-092]
If you omit the radius and specify an address at least up to the city level, then the query will return all items with a location within the city. The following query matches all data items that have a location attribute
of type location whose value refers to an address that within "San Francisco, CA, USA":
[location: @"San Francisco, CA, USA"]
Likewise, you can specify an address up to the state level to find locations within the state, up to the ZIP/postal code level to find locations within a ZIP/postal code, or up to the country level to find locations within a country.
The following query shows how to constrain the value of a daterange
attribute publish date. This query matches all data items that have a publish date attribute
of type daterange whose start and end dates are both on May
3, 2007.
[publish date: 2007-05-03Z]
The following query shows how to constrain the value of a daterange
attribute event date. This query matches all data items that have a publish date attribute
of type daterange whose start and end dates are both on May
3, 2007.
[event date range:2007-05-20..2007-05-25]
Dates have the form YYYY-MM-DD, where YYYY is
the year, MM the month, and DD the day. A date can be followed by a Z to express that all dates/times
are based on the UTC timezone, as defined by ISO 8601.
You can specify
prefixes of a date to denote larger dateranges, such as "2005-02" to refer
to the whole month of February 2005. The following attribute
query matches publish date attributes with a value in the
year 2007:
[publish date: 2007Z]
It is also possible to use the operators <, <=,
etc in combination with daterange attributes. Queries of
the form [<daterange attrib> << <daterange>] are
allowed; they match if the right-hand side daterange value
is fully contained in the daterange value of the
given attribute.
One of the supported attribute types is url. Google Base
does not index the protocol of a URL, so a prefix like http:// has
to be omitted when querying for an attribute of type url.
Here are two examples:
[link: cooking.com] [link: "cooking com"]
Either of these will find all items which have a link attribute
(standard Atom element) to pages on "cooking.com". You can also include
additional path components like this:
[link: www.cooking.com/recipes]
By default, query results are returned in order of relevancy; the most relevant responses being returned first.
You can change the ranking criteria by querying the items or snippets feed and assigning a value to the orderby parameter. You can rank by relevancy, modification time, a specified attribute, or define advanced ranking criteria using the Expression Language. Refer to the Feeds Reference for more information.
You can reverse the rank order by querying the items or snippets feed and setting the sortorder parameter to ascending.
The following query searches for sailboats, and sorts query orders results by price in US dollars. Lowest priced items are displayed first.
snippets?q=sailboats&orderby=price(float USD)&sortorder=ascending
The following query searches for software engineering jobs and orders by salary.
snippets?q=software+engineer&bq=[item+type:jobs]&orderby=salary(int)
The following query searches for vehicle listings, and orders results by those that are geographically closest to the town of Cupertino, California. Note that this example, like all queries, must be properly URL-encoded.
snippets?q=sale&bq=[item type:vehicles][location(location)] &orderby=[x=location(location):neg(min(dist(x,@'Cupertino,CA')))]
By crowding query results, you can limit repetitive results by placing a cap on similar items. To do this, you use the crowding parameter.
In its most basic form, crowding imposes a filter on attribute values. You choose an attribute for which you want to limit results; for example, the location attribute or the title attribute. You also choose the maximum number of items that you want to see for each attribute value. For example, you might only want to see a maximum of two items with the same value for the location attribute. This would result in two items with a location of New York, two with a location of Boston, and so on for every unique value of the location attribute.
In addition to attributes, crowding also works on the url, customerid, and content.
The crowding parameter is called crowdby. crowdby is supported on the items feed and on the snippets feed.
crowdby=crowding:maxvalue,crowding:maxvalue
crowding: is the crowding criteria. You can use any of the following criteria:
attributefilters results based on the specified attribute.
urlfilters results based on the URL. There is a cap on the number of items with the same URL.
customeridfilters results based on the customer ID. Multiple items that have the same associated customer ID are excluded.
contentfilters results based on the content of thetitleanddescriptionattributes in the item. Duplicate items, as determined by the content of these attributes, are excluded.[
crowding expression] uses an expression for crowding.
maxvalue is an integer number greater than or equal to 1. If you do not specify a value for maxvalue, 1 is used by default.
Crowding expressions must evaluate to type text or one of the numerical types, including int, float or number.
Upon query execution, only the specified number of results are returned for each unique value. The rest of the items that match the query are dropped.
Note: When you use crowding, the total number of matching items is incorrect. It reflects the number of results that would have been available had crowding not been imposed.
You can combine more than one crowding expression. For example, you might want to see a list of restaurants by location. You could use crowding to ensure a diverse mix of cuisine (Italian food, hamburgers, Thai food) located in several different cities. Multiple crowding expressions have an "and" relationship. If an item would exceed any of the crowding restrictions, it is eliminated from the query results. Each query may contain a maximum of two crowding expressions.
The following example shows crowding by content. It specifies that you only want to see two items for each cuisine type. It shows each type a maximum of 2 times. You might use this to locate restaurants.
crowdby=cuisine(text):2
The following example shows crowding by content. It searches for cars, and crowds by color. It shows each color car a maximum of two times.
snippets/-/vehicles?crowdby=color(text):2
The following example shows crowding by customerid. It only shows two items per customer.
snippets/-/housing?bq=[location:@"100 Wall Street St, New York, NY, USA" + 3mi]&crowdby=customerid:2
The following example shows crowding by url. It searches for recipes, and shows only 2 recipes originating from each URL.
http://www.google.com/base/feeds/snippets/-/recipes?crowdby=url:2
The following example combines two crowding expressions: url and customerid. Since there is no value specified for the maximum number of results, 1 is used by default.
http://www.google.com/base/feeds/snippets/-/recipes?crowdby=url,customerid
The following example uses the join operator to concatenate the cuisine and main ingredient fields of a recipe. It then crowds based on this joined value.
http://www.google.com/base/feeds/snippets/-/recipes?crowdby=[x=cuisine(text),y=main+ingredient(text):join(x)+","+join(y)]:2
The following URL-encoded example searches for items with digital and camera that have a price attribute of type float and unit USD. For each result:
price is lower than 200 , it internally assigns the string group1 to the result price is between 200 and 500, it assigns the string group2 to the result price is above 500, it assigns the string group3 to the result After it sorts all of the results, it selects the first result from each group, since the query specifies that we only want 1 of each, as specified by :1 at the end of the query.
Since there are three groups, and we want one result from each group, this query returns three results.
snippets?bq=digital+camera[price(float USD)]&crowdby=[x=price(float USD):
if (max(x) < 500.0) then (if (max(x) < 200.0) then "group1" else "group2") else "group3"]:1
The following example shows crowding by location. It queries for housing in a 100 mile radius around Mountain View, California. It defines two groups (group1 and group2) and allow two items per group (:2) in the final result. One group encompasses housing listings less than 50 miles from Mountain View, and the other group encompasses items more than 50 miles away (but still closer than 100 miles due to the base query.)
snippets/-/housing?bq=[location(location) : @"1600 Amphitheatre Parkway, Mountain View, California, USA" +100mi] &crowdby=[x=location(location): if (min(dist(x,@"1600 Amphitheatre Parkway, Mountain View, California, USA")) < 50) then "group1" else "group2"]:2
Refer to the Expression Language page for more complex examples of crowdby expressions.
When you query for a reference, you specify the numeric identifier of a reference URL, prefixed by the # sign, as follows:
snippets?bq=[AttributeName (reference):#ItemId]
As a reminder, make sure that you URL-encode # as %23 if necessary.
For example:
snippets?bq=[business(reference):#3316724142945315489] snippets?bq=[brand(reference):#15670381360807634994]
Query responses always reference an item using its snippets feed URL:
http://www.google.com/base/feeds/snippets/17891817243016304554
You can use the optional query adjustment engine to standardize query results so that they show the canonical form of attributes and attribute values.
Case, underscores, and spaces are always ignored. With query adjustments enabled, Google Base further analyzes its data and identifies instances in which the same type of data is represented differently. For example, we know that gender:f and gender:female probably mean the same thing. In these cases, we consider both values to match the query gender:female.
You may find any of the following in your query results:
Query adjustments are always enabled, and adjusted values are included transparently by default. You can show which query adjustments were made and control the output using the content and adjust query parameters on the items and snippets feeds.
Note: as these values are automatically adjusted to reflect the data in Google Base, they are not guaranteed to be consistent over time.
You should always express queries using the canonical form of attribute values, item types, and attribute names, as listed on the Recommended Attributes page. When query adjustments are enabled, the entire returned item (as opposed to just the attribute on which the query matched) is normalized to match the canonical forms of the item components.
For example, the query snippets?bq=[gender:female] returns an item with the following adjusted value. This adjustment attribute shows that this item was returned because female is considered to be equivalent to F.
<g:gender type='text'> <gm:adjusted_value>female</gm:adjusted_value> F </g:gender>
The simplest case of adjusted query results is for attribute values.
If the tag gm:adjusted_value appears in an attribute tag in search results, it indicates an adjusted attribute value. It contains the adjusted value for the current attribute.
For example, the query snippets?bq=[job industry:healthcare] also returns items with the attribute <g:job_industry>Health Care</g:job_industry>:
<g:job_industry type='text'> <gm:adjusted_value>healthcare</gm:adjusted_value> Health Care </g:job_industry>
In this case, the gm:adjusted_value element means that the values Health Care and healthcare are considered equivalent for query purposes.
Although Google Base provides a list of suggested item types, we do not require users to adhere to those item types. We can, however, improve query results if we can guess that a custom-defined item type maps to a standard item type. For example, we can assume that Business Location has the same meaning as business locations.
For example, the query snippets?bq=[item type: business locations] might retrieve an item with the attribute <g:item_type>Business Location</g:item_type>:
<g:item_type type='text'> <gm:adjusted_value>business location</gm:adjusted_value> Business Locations </g:item_type>
As with item types, Google Base provides a list of suggested attribute names. We can make educated guesses that map the attribute names in a Base item with our canonical attribute names, in order to improve search results.
The query snippets?bq=[cook time] might retrieve an item with the attribute name cooking time, as follows:
<g:cook_time type='text'> <gm:adjusted_name>cooking time</gm:adjusted_name> 3 hr 5 min </g:cook_time>
Here are some complex sample queries. All of these queries are for public
data and go against the /feeds/snippets feed.
Note that these examples are not correctly URL-encoded, in order to make them easier to read.
This query searches on multiple attributes to find a specific job.
http://www.google.com/base/feeds/snippets?bq= [employer: Google] [job type:full-time]
[job function:Marketing]
This query searches for all digital cameras that have 4+ megapixels, cost less than $200, and are new.
http://www.google.com/base/feeds/snippets?bq=digital camera [megapixels >= 4] [price < 200.0] [condition:new]
This query searches for all red convertible Volkswagens manufactured in the past 3 years.
http://www.google.com/base/feeds/snippets?bq=convertible [brand:Volkswagen] [year >=2003]
[color:red] [vehicle type:car]
This query searches for items of type "recipes" whose "cuisine" attribute contains the word "chinese" and whose "course" attribute contains the word "main". It only returns items whose "main ingredient" is chicken.
http://www.google.com/base/feeds/snippets?bq=[item type:recipes] [cuisine:chinese] [course:main] [main ingredient:chicken]
This query searches for all "for sale" items of type "housing" whose location is up to 15 miles away from San Jose, California and whose price is below $700k.
http://www.google.com/base/feeds/snippets?bq= [item type:housing] [listing type:for sale]
[location:@"San Jose, CA, USA"+15mi] [price < 700000 USD]
You can query across an entire item type, such as asking for all jobs items. Notice that this uses a different syntax.
http://www.google.com/base/feeds/snippets/-/jobs
You can also indicate how many results to provide. This example requests only 5 items.
http://www.google.com/base/feeds/snippets?bq=digital+camera&max-results=5
You can use the same query syntax for items in the customer-specific
feed ( /feeds/items). This feed returns a list of only your own data items that match the query. Consequently you may get different results than you would for queries run on the snippets feed.
Here's a sample query. Note the same syntax but a different URL:
http://www.google.com/base/feeds/items?bq=digital+camera
The query /base/feeds/items, run with no query parameters, will return a maximum of 10,000 items that you uploaded.
These could have been uploaded via any mechanism; using the Google Base
UI, using the feeds, or via the API. The returned items have not necessarily
been published.
In addition:
/base/feeds/items/<id> will return a specific published item that you uploaded. /base/feeds/items?bq=... will return a subset of your published items that match the query. /base/feeds/items?dry-run=true will execute a test operation on the specified feed. A test
insertion, update, deletion or batch operation only reports success or
error. It does not insert, update, delete or modify anything on the
server. /base/feeds/customer_id/items can be used by an aggregator to submit items on behalf of another customer. Refer to Getting Started for more information.For Google Base queries to be really useful, it is best if everyone uses a common set of item types and attributes. You can create your own, but your data is more likely to be accessible if you use the standard sets. The metadata feeds (item types and attributes) provide lists of common item types and attributes.
There are two metadata feeds: itemtypes and attributes. The itemtypes feed contains a complete description of Google Base structures and allows you to query for different types of metadata. The attributes feed provides statistics about how an item type has been used and lists what values have been used frequently for its attributes. Here are some sample queries:
jobsThe itemtypes feed contains a complete description of Google Base structures. See the Item Types Feed document for more details. This feed allows you to query for different types of attribute data:
These lists are useful when you are creating new data items and need to know how to structure the data you add to Google Base.
This URL provides a list of all attributes for the US locale:
http://www.google.com/base/feeds/itemtypes/en_US
Here is a sample of a query for the housing item type:
Request: curl "http://www.google.com/base/feeds/itemtypes/en_US/housing" | xmllint --format -
Response:
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:gm="http://base.google.com/ns-metadata/1.0">
<id>http://www.google.com/base/feeds/itemtypes/en_US/housing</id>
<updated>2006-11-30T01:48:44.271Z</updated>
<category scheme="http://base.google.com/categories/itemtypes" term="housing"/>
<title type="text">housing</title>
<content type="text">housing</content>
<link rel="related" type="application/atom+xml" href="http://www.google.com/base/feeds/snippets/-/housing" title="Items of type 'housing'"/>
<link rel="self" type="application/atom+xml" href="http://www.google.com/base/feeds/itemtypes/en_US/housing"/>
<gm:item_type>housing</gm:item_type>
<gm:attributes>
<gm:attribute name="listing type" type="text"/>
<gm:attribute name="property type" type="text"/>
<gm:attribute name="bedrooms" type="number"/>
<gm:attribute name="square footage" type="numberUnit"/>
<gm:attribute name="agent" type="text"/>
<gm:attribute name="year" type="number"/>
<gm:attribute name="bathrooms" type="number"/>
<gm:attribute name="school district" type="text"/>
<gm:attribute name="hoa dues" type="number"/>
<gm:attribute name="location" type="location"/>
<gm:attribute name="price units" type="text"/>
<gm:attribute name="expiration date" type="dateTime"/>
<gm:attribute name="price" type="floatUnit"/>
</gm:attributes>
</entry>
The attributes feed allows you to get attribute statistics for Google Base data. See the Attributes Feed document for more details.
http://www.google.com/base/feeds/attributes
Here is part of the response for the label values for the jobs item type:
Request: curl "http://www.google.com/base/feeds/attributes/job+industry%28text%29?q=jobs" | xmllint --format -
Truncated response: <entry xmlns="http://www.w3.org/2005/Atom" xmlns:gm="http://base.google.com/ns-metadata/1.0">
<id>http://www.google.com/base/feeds/attributes/job+industry%28text%29Njobs</id>
<updated>2006-11-30T01:50:48.317Z</updated>
<title type="text">job industry(text)</title>
<content type="text">Attribute "job industry" of type text.</content>
<link rel="self" type="application/atom+xml" href="http://www.google.com/base/feeds/attributes/job+industry%28text%29Njobs"/>
<gm:attribute name="job industry" type="text" count="192271">
<gm:value count="23140">healthcare</gm:value>
<gm:value count="23108">it internet</gm:value>
<gm:value count="20303">government</gm:value>
<gm:value count="7611">construction</gm:value>
<gm:value count="7225">accounting</gm:value>
<gm:value count="6112">sales</gm:value>
<gm:value count="3564">other</gm:value>
<gm:value count="2913">financial services</gm:value>
<gm:value count="2899">clerical & administrative</gm:value>
<gm:value count="2690">legal</gm:value>
</gm:attribute>
</entry>
Google Base automatically keeps track of the number of clicks, impressions, and page views for each item. This information is stored, respectively, in the gm:impressions, gm:clicks, and gm:page views elements. You can display this information by querying the items or snippets feed, setting the content parameter to attributes,meta.
Statistics are only available for your own items. Thus, you can only get them by querying the items feed. Refer to the Feeds Reference for more information.