The Getting Started page showed you how you can use the control panel to create a custom search engine and gave you an idea of what you need to specify to create it. This page introduces you to the basic concepts behind Custom Search and gets you ready to create a more powerful search engine with customized search results.
This page includes the following sections:
If the control panel does not give you the level of customization that you need, consider using the Custom Search XML or TSV format, which gives you more control, flexibility, and access to more powerful features.
If you want to use the Custom Search API, start with the specification files generated by the control panel. Do not create them from scratch.
Extensible Markup Language or XML is a general-purpose markup language. It is text with tags that you can read. For example, the Custom Search XML format includes the following tags: <Context> </Context> and <LookAndFeel> </LookAndFeel>.
As with any XML file, your Custom Search specifications must follow XML syntax (<element attribute="value">content</element>) and be well-formed. XML has the following rules:
<?xml version="1.0"?>), but the Custom Search API doesn't require it. You could, but you don't have to bother prefacing your Custom Search XML with a declaration. <tag>) and a closing tag (</tag>).<sandwich><filling> peanut butter</sandwich></filling>. Instead, it should be like: <sandwich><filling> peanut butter</filling></sandwich>.<element attribute="value">).<element attribute="value">), not the closing tag ( </element>). You can write notes for yourself using comment tags (<!-- your comment here -->), and Custom Search will not parse that line of text as XML code. Apart from writing reminders or description, you can use comments to temporarily put some XML code out of commission (perhaps because you want to experiment with certain effects or you want to troubleshoot issues). However, these comments are not preserved in the files that you download from the control panel. If you want to keep the comments, you should keep a copy of your commented XML files even after you upload them to the control panel.
You can use a simple text editor to create and edit XML files. Just save the text file with the file extension .xml (for example, cse_badminton.xml).
The Custom Search XML format is not hard to follow, but if you feel uncomfortable using it, you can use the Custom Search TSV (tab-separated values) format. As the name implies, a TSV file is a plain-text file that includes lines of fields (strings of characters) that are separated from each other by single tab stops. You can use a simple text editor or a spreadsheet editor to create and edit TSV files. Just save the text file with the file extension .tsv (for example, cse_bicycles.tsv).
A custom search engine has two main components:
You can create an annotations file for each context file or you can have one communal annotations file shared by all your search engines. In either case, when you download the annotations file from the control panel, you will get a single annotations file that combines all the annotations from different search engines.
To learn more about context files, see Defining Your Search Engine Specifications. To learn more about annotations, see Selecting Sites to Search. For more information about selecting the most appropriate file format for your search engine, see Choosing the Right Format for Your Search Engine.
The context file is the specification for the search engine. It defines the infrastructure of your search engine, such as how the results page should look, what the labels are, how pages should be ranked, and so on. The annotations file, on the other hand, governs the coverage of your search engine. It defines what sites should be searched.
The context file does not make any reference to which annotations file to use, and the annotations file makes no reference to the context file either. So how does Custom Search associate the context with the annotations? It does so through labels. The context file includes labels that identify the search engine, and you tag each annotation in the annotations file with search engine labels. If you change the name of the label in the context file, you have to change all the annotations that have been tagged with that label.
Although you can upload multiple annotations files, when you download them through the control panel, Custom Search merges all your annotations files into a single annotations file. Having a single annotations file for multiple search engines (with their own separate context files) simplifies your work and eliminates replication. It enables you to list sites only once, yet have the flexibility to customize the same site for various search engines. For example, one search engine could restrict its search to some sites, another could eliminate those sites, and yet another could promote those sites.
In the context file, the search engine label that is generated by the control panel looks something like:
<BackgroundLabels> <Label name="_cse_hwbuiarvsbo" mode="FILTER"/> <Label name="_cse_exclude_hwbuiarvsbo" mode="ELIMINATE"/><BackgroundLabels>
When this label looks is associated with a site in the annotations file, it looks something like:
<Annotation about="code.google.com/*" score="1">
<Label name="_cse_hwbuiarvsbo"/>
</Annotation>
Labels and how they should be used in annotations are discussed in greater detail in the subsequent pages.
Creating advanced custom search engines involves the following steps:
To work on an XML file, download the XML specification from the control panel. Don't start a file from scratch. Do the following:
Warning: Do not confuse context files with annotations files.
You can choose to download the files to your hard drive or view them in another browser window or tab window.
.xml (for example, cx_global.xml)If you do not make a copy and the version that you edited does not work properly, you will need to debug your file or recreate your search engine all over again. Not fun.
Note: Do not skip this step. It takes only little effort, yet it could save you from some tears.
Before you start creating your custom search engine, determine which format best suits your needs. You don't want to select a format that is more powerful and complex than what you need, nor do you want to use one that you will quickly outgrow.
Use the following table to pick the appropriate format.
| If you want to create... | Use this format... | Because.... | But be aware of the limitations, which are.... | For more information, see... |
|---|---|---|---|---|
| One or few search engines with a small number of sites | Control panel | You can quickly create your custom search engine by filling out text boxes instead of creating files with a text editor and uploading the files. | The control panel is mostly useful for familiarizing yourself with Custom Search and creating search engines with few sites.
If you want to really customize your search engine or add a great number of sites, you might find the following quite limiting:
|
Getting Started |
| Complex search engines that use lots of sites, use feeds, or are programmatically created | Context file and Annotations files | The Custom Search files let you have a greater level of control over your search engines, and make the tasks of defining and managing sites a lot easier.
Even though you plan to create your search engine using context and annotations files, it's still a good idea to familiarize yourself with the control panel. Tour the different tabs and play with different settings to get a better feel for how Custom Search works. The Preview tab lets you instantly view the results of your experimentation. |
The more you customize your search engine, the more complex it becomes. You have to learn the Custom Search elements and attributes, which are not hard to pick up, but they do require you to invest some time.
You will have to read the rest of the developer guide, which is not the most exciting reading material, unfortunately. |
Defining Your Search Engine Specifications and Selecting Sites to Search |
| Search engines that show no ads and let you have the greatest level of control | Site Search (business edition) | In addition to the features available to the control panel and the Custom Search files, you have an even greater level of control over the ranking and look-and-feel of the search results page. The pages do not display ads, and you can configure them to not display Google branding.
It also comes with technical support. |
It is not free. This edition starts at $100.
To take full advantage of the niftiest features, you have to learn the WebSearch Protocol. |
Google Site Search (Introduction) and Google WebSearch Protocol Reference |
Now that you've picked up the basic concepts behind the core components in a custom search engine, you can start creating a context file that specifies your search engine.
< Back to Getting Started | Forward to Defining Your Search Engine Specifications >