Google Search Appliance software version 6.0
Posted June 2009
The Google Search Appliance enables you to provide universal search to your users. You can get the most from your Google Search Appliance by using some or all of its many features to fine-tune and enhance universal search. Become familiar with the Google Search Appliance's features by reading this document and apply those features that best suit your search solution.

Like other software system deployments, planning is the first and most important phase. For deploying a universal search solution, the following planning activities can help make your deployment a success:
The following sections briefly describe each of these activities.
Before you deploy a universal search solution, identify key business processes and the content sources that they currently use. Through this activity, you can determine which content sources you want to include or exclude from the search appliance index.
You might ask the following questions about business processes:
An examples of a business process that you might support with universal search is customer support. Typically, a customer support organization's success is measured by the speed with which a technical issue is resolved. To resolve an issue, customer support might search several content sources including Frequently Asked Questions (FAQs), product documentation, and a trouble ticket database. Providing universal search of these content sources to customer support would significantly reduce call resolution time and provide immediate cost benefits to an organization. Another possibility would be to provide universal search of the same content sources to customers, which might reduce the number of technical issues for customer support.
After you identify key business processes, also identify:
Through this activity, you can determine methods that you need to use to index and control access to content. You can also determine if you want to serve results from real-time business applications.
Typically, content sources span a variety of repository types, including web portals and applications, file systems, content management systems, and databases. Each content source might also have security controls in place.
Some key content sources may be outside the scope of standard business processes. To ensure that you have considered all content that should be indexed, identify other key content sources that can contribute to the overall information retrieval needs.
The Google Search Appliance can index content in a variety of public and secure content sources. For each type of content source, identify the method you need to use for indexing it. The following table lists types of content sources and the methods that the search appliance uses to index each one. To read more about each type of indexing method, refer to the section listed in the Reference Column.
| Type of Content | Indexing Method | Reference |
|---|---|---|
| Public content in web sites and file shares | Crawling | Crawling Public Content |
| Secure content in web sites and file shares | Secure crawling | Crawling and Serving Controlled-Access Content |
| Content in content management systems | Traversal | Indexing Content in Non-Web Repositories |
| Hard-to-find content | Feeds | Indexing Hard-to-Find Content |
| Content in databases | Database crawling | Indexing Database Content |
| Content in Google Apps | Crawling | Indexing Google Apps Content |
Your organization may require you to restrict access to certain enterprise content. For example, there may be personnel records that only certain users are authorized to view. In this instance, it is important to restrict access to these records to appropriate users.
The Google Search Appliance supports both indexing and serving of controlled-access content. For indexing, you can require the search appliance to provide credentials before crawling particular locations. For serving, you can require authentication and authorization before allowing a user to see controlled-access content.
The Google Search Appliance works with the existing access-control methods already in place in your organization. This ensures that users only see search results for documents they're permitted to access.
The following table lists the access-control methods that the search appliance supports and whether the methods are supported for crawl, serve, or both.
| Method | Crawl | Serve |
|---|---|---|
| HTTP Basic | X | X |
| NTLM HTTP | X | X |
| LDAP (Lightweight Directory Access Protocol) | X | |
| Forms Authentication | X | X |
| x.509 Certificates | X | X |
| Integrated Windows Authentication/Kerberos | X | |
| SAML Service Provider Interfaces (SPIs) | X |
For more information about this topic, refer to Crawling and Serving Controlled-Access Content.
The Google Search Appliance is capable of returning results based on searching the text of the content (that is, an unstructured query). However, there might be situations where it would be more efficient to have the search appliance return structured data in real time.
For example, a sales person might search for product information by entering a Stock Keeping Unit (SKU) number or product ID. In addition to retrieving product literature through a full-text, unstructured search for the SKU or ID, the search appliance can also issue a real-time database query to an inventory management database to retrieve and display the current inventory count for that product. This approach is much more efficient than crawling/indexing the inventory management system, since the inventory counts could change more frequently than the indexing refresh rate.
This real-time query feature is called a "OneBox module." For more information about this topic, refer to Providing Real-Time Connectivity to Business Applications.
Along with identifying key business processes and content sources, identify the users who perform related tasks and use the content sources. Consider the following questions about users in your organization:
Through this activity, you can determine the search experience that you want to offer your users, as well as the appearance of the user interface.
The Google Search Appliance provides features that enable you to fine-tune the end-user search experience. As you determine your users' search requirements, you can use these requirements to help determine the search experience.
An important question to ask is: Does your organization contain multiple user groups, each with its own search requirements? To meet the requirements of each user group, you might consider providing a different search experience for each user group.
You can implement multiple search experiences by using a search appliance feature called "front ends." With each front end, users search the same corpus (a set of data or documents stored in a repository that is searchable by users). However, each front end:
Another consideration is the scope of the search--is it most common for users to search all of the content sources or are there situations where users would want to search only a subset of content sources? Because the Google Search Appliance can support indexing up to 30 million documents, it's possible to use one application to serve the search needs of all users. But it's also common for different user groups to search different subsets of the index.
For example, customer support might search a trouble ticket database, a bugs database, product documentation, FAQs, and product data sheets. Sales might search marketing literature, a competitive database, and the same product data sheets that customer support searches. To meet the different search scopes of the different user groups, you could segment the index using collections and provide different front ends for each group. By searching a collection, users get relevant search results more quickly than by searching the entire index.
For more information about other features that you can use, refer to Using Features to Enhance the Search Experience.
The Google Search Appliance supports customization and deployment of one or more user interfaces, which are basically search and results pages. You can determine the user interface by identifying user groups in your organization and the types of search options they might require.
Because the Google Search Appliance provides highly relevant results, a simple search box is normally sufficient for a large number of queries. However, there may be situations that call for advanced search options. For example, a customer support technician might want to search for information based upon product name, version, language, or other attributes. To meet the requirements of different user groups, you might need to provide different front ends, each with its own search options.
You can use the information that you gather about users groups and the types of search options they require as input to making changes to the user interface. For more information about this topic, refer to Customizing the User Interface.
Let Google know what you think about this document by sending feedback to gsadoc-gtm-feedback@google.com.