Google Search Appliance software version 6.0
Posted June 2009
This document contains the information that you need to set up two instances of the Google Search Appliance for load balancing or fail over.
On software version 6.0, another approach is to use index replication, a beta feature. For more information on index replication, see Configuring Distributed Crawling and Index Replication
You can configure two search appliances to provide load balancing or failover.
Load balancing distributes network traffic of different types to the appropriate applications. Load balancing can be used to distribute network traffic of a particular type to two or more instances of an application, dividing the work load between the instances. A load balancer is a software or hardware application that distributes network traffic. When you configure two Google Search Appliance systems for load balancing, search queries are distributed between the two systems.
Failover configurations typically involve two instances of an application or a particular type of hardware. The first instance, sometimes called the primary instance, responds to search queries. If the first instance fails, the second instance, sometimes called the secondary or standby instance, starts responding to search queries.
This document provides the information you need to to create load balancing or failover configurations using two Google Search Appliance systems.
Google does not recommend specific load balancers to use with the search appliances. The configurations described in this document are expected to work with any equipment that complies with networking RFCs.
In load balancing configurations, Google recommends that you set up the load balancer to use sticky sessions, in which all search queries from a particular user are directed to the same search appliance. Sticky sessions ensure that each client receives consistent results. Some load balancers use sticky sessions as the default setting. See the load balancer's documentation for complete information.
You can set up the search appliances in the following configurations:
You can use these configurations regardless of content location.
The following sections discuss the three configurations in detail and include diagrams each configuration. The diagrams use graphics of the Google Search Appliance model GB-1001, but any of the configurations can be used with other Google Search Appliance models.
Each configuration includes the following:
The computer in the illustrations represents an end point of the required network path to the content.
This configuration provides load balancing by distributing incoming search queries to the search appliances. The search appliances each respond to about half of incoming search queries.
The search appliances are set up with a physical connection between the two search appliances and the load balancer, or with the search appliances located on the same network or subnet as the load balancer.
The following diagram illustrates the components in this configuration. The legend that follows the diagram describes the components and lists the IP address of each component.

If you have a GB-5005 or GB-8008, configure each search appliance with three IP addresses each, as required. For example, you might use the following IP addresses for the first search appliance:
You might use the following IP addresses for the second search appliance:
This configuration requires a load balancer and switch.
This configuration is easy to set up.
If one search appliance goes off line, the load balancer directs all traffic to the remaining search appliance.
During the crawl process, crawl traffic is directed through the load balancer, which creates the potential for network saturation. Therefore, it's important that you schedule the crawl on the two search appliances to take place at different times of day.
The load balancer is a single point of failure in this configuration. If the load balancer fails, you must have physical access to the load balancer and search appliances to restore search capabilities, because the load balancer must be fixed or the IP addresses of the search appliances must be changed.
This configuration provides load balancing by distributing incoming search queries to the search appliances. The search appliances each respond to about half of incoming search queries.
The search appliances are logically downstream from the load balancer, but potentially on different networks or subnets from the load balancer.
The following diagram illustrates the components of a virtual load balancing configuration. The legend that follows the diagram describes the components and lists the IP address of each component.

If you have a GB-5005 or GB-8008, configure each search appliance with three IP addresses each, as required. For example, you might use the following IP addresses for the first search appliance:
You might use the following IP addresses for the second search appliance:
This configuration requires a load balancer and switch. The load balancer must be able to proxy traffic to external virtual IP addresses.
In this configuration, there is no single point of failure.
Crawl traffic is not directed through the switch, so there is less risk of network saturation than in the first configuration.
If one search appliance goes offline, the load balancer directs all traffic to the remaining search appliance.
This configuration requires a load balancer that supports load balancing or traffic proxying.
Access control lists (ACLs) for the router are more complex than in other configurations, because you must create rules for additional IP addresses.
Because of how search queries are handled, there is twice as much network traffic between the load balancer and the switch as in other configurations.
This configuration uses two search appliances configured as primary and second appliances. All queries are directed to the primary search appliance. If the primary appliance becomes unavailable, search fails over to secondary search appliance.
The following diagram illustrates the components of configuration supporting DNS switchover. The legend that follows the diagram describes the components and lists the IP address of each component.

If you have a GB-5005 or GB-8008, configure each search appliance with three IP addresses each, as required. For example, you might use the following IP addresses for the first search appliance:
You might use the following IP addresses for the second search appliance:
This configuration requires a DNS server and switch.
This configuration is easy to configure and has no special hardware requirements. There is no risk of network saturation during crawls The two search appliances can be located anywhere, physically or logically.
This is not a true load balancing configuration. The configuration provides only failover.
In all of these configurations, the load balancer needs to monitor the status of the search appliances. If a search appliance stops running, the load balancer stops sending requests to the search appliance.
To monitor the status of the search appliance, configure the load balancer to send periodic search requests to the search appliance. After each search request, close the connection to the search appliance. See the documentation for your load balancer for information on how to send HTTP requests from the load balancer to the search appliance.
Do not configure the load balancer to monitor the status of a search appliance by sending TCP packets to port 80 of the search appliance, making a connection, and sending a reset. Using TCP packets this way can cause a search appliance to become unresponsive.