My favorites | English | Sign in

Faster apps faster - GWT 2.0 with Speed Tracer New!

Google Search Appliance

Configuring Dynamic Scalability

Google Search Appliance software version 6.0
Posted June, 2009
Revised August, 2009: Removed incorrect information about using forms-based authentication with user impersonation.

This guide contains the information you need to configure dynamic scalability. Dynamic scalability is a Google Search Appliance feature in which a group of search appliances is configured so that a body of documents spread out over several search appliances can be searched by a single search query.

This document is for you if you are a search appliance administrator, network administrator, or another person who configures search appliances or networks. You need to be familiar with configuring crawl, serve, front ends, and security on the Google Search Appliance.

Contents

  1. Introduction to Dynamic Scalability
    1. How Search and Serve Work
    2. Determining Search Appliance Roles
    3. Using Collections to Direct User Searches
    4. How Crawling and Indexing Work
    5. How OneBox Modules Work
  2. About Security
  3. About Authorization
    1. Using Authorization on the Primary Search Appliance
    2. Using Delegated Authorization
  4. About Remote Collections
  5. About Front Ends
  6. About Crawl Patterns
  7. About Database Crawling
  8. About Timeout Intervals and Scoring Bias
  9. Dynamic Scalability Checklist
  10. Setting up Dynamic Scalability Configurations
    1. Adding or Deleting Nodes
  11. Troubleshooting
    1. Using the Federation Network Stats and Federation Diagnostic Pages to Find Problems
    2. Users See 404 Errors After Clicking Results
    3. Results from Secondary Search Appliances are Not Available on Primary Search Appliance
    4. Unexpected Authorization Behavior

Introduction to Dynamic Scalability

Dynamic scalability is a Google Search Appliance feature in which a group of search appliances is configured so that a body of documents spread out over several search appliances can be searched by a single search query. The search appliances in the configuration each crawl a different set, or corpus, of documents. Each search appliance is set up with its own collections, front ends, and other administrator-configurable features.

Configure dynamic scalability when you need to provide search and index services for a larger corpus of documents than a single Google Search Appliance can accommodate. For example, if you need to index 40 million documents, you might use four instances of the Google Search Appliance GB-7007, with each search appliance licensed for 10 million documents. Any model of the Google Search Appliance running software version 6.0 or later can be configured to participate in a dynamic scalability configuration. The configuration may include different search appliance models, provided they are all running the same software version.

Use dynamic scalability with two or more search appliances. If you have an existing dynamic scalability configuration, you can add more search appliances to increase the number of searchable documents or to locate search appliances in different geographic regions. For example, you might have search appliances in Tokyo and Beijing that use dynamic scalability and that index different sets of documents. If you install a search appliance in the Sydney office to index a different body of documents that you want available to Tokyo and Beijing users, you can add the Sydney search appliance to the dynamic scalability configuration.

One search appliance in the configuration is designated the primary search appliance or primary node. The other search appliances are designated the secondary search appliances or secondary nodes. Dynamic scalability configurations are typically set up so that end users' search queries are directed to the primary search appliance. The primary search appliance searches its own index and issues a query to the indexes on the secondary search appliances. The secondary nodes return their results to the primary search appliance. The primary search appliance aggregates the search results from itself and the secondary search appliances, then serves the results to the user. The user does not need to repeat the search on each search appliance in the configuration.

You cannot combine dynamic scalability with the distributed crawling or index replication features. On the Google Search Appliance Admin Console and in the Admin Console help system, dynamic scalability is called Federation. Distributed crawling and index replication are called Multibox on the Admin Console and in the help system.

How Search and Serve Work

In a dynamic scalability configuration, the search and serve processes work seamlessly from the end users' standpoint. Users submit queries and receive results on the same familiar Google Search Appliance pages. You control which documents are searched by configuring collections and remote collections within the dynamic scalability configuration. For more information, see Using Collections to Direct User Searches.

The following graphic shows three search appliances in a dynamic scalability configuration:

  • Search Appliance A indexes sales and marketing documents. It is the primary search appliance in the configuration.
  • Search Appliance B indexes technical support documents. It is a secondary search appliance.
  • Search Appliance C indexes accounting documents. It is a secondary search appliance.

Dynamic Scalability Example with Three Search Appliances

Here's what happens when a user wants to search for technical support, sales, and accounting information about a particular customer, Buzzword Advertising.

  1. The user browses to the search page for Search Appliance A, the primary search appliance in the configuration, and types Buzzword Advertising in the search box.
  2. Search Appliance A searches its local index, which contains sales and marketing documents.
  3. Search Appliance A issues a query to Search Appliance B and Search Appliance C.
  4. Search Appliance B searches its local index, which contains support information.
  5. Search Appliance B sends the results back to Search Appliance A.
  6. Search Appliance C searches its local index, which contains accounting information.
  7. Search Appliance C sends the results back to Search Appliance A.
  8. Search Appliance A merges its own results with the results from Search Appliance B and Search Appliance C and ranks the merged search results.
  9. Search Appliance A returns all results for Buzzword Advertising to the user, including sales contracts, billing and payment information, and records of support contacts with the company.

Determining Search Appliance Roles

Each search appliance in a dynamic scalability configuration is also able to act independently of the configuration. For example, a user who wants to see only support documents related to Buzzword Advertising might connect directly to the search page for Search Appliance B and run the search query there.

A particular search appliance is able to act as both a primary and secondary node in relation to another search appliance. The following example illustrates a pair of dynamic scalability configurations consisting of two search appliances. In dynamic scalability configuration A, Search Appliance A is the primary node and Search Appliance B is the secondary node. In dynamic scalability configuration B, Search Appliance A is the secondary node and Search Appliance B is the primary node.

Two search appliances in two dynamic scalability configuration, A and B. Each search appliance is both a primary and a secondary in relation to the other search appliance.

Using Collections to Direct User Searches

A query to the primary search appliance in a dynamic scalability configuration returns results from all search appliances in the configuration. By default, all collections on all search appliances are searched when a query is directed to the primary search appliance. You can restrict which collections are searched in two ways:

  • Use the site parameter of the query to define which collections are searched. For more information on the site parameter, see the Search Protocol Reference.
  • Create remote collections on the primary search appliance, which are virtual collections encompassing the specific collections to be searched on each secondary search appliance. For more information on remote collections and how they work, see About Remote Collections.

If a user needs to search documents in a collection that is not included in a remote collection, the user must use the search page for that collection's search appliance instead of the search page on the primary search appliance.

How Crawling and Indexing Work

Crawling and indexing in a dynamic scalability configuration are similar to crawling and indexing in single search appliance deployments. Each individual search appliance is configured with its own crawl patterns and each search appliance typically crawls a discrete body of documents. For more information about crawling and indexing in a single search appliance, see Administering Crawl.

Depending on how security is set up in a dynamic scalability configuration, you might have to duplicate the crawler access settings from each secondary search appliance on the primary search appliance to ensure that the primary search appliance can correctly authorize and serve results from the secondary search appliances. For more information, see Security in a Dynamic Scalability Configuration and Configuring Crawl Patterns in a Dynamic Scalability Configuration.

How OneBox Modules Work

In a dynamic scalability configuration, OneBox module configuration is available only on the primary search appliance. In other words, results served from the primary search appliance include results from OneBox modules configured on the primary search appliance, not OneBox modules configured on the secondary nodes. Because spelling checkers are enabled as OneBox modules, spelling check is available only for documents indexed on the primary search appliance. A new feature, user-added results, also uses OneBox modules.

About Security

The Google Search Appliance uses secret tokens and private IP addresses to enforce security within a dynamic scalability configuration.

The search appliances in a dynamic scalability configuration authenticate each other using shared secret tokens that you provide during configuration. The shared secret tokens must consist only of printable ASCII characters.

There are no restrictions on the public IP addresses assigned to the search appliances in the configuration beyond a requirement that a search appliance is able to reach another search appliance's public IP address on port 10999.

Certain communications among the search appliances in a dynamic scalability configuration are conducted over a secure private network, including search requests, search credentials transmitted as sessions, and search results that include snippets, whether the results are authorized or not authorized. When you set up a dynamic scalability configuration, you provide special private network IP addresses that the search appliances use for these secure communications. On the Admin Console interface, the private network IP addresses are called federation network IP addresses.

The following guidelines apply to the private network IP addresses:

  • You can assign or change the private IP addresses at any time.
  • The dynamic scalability configuration private IP addresses must conform to the private address space as defined in RFC 1918 and must not overlap with any other private address space in use on your network.
  • Google recommends that you group dynamic scalability configuration network IP addresses as closely as is practical on your network. For example, it's better to put nodes in the same subnet of /28 than on /27.
  • You cannot put the nodes in a dynamic scalability configuration on the same /16 subnet.
  • The range of allowable private IP addresses is 10.x.0.1 to 10.x.255.254, where x can have any value.

The following requirements also apply to security in a dynamic scalability configuration:

  • All security configurations on the Crawler Access pages on the secondary search appliances must be added to the Crawler Access page on the primary search appliance.
  • The primary and secondary search appliances must use the same security policies.

About Authorization

Authorization is the process by which the search appliance determines whether a particular authenticated user is permitted to view a particular document. You can set up a dynamic scalability configuration to handle user authorization during secure searches in one of two ways:

  • The primary search appliance performs all authorization.
  • The secondary search appliances perform authorization first. If a user cannot be authorized to see a particular document by the secondary search appliances, the primary search appliance attempts to perform the authorization. This process is called delegated authorization. Delegated authorization is enabled by checking a checkbox on the Admin Console > Federation > Host Configuration page.

If you use a Google Enterprise Connector for indexing and searching files in a content management system, you can configure authorization in one of three ways.

  • Configure the connector on the primary search appliance and use authorization on the primary appliance.
  • Configure the connector on a secondary search appliance and use delegated authorization.
  • Configure the connector on a secondary search appliance and the primary search appliance, then add a Do Not Crawl pattern on the primary appliance so that all connector crawling takes place on the secondary search appliance. Use authorization on the primary search appliance.

Using Authorization on the Primary Search Appliance

Use authorization on the primary search appliance when you want all authorization to be performed on the primary search appliance.

The following table tells you how to configure the primary and secondary search appliances when authorization is performed only on the primary search appliance.

Type of User Authentication How the User is Authenticated and Results are Authorized What to do on the Primary Search Appliance What to do on the Secondary Search Appliances
LDAP, HTTP Basic, NTLM HTTP, or Kerberos for public serve User logs in to network domain. Results are public and authorization is not required. Configure the Crawler Access page on the Admin Console with all crawl patterns from the primary and all secondary search appliances. The primary search appliance does not crawl these pages and no authorization is required. Configure the Crawler Access page on the Admin Console only with crawl patterns for the current secondary search appliance.
LDAP, HTTP Basic, NTLM HTTP, or Kerberos for secure serve User logs in to network domain. Credentials for authorization are collected at login time and results are authorized using head requests from the primary search appliance. Configure the Crawl and Index > Crawler Access page on the Admin Console with all crawl patterns from the primary and secondary search appliances. The primary search appliance does not crawl these pages, but uses the crawl credentials for authorization.  If there are SMB URLs, add those URLs to the Follow and Crawl Patterns field on the Crawl and Index >Crawl URLs page. Configure the Crawl and Index > Crawler Access page on the Admin Console only with crawl patterns for the current secondary search appliance.
Cookie site or forms-based authentication for public serve Serve is public. No result authorization at serve time required. Copy the configuration from the Crawler Access page on the secondary search appliances to the primary search appliance. Configure the Crawl and Index > Crawler Accesspage on the Admin Console only with crawl patterns for the current secondary search appliance.
Forms-based authentication with external login for secure serve User provides credentials on a form configured on the primary search appliance. The primary search appliance uses a cookie for authorization using the head requestor for each search result returned by a secondary search appliance. Configure forms authentication for serve. Configure form-based authentication for crawl.
Forms-based authentication with user impersonation for secure serve User provides credentials on a form configured on the primary search appliance. The primary search appliance uses a cookie for authorization using the head requestor for each search result returned by a secondary search appliance. Configure forms authentication for serve. Configure form-based authentication for crawl.
SAML authentication with external authorization SPI User provides credentials on a form configured on the primary search appliance. The primary search appliance uses a cookie for authorization using the head requestor for each search result returned by a secondary search appliance. Configure forms authentication for serve as on a single-search appliance configuration. Configure form-based authentication for crawl as on a single-search appliance configuration.
Forms-based authentication with external authorization SPI User provides credentials on a form configured on the primary search appliance. The primary search appliance uses a cookie for authorization using the head requestor for each search result returned by a secondary search appliance. Configure forms authentication for serve as on a single-search appliance configuration. Configure form-based authentication for crawl as on a single-search appliance configuration.
Policy ACLs with an LDAP identity provider User logs in to network domain. Credentials for authorization are collected at login time and results are authorized according to rules set in policy ACLs. Copy LDAP information and policy ACLs from the secondary search appliance. Configure LDAP and policy ACLs.

Using Delegated Authorization

Use delegated authorization when you want authorization to be performed first on the secondary nodes, with authorization on the primary node only when a secondary node is unable to authorize a user to view a document.

Delegated authorization is enabled on the search appliance Admin Console when you set up a dynamic scalability configuration. Check the Use delegated authorization checkbox on the Federation > Host Configuration page on the primary search appliance and on all secondary search appliances.

The following table tells you how to configure the primary and secondary search appliances if your dynamic scalability configuration uses delegated authorization. The following use cases are not supported with delegated authorization:

  • Kerberos authentication
  • Forms authentication with IP binding, in which the authentication cookie is restricted to a single IP address
Type of User Authentication How the User is Authenticated and Results are Authorized What to do on the Primary Search Appliance What to do on the Secondary Search Appliances
HTTP Basic and NTLM HTTP for public serve User logs in to network domain. Copy the Crawl and Index > Crawler Access settings from all secondary search appliances to the primary search appliance. Ensure that the Make Public box on the crawler access page is checked.  
LDAP, HTTP Basic, or NTLM HTTP for secure serve User logs in to network domain. Credentials for authorization are collected at login time and results are authorized using head requests. Ensure that LDAP naming is the same on the primary and all secondary search appliances.Copy the Crawl and Index > Crawler Access settings from all secondary search appliances to the primary search appliance. Ensure that the Make Public box on the crawler access page is checked. For LDAP, ensure that all secondary search appliances use the same LDAP server and ensure that the LDAP naming is the same on the primary and all secondary search appliances.
Cookie site or forms-based authentication for public serve Serve is public. No result authorization at serve time required. N/A N/A
Forms-based authentication with cookie forwarding for secure serve User provides credentials on a form configured on the primary search appliance. This process generates a cookie. The primary search appliance shares the cookie with the secondary search appliances, which use the cookie for authorization using the head requestor. Ensure that the primary search appliance shares the domain name with the source. Ensure that the secondary search appliances have access to the cookie generated on the primary search appliance. Configure with role account and form authentication for crawling, but ensure that secondary search appliances can use the same cookie generated on the primary search appliance for head requests.
Forms-based authentication with external login for secure serve User provides credentials on a form configured on the primary search appliance. A cookie is generated by the external login URL. The cookie is passed to the primary search appliance, which shares the cookie with the secondary search appliances. The secondary search appliances use the cookie to authorize results. Share the primary search appliance domain name with the external login server URL. Ensure that the secondary search appliances have access to the cookie generated on the primary search appliance. Configure with role account and form authentication for crawling, but ensure that secondary search appliances can use the same cookie generated on the primary search appliance for head requests.
Forms-based authentication with user impersonation for secure serve User provides credentials on a form configured on the primary search appliance. The external login URL generates a cookie, which is passed to the primary search appliance. The primary search appliance forwards the cookie to the secondary search appliances. The secondary search appliances use the cookie to authorize results. Copy the Serving > Forms Authentication settings from all secondary search appliances to the primary search appliance, including the Make Public flag. Configure form authentication for crawling.
SAML authentication with external authorization SPI SAML assertion is passed to the secondary search appliances, where the assertion is used to authorize documents. Copy the Crawl and Index > Crawler Access settings from all secondary search appliances to the primary search appliance, including the Make Public flag. Configure the SPI. Configure the SPI.
Forms-based authentication with external authorization SPI SAML assertion is passed to the secondary search appliances, where the assertion is used to authorize documents. Copy the Crawl and Index > Crawler Access settings from all secondary search appliances to the primary search appliance, including the Make Public flag. Copy the Serving > Forms Authentication settings from all secondary search appliances to the primary search appliance, including the Make Public flag. Configure the SPI. Configure the SPI.

About Remote Collections

A remote collection is a collection configured on the primary search appliance of a dynamic scalability configuration that includes one or more collections defined on one or more of the secondary search appliances. Remote collections do not include any collections from the primary search appliance, because all collections on the primary search appliance are searched by default. You create remote collections to ensure the following:

  • User search queries are distributed to the correct search appliances.
  • The correct collections on those appliances are searched.

There are no limits to the number of remote collections you can create on the primary search appliance. A particular collection on a secondary search appliance can be a member of more than one remote collection.

For example, in a dynamic scalability configuration of three search appliances, the administrator might configure a remote collection called MasterCollection on Search Appliance A as described in the following table.

Search Appliance Name Collections Included in MasterCollection Collections Not Included in Master Collection
Search Appliance A (primary) N/A All collections on Search Appliance A
Search Appliance B (secondary) ProductOneSupportColl
ProductTwoSupportColl
ProductThreeSupportColl
WhoDoesWhatCollection
Search Appliance C (secondary) CustomerDataColl
CustomerPeopleDataColl
BonusInfoCollection

When a user issues a search query on Search Appliance A, the search appliance queries all collections on itself and the collections included in the collection called MasterCollection, but does not search the collections on the secondary appliances that are not included.

Users who need results from the WhoDoesWhatCollection on Search Appliance B or BonusInfoCollection on Search Appliance C need to issue queries directly on those search appliances, because Search Appliance A does not have access to those collections through MasterCollection.

Observe the following cautions in creating remote collections:

  • Do not configure remote collections on a secondary search appliance.
  • Do not give a remote collection the name default_collection, which is reserved for the default collection on each search appliance.

About Front Ends

A front end is the search appliance framework used to manage the appearance and underlying functions of search and results pages, including which collections are searched. Modify the front ends on the primary search appliance to associate the correct remote collections with each front end after you create the remote collections in dynamic scalability configuration. You can do this in two ways:

  • Add an element to the search page that enables users to select a collection. For example, you might use radio buttons or a drop-down list.
  • Use query parameters to bind a collection to a front end, then mask the query parameters using a proxy server.

For more information on front ends and associating collections with front ends, see Creating the Search Experience: Introduction.

In addition, dynamic scalability configurations can use remote front ends, which are front ends on secondary search appliance. You enable remote front ends by checking the Use host frontend filters instead of Primary frontend filters checkbox on the Federation > Host Configuration page under Federation Settings. You choose a front end on each secondary search appliance that is used to apply the following front-end settings to results from that node:

  • Remove URLs
  • Scoring bias, which is called result biasing elsewhere on the search appliance
  • File type filters
  • Domain filters
  • Metatag filters

About Crawl Patterns

Dynamic scalability configurations function more efficiently when the the set of URLs crawled on one node has few or no links to URLs crawled on other nodes. Google recommends that you set up the crawl patterns on each node so that there is minimal interlinking among the nodes.

Depending on how results are authorized in your dynamic scalability configuration, you might need to copy crawl patterns or crawler access information from the secondary search appliances to the primary search appliances. For more information, see the tables in Configuring Authorization for Dynamic Scalability.

If a secondary search appliance uses SMB crawl patterns, you must add the patterns to the patterns on the primary search appliance's Crawl and Index >Crawl URLs > Follow and Crawl Only URLs field.

About Database Crawling

To use database crawling in a dynamic scalability configuration, you might need to perform some additional configuration.

  • If you configure the primary search appliance to crawl the database, no additional configuration is required.
  • If you configure a secondary search appliance to crawl the database, search results from the database are correctly returned to the primary search appliance. However, the primary search appliance cannot retrieve the database when the user clicks a result from the database. Use these instructions to set up the primary search appliance so that it can retrieve the database.

To set up the primary search appliance:

  1. Log in to the Admin Console of the secondary search appliance.
  2. Navigate to Crawl and Index > Databases.
  3. Note down the configuration information.
  4. Log in to the Admin Console of the primary search appliance.
  5. Navigate to Crawl and Index > Databases.
  6. Set up a database crawl configuration that is identical to the configuration on the secondary search appliance.
  7. Configure a dummy SELECT statement for the crawl query that does not return documents. This prevents the primary search appliance from crawling the database. The serve query on the primary search appliance must be identical to the serve query on the secondary search appliance.
  8. Save the configuration.

For more information on crawling database with the Google Search Appliance, see Database Crawling and Serving.

About Timeout Intervals and Result Biasing

The timeout interval and scoring bias parameters are set on the Federation > Host Configuration page for each search appliance in the configuration.

The timeout interval determines how long the primary search appliance waits before timing out a request to a particular secondary node. Set the timeout interval to a lower value for co-located search appliances and to higher values for search appliances that are physically distant from the primary search appliance. Google recommends a 2 second timeout value for co-located search appliances.

The scoring bias parameter sets result biasing for the current node. Scoring bias changes the weight assigned to results from a particular node in a dynamic scalability configuration when the final results ranking is calculated. Less influence is a negative bias for results from the current node. No influence is a neutral bias. More influence is a positive bias for results from the current node.

Dynamic Scalability Checklist

This section provides a checklist of information you need to collect and decisions you need to make before you set up a dynamic scalability configuration.

Task Description Your Values
Determine which Google Search Appliance will participate in the dynamic scalability configuration. Any Google Search Appliance model running software version 6.0 or later can participate.  
Determine the appliance IDs of the participating search appliances. The appliances IDs can be found on the Admin Console under Administration > License.  
Determine the host names or public IP addresses of the search appliances in the dynamic scalability configuration. The host names or IP addresses are used during the initial configuration of the dynamic scalability configuration.  
Determine the network IP addresses for the search appliances. The network IP addresses, called federation IP addresses on the Admin Console, are used for communication among the search appliances in the dynamic scalability configuration. The network IP addresses must conform to the private address space as defined in RFC 1918 and must not overlap with any other private address space in use on your network.  
Determine which search appliance is the primary search appliance in the dynamic scalability configuration. You configure remote collections only on the primary search appliance and searches are typically entered on the primary search appliances.  
Determine which collections on each secondary search appliance will be assigned to remote collections on the primary search appliance. These collections will be served from the primary search appliance. These choices determine which collections are searchable within the dynamic scalability configuration using remote collections.  
Determine the secret token that the search appliances will use to recognize each other within the dynamic scalability configuration. The nodes in a dynamic scalability configuration use the secret tokens to authenticate to each other. The secret token must include only printable ASCII characters. Each search appliance in a dynamic scalability configuration has its own associated secret token, which you specify on the Federation > Host Configuration page.  
Determine the level of scoring bias for each node in the dynamic scalability configuration. Scoring bias changes the weight assigned to results from a particular node in a dynamic scalability configuration when the final results ranking is calculated. Less influence is a negative bias for results from the current node. No influence is a neutral bias. More influence is a positive bias for results from the current node.  
Determine the timeout interval to enter on each node. The timeout interval determines how long the primary search appliance waits before timing out a request to a particular secondary node. Set the timeout interval to a lower value for co-located search appliances and to higher values for search appliances that are physically distant from the primary search appliance. Google recommends a 2 second timeout value for co-located search appliances.  
Determine the type of authorization to use in the configuration. Results can be authorized on the primary search appliance or on the secondary search appliances. For more information, see Security in a Dynamic Scalability Configuration and Configuring Authorization in a Dynamic Scalability Configuration.  
Confirm that the security configuration is identical on all of the search appliances in the dynamic scalability configuration. Do not use different authentication and authorization models on different search appliances in a dynamic scalability configuration. For more information, see Security in a Dynamic Scalability Configuration and Configuring Authorization in a Dynamic Scalability Configuration.  
Determine which crawl patterns and crawler access information needs to be copied from the secondary search appliances to the primary search appliance. For more information, see the tables Security in a Dynamic Scalability Configuration and Configuring Authorization in a Dynamic Scalability Configuration.  
Determine which front ends to use and how to ensure that the correct collections are bound to the front ends. The front end determines which collections are searched. For more information, see Configuring Front Ends for Dynamic Scalability and see Creating the Search Experience: Introduction.  

Setting up Dynamic Scalability Configurations

This section provides high-level instructions for setting up dynamic scalability configurations. Use the online help system for detailed information about completing each page on the Admin Console.

To set up dynamic scalability configurations:

  1. Read this document.
  2. Complete the checklist in Before You Set up a Dynamic Scalability Configuration.
  3. Log in to the Admin Console on the primary node.
  4. Complete the Federation > Host Configuration page on the primary node.
  5. Log in to the Admin Console on each of the secondary nodes.
  6. Complete the Federation > Host Configuration page on each secondary node.
  7. On the primary node, navigate to Federation > Nodes Configuration and add each of the secondary nodes, including the secret token, appliance ID, host name, and federation IP address for each secondary node. When you make changes on this page, the dynamic scalability service restarts.
  8. On each of the secondary nodes, navigate to Federation > Nodes Configuration and add the primary node, including the secret token, appliance ID, host name, and federation IP address of the primary node. When you make changes on this page, the dynamic scalability service restarts. Do not add any of the secondary nodes on another secondary node.
  9. On the primary node only, complete the Federation > Remote Collections page.
  10. On the primary node, make any required updates to the crawl patterns and crawler access information.
  11. On the primary node, make changes to the front ends to ensure that queries are correctly distributed to all nodes in the dynamic scalability configuration.

Adding or Deleting Nodes

If you add or remove search appliances in a dynamic scalability configuration, ensure that you update the following:

  • Federation > Nodes Configuration page on the primary node
  • Federation > Nodes Configuration page on the search appliance you are adding or removing
  • Crawl patterns and crawler access
  • Remote collections
  • Front ends

Troubleshooting

This section provides information for solving problems you might encounter in configuring or using dynamic scalability configurations.

Using the Federation Network Stats and Federation Diagnostic Pages to Find Problems

On the Admin Console, the Federation Network Stats and Federation Diagnostic pages provide statistical and diagnostic information you can use to diagnose problems with a dynamic scalability configuration. For more information, see the online help for the pages.

Users See 404 Errors After Clicking Results

Different configuration problems cause 404 errors when users click search results.

Check the URL patterns in the Follow and Crawl Only URLs settings on the primary and secondary search appliances. Ensure that all Follow and Crawl Only URLs on the secondary appliances also appear on the primary search appliance.

If you are using a database crawl, a user might see a 404 error after clicking a search result. When this happens, it means that the primary search appliance is not set up with the database configuration information from the secondary search appliances. To correct the error, copy the database configuration information from the secondary search appliances to the primary search appliance.

Results from Secondary Search Appliances are Not Available on Primary Search Appliance

If you find that results from the secondary search appliances are not available on the primary search appliance, check the names of the remote collections. If different collections designated as part of a remote collection have the same name, the site parameter is expanded at query time in such a way that the results are not available on the primary search appliance. If this is the case, you can obtain results from the secondary search appliances on http://0:9999/search, but not through the configured front ends.

If you find that results from the secondary search appliances are not available on the primary search appliance, ensure that nodes are added as secondary nodes only on the primary search appliance. Do not add secondary search appliances to other secondary search appliances.

In addition, ensure that remote collections are configured only on the primary search appliance.

Unexpected Authorization Behavior

If you configure delegated authorization incorrectly, you encounter unexpected authorization behavior. If you are using delegated authorization, ensure that it is enabled on the primary and all secondary search appliances in the dynamic scalability configuration.

  • If delegated authorization is enabled only on the primary search appliance and not on the secondary search appliances, the secondary search appliances do not perform authorization. Only the primary search appliance is performing authorization.
  • If delegated authorization is disabled on the primary search appliance but enabled on the secondary search appliances, the secondary search appliances perform authorization, but the primary search appliances ignores the authorization and performs its own.

Back to top