My favorites | English | Sign in

Faster apps faster - GWT 2.0 with Speed Tracer New!

Google Search Appliance

Troubleshooting Search Appliance Problems

Google Search Appliance version 6.0

Posted June 2009

This document provides the information you need to solve problems you might encounter while installing or operating the Google Search Appliance.

Contents

  1. About This Document
  2. Where Do I Get More Help?
    1. If You Contact Technical Support
    2. Providing Access to the Search Appliance
  3. Determining the Type of Problem
  4. Software Problems
    1. Determining Whether the Operating System Is Operating Normally
    2. Correcting an Operating System Error by Rebooting the Search Appliance
    3. Correcting Problems Installing a License
  5. Installation and Update Problems
    1. Correcting Problems Connecting to the Network Installation Wizard
    2. Correcting Problems Connecting to NTP Servers
    3. Correcting DNS and SMTP Configuration Problems
    4. Correcting Errors in Test URLs
    5. Correcting Timeouts During Update File Upload
  6. Stylesheet Problems
    1. Ensuring That Changes Made on the Output Format Page Are Reflected on the Results Page
  7. Crawl Problems
    1. Determining Whether MTU Settings are Causing a Crawl Problem
    2. Correcting Invalid URL Errors on the Network Diagnostics Page
    3. Correcting a Slow Crawl Process
    4. Ensuring That a Particular URL Is Crawled
    5. Ensuring That All URLs Are Crawled
  8. Serving and Results Page Problems
    1. Determining Why a Particular Document Is Not in the Search Results
    2. Correcting Serving Failures Caused by Excessive Collections
  9. Connectivity and Network Problems
  10. Hardware Problems
    1. Determining Whether the Search Appliance Has a Hardware Problem
    2. Determining Why the Search Appliance Does Not Power On
    3. Determining Why the Search Appliance Does Not Start Correctly
    4. Using the LEDs to Help Diagnose Hardware Problems
    5. Using the LEDs to Help Diagnose Disk Problems on the Google Search Appliance

About This Document

This document provides diagnostic information and solutions for problems that you might encounter while installing or operating a Google Search Appliance.

This document is intended for search appliance administrators, network administrators, and other people who operate or install a search appliance. The document assumes that you are familiar with the Google Search Appliance and with networking concepts.

Where Do I Get More Help?

The online help system for the Google Search Appliance contains detailed information on how to configure the search appliance from the Admin Console. You can search the Admin Console online help systems at the Search Appliance Documentation page.

The way that you receive technical support and access to the Enterprise Technical Support site depends on how and where you purchased your search appliance. For more information, see About Technical Support in Installing a Google Search Appliance.

If you received technical support from Google Enterprise Technical Support, you can search the help systems and other support information on the Enterprise Technical Support Web site (http://support.google.com/enterprise). If you forget your password for the Enterprise Technical Support web site, go to the Password Assistance page or click the Forgot Your Password? link on the Enterprise Technical Support login page. Enter your login name and Google Enterprise Technical Support will mail your password to the primary contact you designated when you created your Enterprise Technical Support account. If your primary contact cannot receive email or if you do not know your login name, please contact enterprise-operations@google.com.

If you receive technical support from a provider other than Google Enterprise Technical Support, contact that provider for more information.

If You Contact Technical Support

If you need to contact your technical support provider, have the search appliance ID number available. You can find the appliance ID in the following places:

  • On the label on the back of the search appliance
  • On the Admin Console on the Administration > License page

Providing Access to the Search Appliance

Your technical support provider might need remote access to the search appliance to determine the cause of any problems the search appliance is experiencing. The provider can use SSH, access via modem, and VPN/desktop streaming (for an additional cost).

Different access methods have different requirements. The requirements for remote access are discussed in Remote Access for Technical Support.

If you do not have access to the Enterprise Technical Support site, talk to your technical support provider about the requirements.

Determining the Type of Problem

When a running search appliance experiences problems, the symptoms might include any of the following:

  • Search appliance seems unresponsive
  • Intermittent crawl or serve problems
  • Light-emitting diode (LED) on the search appliance case indicates an issue
  • Errors displayed on the Admin Console
  • Errors in the logs
  • Inability to connect to the network wizard on port 1111, Version Manager on port 9941 or 9942, or Admin Console on port 8000 or 8443
  • Inability to ping the search appliance

The search appliance can experience problems during its initial or a subsequent startup, during the configuration process, or while it is running. Decide which sections of this document to read based on what was taking place when you encountered a problem with the search appliance:

Software Problems

This section discusses general problems you might encounter with the search appliance's operating system and licensing.

Determining Whether the Operating System Is Operating Normally

If the search appliance cannot serve or if the search appliances freezes, the symptoms might indicate an operating system freeze or crash. If you observe these symptoms, complete the following checklist and contact your technical support provider.

Question Your Answer
When did the search appliance stop serving?  
Can you ping the search appliance from another computer?  
Can you connect to the Admin Console on port 8000 or 8443, or the Version Manager on port 9941 or 9942? If you see an error, what is the error?  

Correcting an Operating System Error by Rebooting the Search Appliance

If the Google Search Appliance has certain errors in the operating system, the search appliance sends an email message to the user whose email address was designated to receive notifications. The subject line of the email says:

NOTIFICATION: Machine machine_name needs to be rebooted in order to be used.

The message indicates that the operating system error has put the search appliance in an indeterminate state. To correct the error, restart the search appliance safely. For instructions on a safe restart, see Shutting Down the Search Appliance in Installing the Google Search Appliance.

Correcting Problems Installing a License

If you have problems installing a license to increase the number of documents your search appliance can crawl and index, obtain the search appliance identification number (also known as the serial number) and contact your technical support provider. The search appliance identification number is on Admin Console on the Administration > License page and the label on the search appliance's case

Correcting the Full and Half Duplex Settings on the Network Interface Card

The search appliance displays the network interface settings it is currently using on the Admin Console (port 8000) under Administration > Network Settings. If you see a setting that differs from the setting you specified on the Network Installation Wizard or on your switch, it might be related to negotiation problems between the switch and appliance.

In most cases the problem stops after you set the search appliance and the switch to auto-negotiate mode. If this does not work, please open a ticket with Google Enterprise Support.

Installation and Update Problems

This section deals with issues that might arise while installing the search appliance, configuring the network settings, or updating the software on the search appliance.

Correcting Problems Connecting to the Network Installation Wizard

During the installation process, you use a browser to connect to the Network Installation Wizard and configure the search appliance. If you have a problem connecting to the wizard, use the information in the following table to determine how to fix the problem. For complete information on installing the search appliance, see Installing the Google Search Appliance.

Condition to Check Corrective Action
Determine whether the URL in the web browser is http://192.168.255.1:1111/. If the URL is not correct, change it to http://192.168.255.1:1111/.
Determine which cable connects the laptop to the search appliance. The orange crossover cable is the shorter of the two cables supplied with the search appliance. The crossover cable is the correct cable for making the connection. If you are using the yellow cable, which is considerably longer, switch to using the shorter, orange cable between the laptop and the search appliance.
Determine whether the orange crossover cable is connected to the correct network port on the search appliance. The crossover cable must be plugged in to the port with the orange dot. This port is farthest left on the back of the search appliance.
Determine the color of the link light next to the port. The light should be green. Another color might indicate a problem with the crossover cable or with the search appliance. Exchange the crossover cable for a crossover cable you know to be in working order.
Determine whether the laptop is configured to use DHCP. If the laptop is not configured to use DHCP, consult the laptop documentation and enable DHCP. Without DHCP enabled on the laptop, the search appliance cannot assign an IP address to the laptop. Make sure that you write down the current network settings so that you can restore the settings after you configure the search appliance.
Determine whether the laptop has been assigned an IP address. On Windows, start a command window and type ipconfig, then press Enter. The command should return the value 192.168.255.254. If that value is not returned, type ipconfig /renew and press Enter. If the value 192.168.255.254 is not returned, set the following static IP address information manually: IP address: 192.168.255.254 Subnet mask: 255.255.255.0
Determine whether you can ping the IP address 192.168.255.1 On Windows, open a command window and at the prompt type the command ping 192.168.255.1 and press Enter. On Macintosh or Linux computers, start a telnet application and enter the command ping 192.168.155.1. If you see a reply from that IP address, but you cannot connect to the Network Installation Wizard, check the firewall settings and proxy configuration on the laptop. If the issues is not resolved after adjusting the firewall and proxy settings, there might be a problem with the search appliance.

Restart the search appliance. If the search appliance does start correctly, contact your technical support provider. In software versions before 5.2, contact your technical support provider if the search appliance does not play the melody indicating that the startup process succeeded.

If the ping request times out, the crossover cable might be bad. Try using a different crossover cable. Alternatively, your laptop might not be able to autonegotiate the correct speed and duplex mode with the search appliance. Try connecting the laptop to the appliance using a 10/100T network switch or hub, using regular network cables to connect the laptop and search appliance to the network device. Do not connect any devices to the switch or hub, because the search appliance will act as a DHCP server for any devices attached to the network.

If you are still unable to ping the search appliance from the laptop, reboot the laptop and the search appliance. If this does not work, contact your technical support provider.

Correcting Problems Connecting to NTP Servers

During the search appliance configuration process, you are prompted to designate an NTP (Network Time Protocol) server.

NTP servers provide a time reference for the search appliance and keep the search appliance's time setting accurate over time. Using an NTP server ensures the accuracy of crawl schedules, times recorded in logs, and the default crawling behavior. The search appliance might reject search results from web servers if the time stamps are not synchronized. It is crucial to designate at least one NTP server when you configure a search appliance.

Several problems can prevent the search appliance from contacting a public NTP server:

  • The NTP server might not be running.
  • The NTP server might not be available from your network.
  • The network on which the search appliance is installed might be experiencing a disruption.

To determine why the search appliance cannot contact the NTP server:

  1. On a computer that is on the same network as the search appliance, open a command prompt.
  2. Issue a ping command to the NTP server:

    ping pool.ntp.org

  3. If you cannot ping the NTP server, ask your network administrator to check whether the network gives the search appliance access to the NTP server.
  4. If you can ping the NTP server, click Update/Diagnose Settings on the installation wizard.

    A temporary network disruption might have prevented the search appliance from contacting the NTP server.

Correcting DNS and SMTP Configuration Problems

If the search appliance displays errors when you set the DNS or SMTP server values, or when you navigate to the network diagnostics page, there are two possible causes:

  • The router drops TCP packets that are sent to port 53 of the DNS server.
  • The SMTP server requires both a carriage return and a line feed to terminate commands.

To verify that this is the problem the search appliance is experiencing:

  1. Type the following command from the command line of a computer that is on the same network as the search appliance:

    telnet dns_server_ip_address 53

    If the command hangs for more than two seconds without any response from the DNS server and without opening a TCP connection to the DNS server, the router might be dropping packets. Some Cisco routers are commonly configured to drop packets sent to this port.

  2. Start the netcat program and issue the following command to the SMTP server:

    echo -e "HELO appliance_ip_address\nQUIT\n" | nc smtp_server_hostname 25

    If the SMTP server does not return a 221 closing connection response, the SMTP server requires both a carriage return and a line feed.

Use one of the following techniques to finish configuring the search appliance:

  • For the duration of the configuration process, disconnect the yellow network cable from the primary network interface card on the search appliance.
  • Use the IP address 127.0.0.1 for the locations of the DNS and SMTP servers.
  • Use an SMTP server that does not require both a carriage return and a line feed, which is characteristic of some SMTP servers running on Microsoft Windows hosts.
  • Change the configuration of the router so that it does not drop packets sent to port 53 on the DNS server.

Correcting Errors in Test URLs

On the last page of the network configuration wizard, you can test URLs to see if they are valid. If you enter a test URL and see the following error message, the web site is a secure site:

received 401, should be 200

Test a public URL or ignore the message.

Correcting Timeouts During Update File Upload

If you update the search appliance using the Version Manager's File Upload feature, the file upload request might time out.

The update file is sometimes very large. The computer from which you are attempting to upload the file might not be on the same subnet or in the same data center as the search appliance, and the length of the upload process can exceed the browser timeout interval.

To install a very large update file:

  1. Download the software update to a web server that is local to the search appliance.
  2. Open a browser and connect to Version Manager on the search appliance.
  3. In the URL field in Version Manager, type the URL for the location on the local web server of the update file.
  4. Click Install.
  5. Continue with the update process.

Stylesheet Problems

This section discusses problems you might encounter with the search appliance's stylesheet features.

Ensuring That Changes Made on the Output Format Page Are Reflected on the Results Page

When the search appliance executes a search, it generates the results page by using an XSLT stylesheet to format the results. For performance reasons, the search appliance front end caches the stylesheet. Any changes you make on the Serving > Front Ends > Format page are not reflected in the Test Center or search interface until the cached stylesheet is refreshed.

Use the proxyreload query parameter to force a particular front end to refresh the cached stylesheet.

To use the proxyreload parameter:

  1. Load the search interface for the front end in a browser.

    If you are changing the default front end, entering the search appliance's hostname or IP address in the location bar is normally sufficient.

  2. When the search interface is displayed, add the proxyreload parameter to the end of the URL displayed in your browser's location bar:

    &proxyreload=1

    For example, this might be the search interface URL:

    http://appliance.mycompany.com/search?site=default_collection&client=default_frontend&output=xml_no_dtd &proxystylesheet=default_frontend&proxycustom=%3CHOME/%3E

    To reload the stylesheet for the front end default_frontend, change the URL to:

    http://appliance.mycompany.com/search?site=default_collection&client=default_frontend&output=xml_no_dtd &proxystylesheet=default_frontend&proxycustom=%3CHOME/%3E&proxyreload=1

  3. Press the Enter key.

    The front end reloads the stylesheet and displays the result in the browser.

For more details about query parameters, see the Search Parameters section of the Search Protocol Reference.

Crawl Problems

Crawl is the process a search appliance uses to discover content for indexing and serving. The process can be slow or can result in error messages, or a document might not be crawled. You might become aware of a crawl problem when a document does not appear in search results and you expect the search query to return the document.

Use the information in this section to help you diagnose and fix crawl problems.

Determining Whether MTU Settings are Causing a Crawl Problem

Some crawl problems are related to MTU (maximum transmission unit) negotiation between a web server and the search appliance. To determine whether MTU negotiation is causing crawl problems, determine whether your web server requires a particular MTU and compare the web server's MTU to the MTU setting of the search appliance. The following chart lists the MTU of each search appliance model under software version 5.0.G12 and later.

Search Appliance Model MTUs
Google Search Appliance GB-7007 1500
Google Search Appliance GB-9009 1500

If the web server requirements do not match the MTUs provided by the search appliance, consult the documentation for the web server or consider replacing the web server.

Correcting Invalid URL Errors on the Network Diagnostics Page

The search appliance requires valid URLs for retrieving web pages. DNS servers cannot resolve invalid URLs and the search appliance cannot ping or communicate with a host without a valid URL. You can use the Network Diagnostics section of the Admin Console (Administration > Network Settings) to determine whether a particular URL is valid.

A valid URL must contain the following elements:

  • The protocol

    For example, the URL www.mywebsite.com/ is not a valid URL because it does not include the protocol. The URL http://www.mywebsite.com/ is a valid URL.

  • A fully-qualified domain name (FQDN)

    For example, http://www/ is not a valid URL because it does not include the fully-qualified domain name. The URL http://www.mywebsite.com/ is a valid URL.

  • Path information

    For example, http://www.mywebsite.com is not a valid URL because it lacks the trailing forward slash, so the path information is not complete. The URL http://www.mywebsite.com/ is a valid URL.

If you enter a URL in the Network Diagnostics field and see the error message Not a valid URL, correct the URL format and try again.

You might also see the error if the DNS server used by the appliance is not configured to resolve fully qualified domain names. If the URLs you enter in Network Diagnostics are all valid URLs, ensure that the DNS server is properly configured to resolve FQDNs.

Correcting a Slow Crawl Process

Two major problems can cause a slow crawl process on a search appliance:

  • Network issues

    See Connectivity and Network Problems for additional information on diagnosing and fixing network problems.

  • Too many simultaneous connections

    When too many simultaneous connections are opened from the search appliance to the content server, the content server is not able to handle all of the requests coming from the search appliance. By default the appliance uses 4 concurrent connections, at all times, for each content server. This setting might be too high for some content servers, particularly file servers, which handle smb:// URLs. When there are too many connections, the Crawl Diagnostics page indicates that some URLs are returning different retrieval errors, including HTTP 500 Error, Temporary DNS failure, Connection failed, Connection timeout, Connection closed, Connection refused and, very often, Connection reset. The Host Load setting controls the maximum number of simultaneous connections the crawl software can open to each web server. Decrease the Host Load setting on the Crawl Parameters page of the Admin Console and re-crawl the URLs that are returning errors to see if they no longer result in retrieval errors.

If decreasing the Web Server Host Load setting does not improve the speed of the crawl, use the information in the following table to determine why and fix the problem.

Question or Action Action or Value
On the Admin Console, go to Status and Reports > Crawl Status while the search appliance is running. In the Crawl Status table, examine the value of Current Crawling Rate. Determine whether the crawling rate is lower than expected.
Try to download some URLs that are queued for crawl. To do this, navigate on the Admin Console to Status and Reports > Crawl Queue, examine the URLs, and copy and paste some URLs to your browser. Record how long each URL takes to download. If the URLs download slowly in your browser, there might be a problem with the web server, or the network between your computer and the web server might be slow.

If there are HTTP errors in the browser from the URLs, it might indicate that the server is having problems with the crawler, which retries URLs. The retries can slow down the web server and responses from the web server.
From a computer located on the same network segment as the search appliance, try to download a large static file from the web server while the crawl is running. Record how long the download takes. Proceed to the next step.
From a computer located on the same network segment as the search appliance, try to download the same large static file from the web server while the crawl is stopped. Record how long the download takes. If the file downloads faster when the crawl is stopped, network connectivity might be a problem.
Use the ping and traceroute commands to determine whether there are packet losses, speed problems, or other network connectivity problems between your computer and the web server. If ICMP is disabled on the network, use tcptraceroute.  
Note down the times when the crawl process is slow. If the search appliance crawls several web servers, run the crawl through a proxy. Examine the web server access logs and proxy logs for times when the crawl process is slow.
Determine whether the content includes Office or other non-HTML content files that are being crawled when the crawl process is slow. After the search appliance crawls non-HTML files, the files must be converted to HTML. This is a CPU-intensive process that can slow down the crawl process. The conversion process sometimes times out on complex Microsoft Office documents, which can lead to slower crawling. If the search appliance is crawling such documents, click the cached link for a complex Office document when the crawl is complete. If there is no content cached, the file failed the conversion process.
Determine whether the search appliances is crawling encrypted PDFs. The tool that converts PDFs to HTML can cause slow crawl if you are indexing many encrypted PDFs.
Determine whether the content includes a large number of non-document files in locations that are crawled. If a large number of data files match the URL patterns in the Always Force Recrawl field under Crawl and Index > Freshness Tunning, the crawl process might become steadily slower each time the crawl runs. To solve the problem, modify the URL patterns in the Always Force Recrawl field so that they do not match data files or exclude the data files from the crawl.
Determine whether the crawl slows down toward the end of the process. This be caused if most files can be fetched quickly, but a few are very slow to be retrieved, or if one web server is slower than the others.
Ask Google Technical Support to determine whether the search appliance has disk problems. Some disk problems can slow down the crawl. These disk errors are not recorded in the System Event Log. Google technical Support can connect to the search appliance to determine whether the disks are causing slow crawl.

Ensuring That a Particular URL Is Crawled

If a particular URL is not being crawled, use the checklist in the following table to determine the solution to the problem.

Question Actions Your Answer
Are you updating the software on the search appliance? If the software is being updated, the results might be from the old version of the index. The index might not include the document yet.

To verify whether you are updating the software, open the URL http://your_search_appliance:9941/ in a web browser.
 
Did the search appliance successfully crawl the document? Enter the URL of the document on the Status and Reports > Crawl Diagnostics page, select the correct collection for the document, and click Show URLs. If an error is displayed, resolve the error, then recrawl the URL from the Crawl and Index > Freshness Tuning page.  
Is the document in an indexable format? For a list of file types the search appliance can crawl and index, see Indexable File Formats.  
Is the document an indexable size? For more information on document size, see What File Sizes Can Be Indexed? in Planning for Search Appliance Installation.  
Is the document's location included in the crawl URLs? Review the crawl URLs on the Crawl and Index > Crawl URLs page on the Admin Console. Ensure that the document URL matches one of the patterns in the Follow and Crawl Only URLs field. It must not match any of the patterns in the  Do Not Crawl URLs with the Following Patterns field. Use the pattern-testing utility in the Admin Console to test whether the patterns set allow a URL to be crawled.  
Is the crawl blocked by a robots.txt file or ROBOTS meta tag? Examine the web site and content server, and examine the HTML of the page that is not in the search results.  If there is any error other than a 404 - page not found error upon retrieving robots.txt, the crawler will not crawl the server. Use the telnet command shown above to check for errors retrieving robots.txt. If all the links to a URL have a nofollow robots meta tag to prevent the crawler from fetching the URL.  
Is the location inaccessible because the search appliance does not have the credentials required to crawl the location? Review the security model for your content and the Crawl and Index > Crawler Access page on the Admin Console.  
Is the web page protected by NTLM or Basic authentication or by Forms Authorization? The web page might not be available for public content searches. Search for the content using a Public and secure content search. Is the web page returned as a secure result? When you are prompted for credentials, enter the same credentials that are configured under Crawler Access or Forms Authentication.  
Is the document URL a Lotus Domino URL? The search appliance rewrites Lotus Domino OpenDocument URLs, URLs with the suffix #, and multiple versions of the same URL. Some Lotus Domino URLs are rejected because of the parameters they contain. For complete information, see Lotus Domino Enterprise Server in Monitoring and Troubleshooting Crawls.  
Is the crawl running? Check the Status and Reports > Crawl Status page on the Admin Console. If the crawl is not running, restart the crawl.  
Is the crawl completed successfully? Review the system event log (Status and Reports > Event Log) to verify that the search appliance successfully completed the crawl. If the crawl was not completed successfully, the URL might not have been crawled.  
Are the start URLs being crawled? Record the start URLs on the Crawl and Index > Crawl URLs page and check the Status and Reports > Crawl Diagnostics page to determine whether the URLs are being crawled. If the start URLs are not being crawled, check the network connection between the search appliance and the web server. Use the URLs to Test field on the Administration > Network Settings page to determine whether the search appliance is able to crawl the URLs.  
Did the search appliance crawl more than the maximum number of URLs permitted by the license? Check the Administration > License page for the licensed number of pages and the Status and Reports > Crawl Status page for the number of pages (URLs) crawled by the search appliance. If the search appliance is crawling more documents than the number for which it is licensed, the search appliance silently stops crawling new documents. For additional information on how the search appliance handles license limits, read Administering Crawl.  
Can the crawler follow the link to the document? The crawler can follow normal HTML links and HTML links embedded in Flash content, MS Word documents, and PDF files. The crawler cannot follow links embedded in Javascript code and the crawler cannot submit HTML forms. Examine the page source of the linking page in a web browser and ensure that the link is an HTML link.  Find a URL that links to the missing page and verify that it has been crawled on the Crawl Diagnostics page. Examine the page source of the linking page in a web browser and ensure that the link is an HTML link. If the link is relative, ensure that the path is correct. It is best to look at the source of the cached page to verify that you are looking at the content that has been indexed by the appliance.  You cannot use the link special query term to see if any pages link to the missing page, because this only returns results for links to pages that are indexed. If there is no valid link in a crawled document, you can either add the URL as a starting URL or add a regular HTML link from a jump page.  
Are too many URLs listed in the Crawl Frequently field? On the Admin Console, navigate to Crawl and Index > Freshness Tuning > Crawl Frequently. URLs listed in the field are crawled at least once a day. If too many URLs are listed, this causes a crawl backlog.  
Is the Host Load setting too low? The Host Load setting controls the speed at which the search appliance crawls the hosts. If the number is too low, the crawl process slows down. Read the help system for Google Search Appliance for more information about the Host Load setting.  
Are there any other settings in the Admin Console that may stop the crawler? On the Host Load Scheduling page, ensure that you have not specified a Maximum Number of URLs to Crawl.  
Are there any network problems? You can attempt to retrieve the URL by going to the Network and System Settings page in the Admin Console. Enter the URL in the URLs To Test field of Network Diagnostics. You need an account on the search appliance with Administrator privileges to perform this test. This form indicates whether you have any DNS or network connectivity problems.  

Ensuring That All URLs Are Crawled

If it appears that the search appliance is not crawling any, use the information in the following table to determine why.

Question Actions
Is the crawl is running? Check the Status and Reports > Crawl Status page on the Admin Console. If the crawl is not running, restart it.
Did the crawl finish successfully? Review the system event log (Status and Reports > Event Log) to verify that the search appliance successfully completed the crawl. If the crawl was not completed successfully, the URL might not have been crawled.
Do the follow and crawl URLs match all of the URLs on the sites you want the search appliance to crawl? For example, if http://mysite.com/index.jhtml is both the start URL and the follow and crawl URL, only the start URL matches the follow and crawl pattern. Therefore, only http://mysite.com/index.jhtml is crawled.

If http://mysite.com/index.jhtml is the start URL and http://mysite.com/ is the follow and crawl URL, all pages on mysite.com are crawled.

To check whether the URLs you want crawled match the follow and crawl patterns, use the pattern tester on the search appliance. On the Crawl and Index > Crawl URLs page, click Test These Patterns, which is above the Follow and Crawl field.
Is part or all of a site blocked by a robots.txt file? Check the web servers to see if there are robots.txt files. A robots.txt file tells a web crawler which URLs on the site cannot be crawled.
Are robots metatabs blocking any documents on a site? Documents on the site might contain robots meta-tags that prevent web crawlers from crawling the documents.

Serving and Results Page Problems

Serving is the process by which the Google Search Appliance presents search results to the user. A serving error can indicate a problem with the crawl on your search appliance. For example, a test query might not return all documents you expect to find on the results page. For more information, see Crawl Problems.

Determining Why a Particular Document Is Not in the Search Results

A document might not be returned on the results page for many different reasons:

  • The document's location might not be accessible to the crawler.
  • The crawl configuration might be incorrect.
  • The document might not be relevant to the search query.
  • The document might not be in a format that the search appliance can crawl and index.
  • The document is secured by access control, but you are searching only for public content.
  • The document is not available for the credentials you are using to search public and secure content.
  • The search appliance is unable to determine whether the document is available for the credentials you are using to search public and secure content.

Most of the time, a crawl problem is responsible for the absence of a document from the search results. When you find that a document is not in the search results and you believe that the search should have returned the document, identify the URL of the document or web page and complete the checklist in A Particular URL is not Crawled and the checklist that follows.

Question Information Your Answer
Was the document's URL filtered from the search results? The URL might be filtered if the snippet generated for the query was identical to the snippet for another page considered to be more relevant. To test this, repeat the search and add the term &filter=0 to the URL. The term suppresses the result filtering.

The URL might be filtered if the page is the exact duplicate of another page that was crawled. Duplicate filtering cannot be disabled.

 
Are you installing a software update on the search appliance? If the software is being updated, you might be looking at results from the old version of the index.

To verify whether you are updating the software, open the URL http://your_search_appliance:9941/ in a web browser.
 
Is the URL relevant for the search query? The URL might not be a relevant result for the query. Use the info term to determine whether the search term retrieves the URL. Enter the following URL in the search box:

info:http://www.mycompany.com/path/file.html

If the query retrieves the URL, examine the search request that was sent to the search appliance earlier. Verify that the terms in the q parameter are included on the web page or in a link to the web page. Check the other parameters that are submitted to the search appliance as part of the search request. Ensure that the query does not include the restrict parameter, which would limit to results to a particular subcollection. Ensure that there are no parameters in the query that would remove the URL from the results.
 
Are you searching within a collection or subcollection? Use the Status and Reports > Crawl Diagnostics page to view information for a specific collection. If you have a default_collection that contains all URLs, change the site parameter in the query to read site=default_collection. Verify that the patterns for defining the collection match the URL of the web page.  
Is the document tagged with default_collection? Is the document tagged with any collection? Try removing the site parameter and its value from the search request. By default, the value site=default_collection is applied to all documents in the index, but the value for the site parameter can be changed. If the patterns for that collection have been modified, searching without the site parameter will that ensure all documents in the index are potential search results, even if they are not tagged with default_collection.  
Is the document being removed from the search results because of a Remove URLs pattern set in the front end? Ensure that the Remove URLs field for the front end is blank.
You can also test the document with a new front end that does not have any custom XSLT, to make sure the front end is not removing the document from the results.
 
Are you searching for public content only? Documents crawled using NTLM, HTTP Basic, or Forms Authentication rules are tagged as secure content unless the appropriate Make Public checkbox is checked. Try searching on both public and secure content.  
Are you using credentials that are authorized to see the document? If the document is not in the results when you search for public and secure content, try using the same credentials the crawler is using for the crawl and index process. These are the credentials that you entered on the Crawler Access page or the Forms Authentication page.  
Are you receiving an HTTP 500 Internal Server Error when you search for public and secure content? This error indicates an internal error when the search appliances tries to authorize results. If you are using the SAML bridge for results authorization, make sure the SAML server is returning valid SAML 2.0 XML.  
Is the appliance able to authorize results? If searching for public and secure content takes a long time (up to 30 seconds) no results are returned, but a second try a few minutes later does return results, the search appliance is probably not getting timely responses during the authorization process. Make sure the authorization and content servers respond quickly to requests from the appliance. File servers, where smb:// URLs are located, are sometimes prone to slowness.  
Is the page marked as secure content in the Admin Console? You must have the access parameter in your query set to a to retrieve controlled-access content. You must be authorized to view the document with the credentials that you supplied to the appliance. The appliance must not encounter timeout problems when sending an authorization query to your content server or the Authz SPI server.  

Correcting Serving Failures Caused by Excessive Collections

Serving fails if there are too many collections on a search appliance. If you have 1,500 collections or more, delete sufficient collections to bring the number below 1,500 and see if serving works.

Determining Why the Appliance Is Not Serving Results

If the search appliance is running and the Admin Console is functioning, but the search appliance is not serving results, check to see whether the license for the search appliance is valid. On the Admin Console, navigate to Administration > License. The third row of the table displays the expiration date of the license installed on the search appliances.

Connectivity and Network Problems

Difficulty connecting to the search appliance might have a number of causes. If you are experiencing connectivity problems, the problems can indicate incorrect speed or duplex settings on the switch used by the search appliance or problems on your network.

Network and connectivity problems can result in crawl problems. If the actions outlined in this section do not resolve the search appliance's problems, see Crawl Problems for more steps you can take to diagnose and fix the problems.

Symptoms Possible Cause Actions to Diagnose To Fix the Problem
Intermittent or continuous connectivity problems, inability to crawl, or inability serve results Another device on the network with the same IP address as the search appliance Physically disconnect the search appliance from the network and try to ping the IP address of the search appliance. If you get a response, another device has the IP address of the search appliance. Disconnect the other device or assign the other device or the search appliance an unused IP address.
Poor network performance and intermittent connectivity problems; packet loss when attempting to ping the search appliance; collisions. Appliance is using half duplex setting even though you configured it with full duplex. Speed or duplex setting on network switch does not match the settings on the search appliance Use diagnostic software to display statistics for the port on the switch that is connected to the search appliance. A large number of collisions indicates a possible speed or duplex mismatch. Set the switch port to autonegotiate.

If the switch is a Cisco Catalyst and you use a Google Search Appliance model GB-1001, connect the search appliance to a hub that is connected to the switch.

You can use the ping tool to help diagnose network and connectivity problems when ICMP traffic is permitted between the host where the tests are performed and the search appliance. If ICMP traffic is blocked, use the hping tool to run tests over TCP ur UDP. The following sections discuss some error messages you might see when you attempt to ping or hping the search appliance from a host on your network.

Ping Error Possible Cause Actions to Diagnose To Fix the Problem
Destination Host Unreachable A local or remote route does not exist for the destination host. Check the local routing table on the test host to see whether static routes are used to reach the appliance. If the appliance is reached using the default route, confirm that the default route is set. If the settings are correct on the host, determine whether the appliance is reachable from the router that is configured as the default gateway for the appliance. If the appliance is reachable from the router, check the routes set on the router. Correct the routing table or the routes set on the router.
Destination Host Unreachable The search appliance is down or is not operating on the network. Check the switch port where the appliance is connected to the network and ensure that it has not been shut down. Examine the switch and confirm that the physical link is up on the port where the search appliance is connected. If the link is down, ensure that the search appliance is powered up and the network cable is properly connected to the appliance and the network. Ensure that all connections are correct and the switch and switch port are operating correctly. Ensure that the search appliance is running.
Request Timed Out Network congestion. Run the ping command multiple times and confirm that timeouts are mixed with good responses from the search appliance. Ask your network administrator to investigate and correct network congestion.
Request Timed Out Another device on the network has the same IP address as the search appliance. Run the ping command multiple times and confirm that timeouts are mixed with good responses from the search appliance. Physically disconnect the search appliance and ping the search appliance IP address. Alternatively, use the arping tool from a host on the same subnet as the search appliance. If two MAC addresses are returned from the search appliance IP address, another device on the network has the same IP address as the search appliance. Consult the MAC address tables of the network devices on the subnet to identify the device. Remove the device with the duplicate IP address from the network or change its IP address.
Request Times Out The search appliance is not connected to the network, is shut down, or is not configured correctly. Run the ping command and see only timeouts. Check the switch port where the appliance is connected to the network and ensure that it has not been shut down. Examine the switch and confirm that the physical link is up on the port where the search appliance is connected. If the link is down, ensure that the search appliance is powered up and the network cable is properly connected to the appliance and the network. Ensure that all connections are correct and the switch and switch port are operating correctly. Ensure that the search appliance is running.
Request Times Out The ping request is going to an invalid IP address, or the search appliance is not on the same network as the current host and an intermediary device is not configured correctly. Run the ping command and see only timeouts. Check the IP addresses of the devices. Ask your network administrator to to check the configuration of network devices between the test host and the search appliance.
Request Times Out Network congestion. Run the ping command and see only timeouts. Use the -w flag with the ping command to increase the timeout period and see if you start to get responses from the search appliance. Ask your network administrator to investigate and correct network congestion.
Unknown Host The IP address or host name does not exist on the network or the destination host name cannot be resolved. Verify the name and availability of DNS servers. Add the IP address or host name to the DNS servers.
Unknown Host Packet loss on the path between the test host and the search appliance. Use the MTR tool to verify whether packet loss is on the search appliance or occurring on the path between the test host and the search appliance. Ask your network administrator to investigate the causes.
Unknown Host Packet loss on the search appliance. Use the MTR tool to verify whether packet loss is on the search appliance or occurring on the path between the test host and the search appliance. If the packet loss is on the appliance, verify that the speed and duplex settings on the search appliance correspond with the settings on the switch port. If the settings correspond, switch the search appliance network cable, test with a different port on the switch, or test with a different switch. If the settings do not correspond, correct the settings.
Unknown host. Packet loss on the search appliance indicating that another device on the network has the same IP address as the search appliance. Use the MTR tool to verify whether packet loss is on the search appliance or occurring on the path between the test host and the search appliance. Physically disconnect the search appliance and ping the search appliance IP address. Alternatively, use the arping tool from a host on the same subnet as the search appliance. If two MAC addresses are returned from the search appliance IP address, another device on the network has the same IP address as the search appliance. Consult the MAC address tables of the network devices on the subnet to identify the device. Remove the device with the duplicate IP address from the network or change its IP address.

Hardware Problems

This section discusses possible solutions to hardware problems the search appliance might experience.

Determining Whether the Search Appliance Has a Hardware Problem

If you think your search appliance is experiencing a hardware problem, determine the answers to the following questions and then contact Enterprise Technical Support using the Contact Us page on the Technical Support web site.

Question Your Answer
What is the search appliance identification number (appliance ID, also known as the serial number)? You can find the appliance ID on the Admin Console on the Administration > License page or on the label on the back of the search appliance.  
Are the LEDs on the search appliance lit? See Using the LEDs to Help Diagnose Hardware Problems for more information on the location and colors of the LEDs.  
If the lights are lit, what colors are the disk and link lights on Google Search Appliance? See Using the LEDs to Help Diagnose Hardware Problems for more information on the LEDs.  
Can you ping the search appliance from another computer that is on the same subnet?  
If you cannot ping the search appliance, what happens when you try to ping it?  
If you cannot ping the search appliance, attach a keyboard and monitor to the search appliance. Press the space bar to start video output, then take photos or record the text of any output you see on the screen.Contact Support and provide the screen output details. If the screen shows only the ent1 login: prompt and no other text, it is safe to issue a soft reboot by pressing Ctrl+Alt+Del. If you issues a soft reboot, record any error messages during or after the restart, then contact Support with the details. Otherwise, do not reboot or power-cycle the appliance unless a Support engineer asks you to do so. For information on the type of keyboard and monitor your search appliance model requires, see the section Required Hardware and Software in Planning for Search Appliance Installation. For information on the type of keyboard and monitor your search appliance model requires, see the section Required Hardware and Software in Planning for Search Appliance Installation.  
Can you connect to the Admin Console with a browser using http on port 8000? If you see an error, what is the error?  
Can you connect to the Version Manager with a browser using https on port 9941?  
What error do you see when you point a web browser to port 80 on the search appliance?  
If the search appliance is not serving, when did it stop serving results?  
Is the failure to serve results permanent or intermittent?  
If the failure to serve is intermittent, how frequent are the failures and how long do they last? Can you identify any patterns in the failures?  
Can searches be performed from port 80 or port 7800?  
If searches cannot be performed on port 80 or port 7800, what happens when you try to access those ports?  

Determining Why the Search Appliance Does Not Power On

If it appears that the search appliance is not powering on, complete the following checklist before contacting your technical support provider.

Question Your Answer
Are the LEDs on the search appliance lit? See Using the LEDs to Help Diagnose Hardware Problems for more information on the location and colors of the LEDs.  
Can you hear fan noises?  
If you are using a Google Search Appliance model S5, did you use the power button on the front panel of the search appliance to turn it on?  
Did you try using a different power cable and electrical socket before you powered up the search appliance?  
Did you try draining the current out of the appliance, then restarting? To drain the current, unplug the appliance power cords and hold the power button for 30 seconds or more.  

Determining Why the Search Appliance Does Not Start Correctly

If the search appliance does not start correctly, you can record any error messages the search appliances displays during startup.

To display error messages during startup:

  1. Attach a keyboard and monitor to the appliance.

    For information on the keyboard your search appliance model needs, see What Hardware and Software Do I Need? in Planning for Search Appliance Installation

  2. Restart the appliance by pressing the power button.
  3. Record any errors you see on the monitor.
  4. Complete the checklists in Determining Whether the Search Appliance Has a Hardware Problem and Determining Why the Search Appliance Does Not Power On.
  5. Contact your technical support provider with the information you collected.

The following table lists errors might occur on startup on some search appliance models.

Search Appliance Model Error
Google Search Appliance S5 series Pre- or post-startup memory check fails. When a monitor is attached to the Google Search Appliance, the follow error message is displayed:

MEMORY ERROR!!!! PLEASE CHECK THE SDRAM CONNECTION.
Google Search Appliance S5 series The Google Search Appliance is not responding and rebooting does not fix the problem. When a monitor is attached and the search appliance is rebooted, you see the following prompt:

If you try to configure the search appliance, the RAID controller displays the following error message on the monitor:

Foreign configuration(s) found on adapter.
Press any key to continue, or 'C' to load the configuration utility


The error indicates a known RAID controller problem and might also report that some disks are missing. You will not be able to ping the search appliance or connect to the search appliance on any port.
  LILO initialization normally prints one character to the screen during different stages of the initialization process. When the process is completed normally and a monitor is attached, the following prompt is displayed:

LILO

If there is an error, the process displays an incomplete prompt:

LI
GB-1001, GB-7007, GB-9009 If the LILO process is unable to complete loading the operating system kernel, the following prompt is displayed on the monitor:

LILO
Loading 2431min212EFD9....................
GB-1001, GB-7007, GB-9009 The monitor displays a prompt requesting the root password.
GB-1001, GB-7007, GB-9009 The boot process does not finish, and the following error message is displayed on the attached monitor:

hda: read_intr: status=0x59 {DriveReady SeekComplete DataRequest Error }
hda: read_intr: error=0x40 { Uncorrectable Error }, LBAsect=1048671, sector=1048608
end_request : I/O error, dev 03:01 (hda), sector 1048608
EXT2 - fs error (device ide(3,1)): ext2_read_inode: unable to read inode block - inode = 64385, block = 131076
INIT : version 2.78 booting
INIT : No inittab file found
INIT: can't open(/etc/ioctl.save, 0_WRONLY): Input/output error

GB-1001, GB-7007, GB-9009 The boot process does not finish, and the following error message is displayed on the attached monitor:

hda: read_intr: status=0x59 {DriveReady SeekComplete DataRequest Error}
hda: read_intr: error=0x40 {Uncorrectable}, LBAsect=78, sector=15
end_request: I/O error, dev 03:01 (hda), sector 15
EXT2-fs: unable to read group descriptors
Kernal Panic: VFS: Unable to mount root fs on 03:01
GB-1001, GB-7007, GB-9009 The boot process does not finish. You might be able to ping the search appliance, but no services are running and the following error message is displayed:

Starting enterprise_network: reading enterprise_config [FAILED]

Using the LEDs to Help Diagnose Hardware Problems

The Google Search Appliance has colored LEDs on its case. The LEDs are lit differently depending on the search appliance's condition. In addition, the Google Search Appliance has colored LEDs for each disk drive, which are discussed in Using the LEDs to Diagnose Disk Problems on the Google Search Appliance.

Google Search Appliance LEDs

The following table provides information about the LEDs found on the Google Search Appliance (Model GB-1001, S5 series;).

LED Location LED Function LED Color and Meaning
Front panel Power indicator When the LED is green, Google Search Appliance has power.
When the LED is not lit, Google Search Appliance does not have power.
Power supply AC power input status When the LED is lit, AC electricity is coming into the search appliance and power supply.
When the LED is not lit, AC electricity is not coming into the power supply. Try a different power cord and check the power inlet.
Power supply DC power output status When the LED is lit, DC power is applied to server.
When the LED is not lit, no DC power is output.
Power supply Power supply status When the LED is lit, there is a fault in the power supply unit.
When not lit, the power supply is healthy.
Back of case System identification When the LED is blue or blinking blue, the system is healthy.
When the LED is amber or blinking amber, the system has detected a failure and might operate in degraded mode.

To make the LED blink, push the system identification button on the front of the search appliance. The blinking modes are to allow easy identification of the search appliance when it is in a rack that contains multiple devices.

If the LED is amber, the search appliance might have a memory error; the fan might not be working; the appliance might be overheating or might have a CPU error; there might be a power supply failure. Contact your technical support provider if the light is amber.
Back of case (on network interface card) Activity/Link When the LED is lit, the Google Search Appliance is connected to the network.
When the LED is blinking, there is activity on the line and data is flowing between the Google Search Appliance and the network.
When the LED is not lit, there is no link to the network.
Back of case (on network interface card) Link speed for GB-1001 only, not multinode models When the LED is amber, speed is negotiated to 10 or 100 Mbps.
When the LED is green, speed is negotiated to 1000 Mbps.

Using the LEDs to Help Diagnose Disk Problems on the Google Search Appliance

The Google Search Appliance has two LEDs per disk drive. The LEDs are located under the front panel on the case. If you are looking at the LEDs from above the case, the top LED is the status indicator. The bottom LED is the activity indicator.

If you think the Google Search Appliance is experiencing disk problems, review the information in the following table and contact your technical support provider.

LED Color and Behavior Meaning
Activity Blinking Google Search Appliance is reading or writing to the drive.
Activity Not blinking The drive is reserved as a hot spare drive and is in idle mode. If more than one drive is idle, contact your technical support provider.
Status Off The drive is off line and can be replaced or the drive needs to be configured to be on line.
Status Steady green The drive is on line and ready to be used.
Status Blinking amber four times per second The drive has failed.
Status Blinking green four times per second The drive is in the process of being rebuilt.
Status Blinks green for three seconds, blinks amber for three seconds, turns off for six seconds The process of rebuilding was interrupted or stopped.