Google Search Appliance software version 6.0
Posted June 2009
Revised September 2009: Additions and corrections to IWA/Kerberos information.
This chapter describes how a search appliance discovers content on your servers. It provides an overview of authentication and authorization methods used during crawl and index, and the methods available during serve. It also provides basic instructions for configuring a search appliance to crawl, index, and serve controlled-access content.
Skip over ContentsAuthentication is the process of verifying the identity of a user, a system, or a service. Authorization is the process that determines whether an authenticated user, system, or service has permission to perform a task. The term "controlled-access content" represents any information that should not be displayed unless the user who requests the content is authenticated and has authorization to view the information.
To make controlled-access content discoverable through search, the search appliance mediates two kinds of access:
All controlled-access content that is available to the search appliance is indexed. The search appliance then determines whether to display the controlled-access content in response to each search request.
When a user issues a search request for controlled content, the search appliance impersonates the user. The search appliance verifies the user's identity and determines whether the user has authorization to view controlled-access content. This check is performed before the search appliance displays any content in search results.
A Google Search Appliance provides additional methods for enabling authentication and authorization that do not require user impersonation. These are discussed in the section "The SAML Authentication and Authorization Service Provider Interface (SPI)".
The search appliance indexes all content that can be crawled and indexed. This includes both controlled-access content and content that is available to anyone. Once you set up the search appliance with access credentials, it will maintain a copy of all crawled content in the index. The index allows the search appliance to determine relevance and display secure results when a user performs a search. Users only see the secure results that they are authorized to view.
A search appliance discovers and indexes controlled-access content in the same way that it indexes all other content: by performing a crawl through the content sources that are available to the web crawler, file system crawler, relational database crawler, and the XML content feed interface.
When you define content sources, you must perform additional steps in the Admin Console to give the search appliance access to controlled-access content:
You can specify a different set of access credentials for each URL pattern in the Admin Console. The means by which you provide these credentials is different for each kind of authentication, but the general process remains the same.

Figure 1: The search appliance uses URL patterns and credentials to crawl and index content.
When you set up the search appliance to access controlled-access content with HTTP Basic or NTLM HTTP, consider the following points. You can find more information on these topics in the Admin Console Help Center.
DOMAIN\username." Users must provide the domain name each time that they log in. The search appliance supports cookie based access. For sites that require the use of a cookie for authentication during crawl and index, you can define your content with a forms authentication rule.
Once controlled-access content is present in the index, the search appliance labels it as "secure" or "public":
It's important to understand that when controlled-access content is labeled as "public" in the index, it is shown in all users' search results. Because public search results are served from the index without checking for authorization, users can discover all public content that the search appliance has access to, regardless of whether they have authorization to view that content.
Finally, even though authorized users can see secure content in their search results, they may need to log in again to view the content on the server. To prevent this second request for credentials, the search appliance can pass a user's credentials to the content server through Forms Authentication with cookie forwarding, or by using the SAML Authorization SPI.
If your users have to log in multiple times to access content on different servers, consider implementing a single sign-on (SSO) system for authentication and authorization. The SSO server unifies the authentication process by first authenticating the user and then by authorizing the user on the web servers to which that user has access. Single sign-on servers are available from a variety of vendors such as Computer Associates SiteMinder, and Oracle Identity Management. SSO integration is only available for the search appliance.
When crawling and indexing controlled-access content over HTTP or HTTPS, the search appliance assigns public or secure status based on the type of crawl, and the Make Public checkbox in the Admin Console. If the Make Public checkbox is selected on the Crawl and Index > Forms Authentication page, content is labeled as public. When the checkbox is cleared, content is labeled as secure.
The search appliance assigns status from these pages:
authmethod attribute for the record specifies whether content is treated as public or secure.
authmethod value to none. This is the default for content provided by feeds. authmethod value to ntlm, httpbasic, or httpsso. The front end configuration for a search results page controls how much information users see for each item in the search results. When you make controlled-access content available for public search, open the Page Layout Helper or the XSLT Stylesheet Editor for each front end and review the stylesheet configuration to ensure that you are not revealing more information than the user needs.
In the Page Layout Helper, these parameters under Search Results control which information is displayed:
<S> element is displayed in the search results. Clear the Snippet check box to remove snippets from the search results. <C> element's page size SZ value is displayed in the search results. Clear the Page Size check box to remove information about the document's size from the search results. <CACHE_LAST_MODIFIED> element is included in the XML results. Clear the Modified Date check box to remove information about the document's freshness from the search results. <C> element is included in the XML results. Clear the Cache Link check box to remove the link to the cached document from the search results. In the XSLT Stylesheet Editor, these XSL variables control which information is displayed:
show_res_snippet specifies whether to display a snippet for each result. Set <xsl:variable name="show_res_snippet">0</xsl:variable> to remove snippets from the search results. show_meta_tags specifies whether to display metadata for each result. Set <xsl:variable name="show_meta_tags">0</xsl:variable> to remove the document's metadata from the search results.show_res_size specifies whether to display the page size for each result. Set <xsl:variable name="show_res_size">0</xsl:variable> to remove information about the document's size from the search results. show_res_date specifies whether to display the last-modified date for each result. Set <xsl:variable name="show_res_date">0</xsl:variable> to remove information about the document's freshness from the search results. show_res_cache specifies whether to display the cache link for each result. Set <xsl:variable name="show_res_cache">0</xsl:variable> to remove the link to the cached document from the search results. choose_bottom_navigation specifies which navigation option to use at the bottom of the results page. Set <xsl:variable name="choose_bottom_navigation">simple</xsl:variable> to exclude both the "Gooooogle" navigation and the numbered references to search results pages. When a user performs a search request, the search appliance performs these checks before serving secure content:
If a secure content item fails the second check, the search appliance removes it from the list of results.
A search appliance uses these methods to establish the user's identity:
After the search appliance establishes a user's identity, the search appliance attempts to determine whether a user has access to the secure content that matches their search.
The search appliance performs an authorization check in this order:
If you specify a policy ACL rule, the search appliance checks the URL patterns in the rules against the URLs that are returned for in the search results. If the users and groups in the rule are permitted to view the results, then the results display. If users or groups are not permitted, then the URLs do not display. Steps 2 through 4 occur if a URL pattern does not match a policy ACL rule or SAML is not configured, but steps 2 through 4 do not occur if a URL pattern does match a policy ACL rule and the user is either permitted to view search results or receives a deny and does not see the search results.
If the search appliance is configured to use the SAML Authentication and Authorization SPI, the search appliance sends a SAML authorization request to the Policy Decision Point, using the identity obtained for the user during the serve authentication.
Otherwise,
For secure content that was crawled using HTTP Basic or NTLM HTTP authentication, the search appliance performs a HEAD request for the document, using the credentials obtained for the user during serve authentication.
For secure content that was crawled using Forms Authentication, the search appliance performs a GET request for 0 bytes of the document, using the credentials obtained for the user during serve authentication.
If the authorization check is successful, the secure content that matches the search query is included in the user's search results.
A policy ACL (Access Control List) provides information to the search appliance about which users or groups have access to a specific URL. By specifying policy ACLs on a search appliance, you can enhance performance and reduce load. Policy ACLs speed up the process of authorization and reduce the load on the authorization servers that occurs from performing HEAD requests to a remote authorization server.
Policy ACLs typically store the results that would have occurred if the search appliance initiated a HEAD request to verify authorization. However policy ACLs can also be used to override the decision that would have been returned by a HEAD request. For example, if you put in a policy ACL rule that permits a group to see all documents at a URL, but at the source repository (that is, the HEAD request), there's a more fine-grained rule where only some in the group can view documents, then the behavior with the policy ACL rule is that everyone can see the search results, but only those who have access rights can click the links.
Policy ACLs require that you use an authentication method to establish the identity of the user or group that you specify in the Policy ACL rules.
For more information on policy ACLs, see the previous sections Serve for Controlled-Access Content and How a Search Appliance Determines a User's Identity and Authorization During Serve. See also the Google Search Appliance Policy ACL API Developer's Guide.
A policy ACL rule has two parts:
A URL pattern that you want to protect with restricted access.
Lists the users or groups that have access to the restricted URL.
For example, suppose the eng (engineering) group is the only group that you permit to view all documents in the example.com/engsite page. To grant the engineering group access to the engsite page, specify a policy ACL rule:
example.com/engsite group:eng
When a search appliance executes a search, it attempts to match URLs that the search appliance retrieves from the index against policy ACLs. If a URL pattern matches the policy ACL rule, the search appliance applies the rule.
You can specify a URL pattern to which you want to limit access. When a user performs a search query, the user can view this URL pattern in the search results if you list the user as either an allowed user or if the user is a member of an allowed group.
If more than one URL pattern matches the policy ACL, the search appliance chooses the best match in this order of precedence:
If there is an exact-match URL pattern, it is the best match. An exact-match URL patterns begins with a caret (^) and ends with a dollar sign ($). The following example shows an exact-match URL pattern:
^http://www.example.com/mypage.html$
The coarse-grained rules consist of:
If there is one or more matching prefix-patterns, the pattern with the longest prefix is the best match. A prefix-pattern specifies a (possibly partial) domain and a prefix of the path portion of the URL. The general format of a prefix pattern is:
<domain>/<prefix>
Examples of prefix patterns:
sales.example.com/products/ sales.example.com/products/mypage.html sales.example.com/
If the only matching URL patterns are general patterns, the best match is undefined. The search appliance chooses one pattern for the URL pattern. A general URL pattern is any pattern other than an exact-match pattern or a prefix pattern.
Examples of general URL patterns are:
| Example | Description |
|---|---|
*.doc |
A suffix pattern, matches any file ending with the .doc value. |
contains:product | The product string can appear either in the host name, such as
myproduct.com, or at the end of a URL and doesn't have to be a full word. |
regexp:sid=[0-9A-Z]+/ | The URL has to contain a
URL parameter with sid= followed by a value that contains
either a digit or capital letter. The plus means one or more characters |
A policy ACL rule lists each user's or group's login ID. The user who enters a search can view the URL result if either of the following conditions is true:
Otherwise, the user is denied permission to view the URL. The URL does not appear in the search results.
To determine which group a user belongs to, the search appliance uses one of the following mechanisms:
If the search appliance is configured to use LDAP, then the search appliance gets group memberships from the LDAP server. To configure LDAP for a search appliance, use the Administration > LDAP Setup page.
Using a groups database, you can import a list of groups and memberships lists for each group using the Google Data API.
If a groups database is present, the search appliance uses it to determine a user's group membership. However, you can use both mechanisms together. In this case, the search appliance gets all group memberships from both sources.
To add a policy ACL:
To navigate to the previous page, click the Back to Policy ACL list link.
Note: The order that you specify users or groups is not significant. When you click Save, the search appliance sorts the login names into alphabetic order in each field.
Caution: Ensure that you do not separate login names with commas. The search appliance assumes that the comma is part of the login name.
To add a policy ACL:
To delete a policy ACL:
You can import a text file that contains policy ACL rules. The file you import overwrites all existing policy ACL rules.
Note: Before importing a configuration file, if you have defined policy ACL rules, click Export Search Results to back up your rules. The exported file is in the same format as a configuration file that you can import.
The format of each rule in the file is:
url_pattern allowed_user_or_group
Each line of the file must list only one URL pattern rule, and one or more users, denoted by the user: prefix
or groups, denoted by the group: prefix, as shown in the following example:
example.com/docsite user:jane user:sue user:wilson group:chicagodoc group:texasdoc mycompany.com/engsite group:eng mycompany.com/salessite group:sales user:yvette
To import a file that contains policy ACLs:
You can perform the following types of searches from the Policy pattern field on the Serving > Policy ACLs page:
Display rules by their type--view all rules by the filter you choose, or only those that contain text that you specify in the Policy pattern field. Click Search to list the rules, rules display in alphabetic order by the rule name. The rule filters are as follows:
Provide a URL and all the rules that match the URL are displayed. This search tells you which patterns match a URL. This helps you know for a given URL, which rule applies. Enter a URL pattern in the Policy pattern field, choose Find Rules for URL, and clicking Search. The rules are displayed in best match order. The first rule that displays applies, and is the best match and is the rule that the search appliance applies. The first rule is the one and only rule that is applied. This best match order is useful when you have two rules that match a URL and you want to find which rule applies best to the URL.
Search results appear under Matching URL Patterns.
After you search policy ACLs, you can export the search results as an XML file. To export search results, click Export Search Results. The exported file is in the same format as an import configuration file.
The default file name is policy_acl.xml.
You can also add policy ACLs by using the following mechanisms:
HTTP Basic and NTLM HTTP request the user's credentials for controlled-access content, but do not perform any validation on the credentials entered by the user before saving a session cookie. If you are not using Kerberos authentication, Directory Service Integration with an LDAP server permits a search appliance to validate a user's credentials as they are entered. If a user enters incorrect credentials, the search appliance prompts the user to try again.
Note: You can configure a search appliance to perform secure serve without LDAP directory service integration. In this case, only the authorization check is performed. If the user's credentials are incorrect, the search appliance cannot obtain authorization and secure content is not served.
This section provides a general overview of how to enable the search appliance to authenticate credentials against an LDAP server. For more detailed instructions, click Help Center > Administration > LDAP Setup in the Admin Console.
Note: The search appliance does not support using LDAP and Kerberos authentication at the same time; you must choose one method for all servers on your domain.
To specify LDAP settings for the search appliance:
uid - (user ID)
ou - (organizational unit)
dc - (company name)
Important: If the LDAP Authentication Test settings do not successfully authenticate a user, click Cancel, revisit and change the information you entered, and test again.
When a user performs a query for secure content, the search appliance responds with the same protocol. Because the responses for serve over HTTP Basic and NTLM HTTP include authorization headers, a malicious user could intercept the message and extract the header. To protect the user's credentials against such an attack, you can force the use of HTTPS during serve, even when the search request is sent over HTTP.
To specify whether the search appliance serves all content over HTTPS:
Kerberos is a network authentication protocol that enables client and server applications to perform mutual authentication for the duration of a user's login session. The search appliance can use Kerberos authentication by issuing a head request to confirm a user's right to view controlled-access documents. The search appliance only performs this check during secure serve for content on HTTP servers; Kerberos is not supported for crawling content.
To ensure that a search appliance uses Kerberos during serving, content sources must be enabled for Kerberos. If Kerberos is not configured properly, the content sources fall back to NTLM. For more information on ensuring that Kerberos is configured correctly on Windows content sources, see this wiki page (the information is provided as a reference, and is not officially supported by Google).
The Kerberos implementation supports:
The Kerberos implementation does not support:
When the search appliance is configured to use IWA / Kerberos authentication, the search appliance checks the user's session ticket against a KDC before displaying secure search results to a user. For Windows servers, the domain controller acts as the KDC for Kerberos authentication.
To configure the search appliance to use IWA / Kerberos authentication during serve:
After you complete these steps, recrawl the affected content sources. The search appliance is then able to check a user's authentication status without requiring an additional login.
A verified identity from Kerberos authentication can be used for authorization. The following authorization mechanism can use the verified identity from Kerberos authentication:
If your content sources support these authorization mechanisms, then the content sources are not required to support Kerberos, and delegation is not required.
If you are using IWA (Integrated Windows Authentication) / Kerberos Authentication, read the advisory on the Google Enterprise Technical Support web site and update your search appliance to version 6.0.0.G32-P2.
The process for creating a user for your Key Distribution Center depends on the type of domain controller that you are using. This guide provides instructions for installing the search appliance on a Windows domain.
To configure Windows:
Use DES encryption types for this account.Password Never expires. ktpass -princ HTTP/FQDN_of_the_searchappliance@DOMAIN_NAME -mapuser DOMAIN_NAME\searchappliance_username -pass searchappliance_password -out filename.keytab -crypto DES-CBC-MD5 +DesOnly
where FQDN=fully qualified domain name.
The search appliance username, password, and domain must be consistent with the user account that you created in step 2. With the exception of the mapuser switch, domain names must be fully qualified. Setting the encryption type to DES-CBC-MD5 ensures compatibility with most systems. Ensure that when you issue the ktpass command, HTTP is in upper-case letters and the string FQDN_of_the_search_appliance is in lower-case letters, as shown in the examples in this section. The FQDN_of_the_search_appliance must be the DNS A-name for the search appliance, not the CNAME.
For example, suppose the domain is FOODOMAIN, the user account is gsa_account, the user password is 123pass, and the FQDN of the search appliance is gsa.foodomain.com.
Then you would enter the following command:
ktpass -princ HTTP/gsa.foodomain.com@FOODOMAIN.com -mapuser FOODOMAIN\gsa_account -pass 123pass -out myfilename.keytab -crypto DES-CBC-MD5 +DesOnly
The keytab file is the Kerberos key table that you will install on the search appliance.
HTTP/FQDN_of_the_search_appliance.To configure Kerberos authentication in the Admin Console:
Users who query the search appliance must have their web browsers configured to use Kerberos authentication.
No special configuration is required for Safari. Instructions for Internet Explorer and Firefox/Mozilla are provided below.
To configure Internet Explorer:
To configure Firefox/Mozilla:
about:config".network.negotiate-auth.trusted-uris. Modify this parameter to include the search appliance's URL as a trusted URI.network.negotiate-auth.delegation-uris. Modify this parameter to include the search appliance's URL as a delegation URI.network.auth.use-sspi and set its value to false. Note: For more on Mozilla and integrated authentication, see http://www.mozilla.org/projects/netlib/integrated-auth.html
For more information about the Google Search Appliance and Kerberos, see the following documents:
During serve, secure content from sites that were crawled through a Forms Authentication rule can be handled in one of two ways: by redirecting the user to an external login server, or by mediating the user's session cookie. The correct authentication method depends on your security policy:
Take note that even though Crawl and Index > Forms Authentication supports multiple rules, only one rule can be configured under Serving > Forms Authentication.
Forms authentication with a sample protected URL causes the search appliance to rewrite the links in the login page. Users authenticate by entering their credentials into a login form for the search appliance. The search appliance performs a proxied login on the single sign-on (SSO) server and obtains a session cookie for the user. The search appliance then exchanges cookies back-and-forth between the user and the SSO system, and tests whether the cookies are valid by retrieving a sample protected URL. The user can continue to search without re-authenticating as long as the session cookies remain valid against the sample URL. When the sample URL retrieval fails, the search appliance again presents the user with a copy of the SSO system login form. Upon submission, the search appliance examines the changes in cookies, and continues proxying cookies between the user and the SSO system.
This method does not require the search appliance and the external login server to be on the same cookie domain, and is unaffected by IP restrictions on the server's cookie. You cannot use this method if the login form contains JavaScript or frames.
To configure a search appliance to perform forms authentication with a sample protected URL:
Forms Authentication through an external login server allows you to redirect users to a login page for authentication. Users authenticate by entering their credentials in the login page directly: the search appliance does not proxy the form.
You can use an external login server if your cookie domain includes both the search appliance and the web servers hosting your protected content. You cannot use an external login server if your cookies are IP-restricted. Your login form can use frames and Javascript. Users that have already authenticated do not need to login a second time to get search results. You can use multiple cookies.
You need to implement a redirect URL that meets the following two requirements:
One way to implement such a redirect URL is to copy and paste the following JavaScript snippet into a static html page and change the value of gsahost to the host name of your Google Search Appliance.
<script type="text/javascript">
var gsahost = "gsa.domainname.com"
window.location = "https://" + gsahost + unescape(window.location.search.match("returnPath=[^&]*")[0].substring(11))
</script>
To configure a search appliance to perform forms authentication through an external login server:
The Authentication and Authorization Service Provider Interface (SPI) provides an alternate means of determining whether a user is authorized to view secure controlled-access content during serve. The SPI enables a search appliance to communicate with an existing access control infrastructure using standard SAML messages. The Authorization SPI is also required to support X.509 certificate authentication during serve.
This section provides a general overview of how to configure a search appliance to use the Authentication and Authorization SPI when serving controlled-access content. More information on these configuration parameters is available by clicking Help Center > Serving > Access Control in the Admin Console.
Before using the Authentication and Authorization SPI, you must configure the appliance to crawl and index some secure controlled-access content. The SPI is only used when a user queries for secure results.
You can crawl secure content through HTTP Basic, NTLM HTTP, or with Forms Authentication:
When configuring the search appliance to verify authorization with the Authorization SPI, you do not have to use the Authentication SPI. You can perform the authentication step with any of these methods.
| Authentication Method | What happens when an unauthenticated user requests secure content? |
|---|---|
Authentication SPI |
The search appliance redirects the user to the Identity Provider's login service. The login service requests the user's credentials and returns the user's identity to the search appliance. The search appliance then sends a SAML Authorization Request to the Identity Provider's artifact service, using the identity provided by the login service. If the Identity Provider authenticates the user's credentials, the search appliance stores a session cookie on the user's computer that identifies them as an authenticated user. Supports: HTTP BASIC, NTLM HTTP, SMB/CIFS (public only) |
IWA (Integrated Windows Authentication) / Kerberos Authentication |
The search appliance requests a Kerberos session key from the user and attempts to authenticate the session key against the KDC. If the key is valid, the search appliance stores a session cookie on the user's computer that identifies them as an authenticated user. See IWA (Integrated Windows Authentication) / Kerberos Authentication in this guide for more information on Integrated Windows Authentication and Kerberos authentication. Supports: HTTP BASIC, NTLM HTTP, SMB/CIFS (public), SMB/CIFS (secure) Configuration for this authentication method: Crawler and Index > Crawler Access |
LDAP |
The search appliance requests the user's credentials and attempts to authenticate their username and password against an LDAP Server. If the LDAP Server authenticates the user's credentials, the search appliance stores a session cookie on the user's computer that identifies them as an authenticated user. See Integrating the Search Appliance with an LDAP Server for more on this method of authentication. Supports: HTTP BASIC, NTLM HTTP, SMB/CIFS (public) Configuration for this authentication method: Administration > LDAP Setup. |
x.509 certificates |
The search appliance requests a digital certificate from the user. If the user's certificate is trusted by the root CA certificate for the search appliance, the search appliance stores a session cookie on the user's computer that identifies them as an authenticated user. See User Authentication by X.509 Certificate for more on this method of authentication.
Supports: HTTP BASIC, NTLM HTTP, SMB/CIFS (public) |
Once a user's identity has been authenticated, the Authorization SPI checks to see whether the user is authorized to view each of the secure documents that match their search. Using the authenticated cookie set during authentication, the search appliance passes the user's session cookie to the Policy Decision Point's Authorization Service URL inside a SAML Authorization request.
When you use the Authentication SPI, the user's session cookie contains the user's identity in the SAML Authentication format. However, for other authentication methods, the user's identity is stored in the authentication method's format. For example, if x.509 certificates are used, then the identity in the Authorization SPI request is the "common name" field from the certificate, which is an X.500 format. This is an unusual format for this field in a SAML authorization request. If you do not use the Authentication SPI for authentication, your Policy Decision Point must be prepared to accept the user's identity in the format defined by your authentication method.
Once the SAML Authorization request is sent, what happens next depends on the type of content:
To configure the search appliance to use the Authentication SPI:
https://server.domain.com/cgi-bin/authn_login.cgi?Referer=http://<search appliance name>:<serving port>. The search appliance redirects unauthenticated search users to this login URL. https://server.domain.com/SAML/services/AuthNConnectorVerify. The search appliance determines authentication by issuing an <AuthnRequest> element in messages sent to the Artifact Service URL. Before enabling the Authorization SPI, you must define a method for authenticating the user during serve. You can enable user authentication with LDAP, x.509 certificates, or through the Authentication SPI.
To configure the search appliance to use the Authorization SPI:
https://server.domain.com:8443/SAML/services/AuthZConnector. The search appliance determines Authorization by issuing an <AuthorizationDecisionQuery> element in messages sent to the Artifact Service URL.The search appliance uses digital certificates when communicating with web browsers and servers over HTTPS. The search appliance also supports the use of digital certificates to perform X.509 certificate authentication to verify a user's identity before serving secure results.
This section provides a general overview of how to install a digital certificate for use by the search appliance. For more detailed instructions, including an explanation of how to request a digital certificate from a certification authority and decrypt an encrypted private key, click Help Center > Administration > SSL Settings in the Admin Console.
Note: The SSL Settings page can only install non-encrypted RSA keys in .pem (privacy enhanced mail) format. If the private key is encrypted or in PKCS#12 format, refer to the instructions in the Help Center.
To configure the search appliance to enable crawl and serve over HTTPS:
SSL certificate installed. The appliance console needs
to be restarted, please log in again. The search appliance can check a user's SSL certificate to verify that it was issued by a trusted certificate authority before serving secure results. This section provides a general overview of how to configure a search appliance to require X.509 Certificate Authentication from users who submit search queries. For more detailed instructions on how to configure the search appliance to perform X.509 Certificate Authentication, click Help Center > Administration > Certificate Authorities in the Admin Console.
Note: This functionality requires the Authorization SPI. The search appliance must also have a digital certificate that permits crawl and serve over HTTPS.
To configure the search appliance to require X.509 Certificate Authentication for search requests from users:
When you assign credentials that allow a search appliance to crawl and index controlled-access content, it's important to consider whether the content source includes content that you don't want anyone to see. The best way to ensure that private content is never shown in search results is to exclude all private content sources from the index. Examples of controlled-access content that should be excluded from crawl and indexing include:
To exclude private content from the index, use one or both of these methods:
Despite your best efforts to set exclusion patterns and define secure access policies that prevent the indexing of private content, you may discover unanticipated content that you must remove from the index. Removing content from the search appliance index takes anywhere from 30 minutes to a few hours, depending on the size and complexity of your index. To stop serving content immediately, create an exclusion rule to remove the content from the front end while you correct the index.
To stop serving undesired content immediately:
To permanently remove undesired content from the index: