Google Search Appliance software version 4.6
Posted October 2006
Authentication (AuthN) is used to identify users and authorization (AuthZ) is used to allow users access to documents according to their credentials. This document describes how to use the Google Search Authorization Service Provider Interface (SPI) to develop a component for answering authorization requests, in order to securely show search results.
This document describes features that are available in version 4.6 and later of the Google Search Appliance. These features are not available in the Google Mini.
This document describes both the Authentication SPI and the Authorization SPI, to present the whole picture of delivering secure search results using the SPI.
Many enterprise customers have documents on their intranets that are access-controlled. Some users have access to a given document and some do not. The Google Search Appliance can crawl these documents, and, to protect confidentiality at serving time, include in the search results only the documents that the searcher has access to.
The Google Search Appliance can crawl and serve documents protected by HTTP Basic Authentication and NTLM, as well as documents protected by HTML Forms-based Authentication. The Google Search Appliance integrates with form-based single sign-on systems, such as systems from eTrust™ SiteMinder from Computer Associates, Cams™ from Cafesoft, and Oracle Identity Management. For HTTP Basic Authentication, NTLM, and Forms-based Authentication, checking document authorization is based on masquerading as the user. The user enters their credentials, or sends the appliance their single sign-on cookie, and for each access-controlled document in a results set, the appliance attempts to access the URL.
Rather than masquerade as the user, the Google Search Appliance also has the ability to directly authenticate the user using x509 client certificates.
In addition, you can set up authentication using LDAP (Light Directory Access Protocol) or the Google Search Authentication SPI, which allows a web service you provide to communicate using an Identity Provider and a Policy Decision Point between your access control system and the Google Search Appliance.
Whether client certificates, LDAP, or the Authentication SPI authentication method is used, the Google Search Appliance still needs a way to know what URLs a search user is authorized for, in order to know which URLs to include in the search results. The Authorization SPI exists to satisfy this requirement.
To write an Identity Provider and Policy Decision Point web service, you should be familiar with these technologies.
Extensible Markup Language. [specification]
An XML-based standard whose primary use case is inter-domain single sign-on. [specification]
The Simple Object Access Protocol is an XML-based protocol for exchanging information over the Internet. [specification]
The sample messages in this document conform to the XML schemas on the SAML web site.
Tip: One way to implement an Identity Provider and Policy Decision Point is to access a SOAP server using Apache Axis.
When implemented, the Authentication SPI allows search users to authenticate to the Google Search Appliance. It is designed to allow customers to integrate the appliance into an existing access control infrastructure. Instead of authenticating search users itself, the appliance redirects the user to an Identity Provider (IP), a customer-implemented server, where the actual authentication takes place. The IP then redirects the user back to the appliance, while passing information that includes the identity of the search user. The protocol that governs this communication between the appliance, the browser, and the customer's IP is based on SAML 2.0, an XML-based standard.

Figure 1: The Google Search Appliance communicates through an Identity Provider to authenticate users' access to intranet web pages.
Note: If you use the Authentication SPI, you must use the Authorization SPI as well. However, if you decide to authenticate your users with x509 certificates, or LDAP, you do not need to implement the Authentication SPI. You can go on to the Authorization section.
The Authentication SPI exposed by the Google Search Appliance is based on the SAML 2.0 standard; specifically, on the "Web Browser SSO Profile." The Web Browser SSO profile makes use of the "Authentication Request Protocol," a request-response protocol. The appliance sends a SAML <AuthnRequest> message to the customer's Identity Provider, and the Identity Provider responds with a SAML <Response> message that contains an <Assertion>, which in turn contains an <AuthnStatement>. These messages are transferred between the appliance and the customer's Identity Provider, via the browser, using the "HTTP Redirect" and the "HTTP Artifact" bindings.
The authentication sequence between the user/appliance/Identity Provider goes like this:
The Identity Provider runs a SOAP message service, which accepts queries from the appliance, and returns an <ArtifactResponsee> element as a child of the SOAP <Bodye> element. This connection is made over a mutually authenticated HTTPS connection, ensuring origin integrity, data integrity, and confidentiality. To dereference the artifact and actually obtain the SAML response message, the appliance uses the SAML SOAP binding to send an <ArtifactResolve> message to the Identity Provider. This message contains the artifact. The response message from the Identity Provider is an <ArtifactResponse> message, which contains a <Response> element, which contains an <AuthnStatement>, which contains the identity of the search user.
An artifact should not be reusable. Once an artifact is dereferenced, the Identity Provider should reject attempts to dereference the same artifact again.
After a search user logs in using the Authentication SPI, the appliance will maintain a session with the search user so that the user doesn't have to reauthenticate to the Identity Provider on every search.
This session is maintained with a session cookie. This cookie is securely sent over HTTPS and is set for the appliance's hostname only. The cookie value will contain the user identity and the time that the session cookie will expire.
When a search user performs a query (having no session cookie set), the appliance responds with a redirect that looks something like this:
HTTP/1.1 302 Object Moved Date: 23 Feb 2005 19:00:49 GMT Location: https://ac.corp.company.com/SAML_login?SAMLRequest=BASE64URLENCODEDELEMENT&RelayState=https://search.corp.company.com/search?q=query Content-Type: text/html; charset=iso-8859-1
The element BASE64URLENCODEDELEMENT is similar to:
<AuthnRequest ID="foobar" Version="2.0" IssueInstant="2005-10-08T11:32:19Z"/>
which is first DEFLATE-compressed, then Base 64 encoded, then URL encoded.
When the Identity Provider redirects the browser to the appliance, here's what it might look like:
HTTP/1.1 302 Object Moved Date: 9 Feb 2005 18:22:03 PST Location: https://search.corp.company.com/SamlArtifactConsumer?SAMLart=RANDOMLOOKINGSTRING&RelayState=https://search.corp.company.com/search?q=query Content-Type: text/html; charset=iso-8859-1
The appliance gets the artifact as the SAMLart parameter's value, and sends it to the Identity Provider via SOAP over a mutually authenticated HTTPS connection:
POST /SAMLResponder/resolve HTTP/1.1
Host: ac.corp.company.com
Content-Type: text/xml
Content-Length: nnn
SOAPAction: http://www.oasis-open.org/committees/security
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<samlp:ArtifactResolve
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
xmlns="urn:oasis:names:tc:SAML:2.0:assertion"
ID="randomlooking" Version="2.0"
IssueInstant="2005-02-09T18:42:32Z">
<Issuer>search.corp.company.com</Issuer>
<samlp:Artifact>RANDOMLOOKINGSTRING</samlp:Artifact>
</samlp:ArtifactResolve>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
The Identity Provider returns:
HTTP/1.1 200 OK
Date: 9 February 2005 18:22:04 PST
Content-Type: text/xml
Content-Length: nnn
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<samlp:ArtifactResponse
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
xmlns="urn:oasis:names:tc:SAML:2.0:assertion"
ID="alsorandomlooking" Version="2.0"
InResponseTo="randomlooking"
IssueInstant="2005-02-09T18:43:32Z">
<Issuer>ac.corp.company.com</Issuer>
<samlp:Status>
<samlp:StatusCode Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
</samlp:Status>
<samlp:Response
ID="blahblah"
Version="2.0"
IssueInstant="2004-10-08T14:38:05Z">
<samlp:Status>
<samlp:StatusCode
Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
</samlp:Status>
<Assertion
Version="2.0"
ID="blahblah2"
IssueInstant="2004-10-08T14:38:05Z">
<Issuer>ac.corp.company.com</Issuer>
<Subject>
<NameID>CN=Joe Bob</NameID>
</Subject>
<AuthnStatement
AuthnInstant="2004-10-08T11:32:19Z">
<AuthnContext>
<AuthnContextClassRef>
urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport
</AuthnContextClassRef>
</AuthnContext>
</AuthnStatement>
</Assertion>
</samlp:Response>
</samlp:ArtifactResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
The Google Search Authorization SPI is exposed to allow a customer's web service to communicate between the Authorization SPI and the customer's server that provides access control services, which this document will refer to as the Policy Decision Point (PDP). The PDP provides a layer between the Google Search Appliance and the customer's access-control system. The PDP will be implemented, tested, and maintained by the customer.
When a user performs a search over access-controlled documents, the user must first authenticate to the Google Search Appliance. This allows the Google Search Appliance to reference the user's identity when making authorization checks, and to include the search user's identity in search logs.
There is an option to turn off cache links and snippets for access-controlled documents. This allows the administrator to assess the risk of storing access-controlled documents on the Google Search Appliance.
As with AuthN, the protocol used between the Google Search Appliance, the browser, and the PDP is taken from SAML 2.0, an XML-based standard, whose primary use case is inter-domain single sign-on. For example, suppose a user is logged in at organization A, and wants to access content at organization B. Instead of forcing the user to log in again, SAML provides a way for the SSO system at A to vouch for the user by communicating with the SSO system at B. In our scenario, the PDP will act as organization A, while the Google Search Appliance will act as organization B.
When the Google Search Appliance needs to check whether a search user has access to a URL, it creates a message containing the user identity and the URL, and sends it to an authorization server. This authorization server is the Policy Decision Point (PDP), a service provided by the customer. In response to authorization check requests, the Policy Decision Point responds with a message that says either "Permit," "Deny," or "Indeterminate." (these terms are defined by the SAML standard.)
For each URL in a search results list, the Google Search Appliance will issue an <AuthorizationDecisionQuery> element, containing the identity of the user and the URL in question, to the Policy Decision Point. The PDP will send back a <Response> message, which will contain an <AuthzDecisionStatement>, and will say whether the user is authorized for the URL. These messages will be exchanged using the SAML SOAP binding over HTTPS.
The format of these messages are defined by SAML, and they are sent over SOAP over HTTPS. How the SAML messages are embedded in SOAP is also defined by SAML, as the "SAML SOAP binding". For complete details, please refer to the SAML standard.
When the Google Search Appliance makes an authorization check, it caches the result. The time that this information is valid is configurable in the Admin Console.
Here are the relevant portions of the SAML schema for the request:
<complexType name="RequestAbstractType" abstract="true">
<sequence>
<element ref="saml:Issuer" minOccurs="0"/>
<element ref="ds:Signature" minOccurs="0"/>
<element ref="samlp:Extensions" minOccurs="0"/>
</sequence>
<attribute name="ID" type="ID" use="required"/>
<attribute name="Version" type="string" use="required"/>
<attribute name="IssueInstant" type="dateTime" use="required"/>
<attribute name="Consent" type="anyURI" use="optional"/>
</complexType>
<element name="Extensions" type="samlp:ExtensionsType"/>
<complexType name="ExtensionsType">
<sequence>
<any namespace="##other" processContents="lax" maxOccurs="unbounded"/>
</sequence>
</complexType>
<element name="SubjectQuery" type="samlp:SubjectQueryAbstractType"/>
<complexType name="SubjectQueryAbstractType" abstract="true">
<complexContent>
<extension base="samlp:RequestAbstractType">
<sequence>
<element ref="saml:Subject"/>
</sequence>
</extension>
</complexContent>
</complexType>
<complexType name="BaseIDAbstractType" abstract="true" mixed="true">
<complexContent>
<extension base="anyType">
<attribute name="NameQualifier" type="string" use="optional"/>
<attribute name="SPNameQualifier" type="string" use="optional"/>
</extension>
</complexContent>
</complexType>
<element name="NameID" type="saml:NameIDType"/>
<complexType name="NameIDType" mixed="false">
<simpleContent>
<restriction base="saml:BaseIDAbstractType">
<simpleType>
<restriction base="string"/>
</simpleType>
<attribute name="Format" type="anyURI" use="optional"/>
<attribute name="SPProvidedID" type="string" use="optional"/>
</restriction>
</simpleContent>
</complexType>
<element name="Subject" type="saml:SubjectType"/>
<complexType name="SubjectType">
<choice>
<sequence>
<choice>
<element ref="saml:BaseID"/>
<element ref="saml:NameID"/>
<element ref="saml:EncryptedID"/>
</choice>
<element ref="saml:SubjectConfirmation" minOccurs="0" maxOccurs="unbounded"/>
</sequence>
<element ref="saml:SubjectConfirmation" maxOccurs="unbounded"/>
</choice>
</complexType>
<element name="AuthzDecisionQuery" type="samlp:AuthzDecisionQueryType"/>
<complexType name="AuthzDecisionQueryType">
<complexContent>
<extension base="samlp:SubjectQueryAbstractType">
<sequence>
<element ref="saml:Action" maxOccurs="unbounded"/>
<element ref="saml:Evidence" minOccurs="0"/>
</sequence>
<attribute name="Resource" type="anyURI" use="required"/>
</extension>
</complexContent>
</complexType>
The <Subject> element will contain the identity of the search user. For the <Subject> element, the <NameID> element will be used. The format of this identity will be whatever is passed to the Google Search Appliance from the Authentication portion of the Access Control framework. The Resource attribute will be the URL for which we are checking authorization.
For the <Action> element, the attribute for the namespace will have the value "urn:oasis:names:tc:SAML:1.0:action:ghpp". The value for the text of the <Action> element is GET.
The following elements will not be sent to the Policy Decision Point by the Google Search Appliance.
Here are some relevant portions of the SAML schema for the response:
<element name="Response" type="samlp:ResponseType"/>
<complexType name="ResponseType">
<complexContent>
<extension base="samlp:StatusResponseType">
<choice minOccurs="0" maxOccurs="unbounded">
<element ref="saml:Assertion"/>
<element ref="saml:EncryptedAssertion"/>
</choice>
</extension>
</complexContent>
</complexType>
<complexType name="StatusResponseType">
<sequence>
<element ref="saml:Issuer" minOccurs="0"/>
<element ref="ds:Signature" minOccurs="0"/>
<element ref="samlp:Extensions" minOccurs="0"/>
<element ref="samlp:Status"/>
</sequence>
<attribute name="ID" type="ID" use="required"/>
<attribute name="InResponseTo" type="NCName" use="optional"/>
<attribute name="Version" type="string" use="required"/>
<attribute name="IssueInstant" type="dateTime" use="required"/>
<attribute name="Recipient" type="anyURI" use="optional"/>
</complexType>
<element name="Status" type="samlp:StatusType"/>
<complexType name="StatusType">
<sequence>
<element ref="samlp:StatusCode"/>
<element ref="samlp:StatusMessage" minOccurs="0"/>
<element ref="samlp:StatusDetail" minOccurs="0"/>
</sequence>
</complexType>
<element name="StatusCode" type="samlp:StatusCodeType"/>
<complexType name="StatusCodeType">
<sequence>
<element ref="samlp:StatusCode" minOccurs="0"/>
</sequence>
<attribute name="Value" type="anyURI" use="required"/>
</complexType>
<element name="Assertion" type="saml:AssertionType"/>
<complexType name="AssertionType">
<sequence>
<element ref="saml:Issuer"/>
<element ref="ds:Signature" minOccurs="0"/>
<element ref="saml:Subject" minOccurs="0"/>
<element ref="saml:Conditions" minOccurs="0"/>
<element ref="saml:Advice" minOccurs="0"/>
<choice minOccurs="0" maxOccurs="unbounded">
<element ref="saml:Statement"/>
<element ref="saml:AuthnStatement"/>
<element ref="saml:AuthzDecisionStatement"/>
<element ref="saml:AttributeStatement"/>
</choice>
</sequence>
<attribute name="Version" type="string" use="required"/>
<attribute name="ID" type="ID" use="required"/>
<attribute name="IssueInstant" type="dateTime" use="required"/>
</complexType>
<complexType name="StatementAbstractType" abstract="true"/>
<element name="Issuer" type="saml:NameIDType"/>
<element name="AuthzDecisionStatement" type="saml:AuthzDecisionStatementType"/>
<complexType name="AuthzDecisionStatementType">
<complexContent>
<extension base="saml:StatementAbstractType">
<sequence>
<element ref="saml:Action" maxOccurs="unbounded"/>
<element ref="saml:Evidence" minOccurs="0"/>
</sequence>
<attribute name="Resource" type="anyURI" use="required"/>
<attribute name="Decision" type="saml:DecisionType" use="required"/>
</extension>
</complexContent>
</complexType>
<simpleType name="DecisionType">
<restriction base="string">
<enumeration value="Permit"/>
<enumeration value="Deny"/>
<enumeration value="Indeterminate"/>
</restriction>
</simpleType>
<element name="Action" type="saml:ActionType"/>
<complexType name="ActionType">
<simpleContent>
<extension base="string">
<attribute name="Namespace" type="anyURI" use="required"/>
</extension>
</simpleContent>
</complexType>
The namespace set in the <Action> element attribute will be "urn:oasis:names:tc:SAML:1.0:action:ghpp". If the string in an <Action> element is "GET", the Google Search Appliance will display the URL in the search results, along with snippets and the cache link.
Since the URL found in the cache link (the cache URL pointed to by the cache link, not the URL that points to the original document) is not secret, we must again check "GET" authorization for a document when the user tries to access the corresponding cache link URL.
If the value for the Decision attribute in the <AuthzDecisionStatement> is "Indeterminate", rather than "Permit" or "Deny", the Google Search Appliance will then try to check authorization using Basic Authentication, NTLM, or Forms Authentication, if they are configured. If they aren't configured, an answer of "Indeterminate" will be treated as if authorization was denied.
Here is an example of a message the Google Search Appliance might send to the Policy Decision Point:
POST /authz HTTP/1.1
Host: ac.abc.com
Content-Type: text/xml
SOAPAction: http://www.oasis-open.org/committees/security
Content-length: nnn
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<samlp:AuthzDecisionQuery
ID="kmigpcackfenaibdninipcnmkmajfplommhfapbk"
IssueInstant="2004-10-20T17:52:29Z"
Version="2.0"
Resource="http://www.abc.com/secret.html"
xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol">
<saml:Subject>
<saml:NameID>Joe Bob</saml:NameID>
</saml:Subject>
<saml:Action
Namespace="urn:oasis:names:tc:SAML:1.0:action:ghpp">
GET
</saml:Action>
</samlp:AuthzDecisionQuery>
</soapenv:Body>
</soapenv:Envelope>
Here's an example of a possible response from the Policy Decision Point:
HTTP/1.1 200 OK
Content-Type: text/xml
Content-Length: nnn
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<samlp:Response
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
ID="blahblah"
Version="2.0"
IssueInstant="2004-10-08T14:38:05Z">
<samlp:Status>
<samlp:StatusCode
Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
</samlp:Status>
<saml:Assertion
Version="2.0"
ID="blahblah2"
IssueInstant="2004-10-08T14:38:05Z">
<saml:Issuer>abc.com</saml:Issuer>
<saml:Subject>
<saml:NameID>Joe Bob</saml:NameID>
</saml:Subject>
<saml:AuthzDecisionStatement
Resource="http://www.abc.com/secret.html"
Decision="Permit">
<saml:Action
Namespace="urn:oasis:names:tc:SAML:1.0:action:ghpp">
GET
</saml:Action>
</saml:AuthzDecisionStatement>
</saml:Assertion>
</samlp:Response>
</soapenv:Body>
</soapenv:Envelope>
Last modified: