Authentication/Authorization for Enterprise SPI Guide

Google Search Appliance software version 4.6
Posted October 2006

Authentication (AuthN) is used to identify users and authorization (AuthZ) is used to allow users access to documents according to their credentials. This document describes how to use the Google Search Authorization Service Provider Interface (SPI) to develop a component for answering authorization requests, in order to securely show search results.

This document describes features that are available in version 4.6 and later of the Google Search Appliance. These features are not available in the Google Mini.

Contents

  1. Introduction
  2. Before You Begin
  3. Authentication
    1. Purpose of the Authentication SPI
    2. How the AuthN SPI Works
  4. Authorization
    1. Purpose of the Authorization SPI
    2. How the AuthZ SPI Works

Introduction

This document describes both the Authentication SPI and the Authorization SPI, to present the whole picture of delivering secure search results using the SPI.

Many enterprise customers have documents on their intranets that are access-controlled. Some users have access to a given document and some do not. The Google Search Appliance can crawl these documents, and, to protect confidentiality at serving time, include in the search results only the documents that the searcher has access to.

The Google Search Appliance can crawl and serve documents protected by HTTP Basic Authentication and NTLM, as well as documents protected by HTML Forms-based Authentication. The Google Search Appliance integrates with form-based single sign-on systems, such as systems from eTrust™ SiteMinder from Computer Associates, Cams™ from Cafesoft, and Oracle Identity Management. For HTTP Basic Authentication, NTLM, and Forms-based Authentication, checking document authorization is based on masquerading as the user. The user enters their credentials, or sends the appliance their single sign-on cookie, and for each access-controlled document in a results set, the appliance attempts to access the URL.

Rather than masquerade as the user, the Google Search Appliance also has the ability to directly authenticate the user using x509 client certificates.

In addition, you can set up authentication using LDAP (Light Directory Access Protocol) or the Google Search Authentication SPI, which allows a web service you provide to communicate using an Identity Provider and a Policy Decision Point between your access control system and the Google Search Appliance.

Whether client certificates, LDAP, or the Authentication SPI authentication method is used, the Google Search Appliance still needs a way to know what URLs a search user is authorized for, in order to know which URLs to include in the search results. The Authorization SPI exists to satisfy this requirement.

Back to top

Before You Begin

To write an Identity Provider and Policy Decision Point web service, you should be familiar with these technologies.

The sample messages in this document conform to the XML schemas on the SAML web site.

Tip: One way to implement an Identity Provider and Policy Decision Point is to access a SOAP server using Apache Axis.

Back to top

Authentication

Purpose of the Google Search Authentication SPI

When implemented, the Authentication SPI allows search users to authenticate to the Google Search Appliance. It is designed to allow customers to integrate the appliance into an existing access control infrastructure. Instead of authenticating search users itself, the appliance redirects the user to an Identity Provider (IP), a customer-implemented server, where the actual authentication takes place. The IP then redirects the user back to the appliance, while passing information that includes the identity of the search user. The protocol that governs this communication between the appliance, the browser, and the customer's IP is based on SAML 2.0, an XML-based standard.

Access Connector

Figure 1: The Google Search Appliance communicates through an Identity Provider to authenticate users' access to intranet web pages.

Note: If you use the Authentication SPI, you must use the Authorization SPI as well. However, if you decide to authenticate your users with x509 certificates, or LDAP, you do not need to implement the Authentication SPI. You can go on to the Authorization section.

How the AuthN SPI Works

The Authentication SPI exposed by the Google Search Appliance is based on the SAML 2.0 standard; specifically, on the "Web Browser SSO Profile." The Web Browser SSO profile makes use of the "Authentication Request Protocol," a request-response protocol. The appliance sends a SAML <AuthnRequest> message to the customer's Identity Provider, and the Identity Provider responds with a SAML <Response> message that contains an <Assertion>, which in turn contains an <AuthnStatement>. These messages are transferred between the appliance and the customer's Identity Provider, via the browser, using the "HTTP Redirect" and the "HTTP Artifact" bindings.

The authentication sequence between the user/appliance/Identity Provider goes like this:

The Identity Provider runs a SOAP message service, which accepts queries from the appliance, and returns an <ArtifactResponsee> element as a child of the SOAP <Bodye> element. This connection is made over a mutually authenticated HTTPS connection, ensuring origin integrity, data integrity, and confidentiality. To dereference the artifact and actually obtain the SAML response message, the appliance uses the SAML SOAP binding to send an <ArtifactResolve> message to the Identity Provider. This message contains the artifact. The response message from the Identity Provider is an <ArtifactResponse> message, which contains a <Response> element, which contains an <AuthnStatement>, which contains the identity of the search user.

An artifact should not be reusable. Once an artifact is dereferenced, the Identity Provider should reject attempts to dereference the same artifact again.

Session Cookie

After a search user logs in using the Authentication SPI, the appliance will maintain a session with the search user so that the user doesn't have to reauthenticate to the Identity Provider on every search.

This session is maintained with a session cookie. This cookie is securely sent over HTTPS and is set for the appliance's hostname only. The cookie value will contain the user identity and the time that the session cookie will expire.

HTTP Redirect Binding

When a search user performs a query (having no session cookie set), the appliance responds with a redirect that looks something like this:

HTTP/1.1 302 Object Moved
Date: 23 Feb 2005 19:00:49 GMT
Location:
https://ac.corp.company.com/SAML_login?SAMLRequest=BASE64URLENCODEDELEMENT&RelayState=https://search.corp.company.com/search?q=query
Content-Type: text/html; charset=iso-8859-1

The element BASE64URLENCODEDELEMENT is similar to:

<AuthnRequest ID="foobar" Version="2.0" IssueInstant="2005-10-08T11:32:19Z"/>

which is first DEFLATE-compressed, then Base 64 encoded, then URL encoded.

HTTP Artifact Binding

When the Identity Provider redirects the browser to the appliance, here's what it might look like:

HTTP/1.1 302 Object Moved
Date: 9 Feb 2005 18:22:03 PST
Location:
https://search.corp.company.com/SamlArtifactConsumer?SAMLart=RANDOMLOOKINGSTRING&RelayState=https://search.corp.company.com/search?q=query
Content-Type: text/html; charset=iso-8859-1

The appliance gets the artifact as the SAMLart parameter's value, and sends it to the Identity Provider via SOAP over a mutually authenticated HTTPS connection:

POST /SAMLResponder/resolve HTTP/1.1
Host: ac.corp.company.com
Content-Type: text/xml
Content-Length: nnn
SOAPAction: http://www.oasis-open.org/committees/security
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP-ENV:Body>
    <samlp:ArtifactResolve
      xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
      xmlns="urn:oasis:names:tc:SAML:2.0:assertion"
      ID="randomlooking" Version="2.0"
      IssueInstant="2005-02-09T18:42:32Z">
      <Issuer>search.corp.company.com</Issuer>
      <samlp:Artifact>RANDOMLOOKINGSTRING</samlp:Artifact>
    </samlp:ArtifactResolve>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The Identity Provider returns:

HTTP/1.1 200 OK
Date: 9 February 2005 18:22:04 PST
Content-Type: text/xml
Content-Length: nnn

<SOAP-ENV:Envelope
  xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP-ENV:Body>
    <samlp:ArtifactResponse
      xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
      xmlns="urn:oasis:names:tc:SAML:2.0:assertion"
      ID="alsorandomlooking" Version="2.0"
      InResponseTo="randomlooking"
      IssueInstant="2005-02-09T18:43:32Z">
      <Issuer>ac.corp.company.com</Issuer>
      <samlp:Status>
        <samlp:StatusCode Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
      </samlp:Status>
      <samlp:Response
        ID="blahblah"
        Version="2.0"
        IssueInstant="2004-10-08T14:38:05Z">
        <samlp:Status>
          <samlp:StatusCode
            Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
        </samlp:Status>
        <Assertion
          Version="2.0"
          ID="blahblah2"
          IssueInstant="2004-10-08T14:38:05Z">
          <Issuer>ac.corp.company.com</Issuer>
          <Subject>
            <NameID>CN=Joe Bob</NameID>
          </Subject>
          <AuthnStatement
            AuthnInstant="2004-10-08T11:32:19Z">
            <AuthnContext>
              <AuthnContextClassRef>
                urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport
              </AuthnContextClassRef>
             </AuthnContext>
          </AuthnStatement>
        </Assertion>
      </samlp:Response>
    </samlp:ArtifactResponse>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Back to top

Authorization

Purpose of the Authorization SPI

The Google Search Authorization SPI is exposed to allow a customer's web service to communicate between the Authorization SPI and the customer's server that provides access control services, which this document will refer to as the Policy Decision Point (PDP). The PDP provides a layer between the Google Search Appliance and the customer's access-control system. The PDP will be implemented, tested, and maintained by the customer.

When a user performs a search over access-controlled documents, the user must first authenticate to the Google Search Appliance. This allows the Google Search Appliance to reference the user's identity when making authorization checks, and to include the search user's identity in search logs.

There is an option to turn off cache links and snippets for access-controlled documents. This allows the administrator to assess the risk of storing access-controlled documents on the Google Search Appliance.

As with AuthN, the protocol used between the Google Search Appliance, the browser, and the PDP is taken from SAML 2.0, an XML-based standard, whose primary use case is inter-domain single sign-on. For example, suppose a user is logged in at organization A, and wants to access content at organization B. Instead of forcing the user to log in again, SAML provides a way for the SSO system at A to vouch for the user by communicating with the SSO system at B. In our scenario, the PDP will act as organization A, while the Google Search Appliance will act as organization B.

How the AuthZ SPI Works

When the Google Search Appliance needs to check whether a search user has access to a URL, it creates a message containing the user identity and the URL, and sends it to an authorization server. This authorization server is the Policy Decision Point (PDP), a service provided by the customer. In response to authorization check requests, the Policy Decision Point responds with a message that says either "Permit," "Deny," or "Indeterminate." (these terms are defined by the SAML standard.)

For each URL in a search results list, the Google Search Appliance will issue an <AuthorizationDecisionQuery> element, containing the identity of the user and the URL in question, to the Policy Decision Point. The PDP will send back a <Response> message, which will contain an <AuthzDecisionStatement>, and will say whether the user is authorized for the URL. These messages will be exchanged using the SAML SOAP binding over HTTPS.

The format of these messages are defined by SAML, and they are sent over SOAP over HTTPS. How the SAML messages are embedded in SOAP is also defined by SAML, as the "SAML SOAP binding". For complete details, please refer to the SAML standard.

When the Google Search Appliance makes an authorization check, it caches the result. The time that this information is valid is configurable in the Admin Console.

Here are the relevant portions of the SAML schema for the request:

<complexType name="RequestAbstractType" abstract="true">
  <sequence>
    <element ref="saml:Issuer" minOccurs="0"/>

    <element ref="ds:Signature" minOccurs="0"/>
    <element ref="samlp:Extensions" minOccurs="0"/>
  </sequence>
  <attribute name="ID" type="ID" use="required"/>
  <attribute name="Version" type="string" use="required"/>

  <attribute name="IssueInstant" type="dateTime" use="required"/>
  <attribute name="Consent" type="anyURI" use="optional"/>
</complexType>
<element name="Extensions" type="samlp:ExtensionsType"/>
<complexType name="ExtensionsType">
  <sequence>
    <any namespace="##other" processContents="lax" maxOccurs="unbounded"/>
  </sequence>
</complexType>


<element name="SubjectQuery" type="samlp:SubjectQueryAbstractType"/>
<complexType name="SubjectQueryAbstractType" abstract="true">
  <complexContent>

    <extension base="samlp:RequestAbstractType">
      <sequence>
        <element ref="saml:Subject"/>
      </sequence>
    </extension>
  </complexContent>

</complexType>

<complexType name="BaseIDAbstractType" abstract="true" mixed="true">
    <complexContent>
        <extension base="anyType">
            <attribute name="NameQualifier" type="string" use="optional"/>
            <attribute name="SPNameQualifier" type="string" use="optional"/>
        </extension>
    </complexContent>
</complexType>

<element name="NameID" type="saml:NameIDType"/>
<complexType name="NameIDType" mixed="false">
    <simpleContent>
        <restriction base="saml:BaseIDAbstractType">
            <simpleType>
                <restriction base="string"/>
            </simpleType>
            <attribute name="Format" type="anyURI" use="optional"/>
            <attribute name="SPProvidedID" type="string" use="optional"/>
        </restriction>
    </simpleContent>
</complexType>

<element name="Subject" type="saml:SubjectType"/>
  <complexType name="SubjectType">
    <choice>
        <sequence>
            <choice>
                <element ref="saml:BaseID"/>
                <element ref="saml:NameID"/>
                <element ref="saml:EncryptedID"/>
            </choice>
            <element ref="saml:SubjectConfirmation" minOccurs="0" maxOccurs="unbounded"/>
        </sequence>
        <element ref="saml:SubjectConfirmation" maxOccurs="unbounded"/>
    </choice>
</complexType>

<element name="AuthzDecisionQuery" type="samlp:AuthzDecisionQueryType"/>
<complexType name="AuthzDecisionQueryType">
  <complexContent>
    <extension base="samlp:SubjectQueryAbstractType">
      <sequence>

        <element ref="saml:Action" maxOccurs="unbounded"/>
        <element ref="saml:Evidence" minOccurs="0"/>
      </sequence>
      <attribute name="Resource" type="anyURI" use="required"/>
    </extension>
  </complexContent>

</complexType>

The <Subject> element will contain the identity of the search user. For the <Subject> element, the <NameID> element will be used. The format of this identity will be whatever is passed to the Google Search Appliance from the Authentication portion of the Access Control framework. The Resource attribute will be the URL for which we are checking authorization.

For the <Action> element, the attribute for the namespace will have the value "urn:oasis:names:tc:SAML:1.0:action:ghpp". The value for the text of the <Action> element is GET.

The following elements will not be sent to the Policy Decision Point by the Google Search Appliance.

Here are some relevant portions of the SAML schema for the response:


<element name="Response" type="samlp:ResponseType"/>
<complexType name="ResponseType">
  <complexContent>
    <extension base="samlp:StatusResponseType">
            <choice minOccurs="0" maxOccurs="unbounded">
                <element ref="saml:Assertion"/>
                <element ref="saml:EncryptedAssertion"/>
            </choice>
    </extension>
  </complexContent>
</complexType>

<complexType name="StatusResponseType">
   <sequence>
       <element ref="saml:Issuer" minOccurs="0"/>
       <element ref="ds:Signature" minOccurs="0"/>
       <element ref="samlp:Extensions" minOccurs="0"/>
       <element ref="samlp:Status"/>
   </sequence>
   <attribute name="ID" type="ID" use="required"/>
   <attribute name="InResponseTo" type="NCName" use="optional"/>
   <attribute name="Version" type="string" use="required"/>
   <attribute name="IssueInstant" type="dateTime" use="required"/>
   <attribute name="Recipient" type="anyURI" use="optional"/>
</complexType>

<element name="Status" type="samlp:StatusType"/>
<complexType name="StatusType">
    <sequence>
        <element ref="samlp:StatusCode"/>
        <element ref="samlp:StatusMessage" minOccurs="0"/>
        <element ref="samlp:StatusDetail" minOccurs="0"/>
    </sequence>
</complexType>

<element name="StatusCode" type="samlp:StatusCodeType"/>
<complexType name="StatusCodeType">
    <sequence>
        <element ref="samlp:StatusCode" minOccurs="0"/>
    </sequence>
    <attribute name="Value" type="anyURI" use="required"/>
</complexType>

<element name="Assertion" type="saml:AssertionType"/>
<complexType name="AssertionType">
    <sequence>
        <element ref="saml:Issuer"/>
        <element ref="ds:Signature" minOccurs="0"/>
        <element ref="saml:Subject" minOccurs="0"/>
        <element ref="saml:Conditions" minOccurs="0"/>
        <element ref="saml:Advice" minOccurs="0"/>
        <choice minOccurs="0" maxOccurs="unbounded">
            <element ref="saml:Statement"/>
            <element ref="saml:AuthnStatement"/>
            <element ref="saml:AuthzDecisionStatement"/>
            <element ref="saml:AttributeStatement"/>
        </choice>
    </sequence>
    <attribute name="Version" type="string" use="required"/>
    <attribute name="ID" type="ID" use="required"/>
    <attribute name="IssueInstant" type="dateTime" use="required"/>
</complexType>

<complexType name="StatementAbstractType" abstract="true"/>

<element name="Issuer" type="saml:NameIDType"/>

<element name="AuthzDecisionStatement" type="saml:AuthzDecisionStatementType"/>
<complexType name="AuthzDecisionStatementType">
  <complexContent>
    <extension base="saml:StatementAbstractType">
      <sequence>
        <element ref="saml:Action" maxOccurs="unbounded"/>

        <element ref="saml:Evidence" minOccurs="0"/>
      </sequence>
      <attribute name="Resource" type="anyURI" use="required"/>
      <attribute name="Decision" type="saml:DecisionType" use="required"/>
    </extension>
  </complexContent>
</complexType>

<simpleType name="DecisionType">
  <restriction base="string">
    <enumeration value="Permit"/>
    <enumeration value="Deny"/>
    <enumeration value="Indeterminate"/>
  </restriction>
</simpleType>

<element name="Action" type="saml:ActionType"/>
<complexType name="ActionType">
  <simpleContent>
    <extension base="string">
      <attribute name="Namespace" type="anyURI" use="required"/>
    </extension>
  </simpleContent>
</complexType>

The namespace set in the <Action> element attribute will be "urn:oasis:names:tc:SAML:1.0:action:ghpp". If the string in an <Action> element is "GET", the Google Search Appliance will display the URL in the search results, along with snippets and the cache link.

Since the URL found in the cache link (the cache URL pointed to by the cache link, not the URL that points to the original document) is not secret, we must again check "GET" authorization for a document when the user tries to access the corresponding cache link URL.

If the value for the Decision attribute in the <AuthzDecisionStatement> is "Indeterminate", rather than "Permit" or "Deny", the Google Search Appliance will then try to check authorization using Basic Authentication, NTLM, or Forms Authentication, if they are configured. If they aren't configured, an answer of "Indeterminate" will be treated as if authorization was denied.

Here is an example of a message the Google Search Appliance might send to the Policy Decision Point:

POST /authz HTTP/1.1
Host: ac.abc.com
Content-Type: text/xml
SOAPAction: http://www.oasis-open.org/committees/security
Content-length: nnn

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope
  xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soapenv:Body>

    <samlp:AuthzDecisionQuery
      ID="kmigpcackfenaibdninipcnmkmajfplommhfapbk"
      IssueInstant="2004-10-20T17:52:29Z"
      Version="2.0"
      Resource="http://www.abc.com/secret.html"
      xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
      xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol">
      <saml:Subject>
        <saml:NameID>Joe Bob</saml:NameID>
      </saml:Subject>
      <saml:Action
        Namespace="urn:oasis:names:tc:SAML:1.0:action:ghpp">
        GET
      </saml:Action>
    </samlp:AuthzDecisionQuery>

  </soapenv:Body>
</soapenv:Envelope>

Here's an example of a possible response from the Policy Decision Point:

HTTP/1.1 200 OK
Content-Type: text/xml
Content-Length: nnn

<soapenv:Envelope
  xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
  <soapenv:Body>
    <samlp:Response
      xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
      xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
      ID="blahblah"
      Version="2.0"
      IssueInstant="2004-10-08T14:38:05Z">
      <samlp:Status>
        <samlp:StatusCode
          Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
      </samlp:Status>
      <saml:Assertion
        Version="2.0"
        ID="blahblah2"
        IssueInstant="2004-10-08T14:38:05Z">
        <saml:Issuer>abc.com</saml:Issuer>
        <saml:Subject>
          <saml:NameID>Joe Bob</saml:NameID>
        </saml:Subject>
        <saml:AuthzDecisionStatement
          Resource="http://www.abc.com/secret.html"
          Decision="Permit">
          <saml:Action
            Namespace="urn:oasis:names:tc:SAML:1.0:action:ghpp">
            GET
          </saml:Action>
        </saml:AuthzDecisionStatement>
      </saml:Assertion>
    </samlp:Response>
  </soapenv:Body>
</soapenv:Envelope>

Last modified: