My favorites | English | Sign in

Google Search Appliance

Indexable File Formats

Google Search Appliance software version 6.0
Posted June, 2009
Revised August 2009: Added restriction on indexing encrypted Excel spreadsheets.
Revised September 2009: Clarified indexing of files in XML format.
Revised October 2009: Removed support for .pst files

This documents lists the file formats that the Google Search Appliance can crawl, index, and search.

File Formats

The following tables list word processing, spreadsheet, database, presentation, and other formats that the Google Search Appliance can crawl, index, and search. Please note the following:

  • The Google Search Appliance cannot crawl, index, or search any file formats that are not listed.
  • Text embedded in graphics is not indexed.

    The Google Search Appliance cannot index text contained in graphic file formats, such a JPEG, GIF, or TIFF. When a file in a graphic format is submitted for indexing, text embedded in the graphic is not indexed. However, the file name is indexed. If any metadata is associated with the graphic in HTML meta tags, that metadata is indexed.

  • Encrypted, viewable PDF documents are converted to HTML for indexing, but the cached HTML is not displayed.
  • Encrypted Excel spreadsheets (xls format) cannot be indexed or searched. If the search appliance attempts to crawl and index an encrypted Excel spreadsheet, you see the following error on the Crawl Diagnostics page:

    Crawled with empty body: Conversion error

    To make Excel spreadsheets indexable, disable encryption on the Excel Tools > Options > Security tab and resave any affected spreadsheets.

  • PDF files created by scanning with optical character recognition (OCR) software are supported.
  • If you are using the Google Search Appliance, metadata can be fed from a database and then indexed.
  • Files in XML format are crawled and indexed as plain text. Links are not extracted or followed and XML tags are are converted to escaped HTML counterparts.
  • The contents of compressed file formats, such as ZIP or tar files, cannot be indexed.

The following table lists supported word processing formats.

Format Extension Versions Supported
Adobe FrameMaker mif Version 6.0
ASCII Text (7 & 8 bit) txt All versions
ANSI Text (7 & 8 bit) ans All versions
DEC WPS Plus dx Versions through 4.0
DEC WPS Plus wpl Versions through 4.1
DisplayWrite 2 & 3 txt All versions
DisplayWrite 4 & 5 doc Versions through Release 2.0
Enable wpf Versions 3.0, 4.0 and 4.5
First Choice pfc Versions through 3.0
Framework   Version 3.0
Hangul   Version 97
HTML html, htm Versions through 3.0 (some limitations)
IBM FFT fft All versions
IBM Revisable Form Text rft All versions
IBM Writing Assistant iwa Version 1.01
JustSystems Ichitaro jaw, jbw Versions 5.0, 6.0, 8.0, 9.0 and 10.0
JustWrite jw Versions through 3.0
Legacy leg Versions through 1.1
Lotus AMI/AMI Professional sam Versions through 3.1
Lotus Manuscript doc Versions through 2.0
Lotus WordPro (Windows only) lwp Versions 96 through Millennium 9.6, text only
Lotus WordPro (Text only on UNIX) lwp Versions 96 through Millennium Edition 9.6, text only
MacWrite II mcw, mw, mwii Version 1.1
MASS11 m11 Versions through 8.0
Microsoft Rich Text Format rtf All versions
Microsoft Word for DOS doc Versions through 6.0
Microsoft Word for Macintosh doc Versions 3.0 – 4.0, 98, 2001
Microsoft Word for Windows doc Versions through 2003; version 2007 with extensions docx and docm
Microsoft WordPad rtf, doc All versions
Microsoft Works for DOS wks, wps Versions through 2.0
Microsoft Works for Macintosh wks, wps Versions through 2.0
Microsoft Works for Windows wks, wpf Versions through 4.0
Microsoft Write wri Versions through 3.0
MultiMate dox Versions through 4.0
Navy DIF dif All versions
Nota Bene nb Version 3.0
Novell Perfect Works   Version 2.0
Novell WordPerfect for DOS   Versions through 6.1
Novell WordPerfect for Mac   Versions 1.02 through 3.0
Novell/Corel WordPerfect for Windows   Versions through 11.0
Office Writer ow4 Version 4.0 to 6.0
OpenOffice Writer odt Through version 2.0
PC-File Letter ltr Versions through 5.0
PC-File+ Letter ltr Versions through 3.0
PFS:Write pfb Versions A, B, and C
Professional Write for DOS pw Versions through 2.1
Professional Write Plus pw, pwp Version 1.0
Q&A for DOS qa, qw, dtf Version 2.0
Q&A Write for Windows dtf Version 3.0
Samna Word sam, sm Versions through Samna Word IV+
SmartWare II smt Version 1.02
Sprint spr Version 1.0
Text Mail (MIME)   No specific version
Total Word tw Version 1.2
Unicode Text txt All versions
Volkswriter 3 & 4 vw Versions through 1.0
Wang PC iwp Versions through 2.6
WML   All versions
WordMARC wmc Versions through Composer Plus
WordStar 2000 for DOS ws1, ws2, ws3 Versions through 3.0
WordStar for DOS ws Versions through 7.0
WordStar for Windows ws, wst, wsd Version 1.0
XyWrite xy3, xyp, xyw Versions through III Plus

The following table lists supported spreadsheet formats.

Format Extension Versions Supported
Enable 300, wpf, ssf, dbf Versions 3.0, 4.0 and 4.5
First Choice ss, fol Versions through 3.0
Framework fw3 Version 3.0
Lotus 1-2-3 (DOS & Windows) wku, wk1, wk2, wk3, wk4, wk5, wki, wks Versions through 5.0
Lotus 1-2-3 for SmartSuite wku, wk1, wk2, wk3, wk4, wk5, wki, wks Versions 97-Millenium 9.6
Lotus 1-2-3 Charts (DOS & Windows) wku, wk1, wk2, wk3, wk4, wk5, wki, wks Versions through 5.0
Lotus 1-2-3 (OS/2) wku, wk1, wk2 Versions through 2.0
Lotus 1-2-3 Charts (OS/2) wku, wk1, wk2 Versions through 2.0
Lotus Symphony wr1 Versions 1.0,1.1 and 2.0
Microsoft Excel for Macintosh xls Versions 3.0 through 4.0, 98, 2001
Microsoft Excel for Windows xls, xlw Versions 2.2 through 2003; 2007 with extensions xlsx and xlsm
Microsoft Excel Charts xlc Versions 2.x through 7.0
Microsoft Multiplan col, cod, mod Version 4.0
Microsoft Works for Windows wps, wks Versions through 4.0
Microsoft Works (DOS) wps, wks, wdb, wcm Versions through 2.0
Microsoft Works (Macintosh) wps, wks, wdb, wcm Versions through 2.0
Mosaic Twin wku Version 2.5
Novell Perfect Works   Version 2.0
OpenOffice Calc odc Through version 2.0
PFS:Professional Plan tid Version 1.0
QuattroPro for DOS wkq, wq1 Versions through 5.0
QuattroPro for Windows wb1, wb2, wk3 Versions through 11.0
SmartWare II def, smt Version 1.02
StarOffice Calc for Windows and UNIX text only StarOffice versions 5.2. 6.x, and 7.x, and OpenOffice version 1.1 (Text only)
SuperCalc 5 cal Version 4.0
VP Planner 3D np Version 1.0

The following table lists supported database formats.

Format Extension Versions Supported
DBASEdbf Versions through 5.0
DataEasedba, dbm, dql Version 4.x
dBXLdbf Version 1.3
Enable300, wpf, ssf, dbf Versions 3.0, 4.0 and 4.5
First Choicepfc Versions through 3.0
FoxBasefmt, dbt, fox, inx, dbf Version 2.1
Frameworkfwk, fw, fw2, fw3 Version 3.0
Microsoft Works (DOS)wdb, wks Versions through 2.0
Microsoft Works (Macintosh)wdb, wks Versions through 2. 0
Microsoft Works for Windows wdb, wks, dbf Versions through 4.0
Paradox (DOS)fsl, db, px Versions through 4.0
Paradox (Windows)fsl, db, px Versions through 1.0
Personal R:BASErbf Version 1.0
R:BASE 5000rbf, dbf Versions through 3.1
R:BASE System Vrbf Version 1.0
Reflex r2d Version 2.0
Q & Aqa, qw, dtf Versions through 2.0
SmartWare IIdb, def, smt Version 1.02

The following table lists supported graphics formats. Note that text that is part of a graphic is not indexed. Only file names and metadata are indexed.

Format Extension Versions Supported
Adobe FrameMaker Graphics fmv Vector/raster through 5.0
Adobe Illustrator File Format ai Versions through 7.0, 9.0
Adobe Photoshop File Format psdVersion 4.0
Adobe Portable Document Format pdf Versions 2.1 and 3.0 through 7.0, including Japanese PDF
Ami Draw Format sdwAmi Draw
AutoCAD Interchange and Native Drawing Format dxf, dwg 
AutoCAD Drawing Format dwgVersions 2.5-2.6, 9.0-14.0, 2000i and 2002
AutoShade Rendering Format rndVersion 2
Binary Group 3 Fax  All Versions
Bitmap Format bmp, rle, ico, cur, OS/2 dib & warpWindows
CALS Raster Format gp4Type I and Type II
Corel Clipart Format cmx Versions 5 through 6
Corel Draw cdr Versions 6.0 to 11.0
Corel Draw cdr with tiff header Versions 2.0 through 11.0
Computer Graphics Metafile cgm ANSI, CALS NIST versions 3.0
Encapsulated PostScript eps tiff header only
GEM Paint img No specific version
Graphics Environment Manager gem Bitmap and vector
Graphics Interface Format gif No specific version
Hewlett Packard Graphics Language hpgl Version 2
IBM Graphics Data Format gdf Version 1.0
IBM Picture Interchange Format pif Version 1.0
JFIF (jpeg not in tiff format) jfif All Versions
JPEG jpeg All versions
Kodak Photo CD pcd Version 1.0
Lotus PIC pic All versions
Lotus Snapshot   All versions
Macintosh PICT1 and PICT2 pict Bitmap only
MacPaint pntg No specific version
Micrografx Draw drw Versions through 4.0
Micrografx Designer drwVersions through 3.1
Micrografx Designer dsfWindows 95, Version 6.0
Novell PerfectWorks drawVersion 2.0
OS/2 PM Metafile metVersion 3.0
Paint Shop Pro 6 psp Version 5.0 - 6.0 Windows only
PC Paintbrush pcx, dcxAll versions
Portable Bitmap pbmAll versions
Portable Graymap pgmNo specific version
Portable Network Graphics pngVersion 1.0
Portable Pixmap ppmNo specific version
PostScript psLevel 2
Progressive JPEG jpegNo specific version
Sun Raster srsNo specific version
TIFF tiffVersions through 6
TIFF CCITT Group 3 & 4 tiffVersions through 6
Truevision TGA targaVersion 2
Visio (preview)  Version 4
Visio  Versions 5, 2000, and 2002
WBMP  No specific version
Windows Enhanced Metafile emfNo specific version
Windows Metafile wmfNo specific version
WordPerfect Graphics wpg, wpg2Versions through 2.0
X-Windows Bitmap xbmx10 compatible
X-Windows Dump xdmx10 compatible
X-Windows Pixmap xpmx10 compatible

The following table lists supported presentation formats.

Format Extension Versions Supported
Corel/Novell Presentations shw Versions through 11.0
Harvard Graphics for DOS hgs, cht, ch3, prs Versions 2.x & 3.x
Harvard Graphics for Windows hgs, cht, ch3, prs Windows versions
Freelance for Windows flw, shw, drw, pre Versions through Millenium 9.6
Freelance for OS/2 flw, shw, drw, pre Versions through 2.0
Microsoft PowerPoint for Windows ppt Versions 3.0 through 2003; 2007 with extensions pptx and pptm
Microsoft PowerPoint for Macintosh ppt Versions 4.0, 98, through 2001
OpenOffice Impress odp Through version 2.0
StarOffice Impress for Windows and UNIX text only StarOffice versions 5.2, 6.x, and 7.x and OpenOffice version 1.1 (text only)

The following table lists other supported formats.

Microsoft Outlook Message msg Text only
vCard   Version 2.1