To Date: Given a directory will read in all text from pdf, chm, html, text files in the directory and all sub directories. Uses Lucene to index, generates multiple types of queries to use directory structure in an effort to enhance precision.
It was an interesting project but this will no longer be maintained as of 3/17/09. If you need any information please feel free to contact me at matthew.madson@gmail.com