text-mining


Project to extract text and metadata from various file formats

Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003. Extracts text from fast-saved files as well.

Project Information

Labels:
Word Office Textextraction