|
GettingStartedWithBleedingEdge
Getting Started with Subversion Version of OCRopus
PrerequisitesWe recommend building on Ubuntu 7.10 or 8.04, since that's what we build and test on. Other recent versions of Linux will probably work. People have built OCRopus on OS X, with VisualStudio, and on other platforms. We appreciate feedback for how to improve portability, build files, etc., and will try to make them available to others as much as possible. For official support of another platform, we would need volunteers who can perform nightly builds and submit patches as soon as anything breaks. Tesseract (optional)Tesseract is a fast and pretty accurate text recognition engine from HP and Google. It is optional, but we recommend that you include it for now if you want to perform OCR. If you mostly need OCRopus for other document analysis tasks, you need not include it. To install it, check out the current subversion version of Tesseract from the Tesseract repository: svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr-read-only (If you have trouble with that version, download the latest tarball instead.) Follow its instructions for building it and then install it in /usr/local (the default location you get with configure; make; make install). OpenFST (optional)OpenFST is used for building statistical language models. It is optional, and you may not need it. Download the latest OpenFST distribution from http://openfst.org/ Follow its instructions for building the distribution. Note that OpenFST does not use a standard directory structure; you have to cd two levels down. After everything has built, install the files in the right place: mkdir -p /usr/local/include/fst/lib cp -v fst/lib/*.h /usr/local/include/fst/lib cp fst/lib/*.a /usr/local/lib OCRopusCheck out the current subversion version of OCRopus from the OCRopus respository: svn checkout http://ocropus.googlecode.com/svn/trunk/ ocropus To build OCRopus, just run ./configure, then jam, then jam install. Unit TestsAfter you're done, you should run the unit tests:
Command LineThe OCRopus command line program is called ocroscript and it's installed in /usr/local/bin It takes either scripts or subcommands on the command line. Subcommands are scripts that are installed in /usr/local/lib/ocropus/... or somewhere along the path defined by the OCROSCRIPTS environment variable. The ocroscripts/scripts directory contains the available top-level commands. Here are some examples: ocroscript rec-tess file.png |
Sign in to add a comment
