ocrfeeder

A complete document layout analysis and OCR system for GNU/Linux

Please notice

OCRFeeder's official web page is: http://live.gnome.org/OCRFeeder

And all news be in there from now on, making the project page on Google Code deprecated.

OCRFeeder

OCRFeeder is a document layout analysis and optical character recognition system.

Given the images it will automatically outline its contents, distinguish between what's graphics and text and perform OCR over the latter. It generates multiple formats being its main one ODT.

It features a complete GTK graphical user interface that allows the users to correct any unrecognized characters, defined or correct bounding boxes, set paragraph styles, clean the input images, import PDFs, save and load the project, export everything to multiple formats, etc. OCRFeeder was developed as the project of the Master's Thesis in Computer Science of Joaquim Rocha.

Check the program in action here:

http://www.vimeo.com/3760126

NEWS

2009/11/06: OCRFeeder v0.4 released
2009/10/16: OCRFeeder v0.3 released
2009/10/05: OCRFeeder has changed its development to Gitorious.
2009/05/10: Released first tarball version;
2009/03/18: OCRFeeder has been released today to the general public (checkout SVN).

After a long wait, finally the initial commit to the public SVN.

Project Information

License: GNU GPL v3
42 stars
svn-based source control

Labels:
ocr layoutanalysis imageprocessing python gtk odt document recognition opticalcharacterrecognition linux gnome