|
ReleaseNotes
Release Notes.
IntroductionThis page keeps the most up-to-date release notes. Tesseract release notes June 30 2009 - V2.04
Tesseract release notes April 22 2008 - V2.032.02 was unrunnable, due to a last-minute "simple" change. 2.03 fixes the problem and also adds an include check for leptonica to make it more usable. Tesseract release notes April 21 2008 - V2.02
Tesseract release notes Aug 30, 2007 - V2.01.(See also release notes for 2.00 below for usage information) No major functionality change. Just a bunch of bug fixes.
No new data files for the original 6 languages. Use the files from v2.00. There are new data files for German Fraktur (deu-f) and Brazillian Portuguese (por). STOP PRESS There is a minor bug in unicharset_extractor. Since this is only applicable to training, the main tarball is fine unless you need to run training, in which case, overwrite your unicharset_extractor.cpp and unicharset_extractor.exe with the ones in tesseract-2.01.patch1.tar.gz. Tesseract release notes Jul 18, 2007 - V2.00.(See also release notes for 1.04 below for additional usage information) First release of the International version. This version recognizes the following languages:
tesseract inputimage outputbase -l langcode To train on a new language, see TrainingTesseract. More languages will be appearing over time. List of changes in this release:
xx.00 Version WarningTesseract 2.00 has undergone more compatibility testing than any previous version. There have even been fixes to make the accuracy more consistent across platforms. Having said that, there have been many changes to the code, and portability may have been broken, so 64 bit and Mac platforms may not work or even build as well as before. Tesseract release notes May 15, 2007 - V1.04.Windows users only Added a dll interface for windows. Thanks to Glen at Jetsoft for contributing this. To use the dll, include tessdll.h, import tessdll.lib and put tessdll.dll somewhere where the system can find it. There is also a small dlltest program to test the dll. Run with: dlltest phototest.tif phototest.txt It will output the text from phototest.tif with bounding box information. New for Windows The distribution now includes tesseract.exe and tessdll.dll which might work out of the box! There are no guarantees as you need VC++6 versions of mfc and crt (at least) for it to work. (Batteries not included, and certainly no installshield.) Important note for anyone building with make: i.e. anyone except devstudio users This release includes new standardization for the data directory. To enable Tesseract to find its data files, you must either: ./configure make make install to move the data files to the standard place, or: export TESSDATA_PREFIX="directory in which your tessdata resides/" (or equivalent) in your .profile or whatever or setenv to set the environment variable. Note that the directory must end in a / HAVING tesseract and tessdata IN THE SAME DIRECTORY DOES NOT WORK ANY MORE. All users Fixed a bunch of name collisions - mostly with stl. Made some preliminary changes for unicode compatibility. Includes a new data file (unicharset) and renaming of the other data files to eng. to support different languages. There are also several other minor bug fixes and portability improvements for 64 bit, the latest visual studio compiler etc. Thanks to all who have contributed these fixes. NOTE: This is likely to be the last English-only release! Apologies in advance to non-windows users for bloating the distribution with windows executables. This will probably get fixed in the next release with the multi-language capability, since that will also bloat the distribution. |
Sign in to add a comment
Shouldn't the visual c++ express version be usable with just the vcredist_x86.exe redistributable rather than requiring users to install vc++ express and the platform sdk?
OK well I had some comments about how to improve the DLL api, but it looks like that just can't be done, since Tesseract code supports only one engine intrinsicly!
I'm working now on a DLL API callable from VB or .Net.
The .exe in Windows doesn't find a file: Unable to load unicharset file D:/OCR/tesseract-2.03/tessdata/eng.unicharset
Thank you
Why chinese language is not available?