| Projects on Google Code | Results 1 - 5 of 5 |
Since Sep 8, 2008 / Last update: Dec. 16, 2009
= Introduction =
NHocr is a command line OCR (Optical Character Recognition) program for Japanese language, etc. It has been designed to recognize machine-printed Japanese characters and some ASCII characters/symbols in an image.
NHocr is probably ...
This project aims to develop high quality data files for Polish language support for [http://code.google.com/p/tesseract-ocr/ Tesseract OCR].
Included are the sources for sample documents, utilities to process and prepare dictionary data for compilation into DAWG format etc.
If you simply want...
= About =
OCRopus(tm) is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities.
The OCRopus engine is based on two research projects: a high-performance ha...
ocr,
imageprocessing,
layoutanalysis,
document,
opticalcharacterrecognition,
characterrecognition,
neuralnetwork,
handwritingrecognition,
handwriting,
machinelearning,
CPlusPlus
The aim is to recognize one character of the seven segment display character set at a time. More information about the seven segment display character set representations can be collected from http://en.wikipedia.org/wiki/Seven-segment_display_character_representations.
The developed algorithm ca...
You can download the HIT-MW database freely here.