| Issue 21: | Characters touching table lines are not recognized | |
| 1 person starred this issue and may be notified of changes. | Back to list |
Sign in to add a comment
|
What steps will reproduce the problem? 1. run ./ocropus layout prefix test.png 2. 3. It should generate the sub images for all the text in it. But it generates the sub images for only the text not touching the table lines. I am using code with Revision: 78 from SVN. Running on linux fedora core OS. Please find attached the test image file ( test.png )i used. |
||||||||||||
,
May 26, 2007
Do not expect resolution of this issue. I was running into the same problem with "streaky" FAXes (dirt on the CCD glass gives vertical streaks for length of the page) and *you* will need to deal with the lines (for the next 5-10 years, anyways :) While my hacks did not result in complete satisfaction, they did prevent tess from burning 100% CPU for 10 minutes on each such image before giving up. However, in the end, you are *still* guessing what to put in when you take the lines out. Disclosure: I know very little about OCR/image processing theory. |
|||||||||||||
,
Aug 07, 2007
Well, this problem is fixable in principle, but it will require a significant amount of work. I hope we'll be able to address it around 1.0 as part of improved line recognizers.
Labels: Milestone-Release1.0
|
|||||||||||||
,
Jan 12, 2009
(No comment was entered for this change.)
Status: Accepted
Labels: -Type-Defect Type-Enhancement |
|||||||||||||
,
Jan 12, 2009
(No comment was entered for this change.)
Owner: faisalshafait
|
|||||||||||||
,
Jun 14, 2009
(No comment was entered for this change.)
Labels: SampleImage
|
|||||||||||||
,
Aug 29, 2009
This is the best program that I have found however its batch processing is poor. I have 20,000 pages to process and it would take a lifetime to manually clean up the files. ClearImage Image Processing Engines http://www.inliteresearch.com/homepage/products/ci_image_processing.html I have archived a Hollywood Sound Effects library catalog and would like to post the archives on the web for the sound community however the vertical streaks make the OCR somewhat useless. Hopefully OCROPUS can incorporate some ideas from ClearImage. Rob Nokes www.Sounddogs.com |
|||||||||||||
,
Aug 30, 2009
OCRopus already has many of the cleanup mechanisms in ClearImage. However, they are not automatically applied yet. The beta release will have more information about how to use these. |
|||||||||||||
|
|
|||||||||||||