issue 2
(PyXML code broken) commented on by jrgarrison
- One possibility is to use lxml instead for parsing.
Attached is a modified version of hocr-combine which uses lxml.
One possibility is to use lxml instead for parsing.
Attached is a modified version of hocr-combine which uses lxml.
r22
(Updated hocr-eval script to use Beautiful Soup HTML parser) committed by faisalshafait
- Updated hocr-eval script to use Beautiful Soup HTML parser
Updated hocr-eval script to use Beautiful Soup HTML parser
Aug 15, 2008
r21
(Updated hocr-eval script to be very strict with layout error...) committed by faisalshafait
- Updated hocr-eval script to be very strict with layout errors
Updated hocr-eval script to be very strict with layout errors
Jul 25, 2008
issue 2
(PyXML code broken) reported by tmbdev
- Debian and Ubuntu have removed PyXML support, which breaks the HTML parser
on many of these tools.
For now, it is possible to use adding "/usr/share/pyshared/oldxml/" to the
Python path as a workaround, but we need to find a permanent solution.
Debian and Ubuntu have removed PyXML support, which breaks the HTML parser
on many of these tools.
For now, it is possible to use adding "/usr/share/pyshared/oldxml/" to the
Python path as a workaround, but we need to find a permanent solution.
Jul 25, 2008
issue 1
(hORC tools name confusable with existing project.) Status changed by tmbdev
- Thanks, but there's really nothing we can do about it at this point. The name "hOCR"
is the natural name for an OCR microformat.
Status: WontFix
Thanks, but there's really nothing we can do about it at this point. The name "hOCR"
is the natural name for an OCR microformat.