| Source path: svn/ trunk/ src/ com/ flaptor/ util/ NgramJLanguageIdentifier.java |
|
|
|
Change log
Added cngram-trunk.jar, a language identifier better than nutch's. Modified LangUtils to use cngram. Added DocumentParserTest, to check that return values are null when a document is not parseable. Added NgramJLanguageIdentifier, that uses cngram to perform identification.
| Go to: |
Project members,
sign in to write a code review
Older revisions
All revisions of this file