|
Project Information
Featured
Downloads
|
This is a naive Bayesian text classifier that, given a bit of text can tell you the posterior probability (returned as a log likelihood) that it comes from each of the standard scientific article sections (Introduction, Methods, Results, Discussion). What do we mean by a bit of text, well anything you want really, from a few words to a whole article, you decide on the boundaries. You can use it to:
This project contains:
Test the classifierYou can test the classifier through your browser here Download the classifierDownloads are all listed here or you can see the featured downloads on the right hand side of this page. SOAP/WSDL Web ServiceThe wsdl document for the web service is here and you can test it directly through your browser here. You can also download local clients in Java, Perl and Ruby for accessing the web service. Using the classifier in JavaFirst you must include the ArticleSectionClassifer.jar file in your classpath. Secondly it is best to increase the memory allocated by the Java executable using the -Xmx Java VM argument, I usually suggest -Xmx256m. Then you can use the classifier in few lines of code. String textToClassify = "classify this text"; ArticleSectionClassifier classifier = new ArticleSectionClassifier(); String classifiedAs = classifiers.classifyText(textToClassify); System.out.println(classifiedAs); If you want more detailed output try String textToClassify = "classify this text"; ClassificationInput input = new ClassificationInput(textToClassify); ArticleSectionClassifier classifier = new ArticleSectionClassifier(); ClassificationResult result = classifier.classifyText(input); System.out.println(result); Source codeDetails on getting the source code for this project can be found by clicking on the 'Source' tab above or by clicking here. You can also download an archive of the source from here. The source is uploaded as part of a Netbeans project, which can be opened directly into Netbeans or imported into Eclipse or most other IDEs. ProblemsIf something isn't working, then please post an issue. I will be notified by email and i'll sort it out as soon as I can. It's very likely that something won't work, but its also likely that i've encountered it before and will know how to fix it, therefore if you have a problem, let me know. |