|
Resources
Links to resources for automatic topic indexing
Additional resources for automatic keyphrase extraction and term assignmentApart from the three collections listed in the MultiplyIndexedData, Maui was also tested on several term assignment collections, where each document had a single set of manually assigned topics. FAO-780 data set for term assignmentThis data set contains 780 documents with terms from Agrovoc assigned by professionals. When using this data set, please cite Medelyan (2009) or Medelyan and Witten (2008), see Publications. Other data sets on the webNUS Keyphrase Corpus can be used for training and testing keyphrase extaction and tagging. To use it with Maui, the data first needs to be converted into the required format. SemEval-2010 Keyphrase Extraction will soon publish their data for the participants of the shared task. This data set will be similar to the NUS Keyphrase corpus. See also: MultiplyIndexedData VocabulariesFor term assignment, a number of vocabularies in SKOS format are available on the web: Please note: Maui user Amrita recommends to use HIVE to convert original MeSH vocabulary into SKOS.
Competing systems and demosThe topic indexing blog provides a list of tools for automatic keyphrase extraction, terminology extraction, tagging and other tasks. |
It looks like the LoC LSCH page has been moved - http://id.loc.gov/authorities/subjects.html is a current page for the resource
The W3C’s list of SKOS thesauri links to a page, which 'is no longer in use, see the new SKOS wiki'