My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
Resources  
Links to resources for automatic topic indexing
Updated May 16, 2011 by medel...@gmail.com

Additional resources for automatic keyphrase extraction and term assignment

Apart from the three collections listed in the MultiplyIndexedData, Maui was also tested on several term assignment collections, where each document had a single set of manually assigned topics.

FAO-780 data set for term assignment

This data set contains 780 documents with terms from Agrovoc assigned by professionals. When using this data set, please cite Medelyan (2009) or Medelyan and Witten (2008), see Publications.

Other data sets on the web

NUS Keyphrase Corpus can be used for training and testing keyphrase extaction and tagging. To use it with Maui, the data first needs to be converted into the required format.

SemEval-2010 Keyphrase Extraction will soon publish their data for the participants of the shared task. This data set will be similar to the NUS Keyphrase corpus.

See also: MultiplyIndexedData

Vocabularies

For term assignment, a number of vocabularies in SKOS format are available on the web:

Please note: Maui user Amrita recommends to use HIVE to convert original MeSH vocabulary into SKOS.

Competing systems and demos

The topic indexing blog provides a list of tools for automatic keyphrase extraction, terminology extraction, tagging and other tasks.

Comment by gim...@gmail.com, Nov 26, 2011

It looks like the LoC LSCH page has been moved - http://id.loc.gov/authorities/subjects.html is a current page for the resource

Comment by e...@land.aau.dk, Mar 7, 2012

The W3C’s list of SKOS thesauri links to a page, which 'is no longer in use, see the new SKOS wiki'


Sign in to add a comment
Powered by Google Project Hosting