Maui automatically identifies main topics in text documents. Depending on the task, topics are tags, keywords, keyphrases, vocabulary terms, descriptors, index terms or titles of Wikipedia articles.
Maui performs the following tasks:
It can also be used for terminology extraction and semi-automatic topic indexing.
You can try out this live Maui demo by just copying and pasting a piece of text of your choice or uploading a document in Word or PDF format.
If you would like to use Maui commercially and need help optimising its performance, you can contact me via my consultancy Entopix.
Maui has been successfully tested on computer science, agricultural, medicine, physics, biology, bioinformatics documents, as well as on blog posts and news articles.
It supplies stemmers and stopwords for English, French and Spanish, but can be extended to work in many other languages, including languages that require special encoding.
Examples are provided in Maui's Wiki pages
Maui has been developed by Alyona Medelyan as a part of her PhD project, under supervision of Ian H. Witten and Eibe Frank in the Department of Computer Science at the University of Waikato, New Zealand. The PhD was sponsored by a research grant from Google.
Maui builds on the keyphrase extraction algorithm Kea, but provides additional functionalities: it allows the assignment of topics to documents based on terms from Wikipedia using Wikipedia Miner. Maui also has many new features that help identify topics more accurately.