My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members

Overview

This project provides simple implementations of several algorithms for chunking/segmenting sequences of symbols. The Voting Experts (VE) algorithm is included, as well as algorithms derived from VE such as Bootstrap Voting Experts (BVE) and Voting Experts - Minimum Description Length (VE-MDL). Voting Experts greedily searches for sequences that match an information-theoretic signature: low entropy internally and high entropy at the boundaries. For an up-to-date summary of many VE results and an analysis of VE's chunk signature, see (1).

Voting Experts was originally designed by Paul Cohen and Niall Adams in 2001. The current Java implementation was developed by Daniel Hewlett with contributions from Nik Sharp, and is based in part on earlier development by Wesley Kerr. Funding for Voting Experts research has been provided by the National Science Foundation (NSF) and the Defense Advanced Research Projects Agency (DARPA).

Algorithms Implemented

Voting Experts

  • Voting Experts (VE) - The original algorithm, see (5, 7).
  • Bootstrap Voting Experts (BVE) - VE plus the Knowledge Expert, see (2).
  • VE-MDL - VE with automatic parameter setting through MDL, see (2).
  • BVE-MDL - BVE with automatic parameter setting through MDL.

Other Algorithms

  • Model-Based Dynamic Programming 1 (MBDP-1) - Incremental segmentation algorithm of (Brent, 1999)
  • Phoneme to Morpheme (PtM) - Segmentation algorithm based on changes in boundary entropy (Tanaka-Ishii and Jin, 2006)

Related Publications

  1. Daniel Hewlett and Paul Cohen. Fully Unsupervised Word Segmentation with BVE and MDL. Proceedings of The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-2011). 2011.
  2. Daniel Hewlett and Paul Cohen. Word Segmentation as General Chunking. Proceedings of the Fifteenth Conference on Computational Natural Language Learning (CoNLL-2011). 2011.
  3. Daniel Hewlett and Paul Cohen. Artificial General Segmentation. Proceedings of The Third Conference on Artificial General Intelligence (AGI-10). 2010.
  4. Daniel Hewlett and Paul Cohen. Bootstrap Voting Experts. Proceedings of the Twenty-first International Joint Conference on Artificial Intelligence (IJCAI). 2009.
  5. Matthew Miller, Peter Wong, and Alexander Stoytchev. Unsupervised Segmentation of Audio Speech Using the Voting Experts Algorithm. Proceedings of the Second Conference on Artificial General Intelligence (AGI). 2009.
  6. Matthew Miller and Alexander Stoytchev. Hierarchical Voting Experts: An Unsupervised Algorithm for Hierarchical Sequence Segmentation. Proceedings of the 7th IEEE International Conference on Development and Learning (ICDL). (Best Paper Award, ICDL 2008)
  7. Paul R. Cohen, Niall Adams, Brent Heeringa. Voting Experts: An Unsupervised Algorithm for Segmenting Sequences. To appear in Journal of Intelligent Data Analysis. 2007.
  8. Jimming Cheng and Michael Mitzenmacher. Markov Experts. Proceedings of the Data Compression Conference (DCC). 2005.
  9. Paul R. Cohen and Niall Adams. An Algorithm for Segmenting Categorical Time Series into Meaningful Episodes. Proceedings of the Fourth Symposium on Intelligent Data Analysis, Lecture Notes in Computer Science. 2001.
Powered by Google Project Hosting