|
Project Information
Members
Featured
Downloads
Wiki pages
|
The MIT Language Modeling (MITLM) toolkit is a set of tools designed for the efficient estimation of statistical n-gram language models involving iterative parameter estimation. It achieves much of its efficiency through the use of a compact vector representation of n-grams. Details of the data structure and associated algorithms can be found in the following paper.
Currently, MITLM supports the following features:
MITLM is available for download under the MIT License. It has been built and tested on 32-bit and 64-bit Intel CPUs running Debian Linux 4.0. It currently requires the following:
AcknowledgmentsThe design and implementation of this toolkit benefited significantly from the SRI Language Modeling Toolkit by Andreas Stolcke. The project is supported in part by the T-Party Project, a joint research program between MIT CSAIL and Quanta Computer Inc. ©2009 Bo-June (Paul) Hsu, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. |