My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members
Featured
Downloads
Links

Want to help?

We need: web devs, browser devs, translation expertise (right now i'd like to stick with french, kreyol) - email the project admin or the CrisisCommons MT group if you have a capability and would like to contribute!

Project and Effort Description

In the aftermath of the tragic earthquake in Haiti (2010), it's quite clear there is a clear history of challenges the world faces providing aid, development, and support to areas impacted by natural disaster (the 2004 Indian Ocean tsunami, etc).

The CrisisCommons group has been organized to provide information systems and software solutions to support responders.

Issue: Getting access to language experts that can quickly be mobilized to assist in recovery operations and efforts.

Solution: Provide parallel corpora, statistical language models and an automated machine translation system to assist in the translation and rapid deployment of text and printed media.

The development of a Machine Translation (MT) system would be handy for these situations.

ccmts is an effort to provide a bidirectional high-density-language to low-density-language MT system using data sets (texts) from around the 'net.

The goals of ccmts is to offer as much of the following as possible:

  • access to data sets, n-grams, and language models (LMs)
  • web interfaces
  • browser plugins
  • web services (SOAP, JSON, etc)
  • mobile device application support (iphone, android, symbian, etc)
  • documentation and drop-in-software that can quickly be mobilized to support these efforts.

Special thanks to:

Time Line, Release Schedule

Update! Site is online and hosted! Source code and parallel corpora to follow!

http://crisisterp.dyndns.org

Goals are to execute content (code, corpora, analysis) updates during the evenings to allow flexibility, planning, and time for the day job.

  • 02/23/2010: Alpha web site completed, the moses-nlp translation engine has been successfully wrapped in python for integration with the website. We are testing the product and will have it, the source code, and our parallel corpus available for release this weekend (barring some annoying UTF-8 encoding issues).
  • 02/10/2010: the project manager has secured webhosting for the project. still developing the python bindings for moses-nlp (our translation engine), toronto webdev team is continuing to work on the website for the engine.
  • 01/28/2010: A command line version of the system is operational!
“I need medical help .” the moses-nlp kreyol translation, “I bezwen ede medikal .” Continuing to build a better corpora, coordinating with others to build a web interface.
  • 01/25/2010: email translation/linguistics team to check alignment of our corpora. release a cleaned up version of the corpora for public consumption. - missed deadline - tools are being finicky.
  • 01/21/2010: post to the site a downloadable zip, and commit to the site svn, text file corpora compiled from:
Powered by Google Project Hosting