My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members
Links

This is the example code for a series on Clojure and NLP that I'm doing on my blog: http://writingcoding.blogspot.com/search/label/clojure-series

Quoting from my blog:

Here's the plan:
- Its target audience will be people like myself: self-taught programmers whose primary education has been in the humanities.
- This won't be a tutorial on how to program: The audience should have a little programming experience, preferably in Perl, Python, or Ruby.
- But you won't need to have any experience in lisp, functional programming, or concurrent programming. I'll touch on those as we go along.
- The problems I'm going to tackle will be oriented to processing text documents and analyzing the language in them. The techniques I'll cover will be helpful to those interested in stylistics and other literary studies, or to those interested in corpus linguistics.
- Many of the things I'll describe—such as tokenization–will be very basic and far back from the cutting-edge. Since this is also an introduction to Clojure, I'll cover the basics also.
- But there will also be some odd gaps. For example, I'll explain regular expressions just enough to implement tokenization. Then I'll point you to one of the myriad-thousand excellent online tutorials if you want to learn more about them.
- Since one of the main points of Clojure is concurrency and parallel processing, I'll cover that also.
- The examples will build on each other, and in the end, we'll have a system for doing parallel processing of text documents. We'll build a variety of tasks to do some standard analyses, and we'll design it so creating more tasks and inserting them into the processing stream will be relatively easy.

This code project will act as both a place to keep the sample code that I create for that series and the basis for a the NLP library.

--Eric

Powered by Google Project Hosting