|
|
A collection of code examples, based on the Hadoop open source project at Apache.
This project is intended to explore applications of MapReduce programming techniques for handling large data sets within a distributed computing framework... and provide more examples for other developers to learn to use Hadoop (currently using version 0.15.3) which you will need to download: http://hadoop.apache.org/
Components:
- "jyte" - approximation of jyte.com "cred scores" using a PageRank algorithm
- "canopy" - Java implementation of canopy clustering, from McCallum, Nigam, Ungar
