| Projects on Google Code | Results 1 - 10 of 89 |
= HAMAKE=
== Description ==
'hamake' utility allows you to automate incremental processing of
datasets stored on HDFS using Hadoop tasks written in Java or using
{{{PigLatin}}} scripts. Datasets could be either individual files or
directories containing groups of files. New files may be add...
==Motivation/Purpose==
Hadoop Map/Reduce is gaining a lot of popularity in the industry and is having a wide range of adoption from some of the most influential as well as tech savvy companies.
One of the biggest problems i faced learning the technology was to figure out intuitive examples tha...
Version 0.1 in progress. To be tagged possibly by december
In Later versions, Von Steuben Studio will offer a graphical interface to submit and create jobs. Also support for Amazon elastic Mapreduce will be added.
-----------------------------------------------------------------------------...
[http://hadoop.apache.org/core/ Hadoop] is an Apache project for distributed parallel computing using the [http://en.wikipedia.org/wiki/MapReduce MapReduce] computational framework.
While Hadoop is written in Java, hadoop-sharp enables developers to use any CLR (.NET/Mono) supported programming l...
The goal of the Hadoop UI project is to provide an intuitive, powerful and accessible client for the [http://hadoop.apache.org/ Hadoop] map reduce framework.
The following features have been implemented:
* HDFS Explorer: file manager for distributed file system (HDFS)
* Job Manager: Monitor a...
Large-scale, powerful and battery included!
HadoopLDA can train LDA model with large corpus in parallel on a Hadoop cluster. It use distributed Gibbs Sampling technique, with built-in vocabulary selection. HadoopLDA is easy to use, a single command can turn huge amount of documents into a compact...
[http://hadoop.apache.org/core Hadoop] is an [http://apache.org Apache] project and is released under the Apache Software License. However, several open source codecs are released under [http://www.gnu.org/copyleft/gpl.html GPL]. *This project* is a set of plugins for Hadoop that provide access to t...
Cascading is an active project, please visit our main project page for more info and join our mailing list on the right.
http://www.cascading.org/
Cascading is maintained and supported by [http://www.concurrentinc.com/ Concurrent, Inc.]
Note: We now do all development in Git and keep the cu...
hacdb aims to be a HAdoop-based Column DataBase. We wanna make it one of the best warehousing platforms.
There would be four layers:
|| SQL parser ||
|| planner ||
|| M/R ||
|| HDFS ||
We'll build it from bottom up.
= Contents =
# [HadoopSetup hadoop cluster setu...
MRToolkit provides a framework for building simple Map/Reduce jobs in just a few lines of code. You provide only the map and reduce logic, the framework does the rest. Or use one of the provided map or reduce tools, and write even less.
Map and reduce jobs are written in Ruby. MRToolkit was in...