My favorites | Sign in
Project Home Downloads Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Overview  
Installation of SSS Mapreduce
ja , en
Updated Feb 27, 2013 by tatsuhik...@gmail.com

Overview

SSS Mapreduce is the distributed computing framework for the parallel data processing. The basic concept for the parallel data processing referred to and !Mapreduce, which is the distributed computing model advocated by Google ,its open sorece implementation Apache Hadoop. The feature of SSS MapReduce is to use the distributed key/value store insted the distributed filesystem(GFS of the implementation, HDFS of Apache Hadoop).

Features

This section explans the features of SSS Mapreduce.

Reads local data

SSS Mapreduce processes the local data in each node for High-performance data transfer and network collision evasion.

Hashing of key and grouping data by key-value store

In !Mapreduce computing model, the input data of "Reduce" phase must be grouped by key. SSS Mapreduce uses hashing of key and key-value store for its grouping. When writing key-value pair to key-value store, SSS Mapreduce decides the nodes where the pair is written using the hash value of the key. And the data written to key-value store is grouped by key-value store automatically. SSS Mapreduce executes Reduce using this groups as they are.

Flexible work flow

As above mention, SSS Mapreduce distributes the data which are communicated between Map and Reduce using the hash value of key and stores them to the key-value store. But SSS Mapreduce keeps not only the data which are communicated between Map and reduce but all data(input of Map and output Reduce etc) by this method. So SSS Mapreduce can execute the both Map and Reduce using any data. As a result, Map and Reduce need not be one set and you can make the flexible data flow combining any numbers of Mappers and Reducers freely.

Composition of SSS Mapreduce

SSS Mapreduce is composited by the following three items.

  • Client
  • Worker server
  • Storage server

The worker servers and the store servers run on each worker nodes. And client plays the role which manages the execution of the Map and Reducer processing, and directs the execution of Map/Reduce to eachworker server.

Worker server is the program written by Java. SSS Mapreduce uses TokyoTyrant applied oure pathces as a storage server.

The usage

SSS Mapreduce provides the Java API for the followings.

  • Specifying the condition and the contents of processing of jobs.
  • Invoking the jobs.
  • Managing the execution of the jobs.

You can perform the above-mentioned processing by calling API from your Java program. See ProgrammingGuide, for the concrete usage.

Powered by Google Project Hosting