|
Overview
Installation of SSS Mapreduce
en, ja OverviewSSS Mapreduce is the distributed computing framework for the parallel data processing. The basic concept for the parallel data processing referred to and !Mapreduce, which is the distributed computing model advocated by Google ,its open sorece implementation Apache Hadoop. The feature of SSS MapReduce is to use the distributed key/value store insted the distributed filesystem(GFS of the implementation, HDFS of Apache Hadoop). FeaturesThis section explans the features of SSS Mapreduce. Reads local dataSSS Mapreduce processes the local data in each node for High-performance data transfer and network collision evasion. Hashing of key and grouping data by key-value storeIn !Mapreduce computing model, the input data of "Reduce" phase must be grouped by key. SSS Mapreduce uses hashing of key and key-value store for its grouping. When writing key-value pair to key-value store, SSS Mapreduce decides the nodes where the pair is written using the hash value of the key. And the data written to key-value store is grouped by key-value store automatically. SSS Mapreduce executes Reduce using this groups as they are. Flexible work flowAs above mention, SSS Mapreduce distributes the data which are communicated between Map and Reduce using the hash value of key and stores them to the key-value store. But SSS Mapreduce keeps not only the data which are communicated between Map and reduce but all data(input of Map and output Reduce etc) by this method. So SSS Mapreduce can execute the both Map and Reduce using any data. As a result, Map and Reduce need not be one set and you can make the flexible data flow combining any numbers of Mappers and Reducers freely. Composition of SSS MapreduceSSS Mapreduce is composited by the following three items.
The worker servers and the store servers run on each worker nodes. And client plays the role which manages the execution of the Map and Reducer processing, and directs the execution of Map/Reduce to eachworker server. Worker server is the program written by Java. SSS Mapreduce uses TokyoTyrant applied oure pathces as a storage server. The usageSSS Mapreduce provides the Java API for the followings.
You can perform the above-mentioned processing by calling API from your Java program. See ProgrammingGuide, for the concrete usage. | |