My favorites | Sign in
Project Home Downloads Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Search
for
DataServer  
SSS Mapreduce DataServer
en, ja
Updated Feb 27, 2013

the use of DataServer

SSS Mapreduce uses Tokyo Tyrant as a storage server. But the communication layer of Tokyo Tyrant is not necessarily optimized to the bulk communication which is used abundantly in SSS Mapreduce. Thus, we provide "DataServer", which has the original communication protocol.

Prerequisites

Platform

We checked that the DataServer operated on the following platforms.

  • CentOS release 5.5 x86_64
- Ubuntu 12.10 x86_64

Build tools

C++11 Combiner is required to compile DataServer. We checked to be able to compile gcc 4.7.0 and 4.7.2.

DataServer uses GNU Makefile and CMake to be built. We checked to able to build using the following versions.

- GNU Makefile
- 3.8.1
- CMake
- 2.8.8 - 2.8.9

Library

DataServer requires the following libraries.

- Boost 1.49 or later - POCO
- We check with 1.3.6p1-4 and 1.4.3p1.
- TokyoCabinet
- 1.4.47 applied our patches.

Usage

DataServer is built by "make" command.

$ make

Write the settings peculiar to machines, such as directories in libraries, to "env.mk". And "Release" version or "Debug" version is available. Set CMAKE_BUILD_TYPE variable in "env.mk" to select version.

CMAKE_BUILD_TYPE=Release # Release version
CMAKE_BUILD_TYPE=Debug   # Debug version

If you does not write nothing, "Debug" version is used.


dataserver.sh and dataserver_local.sh are available to start and stop DataServer. These script require environment variable MAPREDUCE_HOME to the suitable value. And they require that environment variable DATASERVER_HOME is set to top directory of DataServer.

dataserver.sh sees ${MAPREDUCE_HOME}/conf/peers and invokes DataServer in remote nodes. dataserver_local.sh invokes DataServer in local host.

The invocation in the remote nodes:

$ export DATASERVER_HOME=$PWD
$ ./dataserver.sh start

The invocation in local host:

$ export DATASERVER_HOME=$PWD
$ ./dataserver_local.sh start

DataServer use 21201 as port number and create database files under ${MAPREDUCE_DATA}/db.

If these script is invoked with "stop", thay stop DataServer.

The stop in the remote nodes:

$ ./dataserver.sh stop

The stop in local host:

$ ./dataserver_local.sh stop

The setting of SSS Mapreduce

To use DataServer, add the following setting to conf/mapreduce.server.properties.

mapreduce.server.io.stream.protocol = true

When you start servers of SSS Mapreduce using cluster.sh, invoke only worker servers by passing "sssserver". cluster.sh invokes worker servers and TokyoTyrant servers by default.

$ ${MAPREDUCE_DATA}/bin/cluster.sh start sssserver
Powered by Google Project Hosting