|
ProgrammingGuideMapperReducer
SSS Mapreduce Programming Guide - Mapper/Reducer class
en, ja Mapper/Reducer classThis chapter explains the part which is not explained by WordCount about definition of Mapper/Reducer. multi outputsIn SSS Mapreduce, Mapper/Reducer can have multi outputs. If you want to use multi outputs, append Output type argument to tail of arguments of map/reduce method. The example of Mapper which has two outputs is shown below. public class MyMapper extends Mapper {
public void map(Context context,
PackableInt key, PackableString value,
Output<PackableString, PackableInt> output1,
Output<PackableInt, PackableDouble> output2) throws Exception {
// The contents of processing
}
}The type of each output have not be the same. And it is necessary to specify TupleGroup to each output using addOutput method in Job.Builder when Job is created. GroupID input = ...;
GroupID output1 = ...;
GroupID output2 = ...;
engine.getJobBuilder("MyMapper", MyMapper.class)
.addInput(input)
.addOutput(output1)
.addOutput(output2) // two output
.build();WARNING: The class used as combiner can not use multioutputs. The method called before the processingMapper/Reducer class have configure method. SSS Mapreduce calls this method before Mapper/Reducer read the tuples. If there is the processing which you want to execute before the execution of Mapper/Reducer, override 'configure' method. 'configure' method of Mapper/Reducer class do nothing. Thus the overrided method have not call 'configure' method of super class. The only signature is shown below. public class MyMapperextends Mapper {
@Override
public void configure(Context context) {
// The contents of processing
}
public void map(Context context,
PackableInt key, PackableString value,
Output<PackableString, PackableInt> output1,
Output<PackableInt, PackableDouble> output2) throws Exception {
// The contents of processing
}
}The method called after the processingYou can also define the method called contrary to "configure" method after finishing the processing of all tuples. When a method named "cleanup" is defined in Mapper/Reducer, SSS Mapreduce calls this method after finishing the processing. "cleanup" method must have the followings arguments.
As required arguments show, "cleanup" method can output tuples like map/reduce method. Therefore, key and value type of Output must be the same to map/reduce method. And when map/reduce method has multi-outputs, "cleanup" method must have same number of outputs. The only signature is shown below. public class MyMapper extends Mapper {
public void map(Context context,
PackableInt key, PackableString value,
Output<PackableString, PackableInt> output1,
Output<PackableInt, PackableDouble> output2) throws Exception {
// The contents of processing
}
public void cleanup(Context context,
Output<PackableString, PackableInt> output1,
Output<PackableInt, PackableDouble> output2) throws Exception {
// The contents of processing
}
}
| |