|
ProgrammingGuideMergeReducer
SSS Mapreduce Programming Guide - MergeReducer
ja , en MergeReducerSSS Mapreduce has the function which merges same key tuples in two TupleGroups. This function is called "MergeReducer". Using MergeReducer, it is necessary to create Reducer which has two inputs. public class MergeReducer extends Reducer {
public void reduce(Context context,
PackableString key1, Iterable<PackableInt> values1,
PackableString key2, Iterable<PackableInt> values2,
Output<PackableString, PackableInt> output) {
// The contents of processing
}
}Then, The types of two keys must be same. SSS Mapreduce merges the tuples which has same key in two TupleGroups and calls reduce method. You can write the processing which merges the values in two TupleGroups. Then two keys have same value. One side does not look necessary but SSS Mapreduce needs them to get the type information of input and output of Reducer from the arguments. Next, When Job is created with JobEngine, specify two inputs using two calls of addInput method. JobEngine engine = new JobEngine(client);
try {
GroupID input1 = ...;
GroupID input2 = ...;
GroupID output = GroupID.createRandom(engine);
engine.getJobBuilder("MergeReducer", MergeReducer.class)
.addInput(input1)
.addInput(input2)
.addOutput(output).build();SSS Mapreduce does not call "reduce" method to the keys which exist in the TupleGroup of one side. In order for SSS Mapreduce to call "reduce" method to the keys which exist in the TupleGroup of one side, set true to "marge_reducer.handle_one_side_only" in Configuration. Job.Builder jb = engine.getJobBuilder("MergeReducer", MergeReducer.class)
.addInput(input1)
.addInput(input2)
.addOutput(output);
jb.getConfiguration().setBoolean("marge_reducer.handle_one_side_only", true);
jb.build();
| |