|
ProgrammingGuideSideData
SSS Mapreduce Programming Guide - SideData
ja , en side dataSideDataSSS Mapreduce provides the function that Mapper/Reducer in remote site reads TupleGroup which is different from the input of TupleGroup. You can create SideData using Context#getSideData. The definitions of SideData#getSideData are shown below. <OK extends Packable,OV extends Packable> SideData<OK,OV> openSideData(GroupID gid,
Class<OK> keyClass,
Class<OV> valClass)
<OK extends Packable,OV extends Packable> SideData<OK,OV> openSideData(GroupID gid,
Class<OK> keyClass,
Class<OV> valClass,
String sideDataType)The means of arguments are shown below.
Identifier of a TupleGroup read. The class object of key of a TupleGroup read. The class object of value of a TupleGroup read. The type of SideData. When "inadvance" is specified, SideData reads all tuples first and stores them on memory. When "ondemand" is specified, SideData reads onley the keys and will read the value on demand. Definition of SideData is shown below. public interface SideData<K extends Packable, V extends Packable> extends Closeable {
boolean containsKey(K key);
List<V> get(K key) throws Exception;
Iterable<K> keys() throws SssException;
Iterable<Tuple<K, V>> tuples() throws SssException;
}
This method returns whether TupleGroup contains the specified key. This method is to get the values from the key. On SSS Mapreduce, the tuples with same key can exist so the return value of this method is List. This method returns the set of all keys. This method returns the set of all tuples. ScanerSideData reads the tuples in one thread. thus the reading of SideData is slow. And SideData can use only one pattern of data structure. Then SSS Mapreduce provides the another function to read the TupleGroup. This function reads the tuples parallely. The following method provides this function. <OK extends Packable,OV extends Packable> void scanSideData(GroupID gid,
java.lang.Class<OK> keyClass,
java.lang.Class<OV> valClass,
TupleGroupScaner<OK,OV> scaner)
throws SssExceptionThe means of arguments are shown below.
Identifier of a TupleGroup read. The class object of key of a TupleGroup read. The class object of value of a TupleGroup read. The listener which is called whenever the tuple is read. The definition of TupleGroupScaner is shown below. public interface TupleGroupScaner<K extends Packable, V extends Packable> {
void set(K key, V value) throws SssException;
} When Context#scanSideData is executed, set method of scaner is called to all tuples read. And set method may be called from two or more threads, so synchronization is required. | |