|
ProgrammingGuideDefinitionOfDataType
SSS Mapreduce Programming Guide - Definition of original data type
en, ja Definition of original data typeThis chapter explains how to define data type which is available as key and value. SSS Mapreduce provides two way as how to define data type.
The common partFirst, data type which is available as key and value must implement Packable interface. This interface has the followings two methods.
getRoughSize method must return the rough size of an instance of this class on memory. SSS Mapreduce uses getRoughSize method to get the rough size when SSS Mapreduce put the instance of this class to buffers. slot method return "slot value" of this class. "slot value" is an integer value used for distributing data in a storage server. WARNING: SSS Mapreduce distributes tuples with Partitioner to each storage server and distributes tuples with "slot value" in a storage server. If "slot value" is the same as a value returned by Partitioner, the tuples are not distributed equally to each storage server and worker server cannot read parallel. It is no problem that the implementation of slot method is the implementation only to return the return value of the hashCode method. Because slot method is used when the class is used as key, it is no problem tath slot returns any value if the class is used as only value. And when you use the class as key, the class implements Object#equals and Object#hashCode method correctly. The method to use MessagePackSSS Mapreduce encodes data to byte string on writing data to storage servers. As the method for encoding, MessagePack is available in SSS Mapreduce. In SSS Mapreduce, If a class implements Packable and can be encoded by MessagePack, it is available as key and value. A class can encode by MessagePack when the following conditions are fulfilled .
The method to define encoder/decoderAs another method, you can define encoder/decoder uniquely. First, implement interface SelfPackable, which is expanded from Packable in your class. SelfPackable has the following methods. Implement the decoding processing from byte string in loadBytes method, the encoding processing to byte string in toBytes method. But there is the case to encode/decode with instances of some class. In this case, loadBytes and toBytes method must create instances of this class each call. Thus when the class implements SelfPackable and has the following class methods, SSS Mapreduce create encoder/decode with these methods each thread and encode/decode with the encoder/decoder. public static Encoder createEncoder();
public static Decoder createDecoder();It is no problem to define only one in createDecoder and createDecoder. In this case loadBytes or toBytes is used. The definitions of Encoder/Decoder are shown below. Implement createEncoder/createDecoder to return Encoder/Decoder which encodes/decodes correctly. public interface Encoder {
byte[] encode(Packable p) throws IOException;
} public interface Decoder {
Packable decode(byte[] b) throws IOException;
}
| |