IntroductionDifferent forms of caching are used at each stage of processing when using the maaap reduce architecture. Machine-level CachingAt the machine level, DisCo provides map and reduce tasks with a connection to a memcached cluster. Memcached functions as a fast, distributed hash table: map tasks that train a neural network can optionally store updated weights for each network layer as it processes training instances. These updated weights are keyed by the node name, so future invocations of the map task can opt to use recent layers weights. In a more complex algorithm, a node may choose to use the layer weights generated by a different node in the cluster, potentially selected by some arbitrary fitness function, or by a graph traversal pattern. Serializing layer weights to and from a NumPy array into memcached is accomplished in a form similar to: memc = params['memc']
# Store new weights
params['weights'] = memc.set('node01/weights', weights.dumps)
# Load previous weights
params['weights'] = numpy.loads(memc.get('node01/weights'))GPU-level CachingTo most efficiently execute the kernels for data contained in the GPU memory, shared memory inside the GPU is allocated for intermediate results (for example, subblocks used to compute matrix multiplication). Here, the caching strategy must be determined beforehand, and indeed it is built into each kernel that leverages shared memory. In the matrix multiplication kernel, the subblocks are allocated from shared memory to each thread handling an element of the output matrix. // Declaration of the shared memory array Bs used to // store the sub-matrix of A __shared__ float As[BLOCK_SIZE][BLOCK_SIZE]; // Declaration of the shared memory array Bs used to // store the sub-matrix of B __shared__ float Bs[BLOCK_SIZE][BLOCK_SIZE]; Job-level CachingThe results of each map/reduce job are stored in a temporary file served over HTTP by the DisCo master node. This constitutes a form of caching, but cannot be controlled by the user and does not affect the performance of data mining using DisCo. |