Export to GitHub

mrs-mapreduce - issue #12

Reduce the amount of data sent between processes


Posted on Oct 24, 2012 by Happy Camel

Communication between processes uses multiprocessing, which requires pickling and writes/reads. Sending an empty dataset should not send a large number of empty buckets, which is inefficient. There may also be other unnecessary data sent over pipes that could be streamlined.

Status: Accepted

Labels:
Type-Defect Priority-Medium