My favorites | Sign in
Project Logo
                
Code license: Apache License 2.0
Labels: compression, RLE, Java, BitSet, bitmaps
Show all Featured downloads:
JavaEWAH_20090203.zip
People details
Project owners:
  lemire

This is a word-aligned compressed variant of the Java Bitset class. It uses a 64-bit RLE-like compression scheme. It is a Java port of some of the C++ code contained in the project lemurbitmapindex (http://code.google.com/p/lemurbitmapindex/).

The goal of word-aligned compression is not to achieve the best compression, but rather to improve query processing time. Hence, we try to save CPU cycles, maybe at the expense of storage. However, the EWAH scheme we implemented is always more efficient storage-wise than an uncompressed bitmap (as implemented in the java BitSet class by Sun).

Unlike some alternatives (including the similar compressedbitset, http://code.google.com/p/compressedbitset/), it does not provide a patented scheme (to D. Lemire's knowledge).

For better performance, use a 64-bit JVM over 64-bit CPUs.

For more details, see the following paper:

Daniel Lemire, Owen Kaser, Kamel Aouiche, Sorting improves word-aligned bitmap indexes http://arxiv.org/abs/0901.3751









Hosted by Google Code