php-bloom-filter


a simple bloom filter implementation in PHP

INTRODUCTION

This is a simple class which implements bloom filter in PHP. It is probably not useful for any particular purpose out-of-the-box because it does not deal with storage issues. Feel free to hack it to suit your needs or drop a ticket if you need a specific feature.

A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters have a great deal of potential for distributed protocols where systems need to share information about what data they have available e.g. for web cache sharing.

For a nice visual demo of how bloom filters work see http://la.ma.la/misc/js/bloomfilter/.

For more information on the theory behind bloom filters read this paper: http://www.box.net/shared/zvmgyyixe3

DOWNLOAD

There is not download. For now get the source from SVN.

REQUIREMENTS

You need pecl/bitset http://pecl.php.net/package/Bitset.

SIMPLE USE

``` //initialize the bloom filter $bf = new BloomFilter(3000000);

$f = fopen('/usr/share/dict/words','r'); while(!feof($f)) $bf->add(trim(fgets($f)));

//write the bit field to file for later use file_put_contents('bloom',$bf->field);

//initialize the bloom filter from file $bf = BloomFilter::init(file_get_contents('bloom'));

//test some membership queries $test_words = array('this','library','does','bloom','filters','in','php');

foreach ($test_words as $word) if ($bf->has($word)) echo $word."\n"; ```

Project Information

Labels:
php bloomfilter algorithm