RubyGems - bloombroom - Versions diffs - 1.0.0 → 1.2.0 - Mend

bloombroom 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

data/.gitignore +12 -0
data/.travis.yml +6 -0
data/CHANGELOG.md +11 -1
data/Gemfile +4 -0
data/README.md +75 -45
data/Rakefile +23 -0
data/benchmark/bloom_filter.rb +35 -0
data/benchmark/bloom_filter_memory.rb +28 -0
data/benchmark/continuous_bloom_filter.rb +60 -0
data/benchmark/continuous_bloom_filter_memory.rb +28 -0
data/benchmark/continuous_bloom_filter_stats.rb +63 -0
data/benchmark/memory.rb +8 -0
data/bloombroom.gemspec +28 -0
data/ffi/bloombroom/hash/Rakefile +5 -0
data/{ext/bloombroom/hash/ffi → ffi/bloombroom/hash}/ffi_fnv.c +0 -0
data/lib/bloombroom.rb +10 -6
data/lib/bloombroom/hash/ffi_fnv.rb +4 -8
data/lib/bloombroom/version.rb +1 -1
data/spec/bloombroom/bits/bit_bucket_field_spec.rb +142 -0
data/spec/bloombroom/bits/bit_field_spec.rb +108 -0
data/spec/bloombroom/filter/bloom_filter_spec.rb +43 -0
data/spec/bloombroom/filter/bloom_helper_spec.rb +18 -0
data/spec/bloombroom/filter/continuous_bloom_filter_spec.rb +107 -0
data/spec/bloombroom/hash/ffi_fnv_spec.rb +31 -0
data/spec/bloombroom/hash/test_vectors.rb +610 -0
data/spec/spec_helper.rb +2 -0
metadata +94 -49
data/ext/bloombroom/hash/cext/cext_fnv.c +0 -91
data/ext/bloombroom/hash/cext/extconf.rb +0 -3
data/ext/bloombroom/hash/ffi/extconf.rb +0 -3
data/lib/bloombroom/hash/fnv_a.rb +0 -100
data/lib/bloombroom/hash/fnv_b.rb +0 -56

data/.gitignore ADDED

@@ -0,0 +1,12 @@
+*.gem
+*.bundle
+*.o
+.rvmrc
+Makefile
+Gemfile.lock
+pkg/*
+.ruby-version
+tmp/
+vendor/
+.vagrant
+Vagrantfile

data/.travis.yml ADDED

@@ -0,0 +1,6 @@
+language: ruby
+rvm:
+  - 1.9.3
+  - jruby-19mode
+  - jruby-head

data/CHANGELOG.md CHANGED

@@ -1 +1,11 @@
-tbd
+# 1.0.0, 05-09-2012
+- initial release
+# 1.1.x
+- bad gems, yanked from Rubygems
+# 1.2.0, 02-27-2013
+- refactored to use ffi-compiler
+- now only using FFI FNV hashing, removed all other implementations
+- test on JRuby 1.7.3 and MRI Ruby 2.0.0
+- FFI performance improvements

data/Gemfile ADDED

@@ -0,0 +1,4 @@
+source "https://rubygems.org"
+# Specify your gem's dependencies in bloombroom.gemspec
+gemspec

data/README.md CHANGED

@@ -1,10 +1,12 @@
-# Bloombroom v1.0.0
+# Bloombroom v1.2.0
+[![build status](https://secure.travis-ci.org/colinsurprenant/bloombroom.png)](http://travis-ci.org/colinsurprenant/bloombroom)
 - Standard **Bloomfilter** class for bounded key space
 - **ContinuousBloomfilter** class for unbounded keys (**stream**)
-- Bitfield class
-- BitBucketField class (multi bits)
-- native, C & FFI extensions FNV hash classes
+- Bitfield class (single bit field)
+- BitBucketField class (multiple bit fields)
+- Fast FNV hashing using C implementation with FFI bindings.
 The Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. See [wikipedia](http://en.wikipedia.org/wiki/Bloom_filter).
@@ -21,16 +23,15 @@ The internal timer resolution is set to half of the required TTL (resolution div
 ring buffer using the current timer tick modulo 15. The timer ticks will be time slot=1, 2, ... 15, 1, 2 and so on. The total
 time of our internal clock will thus be 15 * (TTL / 2). We keep track of TTL by writing the current time slot
 in the key k buckets when inserted in the filter. For a key lookup if the interval betweem the current time slot and any of the k buckets value
-is greater than 2 (resolution divisor) we know this key is expired. See [continuous_bloom_filter.rb](https://github.com/colinsurprenant/bloombroom/blob/master/lib/bloombroom/filter/continuous_bloom_filter.rb)
+is greater than 2 (resolution divisor) we know this key is expired. See my [Continuous Bloom filter](http://colinsurprenant.com/blog/2012/05/12/continuous-bloom-filter/) blog post about this.
 This means that an element is garanteed to not be expired before the given TTL but in the worst case could survive until 3 * (TTL / 2).
 ### Hashing
 Bloom filters require the use of multiple (k) hash functions for each inserted element. We actually simulate multiple hash functions by having just two hash functions which are actually the upper and lower 32 bits of our FFI FNV1a 64 bits hash function. Double hashing with one hash function. Very very fast. See [bloom_helper.rb](https://github.com/colinsurprenant/bloombroom/blob/master/lib/bloombroom/filter/bloom_helper.rb) and the [references](#references) section for more info on this technique.
 ## Installation
-tested in both MRI Ruby 1.9.2, 1.9.3 and JRuby 1.6.7 in 1.9 mode.
+tested in both MRI Ruby 1.9.2, 1.9.3, 2.0 and JRuby 1.7.3.
 ``` sh
 $ gem install bloombroom
@@ -113,45 +114,67 @@ ruby benchmark/continuous_bloom_filter_memory.rb auto 100000000 0.001
 - **1.0%** error rate for **100M** keys: **914mb**
 - **0.1%** error rate for **100M** keys: **1371mb**
+## Simulation
+This is an input stream simulation into the ContinuousBloomfilter. First a series to 32 x 20k random unique insertion keys & 20k random unique test keys not part of the insertion set are generated. At each iteration, 20k insertion keys are added, and 20k test keys are checked for inclusion and the internal timer tick is incremented. Since the life of our keys is of 3 timer ticks we chose a filter capacity of 3 x 20k elements. Specific m and k parameter will be computed for an error rate of 0.1% and 3 x 20k capacity.
-## Benchmarks
-All benchmarks have been run on a MacbookPro with a 2.66GHz i7 with 8GB RAM on OSX 10.6.8 with MRI Ruby 1.9.3p194
-### Hashing
-The Hashing benchmark compares the performance of SHA1, MD5, two native Ruby FNV (A & B) implementations, a C implementation as a C extension and FFI extension for 32 and 64 bits hashes.
+We see that as we add more keys, the test keys false positive rate is stable at the required error rate. In the second section, the same sequence is applied to a standard Bloomfilter to show that, obviously, the error rate will increase as more elements are added past the required capacity.
 ``` sh
-ruby benchmark/fnv.rb
-```
-```
-benchmarking for 1000000 iterations
-                         user     system      total        real
-MD5:                 1.900000   0.010000   1.910000 (  1.912995)
-SHA-1:               2.110000   0.000000   2.110000 (  2.109739)
-native FNV A 32:    32.470000   0.110000  32.580000 ( 32.596759)
-native FNV A 64:    38.330000   0.570000  38.900000 ( 38.923384)
-native FNV B 32:     4.870000   0.020000   4.890000 (  4.882862)
-native FNV B 64:    37.700000   0.110000  37.810000 ( 37.842873)
-ffi FNV 32:          0.760000   0.010000   0.770000 (  0.754941)
-ffi FNV 64:          0.890000   0.000000   0.890000 (  0.901954)
-c-ext FNV 32:        0.310000   0.000000   0.310000 (  0.307131)
-c-ext FNV 64:        0.480000   0.000000   0.480000 (  0.485310)
-MD5:                   522740 ops/s
-SHA-1:                 473992 ops/s
-native FNV A 32:        30678 ops/s
-native FNV A 64:        25691 ops/s
-native FNV B 32:       204798 ops/s
-native FNV B 64:        26425 ops/s
-ffi FNV 32:           1324607 ops/s
-ffi FNV 64:           1108704 ops/s
-c-ext FNV 32:         3255939 ops/s
-c-ext FNV 64:         2060538 ops/s
+ruby benchmark/continuous_bloom_filter_stats.rb
 ```
+```
+generating lots of random keys
+Continuous BloomFilter with capacity=60000, error=0.001(0.1%) -> m=862656, k=10
+added 20000 keys, tested 20000 keys, FPs=0/20000 (0.000)%
+added 20000 keys, tested 20000 keys, FPs=1/20000 (0.005)%
+added 20000 keys, tested 20000 keys, FPs=17/20000 (0.085)%
+added 20000 keys, tested 20000 keys, FPs=20/20000 (0.100)%
+added 20000 keys, tested 20000 keys, FPs=23/20000 (0.115)%
+added 20000 keys, tested 20000 keys, FPs=22/20000 (0.110)%
+added 20000 keys, tested 20000 keys, FPs=22/20000 (0.110)%
+added 20000 keys, tested 20000 keys, FPs=17/20000 (0.085)%
+added 20000 keys, tested 20000 keys, FPs=18/20000 (0.090)%
+added 20000 keys, tested 20000 keys, FPs=21/20000 (0.105)%
+added 20000 keys, tested 20000 keys, FPs=11/20000 (0.055)%
+added 20000 keys, tested 20000 keys, FPs=17/20000 (0.085)%
+added 20000 keys, tested 20000 keys, FPs=18/20000 (0.090)%
+added 20000 keys, tested 20000 keys, FPs=19/20000 (0.095)%
+added 20000 keys, tested 20000 keys, FPs=21/20000 (0.105)%
+added 20000 keys, tested 20000 keys, FPs=20/20000 (0.100)%
+added 20000 keys, tested 20000 keys, FPs=24/20000 (0.120)%
+added 20000 keys, tested 20000 keys, FPs=21/20000 (0.105)%
+added 20000 keys, tested 20000 keys, FPs=22/20000 (0.110)%
+added 20000 keys, tested 20000 keys, FPs=24/20000 (0.120)%
+added 20000 keys, tested 20000 keys, FPs=15/20000 (0.075)%
+added 20000 keys, tested 20000 keys, FPs=16/20000 (0.080)%
+added 20000 keys, tested 20000 keys, FPs=16/20000 (0.080)%
+added 20000 keys, tested 20000 keys, FPs=17/20000 (0.085)%
+added 20000 keys, tested 20000 keys, FPs=22/20000 (0.110)%
+added 20000 keys, tested 20000 keys, FPs=21/20000 (0.105)%
+added 20000 keys, tested 20000 keys, FPs=24/20000 (0.120)%
+added 20000 keys, tested 20000 keys, FPs=16/20000 (0.080)%
+added 20000 keys, tested 20000 keys, FPs=17/20000 (0.085)%
+added 20000 keys, tested 20000 keys, FPs=24/20000 (0.120)%
+added 20000 keys, tested 20000 keys, FPs=17/20000 (0.085)%
+added 20000 keys, tested 20000 keys, FPs=19/20000 (0.095)%
+Continuous BloomFilter 640000 adds + 640000 tests in 16.95s, 75537 ops/s
+BloomFilter with capacity=60000, error=0.001(0.1%) -> m=862656, k=10
+added 20000 keys, tested 20000 keys, FPs=0/20000 (0.000)%
+added 20000 keys, tested 20000 keys, FPs=1/20000 (0.005)%
+added 20000 keys, tested 20000 keys, FPs=17/20000 (0.085)%
+added 20000 keys, tested 20000 keys, FPs=131/20000 (0.655)%
+added 20000 keys, tested 20000 keys, FPs=453/20000 (2.265)%
+added 20000 keys, tested 20000 keys, FPs=1162/20000 (5.810)%
+BloomFilter 120000 adds + 120000 tests in 1.64s, 146008 ops/s
+```
+## Benchmarks
+All benchmarks have been run on a MacbookPro with a 2.66GHz i7 with 8GB RAM on OSX 10.6.8 with MRI Ruby 1.9.3p194
 ### Bloomfilter
-The Bloomfilter class is using the FFI FNV hashing by default, for speed and compatibility.
 ``` sh
 ruby benchmark/bloom_filter.rb
@@ -176,7 +199,6 @@ BloomFilter m=2875518, k=13 include?       119154 ops/s
 ```
 ### ContinuousBloomfilter
-The ContinuousBloomfilter class is using the FFI FNV hashing by default, for speed and compatibility.
 ``` sh
 ruby benchmark/continuous_bloom_filter.rb
@@ -211,6 +233,10 @@ ContinuousBloomFilter m=2875518, k=13 add+include      56606 ops/s
 ```
 ## JRuby
+This has only been tested in Ruby **1.9** mode. JRuby 1.9 mode has to be enabled to run tests and benchmarks.
+Note that this is not necessary anymore with JRuby 1.7 which is in 1.9 mode by default.
 - to run specs use
 ``` sh
@@ -222,7 +248,7 @@ jruby --1.9 -S rake spec
 jruby --1.9 benchmark/some_benchmark.rb
 ```
-<a id="reference" />
+<a id="references" />
 ## References ##
 - [Bloom filter on wikipedia](http://en.wikipedia.org/wiki/Bloom_filter)
 - [Scalable Datasets: Bloom Filters in Ruby](http://www.igvita.com/2008/12/27/scalable-datasets-bloom-filters-in-ruby/)
@@ -232,11 +258,15 @@ jruby --1.9 benchmark/some_benchmark.rb
 - [Producing n hash functions by hashing only once](http://willwhim.wordpress.com/2011/09/03/producing-n-hash-functions-by-hashing-only-once/)
 - [Less Hashing, Same Performance: Building a Better Bloom Filter](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.152.579&rep=rep1&type=pdf)
+## Credits
+- [Ilya Grigorik](http://www.igvita.com/) for his inspiration with the [Time-based Bloom filters](http://www.igvita.com/2010/01/06/flow-analysis-time-based-bloom-filters/).
+- Authors of the [Stable Bloom filters research paper](http://webdocs.cs.ualberta.ca/~drafiei/papers/DupDet06Sigmod.pdf) which also provided inspiration.
+- [Robey Pointer](https://github.com/robey) for his [Ruby FNV C extension implementation](https://github.com/robey/rbfnv).
+- [Peter Cooper](http://www.petercooper.co.uk/) for inspiration with [his BitField class](http://dzone.com/snippets/bitfield-fastish-pure-ruby-bit).
 ## Author
-Colin Surprenant, [@colinsurprenant][twitter], [http://github.com/colinsurprenant][github], colin.surprenant@needium.com, colin.surprenant@gmail.com
+Colin Surprenant, [@colinsurprenant](http://twitter.com/colinsurprenant), [http://github.com/colinsurprenant](http://github.com/colinsurprenant), colin.surprenant@gmail.com
 ## License
 Bloombroom is distributed under the Apache License, Version 2.0.
-[twitter]: http://twitter.com/colinsurprenant
-[github]: http://github.com/colinsurprenant

data/Rakefile ADDED

@@ -0,0 +1,23 @@
+require 'bundler/setup'
+require 'rake'
+require 'rake/clean'
+require 'bundler/gem_tasks'
+require 'rspec/core/rake_task'
+require 'ffi'
+require 'ffi-compiler/compile_task'
+task :default => [:clean, :compile_ffi, :spec]
+desc "clean, make and run specs"
+task :spec  do
+  RSpec::Core::RakeTask.new
+end
+desc "FFI compiler"
+namespace "ffi-compiler" do
+  FFI::Compiler::CompileTask.new('ffi/bloombroom/hash/ffi_fnv')
+end
+task :compile_ffi => ["ffi-compiler:default"]
+CLEAN.include('ffi/**/*{.o,.log,.so,.bundle}')
+CLEAN.include('lib/**/*{.o,.log,.so,.bundle}')

data/benchmark/bloom_filter.rb ADDED

@@ -0,0 +1,35 @@
+require 'bundler/setup'
+require "benchmark"
+require "digest/sha1"
+require "bloombroom"
+KEYS_COUNT = 150000
+ERRORS = [0.01, 0.001, 0.0001]
+TEST_M_K = ERRORS.map{|error| Bloombroom::BloomHelper.find_m_k(KEYS_COUNT, error)}
+keys = KEYS_COUNT.times.map{|i| Digest::SHA1.hexdigest("#{i}#{rand(1000000)}")}
+if !!(defined?(RUBY_ENGINE) && RUBY_ENGINE == 'jruby')
+  puts("warming JVM...")
+  bf = Bloombroom::BloomFilter.new(KEYS_COUNT, 7)
+  keys.each{|key| bf.add(key)}
+end
+puts("benchmarking for #{keys.size} keys with #{ERRORS.map{|e| "#{e * 100}%"}.join(", ")} error rates")
+reports = []
+Benchmark.bm(40) do |x|
+  TEST_M_K.each do |m, k|
+    bf = Bloombroom::BloomFilter.new(m, k)
+    adds = x.report("BloomFilter m=#{"%07.0f" % m}, k=#{"%02.0f" % k} add") {keys.each{|key| bf.add(key)}}
+    includes = x.report("BloomFilter m=#{"%07.0f" % m}, k=#{"%02.0f" % k} include?") {keys.each{|key| bf.include?(key)}}
+    reports << {:m => m, :k => k, :adds => adds, :includes => includes}
+  end
+end
+puts("\n")
+reports.each do |report|
+  puts("BloomFilter m=#{"%07.0f" % report[:m]}, k=#{"%02.0f" % report[:k]} add        #{"%10.0f" % (keys.size / report[:adds].real)} ops/s")
+  puts("BloomFilter m=#{"%07.0f" % report[:m]}, k=#{"%02.0f" % report[:k]} include?   #{"%10.0f" % (keys.size / report[:includes].real)} ops/s")
+end

data/benchmark/bloom_filter_memory.rb ADDED

@@ -0,0 +1,28 @@
+require 'bundler/setup'
+require 'bloombroom'
+require 'benchmark/memory'
+DEFAULT_M = 10000000
+DEFAULT_K = 1
+DEFAULT_CAPACITY = 1000000
+DEFAULT_ERROR = 0.01
+m,k = if ARGV[0] == "auto"
+  ARGV.shift
+  capacity = (ARGV.shift || DEFAULT_CAPACITY).to_i
+  error = (ARGV.shift || DEFAULT_ERROR).to_f
+  Bloombroom::BloomHelper.find_m_k(capacity, error)
+else
+  m = (ARGV.shift || DEFAULT_M).to_i
+  k = (ARGV.shift || DEFAULT_K).to_i
+  [m ,k]
+end
+puts("bloomfilter m=#{m}, k=#{k}, size=#{m} bits / #{"%.1f" % ((m / 8) / 1024.0)}k")
+before = Bloombroom::Process.rss
+bf = Bloombroom::BloomFilter.new(m, k)
+after = Bloombroom::Process.rss
+puts("process size before=#{before}k, after=#{after}k")
+puts("process size growth=#{(after - before)}k" )

data/benchmark/continuous_bloom_filter.rb ADDED

@@ -0,0 +1,60 @@
+require 'bundler/setup'
+require "benchmark"
+require "digest/sha1"
+require "bloombroom"
+KEYS_COUNT = 150000
+ERRORS = [0.01, 0.001, 0.0001]
+TEST_M_K = ERRORS.map{|error| Bloombroom::BloomHelper.find_m_k(KEYS_COUNT, error)}
+keys = KEYS_COUNT.times.map{|i| Digest::SHA1.hexdigest("#{i}#{rand(1000000)}")}
+slots = 10.times.map{|i| (KEYS_COUNT / 3).times.map{|i| Digest::SHA1.hexdigest("#{i}#{rand(1000000)}")}}
+if !!(defined?(RUBY_ENGINE) && RUBY_ENGINE == 'jruby')
+  puts("warming JVM...")
+  bf = Bloombroom::ContinuousBloomFilter.new(*Bloombroom::BloomHelper.find_m_k(KEYS_COUNT, 0.001), 0)
+  keys.each{|key| bf.add(key)}
+end
+puts("benchmarking WITHOUT expiration for #{keys.size} keys with #{ERRORS.map{|e| "#{e * 100}%"}.join(", ")} error rates")
+reports = []
+Benchmark.bm(53) do |x|
+  TEST_M_K.each do |m, k|
+    bf = Bloombroom::ContinuousBloomFilter.new(m, k, 0)
+    adds = x.report("ContinuousBloomFilter m=#{"%07.0f" % m}, k=#{"%02.0f" % k} add") {keys.each{|key| bf.add(key)}}
+    includes = x.report("ContinuousBloomFilter m=#{"%07.0f" % m}, k=#{"%02.0f" % k} include?") {keys.each{|key| bf.include?(key)}}
+    reports << {:m => m, :k => k, :adds => adds, :includes => includes}
+  end
+end
+puts("\n")
+reports.each do |report|
+  puts("ContinuousBloomFilter m=#{"%07.0f" % report[:m]}, k=#{"%02.0f" % report[:k]} add          #{"%10.0f" % (keys.size / report[:adds].real)} ops/s")
+  puts("ContinuousBloomFilter m=#{"%07.0f" % report[:m]}, k=#{"%02.0f" % report[:k]} include?     #{"%10.0f" % (keys.size / report[:includes].real)} ops/s")
+end
+puts("\nbenchmarking WITH expiration for #{slots.map(&:size).reduce(&:+)} keys with #{ERRORS.map{|e| "#{e * 100}%"}.join(", ")} error rates")
+reports = []
+Benchmark.bm(53) do |x|
+  TEST_M_K.each do |m, k|
+    bf = Bloombroom::ContinuousBloomFilter.new(m, k, 0)
+    addincludes = x.report("ContinuousBloomFilter m=#{"%07.0f" % m}, k=#{"%02.0f" % k} add+include") do
+      slots.each do |slot|
+        slot.each{|key| bf.add(key)}
+        slot.each{|key| bf.include?(key)}
+        bf.inc_time_slot
+      end
+    end
+    reports << {:m => m, :k => k, :addincludes => addincludes}
+  end
+end
+puts("\n")
+reports.each do |report|
+  puts("ContinuousBloomFilter m=#{"%07.0f" % report[:m]}, k=#{"%02.0f" % report[:k]} add+include #{"%10.0f" % (slots.map(&:size).reduce(&:+) * 2 / report[:addincludes].real)} ops/s")
+end

data/benchmark/continuous_bloom_filter_memory.rb ADDED

@@ -0,0 +1,28 @@
+require 'bundler/setup'
+require 'benchmark/memory'
+require "bloombroom"
+DEFAULT_M = 10000000
+DEFAULT_K = 1
+DEFAULT_CAPACITY = 1000000
+DEFAULT_ERROR = 0.01
+m,k = if ARGV[0] == "auto"
+  ARGV.shift
+  capacity = (ARGV.shift || DEFAULT_CAPACITY).to_i
+  error = (ARGV.shift || DEFAULT_ERROR).to_f
+  Bloombroom::BloomHelper.find_m_k(capacity, error)
+else
+  m = (ARGV.shift || DEFAULT_M).to_i
+  k = (ARGV.shift || DEFAULT_K).to_i
+  [m ,k]
+end
+puts("continuous bloomfilter m=#{m}, k=#{k}, size=#{m * Bloombroom::ContinuousBloomFilter::BITS_PER_BUCKET} bits / #{"%.1f" %  (((m * Bloombroom::ContinuousBloomFilter::BITS_PER_BUCKET) / 8) / 1024.0)}k")
+before = Bloombroom::Process.rss
+bf = Bloombroom::ContinuousBloomFilter.new(m, k, 0)
+after = Bloombroom::Process.rss
+puts("process size before=#{before}k, after=#{after}k")
+puts("process size growth=#{(after - before)}k" )

data/benchmark/continuous_bloom_filter_stats.rb ADDED

@@ -0,0 +1,63 @@
+require 'bundler/setup'
+require "benchmark"
+require "digest/sha1"
+require "bloombroom"
+module Bloombroom
+  KEYS_PER_SLOT = 20000
+  SLOTS_PER_FILTER = 3
+  KEY_VALUE_RANGE = 100000000
+  puts("\ngenerating lots of random keys")
+  slots = 32.times.map do
+    add = {}
+    KEYS_PER_SLOT.times.each{|i| add["#{i}#{Digest::SHA1.hexdigest(rand(KEY_VALUE_RANGE).to_s)}"] = true}
+    free = []
+    while free.size < add.size
+      key = "#{Digest::SHA1.hexdigest(rand(KEY_VALUE_RANGE).to_s)}"
+      free << key unless add.has_key?(key)
+    end
+    [add.keys, free]
+  end
+  # puts(slots.map{|slot| slot.first.size}.inspect)
+  # puts(slots.map{|slot| slot.last.size}.inspect)
+  capacity = KEYS_PER_SLOT * SLOTS_PER_FILTER
+  error = 0.001 # 0.001 == 0.1%
+  m, k = BloomHelper.find_m_k(capacity, error)
+  puts("\nContinuous BloomFilter with capacity=#{capacity}, error=#{error}(#{error * 100}%) -> m=#{m}, k=#{k}")
+  bf = ContinuousBloomFilter.new(m, k, 0)
+  t = Benchmark.realtime do
+    slots.each do |slot|
+      slot.first.each{|key| bf.add(key)}
+      false_positives = 0
+      slot.last.each{|key| false_positives += 1 if bf.include?(key)}
+      puts("added #{slot.first.size} keys, tested #{slot.last.size} keys, FPs=#{false_positives}/#{slot.last.size} (#{"%.3f" % ((false_positives * 100) / Float(slot.last.size))})%")
+      bf.inc_time_slot
+    end
+  end
+  n = slots.size * KEYS_PER_SLOT
+  puts("Continuous BloomFilter #{n} adds + #{n} tests in #{"%.2f" % t}s, #{"%2.0f" % ((2 * n) / t)} ops/s")
+  puts("\nBloomFilter with capacity=#{capacity}, error=#{error}(#{error * 100}%) -> m=#{m}, k=#{k}")
+  bf = BloomFilter.new(m, k)
+  t = Benchmark.realtime do
+    slots[0, SLOTS_PER_FILTER + 3].each do |slot|
+      slot.first.each{|key| bf.add(key); n += 1}
+      false_positives = 0
+      slot.last.each{|key| false_positives += 1 if bf.include?(key)}
+      puts("added #{slot.first.size} keys, tested #{slot.last.size} keys, FPs=#{false_positives}/#{slot.last.size} (#{"%.3f" % ((false_positives * 100) / Float(slot.last.size))})%")
+    end
+  end
+  n = (SLOTS_PER_FILTER + 3) * KEYS_PER_SLOT
+  puts("BloomFilter #{n} adds + #{n} tests in #{"%.2f" % t}s, #{"%2.0f" % ((2 * n) / t)} ops/s")
+end