RubyGems - bloom_fit - Versions diffs - 0.3.1 → 1.1.0 - Mend

bloom_fit 0.3.1 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +4 -4
data/README.md +220 -47
data/ext/cbloomfilter/cbloomfilter.c +71 -14
data/ext/cbloomfilter/salts.h +50 -0
data/lib/bloom_fit/version.rb +1 -1
data/lib/bloom_fit.rb +36 -33
data/lib/cbloomfilter.bundle +0 -0
data/test/bloom_fit_test.rb +70 -12
data/test/c_bloom_filter_test.rb +233 -0
data/test/test_helper.rb +1 -0
metadata +3 -3
data/ext/cbloomfilter/crc32.h +0 -76
data/lib/bloom_fit/configuration_mismatch.rb +0 -4

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: cd631cdb483e0a84fa05d56eb962fda0f7c7d7a0b002ea708024ce82505a9054
-  data.tar.gz: ee781997465d6f5b590828082e4fadd5b00768298bbdec7845b9f07c3d046549
+  metadata.gz: ed19ba044e45497c9026b8227e77c48cd62aea3043f698c6aca4955eb734f17e
+  data.tar.gz: e712cf58a3b6b11e38733da4437c95fbd94a7e9b07eeb4e72945138a140d730f
 SHA512:
-  metadata.gz: 7862f2d0189bae865c6fc5e7c7ad24f5c7ab0420415a455a1a0b130835d639c536cb8925b08219eab7dd7a10db1e9299b2868019d3e2259db4dce96de01e50a2
-  data.tar.gz: 41cb7f2fcb8cf80f5345785ce0110e242a29fbe6177284b13b701973ec7b0e7010d788585e406f77712f7ee284ff308633fe060e492b0e153a4a5598658fd465
+  metadata.gz: 69b2b91fdf8e3995931507a53b13c6923e225faef01f6cf39c3524e9ad2e63673411452719fc93509aaa830b42f4fa45198cf39f0b3f70cd33f60846116f5430
+  data.tar.gz: e33e427c4bd6ca79d818887dbca0d80348a868fdea85930197e92e087c63adb8a3d339b74d420a894ed55c731e05fedfeb9875a57e5f32698ee342c2836a1ebc

data/README.md CHANGED Viewed

@@ -1,77 +1,250 @@
-# BloomFit makes Bloom Filter tuning easy
+# BloomFit
-[![Gem Version](http://img.shields.io/gem/v/bloom_fit.svg)](https://rubygems.org/gems/bloom_fit)
+[![Gem Version](https://img.shields.io/gem/v/bloom_fit.svg)](https://rubygems.org/gems/bloom_fit)
 [![CI](https://github.com/rmm5t/bloom_fit/actions/workflows/ci.yml/badge.svg)](https://github.com/rmm5t/bloom_fit/actions/workflows/ci.yml)
 [![Gem Downloads](https://img.shields.io/gem/dt/bloom_fit.svg)](https://rubygems.org/gems/bloom_fit)
-BloomFit provides a MRI/C-based non-counting bloom filter for use in your Ruby projects. It is heavily based on [bloomfilter-rb]'s native implementation, but differs in the following ways:
+BloomFit is an in-memory, non-counting Bloom filter for Ruby backed by a small C extension.
+It gives you a compact, Set-like API for probabilistic membership checks:
+- false positives are possible
+- false negatives are not, as long as a value was added to the same filter
+- individual values cannot be deleted safely because the filter is non-counting
+BloomFit is heavily inspired by [bloomfilter-rb]'s native implementation and the original C implementation by Tatsuya Mori. This version uses a DJB2 hash with salts from the CRC table and wraps the native filter in a Ruby-friendly API. The most common way to use it is to pass an expected `capacity` and optional `false_positive_rate`, then let BloomFit calculate `size` and `hashes` for you.
+Compared with bloomfilter-rb, BloomFit:
 - uses DJB2 over CRC32 yielding better hash distribution
 - improves performance for very large datasets
 - avoids the need to supply a seed
-- automatically calculates the bit size (m) and the number of hashes (k) when given a capacity and false-positive-rate
+- automatically calculates the filter size (`m`) and hash count (`k`) from capacity and false-positive rate
-A [Bloom filter](http://en.wikipedia.org/wiki/Bloom_filter) is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. Instead of using k different hash functions, this implementation a DJB2 hash with k seeds from the CRC table.
+## Features
-Performance of the Bloom filter depends on the following:
+- native `CBloomFilter` implementation for MRI Ruby
+- automatic sizing from `capacity` and `false_positive_rate`
+- small Ruby API with familiar methods like `add`, `include?`, `merge`, `|`, and `&`
+- supports strings, symbols, integers, booleans, and other values that can be converted with `to_s`
+- manual `size` / `hashes` overrides when you want control
+- save and reload filters with Ruby `Marshal`
+- inspect filter state with `stats`, `to_hex`, `to_binary`, and `bitmap`
-- size of the bit array
-- number of hash functions
+## Requirements
-## Resources
+- Ruby `>= 3.2.0`
-- Background: [Bloom filter](http://en.wikipedia.org/wiki/Bloom_filter)
-- Determining parameters: [Scalable Datasets: Bloom Filters in Ruby](http://www.igvita.com/2008/12/27/scalable-datasets-bloom-filters-in-ruby/)
-- Applications & reasons behind bloom filter: [Flow analysis: Time based bloom filter](http://www.igvita.com/2010/01/06/flow-analysis-time-based-bloom-filters/)
+## Installation
-## Examples
+```bash
+gem install bloom_fit
+```
-MRI/C implementation which creates an in-memory filter which can be saved and reloaded from disk.
+```ruby
+require "bloom_fit"
+```
-(COMING SOON) If you'd like to specify an expected item count and a false-positive rate that you can tolerate. Visit the [Bloom Filter Calculator](https://hur.st/bloomfilter/) to learn more.
+## Quick Start
 ```ruby
 require "bloom_fit"
-bf = BloomFit.new(capacity: 250, false_positive_rate: 0.001)
-bf.add("cat")
-bf.include?("cat")     # => true
-bf.include?("dog")     # => false
-# Hash syntax with a bloom filter!
-bf["bird"] = "bar"
-bf["bird"]             # => true
-bf["mouse"]            # => false
-puts bf.stats
-# Number of filter bits (m): 3600
-# Number of set bits (n): 20
-# Number of filter hashes (k) : 10
-# Predicted false positive rate = 0.00%
+filter = BloomFit.new(capacity: 250, false_positive_rate: 0.001)
+filter.add("cat")
+filter << :dog
+filter.include?("cat") # => true
+filter.key?("dog")     # => true
+filter["bird"]         # => false
+filter["owl"] = true
+filter["ant"] = false
+filter["owl"]          # => true
+filter["ant"]          # => false
+filter.empty?          # => false
+filter.size            # => 3595
+filter.hashes          # => 10
+filter.clear
+filter.empty?          # => true
 ```
-If you'd like more control over the traditional inputs like bit size and the number of hashes:
+`#include?`, `#key?`, and `#[]` are aliases. `#add` and `#<<` are also aliases.
+## Automatic Sizing
+BloomFit now calculates `size` and `hashes` for you when you initialize it with an expected capacity:
 ```ruby
-require "bloom_fit"
+filter = BloomFit.new(capacity: 10_000, false_positive_rate: 0.01)
+filter.size   # => 95851
+filter.hashes # => 7
+```
+The defaults are a good starting point for many small filters:
+```ruby
+filter = BloomFit.new
+filter.size   # => 1438
+filter.hashes # => 10
+```
+That is equivalent to:
+```ruby
+filter = BloomFit.new(capacity: 100, false_positive_rate: 0.001)
+```
+Internally BloomFit uses the standard Bloom filter formulas:
+```text
+m = -(n * ln(p)) / (ln(2)^2)
+k = (m / n) * ln(2)
+```
+- `n`: expected number of inserted values
+- `p`: target false-positive rate
+- `m`: number of filter buckets (`size`)
+- `k`: number of hash functions (`hashes`)
+For example, if you expect about `10_000` inserts and can tolerate a `1%` false-positive rate, BloomFit will calculate `size: 95_851` and `hashes: 7` for you.
+If you prefer a calculator, see [Bloom Filter Calculator](https://hur.st/bloomfilter/).
+## Manual Sizing
+If you already know the exact filter width and hash count you want, you can still pass them directly:
+```ruby
+filter = BloomFit.new(size: 95_851, hashes: 7)
+```
+This bypasses automatic sizing.
+## Common Operations
-bf = BloomFit.new(size: 100, hashes: 2)
-bf.add("cat")
-bf.include?("cat")     # => true
-bf.include?("dog")     # => false
-# Hash syntax with a bloom filter!
-bf["bird"] = "bar"
-bf["bird"]             # => true
-bf["mouse"]            # => false
-puts bf.stats
-# Number of filter bits (m): 100
-# Number of set bits (n): 4
-# Number of filter hashes (k) : 2
-# Predicted false positive rate = 10.87%
+### Add and check membership
+```ruby
+filter = BloomFit.new(capacity: 100)
+filter << "cat"
+filter << "dog"
+filter.include?("cat")  # => true
+filter.include?("bird") # => false
+```
+### Use hash-like syntax for truthy values
+```ruby
+filter = BloomFit.new(capacity: 64)
+filter[:cat] = true
+filter[:dog] = false
+filter[:cat] # => true
+filter[:dog] # => false
+filter.merge({ bird: true, ant: nil })
+filter.include?(:bird) # => true
+filter.include?(:ant)  # => false
+```
+When merging a hash, only keys with truthy values are added.
+### Merge, union, and intersection
+```ruby
+pets = BloomFit.new(capacity: 50)
+pets << "cat" << "dog"
+more_pets = BloomFit.new(capacity: 50)
+more_pets << "dog" << "bird"
+combined = pets | more_pets
+overlap = pets & more_pets
+combined.include?("bird") # => true
+overlap.include?("dog")   # => true
+overlap.include?("cat")   # => false
+```
+`#merge` also accepts arrays, sets, and other enumerables:
+```ruby
+filter = BloomFit.new(capacity: 100)
+filter.merge(%w[cat dog bird])
+```
+Filters can only be combined when they have the same `size` and `hashes`. Otherwise BloomFit raises `ArgumentError`.
+When you create filters with automatic sizing, use the same `capacity` and `false_positive_rate` for filters you plan to merge, union, or intersect.
+### Save and load filters
+```ruby
+filter = BloomFit.new(capacity: 100)
+filter << "cat" << "dog"
+filter.save("pets.bloom")
+reloaded = BloomFit.load("pets.bloom")
+reloaded.include?("cat") # => true
+reloaded.include?("dog") # => true
+```
+Persistence uses Ruby `Marshal`. Only load files you trust.
+### Inspect the bitmap
+```ruby
+filter = BloomFit.new(size: 16, hashes: 4)
+filter << "cool"
+filter.to_hex    # => "1441"
+filter.to_binary # => "0001010001000001"
+filter.bitmap    # => raw bytes from the native filter
 ```
+`#bitmap` returns the native byte representation, which may include padding bytes beyond the configured filter width. `#to_binary` trims the result to exactly `size` bits.
+## API Overview
+| Method | Notes |
+| --- | --- |
+| `BloomFit.new` or `BloomFit.new(capacity:, false_positive_rate:)` | Creates a filter and calculates `size` and `hashes` automatically. Defaults to `capacity: 100`, `false_positive_rate: 0.001`. |
+| `BloomFit.new(size:, hashes:)` | Creates a filter with explicit sizing when you want fixed parameters. |
+| `add`, `<<` | Adds a value and returns the filter. |
+| `add?` | Adds only when the value does not already appear present. |
+| `include?`, `key?`, `[]` | Probabilistic membership check. |
+| `[]=` | Adds a key only when the assigned value is truthy. |
+| `merge` | Merges another filter or an enumerable into the receiver. |
+| `\|`, `union` | Returns a new filter containing the union. |
+| `&`, `intersection` | Returns a new filter containing the intersection. |
+| `clear` | Resets all bits to `0`. |
+| `empty?` | Exact check for whether any bits are set. |
+| `size`, `m` | Returns the configured filter width. |
+| `hashes`, `k` | Returns the number of hash functions. |
+| `set_bits`, `n` | Returns the number of bits currently set. |
+| `stats` | Returns a human-readable summary including predicted false-positive rate. |
+| `to_hex`, `to_binary`, `bitmap` | Returns the filter bitmap in different representations. |
+| `save`, `BloomFit.load` | Serializes and restores a filter with Ruby `Marshal`. |
+## Resources
+- Background: [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter)
+- Determining parameters: [Scalable Datasets: Bloom Filters in Ruby](http://www.igvita.com/2008/12/27/scalable-datasets-bloom-filters-in-ruby/)
+- Applications and motivation: [Flow analysis: Time based bloom filter](http://www.igvita.com/2010/01/06/flow-analysis-time-based-bloom-filters/)
+- Calculator: [Bloom Filter Calculator](https://hur.st/bloomfilter/)
 ## Credits
 - Tatsuya Mori <valdzone@gmail.com> (Original C implementation)

data/ext/cbloomfilter/cbloomfilter.c CHANGED Viewed

@@ -4,15 +4,15 @@
  */
 #include "ruby.h"
-#include "crc32.h"
+#include <limits.h>
+#include "salts.h"
 #if !defined(RSTRING_LEN)
 # define RSTRING_LEN(x) (RSTRING(x)->len)
 # define RSTRING_PTR(x) (RSTRING(x)->ptr)
 #endif
-/* Reuse the standard CRC table for consistent salts */
-static unsigned int *salts = crc_table;
+static const int salts_length = sizeof(salts) / sizeof(salts[0]);
 static VALUE cBloomFilter;
@@ -26,7 +26,7 @@ struct BloomFilter {
 unsigned long djb2(const char *str, int len) {
     unsigned long hash = 5381;
     for (int i = 0; i < len; i++) {
-        hash = ((hash << 5) + hash) + str[i];
+        hash = ((hash << 5) + hash) + (unsigned char) str[i];
     }
     return hash;
 }
@@ -92,14 +92,41 @@ static int bucket_check(struct BloomFilter *bf, int index) {
     return (bf->ptr[byte_offset] >> bit_offset) & 1;
 }
+static void bf_ensure_compatible(struct BloomFilter *bf, struct BloomFilter *other) {
+    if (bf->m != other->m || bf->k != other->k || bf->bytes != other->bytes) {
+        rb_raise(rb_eArgError, "bloom filters must have matching size and hash count");
+    }
+}
+static void bf_clear_padding_bits(struct BloomFilter *bf) {
+    int full_bytes = bf->m / 8;
+    int remaining_bits = bf->m % 8;
+    int i;
+    if (remaining_bits > 0) {
+        unsigned char mask = (unsigned char) ((1U << remaining_bits) - 1U);
+        bf->ptr[full_bytes] &= mask;
+        full_bytes += 1;
+    }
+    for (i = full_bytes; i < bf->bytes; i++) {
+        bf->ptr[i] = 0;
+    }
+}
 static VALUE bf_initialize(int argc, VALUE *argv, VALUE self) {
     struct BloomFilter *bf;
     VALUE arg1, arg2;
+    long m_value, k_value;
     int m, k;
     bf = bf_ptr(self);
-    /* default = Fugou approach :-) */
+    if (argc > 2) {
+        rb_error_arity(argc, 0, 2);
+    }
+    /* defaults */
     arg1 = INT2FIX(1000);
     arg2 = INT2FIX(4);
@@ -111,13 +138,23 @@ static VALUE bf_initialize(int argc, VALUE *argv, VALUE self) {
       break;
     }
-    m = FIX2INT(arg1);
-    k = FIX2INT(arg2);
+    m_value = NUM2LONG(arg1);
+    k_value = NUM2LONG(arg2);
+    if (m_value > INT_MAX - 15)
+        rb_raise(rb_eRangeError, "bit length is too large");
+    if (k_value > INT_MAX)
+        rb_raise(rb_eRangeError, "hash length is too large");
+    m = (int) m_value;
+    k = (int) k_value;
     if (m < 1)
-        rb_raise(rb_eArgError, "array size");
+        rb_raise(rb_eArgError, "bit length must be >= 1");
     if (k < 1)
-        rb_raise(rb_eArgError, "hash length");
+        rb_raise(rb_eArgError, "hash length must be >= 1");
+    if (k > salts_length)
+        rb_raise(rb_eArgError, "hash length must be <= %d", salts_length);
     bf->m = m;
     bf->k = k;
@@ -131,7 +168,6 @@ static VALUE bf_initialize(int argc, VALUE *argv, VALUE self) {
     /* initialize the bits with zeros */
     memset(bf->ptr, 0, bf->bytes);
-    rb_iv_set(self, "@hash_value", rb_hash_new());
     return self;
 }
@@ -154,12 +190,18 @@ static VALUE bf_k(VALUE self) {
 static VALUE bf_set_bits(VALUE self){
     struct BloomFilter *bf = bf_ptr(self);
-    int i,j,count = 0;
+    int i, count = 0;
     for (i = 0; i < bf->bytes; i++) {
-        for (j = 0; j < 8; j++) {
-            count += (bf->ptr[i] >> j) & 1;
+        unsigned char byte = bf->ptr[i];
+        /* Brian Kernighan’s bit-count loop a*/
+        while (byte != 0) {
+            byte &= (unsigned char) (byte - 1);
+            count++;
         }
     }
     return INT2FIX(count);
 }
@@ -193,6 +235,9 @@ static VALUE bf_merge(VALUE self, VALUE other) {
     struct BloomFilter *bf = bf_ptr(self);
     struct BloomFilter *target = bf_ptr(other);
     int i;
+    bf_ensure_compatible(bf, target);
     for (i = 0; i < bf->bytes; i++) {
         bf->ptr[i] |= target->ptr[i];
     }
@@ -206,6 +251,8 @@ static VALUE bf_and(VALUE self, VALUE other) {
     VALUE klass, obj, args[5];
     int i;
+    bf_ensure_compatible(bf, bf_other);
     args[0] = INT2FIX(bf->m);
     args[1] = INT2FIX(bf->k);
     klass = rb_funcall(self,rb_intern("class"),0);
@@ -225,6 +272,8 @@ static VALUE bf_or(VALUE self, VALUE other) {
     VALUE klass, obj, args[5];
     int i;
+    bf_ensure_compatible(bf, bf_other);
     args[0] = INT2FIX(bf->m);
     args[1] = INT2FIX(bf->k);
     klass = rb_funcall(self,rb_intern("class"),0);
@@ -278,9 +327,17 @@ static VALUE bf_bitmap(VALUE self) {
 static VALUE bf_load(VALUE self, VALUE bitmap) {
     struct BloomFilter *bf = bf_ptr(self);
-    unsigned char* ptr = (unsigned char *) RSTRING_PTR(bitmap);
+    VALUE bitmap_string = StringValue(bitmap);
+    unsigned char* ptr;
+    if (RSTRING_LEN(bitmap_string) != bf->bytes) {
+        rb_raise(rb_eArgError, "bitmap length must be %d bytes", bf->bytes);
+    }
+    ptr = (unsigned char *) RSTRING_PTR(bitmap_string);
     memcpy(bf->ptr, ptr, bf->bytes);
+    bf_clear_padding_bits(bf);
     return Qnil;
 }

data/ext/cbloomfilter/salts.h ADDED Viewed

@@ -0,0 +1,50 @@
+/*
+ *   Borrowed from the CRC table
+ *   https://www.mrob.com/pub/comp/crc-all.html
+ *
+ */
+static unsigned int salts[] = {
+    0x00000000UL, 0x77073096UL, 0xee0e612cUL, 0x990951baUL, 0x076dc419UL, 0x706af48fUL,
+    0xe963a535UL, 0x9e6495a3UL, 0x0edb8832UL, 0x79dcb8a4UL, 0xe0d5e91eUL, 0x97d2d988UL,
+    0x09b64c2bUL, 0x7eb17cbdUL, 0xe7b82d07UL, 0x90bf1d91UL, 0x1db71064UL, 0x6ab020f2UL,
+    0xf3b97148UL, 0x84be41deUL, 0x1adad47dUL, 0x6ddde4ebUL, 0xf4d4b551UL, 0x83d385c7UL,
+    0x136c9856UL, 0x646ba8c0UL, 0xfd62f97aUL, 0x8a65c9ecUL, 0x14015c4fUL, 0x63066cd9UL,
+    0xfa0f3d63UL, 0x8d080df5UL, 0x3b6e20c8UL, 0x4c69105eUL, 0xd56041e4UL, 0xa2677172UL,
+    0x3c03e4d1UL, 0x4b04d447UL, 0xd20d85fdUL, 0xa50ab56bUL, 0x35b5a8faUL, 0x42b2986cUL,
+    0xdbbbc9d6UL, 0xacbcf940UL, 0x32d86ce3UL, 0x45df5c75UL, 0xdcd60dcfUL, 0xabd13d59UL,
+    0x26d930acUL, 0x51de003aUL, 0xc8d75180UL, 0xbfd06116UL, 0x21b4f4b5UL, 0x56b3c423UL,
+    0xcfba9599UL, 0xb8bda50fUL, 0x2802b89eUL, 0x5f058808UL, 0xc60cd9b2UL, 0xb10be924UL,
+    0x2f6f7c87UL, 0x58684c11UL, 0xc1611dabUL, 0xb6662d3dUL, 0x76dc4190UL, 0x01db7106UL,
+    0x98d220bcUL, 0xefd5102aUL, 0x71b18589UL, 0x06b6b51fUL, 0x9fbfe4a5UL, 0xe8b8d433UL,
+    0x7807c9a2UL, 0x0f00f934UL, 0x9609a88eUL, 0xe10e9818UL, 0x7f6a0dbbUL, 0x086d3d2dUL,
+    0x91646c97UL, 0xe6635c01UL, 0x6b6b51f4UL, 0x1c6c6162UL, 0x856530d8UL, 0xf262004eUL,
+    0x6c0695edUL, 0x1b01a57bUL, 0x8208f4c1UL, 0xf50fc457UL, 0x65b0d9c6UL, 0x12b7e950UL,
+    0x8bbeb8eaUL, 0xfcb9887cUL, 0x62dd1ddfUL, 0x15da2d49UL, 0x8cd37cf3UL, 0xfbd44c65UL,
+    0x4db26158UL, 0x3ab551ceUL, 0xa3bc0074UL, 0xd4bb30e2UL, 0x4adfa541UL, 0x3dd895d7UL,
+    0xa4d1c46dUL, 0xd3d6f4fbUL, 0x4369e96aUL, 0x346ed9fcUL, 0xad678846UL, 0xda60b8d0UL,
+    0x44042d73UL, 0x33031de5UL, 0xaa0a4c5fUL, 0xdd0d7cc9UL, 0x5005713cUL, 0x270241aaUL,
+    0xbe0b1010UL, 0xc90c2086UL, 0x5768b525UL, 0x206f85b3UL, 0xb966d409UL, 0xce61e49fUL,
+    0x5edef90eUL, 0x29d9c998UL, 0xb0d09822UL, 0xc7d7a8b4UL, 0x59b33d17UL, 0x2eb40d81UL,
+    0xb7bd5c3bUL, 0xc0ba6cadUL, 0xedb88320UL, 0x9abfb3b6UL, 0x03b6e20cUL, 0x74b1d29aUL,
+    0xead54739UL, 0x9dd277afUL, 0x04db2615UL, 0x73dc1683UL, 0xe3630b12UL, 0x94643b84UL,
+    0x0d6d6a3eUL, 0x7a6a5aa8UL, 0xe40ecf0bUL, 0x9309ff9dUL, 0x0a00ae27UL, 0x7d079eb1UL,
+    0xf00f9344UL, 0x8708a3d2UL, 0x1e01f268UL, 0x6906c2feUL, 0xf762575dUL, 0x806567cbUL,
+    0x196c3671UL, 0x6e6b06e7UL, 0xfed41b76UL, 0x89d32be0UL, 0x10da7a5aUL, 0x67dd4accUL,
+    0xf9b9df6fUL, 0x8ebeeff9UL, 0x17b7be43UL, 0x60b08ed5UL, 0xd6d6a3e8UL, 0xa1d1937eUL,
+    0x38d8c2c4UL, 0x4fdff252UL, 0xd1bb67f1UL, 0xa6bc5767UL, 0x3fb506ddUL, 0x48b2364bUL,
+    0xd80d2bdaUL, 0xaf0a1b4cUL, 0x36034af6UL, 0x41047a60UL, 0xdf60efc3UL, 0xa867df55UL,
+    0x316e8eefUL, 0x4669be79UL, 0xcb61b38cUL, 0xbc66831aUL, 0x256fd2a0UL, 0x5268e236UL,
+    0xcc0c7795UL, 0xbb0b4703UL, 0x220216b9UL, 0x5505262fUL, 0xc5ba3bbeUL, 0xb2bd0b28UL,
+    0x2bb45a92UL, 0x5cb36a04UL, 0xc2d7ffa7UL, 0xb5d0cf31UL, 0x2cd99e8bUL, 0x5bdeae1dUL,
+    0x9b64c2b0UL, 0xec63f226UL, 0x756aa39cUL, 0x026d930aUL, 0x9c0906a9UL, 0xeb0e363fUL,
+    0x72076785UL, 0x05005713UL, 0x95bf4a82UL, 0xe2b87a14UL, 0x7bb12baeUL, 0x0cb61b38UL,
+    0x92d28e9bUL, 0xe5d5be0dUL, 0x7cdcefb7UL, 0x0bdbdf21UL, 0x86d3d2d4UL, 0xf1d4e242UL,
+    0x68ddb3f8UL, 0x1fda836eUL, 0x81be16cdUL, 0xf6b9265bUL, 0x6fb077e1UL, 0x18b74777UL,
+    0x88085ae6UL, 0xff0f6a70UL, 0x66063bcaUL, 0x11010b5cUL, 0x8f659effUL, 0xf862ae69UL,
+    0x616bffd3UL, 0x166ccf45UL, 0xa00ae278UL, 0xd70dd2eeUL, 0x4e048354UL, 0x3903b3c2UL,
+    0xa7672661UL, 0xd06016f7UL, 0x4969474dUL, 0x3e6e77dbUL, 0xaed16a4aUL, 0xd9d65adcUL,
+    0x40df0b66UL, 0x37d83bf0UL, 0xa9bcae53UL, 0xdebb9ec5UL, 0x47b2cf7fUL, 0x30b5ffe9UL,
+    0xbdbdf21cUL, 0xcabac28aUL, 0x53b39330UL, 0x24b4a3a6UL, 0xbad03605UL, 0xcdd70693UL,
+    0x54de5729UL, 0x23d967bfUL, 0xb3667a2eUL, 0xc4614ab8UL, 0x5d681b02UL, 0x2a6f2b94UL,
+    0xb40bbe37UL, 0xc30c8ea1UL, 0x5a05df1bUL, 0x2d02ef8dUL
+};

data/lib/bloom_fit/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 class BloomFit
-  VERSION = "0.3.1".freeze
+  VERSION = "1.1.0".freeze
 end

data/lib/bloom_fit.rb CHANGED Viewed

@@ -1,7 +1,6 @@
 require "forwardable"
 require "cbloomfilter"
-require "bloom_fit/configuration_mismatch"
 require "bloom_fit/version"
 # BloomFit is an in-memory Bloom filter with a small, Set-like API.
@@ -16,7 +15,7 @@ require "bloom_fit/version"
 # serialized with +save+ and reloaded with +BloomFit.load+.
 #
 # Filters can only be combined when they were created with the same +size+ and
-# +hashes+ values; otherwise +BloomFit::ConfigurationMismatch+ is raised.
+# +hashes+ values; otherwise the native extension raises +ArgumentError+.
 #
 #   filter = BloomFit.new(size: 10_000, hashes: 6)
 #   filter.add("cat")
@@ -28,6 +27,8 @@ require "bloom_fit/version"
 class BloomFit
   extend Forwardable
+  LN2 = Math.log(2.0).freeze
   # The wrapped native +CBloomFilter+ instance.
   #
   # This is mostly useful for low-level integrations and internal filter
@@ -40,9 +41,19 @@ class BloomFit
   # but the best values depend on how many keys you expect to insert and how
   # many false positives you can tolerate.
   #
+  # @param capacity [Integer] expected number of elements to store in the set
+  # @param false_positive_rate [Integer] expected number of elements to store in the set
   # @param size [Integer] number of buckets in a bloom filter
   # @param hashes [Integer] number of hash functions
-  def initialize(size: 1_000, hashes: 4)
+  def initialize(capacity: 100, false_positive_rate: 0.001, size: nil, hashes: 4)
+    if size.nil? || hashes.nil?
+      raise ArgumentError, "capacity must be > 0" unless capacity.positive?
+      raise ArgumentError, "false_positive_rate must be between 0 and 1" if false_positive_rate <= 0.0 || false_positive_rate >= 1.0
+      size = (-capacity.to_f * Math.log(false_positive_rate) / (LN2**2)).ceil
+      hashes = (size / capacity * LN2).ceil
+    end
     @bf = CBloomFilter.new(size, hashes)
   end
@@ -68,15 +79,11 @@ class BloomFit
   #
   # Positive results are probabilistic and may be false positives.
-  # :method: clear
-  #
-  # Clears the filter by resetting all bits to +0+.
   # :method: set_bits
   #
   # Returns the number of bits currently set to +1+.
-  def_delegators :@bf, :m, :k, :bitmap, :include?, :clear, :set_bits
+  def_delegators :@bf, :m, :k, :bitmap, :include?, :set_bits
   # Returns the configured filter width.
   alias size m
@@ -103,6 +110,12 @@ class BloomFit
   end
   alias << add
+  # Clears the filter by resetting all bits to +0+ and returns +self+.
+  def clear
+    @bf.clear
+    self
+  end
   # Adds +key+ to the filter when +value+ is truthy.
   #
   # This makes BloomFit behave like a write-only membership hash: truthy values
@@ -150,7 +163,6 @@ class BloomFit
   # This method mutates the receiver and mimics Set#merge.
   def merge(other)
     if other.is_a?(BloomFit)
-      raise BloomFit::ConfigurationMismatch unless same_parameters?(other)
       @bf.merge(other.bf)
     elsif other.respond_to?(:each_key)
       other.each { |k, v| add(k) if v }
@@ -159,17 +171,18 @@ class BloomFit
     else
       raise ArgumentError, "value must be enumerable or another BloomFit filter"
     end
+    self
   end
   # Returns a new filter containing the bitwise intersection of two filters.
   #
-  # Both filters must have the same +size+ and +hashes+ values or
-  # +BloomFit::ConfigurationMismatch+ is raised.
+  # Both filters must have the same +size+ and +hashes+ values or the native
+  # extension raises +ArgumentError+.
   #
   # Like all Bloom filter operations, membership checks on the result remain
   # probabilistic and may still produce false positives.
   def &(other)
-    raise BloomFit::ConfigurationMismatch unless same_parameters?(other)
     self.class.new(size:, hashes:).tap do |result|
       result.instance_variable_set(:@bf, @bf.&(other.bf))
     end
@@ -178,12 +191,11 @@ class BloomFit
   # Returns a new filter containing the bitwise union of two filters.
   #
-  # Both filters must have the same +size+ and +hashes+ values or
-  # +BloomFit::ConfigurationMismatch+ is raised.
+  # Both filters must have the same +size+ and +hashes+ values or the native
+  # extension raises +ArgumentError+.
   #
   # The receiver and +other+ are left unchanged.
   def |(other)
-    raise BloomFit::ConfigurationMismatch unless same_parameters?(other)
     self.class.new(size:, hashes:).tap do |result|
       result.instance_variable_set(:@bf, @bf.|(other.bf))
     end
@@ -196,14 +208,14 @@ class BloomFit
   # bits (+n+), the hash count (+k+), and the predicted false-positive rate
   # based on the current fill level.
   def stats
-    fpr = ((1.0 - Math.exp(-(k * n).to_f / m))**k) * 100
+    fpr = ((n.to_f / m)**k) * 100
-    (+"").tap do |s|
-      s << format("Number of filter buckets (m):  %d\n",     m)
-      s << format("Number of set bits (n):        %d\n",     n)
-      s << format("Number of filter hashes (k):   %d\n",     k)
-      s << format("Predicted false positive rate: %.2f%%\n", fpr)
-    end
+    format <<~STATS, m, n, k, fpr
+      Number of filter buckets (m):  %d
+      Number of set bits (n):        %d
+      Number of filter hashes (k):   %d
+      Predicted false positive rate: %.2f%%
+    STATS
   end
   # Rebuilds the filter from the serialized data returned by +marshal_dump+.
@@ -226,20 +238,11 @@ class BloomFit
   # The file is read using Ruby's +Marshal+ format, so it should only be used
   # with trusted input.
   def self.load(filename)
-    Marshal.load(File.open(filename, "r")) # rubocop:disable Security/MarshalLoad
+    Marshal.load(File.binread(filename)) # rubocop:disable Security/MarshalLoad
   end
   # Writes the filter to +filename+ using Ruby's +Marshal+ format.
   def save(filename)
-    File.open(filename, "w") do |f|
-      f << Marshal.dump(self)
-    end
-  end
-  protected
-  # Returns +true+ when +other+ has the same +size+ and +hashes+ values.
-  def same_parameters?(other)
-    bf.m == other.bf.m && bf.k == other.bf.k
+    File.binwrite(filename, Marshal.dump(self))
   end
 end

data/lib/cbloomfilter.bundle CHANGED Viewed

Binary file

data/test/bloom_fit_test.rb CHANGED Viewed

@@ -3,6 +3,28 @@ require "test_helper"
 class BloomFitTest < Minitest::Spec
   subject { BloomFit.new(size: 100, hashes: 4) }
+  describe ".new" do
+    it "accepts size and hashes override" do
+      bf = BloomFit.new(size: 10, hashes: 1)
+      assert_equal 10, bf.size
+      assert_equal 1, bf.hashes
+    end
+    it "has default capacity and false positive-rate" do
+      bf = BloomFit.new
+      # https://hur.st/bloomfilter/?n=100&p=0.001&m=&k=
+      assert_equal 1438, bf.size
+      assert_equal 10, bf.hashes
+    end
+    it "calculates size and hashes given a capacity and false postiive rate" do
+      bf = BloomFit.new(capacity: 10_000, false_positive_rate: 0.0001)
+      # https://hur.st/bloomfilter/?n=10000&p=0.0001&m=&k=
+      assert_equal 191_702, bf.size
+      assert_equal 14, bf.hashes
+    end
+  end
   describe "#empty?" do
     it "returns true when nothing set" do
       assert_equal true, subject.empty? # rubocop:disable Minitest/AssertTruthy
@@ -102,11 +124,11 @@ class BloomFitTest < Minitest::Spec
   end
   describe "#clear" do
-    it "zeroes the bits" do
+    it "zeroes the bits and returns self" do
       subject.add("test")
       assert_includes subject, "test"
       assert_includes subject.to_binary, "1"
-      subject.clear
+      assert_equal subject, subject.clear
       refute_includes subject, "test"
       refute_includes subject.to_binary, "1"
     end
@@ -180,14 +202,14 @@ class BloomFitTest < Minitest::Spec
   end
   describe "#merge" do
-    it "merges another BloomFit filter" do
+    it "merges another BloomFit filter and returns self" do
       bf1 = BloomFit.new(size: 100, hashes: 2)
       bf2 = BloomFit.new(size: 100, hashes: 2)
       bf1 << "mouse"
       bf2 << "cat" << "dog"
       refute_includes bf1, "cat"
       refute_includes bf1, "dog"
-      bf1.merge(bf2)
+      assert_equal bf1, bf1.merge(bf2)
       assert_includes bf1, "mouse"
       assert_includes bf1, "cat"
       assert_includes bf1, "dog"
@@ -196,9 +218,9 @@ class BloomFitTest < Minitest::Spec
       assert_includes bf2, "dog"
     end
-    it "merges an array" do
+    it "merges an array and returns self" do
       subject << "mouse"
-      subject.merge %i[cat dog]
+      assert_equal subject, subject.merge(%i[cat dog])
       assert_includes subject, "mouse"
       assert_includes subject, "cat"
       assert_includes subject, "dog"
@@ -225,7 +247,7 @@ class BloomFitTest < Minitest::Spec
     it "raises when merge is between incompatible filters" do
       bf1 = BloomFit.new(size: 10)
       bf2 = BloomFit.new(size: 20)
-      assert_raises(BloomFit::ConfigurationMismatch) { bf1.merge(bf2) }
+      assert_raises(ArgumentError) { bf1.merge(bf2) }
     end
   end
@@ -263,11 +285,11 @@ class BloomFitTest < Minitest::Spec
     it "raises when intersection is between incompatible filters" do
       bf1 = BloomFit.new(size: 10)
       bf2 = BloomFit.new(size: 20)
-      assert_raises(BloomFit::ConfigurationMismatch) { bf1 & bf2 }
+      assert_raises(ArgumentError) { bf1 & bf2 }
       bf1 = BloomFit.new(size: 10, hashes: 2)
       bf2 = BloomFit.new(size: 10, hashes: 4)
-      assert_raises(BloomFit::ConfigurationMismatch) { bf1 & bf2 }
+      assert_raises(ArgumentError) { bf1 & bf2 }
     end
   end
@@ -303,7 +325,7 @@ class BloomFitTest < Minitest::Spec
     it "raises when union is between incompatible filters" do
       bf1 = BloomFit.new(size: 10)
       bf2 = BloomFit.new(size: 20)
-      assert_raises(BloomFit::ConfigurationMismatch) { bf1 | bf2 }
+      assert_raises(ArgumentError) { bf1 | bf2 }
     end
   end
@@ -318,16 +340,51 @@ class BloomFitTest < Minitest::Spec
       STATS
       assert_equal expected, bf.stats
     end
+    it "estimates false positives from the current fill level" do
+      bf = BloomFit.new(size: 10, hashes: 3)
+      bf.bf.load("\x07\x00\x00".b)
+      expected = <<~STATS
+        Number of filter buckets (m):  10
+        Number of set bits (n):        3
+        Number of filter hashes (k):   3
+        Predicted false positive rate: 2.70%
+      STATS
+      assert_equal expected, bf.stats
+    end
   end
   describe "serialization" do
-    after { File.unlink("bf.out") }
+    after { FileUtils.rm_f("bf.out") }
     it "marshalls" do
       bf = BloomFit.new
       assert bf.save("bf.out")
     end
+    it "uses binary file io" do
+      dumped = Marshal.dump(subject)
+      writer = Minitest::Mock.new
+      writer.expect(:call, dumped.bytesize, ["bf.out", dumped])
+      reader = Minitest::Mock.new
+      reader.expect(:call, dumped, ["bf.out"])
+      File.stub(:binwrite, writer) do
+        assert_equal dumped.bytesize, subject.save("bf.out")
+      end
+      File.stub(:binread, reader) do
+        bf2 = BloomFit.load("bf.out")
+        assert_equal subject.size, bf2.size
+        assert_equal subject.hashes, bf2.hashes
+      end
+      writer.verify
+      reader.verify
+    end
     it "loads from marshalled" do
       subject.add("foo")
       subject.add("bar")
@@ -338,7 +395,8 @@ class BloomFitTest < Minitest::Spec
       assert_includes bf2, "bar"
       refute_includes bf2, "baz"
-      assert subject.send(:same_parameters?, bf2)
+      assert_equal subject.size, bf2.size
+      assert_equal subject.hashes, bf2.hashes
     end
   end
 end

data/test/c_bloom_filter_test.rb ADDED Viewed

@@ -0,0 +1,233 @@
+require "test_helper"
+class CBloomFilterTest < Minitest::Spec
+  subject { CBloomFilter.new }
+  describe ".new" do
+    it "rejects more than two arguments" do
+      error = assert_raises(ArgumentError) { CBloomFilter.new(1, 2, 3) }
+      assert_equal "wrong number of arguments (given 3, expected 0..2)", error.message
+    end
+  end
+  describe "#m" do
+    it "defaults" do
+      assert_equal 1000, subject.m
+    end
+    it "is set by the 1st arg of the contructor" do
+      bf = CBloomFilter.new(10_000)
+      assert_equal 10_000, bf.m
+    end
+    it "rejects values less than 1" do
+      error = assert_raises(ArgumentError) { CBloomFilter.new(-1) }
+      assert_equal "bit length must be >= 1", error.message
+    end
+    it "rejects values that overflow internal byte sizing" do
+      error = assert_raises(RangeError) { CBloomFilter.new((1 << 31) - 7) }
+      assert_equal "bit length is too large", error.message
+    end
+  end
+  describe "#k" do
+    it "defaults" do
+      assert_equal 4, subject.k
+    end
+    it "is set by the 2nd arg of the contructor" do
+      bf = CBloomFilter.new(10_000, 9)
+      assert_equal 9, bf.k
+    end
+    it "rejects values less than 1" do
+      error = assert_raises(ArgumentError) { CBloomFilter.new(1000, 0) }
+      assert_equal "hash length must be >= 1", error.message
+    end
+    it "rejects values larger than the salt table" do
+      error = assert_raises(ArgumentError) { CBloomFilter.new(10_000, 257) }
+      assert_equal "hash length must be <= 256", error.message
+    end
+  end
+  describe "#set_bits" do
+    it "initializes to zero" do
+      assert_equal 0, subject.set_bits
+    end
+    it "counts the bits when active" do
+      subject.add("foo")
+      assert_equal 4, subject.set_bits
+    end
+  end
+  describe "#add" do
+    it "adds keys to the filter set" do
+      subject.add("foo")
+      subject.add("bar")
+      assert_includes subject, "foo"
+      assert_includes subject, "bar"
+      refute_includes subject, "baz"
+    end
+    it "treats binary bytes as unsigned when hashing" do
+      bf = CBloomFilter.new(20, 4)
+      bf.add("\xFF".b)
+      assert_equal "\x00\x05\x05\x00".b, bf.bitmap
+    end
+  end
+  describe "#include?" do
+    it "returns true when a key is in the set" do
+      subject.add("foo")
+      assert_equal true, subject.include?("foo") # rubocop:disable Minitest/AssertTruthy
+    end
+    it "returns false when a key is not in the set" do
+      subject.add("foo")
+      assert_equal false, subject.include?("bar") # rubocop:disable Minitest/RefuteFalse
+    end
+  end
+  describe "#clear" do
+    it "clears a set" do
+      subject.add("foo")
+      subject.add("bar")
+      subject.add("baz")
+      assert subject.set_bits.positive?
+      subject.clear
+      assert subject.set_bits.zero?
+    end
+  end
+  describe "#merge" do
+    it "adds keys from another set" do
+      subject.add("foo")
+      bf = CBloomFilter.new
+      bf.add("bar")
+      bf.add("baz")
+      subject.merge(bf)
+      assert_includes subject, "foo"
+      assert_includes subject, "bar"
+      assert_includes subject, "baz"
+    end
+    it "rejects incompatible filters" do
+      error = assert_raises(ArgumentError) { subject.merge(CBloomFilter.new(2000, 4)) }
+      assert_equal "bloom filters must have matching size and hash count", error.message
+    end
+  end
+  describe "#&" do
+    it "intersects keys from another set" do
+      subject.add("foo")
+      subject.add("bar")
+      bf = CBloomFilter.new
+      bf.add("bar")
+      bf.add("baz")
+      bf2 = subject & bf
+      refute_includes bf2, "foo"
+      assert_includes bf2, "bar"
+      refute_includes bf2, "baz"
+      bf3 = bf & subject
+      refute_includes bf3, "foo"
+      assert_includes bf3, "bar"
+      refute_includes bf3, "baz"
+    end
+    it "rejects incompatible filters" do
+      error = assert_raises(ArgumentError) { subject & CBloomFilter.new(1000, 2) }
+      assert_equal "bloom filters must have matching size and hash count", error.message
+    end
+  end
+  describe "#|" do
+    it "unions keys from another set" do
+      subject.add("foo")
+      subject.add("bar")
+      bf = CBloomFilter.new
+      bf.add("bar")
+      bf.add("baz")
+      bf2 = subject | bf
+      assert_includes bf2, "foo"
+      assert_includes bf2, "bar"
+      assert_includes bf2, "baz"
+      bf3 = bf | subject
+      assert_includes bf3, "foo"
+      assert_includes bf3, "bar"
+      assert_includes bf3, "baz"
+    end
+    it "rejects incompatible filters" do
+      error = assert_raises(ArgumentError) { subject | CBloomFilter.new(2000, 4) }
+      assert_equal "bloom filters must have matching size and hash count", error.message
+    end
+  end
+  describe "#bitmap" do
+    it "returns a binary bitmap of all zeros when empty (including a terminating byte)" do
+      bf = CBloomFilter.new(16)
+      assert_equal "\x00\x00\x00".b, bf.bitmap
+    end
+    it "returns a binary bitmap representing the set" do
+      bf = CBloomFilter.new(16, 4)
+      bf.add("something")
+      assert_equal "(\x82\x00".b, bf.bitmap
+    end
+    it "returns a binary bitmap representing the set even if not a multiple of 8 bits (includes padding)" do
+      bf = CBloomFilter.new(20, 4)
+      bf.add("wow")
+      assert_equal "\x04\x14\x00\x00".b, bf.bitmap
+    end
+  end
+  describe "#load" do
+    it "overwrites the bitmap" do
+      bf = CBloomFilter.new(1000, 4)
+      bf.add("foo")
+      bf.add("bar")
+      subject.load(bf.bitmap)
+      assert_includes subject, "foo"
+      assert_includes subject, "bar"
+    end
+    it "rejects a short bitmap" do
+      error = assert_raises(ArgumentError) { subject.load("\x00".b) }
+      assert_equal "bitmap length must be 126 bytes", error.message
+    end
+    it "rejects a long bitmap" do
+      error = assert_raises(ArgumentError) { subject.load("\x00".b * 127) }
+      assert_equal "bitmap length must be 126 bytes", error.message
+    end
+    it "coerces bitmap-like objects to strings before loading" do
+      bitmap_data = subject.bitmap
+      bitmap = Object.new
+      bitmap.define_singleton_method(:to_str) { bitmap_data }
+      subject.load(bitmap)
+      assert_equal 0, subject.set_bits
+    end
+    it "clears loaded padding bits beyond the configured size" do
+      bf = CBloomFilter.new(20, 4)
+      bf.load("\x00\x00\xF0\xFF".b)
+      assert_equal 0, bf.set_bits
+      assert_equal "\x00\x00\x00\x00".b, bf.bitmap
+    end
+  end
+end

data/test/test_helper.rb CHANGED Viewed

@@ -1,4 +1,5 @@
 require "minitest/autorun"
+require "minitest/mock"
 require "minitest/reporters"
 Minitest::Reporters.use! # override with MINITEST_REPORTER env var

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: bloom_fit
 version: !ruby/object:Gem::Version
-  version: 0.3.1
+  version: 1.1.0
 platform: ruby
 authors:
 - Ryan McGeary
@@ -24,13 +24,13 @@ extra_rdoc_files: []
 files:
 - README.md
 - ext/cbloomfilter/cbloomfilter.c
-- ext/cbloomfilter/crc32.h
 - ext/cbloomfilter/extconf.rb
+- ext/cbloomfilter/salts.h
 - lib/bloom_fit.rb
-- lib/bloom_fit/configuration_mismatch.rb
 - lib/bloom_fit/version.rb
 - lib/cbloomfilter.bundle
 - test/bloom_fit_test.rb
+- test/c_bloom_filter_test.rb
 - test/test_helper.rb
 homepage: https://github.com/rmm5t/bloom_fit
 licenses: []

data/ext/cbloomfilter/crc32.h DELETED Viewed

@@ -1,76 +0,0 @@
-/* simple CRC32 code */
-/*
- * Copyright 2005 Aris Adamantiadis
- *
- * This file is part of the SSH Library
- *
- * The SSH Library is free software; you can redistribute it and/or modify
- * it under the terms of the GNU Lesser General Public License as published by
- * the Free Software Foundation; either version 2.1 of the License, or (at your
- * option) any later version.
- *
- *
- * The SSH Library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- * or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
- * License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public License
- * along with the SSH Library; see the file COPYING.  If not, write to
- * the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
- * MA 02111-1307, USA. */
-static unsigned int crc_table[] = {
-    0x00000000UL, 0x77073096UL, 0xee0e612cUL, 0x990951baUL, 0x076dc419UL,
-    0x706af48fUL, 0xe963a535UL, 0x9e6495a3UL, 0x0edb8832UL, 0x79dcb8a4UL,
-    0xe0d5e91eUL, 0x97d2d988UL, 0x09b64c2bUL, 0x7eb17cbdUL, 0xe7b82d07UL,
-    0x90bf1d91UL, 0x1db71064UL, 0x6ab020f2UL, 0xf3b97148UL, 0x84be41deUL,
-    0x1adad47dUL, 0x6ddde4ebUL, 0xf4d4b551UL, 0x83d385c7UL, 0x136c9856UL,
-    0x646ba8c0UL, 0xfd62f97aUL, 0x8a65c9ecUL, 0x14015c4fUL, 0x63066cd9UL,
-    0xfa0f3d63UL, 0x8d080df5UL, 0x3b6e20c8UL, 0x4c69105eUL, 0xd56041e4UL,
-    0xa2677172UL, 0x3c03e4d1UL, 0x4b04d447UL, 0xd20d85fdUL, 0xa50ab56bUL,
-    0x35b5a8faUL, 0x42b2986cUL, 0xdbbbc9d6UL, 0xacbcf940UL, 0x32d86ce3UL,
-    0x45df5c75UL, 0xdcd60dcfUL, 0xabd13d59UL, 0x26d930acUL, 0x51de003aUL,
-    0xc8d75180UL, 0xbfd06116UL, 0x21b4f4b5UL, 0x56b3c423UL, 0xcfba9599UL,
-    0xb8bda50fUL, 0x2802b89eUL, 0x5f058808UL, 0xc60cd9b2UL, 0xb10be924UL,
-    0x2f6f7c87UL, 0x58684c11UL, 0xc1611dabUL, 0xb6662d3dUL, 0x76dc4190UL,
-    0x01db7106UL, 0x98d220bcUL, 0xefd5102aUL, 0x71b18589UL, 0x06b6b51fUL,
-    0x9fbfe4a5UL, 0xe8b8d433UL, 0x7807c9a2UL, 0x0f00f934UL, 0x9609a88eUL,
-    0xe10e9818UL, 0x7f6a0dbbUL, 0x086d3d2dUL, 0x91646c97UL, 0xe6635c01UL,
-    0x6b6b51f4UL, 0x1c6c6162UL, 0x856530d8UL, 0xf262004eUL, 0x6c0695edUL,
-    0x1b01a57bUL, 0x8208f4c1UL, 0xf50fc457UL, 0x65b0d9c6UL, 0x12b7e950UL,
-    0x8bbeb8eaUL, 0xfcb9887cUL, 0x62dd1ddfUL, 0x15da2d49UL, 0x8cd37cf3UL,
-    0xfbd44c65UL, 0x4db26158UL, 0x3ab551ceUL, 0xa3bc0074UL, 0xd4bb30e2UL,
-    0x4adfa541UL, 0x3dd895d7UL, 0xa4d1c46dUL, 0xd3d6f4fbUL, 0x4369e96aUL,
-    0x346ed9fcUL, 0xad678846UL, 0xda60b8d0UL, 0x44042d73UL, 0x33031de5UL,
-    0xaa0a4c5fUL, 0xdd0d7cc9UL, 0x5005713cUL, 0x270241aaUL, 0xbe0b1010UL,
-    0xc90c2086UL, 0x5768b525UL, 0x206f85b3UL, 0xb966d409UL, 0xce61e49fUL,
-    0x5edef90eUL, 0x29d9c998UL, 0xb0d09822UL, 0xc7d7a8b4UL, 0x59b33d17UL,
-    0x2eb40d81UL, 0xb7bd5c3bUL, 0xc0ba6cadUL, 0xedb88320UL, 0x9abfb3b6UL,
-    0x03b6e20cUL, 0x74b1d29aUL, 0xead54739UL, 0x9dd277afUL, 0x04db2615UL,
-    0x73dc1683UL, 0xe3630b12UL, 0x94643b84UL, 0x0d6d6a3eUL, 0x7a6a5aa8UL,
-    0xe40ecf0bUL, 0x9309ff9dUL, 0x0a00ae27UL, 0x7d079eb1UL, 0xf00f9344UL,
-    0x8708a3d2UL, 0x1e01f268UL, 0x6906c2feUL, 0xf762575dUL, 0x806567cbUL,
-    0x196c3671UL, 0x6e6b06e7UL, 0xfed41b76UL, 0x89d32be0UL, 0x10da7a5aUL,
-    0x67dd4accUL, 0xf9b9df6fUL, 0x8ebeeff9UL, 0x17b7be43UL, 0x60b08ed5UL,
-    0xd6d6a3e8UL, 0xa1d1937eUL, 0x38d8c2c4UL, 0x4fdff252UL, 0xd1bb67f1UL,
-    0xa6bc5767UL, 0x3fb506ddUL, 0x48b2364bUL, 0xd80d2bdaUL, 0xaf0a1b4cUL,
-    0x36034af6UL, 0x41047a60UL, 0xdf60efc3UL, 0xa867df55UL, 0x316e8eefUL,
-    0x4669be79UL, 0xcb61b38cUL, 0xbc66831aUL, 0x256fd2a0UL, 0x5268e236UL,
-    0xcc0c7795UL, 0xbb0b4703UL, 0x220216b9UL, 0x5505262fUL, 0xc5ba3bbeUL,
-    0xb2bd0b28UL, 0x2bb45a92UL, 0x5cb36a04UL, 0xc2d7ffa7UL, 0xb5d0cf31UL,
-    0x2cd99e8bUL, 0x5bdeae1dUL, 0x9b64c2b0UL, 0xec63f226UL, 0x756aa39cUL,
-    0x026d930aUL, 0x9c0906a9UL, 0xeb0e363fUL, 0x72076785UL, 0x05005713UL,
-    0x95bf4a82UL, 0xe2b87a14UL, 0x7bb12baeUL, 0x0cb61b38UL, 0x92d28e9bUL,
-    0xe5d5be0dUL, 0x7cdcefb7UL, 0x0bdbdf21UL, 0x86d3d2d4UL, 0xf1d4e242UL,
-    0x68ddb3f8UL, 0x1fda836eUL, 0x81be16cdUL, 0xf6b9265bUL, 0x6fb077e1UL,
-    0x18b74777UL, 0x88085ae6UL, 0xff0f6a70UL, 0x66063bcaUL, 0x11010b5cUL,
-    0x8f659effUL, 0xf862ae69UL, 0x616bffd3UL, 0x166ccf45UL, 0xa00ae278UL,
-    0xd70dd2eeUL, 0x4e048354UL, 0x3903b3c2UL, 0xa7672661UL, 0xd06016f7UL,
-    0x4969474dUL, 0x3e6e77dbUL, 0xaed16a4aUL, 0xd9d65adcUL, 0x40df0b66UL,
-    0x37d83bf0UL, 0xa9bcae53UL, 0xdebb9ec5UL, 0x47b2cf7fUL, 0x30b5ffe9UL,
-    0xbdbdf21cUL, 0xcabac28aUL, 0x53b39330UL, 0x24b4a3a6UL, 0xbad03605UL,
-    0xcdd70693UL, 0x54de5729UL, 0x23d967bfUL, 0xb3667a2eUL, 0xc4614ab8UL,
-    0x5d681b02UL, 0x2a6f2b94UL, 0xb40bbe37UL, 0xc30c8ea1UL, 0x5a05df1bUL,
-    0x2d02ef8dUL
-};

data/lib/bloom_fit/configuration_mismatch.rb DELETED Viewed

@@ -1,4 +0,0 @@
-class BloomFit
-  class ConfigurationMismatch < ArgumentError
-  end
-end