bloom_fit 0.3.1 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +220 -47
- data/ext/cbloomfilter/cbloomfilter.c +71 -14
- data/ext/cbloomfilter/salts.h +50 -0
- data/lib/bloom_fit/version.rb +1 -1
- data/lib/bloom_fit.rb +36 -33
- data/lib/cbloomfilter.bundle +0 -0
- data/test/bloom_fit_test.rb +70 -12
- data/test/c_bloom_filter_test.rb +233 -0
- data/test/test_helper.rb +1 -0
- metadata +3 -3
- data/ext/cbloomfilter/crc32.h +0 -76
- data/lib/bloom_fit/configuration_mismatch.rb +0 -4
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: ed19ba044e45497c9026b8227e77c48cd62aea3043f698c6aca4955eb734f17e
|
|
4
|
+
data.tar.gz: e712cf58a3b6b11e38733da4437c95fbd94a7e9b07eeb4e72945138a140d730f
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 69b2b91fdf8e3995931507a53b13c6923e225faef01f6cf39c3524e9ad2e63673411452719fc93509aaa830b42f4fa45198cf39f0b3f70cd33f60846116f5430
|
|
7
|
+
data.tar.gz: e33e427c4bd6ca79d818887dbca0d80348a868fdea85930197e92e087c63adb8a3d339b74d420a894ed55c731e05fedfeb9875a57e5f32698ee342c2836a1ebc
|
data/README.md
CHANGED
|
@@ -1,77 +1,250 @@
|
|
|
1
|
-
# BloomFit
|
|
1
|
+
# BloomFit
|
|
2
2
|
|
|
3
|
-
[](https://rubygems.org/gems/bloom_fit)
|
|
4
4
|
[](https://github.com/rmm5t/bloom_fit/actions/workflows/ci.yml)
|
|
5
5
|
[](https://rubygems.org/gems/bloom_fit)
|
|
6
6
|
|
|
7
|
-
BloomFit
|
|
7
|
+
BloomFit is an in-memory, non-counting Bloom filter for Ruby backed by a small C extension.
|
|
8
|
+
|
|
9
|
+
It gives you a compact, Set-like API for probabilistic membership checks:
|
|
10
|
+
|
|
11
|
+
- false positives are possible
|
|
12
|
+
- false negatives are not, as long as a value was added to the same filter
|
|
13
|
+
- individual values cannot be deleted safely because the filter is non-counting
|
|
14
|
+
|
|
15
|
+
BloomFit is heavily inspired by [bloomfilter-rb]'s native implementation and the original C implementation by Tatsuya Mori. This version uses a DJB2 hash with salts from the CRC table and wraps the native filter in a Ruby-friendly API. The most common way to use it is to pass an expected `capacity` and optional `false_positive_rate`, then let BloomFit calculate `size` and `hashes` for you.
|
|
16
|
+
|
|
17
|
+
Compared with bloomfilter-rb, BloomFit:
|
|
8
18
|
|
|
9
19
|
- uses DJB2 over CRC32 yielding better hash distribution
|
|
10
20
|
- improves performance for very large datasets
|
|
11
21
|
- avoids the need to supply a seed
|
|
12
|
-
- automatically calculates the
|
|
22
|
+
- automatically calculates the filter size (`m`) and hash count (`k`) from capacity and false-positive rate
|
|
13
23
|
|
|
14
|
-
|
|
24
|
+
## Features
|
|
15
25
|
|
|
16
|
-
|
|
26
|
+
- native `CBloomFilter` implementation for MRI Ruby
|
|
27
|
+
- automatic sizing from `capacity` and `false_positive_rate`
|
|
28
|
+
- small Ruby API with familiar methods like `add`, `include?`, `merge`, `|`, and `&`
|
|
29
|
+
- supports strings, symbols, integers, booleans, and other values that can be converted with `to_s`
|
|
30
|
+
- manual `size` / `hashes` overrides when you want control
|
|
31
|
+
- save and reload filters with Ruby `Marshal`
|
|
32
|
+
- inspect filter state with `stats`, `to_hex`, `to_binary`, and `bitmap`
|
|
17
33
|
|
|
18
|
-
|
|
19
|
-
- number of hash functions
|
|
34
|
+
## Requirements
|
|
20
35
|
|
|
21
|
-
|
|
36
|
+
- Ruby `>= 3.2.0`
|
|
22
37
|
|
|
23
|
-
|
|
24
|
-
- Determining parameters: [Scalable Datasets: Bloom Filters in Ruby](http://www.igvita.com/2008/12/27/scalable-datasets-bloom-filters-in-ruby/)
|
|
25
|
-
- Applications & reasons behind bloom filter: [Flow analysis: Time based bloom filter](http://www.igvita.com/2010/01/06/flow-analysis-time-based-bloom-filters/)
|
|
38
|
+
## Installation
|
|
26
39
|
|
|
27
|
-
|
|
40
|
+
```bash
|
|
41
|
+
gem install bloom_fit
|
|
42
|
+
```
|
|
28
43
|
|
|
29
|
-
|
|
44
|
+
```ruby
|
|
45
|
+
require "bloom_fit"
|
|
46
|
+
```
|
|
30
47
|
|
|
31
|
-
|
|
48
|
+
## Quick Start
|
|
32
49
|
|
|
33
50
|
```ruby
|
|
34
51
|
require "bloom_fit"
|
|
35
52
|
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
#
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
#
|
|
49
|
-
#
|
|
50
|
-
|
|
53
|
+
filter = BloomFit.new(capacity: 250, false_positive_rate: 0.001)
|
|
54
|
+
|
|
55
|
+
filter.add("cat")
|
|
56
|
+
filter << :dog
|
|
57
|
+
|
|
58
|
+
filter.include?("cat") # => true
|
|
59
|
+
filter.key?("dog") # => true
|
|
60
|
+
filter["bird"] # => false
|
|
61
|
+
|
|
62
|
+
filter["owl"] = true
|
|
63
|
+
filter["ant"] = false
|
|
64
|
+
|
|
65
|
+
filter["owl"] # => true
|
|
66
|
+
filter["ant"] # => false
|
|
67
|
+
|
|
68
|
+
filter.empty? # => false
|
|
69
|
+
|
|
70
|
+
filter.size # => 3595
|
|
71
|
+
filter.hashes # => 10
|
|
72
|
+
|
|
73
|
+
filter.clear
|
|
74
|
+
filter.empty? # => true
|
|
51
75
|
```
|
|
52
76
|
|
|
53
|
-
|
|
77
|
+
`#include?`, `#key?`, and `#[]` are aliases. `#add` and `#<<` are also aliases.
|
|
78
|
+
|
|
79
|
+
## Automatic Sizing
|
|
80
|
+
|
|
81
|
+
BloomFit now calculates `size` and `hashes` for you when you initialize it with an expected capacity:
|
|
54
82
|
|
|
55
83
|
```ruby
|
|
56
|
-
|
|
84
|
+
filter = BloomFit.new(capacity: 10_000, false_positive_rate: 0.01)
|
|
85
|
+
|
|
86
|
+
filter.size # => 95851
|
|
87
|
+
filter.hashes # => 7
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
The defaults are a good starting point for many small filters:
|
|
91
|
+
|
|
92
|
+
```ruby
|
|
93
|
+
filter = BloomFit.new
|
|
94
|
+
|
|
95
|
+
filter.size # => 1438
|
|
96
|
+
filter.hashes # => 10
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
That is equivalent to:
|
|
100
|
+
|
|
101
|
+
```ruby
|
|
102
|
+
filter = BloomFit.new(capacity: 100, false_positive_rate: 0.001)
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
Internally BloomFit uses the standard Bloom filter formulas:
|
|
106
|
+
|
|
107
|
+
```text
|
|
108
|
+
m = -(n * ln(p)) / (ln(2)^2)
|
|
109
|
+
k = (m / n) * ln(2)
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
- `n`: expected number of inserted values
|
|
113
|
+
- `p`: target false-positive rate
|
|
114
|
+
- `m`: number of filter buckets (`size`)
|
|
115
|
+
- `k`: number of hash functions (`hashes`)
|
|
116
|
+
|
|
117
|
+
For example, if you expect about `10_000` inserts and can tolerate a `1%` false-positive rate, BloomFit will calculate `size: 95_851` and `hashes: 7` for you.
|
|
118
|
+
|
|
119
|
+
If you prefer a calculator, see [Bloom Filter Calculator](https://hur.st/bloomfilter/).
|
|
120
|
+
|
|
121
|
+
## Manual Sizing
|
|
122
|
+
|
|
123
|
+
If you already know the exact filter width and hash count you want, you can still pass them directly:
|
|
124
|
+
|
|
125
|
+
```ruby
|
|
126
|
+
filter = BloomFit.new(size: 95_851, hashes: 7)
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
This bypasses automatic sizing.
|
|
130
|
+
|
|
131
|
+
## Common Operations
|
|
57
132
|
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
133
|
+
### Add and check membership
|
|
134
|
+
|
|
135
|
+
```ruby
|
|
136
|
+
filter = BloomFit.new(capacity: 100)
|
|
137
|
+
|
|
138
|
+
filter << "cat"
|
|
139
|
+
filter << "dog"
|
|
140
|
+
|
|
141
|
+
filter.include?("cat") # => true
|
|
142
|
+
filter.include?("bird") # => false
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### Use hash-like syntax for truthy values
|
|
146
|
+
|
|
147
|
+
```ruby
|
|
148
|
+
filter = BloomFit.new(capacity: 64)
|
|
149
|
+
|
|
150
|
+
filter[:cat] = true
|
|
151
|
+
filter[:dog] = false
|
|
152
|
+
|
|
153
|
+
filter[:cat] # => true
|
|
154
|
+
filter[:dog] # => false
|
|
155
|
+
|
|
156
|
+
filter.merge({ bird: true, ant: nil })
|
|
157
|
+
|
|
158
|
+
filter.include?(:bird) # => true
|
|
159
|
+
filter.include?(:ant) # => false
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
When merging a hash, only keys with truthy values are added.
|
|
163
|
+
|
|
164
|
+
### Merge, union, and intersection
|
|
165
|
+
|
|
166
|
+
```ruby
|
|
167
|
+
pets = BloomFit.new(capacity: 50)
|
|
168
|
+
pets << "cat" << "dog"
|
|
169
|
+
|
|
170
|
+
more_pets = BloomFit.new(capacity: 50)
|
|
171
|
+
more_pets << "dog" << "bird"
|
|
172
|
+
|
|
173
|
+
combined = pets | more_pets
|
|
174
|
+
overlap = pets & more_pets
|
|
175
|
+
|
|
176
|
+
combined.include?("bird") # => true
|
|
177
|
+
overlap.include?("dog") # => true
|
|
178
|
+
overlap.include?("cat") # => false
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
`#merge` also accepts arrays, sets, and other enumerables:
|
|
182
|
+
|
|
183
|
+
```ruby
|
|
184
|
+
filter = BloomFit.new(capacity: 100)
|
|
185
|
+
filter.merge(%w[cat dog bird])
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
Filters can only be combined when they have the same `size` and `hashes`. Otherwise BloomFit raises `ArgumentError`.
|
|
189
|
+
|
|
190
|
+
When you create filters with automatic sizing, use the same `capacity` and `false_positive_rate` for filters you plan to merge, union, or intersect.
|
|
191
|
+
|
|
192
|
+
### Save and load filters
|
|
193
|
+
|
|
194
|
+
```ruby
|
|
195
|
+
filter = BloomFit.new(capacity: 100)
|
|
196
|
+
filter << "cat" << "dog"
|
|
197
|
+
filter.save("pets.bloom")
|
|
198
|
+
|
|
199
|
+
reloaded = BloomFit.load("pets.bloom")
|
|
200
|
+
reloaded.include?("cat") # => true
|
|
201
|
+
reloaded.include?("dog") # => true
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
Persistence uses Ruby `Marshal`. Only load files you trust.
|
|
205
|
+
|
|
206
|
+
### Inspect the bitmap
|
|
207
|
+
|
|
208
|
+
```ruby
|
|
209
|
+
filter = BloomFit.new(size: 16, hashes: 4)
|
|
210
|
+
filter << "cool"
|
|
211
|
+
|
|
212
|
+
filter.to_hex # => "1441"
|
|
213
|
+
filter.to_binary # => "0001010001000001"
|
|
214
|
+
filter.bitmap # => raw bytes from the native filter
|
|
73
215
|
```
|
|
74
216
|
|
|
217
|
+
`#bitmap` returns the native byte representation, which may include padding bytes beyond the configured filter width. `#to_binary` trims the result to exactly `size` bits.
|
|
218
|
+
|
|
219
|
+
## API Overview
|
|
220
|
+
|
|
221
|
+
| Method | Notes |
|
|
222
|
+
| --- | --- |
|
|
223
|
+
| `BloomFit.new` or `BloomFit.new(capacity:, false_positive_rate:)` | Creates a filter and calculates `size` and `hashes` automatically. Defaults to `capacity: 100`, `false_positive_rate: 0.001`. |
|
|
224
|
+
| `BloomFit.new(size:, hashes:)` | Creates a filter with explicit sizing when you want fixed parameters. |
|
|
225
|
+
| `add`, `<<` | Adds a value and returns the filter. |
|
|
226
|
+
| `add?` | Adds only when the value does not already appear present. |
|
|
227
|
+
| `include?`, `key?`, `[]` | Probabilistic membership check. |
|
|
228
|
+
| `[]=` | Adds a key only when the assigned value is truthy. |
|
|
229
|
+
| `merge` | Merges another filter or an enumerable into the receiver. |
|
|
230
|
+
| `\|`, `union` | Returns a new filter containing the union. |
|
|
231
|
+
| `&`, `intersection` | Returns a new filter containing the intersection. |
|
|
232
|
+
| `clear` | Resets all bits to `0`. |
|
|
233
|
+
| `empty?` | Exact check for whether any bits are set. |
|
|
234
|
+
| `size`, `m` | Returns the configured filter width. |
|
|
235
|
+
| `hashes`, `k` | Returns the number of hash functions. |
|
|
236
|
+
| `set_bits`, `n` | Returns the number of bits currently set. |
|
|
237
|
+
| `stats` | Returns a human-readable summary including predicted false-positive rate. |
|
|
238
|
+
| `to_hex`, `to_binary`, `bitmap` | Returns the filter bitmap in different representations. |
|
|
239
|
+
| `save`, `BloomFit.load` | Serializes and restores a filter with Ruby `Marshal`. |
|
|
240
|
+
|
|
241
|
+
## Resources
|
|
242
|
+
|
|
243
|
+
- Background: [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter)
|
|
244
|
+
- Determining parameters: [Scalable Datasets: Bloom Filters in Ruby](http://www.igvita.com/2008/12/27/scalable-datasets-bloom-filters-in-ruby/)
|
|
245
|
+
- Applications and motivation: [Flow analysis: Time based bloom filter](http://www.igvita.com/2010/01/06/flow-analysis-time-based-bloom-filters/)
|
|
246
|
+
- Calculator: [Bloom Filter Calculator](https://hur.st/bloomfilter/)
|
|
247
|
+
|
|
75
248
|
## Credits
|
|
76
249
|
|
|
77
250
|
- Tatsuya Mori <valdzone@gmail.com> (Original C implementation)
|
|
@@ -4,15 +4,15 @@
|
|
|
4
4
|
*/
|
|
5
5
|
|
|
6
6
|
#include "ruby.h"
|
|
7
|
-
#include
|
|
7
|
+
#include <limits.h>
|
|
8
|
+
#include "salts.h"
|
|
8
9
|
|
|
9
10
|
#if !defined(RSTRING_LEN)
|
|
10
11
|
# define RSTRING_LEN(x) (RSTRING(x)->len)
|
|
11
12
|
# define RSTRING_PTR(x) (RSTRING(x)->ptr)
|
|
12
13
|
#endif
|
|
13
14
|
|
|
14
|
-
|
|
15
|
-
static unsigned int *salts = crc_table;
|
|
15
|
+
static const int salts_length = sizeof(salts) / sizeof(salts[0]);
|
|
16
16
|
|
|
17
17
|
static VALUE cBloomFilter;
|
|
18
18
|
|
|
@@ -26,7 +26,7 @@ struct BloomFilter {
|
|
|
26
26
|
unsigned long djb2(const char *str, int len) {
|
|
27
27
|
unsigned long hash = 5381;
|
|
28
28
|
for (int i = 0; i < len; i++) {
|
|
29
|
-
hash = ((hash << 5) + hash) + str[i];
|
|
29
|
+
hash = ((hash << 5) + hash) + (unsigned char) str[i];
|
|
30
30
|
}
|
|
31
31
|
return hash;
|
|
32
32
|
}
|
|
@@ -92,14 +92,41 @@ static int bucket_check(struct BloomFilter *bf, int index) {
|
|
|
92
92
|
return (bf->ptr[byte_offset] >> bit_offset) & 1;
|
|
93
93
|
}
|
|
94
94
|
|
|
95
|
+
static void bf_ensure_compatible(struct BloomFilter *bf, struct BloomFilter *other) {
|
|
96
|
+
if (bf->m != other->m || bf->k != other->k || bf->bytes != other->bytes) {
|
|
97
|
+
rb_raise(rb_eArgError, "bloom filters must have matching size and hash count");
|
|
98
|
+
}
|
|
99
|
+
}
|
|
100
|
+
|
|
101
|
+
static void bf_clear_padding_bits(struct BloomFilter *bf) {
|
|
102
|
+
int full_bytes = bf->m / 8;
|
|
103
|
+
int remaining_bits = bf->m % 8;
|
|
104
|
+
int i;
|
|
105
|
+
|
|
106
|
+
if (remaining_bits > 0) {
|
|
107
|
+
unsigned char mask = (unsigned char) ((1U << remaining_bits) - 1U);
|
|
108
|
+
bf->ptr[full_bytes] &= mask;
|
|
109
|
+
full_bytes += 1;
|
|
110
|
+
}
|
|
111
|
+
|
|
112
|
+
for (i = full_bytes; i < bf->bytes; i++) {
|
|
113
|
+
bf->ptr[i] = 0;
|
|
114
|
+
}
|
|
115
|
+
}
|
|
116
|
+
|
|
95
117
|
static VALUE bf_initialize(int argc, VALUE *argv, VALUE self) {
|
|
96
118
|
struct BloomFilter *bf;
|
|
97
119
|
VALUE arg1, arg2;
|
|
120
|
+
long m_value, k_value;
|
|
98
121
|
int m, k;
|
|
99
122
|
|
|
100
123
|
bf = bf_ptr(self);
|
|
101
124
|
|
|
102
|
-
|
|
125
|
+
if (argc > 2) {
|
|
126
|
+
rb_error_arity(argc, 0, 2);
|
|
127
|
+
}
|
|
128
|
+
|
|
129
|
+
/* defaults */
|
|
103
130
|
arg1 = INT2FIX(1000);
|
|
104
131
|
arg2 = INT2FIX(4);
|
|
105
132
|
|
|
@@ -111,13 +138,23 @@ static VALUE bf_initialize(int argc, VALUE *argv, VALUE self) {
|
|
|
111
138
|
break;
|
|
112
139
|
}
|
|
113
140
|
|
|
114
|
-
|
|
115
|
-
|
|
141
|
+
m_value = NUM2LONG(arg1);
|
|
142
|
+
k_value = NUM2LONG(arg2);
|
|
143
|
+
|
|
144
|
+
if (m_value > INT_MAX - 15)
|
|
145
|
+
rb_raise(rb_eRangeError, "bit length is too large");
|
|
146
|
+
if (k_value > INT_MAX)
|
|
147
|
+
rb_raise(rb_eRangeError, "hash length is too large");
|
|
148
|
+
|
|
149
|
+
m = (int) m_value;
|
|
150
|
+
k = (int) k_value;
|
|
116
151
|
|
|
117
152
|
if (m < 1)
|
|
118
|
-
rb_raise(rb_eArgError, "
|
|
153
|
+
rb_raise(rb_eArgError, "bit length must be >= 1");
|
|
119
154
|
if (k < 1)
|
|
120
|
-
rb_raise(rb_eArgError, "hash length");
|
|
155
|
+
rb_raise(rb_eArgError, "hash length must be >= 1");
|
|
156
|
+
if (k > salts_length)
|
|
157
|
+
rb_raise(rb_eArgError, "hash length must be <= %d", salts_length);
|
|
121
158
|
|
|
122
159
|
bf->m = m;
|
|
123
160
|
bf->k = k;
|
|
@@ -131,7 +168,6 @@ static VALUE bf_initialize(int argc, VALUE *argv, VALUE self) {
|
|
|
131
168
|
|
|
132
169
|
/* initialize the bits with zeros */
|
|
133
170
|
memset(bf->ptr, 0, bf->bytes);
|
|
134
|
-
rb_iv_set(self, "@hash_value", rb_hash_new());
|
|
135
171
|
|
|
136
172
|
return self;
|
|
137
173
|
}
|
|
@@ -154,12 +190,18 @@ static VALUE bf_k(VALUE self) {
|
|
|
154
190
|
|
|
155
191
|
static VALUE bf_set_bits(VALUE self){
|
|
156
192
|
struct BloomFilter *bf = bf_ptr(self);
|
|
157
|
-
int i,
|
|
193
|
+
int i, count = 0;
|
|
194
|
+
|
|
158
195
|
for (i = 0; i < bf->bytes; i++) {
|
|
159
|
-
|
|
160
|
-
|
|
196
|
+
unsigned char byte = bf->ptr[i];
|
|
197
|
+
|
|
198
|
+
/* Brian Kernighan’s bit-count loop a*/
|
|
199
|
+
while (byte != 0) {
|
|
200
|
+
byte &= (unsigned char) (byte - 1);
|
|
201
|
+
count++;
|
|
161
202
|
}
|
|
162
203
|
}
|
|
204
|
+
|
|
163
205
|
return INT2FIX(count);
|
|
164
206
|
}
|
|
165
207
|
|
|
@@ -193,6 +235,9 @@ static VALUE bf_merge(VALUE self, VALUE other) {
|
|
|
193
235
|
struct BloomFilter *bf = bf_ptr(self);
|
|
194
236
|
struct BloomFilter *target = bf_ptr(other);
|
|
195
237
|
int i;
|
|
238
|
+
|
|
239
|
+
bf_ensure_compatible(bf, target);
|
|
240
|
+
|
|
196
241
|
for (i = 0; i < bf->bytes; i++) {
|
|
197
242
|
bf->ptr[i] |= target->ptr[i];
|
|
198
243
|
}
|
|
@@ -206,6 +251,8 @@ static VALUE bf_and(VALUE self, VALUE other) {
|
|
|
206
251
|
VALUE klass, obj, args[5];
|
|
207
252
|
int i;
|
|
208
253
|
|
|
254
|
+
bf_ensure_compatible(bf, bf_other);
|
|
255
|
+
|
|
209
256
|
args[0] = INT2FIX(bf->m);
|
|
210
257
|
args[1] = INT2FIX(bf->k);
|
|
211
258
|
klass = rb_funcall(self,rb_intern("class"),0);
|
|
@@ -225,6 +272,8 @@ static VALUE bf_or(VALUE self, VALUE other) {
|
|
|
225
272
|
VALUE klass, obj, args[5];
|
|
226
273
|
int i;
|
|
227
274
|
|
|
275
|
+
bf_ensure_compatible(bf, bf_other);
|
|
276
|
+
|
|
228
277
|
args[0] = INT2FIX(bf->m);
|
|
229
278
|
args[1] = INT2FIX(bf->k);
|
|
230
279
|
klass = rb_funcall(self,rb_intern("class"),0);
|
|
@@ -278,9 +327,17 @@ static VALUE bf_bitmap(VALUE self) {
|
|
|
278
327
|
|
|
279
328
|
static VALUE bf_load(VALUE self, VALUE bitmap) {
|
|
280
329
|
struct BloomFilter *bf = bf_ptr(self);
|
|
281
|
-
|
|
330
|
+
VALUE bitmap_string = StringValue(bitmap);
|
|
331
|
+
unsigned char* ptr;
|
|
332
|
+
|
|
333
|
+
if (RSTRING_LEN(bitmap_string) != bf->bytes) {
|
|
334
|
+
rb_raise(rb_eArgError, "bitmap length must be %d bytes", bf->bytes);
|
|
335
|
+
}
|
|
336
|
+
|
|
337
|
+
ptr = (unsigned char *) RSTRING_PTR(bitmap_string);
|
|
282
338
|
|
|
283
339
|
memcpy(bf->ptr, ptr, bf->bytes);
|
|
340
|
+
bf_clear_padding_bits(bf);
|
|
284
341
|
|
|
285
342
|
return Qnil;
|
|
286
343
|
}
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
/*
|
|
2
|
+
* Borrowed from the CRC table
|
|
3
|
+
* https://www.mrob.com/pub/comp/crc-all.html
|
|
4
|
+
*
|
|
5
|
+
*/
|
|
6
|
+
static unsigned int salts[] = {
|
|
7
|
+
0x00000000UL, 0x77073096UL, 0xee0e612cUL, 0x990951baUL, 0x076dc419UL, 0x706af48fUL,
|
|
8
|
+
0xe963a535UL, 0x9e6495a3UL, 0x0edb8832UL, 0x79dcb8a4UL, 0xe0d5e91eUL, 0x97d2d988UL,
|
|
9
|
+
0x09b64c2bUL, 0x7eb17cbdUL, 0xe7b82d07UL, 0x90bf1d91UL, 0x1db71064UL, 0x6ab020f2UL,
|
|
10
|
+
0xf3b97148UL, 0x84be41deUL, 0x1adad47dUL, 0x6ddde4ebUL, 0xf4d4b551UL, 0x83d385c7UL,
|
|
11
|
+
0x136c9856UL, 0x646ba8c0UL, 0xfd62f97aUL, 0x8a65c9ecUL, 0x14015c4fUL, 0x63066cd9UL,
|
|
12
|
+
0xfa0f3d63UL, 0x8d080df5UL, 0x3b6e20c8UL, 0x4c69105eUL, 0xd56041e4UL, 0xa2677172UL,
|
|
13
|
+
0x3c03e4d1UL, 0x4b04d447UL, 0xd20d85fdUL, 0xa50ab56bUL, 0x35b5a8faUL, 0x42b2986cUL,
|
|
14
|
+
0xdbbbc9d6UL, 0xacbcf940UL, 0x32d86ce3UL, 0x45df5c75UL, 0xdcd60dcfUL, 0xabd13d59UL,
|
|
15
|
+
0x26d930acUL, 0x51de003aUL, 0xc8d75180UL, 0xbfd06116UL, 0x21b4f4b5UL, 0x56b3c423UL,
|
|
16
|
+
0xcfba9599UL, 0xb8bda50fUL, 0x2802b89eUL, 0x5f058808UL, 0xc60cd9b2UL, 0xb10be924UL,
|
|
17
|
+
0x2f6f7c87UL, 0x58684c11UL, 0xc1611dabUL, 0xb6662d3dUL, 0x76dc4190UL, 0x01db7106UL,
|
|
18
|
+
0x98d220bcUL, 0xefd5102aUL, 0x71b18589UL, 0x06b6b51fUL, 0x9fbfe4a5UL, 0xe8b8d433UL,
|
|
19
|
+
0x7807c9a2UL, 0x0f00f934UL, 0x9609a88eUL, 0xe10e9818UL, 0x7f6a0dbbUL, 0x086d3d2dUL,
|
|
20
|
+
0x91646c97UL, 0xe6635c01UL, 0x6b6b51f4UL, 0x1c6c6162UL, 0x856530d8UL, 0xf262004eUL,
|
|
21
|
+
0x6c0695edUL, 0x1b01a57bUL, 0x8208f4c1UL, 0xf50fc457UL, 0x65b0d9c6UL, 0x12b7e950UL,
|
|
22
|
+
0x8bbeb8eaUL, 0xfcb9887cUL, 0x62dd1ddfUL, 0x15da2d49UL, 0x8cd37cf3UL, 0xfbd44c65UL,
|
|
23
|
+
0x4db26158UL, 0x3ab551ceUL, 0xa3bc0074UL, 0xd4bb30e2UL, 0x4adfa541UL, 0x3dd895d7UL,
|
|
24
|
+
0xa4d1c46dUL, 0xd3d6f4fbUL, 0x4369e96aUL, 0x346ed9fcUL, 0xad678846UL, 0xda60b8d0UL,
|
|
25
|
+
0x44042d73UL, 0x33031de5UL, 0xaa0a4c5fUL, 0xdd0d7cc9UL, 0x5005713cUL, 0x270241aaUL,
|
|
26
|
+
0xbe0b1010UL, 0xc90c2086UL, 0x5768b525UL, 0x206f85b3UL, 0xb966d409UL, 0xce61e49fUL,
|
|
27
|
+
0x5edef90eUL, 0x29d9c998UL, 0xb0d09822UL, 0xc7d7a8b4UL, 0x59b33d17UL, 0x2eb40d81UL,
|
|
28
|
+
0xb7bd5c3bUL, 0xc0ba6cadUL, 0xedb88320UL, 0x9abfb3b6UL, 0x03b6e20cUL, 0x74b1d29aUL,
|
|
29
|
+
0xead54739UL, 0x9dd277afUL, 0x04db2615UL, 0x73dc1683UL, 0xe3630b12UL, 0x94643b84UL,
|
|
30
|
+
0x0d6d6a3eUL, 0x7a6a5aa8UL, 0xe40ecf0bUL, 0x9309ff9dUL, 0x0a00ae27UL, 0x7d079eb1UL,
|
|
31
|
+
0xf00f9344UL, 0x8708a3d2UL, 0x1e01f268UL, 0x6906c2feUL, 0xf762575dUL, 0x806567cbUL,
|
|
32
|
+
0x196c3671UL, 0x6e6b06e7UL, 0xfed41b76UL, 0x89d32be0UL, 0x10da7a5aUL, 0x67dd4accUL,
|
|
33
|
+
0xf9b9df6fUL, 0x8ebeeff9UL, 0x17b7be43UL, 0x60b08ed5UL, 0xd6d6a3e8UL, 0xa1d1937eUL,
|
|
34
|
+
0x38d8c2c4UL, 0x4fdff252UL, 0xd1bb67f1UL, 0xa6bc5767UL, 0x3fb506ddUL, 0x48b2364bUL,
|
|
35
|
+
0xd80d2bdaUL, 0xaf0a1b4cUL, 0x36034af6UL, 0x41047a60UL, 0xdf60efc3UL, 0xa867df55UL,
|
|
36
|
+
0x316e8eefUL, 0x4669be79UL, 0xcb61b38cUL, 0xbc66831aUL, 0x256fd2a0UL, 0x5268e236UL,
|
|
37
|
+
0xcc0c7795UL, 0xbb0b4703UL, 0x220216b9UL, 0x5505262fUL, 0xc5ba3bbeUL, 0xb2bd0b28UL,
|
|
38
|
+
0x2bb45a92UL, 0x5cb36a04UL, 0xc2d7ffa7UL, 0xb5d0cf31UL, 0x2cd99e8bUL, 0x5bdeae1dUL,
|
|
39
|
+
0x9b64c2b0UL, 0xec63f226UL, 0x756aa39cUL, 0x026d930aUL, 0x9c0906a9UL, 0xeb0e363fUL,
|
|
40
|
+
0x72076785UL, 0x05005713UL, 0x95bf4a82UL, 0xe2b87a14UL, 0x7bb12baeUL, 0x0cb61b38UL,
|
|
41
|
+
0x92d28e9bUL, 0xe5d5be0dUL, 0x7cdcefb7UL, 0x0bdbdf21UL, 0x86d3d2d4UL, 0xf1d4e242UL,
|
|
42
|
+
0x68ddb3f8UL, 0x1fda836eUL, 0x81be16cdUL, 0xf6b9265bUL, 0x6fb077e1UL, 0x18b74777UL,
|
|
43
|
+
0x88085ae6UL, 0xff0f6a70UL, 0x66063bcaUL, 0x11010b5cUL, 0x8f659effUL, 0xf862ae69UL,
|
|
44
|
+
0x616bffd3UL, 0x166ccf45UL, 0xa00ae278UL, 0xd70dd2eeUL, 0x4e048354UL, 0x3903b3c2UL,
|
|
45
|
+
0xa7672661UL, 0xd06016f7UL, 0x4969474dUL, 0x3e6e77dbUL, 0xaed16a4aUL, 0xd9d65adcUL,
|
|
46
|
+
0x40df0b66UL, 0x37d83bf0UL, 0xa9bcae53UL, 0xdebb9ec5UL, 0x47b2cf7fUL, 0x30b5ffe9UL,
|
|
47
|
+
0xbdbdf21cUL, 0xcabac28aUL, 0x53b39330UL, 0x24b4a3a6UL, 0xbad03605UL, 0xcdd70693UL,
|
|
48
|
+
0x54de5729UL, 0x23d967bfUL, 0xb3667a2eUL, 0xc4614ab8UL, 0x5d681b02UL, 0x2a6f2b94UL,
|
|
49
|
+
0xb40bbe37UL, 0xc30c8ea1UL, 0x5a05df1bUL, 0x2d02ef8dUL
|
|
50
|
+
};
|
data/lib/bloom_fit/version.rb
CHANGED
data/lib/bloom_fit.rb
CHANGED
|
@@ -1,7 +1,6 @@
|
|
|
1
1
|
require "forwardable"
|
|
2
2
|
|
|
3
3
|
require "cbloomfilter"
|
|
4
|
-
require "bloom_fit/configuration_mismatch"
|
|
5
4
|
require "bloom_fit/version"
|
|
6
5
|
|
|
7
6
|
# BloomFit is an in-memory Bloom filter with a small, Set-like API.
|
|
@@ -16,7 +15,7 @@ require "bloom_fit/version"
|
|
|
16
15
|
# serialized with +save+ and reloaded with +BloomFit.load+.
|
|
17
16
|
#
|
|
18
17
|
# Filters can only be combined when they were created with the same +size+ and
|
|
19
|
-
# +hashes+ values; otherwise
|
|
18
|
+
# +hashes+ values; otherwise the native extension raises +ArgumentError+.
|
|
20
19
|
#
|
|
21
20
|
# filter = BloomFit.new(size: 10_000, hashes: 6)
|
|
22
21
|
# filter.add("cat")
|
|
@@ -28,6 +27,8 @@ require "bloom_fit/version"
|
|
|
28
27
|
class BloomFit
|
|
29
28
|
extend Forwardable
|
|
30
29
|
|
|
30
|
+
LN2 = Math.log(2.0).freeze
|
|
31
|
+
|
|
31
32
|
# The wrapped native +CBloomFilter+ instance.
|
|
32
33
|
#
|
|
33
34
|
# This is mostly useful for low-level integrations and internal filter
|
|
@@ -40,9 +41,19 @@ class BloomFit
|
|
|
40
41
|
# but the best values depend on how many keys you expect to insert and how
|
|
41
42
|
# many false positives you can tolerate.
|
|
42
43
|
#
|
|
44
|
+
# @param capacity [Integer] expected number of elements to store in the set
|
|
45
|
+
# @param false_positive_rate [Integer] expected number of elements to store in the set
|
|
43
46
|
# @param size [Integer] number of buckets in a bloom filter
|
|
44
47
|
# @param hashes [Integer] number of hash functions
|
|
45
|
-
def initialize(size:
|
|
48
|
+
def initialize(capacity: 100, false_positive_rate: 0.001, size: nil, hashes: 4)
|
|
49
|
+
if size.nil? || hashes.nil?
|
|
50
|
+
raise ArgumentError, "capacity must be > 0" unless capacity.positive?
|
|
51
|
+
raise ArgumentError, "false_positive_rate must be between 0 and 1" if false_positive_rate <= 0.0 || false_positive_rate >= 1.0
|
|
52
|
+
|
|
53
|
+
size = (-capacity.to_f * Math.log(false_positive_rate) / (LN2**2)).ceil
|
|
54
|
+
hashes = (size / capacity * LN2).ceil
|
|
55
|
+
end
|
|
56
|
+
|
|
46
57
|
@bf = CBloomFilter.new(size, hashes)
|
|
47
58
|
end
|
|
48
59
|
|
|
@@ -68,15 +79,11 @@ class BloomFit
|
|
|
68
79
|
#
|
|
69
80
|
# Positive results are probabilistic and may be false positives.
|
|
70
81
|
|
|
71
|
-
# :method: clear
|
|
72
|
-
#
|
|
73
|
-
# Clears the filter by resetting all bits to +0+.
|
|
74
|
-
|
|
75
82
|
# :method: set_bits
|
|
76
83
|
#
|
|
77
84
|
# Returns the number of bits currently set to +1+.
|
|
78
85
|
|
|
79
|
-
def_delegators :@bf, :m, :k, :bitmap, :include?, :
|
|
86
|
+
def_delegators :@bf, :m, :k, :bitmap, :include?, :set_bits
|
|
80
87
|
|
|
81
88
|
# Returns the configured filter width.
|
|
82
89
|
alias size m
|
|
@@ -103,6 +110,12 @@ class BloomFit
|
|
|
103
110
|
end
|
|
104
111
|
alias << add
|
|
105
112
|
|
|
113
|
+
# Clears the filter by resetting all bits to +0+ and returns +self+.
|
|
114
|
+
def clear
|
|
115
|
+
@bf.clear
|
|
116
|
+
self
|
|
117
|
+
end
|
|
118
|
+
|
|
106
119
|
# Adds +key+ to the filter when +value+ is truthy.
|
|
107
120
|
#
|
|
108
121
|
# This makes BloomFit behave like a write-only membership hash: truthy values
|
|
@@ -150,7 +163,6 @@ class BloomFit
|
|
|
150
163
|
# This method mutates the receiver and mimics Set#merge.
|
|
151
164
|
def merge(other)
|
|
152
165
|
if other.is_a?(BloomFit)
|
|
153
|
-
raise BloomFit::ConfigurationMismatch unless same_parameters?(other)
|
|
154
166
|
@bf.merge(other.bf)
|
|
155
167
|
elsif other.respond_to?(:each_key)
|
|
156
168
|
other.each { |k, v| add(k) if v }
|
|
@@ -159,17 +171,18 @@ class BloomFit
|
|
|
159
171
|
else
|
|
160
172
|
raise ArgumentError, "value must be enumerable or another BloomFit filter"
|
|
161
173
|
end
|
|
174
|
+
|
|
175
|
+
self
|
|
162
176
|
end
|
|
163
177
|
|
|
164
178
|
# Returns a new filter containing the bitwise intersection of two filters.
|
|
165
179
|
#
|
|
166
|
-
# Both filters must have the same +size+ and +hashes+ values or
|
|
167
|
-
#
|
|
180
|
+
# Both filters must have the same +size+ and +hashes+ values or the native
|
|
181
|
+
# extension raises +ArgumentError+.
|
|
168
182
|
#
|
|
169
183
|
# Like all Bloom filter operations, membership checks on the result remain
|
|
170
184
|
# probabilistic and may still produce false positives.
|
|
171
185
|
def &(other)
|
|
172
|
-
raise BloomFit::ConfigurationMismatch unless same_parameters?(other)
|
|
173
186
|
self.class.new(size:, hashes:).tap do |result|
|
|
174
187
|
result.instance_variable_set(:@bf, @bf.&(other.bf))
|
|
175
188
|
end
|
|
@@ -178,12 +191,11 @@ class BloomFit
|
|
|
178
191
|
|
|
179
192
|
# Returns a new filter containing the bitwise union of two filters.
|
|
180
193
|
#
|
|
181
|
-
# Both filters must have the same +size+ and +hashes+ values or
|
|
182
|
-
#
|
|
194
|
+
# Both filters must have the same +size+ and +hashes+ values or the native
|
|
195
|
+
# extension raises +ArgumentError+.
|
|
183
196
|
#
|
|
184
197
|
# The receiver and +other+ are left unchanged.
|
|
185
198
|
def |(other)
|
|
186
|
-
raise BloomFit::ConfigurationMismatch unless same_parameters?(other)
|
|
187
199
|
self.class.new(size:, hashes:).tap do |result|
|
|
188
200
|
result.instance_variable_set(:@bf, @bf.|(other.bf))
|
|
189
201
|
end
|
|
@@ -196,14 +208,14 @@ class BloomFit
|
|
|
196
208
|
# bits (+n+), the hash count (+k+), and the predicted false-positive rate
|
|
197
209
|
# based on the current fill level.
|
|
198
210
|
def stats
|
|
199
|
-
fpr = ((
|
|
211
|
+
fpr = ((n.to_f / m)**k) * 100
|
|
200
212
|
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
213
|
+
format <<~STATS, m, n, k, fpr
|
|
214
|
+
Number of filter buckets (m): %d
|
|
215
|
+
Number of set bits (n): %d
|
|
216
|
+
Number of filter hashes (k): %d
|
|
217
|
+
Predicted false positive rate: %.2f%%
|
|
218
|
+
STATS
|
|
207
219
|
end
|
|
208
220
|
|
|
209
221
|
# Rebuilds the filter from the serialized data returned by +marshal_dump+.
|
|
@@ -226,20 +238,11 @@ class BloomFit
|
|
|
226
238
|
# The file is read using Ruby's +Marshal+ format, so it should only be used
|
|
227
239
|
# with trusted input.
|
|
228
240
|
def self.load(filename)
|
|
229
|
-
Marshal.load(File.
|
|
241
|
+
Marshal.load(File.binread(filename)) # rubocop:disable Security/MarshalLoad
|
|
230
242
|
end
|
|
231
243
|
|
|
232
244
|
# Writes the filter to +filename+ using Ruby's +Marshal+ format.
|
|
233
245
|
def save(filename)
|
|
234
|
-
File.
|
|
235
|
-
f << Marshal.dump(self)
|
|
236
|
-
end
|
|
237
|
-
end
|
|
238
|
-
|
|
239
|
-
protected
|
|
240
|
-
|
|
241
|
-
# Returns +true+ when +other+ has the same +size+ and +hashes+ values.
|
|
242
|
-
def same_parameters?(other)
|
|
243
|
-
bf.m == other.bf.m && bf.k == other.bf.k
|
|
246
|
+
File.binwrite(filename, Marshal.dump(self))
|
|
244
247
|
end
|
|
245
248
|
end
|
data/lib/cbloomfilter.bundle
CHANGED
|
Binary file
|
data/test/bloom_fit_test.rb
CHANGED
|
@@ -3,6 +3,28 @@ require "test_helper"
|
|
|
3
3
|
class BloomFitTest < Minitest::Spec
|
|
4
4
|
subject { BloomFit.new(size: 100, hashes: 4) }
|
|
5
5
|
|
|
6
|
+
describe ".new" do
|
|
7
|
+
it "accepts size and hashes override" do
|
|
8
|
+
bf = BloomFit.new(size: 10, hashes: 1)
|
|
9
|
+
assert_equal 10, bf.size
|
|
10
|
+
assert_equal 1, bf.hashes
|
|
11
|
+
end
|
|
12
|
+
|
|
13
|
+
it "has default capacity and false positive-rate" do
|
|
14
|
+
bf = BloomFit.new
|
|
15
|
+
# https://hur.st/bloomfilter/?n=100&p=0.001&m=&k=
|
|
16
|
+
assert_equal 1438, bf.size
|
|
17
|
+
assert_equal 10, bf.hashes
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
it "calculates size and hashes given a capacity and false postiive rate" do
|
|
21
|
+
bf = BloomFit.new(capacity: 10_000, false_positive_rate: 0.0001)
|
|
22
|
+
# https://hur.st/bloomfilter/?n=10000&p=0.0001&m=&k=
|
|
23
|
+
assert_equal 191_702, bf.size
|
|
24
|
+
assert_equal 14, bf.hashes
|
|
25
|
+
end
|
|
26
|
+
end
|
|
27
|
+
|
|
6
28
|
describe "#empty?" do
|
|
7
29
|
it "returns true when nothing set" do
|
|
8
30
|
assert_equal true, subject.empty? # rubocop:disable Minitest/AssertTruthy
|
|
@@ -102,11 +124,11 @@ class BloomFitTest < Minitest::Spec
|
|
|
102
124
|
end
|
|
103
125
|
|
|
104
126
|
describe "#clear" do
|
|
105
|
-
it "zeroes the bits" do
|
|
127
|
+
it "zeroes the bits and returns self" do
|
|
106
128
|
subject.add("test")
|
|
107
129
|
assert_includes subject, "test"
|
|
108
130
|
assert_includes subject.to_binary, "1"
|
|
109
|
-
subject.clear
|
|
131
|
+
assert_equal subject, subject.clear
|
|
110
132
|
refute_includes subject, "test"
|
|
111
133
|
refute_includes subject.to_binary, "1"
|
|
112
134
|
end
|
|
@@ -180,14 +202,14 @@ class BloomFitTest < Minitest::Spec
|
|
|
180
202
|
end
|
|
181
203
|
|
|
182
204
|
describe "#merge" do
|
|
183
|
-
it "merges another BloomFit filter" do
|
|
205
|
+
it "merges another BloomFit filter and returns self" do
|
|
184
206
|
bf1 = BloomFit.new(size: 100, hashes: 2)
|
|
185
207
|
bf2 = BloomFit.new(size: 100, hashes: 2)
|
|
186
208
|
bf1 << "mouse"
|
|
187
209
|
bf2 << "cat" << "dog"
|
|
188
210
|
refute_includes bf1, "cat"
|
|
189
211
|
refute_includes bf1, "dog"
|
|
190
|
-
bf1.merge(bf2)
|
|
212
|
+
assert_equal bf1, bf1.merge(bf2)
|
|
191
213
|
assert_includes bf1, "mouse"
|
|
192
214
|
assert_includes bf1, "cat"
|
|
193
215
|
assert_includes bf1, "dog"
|
|
@@ -196,9 +218,9 @@ class BloomFitTest < Minitest::Spec
|
|
|
196
218
|
assert_includes bf2, "dog"
|
|
197
219
|
end
|
|
198
220
|
|
|
199
|
-
it "merges an array" do
|
|
221
|
+
it "merges an array and returns self" do
|
|
200
222
|
subject << "mouse"
|
|
201
|
-
subject.merge
|
|
223
|
+
assert_equal subject, subject.merge(%i[cat dog])
|
|
202
224
|
assert_includes subject, "mouse"
|
|
203
225
|
assert_includes subject, "cat"
|
|
204
226
|
assert_includes subject, "dog"
|
|
@@ -225,7 +247,7 @@ class BloomFitTest < Minitest::Spec
|
|
|
225
247
|
it "raises when merge is between incompatible filters" do
|
|
226
248
|
bf1 = BloomFit.new(size: 10)
|
|
227
249
|
bf2 = BloomFit.new(size: 20)
|
|
228
|
-
assert_raises(
|
|
250
|
+
assert_raises(ArgumentError) { bf1.merge(bf2) }
|
|
229
251
|
end
|
|
230
252
|
end
|
|
231
253
|
|
|
@@ -263,11 +285,11 @@ class BloomFitTest < Minitest::Spec
|
|
|
263
285
|
it "raises when intersection is between incompatible filters" do
|
|
264
286
|
bf1 = BloomFit.new(size: 10)
|
|
265
287
|
bf2 = BloomFit.new(size: 20)
|
|
266
|
-
assert_raises(
|
|
288
|
+
assert_raises(ArgumentError) { bf1 & bf2 }
|
|
267
289
|
|
|
268
290
|
bf1 = BloomFit.new(size: 10, hashes: 2)
|
|
269
291
|
bf2 = BloomFit.new(size: 10, hashes: 4)
|
|
270
|
-
assert_raises(
|
|
292
|
+
assert_raises(ArgumentError) { bf1 & bf2 }
|
|
271
293
|
end
|
|
272
294
|
end
|
|
273
295
|
|
|
@@ -303,7 +325,7 @@ class BloomFitTest < Minitest::Spec
|
|
|
303
325
|
it "raises when union is between incompatible filters" do
|
|
304
326
|
bf1 = BloomFit.new(size: 10)
|
|
305
327
|
bf2 = BloomFit.new(size: 20)
|
|
306
|
-
assert_raises(
|
|
328
|
+
assert_raises(ArgumentError) { bf1 | bf2 }
|
|
307
329
|
end
|
|
308
330
|
end
|
|
309
331
|
|
|
@@ -318,16 +340,51 @@ class BloomFitTest < Minitest::Spec
|
|
|
318
340
|
STATS
|
|
319
341
|
assert_equal expected, bf.stats
|
|
320
342
|
end
|
|
343
|
+
|
|
344
|
+
it "estimates false positives from the current fill level" do
|
|
345
|
+
bf = BloomFit.new(size: 10, hashes: 3)
|
|
346
|
+
bf.bf.load("\x07\x00\x00".b)
|
|
347
|
+
|
|
348
|
+
expected = <<~STATS
|
|
349
|
+
Number of filter buckets (m): 10
|
|
350
|
+
Number of set bits (n): 3
|
|
351
|
+
Number of filter hashes (k): 3
|
|
352
|
+
Predicted false positive rate: 2.70%
|
|
353
|
+
STATS
|
|
354
|
+
assert_equal expected, bf.stats
|
|
355
|
+
end
|
|
321
356
|
end
|
|
322
357
|
|
|
323
358
|
describe "serialization" do
|
|
324
|
-
after {
|
|
359
|
+
after { FileUtils.rm_f("bf.out") }
|
|
325
360
|
|
|
326
361
|
it "marshalls" do
|
|
327
362
|
bf = BloomFit.new
|
|
328
363
|
assert bf.save("bf.out")
|
|
329
364
|
end
|
|
330
365
|
|
|
366
|
+
it "uses binary file io" do
|
|
367
|
+
dumped = Marshal.dump(subject)
|
|
368
|
+
writer = Minitest::Mock.new
|
|
369
|
+
writer.expect(:call, dumped.bytesize, ["bf.out", dumped])
|
|
370
|
+
|
|
371
|
+
reader = Minitest::Mock.new
|
|
372
|
+
reader.expect(:call, dumped, ["bf.out"])
|
|
373
|
+
|
|
374
|
+
File.stub(:binwrite, writer) do
|
|
375
|
+
assert_equal dumped.bytesize, subject.save("bf.out")
|
|
376
|
+
end
|
|
377
|
+
|
|
378
|
+
File.stub(:binread, reader) do
|
|
379
|
+
bf2 = BloomFit.load("bf.out")
|
|
380
|
+
assert_equal subject.size, bf2.size
|
|
381
|
+
assert_equal subject.hashes, bf2.hashes
|
|
382
|
+
end
|
|
383
|
+
|
|
384
|
+
writer.verify
|
|
385
|
+
reader.verify
|
|
386
|
+
end
|
|
387
|
+
|
|
331
388
|
it "loads from marshalled" do
|
|
332
389
|
subject.add("foo")
|
|
333
390
|
subject.add("bar")
|
|
@@ -338,7 +395,8 @@ class BloomFitTest < Minitest::Spec
|
|
|
338
395
|
assert_includes bf2, "bar"
|
|
339
396
|
refute_includes bf2, "baz"
|
|
340
397
|
|
|
341
|
-
|
|
398
|
+
assert_equal subject.size, bf2.size
|
|
399
|
+
assert_equal subject.hashes, bf2.hashes
|
|
342
400
|
end
|
|
343
401
|
end
|
|
344
402
|
end
|
|
@@ -0,0 +1,233 @@
|
|
|
1
|
+
require "test_helper"
|
|
2
|
+
|
|
3
|
+
class CBloomFilterTest < Minitest::Spec
|
|
4
|
+
subject { CBloomFilter.new }
|
|
5
|
+
|
|
6
|
+
describe ".new" do
|
|
7
|
+
it "rejects more than two arguments" do
|
|
8
|
+
error = assert_raises(ArgumentError) { CBloomFilter.new(1, 2, 3) }
|
|
9
|
+
assert_equal "wrong number of arguments (given 3, expected 0..2)", error.message
|
|
10
|
+
end
|
|
11
|
+
end
|
|
12
|
+
|
|
13
|
+
describe "#m" do
|
|
14
|
+
it "defaults" do
|
|
15
|
+
assert_equal 1000, subject.m
|
|
16
|
+
end
|
|
17
|
+
|
|
18
|
+
it "is set by the 1st arg of the contructor" do
|
|
19
|
+
bf = CBloomFilter.new(10_000)
|
|
20
|
+
assert_equal 10_000, bf.m
|
|
21
|
+
end
|
|
22
|
+
|
|
23
|
+
it "rejects values less than 1" do
|
|
24
|
+
error = assert_raises(ArgumentError) { CBloomFilter.new(-1) }
|
|
25
|
+
assert_equal "bit length must be >= 1", error.message
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
it "rejects values that overflow internal byte sizing" do
|
|
29
|
+
error = assert_raises(RangeError) { CBloomFilter.new((1 << 31) - 7) }
|
|
30
|
+
assert_equal "bit length is too large", error.message
|
|
31
|
+
end
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
describe "#k" do
|
|
35
|
+
it "defaults" do
|
|
36
|
+
assert_equal 4, subject.k
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
it "is set by the 2nd arg of the contructor" do
|
|
40
|
+
bf = CBloomFilter.new(10_000, 9)
|
|
41
|
+
assert_equal 9, bf.k
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
it "rejects values less than 1" do
|
|
45
|
+
error = assert_raises(ArgumentError) { CBloomFilter.new(1000, 0) }
|
|
46
|
+
assert_equal "hash length must be >= 1", error.message
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
it "rejects values larger than the salt table" do
|
|
50
|
+
error = assert_raises(ArgumentError) { CBloomFilter.new(10_000, 257) }
|
|
51
|
+
assert_equal "hash length must be <= 256", error.message
|
|
52
|
+
end
|
|
53
|
+
end
|
|
54
|
+
|
|
55
|
+
describe "#set_bits" do
|
|
56
|
+
it "initializes to zero" do
|
|
57
|
+
assert_equal 0, subject.set_bits
|
|
58
|
+
end
|
|
59
|
+
|
|
60
|
+
it "counts the bits when active" do
|
|
61
|
+
subject.add("foo")
|
|
62
|
+
assert_equal 4, subject.set_bits
|
|
63
|
+
end
|
|
64
|
+
end
|
|
65
|
+
|
|
66
|
+
describe "#add" do
|
|
67
|
+
it "adds keys to the filter set" do
|
|
68
|
+
subject.add("foo")
|
|
69
|
+
subject.add("bar")
|
|
70
|
+
assert_includes subject, "foo"
|
|
71
|
+
assert_includes subject, "bar"
|
|
72
|
+
refute_includes subject, "baz"
|
|
73
|
+
end
|
|
74
|
+
|
|
75
|
+
it "treats binary bytes as unsigned when hashing" do
|
|
76
|
+
bf = CBloomFilter.new(20, 4)
|
|
77
|
+
bf.add("\xFF".b)
|
|
78
|
+
assert_equal "\x00\x05\x05\x00".b, bf.bitmap
|
|
79
|
+
end
|
|
80
|
+
end
|
|
81
|
+
|
|
82
|
+
describe "#include?" do
|
|
83
|
+
it "returns true when a key is in the set" do
|
|
84
|
+
subject.add("foo")
|
|
85
|
+
assert_equal true, subject.include?("foo") # rubocop:disable Minitest/AssertTruthy
|
|
86
|
+
end
|
|
87
|
+
|
|
88
|
+
it "returns false when a key is not in the set" do
|
|
89
|
+
subject.add("foo")
|
|
90
|
+
assert_equal false, subject.include?("bar") # rubocop:disable Minitest/RefuteFalse
|
|
91
|
+
end
|
|
92
|
+
end
|
|
93
|
+
|
|
94
|
+
describe "#clear" do
|
|
95
|
+
it "clears a set" do
|
|
96
|
+
subject.add("foo")
|
|
97
|
+
subject.add("bar")
|
|
98
|
+
subject.add("baz")
|
|
99
|
+
assert subject.set_bits.positive?
|
|
100
|
+
subject.clear
|
|
101
|
+
assert subject.set_bits.zero?
|
|
102
|
+
end
|
|
103
|
+
end
|
|
104
|
+
|
|
105
|
+
describe "#merge" do
|
|
106
|
+
it "adds keys from another set" do
|
|
107
|
+
subject.add("foo")
|
|
108
|
+
|
|
109
|
+
bf = CBloomFilter.new
|
|
110
|
+
bf.add("bar")
|
|
111
|
+
bf.add("baz")
|
|
112
|
+
|
|
113
|
+
subject.merge(bf)
|
|
114
|
+
assert_includes subject, "foo"
|
|
115
|
+
assert_includes subject, "bar"
|
|
116
|
+
assert_includes subject, "baz"
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
it "rejects incompatible filters" do
|
|
120
|
+
error = assert_raises(ArgumentError) { subject.merge(CBloomFilter.new(2000, 4)) }
|
|
121
|
+
assert_equal "bloom filters must have matching size and hash count", error.message
|
|
122
|
+
end
|
|
123
|
+
end
|
|
124
|
+
|
|
125
|
+
describe "#&" do
|
|
126
|
+
it "intersects keys from another set" do
|
|
127
|
+
subject.add("foo")
|
|
128
|
+
subject.add("bar")
|
|
129
|
+
|
|
130
|
+
bf = CBloomFilter.new
|
|
131
|
+
bf.add("bar")
|
|
132
|
+
bf.add("baz")
|
|
133
|
+
|
|
134
|
+
bf2 = subject & bf
|
|
135
|
+
refute_includes bf2, "foo"
|
|
136
|
+
assert_includes bf2, "bar"
|
|
137
|
+
refute_includes bf2, "baz"
|
|
138
|
+
|
|
139
|
+
bf3 = bf & subject
|
|
140
|
+
refute_includes bf3, "foo"
|
|
141
|
+
assert_includes bf3, "bar"
|
|
142
|
+
refute_includes bf3, "baz"
|
|
143
|
+
end
|
|
144
|
+
|
|
145
|
+
it "rejects incompatible filters" do
|
|
146
|
+
error = assert_raises(ArgumentError) { subject & CBloomFilter.new(1000, 2) }
|
|
147
|
+
assert_equal "bloom filters must have matching size and hash count", error.message
|
|
148
|
+
end
|
|
149
|
+
end
|
|
150
|
+
|
|
151
|
+
describe "#|" do
|
|
152
|
+
it "unions keys from another set" do
|
|
153
|
+
subject.add("foo")
|
|
154
|
+
subject.add("bar")
|
|
155
|
+
|
|
156
|
+
bf = CBloomFilter.new
|
|
157
|
+
bf.add("bar")
|
|
158
|
+
bf.add("baz")
|
|
159
|
+
|
|
160
|
+
bf2 = subject | bf
|
|
161
|
+
assert_includes bf2, "foo"
|
|
162
|
+
assert_includes bf2, "bar"
|
|
163
|
+
assert_includes bf2, "baz"
|
|
164
|
+
|
|
165
|
+
bf3 = bf | subject
|
|
166
|
+
assert_includes bf3, "foo"
|
|
167
|
+
assert_includes bf3, "bar"
|
|
168
|
+
assert_includes bf3, "baz"
|
|
169
|
+
end
|
|
170
|
+
|
|
171
|
+
it "rejects incompatible filters" do
|
|
172
|
+
error = assert_raises(ArgumentError) { subject | CBloomFilter.new(2000, 4) }
|
|
173
|
+
assert_equal "bloom filters must have matching size and hash count", error.message
|
|
174
|
+
end
|
|
175
|
+
end
|
|
176
|
+
|
|
177
|
+
describe "#bitmap" do
|
|
178
|
+
it "returns a binary bitmap of all zeros when empty (including a terminating byte)" do
|
|
179
|
+
bf = CBloomFilter.new(16)
|
|
180
|
+
assert_equal "\x00\x00\x00".b, bf.bitmap
|
|
181
|
+
end
|
|
182
|
+
|
|
183
|
+
it "returns a binary bitmap representing the set" do
|
|
184
|
+
bf = CBloomFilter.new(16, 4)
|
|
185
|
+
bf.add("something")
|
|
186
|
+
assert_equal "(\x82\x00".b, bf.bitmap
|
|
187
|
+
end
|
|
188
|
+
|
|
189
|
+
it "returns a binary bitmap representing the set even if not a multiple of 8 bits (includes padding)" do
|
|
190
|
+
bf = CBloomFilter.new(20, 4)
|
|
191
|
+
bf.add("wow")
|
|
192
|
+
assert_equal "\x04\x14\x00\x00".b, bf.bitmap
|
|
193
|
+
end
|
|
194
|
+
end
|
|
195
|
+
|
|
196
|
+
describe "#load" do
|
|
197
|
+
it "overwrites the bitmap" do
|
|
198
|
+
bf = CBloomFilter.new(1000, 4)
|
|
199
|
+
bf.add("foo")
|
|
200
|
+
bf.add("bar")
|
|
201
|
+
subject.load(bf.bitmap)
|
|
202
|
+
assert_includes subject, "foo"
|
|
203
|
+
assert_includes subject, "bar"
|
|
204
|
+
end
|
|
205
|
+
|
|
206
|
+
it "rejects a short bitmap" do
|
|
207
|
+
error = assert_raises(ArgumentError) { subject.load("\x00".b) }
|
|
208
|
+
assert_equal "bitmap length must be 126 bytes", error.message
|
|
209
|
+
end
|
|
210
|
+
|
|
211
|
+
it "rejects a long bitmap" do
|
|
212
|
+
error = assert_raises(ArgumentError) { subject.load("\x00".b * 127) }
|
|
213
|
+
assert_equal "bitmap length must be 126 bytes", error.message
|
|
214
|
+
end
|
|
215
|
+
|
|
216
|
+
it "coerces bitmap-like objects to strings before loading" do
|
|
217
|
+
bitmap_data = subject.bitmap
|
|
218
|
+
bitmap = Object.new
|
|
219
|
+
bitmap.define_singleton_method(:to_str) { bitmap_data }
|
|
220
|
+
subject.load(bitmap)
|
|
221
|
+
assert_equal 0, subject.set_bits
|
|
222
|
+
end
|
|
223
|
+
|
|
224
|
+
it "clears loaded padding bits beyond the configured size" do
|
|
225
|
+
bf = CBloomFilter.new(20, 4)
|
|
226
|
+
|
|
227
|
+
bf.load("\x00\x00\xF0\xFF".b)
|
|
228
|
+
|
|
229
|
+
assert_equal 0, bf.set_bits
|
|
230
|
+
assert_equal "\x00\x00\x00\x00".b, bf.bitmap
|
|
231
|
+
end
|
|
232
|
+
end
|
|
233
|
+
end
|
data/test/test_helper.rb
CHANGED
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: bloom_fit
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version:
|
|
4
|
+
version: 1.1.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Ryan McGeary
|
|
@@ -24,13 +24,13 @@ extra_rdoc_files: []
|
|
|
24
24
|
files:
|
|
25
25
|
- README.md
|
|
26
26
|
- ext/cbloomfilter/cbloomfilter.c
|
|
27
|
-
- ext/cbloomfilter/crc32.h
|
|
28
27
|
- ext/cbloomfilter/extconf.rb
|
|
28
|
+
- ext/cbloomfilter/salts.h
|
|
29
29
|
- lib/bloom_fit.rb
|
|
30
|
-
- lib/bloom_fit/configuration_mismatch.rb
|
|
31
30
|
- lib/bloom_fit/version.rb
|
|
32
31
|
- lib/cbloomfilter.bundle
|
|
33
32
|
- test/bloom_fit_test.rb
|
|
33
|
+
- test/c_bloom_filter_test.rb
|
|
34
34
|
- test/test_helper.rb
|
|
35
35
|
homepage: https://github.com/rmm5t/bloom_fit
|
|
36
36
|
licenses: []
|
data/ext/cbloomfilter/crc32.h
DELETED
|
@@ -1,76 +0,0 @@
|
|
|
1
|
-
/* simple CRC32 code */
|
|
2
|
-
/*
|
|
3
|
-
* Copyright 2005 Aris Adamantiadis
|
|
4
|
-
*
|
|
5
|
-
* This file is part of the SSH Library
|
|
6
|
-
*
|
|
7
|
-
* The SSH Library is free software; you can redistribute it and/or modify
|
|
8
|
-
* it under the terms of the GNU Lesser General Public License as published by
|
|
9
|
-
* the Free Software Foundation; either version 2.1 of the License, or (at your
|
|
10
|
-
* option) any later version.
|
|
11
|
-
*
|
|
12
|
-
*
|
|
13
|
-
* The SSH Library is distributed in the hope that it will be useful, but
|
|
14
|
-
* WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
|
|
15
|
-
* or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
|
|
16
|
-
* License for more details.
|
|
17
|
-
*
|
|
18
|
-
* You should have received a copy of the GNU Lesser General Public License
|
|
19
|
-
* along with the SSH Library; see the file COPYING. If not, write to
|
|
20
|
-
* the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
|
|
21
|
-
* MA 02111-1307, USA. */
|
|
22
|
-
|
|
23
|
-
static unsigned int crc_table[] = {
|
|
24
|
-
0x00000000UL, 0x77073096UL, 0xee0e612cUL, 0x990951baUL, 0x076dc419UL,
|
|
25
|
-
0x706af48fUL, 0xe963a535UL, 0x9e6495a3UL, 0x0edb8832UL, 0x79dcb8a4UL,
|
|
26
|
-
0xe0d5e91eUL, 0x97d2d988UL, 0x09b64c2bUL, 0x7eb17cbdUL, 0xe7b82d07UL,
|
|
27
|
-
0x90bf1d91UL, 0x1db71064UL, 0x6ab020f2UL, 0xf3b97148UL, 0x84be41deUL,
|
|
28
|
-
0x1adad47dUL, 0x6ddde4ebUL, 0xf4d4b551UL, 0x83d385c7UL, 0x136c9856UL,
|
|
29
|
-
0x646ba8c0UL, 0xfd62f97aUL, 0x8a65c9ecUL, 0x14015c4fUL, 0x63066cd9UL,
|
|
30
|
-
0xfa0f3d63UL, 0x8d080df5UL, 0x3b6e20c8UL, 0x4c69105eUL, 0xd56041e4UL,
|
|
31
|
-
0xa2677172UL, 0x3c03e4d1UL, 0x4b04d447UL, 0xd20d85fdUL, 0xa50ab56bUL,
|
|
32
|
-
0x35b5a8faUL, 0x42b2986cUL, 0xdbbbc9d6UL, 0xacbcf940UL, 0x32d86ce3UL,
|
|
33
|
-
0x45df5c75UL, 0xdcd60dcfUL, 0xabd13d59UL, 0x26d930acUL, 0x51de003aUL,
|
|
34
|
-
0xc8d75180UL, 0xbfd06116UL, 0x21b4f4b5UL, 0x56b3c423UL, 0xcfba9599UL,
|
|
35
|
-
0xb8bda50fUL, 0x2802b89eUL, 0x5f058808UL, 0xc60cd9b2UL, 0xb10be924UL,
|
|
36
|
-
0x2f6f7c87UL, 0x58684c11UL, 0xc1611dabUL, 0xb6662d3dUL, 0x76dc4190UL,
|
|
37
|
-
0x01db7106UL, 0x98d220bcUL, 0xefd5102aUL, 0x71b18589UL, 0x06b6b51fUL,
|
|
38
|
-
0x9fbfe4a5UL, 0xe8b8d433UL, 0x7807c9a2UL, 0x0f00f934UL, 0x9609a88eUL,
|
|
39
|
-
0xe10e9818UL, 0x7f6a0dbbUL, 0x086d3d2dUL, 0x91646c97UL, 0xe6635c01UL,
|
|
40
|
-
0x6b6b51f4UL, 0x1c6c6162UL, 0x856530d8UL, 0xf262004eUL, 0x6c0695edUL,
|
|
41
|
-
0x1b01a57bUL, 0x8208f4c1UL, 0xf50fc457UL, 0x65b0d9c6UL, 0x12b7e950UL,
|
|
42
|
-
0x8bbeb8eaUL, 0xfcb9887cUL, 0x62dd1ddfUL, 0x15da2d49UL, 0x8cd37cf3UL,
|
|
43
|
-
0xfbd44c65UL, 0x4db26158UL, 0x3ab551ceUL, 0xa3bc0074UL, 0xd4bb30e2UL,
|
|
44
|
-
0x4adfa541UL, 0x3dd895d7UL, 0xa4d1c46dUL, 0xd3d6f4fbUL, 0x4369e96aUL,
|
|
45
|
-
0x346ed9fcUL, 0xad678846UL, 0xda60b8d0UL, 0x44042d73UL, 0x33031de5UL,
|
|
46
|
-
0xaa0a4c5fUL, 0xdd0d7cc9UL, 0x5005713cUL, 0x270241aaUL, 0xbe0b1010UL,
|
|
47
|
-
0xc90c2086UL, 0x5768b525UL, 0x206f85b3UL, 0xb966d409UL, 0xce61e49fUL,
|
|
48
|
-
0x5edef90eUL, 0x29d9c998UL, 0xb0d09822UL, 0xc7d7a8b4UL, 0x59b33d17UL,
|
|
49
|
-
0x2eb40d81UL, 0xb7bd5c3bUL, 0xc0ba6cadUL, 0xedb88320UL, 0x9abfb3b6UL,
|
|
50
|
-
0x03b6e20cUL, 0x74b1d29aUL, 0xead54739UL, 0x9dd277afUL, 0x04db2615UL,
|
|
51
|
-
0x73dc1683UL, 0xe3630b12UL, 0x94643b84UL, 0x0d6d6a3eUL, 0x7a6a5aa8UL,
|
|
52
|
-
0xe40ecf0bUL, 0x9309ff9dUL, 0x0a00ae27UL, 0x7d079eb1UL, 0xf00f9344UL,
|
|
53
|
-
0x8708a3d2UL, 0x1e01f268UL, 0x6906c2feUL, 0xf762575dUL, 0x806567cbUL,
|
|
54
|
-
0x196c3671UL, 0x6e6b06e7UL, 0xfed41b76UL, 0x89d32be0UL, 0x10da7a5aUL,
|
|
55
|
-
0x67dd4accUL, 0xf9b9df6fUL, 0x8ebeeff9UL, 0x17b7be43UL, 0x60b08ed5UL,
|
|
56
|
-
0xd6d6a3e8UL, 0xa1d1937eUL, 0x38d8c2c4UL, 0x4fdff252UL, 0xd1bb67f1UL,
|
|
57
|
-
0xa6bc5767UL, 0x3fb506ddUL, 0x48b2364bUL, 0xd80d2bdaUL, 0xaf0a1b4cUL,
|
|
58
|
-
0x36034af6UL, 0x41047a60UL, 0xdf60efc3UL, 0xa867df55UL, 0x316e8eefUL,
|
|
59
|
-
0x4669be79UL, 0xcb61b38cUL, 0xbc66831aUL, 0x256fd2a0UL, 0x5268e236UL,
|
|
60
|
-
0xcc0c7795UL, 0xbb0b4703UL, 0x220216b9UL, 0x5505262fUL, 0xc5ba3bbeUL,
|
|
61
|
-
0xb2bd0b28UL, 0x2bb45a92UL, 0x5cb36a04UL, 0xc2d7ffa7UL, 0xb5d0cf31UL,
|
|
62
|
-
0x2cd99e8bUL, 0x5bdeae1dUL, 0x9b64c2b0UL, 0xec63f226UL, 0x756aa39cUL,
|
|
63
|
-
0x026d930aUL, 0x9c0906a9UL, 0xeb0e363fUL, 0x72076785UL, 0x05005713UL,
|
|
64
|
-
0x95bf4a82UL, 0xe2b87a14UL, 0x7bb12baeUL, 0x0cb61b38UL, 0x92d28e9bUL,
|
|
65
|
-
0xe5d5be0dUL, 0x7cdcefb7UL, 0x0bdbdf21UL, 0x86d3d2d4UL, 0xf1d4e242UL,
|
|
66
|
-
0x68ddb3f8UL, 0x1fda836eUL, 0x81be16cdUL, 0xf6b9265bUL, 0x6fb077e1UL,
|
|
67
|
-
0x18b74777UL, 0x88085ae6UL, 0xff0f6a70UL, 0x66063bcaUL, 0x11010b5cUL,
|
|
68
|
-
0x8f659effUL, 0xf862ae69UL, 0x616bffd3UL, 0x166ccf45UL, 0xa00ae278UL,
|
|
69
|
-
0xd70dd2eeUL, 0x4e048354UL, 0x3903b3c2UL, 0xa7672661UL, 0xd06016f7UL,
|
|
70
|
-
0x4969474dUL, 0x3e6e77dbUL, 0xaed16a4aUL, 0xd9d65adcUL, 0x40df0b66UL,
|
|
71
|
-
0x37d83bf0UL, 0xa9bcae53UL, 0xdebb9ec5UL, 0x47b2cf7fUL, 0x30b5ffe9UL,
|
|
72
|
-
0xbdbdf21cUL, 0xcabac28aUL, 0x53b39330UL, 0x24b4a3a6UL, 0xbad03605UL,
|
|
73
|
-
0xcdd70693UL, 0x54de5729UL, 0x23d967bfUL, 0xb3667a2eUL, 0xc4614ab8UL,
|
|
74
|
-
0x5d681b02UL, 0x2a6f2b94UL, 0xb40bbe37UL, 0xc30c8ea1UL, 0x5a05df1bUL,
|
|
75
|
-
0x2d02ef8dUL
|
|
76
|
-
};
|