bloomer 0.0.5 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 683b2bd1b28f30606dd8f96ab85ea18a98e1a198d38b10f823563ba6893d1454
4
+ data.tar.gz: 8f26b8793a08d01b401d727fc0371990f948d2e69d01cdcb1d2e1dc83e742830
5
+ SHA512:
6
+ metadata.gz: 53af08a56920e8d6e146a44032151facc1e507c3bb9b21e0a9a203c60339039b2e0e9241a79cdafcb1d6f90ef326e8202252163af811e70373c30608ba3fa2f2
7
+ data.tar.gz: ce375fc18e7531cb2b6455df625034cf0881db9294a41945a9ef1789f2aa649ae7d8037a2205bc32e34113be765bc7f3e71c7b3be06a1cd7e47cb1553552c7ac
@@ -1,7 +1,7 @@
1
1
  language: ruby
2
2
  script: bundle exec rake
3
3
  rvm:
4
- - 1.8.7
5
- - 1.9.3
6
- - jruby-18mode # JRuby in 1.8 mode
7
- - jruby-19mode # JRuby in 1.9 mode
4
+ - 2.5.1
5
+ - 2.4.4
6
+ - jruby-9.1.17.0
7
+ - jruby-9.2.0.0
data/README.md CHANGED
@@ -1,32 +1,36 @@
1
1
  # Bloomer: Bloom filters with elastic
2
2
 
3
- [![Build Status](https://secure.travis-ci.org/mceachen/bloomer.png)](http://travis-ci.org/mceachen/bloomer)
3
+ [![Gem Version](https://badge.fury.io/rb/bloomer.svg)](https://badge.fury.io/rb/bloomer)
4
+ [![Build Status](https://secure.travis-ci.org/mceachen/bloomer.svg)](http://travis-ci.org/mceachen/bloomer)
4
5
 
5
- [Bloom filters](http://en.wikipedia.org/wiki/Bloom_filter) are great for quickly checking to see if
6
- a given string has been seen before--in constant time, and using a fixed amount of RAM, as long
7
- as you know the expected number of elements up front. If you add more than ```capacity``` elements to the filter,
8
- accuracy for ```include?``` will drop below ```false_positive_probability```.
6
+ [Bloom filters](http://en.wikipedia.org/wiki/Bloom_filter) are great for quickly
7
+ checking to see if a given string has been seen before--in constant time, and
8
+ using a fixed amount of RAM, as long as you know the expected number of elements
9
+ up front. If you add more than `capacity` elements to the filter, accuracy for
10
+ `include?` will drop below `false_positive_probability`.
9
11
 
10
- [Scalable Bloom Filters](http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf) maintain a maximal ```false_positive_probability```
11
- by using additional RAM as needed.
12
+ [Scalable Bloom Filters](http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf)
13
+ maintain a maximal `false_positive_probability` by using additional RAM as
14
+ needed.
12
15
 
13
- ```Bloomer``` is a Bloom Filter. ```Bloomer::Scalable``` is a Scalable Bloom Filter.
16
+ `Bloomer` is a Bloom Filter. `Bloomer::Scalable` is a Scalable Bloom Filter.
14
17
 
15
- Keep in mind that **false positives with Bloom filters are expected**, with a specified probability rate.
16
- False negatives, however, are not. In other words,
18
+ Keep in mind that **false positives with Bloom filters are expected**, with a
19
+ specified probability rate. False negatives, however, are not. In other words,
17
20
 
18
- * if ```include?``` returns *false*, that string has *certainly not* been ```add```ed
19
- * if ```include?``` returns *true*, it *might* mean that string was ```add```ed (depending on the
20
- ```false_positive_probability``` parameter provided to the constructor).
21
+ - if `include?` returns _false_, that string has _certainly not_ been `add`ed
22
+ - if `include?` returns _true_, it _might_ mean that string was `add`ed
23
+ (depending on the `false_positive_probability` parameter provided to the
24
+ constructor).
21
25
 
22
26
  This implementation is unique in that Bloomer
23
27
 
24
- * supports scalable bloom filters (SBF)
25
- * uses triple hash chains (see [the paper](http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf))
26
- * can marshal state quickly
27
- * has rigorous tests
28
- * is pure ruby
29
- * does not require EM or Redis or something else unrelated to simply implementing a bloom filter
28
+ - supports scalable bloom filters (SBF)
29
+ - uses triple hash chains (see [the paper](http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf))
30
+ - can marshal state quickly
31
+ - has rigorous tests
32
+ - is pure ruby
33
+ - does not require EM or Redis or something else unrelated to simply implementing a bloom filter
30
34
 
31
35
  ## Usage
32
36
 
@@ -52,30 +56,57 @@ bf.include? "badda"
52
56
  #=> false
53
57
  ```
54
58
 
55
- Serialization is through [Marshal](http://ruby-doc.org/core-1.8.7/Marshal.html):
59
+ Serialization can be done using
60
+ [MessagePack](https://github.com/msgpack/msgpack-ruby):
61
+
62
+ Notice, you'll need to require `bloomer/msgpackable` to enable serialization.
56
63
 
57
64
  ```ruby
65
+ require 'bloomer/msgpackable'
58
66
  b = Bloomer.new(10)
59
67
  b.add("a")
60
- s = Marshal.dump(b)
61
- new_b = Marshal.load(s)
68
+ s = b.to_msgpack
69
+ new_b = Bloomer.from_msgpack(s)
62
70
  new_b.include? "a"
63
71
  #=> true
64
72
  ```
65
73
 
74
+ The original class will be preserved regardless of calling
75
+ `Bloomer.from_msgpack(s)` or `Bloomer::Scalable.from_msgpack(s)`:
76
+
77
+ ```ruby
78
+ require 'bloomer/msgpackable'
79
+ b = Bloomer::Scalable.new
80
+ b.add("a")
81
+ s = b.to_msgpack
82
+ new_b = Bloomer.from_msgpack(s)
83
+ new_b.class == Bloomer::Scalable
84
+ #=> true
85
+ ```
86
+
66
87
  ## Changelog
67
88
 
89
+ ### 1.0.0
90
+
91
+ - Using msgpack for more secure deserialization. Marshal.load still works but is
92
+ not recommended
93
+
68
94
  ### 0.0.5
69
- * Switched from rspec to minitest
95
+
96
+ - Switched from rspec to minitest
70
97
 
71
98
  ### 0.0.4
72
- * Fixed gem packaging
99
+
100
+ - Fixed gem packaging
73
101
 
74
102
  ### 0.0.3
75
- * Added support for scalable bloom filters (SBF)
103
+
104
+ - Added support for scalable bloom filters (SBF)
76
105
 
77
106
  ### 0.0.2
78
- * Switch to triple-hash chaining (simpler, faster, and better false-positive rate)
107
+
108
+ - Switch to triple-hash chaining (simpler, faster, and better false-positive rate)
79
109
 
80
110
  ### 0.0.1
81
- * Bloom, there it is.
111
+
112
+ - Bloom, there it is.
@@ -20,6 +20,7 @@ Gem::Specification.new do |s|
20
20
  s.require_paths = ["lib"]
21
21
 
22
22
  s.add_dependency "bitarray"
23
+ s.add_dependency "msgpack"
23
24
  s.add_development_dependency "rake"
24
25
  s.add_development_dependency "yard"
25
26
  s.add_development_dependency "minitest"
@@ -0,0 +1,65 @@
1
+ require "msgpack"
2
+
3
+ module Msgpackable
4
+ def self.included(base)
5
+ base.extend(ClassMethods)
6
+ end
7
+
8
+ def to_msgpack
9
+ self.class.msgpack_factory.dump self
10
+ end
11
+
12
+ module ClassMethods
13
+ def from_msgpack(data)
14
+ msgpack_factory.load(data)
15
+ end
16
+
17
+ def msgpack_factory
18
+ @msgpack_factory ||= ::MessagePack::Factory.new.tap do |factory|
19
+ factory.register_type(0x01, ::Bloomer)
20
+ factory.register_type(0x02, ::Bloomer::Scalable)
21
+ factory.freeze
22
+ end
23
+ end
24
+ end
25
+ end
26
+
27
+ # Patch Bloomer and Scalable to make them msgpackable
28
+ class Bloomer
29
+ include Msgpackable
30
+
31
+ def to_msgpack_ext
32
+ self.class.msgpack_factory.dump([@capacity, @count, @k, @ba.size, @ba.field])
33
+ end
34
+
35
+ def from_msgpack_ext(capacity, count, k, ba_size, ba_field)
36
+ @capacity, @count, @k = capacity, count, k
37
+ @ba = BitArray.new(ba_size, ba_field)
38
+ end
39
+
40
+ def self.from_msgpack_ext(data)
41
+ values = msgpack_factory.load(data)
42
+ ::Bloomer.new(values[1]).tap do |b|
43
+ b.from_msgpack_ext(*values)
44
+ end
45
+ end
46
+
47
+ class Scalable
48
+ include Msgpackable
49
+
50
+ def to_msgpack_ext
51
+ self.class.msgpack_factory.dump([@false_positive_probability, @bloomers])
52
+ end
53
+
54
+ def from_msgpack_ext(false_positive_probability, bloomers)
55
+ @false_positive_probability, @bloomers = false_positive_probability, bloomers
56
+ end
57
+
58
+ def self.from_msgpack_ext(data)
59
+ false_positive_probability, bloomers = msgpack_factory.load(data)
60
+ ::Bloomer::Scalable.new.tap do |b|
61
+ b.from_msgpack_ext(false_positive_probability, bloomers)
62
+ end
63
+ end
64
+ end
65
+ end
@@ -1,3 +1,3 @@
1
1
  class Bloomer
2
- VERSION = "0.0.5"
2
+ VERSION = "1.0.0"
3
3
  end
@@ -1,6 +1,7 @@
1
1
  require "test_helper"
2
2
 
3
3
  C = ('a'..'z').to_a
4
+
4
5
  def rand_word(length = 8)
5
6
  C.shuffle.first(length).join # not random enough to cause hits.
6
7
  end
@@ -43,6 +44,20 @@ def test_marshal_state(b)
43
44
  inputs.each { |ea| new_b.must_include(ea) }
44
45
  end
45
46
 
47
+ def test_msgpackable(b)
48
+ require "bloomer/msgpackable"
49
+ inputs = b.capacity.times.collect { rand_word }
50
+ inputs.each { |ea| b.add(ea) }
51
+ packed = b.to_msgpack
52
+ new_b = b.class.from_msgpack(packed)
53
+ new_b.count.must_equal b.count
54
+ new_b.capacity.must_equal b.capacity
55
+ inputs.each { |ea| new_b.must_include(ea) }
56
+ dump = Marshal.dump(b)
57
+ packed.size.must_be :<, dump.size
58
+ b.class.must_equal new_b.class
59
+ end
60
+
46
61
  def test_simple(b)
47
62
  b.add("a").must_equal true
48
63
  b.add("a").must_equal false
@@ -68,6 +83,11 @@ describe Bloomer do
68
83
  test_marshal_state(b)
69
84
  end
70
85
 
86
+ it "serializes and deserializes correctly" do
87
+ b = Bloomer.new(10, 0.001)
88
+ test_msgpackable(b)
89
+ end
90
+
71
91
  it "results in similar-to-expected false positives" do
72
92
  max_false_prob = 0.001
73
93
  size = 50_000
@@ -88,6 +108,12 @@ describe Bloomer::Scalable do
88
108
  test_marshal_state(b)
89
109
  end
90
110
 
111
+ it "serializes and deserializes correctly" do
112
+ b = Bloomer::Scalable.new(10, 0.001)
113
+ 100.times.each { b.add(rand_word) }
114
+ test_msgpackable(b)
115
+ end
116
+
91
117
  it "results in similar-to-expected false positives" do
92
118
  max_false_prob = 0.001
93
119
  size = 10_000
metadata CHANGED
@@ -1,94 +1,97 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bloomer
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.5
5
- prerelease:
4
+ version: 1.0.0
6
5
  platform: ruby
7
6
  authors:
8
7
  - Matthew McEachen
9
8
  autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
- date: 2012-04-25 00:00:00.000000000 Z
11
+ date: 2018-09-13 00:00:00.000000000 Z
13
12
  dependencies:
14
13
  - !ruby/object:Gem::Dependency
15
14
  name: bitarray
16
15
  requirement: !ruby/object:Gem::Requirement
17
- none: false
18
16
  requirements:
19
- - - ! '>='
17
+ - - ">="
20
18
  - !ruby/object:Gem::Version
21
19
  version: '0'
22
20
  type: :runtime
23
21
  prerelease: false
24
22
  version_requirements: !ruby/object:Gem::Requirement
25
- none: false
26
23
  requirements:
27
- - - ! '>='
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: msgpack
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
28
39
  - !ruby/object:Gem::Version
29
40
  version: '0'
30
41
  - !ruby/object:Gem::Dependency
31
42
  name: rake
32
43
  requirement: !ruby/object:Gem::Requirement
33
- none: false
34
44
  requirements:
35
- - - ! '>='
45
+ - - ">="
36
46
  - !ruby/object:Gem::Version
37
47
  version: '0'
38
48
  type: :development
39
49
  prerelease: false
40
50
  version_requirements: !ruby/object:Gem::Requirement
41
- none: false
42
51
  requirements:
43
- - - ! '>='
52
+ - - ">="
44
53
  - !ruby/object:Gem::Version
45
54
  version: '0'
46
55
  - !ruby/object:Gem::Dependency
47
56
  name: yard
48
57
  requirement: !ruby/object:Gem::Requirement
49
- none: false
50
58
  requirements:
51
- - - ! '>='
59
+ - - ">="
52
60
  - !ruby/object:Gem::Version
53
61
  version: '0'
54
62
  type: :development
55
63
  prerelease: false
56
64
  version_requirements: !ruby/object:Gem::Requirement
57
- none: false
58
65
  requirements:
59
- - - ! '>='
66
+ - - ">="
60
67
  - !ruby/object:Gem::Version
61
68
  version: '0'
62
69
  - !ruby/object:Gem::Dependency
63
70
  name: minitest
64
71
  requirement: !ruby/object:Gem::Requirement
65
- none: false
66
72
  requirements:
67
- - - ! '>='
73
+ - - ">="
68
74
  - !ruby/object:Gem::Version
69
75
  version: '0'
70
76
  type: :development
71
77
  prerelease: false
72
78
  version_requirements: !ruby/object:Gem::Requirement
73
- none: false
74
79
  requirements:
75
- - - ! '>='
80
+ - - ">="
76
81
  - !ruby/object:Gem::Version
77
82
  version: '0'
78
83
  - !ruby/object:Gem::Dependency
79
84
  name: minitest-reporters
80
85
  requirement: !ruby/object:Gem::Requirement
81
- none: false
82
86
  requirements:
83
- - - ! '>='
87
+ - - ">="
84
88
  - !ruby/object:Gem::Version
85
89
  version: '0'
86
90
  type: :development
87
91
  prerelease: false
88
92
  version_requirements: !ruby/object:Gem::Requirement
89
- none: false
90
93
  requirements:
91
- - - ! '>='
94
+ - - ">="
92
95
  - !ruby/object:Gem::Version
93
96
  version: '0'
94
97
  description: Bloom filters and Scalable Bloom filters (SBF) in pure ruby
@@ -98,48 +101,39 @@ executables: []
98
101
  extensions: []
99
102
  extra_rdoc_files: []
100
103
  files:
101
- - .gitignore
102
- - .travis.yml
104
+ - ".gitignore"
105
+ - ".travis.yml"
103
106
  - Gemfile
104
107
  - MIT-LICENSE
105
108
  - README.md
106
109
  - Rakefile
107
110
  - bloomer.gemspec
108
111
  - lib/bloomer.rb
112
+ - lib/bloomer/msgpackable.rb
109
113
  - lib/bloomer/version.rb
110
114
  - test/bloomer_test.rb
111
115
  - test/test_helper.rb
112
116
  homepage: https://github.com/mceachen/bloomer
113
117
  licenses: []
118
+ metadata: {}
114
119
  post_install_message:
115
120
  rdoc_options: []
116
121
  require_paths:
117
122
  - lib
118
123
  required_ruby_version: !ruby/object:Gem::Requirement
119
- none: false
120
124
  requirements:
121
- - - ! '>='
125
+ - - ">="
122
126
  - !ruby/object:Gem::Version
123
127
  version: '0'
124
- segments:
125
- - 0
126
- hash: 2624379326334183946
127
128
  required_rubygems_version: !ruby/object:Gem::Requirement
128
- none: false
129
129
  requirements:
130
- - - ! '>='
130
+ - - ">="
131
131
  - !ruby/object:Gem::Version
132
132
  version: '0'
133
- segments:
134
- - 0
135
- hash: 2624379326334183946
136
133
  requirements: []
137
134
  rubyforge_project: bloomer
138
- rubygems_version: 1.8.21
135
+ rubygems_version: 2.7.7
139
136
  signing_key:
140
- specification_version: 3
137
+ specification_version: 4
141
138
  summary: Bloom filters and Scalable Bloom filters (SBF) in pure ruby
142
- test_files:
143
- - test/bloomer_test.rb
144
- - test/test_helper.rb
145
- has_rdoc:
139
+ test_files: []