bloomer 0.0.5 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 683b2bd1b28f30606dd8f96ab85ea18a98e1a198d38b10f823563ba6893d1454
4
+ data.tar.gz: 8f26b8793a08d01b401d727fc0371990f948d2e69d01cdcb1d2e1dc83e742830
5
+ SHA512:
6
+ metadata.gz: 53af08a56920e8d6e146a44032151facc1e507c3bb9b21e0a9a203c60339039b2e0e9241a79cdafcb1d6f90ef326e8202252163af811e70373c30608ba3fa2f2
7
+ data.tar.gz: ce375fc18e7531cb2b6455df625034cf0881db9294a41945a9ef1789f2aa649ae7d8037a2205bc32e34113be765bc7f3e71c7b3be06a1cd7e47cb1553552c7ac
@@ -1,7 +1,7 @@
1
1
  language: ruby
2
2
  script: bundle exec rake
3
3
  rvm:
4
- - 1.8.7
5
- - 1.9.3
6
- - jruby-18mode # JRuby in 1.8 mode
7
- - jruby-19mode # JRuby in 1.9 mode
4
+ - 2.5.1
5
+ - 2.4.4
6
+ - jruby-9.1.17.0
7
+ - jruby-9.2.0.0
data/README.md CHANGED
@@ -1,32 +1,36 @@
1
1
  # Bloomer: Bloom filters with elastic
2
2
 
3
- [![Build Status](https://secure.travis-ci.org/mceachen/bloomer.png)](http://travis-ci.org/mceachen/bloomer)
3
+ [![Gem Version](https://badge.fury.io/rb/bloomer.svg)](https://badge.fury.io/rb/bloomer)
4
+ [![Build Status](https://secure.travis-ci.org/mceachen/bloomer.svg)](http://travis-ci.org/mceachen/bloomer)
4
5
 
5
- [Bloom filters](http://en.wikipedia.org/wiki/Bloom_filter) are great for quickly checking to see if
6
- a given string has been seen before--in constant time, and using a fixed amount of RAM, as long
7
- as you know the expected number of elements up front. If you add more than ```capacity``` elements to the filter,
8
- accuracy for ```include?``` will drop below ```false_positive_probability```.
6
+ [Bloom filters](http://en.wikipedia.org/wiki/Bloom_filter) are great for quickly
7
+ checking to see if a given string has been seen before--in constant time, and
8
+ using a fixed amount of RAM, as long as you know the expected number of elements
9
+ up front. If you add more than `capacity` elements to the filter, accuracy for
10
+ `include?` will drop below `false_positive_probability`.
9
11
 
10
- [Scalable Bloom Filters](http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf) maintain a maximal ```false_positive_probability```
11
- by using additional RAM as needed.
12
+ [Scalable Bloom Filters](http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf)
13
+ maintain a maximal `false_positive_probability` by using additional RAM as
14
+ needed.
12
15
 
13
- ```Bloomer``` is a Bloom Filter. ```Bloomer::Scalable``` is a Scalable Bloom Filter.
16
+ `Bloomer` is a Bloom Filter. `Bloomer::Scalable` is a Scalable Bloom Filter.
14
17
 
15
- Keep in mind that **false positives with Bloom filters are expected**, with a specified probability rate.
16
- False negatives, however, are not. In other words,
18
+ Keep in mind that **false positives with Bloom filters are expected**, with a
19
+ specified probability rate. False negatives, however, are not. In other words,
17
20
 
18
- * if ```include?``` returns *false*, that string has *certainly not* been ```add```ed
19
- * if ```include?``` returns *true*, it *might* mean that string was ```add```ed (depending on the
20
- ```false_positive_probability``` parameter provided to the constructor).
21
+ - if `include?` returns _false_, that string has _certainly not_ been `add`ed
22
+ - if `include?` returns _true_, it _might_ mean that string was `add`ed
23
+ (depending on the `false_positive_probability` parameter provided to the
24
+ constructor).
21
25
 
22
26
  This implementation is unique in that Bloomer
23
27
 
24
- * supports scalable bloom filters (SBF)
25
- * uses triple hash chains (see [the paper](http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf))
26
- * can marshal state quickly
27
- * has rigorous tests
28
- * is pure ruby
29
- * does not require EM or Redis or something else unrelated to simply implementing a bloom filter
28
+ - supports scalable bloom filters (SBF)
29
+ - uses triple hash chains (see [the paper](http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf))
30
+ - can marshal state quickly
31
+ - has rigorous tests
32
+ - is pure ruby
33
+ - does not require EM or Redis or something else unrelated to simply implementing a bloom filter
30
34
 
31
35
  ## Usage
32
36
 
@@ -52,30 +56,57 @@ bf.include? "badda"
52
56
  #=> false
53
57
  ```
54
58
 
55
- Serialization is through [Marshal](http://ruby-doc.org/core-1.8.7/Marshal.html):
59
+ Serialization can be done using
60
+ [MessagePack](https://github.com/msgpack/msgpack-ruby):
61
+
62
+ Notice, you'll need to require `bloomer/msgpackable` to enable serialization.
56
63
 
57
64
  ```ruby
65
+ require 'bloomer/msgpackable'
58
66
  b = Bloomer.new(10)
59
67
  b.add("a")
60
- s = Marshal.dump(b)
61
- new_b = Marshal.load(s)
68
+ s = b.to_msgpack
69
+ new_b = Bloomer.from_msgpack(s)
62
70
  new_b.include? "a"
63
71
  #=> true
64
72
  ```
65
73
 
74
+ The original class will be preserved regardless of calling
75
+ `Bloomer.from_msgpack(s)` or `Bloomer::Scalable.from_msgpack(s)`:
76
+
77
+ ```ruby
78
+ require 'bloomer/msgpackable'
79
+ b = Bloomer::Scalable.new
80
+ b.add("a")
81
+ s = b.to_msgpack
82
+ new_b = Bloomer.from_msgpack(s)
83
+ new_b.class == Bloomer::Scalable
84
+ #=> true
85
+ ```
86
+
66
87
  ## Changelog
67
88
 
89
+ ### 1.0.0
90
+
91
+ - Using msgpack for more secure deserialization. Marshal.load still works but is
92
+ not recommended
93
+
68
94
  ### 0.0.5
69
- * Switched from rspec to minitest
95
+
96
+ - Switched from rspec to minitest
70
97
 
71
98
  ### 0.0.4
72
- * Fixed gem packaging
99
+
100
+ - Fixed gem packaging
73
101
 
74
102
  ### 0.0.3
75
- * Added support for scalable bloom filters (SBF)
103
+
104
+ - Added support for scalable bloom filters (SBF)
76
105
 
77
106
  ### 0.0.2
78
- * Switch to triple-hash chaining (simpler, faster, and better false-positive rate)
107
+
108
+ - Switch to triple-hash chaining (simpler, faster, and better false-positive rate)
79
109
 
80
110
  ### 0.0.1
81
- * Bloom, there it is.
111
+
112
+ - Bloom, there it is.
@@ -20,6 +20,7 @@ Gem::Specification.new do |s|
20
20
  s.require_paths = ["lib"]
21
21
 
22
22
  s.add_dependency "bitarray"
23
+ s.add_dependency "msgpack"
23
24
  s.add_development_dependency "rake"
24
25
  s.add_development_dependency "yard"
25
26
  s.add_development_dependency "minitest"
@@ -0,0 +1,65 @@
1
+ require "msgpack"
2
+
3
+ module Msgpackable
4
+ def self.included(base)
5
+ base.extend(ClassMethods)
6
+ end
7
+
8
+ def to_msgpack
9
+ self.class.msgpack_factory.dump self
10
+ end
11
+
12
+ module ClassMethods
13
+ def from_msgpack(data)
14
+ msgpack_factory.load(data)
15
+ end
16
+
17
+ def msgpack_factory
18
+ @msgpack_factory ||= ::MessagePack::Factory.new.tap do |factory|
19
+ factory.register_type(0x01, ::Bloomer)
20
+ factory.register_type(0x02, ::Bloomer::Scalable)
21
+ factory.freeze
22
+ end
23
+ end
24
+ end
25
+ end
26
+
27
+ # Patch Bloomer and Scalable to make them msgpackable
28
+ class Bloomer
29
+ include Msgpackable
30
+
31
+ def to_msgpack_ext
32
+ self.class.msgpack_factory.dump([@capacity, @count, @k, @ba.size, @ba.field])
33
+ end
34
+
35
+ def from_msgpack_ext(capacity, count, k, ba_size, ba_field)
36
+ @capacity, @count, @k = capacity, count, k
37
+ @ba = BitArray.new(ba_size, ba_field)
38
+ end
39
+
40
+ def self.from_msgpack_ext(data)
41
+ values = msgpack_factory.load(data)
42
+ ::Bloomer.new(values[1]).tap do |b|
43
+ b.from_msgpack_ext(*values)
44
+ end
45
+ end
46
+
47
+ class Scalable
48
+ include Msgpackable
49
+
50
+ def to_msgpack_ext
51
+ self.class.msgpack_factory.dump([@false_positive_probability, @bloomers])
52
+ end
53
+
54
+ def from_msgpack_ext(false_positive_probability, bloomers)
55
+ @false_positive_probability, @bloomers = false_positive_probability, bloomers
56
+ end
57
+
58
+ def self.from_msgpack_ext(data)
59
+ false_positive_probability, bloomers = msgpack_factory.load(data)
60
+ ::Bloomer::Scalable.new.tap do |b|
61
+ b.from_msgpack_ext(false_positive_probability, bloomers)
62
+ end
63
+ end
64
+ end
65
+ end
@@ -1,3 +1,3 @@
1
1
  class Bloomer
2
- VERSION = "0.0.5"
2
+ VERSION = "1.0.0"
3
3
  end
@@ -1,6 +1,7 @@
1
1
  require "test_helper"
2
2
 
3
3
  C = ('a'..'z').to_a
4
+
4
5
  def rand_word(length = 8)
5
6
  C.shuffle.first(length).join # not random enough to cause hits.
6
7
  end
@@ -43,6 +44,20 @@ def test_marshal_state(b)
43
44
  inputs.each { |ea| new_b.must_include(ea) }
44
45
  end
45
46
 
47
+ def test_msgpackable(b)
48
+ require "bloomer/msgpackable"
49
+ inputs = b.capacity.times.collect { rand_word }
50
+ inputs.each { |ea| b.add(ea) }
51
+ packed = b.to_msgpack
52
+ new_b = b.class.from_msgpack(packed)
53
+ new_b.count.must_equal b.count
54
+ new_b.capacity.must_equal b.capacity
55
+ inputs.each { |ea| new_b.must_include(ea) }
56
+ dump = Marshal.dump(b)
57
+ packed.size.must_be :<, dump.size
58
+ b.class.must_equal new_b.class
59
+ end
60
+
46
61
  def test_simple(b)
47
62
  b.add("a").must_equal true
48
63
  b.add("a").must_equal false
@@ -68,6 +83,11 @@ describe Bloomer do
68
83
  test_marshal_state(b)
69
84
  end
70
85
 
86
+ it "serializes and deserializes correctly" do
87
+ b = Bloomer.new(10, 0.001)
88
+ test_msgpackable(b)
89
+ end
90
+
71
91
  it "results in similar-to-expected false positives" do
72
92
  max_false_prob = 0.001
73
93
  size = 50_000
@@ -88,6 +108,12 @@ describe Bloomer::Scalable do
88
108
  test_marshal_state(b)
89
109
  end
90
110
 
111
+ it "serializes and deserializes correctly" do
112
+ b = Bloomer::Scalable.new(10, 0.001)
113
+ 100.times.each { b.add(rand_word) }
114
+ test_msgpackable(b)
115
+ end
116
+
91
117
  it "results in similar-to-expected false positives" do
92
118
  max_false_prob = 0.001
93
119
  size = 10_000
metadata CHANGED
@@ -1,94 +1,97 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bloomer
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.5
5
- prerelease:
4
+ version: 1.0.0
6
5
  platform: ruby
7
6
  authors:
8
7
  - Matthew McEachen
9
8
  autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
- date: 2012-04-25 00:00:00.000000000 Z
11
+ date: 2018-09-13 00:00:00.000000000 Z
13
12
  dependencies:
14
13
  - !ruby/object:Gem::Dependency
15
14
  name: bitarray
16
15
  requirement: !ruby/object:Gem::Requirement
17
- none: false
18
16
  requirements:
19
- - - ! '>='
17
+ - - ">="
20
18
  - !ruby/object:Gem::Version
21
19
  version: '0'
22
20
  type: :runtime
23
21
  prerelease: false
24
22
  version_requirements: !ruby/object:Gem::Requirement
25
- none: false
26
23
  requirements:
27
- - - ! '>='
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: msgpack
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
28
39
  - !ruby/object:Gem::Version
29
40
  version: '0'
30
41
  - !ruby/object:Gem::Dependency
31
42
  name: rake
32
43
  requirement: !ruby/object:Gem::Requirement
33
- none: false
34
44
  requirements:
35
- - - ! '>='
45
+ - - ">="
36
46
  - !ruby/object:Gem::Version
37
47
  version: '0'
38
48
  type: :development
39
49
  prerelease: false
40
50
  version_requirements: !ruby/object:Gem::Requirement
41
- none: false
42
51
  requirements:
43
- - - ! '>='
52
+ - - ">="
44
53
  - !ruby/object:Gem::Version
45
54
  version: '0'
46
55
  - !ruby/object:Gem::Dependency
47
56
  name: yard
48
57
  requirement: !ruby/object:Gem::Requirement
49
- none: false
50
58
  requirements:
51
- - - ! '>='
59
+ - - ">="
52
60
  - !ruby/object:Gem::Version
53
61
  version: '0'
54
62
  type: :development
55
63
  prerelease: false
56
64
  version_requirements: !ruby/object:Gem::Requirement
57
- none: false
58
65
  requirements:
59
- - - ! '>='
66
+ - - ">="
60
67
  - !ruby/object:Gem::Version
61
68
  version: '0'
62
69
  - !ruby/object:Gem::Dependency
63
70
  name: minitest
64
71
  requirement: !ruby/object:Gem::Requirement
65
- none: false
66
72
  requirements:
67
- - - ! '>='
73
+ - - ">="
68
74
  - !ruby/object:Gem::Version
69
75
  version: '0'
70
76
  type: :development
71
77
  prerelease: false
72
78
  version_requirements: !ruby/object:Gem::Requirement
73
- none: false
74
79
  requirements:
75
- - - ! '>='
80
+ - - ">="
76
81
  - !ruby/object:Gem::Version
77
82
  version: '0'
78
83
  - !ruby/object:Gem::Dependency
79
84
  name: minitest-reporters
80
85
  requirement: !ruby/object:Gem::Requirement
81
- none: false
82
86
  requirements:
83
- - - ! '>='
87
+ - - ">="
84
88
  - !ruby/object:Gem::Version
85
89
  version: '0'
86
90
  type: :development
87
91
  prerelease: false
88
92
  version_requirements: !ruby/object:Gem::Requirement
89
- none: false
90
93
  requirements:
91
- - - ! '>='
94
+ - - ">="
92
95
  - !ruby/object:Gem::Version
93
96
  version: '0'
94
97
  description: Bloom filters and Scalable Bloom filters (SBF) in pure ruby
@@ -98,48 +101,39 @@ executables: []
98
101
  extensions: []
99
102
  extra_rdoc_files: []
100
103
  files:
101
- - .gitignore
102
- - .travis.yml
104
+ - ".gitignore"
105
+ - ".travis.yml"
103
106
  - Gemfile
104
107
  - MIT-LICENSE
105
108
  - README.md
106
109
  - Rakefile
107
110
  - bloomer.gemspec
108
111
  - lib/bloomer.rb
112
+ - lib/bloomer/msgpackable.rb
109
113
  - lib/bloomer/version.rb
110
114
  - test/bloomer_test.rb
111
115
  - test/test_helper.rb
112
116
  homepage: https://github.com/mceachen/bloomer
113
117
  licenses: []
118
+ metadata: {}
114
119
  post_install_message:
115
120
  rdoc_options: []
116
121
  require_paths:
117
122
  - lib
118
123
  required_ruby_version: !ruby/object:Gem::Requirement
119
- none: false
120
124
  requirements:
121
- - - ! '>='
125
+ - - ">="
122
126
  - !ruby/object:Gem::Version
123
127
  version: '0'
124
- segments:
125
- - 0
126
- hash: 2624379326334183946
127
128
  required_rubygems_version: !ruby/object:Gem::Requirement
128
- none: false
129
129
  requirements:
130
- - - ! '>='
130
+ - - ">="
131
131
  - !ruby/object:Gem::Version
132
132
  version: '0'
133
- segments:
134
- - 0
135
- hash: 2624379326334183946
136
133
  requirements: []
137
134
  rubyforge_project: bloomer
138
- rubygems_version: 1.8.21
135
+ rubygems_version: 2.7.7
139
136
  signing_key:
140
- specification_version: 3
137
+ specification_version: 4
141
138
  summary: Bloom filters and Scalable Bloom filters (SBF) in pure ruby
142
- test_files:
143
- - test/bloomer_test.rb
144
- - test/test_helper.rb
145
- has_rdoc:
139
+ test_files: []