redis-bitops 0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: a567e3b9af870d0a1875760e5c02ea1562b1204b
4
+ data.tar.gz: 0d27e8d824da9cc252aa3058c5c0fcb90c1af965
5
+ SHA512:
6
+ metadata.gz: 9a127af0eda591bd37ad31378275c52ea1cd8d4280fe836520e67d4ae423947836316583edffb4f2a0a153ba67fe4cc7c75c741b98bea862761ba2b4f92f731c
7
+ data.tar.gz: 56f170a119fff911cc250597887c4e49b6d5c9a784817315d02b80fbd70db129ca39ceaeb2a3d999bf5bba51066d3df705184a9ec1971a12084d215565ad89c1
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright 2014 Martin Bilski
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,193 @@
1
+ # Introduction
2
+
3
+ This gem makes it easier to do bit-wise operations on large Redis bitsets, usually called bitmaps, with a natural expression syntax. It also supports huge **sparse bitmaps** by storing data in multiple keys, called chunks, per bitmap.
4
+
5
+ The typical use is real-time web analytics where each bit in a bitmap/bitset corresponds to a user ([introductory article here](http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/)). This library isn't an analytic package though, it's more low level than that and you can use it for anything.
6
+
7
+ **This library is under development and its interface might change.**
8
+
9
+
10
+ # Quick start/pitch
11
+
12
+ Why use the library?
13
+
14
+ require 'redis/bitops'
15
+ redis = Redis.new
16
+
17
+ b1 = redis.sparse_bitmap("b1")
18
+ b1[128_000_000] = true
19
+ b2 = redis.sparse_bitmap("b2")
20
+ b2[128_000_000] = true
21
+ result = b1 & b2
22
+
23
+ Memory usage: about 20kb because it uses a sparse bitmap implementation using chunks of data.
24
+
25
+ Let's go crazy with super-complicated expressions:
26
+
27
+ ...
28
+ result = (b1 & ~b2) | b3 (| (b4 & b5 & b6 & ~b7))
29
+
30
+ Imagine writing this expression using Redis#bitop!
31
+
32
+
33
+ # Installation
34
+
35
+ To install the gem:
36
+
37
+ gem install redis-bitops
38
+
39
+ To use it in your code:
40
+
41
+ require 'redis/bitops'
42
+
43
+ # Usage
44
+
45
+ Reference: [here](http://rdoc.info/github/bilus/redis-bitops/master/frames)
46
+
47
+ ## Basic example
48
+
49
+ An example is often better than theory so here's one. Let's create a few bitmaps and set their individual bits; we'll use those bitmaps in the examples below:
50
+
51
+ redis = Redis.new
52
+
53
+ a = redis.bitmap("a")
54
+ b = redis.bitmap("b")
55
+ result = redis.bitmap("result")
56
+
57
+ b[0] = true; b[2] = true; b[7] = true # 10100001
58
+ a[0] = true; a[1] = true; a[7] = true # 11000001
59
+
60
+ So, now here's a very simple expression:
61
+
62
+ c = a & b
63
+
64
+ You may be surprised but the above statement does not query Redis at all! The expression is lazy-evaluated when you access the result:
65
+
66
+ puts c.bitcount # => 2
67
+ puts c[0] # => true
68
+ puts c[1] # => false
69
+ puts c[2] # => false
70
+ puts c[7] # => false
71
+
72
+ So, in the above example, the call to `c.bitcount` happens to be the first moment when Redis is queried. The result is stored under a temporary unique key.
73
+
74
+ puts c.root_key # => "redis:bitops:8eef38u9o09334"
75
+
76
+ Let's delete the temporary result:
77
+
78
+ c.delete!
79
+
80
+ If you want to store the result directly under a specific key:
81
+
82
+ result << c
83
+
84
+ Or, more adventurously, we can use the following more complex one-liner:
85
+
86
+ result << (~c & (a | b))
87
+
88
+ **Note: ** expressions are optimized by reducing the number of Redis commands and using as few temporary keys to hold intermediate values as possible. See below for details.
89
+
90
+
91
+ ## Sparse bitmaps
92
+
93
+ ### Usage
94
+
95
+ You don't have to do anything special, simply use `Redis#sparse_bitmap` instead of `Redis#bitmap`:
96
+
97
+ a = redis.sparse_bitmap("a")
98
+ b = redis.sparse_bitmap("b")
99
+ result = redis.sparse_bitmap("result")
100
+
101
+ b[0] = true; b[2] = true; b[7] = true # 10100001
102
+ a[0] = true; a[1] = true; a[7] = true # 11000001
103
+
104
+ c = a & b
105
+
106
+ result << c
107
+
108
+ or just:
109
+
110
+ result << (a & b)
111
+
112
+ You can specify the chunk size (in bytes).
113
+
114
+ Use the size consistently. Note that it cannot be re-adjusted for data already saved to Redis:
115
+
116
+ x = redis.sparse_bitmap("x", 1024 * 1024) # 1 MB per chunk.
117
+ x[0] = true
118
+ x[1000] = true
119
+
120
+ **Important:** Do not mix sparse bitmaps with regular ones and never mix sparse bitmaps with different chunk sizes in the same expressions.
121
+
122
+ ### Rationale
123
+
124
+ If you want to store a lot of huge but sparse bitsets, with not many bits set, using regular Redis bitmaps doesn't work very well. It wastes a lot of space. In analytics, it's a reasonable requirement, to be able to store data about several million users. A bitmap for 10 million users weights over 1MB! Imagine storing hourly statistics and using up memory at a rate of 720MB per month.
125
+
126
+ For, say, 100 million users it becomes outright prohibitive!
127
+
128
+ But even with a fairly popular websites, I dare say, you don't often have one million users per hour :) This means that the majority of those bits is never sets and a lot of space goes wasted.
129
+
130
+ Enter sparse bitmaps. They divide each bitmap into chunks thus minimizing memory use (chunks' size can be configured, see Configuration below).
131
+
132
+ Creating and using sparse bitmaps is identical to using regular bitmaps:
133
+
134
+ huge = redis.sparse_bitmap("huge_bitmap")
135
+ huge[128_000_000] = true
136
+
137
+ The only difference in the above example is that it will allocate two 32kb chunks as opposed to 1MB that would be allocated if we used a regular bitmap (Redis#bitmap). In addition, setting the bit is nearly instantaneous.
138
+
139
+ Compare:
140
+
141
+ puts Benchmark.measure {
142
+ sparse = redis.sparse_bitmap("huge_sparse_bitmap")
143
+ sparse[500_000_000] = true
144
+ }
145
+
146
+ which on my machine this generates:
147
+
148
+ 0.000000 0.000000 0.000000 ( 0.000366)
149
+
150
+ It uses just 23kb memory as opposed to 120MB (megabytes!) to store the bit using a regular Redis bitmap:
151
+
152
+ regular = redis.bitmap("huge_regular_bitmap")
153
+ regular[500_000_000] = true
154
+
155
+ ## Configuration
156
+
157
+ Here's how to configure the gem:
158
+
159
+ Redis::Bitops.configure do |config|
160
+ config.default_bytes_per_chunk = 8096 # Eight kilobytes.
161
+ config.transaction_level = :bitmap # allowed values: :bitmap or :none.
162
+ end
163
+
164
+ # Implementation & efficiency
165
+
166
+ ## Optimization phase
167
+
168
+ Prior to evaluation, the expression is optimized by combining operators into single BITOP commands and reusing temporary keys (required to store intermediate results) as much as possible.
169
+
170
+ This silly example:
171
+
172
+ result << (a & b & c | a | b)
173
+
174
+ translates into simply:
175
+
176
+ BITOP AND result a b c
177
+ BITOP OR result result a b
178
+
179
+ and doesn't create any temporary keys at all!
180
+
181
+ ## Materialization phase
182
+
183
+ At this point, the calculations are carried out and the result is saved under the destination key. Note that, for sparse bitmaps, multiple keys may be created.
184
+
185
+
186
+ ## Transaction levels
187
+
188
+ TBD
189
+
190
+
191
+ ## Contributing/feedback
192
+
193
+ Please send in your suggestions to [gyamtso@gmail.com](mailto:gyamtso@gmail.com). Pull requests, issues, comments are more than welcome.
@@ -0,0 +1,26 @@
1
+ require 'redis'
2
+ require 'redis/bitops/queries/materialization_helpers'
3
+ require 'redis/bitops/queries/tree_building_helpers'
4
+ require 'redis/bitops/queries/lazy_evaluation'
5
+ require 'redis/bitops/queries/binary_operator'
6
+ require 'redis/bitops/queries/unary_operator'
7
+ require 'redis/bitops/bitmap'
8
+ require 'redis/bitops/sparse_bitmap'
9
+
10
+ require 'redis/bitops/configuration'
11
+
12
+
13
+ class Redis
14
+
15
+ # Creates a new bitmap.
16
+ #
17
+ def bitmap(key)
18
+ Bitops::Bitmap.new(key, self)
19
+ end
20
+
21
+ # Creates a new sparse bitmap storing data in n chunks to conserve memory.
22
+ #
23
+ def sparse_bitmap(key, bytes_per_chunk = nil)
24
+ Bitops::SparseBitmap.new(key, self, bytes_per_chunk)
25
+ end
26
+ end
@@ -0,0 +1,107 @@
1
+ class Redis
2
+ module Bitops
3
+
4
+ # A sparse bitmap using multiple key to store its data to save memory.
5
+ #
6
+ # Note: When adding new public methods, revise the LazyEvaluation module.
7
+ #
8
+ class Bitmap
9
+
10
+ include Queries
11
+ include TreeBuildingHelpers # See for a list of supported operators.
12
+
13
+ # Creates a new regular Redis bitmap stored in 'redis' under 'root_key'.
14
+ #
15
+ def initialize(root_key, redis)
16
+ @redis = redis
17
+ @root_key = root_key
18
+ end
19
+
20
+ # Saves the result of the query in the bitmap.
21
+ #
22
+ def << (query)
23
+ query.evaluate(self)
24
+ end
25
+
26
+ # Reads bit at position 'pos' returning a boolean.
27
+ #
28
+ def [] (pos)
29
+ i2b(@redis.getbit(key(pos), offset(pos)))
30
+ end
31
+
32
+ # Sets bit at position 'pos' to 1 or 0 based on the boolean 'b'.
33
+ #
34
+ def []= (pos, b)
35
+ @redis.setbit(key(pos), offset(pos), b2i(b))
36
+ end
37
+
38
+ # Returns the number of set bits.
39
+ #
40
+ def bitcount
41
+ @redis.bitcount(@root_key)
42
+ end
43
+
44
+ # Deletes the bitmap and all its keys.
45
+ #
46
+ def delete!
47
+ @redis.del(@root_key)
48
+ end
49
+
50
+ # Redis BITOP operator 'op' (one of :and, :or, :xor or :not) on operands
51
+ # (bitmaps). The result is stored in 'result'.
52
+ #
53
+ def bitop(op, *operands, result)
54
+ @redis.bitop(op, result.root_key, self.root_key, *operands.map(&:root_key))
55
+ result
56
+ end
57
+
58
+ # The key the bitmap is stored under.
59
+ #
60
+ def root_key
61
+ @root_key
62
+ end
63
+
64
+ # Returns lambda creating Bitmap objects using @redis as the connection.
65
+ #
66
+ def bitmap_factory
67
+ lambda { |key| @redis.bitmap(key) }
68
+ end
69
+
70
+ # Copy this bitmap to 'dest' bitmap.
71
+ #
72
+ def copy_to(dest)
73
+ copy(root_key, dest.root_key)
74
+ end
75
+
76
+ protected
77
+
78
+ def key(pos)
79
+ @root_key
80
+ end
81
+
82
+ def offset(pos)
83
+ pos
84
+ end
85
+
86
+ def b2i(b)
87
+ b ? 1 : 0
88
+ end
89
+
90
+ def i2b(i)
91
+ i.to_i != 0 ? true : false
92
+ end
93
+
94
+ COPY_SCRIPT =
95
+ <<-EOS
96
+ redis.call("DEL", KEYS[2])
97
+ if redis.call("EXISTS", KEYS[1]) == 1 then
98
+ local val = redis.call("DUMP", KEYS[1])
99
+ redis.call("RESTORE", KEYS[2], 0, val)
100
+ end
101
+ EOS
102
+ def copy(source_key, dest_key)
103
+ @redis.eval(COPY_SCRIPT, [source_key, dest_key])
104
+ end
105
+ end
106
+ end
107
+ end
@@ -0,0 +1,38 @@
1
+ class Redis
2
+ module Bitops
3
+
4
+ # Configurable settings.
5
+ #
6
+ class Configuration
7
+
8
+ # Number of bytes per one sparse bitmap chunk.
9
+ #
10
+ attr_accessor :default_bytes_per_chunk
11
+
12
+ # Granulatity of MULTI transactions. Currently supported values are :bitmap and nil.
13
+ #
14
+ attr_accessor :transaction_level
15
+
16
+ def initialize
17
+ reset!
18
+ end
19
+
20
+ def reset!
21
+ @default_bytes_per_chunk = 32 * 1024
22
+ @transaction_level = :bitmap
23
+ end
24
+ end
25
+
26
+ extend self
27
+ attr_accessor :configuration
28
+
29
+ # Call this method to modify defaults in your initializers.
30
+ #
31
+ def configure
32
+ self.configuration ||= Configuration.new
33
+ yield(configuration)
34
+ end
35
+ end
36
+
37
+ Bitops.configure {}
38
+ end
@@ -0,0 +1,71 @@
1
+ require 'securerandom'
2
+
3
+ class Redis
4
+ module Bitops
5
+ module Queries
6
+
7
+ # Binary bitwise operator.
8
+ #
9
+ class BinaryOperator
10
+ include MaterializationHelpers
11
+ include TreeBuildingHelpers
12
+ include LazyEvaluation
13
+
14
+ # Creates a bitwise operator 'op' with left-hand operand, 'lhs', and right-hand operand, 'rhs'.
15
+ #
16
+ def initialize(op, lhs, rhs)
17
+ @args = [lhs, rhs]
18
+ @op = op
19
+ end
20
+
21
+ # Runs the expression tree against the redis database, saving the results
22
+ # in bitmap 'dest'.
23
+ #
24
+ def materialize(dest)
25
+ # Resolve lhs and rhs operand, using 'dest' to store intermediate result so
26
+ # a maximum of one temporary Bitmap has to be created.
27
+ # Then apply the bitwise operator storing the final result in 'dest'.
28
+
29
+ intermediate = dest
30
+
31
+ lhs, *other_args = @args
32
+ temp_intermediates = []
33
+
34
+ # Side-effects: if a temp intermediate bitmap is created, it's added to 'temp_intermediates'
35
+ # to be deleted in the "ensure" block. Marked with "<- SE".
36
+
37
+ lhs_operand, intermediate = resolve_operand(lhs, intermediate, temp_intermediates) # <- SE
38
+ other_operands, *_ = other_args.inject([[], intermediate]) do |(operands, intermediate), arg|
39
+ operand, intermediate = resolve_operand(arg, intermediate, temp_intermediates) # <- SE
40
+ [operands << operand, intermediate]
41
+ end
42
+
43
+ lhs_operand.bitop(@op, *other_operands, dest)
44
+ ensure
45
+ temp_intermediates.each(&:delete!)
46
+ end
47
+
48
+ # Recursively optimizes the expression tree by combining operands for neighboring identical
49
+ # operators, so for instance a & b & c ultimately becomes BITOP :and dest a b c as opposed
50
+ # to running two separate BITOP commands.
51
+ #
52
+ def optimize!(parent_op = nil)
53
+ @args.map! { |arg| arg.respond_to?(:optimize!) ? arg.optimize!(@op) : arg }.flatten!
54
+ if parent_op == @op
55
+ @args
56
+ else
57
+ self
58
+ end
59
+ end
60
+
61
+ # Finds the first bitmap factory in the expression tree.
62
+ # Required by LazyEvaluation and MaterializationHelpers.
63
+ #
64
+ def bitmap_factory
65
+ arg = @args.find { |arg| arg.bitmap_factory } or raise "Internal error. Cannot find a bitmap factory."
66
+ arg.bitmap_factory
67
+ end
68
+ end
69
+ end
70
+ end
71
+ end