redis-bitops 0.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: a567e3b9af870d0a1875760e5c02ea1562b1204b
4
+ data.tar.gz: 0d27e8d824da9cc252aa3058c5c0fcb90c1af965
5
+ SHA512:
6
+ metadata.gz: 9a127af0eda591bd37ad31378275c52ea1cd8d4280fe836520e67d4ae423947836316583edffb4f2a0a153ba67fe4cc7c75c741b98bea862761ba2b4f92f731c
7
+ data.tar.gz: 56f170a119fff911cc250597887c4e49b6d5c9a784817315d02b80fbd70db129ca39ceaeb2a3d999bf5bba51066d3df705184a9ec1971a12084d215565ad89c1
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright 2014 Martin Bilski
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,193 @@
1
+ # Introduction
2
+
3
+ This gem makes it easier to do bit-wise operations on large Redis bitsets, usually called bitmaps, with a natural expression syntax. It also supports huge **sparse bitmaps** by storing data in multiple keys, called chunks, per bitmap.
4
+
5
+ The typical use is real-time web analytics where each bit in a bitmap/bitset corresponds to a user ([introductory article here](http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/)). This library isn't an analytic package though, it's more low level than that and you can use it for anything.
6
+
7
+ **This library is under development and its interface might change.**
8
+
9
+
10
+ # Quick start/pitch
11
+
12
+ Why use the library?
13
+
14
+ require 'redis/bitops'
15
+ redis = Redis.new
16
+
17
+ b1 = redis.sparse_bitmap("b1")
18
+ b1[128_000_000] = true
19
+ b2 = redis.sparse_bitmap("b2")
20
+ b2[128_000_000] = true
21
+ result = b1 & b2
22
+
23
+ Memory usage: about 20kb because it uses a sparse bitmap implementation using chunks of data.
24
+
25
+ Let's go crazy with super-complicated expressions:
26
+
27
+ ...
28
+ result = (b1 & ~b2) | b3 (| (b4 & b5 & b6 & ~b7))
29
+
30
+ Imagine writing this expression using Redis#bitop!
31
+
32
+
33
+ # Installation
34
+
35
+ To install the gem:
36
+
37
+ gem install redis-bitops
38
+
39
+ To use it in your code:
40
+
41
+ require 'redis/bitops'
42
+
43
+ # Usage
44
+
45
+ Reference: [here](http://rdoc.info/github/bilus/redis-bitops/master/frames)
46
+
47
+ ## Basic example
48
+
49
+ An example is often better than theory so here's one. Let's create a few bitmaps and set their individual bits; we'll use those bitmaps in the examples below:
50
+
51
+ redis = Redis.new
52
+
53
+ a = redis.bitmap("a")
54
+ b = redis.bitmap("b")
55
+ result = redis.bitmap("result")
56
+
57
+ b[0] = true; b[2] = true; b[7] = true # 10100001
58
+ a[0] = true; a[1] = true; a[7] = true # 11000001
59
+
60
+ So, now here's a very simple expression:
61
+
62
+ c = a & b
63
+
64
+ You may be surprised but the above statement does not query Redis at all! The expression is lazy-evaluated when you access the result:
65
+
66
+ puts c.bitcount # => 2
67
+ puts c[0] # => true
68
+ puts c[1] # => false
69
+ puts c[2] # => false
70
+ puts c[7] # => false
71
+
72
+ So, in the above example, the call to `c.bitcount` happens to be the first moment when Redis is queried. The result is stored under a temporary unique key.
73
+
74
+ puts c.root_key # => "redis:bitops:8eef38u9o09334"
75
+
76
+ Let's delete the temporary result:
77
+
78
+ c.delete!
79
+
80
+ If you want to store the result directly under a specific key:
81
+
82
+ result << c
83
+
84
+ Or, more adventurously, we can use the following more complex one-liner:
85
+
86
+ result << (~c & (a | b))
87
+
88
+ **Note: ** expressions are optimized by reducing the number of Redis commands and using as few temporary keys to hold intermediate values as possible. See below for details.
89
+
90
+
91
+ ## Sparse bitmaps
92
+
93
+ ### Usage
94
+
95
+ You don't have to do anything special, simply use `Redis#sparse_bitmap` instead of `Redis#bitmap`:
96
+
97
+ a = redis.sparse_bitmap("a")
98
+ b = redis.sparse_bitmap("b")
99
+ result = redis.sparse_bitmap("result")
100
+
101
+ b[0] = true; b[2] = true; b[7] = true # 10100001
102
+ a[0] = true; a[1] = true; a[7] = true # 11000001
103
+
104
+ c = a & b
105
+
106
+ result << c
107
+
108
+ or just:
109
+
110
+ result << (a & b)
111
+
112
+ You can specify the chunk size (in bytes).
113
+
114
+ Use the size consistently. Note that it cannot be re-adjusted for data already saved to Redis:
115
+
116
+ x = redis.sparse_bitmap("x", 1024 * 1024) # 1 MB per chunk.
117
+ x[0] = true
118
+ x[1000] = true
119
+
120
+ **Important:** Do not mix sparse bitmaps with regular ones and never mix sparse bitmaps with different chunk sizes in the same expressions.
121
+
122
+ ### Rationale
123
+
124
+ If you want to store a lot of huge but sparse bitsets, with not many bits set, using regular Redis bitmaps doesn't work very well. It wastes a lot of space. In analytics, it's a reasonable requirement, to be able to store data about several million users. A bitmap for 10 million users weights over 1MB! Imagine storing hourly statistics and using up memory at a rate of 720MB per month.
125
+
126
+ For, say, 100 million users it becomes outright prohibitive!
127
+
128
+ But even with a fairly popular websites, I dare say, you don't often have one million users per hour :) This means that the majority of those bits is never sets and a lot of space goes wasted.
129
+
130
+ Enter sparse bitmaps. They divide each bitmap into chunks thus minimizing memory use (chunks' size can be configured, see Configuration below).
131
+
132
+ Creating and using sparse bitmaps is identical to using regular bitmaps:
133
+
134
+ huge = redis.sparse_bitmap("huge_bitmap")
135
+ huge[128_000_000] = true
136
+
137
+ The only difference in the above example is that it will allocate two 32kb chunks as opposed to 1MB that would be allocated if we used a regular bitmap (Redis#bitmap). In addition, setting the bit is nearly instantaneous.
138
+
139
+ Compare:
140
+
141
+ puts Benchmark.measure {
142
+ sparse = redis.sparse_bitmap("huge_sparse_bitmap")
143
+ sparse[500_000_000] = true
144
+ }
145
+
146
+ which on my machine this generates:
147
+
148
+ 0.000000 0.000000 0.000000 ( 0.000366)
149
+
150
+ It uses just 23kb memory as opposed to 120MB (megabytes!) to store the bit using a regular Redis bitmap:
151
+
152
+ regular = redis.bitmap("huge_regular_bitmap")
153
+ regular[500_000_000] = true
154
+
155
+ ## Configuration
156
+
157
+ Here's how to configure the gem:
158
+
159
+ Redis::Bitops.configure do |config|
160
+ config.default_bytes_per_chunk = 8096 # Eight kilobytes.
161
+ config.transaction_level = :bitmap # allowed values: :bitmap or :none.
162
+ end
163
+
164
+ # Implementation & efficiency
165
+
166
+ ## Optimization phase
167
+
168
+ Prior to evaluation, the expression is optimized by combining operators into single BITOP commands and reusing temporary keys (required to store intermediate results) as much as possible.
169
+
170
+ This silly example:
171
+
172
+ result << (a & b & c | a | b)
173
+
174
+ translates into simply:
175
+
176
+ BITOP AND result a b c
177
+ BITOP OR result result a b
178
+
179
+ and doesn't create any temporary keys at all!
180
+
181
+ ## Materialization phase
182
+
183
+ At this point, the calculations are carried out and the result is saved under the destination key. Note that, for sparse bitmaps, multiple keys may be created.
184
+
185
+
186
+ ## Transaction levels
187
+
188
+ TBD
189
+
190
+
191
+ ## Contributing/feedback
192
+
193
+ Please send in your suggestions to [gyamtso@gmail.com](mailto:gyamtso@gmail.com). Pull requests, issues, comments are more than welcome.
@@ -0,0 +1,26 @@
1
+ require 'redis'
2
+ require 'redis/bitops/queries/materialization_helpers'
3
+ require 'redis/bitops/queries/tree_building_helpers'
4
+ require 'redis/bitops/queries/lazy_evaluation'
5
+ require 'redis/bitops/queries/binary_operator'
6
+ require 'redis/bitops/queries/unary_operator'
7
+ require 'redis/bitops/bitmap'
8
+ require 'redis/bitops/sparse_bitmap'
9
+
10
+ require 'redis/bitops/configuration'
11
+
12
+
13
+ class Redis
14
+
15
+ # Creates a new bitmap.
16
+ #
17
+ def bitmap(key)
18
+ Bitops::Bitmap.new(key, self)
19
+ end
20
+
21
+ # Creates a new sparse bitmap storing data in n chunks to conserve memory.
22
+ #
23
+ def sparse_bitmap(key, bytes_per_chunk = nil)
24
+ Bitops::SparseBitmap.new(key, self, bytes_per_chunk)
25
+ end
26
+ end
@@ -0,0 +1,107 @@
1
+ class Redis
2
+ module Bitops
3
+
4
+ # A sparse bitmap using multiple key to store its data to save memory.
5
+ #
6
+ # Note: When adding new public methods, revise the LazyEvaluation module.
7
+ #
8
+ class Bitmap
9
+
10
+ include Queries
11
+ include TreeBuildingHelpers # See for a list of supported operators.
12
+
13
+ # Creates a new regular Redis bitmap stored in 'redis' under 'root_key'.
14
+ #
15
+ def initialize(root_key, redis)
16
+ @redis = redis
17
+ @root_key = root_key
18
+ end
19
+
20
+ # Saves the result of the query in the bitmap.
21
+ #
22
+ def << (query)
23
+ query.evaluate(self)
24
+ end
25
+
26
+ # Reads bit at position 'pos' returning a boolean.
27
+ #
28
+ def [] (pos)
29
+ i2b(@redis.getbit(key(pos), offset(pos)))
30
+ end
31
+
32
+ # Sets bit at position 'pos' to 1 or 0 based on the boolean 'b'.
33
+ #
34
+ def []= (pos, b)
35
+ @redis.setbit(key(pos), offset(pos), b2i(b))
36
+ end
37
+
38
+ # Returns the number of set bits.
39
+ #
40
+ def bitcount
41
+ @redis.bitcount(@root_key)
42
+ end
43
+
44
+ # Deletes the bitmap and all its keys.
45
+ #
46
+ def delete!
47
+ @redis.del(@root_key)
48
+ end
49
+
50
+ # Redis BITOP operator 'op' (one of :and, :or, :xor or :not) on operands
51
+ # (bitmaps). The result is stored in 'result'.
52
+ #
53
+ def bitop(op, *operands, result)
54
+ @redis.bitop(op, result.root_key, self.root_key, *operands.map(&:root_key))
55
+ result
56
+ end
57
+
58
+ # The key the bitmap is stored under.
59
+ #
60
+ def root_key
61
+ @root_key
62
+ end
63
+
64
+ # Returns lambda creating Bitmap objects using @redis as the connection.
65
+ #
66
+ def bitmap_factory
67
+ lambda { |key| @redis.bitmap(key) }
68
+ end
69
+
70
+ # Copy this bitmap to 'dest' bitmap.
71
+ #
72
+ def copy_to(dest)
73
+ copy(root_key, dest.root_key)
74
+ end
75
+
76
+ protected
77
+
78
+ def key(pos)
79
+ @root_key
80
+ end
81
+
82
+ def offset(pos)
83
+ pos
84
+ end
85
+
86
+ def b2i(b)
87
+ b ? 1 : 0
88
+ end
89
+
90
+ def i2b(i)
91
+ i.to_i != 0 ? true : false
92
+ end
93
+
94
+ COPY_SCRIPT =
95
+ <<-EOS
96
+ redis.call("DEL", KEYS[2])
97
+ if redis.call("EXISTS", KEYS[1]) == 1 then
98
+ local val = redis.call("DUMP", KEYS[1])
99
+ redis.call("RESTORE", KEYS[2], 0, val)
100
+ end
101
+ EOS
102
+ def copy(source_key, dest_key)
103
+ @redis.eval(COPY_SCRIPT, [source_key, dest_key])
104
+ end
105
+ end
106
+ end
107
+ end
@@ -0,0 +1,38 @@
1
+ class Redis
2
+ module Bitops
3
+
4
+ # Configurable settings.
5
+ #
6
+ class Configuration
7
+
8
+ # Number of bytes per one sparse bitmap chunk.
9
+ #
10
+ attr_accessor :default_bytes_per_chunk
11
+
12
+ # Granulatity of MULTI transactions. Currently supported values are :bitmap and nil.
13
+ #
14
+ attr_accessor :transaction_level
15
+
16
+ def initialize
17
+ reset!
18
+ end
19
+
20
+ def reset!
21
+ @default_bytes_per_chunk = 32 * 1024
22
+ @transaction_level = :bitmap
23
+ end
24
+ end
25
+
26
+ extend self
27
+ attr_accessor :configuration
28
+
29
+ # Call this method to modify defaults in your initializers.
30
+ #
31
+ def configure
32
+ self.configuration ||= Configuration.new
33
+ yield(configuration)
34
+ end
35
+ end
36
+
37
+ Bitops.configure {}
38
+ end
@@ -0,0 +1,71 @@
1
+ require 'securerandom'
2
+
3
+ class Redis
4
+ module Bitops
5
+ module Queries
6
+
7
+ # Binary bitwise operator.
8
+ #
9
+ class BinaryOperator
10
+ include MaterializationHelpers
11
+ include TreeBuildingHelpers
12
+ include LazyEvaluation
13
+
14
+ # Creates a bitwise operator 'op' with left-hand operand, 'lhs', and right-hand operand, 'rhs'.
15
+ #
16
+ def initialize(op, lhs, rhs)
17
+ @args = [lhs, rhs]
18
+ @op = op
19
+ end
20
+
21
+ # Runs the expression tree against the redis database, saving the results
22
+ # in bitmap 'dest'.
23
+ #
24
+ def materialize(dest)
25
+ # Resolve lhs and rhs operand, using 'dest' to store intermediate result so
26
+ # a maximum of one temporary Bitmap has to be created.
27
+ # Then apply the bitwise operator storing the final result in 'dest'.
28
+
29
+ intermediate = dest
30
+
31
+ lhs, *other_args = @args
32
+ temp_intermediates = []
33
+
34
+ # Side-effects: if a temp intermediate bitmap is created, it's added to 'temp_intermediates'
35
+ # to be deleted in the "ensure" block. Marked with "<- SE".
36
+
37
+ lhs_operand, intermediate = resolve_operand(lhs, intermediate, temp_intermediates) # <- SE
38
+ other_operands, *_ = other_args.inject([[], intermediate]) do |(operands, intermediate), arg|
39
+ operand, intermediate = resolve_operand(arg, intermediate, temp_intermediates) # <- SE
40
+ [operands << operand, intermediate]
41
+ end
42
+
43
+ lhs_operand.bitop(@op, *other_operands, dest)
44
+ ensure
45
+ temp_intermediates.each(&:delete!)
46
+ end
47
+
48
+ # Recursively optimizes the expression tree by combining operands for neighboring identical
49
+ # operators, so for instance a & b & c ultimately becomes BITOP :and dest a b c as opposed
50
+ # to running two separate BITOP commands.
51
+ #
52
+ def optimize!(parent_op = nil)
53
+ @args.map! { |arg| arg.respond_to?(:optimize!) ? arg.optimize!(@op) : arg }.flatten!
54
+ if parent_op == @op
55
+ @args
56
+ else
57
+ self
58
+ end
59
+ end
60
+
61
+ # Finds the first bitmap factory in the expression tree.
62
+ # Required by LazyEvaluation and MaterializationHelpers.
63
+ #
64
+ def bitmap_factory
65
+ arg = @args.find { |arg| arg.bitmap_factory } or raise "Internal error. Cannot find a bitmap factory."
66
+ arg.bitmap_factory
67
+ end
68
+ end
69
+ end
70
+ end
71
+ end