redstream 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 0e1ddc2700836c469d1ca61069e3416c21e657e05725b92e75969aa8110768e3
4
+ data.tar.gz: c8565f3754b3fd4f66823d6d7035e814f73e27abdb15936ecd906f5f07dd8643
5
+ SHA512:
6
+ metadata.gz: edd496df8d06b98b9318b9796f400e2c0870edfc84c3aa7f9c7946dbe6cf91c5a8c0ab32425d627bc20c585389eab92ed1b290e57e0df856e8995547d8a9b7c6
7
+ data.tar.gz: 4893d2197f427479e4df0821ca29a23ee98a604fa73680f955da8d2c71cbdb192d006c476dc3bd6c03c719da327c3d9b6f207842082a64133f0fe2383771aef5
@@ -0,0 +1,14 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+ *.bundle
11
+ *.so
12
+ *.o
13
+ *.a
14
+ mkmf.log
@@ -0,0 +1,10 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - ruby-head
5
+ before_install:
6
+ - docker-compose up -d
7
+ - sleep 10
8
+ install:
9
+ - travis_retry bundle install
10
+ script: rspec
data/Gemfile ADDED
@@ -0,0 +1,5 @@
1
+
2
+ source "https://rubygems.org"
3
+
4
+ gemspec
5
+
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2014 Benjamin Vetter
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,253 @@
1
+
2
+ # Redstream
3
+
4
+ **Using redis streams to keep your primary database in sync with secondary
5
+ datastores (e.g. elasticsearch).**
6
+
7
+ [![Build Status](https://secure.travis-ci.org/mrkamel/redstream.png?branch=master)](http://travis-ci.org/mrkamel/redstream)
8
+
9
+ ## Installation
10
+
11
+ First, install redis. Then, add this line to your application's Gemfile:
12
+
13
+ ```ruby
14
+ gem 'redstream'
15
+ ```
16
+
17
+ And then execute:
18
+
19
+ $ bundle
20
+
21
+ Or install it yourself as:
22
+
23
+ $ gem install redstream
24
+
25
+ ## Reference Docs
26
+
27
+ The reference docs can be found at
28
+ [https://www.rubydoc.info/github/mrkamel/redstream/master](https://www.rubydoc.info/github/mrkamel/redstream/master).
29
+
30
+ ## Usage
31
+
32
+ Include `Redstream::Model` in your model and add a call to
33
+ `redstream_callbacks`.
34
+
35
+ ```ruby
36
+ class MyModel < ActiveRecord::Base
37
+ include Redstream::Model
38
+
39
+ # ...
40
+
41
+ redstream_callbacks
42
+
43
+ # ...
44
+ end
45
+ ```
46
+
47
+ `redstream_callbacks` adds `after_save`, `after_touch`, `after_destroy` and,
48
+ most importantly, `after_commit` callbacks which write messages, containing the
49
+ record id, to a redis stream. A background worker can then fetch those messages
50
+ and update secondary datastores.
51
+
52
+ In a background process, you need to run a `Redstream::Consumer`, `Redstream::Delayer`
53
+ and a `Redstream::Trimmer`:
54
+
55
+ ```ruby
56
+ Redstream::Consumer.new(stream_name: Product.redstream_name, name: "consumer").run do |messages|
57
+ # Update seconday datastore
58
+ end
59
+
60
+ # ...
61
+
62
+ Redstream::Delayer.new(stream_name: Product.redstream_name, delay: 5.minutes).run
63
+
64
+ # ...
65
+
66
+ trimmer = RedStream::Trimmer.new(
67
+ stream_name: Product.redstream_name,
68
+ consumer_names: ["indexer", "cacher"],
69
+ interval: 30
70
+ )
71
+
72
+ trimmer.run
73
+ ```
74
+
75
+ As all of them are blocking, you should run them in individual threads. But as
76
+ none of them must be stopped gracefully, this can be as simple as:
77
+
78
+ ```ruby
79
+ Thread.new do
80
+ Redstream::Consumer.new("...").run do |messages|
81
+ # ...
82
+ end
83
+ end
84
+ ```
85
+
86
+ More concretely, `after_save`, `after_touch` and `after_destroy` only write
87
+ "delay" messages to an additional redis stream. Delay message are like any
88
+ other messages, but they get processed by a `Redstream::Delayer` and the
89
+ `Delayer`will wait for some (configurable) delay/time before processing them.
90
+ As the `Delayer` is neccessary to fix inconsistencies, the delay must be at
91
+ least as long as your maxiumum database transaction time. Contrary,
92
+ `after_commit` writes messages to a redis stream from which the messages can
93
+ be fetched immediately to keep the secondary datastores updated in
94
+ near-realtime. The reasoning of all this is simple: usually, i.e. by using only
95
+ one way to update secondary datastores, namely `after_save` or `after_commit`,
96
+ any errors occurring in between `after_save` and `after_commit` result in
97
+ inconsistencies between your primary and secondary datastore. By using these
98
+ kinds of "delay" messages triggered by `after_save` and fetched after e.g. 5
99
+ minutes, errors occurring in between `after_save` and `after_commit` can be
100
+ fixed when the delay message get processed.
101
+
102
+ Any messages are fetched in batches, such that e.g. elasticsearch can be
103
+ updated using its bulk API. For instance, depending on which elasticsearch ruby
104
+ client you are using, the reindexing code regarding elasticsearch will look
105
+ similar to:
106
+
107
+ ```ruby
108
+ Thread.new do
109
+ Redstream::Consumer.new(stream_name: Product.redstream_name, name: "indexer").run do |messages|
110
+ ids = messages.map { |message| message.payload["id"] }
111
+
112
+ ProductIndex.import Product.where(id: ids)
113
+ end
114
+ end
115
+
116
+ Thread.new do
117
+ Redstream::Delayer.new(stream_name: Product.redstream_name, delay: 5.minutes).run
118
+ end
119
+
120
+ Thread.new do
121
+ RedStream::Trimmer.new(stream_name: Product.redstream_name, consumer_names: ["indexer"], interval: 30).run
122
+ end
123
+ ```
124
+
125
+ You should run a consumer per `(stream_name, name)` tuple on multiple hosts for
126
+ high availability. They'll use a redis based locking mechanism to ensure that
127
+ only one consumer is consuming messages per tuple while the others are
128
+ hot-standbys, i.e. they'll take over in case the currently active instance
129
+ dies. The same stands for delayers and trimmers.
130
+
131
+ Please note: if you have multiple kinds of consumers for a single model/topic,
132
+ then you must use distinct names. Assume you have an indexer, which updates a
133
+ search index for a model and a cacher, which updates a cache store for a model:
134
+
135
+ ```ruby
136
+ Redstream::Consumer.new(stream_name: Product.redstream_name, name: "indexer").run do |messages|
137
+ # ...
138
+ end
139
+
140
+ Redstream::Consumer.new(stream_name: Product.redstream_name, name: "cacher").run do |messages|
141
+ # ...
142
+ end
143
+ ```
144
+
145
+ # Consumer, Delayer, Trimmer, Producer
146
+
147
+ A `Consumer` fetches messages that have been added to a redis stream via
148
+ `after_commit` or by a `Delayer`, i.e. messages that are available for
149
+ immediate retrieval/reindexing/syncing.
150
+
151
+ ```ruby
152
+ Redstream::Consumer.new(stream_name: Product.redstream_name, name: "indexer").run do |messages|
153
+ ids = messages.map { |message| message.payload["id"] }
154
+
155
+ ProductIndex.import Product.where(id: ids)
156
+ end
157
+ ```
158
+
159
+ A `Delayer` fetches messages that have been added to a second redis stream via
160
+ `after_save`, `after_touch` and `after_destroy` to be retrieved after a certain
161
+ configurable amount of time (5 minutes usually) to fix inconsistencies. The
162
+ amount of time must be longer than your maximum database transaction time at
163
+ least.
164
+
165
+ ```ruby
166
+ Redstream::Delayer.new(stream_name: Product.redstream_name, delay: 5.minutes).run
167
+ ```
168
+
169
+ A `Trimmer` is responsible to finally remove messages from redis streams.
170
+ Without a `Trimmer` messages will fill up your redis server and redis will
171
+ finally crash due to out of memory errors. To be able to trim a stream, you
172
+ must pass an array containing all consumer names reading from the respective
173
+ stream. The `Trimmer` then continously checks how far each consumer already
174
+ processed the stream and trims the stream up to the committed minimum.
175
+ Contrary, if there is nothing to trim, the `Trimmer` will sleep for a specified
176
+ `interval`.
177
+
178
+ ```ruby
179
+ RedStream::Trimmer.new(stream_name: Product.redstream_name, consumer_names: ["indexer"], interval: 30).run
180
+ ```
181
+
182
+ A `Producer` adds messages to the concrete redis streams, and you
183
+ can actually pass a concrete `Producer` instance via `redstream_callbacks`:
184
+
185
+ ```ruby
186
+ class Product < ActiveRecord::Base
187
+ include Redstream::Model
188
+
189
+ # ...
190
+
191
+ redstream_callbacks producer: Redstream::Producer.new("...")
192
+
193
+ # ...
194
+ end
195
+ ```
196
+
197
+ As you might recognize, `Redstream::Model` is of course only able to send
198
+ messages to redis streams for model lifecyle callbacks. This is however not
199
+ the case for `#update_all`:
200
+
201
+ ```ruby
202
+ Product.where(on_stock: true).update_all(featured: true)
203
+ ```
204
+
205
+ To capture those updates as well, you need to change:
206
+
207
+ ```ruby
208
+ Product.where(on_stock: true).update_all(featured: true)
209
+ ```
210
+
211
+ to
212
+
213
+ ```ruby
214
+ RedstreamProducer = Redstream::Producer.new
215
+
216
+ Product.where(on_stock: true).find_in_batches do |products|
217
+ RedstreamProducer.bulk products do
218
+ Product.where(id: products.map(&:id)).update_all(featured: true)
219
+ end
220
+ end
221
+ ```
222
+
223
+ The `Producer` will write a message for every matched record into the delay
224
+ stream before `update_all` is called and will write another message for every
225
+ record to the main stream after `update_all` is called - just like it is done
226
+ within the model lifecycle callbacks.
227
+
228
+ The `#bulk` method must ensure that the same set of records is used for the
229
+ delay messages and the instant messages. Thus, you better directly pass an
230
+ array of records to `Redstream::Producer#bulk`, like shown above. If you pass
231
+ an `ActiveRecord::Relation`, the `#bulk` method will convert it to an array,
232
+ i.e. load the whole result set into memory.
233
+
234
+ ## Namespacing
235
+
236
+ In case you are using a shared redis, where multiple appications read/write
237
+ from the same redis server using Redstream, key conflicts could occur.
238
+ To avoid that, you want to use namespacing:
239
+
240
+ ```ruby
241
+ Redstream.namespace = 'my_app'
242
+ ```
243
+
244
+ such that every application will have its own namespaced Redstream keys.
245
+
246
+ ## Contributing
247
+
248
+ Bug reports and pull requests are welcome on GitHub at https://github.com/mrkamel/redstream
249
+
250
+ ## License
251
+
252
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
253
+
@@ -0,0 +1,9 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+
4
+ Rake::TestTask.new(:test) do |t|
5
+ t.libs << "lib"
6
+ t.pattern = "test/**/*_test.rb"
7
+ t.verbose = true
8
+ end
9
+
@@ -0,0 +1,6 @@
1
+ version: '2'
2
+ services:
3
+ redis:
4
+ image: redis:5.0
5
+ ports:
6
+ - 127.0.0.1:6379:6379
@@ -0,0 +1,134 @@
1
+
2
+ require "active_support/inflector"
3
+ require "connection_pool"
4
+ require "redis"
5
+ require "json"
6
+ require "thread"
7
+ require "set"
8
+
9
+ require "redstream/version"
10
+ require "redstream/lock"
11
+ require "redstream/message"
12
+ require "redstream/consumer"
13
+ require "redstream/producer"
14
+ require "redstream/delayer"
15
+ require "redstream/model"
16
+ require "redstream/trimmer"
17
+
18
+ module Redstream
19
+ # Redstream uses the connection_pool gem to pool redis connections. In case
20
+ # you have a distributed redis setup (sentinel/cluster) or the default pool
21
+ # size doesn't match your requirements, then you must specify the connection
22
+ # pool. A connection pool is neccessary, because redstream is using blocking
23
+ # commands. Please note, redis connections are somewhat cheap, so you better
24
+ # specify the pool size to be large enough instead of running into
25
+ # bottlenecks.
26
+ #
27
+ # @example
28
+ # Redstream.connection_pool = ConnectionPool.new(size: 50) do
29
+ # Redis.new("...")
30
+ # end
31
+
32
+ def self.connection_pool=(connection_pool)
33
+ @connection_pool = connection_pool
34
+ end
35
+
36
+ # Returns the connection pool instance or sets and creates a new connection
37
+ # pool in case no pool is yet created.
38
+ #
39
+ # @return [ConnectionPool] The connection pool
40
+
41
+ def self.connection_pool
42
+ @connection_pool ||= ConnectionPool.new { Redis.new }
43
+ end
44
+
45
+ # You can specify a namespace to use for redis keys. This is useful in case
46
+ # you are using a shared redis.
47
+ #
48
+ # @example
49
+ # Redstream.namespace = 'my_app'
50
+
51
+ def self.namespace=(namespace)
52
+ @namespace = namespace
53
+ end
54
+
55
+ # Returns the previously set namespace for redis keys to be used by
56
+ # Redstream.
57
+
58
+ def self.namespace
59
+ @namespace
60
+ end
61
+
62
+ # Returns the max id of the specified stream, i.e. the id of the
63
+ # last/newest message added. Returns nil for empty streams.
64
+ #
65
+ # @param stream_name [String] The stream name
66
+ # @return [String, nil] The id of a stream's newest messages, or nil
67
+
68
+ def self.max_stream_id(stream_name)
69
+ connection_pool.with do |redis|
70
+ message = redis.xrevrange(stream_key_name(stream_name), "+", "-", count: 1).first
71
+
72
+ return unless message
73
+
74
+ message[0]
75
+ end
76
+ end
77
+
78
+ # Returns the max committed id, i.e. the consumer's offset, for the specified
79
+ # consumer name.
80
+ #
81
+ # @param stream_name [String] the stream name
82
+ # @param name [String] the consumer name
83
+ #
84
+ # @return [String, nil] The max committed offset, or nil
85
+
86
+ def self.max_consumer_id(stream_name:, consumer_name:)
87
+ connection_pool.with do |redis|
88
+ redis.get offset_key_name(stream_name: stream_name, consumer_name: consumer_name)
89
+ end
90
+ end
91
+
92
+ # @api private
93
+ #
94
+ # Generates the low level redis stream key name.
95
+ #
96
+ # @param stream_name A high level stream name
97
+ # @return [String] A low level redis stream key name
98
+
99
+ def self.stream_key_name(stream_name)
100
+ "#{base_key_name}:stream:#{stream_name}"
101
+ end
102
+
103
+ # @api private
104
+ #
105
+ # Generates the redis key name used for storing a consumer's current offset,
106
+ # i.e. the maximum id successfully processed.
107
+ #
108
+ # @param consumer_name A high level consumer name
109
+ # @return [String] A redis key name for storing a stream's current offset
110
+
111
+ def self.offset_key_name(stream_name:, consumer_name:)
112
+ "#{base_key_name}:offset:#{stream_name}:#{consumer_name}"
113
+ end
114
+
115
+ # @api private
116
+ #
117
+ # Generates the redis key name used for locking.
118
+ #
119
+ # @param name A high level name for the lock
120
+ # @return [String] A redis key name used for locking
121
+
122
+ def self.lock_key_name(name)
123
+ "#{base_key_name}:lock:#{name}"
124
+ end
125
+
126
+ # @api private
127
+ #
128
+ # Returns the full name namespace for redis keys.
129
+
130
+ def self.base_key_name
131
+ [namespace, "redstream"].compact.join(":")
132
+ end
133
+ end
134
+