msg-batcher 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/README.md +73 -0
- data/lib/msg-batcher.rb +150 -0
- metadata +46 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: ac80b44e3f48248b2355ee73c5ed6346a492f9f5749ba670fbc18310171535cc
|
4
|
+
data.tar.gz: 42537898648e2e953ebb3df19fec938db4cc8da487f5014569a4477a71c666ff
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 8a478b76db6a65bb7a428e01d714a76486f7d1932328b0157da799f83fb05767cab488108ae7dd16b531110e99ea3b6d5dcae15e02bd667e6e9b52e27caea955
|
7
|
+
data.tar.gz: fade326da9e58081a4a8d89d8f05bd368995556c97a12a805f38802974cbe481a3c54a597c1cf23764bd476463269f5b9bd41c62ea6b0d5c58829763704494bc
|
data/README.md
ADDED
@@ -0,0 +1,73 @@
|
|
1
|
+
A Ruby library that facilitates thread-safe batch processing of
|
2
|
+
messages. In certain situations, processing multiple messages in batch
|
3
|
+
is more efficient than handling them one by one.
|
4
|
+
|
5
|
+
Consider a scenario where code receives events at random intervals and
|
6
|
+
must notify an external HTTP service about these events. The straightforward
|
7
|
+
approach is to issue an HTTP request with the details of each event as it is received. However, if events occur frequently, this method can lead to significant time spent on network latency. A more efficient approach is to aggregate events and send them in a single batched HTTP request.
|
8
|
+
|
9
|
+
This library is designed to handle exactly that. Events are pushed into
|
10
|
+
the class instance, and a callback with batched data is triggered under
|
11
|
+
one of two conditions:
|
12
|
+
|
13
|
+
* The number of messages in the batch reaches the specified maximum.
|
14
|
+
* The batch is not yet complete, but the maximum idle time has elapsed.
|
15
|
+
|
16
|
+
The latter condition is crucial for scenarios like the following:
|
17
|
+
suppose you've set up a *batcher* to fire after receiving 10 messages,
|
18
|
+
but only 9 messages are received, and no new messages are forthcoming.
|
19
|
+
In this case, the callback with 9 messages will fire after the
|
20
|
+
maximum idle time has passed.
|
21
|
+
|
22
|
+
## Usage
|
23
|
+
The following code creates a *batcher* with a maximum capacity of 10
|
24
|
+
messages per batch and a maximum idle time of 3 seconds. The callback
|
25
|
+
block simply prints the timestamp, batch size, and content.
|
26
|
+
```
|
27
|
+
require 'msg-batcher'
|
28
|
+
|
29
|
+
batcher = MsgBatcher.new 10, 3000 do |batch|
|
30
|
+
now = Time.now
|
31
|
+
timestamp = "#{now.min}:#{now.sec}"
|
32
|
+
puts "[#{timestamp}] size: #{batch.size} content: #{batch.inspect}"
|
33
|
+
end
|
34
|
+
|
35
|
+
29.times do |i|
|
36
|
+
batcher.push i
|
37
|
+
end
|
38
|
+
|
39
|
+
sleep 5
|
40
|
+
batcher.kill
|
41
|
+
```
|
42
|
+
The output is:
|
43
|
+
```
|
44
|
+
[10:12] size: 10 content: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
|
45
|
+
[10:12] size: 10 content: [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
|
46
|
+
[10:15] size: 9 content: [20, 21, 22, 23, 24, 25, 26, 27, 28]
|
47
|
+
```
|
48
|
+
As you can see, the first two batches, each of size 10, were created immediately, while the last incomplete batch
|
49
|
+
took 3 seconds to be created.
|
50
|
+
|
51
|
+
Finally, it's a good idea to call the `#kill` method when you no longer need the *batcher*.
|
52
|
+
This method terminates the timer thread that was created during the
|
53
|
+
*batcher*'s initialization.
|
54
|
+
|
55
|
+
## Thread-safety
|
56
|
+
The `#push` method is thread-safe.
|
57
|
+
|
58
|
+
## Installation
|
59
|
+
### Bundler
|
60
|
+
Add this line to your application's Gemfile:
|
61
|
+
```
|
62
|
+
gem 'msg-batcher'
|
63
|
+
```
|
64
|
+
And then execute:
|
65
|
+
```
|
66
|
+
$ bundle install
|
67
|
+
```
|
68
|
+
### Standalone
|
69
|
+
Or install it yourself as:
|
70
|
+
```
|
71
|
+
gem install msg-batcher
|
72
|
+
```
|
73
|
+
|
data/lib/msg-batcher.rb
ADDED
@@ -0,0 +1,150 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
# Important!:
|
4
|
+
# If release method is called by timer thread, all other pushes will wait until release callback is finished by
|
5
|
+
# timer thread. If release is issued by pushers, the release callback would be ran by last pushing thread
|
6
|
+
# but other push threads would not wait for it to finish (which is good).
|
7
|
+
|
8
|
+
|
9
|
+
class MsgBatcher
|
10
|
+
DEBUG = false
|
11
|
+
|
12
|
+
class Error < StandardError; end
|
13
|
+
|
14
|
+
def initialize(max_length, max_time_msecs, on_error=nil, &block)
|
15
|
+
@max_length = max_length
|
16
|
+
@max_time_msecs = max_time_msecs
|
17
|
+
@on_error = on_error
|
18
|
+
@on_error ||= lambda { |ex| raise ex }
|
19
|
+
@block = block
|
20
|
+
|
21
|
+
@closed = false
|
22
|
+
|
23
|
+
@storage = []
|
24
|
+
@m = Mutex.new
|
25
|
+
@m2 = Mutex.new # used besides @m mutex. Used because of timer thread.
|
26
|
+
|
27
|
+
@timer_start_cv = ConditionVariable.new
|
28
|
+
@timer_started_cv = ConditionVariable.new
|
29
|
+
@timer_full_cycle_cv = ConditionVariable.new
|
30
|
+
@timer_release_cv = ConditionVariable.new
|
31
|
+
|
32
|
+
|
33
|
+
# It is important that push invocation start after full completion of this method.
|
34
|
+
# So initialize instance of this class first and only then start pushing.
|
35
|
+
start_timer
|
36
|
+
end
|
37
|
+
|
38
|
+
def kill
|
39
|
+
@closed = true
|
40
|
+
@timer_thread.exit
|
41
|
+
end
|
42
|
+
|
43
|
+
# Thread-safe
|
44
|
+
# @raise [Error] when invoked when batcher has been closed
|
45
|
+
def push(entry)
|
46
|
+
raise Error, "Batcher is closed - cannot push" if @closed
|
47
|
+
|
48
|
+
@m.lock
|
49
|
+
@m2.lock
|
50
|
+
|
51
|
+
# Start timer
|
52
|
+
# Timer thread must be in TT1 position
|
53
|
+
if @storage.empty?
|
54
|
+
@timer_start_cv.signal
|
55
|
+
@timer_started_cv.wait @m2 # waiting for timer thread to be in position TT2
|
56
|
+
end
|
57
|
+
|
58
|
+
@storage.push entry
|
59
|
+
# curr_size = @storage.inject(0) { |sum, e| s}
|
60
|
+
if @storage.size == @max_length
|
61
|
+
# unlocks @m inside release method
|
62
|
+
release
|
63
|
+
else
|
64
|
+
@m2.unlock
|
65
|
+
@m.unlock
|
66
|
+
end
|
67
|
+
end
|
68
|
+
|
69
|
+
private
|
70
|
+
|
71
|
+
def dputs(str)
|
72
|
+
puts str if DEBUG
|
73
|
+
end
|
74
|
+
|
75
|
+
def release(from_push=true)
|
76
|
+
# inside @m lock
|
77
|
+
temp = @storage
|
78
|
+
@storage = []
|
79
|
+
|
80
|
+
@already_released = true
|
81
|
+
dputs "kill timer"
|
82
|
+
@timer_release_cv.signal # informing timer that release is happening
|
83
|
+
|
84
|
+
# Now interesting happens
|
85
|
+
# We are releasing @m2 lock and waiting for @timer_full_cycle_cv
|
86
|
+
# No other thread but timer thread would acquire @m2 lock, as other threads that are pushing
|
87
|
+
# are locked on @m.
|
88
|
+
# So as @m2 is acquired by timer thread, it starts new loop cycle and signals @timer_full_cycle_cv
|
89
|
+
# So this release method stops waiting and tries to lock @m2 again. But it cannot until timer
|
90
|
+
# thread releases it. It releases it on line @timer_start_cv.wait @m2.
|
91
|
+
# So now we can be sure that timer is at the beginning, ready to wait for @timer_start_cv signal.
|
92
|
+
|
93
|
+
if from_push
|
94
|
+
dputs "--------before wait m2"
|
95
|
+
@timer_full_cycle_cv.wait @m2
|
96
|
+
@m2.unlock
|
97
|
+
dputs "-----22222"
|
98
|
+
@m.unlock
|
99
|
+
dputs "---- unlock all"
|
100
|
+
end
|
101
|
+
|
102
|
+
begin
|
103
|
+
@block.call temp
|
104
|
+
rescue
|
105
|
+
@on_error.call $!
|
106
|
+
end
|
107
|
+
|
108
|
+
end
|
109
|
+
|
110
|
+
def start_timer
|
111
|
+
@m2.lock
|
112
|
+
|
113
|
+
@timer_thread = Thread.new do
|
114
|
+
@m2.lock
|
115
|
+
while true
|
116
|
+
@already_released = false
|
117
|
+
# informing release that timer is at the beginning
|
118
|
+
@timer_full_cycle_cv.signal
|
119
|
+
dputs "sdlkfjsd"
|
120
|
+
|
121
|
+
|
122
|
+
|
123
|
+
# Position: TT1
|
124
|
+
# Wait for invocation from push
|
125
|
+
# Each release invocation finishes when timer thread are below (waiting for @timer_start_cv)
|
126
|
+
|
127
|
+
dputs "TT1"
|
128
|
+
@timer_start_cv.wait @m2
|
129
|
+
dputs "TT1 after wait"
|
130
|
+
@timer_started_cv.signal
|
131
|
+
|
132
|
+
dputs "TT2"
|
133
|
+
# then wait either time to elapse or signal that data has been released
|
134
|
+
@timer_release_cv.wait @m2, @max_time_msecs / 1000.0
|
135
|
+
|
136
|
+
dputs "timer end #{@m2.owned?}"
|
137
|
+
|
138
|
+
# @m2 is locked here!
|
139
|
+
|
140
|
+
unless @already_released
|
141
|
+
dputs "timer's release"
|
142
|
+
release(false)
|
143
|
+
end
|
144
|
+
end
|
145
|
+
end
|
146
|
+
# wait for timer to be in waiting state
|
147
|
+
@timer_full_cycle_cv.wait @m2
|
148
|
+
@m2.unlock
|
149
|
+
end
|
150
|
+
end
|
metadata
ADDED
@@ -0,0 +1,46 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: msg-batcher
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- ertygiq
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2023-11-10 00:00:00.000000000 Z
|
12
|
+
dependencies: []
|
13
|
+
description: A Ruby library that facilitates thread-safe batch processing of messages.
|
14
|
+
In certain situations, processing multiple messages in batch is more efficient than
|
15
|
+
handling them one by one.
|
16
|
+
email:
|
17
|
+
executables: []
|
18
|
+
extensions: []
|
19
|
+
extra_rdoc_files: []
|
20
|
+
files:
|
21
|
+
- README.md
|
22
|
+
- lib/msg-batcher.rb
|
23
|
+
homepage: https://github.com/ertygiq/msg-batcher
|
24
|
+
licenses:
|
25
|
+
- MIT
|
26
|
+
metadata: {}
|
27
|
+
post_install_message:
|
28
|
+
rdoc_options: []
|
29
|
+
require_paths:
|
30
|
+
- lib
|
31
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
32
|
+
requirements:
|
33
|
+
- - ">="
|
34
|
+
- !ruby/object:Gem::Version
|
35
|
+
version: '0'
|
36
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - ">="
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
requirements: []
|
42
|
+
rubygems_version: 3.4.10
|
43
|
+
signing_key:
|
44
|
+
specification_version: 4
|
45
|
+
summary: A Ruby Library for Thread-Safe Batch Processing of Events
|
46
|
+
test_files: []
|