msg-batcher 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (4) hide show
  1. checksums.yaml +7 -0
  2. data/README.md +73 -0
  3. data/lib/msg-batcher.rb +150 -0
  4. metadata +46 -0
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: ac80b44e3f48248b2355ee73c5ed6346a492f9f5749ba670fbc18310171535cc
4
+ data.tar.gz: 42537898648e2e953ebb3df19fec938db4cc8da487f5014569a4477a71c666ff
5
+ SHA512:
6
+ metadata.gz: 8a478b76db6a65bb7a428e01d714a76486f7d1932328b0157da799f83fb05767cab488108ae7dd16b531110e99ea3b6d5dcae15e02bd667e6e9b52e27caea955
7
+ data.tar.gz: fade326da9e58081a4a8d89d8f05bd368995556c97a12a805f38802974cbe481a3c54a597c1cf23764bd476463269f5b9bd41c62ea6b0d5c58829763704494bc
data/README.md ADDED
@@ -0,0 +1,73 @@
1
+ A Ruby library that facilitates thread-safe batch processing of
2
+ messages. In certain situations, processing multiple messages in batch
3
+ is more efficient than handling them one by one.
4
+
5
+ Consider a scenario where code receives events at random intervals and
6
+ must notify an external HTTP service about these events. The straightforward
7
+ approach is to issue an HTTP request with the details of each event as it is received. However, if events occur frequently, this method can lead to significant time spent on network latency. A more efficient approach is to aggregate events and send them in a single batched HTTP request.
8
+
9
+ This library is designed to handle exactly that. Events are pushed into
10
+ the class instance, and a callback with batched data is triggered under
11
+ one of two conditions:
12
+
13
+ * The number of messages in the batch reaches the specified maximum.
14
+ * The batch is not yet complete, but the maximum idle time has elapsed.
15
+
16
+ The latter condition is crucial for scenarios like the following:
17
+ suppose you've set up a *batcher* to fire after receiving 10 messages,
18
+ but only 9 messages are received, and no new messages are forthcoming.
19
+ In this case, the callback with 9 messages will fire after the
20
+ maximum idle time has passed.
21
+
22
+ ## Usage
23
+ The following code creates a *batcher* with a maximum capacity of 10
24
+ messages per batch and a maximum idle time of 3 seconds. The callback
25
+ block simply prints the timestamp, batch size, and content.
26
+ ```
27
+ require 'msg-batcher'
28
+
29
+ batcher = MsgBatcher.new 10, 3000 do |batch|
30
+ now = Time.now
31
+ timestamp = "#{now.min}:#{now.sec}"
32
+ puts "[#{timestamp}] size: #{batch.size} content: #{batch.inspect}"
33
+ end
34
+
35
+ 29.times do |i|
36
+ batcher.push i
37
+ end
38
+
39
+ sleep 5
40
+ batcher.kill
41
+ ```
42
+ The output is:
43
+ ```
44
+ [10:12] size: 10 content: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
45
+ [10:12] size: 10 content: [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
46
+ [10:15] size: 9 content: [20, 21, 22, 23, 24, 25, 26, 27, 28]
47
+ ```
48
+ As you can see, the first two batches, each of size 10, were created immediately, while the last incomplete batch
49
+ took 3 seconds to be created.
50
+
51
+ Finally, it's a good idea to call the `#kill` method when you no longer need the *batcher*.
52
+ This method terminates the timer thread that was created during the
53
+ *batcher*'s initialization.
54
+
55
+ ## Thread-safety
56
+ The `#push` method is thread-safe.
57
+
58
+ ## Installation
59
+ ### Bundler
60
+ Add this line to your application's Gemfile:
61
+ ```
62
+ gem 'msg-batcher'
63
+ ```
64
+ And then execute:
65
+ ```
66
+ $ bundle install
67
+ ```
68
+ ### Standalone
69
+ Or install it yourself as:
70
+ ```
71
+ gem install msg-batcher
72
+ ```
73
+
@@ -0,0 +1,150 @@
1
+ # frozen_string_literal: true
2
+
3
+ # Important!:
4
+ # If release method is called by timer thread, all other pushes will wait until release callback is finished by
5
+ # timer thread. If release is issued by pushers, the release callback would be ran by last pushing thread
6
+ # but other push threads would not wait for it to finish (which is good).
7
+
8
+
9
+ class MsgBatcher
10
+ DEBUG = false
11
+
12
+ class Error < StandardError; end
13
+
14
+ def initialize(max_length, max_time_msecs, on_error=nil, &block)
15
+ @max_length = max_length
16
+ @max_time_msecs = max_time_msecs
17
+ @on_error = on_error
18
+ @on_error ||= lambda { |ex| raise ex }
19
+ @block = block
20
+
21
+ @closed = false
22
+
23
+ @storage = []
24
+ @m = Mutex.new
25
+ @m2 = Mutex.new # used besides @m mutex. Used because of timer thread.
26
+
27
+ @timer_start_cv = ConditionVariable.new
28
+ @timer_started_cv = ConditionVariable.new
29
+ @timer_full_cycle_cv = ConditionVariable.new
30
+ @timer_release_cv = ConditionVariable.new
31
+
32
+
33
+ # It is important that push invocation start after full completion of this method.
34
+ # So initialize instance of this class first and only then start pushing.
35
+ start_timer
36
+ end
37
+
38
+ def kill
39
+ @closed = true
40
+ @timer_thread.exit
41
+ end
42
+
43
+ # Thread-safe
44
+ # @raise [Error] when invoked when batcher has been closed
45
+ def push(entry)
46
+ raise Error, "Batcher is closed - cannot push" if @closed
47
+
48
+ @m.lock
49
+ @m2.lock
50
+
51
+ # Start timer
52
+ # Timer thread must be in TT1 position
53
+ if @storage.empty?
54
+ @timer_start_cv.signal
55
+ @timer_started_cv.wait @m2 # waiting for timer thread to be in position TT2
56
+ end
57
+
58
+ @storage.push entry
59
+ # curr_size = @storage.inject(0) { |sum, e| s}
60
+ if @storage.size == @max_length
61
+ # unlocks @m inside release method
62
+ release
63
+ else
64
+ @m2.unlock
65
+ @m.unlock
66
+ end
67
+ end
68
+
69
+ private
70
+
71
+ def dputs(str)
72
+ puts str if DEBUG
73
+ end
74
+
75
+ def release(from_push=true)
76
+ # inside @m lock
77
+ temp = @storage
78
+ @storage = []
79
+
80
+ @already_released = true
81
+ dputs "kill timer"
82
+ @timer_release_cv.signal # informing timer that release is happening
83
+
84
+ # Now interesting happens
85
+ # We are releasing @m2 lock and waiting for @timer_full_cycle_cv
86
+ # No other thread but timer thread would acquire @m2 lock, as other threads that are pushing
87
+ # are locked on @m.
88
+ # So as @m2 is acquired by timer thread, it starts new loop cycle and signals @timer_full_cycle_cv
89
+ # So this release method stops waiting and tries to lock @m2 again. But it cannot until timer
90
+ # thread releases it. It releases it on line @timer_start_cv.wait @m2.
91
+ # So now we can be sure that timer is at the beginning, ready to wait for @timer_start_cv signal.
92
+
93
+ if from_push
94
+ dputs "--------before wait m2"
95
+ @timer_full_cycle_cv.wait @m2
96
+ @m2.unlock
97
+ dputs "-----22222"
98
+ @m.unlock
99
+ dputs "---- unlock all"
100
+ end
101
+
102
+ begin
103
+ @block.call temp
104
+ rescue
105
+ @on_error.call $!
106
+ end
107
+
108
+ end
109
+
110
+ def start_timer
111
+ @m2.lock
112
+
113
+ @timer_thread = Thread.new do
114
+ @m2.lock
115
+ while true
116
+ @already_released = false
117
+ # informing release that timer is at the beginning
118
+ @timer_full_cycle_cv.signal
119
+ dputs "sdlkfjsd"
120
+
121
+
122
+
123
+ # Position: TT1
124
+ # Wait for invocation from push
125
+ # Each release invocation finishes when timer thread are below (waiting for @timer_start_cv)
126
+
127
+ dputs "TT1"
128
+ @timer_start_cv.wait @m2
129
+ dputs "TT1 after wait"
130
+ @timer_started_cv.signal
131
+
132
+ dputs "TT2"
133
+ # then wait either time to elapse or signal that data has been released
134
+ @timer_release_cv.wait @m2, @max_time_msecs / 1000.0
135
+
136
+ dputs "timer end #{@m2.owned?}"
137
+
138
+ # @m2 is locked here!
139
+
140
+ unless @already_released
141
+ dputs "timer's release"
142
+ release(false)
143
+ end
144
+ end
145
+ end
146
+ # wait for timer to be in waiting state
147
+ @timer_full_cycle_cv.wait @m2
148
+ @m2.unlock
149
+ end
150
+ end
metadata ADDED
@@ -0,0 +1,46 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: msg-batcher
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - ertygiq
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2023-11-10 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: A Ruby library that facilitates thread-safe batch processing of messages.
14
+ In certain situations, processing multiple messages in batch is more efficient than
15
+ handling them one by one.
16
+ email:
17
+ executables: []
18
+ extensions: []
19
+ extra_rdoc_files: []
20
+ files:
21
+ - README.md
22
+ - lib/msg-batcher.rb
23
+ homepage: https://github.com/ertygiq/msg-batcher
24
+ licenses:
25
+ - MIT
26
+ metadata: {}
27
+ post_install_message:
28
+ rdoc_options: []
29
+ require_paths:
30
+ - lib
31
+ required_ruby_version: !ruby/object:Gem::Requirement
32
+ requirements:
33
+ - - ">="
34
+ - !ruby/object:Gem::Version
35
+ version: '0'
36
+ required_rubygems_version: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ requirements: []
42
+ rubygems_version: 3.4.10
43
+ signing_key:
44
+ specification_version: 4
45
+ summary: A Ruby Library for Thread-Safe Batch Processing of Events
46
+ test_files: []