racecar 3.0.0.alpha.3 → 3.0.0.beta.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +11 -0
- data/Gemfile.lock +3 -2
- data/README.md +36 -0
- data/lib/racecar/async_partition_processor.rb +4 -8
- data/lib/racecar/partition_processor.rb +29 -10
- data/lib/racecar/rebalance_listener.rb +0 -1
- data/lib/racecar/runner.rb +4 -1
- data/lib/racecar/version.rb +1 -1
- data/racecar.gemspec +1 -0
- metadata +15 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b0b740ff066f0f1aee2cd6bffac700badbbf684d2d9575c298aefe6a92a50d75
|
|
4
|
+
data.tar.gz: 022b4e5aea481beb637407cebe248c7890713a9ad64563edd316efca3f8ddfd6
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: fb0f983b58a63ec97ae7f7a47b42cb47ff3a03907055b2c855305918b30964983c15882f806bff173acbbff224d22763331c8503366b0b7783af3ad9e7535dac
|
|
7
|
+
data.tar.gz: 4f95b4582e091c8543aeaf683f915a55880dc622d49f67a56c8dbac160531babcff640919804e555d971022035dfb865931c8d603433c2b5748cb9c11c5b7642
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,17 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
## 3.0.0
|
|
6
|
+
|
|
7
|
+
* Introduce multithreaded processing: when enabled, Racecar processes each assigned partition on its own dedicated thread. Disabled by default and gated behind the `multithreaded_processing_enabled` config.
|
|
8
|
+
* Refactor of the Racecar architecture: introduction of `PartitionProcessor` and `AsyncPartitionProcessor` handling the processing of messages.
|
|
9
|
+
* [Racecar::Config] Add `multithreaded_processing_enabled` (default `false`) to enable multithreaded processing. Can be set via `RACECAR_MULTITHREADED_PROCESSING_ENABLED=1`.
|
|
10
|
+
* [Racecar::Config] Add `multithreaded_processing_max_queue_size` (default `1000`) to cap the number of messages queued per partition before backpressure is applied.
|
|
11
|
+
* [Racecar::Config] Add `multithreaded_processing_resume_threshold` (default `0.5`) controlling the queue fill ratio at which a paused partition is resumed.
|
|
12
|
+
* [Racecar::Config] Add `multithreaded_processing_shutdown_timeout` (default `300`) for how long the main thread waits on each processing thread during graceful shutdown.
|
|
13
|
+
* Apply backpressure when multithreaded processing is enabled: a partition is paused once its queue reaches `multithreaded_processing_max_queue_size` and resumed once it drains below `multithreaded_processing_resume_threshold` of that size.
|
|
14
|
+
* Gracefully drain queued messages and exit per-partition threads on rebalance and shutdown.
|
|
15
|
+
|
|
5
16
|
## 2.12.0
|
|
6
17
|
|
|
7
18
|
* Add tests against Ruby 3.4
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
|
@@ -134,6 +134,42 @@ end
|
|
|
134
134
|
|
|
135
135
|
This is useful to do any one-off work that you wouldn't want to do for each and every message.
|
|
136
136
|
|
|
137
|
+
#### Multithreaded message processing (experimental)
|
|
138
|
+
|
|
139
|
+
Warning - limited battle testing in production environments; use at your own risk!
|
|
140
|
+
|
|
141
|
+
By default a Racecar consumer processes all of its assigned partitions on a single thread. When `multithreaded_processing_enabled` is set, the consumer instead spins up one dedicated thread per assigned partition, so partitions are processed concurrently within a single process. This is an alternative to [parallel workers](#running-consumers-in-parallel-experimental) that avoids forking extra processes (and the associated memory overhead), at the cost of running your consumer code on multiple threads.
|
|
142
|
+
|
|
143
|
+
Each partition thread gets its own instance of your consumer class, so **your consumer code does not need to be thread-safe** - a thread never shares its instance (or its instance state) with another partition. The main thread keeps polling Kafka and hands each partition's messages off to the relevant thread via a bounded queue; if a thread falls behind and its queue fills up, the partition is paused until the queue drains, applying backpressure rather than growing memory without bound.
|
|
144
|
+
|
|
145
|
+
**Warning:** the number of threads scales with the number of assigned partitions, which can be large. Since each thread runs its own consumer instance, every resource that instance acquires (database connections, file handles, network sockets, HTTP clients, etc.) is multiplied by the number of partition threads. Make sure your consumer releases any resource it grabs and that any connection pools or other shared limits are sized to accommodate the resulting concurrency.
|
|
146
|
+
|
|
147
|
+
Enable it via config (or the `RACECAR_MULTITHREADED_PROCESSING_ENABLED=1` environment variable):
|
|
148
|
+
|
|
149
|
+
```ruby
|
|
150
|
+
Racecar.configure do |config|
|
|
151
|
+
config.multithreaded_processing_enabled = true
|
|
152
|
+
end
|
|
153
|
+
|
|
154
|
+
class ResizeImagesConsumer < Racecar::Consumer
|
|
155
|
+
subscribes_to "images"
|
|
156
|
+
|
|
157
|
+
def process(message)
|
|
158
|
+
# This runs on a thread dedicated to message.partition.
|
|
159
|
+
# @state below is private to this partition's thread.
|
|
160
|
+
@state ||= {}
|
|
161
|
+
Image.resize(message.value)
|
|
162
|
+
end
|
|
163
|
+
end
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
The behaviour can be tuned with the following options:
|
|
167
|
+
|
|
168
|
+
- `multithreaded_processing_enabled` – Enable per-partition threads. Default is `false`.
|
|
169
|
+
- `multithreaded_processing_max_queue_size` – Maximum number of queued message batches per partition before the partition is paused to apply backpressure. Default is `1000`.
|
|
170
|
+
- `multithreaded_processing_resume_threshold` – A paused partition is resumed once its queue drains below this fraction of `multithreaded_processing_max_queue_size`. Default is `0.5` (50%).
|
|
171
|
+
- `multithreaded_processing_shutdown_timeout` – How many seconds to wait for each partition thread to finish during graceful shutdown. Default is `300`.
|
|
172
|
+
|
|
137
173
|
#### Setting the starting position
|
|
138
174
|
|
|
139
175
|
When a consumer is started for the first time, it needs to decide where in each partition to start. By default, it will start at the _beginning_, meaning that all past messages will be processed. If you want to instead start at the _end_ of each partition, change your `subscribes_to` like this:
|
|
@@ -7,12 +7,14 @@ module Racecar
|
|
|
7
7
|
class AsyncPartitionProcessor
|
|
8
8
|
attr_reader :thread
|
|
9
9
|
|
|
10
|
-
THREAD_KEY_IDENTIFIER = 'racecar_topic_partition_identifier'.freeze
|
|
11
|
-
|
|
12
10
|
def self.thread_key(topic, partition)
|
|
13
11
|
"#{topic}/#{partition}"
|
|
14
12
|
end
|
|
15
13
|
|
|
14
|
+
def thread_key
|
|
15
|
+
self.class.thread_key(@topic, @partition)
|
|
16
|
+
end
|
|
17
|
+
|
|
16
18
|
def initialize(topic:, partition:, logger:, config:, consumer:, consumer_class:, instrumenter:, rdkafka_consumer:)
|
|
17
19
|
@topic = topic
|
|
18
20
|
@partition = partition
|
|
@@ -70,7 +72,6 @@ module Racecar
|
|
|
70
72
|
rdkafka_consumer: @rdkafka_consumer,
|
|
71
73
|
)
|
|
72
74
|
@queue = Queue.new
|
|
73
|
-
@thread = nil
|
|
74
75
|
|
|
75
76
|
use_process_batch = consumer_class.method_defined?(:process_batch)
|
|
76
77
|
|
|
@@ -90,7 +91,6 @@ module Racecar
|
|
|
90
91
|
def spawn_thread(&block)
|
|
91
92
|
@thread = Thread.new do
|
|
92
93
|
Thread.current.name = "Racecar thread for #{thread_key}"
|
|
93
|
-
Thread.current[AsyncPartitionProcessor::THREAD_KEY_IDENTIFIER] = thread_key
|
|
94
94
|
main_processing_loop(block)
|
|
95
95
|
end
|
|
96
96
|
end
|
|
@@ -121,10 +121,6 @@ module Racecar
|
|
|
121
121
|
end
|
|
122
122
|
end
|
|
123
123
|
|
|
124
|
-
def thread_key
|
|
125
|
-
self.class.thread_key(@topic, @partition)
|
|
126
|
-
end
|
|
127
|
-
|
|
128
124
|
def main_processing_loop(block)
|
|
129
125
|
loop do
|
|
130
126
|
msgs = @queue.pop
|
|
@@ -48,9 +48,8 @@ module Racecar
|
|
|
48
48
|
|
|
49
49
|
with_error_handling(message, payload) do |pause|
|
|
50
50
|
@instrumenter.instrument("process_message", payload) do
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
end
|
|
51
|
+
reconfigure_if_producer_closed!
|
|
52
|
+
exit_if_rebalancing!
|
|
54
53
|
consumer_class_instance.process(Racecar::Message.new(message, retries_count: pause.pauses_count))
|
|
55
54
|
consumer_class_instance.deliver!
|
|
56
55
|
consumer.store_offset(message, @rdkafka_consumer) unless rebalancing
|
|
@@ -76,9 +75,8 @@ module Racecar
|
|
|
76
75
|
racecar_messages = messages.map do |message|
|
|
77
76
|
Racecar::Message.new(message, retries_count: pause.pauses_count)
|
|
78
77
|
end
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
end
|
|
78
|
+
reconfigure_if_producer_closed!
|
|
79
|
+
exit_if_rebalancing!
|
|
82
80
|
consumer_class_instance.process_batch(racecar_messages)
|
|
83
81
|
consumer_class_instance.deliver!
|
|
84
82
|
consumer.store_offset(messages.last, @rdkafka_consumer) unless rebalancing
|
|
@@ -146,14 +144,20 @@ module Racecar
|
|
|
146
144
|
elsif !shutting_down
|
|
147
145
|
handle_processing_error(e, payload, pause: pause)
|
|
148
146
|
pause.pause!
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
147
|
+
|
|
148
|
+
break if config.pause_timeout == 0
|
|
149
|
+
|
|
150
|
+
@sleep_mutex.synchronize do
|
|
151
|
+
next if rebalancing || shutting_down
|
|
152
|
+
if config.pause_timeout == -1
|
|
153
|
+
# Pause indefinitely. backoff_interval is Float::INFINITY here,
|
|
154
|
+
@sleep_cv.wait(@sleep_mutex)
|
|
155
|
+
else
|
|
152
156
|
@sleep_cv.wait(@sleep_mutex, pause.backoff_interval)
|
|
153
157
|
end
|
|
154
158
|
end
|
|
155
159
|
Thread.exit if rebalancing
|
|
156
|
-
break if shutting_down
|
|
160
|
+
break if shutting_down
|
|
157
161
|
else
|
|
158
162
|
handle_processing_error(e, payload, pause: pause)
|
|
159
163
|
break
|
|
@@ -205,6 +209,21 @@ module Racecar
|
|
|
205
209
|
reconfigure_consumer_class_instance!
|
|
206
210
|
end
|
|
207
211
|
|
|
212
|
+
def reconfigure_if_producer_closed!
|
|
213
|
+
return unless @config.multithreaded_processing_enabled
|
|
214
|
+
return unless consumer_class_instance.instance_variable_get(:@producer)&.closed?
|
|
215
|
+
|
|
216
|
+
reconfigure_consumer_class_instance!
|
|
217
|
+
end
|
|
218
|
+
|
|
219
|
+
def exit_if_rebalancing!
|
|
220
|
+
return unless @config.multithreaded_processing_enabled
|
|
221
|
+
if rebalancing
|
|
222
|
+
logger.info "Exiting processing thread for #{topic}/#{partition} due to rebalancing"
|
|
223
|
+
Thread.exit
|
|
224
|
+
end
|
|
225
|
+
end
|
|
226
|
+
|
|
208
227
|
def reconfigure_consumer_class_instance!
|
|
209
228
|
consumer_class_instance.configure(
|
|
210
229
|
producer: consumer.producer,
|
data/lib/racecar/runner.rb
CHANGED
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
3
|
require "rdkafka"
|
|
4
|
+
require "timeout"
|
|
4
5
|
require "racecar/pause"
|
|
5
6
|
require "racecar/message"
|
|
6
7
|
require "racecar/message_delivery_error"
|
|
@@ -168,7 +169,9 @@ module Racecar
|
|
|
168
169
|
processors_snapshot.each do |processor|
|
|
169
170
|
if processor.respond_to?(:thread)
|
|
170
171
|
begin
|
|
171
|
-
processor.thread.join(config.multithreaded_processing_shutdown_timeout)
|
|
172
|
+
raise Timeout::Error unless processor.thread.join(config.multithreaded_processing_shutdown_timeout)
|
|
173
|
+
rescue Timeout::Error
|
|
174
|
+
logger.error "Processor thread for #{processor.thread_key} did not finish within #{config.multithreaded_processing_shutdown_timeout} seconds. It may be stuck in a long-running process or blocked on I/O."
|
|
172
175
|
rescue => e
|
|
173
176
|
logger.error "Error while waiting for processor thread to finish: #{e}"
|
|
174
177
|
end
|
data/lib/racecar/version.rb
CHANGED
data/racecar.gemspec
CHANGED
|
@@ -24,6 +24,7 @@ Gem::Specification.new do |spec|
|
|
|
24
24
|
|
|
25
25
|
spec.add_runtime_dependency "king_konf", "~> 1.0.0"
|
|
26
26
|
spec.add_runtime_dependency "rdkafka", ">= 0.15.0"
|
|
27
|
+
spec.add_runtime_dependency "concurrent-ruby", "~> 1.3"
|
|
27
28
|
|
|
28
29
|
spec.add_development_dependency "bundler", [">= 1.13", "< 3"]
|
|
29
30
|
spec.add_development_dependency "pry-byebug"
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: racecar
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 3.0.0.
|
|
4
|
+
version: 3.0.0.beta.2
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Daniel Schierbeck
|
|
@@ -38,6 +38,20 @@ dependencies:
|
|
|
38
38
|
- - ">="
|
|
39
39
|
- !ruby/object:Gem::Version
|
|
40
40
|
version: 0.15.0
|
|
41
|
+
- !ruby/object:Gem::Dependency
|
|
42
|
+
name: concurrent-ruby
|
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
|
44
|
+
requirements:
|
|
45
|
+
- - "~>"
|
|
46
|
+
- !ruby/object:Gem::Version
|
|
47
|
+
version: '1.3'
|
|
48
|
+
type: :runtime
|
|
49
|
+
prerelease: false
|
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
51
|
+
requirements:
|
|
52
|
+
- - "~>"
|
|
53
|
+
- !ruby/object:Gem::Version
|
|
54
|
+
version: '1.3'
|
|
41
55
|
- !ruby/object:Gem::Dependency
|
|
42
56
|
name: bundler
|
|
43
57
|
requirement: !ruby/object:Gem::Requirement
|