racecar 3.0.0.alpha.3 → 3.0.0.beta.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 043070ffc30189dd4f500a691316db7a2a55422078b7bf5f31b81ff45aa69cab
4
- data.tar.gz: 73a2d774f56f7f14d18c5c43691f185ab64fd7e2f3cd6a29dfcad8b9111f00b5
3
+ metadata.gz: da583e5664d52a322de9d5d1e94c21b1b95a5ddeecfa31c5c52878867c4568e5
4
+ data.tar.gz: f35e00ca35831da9cfc0c5dbb6eebbe2404d8963ad62f2fb785a20e14aa5df08
5
5
  SHA512:
6
- metadata.gz: d0ea838d6e6381660a88b6e1d71c95d0fbe741609dc216b18830d8a74eee78b0d56a6e05713b4f12ac28f60146b566f4f125d53b4c4bd4c52c3c13459941fac4
7
- data.tar.gz: 213281a3b10ad91e3d7bcbf3a8974fbf15d8e1b5905518d3a4f13505ede101d254a433f86b1e7f875ccdad3c7d4d59fbc570b0d9db63f146615f4f454a65bd85
6
+ metadata.gz: 0cbdae52223f76c0d7dbe6fb000522071789ee22deee87603cdab5947eca5842eafc8cc4053662d2152c8f18c841fb53a5872bc7e48c5b0596b82cf2ead0af0b
7
+ data.tar.gz: 2ee11ff529c0043ec2b9bbc680f2da257775e9e6273236d9066df400d17c97ab02a7347f5a2c4687d56b00dc928178a9f1ec5c5c179c639fe6a5ae2ff2b505f5
data/CHANGELOG.md CHANGED
@@ -2,6 +2,17 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## 3.0.0
6
+
7
+ * Introduce multithreaded processing: when enabled, Racecar processes each assigned partition on its own dedicated thread. Disabled by default and gated behind the `multithreaded_processing_enabled` config.
8
+ * Refactor of the Racecar architecture: introduction of `PartitionProcessor` and `AsyncPartitionProcessor` handling the processing of messages.
9
+ * [Racecar::Config] Add `multithreaded_processing_enabled` (default `false`) to enable multithreaded processing. Can be set via `RACECAR_MULTITHREADED_PROCESSING_ENABLED=1`.
10
+ * [Racecar::Config] Add `multithreaded_processing_max_queue_size` (default `1000`) to cap the number of messages queued per partition before backpressure is applied.
11
+ * [Racecar::Config] Add `multithreaded_processing_resume_threshold` (default `0.5`) controlling the queue fill ratio at which a paused partition is resumed.
12
+ * [Racecar::Config] Add `multithreaded_processing_shutdown_timeout` (default `300`) for how long the main thread waits on each processing thread during graceful shutdown.
13
+ * Apply backpressure when multithreaded processing is enabled: a partition is paused once its queue reaches `multithreaded_processing_max_queue_size` and resumed once it drains below `multithreaded_processing_resume_threshold` of that size.
14
+ * Gracefully drain queued messages and exit per-partition threads on rebalance and shutdown.
15
+
5
16
  ## 2.12.0
6
17
 
7
18
  * Add tests against Ruby 3.4
data/Gemfile.lock CHANGED
@@ -1,7 +1,8 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- racecar (3.0.0.alpha.3)
4
+ racecar (3.0.0.beta.1)
5
+ concurrent-ruby (~> 1.3)
5
6
  king_konf (~> 1.0.0)
6
7
  rdkafka (>= 0.15.0)
7
8
 
data/README.md CHANGED
@@ -134,6 +134,42 @@ end
134
134
 
135
135
  This is useful to do any one-off work that you wouldn't want to do for each and every message.
136
136
 
137
+ #### Multithreaded message processing (experimental)
138
+
139
+ Warning - limited battle testing in production environments; use at your own risk!
140
+
141
+ By default a Racecar consumer processes all of its assigned partitions on a single thread. When `multithreaded_processing_enabled` is set, the consumer instead spins up one dedicated thread per assigned partition, so partitions are processed concurrently within a single process. This is an alternative to [parallel workers](#running-consumers-in-parallel-experimental) that avoids forking extra processes (and the associated memory overhead), at the cost of running your consumer code on multiple threads.
142
+
143
+ Each partition thread gets its own instance of your consumer class, so **your consumer code does not need to be thread-safe** - a thread never shares its instance (or its instance state) with another partition. The main thread keeps polling Kafka and hands each partition's messages off to the relevant thread via a bounded queue; if a thread falls behind and its queue fills up, the partition is paused until the queue drains, applying backpressure rather than growing memory without bound.
144
+
145
+ **Warning:** the number of threads scales with the number of assigned partitions, which can be large. Since each thread runs its own consumer instance, every resource that instance acquires (database connections, file handles, network sockets, HTTP clients, etc.) is multiplied by the number of partition threads. Make sure your consumer releases any resource it grabs and that any connection pools or other shared limits are sized to accommodate the resulting concurrency.
146
+
147
+ Enable it via config (or the `RACECAR_MULTITHREADED_PROCESSING_ENABLED=1` environment variable):
148
+
149
+ ```ruby
150
+ Racecar.configure do |config|
151
+ config.multithreaded_processing_enabled = true
152
+ end
153
+
154
+ class ResizeImagesConsumer < Racecar::Consumer
155
+ subscribes_to "images"
156
+
157
+ def process(message)
158
+ # This runs on a thread dedicated to message.partition.
159
+ # @state below is private to this partition's thread.
160
+ @state ||= {}
161
+ Image.resize(message.value)
162
+ end
163
+ end
164
+ ```
165
+
166
+ The behaviour can be tuned with the following options:
167
+
168
+ - `multithreaded_processing_enabled` – Enable per-partition threads. Default is `false`.
169
+ - `multithreaded_processing_max_queue_size` – Maximum number of queued message batches per partition before the partition is paused to apply backpressure. Default is `1000`.
170
+ - `multithreaded_processing_resume_threshold` – A paused partition is resumed once its queue drains below this fraction of `multithreaded_processing_max_queue_size`. Default is `0.5` (50%).
171
+ - `multithreaded_processing_shutdown_timeout` – How many seconds to wait for each partition thread to finish during graceful shutdown. Default is `300`.
172
+
137
173
  #### Setting the starting position
138
174
 
139
175
  When a consumer is started for the first time, it needs to decide where in each partition to start. By default, it will start at the _beginning_, meaning that all past messages will be processed. If you want to instead start at the _end_ of each partition, change your `subscribes_to` like this:
@@ -7,8 +7,6 @@ module Racecar
7
7
  class AsyncPartitionProcessor
8
8
  attr_reader :thread
9
9
 
10
- THREAD_KEY_IDENTIFIER = 'racecar_topic_partition_identifier'.freeze
11
-
12
10
  def self.thread_key(topic, partition)
13
11
  "#{topic}/#{partition}"
14
12
  end
@@ -70,7 +68,6 @@ module Racecar
70
68
  rdkafka_consumer: @rdkafka_consumer,
71
69
  )
72
70
  @queue = Queue.new
73
- @thread = nil
74
71
 
75
72
  use_process_batch = consumer_class.method_defined?(:process_batch)
76
73
 
@@ -90,7 +87,6 @@ module Racecar
90
87
  def spawn_thread(&block)
91
88
  @thread = Thread.new do
92
89
  Thread.current.name = "Racecar thread for #{thread_key}"
93
- Thread.current[AsyncPartitionProcessor::THREAD_KEY_IDENTIFIER] = thread_key
94
90
  main_processing_loop(block)
95
91
  end
96
92
  end
@@ -48,9 +48,7 @@ module Racecar
48
48
 
49
49
  with_error_handling(message, payload) do |pause|
50
50
  @instrumenter.instrument("process_message", payload) do
51
- if @config.multithreaded_processing_enabled && consumer_class_instance.instance_variable_get(:@producer)&.closed?
52
- reconfigure_consumer_class_instance!
53
- end
51
+ reconfigure_if_producer_closed!
54
52
  consumer_class_instance.process(Racecar::Message.new(message, retries_count: pause.pauses_count))
55
53
  consumer_class_instance.deliver!
56
54
  consumer.store_offset(message, @rdkafka_consumer) unless rebalancing
@@ -76,9 +74,7 @@ module Racecar
76
74
  racecar_messages = messages.map do |message|
77
75
  Racecar::Message.new(message, retries_count: pause.pauses_count)
78
76
  end
79
- if @config.multithreaded_processing_enabled && consumer_class_instance.instance_variable_get(:@producer)&.closed?
80
- reconfigure_consumer_class_instance!
81
- end
77
+ reconfigure_if_producer_closed!
82
78
  consumer_class_instance.process_batch(racecar_messages)
83
79
  consumer_class_instance.deliver!
84
80
  consumer.store_offset(messages.last, @rdkafka_consumer) unless rebalancing
@@ -205,6 +201,13 @@ module Racecar
205
201
  reconfigure_consumer_class_instance!
206
202
  end
207
203
 
204
+ def reconfigure_if_producer_closed!
205
+ return unless @config.multithreaded_processing_enabled
206
+ return unless consumer_class_instance.instance_variable_get(:@producer)&.closed?
207
+
208
+ reconfigure_consumer_class_instance!
209
+ end
210
+
208
211
  def reconfigure_consumer_class_instance!
209
212
  consumer_class_instance.configure(
210
213
  producer: consumer.producer,
@@ -2,7 +2,6 @@ module Racecar
2
2
  class RebalanceListener
3
3
  def initialize(config, instrumenter, partition_processors)
4
4
  @consumer_class = config.consumer_class
5
- @config = config
6
5
  @instrumenter = instrumenter
7
6
  @partition_processors = partition_processors
8
7
  @rdkafka_consumer = nil
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Racecar
4
- VERSION = "3.0.0.alpha.3"
4
+ VERSION = "3.0.0.beta.1"
5
5
  end
data/racecar.gemspec CHANGED
@@ -24,6 +24,7 @@ Gem::Specification.new do |spec|
24
24
 
25
25
  spec.add_runtime_dependency "king_konf", "~> 1.0.0"
26
26
  spec.add_runtime_dependency "rdkafka", ">= 0.15.0"
27
+ spec.add_runtime_dependency "concurrent-ruby", "~> 1.3"
27
28
 
28
29
  spec.add_development_dependency "bundler", [">= 1.13", "< 3"]
29
30
  spec.add_development_dependency "pry-byebug"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: racecar
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.0.0.alpha.3
4
+ version: 3.0.0.beta.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Daniel Schierbeck
@@ -38,6 +38,20 @@ dependencies:
38
38
  - - ">="
39
39
  - !ruby/object:Gem::Version
40
40
  version: 0.15.0
41
+ - !ruby/object:Gem::Dependency
42
+ name: concurrent-ruby
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '1.3'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '1.3'
41
55
  - !ruby/object:Gem::Dependency
42
56
  name: bundler
43
57
  requirement: !ruby/object:Gem::Requirement