karafka 2.5.6 → 2.5.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 11edff86c8615652130786431e179d242dcc33130e5aabbbf5e0f5ed6d4138fe
4
- data.tar.gz: 69244021709283a153da19907a424c230a3597832beec6acbb39dbc02e738256
3
+ metadata.gz: b395542efd1d57ac4f9fd89091866a39c23e2c48d446bb20fb3628e7952f762e
4
+ data.tar.gz: 14aeb17e690a257bb0d8d7e57b45e797da19adf22f2e5873bd88e18a9c0a6aea
5
5
  SHA512:
6
- metadata.gz: 843548637c77ace03cde5c1f4150226c244b1b3343c6acd17d16c19cdc798c6d770611522eed7ebef76830f3e551c75afa176f2937c1fd63045e66ccb6276701
7
- data.tar.gz: a0696c28b998c8e13d4b0e0d3f4252245f27e6f93e0d75d2153a93cc1dcc53b45fbe8416ba0a9e6025753ec5d107088d578f03d0a4cada3bb5ed2eb4e5341518
6
+ metadata.gz: e9003ff1411da366cc3f70a77e7b949bdda7fed9d717eb6a2c151e3844af1fb0cdfca9f4c220be8d4a692b750e99c202d2afa05e57ffbac7c274579d065ee54c
7
+ data.tar.gz: 4807ec1df8ed8f8169e6f38b9ce5b47e541485e4f4d551822bd2ea9d578219ee565ca71b27d66a9aeef55917ffa2d16fddd629371f48661afd94271eff9c228d
data/CHANGELOG.md CHANGED
@@ -1,5 +1,18 @@
1
1
  # Karafka Framework Changelog
2
2
 
3
+ ## 2.5.7 (2026-03-16)
4
+ - [Enhancement] Report detailed blocking information (active listeners, alive workers, and in-processing jobs) during forceful shutdown instead of only aggregate counts.
5
+ - [Enhancement] Improve `ForcefulShutdownError` description to clearly explain when and why it is raised.
6
+ - [Enhancement] Cache `messages.last` in `BatchMetadata` builder to avoid duplicate array traversal.
7
+ - [Enhancement] Optimize `VirtualOffsetManager#mark` to use a single array scan instead of separate `include?` and `index` calls (Pro).
8
+ - [Enhancement] Optimize `VirtualOffsetManager#materialize_real_offset` to use `keys.sort` instead of `to_a.sort_by` with tuple destructuring (Pro).
9
+ - [Enhancement] Optimize `IntervalRunner#call` to use a single `monotonic_now` call instead of two per invocation.
10
+ - [Enhancement] Support WaterDrop `:fd` mode in Swarm.
11
+ - [Maintenance] Use both `:fd` and `:thread` producer backends in CI.
12
+ - [Maintenance] Include spec file hash in integration test topic names for easier traceability in Kafka logs (#3056).
13
+ - [Fix] Remove duplicate topic creation in multi-broker health integration specs (#3056).
14
+ - [Fix] Preserve producer-specific kafka settings (e.g., `enable.idempotence`) when recreating the producer in swarm forks.
15
+
3
16
  ## 2.5.6 (2026-02-28)
4
17
  - **[Feature]** Add `karafka topics health` command to check Kafka topics for replication and durability issues, detecting no redundancy (RF=1), zero fault tolerance (RF≤min.insync), and low durability (min.insync=1) configurations with color-coded severity grouping and actionable recommendations (Pro).
5
18
  - [Enhancement] Optimize license loading process by reading license files directly from the gem directory instead of requiring the entire gem, reducing initialization overhead and adding support for user-defined License modules.
@@ -51,7 +51,9 @@ module Karafka
51
51
  end
52
52
  end
53
53
 
54
- # Raised when we've waited enough for shutting down a non-responsive process
54
+ # Raised when the graceful shutdown timeout has been exceeded and Karafka must forcefully
55
+ # terminate remaining listeners and workers. This typically happens when consumer processing
56
+ # or shutdown jobs take longer than the configured `shutdown_timeout`.
55
57
  ForcefulShutdownError = Class.new(BaseError)
56
58
 
57
59
  # Raised when the jobs queue receives a job that should not be received as it would cause
@@ -26,9 +26,11 @@ module Karafka
26
26
 
27
27
  # Runs the requested code if it was not executed previously recently
28
28
  def call
29
- return if monotonic_now - @last_called_at < @interval
29
+ now = monotonic_now
30
30
 
31
- @last_called_at = monotonic_now
31
+ return if now - @last_called_at < @interval
32
+
33
+ @last_called_at = now
32
34
 
33
35
  @block.call
34
36
  end
@@ -385,21 +385,34 @@ module Karafka
385
385
  fatal "Runner crashed due to an error: #{details}"
386
386
  fatal backtrace
387
387
  when "app.stopping.error"
388
- # Counts number of workers and listeners that were still active when forcing the
389
- # shutdown. Please note, that unless all listeners are closed, workers will not finalize
390
- # their operations as well.
391
- # We need to check if listeners and workers are assigned as during super early stages of
392
- # boot they are not.
393
- listeners = Server.listeners ? Server.listeners.count(&:active?) : 0
394
- workers = Server.workers ? Server.workers.count(&:alive?) : 0
388
+ active_listeners = event.payload[:active_listeners]
389
+ alive_workers = event.payload[:alive_workers]
390
+ in_processing = event.payload[:in_processing]
395
391
 
396
392
  message = <<~MSG.tr("\n", " ").strip!
397
393
  Forceful Karafka server stop with:
398
- #{workers} active workers and
399
- #{listeners} active listeners
394
+ #{alive_workers.size} active workers and
395
+ #{active_listeners.size} active listeners
400
396
  MSG
401
397
 
402
398
  error message
399
+
400
+ active_listeners.each do |listener|
401
+ error "Listener #{listener.id} for #{listener.subscription_group.name} still active"
402
+ end
403
+
404
+ in_processing.each do |group_id, jobs|
405
+ next if jobs.empty?
406
+
407
+ jobs.each do |job|
408
+ job_class = job.class.name.split("::").last
409
+ topic_name = job.executor.topic.name
410
+ partition = job.executor.partition
411
+
412
+ error "In processing: #{job_class} job for #{topic_name}/#{partition} " \
413
+ "(group: #{group_id})"
414
+ end
415
+ end
403
416
  when "app.forceful_stopping.error"
404
417
  error "Forceful shutdown error occurred: #{details}"
405
418
  error backtrace
@@ -117,7 +117,12 @@ module Karafka
117
117
  when "runner.call.error"
118
118
  fatal "Runner crashed due to an error: #{error}"
119
119
  when "app.stopping.error"
120
- error "Forceful Karafka server stop"
120
+ active_listeners = event.payload[:active_listeners]
121
+ alive_workers = event.payload[:alive_workers]
122
+
123
+ error "Forceful Karafka server stop with: " \
124
+ "#{alive_workers.size} active workers and " \
125
+ "#{active_listeners.size} active listeners"
121
126
  when "app.forceful_stopping.error"
122
127
  error "Forceful shutdown error occurred: #{error}"
123
128
  when "librdkafka.error"
@@ -17,16 +17,18 @@ module Karafka
17
17
  # @note We do not set `processed_at` as this needs to be assigned when the batch is
18
18
  # picked up for processing.
19
19
  def call(messages, topic, partition, scheduled_at)
20
+ last_message = messages.last
21
+
20
22
  Karafka::Messages::BatchMetadata.new(
21
23
  size: messages.size,
22
24
  first_offset: messages.first&.offset || -1001,
23
- last_offset: messages.last&.offset || -1001,
25
+ last_offset: last_message&.offset || -1001,
24
26
  deserializers: topic.deserializers,
25
27
  partition: partition,
26
28
  topic: topic.name,
27
29
  # We go with the assumption that the creation of the whole batch is the last message
28
30
  # creation time
29
- created_at: local_created_at(messages.last),
31
+ created_at: local_created_at(last_message),
30
32
  # When this batch was built and scheduled for execution
31
33
  scheduled_at: scheduled_at,
32
34
  # This needs to be set to a correct value prior to processing starting
@@ -91,17 +91,27 @@ module Karafka
91
91
  @offsets_metadata[offset] = offset_metadata
92
92
  @current_offset_metadata = offset_metadata
93
93
 
94
- group = @groups.find { |reg_group| reg_group.include?(offset) }
94
+ group = nil
95
+ position = nil
96
+
97
+ @groups.each do |reg_group|
98
+ pos = reg_group.index(offset)
99
+
100
+ if pos
101
+ group = reg_group
102
+ position = pos
103
+ break
104
+ end
105
+ end
95
106
 
96
107
  # This case can happen when someone uses MoM and wants to mark message from a previous
97
108
  # batch as consumed. We can add it, since the real offset refresh will point to it
98
109
  unless group
99
110
  group = [offset]
111
+ position = 0
100
112
  @groups << group
101
113
  end
102
114
 
103
- position = group.index(offset)
104
-
105
115
  # Mark all previous messages from the same group also as virtually consumed
106
116
  group[0..position].each do |markable_offset|
107
117
  # Set previous messages metadata offset as the offset of higher one for overwrites
@@ -135,7 +145,7 @@ module Karafka
135
145
 
136
146
  # @return [Array<Integer>] Offsets of messages already marked as consumed virtually
137
147
  def marked
138
- @marked.select { |_, status| status }.map(&:first).sort
148
+ @marked.select { |_, status| status }.map { |offset, _| offset }.sort
139
149
  end
140
150
 
141
151
  # Is there a real offset we can mark as consumed
@@ -171,11 +181,11 @@ module Karafka
171
181
  private
172
182
 
173
183
  # Recomputes the biggest possible real offset we can have.
174
- # It picks the the biggest offset that has uninterrupted stream of virtually marked as
184
+ # It picks the biggest offset that has uninterrupted stream of virtually marked as
175
185
  # consumed because this will be the collective offset.
176
186
  def materialize_real_offset
177
- @marked.to_a.sort_by(&:first).each do |offset, marked|
178
- break unless marked
187
+ @marked.keys.sort.each do |offset|
188
+ break unless @marked[offset]
179
189
 
180
190
  @real_offset = offset
181
191
  end
@@ -180,6 +180,16 @@ module Karafka
180
180
  end
181
181
  end
182
182
 
183
+ # Returns a snapshot of all jobs currently in processing per group.
184
+ # Useful for diagnostics during forceful shutdown to understand what is blocking.
185
+ #
186
+ # @return [Hash{String => Array<Jobs::Base>}] hash mapping group ids to arrays of jobs
187
+ def in_processing
188
+ @mutex.synchronize do
189
+ @in_processing.transform_values(&:dup).freeze
190
+ end
191
+ end
192
+
183
193
  private
184
194
 
185
195
  # @param group_id [String] id of the group in which jobs we're interested.
@@ -126,10 +126,19 @@ module Karafka
126
126
 
127
127
  raise Errors::ForcefulShutdownError
128
128
  rescue Errors::ForcefulShutdownError => e
129
+ active_listeners = listeners.select(&:active?)
130
+ alive_workers = workers.select(&:alive?)
131
+
132
+ # Collect details about subscription groups that still have jobs in processing
133
+ in_processing = jobs_queue ? jobs_queue.in_processing : {}
134
+
129
135
  Karafka.monitor.instrument(
130
136
  "error.occurred",
131
137
  caller: self,
132
138
  error: e,
139
+ active_listeners: active_listeners,
140
+ alive_workers: alive_workers,
141
+ in_processing: in_processing,
133
142
  type: "app.stopping.error"
134
143
  )
135
144
 
@@ -27,18 +27,6 @@ module Karafka
27
27
  # @return [Integer] pid of the node
28
28
  attr_reader :pid
29
29
 
30
- # When re-creating a producer in the fork, those are not attributes we want to inherit
31
- # from the parent process because they are updated in the fork. If user wants to take those
32
- # from the parent process, he should redefine them by overwriting the whole producer.
33
- SKIPPABLE_NEW_PRODUCER_ATTRIBUTES = %i[
34
- id
35
- kafka
36
- logger
37
- oauth
38
- ].freeze
39
-
40
- private_constant :SKIPPABLE_NEW_PRODUCER_ATTRIBUTES
41
-
42
30
  # @param id [Integer] number of the fork. Used for uniqueness setup for group client ids and
43
31
  # other stuff where we need to know a unique reference of the fork in regards to the rest
44
32
  # of them.
@@ -70,24 +58,7 @@ module Karafka
70
58
  config.producer.close
71
59
 
72
60
  old_producer = config.producer
73
- old_producer_config = old_producer.config
74
-
75
- # Supervisor producer is closed, hence we need a new one here
76
- config.producer = WaterDrop::Producer.new do |p_config|
77
- p_config.kafka = Setup::AttributesMap.producer(kafka.dup)
78
- p_config.logger = config.logger
79
-
80
- old_producer_config.to_h.each do |key, value|
81
- next if SKIPPABLE_NEW_PRODUCER_ATTRIBUTES.include?(key)
82
-
83
- p_config.public_send("#{key}=", value)
84
- end
85
-
86
- # Namespaced attributes need to be migrated directly on their config node
87
- old_producer_config.oauth.to_h.each do |key, value|
88
- p_config.oauth.public_send("#{key}=", value)
89
- end
90
- end
61
+ config.producer = ProducerReplacer.new.call(old_producer, kafka, config.logger)
91
62
 
92
63
  @pid = ::Process.pid
93
64
  @reader.close
@@ -0,0 +1,110 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Karafka
4
+ module Swarm
5
+ # Builds a new WaterDrop producer that inherits configuration from an old one
6
+ #
7
+ # When a swarm node forks, the parent's producer must be replaced with a new one.
8
+ # This class encapsulates the logic for building that new producer, inheriting all relevant
9
+ # settings from the old one while generating fresh connection-level configuration.
10
+ class ProducerReplacer
11
+ # Attributes that should not be directly copied from the old producer config because they
12
+ # are either regenerated fresh (kafka, logger, id) or handled via their own namespaced
13
+ # migration (oauth, polling, polling.fd).
14
+ SKIPPABLE_ATTRIBUTES = %i[
15
+ id
16
+ kafka
17
+ logger
18
+ oauth
19
+ polling
20
+ fd
21
+ ].freeze
22
+
23
+ private_constant :SKIPPABLE_ATTRIBUTES
24
+
25
+ # Builds a new WaterDrop producer inheriting configuration from the old one
26
+ #
27
+ # @param old_producer [WaterDrop::Producer] the old producer to inherit settings from
28
+ # @param kafka [Hash] app-level kafka configuration
29
+ # @param logger [Object] logger instance for the new producer
30
+ # @return [WaterDrop::Producer] new producer with inherited configuration
31
+ def call(old_producer, kafka, logger)
32
+ old_producer_config = old_producer.config
33
+
34
+ WaterDrop::Producer.new do |p_config|
35
+ p_config.logger = logger
36
+
37
+ migrate_kafka(p_config, old_producer_config, kafka)
38
+ migrate_root(p_config, old_producer_config)
39
+ migrate_oauth(p_config, old_producer_config)
40
+ migrate_polling(p_config, old_producer_config)
41
+ migrate_polling_fd(p_config, old_producer_config)
42
+ end
43
+ end
44
+
45
+ private
46
+
47
+ # Migrates root-level producer attributes from the old producer, skipping those that are
48
+ # regenerated fresh or handled by their own namespaced migration
49
+ #
50
+ # @param p_config [WaterDrop::Config] new producer config being built
51
+ # @param old_producer_config [WaterDrop::Config] old producer config to inherit from
52
+ def migrate_root(p_config, old_producer_config)
53
+ old_producer_config.to_h.each do |key, value|
54
+ next if SKIPPABLE_ATTRIBUTES.include?(key)
55
+
56
+ p_config.public_send("#{key}=", value)
57
+ end
58
+ end
59
+
60
+ # Builds fresh kafka config from app-level settings and preserves any producer-specific
61
+ # kafka settings from the old producer (e.g., enable.idempotence) that aren't in the
62
+ # base app kafka config
63
+ #
64
+ # @param p_config [WaterDrop::Config] new producer config being built
65
+ # @param old_producer_config [WaterDrop::Config] old producer config to inherit from
66
+ # @param kafka [Hash] app-level kafka configuration
67
+ def migrate_kafka(p_config, old_producer_config, kafka)
68
+ p_config.kafka = Setup::AttributesMap.producer(kafka.dup)
69
+
70
+ old_producer_config.kafka.each do |key, value|
71
+ next if p_config.kafka.key?(key)
72
+
73
+ p_config.kafka[key] = value
74
+ end
75
+ end
76
+
77
+ # Migrates oauth configuration from the old producer
78
+ #
79
+ # @param p_config [WaterDrop::Config] new producer config being built
80
+ # @param old_producer_config [WaterDrop::Config] old producer config to inherit from
81
+ def migrate_oauth(p_config, old_producer_config)
82
+ old_producer_config.oauth.to_h.each do |key, value|
83
+ p_config.oauth.public_send("#{key}=", value)
84
+ end
85
+ end
86
+
87
+ # Migrates polling configuration from the old producer
88
+ #
89
+ # @param p_config [WaterDrop::Config] new producer config being built
90
+ # @param old_producer_config [WaterDrop::Config] old producer config to inherit from
91
+ def migrate_polling(p_config, old_producer_config)
92
+ old_producer_config.polling.to_h.each do |key, value|
93
+ next if SKIPPABLE_ATTRIBUTES.include?(key)
94
+
95
+ p_config.polling.public_send("#{key}=", value)
96
+ end
97
+ end
98
+
99
+ # Migrates polling fd configuration from the old producer
100
+ #
101
+ # @param p_config [WaterDrop::Config] new producer config being built
102
+ # @param old_producer_config [WaterDrop::Config] old producer config to inherit from
103
+ def migrate_polling_fd(p_config, old_producer_config)
104
+ old_producer_config.polling.fd.to_h.each do |key, value|
105
+ p_config.polling.fd.public_send("#{key}=", value)
106
+ end
107
+ end
108
+ end
109
+ end
110
+ end
@@ -152,6 +152,9 @@ module Karafka
152
152
  caller: self,
153
153
  error: e,
154
154
  manager: manager,
155
+ active_listeners: [],
156
+ alive_workers: [],
157
+ in_processing: {},
155
158
  type: "app.stopping.error"
156
159
  )
157
160
 
@@ -3,5 +3,5 @@
3
3
  # Main module namespace
4
4
  module Karafka
5
5
  # Current Karafka version
6
- VERSION = "2.5.6"
6
+ VERSION = "2.5.7"
7
7
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: karafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.5.6
4
+ version: 2.5.7
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -542,6 +542,7 @@ files:
542
542
  - lib/karafka/swarm/liveness_listener.rb
543
543
  - lib/karafka/swarm/manager.rb
544
544
  - lib/karafka/swarm/node.rb
545
+ - lib/karafka/swarm/producer_replacer.rb
545
546
  - lib/karafka/swarm/supervisor.rb
546
547
  - lib/karafka/templates/application_consumer.rb.erb
547
548
  - lib/karafka/templates/example_consumer.rb.erb