karafka 2.1.6 → 2.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a6994a6d579728a877f84c87086d093aae8a1f830b891fcb4904883085432fe4
4
- data.tar.gz: 13b21009a471194a72971ca81ddc718e044bb96587db0e8f186974f554e9ec62
3
+ metadata.gz: 042f365fb134a24ae360d678590ce798014751c8a23fb267001c920a42aa5324
4
+ data.tar.gz: ba2950de557a5f6c775577ce392d60ee839184f50b7d9225969c684625c9ecd0
5
5
  SHA512:
6
- metadata.gz: e4711880bde1d2cd1cb34959f740459979b74ff4d28a671a232f88adbe7473cf67e366fc2b492fac761c572f3a6dfc147a59d46fc08e1c5e18df8ac5f108afdd
7
- data.tar.gz: c094600c2bd421ce309c0125d60ea82ed0106d5ce4566b3bb8c1aab13c553e7bd2f6651b98029e42ac831b132563b2c502dd1c76defbf8307cd9bd2393b258f7
6
+ metadata.gz: 30b5fcd92c348c50482cb84542380ad28b317d47e01efad2d2049cd3ba7872c5f66a7a4fde93d8bfa262d7fcb3745dedbdb89a4fd5e6cdf2288a43606ea2361d
7
+ data.tar.gz: d9c8ba95b2b71f46a3d35e2f5be634473a7aa9f45d9b5924e1019dcb53c7cea6aca8c594f9f4d3bf85a14176c3cb98a7167b4eb6e1f6f2192059d8040440e4a9
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -1,5 +1,20 @@
1
1
  # Karafka framework changelog
2
2
 
3
+ ## 2.1.8 (2023-07-29)
4
+ - [Improvement] Introduce `Karafka::BaseConsumer#used?` method to indicate, that at least one invocation of `#consume` took or will take place. This can be used as a replacement to the non-direct `messages.count` check for shutdown and revocation to ensure, that the consumption took place or is taking place (in case of running LRJ).
5
+ - [Improvement] Make `messages#to_a` return copy of the underlying array to prevent scenarios, where the mutation impacts offset management.
6
+ - [Improvement] Mitigate a librdkafka `cooperative-sticky` rebalance crash issue.
7
+ - [Improvement] Provide ability to overwrite `consumer_persistence` per subscribed topic. This is mostly useful for plugins and extensions developers.
8
+ - [Fix] Fix a case where the performance tracker would crash in case of mutation of messages to an empty state.
9
+
10
+ ## 2.1.7 (2023-07-22)
11
+ - [Improvement] Always query for watermarks in the Iterator to improve the initial response time.
12
+ - [Improvement] Add `max_wait_time` option to the Iterator.
13
+ - [Fix] Fix a case where `Admin#read_topic` would wait for poll interval on non-existing messages instead of early exit.
14
+ - [Fix] Fix a case where Iterator with per partition offsets with negative lookups would go below the number of available messages.
15
+ - [Fix] Remove unused constant from Admin module.
16
+ - [Fix] Add missing `connection.client.rebalance_callback.error` to the `LoggerListener` instrumentation hook.
17
+
3
18
  ## 2.1.6 (2023-06-29)
4
19
  - [Improvement] Provide time support for iterator
5
20
  - [Improvement] Provide time support for admin `#read_topic`
@@ -63,7 +78,7 @@
63
78
  2. Replace `Karafka::Pro::BaseConsumer` references to `Karafka::BaseConsumer`.
64
79
  3. Replace `Karafka::Instrumentation::Vendors::Datadog:Listener` with `Karafka::Instrumentation::Vendors::Datadog::MetricsListener`.
65
80
 
66
- ## 2.0.41 (2023-14-19)
81
+ ## 2.0.41 (2023-04-19)
67
82
  - **[Feature]** Provide `Karafka::Pro::Iterator` for anonymous topic/partitions iterations and messages lookups (#1389 and #1427).
68
83
  - [Improvement] Optimize topic lookup for `read_topic` admin method usage.
69
84
  - [Improvement] Report via `LoggerListener` information about the partition on which a given job has started and finished.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- karafka (2.1.6)
4
+ karafka (2.1.8)
5
5
  karafka-core (>= 2.1.1, < 2.2.0)
6
6
  thor (>= 0.20)
7
7
  waterdrop (>= 2.6.2, < 3.0.0)
@@ -10,10 +10,10 @@ PATH
10
10
  GEM
11
11
  remote: https://rubygems.org/
12
12
  specs:
13
- activejob (7.0.5)
14
- activesupport (= 7.0.5)
13
+ activejob (7.0.6)
14
+ activesupport (= 7.0.6)
15
15
  globalid (>= 0.3.6)
16
- activesupport (7.0.5)
16
+ activesupport (7.0.6)
17
17
  concurrent-ruby (~> 1.0, >= 1.0.2)
18
18
  i18n (>= 1.6, < 2)
19
19
  minitest (>= 5.1)
@@ -33,21 +33,21 @@ GEM
33
33
  karafka-core (2.1.1)
34
34
  concurrent-ruby (>= 1.1)
35
35
  karafka-rdkafka (>= 0.13.1, < 0.14.0)
36
- karafka-rdkafka (0.13.1)
36
+ karafka-rdkafka (0.13.3)
37
37
  ffi (~> 1.15)
38
38
  mini_portile2 (~> 2.6)
39
39
  rake (> 12)
40
- karafka-web (0.6.1)
40
+ karafka-web (0.6.3)
41
41
  erubi (~> 1.4)
42
42
  karafka (>= 2.1.4, < 3.0.0)
43
43
  karafka-core (>= 2.0.13, < 3.0.0)
44
44
  roda (~> 3.68, >= 3.68)
45
45
  tilt (~> 2.0)
46
- mini_portile2 (2.8.2)
46
+ mini_portile2 (2.8.4)
47
47
  minitest (5.18.1)
48
48
  rack (3.0.8)
49
49
  rake (13.0.6)
50
- roda (3.69.0)
50
+ roda (3.70.0)
51
51
  rack
52
52
  rspec (3.12.0)
53
53
  rspec-core (~> 3.12.0)
@@ -58,10 +58,10 @@ GEM
58
58
  rspec-expectations (3.12.3)
59
59
  diff-lcs (>= 1.2.0, < 2.0)
60
60
  rspec-support (~> 3.12.0)
61
- rspec-mocks (3.12.5)
61
+ rspec-mocks (3.12.6)
62
62
  diff-lcs (>= 1.2.0, < 2.0)
63
63
  rspec-support (~> 3.12.0)
64
- rspec-support (3.12.0)
64
+ rspec-support (3.12.1)
65
65
  simplecov (0.22.0)
66
66
  docile (~> 1.1)
67
67
  simplecov-html (~> 0.11)
@@ -72,8 +72,8 @@ GEM
72
72
  tilt (2.2.0)
73
73
  tzinfo (2.0.6)
74
74
  concurrent-ruby (~> 1.0)
75
- waterdrop (2.6.2)
76
- karafka-core (>= 2.1.0, < 3.0.0)
75
+ waterdrop (2.6.5)
76
+ karafka-core (>= 2.1.1, < 3.0.0)
77
77
  zeitwerk (~> 2.3)
78
78
  zeitwerk (2.6.8)
79
79
 
data/lib/karafka/admin.rb CHANGED
@@ -9,11 +9,6 @@ module Karafka
9
9
  # @note It always uses the primary defined cluster and does not support multi-cluster work.
10
10
  # If you need this, just replace the cluster info for the time you use this
11
11
  module Admin
12
- # A fake admin topic representation that we use for messages fetched using this API
13
- # We cannot use the topics directly because we may want to request data from topics that we
14
- # do not have in the routing
15
- Topic = Struct.new(:name, :deserializer)
16
-
17
12
  # We wait only for this amount of time before raising error as we intercept this error and
18
13
  # retry after checking that the operation was finished or failed using external factor.
19
14
  MAX_WAIT_TIMEOUT = 1
@@ -37,7 +32,7 @@ module Karafka
37
32
  'enable.auto.commit': false
38
33
  }.freeze
39
34
 
40
- private_constant :Topic, :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :TPL_REQUEST_TIMEOUT,
35
+ private_constant :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :TPL_REQUEST_TIMEOUT,
41
36
  :MAX_ATTEMPTS
42
37
 
43
38
  class << self
@@ -71,7 +66,7 @@ module Karafka
71
66
  requested_range = (start_offset..start_offset + (count - 1))
72
67
  # Establish theoretical available range. Note, that this does not handle cases related to
73
68
  # log retention or compaction
74
- available_range = (low_offset..high_offset)
69
+ available_range = (low_offset..(high_offset - 1))
75
70
  # Select only offset that we can select. This will remove all the potential offsets that
76
71
  # are below the low watermark offset
77
72
  possible_range = requested_range.select { |offset| available_range.include?(offset) }
@@ -25,6 +25,7 @@ module Karafka
25
25
  # Creates new consumer and assigns it an id
26
26
  def initialize
27
27
  @id = SecureRandom.hex(6)
28
+ @used = false
28
29
  end
29
30
 
30
31
  # Can be used to run preparation code prior to the job being enqueued
@@ -34,6 +35,7 @@ module Karafka
34
35
  # not as a part of the public api. This should not perform any extensive operations as it is
35
36
  # blocking and running in the listener thread.
36
37
  def on_before_enqueue
38
+ @used = true
37
39
  handle_before_enqueue
38
40
  rescue StandardError => e
39
41
  Karafka.monitor.instrument(
@@ -160,6 +162,14 @@ module Karafka
160
162
  # some teardown procedures (closing file handler, etc).
161
163
  def shutdown; end
162
164
 
165
+ # @return [Boolean] was this consumer in active use. Active use means running `#consume` at
166
+ # least once. Consumer may have to run `#revoked` or `#shutdown` despite not running
167
+ # `#consume` previously in delayed job cases and other cases that potentially involve running
168
+ # the `Jobs::Idle` for house-keeping
169
+ def used?
170
+ @used
171
+ end
172
+
163
173
  # Pauses processing on a given offset for the current topic partition
164
174
  #
165
175
  # After given partition is resumed, it will continue processing from the given offset
@@ -23,11 +23,17 @@ module Karafka
23
23
  # Max time for a TPL request. We increase it to compensate for remote clusters latency
24
24
  TPL_REQUEST_TIMEOUT = 2_000
25
25
 
26
+ # 1 minute of max wait for the first rebalance before a forceful attempt
27
+ # This applies only to a case when a short-lived Karafka instance with a client would be
28
+ # closed before first rebalance. Mitigates a librdkafka bug.
29
+ COOPERATIVE_STICKY_MAX_WAIT = 60_000
30
+
26
31
  # We want to make sure we never close several clients in the same moment to prevent
27
32
  # potential race conditions and other issues
28
33
  SHUTDOWN_MUTEX = Mutex.new
29
34
 
30
- private_constant :MAX_POLL_RETRIES, :SHUTDOWN_MUTEX, :TPL_REQUEST_TIMEOUT
35
+ private_constant :MAX_POLL_RETRIES, :SHUTDOWN_MUTEX, :TPL_REQUEST_TIMEOUT,
36
+ :COOPERATIVE_STICKY_MAX_WAIT
31
37
 
32
38
  # Creates a new consumer instance.
33
39
  #
@@ -226,6 +232,22 @@ module Karafka
226
232
  # as until all the consumers are stopped, the server will keep running serving only
227
233
  # part of the messages
228
234
  def stop
235
+ # This ensures, that we do not stop the underlying client until it passes the first
236
+ # rebalance for cooperative-sticky. Otherwise librdkafka may crash
237
+ #
238
+ # We set a timeout just in case the rebalance would never happen or would last for an
239
+ # extensive time period.
240
+ #
241
+ # @see https://github.com/confluentinc/librdkafka/issues/4312
242
+ if @subscription_group.kafka[:'partition.assignment.strategy'] == 'cooperative-sticky'
243
+ (COOPERATIVE_STICKY_MAX_WAIT / 100).times do
244
+ # If we're past the first rebalance, no need to wait
245
+ break if @rebalance_manager.active?
246
+
247
+ sleep(0.1)
248
+ end
249
+ end
250
+
229
251
  close
230
252
  end
231
253
 
@@ -30,6 +30,7 @@ module Karafka
30
30
  @assigned_partitions = {}
31
31
  @revoked_partitions = {}
32
32
  @changed = false
33
+ @active = false
33
34
  end
34
35
 
35
36
  # Resets the rebalance manager state
@@ -46,11 +47,20 @@ module Karafka
46
47
  @changed
47
48
  end
48
49
 
50
+ # @return [Boolean] true if there was at least one rebalance
51
+ # @note This method is needed to make sure that when using cooperative-sticky, we do not
52
+ # close until first rebalance. Otherwise librdkafka may crash.
53
+ # @see https://github.com/confluentinc/librdkafka/issues/4312
54
+ def active?
55
+ @active
56
+ end
57
+
49
58
  # Callback that kicks in inside of rdkafka, when new partitions are assigned.
50
59
  #
51
60
  # @private
52
61
  # @param partitions [Rdkafka::Consumer::TopicPartitionList]
53
62
  def on_partitions_assigned(partitions)
63
+ @active = true
54
64
  @assigned_partitions = partitions.to_h.transform_values { |part| part.map(&:partition) }
55
65
  @changed = true
56
66
  end
@@ -60,6 +70,7 @@ module Karafka
60
70
  # @private
61
71
  # @param partitions [Rdkafka::Consumer::TopicPartitionList]
62
72
  def on_partitions_revoked(partitions)
73
+ @active = true
63
74
  @revoked_partitions = partitions.to_h.transform_values { |part| part.map(&:partition) }
64
75
  @changed = true
65
76
  end
@@ -277,6 +277,9 @@ module Karafka
277
277
  when 'connection.client.poll.error'
278
278
  error "Data polling error occurred: #{error}"
279
279
  error details
280
+ when 'connection.client.rebalance_callback.error'
281
+ error "Rebalance callback error occurred: #{error}"
282
+ error details
280
283
  else
281
284
  # This should never happen. Please contact the maintainers
282
285
  raise Errors::UnsupportedCaseError, event
@@ -60,10 +60,12 @@ module Karafka
60
60
  @messages_array.size
61
61
  end
62
62
 
63
- # @return [Array<Karafka::Messages::Message>] pure array with messages
63
+ # @return [Array<Karafka::Messages::Message>] copy of the pure array with messages
64
64
  def to_a
65
- @messages_array
65
+ @messages_array.dup
66
66
  end
67
+
68
+ alias count size
67
69
  end
68
70
  end
69
71
  end
@@ -93,12 +93,27 @@ module Karafka
93
93
  next unless partitions.is_a?(Hash)
94
94
 
95
95
  partitions.each do |partition, offset|
96
+ # Care only about numerical offsets
97
+ #
98
+ # For time based we already resolve them via librdkafka lookup API
99
+ next unless offset.is_a?(Integer)
100
+
101
+ low_offset, high_offset = @consumer.query_watermark_offsets(name, partition)
102
+
96
103
  # Care only about negative offsets (last n messages)
97
- next unless offset.is_a?(Integer) && offset.negative?
104
+ #
105
+ # We reject the above results but we **NEED** to run the `#query_watermark_offsets`
106
+ # for each topic partition nonetheless. Without this, librdkafka fetches a lot more
107
+ # metadata about each topic and each partition and this takes much more time than
108
+ # just getting watermarks. If we do not run watermark, at least an extra second
109
+ # is added at the beginning of iterator flow
110
+ #
111
+ # This may not be significant when this runs in the background but in case of
112
+ # using iterator in thins like Puma, it heavily impacts the end user experience
113
+ next unless offset.negative?
98
114
 
99
- _, high_watermark_offset = @consumer.query_watermark_offsets(name, partition)
100
115
  # We add because this offset is negative
101
- @mapped_topics[name][partition] = high_watermark_offset + offset
116
+ @mapped_topics[name][partition] = [high_offset + offset, low_offset].max
102
117
  end
103
118
  end
104
119
  end
@@ -39,6 +39,7 @@ module Karafka
39
39
  # overwritten, you may want to include `auto.offset.reset` to match your case.
40
40
  # @param yield_nil [Boolean] should we yield also `nil` values when poll returns nothing.
41
41
  # Useful in particular for long-living iterators.
42
+ # @param max_wait_time [Integer] max wait in ms when iterator did not receive any messages
42
43
  #
43
44
  # @note It is worth keeping in mind, that this API also needs to operate within
44
45
  # `max.poll.interval.ms` limitations on each iteration
@@ -48,7 +49,8 @@ module Karafka
48
49
  def initialize(
49
50
  topics,
50
51
  settings: { 'auto.offset.reset': 'beginning' },
51
- yield_nil: false
52
+ yield_nil: false,
53
+ max_wait_time: 200
52
54
  )
53
55
  @topics_with_partitions = Expander.new.call(topics)
54
56
 
@@ -62,6 +64,7 @@ module Karafka
62
64
 
63
65
  @settings = settings
64
66
  @yield_nil = yield_nil
67
+ @max_wait_time = max_wait_time
65
68
  end
66
69
 
67
70
  # Iterates over requested topic partitions and yields the results with the iterator itself
@@ -80,7 +83,7 @@ module Karafka
80
83
  # Stream data until we reach the end of all the partitions or until the end user
81
84
  # indicates that they are done
82
85
  until done?
83
- message = poll(200)
86
+ message = poll
84
87
 
85
88
  # Skip nils if not explicitly required
86
89
  next if message.nil? && !@yield_nil
@@ -131,10 +134,9 @@ module Karafka
131
134
 
132
135
  private
133
136
 
134
- # @param timeout [Integer] timeout in ms
135
137
  # @return [Rdkafka::Consumer::Message, nil] message or nil if nothing to do
136
- def poll(timeout)
137
- @current_consumer.poll(timeout)
138
+ def poll
139
+ @current_consumer.poll(@max_wait_time)
138
140
  rescue Rdkafka::RdkafkaError => e
139
141
  # End of partition
140
142
  if e.code == :partition_eof
@@ -48,7 +48,7 @@ module Karafka
48
48
  # We reload the consumers with each batch instead of relying on some external signals
49
49
  # when needed for consistency. That way devs may have it on or off and not in this
50
50
  # middle state, where re-creation of a consumer instance would occur only sometimes
51
- @consumer = nil unless ::Karafka::App.config.consumer_persistence
51
+ @consumer = nil unless topic.consumer_persistence
52
52
 
53
53
  # First we build messages batch...
54
54
  consumer.messages = Messages::Builders::Messages.call(
@@ -17,6 +17,7 @@ module Karafka
17
17
  max_messages
18
18
  max_wait_time
19
19
  initial_offset
20
+ consumer_persistence
20
21
  ].freeze
21
22
 
22
23
  private_constant :INHERITABLE_ATTRIBUTES
@@ -50,7 +51,7 @@ module Karafka
50
51
 
51
52
  # @return [Class] consumer class that we should use
52
53
  def consumer
53
- if Karafka::App.config.consumer_persistence
54
+ if consumer_persistence
54
55
  # When persistence of consumers is on, no need to reload them
55
56
  @consumer
56
57
  else
@@ -25,6 +25,7 @@ module Karafka
25
25
  broker.version.fallback
26
26
  builtin.features
27
27
  check.crcs
28
+ client.dns.lookup
28
29
  client.id
29
30
  client.rack
30
31
  closesocket_cb
@@ -161,6 +162,7 @@ module Karafka
161
162
  broker.address.ttl
162
163
  broker.version.fallback
163
164
  builtin.features
165
+ client.dns.lookup
164
166
  client.id
165
167
  client.rack
166
168
  closesocket_cb
@@ -3,5 +3,5 @@
3
3
  # Main module namespace
4
4
  module Karafka
5
5
  # Current Karafka version
6
- VERSION = '2.1.6'
6
+ VERSION = '2.1.8'
7
7
  end
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: karafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.1.6
4
+ version: 2.1.8
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
35
35
  Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
36
36
  MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
37
37
  -----END CERTIFICATE-----
38
- date: 2023-06-29 00:00:00.000000000 Z
38
+ date: 2023-07-29 00:00:00.000000000 Z
39
39
  dependencies:
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: karafka-core
metadata.gz.sig CHANGED
Binary file