karafka 2.1.6 → 2.1.8

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a6994a6d579728a877f84c87086d093aae8a1f830b891fcb4904883085432fe4
4
- data.tar.gz: 13b21009a471194a72971ca81ddc718e044bb96587db0e8f186974f554e9ec62
3
+ metadata.gz: 042f365fb134a24ae360d678590ce798014751c8a23fb267001c920a42aa5324
4
+ data.tar.gz: ba2950de557a5f6c775577ce392d60ee839184f50b7d9225969c684625c9ecd0
5
5
  SHA512:
6
- metadata.gz: e4711880bde1d2cd1cb34959f740459979b74ff4d28a671a232f88adbe7473cf67e366fc2b492fac761c572f3a6dfc147a59d46fc08e1c5e18df8ac5f108afdd
7
- data.tar.gz: c094600c2bd421ce309c0125d60ea82ed0106d5ce4566b3bb8c1aab13c553e7bd2f6651b98029e42ac831b132563b2c502dd1c76defbf8307cd9bd2393b258f7
6
+ metadata.gz: 30b5fcd92c348c50482cb84542380ad28b317d47e01efad2d2049cd3ba7872c5f66a7a4fde93d8bfa262d7fcb3745dedbdb89a4fd5e6cdf2288a43606ea2361d
7
+ data.tar.gz: d9c8ba95b2b71f46a3d35e2f5be634473a7aa9f45d9b5924e1019dcb53c7cea6aca8c594f9f4d3bf85a14176c3cb98a7167b4eb6e1f6f2192059d8040440e4a9
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -1,5 +1,20 @@
1
1
  # Karafka framework changelog
2
2
 
3
+ ## 2.1.8 (2023-07-29)
4
+ - [Improvement] Introduce `Karafka::BaseConsumer#used?` method to indicate, that at least one invocation of `#consume` took or will take place. This can be used as a replacement to the non-direct `messages.count` check for shutdown and revocation to ensure, that the consumption took place or is taking place (in case of running LRJ).
5
+ - [Improvement] Make `messages#to_a` return copy of the underlying array to prevent scenarios, where the mutation impacts offset management.
6
+ - [Improvement] Mitigate a librdkafka `cooperative-sticky` rebalance crash issue.
7
+ - [Improvement] Provide ability to overwrite `consumer_persistence` per subscribed topic. This is mostly useful for plugins and extensions developers.
8
+ - [Fix] Fix a case where the performance tracker would crash in case of mutation of messages to an empty state.
9
+
10
+ ## 2.1.7 (2023-07-22)
11
+ - [Improvement] Always query for watermarks in the Iterator to improve the initial response time.
12
+ - [Improvement] Add `max_wait_time` option to the Iterator.
13
+ - [Fix] Fix a case where `Admin#read_topic` would wait for poll interval on non-existing messages instead of early exit.
14
+ - [Fix] Fix a case where Iterator with per partition offsets with negative lookups would go below the number of available messages.
15
+ - [Fix] Remove unused constant from Admin module.
16
+ - [Fix] Add missing `connection.client.rebalance_callback.error` to the `LoggerListener` instrumentation hook.
17
+
3
18
  ## 2.1.6 (2023-06-29)
4
19
  - [Improvement] Provide time support for iterator
5
20
  - [Improvement] Provide time support for admin `#read_topic`
@@ -63,7 +78,7 @@
63
78
  2. Replace `Karafka::Pro::BaseConsumer` references to `Karafka::BaseConsumer`.
64
79
  3. Replace `Karafka::Instrumentation::Vendors::Datadog:Listener` with `Karafka::Instrumentation::Vendors::Datadog::MetricsListener`.
65
80
 
66
- ## 2.0.41 (2023-14-19)
81
+ ## 2.0.41 (2023-04-19)
67
82
  - **[Feature]** Provide `Karafka::Pro::Iterator` for anonymous topic/partitions iterations and messages lookups (#1389 and #1427).
68
83
  - [Improvement] Optimize topic lookup for `read_topic` admin method usage.
69
84
  - [Improvement] Report via `LoggerListener` information about the partition on which a given job has started and finished.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- karafka (2.1.6)
4
+ karafka (2.1.8)
5
5
  karafka-core (>= 2.1.1, < 2.2.0)
6
6
  thor (>= 0.20)
7
7
  waterdrop (>= 2.6.2, < 3.0.0)
@@ -10,10 +10,10 @@ PATH
10
10
  GEM
11
11
  remote: https://rubygems.org/
12
12
  specs:
13
- activejob (7.0.5)
14
- activesupport (= 7.0.5)
13
+ activejob (7.0.6)
14
+ activesupport (= 7.0.6)
15
15
  globalid (>= 0.3.6)
16
- activesupport (7.0.5)
16
+ activesupport (7.0.6)
17
17
  concurrent-ruby (~> 1.0, >= 1.0.2)
18
18
  i18n (>= 1.6, < 2)
19
19
  minitest (>= 5.1)
@@ -33,21 +33,21 @@ GEM
33
33
  karafka-core (2.1.1)
34
34
  concurrent-ruby (>= 1.1)
35
35
  karafka-rdkafka (>= 0.13.1, < 0.14.0)
36
- karafka-rdkafka (0.13.1)
36
+ karafka-rdkafka (0.13.3)
37
37
  ffi (~> 1.15)
38
38
  mini_portile2 (~> 2.6)
39
39
  rake (> 12)
40
- karafka-web (0.6.1)
40
+ karafka-web (0.6.3)
41
41
  erubi (~> 1.4)
42
42
  karafka (>= 2.1.4, < 3.0.0)
43
43
  karafka-core (>= 2.0.13, < 3.0.0)
44
44
  roda (~> 3.68, >= 3.68)
45
45
  tilt (~> 2.0)
46
- mini_portile2 (2.8.2)
46
+ mini_portile2 (2.8.4)
47
47
  minitest (5.18.1)
48
48
  rack (3.0.8)
49
49
  rake (13.0.6)
50
- roda (3.69.0)
50
+ roda (3.70.0)
51
51
  rack
52
52
  rspec (3.12.0)
53
53
  rspec-core (~> 3.12.0)
@@ -58,10 +58,10 @@ GEM
58
58
  rspec-expectations (3.12.3)
59
59
  diff-lcs (>= 1.2.0, < 2.0)
60
60
  rspec-support (~> 3.12.0)
61
- rspec-mocks (3.12.5)
61
+ rspec-mocks (3.12.6)
62
62
  diff-lcs (>= 1.2.0, < 2.0)
63
63
  rspec-support (~> 3.12.0)
64
- rspec-support (3.12.0)
64
+ rspec-support (3.12.1)
65
65
  simplecov (0.22.0)
66
66
  docile (~> 1.1)
67
67
  simplecov-html (~> 0.11)
@@ -72,8 +72,8 @@ GEM
72
72
  tilt (2.2.0)
73
73
  tzinfo (2.0.6)
74
74
  concurrent-ruby (~> 1.0)
75
- waterdrop (2.6.2)
76
- karafka-core (>= 2.1.0, < 3.0.0)
75
+ waterdrop (2.6.5)
76
+ karafka-core (>= 2.1.1, < 3.0.0)
77
77
  zeitwerk (~> 2.3)
78
78
  zeitwerk (2.6.8)
79
79
 
data/lib/karafka/admin.rb CHANGED
@@ -9,11 +9,6 @@ module Karafka
9
9
  # @note It always uses the primary defined cluster and does not support multi-cluster work.
10
10
  # If you need this, just replace the cluster info for the time you use this
11
11
  module Admin
12
- # A fake admin topic representation that we use for messages fetched using this API
13
- # We cannot use the topics directly because we may want to request data from topics that we
14
- # do not have in the routing
15
- Topic = Struct.new(:name, :deserializer)
16
-
17
12
  # We wait only for this amount of time before raising error as we intercept this error and
18
13
  # retry after checking that the operation was finished or failed using external factor.
19
14
  MAX_WAIT_TIMEOUT = 1
@@ -37,7 +32,7 @@ module Karafka
37
32
  'enable.auto.commit': false
38
33
  }.freeze
39
34
 
40
- private_constant :Topic, :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :TPL_REQUEST_TIMEOUT,
35
+ private_constant :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :TPL_REQUEST_TIMEOUT,
41
36
  :MAX_ATTEMPTS
42
37
 
43
38
  class << self
@@ -71,7 +66,7 @@ module Karafka
71
66
  requested_range = (start_offset..start_offset + (count - 1))
72
67
  # Establish theoretical available range. Note, that this does not handle cases related to
73
68
  # log retention or compaction
74
- available_range = (low_offset..high_offset)
69
+ available_range = (low_offset..(high_offset - 1))
75
70
  # Select only offset that we can select. This will remove all the potential offsets that
76
71
  # are below the low watermark offset
77
72
  possible_range = requested_range.select { |offset| available_range.include?(offset) }
@@ -25,6 +25,7 @@ module Karafka
25
25
  # Creates new consumer and assigns it an id
26
26
  def initialize
27
27
  @id = SecureRandom.hex(6)
28
+ @used = false
28
29
  end
29
30
 
30
31
  # Can be used to run preparation code prior to the job being enqueued
@@ -34,6 +35,7 @@ module Karafka
34
35
  # not as a part of the public api. This should not perform any extensive operations as it is
35
36
  # blocking and running in the listener thread.
36
37
  def on_before_enqueue
38
+ @used = true
37
39
  handle_before_enqueue
38
40
  rescue StandardError => e
39
41
  Karafka.monitor.instrument(
@@ -160,6 +162,14 @@ module Karafka
160
162
  # some teardown procedures (closing file handler, etc).
161
163
  def shutdown; end
162
164
 
165
+ # @return [Boolean] was this consumer in active use. Active use means running `#consume` at
166
+ # least once. Consumer may have to run `#revoked` or `#shutdown` despite not running
167
+ # `#consume` previously in delayed job cases and other cases that potentially involve running
168
+ # the `Jobs::Idle` for house-keeping
169
+ def used?
170
+ @used
171
+ end
172
+
163
173
  # Pauses processing on a given offset for the current topic partition
164
174
  #
165
175
  # After given partition is resumed, it will continue processing from the given offset
@@ -23,11 +23,17 @@ module Karafka
23
23
  # Max time for a TPL request. We increase it to compensate for remote clusters latency
24
24
  TPL_REQUEST_TIMEOUT = 2_000
25
25
 
26
+ # 1 minute of max wait for the first rebalance before a forceful attempt
27
+ # This applies only to a case when a short-lived Karafka instance with a client would be
28
+ # closed before first rebalance. Mitigates a librdkafka bug.
29
+ COOPERATIVE_STICKY_MAX_WAIT = 60_000
30
+
26
31
  # We want to make sure we never close several clients in the same moment to prevent
27
32
  # potential race conditions and other issues
28
33
  SHUTDOWN_MUTEX = Mutex.new
29
34
 
30
- private_constant :MAX_POLL_RETRIES, :SHUTDOWN_MUTEX, :TPL_REQUEST_TIMEOUT
35
+ private_constant :MAX_POLL_RETRIES, :SHUTDOWN_MUTEX, :TPL_REQUEST_TIMEOUT,
36
+ :COOPERATIVE_STICKY_MAX_WAIT
31
37
 
32
38
  # Creates a new consumer instance.
33
39
  #
@@ -226,6 +232,22 @@ module Karafka
226
232
  # as until all the consumers are stopped, the server will keep running serving only
227
233
  # part of the messages
228
234
  def stop
235
+ # This ensures, that we do not stop the underlying client until it passes the first
236
+ # rebalance for cooperative-sticky. Otherwise librdkafka may crash
237
+ #
238
+ # We set a timeout just in case the rebalance would never happen or would last for an
239
+ # extensive time period.
240
+ #
241
+ # @see https://github.com/confluentinc/librdkafka/issues/4312
242
+ if @subscription_group.kafka[:'partition.assignment.strategy'] == 'cooperative-sticky'
243
+ (COOPERATIVE_STICKY_MAX_WAIT / 100).times do
244
+ # If we're past the first rebalance, no need to wait
245
+ break if @rebalance_manager.active?
246
+
247
+ sleep(0.1)
248
+ end
249
+ end
250
+
229
251
  close
230
252
  end
231
253
 
@@ -30,6 +30,7 @@ module Karafka
30
30
  @assigned_partitions = {}
31
31
  @revoked_partitions = {}
32
32
  @changed = false
33
+ @active = false
33
34
  end
34
35
 
35
36
  # Resets the rebalance manager state
@@ -46,11 +47,20 @@ module Karafka
46
47
  @changed
47
48
  end
48
49
 
50
+ # @return [Boolean] true if there was at least one rebalance
51
+ # @note This method is needed to make sure that when using cooperative-sticky, we do not
52
+ # close until first rebalance. Otherwise librdkafka may crash.
53
+ # @see https://github.com/confluentinc/librdkafka/issues/4312
54
+ def active?
55
+ @active
56
+ end
57
+
49
58
  # Callback that kicks in inside of rdkafka, when new partitions are assigned.
50
59
  #
51
60
  # @private
52
61
  # @param partitions [Rdkafka::Consumer::TopicPartitionList]
53
62
  def on_partitions_assigned(partitions)
63
+ @active = true
54
64
  @assigned_partitions = partitions.to_h.transform_values { |part| part.map(&:partition) }
55
65
  @changed = true
56
66
  end
@@ -60,6 +70,7 @@ module Karafka
60
70
  # @private
61
71
  # @param partitions [Rdkafka::Consumer::TopicPartitionList]
62
72
  def on_partitions_revoked(partitions)
73
+ @active = true
63
74
  @revoked_partitions = partitions.to_h.transform_values { |part| part.map(&:partition) }
64
75
  @changed = true
65
76
  end
@@ -277,6 +277,9 @@ module Karafka
277
277
  when 'connection.client.poll.error'
278
278
  error "Data polling error occurred: #{error}"
279
279
  error details
280
+ when 'connection.client.rebalance_callback.error'
281
+ error "Rebalance callback error occurred: #{error}"
282
+ error details
280
283
  else
281
284
  # This should never happen. Please contact the maintainers
282
285
  raise Errors::UnsupportedCaseError, event
@@ -60,10 +60,12 @@ module Karafka
60
60
  @messages_array.size
61
61
  end
62
62
 
63
- # @return [Array<Karafka::Messages::Message>] pure array with messages
63
+ # @return [Array<Karafka::Messages::Message>] copy of the pure array with messages
64
64
  def to_a
65
- @messages_array
65
+ @messages_array.dup
66
66
  end
67
+
68
+ alias count size
67
69
  end
68
70
  end
69
71
  end
@@ -93,12 +93,27 @@ module Karafka
93
93
  next unless partitions.is_a?(Hash)
94
94
 
95
95
  partitions.each do |partition, offset|
96
+ # Care only about numerical offsets
97
+ #
98
+ # For time based we already resolve them via librdkafka lookup API
99
+ next unless offset.is_a?(Integer)
100
+
101
+ low_offset, high_offset = @consumer.query_watermark_offsets(name, partition)
102
+
96
103
  # Care only about negative offsets (last n messages)
97
- next unless offset.is_a?(Integer) && offset.negative?
104
+ #
105
+ # We reject the above results but we **NEED** to run the `#query_watermark_offsets`
106
+ # for each topic partition nonetheless. Without this, librdkafka fetches a lot more
107
+ # metadata about each topic and each partition and this takes much more time than
108
+ # just getting watermarks. If we do not run watermark, at least an extra second
109
+ # is added at the beginning of iterator flow
110
+ #
111
+ # This may not be significant when this runs in the background but in case of
112
+ # using iterator in thins like Puma, it heavily impacts the end user experience
113
+ next unless offset.negative?
98
114
 
99
- _, high_watermark_offset = @consumer.query_watermark_offsets(name, partition)
100
115
  # We add because this offset is negative
101
- @mapped_topics[name][partition] = high_watermark_offset + offset
116
+ @mapped_topics[name][partition] = [high_offset + offset, low_offset].max
102
117
  end
103
118
  end
104
119
  end
@@ -39,6 +39,7 @@ module Karafka
39
39
  # overwritten, you may want to include `auto.offset.reset` to match your case.
40
40
  # @param yield_nil [Boolean] should we yield also `nil` values when poll returns nothing.
41
41
  # Useful in particular for long-living iterators.
42
+ # @param max_wait_time [Integer] max wait in ms when iterator did not receive any messages
42
43
  #
43
44
  # @note It is worth keeping in mind, that this API also needs to operate within
44
45
  # `max.poll.interval.ms` limitations on each iteration
@@ -48,7 +49,8 @@ module Karafka
48
49
  def initialize(
49
50
  topics,
50
51
  settings: { 'auto.offset.reset': 'beginning' },
51
- yield_nil: false
52
+ yield_nil: false,
53
+ max_wait_time: 200
52
54
  )
53
55
  @topics_with_partitions = Expander.new.call(topics)
54
56
 
@@ -62,6 +64,7 @@ module Karafka
62
64
 
63
65
  @settings = settings
64
66
  @yield_nil = yield_nil
67
+ @max_wait_time = max_wait_time
65
68
  end
66
69
 
67
70
  # Iterates over requested topic partitions and yields the results with the iterator itself
@@ -80,7 +83,7 @@ module Karafka
80
83
  # Stream data until we reach the end of all the partitions or until the end user
81
84
  # indicates that they are done
82
85
  until done?
83
- message = poll(200)
86
+ message = poll
84
87
 
85
88
  # Skip nils if not explicitly required
86
89
  next if message.nil? && !@yield_nil
@@ -131,10 +134,9 @@ module Karafka
131
134
 
132
135
  private
133
136
 
134
- # @param timeout [Integer] timeout in ms
135
137
  # @return [Rdkafka::Consumer::Message, nil] message or nil if nothing to do
136
- def poll(timeout)
137
- @current_consumer.poll(timeout)
138
+ def poll
139
+ @current_consumer.poll(@max_wait_time)
138
140
  rescue Rdkafka::RdkafkaError => e
139
141
  # End of partition
140
142
  if e.code == :partition_eof
@@ -48,7 +48,7 @@ module Karafka
48
48
  # We reload the consumers with each batch instead of relying on some external signals
49
49
  # when needed for consistency. That way devs may have it on or off and not in this
50
50
  # middle state, where re-creation of a consumer instance would occur only sometimes
51
- @consumer = nil unless ::Karafka::App.config.consumer_persistence
51
+ @consumer = nil unless topic.consumer_persistence
52
52
 
53
53
  # First we build messages batch...
54
54
  consumer.messages = Messages::Builders::Messages.call(
@@ -17,6 +17,7 @@ module Karafka
17
17
  max_messages
18
18
  max_wait_time
19
19
  initial_offset
20
+ consumer_persistence
20
21
  ].freeze
21
22
 
22
23
  private_constant :INHERITABLE_ATTRIBUTES
@@ -50,7 +51,7 @@ module Karafka
50
51
 
51
52
  # @return [Class] consumer class that we should use
52
53
  def consumer
53
- if Karafka::App.config.consumer_persistence
54
+ if consumer_persistence
54
55
  # When persistence of consumers is on, no need to reload them
55
56
  @consumer
56
57
  else
@@ -25,6 +25,7 @@ module Karafka
25
25
  broker.version.fallback
26
26
  builtin.features
27
27
  check.crcs
28
+ client.dns.lookup
28
29
  client.id
29
30
  client.rack
30
31
  closesocket_cb
@@ -161,6 +162,7 @@ module Karafka
161
162
  broker.address.ttl
162
163
  broker.version.fallback
163
164
  builtin.features
165
+ client.dns.lookup
164
166
  client.id
165
167
  client.rack
166
168
  closesocket_cb
@@ -3,5 +3,5 @@
3
3
  # Main module namespace
4
4
  module Karafka
5
5
  # Current Karafka version
6
- VERSION = '2.1.6'
6
+ VERSION = '2.1.8'
7
7
  end
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: karafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.1.6
4
+ version: 2.1.8
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
35
35
  Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
36
36
  MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
37
37
  -----END CERTIFICATE-----
38
- date: 2023-06-29 00:00:00.000000000 Z
38
+ date: 2023-07-29 00:00:00.000000000 Z
39
39
  dependencies:
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: karafka-core
metadata.gz.sig CHANGED
Binary file