karafka 2.1.5 → 2.1.7

Sign up to get free protection for your applications and to get access to all the features.
Files changed (38) hide show
  1. checksums.yaml +4 -4
  2. checksums.yaml.gz.sig +2 -2
  3. data/CHANGELOG.md +26 -1
  4. data/Gemfile.lock +15 -15
  5. data/karafka.gemspec +2 -2
  6. data/lib/karafka/admin.rb +35 -9
  7. data/lib/karafka/base_consumer.rb +10 -2
  8. data/lib/karafka/connection/client.rb +103 -86
  9. data/lib/karafka/errors.rb +4 -1
  10. data/lib/karafka/instrumentation/logger_listener.rb +3 -0
  11. data/lib/karafka/messages/seek.rb +3 -0
  12. data/lib/karafka/pro/iterator/expander.rb +95 -0
  13. data/lib/karafka/pro/iterator/tpl_builder.rb +160 -0
  14. data/lib/karafka/pro/iterator.rb +9 -92
  15. data/lib/karafka/pro/processing/filters_applier.rb +1 -0
  16. data/lib/karafka/pro/processing/strategies/aj/dlq_ftr_lrj_mom.rb +3 -1
  17. data/lib/karafka/pro/processing/strategies/aj/dlq_ftr_lrj_mom_vp.rb +3 -1
  18. data/lib/karafka/pro/processing/strategies/aj/dlq_lrj_mom.rb +3 -1
  19. data/lib/karafka/pro/processing/strategies/aj/dlq_lrj_mom_vp.rb +3 -1
  20. data/lib/karafka/pro/processing/strategies/aj/ftr_lrj_mom_vp.rb +3 -1
  21. data/lib/karafka/pro/processing/strategies/aj/lrj_mom_vp.rb +4 -1
  22. data/lib/karafka/pro/processing/strategies/dlq/ftr_lrj.rb +2 -2
  23. data/lib/karafka/pro/processing/strategies/dlq/ftr_lrj_mom.rb +2 -2
  24. data/lib/karafka/pro/processing/strategies/dlq/lrj.rb +2 -1
  25. data/lib/karafka/pro/processing/strategies/dlq/lrj_mom.rb +3 -1
  26. data/lib/karafka/pro/processing/strategies/ftr/default.rb +8 -1
  27. data/lib/karafka/pro/processing/strategies/lrj/default.rb +1 -1
  28. data/lib/karafka/pro/processing/strategies/lrj/ftr.rb +2 -2
  29. data/lib/karafka/pro/processing/strategies/lrj/ftr_mom.rb +2 -2
  30. data/lib/karafka/pro/processing/strategies/lrj/mom.rb +3 -1
  31. data/lib/karafka/pro/processing/virtual_offset_manager.rb +1 -1
  32. data/lib/karafka/processing/coordinator.rb +14 -0
  33. data/lib/karafka/railtie.rb +2 -2
  34. data/lib/karafka/setup/attributes_map.rb +2 -0
  35. data/lib/karafka/version.rb +1 -1
  36. data.tar.gz.sig +0 -0
  37. metadata +8 -6
  38. metadata.gz.sig +6 -1
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 82a8b28b55f1db0808be3d1e48616f6b2389466332c9116e263e37cab992fc65
4
- data.tar.gz: 2f29bb9bb1c3f949d206c5c8453b35ad163219babb48687e2270e13914e78aba
3
+ metadata.gz: f405521c7a6706cc95e764a4740e7570935f7595d34481bbe33fb617e5537978
4
+ data.tar.gz: cd6671c441c07e31050bbddab290ba4d31e4a580a646cfd965edf58c19ff150c
5
5
  SHA512:
6
- metadata.gz: 93a66f4aeb49cea810bfd90cf424b3334d1dae992035e0bd9613bbd3c42f642f94fd0efd979d57df5083a46f66f522a7d3952c9e24340b8a4dc4c23aff165a0f
7
- data.tar.gz: 4ee03b442b3029aecf0ffd636ddccb054e51f2a448c3dd642993464bfc32aa45595f26835db8a9b5b01940ab5b532e0bc22a9a3cdbcc9899320b55010473c749
6
+ metadata.gz: 7b5e343a0d2c6e1f885c6eac6509de2f411b54e1a30ce12fac6fa18bb813d82ef666444345b92d8348ac4955cdabfc47ad3658312482f6c500ca169814f10517
7
+ data.tar.gz: 1b0c319f85dde3bc20b21a842da220d513351b436b3e4de08d56e69a02c36c7c2cd4187c879596ffc73f5dffc2cc3f032c6a8cdbd958ce34138866d27aa00b2b
checksums.yaml.gz.sig CHANGED
@@ -1,2 +1,2 @@
1
- \��2F`��i��F��@��L� ;�)�*���_SOT':,��r ��v�i��_�B�݈:�-��?-ױ�^6��.�/�</A�������1�.'c�X����w�}���3?����5t��ū��)X��.�9&؝=�����?A'upm���F2Ȟ��۞����fb�R�������P��v_��c�.�=֧��+_uߨHJfԳt@7S���p�<�[{�1��� ����u(��Iη!�0\#HPp�AJ����W��X_�y���(o��𔒋�Wp������~�1ٌ�����)�fqb�,+�u�2��d^_ay4����ߺĵG ��`۪�t�z�:+��nN�|Mk`�vX��
2
- �->,��SW
1
+ �9�$��<�8��5��O��/ �<�_�:)*τޜ��|�!�s�}���65�[�B_��E���s��;k���'Z�p��'Ŕ������${�j�.P������/}0�`T��+��V�>Ez1��
2
+
data/CHANGELOG.md CHANGED
@@ -1,5 +1,30 @@
1
1
  # Karafka framework changelog
2
2
 
3
+ ## 2.1.7 (2023-07-22)
4
+ - [Improvement] Always query for watermarks in the Iterator to improve the initial response time.
5
+ - [Improvement] Add `max_wait_time` option to the Iterator.
6
+ - [Fix] Fix a case where `Admin#read_topic` would wait for poll interval on non-existing messages instead of early exit.
7
+ - [Fix] Fix a case where Iterator with per partition offsets with negative lookups would go below the number of available messages.
8
+ - [Fix] Remove unused constant from Admin module.
9
+ - [Fix] Add missing `connection.client.rebalance_callback.error` to the `LoggerListener` instrumentation hook.
10
+
11
+ ## 2.1.6 (2023-06-29)
12
+ - [Improvement] Provide time support for iterator
13
+ - [Improvement] Provide time support for admin `#read_topic`
14
+ - [Improvement] Provide time support for consumer `#seek`.
15
+ - [Improvement] Remove no longer needed locks for client operations.
16
+ - [Improvement] Raise `Karafka::Errors::TopicNotFoundError` when trying to iterate over non-existing topic.
17
+ - [Improvement] Ensure that Kafka multi-command operations run under mutex together.
18
+ - [Change] Require `waterdrop` `>= 2.6.2`
19
+ - [Change] Require `karafka-core` `>= 2.1.1`
20
+ - [Refactor] Clean-up iterator code.
21
+ - [Fix] Improve performance in dev environment for a Rails app (juike)
22
+ - [Fix] Rename `InvalidRealOffsetUsage` to `InvalidRealOffsetUsageError` to align with naming of other errors.
23
+ - [Fix] Fix unstable spec.
24
+ - [Fix] Fix a case where automatic `#seek` would overwrite manual seek of a user when running LRJ.
25
+ - [Fix] Make sure, that user direct `#seek` and `#pause` operations take precedence over system actions.
26
+ - [Fix] Make sure, that `#pause` and `#resume` with one underlying connection do not race-condition.
27
+
3
28
  ## 2.1.5 (2023-06-19)
4
29
  - [Improvement] Drastically improve `#revoked?` response quality by checking the real time assignment lost state on librdkafka.
5
30
  - [Improvement] Improve eviction of saturated jobs that would run on already revoked assignments.
@@ -46,7 +71,7 @@
46
71
  2. Replace `Karafka::Pro::BaseConsumer` references to `Karafka::BaseConsumer`.
47
72
  3. Replace `Karafka::Instrumentation::Vendors::Datadog:Listener` with `Karafka::Instrumentation::Vendors::Datadog::MetricsListener`.
48
73
 
49
- ## 2.0.41 (2023-14-19)
74
+ ## 2.0.41 (2023-04-19)
50
75
  - **[Feature]** Provide `Karafka::Pro::Iterator` for anonymous topic/partitions iterations and messages lookups (#1389 and #1427).
51
76
  - [Improvement] Optimize topic lookup for `read_topic` admin method usage.
52
77
  - [Improvement] Report via `LoggerListener` information about the partition on which a given job has started and finished.
data/Gemfile.lock CHANGED
@@ -1,19 +1,19 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- karafka (2.1.5)
5
- karafka-core (>= 2.1.0, < 2.2.0)
4
+ karafka (2.1.7)
5
+ karafka-core (>= 2.1.1, < 2.2.0)
6
6
  thor (>= 0.20)
7
- waterdrop (>= 2.6.1, < 3.0.0)
7
+ waterdrop (>= 2.6.2, < 3.0.0)
8
8
  zeitwerk (~> 2.3)
9
9
 
10
10
  GEM
11
11
  remote: https://rubygems.org/
12
12
  specs:
13
- activejob (7.0.5)
14
- activesupport (= 7.0.5)
13
+ activejob (7.0.6)
14
+ activesupport (= 7.0.6)
15
15
  globalid (>= 0.3.6)
16
- activesupport (7.0.5)
16
+ activesupport (7.0.6)
17
17
  concurrent-ruby (~> 1.0, >= 1.0.2)
18
18
  i18n (>= 1.6, < 2)
19
19
  minitest (>= 5.1)
@@ -30,14 +30,14 @@ GEM
30
30
  activesupport (>= 5.0)
31
31
  i18n (1.14.1)
32
32
  concurrent-ruby (~> 1.0)
33
- karafka-core (2.1.0)
33
+ karafka-core (2.1.1)
34
34
  concurrent-ruby (>= 1.1)
35
- karafka-rdkafka (>= 0.13.0, < 0.14.0)
36
- karafka-rdkafka (0.13.0)
35
+ karafka-rdkafka (>= 0.13.1, < 0.14.0)
36
+ karafka-rdkafka (0.13.3)
37
37
  ffi (~> 1.15)
38
38
  mini_portile2 (~> 2.6)
39
39
  rake (> 12)
40
- karafka-web (0.6.0)
40
+ karafka-web (0.6.1)
41
41
  erubi (~> 1.4)
42
42
  karafka (>= 2.1.4, < 3.0.0)
43
43
  karafka-core (>= 2.0.13, < 3.0.0)
@@ -47,7 +47,7 @@ GEM
47
47
  minitest (5.18.1)
48
48
  rack (3.0.8)
49
49
  rake (13.0.6)
50
- roda (3.69.0)
50
+ roda (3.70.0)
51
51
  rack
52
52
  rspec (3.12.0)
53
53
  rspec-core (~> 3.12.0)
@@ -58,10 +58,10 @@ GEM
58
58
  rspec-expectations (3.12.3)
59
59
  diff-lcs (>= 1.2.0, < 2.0)
60
60
  rspec-support (~> 3.12.0)
61
- rspec-mocks (3.12.5)
61
+ rspec-mocks (3.12.6)
62
62
  diff-lcs (>= 1.2.0, < 2.0)
63
63
  rspec-support (~> 3.12.0)
64
- rspec-support (3.12.0)
64
+ rspec-support (3.12.1)
65
65
  simplecov (0.22.0)
66
66
  docile (~> 1.1)
67
67
  simplecov-html (~> 0.11)
@@ -72,8 +72,8 @@ GEM
72
72
  tilt (2.2.0)
73
73
  tzinfo (2.0.6)
74
74
  concurrent-ruby (~> 1.0)
75
- waterdrop (2.6.1)
76
- karafka-core (>= 2.1.0, < 3.0.0)
75
+ waterdrop (2.6.4)
76
+ karafka-core (>= 2.1.1, < 3.0.0)
77
77
  zeitwerk (~> 2.3)
78
78
  zeitwerk (2.6.8)
79
79
 
data/karafka.gemspec CHANGED
@@ -21,9 +21,9 @@ Gem::Specification.new do |spec|
21
21
  without having to focus on things that are not your business domain.
22
22
  DESC
23
23
 
24
- spec.add_dependency 'karafka-core', '>= 2.1.0', '< 2.2.0'
24
+ spec.add_dependency 'karafka-core', '>= 2.1.1', '< 2.2.0'
25
25
  spec.add_dependency 'thor', '>= 0.20'
26
- spec.add_dependency 'waterdrop', '>= 2.6.1', '< 3.0.0'
26
+ spec.add_dependency 'waterdrop', '>= 2.6.2', '< 3.0.0'
27
27
  spec.add_dependency 'zeitwerk', '~> 2.3'
28
28
 
29
29
  if $PROGRAM_NAME.end_with?('gem')
data/lib/karafka/admin.rb CHANGED
@@ -9,15 +9,13 @@ module Karafka
9
9
  # @note It always uses the primary defined cluster and does not support multi-cluster work.
10
10
  # If you need this, just replace the cluster info for the time you use this
11
11
  module Admin
12
- # A fake admin topic representation that we use for messages fetched using this API
13
- # We cannot use the topics directly because we may want to request data from topics that we
14
- # do not have in the routing
15
- Topic = Struct.new(:name, :deserializer)
16
-
17
12
  # We wait only for this amount of time before raising error as we intercept this error and
18
13
  # retry after checking that the operation was finished or failed using external factor.
19
14
  MAX_WAIT_TIMEOUT = 1
20
15
 
16
+ # Max time for a TPL request. We increase it to compensate for remote clusters latency
17
+ TPL_REQUEST_TIMEOUT = 2_000
18
+
21
19
  # How many times should be try. 1 x 60 => 60 seconds wait in total
22
20
  MAX_ATTEMPTS = 60
23
21
 
@@ -34,7 +32,8 @@ module Karafka
34
32
  'enable.auto.commit': false
35
33
  }.freeze
36
34
 
37
- private_constant :Topic, :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :MAX_ATTEMPTS
35
+ private_constant :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :TPL_REQUEST_TIMEOUT,
36
+ :MAX_ATTEMPTS
38
37
 
39
38
  class << self
40
39
  # Allows us to read messages from the topic
@@ -42,8 +41,9 @@ module Karafka
42
41
  # @param name [String, Symbol] topic name
43
42
  # @param partition [Integer] partition
44
43
  # @param count [Integer] how many messages we want to get at most
45
- # @param start_offset [Integer] offset from which we should start. If -1 is provided
46
- # (default) we will start from the latest offset
44
+ # @param start_offset [Integer, Time] offset from which we should start. If -1 is provided
45
+ # (default) we will start from the latest offset. If time is provided, the appropriate
46
+ # offset will be resolved.
47
47
  # @param settings [Hash] kafka extra settings (optional)
48
48
  #
49
49
  # @return [Array<Karafka::Messages::Message>] array with messages
@@ -53,6 +53,9 @@ module Karafka
53
53
  low_offset, high_offset = nil
54
54
 
55
55
  with_consumer(settings) do |consumer|
56
+ # Convert the time offset (if needed)
57
+ start_offset = resolve_offset(consumer, name.to_s, partition, start_offset)
58
+
56
59
  low_offset, high_offset = consumer.query_watermark_offsets(name, partition)
57
60
 
58
61
  # Select offset dynamically if -1 or less
@@ -63,7 +66,7 @@ module Karafka
63
66
  requested_range = (start_offset..start_offset + (count - 1))
64
67
  # Establish theoretical available range. Note, that this does not handle cases related to
65
68
  # log retention or compaction
66
- available_range = (low_offset..high_offset)
69
+ available_range = (low_offset..(high_offset - 1))
67
70
  # Select only offset that we can select. This will remove all the potential offsets that
68
71
  # are below the low watermark offset
69
72
  possible_range = requested_range.select { |offset| available_range.include?(offset) }
@@ -243,6 +246,29 @@ module Karafka
243
246
 
244
247
  ::Rdkafka::Config.new(config_hash)
245
248
  end
249
+
250
+ # Resolves the offset if offset is in a time format. Otherwise returns the offset without
251
+ # resolving.
252
+ # @param consumer [::Rdkafka::Consumer]
253
+ # @param name [String, Symbol] expected topic name
254
+ # @param partition [Integer]
255
+ # @param offset [Integer, Time]
256
+ # @return [Integer] expected offset
257
+ def resolve_offset(consumer, name, partition, offset)
258
+ if offset.is_a?(Time)
259
+ tpl = ::Rdkafka::Consumer::TopicPartitionList.new
260
+ tpl.add_topic_and_partitions_with_offsets(
261
+ name, partition => offset
262
+ )
263
+
264
+ real_offsets = consumer.offsets_for_times(tpl, TPL_REQUEST_TIMEOUT)
265
+ detected_offset = real_offsets.to_h.dig(name, partition)
266
+
267
+ detected_offset&.offset || raise(Errors::InvalidTimeBasedOffsetError)
268
+ else
269
+ offset
270
+ end
271
+ end
246
272
  end
247
273
  end
248
274
  end
@@ -70,6 +70,7 @@ module Karafka
70
70
  #
71
71
  # @return [Boolean] true if there was no exception, otherwise false.
72
72
  #
73
+ # @private
73
74
  # @note We keep the seek offset tracking, and use it to compensate for async offset flushing
74
75
  # that may not yet kick in when error occurs. That way we pause always on the last processed
75
76
  # message.
@@ -203,8 +204,15 @@ module Karafka
203
204
 
204
205
  # Seeks in the context of current topic and partition
205
206
  #
206
- # @param offset [Integer] offset where we want to seek
207
- def seek(offset)
207
+ # @param offset [Integer, Time] offset where we want to seek or time of the offset where we
208
+ # want to seek.
209
+ # @param manual_seek [Boolean] Flag to differentiate between user seek and system/strategy
210
+ # based seek. User seek operations should take precedence over system actions, hence we need
211
+ # to know who invoked it.
212
+ # @note Please note, that if you are seeking to a time offset, getting the offset is blocking
213
+ def seek(offset, manual_seek = true)
214
+ coordinator.manual_seek if manual_seek
215
+
208
216
  client.seek(
209
217
  Karafka::Messages::Seek.new(
210
218
  topic.name,
@@ -20,11 +20,14 @@ module Karafka
20
20
  # How many times should we retry polling in case of a failure
21
21
  MAX_POLL_RETRIES = 20
22
22
 
23
+ # Max time for a TPL request. We increase it to compensate for remote clusters latency
24
+ TPL_REQUEST_TIMEOUT = 2_000
25
+
23
26
  # We want to make sure we never close several clients in the same moment to prevent
24
27
  # potential race conditions and other issues
25
28
  SHUTDOWN_MUTEX = Mutex.new
26
29
 
27
- private_constant :MAX_POLL_RETRIES, :SHUTDOWN_MUTEX
30
+ private_constant :MAX_POLL_RETRIES, :SHUTDOWN_MUTEX, :TPL_REQUEST_TIMEOUT
28
31
 
29
32
  # Creates a new consumer instance.
30
33
  #
@@ -35,12 +38,16 @@ module Karafka
35
38
  @id = SecureRandom.hex(6)
36
39
  # Name is set when we build consumer
37
40
  @name = ''
38
- @mutex = Mutex.new
39
41
  @closed = false
40
42
  @subscription_group = subscription_group
41
43
  @buffer = RawMessagesBuffer.new
42
44
  @rebalance_manager = RebalanceManager.new
43
45
  @kafka = build_consumer
46
+ # There are few operations that can happen in parallel from the listener threads as well
47
+ # as from the workers. They are not fully thread-safe because they may be composed out of
48
+ # few calls to Kafka or out of few internal state changes. That is why we mutex them.
49
+ # It mostly revolves around pausing and resuming.
50
+ @mutex = Mutex.new
44
51
  # We need to keep track of what we have paused for resuming
45
52
  # In case we loose partition, we still need to resume it, otherwise it won't be fetched
46
53
  # again if we get reassigned to it later on. We need to keep them as after revocation we
@@ -101,16 +108,12 @@ module Karafka
101
108
  #
102
109
  # @param message [Karafka::Messages::Message]
103
110
  def store_offset(message)
104
- @mutex.synchronize do
105
- internal_store_offset(message)
106
- end
111
+ internal_store_offset(message)
107
112
  end
108
113
 
109
114
  # @return [Boolean] true if our current assignment has been lost involuntarily.
110
115
  def assignment_lost?
111
- @mutex.synchronize do
112
- @kafka.assignment_lost?
113
- end
116
+ @kafka.assignment_lost?
114
117
  end
115
118
 
116
119
  # Commits the offset on a current consumer in a non-blocking or blocking way.
@@ -127,11 +130,7 @@ module Karafka
127
130
  # it does **not** resolve to `lost_assignment?`. It returns only the commit state operation
128
131
  # result.
129
132
  def commit_offsets(async: true)
130
- @mutex.lock
131
-
132
133
  internal_commit_offsets(async: async)
133
- ensure
134
- @mutex.unlock
135
134
  end
136
135
 
137
136
  # Commits offset in a synchronous way.
@@ -144,13 +143,11 @@ module Karafka
144
143
  # Seek to a particular message. The next poll on the topic/partition will return the
145
144
  # message at the given offset.
146
145
  #
147
- # @param message [Messages::Message, Messages::Seek] message to which we want to seek to
146
+ # @param message [Messages::Message, Messages::Seek] message to which we want to seek to.
147
+ # It can have the time based offset.
148
+ # @note Please note, that if you are seeking to a time offset, getting the offset is blocking
148
149
  def seek(message)
149
- @mutex.lock
150
-
151
- @kafka.seek(message)
152
- ensure
153
- @mutex.unlock
150
+ @mutex.synchronize { internal_seek(message) }
154
151
  end
155
152
 
156
153
  # Pauses given partition and moves back to last successful offset processed.
@@ -161,37 +158,34 @@ module Karafka
161
158
  # be reprocessed after getting back to processing)
162
159
  # @note This will pause indefinitely and requires manual `#resume`
163
160
  def pause(topic, partition, offset)
164
- @mutex.lock
165
-
166
- # Do not pause if the client got closed, would not change anything
167
- return if @closed
168
-
169
- pause_msg = Messages::Seek.new(topic, partition, offset)
161
+ @mutex.synchronize do
162
+ # Do not pause if the client got closed, would not change anything
163
+ return if @closed
170
164
 
171
- internal_commit_offsets(async: true)
165
+ pause_msg = Messages::Seek.new(topic, partition, offset)
172
166
 
173
- # Here we do not use our cached tpls because we should not try to pause something we do
174
- # not own anymore.
175
- tpl = topic_partition_list(topic, partition)
167
+ internal_commit_offsets(async: true)
176
168
 
177
- return unless tpl
169
+ # Here we do not use our cached tpls because we should not try to pause something we do
170
+ # not own anymore.
171
+ tpl = topic_partition_list(topic, partition)
178
172
 
179
- Karafka.monitor.instrument(
180
- 'client.pause',
181
- caller: self,
182
- subscription_group: @subscription_group,
183
- topic: topic,
184
- partition: partition,
185
- offset: offset
186
- )
173
+ return unless tpl
187
174
 
188
- @paused_tpls[topic][partition] = tpl
175
+ Karafka.monitor.instrument(
176
+ 'client.pause',
177
+ caller: self,
178
+ subscription_group: @subscription_group,
179
+ topic: topic,
180
+ partition: partition,
181
+ offset: offset
182
+ )
189
183
 
190
- @kafka.pause(tpl)
184
+ @paused_tpls[topic][partition] = tpl
191
185
 
192
- @kafka.seek(pause_msg)
193
- ensure
194
- @mutex.unlock
186
+ @kafka.pause(tpl)
187
+ internal_seek(pause_msg)
188
+ end
195
189
  end
196
190
 
197
191
  # Resumes processing of a give topic partition after it was paused.
@@ -199,33 +193,31 @@ module Karafka
199
193
  # @param topic [String] topic name
200
194
  # @param partition [Integer] partition
201
195
  def resume(topic, partition)
202
- @mutex.lock
203
-
204
- return if @closed
196
+ @mutex.synchronize do
197
+ return if @closed
205
198
 
206
- # We now commit offsets on rebalances, thus we can do it async just to make sure
207
- internal_commit_offsets(async: true)
199
+ # We now commit offsets on rebalances, thus we can do it async just to make sure
200
+ internal_commit_offsets(async: true)
208
201
 
209
- # If we were not able, let's try to reuse the one we have (if we have)
210
- tpl = topic_partition_list(topic, partition) || @paused_tpls[topic][partition]
202
+ # If we were not able, let's try to reuse the one we have (if we have)
203
+ tpl = topic_partition_list(topic, partition) || @paused_tpls[topic][partition]
211
204
 
212
- return unless tpl
205
+ return unless tpl
213
206
 
214
- # If we did not have it, it means we never paused this partition, thus no resume should
215
- # happen in the first place
216
- return unless @paused_tpls[topic].delete(partition)
207
+ # If we did not have it, it means we never paused this partition, thus no resume should
208
+ # happen in the first place
209
+ return unless @paused_tpls[topic].delete(partition)
217
210
 
218
- Karafka.monitor.instrument(
219
- 'client.resume',
220
- caller: self,
221
- subscription_group: @subscription_group,
222
- topic: topic,
223
- partition: partition
224
- )
211
+ Karafka.monitor.instrument(
212
+ 'client.resume',
213
+ caller: self,
214
+ subscription_group: @subscription_group,
215
+ topic: topic,
216
+ partition: partition
217
+ )
225
218
 
226
- @kafka.resume(tpl)
227
- ensure
228
- @mutex.unlock
219
+ @kafka.resume(tpl)
220
+ end
229
221
  end
230
222
 
231
223
  # Gracefully stops topic consumption.
@@ -262,11 +254,9 @@ module Karafka
262
254
  def reset
263
255
  close
264
256
 
265
- @mutex.synchronize do
266
- @closed = false
267
- @paused_tpls.clear
268
- @kafka = build_consumer
269
- end
257
+ @closed = false
258
+ @paused_tpls.clear
259
+ @kafka = build_consumer
270
260
  end
271
261
 
272
262
  # Runs a single poll ignoring all the potential errors
@@ -323,28 +313,55 @@ module Karafka
323
313
  raise e
324
314
  end
325
315
 
316
+ # Non-mutexed seek that should be used only internally. Outside we expose `#seek` that is
317
+ # wrapped with a mutex.
318
+ #
319
+ # @param message [Messages::Message, Messages::Seek] message to which we want to seek to.
320
+ # It can have the time based offset.
321
+ def internal_seek(message)
322
+ # If the seek message offset is in a time format, we need to find the closest "real"
323
+ # offset matching before we seek
324
+ if message.offset.is_a?(Time)
325
+ tpl = ::Rdkafka::Consumer::TopicPartitionList.new
326
+ tpl.add_topic_and_partitions_with_offsets(
327
+ message.topic,
328
+ message.partition => message.offset
329
+ )
330
+
331
+ # Now we can overwrite the seek message offset with our resolved offset and we can
332
+ # then seek to the appropriate message
333
+ # We set the timeout to 2_000 to make sure that remote clusters handle this well
334
+ real_offsets = @kafka.offsets_for_times(tpl, TPL_REQUEST_TIMEOUT)
335
+ detected_partition = real_offsets.to_h.dig(message.topic, message.partition)
336
+
337
+ # There always needs to be an offset. In case we seek into the future, where there
338
+ # are no offsets yet, we get -1 which indicates the most recent offset
339
+ # We should always detect offset, whether it is 0, -1 or a corresponding
340
+ message.offset = detected_partition&.offset || raise(Errors::InvalidTimeBasedOffsetError)
341
+ end
342
+
343
+ @kafka.seek(message)
344
+ end
345
+
326
346
  # Commits the stored offsets in a sync way and closes the consumer.
327
347
  def close
328
348
  # Allow only one client to be closed at the same time
329
349
  SHUTDOWN_MUTEX.synchronize do
330
- # Make sure that no other operations are happening on this client when we close it
331
- @mutex.synchronize do
332
- # Once client is closed, we should not close it again
333
- # This could only happen in case of a race-condition when forceful shutdown happens
334
- # and triggers this from a different thread
335
- return if @closed
336
-
337
- @closed = true
338
-
339
- # Remove callbacks runners that were registered
340
- ::Karafka::Core::Instrumentation.statistics_callbacks.delete(@subscription_group.id)
341
- ::Karafka::Core::Instrumentation.error_callbacks.delete(@subscription_group.id)
342
-
343
- @kafka.close
344
- @buffer.clear
345
- # @note We do not clear rebalance manager here as we may still have revocation info
346
- # here that we want to consider valid prior to running another reconnection
347
- end
350
+ # Once client is closed, we should not close it again
351
+ # This could only happen in case of a race-condition when forceful shutdown happens
352
+ # and triggers this from a different thread
353
+ return if @closed
354
+
355
+ @closed = true
356
+
357
+ # Remove callbacks runners that were registered
358
+ ::Karafka::Core::Instrumentation.statistics_callbacks.delete(@subscription_group.id)
359
+ ::Karafka::Core::Instrumentation.error_callbacks.delete(@subscription_group.id)
360
+
361
+ @kafka.close
362
+ @buffer.clear
363
+ # @note We do not clear rebalance manager here as we may still have revocation info
364
+ # here that we want to consider valid prior to running another reconnection
348
365
  end
349
366
  end
350
367
 
@@ -48,6 +48,9 @@ module Karafka
48
48
  StrategyNotFoundError = Class.new(BaseError)
49
49
 
50
50
  # This should never happen. Please open an issue if it does.
51
- InvalidRealOffsetUsage = Class.new(BaseError)
51
+ InvalidRealOffsetUsageError = Class.new(BaseError)
52
+
53
+ # This should never happen. Please open an issue if it does.
54
+ InvalidTimeBasedOffsetError = Class.new(BaseError)
52
55
  end
53
56
  end
@@ -277,6 +277,9 @@ module Karafka
277
277
  when 'connection.client.poll.error'
278
278
  error "Data polling error occurred: #{error}"
279
279
  error details
280
+ when 'connection.client.rebalance_callback.error'
281
+ error "Rebalance callback error occurred: #{error}"
282
+ error details
280
283
  else
281
284
  # This should never happen. Please contact the maintainers
282
285
  raise Errors::UnsupportedCaseError, event
@@ -4,6 +4,9 @@ module Karafka
4
4
  module Messages
5
5
  # "Fake" message that we use as an abstraction layer when seeking back.
6
6
  # This allows us to encapsulate a seek with a simple abstraction
7
+ #
8
+ # @note `#offset` can be either the offset value or the time of the offset
9
+ # (first equal or greater)
7
10
  Seek = Struct.new(:topic, :partition, :offset)
8
11
  end
9
12
  end