karafka 2.2.9 → 2.2.10
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/.github/ISSUE_TEMPLATE/bug_report.md +10 -9
- data/.github/workflows/ci.yml +3 -1
- data/CHANGELOG.md +19 -9
- data/Gemfile.lock +9 -9
- data/lib/karafka/base_consumer.rb +12 -7
- data/lib/karafka/connection/client.rb +37 -5
- data/lib/karafka/instrumentation/logger_listener.rb +5 -1
- data/lib/karafka/pro/processing/strategies/aj/lrj_mom_vp.rb +1 -1
- data/lib/karafka/pro/processing/strategies/lrj/default.rb +4 -6
- data/lib/karafka/pro/processing/strategies/lrj/mom.rb +1 -6
- data/lib/karafka/pro/routing/features/patterns/consumer_group.rb +4 -0
- data/lib/karafka/version.rb +1 -1
- data/renovate.json +4 -1
- data.tar.gz.sig +0 -0
- metadata +2 -2
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 460029e2dbc352c56f107e18bbff28f5914b5b56a0e42be9670e13f0f3592fc9
|
4
|
+
data.tar.gz: 68c25bacd1995a8e9c97a915e1e1b2b4435437c97a5e31e2824499c9ad531057
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: adc877c3be3cb8885b0244168014387f0353bed89575b6931f5e1ca062d01ba6a0e5a0566da4dd282f5117c7daf27ff2116f27025ec4213972c1ca52e92980de
|
7
|
+
data.tar.gz: a62ff96ef96ea85d98288bad3c8f6cf0bfafc1ee6de1cf928fe803ad80435b0fa137aac9cf5eff4b232e854b3c4df76c3c16efaa4a117d6b86e442975cfc9a30
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
@@ -37,14 +37,15 @@ Here's an example:
|
|
37
37
|
|
38
38
|
```
|
39
39
|
$ [bundle exec] karafka info
|
40
|
-
Karafka version:
|
41
|
-
Ruby version: 2.
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
Boot file: /app/karafka
|
40
|
+
Karafka version: 2.2.10 + Pro
|
41
|
+
Ruby version: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
|
42
|
+
Rdkafka version: 0.13.8
|
43
|
+
Consumer groups count: 2
|
44
|
+
Subscription groups count: 2
|
45
|
+
Workers count: 2
|
46
|
+
Application client id: example_app
|
47
|
+
Boot file: /app/karafka.rb
|
48
48
|
Environment: development
|
49
|
-
|
49
|
+
License: Commercial
|
50
|
+
License entity: karafka-ci
|
50
51
|
```
|
data/.github/workflows/ci.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,15 @@
|
|
1
1
|
# Karafka framework changelog
|
2
2
|
|
3
|
+
## 2.2.10 (2023-11-02)
|
4
|
+
- [Improvement] Allow for running `#pause` without specifying the offset (provide offset or `:consecutive`). This allows for pausing on the consecutive message (last received + 1), so after resume we will get last message received + 1 effectively not using `#seek` and not purging `librdafka` buffer preserving on networking. Please be mindful that this uses notion of last message passed from **librdkafka**, and not the last one available in the consumer (`messages.last`). While for regular cases they will be the same, when using things like DLQ, LRJs, VPs or Filtering API, those may not be the same.
|
5
|
+
- [Improvement] **Drastically** improve network efficiency of operating with LRJ by using the `:consecutive` offset as default strategy for running LRJs without moving the offset in place and purging the data.
|
6
|
+
- [Improvement] Do not "seek in place". When pausing and/or seeking to the same location as the current position, do nothing not to purge buffers and not to move to the same place where we are.
|
7
|
+
- [Fix] Pattern regexps should not be part of declaratives even when configured.
|
8
|
+
|
9
|
+
### Upgrade Notes
|
10
|
+
|
11
|
+
In the latest Karafka release, there are no breaking changes. However, please note the updates to #pause and #seek. If you spot any issues, please report them immediately. Your feedback is crucial.
|
12
|
+
|
3
13
|
## 2.2.9 (2023-10-24)
|
4
14
|
- [Improvement] Allow using negative offset references in `Karafka::Admin#read_topic`.
|
5
15
|
- [Change] Make sure that WaterDrop `2.6.10` or higher is used with this release to support transactions fully and the Web-UI.
|
@@ -11,7 +21,7 @@
|
|
11
21
|
- [Refactor] Reorganize how rebalance events are propagated from `librdkafka` to Karafka. Replace `connection.client.rebalance_callback` with `rebalance.partitions_assigned` and `rebalance.partitions_revoked`. Introduce two extra events: `rebalance.partitions_assign` and `rebalance.partitions_revoke` to handle pre-rebalance future work.
|
12
22
|
- [Refactor] Remove `thor` as a CLI layer and rely on Ruby `OptParser`
|
13
23
|
|
14
|
-
### Upgrade
|
24
|
+
### Upgrade Notes
|
15
25
|
|
16
26
|
1. Unless you were using `connection.client.rebalance_callback` which was considered private, nothing.
|
17
27
|
2. None of the CLI commands should change but `thor` has been removed so please report if you find any bugs.
|
@@ -53,7 +63,7 @@
|
|
53
63
|
- [Enhancement] Make sure that consumer group used by `Karafka::Admin` obeys the `ConsumerMapper` setup.
|
54
64
|
- [Fix] Fix a case where subscription group would not accept a symbol name.
|
55
65
|
|
56
|
-
### Upgrade
|
66
|
+
### Upgrade Notes
|
57
67
|
|
58
68
|
As always, please make sure you have upgraded to the most recent version of `2.1` before upgrading to `2.2`.
|
59
69
|
|
@@ -184,7 +194,7 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
|
|
184
194
|
- [Change] Removed `Karafka::Pro::BaseConsumer` in favor of `Karafka::BaseConsumer`. (#1345)
|
185
195
|
- [Fix] Fix for `max_messages` and `max_wait_time` not having reference in errors.yml (#1443)
|
186
196
|
|
187
|
-
### Upgrade
|
197
|
+
### Upgrade Notes
|
188
198
|
|
189
199
|
1. Upgrade to Karafka `2.0.41` prior to upgrading to `2.1.0`.
|
190
200
|
2. Replace `Karafka::Pro::BaseConsumer` references to `Karafka::BaseConsumer`.
|
@@ -256,7 +266,7 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
|
|
256
266
|
- [Improvement] Attach an `embedded` tag to Karafka processes started using the embedded API.
|
257
267
|
- [Change] Renamed `Datadog::Listener` to `Datadog::MetricsListener` for consistency. (#1124)
|
258
268
|
|
259
|
-
### Upgrade
|
269
|
+
### Upgrade Notes
|
260
270
|
|
261
271
|
1. Replace `Datadog::Listener` references to `Datadog::MetricsListener`.
|
262
272
|
|
@@ -270,7 +280,7 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
|
|
270
280
|
- [Improvement] Introduce a `strict_topics_namespacing` config option to enable/disable the strict topics naming validations. This can be useful when working with pre-existing topics which we cannot or do not want to rename.
|
271
281
|
- [Fix] Karafka monitor is prematurely cached (#1314)
|
272
282
|
|
273
|
-
### Upgrade
|
283
|
+
### Upgrade Notes
|
274
284
|
|
275
285
|
Since `#tags` were introduced on consumers, the `#tags` method is now part of the consumers API.
|
276
286
|
|
@@ -332,7 +342,7 @@ end
|
|
332
342
|
- [Improvement] Include `original_consumer_group` in the DLQ dispatched messages in Pro.
|
333
343
|
- [Improvement] Use Karafka `client_id` as kafka `client.id` value by default
|
334
344
|
|
335
|
-
### Upgrade
|
345
|
+
### Upgrade Notes
|
336
346
|
|
337
347
|
If you want to continue to use `karafka` as default for kafka `client.id`, assign it manually:
|
338
348
|
|
@@ -385,7 +395,7 @@ class KarafkaApp < Karafka::App
|
|
385
395
|
- [Fix] Do not trigger code reloading when `consumer_persistence` is enabled.
|
386
396
|
- [Fix] Shutdown producer after all the consumer components are down and the status is stopped. This will ensure, that any instrumentation related Kafka messaging can still operate.
|
387
397
|
|
388
|
-
### Upgrade
|
398
|
+
### Upgrade Notes
|
389
399
|
|
390
400
|
If you want to disable `librdkafka` statistics because you do not use them at all, update the `kafka` `statistics.interval.ms` setting and set it to `0`:
|
391
401
|
|
@@ -447,7 +457,7 @@ end
|
|
447
457
|
- [Fix] Few typos around DLQ and Pro DLQ Dispatch original metadata naming.
|
448
458
|
- [Fix] Narrow the components lookup to the appropriate scope (#1114)
|
449
459
|
|
450
|
-
### Upgrade
|
460
|
+
### Upgrade Notes
|
451
461
|
|
452
462
|
1. Replace `original-*` references from DLQ dispatched metadata with `original_*`
|
453
463
|
|
@@ -490,7 +500,7 @@ end
|
|
490
500
|
- [Specs] Split specs into regular and pro to simplify how resources are loaded
|
491
501
|
- [Specs] Add specs to ensure, that all the Pro components have a proper per-file license (#1099)
|
492
502
|
|
493
|
-
### Upgrade
|
503
|
+
### Upgrade Notes
|
494
504
|
|
495
505
|
1. Remove the `manual_offset_management` setting from the main config if you use it:
|
496
506
|
|
data/Gemfile.lock
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
karafka (2.2.
|
4
|
+
karafka (2.2.10)
|
5
5
|
karafka-core (>= 2.2.2, < 2.3.0)
|
6
6
|
waterdrop (>= 2.6.10, < 3.0.0)
|
7
7
|
zeitwerk (~> 2.3)
|
@@ -39,24 +39,24 @@ GEM
|
|
39
39
|
activesupport (>= 6.1)
|
40
40
|
i18n (1.14.1)
|
41
41
|
concurrent-ruby (~> 1.0)
|
42
|
-
karafka-core (2.2.
|
42
|
+
karafka-core (2.2.5)
|
43
43
|
concurrent-ruby (>= 1.1)
|
44
|
-
karafka-rdkafka (>= 0.13.
|
45
|
-
karafka-rdkafka (0.13.
|
44
|
+
karafka-rdkafka (>= 0.13.8, < 0.15.0)
|
45
|
+
karafka-rdkafka (0.13.8)
|
46
46
|
ffi (~> 1.15)
|
47
47
|
mini_portile2 (~> 2.6)
|
48
48
|
rake (> 12)
|
49
|
-
karafka-web (0.7.
|
49
|
+
karafka-web (0.7.10)
|
50
50
|
erubi (~> 1.4)
|
51
|
-
karafka (>= 2.2.
|
52
|
-
karafka-core (>= 2.2.
|
51
|
+
karafka (>= 2.2.9, < 3.0.0)
|
52
|
+
karafka-core (>= 2.2.4, < 3.0.0)
|
53
53
|
roda (~> 3.68, >= 3.69)
|
54
54
|
tilt (~> 2.0)
|
55
55
|
mini_portile2 (2.8.5)
|
56
56
|
minitest (5.20.0)
|
57
57
|
mutex_m (0.1.2)
|
58
58
|
rack (3.0.8)
|
59
|
-
rake (13.0
|
59
|
+
rake (13.1.0)
|
60
60
|
roda (3.73.0)
|
61
61
|
rack
|
62
62
|
rspec (3.12.0)
|
@@ -82,7 +82,7 @@ GEM
|
|
82
82
|
tilt (2.3.0)
|
83
83
|
tzinfo (2.0.6)
|
84
84
|
concurrent-ruby (~> 1.0)
|
85
|
-
waterdrop (2.6.
|
85
|
+
waterdrop (2.6.11)
|
86
86
|
karafka-core (>= 2.2.3, < 3.0.0)
|
87
87
|
zeitwerk (~> 2.3)
|
88
88
|
zeitwerk (2.6.12)
|
@@ -171,23 +171,28 @@ module Karafka
|
|
171
171
|
@used
|
172
172
|
end
|
173
173
|
|
174
|
-
# Pauses processing on a given offset for the current topic partition
|
174
|
+
# Pauses processing on a given offset or consecutive offset for the current topic partition
|
175
175
|
#
|
176
176
|
# After given partition is resumed, it will continue processing from the given offset
|
177
|
-
# @param offset [Integer] offset from which we want to restart the processing
|
177
|
+
# @param offset [Integer, Symbol] offset from which we want to restart the processing or
|
178
|
+
# `:consecutive` if we want to pause and continue without changing the consecutive offset
|
179
|
+
# (cursor position)
|
178
180
|
# @param timeout [Integer, nil] how long in milliseconds do we want to pause or nil to use the
|
179
181
|
# default exponential pausing strategy defined for retries
|
180
182
|
# @param manual_pause [Boolean] Flag to differentiate between user pause and system/strategy
|
181
183
|
# based pause. While they both pause in exactly the same way, the strategy application
|
182
184
|
# may need to differentiate between them.
|
185
|
+
#
|
186
|
+
# @note It is **critical** to understand how pause with `:consecutive` offset operates. While
|
187
|
+
# it provides benefit of not purging librdkafka buffer, in case of usage of filters, retries
|
188
|
+
# or other advanced options the consecutive offset may not be the one you want to pause on.
|
189
|
+
# Test it well to ensure, that this behaviour is expected by you.
|
183
190
|
def pause(offset, timeout = nil, manual_pause = true)
|
184
191
|
timeout ? coordinator.pause_tracker.pause(timeout) : coordinator.pause_tracker.pause
|
185
192
|
|
186
|
-
|
187
|
-
|
188
|
-
|
189
|
-
offset
|
190
|
-
)
|
193
|
+
offset = nil if offset == :consecutive
|
194
|
+
|
195
|
+
client.pause(topic.name, partition, offset)
|
191
196
|
|
192
197
|
# Indicate, that user took a manual action of pausing
|
193
198
|
coordinator.manual_pause if manual_pause
|
@@ -160,16 +160,17 @@ module Karafka
|
|
160
160
|
#
|
161
161
|
# @param topic [String] topic name
|
162
162
|
# @param partition [Integer] partition
|
163
|
-
# @param offset [Integer] offset of the message on which we want to pause (this message
|
164
|
-
# be reprocessed after getting back to processing)
|
163
|
+
# @param offset [Integer, nil] offset of the message on which we want to pause (this message
|
164
|
+
# will be reprocessed after getting back to processing) or nil if we want to pause and
|
165
|
+
# resume from the consecutive offset (+1 from the last message passed to us by librdkafka)
|
165
166
|
# @note This will pause indefinitely and requires manual `#resume`
|
166
|
-
|
167
|
+
# @note When `#internal_seek` is not involved (when offset is `nil`) we will not purge the
|
168
|
+
# librdkafka buffers and continue from the last cursor offset
|
169
|
+
def pause(topic, partition, offset = nil)
|
167
170
|
@mutex.synchronize do
|
168
171
|
# Do not pause if the client got closed, would not change anything
|
169
172
|
return if @closed
|
170
173
|
|
171
|
-
pause_msg = Messages::Seek.new(topic, partition, offset)
|
172
|
-
|
173
174
|
internal_commit_offsets(async: true)
|
174
175
|
|
175
176
|
# Here we do not use our cached tpls because we should not try to pause something we do
|
@@ -190,6 +191,14 @@ module Karafka
|
|
190
191
|
@paused_tpls[topic][partition] = tpl
|
191
192
|
|
192
193
|
@kafka.pause(tpl)
|
194
|
+
|
195
|
+
# If offset is not provided, will pause where it finished.
|
196
|
+
# This makes librdkafka not purge buffers and can provide significant network savings
|
197
|
+
# when we just want to pause before further processing without changing the offsets
|
198
|
+
return unless offset
|
199
|
+
|
200
|
+
pause_msg = Messages::Seek.new(topic, partition, offset)
|
201
|
+
|
193
202
|
internal_seek(pause_msg)
|
194
203
|
end
|
195
204
|
end
|
@@ -354,6 +363,9 @@ module Karafka
|
|
354
363
|
#
|
355
364
|
# @param message [Messages::Message, Messages::Seek] message to which we want to seek to.
|
356
365
|
# It can have the time based offset.
|
366
|
+
#
|
367
|
+
# @note Will not invoke seeking if the desired seek would lead us to the current position.
|
368
|
+
# This prevents us from flushing librdkafka buffer when it is not needed.
|
357
369
|
def internal_seek(message)
|
358
370
|
# If the seek message offset is in a time format, we need to find the closest "real"
|
359
371
|
# offset matching before we seek
|
@@ -378,6 +390,14 @@ module Karafka
|
|
378
390
|
message.offset = detected_partition&.offset || raise(Errors::InvalidTimeBasedOffsetError)
|
379
391
|
end
|
380
392
|
|
393
|
+
# Never seek if we would get the same location as we would get without seeking
|
394
|
+
# This prevents us from the expensive buffer purges that can lead to increased network
|
395
|
+
# traffic and can cost a lot of money
|
396
|
+
#
|
397
|
+
# This code adds around 0.01 ms per seek but saves from many user unexpected behaviours in
|
398
|
+
# seeking and pausing
|
399
|
+
return if message.offset == topic_partition_position(message.topic, message.partition)
|
400
|
+
|
381
401
|
@kafka.seek(message)
|
382
402
|
end
|
383
403
|
|
@@ -432,6 +452,18 @@ module Karafka
|
|
432
452
|
Rdkafka::Consumer::TopicPartitionList.new({ topic => [rdkafka_partition] })
|
433
453
|
end
|
434
454
|
|
455
|
+
# @param topic [String]
|
456
|
+
# @param partition [Integer]
|
457
|
+
# @return [Integer] current position within topic partition or `-1` if it could not be
|
458
|
+
# established. It may be `-1` in case we lost the assignment or we did not yet fetch data
|
459
|
+
# for this topic partition
|
460
|
+
def topic_partition_position(topic, partition)
|
461
|
+
rd_partition = ::Rdkafka::Consumer::Partition.new(partition, nil, 0)
|
462
|
+
tpl = ::Rdkafka::Consumer::TopicPartitionList.new(topic => [rd_partition])
|
463
|
+
|
464
|
+
@kafka.position(tpl).to_h.fetch(topic).first.offset || -1
|
465
|
+
end
|
466
|
+
|
435
467
|
# Performs a single poll operation and handles retries and error
|
436
468
|
#
|
437
469
|
# @param timeout [Integer] timeout for a single poll
|
@@ -71,6 +71,8 @@ module Karafka
|
|
71
71
|
# Prints info about a consumer pause occurrence. Irrelevant if user or system initiated.
|
72
72
|
#
|
73
73
|
# @param event [Karafka::Core::Monitoring::Event] event details including payload
|
74
|
+
# @note There may be no offset provided in case user wants to pause on the consecutive offset
|
75
|
+
# position. This can be beneficial when not wanting to purge the buffers.
|
74
76
|
def on_client_pause(event)
|
75
77
|
topic = event[:topic]
|
76
78
|
partition = event[:partition]
|
@@ -78,7 +80,9 @@ module Karafka
|
|
78
80
|
client = event[:caller]
|
79
81
|
|
80
82
|
info <<~MSG.tr("\n", ' ').strip!
|
81
|
-
[#{client.id}]
|
83
|
+
[#{client.id}]
|
84
|
+
Pausing on topic #{topic}/#{partition}
|
85
|
+
on #{offset ? "offset #{offset}" : 'the consecutive offset'}
|
82
86
|
MSG
|
83
87
|
end
|
84
88
|
|
@@ -35,12 +35,10 @@ module Karafka
|
|
35
35
|
# This ensures that when running LRJ with VP, things operate as expected run only
|
36
36
|
# once for all the virtual partitions collectively
|
37
37
|
coordinator.on_enqueued do
|
38
|
-
# Pause
|
39
|
-
#
|
40
|
-
#
|
41
|
-
|
42
|
-
# have any edge cases here.
|
43
|
-
pause(coordinator.seek_offset, MAX_PAUSE_TIME, false)
|
38
|
+
# Pause and continue with another batch in case of a regular resume.
|
39
|
+
# In case of an error, the `#retry_after_pause` will move the offset to the first
|
40
|
+
# message out of this batch.
|
41
|
+
pause(:consecutive, MAX_PAUSE_TIME, false)
|
44
42
|
end
|
45
43
|
end
|
46
44
|
|
@@ -35,12 +35,7 @@ module Karafka
|
|
35
35
|
# This ensures that when running LRJ with VP, things operate as expected run only
|
36
36
|
# once for all the virtual partitions collectively
|
37
37
|
coordinator.on_enqueued do
|
38
|
-
|
39
|
-
# loose any messages.
|
40
|
-
#
|
41
|
-
# For VP it applies the same way and since VP cannot be used with MOM we should not
|
42
|
-
# have any edge cases here.
|
43
|
-
pause(coordinator.seek_offset, Strategies::Lrj::Default::MAX_PAUSE_TIME, false)
|
38
|
+
pause(:consecutive, Strategies::Lrj::Default::MAX_PAUSE_TIME, false)
|
44
39
|
end
|
45
40
|
end
|
46
41
|
|
@@ -50,6 +50,10 @@ module Karafka
|
|
50
50
|
virtual_topic = public_send(:topic=, pattern.name, &block)
|
51
51
|
# Indicate the nature of this topic (matcher)
|
52
52
|
virtual_topic.patterns(active: true, type: :matcher, pattern: pattern)
|
53
|
+
# Pattern subscriptions should never be part of declarative topics definitions
|
54
|
+
# Since they are subscribed by regular expressions, we do not know the target
|
55
|
+
# topics names so we cannot manage them via declaratives
|
56
|
+
virtual_topic.config(active: false)
|
53
57
|
pattern.topic = virtual_topic
|
54
58
|
@patterns << pattern
|
55
59
|
end
|
data/lib/karafka/version.rb
CHANGED
data/renovate.json
CHANGED
data.tar.gz.sig
CHANGED
Binary file
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: karafka
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.2.
|
4
|
+
version: 2.2.10
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Maciej Mensfeld
|
@@ -35,7 +35,7 @@ cert_chain:
|
|
35
35
|
AnG1dJU+yL2BK7vaVytLTstJME5mepSZ46qqIJXMuWob/YPDmVaBF39TDSG9e34s
|
36
36
|
msG3BiCqgOgHAnL23+CN3Rt8MsuRfEtoTKpJVcCfoEoNHOkc
|
37
37
|
-----END CERTIFICATE-----
|
38
|
-
date: 2023-
|
38
|
+
date: 2023-11-02 00:00:00.000000000 Z
|
39
39
|
dependencies:
|
40
40
|
- !ruby/object:Gem::Dependency
|
41
41
|
name: karafka-core
|
metadata.gz.sig
CHANGED
Binary file
|