karafka 2.2.9 → 2.2.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/.github/ISSUE_TEMPLATE/bug_report.md +10 -9
- data/.github/workflows/ci.yml +3 -1
- data/CHANGELOG.md +19 -9
- data/Gemfile.lock +9 -9
- data/lib/karafka/base_consumer.rb +12 -7
- data/lib/karafka/connection/client.rb +37 -5
- data/lib/karafka/instrumentation/logger_listener.rb +5 -1
- data/lib/karafka/pro/processing/strategies/aj/lrj_mom_vp.rb +1 -1
- data/lib/karafka/pro/processing/strategies/lrj/default.rb +4 -6
- data/lib/karafka/pro/processing/strategies/lrj/mom.rb +1 -6
- data/lib/karafka/pro/routing/features/patterns/consumer_group.rb +4 -0
- data/lib/karafka/version.rb +1 -1
- data/renovate.json +4 -1
- data.tar.gz.sig +0 -0
- metadata +2 -2
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 460029e2dbc352c56f107e18bbff28f5914b5b56a0e42be9670e13f0f3592fc9
|
4
|
+
data.tar.gz: 68c25bacd1995a8e9c97a915e1e1b2b4435437c97a5e31e2824499c9ad531057
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: adc877c3be3cb8885b0244168014387f0353bed89575b6931f5e1ca062d01ba6a0e5a0566da4dd282f5117c7daf27ff2116f27025ec4213972c1ca52e92980de
|
7
|
+
data.tar.gz: a62ff96ef96ea85d98288bad3c8f6cf0bfafc1ee6de1cf928fe803ad80435b0fa137aac9cf5eff4b232e854b3c4df76c3c16efaa4a117d6b86e442975cfc9a30
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
@@ -37,14 +37,15 @@ Here's an example:
|
|
37
37
|
|
38
38
|
```
|
39
39
|
$ [bundle exec] karafka info
|
40
|
-
Karafka version:
|
41
|
-
Ruby version: 2.
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
Boot file: /app/karafka
|
40
|
+
Karafka version: 2.2.10 + Pro
|
41
|
+
Ruby version: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
|
42
|
+
Rdkafka version: 0.13.8
|
43
|
+
Consumer groups count: 2
|
44
|
+
Subscription groups count: 2
|
45
|
+
Workers count: 2
|
46
|
+
Application client id: example_app
|
47
|
+
Boot file: /app/karafka.rb
|
48
48
|
Environment: development
|
49
|
-
|
49
|
+
License: Commercial
|
50
|
+
License entity: karafka-ci
|
50
51
|
```
|
data/.github/workflows/ci.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,15 @@
|
|
1
1
|
# Karafka framework changelog
|
2
2
|
|
3
|
+
## 2.2.10 (2023-11-02)
|
4
|
+
- [Improvement] Allow for running `#pause` without specifying the offset (provide offset or `:consecutive`). This allows for pausing on the consecutive message (last received + 1), so after resume we will get last message received + 1 effectively not using `#seek` and not purging `librdafka` buffer preserving on networking. Please be mindful that this uses notion of last message passed from **librdkafka**, and not the last one available in the consumer (`messages.last`). While for regular cases they will be the same, when using things like DLQ, LRJs, VPs or Filtering API, those may not be the same.
|
5
|
+
- [Improvement] **Drastically** improve network efficiency of operating with LRJ by using the `:consecutive` offset as default strategy for running LRJs without moving the offset in place and purging the data.
|
6
|
+
- [Improvement] Do not "seek in place". When pausing and/or seeking to the same location as the current position, do nothing not to purge buffers and not to move to the same place where we are.
|
7
|
+
- [Fix] Pattern regexps should not be part of declaratives even when configured.
|
8
|
+
|
9
|
+
### Upgrade Notes
|
10
|
+
|
11
|
+
In the latest Karafka release, there are no breaking changes. However, please note the updates to #pause and #seek. If you spot any issues, please report them immediately. Your feedback is crucial.
|
12
|
+
|
3
13
|
## 2.2.9 (2023-10-24)
|
4
14
|
- [Improvement] Allow using negative offset references in `Karafka::Admin#read_topic`.
|
5
15
|
- [Change] Make sure that WaterDrop `2.6.10` or higher is used with this release to support transactions fully and the Web-UI.
|
@@ -11,7 +21,7 @@
|
|
11
21
|
- [Refactor] Reorganize how rebalance events are propagated from `librdkafka` to Karafka. Replace `connection.client.rebalance_callback` with `rebalance.partitions_assigned` and `rebalance.partitions_revoked`. Introduce two extra events: `rebalance.partitions_assign` and `rebalance.partitions_revoke` to handle pre-rebalance future work.
|
12
22
|
- [Refactor] Remove `thor` as a CLI layer and rely on Ruby `OptParser`
|
13
23
|
|
14
|
-
### Upgrade
|
24
|
+
### Upgrade Notes
|
15
25
|
|
16
26
|
1. Unless you were using `connection.client.rebalance_callback` which was considered private, nothing.
|
17
27
|
2. None of the CLI commands should change but `thor` has been removed so please report if you find any bugs.
|
@@ -53,7 +63,7 @@
|
|
53
63
|
- [Enhancement] Make sure that consumer group used by `Karafka::Admin` obeys the `ConsumerMapper` setup.
|
54
64
|
- [Fix] Fix a case where subscription group would not accept a symbol name.
|
55
65
|
|
56
|
-
### Upgrade
|
66
|
+
### Upgrade Notes
|
57
67
|
|
58
68
|
As always, please make sure you have upgraded to the most recent version of `2.1` before upgrading to `2.2`.
|
59
69
|
|
@@ -184,7 +194,7 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
|
|
184
194
|
- [Change] Removed `Karafka::Pro::BaseConsumer` in favor of `Karafka::BaseConsumer`. (#1345)
|
185
195
|
- [Fix] Fix for `max_messages` and `max_wait_time` not having reference in errors.yml (#1443)
|
186
196
|
|
187
|
-
### Upgrade
|
197
|
+
### Upgrade Notes
|
188
198
|
|
189
199
|
1. Upgrade to Karafka `2.0.41` prior to upgrading to `2.1.0`.
|
190
200
|
2. Replace `Karafka::Pro::BaseConsumer` references to `Karafka::BaseConsumer`.
|
@@ -256,7 +266,7 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
|
|
256
266
|
- [Improvement] Attach an `embedded` tag to Karafka processes started using the embedded API.
|
257
267
|
- [Change] Renamed `Datadog::Listener` to `Datadog::MetricsListener` for consistency. (#1124)
|
258
268
|
|
259
|
-
### Upgrade
|
269
|
+
### Upgrade Notes
|
260
270
|
|
261
271
|
1. Replace `Datadog::Listener` references to `Datadog::MetricsListener`.
|
262
272
|
|
@@ -270,7 +280,7 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
|
|
270
280
|
- [Improvement] Introduce a `strict_topics_namespacing` config option to enable/disable the strict topics naming validations. This can be useful when working with pre-existing topics which we cannot or do not want to rename.
|
271
281
|
- [Fix] Karafka monitor is prematurely cached (#1314)
|
272
282
|
|
273
|
-
### Upgrade
|
283
|
+
### Upgrade Notes
|
274
284
|
|
275
285
|
Since `#tags` were introduced on consumers, the `#tags` method is now part of the consumers API.
|
276
286
|
|
@@ -332,7 +342,7 @@ end
|
|
332
342
|
- [Improvement] Include `original_consumer_group` in the DLQ dispatched messages in Pro.
|
333
343
|
- [Improvement] Use Karafka `client_id` as kafka `client.id` value by default
|
334
344
|
|
335
|
-
### Upgrade
|
345
|
+
### Upgrade Notes
|
336
346
|
|
337
347
|
If you want to continue to use `karafka` as default for kafka `client.id`, assign it manually:
|
338
348
|
|
@@ -385,7 +395,7 @@ class KarafkaApp < Karafka::App
|
|
385
395
|
- [Fix] Do not trigger code reloading when `consumer_persistence` is enabled.
|
386
396
|
- [Fix] Shutdown producer after all the consumer components are down and the status is stopped. This will ensure, that any instrumentation related Kafka messaging can still operate.
|
387
397
|
|
388
|
-
### Upgrade
|
398
|
+
### Upgrade Notes
|
389
399
|
|
390
400
|
If you want to disable `librdkafka` statistics because you do not use them at all, update the `kafka` `statistics.interval.ms` setting and set it to `0`:
|
391
401
|
|
@@ -447,7 +457,7 @@ end
|
|
447
457
|
- [Fix] Few typos around DLQ and Pro DLQ Dispatch original metadata naming.
|
448
458
|
- [Fix] Narrow the components lookup to the appropriate scope (#1114)
|
449
459
|
|
450
|
-
### Upgrade
|
460
|
+
### Upgrade Notes
|
451
461
|
|
452
462
|
1. Replace `original-*` references from DLQ dispatched metadata with `original_*`
|
453
463
|
|
@@ -490,7 +500,7 @@ end
|
|
490
500
|
- [Specs] Split specs into regular and pro to simplify how resources are loaded
|
491
501
|
- [Specs] Add specs to ensure, that all the Pro components have a proper per-file license (#1099)
|
492
502
|
|
493
|
-
### Upgrade
|
503
|
+
### Upgrade Notes
|
494
504
|
|
495
505
|
1. Remove the `manual_offset_management` setting from the main config if you use it:
|
496
506
|
|
data/Gemfile.lock
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
karafka (2.2.
|
4
|
+
karafka (2.2.10)
|
5
5
|
karafka-core (>= 2.2.2, < 2.3.0)
|
6
6
|
waterdrop (>= 2.6.10, < 3.0.0)
|
7
7
|
zeitwerk (~> 2.3)
|
@@ -39,24 +39,24 @@ GEM
|
|
39
39
|
activesupport (>= 6.1)
|
40
40
|
i18n (1.14.1)
|
41
41
|
concurrent-ruby (~> 1.0)
|
42
|
-
karafka-core (2.2.
|
42
|
+
karafka-core (2.2.5)
|
43
43
|
concurrent-ruby (>= 1.1)
|
44
|
-
karafka-rdkafka (>= 0.13.
|
45
|
-
karafka-rdkafka (0.13.
|
44
|
+
karafka-rdkafka (>= 0.13.8, < 0.15.0)
|
45
|
+
karafka-rdkafka (0.13.8)
|
46
46
|
ffi (~> 1.15)
|
47
47
|
mini_portile2 (~> 2.6)
|
48
48
|
rake (> 12)
|
49
|
-
karafka-web (0.7.
|
49
|
+
karafka-web (0.7.10)
|
50
50
|
erubi (~> 1.4)
|
51
|
-
karafka (>= 2.2.
|
52
|
-
karafka-core (>= 2.2.
|
51
|
+
karafka (>= 2.2.9, < 3.0.0)
|
52
|
+
karafka-core (>= 2.2.4, < 3.0.0)
|
53
53
|
roda (~> 3.68, >= 3.69)
|
54
54
|
tilt (~> 2.0)
|
55
55
|
mini_portile2 (2.8.5)
|
56
56
|
minitest (5.20.0)
|
57
57
|
mutex_m (0.1.2)
|
58
58
|
rack (3.0.8)
|
59
|
-
rake (13.0
|
59
|
+
rake (13.1.0)
|
60
60
|
roda (3.73.0)
|
61
61
|
rack
|
62
62
|
rspec (3.12.0)
|
@@ -82,7 +82,7 @@ GEM
|
|
82
82
|
tilt (2.3.0)
|
83
83
|
tzinfo (2.0.6)
|
84
84
|
concurrent-ruby (~> 1.0)
|
85
|
-
waterdrop (2.6.
|
85
|
+
waterdrop (2.6.11)
|
86
86
|
karafka-core (>= 2.2.3, < 3.0.0)
|
87
87
|
zeitwerk (~> 2.3)
|
88
88
|
zeitwerk (2.6.12)
|
@@ -171,23 +171,28 @@ module Karafka
|
|
171
171
|
@used
|
172
172
|
end
|
173
173
|
|
174
|
-
# Pauses processing on a given offset for the current topic partition
|
174
|
+
# Pauses processing on a given offset or consecutive offset for the current topic partition
|
175
175
|
#
|
176
176
|
# After given partition is resumed, it will continue processing from the given offset
|
177
|
-
# @param offset [Integer] offset from which we want to restart the processing
|
177
|
+
# @param offset [Integer, Symbol] offset from which we want to restart the processing or
|
178
|
+
# `:consecutive` if we want to pause and continue without changing the consecutive offset
|
179
|
+
# (cursor position)
|
178
180
|
# @param timeout [Integer, nil] how long in milliseconds do we want to pause or nil to use the
|
179
181
|
# default exponential pausing strategy defined for retries
|
180
182
|
# @param manual_pause [Boolean] Flag to differentiate between user pause and system/strategy
|
181
183
|
# based pause. While they both pause in exactly the same way, the strategy application
|
182
184
|
# may need to differentiate between them.
|
185
|
+
#
|
186
|
+
# @note It is **critical** to understand how pause with `:consecutive` offset operates. While
|
187
|
+
# it provides benefit of not purging librdkafka buffer, in case of usage of filters, retries
|
188
|
+
# or other advanced options the consecutive offset may not be the one you want to pause on.
|
189
|
+
# Test it well to ensure, that this behaviour is expected by you.
|
183
190
|
def pause(offset, timeout = nil, manual_pause = true)
|
184
191
|
timeout ? coordinator.pause_tracker.pause(timeout) : coordinator.pause_tracker.pause
|
185
192
|
|
186
|
-
|
187
|
-
|
188
|
-
|
189
|
-
offset
|
190
|
-
)
|
193
|
+
offset = nil if offset == :consecutive
|
194
|
+
|
195
|
+
client.pause(topic.name, partition, offset)
|
191
196
|
|
192
197
|
# Indicate, that user took a manual action of pausing
|
193
198
|
coordinator.manual_pause if manual_pause
|
@@ -160,16 +160,17 @@ module Karafka
|
|
160
160
|
#
|
161
161
|
# @param topic [String] topic name
|
162
162
|
# @param partition [Integer] partition
|
163
|
-
# @param offset [Integer] offset of the message on which we want to pause (this message
|
164
|
-
# be reprocessed after getting back to processing)
|
163
|
+
# @param offset [Integer, nil] offset of the message on which we want to pause (this message
|
164
|
+
# will be reprocessed after getting back to processing) or nil if we want to pause and
|
165
|
+
# resume from the consecutive offset (+1 from the last message passed to us by librdkafka)
|
165
166
|
# @note This will pause indefinitely and requires manual `#resume`
|
166
|
-
|
167
|
+
# @note When `#internal_seek` is not involved (when offset is `nil`) we will not purge the
|
168
|
+
# librdkafka buffers and continue from the last cursor offset
|
169
|
+
def pause(topic, partition, offset = nil)
|
167
170
|
@mutex.synchronize do
|
168
171
|
# Do not pause if the client got closed, would not change anything
|
169
172
|
return if @closed
|
170
173
|
|
171
|
-
pause_msg = Messages::Seek.new(topic, partition, offset)
|
172
|
-
|
173
174
|
internal_commit_offsets(async: true)
|
174
175
|
|
175
176
|
# Here we do not use our cached tpls because we should not try to pause something we do
|
@@ -190,6 +191,14 @@ module Karafka
|
|
190
191
|
@paused_tpls[topic][partition] = tpl
|
191
192
|
|
192
193
|
@kafka.pause(tpl)
|
194
|
+
|
195
|
+
# If offset is not provided, will pause where it finished.
|
196
|
+
# This makes librdkafka not purge buffers and can provide significant network savings
|
197
|
+
# when we just want to pause before further processing without changing the offsets
|
198
|
+
return unless offset
|
199
|
+
|
200
|
+
pause_msg = Messages::Seek.new(topic, partition, offset)
|
201
|
+
|
193
202
|
internal_seek(pause_msg)
|
194
203
|
end
|
195
204
|
end
|
@@ -354,6 +363,9 @@ module Karafka
|
|
354
363
|
#
|
355
364
|
# @param message [Messages::Message, Messages::Seek] message to which we want to seek to.
|
356
365
|
# It can have the time based offset.
|
366
|
+
#
|
367
|
+
# @note Will not invoke seeking if the desired seek would lead us to the current position.
|
368
|
+
# This prevents us from flushing librdkafka buffer when it is not needed.
|
357
369
|
def internal_seek(message)
|
358
370
|
# If the seek message offset is in a time format, we need to find the closest "real"
|
359
371
|
# offset matching before we seek
|
@@ -378,6 +390,14 @@ module Karafka
|
|
378
390
|
message.offset = detected_partition&.offset || raise(Errors::InvalidTimeBasedOffsetError)
|
379
391
|
end
|
380
392
|
|
393
|
+
# Never seek if we would get the same location as we would get without seeking
|
394
|
+
# This prevents us from the expensive buffer purges that can lead to increased network
|
395
|
+
# traffic and can cost a lot of money
|
396
|
+
#
|
397
|
+
# This code adds around 0.01 ms per seek but saves from many user unexpected behaviours in
|
398
|
+
# seeking and pausing
|
399
|
+
return if message.offset == topic_partition_position(message.topic, message.partition)
|
400
|
+
|
381
401
|
@kafka.seek(message)
|
382
402
|
end
|
383
403
|
|
@@ -432,6 +452,18 @@ module Karafka
|
|
432
452
|
Rdkafka::Consumer::TopicPartitionList.new({ topic => [rdkafka_partition] })
|
433
453
|
end
|
434
454
|
|
455
|
+
# @param topic [String]
|
456
|
+
# @param partition [Integer]
|
457
|
+
# @return [Integer] current position within topic partition or `-1` if it could not be
|
458
|
+
# established. It may be `-1` in case we lost the assignment or we did not yet fetch data
|
459
|
+
# for this topic partition
|
460
|
+
def topic_partition_position(topic, partition)
|
461
|
+
rd_partition = ::Rdkafka::Consumer::Partition.new(partition, nil, 0)
|
462
|
+
tpl = ::Rdkafka::Consumer::TopicPartitionList.new(topic => [rd_partition])
|
463
|
+
|
464
|
+
@kafka.position(tpl).to_h.fetch(topic).first.offset || -1
|
465
|
+
end
|
466
|
+
|
435
467
|
# Performs a single poll operation and handles retries and error
|
436
468
|
#
|
437
469
|
# @param timeout [Integer] timeout for a single poll
|
@@ -71,6 +71,8 @@ module Karafka
|
|
71
71
|
# Prints info about a consumer pause occurrence. Irrelevant if user or system initiated.
|
72
72
|
#
|
73
73
|
# @param event [Karafka::Core::Monitoring::Event] event details including payload
|
74
|
+
# @note There may be no offset provided in case user wants to pause on the consecutive offset
|
75
|
+
# position. This can be beneficial when not wanting to purge the buffers.
|
74
76
|
def on_client_pause(event)
|
75
77
|
topic = event[:topic]
|
76
78
|
partition = event[:partition]
|
@@ -78,7 +80,9 @@ module Karafka
|
|
78
80
|
client = event[:caller]
|
79
81
|
|
80
82
|
info <<~MSG.tr("\n", ' ').strip!
|
81
|
-
[#{client.id}]
|
83
|
+
[#{client.id}]
|
84
|
+
Pausing on topic #{topic}/#{partition}
|
85
|
+
on #{offset ? "offset #{offset}" : 'the consecutive offset'}
|
82
86
|
MSG
|
83
87
|
end
|
84
88
|
|
@@ -35,12 +35,10 @@ module Karafka
|
|
35
35
|
# This ensures that when running LRJ with VP, things operate as expected run only
|
36
36
|
# once for all the virtual partitions collectively
|
37
37
|
coordinator.on_enqueued do
|
38
|
-
# Pause
|
39
|
-
#
|
40
|
-
#
|
41
|
-
|
42
|
-
# have any edge cases here.
|
43
|
-
pause(coordinator.seek_offset, MAX_PAUSE_TIME, false)
|
38
|
+
# Pause and continue with another batch in case of a regular resume.
|
39
|
+
# In case of an error, the `#retry_after_pause` will move the offset to the first
|
40
|
+
# message out of this batch.
|
41
|
+
pause(:consecutive, MAX_PAUSE_TIME, false)
|
44
42
|
end
|
45
43
|
end
|
46
44
|
|
@@ -35,12 +35,7 @@ module Karafka
|
|
35
35
|
# This ensures that when running LRJ with VP, things operate as expected run only
|
36
36
|
# once for all the virtual partitions collectively
|
37
37
|
coordinator.on_enqueued do
|
38
|
-
|
39
|
-
# loose any messages.
|
40
|
-
#
|
41
|
-
# For VP it applies the same way and since VP cannot be used with MOM we should not
|
42
|
-
# have any edge cases here.
|
43
|
-
pause(coordinator.seek_offset, Strategies::Lrj::Default::MAX_PAUSE_TIME, false)
|
38
|
+
pause(:consecutive, Strategies::Lrj::Default::MAX_PAUSE_TIME, false)
|
44
39
|
end
|
45
40
|
end
|
46
41
|
|
@@ -50,6 +50,10 @@ module Karafka
|
|
50
50
|
virtual_topic = public_send(:topic=, pattern.name, &block)
|
51
51
|
# Indicate the nature of this topic (matcher)
|
52
52
|
virtual_topic.patterns(active: true, type: :matcher, pattern: pattern)
|
53
|
+
# Pattern subscriptions should never be part of declarative topics definitions
|
54
|
+
# Since they are subscribed by regular expressions, we do not know the target
|
55
|
+
# topics names so we cannot manage them via declaratives
|
56
|
+
virtual_topic.config(active: false)
|
53
57
|
pattern.topic = virtual_topic
|
54
58
|
@patterns << pattern
|
55
59
|
end
|
data/lib/karafka/version.rb
CHANGED
data/renovate.json
CHANGED
data.tar.gz.sig
CHANGED
Binary file
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: karafka
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.2.
|
4
|
+
version: 2.2.10
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Maciej Mensfeld
|
@@ -35,7 +35,7 @@ cert_chain:
|
|
35
35
|
AnG1dJU+yL2BK7vaVytLTstJME5mepSZ46qqIJXMuWob/YPDmVaBF39TDSG9e34s
|
36
36
|
msG3BiCqgOgHAnL23+CN3Rt8MsuRfEtoTKpJVcCfoEoNHOkc
|
37
37
|
-----END CERTIFICATE-----
|
38
|
-
date: 2023-
|
38
|
+
date: 2023-11-02 00:00:00.000000000 Z
|
39
39
|
dependencies:
|
40
40
|
- !ruby/object:Gem::Dependency
|
41
41
|
name: karafka-core
|
metadata.gz.sig
CHANGED
Binary file
|