karafka 2.5.0.rc1 → 2.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +9 -2
- data/Gemfile +1 -1
- data/Gemfile.lock +12 -12
- data/bin/integrations +1 -0
- data/config/locales/errors.yml +2 -1
- data/docker-compose.yml +1 -1
- data/karafka.gemspec +1 -1
- data/lib/karafka/active_job/job_extensions.rb +4 -1
- data/lib/karafka/admin.rb +27 -15
- data/lib/karafka/contracts/base.rb +3 -2
- data/lib/karafka/contracts/config.rb +2 -1
- data/lib/karafka/instrumentation/logger_listener.rb +11 -11
- data/lib/karafka/instrumentation/vendors/kubernetes/base_listener.rb +17 -2
- data/lib/karafka/instrumentation/vendors/kubernetes/liveness_listener.rb +29 -6
- data/lib/karafka/instrumentation/vendors/kubernetes/swarm_liveness_listener.rb +9 -0
- data/lib/karafka/pro/encryption.rb +4 -1
- data/lib/karafka/pro/recurring_tasks.rb +8 -2
- data/lib/karafka/pro/routing/features/swarm/contracts/routing.rb +3 -2
- data/lib/karafka/pro/routing/features/swarm.rb +4 -1
- data/lib/karafka/pro/scheduled_messages/proxy.rb +15 -3
- data/lib/karafka/pro/scheduled_messages.rb +4 -1
- data/lib/karafka/routing/builder.rb +12 -3
- data/lib/karafka/routing/features/base/expander.rb +8 -2
- data/lib/karafka/server.rb +4 -1
- data/lib/karafka/setup/config.rb +17 -5
- data/lib/karafka/swarm/supervisor.rb +5 -2
- data/lib/karafka/version.rb +1 -1
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 03a08ef42e32f92069ef95b4380b744f7188dd2248f296abc752e5cee9d12c7f
|
4
|
+
data.tar.gz: e23896dcf66e16cddf193ee1b412fb0560dd82e50dcf01b31f3bb93d451afcc3
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 52356bcb5a97f121a6e383bccc7530ef3e5252442ff0a13122b3cfebc8579a00a396576e4965f74473f950b28c820d898d259cffd90b26ff80c40701527cf97f
|
7
|
+
data.tar.gz: 8fe1550960de8de921e21a45b170a0b1282fdd6d9c5f81f2285cfc5dc958c8d54dbfcb953a997d87891f2ab29f010d8b418987a55825f7d25bb124c717876fab
|
data/CHANGELOG.md
CHANGED
@@ -1,11 +1,14 @@
|
|
1
1
|
# Karafka Framework Changelog
|
2
2
|
|
3
|
-
## 2.5.0 (
|
3
|
+
## 2.5.0 (2025-06-15)
|
4
4
|
- **[Breaking]** Change how consistency of DLQ dispatches works in Pro (`partition_key` vs. direct partition id mapping).
|
5
5
|
- **[Breaking]** Remove the headers `source_key` from the Pro DLQ dispatched messages as the original key is now fully preserved.
|
6
6
|
- **[Breaking]** Use DLQ and Piping prefix `source_` instead of `original_` to align with naming convention of Kafka Streams and Apache Flink for future usage.
|
7
7
|
- **[Breaking]** Rename scheduled jobs topics names in their config (Pro).
|
8
|
+
- **[Breaking]** Change K8s listener response from `204` to `200` and include JSON body with reasons.
|
9
|
+
- **[Breaking]** Replace admin config `max_attempts` with `max_retries_duration` and
|
8
10
|
- **[Feature]** Parallel Segments for concurrent processing of the same partition with more than partition count of processes (Pro).
|
11
|
+
- [Enhancement] Normalize topic + partition logs format.
|
9
12
|
- [Enhancement] Support KIP-82 (header values of arrays).
|
10
13
|
- [Enhancement] Enhance errors tracker with `#counts` that contains per-error class specific counters for granular flow handling.
|
11
14
|
- [Enhancement] Provide explicit `Karafka::Admin.copy_consumer_group` API.
|
@@ -41,7 +44,9 @@
|
|
41
44
|
- [Enhancement] Enrich scheduled messages state reporter with debug data.
|
42
45
|
- [Enhancement] Introduce a new state called `stopped` to the scheduled messages.
|
43
46
|
- [Enhancement] Do not overwrite the `key` in the Pro DLQ dispatched messages for routing reasons.
|
44
|
-
- [Enhancement] Introduce `errors_tracker.trace_id` for distributed error details correlation with the Web UI.
|
47
|
+
- [Enhancement] Introduce `errors_tracker.trace_id` for distributed error details correlation with the Web UI.
|
48
|
+
- [Enhancement] Improve contracts validations reporting.
|
49
|
+
- [Enhancement] Optimize topic creation and repartitioning admin operations for topics with hundreds of partitions.
|
45
50
|
- [Refactor] Introduce a `bin/verify_kafka_warnings` script to clean Kafka from temporary test-suite topics.
|
46
51
|
- [Refactor] Introduce a `bin/verify_topics_naming` script to ensure proper test topics naming convention.
|
47
52
|
- [Refactor] Make sure all temporary topics have a `it-` prefix in their name.
|
@@ -66,6 +71,8 @@
|
|
66
71
|
- [Fix] Scheduled Messages re-seek moves to `latest` on inheritance of initial offset when `0` offset is compacted.
|
67
72
|
- [Fix] Seek to `:latest` without `topic_partition_position` (-1) will not seek at all.
|
68
73
|
- [Fix] Extremely high turn over of scheduled messages can cause them not to reach EOF/Loaded state.
|
74
|
+
- [Fix] Fix incorrectly passed `max_wait_time` to rdkafka (ms instead of seconds) causing too long wait.
|
75
|
+
- [Fix] Remove aggresive requerying of the Kafka cluster on topic creation/removal/altering.
|
69
76
|
- [Change] Move to trusted-publishers and remove signing since no longer needed.
|
70
77
|
|
71
78
|
## 2.4.18 (2025-04-09)
|
data/Gemfile
CHANGED
data/Gemfile.lock
CHANGED
@@ -1,9 +1,9 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
karafka (2.5.0
|
4
|
+
karafka (2.5.0)
|
5
5
|
base64 (~> 0.2)
|
6
|
-
karafka-core (>= 2.5.
|
6
|
+
karafka-core (>= 2.5.2, < 2.6.0)
|
7
7
|
karafka-rdkafka (>= 0.19.5)
|
8
8
|
waterdrop (>= 2.8.3, < 3.0.0)
|
9
9
|
zeitwerk (~> 2.3)
|
@@ -27,9 +27,9 @@ GEM
|
|
27
27
|
securerandom (>= 0.3)
|
28
28
|
tzinfo (~> 2.0, >= 2.0.5)
|
29
29
|
uri (>= 0.13.1)
|
30
|
-
base64 (0.
|
30
|
+
base64 (0.3.0)
|
31
31
|
benchmark (0.4.1)
|
32
|
-
bigdecimal (3.
|
32
|
+
bigdecimal (3.2.2)
|
33
33
|
byebug (12.0.0)
|
34
34
|
concurrent-ruby (1.3.5)
|
35
35
|
connection_pool (2.5.3)
|
@@ -39,7 +39,7 @@ GEM
|
|
39
39
|
erubi (1.13.1)
|
40
40
|
et-orbi (1.2.11)
|
41
41
|
tzinfo
|
42
|
-
factory_bot (6.5.
|
42
|
+
factory_bot (6.5.4)
|
43
43
|
activesupport (>= 6.1.0)
|
44
44
|
ffi (1.17.2)
|
45
45
|
ffi (1.17.2-aarch64-linux-gnu)
|
@@ -59,7 +59,7 @@ GEM
|
|
59
59
|
activesupport (>= 6.1)
|
60
60
|
i18n (1.14.7)
|
61
61
|
concurrent-ruby (~> 1.0)
|
62
|
-
karafka-core (2.5.
|
62
|
+
karafka-core (2.5.2)
|
63
63
|
karafka-rdkafka (>= 0.19.2, < 0.21.0)
|
64
64
|
logger (>= 1.6.0)
|
65
65
|
karafka-rdkafka (0.19.5)
|
@@ -69,9 +69,9 @@ GEM
|
|
69
69
|
karafka-testing (2.5.1)
|
70
70
|
karafka (>= 2.5.0.beta1, < 2.6.0)
|
71
71
|
waterdrop (>= 2.8.0)
|
72
|
-
karafka-web (0.11.0
|
72
|
+
karafka-web (0.11.0)
|
73
73
|
erubi (~> 1.4)
|
74
|
-
karafka (>= 2.5.0.
|
74
|
+
karafka (>= 2.5.0.rc2, < 2.6.0)
|
75
75
|
karafka-core (>= 2.5.0, < 2.6.0)
|
76
76
|
roda (~> 3.68, >= 3.69)
|
77
77
|
tilt (~> 2.0)
|
@@ -80,9 +80,9 @@ GEM
|
|
80
80
|
minitest (5.25.5)
|
81
81
|
ostruct (0.6.1)
|
82
82
|
raabro (1.4.0)
|
83
|
-
rack (3.1.
|
83
|
+
rack (3.1.16)
|
84
84
|
rake (13.3.0)
|
85
|
-
roda (3.
|
85
|
+
roda (3.93.0)
|
86
86
|
rack
|
87
87
|
rspec (3.13.1)
|
88
88
|
rspec-core (~> 3.13.0)
|
@@ -113,7 +113,7 @@ GEM
|
|
113
113
|
karafka-core (>= 2.4.9, < 3.0.0)
|
114
114
|
karafka-rdkafka (>= 0.19.2)
|
115
115
|
zeitwerk (~> 2.3)
|
116
|
-
zeitwerk (2.
|
116
|
+
zeitwerk (2.7.3)
|
117
117
|
|
118
118
|
PLATFORMS
|
119
119
|
aarch64-linux-gnu
|
@@ -135,7 +135,7 @@ DEPENDENCIES
|
|
135
135
|
fugit
|
136
136
|
karafka!
|
137
137
|
karafka-testing (>= 2.5.0)
|
138
|
-
karafka-web (>= 0.11.0.
|
138
|
+
karafka-web (>= 0.11.0.rc2)
|
139
139
|
ostruct
|
140
140
|
rspec
|
141
141
|
simplecov
|
data/bin/integrations
CHANGED
@@ -45,6 +45,7 @@ class Scenario
|
|
45
45
|
'shutdown/on_hanging_on_shutdown_job_and_a_shutdown_spec.rb' => [2].freeze,
|
46
46
|
'shutdown/on_hanging_listener_and_shutdown_spec.rb' => [2].freeze,
|
47
47
|
'swarm/forceful_shutdown_of_hanging_spec.rb' => [2].freeze,
|
48
|
+
'swarm/with_blocking_at_exit_spec.rb' => [2].freeze,
|
48
49
|
'instrumentation/post_errors_instrumentation_error_spec.rb' => [1].freeze,
|
49
50
|
'cli/declaratives/delete/existing_with_exit_code_spec.rb' => [2].freeze,
|
50
51
|
'cli/declaratives/create/new_with_exit_code_spec.rb' => [2].freeze,
|
data/config/locales/errors.yml
CHANGED
@@ -84,7 +84,8 @@ en:
|
|
84
84
|
admin.kafka_format: needs to be a hash
|
85
85
|
admin.group_id_format: 'needs to be a string with a Kafka accepted format'
|
86
86
|
admin.max_wait_time_format: 'needs to be an integer bigger than 0'
|
87
|
-
admin.
|
87
|
+
admin.retry_backoff_format: 'needs to be an integer bigger than 100'
|
88
|
+
admin.max_retries_duration_format: 'needs to be an integer bigger than 1000'
|
88
89
|
|
89
90
|
swarm.nodes_format: 'needs to be an integer bigger than 0'
|
90
91
|
swarm.node_format: needs to be false or node instance
|
data/docker-compose.yml
CHANGED
data/karafka.gemspec
CHANGED
@@ -22,7 +22,7 @@ Gem::Specification.new do |spec|
|
|
22
22
|
DESC
|
23
23
|
|
24
24
|
spec.add_dependency 'base64', '~> 0.2'
|
25
|
-
spec.add_dependency 'karafka-core', '>= 2.5.
|
25
|
+
spec.add_dependency 'karafka-core', '>= 2.5.2', '< 2.6.0'
|
26
26
|
spec.add_dependency 'karafka-rdkafka', '>= 0.19.5'
|
27
27
|
spec.add_dependency 'waterdrop', '>= 2.8.3', '< 3.0.0'
|
28
28
|
spec.add_dependency 'zeitwerk', '~> 2.3'
|
@@ -21,7 +21,10 @@ module Karafka
|
|
21
21
|
|
22
22
|
# Make sure, that karafka options that someone wants to use are valid before assigning
|
23
23
|
# them
|
24
|
-
App.config.internal.active_job.job_options_contract.validate!(
|
24
|
+
App.config.internal.active_job.job_options_contract.validate!(
|
25
|
+
new_options,
|
26
|
+
scope: %w[active_job]
|
27
|
+
)
|
25
28
|
|
26
29
|
# We need to modify this hash because otherwise we would modify parent hash.
|
27
30
|
self._karafka_options = _karafka_options.dup
|
data/lib/karafka/admin.rb
CHANGED
@@ -10,10 +10,13 @@ module Karafka
|
|
10
10
|
# Cluster on which operations are performed can be changed via `admin.kafka` config, however
|
11
11
|
# there is no multi-cluster runtime support.
|
12
12
|
module Admin
|
13
|
+
extend Core::Helpers::Time
|
14
|
+
|
13
15
|
extend Helpers::ConfigImporter.new(
|
14
16
|
max_wait_time: %i[admin max_wait_time],
|
15
17
|
poll_timeout: %i[admin poll_timeout],
|
16
|
-
|
18
|
+
max_retries_duration: %i[admin max_retries_duration],
|
19
|
+
retry_backoff: %i[admin retry_backoff],
|
17
20
|
group_id: %i[admin group_id],
|
18
21
|
app_kafka: %i[kafka],
|
19
22
|
admin_kafka: %i[admin kafka]
|
@@ -122,7 +125,7 @@ module Karafka
|
|
122
125
|
handler = admin.create_topic(name, partitions, replication_factor, topic_config)
|
123
126
|
|
124
127
|
with_re_wait(
|
125
|
-
-> { handler.wait(max_wait_timeout:
|
128
|
+
-> { handler.wait(max_wait_timeout: max_wait_time_seconds) },
|
126
129
|
-> { topics_names.include?(name) }
|
127
130
|
)
|
128
131
|
end
|
@@ -136,7 +139,7 @@ module Karafka
|
|
136
139
|
handler = admin.delete_topic(name)
|
137
140
|
|
138
141
|
with_re_wait(
|
139
|
-
-> { handler.wait(max_wait_timeout:
|
142
|
+
-> { handler.wait(max_wait_timeout: max_wait_time_seconds) },
|
140
143
|
-> { !topics_names.include?(name) }
|
141
144
|
)
|
142
145
|
end
|
@@ -151,7 +154,7 @@ module Karafka
|
|
151
154
|
handler = admin.create_partitions(name, partitions)
|
152
155
|
|
153
156
|
with_re_wait(
|
154
|
-
-> { handler.wait(max_wait_timeout:
|
157
|
+
-> { handler.wait(max_wait_timeout: max_wait_time_seconds) },
|
155
158
|
-> { topic_info(name).fetch(:partition_count) >= partitions }
|
156
159
|
)
|
157
160
|
end
|
@@ -362,7 +365,7 @@ module Karafka
|
|
362
365
|
def delete_consumer_group(consumer_group_id)
|
363
366
|
with_admin do |admin|
|
364
367
|
handler = admin.delete_group(consumer_group_id)
|
365
|
-
handler.wait(max_wait_timeout:
|
368
|
+
handler.wait(max_wait_timeout: max_wait_time_seconds)
|
366
369
|
end
|
367
370
|
end
|
368
371
|
|
@@ -564,6 +567,12 @@ module Karafka
|
|
564
567
|
|
565
568
|
private
|
566
569
|
|
570
|
+
# @return [Integer] number of seconds to wait. `rdkafka` requires this value
|
571
|
+
# (`max_wait_time`) to be provided in seconds while we define it in ms hence the conversion
|
572
|
+
def max_wait_time_seconds
|
573
|
+
max_wait_time / 1_000.0
|
574
|
+
end
|
575
|
+
|
567
576
|
# Adds a new callback for given rdkafka instance for oauth token refresh (if needed)
|
568
577
|
#
|
569
578
|
# @param id [String, Symbol] unique (for the lifetime of instance) id that we use for
|
@@ -602,20 +611,23 @@ module Karafka
|
|
602
611
|
# @param handler [Proc] the wait handler operation
|
603
612
|
# @param breaker [Proc] extra condition upon timeout that indicates things were finished ok
|
604
613
|
def with_re_wait(handler, breaker)
|
605
|
-
|
606
|
-
|
614
|
+
start_time = monotonic_now
|
615
|
+
# Convert milliseconds to seconds for sleep
|
616
|
+
sleep_time = retry_backoff / 1000.0
|
607
617
|
|
608
|
-
|
618
|
+
loop do
|
619
|
+
handler.call
|
609
620
|
|
610
|
-
|
611
|
-
# not visible and we need to wait
|
612
|
-
raise(Errors::ResultNotVisibleError) unless breaker.call
|
613
|
-
rescue Rdkafka::AbstractHandle::WaitTimeoutError, Errors::ResultNotVisibleError
|
614
|
-
return if breaker.call
|
621
|
+
sleep(sleep_time)
|
615
622
|
|
616
|
-
|
623
|
+
return if breaker.call
|
624
|
+
rescue Rdkafka::AbstractHandle::WaitTimeoutError
|
625
|
+
return if breaker.call
|
617
626
|
|
618
|
-
|
627
|
+
next if monotonic_now - start_time < max_retries_duration
|
628
|
+
|
629
|
+
raise(Errors::ResultNotVisibleError)
|
630
|
+
end
|
619
631
|
end
|
620
632
|
|
621
633
|
# @param type [Symbol] type of config we want
|
@@ -5,12 +5,13 @@ module Karafka
|
|
5
5
|
# Base contract for all Karafka contracts
|
6
6
|
class Base < ::Karafka::Core::Contractable::Contract
|
7
7
|
# @param data [Hash] data for validation
|
8
|
+
# @param scope [Array<String>] nested scope if in use
|
8
9
|
# @return [Boolean] true if all good
|
9
10
|
# @raise [Errors::InvalidConfigurationError] invalid configuration error
|
10
11
|
# @note We use contracts only in the config validation context, so no need to add support
|
11
12
|
# for multiple error classes. It will be added when it will be needed.
|
12
|
-
def validate!(data)
|
13
|
-
super(data, Errors::InvalidConfigurationError)
|
13
|
+
def validate!(data, scope: [])
|
14
|
+
super(data, Errors::InvalidConfigurationError, scope: scope)
|
14
15
|
end
|
15
16
|
end
|
16
17
|
end
|
@@ -53,7 +53,8 @@ module Karafka
|
|
53
53
|
required(:kafka) { |val| val.is_a?(Hash) }
|
54
54
|
required(:group_id) { |val| val.is_a?(String) && Contracts::TOPIC_REGEXP.match?(val) }
|
55
55
|
required(:max_wait_time) { |val| val.is_a?(Integer) && val.positive? }
|
56
|
-
required(:
|
56
|
+
required(:retry_backoff) { |val| val.is_a?(Integer) && val >= 100 }
|
57
|
+
required(:max_retries_duration) { |val| val.is_a?(Integer) && val >= 1_000 }
|
57
58
|
end
|
58
59
|
|
59
60
|
# We validate internals just to be sure, that they are present and working
|
@@ -76,7 +76,7 @@ module Karafka
|
|
76
76
|
consumer = job.executor.topic.consumer
|
77
77
|
topic = job.executor.topic.name
|
78
78
|
partition = job.executor.partition
|
79
|
-
info "[#{job.id}] #{job_type} job for #{consumer} on #{topic}
|
79
|
+
info "[#{job.id}] #{job_type} job for #{consumer} on #{topic}-#{partition} started"
|
80
80
|
end
|
81
81
|
|
82
82
|
# Prints info about the fact that a given job has finished
|
@@ -91,7 +91,7 @@ module Karafka
|
|
91
91
|
partition = job.executor.partition
|
92
92
|
info <<~MSG.tr("\n", ' ').strip!
|
93
93
|
[#{job.id}] #{job_type} job for #{consumer}
|
94
|
-
on #{topic}
|
94
|
+
on #{topic}-#{partition} finished in #{time} ms
|
95
95
|
MSG
|
96
96
|
end
|
97
97
|
|
@@ -108,7 +108,7 @@ module Karafka
|
|
108
108
|
|
109
109
|
info <<~MSG.tr("\n", ' ').strip!
|
110
110
|
[#{client.id}]
|
111
|
-
Pausing on topic #{topic}
|
111
|
+
Pausing on topic #{topic}-#{partition}
|
112
112
|
on #{offset ? "offset #{offset}" : 'the consecutive offset'}
|
113
113
|
MSG
|
114
114
|
end
|
@@ -122,7 +122,7 @@ module Karafka
|
|
122
122
|
client = event[:caller]
|
123
123
|
|
124
124
|
info <<~MSG.tr("\n", ' ').strip!
|
125
|
-
[#{client.id}] Resuming on topic #{topic}
|
125
|
+
[#{client.id}] Resuming on topic #{topic}-#{partition}
|
126
126
|
MSG
|
127
127
|
end
|
128
128
|
|
@@ -138,7 +138,7 @@ module Karafka
|
|
138
138
|
|
139
139
|
info <<~MSG.tr("\n", ' ').strip!
|
140
140
|
[#{consumer.id}] Retrying of #{consumer.class} after #{timeout} ms
|
141
|
-
on topic #{topic}
|
141
|
+
on topic #{topic}-#{partition} from offset #{offset}
|
142
142
|
MSG
|
143
143
|
end
|
144
144
|
|
@@ -153,7 +153,7 @@ module Karafka
|
|
153
153
|
|
154
154
|
info <<~MSG.tr("\n", ' ').strip!
|
155
155
|
[#{consumer.id}] Seeking from #{consumer.class}
|
156
|
-
on topic #{topic}
|
156
|
+
on topic #{topic}-#{partition} to offset #{seek_offset}
|
157
157
|
MSG
|
158
158
|
end
|
159
159
|
|
@@ -233,7 +233,7 @@ module Karafka
|
|
233
233
|
info "#{group_prefix}: No partitions revoked"
|
234
234
|
else
|
235
235
|
revoked_partitions.each do |topic, partitions|
|
236
|
-
info "#{group_prefix}:
|
236
|
+
info "#{group_prefix}: #{topic}-[#{partitions.join(',')}] revoked"
|
237
237
|
end
|
238
238
|
end
|
239
239
|
end
|
@@ -251,7 +251,7 @@ module Karafka
|
|
251
251
|
info "#{group_prefix}: No partitions assigned"
|
252
252
|
else
|
253
253
|
assigned_partitions.each do |topic, partitions|
|
254
|
-
info "#{group_prefix}:
|
254
|
+
info "#{group_prefix}: #{topic}-[#{partitions.join(',')}] assigned"
|
255
255
|
end
|
256
256
|
end
|
257
257
|
end
|
@@ -269,7 +269,7 @@ module Karafka
|
|
269
269
|
|
270
270
|
info <<~MSG.tr("\n", ' ').strip!
|
271
271
|
[#{consumer.id}] Dispatched message #{offset}
|
272
|
-
from #{topic}
|
272
|
+
from #{topic}-#{partition}
|
273
273
|
to DLQ topic: #{dlq_topic}
|
274
274
|
MSG
|
275
275
|
end
|
@@ -288,7 +288,7 @@ module Karafka
|
|
288
288
|
info <<~MSG.tr("\n", ' ').strip!
|
289
289
|
[#{consumer.id}] Throttled and will resume
|
290
290
|
from message #{offset}
|
291
|
-
on #{topic}
|
291
|
+
on #{topic}-#{partition}
|
292
292
|
MSG
|
293
293
|
end
|
294
294
|
|
@@ -303,7 +303,7 @@ module Karafka
|
|
303
303
|
|
304
304
|
info <<~MSG.tr("\n", ' ').strip!
|
305
305
|
[#{consumer.id}] Post-filtering seeking to message #{offset}
|
306
|
-
on #{topic}
|
306
|
+
on #{topic}-#{partition}
|
307
307
|
MSG
|
308
308
|
end
|
309
309
|
|
@@ -8,11 +8,12 @@ module Karafka
|
|
8
8
|
# Namespace for instrumentation related with Kubernetes
|
9
9
|
module Kubernetes
|
10
10
|
# Base Kubernetes Listener providing basic HTTP server capabilities to respond with health
|
11
|
+
# statuses
|
11
12
|
class BaseListener
|
12
13
|
include ::Karafka::Core::Helpers::Time
|
13
14
|
|
14
15
|
# All good with Karafka
|
15
|
-
OK_CODE = '
|
16
|
+
OK_CODE = '200 OK'
|
16
17
|
|
17
18
|
# Some timeouts, fail
|
18
19
|
FAIL_CODE = '500 Internal Server Error'
|
@@ -38,11 +39,15 @@ module Karafka
|
|
38
39
|
|
39
40
|
# Responds to a HTTP request with the process liveness status
|
40
41
|
def respond
|
42
|
+
body = JSON.generate(status_body)
|
43
|
+
|
41
44
|
client = @server.accept
|
42
45
|
client.gets
|
43
46
|
client.print "HTTP/1.1 #{healthy? ? OK_CODE : FAIL_CODE}\r\n"
|
44
|
-
client.print "Content-Type:
|
47
|
+
client.print "Content-Type: application/json\r\n"
|
48
|
+
client.print "Content-Length: #{body.bytesize}\r\n"
|
45
49
|
client.print "\r\n"
|
50
|
+
client.print body
|
46
51
|
client.close
|
47
52
|
|
48
53
|
true
|
@@ -50,6 +55,16 @@ module Karafka
|
|
50
55
|
!@server.closed?
|
51
56
|
end
|
52
57
|
|
58
|
+
# @return [Hash] hash that will be the response body
|
59
|
+
def status_body
|
60
|
+
{
|
61
|
+
status: healthy? ? 'healthy' : 'unhealthy',
|
62
|
+
timestamp: Time.now.to_i,
|
63
|
+
port: @port,
|
64
|
+
process_id: ::Process.pid
|
65
|
+
}
|
66
|
+
end
|
67
|
+
|
53
68
|
# Starts background thread with micro-http monitoring
|
54
69
|
def start
|
55
70
|
@server = TCPServer.new(*[@hostname, @port].compact)
|
@@ -53,7 +53,7 @@ module Karafka
|
|
53
53
|
consuming_ttl: 5 * 60 * 1_000,
|
54
54
|
polling_ttl: 5 * 60 * 1_000
|
55
55
|
)
|
56
|
-
# If this is set to
|
56
|
+
# If this is set to a symbol, it indicates unrecoverable error like fencing
|
57
57
|
# While fencing can be partial (for one of the SGs), we still should consider this
|
58
58
|
# as an undesired state for the whole process because it halts processing in a
|
59
59
|
# non-recoverable manner forever
|
@@ -116,7 +116,7 @@ module Karafka
|
|
116
116
|
# We mark as unrecoverable only on certain errors that will not be fixed by retrying
|
117
117
|
return unless UNRECOVERABLE_RDKAFKA_ERRORS.include?(error.code)
|
118
118
|
|
119
|
-
@unrecoverable =
|
119
|
+
@unrecoverable = error.code
|
120
120
|
end
|
121
121
|
|
122
122
|
# Deregister the polling tracker for given listener
|
@@ -142,17 +142,29 @@ module Karafka
|
|
142
142
|
# Did we exceed any of the ttls
|
143
143
|
# @return [String] 204 string if ok, 500 otherwise
|
144
144
|
def healthy?
|
145
|
-
time = monotonic_now
|
146
|
-
|
147
145
|
return false if @unrecoverable
|
148
|
-
return false if
|
149
|
-
return false if
|
146
|
+
return false if polling_ttl_exceeded?
|
147
|
+
return false if consuming_ttl_exceeded?
|
150
148
|
|
151
149
|
true
|
152
150
|
end
|
153
151
|
|
154
152
|
private
|
155
153
|
|
154
|
+
# @return [Boolean] true if the consumer exceeded the polling ttl
|
155
|
+
def polling_ttl_exceeded?
|
156
|
+
time = monotonic_now
|
157
|
+
|
158
|
+
@pollings.values.any? { |tick| (time - tick) > @polling_ttl }
|
159
|
+
end
|
160
|
+
|
161
|
+
# @return [Boolean] true if the consumer exceeded the consuming ttl
|
162
|
+
def consuming_ttl_exceeded?
|
163
|
+
time = monotonic_now
|
164
|
+
|
165
|
+
@consumptions.values.any? { |tick| (time - tick) > @consuming_ttl }
|
166
|
+
end
|
167
|
+
|
156
168
|
# Wraps the logic with a mutex
|
157
169
|
# @param block [Proc] code we want to run in mutex
|
158
170
|
def synchronize(&block)
|
@@ -191,6 +203,17 @@ module Karafka
|
|
191
203
|
@consumptions.delete(thread_id)
|
192
204
|
end
|
193
205
|
end
|
206
|
+
|
207
|
+
# @return [Hash] response body status
|
208
|
+
def status_body
|
209
|
+
super.merge!(
|
210
|
+
errors: {
|
211
|
+
polling_ttl_exceeded: polling_ttl_exceeded?,
|
212
|
+
consumption_ttl_exceeded: consuming_ttl_exceeded?,
|
213
|
+
unrecoverable: @unrecoverable
|
214
|
+
}
|
215
|
+
)
|
216
|
+
end
|
194
217
|
end
|
195
218
|
end
|
196
219
|
end
|
@@ -47,6 +47,15 @@ module Karafka
|
|
47
47
|
def healthy?
|
48
48
|
(monotonic_now - @controlling) < @controlling_ttl
|
49
49
|
end
|
50
|
+
|
51
|
+
# @return [Hash] response body status
|
52
|
+
def status_body
|
53
|
+
super.merge!(
|
54
|
+
errors: {
|
55
|
+
controlling_ttl_exceeded: !healthy?
|
56
|
+
}
|
57
|
+
)
|
58
|
+
end
|
50
59
|
end
|
51
60
|
end
|
52
61
|
end
|
@@ -22,7 +22,10 @@ module Karafka
|
|
22
22
|
|
23
23
|
# @param config [Karafka::Core::Configurable::Node] root node config
|
24
24
|
def post_setup(config)
|
25
|
-
Encryption::Contracts::Config.new.validate!(
|
25
|
+
Encryption::Contracts::Config.new.validate!(
|
26
|
+
config.to_h,
|
27
|
+
scope: %w[config]
|
28
|
+
)
|
26
29
|
|
27
30
|
# Don't inject extra components if encryption is not active
|
28
31
|
return unless config.encryption.active
|
@@ -29,7 +29,10 @@ module Karafka
|
|
29
29
|
@schedule.instance_exec(&block)
|
30
30
|
|
31
31
|
@schedule.each do |task|
|
32
|
-
Contracts::Task.new.validate!(
|
32
|
+
Contracts::Task.new.validate!(
|
33
|
+
task.to_h,
|
34
|
+
scope: ['recurring_tasks', task.id]
|
35
|
+
)
|
33
36
|
end
|
34
37
|
|
35
38
|
@schedule
|
@@ -59,7 +62,10 @@ module Karafka
|
|
59
62
|
|
60
63
|
# @param config [Karafka::Core::Configurable::Node] root node config
|
61
64
|
def post_setup(config)
|
62
|
-
RecurringTasks::Contracts::Config.new.validate!(
|
65
|
+
RecurringTasks::Contracts::Config.new.validate!(
|
66
|
+
config.to_h,
|
67
|
+
scope: %w[config]
|
68
|
+
)
|
63
69
|
|
64
70
|
# Published after task is successfully executed
|
65
71
|
Karafka.monitor.notifications_bus.register_event('recurring_tasks.task.executed')
|
@@ -28,7 +28,8 @@ module Karafka
|
|
28
28
|
# Validates that each node has at least one assignment.
|
29
29
|
#
|
30
30
|
# @param builder [Karafka::Routing::Builder]
|
31
|
-
|
31
|
+
# @param scope [Array<String>]
|
32
|
+
def validate!(builder, scope: [])
|
32
33
|
nodes_setup = Hash.new do |h, node_id|
|
33
34
|
h[node_id] = { active: false, node_id: node_id }
|
34
35
|
end
|
@@ -49,7 +50,7 @@ module Karafka
|
|
49
50
|
end
|
50
51
|
|
51
52
|
nodes_setup.each_value do |details|
|
52
|
-
super(details)
|
53
|
+
super(details, scope: scope)
|
53
54
|
end
|
54
55
|
end
|
55
56
|
|
@@ -17,7 +17,10 @@ module Karafka
|
|
17
17
|
# @param config [Karafka::Core::Configurable::Node] app config
|
18
18
|
def post_setup(config)
|
19
19
|
config.monitor.subscribe('app.before_warmup') do
|
20
|
-
Contracts::Routing.new.validate!(
|
20
|
+
Contracts::Routing.new.validate!(
|
21
|
+
config.internal.routing.builder,
|
22
|
+
scope: %w[swarm]
|
23
|
+
)
|
21
24
|
end
|
22
25
|
end
|
23
26
|
end
|
@@ -60,7 +60,11 @@ module Karafka
|
|
60
60
|
# We need to ensure that the message we want to proxy is fully legit. Otherwise, since
|
61
61
|
# we envelope details like target topic, we could end up having incorrect data to
|
62
62
|
# schedule
|
63
|
-
MSG_CONTRACT.validate!(
|
63
|
+
MSG_CONTRACT.validate!(
|
64
|
+
message,
|
65
|
+
WaterDrop::Errors::MessageInvalidError,
|
66
|
+
scope: %w[scheduled_messages message]
|
67
|
+
)
|
64
68
|
|
65
69
|
headers = (message[:headers] || {}).merge(
|
66
70
|
'schedule_schema_version' => ScheduledMessages::SCHEMA_VERSION,
|
@@ -166,9 +170,17 @@ module Karafka
|
|
166
170
|
# complies with our requirements
|
167
171
|
# @param proxy_message [Hash] our message envelope
|
168
172
|
def validate!(proxy_message)
|
169
|
-
POST_CONTRACT.validate!(
|
173
|
+
POST_CONTRACT.validate!(
|
174
|
+
proxy_message,
|
175
|
+
scope: %w[scheduled_messages message]
|
176
|
+
)
|
177
|
+
|
170
178
|
# After proxy specific validations we also ensure, that the final form is correct
|
171
|
-
MSG_CONTRACT.validate!(
|
179
|
+
MSG_CONTRACT.validate!(
|
180
|
+
proxy_message,
|
181
|
+
WaterDrop::Errors::MessageInvalidError,
|
182
|
+
scope: %w[scheduled_messages message]
|
183
|
+
)
|
172
184
|
end
|
173
185
|
end
|
174
186
|
end
|
@@ -51,7 +51,10 @@ module Karafka
|
|
51
51
|
|
52
52
|
# @param config [Karafka::Core::Configurable::Node] root node config
|
53
53
|
def post_setup(config)
|
54
|
-
|
54
|
+
ScheduledMessages::Contracts::Config.new.validate!(
|
55
|
+
config.to_h,
|
56
|
+
scope: %w[config]
|
57
|
+
)
|
55
58
|
end
|
56
59
|
|
57
60
|
# Basically since we may have custom producers configured that are not the same as the
|
@@ -50,15 +50,24 @@ module Karafka
|
|
50
50
|
|
51
51
|
# Ensures high-level routing details consistency
|
52
52
|
# Contains checks that require knowledge about all the consumer groups to operate
|
53
|
-
Contracts::Routing.new.validate!(
|
53
|
+
Contracts::Routing.new.validate!(
|
54
|
+
map(&:to_h),
|
55
|
+
scope: %w[routes]
|
56
|
+
)
|
54
57
|
|
55
58
|
each do |consumer_group|
|
56
59
|
# Validate consumer group settings
|
57
|
-
Contracts::ConsumerGroup.new.validate!(
|
60
|
+
Contracts::ConsumerGroup.new.validate!(
|
61
|
+
consumer_group.to_h,
|
62
|
+
scope: ['routes', consumer_group.name]
|
63
|
+
)
|
58
64
|
|
59
65
|
# and then its topics settings
|
60
66
|
consumer_group.topics.each do |topic|
|
61
|
-
Contracts::Topic.new.validate!(
|
67
|
+
Contracts::Topic.new.validate!(
|
68
|
+
topic.to_h,
|
69
|
+
scope: ['routes', consumer_group.name, topic.name]
|
70
|
+
)
|
62
71
|
end
|
63
72
|
|
64
73
|
# Initialize subscription groups after all the routing is done
|
@@ -38,13 +38,19 @@ module Karafka
|
|
38
38
|
|
39
39
|
each do |consumer_group|
|
40
40
|
if scope::Contracts.const_defined?('ConsumerGroup', false)
|
41
|
-
scope::Contracts::ConsumerGroup.new.validate!(
|
41
|
+
scope::Contracts::ConsumerGroup.new.validate!(
|
42
|
+
consumer_group.to_h,
|
43
|
+
scope: ['routes', consumer_group.name]
|
44
|
+
)
|
42
45
|
end
|
43
46
|
|
44
47
|
next unless scope::Contracts.const_defined?('Topic', false)
|
45
48
|
|
46
49
|
consumer_group.topics.each do |topic|
|
47
|
-
scope::Contracts::Topic.new.validate!(
|
50
|
+
scope::Contracts::Topic.new.validate!(
|
51
|
+
topic.to_h,
|
52
|
+
scope: ['routes', consumer_group.name, topic.name]
|
53
|
+
)
|
48
54
|
end
|
49
55
|
end
|
50
56
|
|
data/lib/karafka/server.rb
CHANGED
@@ -51,7 +51,10 @@ module Karafka
|
|
51
51
|
# embedded
|
52
52
|
# We cannot validate this during the start because config needs to be populated and routes
|
53
53
|
# need to be defined.
|
54
|
-
cli_contract.validate!(
|
54
|
+
cli_contract.validate!(
|
55
|
+
activity_manager.to_h,
|
56
|
+
scope: %w[cli]
|
57
|
+
)
|
55
58
|
|
56
59
|
# We clear as we do not want parent handlers in case of working from fork
|
57
60
|
process.clear
|
data/lib/karafka/setup/config.rb
CHANGED
@@ -131,11 +131,20 @@ module Karafka
|
|
131
131
|
# option max_wait_time [Integer] We wait only for this amount of time before raising error
|
132
132
|
# as we intercept this error and retry after checking that the operation was finished or
|
133
133
|
# failed using external factor.
|
134
|
-
|
134
|
+
#
|
135
|
+
# For async this will finish immediately but for sync operations this will wait and we
|
136
|
+
# will get a confirmation. 60 seconds is ok for both cases as for async, the re-wait will
|
137
|
+
# kick in
|
138
|
+
setting :max_wait_time, default: 60 * 1_000
|
139
|
+
|
140
|
+
# How long should we wait on admin operation retrying before giving up and raising an
|
141
|
+
# error that result is not visible
|
142
|
+
setting :max_retries_duration, default: 60_000
|
135
143
|
|
136
|
-
#
|
137
|
-
#
|
138
|
-
|
144
|
+
# In case of fast-finished async work, this `retry_backoff` help us not re-query Kafka
|
145
|
+
# too fast after previous call to check the async operation results. Basically prevents
|
146
|
+
# us from spamming metadata requests to Kafka in a loop
|
147
|
+
setting :retry_backoff, default: 500
|
139
148
|
|
140
149
|
# option poll_timeout [Integer] time in ms
|
141
150
|
# How long should a poll wait before yielding on no results (rdkafka-ruby setting)
|
@@ -352,7 +361,10 @@ module Karafka
|
|
352
361
|
|
353
362
|
configure(&block)
|
354
363
|
|
355
|
-
Contracts::Config.new.validate!(
|
364
|
+
Contracts::Config.new.validate!(
|
365
|
+
config.to_h,
|
366
|
+
scope: %w[config]
|
367
|
+
)
|
356
368
|
|
357
369
|
configure_components
|
358
370
|
|
@@ -42,7 +42,10 @@ module Karafka
|
|
42
42
|
# Creates needed number of forks, installs signals and starts supervision
|
43
43
|
def run
|
44
44
|
# Validate the CLI provided options the same way as we do for the regular server
|
45
|
-
cli_contract.validate!(
|
45
|
+
cli_contract.validate!(
|
46
|
+
activity_manager.to_h,
|
47
|
+
scope: %w[swarm cli]
|
48
|
+
)
|
46
49
|
|
47
50
|
# Close producer just in case. While it should not be used, we do not want even a
|
48
51
|
# theoretical case since librdkafka is not thread-safe.
|
@@ -154,7 +157,7 @@ module Karafka
|
|
154
157
|
# Run forceful kill
|
155
158
|
manager.terminate
|
156
159
|
# And wait until linux kills them
|
157
|
-
# This prevents us from
|
160
|
+
# This prevents us from exiting forcefully with any dead child process still existing
|
158
161
|
# Since we have sent the `KILL` signal, it must die, so we can wait until all dead
|
159
162
|
sleep(supervision_sleep) until manager.stopped?
|
160
163
|
|
data/lib/karafka/version.rb
CHANGED
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: karafka
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.5.0
|
4
|
+
version: 2.5.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Maciej Mensfeld
|
@@ -29,7 +29,7 @@ dependencies:
|
|
29
29
|
requirements:
|
30
30
|
- - ">="
|
31
31
|
- !ruby/object:Gem::Version
|
32
|
-
version: 2.5.
|
32
|
+
version: 2.5.2
|
33
33
|
- - "<"
|
34
34
|
- !ruby/object:Gem::Version
|
35
35
|
version: 2.6.0
|
@@ -39,7 +39,7 @@ dependencies:
|
|
39
39
|
requirements:
|
40
40
|
- - ">="
|
41
41
|
- !ruby/object:Gem::Version
|
42
|
-
version: 2.5.
|
42
|
+
version: 2.5.2
|
43
43
|
- - "<"
|
44
44
|
- !ruby/object:Gem::Version
|
45
45
|
version: 2.6.0
|