karafka 2.0.36 → 2.0.38

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 128d19fa065cbb43fae7a23a424e44ec30077fe58c93d7c122da299ec5b98df1
4
- data.tar.gz: 0b396fa75d93257f32a9ddf94f0dc89d2456b6ed07acad5c581908d148cf0a6f
3
+ metadata.gz: 1b9653385cf5a3b1e27eae06d53b9761c9a1f265252f721773258459eb3df1e7
4
+ data.tar.gz: c0af983ab0539e8463bf2612068a6b261de1325078c3e8600b0d6df0f596d100
5
5
  SHA512:
6
- metadata.gz: 68d4da94eaf7a1033c95dd02bdc9a4bd8e69040df4db304dbcb6f8255997a40c003c999eea6442f6c1a1640c3e1ed44032c3826191e96085afbccc8600a8f2b6
7
- data.tar.gz: 025fbc8edb26f6f8442e11ea90afdc3e07873efcba3f6778549cd66e0738dfec73d956d877c44ea39bc1064c22df9fca6d3c68a42ef97ad4438ed7812b690ea1
6
+ metadata.gz: d9000a8f71d7fff762db5f567956f6ea68e436b428014c509ae233730c9f75fd6ac311e51b0022999dfdce64362c86dab6912ce549378d9def231e5749961140
7
+ data.tar.gz: f980261b5ada2f46efbf919aac86ab63da5bccce26639b9e7d98c07c6012cc3c727189a548627687092ee2802aca8df3d5459bcdcc8d9d29b35f2d6da92a64fc
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -1,5 +1,15 @@
1
1
  # Karafka framework changelog
2
2
 
3
+ ## 2.0.38 (2023-03-27)
4
+ - [Improvement] Introduce `Karafka::Admin#read_watermark_offsets` to get low and high watermark offsets values.
5
+ - [Improvement] Track active_job_id in instrumentation (#1372)
6
+ - [Improvement] Improve `#read_topic` reading in case of a compacted partition where the offset is below the low watermark offset. This should optimize reading and should not go beyond the low watermark offset.
7
+ - [Improvement] Allow `#read_topic` to accept instance settings to overwrite any settings needed to customize reading behaviours.
8
+
9
+ ## 2.0.37 (2023-03-20)
10
+ - [Fix] Declarative topics execution on a secondary cluster run topics creation on the primary one (#1365)
11
+ - [Fix] Admin read operations commit offset when not needed (#1369)
12
+
3
13
  ## 2.0.36 (2023-03-17)
4
14
  - [Refactor] Rename internal naming of `Structurable` to `Declaratives` for declarative topics feature.
5
15
  - [Fix] AJ + DLQ + MOM + LRJ is pausing indefinitely after the first job (#1362)
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- karafka (2.0.36)
4
+ karafka (2.0.38)
5
5
  karafka-core (>= 2.0.12, < 3.0.0)
6
6
  thor (>= 0.20)
7
7
  waterdrop (>= 2.4.10, < 3.0.0)
@@ -10,10 +10,10 @@ PATH
10
10
  GEM
11
11
  remote: https://rubygems.org/
12
12
  specs:
13
- activejob (7.0.4.2)
14
- activesupport (= 7.0.4.2)
13
+ activejob (7.0.4.3)
14
+ activesupport (= 7.0.4.3)
15
15
  globalid (>= 0.3.6)
16
- activesupport (7.0.4.2)
16
+ activesupport (7.0.4.3)
17
17
  concurrent-ruby (~> 1.0, >= 1.0.2)
18
18
  i18n (>= 1.6, < 2)
19
19
  minitest (>= 5.1)
data/README.md CHANGED
@@ -86,7 +86,7 @@ bundle exec karafka server
86
86
 
87
87
  I also sell Karafka Pro subscriptions. It includes a commercial-friendly license, priority support, architecture consultations, enhanced Web UI and high throughput data processing-related features (virtual partitions, long-running jobs, and more).
88
88
 
89
- **20%** of the income will be distributed back to other OSS projects that Karafka uses under the hood.
89
+ **10%** of the income will be distributed back to other OSS projects that Karafka uses under the hood.
90
90
 
91
91
  Help me provide high-quality open-source software. Please see the Karafka [homepage](https://karafka.io/#become-pro) for more details.
92
92
 
@@ -12,16 +12,31 @@ module Karafka
12
12
  messages.each do |message|
13
13
  break if Karafka::App.stopping?
14
14
 
15
- # We technically speaking could set this as deserializer and reference it from the
16
- # message instead of using the `#raw_payload`. This is not done on purpose to simplify
17
- # the ActiveJob setup here
18
- job = ::ActiveSupport::JSON.decode(message.raw_payload)
15
+ consume_job(message)
19
16
 
20
- tags.add(:job_class, job['job_class'])
17
+ mark_as_consumed(message)
18
+ end
19
+ end
21
20
 
22
- ::ActiveJob::Base.execute(job)
21
+ private
23
22
 
24
- mark_as_consumed(message)
23
+ # Consumes a message with the job and runs needed instrumentation
24
+ #
25
+ # @param job_message [Karafka::Messages::Message] message with active job
26
+ def consume_job(job_message)
27
+ # We technically speaking could set this as deserializer and reference it from the
28
+ # message instead of using the `#raw_payload`. This is not done on purpose to simplify
29
+ # the ActiveJob setup here
30
+ job = ::ActiveSupport::JSON.decode(job_message.raw_payload)
31
+
32
+ tags.add(:job_class, job['job_class'])
33
+
34
+ payload = { caller: self, job: job, message: job_message }
35
+
36
+ # We publish both to make it consistent with `consumer.x` events
37
+ Karafka.monitor.instrument('active_job.consume', payload)
38
+ Karafka.monitor.instrument('active_job.consumed', payload) do
39
+ ::ActiveJob::Base.execute(job)
25
40
  end
26
41
  end
27
42
  end
data/lib/karafka/admin.rb CHANGED
@@ -28,7 +28,10 @@ module Karafka
28
28
  'enable.partition.eof': true,
29
29
  'statistics.interval.ms': 0,
30
30
  # Fetch at most 5 MBs when using admin
31
- 'fetch.message.max.bytes': 5 * 1_048_576
31
+ 'fetch.message.max.bytes': 5 * 1_048_576,
32
+ # Do not commit offset automatically, this prevents offset tracking for operations involving
33
+ # a consumer instance
34
+ 'enable.auto.commit': false
32
35
  }.freeze
33
36
 
34
37
  private_constant :Topic, :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :MAX_ATTEMPTS
@@ -41,17 +44,32 @@ module Karafka
41
44
  # @param count [Integer] how many messages we want to get at most
42
45
  # @param start_offset [Integer] offset from which we should start. If -1 is provided
43
46
  # (default) we will start from the latest offset
47
+ # @param settings [Hash] kafka extra settings (optional)
44
48
  #
45
49
  # @return [Array<Karafka::Messages::Message>] array with messages
46
- def read_topic(name, partition, count, start_offset = -1)
50
+ def read_topic(name, partition, count, start_offset = -1, settings = {})
47
51
  messages = []
48
52
  tpl = Rdkafka::Consumer::TopicPartitionList.new
53
+ low_offset, high_offset = nil
49
54
 
50
- with_consumer do |consumer|
51
- offsets = consumer.query_watermark_offsets(name, partition)
52
- end_offset = offsets.last
55
+ with_consumer(settings) do |consumer|
56
+ low_offset, high_offset = consumer.query_watermark_offsets(name, partition)
57
+
58
+ # Select offset dynamically if -1 or less
59
+ start_offset = high_offset - count if start_offset.negative?
53
60
 
54
- start_offset = [0, offsets.last - count].max if start_offset.negative?
61
+ # Build the requested range - since first element is on the start offset we need to
62
+ # subtract one from requested count to end up with expected number of elements
63
+ requested_range = (start_offset..start_offset + (count - 1))
64
+ # Establish theoretical available range. Note, that this does not handle cases related to
65
+ # log retention or compaction
66
+ available_range = (low_offset..high_offset)
67
+ # Select only offset that we can select. This will remove all the potential offsets that
68
+ # are below the low watermark offset
69
+ possible_range = requested_range.select { |offset| available_range.include?(offset) }
70
+
71
+ start_offset = possible_range.first
72
+ count = possible_range.count
55
73
 
56
74
  tpl.add_topic_and_partitions_with_offsets(name, partition => start_offset)
57
75
  consumer.assign(tpl)
@@ -61,11 +79,15 @@ module Karafka
61
79
  loop do
62
80
  # If we've got as many messages as we've wanted stop
63
81
  break if messages.size >= count
64
- # If we've reached end of the topic messages, don't process more
65
- break if !messages.empty? && end_offset <= messages.last.offset
66
82
 
67
83
  message = consumer.poll(200)
68
- messages << message if message
84
+
85
+ next unless message
86
+
87
+ # If the message we've got is beyond the requested range, stop
88
+ break unless possible_range.include?(message.offset)
89
+
90
+ messages << message
69
91
  rescue Rdkafka::RdkafkaError => e
70
92
  # End of partition
71
93
  break if e.code == :partition_eof
@@ -74,7 +96,7 @@ module Karafka
74
96
  end
75
97
  end
76
98
 
77
- messages.map do |message|
99
+ messages.map! do |message|
78
100
  Messages::Builders::Message.call(
79
101
  message,
80
102
  # Use topic from routes if we can match it or create a dummy one
@@ -133,6 +155,17 @@ module Karafka
133
155
  end
134
156
  end
135
157
 
158
+ # Fetches the watermark offsets for a given topic partition
159
+ #
160
+ # @param name [String, Symbol] topic name
161
+ # @param partition [Integer] partition
162
+ # @return [Array<Integer, Integer>] low watermark offset and high watermark offset
163
+ def read_watermark_offsets(name, partition)
164
+ with_consumer do |consumer|
165
+ consumer.query_watermark_offsets(name, partition)
166
+ end
167
+ end
168
+
136
169
  # @return [Rdkafka::Metadata] cluster metadata info
137
170
  def cluster_info
138
171
  with_admin do |admin|
@@ -156,15 +189,16 @@ module Karafka
156
189
 
157
190
  # Creates admin instance and yields it. After usage it closes the admin instance
158
191
  def with_admin
159
- admin = config(:producer).admin
192
+ admin = config(:producer, {}).admin
160
193
  yield(admin)
161
194
  ensure
162
195
  admin&.close
163
196
  end
164
197
 
165
198
  # Creates consumer instance and yields it. After usage it closes the consumer instance
166
- def with_consumer
167
- consumer = config(:consumer).consumer
199
+ # @param settings [Hash] extra settings to customize consumer
200
+ def with_consumer(settings = {})
201
+ consumer = config(:consumer, settings).consumer
168
202
  yield(consumer)
169
203
  ensure
170
204
  consumer&.close
@@ -193,11 +227,12 @@ module Karafka
193
227
  end
194
228
 
195
229
  # @param type [Symbol] type of config we want
230
+ # @param settings [Hash] extra settings for config (if needed)
196
231
  # @return [::Rdkafka::Config] rdkafka config
197
- def config(type)
232
+ def config(type, settings)
198
233
  config_hash = Karafka::Setup::AttributesMap.public_send(
199
234
  type,
200
- Karafka::App.config.kafka.dup.merge(CONFIG_DEFAULTS)
235
+ Karafka::App.config.kafka.dup.merge(CONFIG_DEFAULTS).merge!(settings)
201
236
  )
202
237
 
203
238
  ::Rdkafka::Config.new(config_hash)
@@ -116,11 +116,14 @@ module Karafka
116
116
  return @declaratives_routing_topics if @declaratives_routing_topics
117
117
 
118
118
  collected_topics = {}
119
+ default_servers = Karafka::App.config.kafka[:'bootstrap.servers']
119
120
 
120
121
  App.consumer_groups.each do |consumer_group|
121
122
  consumer_group.topics.each do |topic|
122
123
  # Skip topics that were explicitly disabled from management
123
124
  next unless topic.declaratives.active?
125
+ # If bootstrap servers are different, consider this a different cluster
126
+ next unless default_servers == topic.kafka[:'bootstrap.servers']
124
127
 
125
128
  collected_topics[topic.name] ||= topic
126
129
  end
@@ -17,6 +17,9 @@ module Karafka
17
17
  # complete list of all the events. Please use the #available_events on fully loaded
18
18
  # Karafka system to determine all of the events you can use.
19
19
  EVENTS = %w[
20
+ active_job.consume
21
+ active_job.consumed
22
+
20
23
  app.initialized
21
24
  app.running
22
25
  app.quieting
@@ -22,7 +22,7 @@ module Karafka
22
22
  #
23
23
  # It contains slightly better revocation warranties than the regular blocking consumer as
24
24
  # it can stop processing batch of jobs in the middle after the revocation.
25
- class Consumer < Karafka::Pro::BaseConsumer
25
+ class Consumer < ::Karafka::ActiveJob::Consumer
26
26
  # Runs ActiveJob jobs processing and handles lrj if needed
27
27
  def consume
28
28
  messages.each do |message|
@@ -31,11 +31,7 @@ module Karafka
31
31
  break if revoked?
32
32
  break if Karafka::App.stopping?
33
33
 
34
- job = ::ActiveSupport::JSON.decode(message.raw_payload)
35
-
36
- tags.add(:job_class, job['job_class'])
37
-
38
- ::ActiveJob::Base.execute(job)
34
+ consume_job(message)
39
35
 
40
36
  # We cannot mark jobs as done after each if there are virtual partitions. Otherwise
41
37
  # this could create random markings.
@@ -3,5 +3,5 @@
3
3
  # Main module namespace
4
4
  module Karafka
5
5
  # Current Karafka version
6
- VERSION = '2.0.36'
6
+ VERSION = '2.0.38'
7
7
  end
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: karafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.36
4
+ version: 2.0.38
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
35
35
  Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
36
36
  MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
37
37
  -----END CERTIFICATE-----
38
- date: 2023-03-17 00:00:00.000000000 Z
38
+ date: 2023-03-27 00:00:00.000000000 Z
39
39
  dependencies:
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: karafka-core
metadata.gz.sig CHANGED
Binary file