karafka 2.0.32 → 2.0.34

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f5364527333b7924241340cbf9df8b3c189447ccf6c1b79612845d2d170990fe
4
- data.tar.gz: d590b8f940a5fa00926d386e196607124e090888e0fa1fff321d935cb0818d47
3
+ metadata.gz: 36d890d825aaeaee5349dcc653d888da3a023c01a837864544a905db977569c4
4
+ data.tar.gz: be442485812a05a030bab33da31a8e2fda684add8c4d59a0af78f517bb2519bd
5
5
  SHA512:
6
- metadata.gz: 25cec5ed66eb1199ec92c0206a34fba4583d79a9ba7d3ce68041855c48e1bb4bf97590ada597649c801340026270c604ed2a00c57c12f90c6fb861e3d85fd0b3
7
- data.tar.gz: dbd604d94c1dc1df6a0040b24e4c5754205303ec7d7a3a15271d2309df3caa56f34f94ef7ea44904b020b57154c17dd06688b23b94ded4230dde321e5f3f1d91
6
+ metadata.gz: d92be137485c436c1ed02435669785422e6e4da194ab19b97dca31f70b530fe7e0ae4e6b0c7c54895dbceca2d8fe4ef1bdcabdb287affe06e377942062777979
7
+ data.tar.gz: 0525b652373088a7a692134a6a4c89487e495e01934504d256a60ae7805c92f65a308d38c8d75ad806f39f5128d2d00865e10b1b4ae326c7dd60e7594c68558e
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -1,11 +1,55 @@
1
1
  # Karafka framework changelog
2
2
 
3
- ## 2.0.32 (2022-02-13)
3
+ ## 2.0.34 (2023-03-04)
4
+ - [Improvement] Attach an `embedded` tag to Karafka processes started using the embedded API.
5
+ - [Change] Renamed `Datadog::Listener` to `Datadog::MetricsListener` for consistency. (#1124)
6
+
7
+ ### Upgrade notes
8
+
9
+ 1. Replace `Datadog::Listener` references to `Datadog::MetricsListener`.
10
+
11
+ ## 2.0.33 (2023-02-24)
12
+ - **[Feature]** Support `perform_all_later` in ActiveJob adapter for Rails `7.1+`
13
+ - **[Feature]** Introduce ability to assign and re-assign tags in consumer instances. This can be used for extra instrumentation that is context aware.
14
+ - **[Feature]** Introduce ability to assign and reassign tags to the `Karafka::Process`.
15
+ - [Improvement] When using `ActiveJob` adapter, automatically tag jobs with the name of the `ActiveJob` class that is running inside of the `ActiveJob` consumer.
16
+ - [Improvement] Make `::Karafka::Instrumentation::Notifications::EVENTS` list public for anyone wanting to re-bind those into a different notification bus.
17
+ - [Improvement] Set `fetch.message.max.bytes` for `Karafka::Admin` to `5MB` to make sure that all data is fetched correctly for Web UI under heavy load (many consumers).
18
+ - [Improvement] Introduce a `strict_topics_namespacing` config option to enable/disable the strict topics naming validations. This can be useful when working with pre-existing topics which we cannot or do not want to rename.
19
+ - [Fix] Karafka monitor is prematurely cached (#1314)
20
+
21
+ ### Upgrade notes
22
+
23
+ Since `#tags` were introduced on consumers, the `#tags` method is now part of the consumers API.
24
+
25
+ This means, that in case you were using a method called `#tags` in your consumers, you will have to rename it:
26
+
27
+ ```ruby
28
+ class EventsConsumer < ApplicationConsumer
29
+ def consume
30
+ messages.each do |message|
31
+ tags << message.payload.tag
32
+ end
33
+
34
+ tags.each { |tags| puts tag }
35
+ end
36
+
37
+ private
38
+
39
+ # This will collide with the tagging API
40
+ # This NEEDS to be renamed not to collide with `#tags` method provided by the consumers API.
41
+ def tags
42
+ @tags ||= Set.new
43
+ end
44
+ end
45
+ ```
46
+
47
+ ## 2.0.32 (2023-02-13)
4
48
  - [Fix] Many non-existing topic subscriptions propagate poll errors beyond client
5
49
  - [Improvement] Ignore `unknown_topic_or_part` errors in dev when `allow.auto.create.topics` is on.
6
50
  - [Improvement] Optimize temporary errors handling in polling for a better backoff policy
7
51
 
8
- ## 2.0.31 (2022-02-12)
52
+ ## 2.0.31 (2023-02-12)
9
53
  - [Feature] Allow for adding partitions via `Admin#create_partitions` API.
10
54
  - [Fix] Do not ignore admin errors upon invalid configuration (#1254)
11
55
  - [Fix] Topic name validation (#1300) - CandyFet
@@ -13,7 +57,7 @@
13
57
  - [Maintenance] Require `karafka-core` >= `2.0.11` and switch to shared RSpec locator.
14
58
  - [Maintenance] Require `karafka-rdkafka` >= `0.12.1`
15
59
 
16
- ## 2.0.30 (2022-01-31)
60
+ ## 2.0.30 (2023-01-31)
17
61
  - [Improvement] Alias `--consumer-groups` with `--include-consumer-groups`
18
62
  - [Improvement] Alias `--subscription-groups` with `--include-subscription-groups`
19
63
  - [Improvement] Alias `--topics` with `--include-topics`
@@ -63,7 +107,7 @@ class KarafkaApp < Karafka::App
63
107
  - [Improvement] Expand `LoggerListener` with `client.resume` notification.
64
108
  - [Improvement] Replace random anonymous subscription groups ids with stable once.
65
109
  - [Improvement] Add `consumer.consume`, `consumer.revoke` and `consumer.shutting_down` notification events and move the revocation logic calling to strategies.
66
- - [Change] Rename job queue statistics `processing` key to `busy`. No changes needed because naming in the DataDog listener stays the same.
110
+ - [Change] Rename job queue statistics `processing` key to `busy`. No changes needed because naming in the DataDog listener stays the same.
67
111
  - [Fix] Fix proctitle listener state changes reporting on new states.
68
112
  - [Fix] Make sure all files descriptors are closed in the integration specs.
69
113
  - [Fix] Fix a case where empty subscription groups could leak into the execution flow.
data/Gemfile.lock CHANGED
@@ -1,8 +1,8 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- karafka (2.0.32)
5
- karafka-core (>= 2.0.11, < 3.0.0)
4
+ karafka (2.0.34)
5
+ karafka-core (>= 2.0.12, < 3.0.0)
6
6
  thor (>= 0.20)
7
7
  waterdrop (>= 2.4.10, < 3.0.0)
8
8
  zeitwerk (~> 2.3)
@@ -19,7 +19,7 @@ GEM
19
19
  minitest (>= 5.1)
20
20
  tzinfo (~> 2.0)
21
21
  byebug (11.1.3)
22
- concurrent-ruby (1.2.0)
22
+ concurrent-ruby (1.2.2)
23
23
  diff-lcs (1.5.0)
24
24
  docile (1.4.0)
25
25
  factory_bot (6.2.1)
@@ -29,7 +29,7 @@ GEM
29
29
  activesupport (>= 5.0)
30
30
  i18n (1.12.0)
31
31
  concurrent-ruby (~> 1.0)
32
- karafka-core (2.0.11)
32
+ karafka-core (2.0.12)
33
33
  concurrent-ruby (>= 1.1)
34
34
  karafka-rdkafka (>= 0.12.1)
35
35
  karafka-rdkafka (0.12.1)
@@ -61,13 +61,12 @@ GEM
61
61
  thor (1.2.1)
62
62
  tzinfo (2.0.6)
63
63
  concurrent-ruby (~> 1.0)
64
- waterdrop (2.4.10)
65
- karafka-core (>= 2.0.9, < 3.0.0)
64
+ waterdrop (2.4.11)
65
+ karafka-core (>= 2.0.12, < 3.0.0)
66
66
  zeitwerk (~> 2.3)
67
- zeitwerk (2.6.6)
67
+ zeitwerk (2.6.7)
68
68
 
69
69
  PLATFORMS
70
- arm64-darwin-21
71
70
  x86_64-linux
72
71
 
73
72
  DEPENDENCIES
@@ -45,18 +45,23 @@ en:
45
45
  dead_letter_queue.topic_format: 'needs to be a string with a Kafka accepted format'
46
46
  dead_letter_queue.active_format: needs to be either true or false
47
47
  active_format: needs to be either true or false
48
- inconsistent_namespacing: needs to be consistent namespacing style
48
+ inconsistent_namespacing: |
49
+ needs to be consistent namespacing style
50
+ disable this validation by setting config.strict_topics_namespacing to false
49
51
 
50
52
  consumer_group:
51
53
  missing: needs to be present
52
54
  topics_names_not_unique: all topic names within a single consumer group must be unique
53
- topics_namespaced_names_not_unique: all topic names within a single consumer group must be unique considering namespacing styles
54
55
  id_format: 'needs to be a string with a Kafka accepted format'
55
56
  topics_format: needs to be a non-empty array
57
+ topics_namespaced_names_not_unique: |
58
+ all topic names within a single consumer group must be unique considering namespacing styles
59
+ disable this validation by setting config.strict_topics_namespacing to false
56
60
 
57
61
  job_options:
58
62
  missing: needs to be present
59
63
  dispatch_method_format: needs to be either :produce_async or :produce_sync
64
+ dispatch_many_method_format: needs to be either :produce_many_async or :produce_many_sync
60
65
  partitioner_format: 'needs to respond to #call'
61
66
  partition_key_type_format: 'needs to be either :key or :partition_key'
62
67
 
data/karafka.gemspec CHANGED
@@ -21,7 +21,7 @@ Gem::Specification.new do |spec|
21
21
  without having to focus on things that are not your business domain.
22
22
  DESC
23
23
 
24
- spec.add_dependency 'karafka-core', '>= 2.0.11', '< 3.0.0'
24
+ spec.add_dependency 'karafka-core', '>= 2.0.12', '< 3.0.0'
25
25
  spec.add_dependency 'thor', '>= 0.20'
26
26
  spec.add_dependency 'waterdrop', '>= 2.4.10', '< 3.0.0'
27
27
  spec.add_dependency 'zeitwerk', '~> 2.3'
@@ -11,7 +11,13 @@ module ActiveJob
11
11
  #
12
12
  # @param job [Object] job that should be enqueued
13
13
  def enqueue(job)
14
- ::Karafka::App.config.internal.active_job.dispatcher.call(job)
14
+ ::Karafka::App.config.internal.active_job.dispatcher.dispatch(job)
15
+ end
16
+
17
+ # Enqueues multiple jobs in one go
18
+ # @param jobs [Array<Object>] jobs that we want to enqueue
19
+ def enqueue_all(jobs)
20
+ ::Karafka::App.config.internal.active_job.dispatcher.dispatch_many(jobs)
15
21
  end
16
22
 
17
23
  # Raises info, that Karafka backend does not support scheduling jobs
@@ -12,12 +12,14 @@ module Karafka
12
12
  messages.each do |message|
13
13
  break if Karafka::App.stopping?
14
14
 
15
- ::ActiveJob::Base.execute(
16
- # We technically speaking could set this as deserializer and reference it from the
17
- # message instead of using the `#raw_payload`. This is not done on purpose to simplify
18
- # the ActiveJob setup here
19
- ::ActiveSupport::JSON.decode(message.raw_payload)
20
- )
15
+ # We technically speaking could set this as deserializer and reference it from the
16
+ # message instead of using the `#raw_payload`. This is not done on purpose to simplify
17
+ # the ActiveJob setup here
18
+ job = ::ActiveSupport::JSON.decode(message.raw_payload)
19
+
20
+ tags.add(:job_class, job['job_class'])
21
+
22
+ ::ActiveJob::Base.execute(job)
21
23
 
22
24
  mark_as_consumed(message)
23
25
  end
@@ -7,13 +7,14 @@ module Karafka
7
7
  # Defaults for dispatching
8
8
  # The can be updated by using `#karafka_options` on the job
9
9
  DEFAULTS = {
10
- dispatch_method: :produce_async
10
+ dispatch_method: :produce_async,
11
+ dispatch_many_method: :produce_many_async
11
12
  }.freeze
12
13
 
13
14
  private_constant :DEFAULTS
14
15
 
15
16
  # @param job [ActiveJob::Base] job
16
- def call(job)
17
+ def dispatch(job)
17
18
  ::Karafka.producer.public_send(
18
19
  fetch_option(job, :dispatch_method, DEFAULTS),
19
20
  topic: job.queue_name,
@@ -21,6 +22,30 @@ module Karafka
21
22
  )
22
23
  end
23
24
 
25
+ # Bulk dispatches multiple jobs using the Rails 7.1+ API
26
+ # @param jobs [Array<ActiveJob::Base>] jobs we want to dispatch
27
+ def dispatch_many(jobs)
28
+ # Group jobs by their desired dispatch method
29
+ # It can be configured per job class, so we need to make sure we divide them
30
+ dispatches = Hash.new { |hash, key| hash[key] = [] }
31
+
32
+ jobs.each do |job|
33
+ d_method = fetch_option(job, :dispatch_many_method, DEFAULTS)
34
+
35
+ dispatches[d_method] << {
36
+ topic: job.queue_name,
37
+ payload: ::ActiveSupport::JSON.encode(job.serialize)
38
+ }
39
+ end
40
+
41
+ dispatches.each do |type, messages|
42
+ ::Karafka.producer.public_send(
43
+ type,
44
+ messages
45
+ )
46
+ end
47
+ end
48
+
24
49
  private
25
50
 
26
51
  # @param job [ActiveJob::Base] job
@@ -15,7 +15,18 @@ module Karafka
15
15
  ).fetch('en').fetch('validations').fetch('job_options')
16
16
  end
17
17
 
18
- optional(:dispatch_method) { |val| %i[produce_async produce_sync].include?(val) }
18
+ optional(:dispatch_method) do |val|
19
+ %i[
20
+ produce_async
21
+ produce_sync
22
+ ].include?(val)
23
+ end
24
+ optional(:dispatch_many_method) do |val|
25
+ %i[
26
+ produce_many_async
27
+ produce_many_sync
28
+ ].include?(val)
29
+ end
19
30
  end
20
31
  end
21
32
  end
data/lib/karafka/admin.rb CHANGED
@@ -26,7 +26,9 @@ module Karafka
26
26
  'group.id': 'karafka_admin',
27
27
  # We want to know when there is no more data not to end up with an endless loop
28
28
  'enable.partition.eof': true,
29
- 'statistics.interval.ms': 0
29
+ 'statistics.interval.ms': 0,
30
+ # Fetch at most 5 MBs when using admin
31
+ 'fetch.message.max.bytes': 5 * 1_048_576
30
32
  }.freeze
31
33
 
32
34
  private_constant :Topic, :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :MAX_ATTEMPTS
@@ -4,6 +4,9 @@
4
4
  module Karafka
5
5
  # Base consumer from which all Karafka consumers should inherit
6
6
  class BaseConsumer
7
+ # Allow for consumer instance tagging for instrumentation
8
+ include ::Karafka::Core::Taggable
9
+
7
10
  # @return [String] id of the current consumer
8
11
  attr_reader :id
9
12
  # @return [Karafka::Routing::Topic] topic to which a given consumer is subscribed
@@ -431,8 +431,7 @@ module Karafka
431
431
  Instrumentation::Callbacks::Statistics.new(
432
432
  @subscription_group.id,
433
433
  @subscription_group.consumer_group_id,
434
- @name,
435
- ::Karafka::App.config.monitor
434
+ @name
436
435
  )
437
436
  )
438
437
 
@@ -442,8 +441,7 @@ module Karafka
442
441
  Instrumentation::Callbacks::Error.new(
443
442
  @subscription_group.id,
444
443
  @subscription_group.consumer_group_id,
445
- @name,
446
- ::Karafka::App.config.monitor
444
+ @name
447
445
  )
448
446
  )
449
447
 
@@ -27,6 +27,7 @@ module Karafka
27
27
 
28
28
  virtual do |data, errors|
29
29
  next unless errors.empty?
30
+ next unless ::Karafka::App.config.strict_topics_namespacing
30
31
 
31
32
  names = data.fetch(:topics).map { |topic| topic[:name] }
32
33
  names_hash = names.each_with_object({}) { |n, h| h[n] = true }
@@ -51,6 +51,7 @@ module Karafka
51
51
 
52
52
  virtual do |data, errors|
53
53
  next unless errors.empty?
54
+ next unless ::Karafka::App.config.strict_topics_namespacing
54
55
 
55
56
  value = data.fetch(:name)
56
57
  namespacing_chars_count = value.chars.find_all { |c| ['.', '_'].include?(c) }.uniq.count
@@ -7,7 +7,10 @@ module Karafka
7
7
  # Starts Karafka without supervision and without ownership of signals in a background thread
8
8
  # so it won't interrupt other things running
9
9
  def start
10
- Thread.new { Karafka::Server.start }
10
+ Thread.new do
11
+ Karafka::Process.tags.add(:execution_mode, 'embedded')
12
+ Karafka::Server.start
13
+ end
11
14
  end
12
15
 
13
16
  # Stops Karafka upon any event
@@ -9,12 +9,10 @@ module Karafka
9
9
  # @param subscription_group_id [String] id of the current subscription group instance
10
10
  # @param consumer_group_id [String] id of the current consumer group
11
11
  # @param client_name [String] rdkafka client name
12
- # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
13
- def initialize(subscription_group_id, consumer_group_id, client_name, monitor)
12
+ def initialize(subscription_group_id, consumer_group_id, client_name)
14
13
  @subscription_group_id = subscription_group_id
15
14
  @consumer_group_id = consumer_group_id
16
15
  @client_name = client_name
17
- @monitor = monitor
18
16
  end
19
17
 
20
18
  # Runs the instrumentation monitor with error
@@ -26,7 +24,7 @@ module Karafka
26
24
  # Same as with statistics (mor explanation there)
27
25
  return unless @client_name == client_name
28
26
 
29
- @monitor.instrument(
27
+ ::Karafka.monitor.instrument(
30
28
  'error.occurred',
31
29
  caller: self,
32
30
  subscription_group_id: @subscription_group_id,
@@ -10,12 +10,10 @@ module Karafka
10
10
  # @param subscription_group_id [String] id of the current subscription group
11
11
  # @param consumer_group_id [String] id of the current consumer group
12
12
  # @param client_name [String] rdkafka client name
13
- # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
14
- def initialize(subscription_group_id, consumer_group_id, client_name, monitor)
13
+ def initialize(subscription_group_id, consumer_group_id, client_name)
15
14
  @subscription_group_id = subscription_group_id
16
15
  @consumer_group_id = consumer_group_id
17
16
  @client_name = client_name
18
- @monitor = monitor
19
17
  @statistics_decorator = ::Karafka::Core::Monitoring::StatisticsDecorator.new
20
18
  end
21
19
 
@@ -28,7 +26,7 @@ module Karafka
28
26
  # all the time.
29
27
  return unless @client_name == statistics['name']
30
28
 
31
- @monitor.instrument(
29
+ ::Karafka.monitor.instrument(
32
30
  'statistics.emitted',
33
31
  subscription_group_id: @subscription_group_id,
34
32
  consumer_group_id: @consumer_group_id,
@@ -54,8 +54,6 @@ module Karafka
54
54
  error.occurred
55
55
  ].freeze
56
56
 
57
- private_constant :EVENTS
58
-
59
57
  # @return [Karafka::Instrumentation::Monitor] monitor instance for system instrumentation
60
58
  def initialize
61
59
  super
@@ -1,258 +1,15 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require_relative 'metrics_listener'
4
+
3
5
  module Karafka
4
6
  module Instrumentation
5
7
  # Namespace for vendor specific instrumentation
6
8
  module Vendors
7
9
  # Datadog specific instrumentation
8
10
  module Datadog
9
- # Listener that can be used to subscribe to Karafka to receive stats via StatsD
10
- # and/or Datadog
11
- #
12
- # @note You need to setup the `dogstatsd-ruby` client and assign it
13
- class Listener
14
- include ::Karafka::Core::Configurable
15
- extend Forwardable
16
-
17
- def_delegators :config, :client, :rd_kafka_metrics, :namespace, :default_tags
18
-
19
- # Value object for storing a single rdkafka metric publishing details
20
- RdKafkaMetric = Struct.new(:type, :scope, :name, :key_location)
21
-
22
- # Namespace under which the DD metrics should be published
23
- setting :namespace, default: 'karafka'
24
-
25
- # Datadog client that we should use to publish the metrics
26
- setting :client
27
-
28
- # Default tags we want to publish (for example hostname)
29
- # Format as followed (example for hostname): `["host:#{Socket.gethostname}"]`
30
- setting :default_tags, default: []
31
-
32
- # All the rdkafka metrics we want to publish
33
- #
34
- # By default we publish quite a lot so this can be tuned
35
- # Note, that the once with `_d` come from Karafka, not rdkafka or Kafka
36
- setting :rd_kafka_metrics, default: [
37
- # Client metrics
38
- RdKafkaMetric.new(:count, :root, 'messages.consumed', 'rxmsgs_d'),
39
- RdKafkaMetric.new(:count, :root, 'messages.consumed.bytes', 'rxmsg_bytes'),
40
-
41
- # Broker metrics
42
- RdKafkaMetric.new(:count, :brokers, 'consume.attempts', 'txretries_d'),
43
- RdKafkaMetric.new(:count, :brokers, 'consume.errors', 'txerrs_d'),
44
- RdKafkaMetric.new(:count, :brokers, 'receive.errors', 'rxerrs_d'),
45
- RdKafkaMetric.new(:count, :brokers, 'connection.connects', 'connects_d'),
46
- RdKafkaMetric.new(:count, :brokers, 'connection.disconnects', 'disconnects_d'),
47
- RdKafkaMetric.new(:gauge, :brokers, 'network.latency.avg', %w[rtt avg]),
48
- RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p95', %w[rtt p95]),
49
- RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p99', %w[rtt p99]),
50
-
51
- # Topics metrics
52
- RdKafkaMetric.new(:gauge, :topics, 'consumer.lags', 'consumer_lag_stored'),
53
- RdKafkaMetric.new(:gauge, :topics, 'consumer.lags_delta', 'consumer_lag_stored_d')
54
- ].freeze
55
-
56
- configure
57
-
58
- # @param block [Proc] configuration block
59
- def initialize(&block)
60
- configure
61
- setup(&block) if block
62
- end
63
-
64
- # @param block [Proc] configuration block
65
- # @note We define this alias to be consistent with `WaterDrop#setup`
66
- def setup(&block)
67
- configure(&block)
68
- end
69
-
70
- # Hooks up to WaterDrop instrumentation for emitted statistics
71
- #
72
- # @param event [Karafka::Core::Monitoring::Event]
73
- def on_statistics_emitted(event)
74
- statistics = event[:statistics]
75
- consumer_group_id = event[:consumer_group_id]
76
-
77
- base_tags = default_tags + ["consumer_group:#{consumer_group_id}"]
78
-
79
- rd_kafka_metrics.each do |metric|
80
- report_metric(metric, statistics, base_tags)
81
- end
82
- end
83
-
84
- # Increases the errors count by 1
85
- #
86
- # @param event [Karafka::Core::Monitoring::Event]
87
- def on_error_occurred(event)
88
- extra_tags = ["type:#{event[:type]}"]
89
-
90
- if event.payload[:caller].respond_to?(:messages)
91
- extra_tags += consumer_tags(event.payload[:caller])
92
- end
93
-
94
- count('error_occurred', 1, tags: default_tags + extra_tags)
95
- end
96
-
97
- # Reports how many messages we've polled and how much time did we spend on it
98
- #
99
- # @param event [Karafka::Core::Monitoring::Event]
100
- def on_connection_listener_fetch_loop_received(event)
101
- time_taken = event[:time]
102
- messages_count = event[:messages_buffer].size
103
-
104
- consumer_group_id = event[:subscription_group].consumer_group_id
105
-
106
- extra_tags = ["consumer_group:#{consumer_group_id}"]
107
-
108
- histogram('listener.polling.time_taken', time_taken, tags: default_tags + extra_tags)
109
- histogram('listener.polling.messages', messages_count, tags: default_tags + extra_tags)
110
- end
111
-
112
- # Here we report majority of things related to processing as we have access to the
113
- # consumer
114
- # @param event [Karafka::Core::Monitoring::Event]
115
- def on_consumer_consumed(event)
116
- consumer = event.payload[:caller]
117
- messages = consumer.messages
118
- metadata = messages.metadata
119
-
120
- tags = default_tags + consumer_tags(consumer)
121
-
122
- count('consumer.messages', messages.count, tags: tags)
123
- count('consumer.batches', 1, tags: tags)
124
- gauge('consumer.offset', metadata.last_offset, tags: tags)
125
- histogram('consumer.consumed.time_taken', event[:time], tags: tags)
126
- histogram('consumer.batch_size', messages.count, tags: tags)
127
- histogram('consumer.processing_lag', metadata.processing_lag, tags: tags)
128
- histogram('consumer.consumption_lag', metadata.consumption_lag, tags: tags)
129
- end
130
-
131
- # @param event [Karafka::Core::Monitoring::Event]
132
- def on_consumer_revoked(event)
133
- tags = default_tags + consumer_tags(event.payload[:caller])
134
-
135
- count('consumer.revoked', 1, tags: tags)
136
- end
137
-
138
- # @param event [Karafka::Core::Monitoring::Event]
139
- def on_consumer_shutdown(event)
140
- tags = default_tags + consumer_tags(event.payload[:caller])
141
-
142
- count('consumer.shutdown', 1, tags: tags)
143
- end
144
-
145
- # Worker related metrics
146
- # @param event [Karafka::Core::Monitoring::Event]
147
- def on_worker_process(event)
148
- jq_stats = event[:jobs_queue].statistics
149
-
150
- gauge('worker.total_threads', Karafka::App.config.concurrency, tags: default_tags)
151
- histogram('worker.processing', jq_stats[:busy], tags: default_tags)
152
- histogram('worker.enqueued_jobs', jq_stats[:enqueued], tags: default_tags)
153
- end
154
-
155
- # We report this metric before and after processing for higher accuracy
156
- # Without this, the utilization would not be fully reflected
157
- # @param event [Karafka::Core::Monitoring::Event]
158
- def on_worker_processed(event)
159
- jq_stats = event[:jobs_queue].statistics
160
-
161
- histogram('worker.processing', jq_stats[:busy], tags: default_tags)
162
- end
163
-
164
- private
165
-
166
- %i[
167
- count
168
- gauge
169
- histogram
170
- increment
171
- decrement
172
- ].each do |metric_type|
173
- class_eval <<~METHODS, __FILE__, __LINE__ + 1
174
- def #{metric_type}(key, *args)
175
- client.#{metric_type}(
176
- namespaced_metric(key),
177
- *args
178
- )
179
- end
180
- METHODS
181
- end
182
-
183
- # Wraps metric name in listener's namespace
184
- # @param metric_name [String] RdKafkaMetric name
185
- # @return [String]
186
- def namespaced_metric(metric_name)
187
- "#{namespace}.#{metric_name}"
188
- end
189
-
190
- # Reports a given metric statistics to Datadog
191
- # @param metric [RdKafkaMetric] metric value object
192
- # @param statistics [Hash] hash with all the statistics emitted
193
- # @param base_tags [Array<String>] base tags we want to start with
194
- def report_metric(metric, statistics, base_tags)
195
- case metric.scope
196
- when :root
197
- public_send(
198
- metric.type,
199
- metric.name,
200
- statistics.fetch(*metric.key_location),
201
- tags: base_tags
202
- )
203
- when :brokers
204
- statistics.fetch('brokers').each_value do |broker_statistics|
205
- # Skip bootstrap nodes
206
- # Bootstrap nodes have nodeid -1, other nodes have positive
207
- # node ids
208
- next if broker_statistics['nodeid'] == -1
209
-
210
- public_send(
211
- metric.type,
212
- metric.name,
213
- broker_statistics.dig(*metric.key_location),
214
- tags: base_tags + ["broker:#{broker_statistics['nodename']}"]
215
- )
216
- end
217
- when :topics
218
- statistics.fetch('topics').each do |topic_name, topic_values|
219
- topic_values['partitions'].each do |partition_name, partition_statistics|
220
- next if partition_name == '-1'
221
- # Skip until lag info is available
222
- next if partition_statistics['consumer_lag'] == -1
223
-
224
- public_send(
225
- metric.type,
226
- metric.name,
227
- partition_statistics.dig(*metric.key_location),
228
- tags: base_tags + [
229
- "topic:#{topic_name}",
230
- "partition:#{partition_name}"
231
- ]
232
- )
233
- end
234
- end
235
- else
236
- raise ArgumentError, metric.scope
237
- end
238
- end
239
-
240
- # Builds basic per consumer tags for publication
241
- #
242
- # @param consumer [Karafka::BaseConsumer]
243
- # @return [Array<String>]
244
- def consumer_tags(consumer)
245
- messages = consumer.messages
246
- metadata = messages.metadata
247
- consumer_group_id = consumer.topic.consumer_group.id
248
-
249
- [
250
- "topic:#{metadata.topic}",
251
- "partition:#{metadata.partition}",
252
- "consumer_group:#{consumer_group_id}"
253
- ]
254
- end
255
- end
11
+ # Alias to keep backwards compatibility
12
+ Listener = MetricsListener
256
13
  end
257
14
  end
258
15
  end
@@ -0,0 +1,259 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Karafka
4
+ module Instrumentation
5
+ # Namespace for vendor specific instrumentation
6
+ module Vendors
7
+ # Datadog specific instrumentation
8
+ module Datadog
9
+ # Listener that can be used to subscribe to Karafka to receive stats via StatsD
10
+ # and/or Datadog
11
+ #
12
+ # @note You need to setup the `dogstatsd-ruby` client and assign it
13
+ class MetricsListener
14
+ include ::Karafka::Core::Configurable
15
+ extend Forwardable
16
+
17
+ def_delegators :config, :client, :rd_kafka_metrics, :namespace, :default_tags
18
+
19
+ # Value object for storing a single rdkafka metric publishing details
20
+ RdKafkaMetric = Struct.new(:type, :scope, :name, :key_location)
21
+
22
+ # Namespace under which the DD metrics should be published
23
+ setting :namespace, default: 'karafka'
24
+
25
+ # Datadog client that we should use to publish the metrics
26
+ setting :client
27
+
28
+ # Default tags we want to publish (for example hostname)
29
+ # Format as followed (example for hostname): `["host:#{Socket.gethostname}"]`
30
+ setting :default_tags, default: []
31
+
32
+ # All the rdkafka metrics we want to publish
33
+ #
34
+ # By default we publish quite a lot so this can be tuned
35
+ # Note, that the once with `_d` come from Karafka, not rdkafka or Kafka
36
+ setting :rd_kafka_metrics, default: [
37
+ # Client metrics
38
+ RdKafkaMetric.new(:count, :root, 'messages.consumed', 'rxmsgs_d'),
39
+ RdKafkaMetric.new(:count, :root, 'messages.consumed.bytes', 'rxmsg_bytes'),
40
+
41
+ # Broker metrics
42
+ RdKafkaMetric.new(:count, :brokers, 'consume.attempts', 'txretries_d'),
43
+ RdKafkaMetric.new(:count, :brokers, 'consume.errors', 'txerrs_d'),
44
+ RdKafkaMetric.new(:count, :brokers, 'receive.errors', 'rxerrs_d'),
45
+ RdKafkaMetric.new(:count, :brokers, 'connection.connects', 'connects_d'),
46
+ RdKafkaMetric.new(:count, :brokers, 'connection.disconnects', 'disconnects_d'),
47
+ RdKafkaMetric.new(:gauge, :brokers, 'network.latency.avg', %w[rtt avg]),
48
+ RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p95', %w[rtt p95]),
49
+ RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p99', %w[rtt p99]),
50
+
51
+ # Topics metrics
52
+ RdKafkaMetric.new(:gauge, :topics, 'consumer.lags', 'consumer_lag_stored'),
53
+ RdKafkaMetric.new(:gauge, :topics, 'consumer.lags_delta', 'consumer_lag_stored_d')
54
+ ].freeze
55
+
56
+ configure
57
+
58
+ # @param block [Proc] configuration block
59
+ def initialize(&block)
60
+ configure
61
+ setup(&block) if block
62
+ end
63
+
64
+ # @param block [Proc] configuration block
65
+ # @note We define this alias to be consistent with `WaterDrop#setup`
66
+ def setup(&block)
67
+ configure(&block)
68
+ end
69
+
70
+ # Hooks up to WaterDrop instrumentation for emitted statistics
71
+ #
72
+ # @param event [Karafka::Core::Monitoring::Event]
73
+ def on_statistics_emitted(event)
74
+ statistics = event[:statistics]
75
+ consumer_group_id = event[:consumer_group_id]
76
+
77
+ base_tags = default_tags + ["consumer_group:#{consumer_group_id}"]
78
+
79
+ rd_kafka_metrics.each do |metric|
80
+ report_metric(metric, statistics, base_tags)
81
+ end
82
+ end
83
+
84
+ # Increases the errors count by 1
85
+ #
86
+ # @param event [Karafka::Core::Monitoring::Event]
87
+ def on_error_occurred(event)
88
+ extra_tags = ["type:#{event[:type]}"]
89
+
90
+ if event.payload[:caller].respond_to?(:messages)
91
+ extra_tags += consumer_tags(event.payload[:caller])
92
+ end
93
+
94
+ count('error_occurred', 1, tags: default_tags + extra_tags)
95
+ end
96
+
97
+ # Reports how many messages we've polled and how much time did we spend on it
98
+ #
99
+ # @param event [Karafka::Core::Monitoring::Event]
100
+ def on_connection_listener_fetch_loop_received(event)
101
+ time_taken = event[:time]
102
+ messages_count = event[:messages_buffer].size
103
+
104
+ consumer_group_id = event[:subscription_group].consumer_group_id
105
+
106
+ extra_tags = ["consumer_group:#{consumer_group_id}"]
107
+
108
+ histogram('listener.polling.time_taken', time_taken, tags: default_tags + extra_tags)
109
+ histogram('listener.polling.messages', messages_count, tags: default_tags + extra_tags)
110
+ end
111
+
112
+ # Here we report majority of things related to processing as we have access to the
113
+ # consumer
114
+ # @param event [Karafka::Core::Monitoring::Event]
115
+ def on_consumer_consumed(event)
116
+ consumer = event.payload[:caller]
117
+ messages = consumer.messages
118
+ metadata = messages.metadata
119
+
120
+ tags = default_tags + consumer_tags(consumer)
121
+
122
+ count('consumer.messages', messages.count, tags: tags)
123
+ count('consumer.batches', 1, tags: tags)
124
+ gauge('consumer.offset', metadata.last_offset, tags: tags)
125
+ histogram('consumer.consumed.time_taken', event[:time], tags: tags)
126
+ histogram('consumer.batch_size', messages.count, tags: tags)
127
+ histogram('consumer.processing_lag', metadata.processing_lag, tags: tags)
128
+ histogram('consumer.consumption_lag', metadata.consumption_lag, tags: tags)
129
+ end
130
+
131
+ # @param event [Karafka::Core::Monitoring::Event]
132
+ def on_consumer_revoked(event)
133
+ tags = default_tags + consumer_tags(event.payload[:caller])
134
+
135
+ count('consumer.revoked', 1, tags: tags)
136
+ end
137
+
138
+ # @param event [Karafka::Core::Monitoring::Event]
139
+ def on_consumer_shutdown(event)
140
+ tags = default_tags + consumer_tags(event.payload[:caller])
141
+
142
+ count('consumer.shutdown', 1, tags: tags)
143
+ end
144
+
145
+ # Worker related metrics
146
+ # @param event [Karafka::Core::Monitoring::Event]
147
+ def on_worker_process(event)
148
+ jq_stats = event[:jobs_queue].statistics
149
+
150
+ gauge('worker.total_threads', Karafka::App.config.concurrency, tags: default_tags)
151
+ histogram('worker.processing', jq_stats[:busy], tags: default_tags)
152
+ histogram('worker.enqueued_jobs', jq_stats[:enqueued], tags: default_tags)
153
+ end
154
+
155
+ # We report this metric before and after processing for higher accuracy
156
+ # Without this, the utilization would not be fully reflected
157
+ # @param event [Karafka::Core::Monitoring::Event]
158
+ def on_worker_processed(event)
159
+ jq_stats = event[:jobs_queue].statistics
160
+
161
+ histogram('worker.processing', jq_stats[:busy], tags: default_tags)
162
+ end
163
+
164
+ private
165
+
166
+ %i[
167
+ count
168
+ gauge
169
+ histogram
170
+ increment
171
+ decrement
172
+ ].each do |metric_type|
173
+ class_eval <<~METHODS, __FILE__, __LINE__ + 1
174
+ def #{metric_type}(key, *args)
175
+ client.#{metric_type}(
176
+ namespaced_metric(key),
177
+ *args
178
+ )
179
+ end
180
+ METHODS
181
+ end
182
+
183
+ # Wraps metric name in listener's namespace
184
+ # @param metric_name [String] RdKafkaMetric name
185
+ # @return [String]
186
+ def namespaced_metric(metric_name)
187
+ "#{namespace}.#{metric_name}"
188
+ end
189
+
190
+ # Reports a given metric statistics to Datadog
191
+ # @param metric [RdKafkaMetric] metric value object
192
+ # @param statistics [Hash] hash with all the statistics emitted
193
+ # @param base_tags [Array<String>] base tags we want to start with
194
+ def report_metric(metric, statistics, base_tags)
195
+ case metric.scope
196
+ when :root
197
+ public_send(
198
+ metric.type,
199
+ metric.name,
200
+ statistics.fetch(*metric.key_location),
201
+ tags: base_tags
202
+ )
203
+ when :brokers
204
+ statistics.fetch('brokers').each_value do |broker_statistics|
205
+ # Skip bootstrap nodes
206
+ # Bootstrap nodes have nodeid -1, other nodes have positive
207
+ # node ids
208
+ next if broker_statistics['nodeid'] == -1
209
+
210
+ public_send(
211
+ metric.type,
212
+ metric.name,
213
+ broker_statistics.dig(*metric.key_location),
214
+ tags: base_tags + ["broker:#{broker_statistics['nodename']}"]
215
+ )
216
+ end
217
+ when :topics
218
+ statistics.fetch('topics').each do |topic_name, topic_values|
219
+ topic_values['partitions'].each do |partition_name, partition_statistics|
220
+ next if partition_name == '-1'
221
+ # Skip until lag info is available
222
+ next if partition_statistics['consumer_lag'] == -1
223
+
224
+ public_send(
225
+ metric.type,
226
+ metric.name,
227
+ partition_statistics.dig(*metric.key_location),
228
+ tags: base_tags + [
229
+ "topic:#{topic_name}",
230
+ "partition:#{partition_name}"
231
+ ]
232
+ )
233
+ end
234
+ end
235
+ else
236
+ raise ArgumentError, metric.scope
237
+ end
238
+ end
239
+
240
+ # Builds basic per consumer tags for publication
241
+ #
242
+ # @param consumer [Karafka::BaseConsumer]
243
+ # @return [Array<String>]
244
+ def consumer_tags(consumer)
245
+ messages = consumer.messages
246
+ metadata = messages.metadata
247
+ consumer_group_id = consumer.topic.consumer_group.id
248
+
249
+ [
250
+ "topic:#{metadata.topic}",
251
+ "partition:#{metadata.partition}",
252
+ "consumer_group:#{consumer_group_id}"
253
+ ]
254
+ end
255
+ end
256
+ end
257
+ end
258
+ end
259
+ end
@@ -31,9 +31,11 @@ module Karafka
31
31
  break if revoked?
32
32
  break if Karafka::App.stopping?
33
33
 
34
- ::ActiveJob::Base.execute(
35
- ::ActiveSupport::JSON.decode(message.raw_payload)
36
- )
34
+ job = ::ActiveSupport::JSON.decode(message.raw_payload)
35
+
36
+ tags.add(:job_class, job['job_class'])
37
+
38
+ ::ActiveJob::Base.execute(job)
37
39
 
38
40
  # We cannot mark jobs as done after each if there are virtual partitions. Otherwise
39
41
  # this could create random markings.
@@ -23,6 +23,7 @@ module Karafka
23
23
  # They can be updated by using `#karafka_options` on the job
24
24
  DEFAULTS = {
25
25
  dispatch_method: :produce_async,
26
+ dispatch_many_method: :produce_many_async,
26
27
  # We don't create a dummy proc based partitioner as we would have to evaluate it with
27
28
  # each job.
28
29
  partitioner: nil,
@@ -33,7 +34,7 @@ module Karafka
33
34
  private_constant :DEFAULTS
34
35
 
35
36
  # @param job [ActiveJob::Base] job
36
- def call(job)
37
+ def dispatch(job)
37
38
  ::Karafka.producer.public_send(
38
39
  fetch_option(job, :dispatch_method, DEFAULTS),
39
40
  dispatch_details(job).merge!(
@@ -43,6 +44,28 @@ module Karafka
43
44
  )
44
45
  end
45
46
 
47
+ # Bulk dispatches multiple jobs using the Rails 7.1+ API
48
+ # @param jobs [Array<ActiveJob::Base>] jobs we want to dispatch
49
+ def dispatch_many(jobs)
50
+ dispatches = Hash.new { |hash, key| hash[key] = [] }
51
+
52
+ jobs.each do |job|
53
+ d_method = fetch_option(job, :dispatch_many_method, DEFAULTS)
54
+
55
+ dispatches[d_method] << dispatch_details(job).merge!(
56
+ topic: job.queue_name,
57
+ payload: ::ActiveSupport::JSON.encode(job.serialize)
58
+ )
59
+ end
60
+
61
+ dispatches.each do |type, messages|
62
+ ::Karafka.producer.public_send(
63
+ type,
64
+ messages
65
+ )
66
+ end
67
+ end
68
+
46
69
  private
47
70
 
48
71
  # @param job [ActiveJob::Base] job instance
@@ -25,9 +25,20 @@ module Karafka
25
25
  ).fetch('en').fetch('validations').fetch('job_options')
26
26
  end
27
27
 
28
- optional(:dispatch_method) { |val| %i[produce_async produce_sync].include?(val) }
29
28
  optional(:partitioner) { |val| val.respond_to?(:call) }
30
29
  optional(:partition_key_type) { |val| %i[key partition_key].include?(val) }
30
+ optional(:dispatch_method) do |val|
31
+ %i[
32
+ produce_async
33
+ produce_sync
34
+ ].include?(val)
35
+ end
36
+ optional(:dispatch_many_method) do |val|
37
+ %i[
38
+ produce_many_async
39
+ produce_many_sync
40
+ ].include?(val)
41
+ end
31
42
  end
32
43
  end
33
44
  end
@@ -4,6 +4,9 @@ module Karafka
4
4
  # Class used to catch signals from ruby Signal class in order to manage Karafka stop
5
5
  # @note There might be only one process - this class is a singleton
6
6
  class Process
7
+ # Allow for process tagging for instrumentation
8
+ extend ::Karafka::Core::Taggable
9
+
7
10
  # Signal types that we handle
8
11
  HANDLED_SIGNALS = %i[
9
12
  SIGINT
@@ -89,6 +89,11 @@ module Karafka
89
89
  # option [::WaterDrop::Producer, nil]
90
90
  # Unless configured, will be created once Karafka is configured based on user Karafka setup
91
91
  setting :producer, default: nil
92
+ # option [Boolean] when set to true, Karafka will ensure that the routing topic naming
93
+ # convention is strict
94
+ # Disabling this may be needed in scenarios where we do not have control over topics names
95
+ # and/or we work with existing systems where we cannot change topics names.
96
+ setting :strict_topics_namespacing, default: true
92
97
 
93
98
  # rdkafka default options
94
99
  # @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
@@ -168,6 +173,9 @@ module Karafka
168
173
 
169
174
  configure_components
170
175
 
176
+ # Refreshes the references that are cached that might have been changed by the config
177
+ ::Karafka.refresh!
178
+
171
179
  # Runs things that need to be executed after config is defined and all the components
172
180
  # are also configured
173
181
  Pro::Loader.post_setup(config) if Karafka.pro?
@@ -3,5 +3,5 @@
3
3
  # Main module namespace
4
4
  module Karafka
5
5
  # Current Karafka version
6
- VERSION = '2.0.32'
6
+ VERSION = '2.0.34'
7
7
  end
data/lib/karafka.rb CHANGED
@@ -95,6 +95,19 @@ module Karafka
95
95
  def boot_file
96
96
  Pathname.new(ENV['KARAFKA_BOOT_FILE'] || File.join(Karafka.root, 'karafka.rb'))
97
97
  end
98
+
99
+ # We need to be able to overwrite both monitor and logger after the configuration in case they
100
+ # would be changed because those two (with defaults) can be used prior to the setup and their
101
+ # state change should be reflected in the updated setup
102
+ #
103
+ # This method refreshes the things that might have been altered by the configuration
104
+ def refresh!
105
+ config = ::Karafka::App.config
106
+
107
+ @logger = config.logger
108
+ @producer = config.producer
109
+ @monitor = config.monitor
110
+ end
98
111
  end
99
112
  end
100
113
 
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: karafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.32
4
+ version: 2.0.34
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
35
35
  Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
36
36
  MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
37
37
  -----END CERTIFICATE-----
38
- date: 2023-02-14 00:00:00.000000000 Z
38
+ date: 2023-03-04 00:00:00.000000000 Z
39
39
  dependencies:
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: karafka-core
@@ -43,7 +43,7 @@ dependencies:
43
43
  requirements:
44
44
  - - ">="
45
45
  - !ruby/object:Gem::Version
46
- version: 2.0.11
46
+ version: 2.0.12
47
47
  - - "<"
48
48
  - !ruby/object:Gem::Version
49
49
  version: 3.0.0
@@ -53,7 +53,7 @@ dependencies:
53
53
  requirements:
54
54
  - - ">="
55
55
  - !ruby/object:Gem::Version
56
- version: 2.0.11
56
+ version: 2.0.12
57
57
  - - "<"
58
58
  - !ruby/object:Gem::Version
59
59
  version: 3.0.0
@@ -198,6 +198,7 @@ files:
198
198
  - lib/karafka/instrumentation/vendors/datadog/dashboard.json
199
199
  - lib/karafka/instrumentation/vendors/datadog/listener.rb
200
200
  - lib/karafka/instrumentation/vendors/datadog/logger_listener.rb
201
+ - lib/karafka/instrumentation/vendors/datadog/metrics_listener.rb
201
202
  - lib/karafka/licenser.rb
202
203
  - lib/karafka/messages/batch_metadata.rb
203
204
  - lib/karafka/messages/builders/batch_metadata.rb
metadata.gz.sig CHANGED
Binary file