karafka 2.0.32 → 2.0.34

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f5364527333b7924241340cbf9df8b3c189447ccf6c1b79612845d2d170990fe
4
- data.tar.gz: d590b8f940a5fa00926d386e196607124e090888e0fa1fff321d935cb0818d47
3
+ metadata.gz: 36d890d825aaeaee5349dcc653d888da3a023c01a837864544a905db977569c4
4
+ data.tar.gz: be442485812a05a030bab33da31a8e2fda684add8c4d59a0af78f517bb2519bd
5
5
  SHA512:
6
- metadata.gz: 25cec5ed66eb1199ec92c0206a34fba4583d79a9ba7d3ce68041855c48e1bb4bf97590ada597649c801340026270c604ed2a00c57c12f90c6fb861e3d85fd0b3
7
- data.tar.gz: dbd604d94c1dc1df6a0040b24e4c5754205303ec7d7a3a15271d2309df3caa56f34f94ef7ea44904b020b57154c17dd06688b23b94ded4230dde321e5f3f1d91
6
+ metadata.gz: d92be137485c436c1ed02435669785422e6e4da194ab19b97dca31f70b530fe7e0ae4e6b0c7c54895dbceca2d8fe4ef1bdcabdb287affe06e377942062777979
7
+ data.tar.gz: 0525b652373088a7a692134a6a4c89487e495e01934504d256a60ae7805c92f65a308d38c8d75ad806f39f5128d2d00865e10b1b4ae326c7dd60e7594c68558e
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -1,11 +1,55 @@
1
1
  # Karafka framework changelog
2
2
 
3
- ## 2.0.32 (2022-02-13)
3
+ ## 2.0.34 (2023-03-04)
4
+ - [Improvement] Attach an `embedded` tag to Karafka processes started using the embedded API.
5
+ - [Change] Renamed `Datadog::Listener` to `Datadog::MetricsListener` for consistency. (#1124)
6
+
7
+ ### Upgrade notes
8
+
9
+ 1. Replace `Datadog::Listener` references to `Datadog::MetricsListener`.
10
+
11
+ ## 2.0.33 (2023-02-24)
12
+ - **[Feature]** Support `perform_all_later` in ActiveJob adapter for Rails `7.1+`
13
+ - **[Feature]** Introduce ability to assign and re-assign tags in consumer instances. This can be used for extra instrumentation that is context aware.
14
+ - **[Feature]** Introduce ability to assign and reassign tags to the `Karafka::Process`.
15
+ - [Improvement] When using `ActiveJob` adapter, automatically tag jobs with the name of the `ActiveJob` class that is running inside of the `ActiveJob` consumer.
16
+ - [Improvement] Make `::Karafka::Instrumentation::Notifications::EVENTS` list public for anyone wanting to re-bind those into a different notification bus.
17
+ - [Improvement] Set `fetch.message.max.bytes` for `Karafka::Admin` to `5MB` to make sure that all data is fetched correctly for Web UI under heavy load (many consumers).
18
+ - [Improvement] Introduce a `strict_topics_namespacing` config option to enable/disable the strict topics naming validations. This can be useful when working with pre-existing topics which we cannot or do not want to rename.
19
+ - [Fix] Karafka monitor is prematurely cached (#1314)
20
+
21
+ ### Upgrade notes
22
+
23
+ Since `#tags` were introduced on consumers, the `#tags` method is now part of the consumers API.
24
+
25
+ This means, that in case you were using a method called `#tags` in your consumers, you will have to rename it:
26
+
27
+ ```ruby
28
+ class EventsConsumer < ApplicationConsumer
29
+ def consume
30
+ messages.each do |message|
31
+ tags << message.payload.tag
32
+ end
33
+
34
+ tags.each { |tags| puts tag }
35
+ end
36
+
37
+ private
38
+
39
+ # This will collide with the tagging API
40
+ # This NEEDS to be renamed not to collide with `#tags` method provided by the consumers API.
41
+ def tags
42
+ @tags ||= Set.new
43
+ end
44
+ end
45
+ ```
46
+
47
+ ## 2.0.32 (2023-02-13)
4
48
  - [Fix] Many non-existing topic subscriptions propagate poll errors beyond client
5
49
  - [Improvement] Ignore `unknown_topic_or_part` errors in dev when `allow.auto.create.topics` is on.
6
50
  - [Improvement] Optimize temporary errors handling in polling for a better backoff policy
7
51
 
8
- ## 2.0.31 (2022-02-12)
52
+ ## 2.0.31 (2023-02-12)
9
53
  - [Feature] Allow for adding partitions via `Admin#create_partitions` API.
10
54
  - [Fix] Do not ignore admin errors upon invalid configuration (#1254)
11
55
  - [Fix] Topic name validation (#1300) - CandyFet
@@ -13,7 +57,7 @@
13
57
  - [Maintenance] Require `karafka-core` >= `2.0.11` and switch to shared RSpec locator.
14
58
  - [Maintenance] Require `karafka-rdkafka` >= `0.12.1`
15
59
 
16
- ## 2.0.30 (2022-01-31)
60
+ ## 2.0.30 (2023-01-31)
17
61
  - [Improvement] Alias `--consumer-groups` with `--include-consumer-groups`
18
62
  - [Improvement] Alias `--subscription-groups` with `--include-subscription-groups`
19
63
  - [Improvement] Alias `--topics` with `--include-topics`
@@ -63,7 +107,7 @@ class KarafkaApp < Karafka::App
63
107
  - [Improvement] Expand `LoggerListener` with `client.resume` notification.
64
108
  - [Improvement] Replace random anonymous subscription groups ids with stable once.
65
109
  - [Improvement] Add `consumer.consume`, `consumer.revoke` and `consumer.shutting_down` notification events and move the revocation logic calling to strategies.
66
- - [Change] Rename job queue statistics `processing` key to `busy`. No changes needed because naming in the DataDog listener stays the same.
110
+ - [Change] Rename job queue statistics `processing` key to `busy`. No changes needed because naming in the DataDog listener stays the same.
67
111
  - [Fix] Fix proctitle listener state changes reporting on new states.
68
112
  - [Fix] Make sure all files descriptors are closed in the integration specs.
69
113
  - [Fix] Fix a case where empty subscription groups could leak into the execution flow.
data/Gemfile.lock CHANGED
@@ -1,8 +1,8 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- karafka (2.0.32)
5
- karafka-core (>= 2.0.11, < 3.0.0)
4
+ karafka (2.0.34)
5
+ karafka-core (>= 2.0.12, < 3.0.0)
6
6
  thor (>= 0.20)
7
7
  waterdrop (>= 2.4.10, < 3.0.0)
8
8
  zeitwerk (~> 2.3)
@@ -19,7 +19,7 @@ GEM
19
19
  minitest (>= 5.1)
20
20
  tzinfo (~> 2.0)
21
21
  byebug (11.1.3)
22
- concurrent-ruby (1.2.0)
22
+ concurrent-ruby (1.2.2)
23
23
  diff-lcs (1.5.0)
24
24
  docile (1.4.0)
25
25
  factory_bot (6.2.1)
@@ -29,7 +29,7 @@ GEM
29
29
  activesupport (>= 5.0)
30
30
  i18n (1.12.0)
31
31
  concurrent-ruby (~> 1.0)
32
- karafka-core (2.0.11)
32
+ karafka-core (2.0.12)
33
33
  concurrent-ruby (>= 1.1)
34
34
  karafka-rdkafka (>= 0.12.1)
35
35
  karafka-rdkafka (0.12.1)
@@ -61,13 +61,12 @@ GEM
61
61
  thor (1.2.1)
62
62
  tzinfo (2.0.6)
63
63
  concurrent-ruby (~> 1.0)
64
- waterdrop (2.4.10)
65
- karafka-core (>= 2.0.9, < 3.0.0)
64
+ waterdrop (2.4.11)
65
+ karafka-core (>= 2.0.12, < 3.0.0)
66
66
  zeitwerk (~> 2.3)
67
- zeitwerk (2.6.6)
67
+ zeitwerk (2.6.7)
68
68
 
69
69
  PLATFORMS
70
- arm64-darwin-21
71
70
  x86_64-linux
72
71
 
73
72
  DEPENDENCIES
@@ -45,18 +45,23 @@ en:
45
45
  dead_letter_queue.topic_format: 'needs to be a string with a Kafka accepted format'
46
46
  dead_letter_queue.active_format: needs to be either true or false
47
47
  active_format: needs to be either true or false
48
- inconsistent_namespacing: needs to be consistent namespacing style
48
+ inconsistent_namespacing: |
49
+ needs to be consistent namespacing style
50
+ disable this validation by setting config.strict_topics_namespacing to false
49
51
 
50
52
  consumer_group:
51
53
  missing: needs to be present
52
54
  topics_names_not_unique: all topic names within a single consumer group must be unique
53
- topics_namespaced_names_not_unique: all topic names within a single consumer group must be unique considering namespacing styles
54
55
  id_format: 'needs to be a string with a Kafka accepted format'
55
56
  topics_format: needs to be a non-empty array
57
+ topics_namespaced_names_not_unique: |
58
+ all topic names within a single consumer group must be unique considering namespacing styles
59
+ disable this validation by setting config.strict_topics_namespacing to false
56
60
 
57
61
  job_options:
58
62
  missing: needs to be present
59
63
  dispatch_method_format: needs to be either :produce_async or :produce_sync
64
+ dispatch_many_method_format: needs to be either :produce_many_async or :produce_many_sync
60
65
  partitioner_format: 'needs to respond to #call'
61
66
  partition_key_type_format: 'needs to be either :key or :partition_key'
62
67
 
data/karafka.gemspec CHANGED
@@ -21,7 +21,7 @@ Gem::Specification.new do |spec|
21
21
  without having to focus on things that are not your business domain.
22
22
  DESC
23
23
 
24
- spec.add_dependency 'karafka-core', '>= 2.0.11', '< 3.0.0'
24
+ spec.add_dependency 'karafka-core', '>= 2.0.12', '< 3.0.0'
25
25
  spec.add_dependency 'thor', '>= 0.20'
26
26
  spec.add_dependency 'waterdrop', '>= 2.4.10', '< 3.0.0'
27
27
  spec.add_dependency 'zeitwerk', '~> 2.3'
@@ -11,7 +11,13 @@ module ActiveJob
11
11
  #
12
12
  # @param job [Object] job that should be enqueued
13
13
  def enqueue(job)
14
- ::Karafka::App.config.internal.active_job.dispatcher.call(job)
14
+ ::Karafka::App.config.internal.active_job.dispatcher.dispatch(job)
15
+ end
16
+
17
+ # Enqueues multiple jobs in one go
18
+ # @param jobs [Array<Object>] jobs that we want to enqueue
19
+ def enqueue_all(jobs)
20
+ ::Karafka::App.config.internal.active_job.dispatcher.dispatch_many(jobs)
15
21
  end
16
22
 
17
23
  # Raises info, that Karafka backend does not support scheduling jobs
@@ -12,12 +12,14 @@ module Karafka
12
12
  messages.each do |message|
13
13
  break if Karafka::App.stopping?
14
14
 
15
- ::ActiveJob::Base.execute(
16
- # We technically speaking could set this as deserializer and reference it from the
17
- # message instead of using the `#raw_payload`. This is not done on purpose to simplify
18
- # the ActiveJob setup here
19
- ::ActiveSupport::JSON.decode(message.raw_payload)
20
- )
15
+ # We technically speaking could set this as deserializer and reference it from the
16
+ # message instead of using the `#raw_payload`. This is not done on purpose to simplify
17
+ # the ActiveJob setup here
18
+ job = ::ActiveSupport::JSON.decode(message.raw_payload)
19
+
20
+ tags.add(:job_class, job['job_class'])
21
+
22
+ ::ActiveJob::Base.execute(job)
21
23
 
22
24
  mark_as_consumed(message)
23
25
  end
@@ -7,13 +7,14 @@ module Karafka
7
7
  # Defaults for dispatching
8
8
  # The can be updated by using `#karafka_options` on the job
9
9
  DEFAULTS = {
10
- dispatch_method: :produce_async
10
+ dispatch_method: :produce_async,
11
+ dispatch_many_method: :produce_many_async
11
12
  }.freeze
12
13
 
13
14
  private_constant :DEFAULTS
14
15
 
15
16
  # @param job [ActiveJob::Base] job
16
- def call(job)
17
+ def dispatch(job)
17
18
  ::Karafka.producer.public_send(
18
19
  fetch_option(job, :dispatch_method, DEFAULTS),
19
20
  topic: job.queue_name,
@@ -21,6 +22,30 @@ module Karafka
21
22
  )
22
23
  end
23
24
 
25
+ # Bulk dispatches multiple jobs using the Rails 7.1+ API
26
+ # @param jobs [Array<ActiveJob::Base>] jobs we want to dispatch
27
+ def dispatch_many(jobs)
28
+ # Group jobs by their desired dispatch method
29
+ # It can be configured per job class, so we need to make sure we divide them
30
+ dispatches = Hash.new { |hash, key| hash[key] = [] }
31
+
32
+ jobs.each do |job|
33
+ d_method = fetch_option(job, :dispatch_many_method, DEFAULTS)
34
+
35
+ dispatches[d_method] << {
36
+ topic: job.queue_name,
37
+ payload: ::ActiveSupport::JSON.encode(job.serialize)
38
+ }
39
+ end
40
+
41
+ dispatches.each do |type, messages|
42
+ ::Karafka.producer.public_send(
43
+ type,
44
+ messages
45
+ )
46
+ end
47
+ end
48
+
24
49
  private
25
50
 
26
51
  # @param job [ActiveJob::Base] job
@@ -15,7 +15,18 @@ module Karafka
15
15
  ).fetch('en').fetch('validations').fetch('job_options')
16
16
  end
17
17
 
18
- optional(:dispatch_method) { |val| %i[produce_async produce_sync].include?(val) }
18
+ optional(:dispatch_method) do |val|
19
+ %i[
20
+ produce_async
21
+ produce_sync
22
+ ].include?(val)
23
+ end
24
+ optional(:dispatch_many_method) do |val|
25
+ %i[
26
+ produce_many_async
27
+ produce_many_sync
28
+ ].include?(val)
29
+ end
19
30
  end
20
31
  end
21
32
  end
data/lib/karafka/admin.rb CHANGED
@@ -26,7 +26,9 @@ module Karafka
26
26
  'group.id': 'karafka_admin',
27
27
  # We want to know when there is no more data not to end up with an endless loop
28
28
  'enable.partition.eof': true,
29
- 'statistics.interval.ms': 0
29
+ 'statistics.interval.ms': 0,
30
+ # Fetch at most 5 MBs when using admin
31
+ 'fetch.message.max.bytes': 5 * 1_048_576
30
32
  }.freeze
31
33
 
32
34
  private_constant :Topic, :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :MAX_ATTEMPTS
@@ -4,6 +4,9 @@
4
4
  module Karafka
5
5
  # Base consumer from which all Karafka consumers should inherit
6
6
  class BaseConsumer
7
+ # Allow for consumer instance tagging for instrumentation
8
+ include ::Karafka::Core::Taggable
9
+
7
10
  # @return [String] id of the current consumer
8
11
  attr_reader :id
9
12
  # @return [Karafka::Routing::Topic] topic to which a given consumer is subscribed
@@ -431,8 +431,7 @@ module Karafka
431
431
  Instrumentation::Callbacks::Statistics.new(
432
432
  @subscription_group.id,
433
433
  @subscription_group.consumer_group_id,
434
- @name,
435
- ::Karafka::App.config.monitor
434
+ @name
436
435
  )
437
436
  )
438
437
 
@@ -442,8 +441,7 @@ module Karafka
442
441
  Instrumentation::Callbacks::Error.new(
443
442
  @subscription_group.id,
444
443
  @subscription_group.consumer_group_id,
445
- @name,
446
- ::Karafka::App.config.monitor
444
+ @name
447
445
  )
448
446
  )
449
447
 
@@ -27,6 +27,7 @@ module Karafka
27
27
 
28
28
  virtual do |data, errors|
29
29
  next unless errors.empty?
30
+ next unless ::Karafka::App.config.strict_topics_namespacing
30
31
 
31
32
  names = data.fetch(:topics).map { |topic| topic[:name] }
32
33
  names_hash = names.each_with_object({}) { |n, h| h[n] = true }
@@ -51,6 +51,7 @@ module Karafka
51
51
 
52
52
  virtual do |data, errors|
53
53
  next unless errors.empty?
54
+ next unless ::Karafka::App.config.strict_topics_namespacing
54
55
 
55
56
  value = data.fetch(:name)
56
57
  namespacing_chars_count = value.chars.find_all { |c| ['.', '_'].include?(c) }.uniq.count
@@ -7,7 +7,10 @@ module Karafka
7
7
  # Starts Karafka without supervision and without ownership of signals in a background thread
8
8
  # so it won't interrupt other things running
9
9
  def start
10
- Thread.new { Karafka::Server.start }
10
+ Thread.new do
11
+ Karafka::Process.tags.add(:execution_mode, 'embedded')
12
+ Karafka::Server.start
13
+ end
11
14
  end
12
15
 
13
16
  # Stops Karafka upon any event
@@ -9,12 +9,10 @@ module Karafka
9
9
  # @param subscription_group_id [String] id of the current subscription group instance
10
10
  # @param consumer_group_id [String] id of the current consumer group
11
11
  # @param client_name [String] rdkafka client name
12
- # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
13
- def initialize(subscription_group_id, consumer_group_id, client_name, monitor)
12
+ def initialize(subscription_group_id, consumer_group_id, client_name)
14
13
  @subscription_group_id = subscription_group_id
15
14
  @consumer_group_id = consumer_group_id
16
15
  @client_name = client_name
17
- @monitor = monitor
18
16
  end
19
17
 
20
18
  # Runs the instrumentation monitor with error
@@ -26,7 +24,7 @@ module Karafka
26
24
  # Same as with statistics (mor explanation there)
27
25
  return unless @client_name == client_name
28
26
 
29
- @monitor.instrument(
27
+ ::Karafka.monitor.instrument(
30
28
  'error.occurred',
31
29
  caller: self,
32
30
  subscription_group_id: @subscription_group_id,
@@ -10,12 +10,10 @@ module Karafka
10
10
  # @param subscription_group_id [String] id of the current subscription group
11
11
  # @param consumer_group_id [String] id of the current consumer group
12
12
  # @param client_name [String] rdkafka client name
13
- # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
14
- def initialize(subscription_group_id, consumer_group_id, client_name, monitor)
13
+ def initialize(subscription_group_id, consumer_group_id, client_name)
15
14
  @subscription_group_id = subscription_group_id
16
15
  @consumer_group_id = consumer_group_id
17
16
  @client_name = client_name
18
- @monitor = monitor
19
17
  @statistics_decorator = ::Karafka::Core::Monitoring::StatisticsDecorator.new
20
18
  end
21
19
 
@@ -28,7 +26,7 @@ module Karafka
28
26
  # all the time.
29
27
  return unless @client_name == statistics['name']
30
28
 
31
- @monitor.instrument(
29
+ ::Karafka.monitor.instrument(
32
30
  'statistics.emitted',
33
31
  subscription_group_id: @subscription_group_id,
34
32
  consumer_group_id: @consumer_group_id,
@@ -54,8 +54,6 @@ module Karafka
54
54
  error.occurred
55
55
  ].freeze
56
56
 
57
- private_constant :EVENTS
58
-
59
57
  # @return [Karafka::Instrumentation::Monitor] monitor instance for system instrumentation
60
58
  def initialize
61
59
  super
@@ -1,258 +1,15 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require_relative 'metrics_listener'
4
+
3
5
  module Karafka
4
6
  module Instrumentation
5
7
  # Namespace for vendor specific instrumentation
6
8
  module Vendors
7
9
  # Datadog specific instrumentation
8
10
  module Datadog
9
- # Listener that can be used to subscribe to Karafka to receive stats via StatsD
10
- # and/or Datadog
11
- #
12
- # @note You need to setup the `dogstatsd-ruby` client and assign it
13
- class Listener
14
- include ::Karafka::Core::Configurable
15
- extend Forwardable
16
-
17
- def_delegators :config, :client, :rd_kafka_metrics, :namespace, :default_tags
18
-
19
- # Value object for storing a single rdkafka metric publishing details
20
- RdKafkaMetric = Struct.new(:type, :scope, :name, :key_location)
21
-
22
- # Namespace under which the DD metrics should be published
23
- setting :namespace, default: 'karafka'
24
-
25
- # Datadog client that we should use to publish the metrics
26
- setting :client
27
-
28
- # Default tags we want to publish (for example hostname)
29
- # Format as followed (example for hostname): `["host:#{Socket.gethostname}"]`
30
- setting :default_tags, default: []
31
-
32
- # All the rdkafka metrics we want to publish
33
- #
34
- # By default we publish quite a lot so this can be tuned
35
- # Note, that the once with `_d` come from Karafka, not rdkafka or Kafka
36
- setting :rd_kafka_metrics, default: [
37
- # Client metrics
38
- RdKafkaMetric.new(:count, :root, 'messages.consumed', 'rxmsgs_d'),
39
- RdKafkaMetric.new(:count, :root, 'messages.consumed.bytes', 'rxmsg_bytes'),
40
-
41
- # Broker metrics
42
- RdKafkaMetric.new(:count, :brokers, 'consume.attempts', 'txretries_d'),
43
- RdKafkaMetric.new(:count, :brokers, 'consume.errors', 'txerrs_d'),
44
- RdKafkaMetric.new(:count, :brokers, 'receive.errors', 'rxerrs_d'),
45
- RdKafkaMetric.new(:count, :brokers, 'connection.connects', 'connects_d'),
46
- RdKafkaMetric.new(:count, :brokers, 'connection.disconnects', 'disconnects_d'),
47
- RdKafkaMetric.new(:gauge, :brokers, 'network.latency.avg', %w[rtt avg]),
48
- RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p95', %w[rtt p95]),
49
- RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p99', %w[rtt p99]),
50
-
51
- # Topics metrics
52
- RdKafkaMetric.new(:gauge, :topics, 'consumer.lags', 'consumer_lag_stored'),
53
- RdKafkaMetric.new(:gauge, :topics, 'consumer.lags_delta', 'consumer_lag_stored_d')
54
- ].freeze
55
-
56
- configure
57
-
58
- # @param block [Proc] configuration block
59
- def initialize(&block)
60
- configure
61
- setup(&block) if block
62
- end
63
-
64
- # @param block [Proc] configuration block
65
- # @note We define this alias to be consistent with `WaterDrop#setup`
66
- def setup(&block)
67
- configure(&block)
68
- end
69
-
70
- # Hooks up to WaterDrop instrumentation for emitted statistics
71
- #
72
- # @param event [Karafka::Core::Monitoring::Event]
73
- def on_statistics_emitted(event)
74
- statistics = event[:statistics]
75
- consumer_group_id = event[:consumer_group_id]
76
-
77
- base_tags = default_tags + ["consumer_group:#{consumer_group_id}"]
78
-
79
- rd_kafka_metrics.each do |metric|
80
- report_metric(metric, statistics, base_tags)
81
- end
82
- end
83
-
84
- # Increases the errors count by 1
85
- #
86
- # @param event [Karafka::Core::Monitoring::Event]
87
- def on_error_occurred(event)
88
- extra_tags = ["type:#{event[:type]}"]
89
-
90
- if event.payload[:caller].respond_to?(:messages)
91
- extra_tags += consumer_tags(event.payload[:caller])
92
- end
93
-
94
- count('error_occurred', 1, tags: default_tags + extra_tags)
95
- end
96
-
97
- # Reports how many messages we've polled and how much time did we spend on it
98
- #
99
- # @param event [Karafka::Core::Monitoring::Event]
100
- def on_connection_listener_fetch_loop_received(event)
101
- time_taken = event[:time]
102
- messages_count = event[:messages_buffer].size
103
-
104
- consumer_group_id = event[:subscription_group].consumer_group_id
105
-
106
- extra_tags = ["consumer_group:#{consumer_group_id}"]
107
-
108
- histogram('listener.polling.time_taken', time_taken, tags: default_tags + extra_tags)
109
- histogram('listener.polling.messages', messages_count, tags: default_tags + extra_tags)
110
- end
111
-
112
- # Here we report majority of things related to processing as we have access to the
113
- # consumer
114
- # @param event [Karafka::Core::Monitoring::Event]
115
- def on_consumer_consumed(event)
116
- consumer = event.payload[:caller]
117
- messages = consumer.messages
118
- metadata = messages.metadata
119
-
120
- tags = default_tags + consumer_tags(consumer)
121
-
122
- count('consumer.messages', messages.count, tags: tags)
123
- count('consumer.batches', 1, tags: tags)
124
- gauge('consumer.offset', metadata.last_offset, tags: tags)
125
- histogram('consumer.consumed.time_taken', event[:time], tags: tags)
126
- histogram('consumer.batch_size', messages.count, tags: tags)
127
- histogram('consumer.processing_lag', metadata.processing_lag, tags: tags)
128
- histogram('consumer.consumption_lag', metadata.consumption_lag, tags: tags)
129
- end
130
-
131
- # @param event [Karafka::Core::Monitoring::Event]
132
- def on_consumer_revoked(event)
133
- tags = default_tags + consumer_tags(event.payload[:caller])
134
-
135
- count('consumer.revoked', 1, tags: tags)
136
- end
137
-
138
- # @param event [Karafka::Core::Monitoring::Event]
139
- def on_consumer_shutdown(event)
140
- tags = default_tags + consumer_tags(event.payload[:caller])
141
-
142
- count('consumer.shutdown', 1, tags: tags)
143
- end
144
-
145
- # Worker related metrics
146
- # @param event [Karafka::Core::Monitoring::Event]
147
- def on_worker_process(event)
148
- jq_stats = event[:jobs_queue].statistics
149
-
150
- gauge('worker.total_threads', Karafka::App.config.concurrency, tags: default_tags)
151
- histogram('worker.processing', jq_stats[:busy], tags: default_tags)
152
- histogram('worker.enqueued_jobs', jq_stats[:enqueued], tags: default_tags)
153
- end
154
-
155
- # We report this metric before and after processing for higher accuracy
156
- # Without this, the utilization would not be fully reflected
157
- # @param event [Karafka::Core::Monitoring::Event]
158
- def on_worker_processed(event)
159
- jq_stats = event[:jobs_queue].statistics
160
-
161
- histogram('worker.processing', jq_stats[:busy], tags: default_tags)
162
- end
163
-
164
- private
165
-
166
- %i[
167
- count
168
- gauge
169
- histogram
170
- increment
171
- decrement
172
- ].each do |metric_type|
173
- class_eval <<~METHODS, __FILE__, __LINE__ + 1
174
- def #{metric_type}(key, *args)
175
- client.#{metric_type}(
176
- namespaced_metric(key),
177
- *args
178
- )
179
- end
180
- METHODS
181
- end
182
-
183
- # Wraps metric name in listener's namespace
184
- # @param metric_name [String] RdKafkaMetric name
185
- # @return [String]
186
- def namespaced_metric(metric_name)
187
- "#{namespace}.#{metric_name}"
188
- end
189
-
190
- # Reports a given metric statistics to Datadog
191
- # @param metric [RdKafkaMetric] metric value object
192
- # @param statistics [Hash] hash with all the statistics emitted
193
- # @param base_tags [Array<String>] base tags we want to start with
194
- def report_metric(metric, statistics, base_tags)
195
- case metric.scope
196
- when :root
197
- public_send(
198
- metric.type,
199
- metric.name,
200
- statistics.fetch(*metric.key_location),
201
- tags: base_tags
202
- )
203
- when :brokers
204
- statistics.fetch('brokers').each_value do |broker_statistics|
205
- # Skip bootstrap nodes
206
- # Bootstrap nodes have nodeid -1, other nodes have positive
207
- # node ids
208
- next if broker_statistics['nodeid'] == -1
209
-
210
- public_send(
211
- metric.type,
212
- metric.name,
213
- broker_statistics.dig(*metric.key_location),
214
- tags: base_tags + ["broker:#{broker_statistics['nodename']}"]
215
- )
216
- end
217
- when :topics
218
- statistics.fetch('topics').each do |topic_name, topic_values|
219
- topic_values['partitions'].each do |partition_name, partition_statistics|
220
- next if partition_name == '-1'
221
- # Skip until lag info is available
222
- next if partition_statistics['consumer_lag'] == -1
223
-
224
- public_send(
225
- metric.type,
226
- metric.name,
227
- partition_statistics.dig(*metric.key_location),
228
- tags: base_tags + [
229
- "topic:#{topic_name}",
230
- "partition:#{partition_name}"
231
- ]
232
- )
233
- end
234
- end
235
- else
236
- raise ArgumentError, metric.scope
237
- end
238
- end
239
-
240
- # Builds basic per consumer tags for publication
241
- #
242
- # @param consumer [Karafka::BaseConsumer]
243
- # @return [Array<String>]
244
- def consumer_tags(consumer)
245
- messages = consumer.messages
246
- metadata = messages.metadata
247
- consumer_group_id = consumer.topic.consumer_group.id
248
-
249
- [
250
- "topic:#{metadata.topic}",
251
- "partition:#{metadata.partition}",
252
- "consumer_group:#{consumer_group_id}"
253
- ]
254
- end
255
- end
11
+ # Alias to keep backwards compatibility
12
+ Listener = MetricsListener
256
13
  end
257
14
  end
258
15
  end
@@ -0,0 +1,259 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Karafka
4
+ module Instrumentation
5
+ # Namespace for vendor specific instrumentation
6
+ module Vendors
7
+ # Datadog specific instrumentation
8
+ module Datadog
9
+ # Listener that can be used to subscribe to Karafka to receive stats via StatsD
10
+ # and/or Datadog
11
+ #
12
+ # @note You need to setup the `dogstatsd-ruby` client and assign it
13
+ class MetricsListener
14
+ include ::Karafka::Core::Configurable
15
+ extend Forwardable
16
+
17
+ def_delegators :config, :client, :rd_kafka_metrics, :namespace, :default_tags
18
+
19
+ # Value object for storing a single rdkafka metric publishing details
20
+ RdKafkaMetric = Struct.new(:type, :scope, :name, :key_location)
21
+
22
+ # Namespace under which the DD metrics should be published
23
+ setting :namespace, default: 'karafka'
24
+
25
+ # Datadog client that we should use to publish the metrics
26
+ setting :client
27
+
28
+ # Default tags we want to publish (for example hostname)
29
+ # Format as followed (example for hostname): `["host:#{Socket.gethostname}"]`
30
+ setting :default_tags, default: []
31
+
32
+ # All the rdkafka metrics we want to publish
33
+ #
34
+ # By default we publish quite a lot so this can be tuned
35
+ # Note, that the once with `_d` come from Karafka, not rdkafka or Kafka
36
+ setting :rd_kafka_metrics, default: [
37
+ # Client metrics
38
+ RdKafkaMetric.new(:count, :root, 'messages.consumed', 'rxmsgs_d'),
39
+ RdKafkaMetric.new(:count, :root, 'messages.consumed.bytes', 'rxmsg_bytes'),
40
+
41
+ # Broker metrics
42
+ RdKafkaMetric.new(:count, :brokers, 'consume.attempts', 'txretries_d'),
43
+ RdKafkaMetric.new(:count, :brokers, 'consume.errors', 'txerrs_d'),
44
+ RdKafkaMetric.new(:count, :brokers, 'receive.errors', 'rxerrs_d'),
45
+ RdKafkaMetric.new(:count, :brokers, 'connection.connects', 'connects_d'),
46
+ RdKafkaMetric.new(:count, :brokers, 'connection.disconnects', 'disconnects_d'),
47
+ RdKafkaMetric.new(:gauge, :brokers, 'network.latency.avg', %w[rtt avg]),
48
+ RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p95', %w[rtt p95]),
49
+ RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p99', %w[rtt p99]),
50
+
51
+ # Topics metrics
52
+ RdKafkaMetric.new(:gauge, :topics, 'consumer.lags', 'consumer_lag_stored'),
53
+ RdKafkaMetric.new(:gauge, :topics, 'consumer.lags_delta', 'consumer_lag_stored_d')
54
+ ].freeze
55
+
56
+ configure
57
+
58
+ # @param block [Proc] configuration block
59
+ def initialize(&block)
60
+ configure
61
+ setup(&block) if block
62
+ end
63
+
64
+ # @param block [Proc] configuration block
65
+ # @note We define this alias to be consistent with `WaterDrop#setup`
66
+ def setup(&block)
67
+ configure(&block)
68
+ end
69
+
70
+ # Hooks up to WaterDrop instrumentation for emitted statistics
71
+ #
72
+ # @param event [Karafka::Core::Monitoring::Event]
73
+ def on_statistics_emitted(event)
74
+ statistics = event[:statistics]
75
+ consumer_group_id = event[:consumer_group_id]
76
+
77
+ base_tags = default_tags + ["consumer_group:#{consumer_group_id}"]
78
+
79
+ rd_kafka_metrics.each do |metric|
80
+ report_metric(metric, statistics, base_tags)
81
+ end
82
+ end
83
+
84
+ # Increases the errors count by 1
85
+ #
86
+ # @param event [Karafka::Core::Monitoring::Event]
87
+ def on_error_occurred(event)
88
+ extra_tags = ["type:#{event[:type]}"]
89
+
90
+ if event.payload[:caller].respond_to?(:messages)
91
+ extra_tags += consumer_tags(event.payload[:caller])
92
+ end
93
+
94
+ count('error_occurred', 1, tags: default_tags + extra_tags)
95
+ end
96
+
97
+ # Reports how many messages we've polled and how much time did we spend on it
98
+ #
99
+ # @param event [Karafka::Core::Monitoring::Event]
100
+ def on_connection_listener_fetch_loop_received(event)
101
+ time_taken = event[:time]
102
+ messages_count = event[:messages_buffer].size
103
+
104
+ consumer_group_id = event[:subscription_group].consumer_group_id
105
+
106
+ extra_tags = ["consumer_group:#{consumer_group_id}"]
107
+
108
+ histogram('listener.polling.time_taken', time_taken, tags: default_tags + extra_tags)
109
+ histogram('listener.polling.messages', messages_count, tags: default_tags + extra_tags)
110
+ end
111
+
112
+ # Here we report majority of things related to processing as we have access to the
113
+ # consumer
114
+ # @param event [Karafka::Core::Monitoring::Event]
115
+ def on_consumer_consumed(event)
116
+ consumer = event.payload[:caller]
117
+ messages = consumer.messages
118
+ metadata = messages.metadata
119
+
120
+ tags = default_tags + consumer_tags(consumer)
121
+
122
+ count('consumer.messages', messages.count, tags: tags)
123
+ count('consumer.batches', 1, tags: tags)
124
+ gauge('consumer.offset', metadata.last_offset, tags: tags)
125
+ histogram('consumer.consumed.time_taken', event[:time], tags: tags)
126
+ histogram('consumer.batch_size', messages.count, tags: tags)
127
+ histogram('consumer.processing_lag', metadata.processing_lag, tags: tags)
128
+ histogram('consumer.consumption_lag', metadata.consumption_lag, tags: tags)
129
+ end
130
+
131
+ # @param event [Karafka::Core::Monitoring::Event]
132
+ def on_consumer_revoked(event)
133
+ tags = default_tags + consumer_tags(event.payload[:caller])
134
+
135
+ count('consumer.revoked', 1, tags: tags)
136
+ end
137
+
138
+ # @param event [Karafka::Core::Monitoring::Event]
139
+ def on_consumer_shutdown(event)
140
+ tags = default_tags + consumer_tags(event.payload[:caller])
141
+
142
+ count('consumer.shutdown', 1, tags: tags)
143
+ end
144
+
145
+ # Worker related metrics
146
+ # @param event [Karafka::Core::Monitoring::Event]
147
+ def on_worker_process(event)
148
+ jq_stats = event[:jobs_queue].statistics
149
+
150
+ gauge('worker.total_threads', Karafka::App.config.concurrency, tags: default_tags)
151
+ histogram('worker.processing', jq_stats[:busy], tags: default_tags)
152
+ histogram('worker.enqueued_jobs', jq_stats[:enqueued], tags: default_tags)
153
+ end
154
+
155
+ # We report this metric before and after processing for higher accuracy
156
+ # Without this, the utilization would not be fully reflected
157
+ # @param event [Karafka::Core::Monitoring::Event]
158
+ def on_worker_processed(event)
159
+ jq_stats = event[:jobs_queue].statistics
160
+
161
+ histogram('worker.processing', jq_stats[:busy], tags: default_tags)
162
+ end
163
+
164
+ private
165
+
166
+ %i[
167
+ count
168
+ gauge
169
+ histogram
170
+ increment
171
+ decrement
172
+ ].each do |metric_type|
173
+ class_eval <<~METHODS, __FILE__, __LINE__ + 1
174
+ def #{metric_type}(key, *args)
175
+ client.#{metric_type}(
176
+ namespaced_metric(key),
177
+ *args
178
+ )
179
+ end
180
+ METHODS
181
+ end
182
+
183
+ # Wraps metric name in listener's namespace
184
+ # @param metric_name [String] RdKafkaMetric name
185
+ # @return [String]
186
+ def namespaced_metric(metric_name)
187
+ "#{namespace}.#{metric_name}"
188
+ end
189
+
190
+ # Reports a given metric statistics to Datadog
191
+ # @param metric [RdKafkaMetric] metric value object
192
+ # @param statistics [Hash] hash with all the statistics emitted
193
+ # @param base_tags [Array<String>] base tags we want to start with
194
+ def report_metric(metric, statistics, base_tags)
195
+ case metric.scope
196
+ when :root
197
+ public_send(
198
+ metric.type,
199
+ metric.name,
200
+ statistics.fetch(*metric.key_location),
201
+ tags: base_tags
202
+ )
203
+ when :brokers
204
+ statistics.fetch('brokers').each_value do |broker_statistics|
205
+ # Skip bootstrap nodes
206
+ # Bootstrap nodes have nodeid -1, other nodes have positive
207
+ # node ids
208
+ next if broker_statistics['nodeid'] == -1
209
+
210
+ public_send(
211
+ metric.type,
212
+ metric.name,
213
+ broker_statistics.dig(*metric.key_location),
214
+ tags: base_tags + ["broker:#{broker_statistics['nodename']}"]
215
+ )
216
+ end
217
+ when :topics
218
+ statistics.fetch('topics').each do |topic_name, topic_values|
219
+ topic_values['partitions'].each do |partition_name, partition_statistics|
220
+ next if partition_name == '-1'
221
+ # Skip until lag info is available
222
+ next if partition_statistics['consumer_lag'] == -1
223
+
224
+ public_send(
225
+ metric.type,
226
+ metric.name,
227
+ partition_statistics.dig(*metric.key_location),
228
+ tags: base_tags + [
229
+ "topic:#{topic_name}",
230
+ "partition:#{partition_name}"
231
+ ]
232
+ )
233
+ end
234
+ end
235
+ else
236
+ raise ArgumentError, metric.scope
237
+ end
238
+ end
239
+
240
+ # Builds basic per consumer tags for publication
241
+ #
242
+ # @param consumer [Karafka::BaseConsumer]
243
+ # @return [Array<String>]
244
+ def consumer_tags(consumer)
245
+ messages = consumer.messages
246
+ metadata = messages.metadata
247
+ consumer_group_id = consumer.topic.consumer_group.id
248
+
249
+ [
250
+ "topic:#{metadata.topic}",
251
+ "partition:#{metadata.partition}",
252
+ "consumer_group:#{consumer_group_id}"
253
+ ]
254
+ end
255
+ end
256
+ end
257
+ end
258
+ end
259
+ end
@@ -31,9 +31,11 @@ module Karafka
31
31
  break if revoked?
32
32
  break if Karafka::App.stopping?
33
33
 
34
- ::ActiveJob::Base.execute(
35
- ::ActiveSupport::JSON.decode(message.raw_payload)
36
- )
34
+ job = ::ActiveSupport::JSON.decode(message.raw_payload)
35
+
36
+ tags.add(:job_class, job['job_class'])
37
+
38
+ ::ActiveJob::Base.execute(job)
37
39
 
38
40
  # We cannot mark jobs as done after each if there are virtual partitions. Otherwise
39
41
  # this could create random markings.
@@ -23,6 +23,7 @@ module Karafka
23
23
  # They can be updated by using `#karafka_options` on the job
24
24
  DEFAULTS = {
25
25
  dispatch_method: :produce_async,
26
+ dispatch_many_method: :produce_many_async,
26
27
  # We don't create a dummy proc based partitioner as we would have to evaluate it with
27
28
  # each job.
28
29
  partitioner: nil,
@@ -33,7 +34,7 @@ module Karafka
33
34
  private_constant :DEFAULTS
34
35
 
35
36
  # @param job [ActiveJob::Base] job
36
- def call(job)
37
+ def dispatch(job)
37
38
  ::Karafka.producer.public_send(
38
39
  fetch_option(job, :dispatch_method, DEFAULTS),
39
40
  dispatch_details(job).merge!(
@@ -43,6 +44,28 @@ module Karafka
43
44
  )
44
45
  end
45
46
 
47
+ # Bulk dispatches multiple jobs using the Rails 7.1+ API
48
+ # @param jobs [Array<ActiveJob::Base>] jobs we want to dispatch
49
+ def dispatch_many(jobs)
50
+ dispatches = Hash.new { |hash, key| hash[key] = [] }
51
+
52
+ jobs.each do |job|
53
+ d_method = fetch_option(job, :dispatch_many_method, DEFAULTS)
54
+
55
+ dispatches[d_method] << dispatch_details(job).merge!(
56
+ topic: job.queue_name,
57
+ payload: ::ActiveSupport::JSON.encode(job.serialize)
58
+ )
59
+ end
60
+
61
+ dispatches.each do |type, messages|
62
+ ::Karafka.producer.public_send(
63
+ type,
64
+ messages
65
+ )
66
+ end
67
+ end
68
+
46
69
  private
47
70
 
48
71
  # @param job [ActiveJob::Base] job instance
@@ -25,9 +25,20 @@ module Karafka
25
25
  ).fetch('en').fetch('validations').fetch('job_options')
26
26
  end
27
27
 
28
- optional(:dispatch_method) { |val| %i[produce_async produce_sync].include?(val) }
29
28
  optional(:partitioner) { |val| val.respond_to?(:call) }
30
29
  optional(:partition_key_type) { |val| %i[key partition_key].include?(val) }
30
+ optional(:dispatch_method) do |val|
31
+ %i[
32
+ produce_async
33
+ produce_sync
34
+ ].include?(val)
35
+ end
36
+ optional(:dispatch_many_method) do |val|
37
+ %i[
38
+ produce_many_async
39
+ produce_many_sync
40
+ ].include?(val)
41
+ end
31
42
  end
32
43
  end
33
44
  end
@@ -4,6 +4,9 @@ module Karafka
4
4
  # Class used to catch signals from ruby Signal class in order to manage Karafka stop
5
5
  # @note There might be only one process - this class is a singleton
6
6
  class Process
7
+ # Allow for process tagging for instrumentation
8
+ extend ::Karafka::Core::Taggable
9
+
7
10
  # Signal types that we handle
8
11
  HANDLED_SIGNALS = %i[
9
12
  SIGINT
@@ -89,6 +89,11 @@ module Karafka
89
89
  # option [::WaterDrop::Producer, nil]
90
90
  # Unless configured, will be created once Karafka is configured based on user Karafka setup
91
91
  setting :producer, default: nil
92
+ # option [Boolean] when set to true, Karafka will ensure that the routing topic naming
93
+ # convention is strict
94
+ # Disabling this may be needed in scenarios where we do not have control over topics names
95
+ # and/or we work with existing systems where we cannot change topics names.
96
+ setting :strict_topics_namespacing, default: true
92
97
 
93
98
  # rdkafka default options
94
99
  # @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
@@ -168,6 +173,9 @@ module Karafka
168
173
 
169
174
  configure_components
170
175
 
176
+ # Refreshes the references that are cached that might have been changed by the config
177
+ ::Karafka.refresh!
178
+
171
179
  # Runs things that need to be executed after config is defined and all the components
172
180
  # are also configured
173
181
  Pro::Loader.post_setup(config) if Karafka.pro?
@@ -3,5 +3,5 @@
3
3
  # Main module namespace
4
4
  module Karafka
5
5
  # Current Karafka version
6
- VERSION = '2.0.32'
6
+ VERSION = '2.0.34'
7
7
  end
data/lib/karafka.rb CHANGED
@@ -95,6 +95,19 @@ module Karafka
95
95
  def boot_file
96
96
  Pathname.new(ENV['KARAFKA_BOOT_FILE'] || File.join(Karafka.root, 'karafka.rb'))
97
97
  end
98
+
99
+ # We need to be able to overwrite both monitor and logger after the configuration in case they
100
+ # would be changed because those two (with defaults) can be used prior to the setup and their
101
+ # state change should be reflected in the updated setup
102
+ #
103
+ # This method refreshes the things that might have been altered by the configuration
104
+ def refresh!
105
+ config = ::Karafka::App.config
106
+
107
+ @logger = config.logger
108
+ @producer = config.producer
109
+ @monitor = config.monitor
110
+ end
98
111
  end
99
112
  end
100
113
 
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: karafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.32
4
+ version: 2.0.34
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
35
35
  Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
36
36
  MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
37
37
  -----END CERTIFICATE-----
38
- date: 2023-02-14 00:00:00.000000000 Z
38
+ date: 2023-03-04 00:00:00.000000000 Z
39
39
  dependencies:
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: karafka-core
@@ -43,7 +43,7 @@ dependencies:
43
43
  requirements:
44
44
  - - ">="
45
45
  - !ruby/object:Gem::Version
46
- version: 2.0.11
46
+ version: 2.0.12
47
47
  - - "<"
48
48
  - !ruby/object:Gem::Version
49
49
  version: 3.0.0
@@ -53,7 +53,7 @@ dependencies:
53
53
  requirements:
54
54
  - - ">="
55
55
  - !ruby/object:Gem::Version
56
- version: 2.0.11
56
+ version: 2.0.12
57
57
  - - "<"
58
58
  - !ruby/object:Gem::Version
59
59
  version: 3.0.0
@@ -198,6 +198,7 @@ files:
198
198
  - lib/karafka/instrumentation/vendors/datadog/dashboard.json
199
199
  - lib/karafka/instrumentation/vendors/datadog/listener.rb
200
200
  - lib/karafka/instrumentation/vendors/datadog/logger_listener.rb
201
+ - lib/karafka/instrumentation/vendors/datadog/metrics_listener.rb
201
202
  - lib/karafka/licenser.rb
202
203
  - lib/karafka/messages/batch_metadata.rb
203
204
  - lib/karafka/messages/builders/batch_metadata.rb
metadata.gz.sig CHANGED
Binary file