RubyGems - karafka - Versions diffs - 2.0.32 → 2.0.34 - Mend

karafka 2.0.32 → 2.0.34

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

checksums.yaml +4 -4
checksums.yaml.gz.sig +0 -0
data/CHANGELOG.md +48 -4
data/Gemfile.lock +7 -8
data/config/locales/errors.yml +7 -2
data/karafka.gemspec +1 -1
data/lib/active_job/queue_adapters/karafka_adapter.rb +7 -1
data/lib/karafka/active_job/consumer.rb +8 -6
data/lib/karafka/active_job/dispatcher.rb +27 -2
data/lib/karafka/active_job/job_options_contract.rb +12 -1
data/lib/karafka/admin.rb +3 -1
data/lib/karafka/base_consumer.rb +3 -0
data/lib/karafka/connection/client.rb +2 -4
data/lib/karafka/contracts/consumer_group.rb +1 -0
data/lib/karafka/contracts/topic.rb +1 -0
data/lib/karafka/embedded.rb +4 -1
data/lib/karafka/instrumentation/callbacks/error.rb +2 -4
data/lib/karafka/instrumentation/callbacks/statistics.rb +2 -4
data/lib/karafka/instrumentation/notifications.rb +0 -2
data/lib/karafka/instrumentation/vendors/datadog/listener.rb +4 -247
data/lib/karafka/instrumentation/vendors/datadog/metrics_listener.rb +259 -0
data/lib/karafka/pro/active_job/consumer.rb +5 -3
data/lib/karafka/pro/active_job/dispatcher.rb +24 -1
data/lib/karafka/pro/active_job/job_options_contract.rb +12 -1
data/lib/karafka/process.rb +3 -0
data/lib/karafka/setup/config.rb +8 -0
data/lib/karafka/version.rb +1 -1
data/lib/karafka.rb +13 -0
data.tar.gz.sig +0 -0
metadata +5 -4
metadata.gz.sig +0 -0

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f5364527333b7924241340cbf9df8b3c189447ccf6c1b79612845d2d170990fe
-  data.tar.gz: d590b8f940a5fa00926d386e196607124e090888e0fa1fff321d935cb0818d47
+  metadata.gz: 36d890d825aaeaee5349dcc653d888da3a023c01a837864544a905db977569c4
+  data.tar.gz: be442485812a05a030bab33da31a8e2fda684add8c4d59a0af78f517bb2519bd
 SHA512:
-  metadata.gz: 25cec5ed66eb1199ec92c0206a34fba4583d79a9ba7d3ce68041855c48e1bb4bf97590ada597649c801340026270c604ed2a00c57c12f90c6fb861e3d85fd0b3
-  data.tar.gz: dbd604d94c1dc1df6a0040b24e4c5754205303ec7d7a3a15271d2309df3caa56f34f94ef7ea44904b020b57154c17dd06688b23b94ded4230dde321e5f3f1d91
+  metadata.gz: d92be137485c436c1ed02435669785422e6e4da194ab19b97dca31f70b530fe7e0ae4e6b0c7c54895dbceca2d8fe4ef1bdcabdb287affe06e377942062777979
+  data.tar.gz: 0525b652373088a7a692134a6a4c89487e495e01934504d256a60ae7805c92f65a308d38c8d75ad806f39f5128d2d00865e10b1b4ae326c7dd60e7594c68558e

checksums.yaml.gz.sig CHANGED Viewed

Binary file

data/CHANGELOG.md CHANGED Viewed

@@ -1,11 +1,55 @@
 # Karafka framework changelog
-## 2.0.32 (2022-02-13)
+## 2.0.34 (2023-03-04)
+- [Improvement] Attach an `embedded` tag to Karafka processes started using the embedded API.
+- [Change] Renamed `Datadog::Listener` to `Datadog::MetricsListener` for consistency. (#1124)
+### Upgrade notes
+1. Replace `Datadog::Listener` references to `Datadog::MetricsListener`.
+## 2.0.33 (2023-02-24)
+- **[Feature]** Support `perform_all_later` in ActiveJob adapter for Rails `7.1+`
+- **[Feature]** Introduce ability to assign and re-assign tags in consumer instances. This can be used for extra instrumentation that is context aware.
+- **[Feature]** Introduce ability to assign and reassign tags to the `Karafka::Process`.
+- [Improvement] When using `ActiveJob` adapter, automatically tag jobs with the name of the `ActiveJob` class that is running inside of the `ActiveJob` consumer.
+- [Improvement] Make `::Karafka::Instrumentation::Notifications::EVENTS` list public for anyone wanting to re-bind those into a different notification bus.
+- [Improvement] Set `fetch.message.max.bytes` for `Karafka::Admin` to `5MB` to make sure that all data is fetched correctly for Web UI under heavy load (many consumers).
+- [Improvement] Introduce a `strict_topics_namespacing` config option to enable/disable the strict topics naming validations. This can be useful when working with pre-existing topics which we cannot or do not want to rename.
+- [Fix] Karafka monitor is prematurely cached (#1314)
+### Upgrade notes
+Since `#tags` were introduced on consumers, the `#tags` method is now part of the consumers API.
+This means, that in case you were using a method called `#tags` in your consumers, you will have to rename it:
+```ruby
+class EventsConsumer < ApplicationConsumer
+  def consume
+    messages.each do |message|
+      tags << message.payload.tag
+    end
+    tags.each { |tags| puts tag }
+  end
+  private
+  # This will collide with the tagging API
+  # This NEEDS to be renamed not to collide with `#tags` method provided by the consumers API.
+  def tags
+    @tags ||= Set.new
+  end
+end
+```
+## 2.0.32 (2023-02-13)
 - [Fix] Many non-existing topic subscriptions propagate poll errors beyond client
 - [Improvement] Ignore `unknown_topic_or_part` errors in dev when `allow.auto.create.topics` is on.
 - [Improvement] Optimize temporary errors handling in polling for a better backoff policy
-## 2.0.31 (2022-02-12)
+## 2.0.31 (2023-02-12)
 - [Feature] Allow for adding partitions via `Admin#create_partitions` API.
 - [Fix] Do not ignore admin errors upon invalid configuration (#1254)
 - [Fix] Topic name validation (#1300) - CandyFet
@@ -13,7 +57,7 @@
 - [Maintenance] Require `karafka-core` >= `2.0.11` and switch to shared RSpec locator.
 - [Maintenance] Require `karafka-rdkafka` >= `0.12.1`
-## 2.0.30 (2022-01-31)
+## 2.0.30 (2023-01-31)
 - [Improvement] Alias `--consumer-groups` with `--include-consumer-groups`
 - [Improvement] Alias `--subscription-groups` with `--include-subscription-groups`
 - [Improvement] Alias `--topics` with `--include-topics`
@@ -63,7 +107,7 @@ class KarafkaApp < Karafka::App
 - [Improvement] Expand `LoggerListener` with `client.resume` notification.
 - [Improvement] Replace random anonymous subscription groups ids with stable once.
 - [Improvement] Add `consumer.consume`, `consumer.revoke` and `consumer.shutting_down` notification events and move the revocation logic calling to strategies.
-- [Change] Rename job queue statistics `processing` key to `busy`. No changes needed because naming in the DataDog listener stays the same.
+- [Change] Rename job queue statistics `processing` key to `busy`. No changes needed because naming in the DataDog listener stays the same.
 - [Fix] Fix proctitle listener state changes reporting on new states.
 - [Fix] Make sure all files descriptors are closed in the integration specs.
 - [Fix] Fix a case where empty subscription groups could leak into the execution flow.

data/Gemfile.lock CHANGED Viewed

@@ -1,8 +1,8 @@
 PATH
   remote: .
   specs:
-    karafka (2.0.32)
-      karafka-core (>= 2.0.11, < 3.0.0)
+    karafka (2.0.34)
+      karafka-core (>= 2.0.12, < 3.0.0)
       thor (>= 0.20)
       waterdrop (>= 2.4.10, < 3.0.0)
       zeitwerk (~> 2.3)
@@ -19,7 +19,7 @@ GEM
       minitest (>= 5.1)
       tzinfo (~> 2.0)
     byebug (11.1.3)
-    concurrent-ruby (1.2.0)
+    concurrent-ruby (1.2.2)
     diff-lcs (1.5.0)
     docile (1.4.0)
     factory_bot (6.2.1)
@@ -29,7 +29,7 @@ GEM
       activesupport (>= 5.0)
     i18n (1.12.0)
       concurrent-ruby (~> 1.0)
-    karafka-core (2.0.11)
+    karafka-core (2.0.12)
       concurrent-ruby (>= 1.1)
       karafka-rdkafka (>= 0.12.1)
     karafka-rdkafka (0.12.1)
@@ -61,13 +61,12 @@ GEM
     thor (1.2.1)
     tzinfo (2.0.6)
       concurrent-ruby (~> 1.0)
-    waterdrop (2.4.10)
-      karafka-core (>= 2.0.9, < 3.0.0)
+    waterdrop (2.4.11)
+      karafka-core (>= 2.0.12, < 3.0.0)
       zeitwerk (~> 2.3)
-    zeitwerk (2.6.6)
+    zeitwerk (2.6.7)
 PLATFORMS
-  arm64-darwin-21
   x86_64-linux
 DEPENDENCIES

data/config/locales/errors.yml CHANGED Viewed

@@ -45,18 +45,23 @@ en:
       dead_letter_queue.topic_format: 'needs to be a string with a Kafka accepted format'
       dead_letter_queue.active_format: needs to be either true or false
       active_format: needs to be either true or false
-      inconsistent_namespacing: needs to be consistent namespacing style
+      inconsistent_namespacing: |
+        needs to be consistent namespacing style
+        disable this validation by setting config.strict_topics_namespacing to false
     consumer_group:
       missing: needs to be present
       topics_names_not_unique: all topic names within a single consumer group must be unique
-      topics_namespaced_names_not_unique: all topic names within a single consumer group must be unique considering namespacing styles
       id_format: 'needs to be a string with a Kafka accepted format'
       topics_format: needs to be a non-empty array
+      topics_namespaced_names_not_unique: |
+        all topic names within a single consumer group must be unique considering namespacing styles
+        disable this validation by setting config.strict_topics_namespacing to false
     job_options:
       missing: needs to be present
       dispatch_method_format: needs to be either :produce_async or :produce_sync
+      dispatch_many_method_format: needs to be either :produce_many_async or :produce_many_sync
       partitioner_format: 'needs to respond to #call'
       partition_key_type_format: 'needs to be either :key or :partition_key'

data/karafka.gemspec CHANGED Viewed

@@ -21,7 +21,7 @@ Gem::Specification.new do |spec|
     without having to focus on things that are not your business domain.
   DESC
-  spec.add_dependency 'karafka-core', '>= 2.0.11', '< 3.0.0'
+  spec.add_dependency 'karafka-core', '>= 2.0.12', '< 3.0.0'
   spec.add_dependency 'thor', '>= 0.20'
   spec.add_dependency 'waterdrop', '>= 2.4.10', '< 3.0.0'
   spec.add_dependency 'zeitwerk', '~> 2.3'

data/lib/active_job/queue_adapters/karafka_adapter.rb CHANGED Viewed

@@ -11,7 +11,13 @@ module ActiveJob
       #
       # @param job [Object] job that should be enqueued
       def enqueue(job)
-        ::Karafka::App.config.internal.active_job.dispatcher.call(job)
+        ::Karafka::App.config.internal.active_job.dispatcher.dispatch(job)
+      end
+      # Enqueues multiple jobs in one go
+      # @param jobs [Array<Object>] jobs that we want to enqueue
+      def enqueue_all(jobs)
+        ::Karafka::App.config.internal.active_job.dispatcher.dispatch_many(jobs)
       end
       # Raises info, that Karafka backend does not support scheduling jobs

data/lib/karafka/active_job/consumer.rb CHANGED Viewed

@@ -12,12 +12,14 @@ module Karafka
         messages.each do |message|
           break if Karafka::App.stopping?
-          ::ActiveJob::Base.execute(
-            # We technically speaking could set this as deserializer and reference it from the
-            # message instead of using the `#raw_payload`. This is not done on purpose to simplify
-            # the ActiveJob setup here
-            ::ActiveSupport::JSON.decode(message.raw_payload)
-          )
+          # We technically speaking could set this as deserializer and reference it from the
+          # message instead of using the `#raw_payload`. This is not done on purpose to simplify
+          # the ActiveJob setup here
+          job = ::ActiveSupport::JSON.decode(message.raw_payload)
+          tags.add(:job_class, job['job_class'])
+          ::ActiveJob::Base.execute(job)
           mark_as_consumed(message)
         end

data/lib/karafka/active_job/dispatcher.rb CHANGED Viewed

@@ -7,13 +7,14 @@ module Karafka
       # Defaults for dispatching
       # The can be updated by using `#karafka_options` on the job
       DEFAULTS = {
-        dispatch_method: :produce_async
+        dispatch_method: :produce_async,
+        dispatch_many_method: :produce_many_async
       }.freeze
       private_constant :DEFAULTS
       # @param job [ActiveJob::Base] job
-      def call(job)
+      def dispatch(job)
         ::Karafka.producer.public_send(
           fetch_option(job, :dispatch_method, DEFAULTS),
           topic: job.queue_name,
@@ -21,6 +22,30 @@ module Karafka
         )
       end
+      # Bulk dispatches multiple jobs using the Rails 7.1+ API
+      # @param jobs [Array<ActiveJob::Base>] jobs we want to dispatch
+      def dispatch_many(jobs)
+        # Group jobs by their desired dispatch method
+        # It can be configured per job class, so we need to make sure we divide them
+        dispatches = Hash.new { |hash, key| hash[key] = [] }
+        jobs.each do |job|
+          d_method = fetch_option(job, :dispatch_many_method, DEFAULTS)
+          dispatches[d_method] << {
+            topic: job.queue_name,
+            payload: ::ActiveSupport::JSON.encode(job.serialize)
+          }
+        end
+        dispatches.each do |type, messages|
+          ::Karafka.producer.public_send(
+            type,
+            messages
+          )
+        end
+      end
       private
       # @param job [ActiveJob::Base] job

data/lib/karafka/active_job/job_options_contract.rb CHANGED Viewed

@@ -15,7 +15,18 @@ module Karafka
         ).fetch('en').fetch('validations').fetch('job_options')
       end
-      optional(:dispatch_method) { |val| %i[produce_async produce_sync].include?(val) }
+      optional(:dispatch_method) do |val|
+        %i[
+          produce_async
+          produce_sync
+        ].include?(val)
+      end
+      optional(:dispatch_many_method) do |val|
+        %i[
+          produce_many_async
+          produce_many_sync
+        ].include?(val)
+      end
     end
   end
 end

data/lib/karafka/admin.rb CHANGED Viewed

@@ -26,7 +26,9 @@ module Karafka
       'group.id': 'karafka_admin',
       # We want to know when there is no more data not to end up with an endless loop
       'enable.partition.eof': true,
-      'statistics.interval.ms': 0
+      'statistics.interval.ms': 0,
+      # Fetch at most 5 MBs when using admin
+      'fetch.message.max.bytes': 5 * 1_048_576
     }.freeze
     private_constant :Topic, :CONFIG_DEFAULTS, :MAX_WAIT_TIMEOUT, :MAX_ATTEMPTS

data/lib/karafka/base_consumer.rb CHANGED Viewed

@@ -4,6 +4,9 @@
 module Karafka
   # Base consumer from which all Karafka consumers should inherit
   class BaseConsumer
+    # Allow for consumer instance tagging for instrumentation
+    include ::Karafka::Core::Taggable
     # @return [String] id of the current consumer
     attr_reader :id
     # @return [Karafka::Routing::Topic] topic to which a given consumer is subscribed

data/lib/karafka/connection/client.rb CHANGED Viewed

@@ -431,8 +431,7 @@ module Karafka
           Instrumentation::Callbacks::Statistics.new(
             @subscription_group.id,
             @subscription_group.consumer_group_id,
-            @name,
-            ::Karafka::App.config.monitor
+            @name
           )
         )
@@ -442,8 +441,7 @@ module Karafka
           Instrumentation::Callbacks::Error.new(
             @subscription_group.id,
             @subscription_group.consumer_group_id,
-            @name,
-            ::Karafka::App.config.monitor
+            @name
           )
         )

data/lib/karafka/contracts/consumer_group.rb CHANGED Viewed

@@ -27,6 +27,7 @@ module Karafka
       virtual do |data, errors|
         next unless errors.empty?
+        next unless ::Karafka::App.config.strict_topics_namespacing
         names = data.fetch(:topics).map { |topic| topic[:name] }
         names_hash = names.each_with_object({}) { |n, h| h[n] = true }

data/lib/karafka/contracts/topic.rb CHANGED Viewed

@@ -51,6 +51,7 @@ module Karafka
       virtual do |data, errors|
         next unless errors.empty?
+        next unless ::Karafka::App.config.strict_topics_namespacing
         value = data.fetch(:name)
         namespacing_chars_count = value.chars.find_all { |c| ['.', '_'].include?(c) }.uniq.count

data/lib/karafka/embedded.rb CHANGED Viewed

@@ -7,7 +7,10 @@ module Karafka
       # Starts Karafka without supervision and without ownership of signals in a background thread
       # so it won't interrupt other things running
       def start
-        Thread.new { Karafka::Server.start }
+        Thread.new do
+          Karafka::Process.tags.add(:execution_mode, 'embedded')
+          Karafka::Server.start
+        end
       end
       # Stops Karafka upon any event

data/lib/karafka/instrumentation/callbacks/error.rb CHANGED Viewed

@@ -9,12 +9,10 @@ module Karafka
         # @param subscription_group_id [String] id of the current subscription group instance
         # @param consumer_group_id [String] id of the current consumer group
         # @param client_name [String] rdkafka client name
-        # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
-        def initialize(subscription_group_id, consumer_group_id, client_name, monitor)
+        def initialize(subscription_group_id, consumer_group_id, client_name)
           @subscription_group_id = subscription_group_id
           @consumer_group_id = consumer_group_id
           @client_name = client_name
-          @monitor = monitor
         end
         # Runs the instrumentation monitor with error
@@ -26,7 +24,7 @@ module Karafka
           # Same as with statistics (mor explanation there)
           return unless @client_name == client_name
-          @monitor.instrument(
+          ::Karafka.monitor.instrument(
             'error.occurred',
             caller: self,
             subscription_group_id: @subscription_group_id,

data/lib/karafka/instrumentation/callbacks/statistics.rb CHANGED Viewed

@@ -10,12 +10,10 @@ module Karafka
         # @param subscription_group_id [String] id of the current subscription group
         # @param consumer_group_id [String] id of the current consumer group
         # @param client_name [String] rdkafka client name
-        # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
-        def initialize(subscription_group_id, consumer_group_id, client_name, monitor)
+        def initialize(subscription_group_id, consumer_group_id, client_name)
           @subscription_group_id = subscription_group_id
           @consumer_group_id = consumer_group_id
           @client_name = client_name
-          @monitor = monitor
           @statistics_decorator = ::Karafka::Core::Monitoring::StatisticsDecorator.new
         end
@@ -28,7 +26,7 @@ module Karafka
           # all the time.
           return unless @client_name == statistics['name']
-          @monitor.instrument(
+          ::Karafka.monitor.instrument(
             'statistics.emitted',
             subscription_group_id: @subscription_group_id,
             consumer_group_id: @consumer_group_id,

data/lib/karafka/instrumentation/notifications.rb CHANGED Viewed

@@ -54,8 +54,6 @@ module Karafka
         error.occurred
       ].freeze
-      private_constant :EVENTS
       # @return [Karafka::Instrumentation::Monitor] monitor instance for system instrumentation
       def initialize
         super

data/lib/karafka/instrumentation/vendors/datadog/listener.rb CHANGED Viewed

@@ -1,258 +1,15 @@
 # frozen_string_literal: true
+require_relative 'metrics_listener'
 module Karafka
   module Instrumentation
     # Namespace for vendor specific instrumentation
     module Vendors
       # Datadog specific instrumentation
       module Datadog
-        # Listener that can be used to subscribe to Karafka to receive stats via StatsD
-        # and/or Datadog
-        #
-        # @note You need to setup the `dogstatsd-ruby` client and assign it
-        class Listener
-          include ::Karafka::Core::Configurable
-          extend Forwardable
-          def_delegators :config, :client, :rd_kafka_metrics, :namespace, :default_tags
-          # Value object for storing a single rdkafka metric publishing details
-          RdKafkaMetric = Struct.new(:type, :scope, :name, :key_location)
-          # Namespace under which the DD metrics should be published
-          setting :namespace, default: 'karafka'
-          # Datadog client that we should use to publish the metrics
-          setting :client
-          # Default tags we want to publish (for example hostname)
-          # Format as followed (example for hostname): `["host:#{Socket.gethostname}"]`
-          setting :default_tags, default: []
-          # All the rdkafka metrics we want to publish
-          #
-          # By default we publish quite a lot so this can be tuned
-          # Note, that the once with `_d` come from Karafka, not rdkafka or Kafka
-          setting :rd_kafka_metrics, default: [
-            # Client metrics
-            RdKafkaMetric.new(:count, :root, 'messages.consumed', 'rxmsgs_d'),
-            RdKafkaMetric.new(:count, :root, 'messages.consumed.bytes', 'rxmsg_bytes'),
-            # Broker metrics
-            RdKafkaMetric.new(:count, :brokers, 'consume.attempts', 'txretries_d'),
-            RdKafkaMetric.new(:count, :brokers, 'consume.errors', 'txerrs_d'),
-            RdKafkaMetric.new(:count, :brokers, 'receive.errors', 'rxerrs_d'),
-            RdKafkaMetric.new(:count, :brokers, 'connection.connects', 'connects_d'),
-            RdKafkaMetric.new(:count, :brokers, 'connection.disconnects', 'disconnects_d'),
-            RdKafkaMetric.new(:gauge, :brokers, 'network.latency.avg', %w[rtt avg]),
-            RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p95', %w[rtt p95]),
-            RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p99', %w[rtt p99]),
-            # Topics metrics
-            RdKafkaMetric.new(:gauge, :topics, 'consumer.lags', 'consumer_lag_stored'),
-            RdKafkaMetric.new(:gauge, :topics, 'consumer.lags_delta', 'consumer_lag_stored_d')
-          ].freeze
-          configure
-          # @param block [Proc] configuration block
-          def initialize(&block)
-            configure
-            setup(&block) if block
-          end
-          # @param block [Proc] configuration block
-          # @note We define this alias to be consistent with `WaterDrop#setup`
-          def setup(&block)
-            configure(&block)
-          end
-          # Hooks up to WaterDrop instrumentation for emitted statistics
-          #
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_statistics_emitted(event)
-            statistics = event[:statistics]
-            consumer_group_id = event[:consumer_group_id]
-            base_tags = default_tags + ["consumer_group:#{consumer_group_id}"]
-            rd_kafka_metrics.each do |metric|
-              report_metric(metric, statistics, base_tags)
-            end
-          end
-          # Increases the errors count by 1
-          #
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_error_occurred(event)
-            extra_tags = ["type:#{event[:type]}"]
-            if event.payload[:caller].respond_to?(:messages)
-              extra_tags += consumer_tags(event.payload[:caller])
-            end
-            count('error_occurred', 1, tags: default_tags + extra_tags)
-          end
-          # Reports how many messages we've polled and how much time did we spend on it
-          #
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_connection_listener_fetch_loop_received(event)
-            time_taken = event[:time]
-            messages_count = event[:messages_buffer].size
-            consumer_group_id = event[:subscription_group].consumer_group_id
-            extra_tags = ["consumer_group:#{consumer_group_id}"]
-            histogram('listener.polling.time_taken', time_taken, tags: default_tags + extra_tags)
-            histogram('listener.polling.messages', messages_count, tags: default_tags + extra_tags)
-          end
-          # Here we report majority of things related to processing as we have access to the
-          # consumer
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_consumer_consumed(event)
-            consumer = event.payload[:caller]
-            messages = consumer.messages
-            metadata = messages.metadata
-            tags = default_tags + consumer_tags(consumer)
-            count('consumer.messages', messages.count, tags: tags)
-            count('consumer.batches', 1, tags: tags)
-            gauge('consumer.offset', metadata.last_offset, tags: tags)
-            histogram('consumer.consumed.time_taken', event[:time], tags: tags)
-            histogram('consumer.batch_size', messages.count, tags: tags)
-            histogram('consumer.processing_lag', metadata.processing_lag, tags: tags)
-            histogram('consumer.consumption_lag', metadata.consumption_lag, tags: tags)
-          end
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_consumer_revoked(event)
-            tags = default_tags + consumer_tags(event.payload[:caller])
-            count('consumer.revoked', 1, tags: tags)
-          end
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_consumer_shutdown(event)
-            tags = default_tags + consumer_tags(event.payload[:caller])
-            count('consumer.shutdown', 1, tags: tags)
-          end
-          # Worker related metrics
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_worker_process(event)
-            jq_stats = event[:jobs_queue].statistics
-            gauge('worker.total_threads', Karafka::App.config.concurrency, tags: default_tags)
-            histogram('worker.processing', jq_stats[:busy], tags: default_tags)
-            histogram('worker.enqueued_jobs', jq_stats[:enqueued], tags: default_tags)
-          end
-          # We report this metric before and after processing for higher accuracy
-          # Without this, the utilization would not be fully reflected
-          # @param event [Karafka::Core::Monitoring::Event]
-          def on_worker_processed(event)
-            jq_stats = event[:jobs_queue].statistics
-            histogram('worker.processing', jq_stats[:busy], tags: default_tags)
-          end
-          private
-          %i[
-            count
-            gauge
-            histogram
-            increment
-            decrement
-          ].each do |metric_type|
-            class_eval <<~METHODS, __FILE__, __LINE__ + 1
-              def #{metric_type}(key, *args)
-                client.#{metric_type}(
-                  namespaced_metric(key),
-                  *args
-                )
-              end
-            METHODS
-          end
-          # Wraps metric name in listener's namespace
-          # @param metric_name [String] RdKafkaMetric name
-          # @return [String]
-          def namespaced_metric(metric_name)
-            "#{namespace}.#{metric_name}"
-          end
-          # Reports a given metric statistics to Datadog
-          # @param metric [RdKafkaMetric] metric value object
-          # @param statistics [Hash] hash with all the statistics emitted
-          # @param base_tags [Array<String>] base tags we want to start with
-          def report_metric(metric, statistics, base_tags)
-            case metric.scope
-            when :root
-              public_send(
-                metric.type,
-                metric.name,
-                statistics.fetch(*metric.key_location),
-                tags: base_tags
-              )
-            when :brokers
-              statistics.fetch('brokers').each_value do |broker_statistics|
-                # Skip bootstrap nodes
-                # Bootstrap nodes have nodeid -1, other nodes have positive
-                # node ids
-                next if broker_statistics['nodeid'] == -1
-                public_send(
-                  metric.type,
-                  metric.name,
-                  broker_statistics.dig(*metric.key_location),
-                  tags: base_tags + ["broker:#{broker_statistics['nodename']}"]
-                )
-              end
-            when :topics
-              statistics.fetch('topics').each do |topic_name, topic_values|
-                topic_values['partitions'].each do |partition_name, partition_statistics|
-                  next if partition_name == '-1'
-                  # Skip until lag info is available
-                  next if partition_statistics['consumer_lag'] == -1
-                  public_send(
-                    metric.type,
-                    metric.name,
-                    partition_statistics.dig(*metric.key_location),
-                    tags: base_tags + [
-                      "topic:#{topic_name}",
-                      "partition:#{partition_name}"
-                    ]
-                  )
-                end
-              end
-            else
-              raise ArgumentError, metric.scope
-            end
-          end
-          # Builds basic per consumer tags for publication
-          #
-          # @param consumer [Karafka::BaseConsumer]
-          # @return [Array<String>]
-          def consumer_tags(consumer)
-            messages = consumer.messages
-            metadata = messages.metadata
-            consumer_group_id = consumer.topic.consumer_group.id
-            [
-              "topic:#{metadata.topic}",
-              "partition:#{metadata.partition}",
-              "consumer_group:#{consumer_group_id}"
-            ]
-          end
-        end
+        # Alias to keep backwards compatibility
+        Listener = MetricsListener
       end
     end
   end

data/lib/karafka/instrumentation/vendors/datadog/metrics_listener.rb ADDED Viewed

@@ -0,0 +1,259 @@
+# frozen_string_literal: true
+module Karafka
+  module Instrumentation
+    # Namespace for vendor specific instrumentation
+    module Vendors
+      # Datadog specific instrumentation
+      module Datadog
+        # Listener that can be used to subscribe to Karafka to receive stats via StatsD
+        # and/or Datadog
+        #
+        # @note You need to setup the `dogstatsd-ruby` client and assign it
+        class MetricsListener
+          include ::Karafka::Core::Configurable
+          extend Forwardable
+          def_delegators :config, :client, :rd_kafka_metrics, :namespace, :default_tags
+          # Value object for storing a single rdkafka metric publishing details
+          RdKafkaMetric = Struct.new(:type, :scope, :name, :key_location)
+          # Namespace under which the DD metrics should be published
+          setting :namespace, default: 'karafka'
+          # Datadog client that we should use to publish the metrics
+          setting :client
+          # Default tags we want to publish (for example hostname)
+          # Format as followed (example for hostname): `["host:#{Socket.gethostname}"]`
+          setting :default_tags, default: []
+          # All the rdkafka metrics we want to publish
+          #
+          # By default we publish quite a lot so this can be tuned
+          # Note, that the once with `_d` come from Karafka, not rdkafka or Kafka
+          setting :rd_kafka_metrics, default: [
+            # Client metrics
+            RdKafkaMetric.new(:count, :root, 'messages.consumed', 'rxmsgs_d'),
+            RdKafkaMetric.new(:count, :root, 'messages.consumed.bytes', 'rxmsg_bytes'),
+            # Broker metrics
+            RdKafkaMetric.new(:count, :brokers, 'consume.attempts', 'txretries_d'),
+            RdKafkaMetric.new(:count, :brokers, 'consume.errors', 'txerrs_d'),
+            RdKafkaMetric.new(:count, :brokers, 'receive.errors', 'rxerrs_d'),
+            RdKafkaMetric.new(:count, :brokers, 'connection.connects', 'connects_d'),
+            RdKafkaMetric.new(:count, :brokers, 'connection.disconnects', 'disconnects_d'),
+            RdKafkaMetric.new(:gauge, :brokers, 'network.latency.avg', %w[rtt avg]),
+            RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p95', %w[rtt p95]),
+            RdKafkaMetric.new(:gauge, :brokers, 'network.latency.p99', %w[rtt p99]),
+            # Topics metrics
+            RdKafkaMetric.new(:gauge, :topics, 'consumer.lags', 'consumer_lag_stored'),
+            RdKafkaMetric.new(:gauge, :topics, 'consumer.lags_delta', 'consumer_lag_stored_d')
+          ].freeze
+          configure
+          # @param block [Proc] configuration block
+          def initialize(&block)
+            configure
+            setup(&block) if block
+          end
+          # @param block [Proc] configuration block
+          # @note We define this alias to be consistent with `WaterDrop#setup`
+          def setup(&block)
+            configure(&block)
+          end
+          # Hooks up to WaterDrop instrumentation for emitted statistics
+          #
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_statistics_emitted(event)
+            statistics = event[:statistics]
+            consumer_group_id = event[:consumer_group_id]
+            base_tags = default_tags + ["consumer_group:#{consumer_group_id}"]
+            rd_kafka_metrics.each do |metric|
+              report_metric(metric, statistics, base_tags)
+            end
+          end
+          # Increases the errors count by 1
+          #
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_error_occurred(event)
+            extra_tags = ["type:#{event[:type]}"]
+            if event.payload[:caller].respond_to?(:messages)
+              extra_tags += consumer_tags(event.payload[:caller])
+            end
+            count('error_occurred', 1, tags: default_tags + extra_tags)
+          end
+          # Reports how many messages we've polled and how much time did we spend on it
+          #
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_connection_listener_fetch_loop_received(event)
+            time_taken = event[:time]
+            messages_count = event[:messages_buffer].size
+            consumer_group_id = event[:subscription_group].consumer_group_id
+            extra_tags = ["consumer_group:#{consumer_group_id}"]
+            histogram('listener.polling.time_taken', time_taken, tags: default_tags + extra_tags)
+            histogram('listener.polling.messages', messages_count, tags: default_tags + extra_tags)
+          end
+          # Here we report majority of things related to processing as we have access to the
+          # consumer
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_consumer_consumed(event)
+            consumer = event.payload[:caller]
+            messages = consumer.messages
+            metadata = messages.metadata
+            tags = default_tags + consumer_tags(consumer)
+            count('consumer.messages', messages.count, tags: tags)
+            count('consumer.batches', 1, tags: tags)
+            gauge('consumer.offset', metadata.last_offset, tags: tags)
+            histogram('consumer.consumed.time_taken', event[:time], tags: tags)
+            histogram('consumer.batch_size', messages.count, tags: tags)
+            histogram('consumer.processing_lag', metadata.processing_lag, tags: tags)
+            histogram('consumer.consumption_lag', metadata.consumption_lag, tags: tags)
+          end
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_consumer_revoked(event)
+            tags = default_tags + consumer_tags(event.payload[:caller])
+            count('consumer.revoked', 1, tags: tags)
+          end
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_consumer_shutdown(event)
+            tags = default_tags + consumer_tags(event.payload[:caller])
+            count('consumer.shutdown', 1, tags: tags)
+          end
+          # Worker related metrics
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_worker_process(event)
+            jq_stats = event[:jobs_queue].statistics
+            gauge('worker.total_threads', Karafka::App.config.concurrency, tags: default_tags)
+            histogram('worker.processing', jq_stats[:busy], tags: default_tags)
+            histogram('worker.enqueued_jobs', jq_stats[:enqueued], tags: default_tags)
+          end
+          # We report this metric before and after processing for higher accuracy
+          # Without this, the utilization would not be fully reflected
+          # @param event [Karafka::Core::Monitoring::Event]
+          def on_worker_processed(event)
+            jq_stats = event[:jobs_queue].statistics
+            histogram('worker.processing', jq_stats[:busy], tags: default_tags)
+          end
+          private
+          %i[
+            count
+            gauge
+            histogram
+            increment
+            decrement
+          ].each do |metric_type|
+            class_eval <<~METHODS, __FILE__, __LINE__ + 1
+              def #{metric_type}(key, *args)
+                client.#{metric_type}(
+                  namespaced_metric(key),
+                  *args
+                )
+              end
+            METHODS
+          end
+          # Wraps metric name in listener's namespace
+          # @param metric_name [String] RdKafkaMetric name
+          # @return [String]
+          def namespaced_metric(metric_name)
+            "#{namespace}.#{metric_name}"
+          end
+          # Reports a given metric statistics to Datadog
+          # @param metric [RdKafkaMetric] metric value object
+          # @param statistics [Hash] hash with all the statistics emitted
+          # @param base_tags [Array<String>] base tags we want to start with
+          def report_metric(metric, statistics, base_tags)
+            case metric.scope
+            when :root
+              public_send(
+                metric.type,
+                metric.name,
+                statistics.fetch(*metric.key_location),
+                tags: base_tags
+              )
+            when :brokers
+              statistics.fetch('brokers').each_value do |broker_statistics|
+                # Skip bootstrap nodes
+                # Bootstrap nodes have nodeid -1, other nodes have positive
+                # node ids
+                next if broker_statistics['nodeid'] == -1
+                public_send(
+                  metric.type,
+                  metric.name,
+                  broker_statistics.dig(*metric.key_location),
+                  tags: base_tags + ["broker:#{broker_statistics['nodename']}"]
+                )
+              end
+            when :topics
+              statistics.fetch('topics').each do |topic_name, topic_values|
+                topic_values['partitions'].each do |partition_name, partition_statistics|
+                  next if partition_name == '-1'
+                  # Skip until lag info is available
+                  next if partition_statistics['consumer_lag'] == -1
+                  public_send(
+                    metric.type,
+                    metric.name,
+                    partition_statistics.dig(*metric.key_location),
+                    tags: base_tags + [
+                      "topic:#{topic_name}",
+                      "partition:#{partition_name}"
+                    ]
+                  )
+                end
+              end
+            else
+              raise ArgumentError, metric.scope
+            end
+          end
+          # Builds basic per consumer tags for publication
+          #
+          # @param consumer [Karafka::BaseConsumer]
+          # @return [Array<String>]
+          def consumer_tags(consumer)
+            messages = consumer.messages
+            metadata = messages.metadata
+            consumer_group_id = consumer.topic.consumer_group.id
+            [
+              "topic:#{metadata.topic}",
+              "partition:#{metadata.partition}",
+              "consumer_group:#{consumer_group_id}"
+            ]
+          end
+        end
+      end
+    end
+  end
+end

data/lib/karafka/pro/active_job/consumer.rb CHANGED Viewed

@@ -31,9 +31,11 @@ module Karafka
             break if revoked?
             break if Karafka::App.stopping?
-            ::ActiveJob::Base.execute(
-              ::ActiveSupport::JSON.decode(message.raw_payload)
-            )
+            job = ::ActiveSupport::JSON.decode(message.raw_payload)
+            tags.add(:job_class, job['job_class'])
+            ::ActiveJob::Base.execute(job)
             # We cannot mark jobs as done after each if there are virtual partitions. Otherwise
             # this could create random markings.

data/lib/karafka/pro/active_job/dispatcher.rb CHANGED Viewed

@@ -23,6 +23,7 @@ module Karafka
         # They can be updated by using `#karafka_options` on the job
         DEFAULTS = {
           dispatch_method: :produce_async,
+          dispatch_many_method: :produce_many_async,
           # We don't create a dummy proc based partitioner as we would have to evaluate it with
           # each job.
           partitioner: nil,
@@ -33,7 +34,7 @@ module Karafka
         private_constant :DEFAULTS
         # @param job [ActiveJob::Base] job
-        def call(job)
+        def dispatch(job)
           ::Karafka.producer.public_send(
             fetch_option(job, :dispatch_method, DEFAULTS),
             dispatch_details(job).merge!(
@@ -43,6 +44,28 @@ module Karafka
           )
         end
+        # Bulk dispatches multiple jobs using the Rails 7.1+ API
+        # @param jobs [Array<ActiveJob::Base>] jobs we want to dispatch
+        def dispatch_many(jobs)
+          dispatches = Hash.new { |hash, key| hash[key] = [] }
+          jobs.each do |job|
+            d_method = fetch_option(job, :dispatch_many_method, DEFAULTS)
+            dispatches[d_method] << dispatch_details(job).merge!(
+              topic: job.queue_name,
+              payload: ::ActiveSupport::JSON.encode(job.serialize)
+            )
+          end
+          dispatches.each do |type, messages|
+            ::Karafka.producer.public_send(
+              type,
+              messages
+            )
+          end
+        end
         private
         # @param job [ActiveJob::Base] job instance

data/lib/karafka/pro/active_job/job_options_contract.rb CHANGED Viewed

@@ -25,9 +25,20 @@ module Karafka
           ).fetch('en').fetch('validations').fetch('job_options')
         end
-        optional(:dispatch_method) { |val| %i[produce_async produce_sync].include?(val) }
         optional(:partitioner) { |val| val.respond_to?(:call) }
         optional(:partition_key_type) { |val| %i[key partition_key].include?(val) }
+        optional(:dispatch_method) do |val|
+          %i[
+            produce_async
+            produce_sync
+          ].include?(val)
+        end
+        optional(:dispatch_many_method) do |val|
+          %i[
+            produce_many_async
+            produce_many_sync
+          ].include?(val)
+        end
       end
     end
   end

data/lib/karafka/process.rb CHANGED Viewed

@@ -4,6 +4,9 @@ module Karafka
   # Class used to catch signals from ruby Signal class in order to manage Karafka stop
   # @note There might be only one process - this class is a singleton
   class Process
+    # Allow for process tagging for instrumentation
+    extend ::Karafka::Core::Taggable
     # Signal types that we handle
     HANDLED_SIGNALS = %i[
       SIGINT

data/lib/karafka/setup/config.rb CHANGED Viewed

@@ -89,6 +89,11 @@ module Karafka
       # option [::WaterDrop::Producer, nil]
       # Unless configured, will be created once Karafka is configured based on user Karafka setup
       setting :producer, default: nil
+      # option [Boolean] when set to true, Karafka will ensure that the routing topic naming
+      # convention is strict
+      # Disabling this may be needed in scenarios where we do not have control over topics names
+      # and/or we work with existing systems where we cannot change topics names.
+      setting :strict_topics_namespacing, default: true
       # rdkafka default options
       # @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
@@ -168,6 +173,9 @@ module Karafka
           configure_components
+          # Refreshes the references that are cached that might have been changed by the config
+          ::Karafka.refresh!
           # Runs things that need to be executed after config is defined and all the components
           # are also configured
           Pro::Loader.post_setup(config) if Karafka.pro?

data/lib/karafka/version.rb CHANGED Viewed

@@ -3,5 +3,5 @@
 # Main module namespace
 module Karafka
   # Current Karafka version
-  VERSION = '2.0.32'
+  VERSION = '2.0.34'
 end

data/lib/karafka.rb CHANGED Viewed

@@ -95,6 +95,19 @@ module Karafka
     def boot_file
       Pathname.new(ENV['KARAFKA_BOOT_FILE'] || File.join(Karafka.root, 'karafka.rb'))
     end
+    # We need to be able to overwrite both monitor and logger after the configuration in case they
+    # would be changed because those two (with defaults) can be used prior to the setup and their
+    # state change should be reflected in the updated setup
+    #
+    # This method refreshes the things that might have been altered by the configuration
+    def refresh!
+      config = ::Karafka::App.config
+      @logger = config.logger
+      @producer = config.producer
+      @monitor = config.monitor
+    end
   end
 end

data.tar.gz.sig CHANGED Viewed

Binary file

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: karafka
 version: !ruby/object:Gem::Version
-  version: 2.0.32
+  version: 2.0.34
 platform: ruby
 authors:
 - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
   Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
   MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
   -----END CERTIFICATE-----
-date: 2023-02-14 00:00:00.000000000 Z
+date: 2023-03-04 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: karafka-core
@@ -43,7 +43,7 @@ dependencies:
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: 2.0.11
+        version: 2.0.12
     - - "<"
       - !ruby/object:Gem::Version
         version: 3.0.0
@@ -53,7 +53,7 @@ dependencies:
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: 2.0.11
+        version: 2.0.12
     - - "<"
       - !ruby/object:Gem::Version
         version: 3.0.0
@@ -198,6 +198,7 @@ files:
 - lib/karafka/instrumentation/vendors/datadog/dashboard.json
 - lib/karafka/instrumentation/vendors/datadog/listener.rb
 - lib/karafka/instrumentation/vendors/datadog/logger_listener.rb
+- lib/karafka/instrumentation/vendors/datadog/metrics_listener.rb
 - lib/karafka/licenser.rb
 - lib/karafka/messages/batch_metadata.rb
 - lib/karafka/messages/builders/batch_metadata.rb

metadata.gz.sig CHANGED Viewed

Binary file