RubyGems - karafka - Versions diffs - 2.2.11 → 2.2.13 - Mend

karafka 2.2.11 → 2.2.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

checksums.yaml +4 -4
checksums.yaml.gz.sig +2 -4
data/CHANGELOG.md +12 -0
data/Gemfile.lock +13 -13
data/config/locales/errors.yml +3 -1
data/docker-compose.yml +1 -1
data/karafka.gemspec +2 -2
data/lib/karafka/connection/client.rb +77 -11
data/lib/karafka/connection/consumer_group_coordinator.rb +3 -3
data/lib/karafka/connection/listener.rb +30 -7
data/lib/karafka/connection/listeners_batch.rb +6 -1
data/lib/karafka/contracts/config.rb +5 -1
data/lib/karafka/helpers/interval_runner.rb +39 -0
data/lib/karafka/instrumentation/notifications.rb +1 -0
data/lib/karafka/instrumentation/vendors/datadog/logger_listener.rb +1 -9
data/lib/karafka/pro/loader.rb +2 -1
data/lib/karafka/pro/processing/coordinator.rb +12 -6
data/lib/karafka/pro/processing/jobs_queue.rb +109 -0
data/lib/karafka/pro/processing/scheduler.rb +2 -3
data/lib/karafka/pro/processing/strategies/default.rb +2 -0
data/lib/karafka/pro/processing/strategies/lrj/default.rb +9 -0
data/lib/karafka/pro/processing/strategies/vp/default.rb +8 -4
data/lib/karafka/processing/coordinator.rb +13 -7
data/lib/karafka/processing/inline_insights/consumer.rb +2 -0
data/lib/karafka/processing/jobs_queue.rb +41 -13
data/lib/karafka/processing/scheduler.rb +19 -3
data/lib/karafka/processing/strategies/default.rb +2 -0
data/lib/karafka/processing/timed_queue.rb +62 -0
data/lib/karafka/routing/builder.rb +32 -17
data/lib/karafka/routing/subscription_group.rb +11 -6
data/lib/karafka/runner.rb +1 -1
data/lib/karafka/setup/config.rb +13 -1
data/lib/karafka/version.rb +1 -1
data/lib/karafka.rb +0 -1
data.tar.gz.sig +0 -0
metadata +9 -6
metadata.gz.sig +0 -0

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: '08439276b10ee121dabe7ef6496334d7aef3cf6a8a49fdc723e6333a147d57d7'
-  data.tar.gz: f1022962556a4e3397ec256bdd63e7ced077cadabddf72849473eef8ec190186
+  metadata.gz: 4056d72f0d37ac46c52597ebcfed87de031f9f250d57a64ec5c665d3423a3087
+  data.tar.gz: 95aeab42e351043873d548a5289e8355fe48fa7b7f27aaf1549a220c76eac9c1
 SHA512:
-  metadata.gz: 75f6a68aba0fa013bcdbcd618c9186f5b2e8870723aaef87bbfb8cb745c4a33862efac55c2a46938b7ad843f1f5e6640ebe381861c4365f459df8f115288cf2d
-  data.tar.gz: 6e495a3376f1c9650039534260c3d21ea697b77104ad2b9d7393b1ae8301cc29a116a6757efb9ead13716932ec7f6b188ab2404f3f0d86a68942f2c9972a5dc6
+  metadata.gz: 8e41da4dff00dc3cb9749874568a275cdad81b7a762182cee7ea497bfe373dd1b3f777dd40638d0c30ff13f50c5913cdcad175edcc8b9b36a3e26fb5658fc986
+  data.tar.gz: 738352dea20404d42a80340c2fc27359d54185565e8069f8245662e02d33c8630ce7922c3938b06b07e5587bd007342c65439229484ed529ae050e356872f150

checksums.yaml.gz.sig CHANGED Viewed

@@ -1,4 +1,2 @@
-Xit�b�Ƈ��C��c��v%� �B���޻h�v��f@�$s]�V�]d6e��&@������ܮ���`o�y0�d����S�O�4�-��u���5q�-}V���v��1����+X��]eFN_*398�[�%JDŜ���c���l�[p��Vk�$�ծ�D}�+|����$�T���������gVx��������UR�R�[<)T_����mP��=��|
-U	���a+��fC��&�=7R�Gމ�_���ʘ���&d���q]���S68\W�?7�z�	��I�b�ԚC&D��tG�<��yF���O��I��j���?������O�t�J|
-�	:��=R��6kh�8��3h�E
+y�0���tf�n�gG���a�+�4[�]"V��u�L��?�!����@h�8��bŶg�����)�t�l��GBn���4�6�q�<�P��3��#�����Ϗ�71.7��w@=d�������Ā�%|J��.O�x{�a����f����*5�#Aݶ��[�/�e�qޙcJ���[��w��慻��f:��D.�����"�z�� `�����R�����ۑ�7~�;�
+@�����~��B�ࢭ]A�8S8�Z����������>UzZA	��!}JY����ߵHi�>����r7B�֥����p��m��@I�SX�u�[c��=Ef�Kh`s�UR�����yL.���%M}a���?Q��oR%۳�{[~-^(��

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,17 @@
 # Karafka framework changelog
+## 2.2.13 (2023-11-17)
+- **[Feature]** Introduce low-level extended Scheduling API for granular control of schedulers and jobs execution [Pro].
+- [Improvement] Use separate lock for user-facing synchronization.
+- [Improvement] Instrument `consumer.before_enqueue`.
+- [Improvement] Limit usage of `concurrent-ruby` (plan to remove it as a dependency fully)
+- [Improvement] Provide `#synchronize` API same as in VPs for LRJs to allow for lifecycle events and consumption synchronization.
+## 2.2.12 (2023-11-09)
+- [Improvement] Rewrite the polling engine to update statistics and error callbacks despite longer non LRJ processing or long `max_wait_time` setups. This change provides stability to the statistics and background error emitting making them time-reliable.
+- [Improvement] Auto-update Inline Insights if new insights are present for all consumers and not only LRJ (OSS and Pro).
+- [Improvement] Alias `#insights` with `#inline_insights` and `#insights?` with `#inline_insights?`
 ## 2.2.11 (2023-11-03)
 - [Improvement] Allow marking as consumed in the user `#synchronize` block.
 - [Improvement] Make whole Pro VP marking as consumed concurrency safe for both async and sync scenarios.

data/Gemfile.lock CHANGED Viewed

@@ -1,18 +1,18 @@
 PATH
   remote: .
   specs:
-    karafka (2.2.11)
-      karafka-core (>= 2.2.6, < 2.3.0)
-      waterdrop (>= 2.6.10, < 3.0.0)
+    karafka (2.2.13)
+      karafka-core (>= 2.2.7, < 2.3.0)
+      waterdrop (>= 2.6.11, < 3.0.0)
       zeitwerk (~> 2.3)
 GEM
   remote: https://rubygems.org/
   specs:
-    activejob (7.1.1)
-      activesupport (= 7.1.1)
+    activejob (7.1.2)
+      activesupport (= 7.1.2)
       globalid (>= 0.3.6)
-    activesupport (7.1.1)
+    activesupport (7.1.2)
       base64
       bigdecimal
       concurrent-ruby (~> 1.0, >= 1.0.2)
@@ -22,14 +22,14 @@ GEM
       minitest (>= 5.1)
       mutex_m
       tzinfo (~> 2.0)
-    base64 (0.1.1)
+    base64 (0.2.0)
     bigdecimal (3.1.4)
     byebug (11.1.3)
     concurrent-ruby (1.2.2)
     connection_pool (2.4.1)
     diff-lcs (1.5.0)
     docile (1.4.0)
-    drb (2.1.1)
+    drb (2.2.0)
       ruby2_keywords
     erubi (1.12.0)
     factory_bot (6.3.0)
@@ -39,10 +39,10 @@ GEM
       activesupport (>= 6.1)
     i18n (1.14.1)
       concurrent-ruby (~> 1.0)
-    karafka-core (2.2.6)
+    karafka-core (2.2.7)
       concurrent-ruby (>= 1.1)
-      karafka-rdkafka (>= 0.13.8, < 0.15.0)
-    karafka-rdkafka (0.13.8)
+      karafka-rdkafka (>= 0.13.9, < 0.15.0)
+    karafka-rdkafka (0.14.0)
       ffi (~> 1.15)
       mini_portile2 (~> 2.6)
       rake (> 12)
@@ -54,10 +54,10 @@ GEM
       tilt (~> 2.0)
     mini_portile2 (2.8.5)
     minitest (5.20.0)
-    mutex_m (0.1.2)
+    mutex_m (0.2.0)
     rack (3.0.8)
     rake (13.1.0)
-    roda (3.73.0)
+    roda (3.74.0)
       rack
     rspec (3.12.0)
       rspec-core (~> 3.12.0)

data/config/locales/errors.yml CHANGED Viewed

@@ -16,7 +16,8 @@ en:
       max_wait_time_format: needs to be an integer bigger than 0
       kafka_format: needs to be a filled hash
       internal.processing.jobs_builder_format: cannot be nil
-      internal.processing.scheduler_format: cannot be nil
+      internal.processing.jobs_queue_class_format: cannot be nil
+      internal.processing.scheduler_class_format: cannot be nil
       internal.processing.coordinator_class_format: cannot be nil
       internal.processing.partitioner_class_format: cannot be nil
       internal.processing.strategy_selector_format: cannot be nil
@@ -26,6 +27,7 @@ en:
       internal.active_job.consumer_class: cannot be nil
       internal.status_format: needs to be present
       internal.process_format: needs to be present
+      internal.tick_interval_format: needs to be an integer bigger or equal to 1000
       internal.routing.builder_format: needs to be present
       internal.routing.subscription_groups_builder_format: needs to be present
       internal.connection.proxy.query_watermark_offsets.timeout_format: needs to be an integer bigger than 0

data/docker-compose.yml CHANGED Viewed

@@ -3,7 +3,7 @@ version: '2'
 services:
   kafka:
     container_name: kafka
-    image: confluentinc/cp-kafka:7.5.1
+    image: confluentinc/cp-kafka:7.5.2
     ports:
       - 9092:9092

data/karafka.gemspec CHANGED Viewed

@@ -21,8 +21,8 @@ Gem::Specification.new do |spec|
     without having to focus on things that are not your business domain.
   DESC
-  spec.add_dependency 'karafka-core', '>= 2.2.6', '< 2.3.0'
-  spec.add_dependency 'waterdrop', '>= 2.6.10', '< 3.0.0'
+  spec.add_dependency 'karafka-core', '>= 2.2.7', '< 2.3.0'
+  spec.add_dependency 'waterdrop', '>= 2.6.11', '< 3.0.0'
   spec.add_dependency 'zeitwerk', '~> 2.3'
   if $PROGRAM_NAME.end_with?('gem')

data/lib/karafka/connection/client.rb CHANGED Viewed

@@ -43,11 +43,13 @@ module Karafka
         @closed = false
         @subscription_group = subscription_group
         @buffer = RawMessagesBuffer.new
+        @tick_interval = ::Karafka::App.config.internal.tick_interval
         @rebalance_manager = RebalanceManager.new(@subscription_group.id)
         @rebalance_callback = Instrumentation::Callbacks::Rebalance.new(
           @subscription_group.id,
           @subscription_group.consumer_group.id
         )
+        @events_poller = Helpers::IntervalRunner.new { events_poll }
         @kafka = build_consumer
         # There are few operations that can happen in parallel from the listener threads as well
         # as from the workers. They are not fully thread-safe because they may be composed out of
@@ -64,6 +66,8 @@ module Karafka
       # Fetches messages within boundaries defined by the settings (time, size, topics, etc).
       #
+      # Also periodically runs the events polling to trigger events callbacks.
+      #
       # @return [Karafka::Connection::MessagesBuffer] messages buffer that holds messages per topic
       #   partition
       # @note This method should not be executed from many threads at the same time
@@ -73,38 +77,46 @@ module Karafka
         @buffer.clear
         @rebalance_manager.clear
+        events_poll
         loop do
           time_poll.start
           # Don't fetch more messages if we do not have any time left
           break if time_poll.exceeded?
-          # Don't fetch more messages if we've fetched max as we've wanted
+          # Don't fetch more messages if we've fetched max that we've wanted
           break if @buffer.size >= @subscription_group.max_messages
           # Fetch message within our time boundaries
-          message = poll(time_poll.remaining)
+          response = poll(time_poll.remaining)
           # Put a message to the buffer if there is one
-          @buffer << message if message
+          @buffer << response if response && response != :tick_time
           # Upon polling rebalance manager might have been updated.
           # If partition revocation happens, we need to remove messages from revoked partitions
           # as well as ensure we do not have duplicated due to the offset reset for partitions
           # that we got assigned
+          #
           # We also do early break, so the information about rebalance is used as soon as possible
           if @rebalance_manager.changed?
+            # Since rebalances do not occur often, we can run events polling as well without
+            # any throttling
+            events_poll
             remove_revoked_and_duplicated_messages
             break
           end
+          @events_poller.call
           # Track time spent on all of the processing and polling
           time_poll.checkpoint
           # Finally once we've (potentially) removed revoked, etc, if no messages were returned
-          # we can break.
+          # and it was not an early poll exist, we can break.
           # Worth keeping in mind, that the rebalance manager might have been updated despite no
           # messages being returned during a poll
-          break unless message
+          break unless response
         end
         @buffer
@@ -299,22 +311,38 @@ module Karafka
       def reset
         close
+        @events_poller.reset
         @closed = false
         @paused_tpls.clear
         @kafka = build_consumer
       end
-      # Runs a single poll ignoring all the potential errors
+      # Runs a single poll on the main queue and consumer queue ignoring all the potential errors
       # This is used as a keep-alive in the shutdown stage and any errors that happen here are
       # irrelevant from the shutdown process perspective
       #
-      # This is used only to trigger rebalance callbacks
+      # This is used only to trigger rebalance callbacks and other callbacks
       def ping
+        events_poll(100)
         poll(100)
       rescue Rdkafka::RdkafkaError
         nil
       end
+      # Triggers the rdkafka main queue events by consuming this queue. This is not the consumer
+      # consumption queue but the one with:
+      #   - error callbacks
+      #   - stats callbacks
+      #   - OAUTHBEARER token refresh callbacks
+      #
+      # @param timeout [Integer] number of milliseconds to wait on events or 0 not to wait.
+      #
+      # @note It is non-blocking when timeout 0 and will not wait if queue empty. It costs up to
+      #   2ms when no callbacks are triggered.
+      def events_poll(timeout = 0)
+        @kafka.events_poll(timeout)
+      end
       private
       # When we cannot store an offset, it means we no longer own the partition
@@ -464,18 +492,52 @@ module Karafka
         @kafka.position(tpl).to_h.fetch(topic).first.offset || -1
       end
-      # Performs a single poll operation and handles retries and error
+      # Performs a single poll operation and handles retries and errors
+      #
+      # Keep in mind, that this timeout will be limited by a tick interval value, because we cannot
+      # block on a single poll longer than that. Otherwise our events polling would not be able to
+      # run frequently enough. This means, that even if you provide big value, it will not block
+      # for that long. This is anyhow compensated by the `#batch_poll` that can run for extended
+      # period of time but will run events polling frequently while waiting for the requested total
+      # time.
       #
-      # @param timeout [Integer] timeout for a single poll
-      # @return [Rdkafka::Consumer::Message, nil] fetched message or nil if nothing polled
+      # @param timeout [Integer] timeout for a single poll.
+      # @return [Rdkafka::Consumer::Message, nil, Symbol] fetched message, nil if nothing polled
+      #   within the time we had or symbol indicating the early return reason
       def poll(timeout)
         time_poll ||= TimeTrackers::Poll.new(timeout)
         return nil if time_poll.exceeded?
         time_poll.start
+        remaining = time_poll.remaining
+        # We should not run a single poll longer than the tick frequency. Otherwise during a single
+        # `#batch_poll` we would not be able to run `#events_poll` often enough effectively
+        # blocking events from being handled.
+        poll_tick = timeout > @tick_interval ? @tick_interval : timeout
+        result = @kafka.poll(poll_tick)
+        # If we've got a message, we can return it
+        return result if result
+        time_poll.checkpoint
+        # We need to check if we have used all the allocated time as depending on the outcome, the
+        # batch loop behavior will differ. Using all time means, that we had nothing to do as no
+        # messages were present but if we did not exceed total time, it means we can still try
+        # polling again as we are withing user expected max wait time
+        used = remaining - time_poll.remaining
+        # In case we did not use enough time, it means that an internal event occured that means
+        # that something has changed without messages being published. For example a rebalance.
+        # In cases like this we finish early as well
+        return nil if used < poll_tick
-        @kafka.poll(timeout)
+        # If we did not exceed total time allocated, it means that we finished because of the
+        # tick interval time limitations and not because time run out without any data
+        time_poll.exceeded? ? nil : :tick_time
       rescue ::Rdkafka::RdkafkaError => e
         early_report = false
@@ -535,6 +597,10 @@ module Karafka
         ::Rdkafka::Config.logger = ::Karafka::App.config.logger
         config = ::Rdkafka::Config.new(@subscription_group.kafka)
         config.consumer_rebalance_listener = @rebalance_callback
+        # We want to manage the events queue independently from the messages queue. Thanks to that
+        # we can ensure, that we get statistics and errors often enough even when not polling
+        # new messages. This allows us to report statistics while data is still being processed
+        config.consumer_poll_set = false
         consumer = config.consumer
         @name = consumer.name

data/lib/karafka/connection/consumer_group_coordinator.rb CHANGED Viewed

@@ -16,7 +16,7 @@ module Karafka
     class ConsumerGroupCoordinator
       # @param group_size [Integer] number of separate subscription groups in a consumer group
       def initialize(group_size)
-        @shutdown_lock = Mutex.new
+        @shutdown_mutex = Mutex.new
         @group_size = group_size
         @finished = Set.new
       end
@@ -30,12 +30,12 @@ module Karafka
       # @return [Boolean] can we start shutdown on a given listener
       # @note If true, will also obtain a lock so no-one else will be closing the same time we do
       def shutdown?
-        finished? && @shutdown_lock.try_lock
+        finished? && @shutdown_mutex.try_lock
       end
       # Unlocks the shutdown lock
       def unlock
-        @shutdown_lock.unlock if @shutdown_lock.owned?
+        @shutdown_mutex.unlock if @shutdown_mutex.owned?
       end
       # Marks given listener as finished

data/lib/karafka/connection/listener.rb CHANGED Viewed

@@ -14,11 +14,18 @@ module Karafka
       # @return [String] id of this listener
       attr_reader :id
+      # How long to wait in the initial events poll. Increases chances of having the initial events
+      # immediately available
+      INITIAL_EVENTS_POLL_TIMEOUT = 100
+      private_constant :INITIAL_EVENTS_POLL_TIMEOUT
       # @param consumer_group_coordinator [Karafka::Connection::ConsumerGroupCoordinator]
       # @param subscription_group [Karafka::Routing::SubscriptionGroup]
       # @param jobs_queue [Karafka::Processing::JobsQueue] queue where we should push work
+      # @param scheduler [Karafka::Processing::Scheduler] scheduler we want to use
       # @return [Karafka::Connection::Listener] listener instance
-      def initialize(consumer_group_coordinator, subscription_group, jobs_queue)
+      def initialize(consumer_group_coordinator, subscription_group, jobs_queue, scheduler)
         proc_config = ::Karafka::App.config.internal.processing
         @id = SecureRandom.hex(6)
@@ -30,8 +37,8 @@ module Karafka
         @executors = Processing::ExecutorsBuffer.new(@client, subscription_group)
         @jobs_builder = proc_config.jobs_builder
         @partitioner = proc_config.partitioner_class.new(subscription_group)
-        # We reference scheduler here as it is much faster than fetching this each time
-        @scheduler = proc_config.scheduler
+        @scheduler = scheduler
+        @events_poller = Helpers::IntervalRunner.new { @client.events_poll }
         # We keep one buffer for messages to preserve memory and not allocate extra objects
         # We can do this that way because we always first schedule jobs using messages before we
         # fetch another batch.
@@ -84,6 +91,15 @@ module Karafka
       #   Kafka connections / Internet connection issues / Etc. Business logic problems should not
       #   propagate this far.
       def fetch_loop
+        # Run the initial events fetch to improve chances of having metrics and initial callbacks
+        # triggers on start.
+        #
+        # In theory this may slow down the initial boot but we limit it up to 100ms, so it should
+        # not have a big initial impact. It may not be enough but Karafka does not give the boot
+        # warranties of statistics or other callbacks being immediately available, hence this is
+        # a fair trade-off
+        @client.events_poll(INITIAL_EVENTS_POLL_TIMEOUT)
         # Run the main loop as long as we are not stopping or moving into quiet mode
         until Karafka::App.done?
           Karafka.monitor.instrument(
@@ -227,7 +243,7 @@ module Karafka
           end
         end
-        @scheduler.schedule_revocation(@jobs_queue, jobs)
+        @scheduler.schedule_revocation(jobs)
       end
       # Enqueues the shutdown jobs for all the executors that exist in our subscription group
@@ -240,7 +256,7 @@ module Karafka
           jobs << job
         end
-        @scheduler.schedule_shutdown(@jobs_queue, jobs)
+        @scheduler.schedule_shutdown(jobs)
       end
       # Polls messages within the time and amount boundaries defined in the settings and then
@@ -282,12 +298,15 @@ module Karafka
         jobs.each(&:before_enqueue)
-        @scheduler.schedule_consumption(@jobs_queue, jobs)
+        @scheduler.schedule_consumption(jobs)
       end
       # Waits for all the jobs from a given subscription group to finish before moving forward
       def wait
-        @jobs_queue.wait(@subscription_group.id)
+        @jobs_queue.wait(@subscription_group.id) do
+          @events_poller.call
+          @scheduler.manage
+        end
       end
       # Waits without blocking the polling
@@ -303,6 +322,8 @@ module Karafka
       def wait_pinging(wait_until:, after_ping: -> {})
         until wait_until.call
           @client.ping
+          @scheduler.manage
           after_ping.call
           sleep(0.2)
         end
@@ -318,6 +339,8 @@ module Karafka
         # resetting.
         @jobs_queue.wait(@subscription_group.id)
         @jobs_queue.clear(@subscription_group.id)
+        @scheduler.clear(@subscription_group.id)
+        @events_poller.reset
         @client.reset
         @coordinators.reset
         @executors = Processing::ExecutorsBuffer.new(@client, @subscription_group)

data/lib/karafka/connection/listeners_batch.rb CHANGED Viewed

@@ -11,6 +11,10 @@ module Karafka
       # @param jobs_queue [JobsQueue]
       # @return [ListenersBatch]
       def initialize(jobs_queue)
+        # We need one scheduler for all the listeners because in case of complex schedulers, they
+        # should be able to distribute work whenever any work is done in any of the listeners
+        scheduler = App.config.internal.processing.scheduler_class.new(jobs_queue)
         @coordinators = []
         @batch = App.subscription_groups.flat_map do |_consumer_group, subscription_groups|
@@ -24,7 +28,8 @@ module Karafka
             Connection::Listener.new(
               consumer_group_coordinator,
               subscription_group,
-              jobs_queue
+              jobs_queue,
+              scheduler
             )
           end
         end

data/lib/karafka/contracts/config.rb CHANGED Viewed

@@ -46,6 +46,9 @@ module Karafka
       nested(:internal) do
         required(:status) { |val| !val.nil? }
         required(:process) { |val| !val.nil? }
+        # In theory this could be less than a second, however this would impact the maximum time
+        # of a single consumer queue poll, hence we prevent it
+        required(:tick_interval) { |val| val.is_a?(Integer) && val >= 1_000 }
         nested(:connection) do
           nested(:proxy) do
@@ -70,7 +73,8 @@ module Karafka
         nested(:processing) do
           required(:jobs_builder) { |val| !val.nil? }
-          required(:scheduler) { |val| !val.nil? }
+          required(:jobs_queue_class) { |val| !val.nil? }
+          required(:scheduler_class) { |val| !val.nil? }
           required(:coordinator_class) { |val| !val.nil? }
           required(:partitioner_class) { |val| !val.nil? }
           required(:strategy_selector) { |val| !val.nil? }

data/lib/karafka/helpers/interval_runner.rb ADDED Viewed

@@ -0,0 +1,39 @@
+# frozen_string_literal: true
+module Karafka
+  module Helpers
+    # Object responsible for running given code with a given interval. It won't run given code
+    # more often than with a given interval.
+    #
+    # This allows us to execute certain code only once in a while.
+    #
+    # This can be used when we have code that could be invoked often due to it being in loops
+    # or other places but would only slow things down if would run with each tick.
+    class IntervalRunner
+      include Karafka::Core::Helpers::Time
+      # @param interval [Integer] interval in ms for running the provided code. Defaults to the
+      #   `internal.tick_interval` value
+      # @param block [Proc] block of code we want to run once in a while
+      def initialize(interval: ::Karafka::App.config.internal.tick_interval, &block)
+        @block = block
+        @interval = interval
+        @last_called_at = monotonic_now - @interval
+      end
+      # Runs the requested code if it was not executed previously recently
+      def call
+        return if monotonic_now - @last_called_at < @interval
+        @last_called_at = monotonic_now
+        @block.call
+      end
+      # Resets the runner, so next `#call` will run the underlying code
+      def reset
+        @last_called_at = monotonic_now - @interval
+      end
+    end
+  end
+end

data/lib/karafka/instrumentation/notifications.rb CHANGED Viewed

@@ -43,6 +43,7 @@ module Karafka
         rebalance.partitions_revoke
         rebalance.partitions_revoked
+        consumer.before_enqueue
         consumer.consume
         consumer.consumed
         consumer.consuming.pause

data/lib/karafka/instrumentation/vendors/datadog/logger_listener.rb CHANGED Viewed

@@ -137,15 +137,7 @@ module Karafka
           def push_tags
             return unless Karafka.logger.respond_to?(:push_tags)
-            # Older versions of ddtrace do not have the `#log_correlation` method, so we fallback
-            # to the older method for tags
-            tags = if client.respond_to?(:log_correlation)
-                     client.log_correlation
-                   else
-                     client.active_correlation.to_s
-                   end
-            Karafka.logger.push_tags(tags)
+            Karafka.logger.push_tags(client.log_correlation)
           end
           # Pops datadog's tags from the logger

data/lib/karafka/pro/loader.rb CHANGED Viewed

@@ -84,7 +84,8 @@ module Karafka
           icfg.processing.coordinator_class = Processing::Coordinator
           icfg.processing.partitioner_class = Processing::Partitioner
-          icfg.processing.scheduler = Processing::Scheduler.new
+          icfg.processing.scheduler_class = Processing::Scheduler
+          icfg.processing.jobs_queue_class = Processing::JobsQueue
           icfg.processing.jobs_builder = Processing::JobsBuilder.new
           icfg.processing.strategy_selector = Processing::StrategySelector.new

data/lib/karafka/pro/processing/coordinator.rb CHANGED Viewed

@@ -21,14 +21,20 @@ module Karafka
         def_delegators :@collapser, :collapsed?, :collapse_until!
-        attr_reader :filter, :virtual_offset_manager
+        attr_reader :filter, :virtual_offset_manager, :shared_mutex
         # @param args [Object] anything the base coordinator accepts
         def initialize(*args)
           super
           @executed = []
-          @flow_lock = Mutex.new
+          @flow_mutex = Mutex.new
+          # Lock for user code synchronization
+          # We do not want to mix coordinator lock with the user lock not to create cases where
+          # user imposed lock would lock the internal operations of Karafka
+          # This shared lock can be used by the end user as it is not used internally by the
+          # framework and can be used for user-facing locking
+          @shared_mutex = Mutex.new
           @collapser = Collapser.new
           @filter = FiltersApplier.new(self)
@@ -89,7 +95,7 @@ module Karafka
         # Runs synchronized code once for a collective of virtual partitions prior to work being
         # enqueued
         def on_enqueued
-          @flow_lock.synchronize do
+          @flow_mutex.synchronize do
             return unless executable?(:on_enqueued)
             yield(@last_message)
@@ -98,7 +104,7 @@ module Karafka
         # Runs given code only once per all the coordinated jobs upon starting first of them
         def on_started
-          @flow_lock.synchronize do
+          @flow_mutex.synchronize do
             return unless executable?(:on_started)
             yield(@last_message)
@@ -109,7 +115,7 @@ module Karafka
         # It runs once per all the coordinated jobs and should be used to run any type of post
         # jobs coordination processing execution
         def on_finished
-          @flow_lock.synchronize do
+          @flow_mutex.synchronize do
             return unless finished?
             return unless executable?(:on_finished)
@@ -119,7 +125,7 @@ module Karafka
         # Runs once after a partition is revoked
         def on_revoked
-          @flow_lock.synchronize do
+          @flow_mutex.synchronize do
             return unless executable?(:on_revoked)
             yield(@last_message)