RubyGems - karafka - Versions diffs - 2.2.13 → 2.2.14 - Mend

karafka 2.2.13 → 2.2.14

Files changed (46) hide show

checksums.yaml +4 -4
checksums.yaml.gz.sig +0 -0
data/CHANGELOG.md +138 -125
data/Gemfile.lock +3 -3
data/docker-compose.yml +2 -0
data/lib/karafka/admin.rb +109 -3
data/lib/karafka/app.rb +7 -0
data/lib/karafka/base_consumer.rb +23 -30
data/lib/karafka/connection/client.rb +13 -10
data/lib/karafka/connection/listener.rb +11 -9
data/lib/karafka/instrumentation/assignments_tracker.rb +96 -0
data/lib/karafka/instrumentation/callbacks/rebalance.rb +10 -7
data/lib/karafka/instrumentation/logger_listener.rb +0 -9
data/lib/karafka/instrumentation/notifications.rb +6 -4
data/lib/karafka/instrumentation/vendors/datadog/logger_listener.rb +2 -2
data/lib/karafka/pro/instrumentation/performance_tracker.rb +85 -0
data/lib/karafka/pro/loader.rb +2 -2
data/lib/karafka/pro/processing/schedulers/base.rb +127 -0
data/lib/karafka/pro/processing/schedulers/default.rb +109 -0
data/lib/karafka/pro/processing/strategies/aj/lrj_mom_vp.rb +1 -1
data/lib/karafka/pro/processing/strategies/default.rb +2 -2
data/lib/karafka/pro/processing/strategies/lrj/default.rb +1 -1
data/lib/karafka/pro/processing/strategies/lrj/mom.rb +1 -1
data/lib/karafka/pro/processing/strategies/vp/default.rb +1 -1
data/lib/karafka/processing/executor.rb +27 -3
data/lib/karafka/processing/executors_buffer.rb +3 -3
data/lib/karafka/processing/jobs/base.rb +19 -2
data/lib/karafka/processing/jobs/consume.rb +3 -3
data/lib/karafka/processing/jobs/idle.rb +5 -0
data/lib/karafka/processing/jobs/revoked.rb +5 -0
data/lib/karafka/processing/jobs/shutdown.rb +5 -0
data/lib/karafka/processing/jobs_queue.rb +19 -8
data/lib/karafka/processing/schedulers/default.rb +41 -0
data/lib/karafka/processing/strategies/base.rb +13 -4
data/lib/karafka/processing/strategies/default.rb +17 -7
data/lib/karafka/processing/worker.rb +4 -1
data/lib/karafka/routing/proxy.rb +4 -3
data/lib/karafka/routing/topics.rb +1 -1
data/lib/karafka/setup/config.rb +4 -1
data/lib/karafka/version.rb +1 -1
data.tar.gz.sig +0 -0
metadata +7 -5
metadata.gz.sig +0 -0
data/lib/karafka/pro/performance_tracker.rb +0 -84
data/lib/karafka/pro/processing/scheduler.rb +0 -74
data/lib/karafka/processing/scheduler.rb +0 -38

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 4056d72f0d37ac46c52597ebcfed87de031f9f250d57a64ec5c665d3423a3087
-  data.tar.gz: 95aeab42e351043873d548a5289e8355fe48fa7b7f27aaf1549a220c76eac9c1
+  metadata.gz: 69d8242fa695121f63b2e582d7a0b97f090d58f82047513c450f3b21107703b3
+  data.tar.gz: a9fb3db88cc6fbb3a25db24e95b8010a0b01f7ab09fc2f54d201e311581db9a5
 SHA512:
-  metadata.gz: 8e41da4dff00dc3cb9749874568a275cdad81b7a762182cee7ea497bfe373dd1b3f777dd40638d0c30ff13f50c5913cdcad175edcc8b9b36a3e26fb5658fc986
-  data.tar.gz: 738352dea20404d42a80340c2fc27359d54185565e8069f8245662e02d33c8630ce7922c3938b06b07e5587bd007342c65439229484ed529ae050e356872f150
+  metadata.gz: 83e22a8317f10328c11f3f4ac4c90109ecebb7f1ca0b089da2875c4e0700b58338adfb6b7c70e30df6fedecb26e2aaa4a11df347cc0bd898781adf709ad7a87c
+  data.tar.gz: 15eb23000600be7d2f2c49316ae8d3355ddef4ab2d9f75585a5b63ea0f8b27a87f473d8fe5fcf926a5815f319413541f6822fa15660fc7962bb5baf31771f00a

checksums.yaml.gz.sig CHANGED Viewed

Binary file

data/CHANGELOG.md CHANGED Viewed

@@ -1,26 +1,39 @@
 # Karafka framework changelog
+## 2.2.14 (2023-12-07)
+- **[Feature]** Provide `Karafka::Admin#delete_consumer_group` and `Karafka::Admin#seek_consumer_group`.
+- **[Feature]** Provide `Karafka::App.assignments` that will return real-time assignments tracking.
+- [Enhancement] Make sure that the Scheduling API is thread-safe by default and allow for lock-less schedulers when schedulers are stateless.
+- [Enhancement] "Blockless" topics with defaults
+- [Enhancement] Provide a `finished?` method to the jobs for advanced reference based job schedulers.
+- [Enhancement] Provide `client.reset` notification event.
+- [Enhancement] Remove all usage of concurrent-ruby from Karafka
+- [Change] Replace single #before_schedule with appropriate methods and events for scheduling various types of work. This is needed as we may run different framework logic on those and, second, for accurate job tracking with advanced schedulers.
+- [Change] Rename `before_enqueue` to `before_schedule` to reflect what it does and when (internal).
+- [Change] Remove not needed error catchers for strategies code. This code if errors, should be considered critical and should not be silenced.
+- [Change] Remove not used notifications events.
 ## 2.2.13 (2023-11-17)
 - **[Feature]** Introduce low-level extended Scheduling API for granular control of schedulers and jobs execution [Pro].
-- [Improvement] Use separate lock for user-facing synchronization.
-- [Improvement] Instrument `consumer.before_enqueue`.
-- [Improvement] Limit usage of `concurrent-ruby` (plan to remove it as a dependency fully)
-- [Improvement] Provide `#synchronize` API same as in VPs for LRJs to allow for lifecycle events and consumption synchronization.
+- [Enhancement] Use separate lock for user-facing synchronization.
+- [Enhancement] Instrument `consumer.before_enqueue`.
+- [Enhancement] Limit usage of `concurrent-ruby` (plan to remove it as a dependency fully)
+- [Enhancement] Provide `#synchronize` API same as in VPs for LRJs to allow for lifecycle events and consumption synchronization.
 ## 2.2.12 (2023-11-09)
-- [Improvement] Rewrite the polling engine to update statistics and error callbacks despite longer non LRJ processing or long `max_wait_time` setups. This change provides stability to the statistics and background error emitting making them time-reliable.
-- [Improvement] Auto-update Inline Insights if new insights are present for all consumers and not only LRJ (OSS and Pro).
-- [Improvement] Alias `#insights` with `#inline_insights` and `#insights?` with `#inline_insights?`
+- [Enhancement] Rewrite the polling engine to update statistics and error callbacks despite longer non LRJ processing or long `max_wait_time` setups. This change provides stability to the statistics and background error emitting making them time-reliable.
+- [Enhancement] Auto-update Inline Insights if new insights are present for all consumers and not only LRJ (OSS and Pro).
+- [Enhancement] Alias `#insights` with `#inline_insights` and `#insights?` with `#inline_insights?`
 ## 2.2.11 (2023-11-03)
-- [Improvement] Allow marking as consumed in the user `#synchronize` block.
-- [Improvement] Make whole Pro VP marking as consumed concurrency safe for both async and sync scenarios.
-- [Improvement] Provide new alias to `karafka server`, that is: `karafka consumer`.
+- [Enhancement] Allow marking as consumed in the user `#synchronize` block.
+- [Enhancement] Make whole Pro VP marking as consumed concurrency safe for both async and sync scenarios.
+- [Enhancement] Provide new alias to `karafka server`, that is: `karafka consumer`.
 ## 2.2.10 (2023-11-02)
-- [Improvement] Allow for running `#pause` without specifying the offset (provide offset or `:consecutive`). This allows for pausing on the consecutive message (last received + 1), so after resume we will get last message received + 1 effectively not using `#seek` and not purging `librdafka` buffer preserving on networking. Please be mindful that this uses notion of last message passed from **librdkafka**, and not the last one available in the consumer (`messages.last`). While for regular cases they will be the same, when using things like DLQ, LRJs, VPs or Filtering API, those may not be the same.
-- [Improvement] **Drastically** improve network efficiency of operating with LRJ by using the `:consecutive` offset as default strategy for running LRJs without moving the offset in place and purging the data.
-- [Improvement] Do not "seek in place". When pausing and/or seeking to the same location as the current position, do nothing not to purge buffers and not to move to the same place where we are.
+- [Enhancement] Allow for running `#pause` without specifying the offset (provide offset or `:consecutive`). This allows for pausing on the consecutive message (last received + 1), so after resume we will get last message received + 1 effectively not using `#seek` and not purging `librdafka` buffer preserving on networking. Please be mindful that this uses notion of last message passed from **librdkafka**, and not the last one available in the consumer (`messages.last`). While for regular cases they will be the same, when using things like DLQ, LRJs, VPs or Filtering API, those may not be the same.
+- [Enhancement] **Drastically** improve network efficiency of operating with LRJ by using the `:consecutive` offset as default strategy for running LRJs without moving the offset in place and purging the data.
+- [Enhancement] Do not "seek in place". When pausing and/or seeking to the same location as the current position, do nothing not to purge buffers and not to move to the same place where we are.
 - [Fix] Pattern regexps should not be part of declaratives even when configured.
 ### Upgrade Notes
@@ -28,13 +41,13 @@
 In the latest Karafka release, there are no breaking changes. However, please note the updates to #pause and #seek. If you spot any issues, please report them immediately. Your feedback is crucial.
 ## 2.2.9 (2023-10-24)
-- [Improvement] Allow using negative offset references in `Karafka::Admin#read_topic`.
+- [Enhancement] Allow using negative offset references in `Karafka::Admin#read_topic`.
 - [Change] Make sure that WaterDrop `2.6.10` or higher is used with this release to support transactions fully and the Web-UI.
 ## 2.2.8 (2023-10-20)
 - **[Feature]** Introduce Appsignal integration for errors and metrics tracking.
-- [Improvement] Expose `#synchronize` for VPs to allow for locks when cross-VP consumers work is needed.
-- [Improvement] Provide `#collapse_until!` direct consumer API to allow for collapsed virtual partitions consumer operations together with the Filtering API for advanced use-cases.
+- [Enhancement] Expose `#synchronize` for VPs to allow for locks when cross-VP consumers work is needed.
+- [Enhancement] Provide `#collapse_until!` direct consumer API to allow for collapsed virtual partitions consumer operations together with the Filtering API for advanced use-cases.
 - [Refactor] Reorganize how rebalance events are propagated from `librdkafka` to Karafka. Replace `connection.client.rebalance_callback` with `rebalance.partitions_assigned` and `rebalance.partitions_revoked`. Introduce two extra events: `rebalance.partitions_assign` and `rebalance.partitions_revoke` to handle pre-rebalance future work.
 - [Refactor] Remove `thor` as a CLI layer and rely on Ruby `OptParser`
@@ -136,31 +149,31 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 ## 2.1.9 (2023-08-06)
 - **[Feature]** Introduce ability to customize pause strategy on a per topic basis (Pro).
-- [Improvement] Disable the extensive messages logging in the default `karafka.rb` template.
+- [Enhancement] Disable the extensive messages logging in the default `karafka.rb` template.
 - [Change] Require `waterdrop` `>= 2.6.6` due to extra `LoggerListener` API.
 ## 2.1.8 (2023-07-29)
-- [Improvement] Introduce `Karafka::BaseConsumer#used?` method to indicate, that at least one invocation of `#consume` took or will take place. This can be used as a replacement to the non-direct `messages.count` check for shutdown and revocation to ensure, that the consumption took place or is taking place (in case of running LRJ).
-- [Improvement] Make `messages#to_a` return copy of the underlying array to prevent scenarios, where the mutation impacts offset management.
-- [Improvement] Mitigate a librdkafka `cooperative-sticky` rebalance crash issue.
-- [Improvement] Provide ability to overwrite `consumer_persistence` per subscribed topic. This is mostly useful for plugins and extensions developers.
+- [Enhancement] Introduce `Karafka::BaseConsumer#used?` method to indicate, that at least one invocation of `#consume` took or will take place. This can be used as a replacement to the non-direct `messages.count` check for shutdown and revocation to ensure, that the consumption took place or is taking place (in case of running LRJ).
+- [Enhancement] Make `messages#to_a` return copy of the underlying array to prevent scenarios, where the mutation impacts offset management.
+- [Enhancement] Mitigate a librdkafka `cooperative-sticky` rebalance crash issue.
+- [Enhancement] Provide ability to overwrite `consumer_persistence` per subscribed topic. This is mostly useful for plugins and extensions developers.
 - [Fix] Fix a case where the performance tracker would crash in case of mutation of messages to an empty state.
 ## 2.1.7 (2023-07-22)
-- [Improvement] Always query for watermarks in the Iterator to improve the initial response time.
-- [Improvement] Add `max_wait_time` option to the Iterator.
+- [Enhancement] Always query for watermarks in the Iterator to improve the initial response time.
+- [Enhancement] Add `max_wait_time` option to the Iterator.
 - [Fix] Fix a case where `Admin#read_topic` would wait for poll interval on non-existing messages instead of early exit.
 - [Fix] Fix a case where Iterator with per partition offsets with negative lookups would go below the number of available messages.
 - [Fix] Remove unused constant from Admin module.
 - [Fix] Add missing `connection.client.rebalance_callback.error` to the `LoggerListener` instrumentation hook.
 ## 2.1.6 (2023-06-29)
-- [Improvement] Provide time support for iterator
-- [Improvement] Provide time support for admin `#read_topic`
-- [Improvement] Provide time support for consumer `#seek`.
-- [Improvement] Remove no longer needed locks for client operations.
-- [Improvement] Raise `Karafka::Errors::TopicNotFoundError` when trying to iterate over non-existing topic.
-- [Improvement] Ensure that Kafka multi-command operations run under mutex together.
+- [Enhancement] Provide time support for iterator
+- [Enhancement] Provide time support for admin `#read_topic`
+- [Enhancement] Provide time support for consumer `#seek`.
+- [Enhancement] Remove no longer needed locks for client operations.
+- [Enhancement] Raise `Karafka::Errors::TopicNotFoundError` when trying to iterate over non-existing topic.
+- [Enhancement] Ensure that Kafka multi-command operations run under mutex together.
 - [Change] Require `waterdrop` `>= 2.6.2`
 - [Change] Require `karafka-core` `>= 2.1.1`
 - [Refactor] Clean-up iterator code.
@@ -172,13 +185,13 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 - [Fix] Make sure, that `#pause` and `#resume` with one underlying connection do not race-condition.
 ## 2.1.5 (2023-06-19)
-- [Improvement] Drastically improve `#revoked?` response quality by checking the real time assignment lost state on librdkafka.
-- [Improvement] Improve eviction of saturated jobs that would run on already revoked assignments.
-- [Improvement] Expose `#commit_offsets` and `#commit_offsets!` methods in the consumer to provide ability to commit offsets directly to Kafka without having to mark new messages as consumed.
-- [Improvement] No longer skip offset commit when no messages marked as consumed as `librdkafka` has fixed the crashes there.
-- [Improvement] Remove no longer needed patches.
-- [Improvement] Ensure, that the coordinator revocation status is switched upon revocation detection when using `#revoked?`
-- [Improvement] Add benchmarks for marking as consumed (sync and async).
+- [Enhancement] Drastically improve `#revoked?` response quality by checking the real time assignment lost state on librdkafka.
+- [Enhancement] Improve eviction of saturated jobs that would run on already revoked assignments.
+- [Enhancement] Expose `#commit_offsets` and `#commit_offsets!` methods in the consumer to provide ability to commit offsets directly to Kafka without having to mark new messages as consumed.
+- [Enhancement] No longer skip offset commit when no messages marked as consumed as `librdkafka` has fixed the crashes there.
+- [Enhancement] Remove no longer needed patches.
+- [Enhancement] Ensure, that the coordinator revocation status is switched upon revocation detection when using `#revoked?`
+- [Enhancement] Add benchmarks for marking as consumed (sync and async).
 - [Change] Require `karafka-core` `>= 2.1.0`
 - [Change] Require `waterdrop` `>= 2.6.1`
@@ -202,12 +215,12 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 - **[Feature]** Provide ability to use CurrentAttributes with ActiveJob's Karafka adapter (federicomoretti).
 - **[Feature]** Introduce collective Virtual Partitions offset management.
 - **[Feature]** Use virtual offsets to filter out messages that would be re-processed upon retries.
-- [Improvement] No longer break processing on failing parallel virtual partitions in ActiveJob because it is compensated by virtual marking.
-- [Improvement] Always use Virtual offset management for Pro ActiveJobs.
-- [Improvement] Do not attempt to mark offsets on already revoked partitions.
-- [Improvement] Make sure, that VP components are not injected into non VP strategies.
-- [Improvement] Improve complex strategies inheritance flow.
-- [Improvement] Optimize offset management for DLQ + MoM feature combinations.
+- [Enhancement] No longer break processing on failing parallel virtual partitions in ActiveJob because it is compensated by virtual marking.
+- [Enhancement] Always use Virtual offset management for Pro ActiveJobs.
+- [Enhancement] Do not attempt to mark offsets on already revoked partitions.
+- [Enhancement] Make sure, that VP components are not injected into non VP strategies.
+- [Enhancement] Improve complex strategies inheritance flow.
+- [Enhancement] Optimize offset management for DLQ + MoM feature combinations.
 - [Change] Removed `Karafka::Pro::BaseConsumer` in favor of `Karafka::BaseConsumer`. (#1345)
 - [Fix] Fix for `max_messages` and `max_wait_time` not having reference in errors.yml (#1443)
@@ -219,16 +232,16 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 ## 2.0.41 (2023-04-19)
 - **[Feature]** Provide `Karafka::Pro::Iterator` for anonymous topic/partitions iterations and messages lookups (#1389 and #1427).
-- [Improvement] Optimize topic lookup for `read_topic` admin method usage.
-- [Improvement] Report via `LoggerListener` information about the partition on which a given job has started and finished.
-- [Improvement] Slightly normalize the `LoggerListener` format. Always report partition related operations as followed: `TOPIC_NAME/PARTITION`.
-- [Improvement] Do not retry recovery from `unknown_topic_or_part` when Karafka is shutting down as there is no point and no risk of any data losses.
-- [Improvement] Report `client.software.name` and `client.software.version` according to `librdkafka` recommendation.
-- [Improvement] Report ten longest integration specs after the suite execution.
-- [Improvement] Prevent user originating errors related to statistics processing after listener loop crash from potentially crashing the listener loop and hanging Karafka process.
+- [Enhancement] Optimize topic lookup for `read_topic` admin method usage.
+- [Enhancement] Report via `LoggerListener` information about the partition on which a given job has started and finished.
+- [Enhancement] Slightly normalize the `LoggerListener` format. Always report partition related operations as followed: `TOPIC_NAME/PARTITION`.
+- [Enhancement] Do not retry recovery from `unknown_topic_or_part` when Karafka is shutting down as there is no point and no risk of any data losses.
+- [Enhancement] Report `client.software.name` and `client.software.version` according to `librdkafka` recommendation.
+- [Enhancement] Report ten longest integration specs after the suite execution.
+- [Enhancement] Prevent user originating errors related to statistics processing after listener loop crash from potentially crashing the listener loop and hanging Karafka process.
 ## 2.0.40 (2023-04-13)
-- [Improvement] Introduce `Karafka::Messages::Messages#empty?` method to handle Idle related cases where shutdown or revocation would be called on an empty messages set. This method allows for checking if there are any messages in the messages batch.
+- [Enhancement] Introduce `Karafka::Messages::Messages#empty?` method to handle Idle related cases where shutdown or revocation would be called on an empty messages set. This method allows for checking if there are any messages in the messages batch.
 - [Refactor] Require messages builder to accept partition and do not fetch it from messages.
 - [Refactor] Use empty messages set for internal APIs (Idle) (so there always is `Karafka::Messages::Messages`)
 - [Refactor] Allow for empty messages set initialization with -1001 and -1 on metadata (similar to `librdkafka`)
@@ -238,17 +251,17 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 - **[Feature]** Provide Delayed Topics (#1000)
 - **[Feature]** Provide ability to expire messages (expiring topics)
 - **[Feature]** Provide ability to apply filters after messages are polled and before enqueued. This is a generic filter API for any usage.
-- [Improvement] When using ActiveJob with Virtual Partitions, Karafka will stop if collectively VPs are failing. This minimizes number of jobs that will be collectively re-processed.
-- [Improvement] `#retrying?` method has been added to consumers to provide ability to check, that we're reprocessing data after a failure. This is useful for branching out processing based on errors.
-- [Improvement] Track active_job_id in instrumentation (#1372)
-- [Improvement] Introduce new housekeeping job type called `Idle` for non-consumption execution flows.
-- [Improvement] Change how a manual offset management works with Long-Running Jobs. Use the last message offset to move forward instead of relying on the last message marked as consumed for a scenario where no message is marked.
-- [Improvement] Prioritize in Pro non-consumption jobs execution over consumption despite LJF. This will ensure, that housekeeping as well as other non-consumption events are not saturated when running a lot of work.
-- [Improvement] Normalize the DLQ behaviour with MoM. Always pause on dispatch for all the strategies.
-- [Improvement] Improve the manual offset management and DLQ behaviour when no markings occur for OSS.
-- [Improvement] Do not early stop ActiveJob work running under virtual partitions to prevent extensive reprocessing.
-- [Improvement] Drastically increase number of scenarios covered by integration specs (OSS and Pro).
-- [Improvement] Introduce a `Coordinator#synchronize` lock for cross virtual partitions operations.
+- [Enhancement] When using ActiveJob with Virtual Partitions, Karafka will stop if collectively VPs are failing. This minimizes number of jobs that will be collectively re-processed.
+- [Enhancement] `#retrying?` method has been added to consumers to provide ability to check, that we're reprocessing data after a failure. This is useful for branching out processing based on errors.
+- [Enhancement] Track active_job_id in instrumentation (#1372)
+- [Enhancement] Introduce new housekeeping job type called `Idle` for non-consumption execution flows.
+- [Enhancement] Change how a manual offset management works with Long-Running Jobs. Use the last message offset to move forward instead of relying on the last message marked as consumed for a scenario where no message is marked.
+- [Enhancement] Prioritize in Pro non-consumption jobs execution over consumption despite LJF. This will ensure, that housekeeping as well as other non-consumption events are not saturated when running a lot of work.
+- [Enhancement] Normalize the DLQ behaviour with MoM. Always pause on dispatch for all the strategies.
+- [Enhancement] Improve the manual offset management and DLQ behaviour when no markings occur for OSS.
+- [Enhancement] Do not early stop ActiveJob work running under virtual partitions to prevent extensive reprocessing.
+- [Enhancement] Drastically increase number of scenarios covered by integration specs (OSS and Pro).
+- [Enhancement] Introduce a `Coordinator#synchronize` lock for cross virtual partitions operations.
 - [Fix] Do not resume partition that is not paused.
 - [Fix] Fix `LoggerListener` cases where logs would not include caller id (when available)
 - [Fix] Fix not working benchmark tests.
@@ -262,10 +275,10 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 - [Refactor] Move `#mark_as_consumed` and `#mark_as_consumed!`into `Strategies::Default` to be able to introduce marking for virtual partitions.
 ## 2.0.38 (2023-03-27)
-- [Improvement] Introduce `Karafka::Admin#read_watermark_offsets` to get low and high watermark offsets values.
-- [Improvement] Track active_job_id in instrumentation (#1372)
-- [Improvement] Improve `#read_topic` reading in case of a compacted partition where the offset is below the low watermark offset. This should optimize reading and should not go beyond the low watermark offset.
-- [Improvement] Allow `#read_topic` to accept instance settings to overwrite any settings needed to customize reading behaviours.
+- [Enhancement] Introduce `Karafka::Admin#read_watermark_offsets` to get low and high watermark offsets values.
+- [Enhancement] Track active_job_id in instrumentation (#1372)
+- [Enhancement] Improve `#read_topic` reading in case of a compacted partition where the offset is below the low watermark offset. This should optimize reading and should not go beyond the low watermark offset.
+- [Enhancement] Allow `#read_topic` to accept instance settings to overwrite any settings needed to customize reading behaviours.
 ## 2.0.37 (2023-03-20)
 - [Fix] Declarative topics execution on a secondary cluster run topics creation on the primary one (#1365)
@@ -280,7 +293,7 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 - **[Feature]** Allow for full topics reset and topics repartitioning via the CLI.
 ## 2.0.34 (2023-03-04)
-- [Improvement] Attach an `embedded` tag to Karafka processes started using the embedded API.
+- [Enhancement] Attach an `embedded` tag to Karafka processes started using the embedded API.
 - [Change] Renamed `Datadog::Listener` to `Datadog::MetricsListener` for consistency. (#1124)
 ### Upgrade Notes
@@ -291,10 +304,10 @@ If you want to maintain the `2.1` behavior, that is `karafka_admin` admin group,
 - **[Feature]** Support `perform_all_later` in ActiveJob adapter for Rails `7.1+`
 - **[Feature]** Introduce ability to assign and re-assign tags in consumer instances. This can be used for extra instrumentation that is context aware.
 - **[Feature]** Introduce ability to assign and reassign tags to the `Karafka::Process`.
-- [Improvement] When using `ActiveJob` adapter, automatically tag jobs with the name of the `ActiveJob` class that is running inside of the `ActiveJob` consumer.
-- [Improvement] Make `::Karafka::Instrumentation::Notifications::EVENTS` list public for anyone wanting to re-bind those into a different notification bus.
-- [Improvement] Set `fetch.message.max.bytes` for `Karafka::Admin` to `5MB` to make sure that all data is fetched correctly for Web UI under heavy load (many consumers).
-- [Improvement] Introduce a `strict_topics_namespacing` config option to enable/disable the strict topics naming validations. This can be useful when working with pre-existing topics which we cannot or do not want to rename.
+- [Enhancement] When using `ActiveJob` adapter, automatically tag jobs with the name of the `ActiveJob` class that is running inside of the `ActiveJob` consumer.
+- [Enhancement] Make `::Karafka::Instrumentation::Notifications::EVENTS` list public for anyone wanting to re-bind those into a different notification bus.
+- [Enhancement] Set `fetch.message.max.bytes` for `Karafka::Admin` to `5MB` to make sure that all data is fetched correctly for Web UI under heavy load (many consumers).
+- [Enhancement] Introduce a `strict_topics_namespacing` config option to enable/disable the strict topics naming validations. This can be useful when working with pre-existing topics which we cannot or do not want to rename.
 - [Fix] Karafka monitor is prematurely cached (#1314)
 ### Upgrade Notes
@@ -325,39 +338,39 @@ end
 ## 2.0.32 (2023-02-13)
 - [Fix] Many non-existing topic subscriptions propagate poll errors beyond client
-- [Improvement] Ignore `unknown_topic_or_part` errors in dev when `allow.auto.create.topics` is on.
-- [Improvement] Optimize temporary errors handling in polling for a better backoff policy
+- [Enhancement] Ignore `unknown_topic_or_part` errors in dev when `allow.auto.create.topics` is on.
+- [Enhancement] Optimize temporary errors handling in polling for a better backoff policy
 ## 2.0.31 (2023-02-12)
 - [Feature] Allow for adding partitions via `Admin#create_partitions` API.
 - [Fix] Do not ignore admin errors upon invalid configuration (#1254)
 - [Fix] Topic name validation (#1300) - CandyFet
-- [Improvement] Increase the `max_wait_timeout` on admin operations to five minutes to make sure no timeout on heavily loaded clusters.
+- [Enhancement] Increase the `max_wait_timeout` on admin operations to five minutes to make sure no timeout on heavily loaded clusters.
 - [Maintenance] Require `karafka-core` >= `2.0.11` and switch to shared RSpec locator.
 - [Maintenance] Require `karafka-rdkafka` >= `0.12.1`
 ## 2.0.30 (2023-01-31)
-- [Improvement] Alias `--consumer-groups` with `--include-consumer-groups`
-- [Improvement] Alias `--subscription-groups` with `--include-subscription-groups`
-- [Improvement] Alias `--topics` with `--include-topics`
-- [Improvement] Introduce `--exclude-consumer-groups` for ability to exclude certain consumer groups from running
-- [Improvement] Introduce `--exclude-subscription-groups` for ability to exclude certain subscription groups from running
-- [Improvement] Introduce `--exclude-topics` for ability to exclude certain topics from running
+- [Enhancement] Alias `--consumer-groups` with `--include-consumer-groups`
+- [Enhancement] Alias `--subscription-groups` with `--include-subscription-groups`
+- [Enhancement] Alias `--topics` with `--include-topics`
+- [Enhancement] Introduce `--exclude-consumer-groups` for ability to exclude certain consumer groups from running
+- [Enhancement] Introduce `--exclude-subscription-groups` for ability to exclude certain subscription groups from running
+- [Enhancement] Introduce `--exclude-topics` for ability to exclude certain topics from running
 ## 2.0.29 (2023-01-30)
-- [Improvement] Make sure, that the `Karafka#producer` instance has the `LoggerListener` enabled in the install template, so Karafka by default prints both consumer and producer info.
-- [Improvement] Extract the code loading capabilities of Karafka console from the executable, so web can use it to provide CLI commands.
+- [Enhancement] Make sure, that the `Karafka#producer` instance has the `LoggerListener` enabled in the install template, so Karafka by default prints both consumer and producer info.
+- [Enhancement] Extract the code loading capabilities of Karafka console from the executable, so web can use it to provide CLI commands.
 - [Fix] Fix for: running karafka console results in NameError with Rails (#1280)
 - [Fix] Make sure, that the `caller` for async errors is being published.
 - [Change] Make sure that WaterDrop `2.4.10` or higher is used with this release to support Web-UI.
 ## 2.0.28 (2023-01-25)
 - **[Feature]** Provide the ability to use Dead Letter Queue with Virtual Partitions.
-- [Improvement] Collapse Virtual Partitions upon retryable error to a single partition. This allows dead letter queue to operate and mitigate issues arising from work virtualization. This removes uncertainties upon errors that can be retried and processed. Affects given topic partition virtualization only for multi-topic and multi-partition parallelization. It also minimizes potential "flickering" where given data set has potentially many corrupted messages. The collapse will last until all the messages from the collective corrupted batch are processed. After that, virtualization will resume.
-- [Improvement] Introduce `#collapsed?` consumer method available for consumers using Virtual Partitions.
-- [Improvement] Allow for customization of DLQ dispatched message details in Pro (#1266) via the `#enhance_dlq_message` consumer method.
-- [Improvement] Include `original_consumer_group` in the DLQ dispatched messages in Pro.
-- [Improvement] Use Karafka `client_id` as kafka `client.id` value by default
+- [Enhancement] Collapse Virtual Partitions upon retryable error to a single partition. This allows dead letter queue to operate and mitigate issues arising from work virtualization. This removes uncertainties upon errors that can be retried and processed. Affects given topic partition virtualization only for multi-topic and multi-partition parallelization. It also minimizes potential "flickering" where given data set has potentially many corrupted messages. The collapse will last until all the messages from the collective corrupted batch are processed. After that, virtualization will resume.
+- [Enhancement] Introduce `#collapsed?` consumer method available for consumers using Virtual Partitions.
+- [Enhancement] Allow for customization of DLQ dispatched message details in Pro (#1266) via the `#enhance_dlq_message` consumer method.
+- [Enhancement] Include `original_consumer_group` in the DLQ dispatched messages in Pro.
+- [Enhancement] Use Karafka `client_id` as kafka `client.id` value by default
 ### Upgrade Notes
@@ -378,14 +391,14 @@ class KarafkaApp < Karafka::App
 ## 2.0.26 (2023-01-10)
 - **[Feature]** Allow for disabling given topics by setting `active` to false. It will exclude them from consumption but will allow to have their definitions for using admin APIs, etc.
-- [Improvement] Early terminate on `read_topic` when reaching the last offset available on the request time.
-- [Improvement] Introduce a `quiet` state that indicates that Karafka is not only moving to quiet mode but actually that it reached it and no work will happen anymore in any of the consumer groups.
-- [Improvement] Use Karafka defined routes topics when possible for `read_topic` admin API.
-- [Improvement] Introduce `client.pause` and `client.resume` instrumentation hooks for tracking client topic partition pausing and resuming. This is alongside of `consumer.consuming.pause` that can be used to track both manual and automatic pausing with more granular consumer related details. The `client.*` should be used for low level tracking.
-- [Improvement] Replace `LoggerListener` pause notification with one based on `client.pause` instead of `consumer.consuming.pause`.
-- [Improvement] Expand `LoggerListener` with `client.resume` notification.
-- [Improvement] Replace random anonymous subscription groups ids with stable once.
-- [Improvement] Add `consumer.consume`, `consumer.revoke` and `consumer.shutting_down` notification events and move the revocation logic calling to strategies.
+- [Enhancement] Early terminate on `read_topic` when reaching the last offset available on the request time.
+- [Enhancement] Introduce a `quiet` state that indicates that Karafka is not only moving to quiet mode but actually that it reached it and no work will happen anymore in any of the consumer groups.
+- [Enhancement] Use Karafka defined routes topics when possible for `read_topic` admin API.
+- [Enhancement] Introduce `client.pause` and `client.resume` instrumentation hooks for tracking client topic partition pausing and resuming. This is alongside of `consumer.consuming.pause` that can be used to track both manual and automatic pausing with more granular consumer related details. The `client.*` should be used for low level tracking.
+- [Enhancement] Replace `LoggerListener` pause notification with one based on `client.pause` instead of `consumer.consuming.pause`.
+- [Enhancement] Expand `LoggerListener` with `client.resume` notification.
+- [Enhancement] Replace random anonymous subscription groups ids with stable once.
+- [Enhancement] Add `consumer.consume`, `consumer.revoke` and `consumer.shutting_down` notification events and move the revocation logic calling to strategies.
 - [Change] Rename job queue statistics `processing` key to `busy`. No changes needed because naming in the DataDog listener stays the same.
 - [Fix] Fix proctitle listener state changes reporting on new states.
 - [Fix] Make sure all files descriptors are closed in the integration specs.
@@ -398,17 +411,17 @@ class KarafkaApp < Karafka::App
 ## 2.0.24 (2022-12-19)
 - **[Feature]** Provide out of the box encryption support for Pro.
-- [Improvement] Add instrumentation upon `#pause`.
-- [Improvement] Add instrumentation upon retries.
-- [Improvement] Assign `#id` to consumers similar to other entities for ease of debugging.
-- [Improvement] Add retries and pausing to the default `LoggerListener`.
-- [Improvement] Introduce a new final `terminated` state that will kick in prior to exit but after all the instrumentation and other things are done.
-- [Improvement] Ensure that state transitions are thread-safe and ensure state transitions can occur in one direction.
-- [Improvement] Optimize status methods proxying to `Karafka::App`.
-- [Improvement] Allow for easier state usage by introducing explicit `#to_s` for reporting.
-- [Improvement] Change auto-generated id from `SecureRandom#uuid` to `SecureRandom#hex(6)`
-- [Improvement] Emit statistic every 5 seconds by default.
-- [Improvement] Introduce general messages parser that can be swapped when needed.
+- [Enhancement] Add instrumentation upon `#pause`.
+- [Enhancement] Add instrumentation upon retries.
+- [Enhancement] Assign `#id` to consumers similar to other entities for ease of debugging.
+- [Enhancement] Add retries and pausing to the default `LoggerListener`.
+- [Enhancement] Introduce a new final `terminated` state that will kick in prior to exit but after all the instrumentation and other things are done.
+- [Enhancement] Ensure that state transitions are thread-safe and ensure state transitions can occur in one direction.
+- [Enhancement] Optimize status methods proxying to `Karafka::App`.
+- [Enhancement] Allow for easier state usage by introducing explicit `#to_s` for reporting.
+- [Enhancement] Change auto-generated id from `SecureRandom#uuid` to `SecureRandom#hex(6)`
+- [Enhancement] Emit statistic every 5 seconds by default.
+- [Enhancement] Introduce general messages parser that can be swapped when needed.
 - [Fix] Do not trigger code reloading when `consumer_persistence` is enabled.
 - [Fix] Shutdown producer after all the consumer components are down and the status is stopped. This will ensure, that any instrumentation related Kafka messaging can still operate.
@@ -429,17 +442,17 @@ end
 ## 2.0.23 (2022-12-07)
 - [Maintenance] Align with `waterdrop` and `karafka-core`
-- [Improvement] Provide `Admin#read_topic` API to get topic data without subscribing.
-- [Improvement] Upon an end user `#pause`, do not commit the offset in automatic offset management mode. This will prevent from a scenario where pause is needed but during it a rebalance occurs and a different assigned process starts not from the pause location but from the automatic offset that may be different. This still allows for using the `#mark_as_consumed`.
+- [Enhancement] Provide `Admin#read_topic` API to get topic data without subscribing.
+- [Enhancement] Upon an end user `#pause`, do not commit the offset in automatic offset management mode. This will prevent from a scenario where pause is needed but during it a rebalance occurs and a different assigned process starts not from the pause location but from the automatic offset that may be different. This still allows for using the `#mark_as_consumed`.
 - [Fix] Fix a scenario where manual `#pause` would be overwritten by a resume initiated by the strategy.
 - [Fix] Fix a scenario where manual `#pause` in LRJ would cause infinite pause.
 ## 2.0.22 (2022-12-02)
-- [Improvement] Load Pro components upon Karafka require so they can be altered prior to setup.
-- [Improvement] Do not run LRJ jobs that were added to the jobs queue but were revoked meanwhile.
-- [Improvement] Allow running particular named subscription groups similar to consumer groups.
-- [Improvement] Allow running particular topics similar to consumer groups.
-- [Improvement] Raise configuration error when trying to run Karafka with options leading to no subscriptions.
+- [Enhancement] Load Pro components upon Karafka require so they can be altered prior to setup.
+- [Enhancement] Do not run LRJ jobs that were added to the jobs queue but were revoked meanwhile.
+- [Enhancement] Allow running particular named subscription groups similar to consumer groups.
+- [Enhancement] Allow running particular topics similar to consumer groups.
+- [Enhancement] Raise configuration error when trying to run Karafka with options leading to no subscriptions.
 - [Fix] Fix `karafka info` subscription groups count reporting as it was misleading.
 - [Fix] Allow for defining subscription groups with symbols similar to consumer groups and topics to align the API.
 - [Fix] Do not allow for an explicit `nil` as a `subscription_group` block argument.
@@ -449,23 +462,23 @@ end
 - [Fix] Duplicated logs in development environment for Rails when logger set to `$stdout`.
 ## 20.0.21 (2022-11-25)
-- [Improvement] Make revocation jobs for LRJ topics non-blocking to prevent blocking polling when someone uses non-revocation aware LRJ jobs and revocation happens.
+- [Enhancement] Make revocation jobs for LRJ topics non-blocking to prevent blocking polling when someone uses non-revocation aware LRJ jobs and revocation happens.
 ## 2.0.20 (2022-11-24)
-- [Improvement] Support `group.instance.id` assignment (static group membership) for a case where a single consumer group has multiple subscription groups (#1173).
+- [Enhancement] Support `group.instance.id` assignment (static group membership) for a case where a single consumer group has multiple subscription groups (#1173).
 ## 2.0.19 (2022-11-20)
 - **[Feature]** Provide ability to skip failing messages without dispatching them to an alternative topic (DLQ).
-- [Improvement] Improve the integration with Ruby on Rails by preventing double-require of components.
-- [Improvement] Improve stability of the shutdown process upon critical errors.
-- [Improvement] Improve stability of the integrations spec suite.
+- [Enhancement] Improve the integration with Ruby on Rails by preventing double-require of components.
+- [Enhancement] Improve stability of the shutdown process upon critical errors.
+- [Enhancement] Improve stability of the integrations spec suite.
 - [Fix] Fix an issue where upon fast startup of multiple subscription groups from the same consumer group, a ghost queue would be created due to problems in `Concurrent::Hash`.
 ## 2.0.18 (2022-11-18)
 - **[Feature]** Support quiet mode via `TSTP` signal. When used, Karafka will finish processing current messages, run `shutdown` jobs, and switch to a quiet mode where no new work is being accepted. At the same time, it will keep the consumer group quiet, and thus no rebalance will be triggered. This can be particularly useful during deployments.
-- [Improvement] Trigger `#revoked` for jobs in case revocation would happen during shutdown when jobs are still running. This should ensure, we get a notion of revocation for Pro LRJ jobs even when revocation happening upon shutdown (#1150).
-- [Improvement] Stabilize the shutdown procedure for consumer groups with many subscription groups that have non-aligned processing cost per batch.
-- [Improvement] Remove double loading of Karafka via Rails railtie.
+- [Enhancement] Trigger `#revoked` for jobs in case revocation would happen during shutdown when jobs are still running. This should ensure, we get a notion of revocation for Pro LRJ jobs even when revocation happening upon shutdown (#1150).
+- [Enhancement] Stabilize the shutdown procedure for consumer groups with many subscription groups that have non-aligned processing cost per batch.
+- [Enhancement] Remove double loading of Karafka via Rails railtie.
 - [Fix] Fix invalid class references in YARD docs.
 - [Fix] prevent parallel closing of many clients.
 - [Fix] fix a case where information about revocation for a combination of LRJ + VP would not be dispatched until all VP work is done.
@@ -494,11 +507,11 @@ end
 ## 2.0.16 (2022-11-09)
 - **[Breaking]** Disable the root `manual_offset_management` setting and require it to be configured per topic. This is part of "topic features" configuration extraction for better code organization.
 - **[Feature]** Introduce **Dead Letter Queue** feature and Pro **Enhanced Dead Letter Queue** feature
-- [Improvement] Align attributes available in the instrumentation bus for listener related events.
-- [Improvement] Include consumer group id in consumption related events (#1093)
-- [Improvement] Delegate pro components loading to Zeitwerk
-- [Improvement] Include `Datadog::LoggerListener` for tracking logger data with DataDog (@bruno-b-martins)
-- [Improvement] Include `seek_offset` in the `consumer.consume.error` event payload (#1113)
+- [Enhancement] Align attributes available in the instrumentation bus for listener related events.
+- [Enhancement] Include consumer group id in consumption related events (#1093)
+- [Enhancement] Delegate pro components loading to Zeitwerk
+- [Enhancement] Include `Datadog::LoggerListener` for tracking logger data with DataDog (@bruno-b-martins)
+- [Enhancement] Include `seek_offset` in the `consumer.consume.error` event payload (#1113)
 - [Refactor] Remove unused logger listener event handler.
 - [Refactor] Internal refactoring of routing validations flow.
 - [Refactor] Reorganize how routing related features are represented internally to simplify features management.

data/Gemfile.lock CHANGED Viewed

@@ -1,7 +1,7 @@
 PATH
   remote: .
   specs:
-    karafka (2.2.13)
+    karafka (2.2.14)
       karafka-core (>= 2.2.7, < 2.3.0)
       waterdrop (>= 2.6.11, < 3.0.0)
       zeitwerk (~> 2.3)
@@ -32,7 +32,7 @@ GEM
     drb (2.2.0)
       ruby2_keywords
     erubi (1.12.0)
-    factory_bot (6.3.0)
+    factory_bot (6.4.2)
       activesupport (>= 5.0.0)
     ffi (1.16.3)
     globalid (1.2.1)
@@ -42,7 +42,7 @@ GEM
     karafka-core (2.2.7)
       concurrent-ruby (>= 1.1)
       karafka-rdkafka (>= 0.13.9, < 0.15.0)
-    karafka-rdkafka (0.14.0)
+    karafka-rdkafka (0.14.1)
       ffi (~> 1.15)
       mini_portile2 (~> 2.6)
       rake (> 12)

data/docker-compose.yml CHANGED Viewed

@@ -23,3 +23,5 @@ services:
       KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'true'
       KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
       KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
+      KAFKA_ALLOW_EVERYONE_IF_NO_ACL_FOUND: "true"
+      KAFKA_AUTHORIZER_CLASS_NAME: org.apache.kafka.metadata.authorizer.StandardAuthorizer