waterdrop 2.9.0 → 2.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 38a7f384d48b06062104341df54cf2532c6e338549dd370507d8b2e820721c73
4
- data.tar.gz: 2b66e9f62a94a4fb1baafc8492f9bcb9d51cee3bebc9de937da2860576f82bfb
3
+ metadata.gz: 5a840a99425c1700eb3ea2cad5da08279e11ba0f4a2600b046dcef14bca9255b
4
+ data.tar.gz: 6107f58c3ed66912e56379660a021eb87c1e18a0d7fee7702460c1e90b75fea3
5
5
  SHA512:
6
- metadata.gz: 941854a59acf37d5f19d49186cd0950ca699a441653995a1f09cec740fd44c98880364f954aada62377a2313a61e1e164d20e38601384a9f77c3671156150d5b
7
- data.tar.gz: 70fdbebdff760f2fba63fc001cfe4c151a25c64b9e9ce61f5fbc4f17b9c13b392f06a04cff41d02ae32edd015d2ca801b9fd744f7804417aff2a16bde6d4a75e
6
+ metadata.gz: 454ff01bc3baa3c2b47c46c6538dd8310c3b2af10cafe3551f00630e78d5980b7afeec7eb2047f769b24d0d6106aba4366e04e99d91045154a4ddb0b34edd4b1
7
+ data.tar.gz: '0980ac5585f18983d4d6918f99cf8bf070160d5c99964459c714b532195d74372785dd9b3e40366fae709f92aea9c364e2b71d33c3ffadde0528f60e863d8348'
data/.gitignore CHANGED
@@ -67,3 +67,6 @@ pickle-email-*.html
67
67
  .yardoc
68
68
 
69
69
  .byebug_history
70
+
71
+ # Integration test lock files (generated locally with paths, CI runs bundle install)
72
+ test/integrations/**/Gemfile.lock
data/.ruby-version CHANGED
@@ -1 +1 @@
1
- 4.0.2
1
+ 4.0.3
data/CHANGELOG.md CHANGED
@@ -1,5 +1,12 @@
1
1
  # WaterDrop changelog
2
2
 
3
+ ## 2.10.0 (2026-05-07)
4
+ - [Fix] Clean up native rdkafka client, global instrumentation callbacks, and poller registration when `init_transactions` fails during producer client construction. Previously, each failed attempt permanently leaked native threads, pipe file descriptors, and callback registry entries because the started `rd_kafka_t` handle was abandoned without being destroyed.
5
+ - **[Breaking]** Skip emitting librdkafka statistics when nothing is subscribed to `statistics.emitted` at the time the underlying rdkafka client is constructed. When no listener is present at build time, `statistics.interval.ms` is forced to `0` regardless of user configuration and the statistics callback is not registered, saving substantial allocations in the hot path (no JSON parsing, no statistics hash materialization, no decoration work). To use statistics, subscribe a listener to `statistics.emitted` BEFORE the first producer use (before the underlying client is lazily initialized).
6
+ - **[Breaking]** Raise `WaterDrop::Errors::StatisticsNotEnabledError` when attempting to subscribe to `statistics.emitted` (either via block or via a listener that responds to `on_statistics_emitted`) on a monitor where librdkafka statistics have been disabled at client build time. This replaces the "silent nothing" failure mode with an immediate, actionable error that pinpoints the timing mistake.
7
+ - [Feature] Add tombstone API (`#tombstone_sync`, `#tombstone_async`, `#tombstone_many_sync`, `#tombstone_many_async`) for producing tombstone records (nil-payload messages) with required key and partition validation. Works with variants.
8
+ - [Fix] Add `ensure_same_process!` to `Poller#unregister` for fork safety. Without this, a child process that inherited a pre-fork producer would deadlock on `producer.close` because `unregister` waited on a latch the dead parent poller thread would never release.
9
+
3
10
  ## 2.9.0 (2026-04-08)
4
11
  - [Fix] Use `delete` in the variant ensure block to avoid leaving stale nil entries in `Fiber.current.waterdrop_clients` and prevent memory leaks in long-running processes (#836).
5
12
  - [Fix] Exclude test files, `.github/`, and `log/` directories from gem releases to reduce package size.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- waterdrop (2.9.0)
4
+ waterdrop (2.10.0)
5
5
  karafka-core (>= 2.5.12, < 3.0.0)
6
6
  karafka-rdkafka (>= 0.24.0)
7
7
  zeitwerk (~> 2.3)
@@ -28,7 +28,7 @@ GEM
28
28
  rake (> 12)
29
29
  logger (1.7.0)
30
30
  mini_portile2 (2.8.9)
31
- minitest (6.0.2)
31
+ minitest (6.0.6)
32
32
  drb (~> 2.0)
33
33
  prism (~> 1.5)
34
34
  mocha (3.1.0)
@@ -74,7 +74,7 @@ CHECKSUMS
74
74
  karafka-rdkafka (0.25.0) sha256=67b316b942cf9ff7e9d7bbf9029e6f2d91eba97b4c9dc93b9f49fd207dfb80f8
75
75
  logger (1.7.0) sha256=196edec7cc44b66cfb40f9755ce11b392f21f7967696af15d274dde7edff0203
76
76
  mini_portile2 (2.8.9) sha256=0cd7c7f824e010c072e33f68bc02d85a00aeb6fce05bb4819c03dfd3c140c289
77
- minitest (6.0.2) sha256=db6e57956f6ecc6134683b4c87467d6dd792323c7f0eea7b93f66bd284adbc3d
77
+ minitest (6.0.6) sha256=153ea36d1d987a62942382b61075745042a2b3123b1cd48f4c3675af9cc7d6f1
78
78
  mocha (3.1.0) sha256=75f42d69ebfb1f10b32489dff8f8431d37a418120ecdfc07afe3bc183d4e1d56
79
79
  ostruct (0.6.3) sha256=95a2ed4a4bd1d190784e666b47b2d3f078e4a9efda2fccf18f84ddc6538ed912
80
80
  prism (1.9.0) sha256=7b530c6a9f92c24300014919c9dcbc055bf4cdf51ec30aed099b06cd6674ef85
@@ -85,7 +85,7 @@ CHECKSUMS
85
85
  simplecov-html (0.13.2) sha256=bd0b8e54e7c2d7685927e8d6286466359b6f16b18cb0df47b508e8d73c777246
86
86
  simplecov_json_formatter (0.1.4) sha256=529418fbe8de1713ac2b2d612aa3daa56d316975d307244399fa4838c601b428
87
87
  warning (1.5.0) sha256=0f12c49fea0c06757778eefdcc7771e4fd99308901e3d55c504d87afdd718c53
88
- waterdrop (2.9.0)
88
+ waterdrop (2.10.0)
89
89
  zeitwerk (2.7.5) sha256=d8da92128c09ea6ec62c949011b00ed4a20242b255293dd66bf41545398f73dd
90
90
 
91
91
  BUNDLED WITH
@@ -6,6 +6,7 @@
6
6
  allowed_patterns=(
7
7
  "Performing controller activation"
8
8
  "registered with feature metadata.version"
9
+ "TOPIC_ALREADY_EXISTS"
9
10
  )
10
11
 
11
12
  # Get all warnings
@@ -61,6 +61,11 @@ en:
61
61
  headers_invalid_key_type: all headers keys need to be of type String
62
62
  headers_invalid_value_type: all headers values need to be strings or arrays of strings
63
63
 
64
+ tombstone:
65
+ missing: must be present
66
+ key_format: must be a non-empty string
67
+ partition_format: must be an integer greater or equal to 0
68
+
64
69
  transactional_offset:
65
70
  consumer_format: 'must respond to #consumer_group_metadata_pointer method'
66
71
  message_format: 'must respond to #topic, #partition and #offset'
@@ -0,0 +1,26 @@
1
+ services:
2
+ kafka-sasl:
3
+ image: confluentinc/cp-kafka:8.2.0
4
+ container_name: kafka-sasl
5
+ ports:
6
+ - "9095:9095"
7
+ environment:
8
+ CLUSTER_ID: kafka-sasl-cluster-1
9
+ KAFKA_BROKER_ID: 1
10
+ KAFKA_PROCESS_ROLES: broker,controller
11
+ KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka-sasl:9093
12
+ KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
13
+ KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
14
+ KAFKA_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093,SASL_PLAINTEXT://:9095
15
+ KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-sasl:9092,SASL_PLAINTEXT://127.0.0.1:9095
16
+ KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SASL_PLAINTEXT:SASL_PLAINTEXT
17
+ KAFKA_SASL_ENABLED_MECHANISMS: PLAIN
18
+ KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL: PLAINTEXT
19
+ KAFKA_LISTENER_NAME_SASL__PLAINTEXT_PLAIN_SASL_JAAS_CONFIG: 'org.apache.kafka.common.security.plain.PlainLoginModule required username="admin" password="admin-secret" user_admin="admin-secret" user_testuser="testuser-secret";'
20
+ KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
21
+ KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
22
+ KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
23
+ KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
24
+ KAFKA_ALLOW_EVERYONE_IF_NO_ACL_FOUND: "true"
25
+ KAFKA_AUTHORIZER_CLASS_NAME: org.apache.kafka.metadata.authorizer.StandardAuthorizer
26
+ KAFKA_OPTS: ""
@@ -11,68 +11,185 @@ module WaterDrop
11
11
  # @param producer [WaterDrop::Producer] producer instance with its config, etc
12
12
  # @note We overwrite this that way, because we do not care
13
13
  def new(producer)
14
- kafka_config = producer.config.kafka.to_h
15
14
  monitor = producer.config.monitor
15
+ kafka_config, statistics_enabled = prepare_statistics(
16
+ producer.config.kafka.to_h,
17
+ monitor
18
+ )
16
19
 
17
- # When FD polling is enabled, we disable the native librdkafka polling thread
18
- # and use our own Ruby-based poller instead
19
- producer_options = { native_kafka_auto_start: false }
20
+ client = build_rdkafka_client(producer, kafka_config)
21
+
22
+ register_instrumentation_callbacks(
23
+ producer,
24
+ client,
25
+ monitor,
26
+ statistics_enabled: statistics_enabled
27
+ )
28
+
29
+ # This callback is not global and is per client, thus we do not have to wrap it with a
30
+ # callbacks manager to make it work
31
+ client.delivery_callback = Instrumentation::Callbacks::Delivery.new(
32
+ producer.id,
33
+ producer.transactional?,
34
+ monitor
35
+ )
36
+
37
+ subscribe_oauth_listener(producer, monitor)
38
+ activate_client(producer, client, kafka_config)
39
+
40
+ client
41
+ end
42
+
43
+ private
44
+
45
+ # Decides whether librdkafka statistics should be enabled for this client and returns
46
+ # the (possibly mutated) kafka config together with the decision.
47
+ #
48
+ # When no one is subscribed to `statistics.emitted` at the time the underlying rdkafka
49
+ # client is being built, we force `statistics.interval.ms` to 0 regardless of user
50
+ # configuration. This prevents librdkafka from computing statistics periodically and
51
+ # saves a significant number of allocations on the Ruby side (no JSON parsing, no
52
+ # statistics hash materialization, no decorator work). Any listener subscribed after
53
+ # the client has been built will not receive `statistics.emitted` events because
54
+ # librdkafka never emits them in the first place — to use statistics, subscribe a
55
+ # listener BEFORE the first producer use.
56
+ #
57
+ # When statistics end up disabled (either because the user explicitly set the interval
58
+ # to 0, or because we forced it to 0 here), we freeze the statistics listener slot on
59
+ # the monitor. Any later subscription attempt raises instead of silently being a no-op,
60
+ # surfacing the timing mistake to the user immediately.
61
+ #
62
+ # @param kafka_config [Hash] kafka config hash taken from the producer config
63
+ # @param monitor [WaterDrop::Instrumentation::Monitor] per-producer monitor
64
+ # @return [Array] two-element array `[kafka_config, statistics_enabled]`. The returned
65
+ # hash is a duped copy when we need to mutate the interval, so the producer's own
66
+ # config hash is never touched.
67
+ def prepare_statistics(kafka_config, monitor)
68
+ statistics_enabled = kafka_config[:"statistics.interval.ms"].to_i.positive?
69
+
70
+ if statistics_enabled && !statistics_listener?(monitor)
71
+ kafka_config = kafka_config.dup
72
+ kafka_config[:"statistics.interval.ms"] = 0
73
+ statistics_enabled = false
74
+ end
75
+
76
+ monitor.freeze_statistics_listeners! unless statistics_enabled
20
77
 
78
+ [kafka_config, statistics_enabled]
79
+ end
80
+
81
+ # Instantiates the underlying rdkafka producer with the correct polling options. When
82
+ # FD polling is enabled, we disable librdkafka's native background polling thread and
83
+ # use our own Ruby-based poller instead.
84
+ #
85
+ # @param producer [WaterDrop::Producer]
86
+ # @param kafka_config [Hash]
87
+ # @return [::Rdkafka::Producer]
88
+ def build_rdkafka_client(producer, kafka_config)
89
+ producer_options = { native_kafka_auto_start: false }
21
90
  producer_options[:run_polling_thread] = false if producer.fd_polling?
22
91
 
23
- client = ::Rdkafka::Config.new(kafka_config).producer(**producer_options)
92
+ ::Rdkafka::Config.new(kafka_config).producer(**producer_options)
93
+ end
24
94
 
25
- # Register statistics runner for this particular type of callbacks
26
- ::Karafka::Core::Instrumentation.statistics_callbacks.add(
27
- producer.id,
28
- Instrumentation::Callbacks::Statistics.new(
95
+ # Registers the global callbacks (statistics, error, oauth refresh) for this producer
96
+ # on the shared `Karafka::Core::Instrumentation` managers. The statistics callback is
97
+ # only registered when librdkafka is actually going to emit statistics — otherwise it
98
+ # would never fire and would only waste memory and a manager slot.
99
+ #
100
+ # @param producer [WaterDrop::Producer]
101
+ # @param client [::Rdkafka::Producer]
102
+ # @param monitor [WaterDrop::Instrumentation::Monitor]
103
+ # @param statistics_enabled [Boolean]
104
+ def register_instrumentation_callbacks(producer, client, monitor, statistics_enabled:)
105
+ if statistics_enabled
106
+ ::Karafka::Core::Instrumentation.statistics_callbacks.add(
29
107
  producer.id,
30
- client.name,
31
- monitor,
32
- producer.config.statistics_decorator
108
+ Instrumentation::Callbacks::Statistics.new(
109
+ producer.id,
110
+ client.name,
111
+ monitor,
112
+ producer.config.statistics_decorator
113
+ )
33
114
  )
34
- )
115
+ end
35
116
 
36
- # Register error tracking callback
37
117
  ::Karafka::Core::Instrumentation.error_callbacks.add(
38
118
  producer.id,
39
119
  Instrumentation::Callbacks::Error.new(producer.id, client.name, monitor)
40
120
  )
41
121
 
42
- # Register oauth bearer refresh for this particular type of callbacks
43
122
  ::Karafka::Core::Instrumentation.oauthbearer_token_refresh_callbacks.add(
44
123
  producer.id,
45
124
  Instrumentation::Callbacks::OauthbearerTokenRefresh.new(client, monitor)
46
125
  )
126
+ end
47
127
 
48
- # This callback is not global and is per client, thus we do not have to wrap it with a
49
- # callbacks manager to make it work
50
- client.delivery_callback = Instrumentation::Callbacks::Delivery.new(
51
- producer.id,
52
- producer.transactional?,
53
- monitor
54
- )
55
-
128
+ # Subscribes the oauth bearer token refresh listener to the monitor if one is configured.
129
+ #
130
+ # We need to subscribe it here because we want it to be ready before any producer
131
+ # callbacks run. In theory because the WaterDrop rdkafka producer is lazy loaded, the
132
+ # user would have enough time to subscribe it himself, but then it would not coop with
133
+ # auto-configuration coming from Karafka. The way it is done here, if it is configured
134
+ # it will be subscribed and if not, the user always can subscribe it himself as long as
135
+ # it is done prior to first usage.
136
+ #
137
+ # @param producer [WaterDrop::Producer]
138
+ # @param monitor [WaterDrop::Instrumentation::Monitor]
139
+ def subscribe_oauth_listener(producer, monitor)
56
140
  oauth_listener = producer.config.oauth.token_provider_listener
57
- # We need to subscribe the oauth listener here because we want it to be ready before
58
- # any producer callbacks run. In theory because WaterDrop rdkafka producer is lazy loaded
59
- # we would have enough time to make user subscribe it himself, but then it would not
60
- # coop with auto-configuration coming from Karafka. The way it is done below, if it is
61
- # configured it will be subscribed and if not, user always can subscribe it himself as
62
- # long as it is done prior to first usage
63
141
  monitor.subscribe(oauth_listener) if oauth_listener
142
+ end
64
143
 
144
+ # Transitions the freshly built client into an active state: starts the native side,
145
+ # registers it with our FD poller (when FD polling is enabled), and initializes
146
+ # transactions if the user configured a transactional id. Must run last so all
147
+ # callbacks are already wired up before the client goes live.
148
+ #
149
+ # If any step after `client.start` fails (most commonly `init_transactions` timing
150
+ # out when Kafka is unreachable), we must clean up everything that was already set up:
151
+ # unregister from the poller, remove the global instrumentation callbacks, and close
152
+ # the native client. Without this, each failed attempt leaks native threads, pipe
153
+ # file descriptors, and callback registry entries permanently.
154
+ #
155
+ # @param producer [WaterDrop::Producer]
156
+ # @param client [::Rdkafka::Producer]
157
+ # @param kafka_config [Hash]
158
+ def activate_client(producer, client, kafka_config)
65
159
  client.start
66
160
 
67
- # Register with poller if FD polling is enabled
68
- # Uses the producer's configured poller (custom or global singleton)
69
- # This must happen after client.start to ensure the client is ready
161
+ # Register with poller if FD polling is enabled. Uses the producer's configured poller
162
+ # (custom or global singleton). This must happen after client.start to ensure the
163
+ # client is ready.
70
164
  producer.poller.register(producer, client) if producer.fd_polling?
71
165
 
72
- # Switch to the transactional mode if user provided the transactional id
166
+ # Switch to transactional mode if user provided a transactional id
73
167
  client.init_transactions if kafka_config.key?(:"transactional.id")
168
+ rescue
169
+ # Unwind everything we set up before re-raising:
170
+ # 1. Unregister from poller (if we registered)
171
+ producer.poller.unregister(producer) if producer.fd_polling?
74
172
 
75
- client
173
+ # 2. Remove global instrumentation callbacks so they don't accumulate
174
+ ::Karafka::Core::Instrumentation.statistics_callbacks.delete(producer.id)
175
+ ::Karafka::Core::Instrumentation.error_callbacks.delete(producer.id)
176
+ ::Karafka::Core::Instrumentation.oauthbearer_token_refresh_callbacks.delete(producer.id)
177
+
178
+ # 3. Close the native client to join its threads and release pipe FDs
179
+ client.close
180
+
181
+ raise
182
+ end
183
+
184
+ # Checks whether there is at least one subscriber to the `statistics.emitted` event on
185
+ # the per-producer monitor. We use this at client build time to decide whether to enable
186
+ # librdkafka statistics at all.
187
+ #
188
+ # @param monitor [WaterDrop::Instrumentation::Monitor] per-producer monitor
189
+ # @return [Boolean] true if any listener is registered for `statistics.emitted`
190
+ def statistics_listener?(monitor)
191
+ listeners = monitor.listeners["statistics.emitted"]
192
+ listeners && !listeners.empty?
76
193
  end
77
194
  end
78
195
  end
@@ -0,0 +1,21 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Contracts
5
+ # Contract for validating tombstone-specific message requirements.
6
+ # Tombstones require a non-nil key and an explicit partition.
7
+ #
8
+ # @note Topic, headers, and other standard message attributes are validated separately
9
+ # by the {Message} contract during the produce delegation flow.
10
+ class Tombstone < ::Karafka::Core::Contractable::Contract
11
+ configure do |config|
12
+ config.error_messages = YAML.safe_load_file(
13
+ File.join(WaterDrop.gem_root, "config", "locales", "errors.yml")
14
+ ).fetch("en").fetch("validations").fetch("tombstone")
15
+ end
16
+
17
+ required(:key) { |val| val.is_a?(String) && !val.empty? }
18
+ required(:partition) { |val| val.is_a?(Integer) && val >= 0 }
19
+ end
20
+ end
21
+ end
@@ -59,6 +59,13 @@ module WaterDrop
59
59
  # Raised when an error occurs in the polling loop
60
60
  PollerError = Class.new(BaseError)
61
61
 
62
+ # Raised when trying to subscribe to `statistics.emitted` after the underlying rdkafka client
63
+ # has been built without any listener present at build time. In that case, librdkafka
64
+ # statistics are disabled entirely for performance, and late subscriptions would silently
65
+ # receive nothing. To fix: subscribe the listener BEFORE first producer use (i.e. before the
66
+ # underlying client is lazily initialized).
67
+ StatisticsNotEnabledError = Class.new(BaseError)
68
+
62
69
  # Raised when during messages producing something bad happened inline
63
70
  class ProduceManyError < ProduceError
64
71
  attr_reader :dispatched
@@ -6,6 +6,15 @@ module WaterDrop
6
6
  # By default uses our internal notifications bus but can be used with
7
7
  # `ActiveSupport::Notifications` as well
8
8
  class Monitor < ::Karafka::Core::Monitoring::Monitor
9
+ # Event name for librdkafka statistics emissions
10
+ STATISTICS_EVENT = "statistics.emitted"
11
+
12
+ # Method name a listener object must implement in order to receive
13
+ # `statistics.emitted` events via object-based subscription
14
+ STATISTICS_LISTENER_METHOD = :on_statistics_emitted
15
+
16
+ private_constant :STATISTICS_EVENT, :STATISTICS_LISTENER_METHOD
17
+
9
18
  # @param notifications_bus [Object] either our internal notifications bus or
10
19
  # `ActiveSupport::Notifications`
11
20
  # @param namespace [String, nil] namespace for events or nil if no namespace
@@ -14,6 +23,58 @@ module WaterDrop
14
23
  namespace = nil
15
24
  )
16
25
  super
26
+ @statistics_listeners_frozen = false
27
+ end
28
+
29
+ # Marks this monitor as no longer accepting new subscriptions to `statistics.emitted`.
30
+ # Called by the rdkafka client builder when it decides to leave librdkafka statistics
31
+ # disabled (because no listener was present at build time). Any subsequent attempt to
32
+ # subscribe to `statistics.emitted` — either via a block or via a listener object that
33
+ # responds to `on_statistics_emitted` — will raise
34
+ # `WaterDrop::Errors::StatisticsNotEnabledError` instead of silently doing nothing.
35
+ def freeze_statistics_listeners!
36
+ @statistics_listeners_frozen = true
37
+ end
38
+
39
+ # Subscribes to the notifications bus, raising if the user tries to subscribe to
40
+ # `statistics.emitted` after statistics have been disabled at client build time. This
41
+ # prevents the "silent nothing" pitfall where a user expects statistics but no events
42
+ # ever arrive because librdkafka statistics were turned off entirely.
43
+ #
44
+ # @param event_id_or_listener [String, Symbol, Object] event id (with block) or listener
45
+ # @param block [Proc, nil] handler block when subscribing to a named event
46
+ # @raise [WaterDrop::Errors::StatisticsNotEnabledError] when the subscription targets
47
+ # `statistics.emitted` and this monitor has been frozen for statistics
48
+ def subscribe(event_id_or_listener, &block)
49
+ if @statistics_listeners_frozen && targets_statistics?(event_id_or_listener, block)
50
+ raise Errors::StatisticsNotEnabledError, <<~MSG.tr("\n", " ").strip
51
+ Cannot subscribe to `statistics.emitted` after the producer has been connected.
52
+ Statistics are disabled because no listener was subscribed before the underlying
53
+ rdkafka client was built, so librdkafka is not emitting statistics at all.
54
+ Subscribe your listener BEFORE the first producer use (before the underlying
55
+ client is lazily initialized), or explicitly keep statistics enabled by leaving
56
+ a listener in place at build time.
57
+ MSG
58
+ end
59
+
60
+ super
61
+ end
62
+
63
+ private
64
+
65
+ # Determines whether a subscription call targets `statistics.emitted`. Handles both
66
+ # block-based subscription (where the first argument is the event id string) and
67
+ # listener-object subscription (where the listener responds to `on_statistics_emitted`).
68
+ #
69
+ # @param event_id_or_listener [String, Symbol, Object]
70
+ # @param block [Proc, nil]
71
+ # @return [Boolean]
72
+ def targets_statistics?(event_id_or_listener, block)
73
+ if block
74
+ event_id_or_listener.to_s == STATISTICS_EVENT
75
+ else
76
+ event_id_or_listener.respond_to?(STATISTICS_LISTENER_METHOD)
77
+ end
17
78
  end
18
79
  end
19
80
  end
@@ -18,6 +18,7 @@ module WaterDrop
18
18
  # This ensures the producer is fully drained and removed from the poller
19
19
  # before returning control to the caller, preventing race conditions.
20
20
  class Latch
21
+ # Initializes a new latch in the unreleased state.
21
22
  def initialize
22
23
  @mutex = Mutex.new
23
24
  @cv = ConditionVariable.new
@@ -47,6 +47,8 @@ module WaterDrop
47
47
  # @return [Integer] unique identifier for this poller instance
48
48
  attr_reader :id
49
49
 
50
+ # Initializes an empty poller with no registered producers. The background thread is
51
+ # not started until the first producer is registered.
50
52
  def initialize
51
53
  @id = self.class.next_id
52
54
  @mutex = Mutex.new
@@ -142,6 +144,8 @@ module WaterDrop
142
144
  # This matches the threaded polling behavior which drains without timeout
143
145
  # @param producer [WaterDrop::Producer] the producer instance
144
146
  def unregister(producer)
147
+ ensure_same_process!
148
+
145
149
  state, thread = @mutex.synchronize { [@producers[producer.id], @thread] }
146
150
 
147
151
  return unless state
@@ -0,0 +1,78 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ class Producer
5
+ # Component for tombstone producer operations
6
+ #
7
+ # Tombstone records are Kafka messages with a nil payload, used to signal deletion of a key
8
+ # in compacted topics. This module provides a dedicated API so users don't have to manually
9
+ # construct `produce_*(topic:, key:, payload: nil, ...)` calls.
10
+ module Tombstone
11
+ # Produces a tombstone message to Kafka and waits for it to be delivered
12
+ #
13
+ # @param message [Hash] hash with at least `:topic`, `:key`, and `:partition` keys.
14
+ # `:payload` is not accepted — it will be silently removed if present.
15
+ #
16
+ # @return [Rdkafka::Producer::DeliveryReport] delivery report
17
+ #
18
+ # @raise [Errors::MessageInvalidError] When `:key` or `:partition` is missing
19
+ def tombstone_sync(message)
20
+ produce_sync(prepare_tombstone(message))
21
+ end
22
+
23
+ # Produces a tombstone message to Kafka and does not wait for results
24
+ #
25
+ # @param message [Hash] hash with at least `:topic`, `:key`, and `:partition` keys.
26
+ # `:payload` is not accepted — it will be silently removed if present.
27
+ #
28
+ # @return [Rdkafka::Producer::DeliveryHandle] delivery handle
29
+ #
30
+ # @raise [Errors::MessageInvalidError] When `:key` or `:partition` is missing
31
+ def tombstone_async(message)
32
+ produce_async(prepare_tombstone(message))
33
+ end
34
+
35
+ # Produces many tombstone messages to Kafka and waits for them to be delivered
36
+ #
37
+ # @param messages [Array<Hash>] array of hashes, each with `:topic`, `:key`, and
38
+ # `:partition` keys
39
+ #
40
+ # @return [Array<Rdkafka::Producer::DeliveryHandle>] delivery handles
41
+ #
42
+ # @raise [Errors::MessageInvalidError] When any message is missing `:key` or `:partition`
43
+ def tombstone_many_sync(messages)
44
+ produce_many_sync(messages.map { |message| prepare_tombstone(message) })
45
+ end
46
+
47
+ # Produces many tombstone messages to Kafka and does not wait for them to be delivered
48
+ #
49
+ # @param messages [Array<Hash>] array of hashes, each with `:topic`, `:key`, and
50
+ # `:partition` keys
51
+ #
52
+ # @return [Array<Rdkafka::Producer::DeliveryHandle>] delivery handles
53
+ #
54
+ # @raise [Errors::MessageInvalidError] When any message is missing `:key` or `:partition`
55
+ def tombstone_many_async(messages)
56
+ produce_many_async(messages.map { |message| prepare_tombstone(message) })
57
+ end
58
+
59
+ private
60
+
61
+ # Validates and prepares a tombstone message by ensuring required keys are present
62
+ # and setting payload to nil
63
+ #
64
+ # @param message [Hash] the original message hash
65
+ # @return [Hash] a new message hash with payload set to nil
66
+ # @raise [Errors::MessageInvalidError] when key or partition is missing
67
+ def prepare_tombstone(message)
68
+ message = message.dup
69
+ message.delete(:payload)
70
+ message[:payload] = nil
71
+
72
+ Contracts::Tombstone.new.validate!(message, Errors::MessageInvalidError)
73
+
74
+ message
75
+ end
76
+ end
77
+ end
78
+ end
@@ -71,6 +71,7 @@ module WaterDrop
71
71
  Async,
72
72
  Buffer,
73
73
  Sync,
74
+ Tombstone,
74
75
  Transactions
75
76
  ].each do |scope|
76
77
  scope.instance_methods(false).each do |method_name|
@@ -7,6 +7,7 @@ module WaterDrop
7
7
  include Sync
8
8
  include Async
9
9
  include Buffer
10
+ include Tombstone
10
11
  include Transactions
11
12
  include Idempotence
12
13
  include ClassMonitor
@@ -3,5 +3,5 @@
3
3
  # WaterDrop library
4
4
  module WaterDrop
5
5
  # Current WaterDrop version
6
- VERSION = "2.9.0"
6
+ VERSION = "2.10.0"
7
7
  end
data/renovate.json CHANGED
@@ -45,5 +45,8 @@
45
45
  "minimumReleaseAge": "7 days",
46
46
  "labels": [
47
47
  "dependencies"
48
- ]
48
+ ],
49
+ "lockFileMaintenance": {
50
+ "enabled": true
51
+ }
49
52
  }
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: waterdrop
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.9.0
4
+ version: 2.10.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -82,6 +82,7 @@ files:
82
82
  - bin/verify_topics_naming
83
83
  - config/locales/errors.yml
84
84
  - docker-compose.oauth.yml
85
+ - docker-compose.sasl.yml
85
86
  - docker-compose.yml
86
87
  - lib/waterdrop.rb
87
88
  - lib/waterdrop/clients/buffered.rb
@@ -93,6 +94,7 @@ files:
93
94
  - lib/waterdrop/contracts/config.rb
94
95
  - lib/waterdrop/contracts/message.rb
95
96
  - lib/waterdrop/contracts/poller_config.rb
97
+ - lib/waterdrop/contracts/tombstone.rb
96
98
  - lib/waterdrop/contracts/transactional_offset.rb
97
99
  - lib/waterdrop/contracts/variant.rb
98
100
  - lib/waterdrop/errors.rb
@@ -125,6 +127,7 @@ files:
125
127
  - lib/waterdrop/producer/status.rb
126
128
  - lib/waterdrop/producer/sync.rb
127
129
  - lib/waterdrop/producer/testing.rb
130
+ - lib/waterdrop/producer/tombstone.rb
128
131
  - lib/waterdrop/producer/transactions.rb
129
132
  - lib/waterdrop/producer/variant.rb
130
133
  - lib/waterdrop/version.rb