waterdrop 2.9.0 → 2.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +3 -0
- data/.ruby-version +1 -1
- data/CHANGELOG.md +7 -0
- data/Gemfile.lock +4 -4
- data/bin/verify_kafka_warnings +1 -0
- data/config/locales/errors.yml +5 -0
- data/docker-compose.sasl.yml +26 -0
- data/lib/waterdrop/clients/rdkafka.rb +151 -34
- data/lib/waterdrop/contracts/tombstone.rb +21 -0
- data/lib/waterdrop/errors.rb +7 -0
- data/lib/waterdrop/instrumentation/monitor.rb +61 -0
- data/lib/waterdrop/polling/latch.rb +1 -0
- data/lib/waterdrop/polling/poller.rb +4 -0
- data/lib/waterdrop/producer/tombstone.rb +78 -0
- data/lib/waterdrop/producer/variant.rb +1 -0
- data/lib/waterdrop/producer.rb +1 -0
- data/lib/waterdrop/version.rb +1 -1
- data/renovate.json +4 -1
- metadata +4 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 5a840a99425c1700eb3ea2cad5da08279e11ba0f4a2600b046dcef14bca9255b
|
|
4
|
+
data.tar.gz: 6107f58c3ed66912e56379660a021eb87c1e18a0d7fee7702460c1e90b75fea3
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 454ff01bc3baa3c2b47c46c6538dd8310c3b2af10cafe3551f00630e78d5980b7afeec7eb2047f769b24d0d6106aba4366e04e99d91045154a4ddb0b34edd4b1
|
|
7
|
+
data.tar.gz: '0980ac5585f18983d4d6918f99cf8bf070160d5c99964459c714b532195d74372785dd9b3e40366fae709f92aea9c364e2b71d33c3ffadde0528f60e863d8348'
|
data/.gitignore
CHANGED
data/.ruby-version
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
4.0.
|
|
1
|
+
4.0.3
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,12 @@
|
|
|
1
1
|
# WaterDrop changelog
|
|
2
2
|
|
|
3
|
+
## 2.10.0 (2026-05-07)
|
|
4
|
+
- [Fix] Clean up native rdkafka client, global instrumentation callbacks, and poller registration when `init_transactions` fails during producer client construction. Previously, each failed attempt permanently leaked native threads, pipe file descriptors, and callback registry entries because the started `rd_kafka_t` handle was abandoned without being destroyed.
|
|
5
|
+
- **[Breaking]** Skip emitting librdkafka statistics when nothing is subscribed to `statistics.emitted` at the time the underlying rdkafka client is constructed. When no listener is present at build time, `statistics.interval.ms` is forced to `0` regardless of user configuration and the statistics callback is not registered, saving substantial allocations in the hot path (no JSON parsing, no statistics hash materialization, no decoration work). To use statistics, subscribe a listener to `statistics.emitted` BEFORE the first producer use (before the underlying client is lazily initialized).
|
|
6
|
+
- **[Breaking]** Raise `WaterDrop::Errors::StatisticsNotEnabledError` when attempting to subscribe to `statistics.emitted` (either via block or via a listener that responds to `on_statistics_emitted`) on a monitor where librdkafka statistics have been disabled at client build time. This replaces the "silent nothing" failure mode with an immediate, actionable error that pinpoints the timing mistake.
|
|
7
|
+
- [Feature] Add tombstone API (`#tombstone_sync`, `#tombstone_async`, `#tombstone_many_sync`, `#tombstone_many_async`) for producing tombstone records (nil-payload messages) with required key and partition validation. Works with variants.
|
|
8
|
+
- [Fix] Add `ensure_same_process!` to `Poller#unregister` for fork safety. Without this, a child process that inherited a pre-fork producer would deadlock on `producer.close` because `unregister` waited on a latch the dead parent poller thread would never release.
|
|
9
|
+
|
|
3
10
|
## 2.9.0 (2026-04-08)
|
|
4
11
|
- [Fix] Use `delete` in the variant ensure block to avoid leaving stale nil entries in `Fiber.current.waterdrop_clients` and prevent memory leaks in long-running processes (#836).
|
|
5
12
|
- [Fix] Exclude test files, `.github/`, and `log/` directories from gem releases to reduce package size.
|
data/Gemfile.lock
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
PATH
|
|
2
2
|
remote: .
|
|
3
3
|
specs:
|
|
4
|
-
waterdrop (2.
|
|
4
|
+
waterdrop (2.10.0)
|
|
5
5
|
karafka-core (>= 2.5.12, < 3.0.0)
|
|
6
6
|
karafka-rdkafka (>= 0.24.0)
|
|
7
7
|
zeitwerk (~> 2.3)
|
|
@@ -28,7 +28,7 @@ GEM
|
|
|
28
28
|
rake (> 12)
|
|
29
29
|
logger (1.7.0)
|
|
30
30
|
mini_portile2 (2.8.9)
|
|
31
|
-
minitest (6.0.
|
|
31
|
+
minitest (6.0.6)
|
|
32
32
|
drb (~> 2.0)
|
|
33
33
|
prism (~> 1.5)
|
|
34
34
|
mocha (3.1.0)
|
|
@@ -74,7 +74,7 @@ CHECKSUMS
|
|
|
74
74
|
karafka-rdkafka (0.25.0) sha256=67b316b942cf9ff7e9d7bbf9029e6f2d91eba97b4c9dc93b9f49fd207dfb80f8
|
|
75
75
|
logger (1.7.0) sha256=196edec7cc44b66cfb40f9755ce11b392f21f7967696af15d274dde7edff0203
|
|
76
76
|
mini_portile2 (2.8.9) sha256=0cd7c7f824e010c072e33f68bc02d85a00aeb6fce05bb4819c03dfd3c140c289
|
|
77
|
-
minitest (6.0.
|
|
77
|
+
minitest (6.0.6) sha256=153ea36d1d987a62942382b61075745042a2b3123b1cd48f4c3675af9cc7d6f1
|
|
78
78
|
mocha (3.1.0) sha256=75f42d69ebfb1f10b32489dff8f8431d37a418120ecdfc07afe3bc183d4e1d56
|
|
79
79
|
ostruct (0.6.3) sha256=95a2ed4a4bd1d190784e666b47b2d3f078e4a9efda2fccf18f84ddc6538ed912
|
|
80
80
|
prism (1.9.0) sha256=7b530c6a9f92c24300014919c9dcbc055bf4cdf51ec30aed099b06cd6674ef85
|
|
@@ -85,7 +85,7 @@ CHECKSUMS
|
|
|
85
85
|
simplecov-html (0.13.2) sha256=bd0b8e54e7c2d7685927e8d6286466359b6f16b18cb0df47b508e8d73c777246
|
|
86
86
|
simplecov_json_formatter (0.1.4) sha256=529418fbe8de1713ac2b2d612aa3daa56d316975d307244399fa4838c601b428
|
|
87
87
|
warning (1.5.0) sha256=0f12c49fea0c06757778eefdcc7771e4fd99308901e3d55c504d87afdd718c53
|
|
88
|
-
waterdrop (2.
|
|
88
|
+
waterdrop (2.10.0)
|
|
89
89
|
zeitwerk (2.7.5) sha256=d8da92128c09ea6ec62c949011b00ed4a20242b255293dd66bf41545398f73dd
|
|
90
90
|
|
|
91
91
|
BUNDLED WITH
|
data/bin/verify_kafka_warnings
CHANGED
data/config/locales/errors.yml
CHANGED
|
@@ -61,6 +61,11 @@ en:
|
|
|
61
61
|
headers_invalid_key_type: all headers keys need to be of type String
|
|
62
62
|
headers_invalid_value_type: all headers values need to be strings or arrays of strings
|
|
63
63
|
|
|
64
|
+
tombstone:
|
|
65
|
+
missing: must be present
|
|
66
|
+
key_format: must be a non-empty string
|
|
67
|
+
partition_format: must be an integer greater or equal to 0
|
|
68
|
+
|
|
64
69
|
transactional_offset:
|
|
65
70
|
consumer_format: 'must respond to #consumer_group_metadata_pointer method'
|
|
66
71
|
message_format: 'must respond to #topic, #partition and #offset'
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
services:
|
|
2
|
+
kafka-sasl:
|
|
3
|
+
image: confluentinc/cp-kafka:8.2.0
|
|
4
|
+
container_name: kafka-sasl
|
|
5
|
+
ports:
|
|
6
|
+
- "9095:9095"
|
|
7
|
+
environment:
|
|
8
|
+
CLUSTER_ID: kafka-sasl-cluster-1
|
|
9
|
+
KAFKA_BROKER_ID: 1
|
|
10
|
+
KAFKA_PROCESS_ROLES: broker,controller
|
|
11
|
+
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka-sasl:9093
|
|
12
|
+
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
|
|
13
|
+
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
|
|
14
|
+
KAFKA_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093,SASL_PLAINTEXT://:9095
|
|
15
|
+
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-sasl:9092,SASL_PLAINTEXT://127.0.0.1:9095
|
|
16
|
+
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SASL_PLAINTEXT:SASL_PLAINTEXT
|
|
17
|
+
KAFKA_SASL_ENABLED_MECHANISMS: PLAIN
|
|
18
|
+
KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL: PLAINTEXT
|
|
19
|
+
KAFKA_LISTENER_NAME_SASL__PLAINTEXT_PLAIN_SASL_JAAS_CONFIG: 'org.apache.kafka.common.security.plain.PlainLoginModule required username="admin" password="admin-secret" user_admin="admin-secret" user_testuser="testuser-secret";'
|
|
20
|
+
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
|
|
21
|
+
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
|
|
22
|
+
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
|
|
23
|
+
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
|
|
24
|
+
KAFKA_ALLOW_EVERYONE_IF_NO_ACL_FOUND: "true"
|
|
25
|
+
KAFKA_AUTHORIZER_CLASS_NAME: org.apache.kafka.metadata.authorizer.StandardAuthorizer
|
|
26
|
+
KAFKA_OPTS: ""
|
|
@@ -11,68 +11,185 @@ module WaterDrop
|
|
|
11
11
|
# @param producer [WaterDrop::Producer] producer instance with its config, etc
|
|
12
12
|
# @note We overwrite this that way, because we do not care
|
|
13
13
|
def new(producer)
|
|
14
|
-
kafka_config = producer.config.kafka.to_h
|
|
15
14
|
monitor = producer.config.monitor
|
|
15
|
+
kafka_config, statistics_enabled = prepare_statistics(
|
|
16
|
+
producer.config.kafka.to_h,
|
|
17
|
+
monitor
|
|
18
|
+
)
|
|
16
19
|
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
+
client = build_rdkafka_client(producer, kafka_config)
|
|
21
|
+
|
|
22
|
+
register_instrumentation_callbacks(
|
|
23
|
+
producer,
|
|
24
|
+
client,
|
|
25
|
+
monitor,
|
|
26
|
+
statistics_enabled: statistics_enabled
|
|
27
|
+
)
|
|
28
|
+
|
|
29
|
+
# This callback is not global and is per client, thus we do not have to wrap it with a
|
|
30
|
+
# callbacks manager to make it work
|
|
31
|
+
client.delivery_callback = Instrumentation::Callbacks::Delivery.new(
|
|
32
|
+
producer.id,
|
|
33
|
+
producer.transactional?,
|
|
34
|
+
monitor
|
|
35
|
+
)
|
|
36
|
+
|
|
37
|
+
subscribe_oauth_listener(producer, monitor)
|
|
38
|
+
activate_client(producer, client, kafka_config)
|
|
39
|
+
|
|
40
|
+
client
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
private
|
|
44
|
+
|
|
45
|
+
# Decides whether librdkafka statistics should be enabled for this client and returns
|
|
46
|
+
# the (possibly mutated) kafka config together with the decision.
|
|
47
|
+
#
|
|
48
|
+
# When no one is subscribed to `statistics.emitted` at the time the underlying rdkafka
|
|
49
|
+
# client is being built, we force `statistics.interval.ms` to 0 regardless of user
|
|
50
|
+
# configuration. This prevents librdkafka from computing statistics periodically and
|
|
51
|
+
# saves a significant number of allocations on the Ruby side (no JSON parsing, no
|
|
52
|
+
# statistics hash materialization, no decorator work). Any listener subscribed after
|
|
53
|
+
# the client has been built will not receive `statistics.emitted` events because
|
|
54
|
+
# librdkafka never emits them in the first place — to use statistics, subscribe a
|
|
55
|
+
# listener BEFORE the first producer use.
|
|
56
|
+
#
|
|
57
|
+
# When statistics end up disabled (either because the user explicitly set the interval
|
|
58
|
+
# to 0, or because we forced it to 0 here), we freeze the statistics listener slot on
|
|
59
|
+
# the monitor. Any later subscription attempt raises instead of silently being a no-op,
|
|
60
|
+
# surfacing the timing mistake to the user immediately.
|
|
61
|
+
#
|
|
62
|
+
# @param kafka_config [Hash] kafka config hash taken from the producer config
|
|
63
|
+
# @param monitor [WaterDrop::Instrumentation::Monitor] per-producer monitor
|
|
64
|
+
# @return [Array] two-element array `[kafka_config, statistics_enabled]`. The returned
|
|
65
|
+
# hash is a duped copy when we need to mutate the interval, so the producer's own
|
|
66
|
+
# config hash is never touched.
|
|
67
|
+
def prepare_statistics(kafka_config, monitor)
|
|
68
|
+
statistics_enabled = kafka_config[:"statistics.interval.ms"].to_i.positive?
|
|
69
|
+
|
|
70
|
+
if statistics_enabled && !statistics_listener?(monitor)
|
|
71
|
+
kafka_config = kafka_config.dup
|
|
72
|
+
kafka_config[:"statistics.interval.ms"] = 0
|
|
73
|
+
statistics_enabled = false
|
|
74
|
+
end
|
|
75
|
+
|
|
76
|
+
monitor.freeze_statistics_listeners! unless statistics_enabled
|
|
20
77
|
|
|
78
|
+
[kafka_config, statistics_enabled]
|
|
79
|
+
end
|
|
80
|
+
|
|
81
|
+
# Instantiates the underlying rdkafka producer with the correct polling options. When
|
|
82
|
+
# FD polling is enabled, we disable librdkafka's native background polling thread and
|
|
83
|
+
# use our own Ruby-based poller instead.
|
|
84
|
+
#
|
|
85
|
+
# @param producer [WaterDrop::Producer]
|
|
86
|
+
# @param kafka_config [Hash]
|
|
87
|
+
# @return [::Rdkafka::Producer]
|
|
88
|
+
def build_rdkafka_client(producer, kafka_config)
|
|
89
|
+
producer_options = { native_kafka_auto_start: false }
|
|
21
90
|
producer_options[:run_polling_thread] = false if producer.fd_polling?
|
|
22
91
|
|
|
23
|
-
|
|
92
|
+
::Rdkafka::Config.new(kafka_config).producer(**producer_options)
|
|
93
|
+
end
|
|
24
94
|
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
95
|
+
# Registers the global callbacks (statistics, error, oauth refresh) for this producer
|
|
96
|
+
# on the shared `Karafka::Core::Instrumentation` managers. The statistics callback is
|
|
97
|
+
# only registered when librdkafka is actually going to emit statistics — otherwise it
|
|
98
|
+
# would never fire and would only waste memory and a manager slot.
|
|
99
|
+
#
|
|
100
|
+
# @param producer [WaterDrop::Producer]
|
|
101
|
+
# @param client [::Rdkafka::Producer]
|
|
102
|
+
# @param monitor [WaterDrop::Instrumentation::Monitor]
|
|
103
|
+
# @param statistics_enabled [Boolean]
|
|
104
|
+
def register_instrumentation_callbacks(producer, client, monitor, statistics_enabled:)
|
|
105
|
+
if statistics_enabled
|
|
106
|
+
::Karafka::Core::Instrumentation.statistics_callbacks.add(
|
|
29
107
|
producer.id,
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
108
|
+
Instrumentation::Callbacks::Statistics.new(
|
|
109
|
+
producer.id,
|
|
110
|
+
client.name,
|
|
111
|
+
monitor,
|
|
112
|
+
producer.config.statistics_decorator
|
|
113
|
+
)
|
|
33
114
|
)
|
|
34
|
-
|
|
115
|
+
end
|
|
35
116
|
|
|
36
|
-
# Register error tracking callback
|
|
37
117
|
::Karafka::Core::Instrumentation.error_callbacks.add(
|
|
38
118
|
producer.id,
|
|
39
119
|
Instrumentation::Callbacks::Error.new(producer.id, client.name, monitor)
|
|
40
120
|
)
|
|
41
121
|
|
|
42
|
-
# Register oauth bearer refresh for this particular type of callbacks
|
|
43
122
|
::Karafka::Core::Instrumentation.oauthbearer_token_refresh_callbacks.add(
|
|
44
123
|
producer.id,
|
|
45
124
|
Instrumentation::Callbacks::OauthbearerTokenRefresh.new(client, monitor)
|
|
46
125
|
)
|
|
126
|
+
end
|
|
47
127
|
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
128
|
+
# Subscribes the oauth bearer token refresh listener to the monitor if one is configured.
|
|
129
|
+
#
|
|
130
|
+
# We need to subscribe it here because we want it to be ready before any producer
|
|
131
|
+
# callbacks run. In theory because the WaterDrop rdkafka producer is lazy loaded, the
|
|
132
|
+
# user would have enough time to subscribe it himself, but then it would not coop with
|
|
133
|
+
# auto-configuration coming from Karafka. The way it is done here, if it is configured
|
|
134
|
+
# it will be subscribed and if not, the user always can subscribe it himself as long as
|
|
135
|
+
# it is done prior to first usage.
|
|
136
|
+
#
|
|
137
|
+
# @param producer [WaterDrop::Producer]
|
|
138
|
+
# @param monitor [WaterDrop::Instrumentation::Monitor]
|
|
139
|
+
def subscribe_oauth_listener(producer, monitor)
|
|
56
140
|
oauth_listener = producer.config.oauth.token_provider_listener
|
|
57
|
-
# We need to subscribe the oauth listener here because we want it to be ready before
|
|
58
|
-
# any producer callbacks run. In theory because WaterDrop rdkafka producer is lazy loaded
|
|
59
|
-
# we would have enough time to make user subscribe it himself, but then it would not
|
|
60
|
-
# coop with auto-configuration coming from Karafka. The way it is done below, if it is
|
|
61
|
-
# configured it will be subscribed and if not, user always can subscribe it himself as
|
|
62
|
-
# long as it is done prior to first usage
|
|
63
141
|
monitor.subscribe(oauth_listener) if oauth_listener
|
|
142
|
+
end
|
|
64
143
|
|
|
144
|
+
# Transitions the freshly built client into an active state: starts the native side,
|
|
145
|
+
# registers it with our FD poller (when FD polling is enabled), and initializes
|
|
146
|
+
# transactions if the user configured a transactional id. Must run last so all
|
|
147
|
+
# callbacks are already wired up before the client goes live.
|
|
148
|
+
#
|
|
149
|
+
# If any step after `client.start` fails (most commonly `init_transactions` timing
|
|
150
|
+
# out when Kafka is unreachable), we must clean up everything that was already set up:
|
|
151
|
+
# unregister from the poller, remove the global instrumentation callbacks, and close
|
|
152
|
+
# the native client. Without this, each failed attempt leaks native threads, pipe
|
|
153
|
+
# file descriptors, and callback registry entries permanently.
|
|
154
|
+
#
|
|
155
|
+
# @param producer [WaterDrop::Producer]
|
|
156
|
+
# @param client [::Rdkafka::Producer]
|
|
157
|
+
# @param kafka_config [Hash]
|
|
158
|
+
def activate_client(producer, client, kafka_config)
|
|
65
159
|
client.start
|
|
66
160
|
|
|
67
|
-
# Register with poller if FD polling is enabled
|
|
68
|
-
#
|
|
69
|
-
#
|
|
161
|
+
# Register with poller if FD polling is enabled. Uses the producer's configured poller
|
|
162
|
+
# (custom or global singleton). This must happen after client.start to ensure the
|
|
163
|
+
# client is ready.
|
|
70
164
|
producer.poller.register(producer, client) if producer.fd_polling?
|
|
71
165
|
|
|
72
|
-
# Switch to
|
|
166
|
+
# Switch to transactional mode if user provided a transactional id
|
|
73
167
|
client.init_transactions if kafka_config.key?(:"transactional.id")
|
|
168
|
+
rescue
|
|
169
|
+
# Unwind everything we set up before re-raising:
|
|
170
|
+
# 1. Unregister from poller (if we registered)
|
|
171
|
+
producer.poller.unregister(producer) if producer.fd_polling?
|
|
74
172
|
|
|
75
|
-
|
|
173
|
+
# 2. Remove global instrumentation callbacks so they don't accumulate
|
|
174
|
+
::Karafka::Core::Instrumentation.statistics_callbacks.delete(producer.id)
|
|
175
|
+
::Karafka::Core::Instrumentation.error_callbacks.delete(producer.id)
|
|
176
|
+
::Karafka::Core::Instrumentation.oauthbearer_token_refresh_callbacks.delete(producer.id)
|
|
177
|
+
|
|
178
|
+
# 3. Close the native client to join its threads and release pipe FDs
|
|
179
|
+
client.close
|
|
180
|
+
|
|
181
|
+
raise
|
|
182
|
+
end
|
|
183
|
+
|
|
184
|
+
# Checks whether there is at least one subscriber to the `statistics.emitted` event on
|
|
185
|
+
# the per-producer monitor. We use this at client build time to decide whether to enable
|
|
186
|
+
# librdkafka statistics at all.
|
|
187
|
+
#
|
|
188
|
+
# @param monitor [WaterDrop::Instrumentation::Monitor] per-producer monitor
|
|
189
|
+
# @return [Boolean] true if any listener is registered for `statistics.emitted`
|
|
190
|
+
def statistics_listener?(monitor)
|
|
191
|
+
listeners = monitor.listeners["statistics.emitted"]
|
|
192
|
+
listeners && !listeners.empty?
|
|
76
193
|
end
|
|
77
194
|
end
|
|
78
195
|
end
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module WaterDrop
|
|
4
|
+
module Contracts
|
|
5
|
+
# Contract for validating tombstone-specific message requirements.
|
|
6
|
+
# Tombstones require a non-nil key and an explicit partition.
|
|
7
|
+
#
|
|
8
|
+
# @note Topic, headers, and other standard message attributes are validated separately
|
|
9
|
+
# by the {Message} contract during the produce delegation flow.
|
|
10
|
+
class Tombstone < ::Karafka::Core::Contractable::Contract
|
|
11
|
+
configure do |config|
|
|
12
|
+
config.error_messages = YAML.safe_load_file(
|
|
13
|
+
File.join(WaterDrop.gem_root, "config", "locales", "errors.yml")
|
|
14
|
+
).fetch("en").fetch("validations").fetch("tombstone")
|
|
15
|
+
end
|
|
16
|
+
|
|
17
|
+
required(:key) { |val| val.is_a?(String) && !val.empty? }
|
|
18
|
+
required(:partition) { |val| val.is_a?(Integer) && val >= 0 }
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
end
|
data/lib/waterdrop/errors.rb
CHANGED
|
@@ -59,6 +59,13 @@ module WaterDrop
|
|
|
59
59
|
# Raised when an error occurs in the polling loop
|
|
60
60
|
PollerError = Class.new(BaseError)
|
|
61
61
|
|
|
62
|
+
# Raised when trying to subscribe to `statistics.emitted` after the underlying rdkafka client
|
|
63
|
+
# has been built without any listener present at build time. In that case, librdkafka
|
|
64
|
+
# statistics are disabled entirely for performance, and late subscriptions would silently
|
|
65
|
+
# receive nothing. To fix: subscribe the listener BEFORE first producer use (i.e. before the
|
|
66
|
+
# underlying client is lazily initialized).
|
|
67
|
+
StatisticsNotEnabledError = Class.new(BaseError)
|
|
68
|
+
|
|
62
69
|
# Raised when during messages producing something bad happened inline
|
|
63
70
|
class ProduceManyError < ProduceError
|
|
64
71
|
attr_reader :dispatched
|
|
@@ -6,6 +6,15 @@ module WaterDrop
|
|
|
6
6
|
# By default uses our internal notifications bus but can be used with
|
|
7
7
|
# `ActiveSupport::Notifications` as well
|
|
8
8
|
class Monitor < ::Karafka::Core::Monitoring::Monitor
|
|
9
|
+
# Event name for librdkafka statistics emissions
|
|
10
|
+
STATISTICS_EVENT = "statistics.emitted"
|
|
11
|
+
|
|
12
|
+
# Method name a listener object must implement in order to receive
|
|
13
|
+
# `statistics.emitted` events via object-based subscription
|
|
14
|
+
STATISTICS_LISTENER_METHOD = :on_statistics_emitted
|
|
15
|
+
|
|
16
|
+
private_constant :STATISTICS_EVENT, :STATISTICS_LISTENER_METHOD
|
|
17
|
+
|
|
9
18
|
# @param notifications_bus [Object] either our internal notifications bus or
|
|
10
19
|
# `ActiveSupport::Notifications`
|
|
11
20
|
# @param namespace [String, nil] namespace for events or nil if no namespace
|
|
@@ -14,6 +23,58 @@ module WaterDrop
|
|
|
14
23
|
namespace = nil
|
|
15
24
|
)
|
|
16
25
|
super
|
|
26
|
+
@statistics_listeners_frozen = false
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
# Marks this monitor as no longer accepting new subscriptions to `statistics.emitted`.
|
|
30
|
+
# Called by the rdkafka client builder when it decides to leave librdkafka statistics
|
|
31
|
+
# disabled (because no listener was present at build time). Any subsequent attempt to
|
|
32
|
+
# subscribe to `statistics.emitted` — either via a block or via a listener object that
|
|
33
|
+
# responds to `on_statistics_emitted` — will raise
|
|
34
|
+
# `WaterDrop::Errors::StatisticsNotEnabledError` instead of silently doing nothing.
|
|
35
|
+
def freeze_statistics_listeners!
|
|
36
|
+
@statistics_listeners_frozen = true
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
# Subscribes to the notifications bus, raising if the user tries to subscribe to
|
|
40
|
+
# `statistics.emitted` after statistics have been disabled at client build time. This
|
|
41
|
+
# prevents the "silent nothing" pitfall where a user expects statistics but no events
|
|
42
|
+
# ever arrive because librdkafka statistics were turned off entirely.
|
|
43
|
+
#
|
|
44
|
+
# @param event_id_or_listener [String, Symbol, Object] event id (with block) or listener
|
|
45
|
+
# @param block [Proc, nil] handler block when subscribing to a named event
|
|
46
|
+
# @raise [WaterDrop::Errors::StatisticsNotEnabledError] when the subscription targets
|
|
47
|
+
# `statistics.emitted` and this monitor has been frozen for statistics
|
|
48
|
+
def subscribe(event_id_or_listener, &block)
|
|
49
|
+
if @statistics_listeners_frozen && targets_statistics?(event_id_or_listener, block)
|
|
50
|
+
raise Errors::StatisticsNotEnabledError, <<~MSG.tr("\n", " ").strip
|
|
51
|
+
Cannot subscribe to `statistics.emitted` after the producer has been connected.
|
|
52
|
+
Statistics are disabled because no listener was subscribed before the underlying
|
|
53
|
+
rdkafka client was built, so librdkafka is not emitting statistics at all.
|
|
54
|
+
Subscribe your listener BEFORE the first producer use (before the underlying
|
|
55
|
+
client is lazily initialized), or explicitly keep statistics enabled by leaving
|
|
56
|
+
a listener in place at build time.
|
|
57
|
+
MSG
|
|
58
|
+
end
|
|
59
|
+
|
|
60
|
+
super
|
|
61
|
+
end
|
|
62
|
+
|
|
63
|
+
private
|
|
64
|
+
|
|
65
|
+
# Determines whether a subscription call targets `statistics.emitted`. Handles both
|
|
66
|
+
# block-based subscription (where the first argument is the event id string) and
|
|
67
|
+
# listener-object subscription (where the listener responds to `on_statistics_emitted`).
|
|
68
|
+
#
|
|
69
|
+
# @param event_id_or_listener [String, Symbol, Object]
|
|
70
|
+
# @param block [Proc, nil]
|
|
71
|
+
# @return [Boolean]
|
|
72
|
+
def targets_statistics?(event_id_or_listener, block)
|
|
73
|
+
if block
|
|
74
|
+
event_id_or_listener.to_s == STATISTICS_EVENT
|
|
75
|
+
else
|
|
76
|
+
event_id_or_listener.respond_to?(STATISTICS_LISTENER_METHOD)
|
|
77
|
+
end
|
|
17
78
|
end
|
|
18
79
|
end
|
|
19
80
|
end
|
|
@@ -18,6 +18,7 @@ module WaterDrop
|
|
|
18
18
|
# This ensures the producer is fully drained and removed from the poller
|
|
19
19
|
# before returning control to the caller, preventing race conditions.
|
|
20
20
|
class Latch
|
|
21
|
+
# Initializes a new latch in the unreleased state.
|
|
21
22
|
def initialize
|
|
22
23
|
@mutex = Mutex.new
|
|
23
24
|
@cv = ConditionVariable.new
|
|
@@ -47,6 +47,8 @@ module WaterDrop
|
|
|
47
47
|
# @return [Integer] unique identifier for this poller instance
|
|
48
48
|
attr_reader :id
|
|
49
49
|
|
|
50
|
+
# Initializes an empty poller with no registered producers. The background thread is
|
|
51
|
+
# not started until the first producer is registered.
|
|
50
52
|
def initialize
|
|
51
53
|
@id = self.class.next_id
|
|
52
54
|
@mutex = Mutex.new
|
|
@@ -142,6 +144,8 @@ module WaterDrop
|
|
|
142
144
|
# This matches the threaded polling behavior which drains without timeout
|
|
143
145
|
# @param producer [WaterDrop::Producer] the producer instance
|
|
144
146
|
def unregister(producer)
|
|
147
|
+
ensure_same_process!
|
|
148
|
+
|
|
145
149
|
state, thread = @mutex.synchronize { [@producers[producer.id], @thread] }
|
|
146
150
|
|
|
147
151
|
return unless state
|
|
@@ -0,0 +1,78 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module WaterDrop
|
|
4
|
+
class Producer
|
|
5
|
+
# Component for tombstone producer operations
|
|
6
|
+
#
|
|
7
|
+
# Tombstone records are Kafka messages with a nil payload, used to signal deletion of a key
|
|
8
|
+
# in compacted topics. This module provides a dedicated API so users don't have to manually
|
|
9
|
+
# construct `produce_*(topic:, key:, payload: nil, ...)` calls.
|
|
10
|
+
module Tombstone
|
|
11
|
+
# Produces a tombstone message to Kafka and waits for it to be delivered
|
|
12
|
+
#
|
|
13
|
+
# @param message [Hash] hash with at least `:topic`, `:key`, and `:partition` keys.
|
|
14
|
+
# `:payload` is not accepted — it will be silently removed if present.
|
|
15
|
+
#
|
|
16
|
+
# @return [Rdkafka::Producer::DeliveryReport] delivery report
|
|
17
|
+
#
|
|
18
|
+
# @raise [Errors::MessageInvalidError] When `:key` or `:partition` is missing
|
|
19
|
+
def tombstone_sync(message)
|
|
20
|
+
produce_sync(prepare_tombstone(message))
|
|
21
|
+
end
|
|
22
|
+
|
|
23
|
+
# Produces a tombstone message to Kafka and does not wait for results
|
|
24
|
+
#
|
|
25
|
+
# @param message [Hash] hash with at least `:topic`, `:key`, and `:partition` keys.
|
|
26
|
+
# `:payload` is not accepted — it will be silently removed if present.
|
|
27
|
+
#
|
|
28
|
+
# @return [Rdkafka::Producer::DeliveryHandle] delivery handle
|
|
29
|
+
#
|
|
30
|
+
# @raise [Errors::MessageInvalidError] When `:key` or `:partition` is missing
|
|
31
|
+
def tombstone_async(message)
|
|
32
|
+
produce_async(prepare_tombstone(message))
|
|
33
|
+
end
|
|
34
|
+
|
|
35
|
+
# Produces many tombstone messages to Kafka and waits for them to be delivered
|
|
36
|
+
#
|
|
37
|
+
# @param messages [Array<Hash>] array of hashes, each with `:topic`, `:key`, and
|
|
38
|
+
# `:partition` keys
|
|
39
|
+
#
|
|
40
|
+
# @return [Array<Rdkafka::Producer::DeliveryHandle>] delivery handles
|
|
41
|
+
#
|
|
42
|
+
# @raise [Errors::MessageInvalidError] When any message is missing `:key` or `:partition`
|
|
43
|
+
def tombstone_many_sync(messages)
|
|
44
|
+
produce_many_sync(messages.map { |message| prepare_tombstone(message) })
|
|
45
|
+
end
|
|
46
|
+
|
|
47
|
+
# Produces many tombstone messages to Kafka and does not wait for them to be delivered
|
|
48
|
+
#
|
|
49
|
+
# @param messages [Array<Hash>] array of hashes, each with `:topic`, `:key`, and
|
|
50
|
+
# `:partition` keys
|
|
51
|
+
#
|
|
52
|
+
# @return [Array<Rdkafka::Producer::DeliveryHandle>] delivery handles
|
|
53
|
+
#
|
|
54
|
+
# @raise [Errors::MessageInvalidError] When any message is missing `:key` or `:partition`
|
|
55
|
+
def tombstone_many_async(messages)
|
|
56
|
+
produce_many_async(messages.map { |message| prepare_tombstone(message) })
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
private
|
|
60
|
+
|
|
61
|
+
# Validates and prepares a tombstone message by ensuring required keys are present
|
|
62
|
+
# and setting payload to nil
|
|
63
|
+
#
|
|
64
|
+
# @param message [Hash] the original message hash
|
|
65
|
+
# @return [Hash] a new message hash with payload set to nil
|
|
66
|
+
# @raise [Errors::MessageInvalidError] when key or partition is missing
|
|
67
|
+
def prepare_tombstone(message)
|
|
68
|
+
message = message.dup
|
|
69
|
+
message.delete(:payload)
|
|
70
|
+
message[:payload] = nil
|
|
71
|
+
|
|
72
|
+
Contracts::Tombstone.new.validate!(message, Errors::MessageInvalidError)
|
|
73
|
+
|
|
74
|
+
message
|
|
75
|
+
end
|
|
76
|
+
end
|
|
77
|
+
end
|
|
78
|
+
end
|
data/lib/waterdrop/producer.rb
CHANGED
data/lib/waterdrop/version.rb
CHANGED
data/renovate.json
CHANGED
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: waterdrop
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 2.
|
|
4
|
+
version: 2.10.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Maciej Mensfeld
|
|
@@ -82,6 +82,7 @@ files:
|
|
|
82
82
|
- bin/verify_topics_naming
|
|
83
83
|
- config/locales/errors.yml
|
|
84
84
|
- docker-compose.oauth.yml
|
|
85
|
+
- docker-compose.sasl.yml
|
|
85
86
|
- docker-compose.yml
|
|
86
87
|
- lib/waterdrop.rb
|
|
87
88
|
- lib/waterdrop/clients/buffered.rb
|
|
@@ -93,6 +94,7 @@ files:
|
|
|
93
94
|
- lib/waterdrop/contracts/config.rb
|
|
94
95
|
- lib/waterdrop/contracts/message.rb
|
|
95
96
|
- lib/waterdrop/contracts/poller_config.rb
|
|
97
|
+
- lib/waterdrop/contracts/tombstone.rb
|
|
96
98
|
- lib/waterdrop/contracts/transactional_offset.rb
|
|
97
99
|
- lib/waterdrop/contracts/variant.rb
|
|
98
100
|
- lib/waterdrop/errors.rb
|
|
@@ -125,6 +127,7 @@ files:
|
|
|
125
127
|
- lib/waterdrop/producer/status.rb
|
|
126
128
|
- lib/waterdrop/producer/sync.rb
|
|
127
129
|
- lib/waterdrop/producer/testing.rb
|
|
130
|
+
- lib/waterdrop/producer/tombstone.rb
|
|
128
131
|
- lib/waterdrop/producer/transactions.rb
|
|
129
132
|
- lib/waterdrop/producer/variant.rb
|
|
130
133
|
- lib/waterdrop/version.rb
|