waterdrop 2.0.7 → 2.6.11
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/.github/FUNDING.yml +1 -0
- data/.github/workflows/ci.yml +22 -11
- data/.ruby-version +1 -1
- data/CHANGELOG.md +200 -0
- data/Gemfile +0 -2
- data/Gemfile.lock +32 -75
- data/README.md +22 -275
- data/certs/cert_chain.pem +26 -0
- data/config/locales/errors.yml +33 -0
- data/docker-compose.yml +19 -12
- data/lib/waterdrop/clients/buffered.rb +90 -0
- data/lib/waterdrop/clients/dummy.rb +69 -0
- data/lib/waterdrop/clients/rdkafka.rb +34 -0
- data/lib/{water_drop → waterdrop}/config.rb +39 -16
- data/lib/waterdrop/contracts/config.rb +43 -0
- data/lib/waterdrop/contracts/message.rb +64 -0
- data/lib/{water_drop → waterdrop}/errors.rb +14 -7
- data/lib/waterdrop/instrumentation/callbacks/delivery.rb +102 -0
- data/lib/{water_drop → waterdrop}/instrumentation/callbacks/error.rb +6 -2
- data/lib/{water_drop → waterdrop}/instrumentation/callbacks/statistics.rb +1 -1
- data/lib/{water_drop/instrumentation/stdout_listener.rb → waterdrop/instrumentation/logger_listener.rb} +66 -21
- data/lib/waterdrop/instrumentation/monitor.rb +20 -0
- data/lib/{water_drop/instrumentation/monitor.rb → waterdrop/instrumentation/notifications.rb} +12 -14
- data/lib/waterdrop/instrumentation/vendors/datadog/dashboard.json +1 -0
- data/lib/waterdrop/instrumentation/vendors/datadog/metrics_listener.rb +210 -0
- data/lib/waterdrop/middleware.rb +50 -0
- data/lib/{water_drop → waterdrop}/producer/async.rb +40 -4
- data/lib/{water_drop → waterdrop}/producer/buffer.rb +12 -30
- data/lib/{water_drop → waterdrop}/producer/builder.rb +6 -11
- data/lib/{water_drop → waterdrop}/producer/sync.rb +44 -15
- data/lib/waterdrop/producer/transactions.rb +170 -0
- data/lib/waterdrop/producer.rb +308 -0
- data/lib/{water_drop → waterdrop}/version.rb +1 -1
- data/lib/waterdrop.rb +28 -2
- data/renovate.json +6 -0
- data/waterdrop.gemspec +14 -11
- data.tar.gz.sig +0 -0
- metadata +71 -111
- metadata.gz.sig +0 -0
- data/certs/mensfeld.pem +0 -25
- data/config/errors.yml +0 -6
- data/lib/water_drop/contracts/config.rb +0 -26
- data/lib/water_drop/contracts/message.rb +0 -42
- data/lib/water_drop/instrumentation/callbacks/delivery.rb +0 -30
- data/lib/water_drop/instrumentation/callbacks/statistics_decorator.rb +0 -77
- data/lib/water_drop/instrumentation/callbacks_manager.rb +0 -39
- data/lib/water_drop/instrumentation.rb +0 -20
- data/lib/water_drop/patches/rdkafka/bindings.rb +0 -42
- data/lib/water_drop/patches/rdkafka/producer.rb +0 -20
- data/lib/water_drop/producer/dummy_client.rb +0 -32
- data/lib/water_drop/producer.rb +0 -162
- data/lib/water_drop.rb +0 -36
- /data/lib/{water_drop → waterdrop}/contracts.rb +0 -0
- /data/lib/{water_drop → waterdrop}/producer/status.rb +0 -0
data/README.md
CHANGED
@@ -1,84 +1,40 @@
|
|
1
1
|
# WaterDrop
|
2
2
|
|
3
|
-
**Note**: Documentation presented here refers to WaterDrop `2.0.0`.
|
4
|
-
|
5
|
-
WaterDrop `2.0` does **not** work with Karafka `1.*` and aims to either work as a standalone producer outside of Karafka `1.*` ecosystem or as a part of not yet released Karafka `2.0.*`.
|
6
|
-
|
7
|
-
Please refer to [this](https://github.com/karafka/waterdrop/tree/1.4) branch and its documentation for details about WaterDrop `1.*` usage.
|
8
|
-
|
9
3
|
[![Build Status](https://github.com/karafka/waterdrop/workflows/ci/badge.svg)](https://github.com/karafka/waterdrop/actions?query=workflow%3Aci)
|
10
4
|
[![Gem Version](https://badge.fury.io/rb/waterdrop.svg)](http://badge.fury.io/rb/waterdrop)
|
11
5
|
[![Join the chat at https://slack.karafka.io](https://raw.githubusercontent.com/karafka/misc/master/slack.svg)](https://slack.karafka.io)
|
12
6
|
|
13
|
-
|
7
|
+
WaterDrop is a standalone gem that sends messages to Kafka easily with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
|
14
8
|
|
15
9
|
It:
|
16
10
|
|
17
|
-
- Is thread
|
11
|
+
- Is thread-safe
|
18
12
|
- Supports sync producing
|
19
13
|
- Supports async producing
|
14
|
+
- Supports transactions
|
20
15
|
- Supports buffering
|
21
16
|
- Supports producing messages to multiple clusters
|
22
17
|
- Supports multiple delivery policies
|
23
|
-
- Works with Kafka 1.0
|
24
|
-
|
25
|
-
## Table of contents
|
18
|
+
- Works with Kafka `1.0+` and Ruby `2.7+`
|
19
|
+
- Works with and without Karafka
|
26
20
|
|
27
|
-
|
28
|
-
- [Setup](#setup)
|
29
|
-
* [WaterDrop configuration options](#waterdrop-configuration-options)
|
30
|
-
* [Kafka configuration options](#kafka-configuration-options)
|
31
|
-
- [Usage](#usage)
|
32
|
-
* [Basic usage](#basic-usage)
|
33
|
-
* [Buffering](#buffering)
|
34
|
-
+ [Using WaterDrop to buffer messages based on the application logic](#using-waterdrop-to-buffer-messages-based-on-the-application-logic)
|
35
|
-
+ [Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing](#using-waterdrop-with-rdkafka-buffers-to-achieve-periodic-auto-flushing)
|
36
|
-
- [Instrumentation](#instrumentation)
|
37
|
-
* [Usage statistics](#usage-statistics)
|
38
|
-
* [Error notifications](#error-notifications)
|
39
|
-
* [Forking and potential memory problems](#forking-and-potential-memory-problems)
|
40
|
-
- [Note on contributions](#note-on-contributions)
|
41
|
-
|
42
|
-
## Installation
|
43
|
-
|
44
|
-
```ruby
|
45
|
-
gem install waterdrop
|
46
|
-
```
|
47
|
-
|
48
|
-
or add this to your Gemfile:
|
49
|
-
|
50
|
-
```ruby
|
51
|
-
gem 'waterdrop'
|
52
|
-
```
|
53
|
-
|
54
|
-
and run
|
55
|
-
|
56
|
-
```
|
57
|
-
bundle install
|
58
|
-
```
|
21
|
+
## Documentation
|
59
22
|
|
60
|
-
|
23
|
+
Karafka ecosystem components documentation, including WaterDrop, can be found [here](https://karafka.io/docs/#waterdrop).
|
61
24
|
|
62
|
-
|
25
|
+
## Getting Started
|
63
26
|
|
64
|
-
|
65
|
-
- Kafka driver options - options related to `rdkafka`
|
27
|
+
If you want to both produce and consume messages, please use [Karafka](https://github.com/karafka/karafka/). It integrates WaterDrop automatically.
|
66
28
|
|
67
|
-
To
|
29
|
+
To get started with WaterDrop:
|
68
30
|
|
69
|
-
|
70
|
-
producer = WaterDrop::Producer.new
|
31
|
+
1. Add it to your Gemfile:
|
71
32
|
|
72
|
-
|
73
|
-
|
74
|
-
config.kafka = {
|
75
|
-
'bootstrap.servers': 'localhost:9092',
|
76
|
-
'request.required.acks': 1
|
77
|
-
}
|
78
|
-
end
|
33
|
+
```bash
|
34
|
+
bundle add waterdrop
|
79
35
|
```
|
80
36
|
|
81
|
-
|
37
|
+
2. Create and configure a producer:
|
82
38
|
|
83
39
|
```ruby
|
84
40
|
producer = WaterDrop::Producer.new do |config|
|
@@ -90,41 +46,17 @@ producer = WaterDrop::Producer.new do |config|
|
|
90
46
|
end
|
91
47
|
```
|
92
48
|
|
93
|
-
|
49
|
+
3. Use it as follows:
|
94
50
|
|
95
|
-
| Option | Description |
|
96
|
-
|--------------------|-----------------------------------------------------------------|
|
97
|
-
| `id` | id of the producer for instrumentation and logging |
|
98
|
-
| `logger` | Logger that we want to use |
|
99
|
-
| `deliver` | Should we send messages to Kafka or just fake the delivery |
|
100
|
-
| `max_wait_timeout` | Waits that long for the delivery report or raises an error |
|
101
|
-
| `wait_timeout` | Waits that long before re-check of delivery report availability |
|
102
|
-
|
103
|
-
### Kafka configuration options
|
104
|
-
|
105
|
-
You can create producers with different `kafka` settings. Documentation of the available configuration options is available on https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
|
106
|
-
|
107
|
-
## Usage
|
108
|
-
|
109
|
-
Please refer to the [documentation](https://www.rubydoc.info/gems/waterdrop) in case you're interested in the more advanced API.
|
110
|
-
|
111
|
-
### Basic usage
|
112
|
-
|
113
|
-
To send Kafka messages, just create a producer and use it:
|
114
51
|
|
115
52
|
```ruby
|
116
|
-
|
117
|
-
|
118
|
-
producer.setup do |config|
|
119
|
-
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
120
|
-
end
|
121
|
-
|
53
|
+
# sync producing
|
122
54
|
producer.produce_sync(topic: 'my-topic', payload: 'my message')
|
123
55
|
|
124
56
|
# or for async
|
125
57
|
producer.produce_async(topic: 'my-topic', payload: 'my message')
|
126
58
|
|
127
|
-
# or in batches
|
59
|
+
# or in sync batches
|
128
60
|
producer.produce_many_sync(
|
129
61
|
[
|
130
62
|
{ topic: 'my-topic', payload: 'my message'},
|
@@ -132,7 +64,7 @@ producer.produce_many_sync(
|
|
132
64
|
]
|
133
65
|
)
|
134
66
|
|
135
|
-
#
|
67
|
+
# and async batches
|
136
68
|
producer.produce_many_async(
|
137
69
|
[
|
138
70
|
{ topic: 'my-topic', payload: 'my message'},
|
@@ -140,194 +72,9 @@ producer.produce_many_async(
|
|
140
72
|
]
|
141
73
|
)
|
142
74
|
|
143
|
-
#
|
144
|
-
producer.
|
145
|
-
|
146
|
-
|
147
|
-
Each message that you want to publish, will have its value checked.
|
148
|
-
|
149
|
-
Here are all the things you can provide in the message hash:
|
150
|
-
|
151
|
-
| Option | Required | Value type | Description |
|
152
|
-
|-----------------|----------|---------------|----------------------------------------------------------|
|
153
|
-
| `topic` | true | String | The Kafka topic that should be written to |
|
154
|
-
| `payload` | true | String | Data you want to send to Kafka |
|
155
|
-
| `key` | false | String | The key that should be set in the Kafka message |
|
156
|
-
| `partition` | false | Integer | A specific partition number that should be written to |
|
157
|
-
| `partition_key` | false | String | Key to indicate the destination partition of the message |
|
158
|
-
| `timestamp` | false | Time, Integer | The timestamp that should be set on the message |
|
159
|
-
| `headers` | false | Hash | Headers for the message |
|
160
|
-
|
161
|
-
Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
|
162
|
-
|
163
|
-
### Buffering
|
164
|
-
|
165
|
-
WaterDrop producers support buffering messages in their internal buffers and on the `rdkafka` level via `queue.buffering.*` set of settings.
|
166
|
-
|
167
|
-
This means that depending on your use case, you can achieve both granular buffering and flushing control when needed with context awareness and periodic and size-based flushing functionalities.
|
168
|
-
|
169
|
-
#### Using WaterDrop to buffer messages based on the application logic
|
170
|
-
|
171
|
-
```ruby
|
172
|
-
producer = WaterDrop::Producer.new
|
173
|
-
|
174
|
-
producer.setup do |config|
|
175
|
-
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
176
|
-
end
|
177
|
-
|
178
|
-
# Simulating some events states of a transaction - notice, that the messages will be flushed to
|
179
|
-
# kafka only upon arrival of the `finished` state.
|
180
|
-
%w[
|
181
|
-
started
|
182
|
-
processed
|
183
|
-
finished
|
184
|
-
].each do |state|
|
185
|
-
producer.buffer(topic: 'events', payload: state)
|
186
|
-
|
187
|
-
puts "The messages buffer size #{producer.messages.size}"
|
188
|
-
producer.flush_sync if state == 'finished'
|
189
|
-
puts "The messages buffer size #{producer.messages.size}"
|
190
|
-
end
|
191
|
-
|
192
|
-
producer.close
|
193
|
-
```
|
194
|
-
|
195
|
-
#### Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing
|
196
|
-
|
197
|
-
```ruby
|
198
|
-
producer = WaterDrop::Producer.new
|
199
|
-
|
200
|
-
producer.setup do |config|
|
201
|
-
config.kafka = {
|
202
|
-
'bootstrap.servers': 'localhost:9092',
|
203
|
-
# Accumulate messages for at most 10 seconds
|
204
|
-
'queue.buffering.max.ms' => 10_000
|
205
|
-
}
|
206
|
-
end
|
207
|
-
|
208
|
-
# WaterDrop will flush messages minimum once every 10 seconds
|
209
|
-
30.times do |i|
|
210
|
-
producer.produce_async(topic: 'events', payload: i.to_s)
|
211
|
-
sleep(1)
|
212
|
-
end
|
213
|
-
|
214
|
-
producer.close
|
215
|
-
```
|
216
|
-
|
217
|
-
## Instrumentation
|
218
|
-
|
219
|
-
Each of the producers after the `#setup` is done, has a custom monitor to which you can subscribe.
|
220
|
-
|
221
|
-
```ruby
|
222
|
-
producer = WaterDrop::Producer.new
|
223
|
-
|
224
|
-
producer.setup do |config|
|
225
|
-
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
226
|
-
end
|
227
|
-
|
228
|
-
producer.monitor.subscribe('message.produced_async') do |event|
|
229
|
-
puts "A message was produced to '#{event[:message][:topic]}' topic!"
|
230
|
-
end
|
231
|
-
|
232
|
-
producer.produce_async(topic: 'events', payload: 'data')
|
233
|
-
|
234
|
-
producer.close
|
235
|
-
```
|
236
|
-
|
237
|
-
See the `WaterDrop::Instrumentation::Monitor::EVENTS` for the list of all the supported events.
|
238
|
-
|
239
|
-
### Usage statistics
|
240
|
-
|
241
|
-
WaterDrop may be configured to emit internal metrics at a fixed interval by setting the `kafka` `statistics.interval.ms` configuration property to a value > `0`. Once that is done, emitted statistics are available after subscribing to the `statistics.emitted` publisher event.
|
242
|
-
|
243
|
-
The statistics include all of the metrics from `librdkafka` (full list [here](https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md)) as well as the diff of those against the previously emitted values.
|
244
|
-
|
245
|
-
For several attributes like `txmsgs`, `librdkafka` publishes only the totals. In order to make it easier to track the progress (for example number of messages sent between statistics emitted events), WaterDrop diffs all the numeric values against previously available numbers. All of those metrics are available under the same key as the metric but with additional `_d` postfix:
|
246
|
-
|
247
|
-
|
248
|
-
```ruby
|
249
|
-
producer = WaterDrop::Producer.new do |config|
|
250
|
-
config.kafka = {
|
251
|
-
'bootstrap.servers': 'localhost:9092',
|
252
|
-
'statistics.interval.ms': 2_000 # emit statistics every 2 seconds
|
253
|
-
}
|
254
|
-
end
|
255
|
-
|
256
|
-
producer.monitor.subscribe('statistics.emitted') do |event|
|
257
|
-
sum = event[:statistics]['txmsgs']
|
258
|
-
diff = event[:statistics]['txmsgs_d']
|
259
|
-
|
260
|
-
p "Sent messages: #{sum}"
|
261
|
-
p "Messages sent from last statistics report: #{diff}"
|
262
|
-
end
|
263
|
-
|
264
|
-
sleep(2)
|
265
|
-
|
266
|
-
# Sent messages: 0
|
267
|
-
# Messages sent from last statistics report: 0
|
268
|
-
|
269
|
-
20.times { producer.produce_async(topic: 'events', payload: 'data') }
|
270
|
-
|
271
|
-
# Sent messages: 20
|
272
|
-
# Messages sent from last statistics report: 20
|
273
|
-
|
274
|
-
sleep(2)
|
275
|
-
|
276
|
-
20.times { producer.produce_async(topic: 'events', payload: 'data') }
|
277
|
-
|
278
|
-
# Sent messages: 40
|
279
|
-
# Messages sent from last statistics report: 20
|
280
|
-
|
281
|
-
sleep(2)
|
282
|
-
|
283
|
-
# Sent messages: 40
|
284
|
-
# Messages sent from last statistics report: 0
|
285
|
-
|
286
|
-
producer.close
|
287
|
-
```
|
288
|
-
|
289
|
-
Note: The metrics returned may not be completely consistent between brokers, toppars and totals, due to the internal asynchronous nature of librdkafka. E.g., the top level tx total may be less than the sum of the broker tx values which it represents.
|
290
|
-
|
291
|
-
### Error notifications
|
292
|
-
|
293
|
-
Aside from errors related to publishing messages like `buffer.flushed_async.error`, WaterDrop allows you to listen to errors that occur in its internal background threads. Things like reconnecting to Kafka upon network errors and others unrelated to publishing messages are all available under `error.emitted` notification key. You can subscribe to this event to ensure your setup is healthy and without any problems that would otherwise go unnoticed as long as messages are delivered.
|
294
|
-
|
295
|
-
```ruby
|
296
|
-
producer = WaterDrop::Producer.new do |config|
|
297
|
-
# Note invalid connection port...
|
298
|
-
config.kafka = { 'bootstrap.servers': 'localhost:9090' }
|
299
|
-
end
|
300
|
-
|
301
|
-
producer.monitor.subscribe('error.emitted') do |event|
|
302
|
-
error = event[:error]
|
303
|
-
|
304
|
-
p "Internal error occurred: #{error}"
|
75
|
+
# transactions
|
76
|
+
producer.transaction do
|
77
|
+
producer.produce_async(topic: 'my-topic', payload: 'my message')
|
78
|
+
producer.produce_async(topic: 'my-topic', payload: 'my message')
|
305
79
|
end
|
306
|
-
|
307
|
-
# Run this code without Kafka cluster
|
308
|
-
loop do
|
309
|
-
producer.produce_async(topic: 'events', payload: 'data')
|
310
|
-
|
311
|
-
sleep(1)
|
312
|
-
end
|
313
|
-
|
314
|
-
# After you stop your Kafka cluster, you will see a lot of those:
|
315
|
-
#
|
316
|
-
# Internal error occurred: Local: Broker transport failure (transport)
|
317
|
-
#
|
318
|
-
# Internal error occurred: Local: Broker transport failure (transport)
|
319
80
|
```
|
320
|
-
|
321
|
-
### Forking and potential memory problems
|
322
|
-
|
323
|
-
If you work with forked processes, make sure you **don't** use the producer before the fork. You can easily configure the producer and then fork and use it.
|
324
|
-
|
325
|
-
To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
|
326
|
-
|
327
|
-
## Note on contributions
|
328
|
-
|
329
|
-
First, thank you for considering contributing to the Karafka ecosystem! It's people like you that make the open source community such a great community!
|
330
|
-
|
331
|
-
Each pull request must pass all the RSpec specs, integration tests and meet our quality requirements.
|
332
|
-
|
333
|
-
Fork it, update and wait for the Github Actions results.
|
@@ -0,0 +1,26 @@
|
|
1
|
+
-----BEGIN CERTIFICATE-----
|
2
|
+
MIIEcDCCAtigAwIBAgIBATANBgkqhkiG9w0BAQsFADA/MRAwDgYDVQQDDAdjb250
|
3
|
+
YWN0MRcwFQYKCZImiZPyLGQBGRYHa2FyYWZrYTESMBAGCgmSJomT8ixkARkWAmlv
|
4
|
+
MB4XDTIzMDgyMTA3MjU1NFoXDTI0MDgyMDA3MjU1NFowPzEQMA4GA1UEAwwHY29u
|
5
|
+
dGFjdDEXMBUGCgmSJomT8ixkARkWB2thcmFma2ExEjAQBgoJkiaJk/IsZAEZFgJp
|
6
|
+
bzCCAaIwDQYJKoZIhvcNAQEBBQADggGPADCCAYoCggGBAOuZpyQKEwsTG9plLat7
|
7
|
+
8bUaNuNBEnouTsNMr6X+XTgvyrAxTuocdsyP1sNCjdS1B8RiiDH1/Nt9qpvlBWon
|
8
|
+
sdJ1SYhaWNVfqiYStTDnCx3PRMmHRdD4KqUWKpN6VpZ1O/Zu+9Mw0COmvXgZuuO9
|
9
|
+
wMSJkXRo6dTCfMedLAIxjMeBIxtoLR2e6Jm6MR8+8WYYVWrO9kSOOt5eKQLBY7aK
|
10
|
+
b/Dc40EcJKPg3Z30Pia1M9ZyRlb6SOj6SKpHRqc7vbVQxjEw6Jjal1lZ49m3YZMd
|
11
|
+
ArMAs9lQZNdSw5/UX6HWWURLowg6k10RnhTUtYyzO9BFev0JFJftHnmuk8vtb+SD
|
12
|
+
5VPmjFXg2VOcw0B7FtG75Vackk8QKfgVe3nSPhVpew2CSPlbJzH80wChbr19+e3+
|
13
|
+
YGr1tOiaJrL6c+PNmb0F31NXMKpj/r+n15HwlTMRxQrzFcgjBlxf2XFGnPQXHhBm
|
14
|
+
kp1OFnEq4GG9sON4glRldkwzi/f/fGcZmo5fm3d+0ZdNgwIDAQABo3cwdTAJBgNV
|
15
|
+
HRMEAjAAMAsGA1UdDwQEAwIEsDAdBgNVHQ4EFgQUPVH5+dLA80A1kJ2Uz5iGwfOa
|
16
|
+
1+swHQYDVR0RBBYwFIESY29udGFjdEBrYXJhZmthLmlvMB0GA1UdEgQWMBSBEmNv
|
17
|
+
bnRhY3RAa2FyYWZrYS5pbzANBgkqhkiG9w0BAQsFAAOCAYEAnpa0jcN7JzREHMTQ
|
18
|
+
bfZ+xcvlrzuROMY6A3zIZmQgbnoZZNuX4cMRrT1p1HuwXpxdpHPw7dDjYqWw3+1h
|
19
|
+
3mXLeMuk7amjQpYoSWU/OIZMhIsARra22UN8qkkUlUj3AwTaChVKN/bPJOM2DzfU
|
20
|
+
kz9vUgLeYYFfQbZqeI6SsM7ltilRV4W8D9yNUQQvOxCFxtLOetJ00fC/E7zMUzbK
|
21
|
+
IBwYFQYsbI6XQzgAIPW6nGSYKgRhkfpmquXSNKZRIQ4V6bFrufa+DzD0bt2ZA3ah
|
22
|
+
fMmJguyb5L2Gf1zpDXzFSPMG7YQFLzwYz1zZZvOU7/UCpQsHpID/YxqDp4+Dgb+Y
|
23
|
+
qma0whX8UG/gXFV2pYWpYOfpatvahwi+A1TwPQsuZwkkhi1OyF1At3RY+hjSXyav
|
24
|
+
AnG1dJU+yL2BK7vaVytLTstJME5mepSZ46qqIJXMuWob/YPDmVaBF39TDSG9e34s
|
25
|
+
msG3BiCqgOgHAnL23+CN3Rt8MsuRfEtoTKpJVcCfoEoNHOkc
|
26
|
+
-----END CERTIFICATE-----
|
@@ -0,0 +1,33 @@
|
|
1
|
+
en:
|
2
|
+
validations:
|
3
|
+
config:
|
4
|
+
missing: must be present
|
5
|
+
logger_format: must be present
|
6
|
+
deliver_format: must be boolean
|
7
|
+
id_format: must be a non-empty string
|
8
|
+
max_payload_size_format: must be an integer that is equal or bigger than 1
|
9
|
+
wait_timeout_format: must be a numeric that is bigger than 0
|
10
|
+
max_wait_timeout_format: must be an integer that is equal or bigger than 0
|
11
|
+
kafka_format: must be a hash with symbol based keys
|
12
|
+
kafka_key_must_be_a_symbol: All keys under the kafka settings scope need to be symbols
|
13
|
+
wait_on_queue_full_format: must be boolean
|
14
|
+
wait_backoff_on_queue_full_format: must be a numeric that is bigger or equal to 0
|
15
|
+
wait_timeout_on_queue_full_format: must be a numeric that is bigger or equal to 0
|
16
|
+
|
17
|
+
message:
|
18
|
+
missing: must be present
|
19
|
+
partition_format: must be an integer greater or equal to -1
|
20
|
+
topic_format: 'does not match the topic allowed format'
|
21
|
+
partition_key_format: must be a non-empty string
|
22
|
+
timestamp_format: must be either time or integer
|
23
|
+
payload_format: must be string or nil
|
24
|
+
headers_format: must be a hash
|
25
|
+
key_format: must be a non-empty string
|
26
|
+
payload_max_size: is more than `max_payload_size` config value
|
27
|
+
headers_invalid_key_type: all headers keys need to be of type String
|
28
|
+
headers_invalid_value_type: all headers values need to be of type String
|
29
|
+
|
30
|
+
test:
|
31
|
+
missing: must be present
|
32
|
+
nested.id_format: 'is invalid'
|
33
|
+
nested.id2_format: 'is invalid'
|
data/docker-compose.yml
CHANGED
@@ -1,18 +1,25 @@
|
|
1
1
|
version: '2'
|
2
|
+
|
2
3
|
services:
|
3
|
-
zookeeper:
|
4
|
-
image: wurstmeister/zookeeper
|
5
|
-
ports:
|
6
|
-
- "2181:2181"
|
7
4
|
kafka:
|
8
|
-
|
5
|
+
container_name: kafka
|
6
|
+
image: confluentinc/cp-kafka:7.5.1
|
7
|
+
|
9
8
|
ports:
|
10
|
-
-
|
9
|
+
- 9092:9092
|
10
|
+
|
11
11
|
environment:
|
12
|
-
|
13
|
-
|
14
|
-
|
12
|
+
CLUSTER_ID: kafka-docker-cluster-1
|
13
|
+
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
|
14
|
+
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
|
15
|
+
KAFKA_PROCESS_ROLES: broker,controller
|
16
|
+
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
|
17
|
+
KAFKA_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093
|
18
|
+
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
|
19
|
+
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://127.0.0.1:9092
|
20
|
+
KAFKA_BROKER_ID: 1
|
21
|
+
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@127.0.0.1:9093
|
22
|
+
ALLOW_PLAINTEXT_LISTENER: 'yes'
|
15
23
|
KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'true'
|
16
|
-
|
17
|
-
|
18
|
-
- /var/run/docker.sock:/var/run/docker.sock
|
24
|
+
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
|
25
|
+
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
|
@@ -0,0 +1,90 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module WaterDrop
|
4
|
+
module Clients
|
5
|
+
# Client used to buffer messages that we send out in specs and other places.
|
6
|
+
class Buffered < Clients::Dummy
|
7
|
+
attr_accessor :messages
|
8
|
+
|
9
|
+
# @param args [Object] anything accepted by `Clients::Dummy`
|
10
|
+
def initialize(*args)
|
11
|
+
super
|
12
|
+
@messages = []
|
13
|
+
@topics = Hash.new { |k, v| k[v] = [] }
|
14
|
+
|
15
|
+
@transaction_active = false
|
16
|
+
@transaction_messages = []
|
17
|
+
@transaction_topics = Hash.new { |k, v| k[v] = [] }
|
18
|
+
@transaction_level = 0
|
19
|
+
end
|
20
|
+
|
21
|
+
# "Produces" message to Kafka: it acknowledges it locally, adds it to the internal buffer
|
22
|
+
# @param message [Hash] `WaterDrop::Producer#produce_sync` message hash
|
23
|
+
# @return [Dummy::Handle] fake delivery handle that can be materialized into a report
|
24
|
+
def produce(message)
|
25
|
+
if @transaction_active
|
26
|
+
@transaction_topics[message.fetch(:topic)] << message
|
27
|
+
@transaction_messages << message
|
28
|
+
else
|
29
|
+
# We pre-validate the message payload, so topic is ensured to be present
|
30
|
+
@topics[message.fetch(:topic)] << message
|
31
|
+
@messages << message
|
32
|
+
end
|
33
|
+
|
34
|
+
super(**message.to_h)
|
35
|
+
end
|
36
|
+
|
37
|
+
# Starts the transaction on a given level
|
38
|
+
def begin_transaction
|
39
|
+
@transaction_level += 1
|
40
|
+
@transaction_active = true
|
41
|
+
end
|
42
|
+
|
43
|
+
# Finishes given level of transaction
|
44
|
+
def commit_transaction
|
45
|
+
@transaction_level -= 1
|
46
|
+
|
47
|
+
return unless @transaction_level.zero?
|
48
|
+
|
49
|
+
# Transfer transactional data on success
|
50
|
+
@transaction_topics.each do |topic, messages|
|
51
|
+
@topics[topic] += messages
|
52
|
+
end
|
53
|
+
|
54
|
+
@messages += @transaction_messages
|
55
|
+
|
56
|
+
@transaction_topics.clear
|
57
|
+
@transaction_messages.clear
|
58
|
+
@transaction_active = false
|
59
|
+
end
|
60
|
+
|
61
|
+
# Aborts the transaction
|
62
|
+
def abort_transaction
|
63
|
+
@transaction_level -= 1
|
64
|
+
|
65
|
+
return unless @transaction_level.zero?
|
66
|
+
|
67
|
+
@transaction_topics.clear
|
68
|
+
@transaction_messages.clear
|
69
|
+
@transaction_active = false
|
70
|
+
end
|
71
|
+
|
72
|
+
# Returns messages produced to a given topic
|
73
|
+
# @param topic [String]
|
74
|
+
def messages_for(topic)
|
75
|
+
@topics[topic]
|
76
|
+
end
|
77
|
+
|
78
|
+
# Clears internal buffer
|
79
|
+
# Used in between specs so messages do not leak out
|
80
|
+
def reset
|
81
|
+
@transaction_level = 0
|
82
|
+
@transaction_active = false
|
83
|
+
@transaction_topics.clear
|
84
|
+
@transaction_messages.clear
|
85
|
+
@messages.clear
|
86
|
+
@topics.each_value(&:clear)
|
87
|
+
end
|
88
|
+
end
|
89
|
+
end
|
90
|
+
end
|
@@ -0,0 +1,69 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module WaterDrop
|
4
|
+
module Clients
|
5
|
+
# A dummy client that is supposed to be used instead of Rdkafka::Producer in case we don't
|
6
|
+
# want to dispatch anything to Kafka.
|
7
|
+
#
|
8
|
+
# It does not store anything and just ignores messages. It does however return proper delivery
|
9
|
+
# handle that can be materialized into a report.
|
10
|
+
class Dummy
|
11
|
+
# `::Rdkafka::Producer::DeliveryHandle` object API compatible dummy object
|
12
|
+
class Handle < ::Rdkafka::Producer::DeliveryHandle
|
13
|
+
# @param topic [String] topic where we want to dispatch message
|
14
|
+
# @param partition [Integer] target partition
|
15
|
+
# @param offset [Integer] offset assigned by our fake "Kafka"
|
16
|
+
def initialize(topic, partition, offset)
|
17
|
+
@topic = topic
|
18
|
+
@partition = partition
|
19
|
+
@offset = offset
|
20
|
+
end
|
21
|
+
|
22
|
+
# Does not wait, just creates the result
|
23
|
+
#
|
24
|
+
# @param _args [Array] anything the wait handle would accept
|
25
|
+
# @return [::Rdkafka::Producer::DeliveryReport]
|
26
|
+
def wait(*_args)
|
27
|
+
create_result
|
28
|
+
end
|
29
|
+
|
30
|
+
# Creates a delivery report with details where the message went
|
31
|
+
#
|
32
|
+
# @return [::Rdkafka::Producer::DeliveryReport]
|
33
|
+
def create_result
|
34
|
+
::Rdkafka::Producer::DeliveryReport.new(
|
35
|
+
@partition,
|
36
|
+
@offset,
|
37
|
+
@topic
|
38
|
+
)
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
# @param _producer [WaterDrop::Producer]
|
43
|
+
# @return [Dummy] dummy instance
|
44
|
+
def initialize(_producer)
|
45
|
+
@counters = Hash.new { |h, k| h[k] = -1 }
|
46
|
+
end
|
47
|
+
|
48
|
+
# "Produces" the message
|
49
|
+
# @param topic [String, Symbol] topic where we want to dispatch message
|
50
|
+
# @param partition [Integer] target partition
|
51
|
+
# @param _args [Hash] remaining details that are ignored in the dummy mode
|
52
|
+
# @return [Handle] delivery handle
|
53
|
+
def produce(topic:, partition: 0, **_args)
|
54
|
+
Handle.new(topic.to_s, partition, @counters["#{topic}#{partition}"] += 1)
|
55
|
+
end
|
56
|
+
|
57
|
+
# @param _args [Object] anything really, this dummy is suppose to support anything
|
58
|
+
def respond_to_missing?(*_args)
|
59
|
+
true
|
60
|
+
end
|
61
|
+
|
62
|
+
# @param _args [Object] anything really, this dummy is suppose to support anything
|
63
|
+
# @return [self] returns self for chaining cases
|
64
|
+
def method_missing(*_args)
|
65
|
+
self || super
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
end
|
@@ -0,0 +1,34 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module WaterDrop
|
4
|
+
# Namespace for all the clients that WaterDrop may use under the hood
|
5
|
+
module Clients
|
6
|
+
# Default Rdkafka client.
|
7
|
+
# Since we use the ::Rdkafka::Producer under the hood, this is just a module that aligns with
|
8
|
+
# client building API for the convenience.
|
9
|
+
module Rdkafka
|
10
|
+
class << self
|
11
|
+
# @param producer [WaterDrop::Producer] producer instance with its config, etc
|
12
|
+
# @note We overwrite this that way, because we do not care
|
13
|
+
def new(producer)
|
14
|
+
config = producer.config.kafka.to_h
|
15
|
+
|
16
|
+
client = ::Rdkafka::Config.new(config).producer
|
17
|
+
|
18
|
+
# This callback is not global and is per client, thus we do not have to wrap it with a
|
19
|
+
# callbacks manager to make it work
|
20
|
+
client.delivery_callback = Instrumentation::Callbacks::Delivery.new(
|
21
|
+
producer.id,
|
22
|
+
producer.transactional?,
|
23
|
+
producer.config.monitor
|
24
|
+
)
|
25
|
+
|
26
|
+
# Switch to the transactional mode if user provided the transactional id
|
27
|
+
client.init_transactions if config.key?(:'transactional.id')
|
28
|
+
|
29
|
+
client
|
30
|
+
end
|
31
|
+
end
|
32
|
+
end
|
33
|
+
end
|
34
|
+
end
|