waterdrop 2.0.1 → 2.0.5
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/.github/workflows/ci.yml +7 -3
- data/.ruby-version +1 -1
- data/CHANGELOG.md +33 -0
- data/Gemfile.lock +31 -34
- data/MIT-LICENSE +18 -0
- data/README.md +101 -33
- data/certs/mensfeld.pem +21 -21
- data/docker-compose.yml +2 -1
- data/lib/water_drop/config.rb +43 -8
- data/lib/water_drop/contracts/message.rb +1 -0
- data/lib/water_drop/instrumentation/callbacks/delivery.rb +30 -0
- data/lib/water_drop/instrumentation/callbacks/error.rb +35 -0
- data/lib/water_drop/instrumentation/callbacks/statistics.rb +41 -0
- data/lib/water_drop/instrumentation/callbacks/statistics_decorator.rb +77 -0
- data/lib/water_drop/instrumentation/callbacks_manager.rb +35 -0
- data/lib/water_drop/instrumentation/monitor.rb +8 -2
- data/lib/water_drop/instrumentation.rb +14 -0
- data/lib/water_drop/patches/rdkafka/bindings.rb +42 -0
- data/lib/water_drop/patches/rdkafka/producer.rb +20 -0
- data/lib/water_drop/producer/builder.rb +7 -42
- data/lib/water_drop/producer.rb +22 -3
- data/lib/water_drop/version.rb +1 -1
- data/lib/water_drop.rb +6 -0
- data/waterdrop.gemspec +6 -6
- data.tar.gz.sig +0 -0
- metadata +43 -38
- metadata.gz.sig +0 -0
- data/.github/FUNDING.yml +0 -1
- data/LICENSE +0 -165
- data/lib/water_drop/producer/statistics_decorator.rb +0 -71
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 310a3d7e1a4d0e5825b3a01f59b29c22a9f180c639951763bdf936a23c1119fd
|
4
|
+
data.tar.gz: f6c0c498266ba067201e7983d5bdea7a0aee7810a403be1cd4f4b3d62ab60633
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4e486cfa6aa673e008eeaccb8cf920fbb30fce1d23277021d3c6a02e36ee14b8a280e9114b9be778bdb68ba4b07eb2d64371362c454c607edf3c4b57a26a0066
|
7
|
+
data.tar.gz: 50301b9c5a5e67434f46247b5d1a83e4af2577e0f3b8f251a2795bc48aaba8c59135025e606b8143ca57560a2eac6666c530bd5d1b6059ce2e61d008e1eb9385
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
data/.github/workflows/ci.yml
CHANGED
@@ -17,7 +17,7 @@ jobs:
|
|
17
17
|
- '3.0'
|
18
18
|
- '2.7'
|
19
19
|
- '2.6'
|
20
|
-
- 'jruby-
|
20
|
+
- 'jruby-9.3.1.0'
|
21
21
|
include:
|
22
22
|
- ruby: '3.0'
|
23
23
|
coverage: 'true'
|
@@ -29,6 +29,12 @@ jobs:
|
|
29
29
|
uses: ruby/setup-ruby@v1
|
30
30
|
with:
|
31
31
|
ruby-version: ${{matrix.ruby}}
|
32
|
+
- name: Run Kafka with docker-compose
|
33
|
+
# We need to give Kafka enough time to start and create all the needed topics, etc
|
34
|
+
# If anyone has a better idea on how to do it smart and easily, please contact me
|
35
|
+
run: |
|
36
|
+
docker-compose up -d
|
37
|
+
sleep 5
|
32
38
|
- name: Install latest bundler
|
33
39
|
run: |
|
34
40
|
gem install bundler --no-document
|
@@ -37,8 +43,6 @@ jobs:
|
|
37
43
|
run: |
|
38
44
|
bundle config set without development
|
39
45
|
bundle install --jobs 4 --retry 3
|
40
|
-
- name: Run Kafka with docker-compose
|
41
|
-
run: docker-compose up -d
|
42
46
|
- name: Run all tests
|
43
47
|
env:
|
44
48
|
GITHUB_COVERAGE: ${{matrix.coverage}}
|
data/.ruby-version
CHANGED
@@ -1 +1 @@
|
|
1
|
-
3.0.
|
1
|
+
3.0.2
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,38 @@
|
|
1
1
|
# WaterDrop changelog
|
2
2
|
|
3
|
+
## 2.0.5 (2021-11-28)
|
4
|
+
|
5
|
+
### Bug fixes
|
6
|
+
|
7
|
+
- Fixes an issue where multiple producers would emit stats of other producers causing the same stats to be published several times (as many times as a number of producers). This could cause invalid reporting for multi-kafka setups.
|
8
|
+
- Fixes a bug where emitted statistics would contain their first value as the first delta value for first stats emitted.
|
9
|
+
- Fixes a bug where decorated statistics would include a delta for a root field with non-numeric values.
|
10
|
+
|
11
|
+
### Changes and features
|
12
|
+
- Introduces support for error callbacks instrumentation notifications with `error.emitted` monitor emitted key for tracking background errors that would occur on the producer (disconnects, etc).
|
13
|
+
- Removes the `:producer` key from `statistics.emitted` and replaces it with `:producer_id` not to inject whole producer into the payload
|
14
|
+
- Removes the `:producer` key from `message.acknowledged` and replaces it with `:producer_id` not to inject whole producer into the payload
|
15
|
+
- Cleanup and refactor of callbacks support to simplify the API and make it work with Rdkafka way of things.
|
16
|
+
- Introduces a callbacks manager concept that will also be within in Karafka `2.0` for both statistics and errors tracking per client.
|
17
|
+
- Sets default Kafka `client.id` to `waterdrop` when not set.
|
18
|
+
- Updates specs to always emit statistics for better test coverage.
|
19
|
+
- Adds statistics and errors integration specs running against Kafka.
|
20
|
+
- Replaces direct `RSpec.describe` reference with auto-discovery
|
21
|
+
- Patches `rdkafka` to provide functionalities that are needed for granular callback support.
|
22
|
+
|
23
|
+
## 2.0.4 (2021-09-19)
|
24
|
+
- Update `dry-*` to the recent versions and update settings syntax to match it
|
25
|
+
- Update Zeitwerk requirement
|
26
|
+
|
27
|
+
## 2.0.3 (2021-09-05)
|
28
|
+
- Remove rdkafka patch in favour of spec topic pre-creation
|
29
|
+
- Do not close client that was never used upon closing producer
|
30
|
+
|
31
|
+
## 2.0.2 (2021-08-13)
|
32
|
+
- Add support for `partition_key`
|
33
|
+
- Switch license from `LGPL-3.0` to `MIT`
|
34
|
+
- Switch flushing on close to sync
|
35
|
+
|
3
36
|
## 2.0.1 (2021-06-05)
|
4
37
|
- Remove Ruby 2.5 support and update minimum Ruby requirement to 2.6
|
5
38
|
- Fix the `finalizer references object to be finalized` warning issued with 3.0
|
data/Gemfile.lock
CHANGED
@@ -1,18 +1,18 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
waterdrop (2.0.
|
4
|
+
waterdrop (2.0.5)
|
5
5
|
concurrent-ruby (>= 1.1)
|
6
|
-
dry-configurable (~> 0.
|
7
|
-
dry-monitor (~> 0.
|
8
|
-
dry-validation (~> 1.
|
9
|
-
rdkafka (>= 0.
|
10
|
-
zeitwerk (~> 2.
|
6
|
+
dry-configurable (~> 0.13)
|
7
|
+
dry-monitor (~> 0.5)
|
8
|
+
dry-validation (~> 1.7)
|
9
|
+
rdkafka (>= 0.10)
|
10
|
+
zeitwerk (~> 2.3)
|
11
11
|
|
12
12
|
GEM
|
13
13
|
remote: https://rubygems.org/
|
14
14
|
specs:
|
15
|
-
activesupport (6.1.
|
15
|
+
activesupport (6.1.4.1)
|
16
16
|
concurrent-ruby (~> 1.0, >= 1.0.2)
|
17
17
|
i18n (>= 1.6, < 2)
|
18
18
|
minitest (>= 5.1)
|
@@ -22,30 +22,29 @@ GEM
|
|
22
22
|
concurrent-ruby (1.1.9)
|
23
23
|
diff-lcs (1.4.4)
|
24
24
|
docile (1.4.0)
|
25
|
-
dry-configurable (0.
|
25
|
+
dry-configurable (0.13.0)
|
26
26
|
concurrent-ruby (~> 1.0)
|
27
|
-
dry-core (~> 0.
|
28
|
-
dry-container (0.
|
27
|
+
dry-core (~> 0.6)
|
28
|
+
dry-container (0.9.0)
|
29
29
|
concurrent-ruby (~> 1.0)
|
30
|
-
dry-configurable (~> 0.
|
31
|
-
dry-core (0.
|
30
|
+
dry-configurable (~> 0.13, >= 0.13.0)
|
31
|
+
dry-core (0.7.1)
|
32
32
|
concurrent-ruby (~> 1.0)
|
33
|
-
dry-equalizer (0.3.0)
|
34
33
|
dry-events (0.3.0)
|
35
34
|
concurrent-ruby (~> 1.0)
|
36
35
|
dry-core (~> 0.5, >= 0.5)
|
37
|
-
dry-inflector (0.2.
|
36
|
+
dry-inflector (0.2.1)
|
38
37
|
dry-initializer (3.0.4)
|
39
38
|
dry-logic (1.2.0)
|
40
39
|
concurrent-ruby (~> 1.0)
|
41
40
|
dry-core (~> 0.5, >= 0.5)
|
42
|
-
dry-monitor (0.
|
43
|
-
dry-configurable (~> 0.
|
41
|
+
dry-monitor (0.5.0)
|
42
|
+
dry-configurable (~> 0.13, >= 0.13.0)
|
44
43
|
dry-core (~> 0.5, >= 0.5)
|
45
44
|
dry-events (~> 0.2)
|
46
|
-
dry-schema (1.
|
45
|
+
dry-schema (1.8.0)
|
47
46
|
concurrent-ruby (~> 1.0)
|
48
|
-
dry-configurable (~> 0.
|
47
|
+
dry-configurable (~> 0.13, >= 0.13.0)
|
49
48
|
dry-core (~> 0.5, >= 0.5)
|
50
49
|
dry-initializer (~> 3.0)
|
51
50
|
dry-logic (~> 1.0)
|
@@ -56,25 +55,24 @@ GEM
|
|
56
55
|
dry-core (~> 0.5, >= 0.5)
|
57
56
|
dry-inflector (~> 0.1, >= 0.1.2)
|
58
57
|
dry-logic (~> 1.0, >= 1.0.2)
|
59
|
-
dry-validation (1.
|
58
|
+
dry-validation (1.7.0)
|
60
59
|
concurrent-ruby (~> 1.0)
|
61
60
|
dry-container (~> 0.7, >= 0.7.1)
|
62
|
-
dry-core (~> 0.
|
63
|
-
dry-equalizer (~> 0.2)
|
61
|
+
dry-core (~> 0.5, >= 0.5)
|
64
62
|
dry-initializer (~> 3.0)
|
65
|
-
dry-schema (~> 1.
|
63
|
+
dry-schema (~> 1.8, >= 1.8.0)
|
66
64
|
factory_bot (6.2.0)
|
67
65
|
activesupport (>= 5.0.0)
|
68
|
-
ffi (1.15.
|
69
|
-
i18n (1.8.
|
66
|
+
ffi (1.15.4)
|
67
|
+
i18n (1.8.11)
|
70
68
|
concurrent-ruby (~> 1.0)
|
71
|
-
mini_portile2 (2.
|
69
|
+
mini_portile2 (2.7.1)
|
72
70
|
minitest (5.14.4)
|
73
|
-
rake (13.0.
|
74
|
-
rdkafka (0.
|
75
|
-
ffi (~> 1.
|
76
|
-
mini_portile2 (~> 2.
|
77
|
-
rake (
|
71
|
+
rake (13.0.6)
|
72
|
+
rdkafka (0.11.0)
|
73
|
+
ffi (~> 1.15)
|
74
|
+
mini_portile2 (~> 2.7)
|
75
|
+
rake (> 12)
|
78
76
|
rspec (3.10.0)
|
79
77
|
rspec-core (~> 3.10.0)
|
80
78
|
rspec-expectations (~> 3.10.0)
|
@@ -87,7 +85,7 @@ GEM
|
|
87
85
|
rspec-mocks (3.10.2)
|
88
86
|
diff-lcs (>= 1.2.0, < 2.0)
|
89
87
|
rspec-support (~> 3.10.0)
|
90
|
-
rspec-support (3.10.
|
88
|
+
rspec-support (3.10.3)
|
91
89
|
simplecov (0.21.2)
|
92
90
|
docile (~> 1.1)
|
93
91
|
simplecov-html (~> 0.11)
|
@@ -96,11 +94,10 @@ GEM
|
|
96
94
|
simplecov_json_formatter (0.1.3)
|
97
95
|
tzinfo (2.0.4)
|
98
96
|
concurrent-ruby (~> 1.0)
|
99
|
-
zeitwerk (2.
|
97
|
+
zeitwerk (2.5.1)
|
100
98
|
|
101
99
|
PLATFORMS
|
102
100
|
x86_64-darwin
|
103
|
-
x86_64-darwin-19
|
104
101
|
x86_64-linux
|
105
102
|
|
106
103
|
DEPENDENCIES
|
@@ -112,4 +109,4 @@ DEPENDENCIES
|
|
112
109
|
waterdrop!
|
113
110
|
|
114
111
|
BUNDLED WITH
|
115
|
-
2.2.
|
112
|
+
2.2.31
|
data/MIT-LICENSE
ADDED
@@ -0,0 +1,18 @@
|
|
1
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
2
|
+
a copy of this software and associated documentation files (the
|
3
|
+
"Software"), to deal in the Software without restriction, including
|
4
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
5
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
6
|
+
permit persons to whom the Software is furnished to do so, subject to
|
7
|
+
the following conditions:
|
8
|
+
|
9
|
+
The above copyright notice and this permission notice shall be
|
10
|
+
included in all copies or substantial portions of the Software.
|
11
|
+
|
12
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
13
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
14
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
15
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
16
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
17
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
18
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
CHANGED
@@ -4,11 +4,11 @@
|
|
4
4
|
|
5
5
|
WaterDrop `2.0` does **not** work with Karafka `1.*` and aims to either work as a standalone producer outside of Karafka `1.*` ecosystem or as a part of not yet released Karafka `2.0.*`.
|
6
6
|
|
7
|
-
Please refer to [this](https://github.com/karafka/waterdrop/tree/1.4) branch and
|
7
|
+
Please refer to [this](https://github.com/karafka/waterdrop/tree/1.4) branch and its documentation for details about WaterDrop `1.*` usage.
|
8
8
|
|
9
9
|
[![Build Status](https://github.com/karafka/waterdrop/workflows/ci/badge.svg)](https://github.com/karafka/waterdrop/actions?query=workflow%3Aci)
|
10
10
|
[![Gem Version](https://badge.fury.io/rb/waterdrop.svg)](http://badge.fury.io/rb/waterdrop)
|
11
|
-
[![Join the chat at https://
|
11
|
+
[![Join the chat at https://slack.karafka.io](https://raw.githubusercontent.com/karafka/misc/master/slack.svg)](https://slack.karafka.io)
|
12
12
|
|
13
13
|
Gem used to send messages to Kafka in an easy way with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
|
14
14
|
|
@@ -20,7 +20,24 @@ It:
|
|
20
20
|
- Supports buffering
|
21
21
|
- Supports producing messages to multiple clusters
|
22
22
|
- Supports multiple delivery policies
|
23
|
-
- Works with Kafka 1.0+ and Ruby 2.
|
23
|
+
- Works with Kafka 1.0+ and Ruby 2.6+
|
24
|
+
|
25
|
+
## Table of contents
|
26
|
+
|
27
|
+
- [Installation](#installation)
|
28
|
+
- [Setup](#setup)
|
29
|
+
* [WaterDrop configuration options](#waterdrop-configuration-options)
|
30
|
+
* [Kafka configuration options](#kafka-configuration-options)
|
31
|
+
- [Usage](#usage)
|
32
|
+
* [Basic usage](#basic-usage)
|
33
|
+
* [Buffering](#buffering)
|
34
|
+
+ [Using WaterDrop to buffer messages based on the application logic](#using-waterdrop-to-buffer-messages-based-on-the-application-logic)
|
35
|
+
+ [Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing](#using-waterdrop-with-rdkafka-buffers-to-achieve-periodic-auto-flushing)
|
36
|
+
- [Instrumentation](#instrumentation)
|
37
|
+
* [Usage statistics](#usage-statistics)
|
38
|
+
* [Error notifications](#error-notifications)
|
39
|
+
* [Forking and potential memory problems](#forking-and-potential-memory-problems)
|
40
|
+
- [Note on contributions](#note-on-contributions)
|
24
41
|
|
25
42
|
## Installation
|
26
43
|
|
@@ -44,8 +61,8 @@ bundle install
|
|
44
61
|
|
45
62
|
WaterDrop is a complex tool, that contains multiple configuration options. To keep everything organized, all the configuration options were divided into two groups:
|
46
63
|
|
47
|
-
- WaterDrop options - options directly related to
|
48
|
-
- Kafka driver options - options related to `
|
64
|
+
- WaterDrop options - options directly related to WaterDrop and its components
|
65
|
+
- Kafka driver options - options related to `rdkafka`
|
49
66
|
|
50
67
|
To apply all those configuration options, you need to create a producer instance and use the ```#setup``` method:
|
51
68
|
|
@@ -131,20 +148,25 @@ Each message that you want to publish, will have its value checked.
|
|
131
148
|
|
132
149
|
Here are all the things you can provide in the message hash:
|
133
150
|
|
134
|
-
| Option
|
135
|
-
|
136
|
-
| `topic`
|
137
|
-
| `payload`
|
138
|
-
| `key`
|
139
|
-
| `partition`
|
140
|
-
| `
|
141
|
-
| `
|
151
|
+
| Option | Required | Value type | Description |
|
152
|
+
|-----------------|----------|---------------|----------------------------------------------------------|
|
153
|
+
| `topic` | true | String | The Kafka topic that should be written to |
|
154
|
+
| `payload` | true | String | Data you want to send to Kafka |
|
155
|
+
| `key` | false | String | The key that should be set in the Kafka message |
|
156
|
+
| `partition` | false | Integer | A specific partition number that should be written to |
|
157
|
+
| `partition_key` | false | String | Key to indicate the destination partition of the message |
|
158
|
+
| `timestamp` | false | Time, Integer | The timestamp that should be set on the message |
|
159
|
+
| `headers` | false | Hash | Headers for the message |
|
142
160
|
|
143
161
|
Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
|
144
162
|
|
145
163
|
### Buffering
|
146
164
|
|
147
|
-
WaterDrop producers support buffering
|
165
|
+
WaterDrop producers support buffering messages in their internal buffers and on the `rdkafka` level via `queue.buffering.*` set of settings.
|
166
|
+
|
167
|
+
This means that depending on your use case, you can achieve both granular buffering and flushing control when needed with context awareness and periodic and size-based flushing functionalities.
|
168
|
+
|
169
|
+
#### Using WaterDrop to buffer messages based on the application logic
|
148
170
|
|
149
171
|
```ruby
|
150
172
|
producer = WaterDrop::Producer.new
|
@@ -153,16 +175,41 @@ producer.setup do |config|
|
|
153
175
|
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
154
176
|
end
|
155
177
|
|
156
|
-
|
178
|
+
# Simulating some events states of a transaction - notice, that the messages will be flushed to
|
179
|
+
# kafka only upon arrival of the `finished` state.
|
180
|
+
%w[
|
181
|
+
started
|
182
|
+
processed
|
183
|
+
finished
|
184
|
+
].each do |state|
|
185
|
+
producer.buffer(topic: 'events', payload: state)
|
186
|
+
|
187
|
+
puts "The messages buffer size #{producer.messages.size}"
|
188
|
+
producer.flush_sync if state == 'finished'
|
189
|
+
puts "The messages buffer size #{producer.messages.size}"
|
190
|
+
end
|
191
|
+
|
192
|
+
producer.close
|
193
|
+
```
|
194
|
+
|
195
|
+
#### Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing
|
196
|
+
|
197
|
+
```ruby
|
198
|
+
producer = WaterDrop::Producer.new
|
157
199
|
|
158
|
-
|
159
|
-
|
160
|
-
|
200
|
+
producer.setup do |config|
|
201
|
+
config.kafka = {
|
202
|
+
'bootstrap.servers': 'localhost:9092',
|
203
|
+
# Accumulate messages for at most 10 seconds
|
204
|
+
'queue.buffering.max.ms' => 10_000
|
205
|
+
}
|
161
206
|
end
|
162
207
|
|
163
|
-
|
164
|
-
|
165
|
-
|
208
|
+
# WaterDrop will flush messages minimum once every 10 seconds
|
209
|
+
30.times do |i|
|
210
|
+
producer.produce_async(topic: 'events', payload: i.to_s)
|
211
|
+
sleep(1)
|
212
|
+
end
|
166
213
|
|
167
214
|
producer.close
|
168
215
|
```
|
@@ -241,25 +288,46 @@ producer.close
|
|
241
288
|
|
242
289
|
Note: The metrics returned may not be completely consistent between brokers, toppars and totals, due to the internal asynchronous nature of librdkafka. E.g., the top level tx total may be less than the sum of the broker tx values which it represents.
|
243
290
|
|
291
|
+
### Error notifications
|
292
|
+
|
293
|
+
Aside from errors related to publishing messages like `buffer.flushed_async.error`, WaterDrop allows you to listen to errors that occur in its internal background threads. Things like reconnecting to Kafka upon network errors and others unrelated to publishing messages are all available under `error.emitted` notification key. You can subscribe to this event to ensure your setup is healthy and without any problems that would otherwise go unnoticed as long as messages are delivered.
|
294
|
+
|
295
|
+
```ruby
|
296
|
+
producer = WaterDrop::Producer.new do |config|
|
297
|
+
# Note invalid connection port...
|
298
|
+
config.kafka = { 'bootstrap.servers': 'localhost:9090' }
|
299
|
+
end
|
300
|
+
|
301
|
+
producer.monitor.subscribe('error.emitted') do |event|
|
302
|
+
error = event[:error]
|
303
|
+
|
304
|
+
p "Internal error occurred: #{error}"
|
305
|
+
end
|
306
|
+
|
307
|
+
# Run this code without Kafka cluster
|
308
|
+
loop do
|
309
|
+
producer.produce_async(topic: 'events', payload: 'data')
|
310
|
+
|
311
|
+
sleep(1)
|
312
|
+
end
|
313
|
+
|
314
|
+
# After you stop your Kafka cluster, you will see a lot of those:
|
315
|
+
#
|
316
|
+
# Internal error occurred: Local: Broker transport failure (transport)
|
317
|
+
#
|
318
|
+
# Internal error occurred: Local: Broker transport failure (transport)
|
319
|
+
```
|
320
|
+
|
244
321
|
### Forking and potential memory problems
|
245
322
|
|
246
323
|
If you work with forked processes, make sure you **don't** use the producer before the fork. You can easily configure the producer and then fork and use it.
|
247
324
|
|
248
325
|
To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
|
249
326
|
|
250
|
-
## References
|
251
|
-
|
252
|
-
* [WaterDrop code documentation](https://www.rubydoc.info/github/karafka/waterdrop)
|
253
|
-
* [Karafka framework](https://github.com/karafka/karafka)
|
254
|
-
* [WaterDrop Actions CI](https://github.com/karafka/waterdrop/actions?query=workflow%3Ac)
|
255
|
-
* [WaterDrop Coditsu](https://app.coditsu.io/karafka/repositories/waterdrop)
|
256
|
-
|
257
327
|
## Note on contributions
|
258
328
|
|
259
|
-
First, thank you for considering contributing to
|
260
|
-
|
261
|
-
Each pull request must pass all the RSpec specs and meet our quality requirements.
|
329
|
+
First, thank you for considering contributing to the Karafka ecosystem! It's people like you that make the open source community such a great community!
|
262
330
|
|
263
|
-
|
331
|
+
Each pull request must pass all the RSpec specs, integration tests and meet our quality requirements.
|
264
332
|
|
265
|
-
|
333
|
+
Fork it, update and wait for the Github Actions results.
|
data/certs/mensfeld.pem
CHANGED
@@ -1,25 +1,25 @@
|
|
1
1
|
-----BEGIN CERTIFICATE-----
|
2
2
|
MIIEODCCAqCgAwIBAgIBATANBgkqhkiG9w0BAQsFADAjMSEwHwYDVQQDDBhtYWNp
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
3
|
+
ZWovREM9bWVuc2ZlbGQvREM9cGwwHhcNMjEwODExMTQxNTEzWhcNMjIwODExMTQx
|
4
|
+
NTEzWjAjMSEwHwYDVQQDDBhtYWNpZWovREM9bWVuc2ZlbGQvREM9cGwwggGiMA0G
|
5
|
+
CSqGSIb3DQEBAQUAA4IBjwAwggGKAoIBgQDV2jKH4Ti87GM6nyT6D+ESzTI0MZDj
|
6
|
+
ak2/TEwnxvijMJyCCPKT/qIkbW4/f0VHM4rhPr1nW73sb5SZBVFCLlJcOSKOBdUY
|
7
|
+
TMY+SIXN2EtUaZuhAOe8LxtxjHTgRHvHcqUQMBENXTISNzCo32LnUxweu66ia4Pd
|
8
|
+
1mNRhzOqNv9YiBZvtBf7IMQ+sYdOCjboq2dlsWmJiwiDpY9lQBTnWORnT3mQxU5x
|
9
|
+
vPSwnLB854cHdCS8fQo4DjeJBRZHhEbcE5sqhEMB3RZA3EtFVEXOxlNxVTS3tncI
|
10
|
+
qyNXiWDaxcipaens4ObSY1C2HTV7OWb7OMqSCIybeYTSfkaSdqmcl4S6zxXkjH1J
|
11
|
+
tnjayAVzD+QVXGijsPLE2PFnJAh9iDET2cMsjabO1f6l1OQNyAtqpcyQcgfnyW0z
|
12
|
+
g7tGxTYD+6wJHffM9d9txOUw6djkF6bDxyqB8lo4Z3IObCx18AZjI9XPS9QG7w6q
|
13
|
+
LCWuMG2lkCcRgASqaVk9fEf9yMc2xxz5o3kCAwEAAaN3MHUwCQYDVR0TBAIwADAL
|
14
|
+
BgNVHQ8EBAMCBLAwHQYDVR0OBBYEFBqUFCKCOe5IuueUVqOB991jyCLLMB0GA1Ud
|
15
15
|
EQQWMBSBEm1hY2llakBtZW5zZmVsZC5wbDAdBgNVHRIEFjAUgRJtYWNpZWpAbWVu
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
16
|
+
c2ZlbGQucGwwDQYJKoZIhvcNAQELBQADggGBADD0/UuTTFgW+CGk2U0RDw2RBOca
|
17
|
+
W2LTF/G7AOzuzD0Tc4voc7WXyrgKwJREv8rgBimLnNlgmFJLmtUCh2U/MgxvcilH
|
18
|
+
yshYcbseNvjkrtYnLRlWZR4SSB6Zei5AlyGVQLPkvdsBpNegcG6w075YEwzX/38a
|
19
|
+
8V9B/Yri2OGELBz8ykl7BsXUgNoUPA/4pHF6YRLz+VirOaUIQ4JfY7xGj6fSOWWz
|
20
|
+
/rQ/d77r6o1mfJYM/3BRVg73a3b7DmRnE5qjwmSaSQ7u802pJnLesmArch0xGCT/
|
21
|
+
fMmRli1Qb+6qOTl9mzD6UDMAyFR4t6MStLm0mIEqM0nBO5nUdUWbC7l9qXEf8XBE
|
22
|
+
2DP28p3EqSuS+lKbAWKcqv7t0iRhhmaod+Yn9mcrLN1sa3q3KSQ9BCyxezCD4Mk2
|
23
|
+
R2P11bWoCtr70BsccVrN8jEhzwXngMyI2gVt750Y+dbTu1KgRqZKp/ECe7ZzPzXj
|
24
|
+
pIy9vHxTANKYVyI4qj8OrFdEM5BQNu8oQpL0iQ==
|
25
25
|
-----END CERTIFICATE-----
|
data/docker-compose.yml
CHANGED
@@ -5,7 +5,7 @@ services:
|
|
5
5
|
ports:
|
6
6
|
- "2181:2181"
|
7
7
|
kafka:
|
8
|
-
image: wurstmeister/kafka:
|
8
|
+
image: wurstmeister/kafka:2.12-2.5.0
|
9
9
|
ports:
|
10
10
|
- "9092:9092"
|
11
11
|
environment:
|
@@ -13,5 +13,6 @@ services:
|
|
13
13
|
KAFKA_ADVERTISED_PORT: 9092
|
14
14
|
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
|
15
15
|
KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'true'
|
16
|
+
KAFKA_CREATE_TOPICS: 'example_topic:1:1'
|
16
17
|
volumes:
|
17
18
|
- /var/run/docker.sock:/var/run/docker.sock
|
data/lib/water_drop/config.rb
CHANGED
@@ -7,33 +7,52 @@ module WaterDrop
|
|
7
7
|
class Config
|
8
8
|
include Dry::Configurable
|
9
9
|
|
10
|
+
# Defaults for kafka settings, that will be overwritten only if not present already
|
11
|
+
KAFKA_DEFAULTS = {
|
12
|
+
'client.id' => 'waterdrop'
|
13
|
+
}.freeze
|
14
|
+
|
15
|
+
private_constant :KAFKA_DEFAULTS
|
16
|
+
|
10
17
|
# WaterDrop options
|
11
18
|
#
|
12
19
|
# option [String] id of the producer. This can be helpful when building producer specific
|
13
20
|
# instrumentation or loggers. It is not the kafka producer id
|
14
|
-
setting(
|
21
|
+
setting(
|
22
|
+
:id,
|
23
|
+
default: false,
|
24
|
+
constructor: ->(id) { id || SecureRandom.uuid }
|
25
|
+
)
|
15
26
|
# option [Instance] logger that we want to use
|
16
27
|
# @note Due to how rdkafka works, this setting is global for all the producers
|
17
|
-
setting(
|
28
|
+
setting(
|
29
|
+
:logger,
|
30
|
+
default: false,
|
31
|
+
constructor: ->(logger) { logger || Logger.new($stdout, level: Logger::WARN) }
|
32
|
+
)
|
18
33
|
# option [Instance] monitor that we want to use. See instrumentation part of the README for
|
19
34
|
# more details
|
20
|
-
setting(
|
35
|
+
setting(
|
36
|
+
:monitor,
|
37
|
+
default: false,
|
38
|
+
constructor: ->(monitor) { monitor || WaterDrop::Instrumentation::Monitor.new }
|
39
|
+
)
|
21
40
|
# option [Integer] max payload size allowed for delivery to Kafka
|
22
|
-
setting :max_payload_size, 1_000_012
|
41
|
+
setting :max_payload_size, default: 1_000_012
|
23
42
|
# option [Integer] Wait that long for the delivery report or raise an error if this takes
|
24
43
|
# longer than the timeout.
|
25
|
-
setting :max_wait_timeout, 5
|
44
|
+
setting :max_wait_timeout, default: 5
|
26
45
|
# option [Numeric] how long should we wait between re-checks on the availability of the
|
27
46
|
# delivery report. In a really robust systems, this describes the min-delivery time
|
28
47
|
# for a single sync message when produced in isolation
|
29
|
-
setting :wait_timeout, 0.005 # 5 milliseconds
|
48
|
+
setting :wait_timeout, default: 0.005 # 5 milliseconds
|
30
49
|
# option [Boolean] should we send messages. Setting this to false can be really useful when
|
31
50
|
# testing and or developing because when set to false, won't actually ping Kafka but will
|
32
51
|
# run all the validations, etc
|
33
|
-
setting :deliver, true
|
52
|
+
setting :deliver, default: true
|
34
53
|
# rdkafka options
|
35
54
|
# @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
|
36
|
-
setting :kafka, {}
|
55
|
+
setting :kafka, default: {}
|
37
56
|
|
38
57
|
# Configuration method
|
39
58
|
# @yield Runs a block of code providing a config singleton instance to it
|
@@ -41,12 +60,28 @@ module WaterDrop
|
|
41
60
|
def setup
|
42
61
|
configure do |config|
|
43
62
|
yield(config)
|
63
|
+
|
64
|
+
merge_kafka_defaults!(config)
|
44
65
|
validate!(config.to_h)
|
66
|
+
|
67
|
+
::Rdkafka::Config.logger = config.logger
|
45
68
|
end
|
46
69
|
end
|
47
70
|
|
48
71
|
private
|
49
72
|
|
73
|
+
# Propagates the kafka setting defaults unless they are already present
|
74
|
+
# This makes it easier to set some values that users usually don't change but still allows them
|
75
|
+
# to overwrite the whole hash if they want to
|
76
|
+
# @param config [Dry::Configurable::Config] dry config of this producer
|
77
|
+
def merge_kafka_defaults!(config)
|
78
|
+
KAFKA_DEFAULTS.each do |key, value|
|
79
|
+
next if config.kafka.key?(key)
|
80
|
+
|
81
|
+
config.kafka[key] = value
|
82
|
+
end
|
83
|
+
end
|
84
|
+
|
50
85
|
# Validates the configuration and if anything is wrong, will raise an exception
|
51
86
|
# @param config_hash [Hash] config hash with setup details
|
52
87
|
# @raise [WaterDrop::Errors::ConfigurationInvalidError] raised when something is wrong with
|
@@ -22,6 +22,7 @@ module WaterDrop
|
|
22
22
|
required(:payload).filled(:str?)
|
23
23
|
optional(:key).maybe(:str?, :filled?)
|
24
24
|
optional(:partition).filled(:int?, gteq?: -1)
|
25
|
+
optional(:partition_key).maybe(:str?, :filled?)
|
25
26
|
optional(:timestamp).maybe { time? | int? }
|
26
27
|
optional(:headers).maybe(:hash?)
|
27
28
|
end
|
@@ -0,0 +1,30 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module WaterDrop
|
4
|
+
module Instrumentation
|
5
|
+
module Callbacks
|
6
|
+
# Creates a callable that we want to run upon each message delivery or failure
|
7
|
+
#
|
8
|
+
# @note We don't have to provide client_name here as this callback is per client instance
|
9
|
+
class Delivery
|
10
|
+
# @param producer_id [String] id of the current producer
|
11
|
+
# @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
|
12
|
+
def initialize(producer_id, monitor)
|
13
|
+
@producer_id = producer_id
|
14
|
+
@monitor = monitor
|
15
|
+
end
|
16
|
+
|
17
|
+
# Emits delivery details to the monitor
|
18
|
+
# @param delivery_report [Rdkafka::Producer::DeliveryReport] delivery report
|
19
|
+
def call(delivery_report)
|
20
|
+
@monitor.instrument(
|
21
|
+
'message.acknowledged',
|
22
|
+
producer_id: @producer_id,
|
23
|
+
offset: delivery_report.offset,
|
24
|
+
partition: delivery_report.partition
|
25
|
+
)
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
29
|
+
end
|
30
|
+
end
|