waterdrop 1.4.2 → 2.0.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data.tar.gz.sig +0 -0
- data/.github/workflows/ci.yml +1 -2
- data/.gitignore +2 -0
- data/.ruby-version +1 -1
- data/CHANGELOG.md +17 -5
- data/Gemfile +9 -0
- data/Gemfile.lock +42 -29
- data/{MIT-LICENCE → MIT-LICENSE} +0 -0
- data/README.md +244 -57
- data/certs/mensfeld.pem +21 -21
- data/config/errors.yml +3 -16
- data/docker-compose.yml +1 -1
- data/lib/water_drop.rb +4 -24
- data/lib/water_drop/config.rb +41 -142
- data/lib/water_drop/contracts.rb +0 -2
- data/lib/water_drop/contracts/config.rb +8 -121
- data/lib/water_drop/contracts/message.rb +42 -0
- data/lib/water_drop/errors.rb +31 -5
- data/lib/water_drop/instrumentation/monitor.rb +16 -22
- data/lib/water_drop/instrumentation/stdout_listener.rb +113 -32
- data/lib/water_drop/patches/rdkafka_producer.rb +49 -0
- data/lib/water_drop/producer.rb +143 -0
- data/lib/water_drop/producer/async.rb +51 -0
- data/lib/water_drop/producer/buffer.rb +113 -0
- data/lib/water_drop/producer/builder.rb +63 -0
- data/lib/water_drop/producer/dummy_client.rb +32 -0
- data/lib/water_drop/producer/statistics_decorator.rb +71 -0
- data/lib/water_drop/producer/status.rb +52 -0
- data/lib/water_drop/producer/sync.rb +65 -0
- data/lib/water_drop/version.rb +1 -1
- data/waterdrop.gemspec +4 -4
- metadata +44 -45
- metadata.gz.sig +0 -0
- data/lib/water_drop/async_producer.rb +0 -26
- data/lib/water_drop/base_producer.rb +0 -57
- data/lib/water_drop/config_applier.rb +0 -52
- data/lib/water_drop/contracts/message_options.rb +0 -19
- data/lib/water_drop/sync_producer.rb +0 -24
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 24189103d7583fac3c911f4b32244655ffa2cd1b5a95abe0e73dbd4344530c5b
|
4
|
+
data.tar.gz: b4f5c0e95af559e0a0a929d14a3b56b39c4da5edc9b1d9ffe995e8b1852d9c14
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bd49152ec9f6325cf58d9d470ed1ccfdb0c5b5d9a899bcdca289500ab56008b1153db35bf90f8e5e167692be8c79cd5bf96bbf66f9f515f97922bb217a0544d5
|
7
|
+
data.tar.gz: 32b36daf1f26227c58690fea5a6846c7599d1e8859e074ca3f2075cf0effa187cfbb4cb86a50ea22dfe169d3af54fd2e3232e14c569efcaa625f52166f227dbb
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
data.tar.gz.sig
CHANGED
Binary file
|
data/.github/workflows/ci.yml
CHANGED
data/.gitignore
CHANGED
data/.ruby-version
CHANGED
@@ -1 +1 @@
|
|
1
|
-
3.0.
|
1
|
+
3.0.2
|
data/CHANGELOG.md
CHANGED
@@ -1,10 +1,22 @@
|
|
1
1
|
# WaterDrop changelog
|
2
2
|
|
3
|
-
##
|
4
|
-
-
|
5
|
-
|
6
|
-
|
7
|
-
|
3
|
+
## 2.0.2 (2021-08-13)
|
4
|
+
- Add support for `partition_key`
|
5
|
+
- Switch license from `LGPL-3.0` to `MIT`
|
6
|
+
- Switch flushing on close to sync
|
7
|
+
|
8
|
+
## 2.0.1 (2021-06-05)
|
9
|
+
- Remove Ruby 2.5 support and update minimum Ruby requirement to 2.6
|
10
|
+
- Fix the `finalizer references object to be finalized` warning issued with 3.0
|
11
|
+
|
12
|
+
## 2.0.0 (2020-12-13)
|
13
|
+
- Redesign of the whole API (see `README.md` for the use-cases and the current API)
|
14
|
+
- Replace `ruby-kafka` with `rdkafka`
|
15
|
+
- Switch license from `MIT` to `LGPL-3.0`
|
16
|
+
- #113 - Add some basic validations of the kafka scope of the config (Azdaroth)
|
17
|
+
- Global state removed
|
18
|
+
- Redesigned metrics that use `rdkafka` internal data + custom diffing
|
19
|
+
- Restore JRuby support
|
8
20
|
|
9
21
|
## 1.4.0 (2020-08-25)
|
10
22
|
- Release to match Karafka 1.4 versioning.
|
data/Gemfile
CHANGED
data/Gemfile.lock
CHANGED
@@ -1,49 +1,49 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
waterdrop (
|
5
|
-
|
4
|
+
waterdrop (2.0.2)
|
5
|
+
concurrent-ruby (>= 1.1)
|
6
6
|
dry-configurable (~> 0.8)
|
7
7
|
dry-monitor (~> 0.3)
|
8
|
-
dry-validation (~> 1.
|
9
|
-
|
8
|
+
dry-validation (~> 1.3)
|
9
|
+
rdkafka (>= 0.6.0)
|
10
10
|
zeitwerk (~> 2.1)
|
11
11
|
|
12
12
|
GEM
|
13
13
|
remote: https://rubygems.org/
|
14
14
|
specs:
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
15
|
+
activesupport (6.1.4)
|
16
|
+
concurrent-ruby (~> 1.0, >= 1.0.2)
|
17
|
+
i18n (>= 1.6, < 2)
|
18
|
+
minitest (>= 5.1)
|
19
|
+
tzinfo (~> 2.0)
|
20
|
+
zeitwerk (~> 2.3)
|
21
|
+
byebug (11.1.3)
|
22
|
+
concurrent-ruby (1.1.9)
|
19
23
|
diff-lcs (1.4.4)
|
20
|
-
|
21
|
-
rake (>= 12.0.0, < 14.0.0)
|
22
|
-
docile (1.3.5)
|
24
|
+
docile (1.4.0)
|
23
25
|
dry-configurable (0.12.1)
|
24
26
|
concurrent-ruby (~> 1.0)
|
25
27
|
dry-core (~> 0.5, >= 0.5.0)
|
26
|
-
dry-container (0.
|
28
|
+
dry-container (0.8.0)
|
27
29
|
concurrent-ruby (~> 1.0)
|
28
30
|
dry-configurable (~> 0.1, >= 0.1.3)
|
29
|
-
dry-core (0.
|
31
|
+
dry-core (0.7.1)
|
30
32
|
concurrent-ruby (~> 1.0)
|
31
33
|
dry-equalizer (0.3.0)
|
32
|
-
dry-events (0.
|
34
|
+
dry-events (0.3.0)
|
33
35
|
concurrent-ruby (~> 1.0)
|
34
|
-
dry-core (~> 0.
|
35
|
-
|
36
|
-
dry-inflector (0.2.0)
|
36
|
+
dry-core (~> 0.5, >= 0.5)
|
37
|
+
dry-inflector (0.2.1)
|
37
38
|
dry-initializer (3.0.4)
|
38
|
-
dry-logic (1.
|
39
|
+
dry-logic (1.2.0)
|
39
40
|
concurrent-ruby (~> 1.0)
|
40
41
|
dry-core (~> 0.5, >= 0.5)
|
41
|
-
dry-monitor (0.
|
42
|
+
dry-monitor (0.4.0)
|
42
43
|
dry-configurable (~> 0.5)
|
43
|
-
dry-core (~> 0.
|
44
|
-
dry-equalizer (~> 0.2)
|
44
|
+
dry-core (~> 0.5, >= 0.5)
|
45
45
|
dry-events (~> 0.2)
|
46
|
-
dry-schema (1.
|
46
|
+
dry-schema (1.7.0)
|
47
47
|
concurrent-ruby (~> 1.0)
|
48
48
|
dry-configurable (~> 0.8, >= 0.8.3)
|
49
49
|
dry-core (~> 0.5, >= 0.5)
|
@@ -63,8 +63,18 @@ GEM
|
|
63
63
|
dry-equalizer (~> 0.2)
|
64
64
|
dry-initializer (~> 3.0)
|
65
65
|
dry-schema (~> 1.5, >= 1.5.2)
|
66
|
-
|
67
|
-
|
66
|
+
factory_bot (6.2.0)
|
67
|
+
activesupport (>= 5.0.0)
|
68
|
+
ffi (1.15.3)
|
69
|
+
i18n (1.8.10)
|
70
|
+
concurrent-ruby (~> 1.0)
|
71
|
+
mini_portile2 (2.6.1)
|
72
|
+
minitest (5.14.4)
|
73
|
+
rake (13.0.6)
|
74
|
+
rdkafka (0.9.0)
|
75
|
+
ffi (~> 1.9)
|
76
|
+
mini_portile2 (~> 2.1)
|
77
|
+
rake (>= 12.3)
|
68
78
|
rspec (3.10.0)
|
69
79
|
rspec-core (~> 3.10.0)
|
70
80
|
rspec-expectations (~> 3.10.0)
|
@@ -78,24 +88,27 @@ GEM
|
|
78
88
|
diff-lcs (>= 1.2.0, < 2.0)
|
79
89
|
rspec-support (~> 3.10.0)
|
80
90
|
rspec-support (3.10.2)
|
81
|
-
ruby-kafka (1.3.0)
|
82
|
-
digest-crc
|
83
91
|
simplecov (0.21.2)
|
84
92
|
docile (~> 1.1)
|
85
93
|
simplecov-html (~> 0.11)
|
86
94
|
simplecov_json_formatter (~> 0.1)
|
87
95
|
simplecov-html (0.12.3)
|
88
|
-
simplecov_json_formatter (0.1.
|
96
|
+
simplecov_json_formatter (0.1.3)
|
97
|
+
tzinfo (2.0.4)
|
98
|
+
concurrent-ruby (~> 1.0)
|
89
99
|
zeitwerk (2.4.2)
|
90
100
|
|
91
101
|
PLATFORMS
|
92
|
-
x86_64-darwin
|
102
|
+
x86_64-darwin
|
93
103
|
x86_64-linux
|
94
104
|
|
95
105
|
DEPENDENCIES
|
106
|
+
byebug
|
107
|
+
factory_bot
|
108
|
+
rdkafka
|
96
109
|
rspec
|
97
110
|
simplecov
|
98
111
|
waterdrop!
|
99
112
|
|
100
113
|
BUNDLED WITH
|
101
|
-
2.2.
|
114
|
+
2.2.25
|
data/{MIT-LICENCE → MIT-LICENSE}
RENAMED
File without changes
|
data/README.md
CHANGED
@@ -1,17 +1,45 @@
|
|
1
1
|
# WaterDrop
|
2
2
|
|
3
|
-
|
4
|
-
[![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
|
3
|
+
**Note**: Documentation presented here refers to WaterDrop `2.0.0`.
|
5
4
|
|
6
|
-
|
5
|
+
WaterDrop `2.0` does **not** work with Karafka `1.*` and aims to either work as a standalone producer outside of Karafka `1.*` ecosystem or as a part of not yet released Karafka `2.0.*`.
|
6
|
+
|
7
|
+
Please refer to [this](https://github.com/karafka/waterdrop/tree/1.4) branch and its documentation for details about WaterDrop `1.*` usage.
|
7
8
|
|
8
|
-
|
9
|
+
[![Build Status](https://github.com/karafka/waterdrop/workflows/ci/badge.svg)](https://github.com/karafka/waterdrop/actions?query=workflow%3Aci)
|
10
|
+
[![Gem Version](https://badge.fury.io/rb/waterdrop.svg)](http://badge.fury.io/rb/waterdrop)
|
11
|
+
[![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka)
|
9
12
|
|
10
|
-
It is
|
13
|
+
Gem used to send messages to Kafka in an easy way with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
|
11
14
|
|
12
|
-
|
13
|
-
|
14
|
-
-
|
15
|
+
It:
|
16
|
+
|
17
|
+
- Is thread safe
|
18
|
+
- Supports sync producing
|
19
|
+
- Supports async producing
|
20
|
+
- Supports buffering
|
21
|
+
- Supports producing messages to multiple clusters
|
22
|
+
- Supports multiple delivery policies
|
23
|
+
- Works with Kafka 1.0+ and Ruby 2.6+
|
24
|
+
|
25
|
+
## Table of contents
|
26
|
+
|
27
|
+
- [WaterDrop](#waterdrop)
|
28
|
+
* [Table of contents](#table-of-contents)
|
29
|
+
* [Installation](#installation)
|
30
|
+
* [Setup](#setup)
|
31
|
+
+ [WaterDrop configuration options](#waterdrop-configuration-options)
|
32
|
+
+ [Kafka configuration options](#kafka-configuration-options)
|
33
|
+
* [Usage](#usage)
|
34
|
+
+ [Basic usage](#basic-usage)
|
35
|
+
+ [Buffering](#buffering)
|
36
|
+
- [Using WaterDrop to buffer messages based on the application logic](#using-waterdrop-to-buffer-messages-based-on-the-application-logic)
|
37
|
+
- [Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing](#using-waterdrop-with-rdkafka-buffers-to-achieve-periodic-auto-flushing)
|
38
|
+
* [Instrumentation](#instrumentation)
|
39
|
+
+ [Usage statistics](#usage-statistics)
|
40
|
+
+ [Forking and potential memory problems](#forking-and-potential-memory-problems)
|
41
|
+
* [References](#references)
|
42
|
+
* [Note on contributions](#note-on-contributions)
|
15
43
|
|
16
44
|
## Installation
|
17
45
|
|
@@ -35,83 +63,244 @@ bundle install
|
|
35
63
|
|
36
64
|
WaterDrop is a complex tool, that contains multiple configuration options. To keep everything organized, all the configuration options were divided into two groups:
|
37
65
|
|
38
|
-
- WaterDrop options - options directly related to
|
39
|
-
-
|
66
|
+
- WaterDrop options - options directly related to WaterDrop and its components
|
67
|
+
- Kafka driver options - options related to `rdkafka`
|
68
|
+
|
69
|
+
To apply all those configuration options, you need to create a producer instance and use the ```#setup``` method:
|
70
|
+
|
71
|
+
```ruby
|
72
|
+
producer = WaterDrop::Producer.new
|
73
|
+
|
74
|
+
producer.setup do |config|
|
75
|
+
config.deliver = true
|
76
|
+
config.kafka = {
|
77
|
+
'bootstrap.servers': 'localhost:9092',
|
78
|
+
'request.required.acks': 1
|
79
|
+
}
|
80
|
+
end
|
81
|
+
```
|
40
82
|
|
41
|
-
|
83
|
+
or you can do the same while initializing the producer:
|
42
84
|
|
43
85
|
```ruby
|
44
|
-
WaterDrop.
|
86
|
+
producer = WaterDrop::Producer.new do |config|
|
45
87
|
config.deliver = true
|
46
|
-
config.kafka
|
88
|
+
config.kafka = {
|
89
|
+
'bootstrap.servers': 'localhost:9092',
|
90
|
+
'request.required.acks': 1
|
91
|
+
}
|
47
92
|
end
|
48
93
|
```
|
49
94
|
|
50
95
|
### WaterDrop configuration options
|
51
96
|
|
52
|
-
| Option
|
53
|
-
|
54
|
-
|
|
55
|
-
| logger
|
56
|
-
| deliver
|
97
|
+
| Option | Description |
|
98
|
+
|--------------------|-----------------------------------------------------------------|
|
99
|
+
| `id` | id of the producer for instrumentation and logging |
|
100
|
+
| `logger` | Logger that we want to use |
|
101
|
+
| `deliver` | Should we send messages to Kafka or just fake the delivery |
|
102
|
+
| `max_wait_timeout` | Waits that long for the delivery report or raises an error |
|
103
|
+
| `wait_timeout` | Waits that long before re-check of delivery report availability |
|
57
104
|
|
58
|
-
###
|
105
|
+
### Kafka configuration options
|
59
106
|
|
60
|
-
|
107
|
+
You can create producers with different `kafka` settings. Documentation of the available configuration options is available on https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
|
61
108
|
|
62
|
-
|
109
|
+
## Usage
|
110
|
+
|
111
|
+
Please refer to the [documentation](https://www.rubydoc.info/gems/waterdrop) in case you're interested in the more advanced API.
|
63
112
|
|
64
|
-
|
65
|
-
|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
|
66
|
-
| raise_on_buffer_overflow | Should we raise an exception, when messages can't be sent in an async way due to the message buffer overflow or should we just drop them |
|
67
|
-
| delivery_interval | The number of seconds between background message deliveries. Disable timer-based background deliveries by setting this to 0. |
|
68
|
-
| delivery_threshold | The number of buffered messages that will trigger a background message delivery. Disable buffer size based background deliveries by setting this to 0.|
|
69
|
-
| required_acks | The number of Kafka replicas that must acknowledge messages before they're considered as successfully written. |
|
70
|
-
| ack_timeout | A timeout executed by a broker when the client is sending messages to it. |
|
71
|
-
| max_retries | The number of retries when attempting to deliver messages. |
|
72
|
-
| retry_backoff | The number of seconds to wait after a failed attempt to send messages to a Kafka broker before retrying. |
|
73
|
-
| max_buffer_bytesize | The maximum number of bytes allowed in the buffer before new messages are rejected. |
|
74
|
-
| max_buffer_size | The maximum number of messages allowed in the buffer before new messages are rejected. |
|
75
|
-
| max_queue_size | The maximum number of messages allowed in the queue before new messages are rejected. |
|
76
|
-
| sasl_plain_username | The username used to authenticate. |
|
77
|
-
| sasl_plain_password | The password used to authenticate. |
|
113
|
+
### Basic usage
|
78
114
|
|
79
|
-
|
115
|
+
To send Kafka messages, just create a producer and use it:
|
80
116
|
|
81
117
|
```ruby
|
82
|
-
WaterDrop.
|
83
|
-
|
84
|
-
|
118
|
+
producer = WaterDrop::Producer.new
|
119
|
+
|
120
|
+
producer.setup do |config|
|
121
|
+
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
85
122
|
end
|
123
|
+
|
124
|
+
producer.produce_sync(topic: 'my-topic', payload: 'my message')
|
125
|
+
|
126
|
+
# or for async
|
127
|
+
producer.produce_async(topic: 'my-topic', payload: 'my message')
|
128
|
+
|
129
|
+
# or in batches
|
130
|
+
producer.produce_many_sync(
|
131
|
+
[
|
132
|
+
{ topic: 'my-topic', payload: 'my message'},
|
133
|
+
{ topic: 'my-topic', payload: 'my message'}
|
134
|
+
]
|
135
|
+
)
|
136
|
+
|
137
|
+
# both sync and async
|
138
|
+
producer.produce_many_async(
|
139
|
+
[
|
140
|
+
{ topic: 'my-topic', payload: 'my message'},
|
141
|
+
{ topic: 'my-topic', payload: 'my message'}
|
142
|
+
]
|
143
|
+
)
|
144
|
+
|
145
|
+
# Don't forget to close the producer once you're done to flush the internal buffers, etc
|
146
|
+
producer.close
|
86
147
|
```
|
87
148
|
|
88
|
-
|
149
|
+
Each message that you want to publish, will have its value checked.
|
150
|
+
|
151
|
+
Here are all the things you can provide in the message hash:
|
152
|
+
|
153
|
+
| Option | Required | Value type | Description |
|
154
|
+
|-----------------|----------|---------------|----------------------------------------------------------|
|
155
|
+
| `topic` | true | String | The Kafka topic that should be written to |
|
156
|
+
| `payload` | true | String | Data you want to send to Kafka |
|
157
|
+
| `key` | false | String | The key that should be set in the Kafka message |
|
158
|
+
| `partition` | false | Integer | A specific partition number that should be written to |
|
159
|
+
| `partition_key` | false | String | Key to indicate the destination partition of the message |
|
160
|
+
| `timestamp` | false | Time, Integer | The timestamp that should be set on the message |
|
161
|
+
| `headers` | false | Hash | Headers for the message |
|
89
162
|
|
90
|
-
|
163
|
+
Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
|
164
|
+
|
165
|
+
### Buffering
|
166
|
+
|
167
|
+
WaterDrop producers support buffering messages in their internal buffers and on the `rdkafka` level via `queue.buffering.*` set of settings.
|
168
|
+
|
169
|
+
This means that depending on your use case, you can achieve both granular buffering and flushing control when needed with context awareness and periodic and size-based flushing functionalities.
|
170
|
+
|
171
|
+
#### Using WaterDrop to buffer messages based on the application logic
|
91
172
|
|
92
173
|
```ruby
|
93
|
-
WaterDrop::
|
94
|
-
|
95
|
-
|
174
|
+
producer = WaterDrop::Producer.new
|
175
|
+
|
176
|
+
producer.setup do |config|
|
177
|
+
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
178
|
+
end
|
179
|
+
|
180
|
+
# Simulating some events states of a transaction - notice, that the messages will be flushed to
|
181
|
+
# kafka only upon arrival of the `finished` state.
|
182
|
+
%w[
|
183
|
+
started
|
184
|
+
processed
|
185
|
+
finished
|
186
|
+
].each do |state|
|
187
|
+
producer.buffer(topic: 'events', payload: state)
|
188
|
+
|
189
|
+
puts "The messages buffer size #{producer.messages.size}"
|
190
|
+
producer.flush_sync if state == 'finished'
|
191
|
+
puts "The messages buffer size #{producer.messages.size}"
|
192
|
+
end
|
193
|
+
|
194
|
+
producer.close
|
96
195
|
```
|
97
196
|
|
98
|
-
|
197
|
+
#### Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing
|
99
198
|
|
100
|
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
|
105
|
-
|
106
|
-
|
107
|
-
|
199
|
+
```ruby
|
200
|
+
producer = WaterDrop::Producer.new
|
201
|
+
|
202
|
+
producer.setup do |config|
|
203
|
+
config.kafka = {
|
204
|
+
'bootstrap.servers': 'localhost:9092',
|
205
|
+
# Accumulate messages for at most 10 seconds
|
206
|
+
'queue.buffering.max.ms' => 10_000
|
207
|
+
}
|
208
|
+
end
|
108
209
|
|
109
|
-
|
210
|
+
# WaterDrop will flush messages minimum once every 10 seconds
|
211
|
+
30.times do |i|
|
212
|
+
producer.produce_async(topic: 'events', payload: i.to_s)
|
213
|
+
sleep(1)
|
214
|
+
end
|
215
|
+
|
216
|
+
producer.close
|
217
|
+
```
|
218
|
+
|
219
|
+
## Instrumentation
|
220
|
+
|
221
|
+
Each of the producers after the `#setup` is done, has a custom monitor to which you can subscribe.
|
222
|
+
|
223
|
+
```ruby
|
224
|
+
producer = WaterDrop::Producer.new
|
225
|
+
|
226
|
+
producer.setup do |config|
|
227
|
+
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
228
|
+
end
|
229
|
+
|
230
|
+
producer.monitor.subscribe('message.produced_async') do |event|
|
231
|
+
puts "A message was produced to '#{event[:message][:topic]}' topic!"
|
232
|
+
end
|
233
|
+
|
234
|
+
producer.produce_async(topic: 'events', payload: 'data')
|
235
|
+
|
236
|
+
producer.close
|
237
|
+
```
|
238
|
+
|
239
|
+
See the `WaterDrop::Instrumentation::Monitor::EVENTS` for the list of all the supported events.
|
240
|
+
|
241
|
+
### Usage statistics
|
242
|
+
|
243
|
+
WaterDrop may be configured to emit internal metrics at a fixed interval by setting the `kafka` `statistics.interval.ms` configuration property to a value > `0`. Once that is done, emitted statistics are available after subscribing to the `statistics.emitted` publisher event.
|
244
|
+
|
245
|
+
The statistics include all of the metrics from `librdkafka` (full list [here](https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md)) as well as the diff of those against the previously emitted values.
|
246
|
+
|
247
|
+
For several attributes like `txmsgs`, `librdkafka` publishes only the totals. In order to make it easier to track the progress (for example number of messages sent between statistics emitted events), WaterDrop diffs all the numeric values against previously available numbers. All of those metrics are available under the same key as the metric but with additional `_d` postfix:
|
248
|
+
|
249
|
+
|
250
|
+
```ruby
|
251
|
+
producer = WaterDrop::Producer.new do |config|
|
252
|
+
config.kafka = {
|
253
|
+
'bootstrap.servers': 'localhost:9092',
|
254
|
+
'statistics.interval.ms': 2_000 # emit statistics every 2 seconds
|
255
|
+
}
|
256
|
+
end
|
257
|
+
|
258
|
+
producer.monitor.subscribe('statistics.emitted') do |event|
|
259
|
+
sum = event[:statistics]['txmsgs']
|
260
|
+
diff = event[:statistics]['txmsgs_d']
|
261
|
+
|
262
|
+
p "Sent messages: #{sum}"
|
263
|
+
p "Messages sent from last statistics report: #{diff}"
|
264
|
+
end
|
265
|
+
|
266
|
+
sleep(2)
|
267
|
+
|
268
|
+
# Sent messages: 0
|
269
|
+
# Messages sent from last statistics report: 0
|
270
|
+
|
271
|
+
20.times { producer.produce_async(topic: 'events', payload: 'data') }
|
272
|
+
|
273
|
+
# Sent messages: 20
|
274
|
+
# Messages sent from last statistics report: 20
|
275
|
+
|
276
|
+
sleep(2)
|
277
|
+
|
278
|
+
20.times { producer.produce_async(topic: 'events', payload: 'data') }
|
279
|
+
|
280
|
+
# Sent messages: 40
|
281
|
+
# Messages sent from last statistics report: 20
|
282
|
+
|
283
|
+
sleep(2)
|
284
|
+
|
285
|
+
# Sent messages: 40
|
286
|
+
# Messages sent from last statistics report: 0
|
287
|
+
|
288
|
+
producer.close
|
289
|
+
```
|
290
|
+
|
291
|
+
Note: The metrics returned may not be completely consistent between brokers, toppars and totals, due to the internal asynchronous nature of librdkafka. E.g., the top level tx total may be less than the sum of the broker tx values which it represents.
|
292
|
+
|
293
|
+
### Forking and potential memory problems
|
294
|
+
|
295
|
+
If you work with forked processes, make sure you **don't** use the producer before the fork. You can easily configure the producer and then fork and use it.
|
296
|
+
|
297
|
+
To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
|
110
298
|
|
111
299
|
## References
|
112
300
|
|
301
|
+
* [WaterDrop code documentation](https://www.rubydoc.info/github/karafka/waterdrop)
|
113
302
|
* [Karafka framework](https://github.com/karafka/karafka)
|
114
|
-
* [WaterDrop
|
303
|
+
* [WaterDrop Actions CI](https://github.com/karafka/waterdrop/actions?query=workflow%3Ac)
|
115
304
|
* [WaterDrop Coditsu](https://app.coditsu.io/karafka/repositories/waterdrop)
|
116
305
|
|
117
306
|
## Note on contributions
|
@@ -123,5 +312,3 @@ Each pull request must pass all the RSpec specs and meet our quality requirement
|
|
123
312
|
To check if everything is as it should be, we use [Coditsu](https://coditsu.io) that combines multiple linters and code analyzers for both code and documentation. Once you're done with your changes, submit a pull request.
|
124
313
|
|
125
314
|
Coditsu will automatically check your work against our quality standards. You can find your commit check results on the [builds page](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds) of WaterDrop repository.
|
126
|
-
|
127
|
-
[![coditsu](https://coditsu.io/assets/quality_bar.svg)](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds)
|