waterdrop 1.4.4 → 2.0.0.rc1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/.github/FUNDING.yml +1 -0
- data/.github/workflows/ci.yml +3 -25
- data/.gitignore +2 -0
- data/.ruby-version +1 -1
- data/CHANGELOG.md +8 -13
- data/Gemfile +9 -0
- data/Gemfile.lock +81 -60
- data/LICENSE +165 -0
- data/README.md +200 -57
- data/certs/mensfeld.pem +21 -21
- data/config/errors.yml +3 -16
- data/lib/water_drop/config.rb +42 -143
- data/lib/water_drop/contracts/config.rb +8 -121
- data/lib/water_drop/contracts/message.rb +41 -0
- data/lib/water_drop/contracts.rb +0 -2
- data/lib/water_drop/errors.rb +30 -5
- data/lib/water_drop/instrumentation/monitor.rb +16 -22
- data/lib/water_drop/instrumentation/stdout_listener.rb +113 -32
- data/lib/water_drop/producer/async.rb +51 -0
- data/lib/water_drop/producer/buffer.rb +113 -0
- data/lib/water_drop/producer/builder.rb +63 -0
- data/lib/water_drop/producer/dummy_client.rb +32 -0
- data/lib/water_drop/producer/statistics_decorator.rb +71 -0
- data/lib/water_drop/producer/status.rb +52 -0
- data/lib/water_drop/producer/sync.rb +65 -0
- data/lib/water_drop/producer.rb +142 -0
- data/lib/water_drop/version.rb +1 -1
- data/lib/water_drop.rb +4 -24
- data/waterdrop.gemspec +8 -8
- data.tar.gz.sig +0 -0
- metadata +53 -54
- metadata.gz.sig +0 -0
- data/MIT-LICENCE +0 -18
- data/lib/water_drop/async_producer.rb +0 -26
- data/lib/water_drop/base_producer.rb +0 -57
- data/lib/water_drop/config_applier.rb +0 -52
- data/lib/water_drop/contracts/message_options.rb +0 -19
- data/lib/water_drop/sync_producer.rb +0 -24
data/README.md
CHANGED
@@ -1,17 +1,25 @@
|
|
1
1
|
# WaterDrop
|
2
2
|
|
3
|
-
|
4
|
-
[![Join the chat at https://slack.karafka.io](https://raw.githubusercontent.com/karafka/misc/master/slack.svg)](https://slack.karafka.io)
|
3
|
+
**Note**: Documentation presented here refers to WaterDrop `2.0.0.pre1`.
|
5
4
|
|
6
|
-
|
5
|
+
WaterDrop `2.0` does **not** work with Karafka `1.*` and aims to either work as a standalone producer outside of Karafka `1.*` ecosystem or as a part of not yet released Karafka `2.0.*`.
|
6
|
+
|
7
|
+
Please refer to [this](https://github.com/karafka/waterdrop/tree/1.4) branch and it's documentation for details about WaterDrop `1.*` usage.
|
7
8
|
|
8
|
-
|
9
|
+
[![Build Status](https://github.com/karafka/waterdrop/workflows/ci/badge.svg)](https://github.com/karafka/waterdrop/actions?query=workflow%3Aci)
|
10
|
+
[![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
|
9
11
|
|
10
|
-
It is
|
12
|
+
Gem used to send messages to Kafka in an easy way with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
|
11
13
|
|
12
|
-
|
13
|
-
|
14
|
-
-
|
14
|
+
It:
|
15
|
+
|
16
|
+
- Is thread safe
|
17
|
+
- Supports sync producing
|
18
|
+
- Supports async producing
|
19
|
+
- Supports buffering
|
20
|
+
- Supports producing messages to multiple clusters
|
21
|
+
- Supports multiple delivery policies
|
22
|
+
- Works with Kafka 1.0+ and Ruby 2.5+
|
15
23
|
|
16
24
|
## Installation
|
17
25
|
|
@@ -36,88 +44,223 @@ bundle install
|
|
36
44
|
WaterDrop is a complex tool, that contains multiple configuration options. To keep everything organized, all the configuration options were divided into two groups:
|
37
45
|
|
38
46
|
- WaterDrop options - options directly related to Karafka framework and it's components
|
39
|
-
-
|
47
|
+
- Kafka driver options - options related to `Kafka`
|
48
|
+
|
49
|
+
To apply all those configuration options, you need to create a producer instance and use the ```#setup``` method:
|
50
|
+
|
51
|
+
```ruby
|
52
|
+
producer = WaterDrop::Producer.new
|
53
|
+
|
54
|
+
producer.setup do |config|
|
55
|
+
config.deliver = true
|
56
|
+
config.kafka = {
|
57
|
+
'bootstrap.servers': 'localhost:9092',
|
58
|
+
'request.required.acks': 1
|
59
|
+
}
|
60
|
+
end
|
61
|
+
```
|
40
62
|
|
41
|
-
|
63
|
+
or you can do the same while initializing the producer:
|
42
64
|
|
43
65
|
```ruby
|
44
|
-
WaterDrop.
|
66
|
+
producer = WaterDrop::Producer.new do |config|
|
45
67
|
config.deliver = true
|
46
|
-
config.kafka
|
68
|
+
config.kafka = {
|
69
|
+
'bootstrap.servers': 'localhost:9092',
|
70
|
+
'request.required.acks': 1
|
71
|
+
}
|
47
72
|
end
|
48
73
|
```
|
49
74
|
|
50
75
|
### WaterDrop configuration options
|
51
76
|
|
52
|
-
| Option
|
53
|
-
|
54
|
-
|
|
55
|
-
| logger
|
56
|
-
| deliver
|
77
|
+
| Option | Description |
|
78
|
+
|--------------------|-----------------------------------------------------------------|
|
79
|
+
| `id` | id of the producer for instrumentation and logging |
|
80
|
+
| `logger` | Logger that we want to use |
|
81
|
+
| `deliver` | Should we send messages to Kafka or just fake the delivery |
|
82
|
+
| `max_wait_timeout` | Waits that long for the delivery report or raises an error |
|
83
|
+
| `wait_timeout` | Waits that long before re-check of delivery report availability |
|
57
84
|
|
58
|
-
###
|
85
|
+
### Kafka configuration options
|
59
86
|
|
60
|
-
|
87
|
+
You can create producers with different `kafka` settings. Documentation of the available configuration options is available on https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
|
61
88
|
|
62
|
-
|
89
|
+
## Usage
|
63
90
|
|
64
|
-
|
65
|
-
|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
|
66
|
-
| raise_on_buffer_overflow | Should we raise an exception, when messages can't be sent in an async way due to the message buffer overflow or should we just drop them |
|
67
|
-
| delivery_interval | The number of seconds between background message deliveries. Disable timer-based background deliveries by setting this to 0. |
|
68
|
-
| delivery_threshold | The number of buffered messages that will trigger a background message delivery. Disable buffer size based background deliveries by setting this to 0.|
|
69
|
-
| required_acks | The number of Kafka replicas that must acknowledge messages before they're considered as successfully written. |
|
70
|
-
| ack_timeout | A timeout executed by a broker when the client is sending messages to it. |
|
71
|
-
| max_retries | The number of retries when attempting to deliver messages. |
|
72
|
-
| retry_backoff | The number of seconds to wait after a failed attempt to send messages to a Kafka broker before retrying. |
|
73
|
-
| max_buffer_bytesize | The maximum number of bytes allowed in the buffer before new messages are rejected. |
|
74
|
-
| max_buffer_size | The maximum number of messages allowed in the buffer before new messages are rejected. |
|
75
|
-
| max_queue_size | The maximum number of messages allowed in the queue before new messages are rejected. |
|
76
|
-
| sasl_plain_username | The username used to authenticate. |
|
77
|
-
| sasl_plain_password | The password used to authenticate. |
|
91
|
+
Please refer to the [documentation](https://www.rubydoc.info/github/karafka/waterdrop) in case you're interested in the more advanced API.
|
78
92
|
|
79
|
-
|
93
|
+
### Basic usage
|
94
|
+
|
95
|
+
To send Kafka messages, just create a producer and use it:
|
80
96
|
|
81
97
|
```ruby
|
82
|
-
WaterDrop.
|
83
|
-
|
84
|
-
|
98
|
+
producer = WaterDrop::Producer.new
|
99
|
+
|
100
|
+
producer.setup do |config|
|
101
|
+
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
85
102
|
end
|
103
|
+
|
104
|
+
producer.produce_sync(topic: 'my-topic', payload: 'my message')
|
105
|
+
|
106
|
+
# or for async
|
107
|
+
producer.produce_async(topic: 'my-topic', payload: 'my message')
|
108
|
+
|
109
|
+
# or in batches
|
110
|
+
producer.produce_many_sync(
|
111
|
+
[
|
112
|
+
{ topic: 'my-topic', payload: 'my message'},
|
113
|
+
{ topic: 'my-topic', payload: 'my message'}
|
114
|
+
]
|
115
|
+
)
|
116
|
+
|
117
|
+
# both sync and async
|
118
|
+
producer.produce_many_async(
|
119
|
+
[
|
120
|
+
{ topic: 'my-topic', payload: 'my message'},
|
121
|
+
{ topic: 'my-topic', payload: 'my message'}
|
122
|
+
]
|
123
|
+
)
|
124
|
+
|
125
|
+
# Don't forget to close the producer once you're done to flush the internal buffers, etc
|
126
|
+
producer.close
|
86
127
|
```
|
87
128
|
|
88
|
-
|
129
|
+
Each message that you want to publish, will have its value checked.
|
89
130
|
|
90
|
-
|
131
|
+
Here are all the things you can provide in the message hash:
|
132
|
+
|
133
|
+
| Option | Required | Value type | Description |
|
134
|
+
|-------------|----------|---------------|-------------------------------------------------------|
|
135
|
+
| `topic` | true | String | The Kafka topic that should be written to |
|
136
|
+
| `payload` | true | String | Data you want to send to Kafka |
|
137
|
+
| `key` | false | String | The key that should be set in the Kafka message |
|
138
|
+
| `partition` | false | Integer | A specific partition number that should be written to |
|
139
|
+
| `timestamp` | false | Time, Integer | The timestamp that should be set on the message |
|
140
|
+
| `headers` | false | Hash | Headers for the message |
|
141
|
+
|
142
|
+
Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
|
143
|
+
|
144
|
+
### Buffering
|
145
|
+
|
146
|
+
WaterDrop producers support buffering of messages, which means that you can easily implement periodic flushing for long running processes as well as buffer several messages to be flushed the same moment:
|
91
147
|
|
92
148
|
```ruby
|
93
|
-
WaterDrop::
|
94
|
-
|
95
|
-
|
149
|
+
producer = WaterDrop::Producer.new
|
150
|
+
|
151
|
+
producer.setup do |config|
|
152
|
+
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
153
|
+
end
|
154
|
+
|
155
|
+
time = Time.now - 10
|
156
|
+
|
157
|
+
while time < Time.now
|
158
|
+
time += 1
|
159
|
+
producer.buffer(topic: 'times', payload: Time.now.to_s)
|
160
|
+
end
|
161
|
+
|
162
|
+
puts "The messages buffer size #{producer.messages.size}"
|
163
|
+
producer.flush_sync
|
164
|
+
puts "The messages buffer size #{producer.message.size}"
|
165
|
+
|
166
|
+
producer.close
|
96
167
|
```
|
97
168
|
|
98
|
-
|
169
|
+
## Instrumentation
|
99
170
|
|
100
|
-
|
101
|
-
|-------------------- |----------|------------|---------------------------------------------------------------------|
|
102
|
-
| ```topic``` | true | String | The Kafka topic that should be written to |
|
103
|
-
| ```key``` | false | String | The key that should be set in the Kafka message |
|
104
|
-
| ```partition``` | false | Integer | A specific partition number that should be written to |
|
105
|
-
| ```partition_key``` | false | String | A string that can be used to deterministically select the partition |
|
106
|
-
| ```create_time``` | false | Time | The timestamp that should be set on the message |
|
107
|
-
| ```headers``` | false | Hash | Headers for the message |
|
171
|
+
Each of the producers after the `#setup` is done, has a custom monitor to which you can subscribe.
|
108
172
|
|
109
|
-
|
173
|
+
```ruby
|
174
|
+
producer = WaterDrop::Producer.new
|
175
|
+
|
176
|
+
producer.setup do |config|
|
177
|
+
config.kafka = { 'bootstrap.servers': 'localhost:9092' }
|
178
|
+
end
|
179
|
+
|
180
|
+
producer.monitor.subscribe('message.produced_async') do |event|
|
181
|
+
puts "A message was produced to '#{event[:message][:topic]}' topic!"
|
182
|
+
end
|
183
|
+
|
184
|
+
producer.produce_async(topic: 'events', payload: 'data')
|
185
|
+
|
186
|
+
producer.close
|
187
|
+
```
|
188
|
+
|
189
|
+
See the `WaterDrop::Instrumentation::Monitor::EVENTS` for the list of all the supported events.
|
190
|
+
|
191
|
+
### Usage statistics
|
192
|
+
|
193
|
+
WaterDrop may be configured to emit internal metrics at a fixed interval by setting the `kafka` `statistics.interval.ms` configuration property to a value > `0`. Once that is done, emitted statistics are available after subscribing to the `statistics.emitted` publisher event.
|
194
|
+
|
195
|
+
The statistics include all of the metrics from `librdkafka` (full list [here](https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md)) as well as the diff of those against the previously emitted values.
|
196
|
+
|
197
|
+
For several attributes like `txmsgs`, `librdkafka` publishes only the totals. In order to make it easier to track the progress (for example number of messages sent between statistics emitted events), WaterDrop diffs all the numeric values against previously available numbers. All of those metrics are available under the same key as the metric but with additional `_d` postfix:
|
198
|
+
|
199
|
+
|
200
|
+
```ruby
|
201
|
+
producer = WaterDrop::Producer.new do |config|
|
202
|
+
config.kafka = {
|
203
|
+
'bootstrap.servers': 'localhost:9092',
|
204
|
+
'statistics.interval.ms': 2_000 # emit statistics every 2 seconds
|
205
|
+
}
|
206
|
+
end
|
207
|
+
|
208
|
+
producer.monitor.subscribe('statistics.emitted') do |event|
|
209
|
+
sum = event[:statistics]['txmsgs']
|
210
|
+
diff = event[:statistics]['txmsgs_d']
|
211
|
+
|
212
|
+
p "Sent messages: #{sum}"
|
213
|
+
p "Messages sent from last statistics report: #{diff}"
|
214
|
+
end
|
215
|
+
|
216
|
+
sleep(2)
|
217
|
+
|
218
|
+
# Sent messages: 0
|
219
|
+
# Messages sent from last statistics report: 0
|
220
|
+
|
221
|
+
20.times { producer.produce_async(topic: 'events', payload: 'data') }
|
222
|
+
|
223
|
+
# Sent messages: 20
|
224
|
+
# Messages sent from last statistics report: 20
|
225
|
+
|
226
|
+
sleep(2)
|
227
|
+
|
228
|
+
20.times { producer.produce_async(topic: 'events', payload: 'data') }
|
229
|
+
|
230
|
+
# Sent messages: 40
|
231
|
+
# Messages sent from last statistics report: 20
|
232
|
+
|
233
|
+
sleep(2)
|
234
|
+
|
235
|
+
# Sent messages: 40
|
236
|
+
# Messages sent from last statistics report: 0
|
237
|
+
|
238
|
+
producer.close
|
239
|
+
```
|
240
|
+
|
241
|
+
Note: The metrics returned may not be completely consistent between brokers, toppars and totals, due to the internal asynchronous nature of librdkafka. E.g., the top level tx total may be less than the sum of the broker tx values which it represents.
|
242
|
+
|
243
|
+
### Forking and potential memory problems
|
244
|
+
|
245
|
+
If you work with forked processes, make sure you **don't** use the producer before the fork. You can easily configure the producer and then fork and use it.
|
246
|
+
|
247
|
+
To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
|
110
248
|
|
111
249
|
## References
|
112
250
|
|
251
|
+
* [WaterDrop code documentation](https://www.rubydoc.info/github/karafka/waterdrop)
|
113
252
|
* [Karafka framework](https://github.com/karafka/karafka)
|
114
|
-
* [WaterDrop
|
253
|
+
* [WaterDrop Actions CI](https://github.com/karafka/waterdrop/actions?query=workflow%3Ac)
|
115
254
|
* [WaterDrop Coditsu](https://app.coditsu.io/karafka/repositories/waterdrop)
|
116
255
|
|
117
256
|
## Note on contributions
|
118
257
|
|
119
|
-
First, thank you for considering contributing to
|
258
|
+
First, thank you for considering contributing to WaterDrop! It's people like you that make the open source community such a great community!
|
259
|
+
|
260
|
+
Each pull request must pass all the RSpec specs and meet our quality requirements.
|
261
|
+
|
262
|
+
To check if everything is as it should be, we use [Coditsu](https://coditsu.io) that combines multiple linters and code analyzers for both code and documentation. Once you're done with your changes, submit a pull request.
|
120
263
|
|
121
|
-
|
264
|
+
Coditsu will automatically check your work against our quality standards. You can find your commit check results on the [builds page](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds) of WaterDrop repository.
|
122
265
|
|
123
|
-
|
266
|
+
[![coditsu](https://coditsu.io/assets/quality_bar.svg)](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds)
|
data/certs/mensfeld.pem
CHANGED
@@ -1,25 +1,25 @@
|
|
1
1
|
-----BEGIN CERTIFICATE-----
|
2
2
|
MIIEODCCAqCgAwIBAgIBATANBgkqhkiG9w0BAQsFADAjMSEwHwYDVQQDDBhtYWNp
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
3
|
+
ZWovREM9bWVuc2ZlbGQvREM9cGwwHhcNMjAwODExMDkxNTM3WhcNMjEwODExMDkx
|
4
|
+
NTM3WjAjMSEwHwYDVQQDDBhtYWNpZWovREM9bWVuc2ZlbGQvREM9cGwwggGiMA0G
|
5
|
+
CSqGSIb3DQEBAQUAA4IBjwAwggGKAoIBgQDCpXsCgmINb6lHBXXBdyrgsBPSxC4/
|
6
|
+
2H+weJ6L9CruTiv2+2/ZkQGtnLcDgrD14rdLIHK7t0o3EKYlDT5GhD/XUVhI15JE
|
7
|
+
N7IqnPUgexe1fbZArwQ51afxz2AmPQN2BkB2oeQHXxnSWUGMhvcEZpfbxCCJH26w
|
8
|
+
hS0Ccsma8yxA6hSlGVhFVDuCr7c2L1di6cK2CtIDpfDaWqnVNJEwBYHIxrCoWK5g
|
9
|
+
sIGekVt/admS9gRhIMaIBg+Mshth5/DEyWO2QjteTodItlxfTctrfmiAl8X8T5JP
|
10
|
+
VXeLp5SSOJ5JXE80nShMJp3RFnGw5fqjX/ffjtISYh78/By4xF3a25HdWH9+qO2Z
|
11
|
+
tx0wSGc9/4gqNM0APQnjN/4YXrGZ4IeSjtE+OrrX07l0TiyikzSLFOkZCAp8oBJi
|
12
|
+
Fhlosz8xQDJf7mhNxOaZziqASzp/hJTU/tuDKl5+ql2icnMv5iV/i6SlmvU29QNg
|
13
|
+
LCV71pUv0pWzN+OZbHZKWepGhEQ3cG9MwvkCAwEAAaN3MHUwCQYDVR0TBAIwADAL
|
14
|
+
BgNVHQ8EBAMCBLAwHQYDVR0OBBYEFImGed2AXS070ohfRidiCEhXEUN+MB0GA1Ud
|
15
15
|
EQQWMBSBEm1hY2llakBtZW5zZmVsZC5wbDAdBgNVHRIEFjAUgRJtYWNpZWpAbWVu
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
/
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
16
|
+
c2ZlbGQucGwwDQYJKoZIhvcNAQELBQADggGBAKiHpwoENVrMi94V1zD4o8/6G3AU
|
17
|
+
gWz4udkPYHTZLUy3dLznc/sNjdkJFWT3E6NKYq7c60EpJ0m0vAEg5+F5pmNOsvD3
|
18
|
+
2pXLj9kisEeYhR516HwXAvtngboUcb75skqvBCU++4Pu7BRAPjO1/ihLSBexbwSS
|
19
|
+
fF+J5OWNuyHHCQp+kGPLtXJe2yUYyvSWDj3I2//Vk0VhNOIlaCS1+5/P3ZJThOtm
|
20
|
+
zJUBI7h3HgovwRpcnmk2mXTmU4Zx/bCzX8EA6VY0khEvnmiq7S6eBF0H9qH8KyQ6
|
21
|
+
EkVLpvmUDFcf/uNaBQdazEMB5jYtwoA8gQlANETNGPi51KlkukhKgaIEDMkBDJOx
|
22
|
+
65N7DzmkcyY0/GwjIVIxmRhcrCt1YeCUElmfFx0iida1/YRm6sB2AXqScc1+ECRi
|
23
|
+
2DND//YJUikn1zwbz1kT70XmHd97B4Eytpln7K+M1u2g1pHVEPW4owD/ammXNpUy
|
24
|
+
nt70FcDD4yxJQ+0YNiHd0N8IcVBM1TMIVctMNQ==
|
25
25
|
-----END CERTIFICATE-----
|
data/config/errors.yml
CHANGED
@@ -1,19 +1,6 @@
|
|
1
1
|
en:
|
2
2
|
dry_validation:
|
3
3
|
errors:
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
Example: kafka://127.0.0.1:9092 or kafka+ssl://127.0.0.1:9092
|
8
|
-
ssl_client_cert_with_ssl_client_cert_key: >
|
9
|
-
Both ssl_client_cert and ssl_client_cert_key need to be provided.
|
10
|
-
ssl_client_cert_key_with_ssl_client_cert: >
|
11
|
-
Both ssl_client_cert_key and ssl_client_cert need to be provided.
|
12
|
-
ssl_client_cert_chain_with_ssl_client_cert: >
|
13
|
-
Both ssl_client_cert_chain and ssl_client_cert need to be provided.
|
14
|
-
ssl_client_cert_chain_with_ssl_client_cert_key: >
|
15
|
-
Both ssl_client_cert_chain and ssl_client_cert_key need to be provided.
|
16
|
-
ssl_client_cert_key_password_with_ssl_client_cert_key: >
|
17
|
-
Both ssl_client_cert_key_password and ssl_client_cert_key need to be provided.
|
18
|
-
sasl_oauth_token_provider_respond_to_token: >
|
19
|
-
sasl_oauth_token_provider needs to respond to a #token method.
|
4
|
+
invalid_key_type: all keys need to be of type String
|
5
|
+
invalid_value_type: all values need to be of type String
|
6
|
+
max_payload_size: is more than `max_payload_size` config value
|
data/lib/water_drop/config.rb
CHANGED
@@ -5,158 +5,57 @@
|
|
5
5
|
module WaterDrop
|
6
6
|
# Configuration object for setting up all options required by WaterDrop
|
7
7
|
class Config
|
8
|
-
|
9
|
-
|
10
|
-
# Config schema definition
|
11
|
-
# @note We use a single instance not to create new one upon each usage
|
12
|
-
SCHEMA = Contracts::Config.new.freeze
|
13
|
-
|
14
|
-
private_constant :SCHEMA
|
8
|
+
include Dry::Configurable
|
15
9
|
|
16
10
|
# WaterDrop options
|
17
|
-
#
|
18
|
-
|
19
|
-
#
|
20
|
-
setting
|
11
|
+
#
|
12
|
+
# option [String] id of the producer. This can be helpful when building producer specific
|
13
|
+
# instrumentation or loggers. It is not the kafka producer id
|
14
|
+
setting(:id, false) { |id| id || SecureRandom.uuid }
|
15
|
+
# option [Instance] logger that we want to use
|
16
|
+
# @note Due to how rdkafka works, this setting is global for all the producers
|
17
|
+
setting(:logger, false) { |logger| logger || Logger.new($stdout, level: Logger::WARN) }
|
21
18
|
# option [Instance] monitor that we want to use. See instrumentation part of the README for
|
22
19
|
# more details
|
23
|
-
setting
|
20
|
+
setting(:monitor, false) { |monitor| monitor || WaterDrop::Instrumentation::Monitor.new }
|
21
|
+
# option [Integer] max payload size allowed for delivery to Kafka
|
22
|
+
setting :max_payload_size, 1_000_012
|
23
|
+
# option [Integer] Wait that long for the delivery report or raise an error if this takes
|
24
|
+
# longer than the timeout.
|
25
|
+
setting :max_wait_timeout, 5
|
26
|
+
# option [Numeric] how long should we wait between re-checks on the availability of the
|
27
|
+
# delivery report. In a really robust systems, this describes the min-delivery time
|
28
|
+
# for a single sync message when produced in isolation
|
29
|
+
setting :wait_timeout, 0.005 # 5 milliseconds
|
24
30
|
# option [Boolean] should we send messages. Setting this to false can be really useful when
|
25
|
-
# testing and or developing because when set to false, won't actually ping Kafka
|
26
|
-
|
27
|
-
|
28
|
-
#
|
29
|
-
#
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
#
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
# option connect_timeout [Integer] Sets the number of seconds to wait while connecting to
|
40
|
-
# a broker for the first time. When ruby-kafka initializes, it needs to connect to at
|
41
|
-
# least one host.
|
42
|
-
setting :connect_timeout, default: 10
|
43
|
-
# option socket_timeout [Integer] Sets the number of seconds to wait when reading from or
|
44
|
-
# writing to a socket connection to a broker. After this timeout expires the connection
|
45
|
-
# will be killed. Note that some Kafka operations are by definition long-running, such as
|
46
|
-
# waiting for new messages to arrive in a partition, so don't set this value too low
|
47
|
-
setting :socket_timeout, default: 30
|
48
|
-
|
49
|
-
# Buffering for async producer
|
50
|
-
# @option [Integer] The maximum number of bytes allowed in the buffer before new messages
|
51
|
-
# are rejected.
|
52
|
-
setting :max_buffer_bytesize, default: 10_000_000
|
53
|
-
# @option [Integer] The maximum number of messages allowed in the buffer before new messages
|
54
|
-
# are rejected.
|
55
|
-
setting :max_buffer_size, default: 1000
|
56
|
-
# @option [Integer] The maximum number of messages allowed in the queue before new messages
|
57
|
-
# are rejected. The queue is used to ferry messages from the foreground threads of your
|
58
|
-
# application to the background thread that buffers and delivers messages.
|
59
|
-
setting :max_queue_size, default: 1000
|
60
|
-
|
61
|
-
# option [Integer] A timeout executed by a broker when the client is sending messages to it.
|
62
|
-
# It defines the number of seconds the broker should wait for replicas to acknowledge the
|
63
|
-
# write before responding to the client with an error. As such, it relates to the
|
64
|
-
# required_acks setting. It should be set lower than socket_timeout.
|
65
|
-
setting :ack_timeout, default: 5
|
66
|
-
# option [Integer] The number of seconds between background message
|
67
|
-
# deliveries. Default is 10 seconds. Disable timer-based background deliveries by
|
68
|
-
# setting this to 0.
|
69
|
-
setting :delivery_interval, default: 10
|
70
|
-
# option [Integer] The number of buffered messages that will trigger a background message
|
71
|
-
# delivery. Default is 100 messages. Disable buffer size based background deliveries by
|
72
|
-
# setting this to 0.
|
73
|
-
setting :delivery_threshold, default: 100
|
74
|
-
# option [Boolean]
|
75
|
-
setting :idempotent, default: false
|
76
|
-
# option [Boolean]
|
77
|
-
setting :transactional, default: false
|
78
|
-
# option [Integer]
|
79
|
-
setting :transactional_timeout, default: 60
|
80
|
-
|
81
|
-
# option [Integer] The number of retries when attempting to deliver messages.
|
82
|
-
setting :max_retries, default: 2
|
83
|
-
# option [Integer]
|
84
|
-
setting :required_acks, default: -1
|
85
|
-
# option [Integer]
|
86
|
-
setting :retry_backoff, default: 1
|
87
|
-
|
88
|
-
# option [Integer] The minimum number of messages that must be buffered before compression is
|
89
|
-
# attempted. By default only one message is required. Only relevant if compression_codec
|
90
|
-
# is set.
|
91
|
-
setting :compression_threshold, default: 1
|
92
|
-
# option [Symbol] The codec used to compress messages. Must be either snappy or gzip.
|
93
|
-
setting :compression_codec, default: nil
|
94
|
-
|
95
|
-
# SSL authentication related settings
|
96
|
-
# option ca_cert [String, nil] SSL CA certificate
|
97
|
-
setting :ssl_ca_cert, default: nil
|
98
|
-
# option ssl_ca_cert_file_path [String, nil] SSL CA certificate file path
|
99
|
-
setting :ssl_ca_cert_file_path, default: nil
|
100
|
-
# option ssl_ca_certs_from_system [Boolean] Use the CA certs from your system's default
|
101
|
-
# certificate store
|
102
|
-
setting :ssl_ca_certs_from_system, default: false
|
103
|
-
# option ssl_verify_hostname [Boolean] Verify the hostname for client certs
|
104
|
-
setting :ssl_verify_hostname, default: true
|
105
|
-
# option ssl_client_cert [String, nil] SSL client certificate
|
106
|
-
setting :ssl_client_cert, default: nil
|
107
|
-
# option ssl_client_cert_key [String, nil] SSL client certificate password
|
108
|
-
setting :ssl_client_cert_key, default: nil
|
109
|
-
# option sasl_gssapi_principal [String, nil] sasl principal
|
110
|
-
setting :sasl_gssapi_principal, default: nil
|
111
|
-
# option sasl_gssapi_keytab [String, nil] sasl keytab
|
112
|
-
setting :sasl_gssapi_keytab, default: nil
|
113
|
-
# option sasl_plain_authzid [String] The authorization identity to use
|
114
|
-
setting :sasl_plain_authzid, default: ''
|
115
|
-
# option sasl_plain_username [String, nil] The username used to authenticate
|
116
|
-
setting :sasl_plain_username, default: nil
|
117
|
-
# option sasl_plain_password [String, nil] The password used to authenticate
|
118
|
-
setting :sasl_plain_password, default: nil
|
119
|
-
# option sasl_scram_username [String, nil] The username used to authenticate
|
120
|
-
setting :sasl_scram_username, default: nil
|
121
|
-
# option sasl_scram_password [String, nil] The password used to authenticate
|
122
|
-
setting :sasl_scram_password, default: nil
|
123
|
-
# option sasl_scram_mechanism [String, nil] Scram mechanism, either 'sha256' or 'sha512'
|
124
|
-
setting :sasl_scram_mechanism, default: nil
|
125
|
-
# option sasl_over_ssl [Boolean] whether to enforce SSL with SASL
|
126
|
-
setting :sasl_over_ssl, default: true
|
127
|
-
# option ssl_client_cert_chain [String, nil] client cert chain or nil if not used
|
128
|
-
setting :ssl_client_cert_chain, default: nil
|
129
|
-
# option ssl_client_cert_key_password [String, nil] the password required to read
|
130
|
-
# the ssl_client_cert_key
|
131
|
-
setting :ssl_client_cert_key_password, default: nil
|
132
|
-
# @param sasl_oauth_token_provider [Object, nil] OAuthBearer Token Provider instance that
|
133
|
-
# implements method token.
|
134
|
-
setting :sasl_oauth_token_provider, default: nil
|
135
|
-
end
|
136
|
-
|
137
|
-
class << self
|
138
|
-
# Configuration method
|
139
|
-
# @yield Runs a block of code providing a config singleton instance to it
|
140
|
-
# @yieldparam [WaterDrop::Config] WaterDrop config instance
|
141
|
-
def setup
|
142
|
-
configure do |config|
|
143
|
-
yield(config)
|
144
|
-
validate!(config.to_h)
|
145
|
-
end
|
31
|
+
# testing and or developing because when set to false, won't actually ping Kafka but will
|
32
|
+
# run all the validations, etc
|
33
|
+
setting :deliver, true
|
34
|
+
# rdkafka options
|
35
|
+
# @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
|
36
|
+
setting :kafka, {}
|
37
|
+
|
38
|
+
# Configuration method
|
39
|
+
# @yield Runs a block of code providing a config singleton instance to it
|
40
|
+
# @yieldparam [WaterDrop::Config] WaterDrop config instance
|
41
|
+
def setup
|
42
|
+
configure do |config|
|
43
|
+
yield(config)
|
44
|
+
validate!(config.to_h)
|
146
45
|
end
|
46
|
+
end
|
147
47
|
|
148
|
-
|
48
|
+
private
|
149
49
|
|
150
|
-
|
151
|
-
|
152
|
-
|
153
|
-
|
154
|
-
|
155
|
-
|
156
|
-
|
50
|
+
# Validates the configuration and if anything is wrong, will raise an exception
|
51
|
+
# @param config_hash [Hash] config hash with setup details
|
52
|
+
# @raise [WaterDrop::Errors::ConfigurationInvalidError] raised when something is wrong with
|
53
|
+
# the configuration
|
54
|
+
def validate!(config_hash)
|
55
|
+
result = Contracts::Config.new.call(config_hash)
|
56
|
+
return true if result.success?
|
157
57
|
|
158
|
-
|
159
|
-
end
|
58
|
+
raise Errors::ConfigurationInvalidError, result.errors.to_h
|
160
59
|
end
|
161
60
|
end
|
162
61
|
end
|