waterdrop 1.4.4 → 2.0.0.rc1

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,17 +1,25 @@
1
1
  # WaterDrop
2
2
 
3
- [![Build Status](https://travis-ci.org/karafka/waterdrop.svg)](https://travis-ci.org/karafka/waterdrop)
4
- [![Join the chat at https://slack.karafka.io](https://raw.githubusercontent.com/karafka/misc/master/slack.svg)](https://slack.karafka.io)
3
+ **Note**: Documentation presented here refers to WaterDrop `2.0.0.pre1`.
5
4
 
6
- Gem used to send messages to Kafka in an easy way with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
5
+ WaterDrop `2.0` does **not** work with Karafka `1.*` and aims to either work as a standalone producer outside of Karafka `1.*` ecosystem or as a part of not yet released Karafka `2.0.*`.
6
+
7
+ Please refer to [this](https://github.com/karafka/waterdrop/tree/1.4) branch and it's documentation for details about WaterDrop `1.*` usage.
7
8
 
8
- WaterDrop is based on Zendesks [delivery_boy](https://github.com/zendesk/delivery_boy) gem.
9
+ [![Build Status](https://github.com/karafka/waterdrop/workflows/ci/badge.svg)](https://github.com/karafka/waterdrop/actions?query=workflow%3Aci)
10
+ [![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
9
11
 
10
- It is:
12
+ Gem used to send messages to Kafka in an easy way with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
11
13
 
12
- - Thread safe
13
- - Supports sync and async producers
14
- - Working with 0.11+ Kafka
14
+ It:
15
+
16
+ - Is thread safe
17
+ - Supports sync producing
18
+ - Supports async producing
19
+ - Supports buffering
20
+ - Supports producing messages to multiple clusters
21
+ - Supports multiple delivery policies
22
+ - Works with Kafka 1.0+ and Ruby 2.5+
15
23
 
16
24
  ## Installation
17
25
 
@@ -36,88 +44,223 @@ bundle install
36
44
  WaterDrop is a complex tool, that contains multiple configuration options. To keep everything organized, all the configuration options were divided into two groups:
37
45
 
38
46
  - WaterDrop options - options directly related to Karafka framework and it's components
39
- - Ruby-Kafka driver options - options related to Ruby-Kafka/Delivery boy
47
+ - Kafka driver options - options related to `Kafka`
48
+
49
+ To apply all those configuration options, you need to create a producer instance and use the ```#setup``` method:
50
+
51
+ ```ruby
52
+ producer = WaterDrop::Producer.new
53
+
54
+ producer.setup do |config|
55
+ config.deliver = true
56
+ config.kafka = {
57
+ 'bootstrap.servers': 'localhost:9092',
58
+ 'request.required.acks': 1
59
+ }
60
+ end
61
+ ```
40
62
 
41
- To apply all those configuration options, you need to use the ```#setup``` method:
63
+ or you can do the same while initializing the producer:
42
64
 
43
65
  ```ruby
44
- WaterDrop.setup do |config|
66
+ producer = WaterDrop::Producer.new do |config|
45
67
  config.deliver = true
46
- config.kafka.seed_brokers = %w[kafka://localhost:9092]
68
+ config.kafka = {
69
+ 'bootstrap.servers': 'localhost:9092',
70
+ 'request.required.acks': 1
71
+ }
47
72
  end
48
73
  ```
49
74
 
50
75
  ### WaterDrop configuration options
51
76
 
52
- | Option | Description |
53
- |-----------------------------|------------------------------------------------------------------|
54
- | client_id | This is how the client will identify itself to the Kafka brokers |
55
- | logger | Logger that we want to use |
56
- | deliver | Should we send messages to Kafka |
77
+ | Option | Description |
78
+ |--------------------|-----------------------------------------------------------------|
79
+ | `id` | id of the producer for instrumentation and logging |
80
+ | `logger` | Logger that we want to use |
81
+ | `deliver` | Should we send messages to Kafka or just fake the delivery |
82
+ | `max_wait_timeout` | Waits that long for the delivery report or raises an error |
83
+ | `wait_timeout` | Waits that long before re-check of delivery report availability |
57
84
 
58
- ### Ruby-Kafka driver and Delivery boy configuration options
85
+ ### Kafka configuration options
59
86
 
60
- **Note:** We've listed here only **the most important** configuration options. If you're interested in all the options, please go to the [config.rb](https://github.com/karafka/waterdrop/blob/master/lib/water_drop/config.rb) file for more details.
87
+ You can create producers with different `kafka` settings. Documentation of the available configuration options is available on https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
61
88
 
62
- **Note:** All the options are subject to validations. In order to check what is and what is not acceptable, please go to the [config.rb validation schema](https://github.com/karafka/waterdrop/blob/master/lib/water_drop/schemas/config.rb) file.
89
+ ## Usage
63
90
 
64
- | Option | Description |
65
- |--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
66
- | raise_on_buffer_overflow | Should we raise an exception, when messages can't be sent in an async way due to the message buffer overflow or should we just drop them |
67
- | delivery_interval | The number of seconds between background message deliveries. Disable timer-based background deliveries by setting this to 0. |
68
- | delivery_threshold | The number of buffered messages that will trigger a background message delivery. Disable buffer size based background deliveries by setting this to 0.|
69
- | required_acks | The number of Kafka replicas that must acknowledge messages before they're considered as successfully written. |
70
- | ack_timeout | A timeout executed by a broker when the client is sending messages to it. |
71
- | max_retries | The number of retries when attempting to deliver messages. |
72
- | retry_backoff | The number of seconds to wait after a failed attempt to send messages to a Kafka broker before retrying. |
73
- | max_buffer_bytesize | The maximum number of bytes allowed in the buffer before new messages are rejected. |
74
- | max_buffer_size | The maximum number of messages allowed in the buffer before new messages are rejected. |
75
- | max_queue_size | The maximum number of messages allowed in the queue before new messages are rejected. |
76
- | sasl_plain_username | The username used to authenticate. |
77
- | sasl_plain_password | The password used to authenticate. |
91
+ Please refer to the [documentation](https://www.rubydoc.info/github/karafka/waterdrop) in case you're interested in the more advanced API.
78
92
 
79
- This configuration can be also placed in *config/initializers* and can vary based on the environment:
93
+ ### Basic usage
94
+
95
+ To send Kafka messages, just create a producer and use it:
80
96
 
81
97
  ```ruby
82
- WaterDrop.setup do |config|
83
- config.deliver = Rails.env.production?
84
- config.kafka.seed_brokers = [Rails.env.production? ? 'kafka://prod-host:9091' : 'kafka://localhost:9092']
98
+ producer = WaterDrop::Producer.new
99
+
100
+ producer.setup do |config|
101
+ config.kafka = { 'bootstrap.servers': 'localhost:9092' }
85
102
  end
103
+
104
+ producer.produce_sync(topic: 'my-topic', payload: 'my message')
105
+
106
+ # or for async
107
+ producer.produce_async(topic: 'my-topic', payload: 'my message')
108
+
109
+ # or in batches
110
+ producer.produce_many_sync(
111
+ [
112
+ { topic: 'my-topic', payload: 'my message'},
113
+ { topic: 'my-topic', payload: 'my message'}
114
+ ]
115
+ )
116
+
117
+ # both sync and async
118
+ producer.produce_many_async(
119
+ [
120
+ { topic: 'my-topic', payload: 'my message'},
121
+ { topic: 'my-topic', payload: 'my message'}
122
+ ]
123
+ )
124
+
125
+ # Don't forget to close the producer once you're done to flush the internal buffers, etc
126
+ producer.close
86
127
  ```
87
128
 
88
- ## Usage
129
+ Each message that you want to publish, will have its value checked.
89
130
 
90
- To send Kafka messages, just use one of the producers:
131
+ Here are all the things you can provide in the message hash:
132
+
133
+ | Option | Required | Value type | Description |
134
+ |-------------|----------|---------------|-------------------------------------------------------|
135
+ | `topic` | true | String | The Kafka topic that should be written to |
136
+ | `payload` | true | String | Data you want to send to Kafka |
137
+ | `key` | false | String | The key that should be set in the Kafka message |
138
+ | `partition` | false | Integer | A specific partition number that should be written to |
139
+ | `timestamp` | false | Time, Integer | The timestamp that should be set on the message |
140
+ | `headers` | false | Hash | Headers for the message |
141
+
142
+ Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
143
+
144
+ ### Buffering
145
+
146
+ WaterDrop producers support buffering of messages, which means that you can easily implement periodic flushing for long running processes as well as buffer several messages to be flushed the same moment:
91
147
 
92
148
  ```ruby
93
- WaterDrop::SyncProducer.call('message', topic: 'my-topic')
94
- # or for async
95
- WaterDrop::AsyncProducer.call('message', topic: 'my-topic')
149
+ producer = WaterDrop::Producer.new
150
+
151
+ producer.setup do |config|
152
+ config.kafka = { 'bootstrap.servers': 'localhost:9092' }
153
+ end
154
+
155
+ time = Time.now - 10
156
+
157
+ while time < Time.now
158
+ time += 1
159
+ producer.buffer(topic: 'times', payload: Time.now.to_s)
160
+ end
161
+
162
+ puts "The messages buffer size #{producer.messages.size}"
163
+ producer.flush_sync
164
+ puts "The messages buffer size #{producer.message.size}"
165
+
166
+ producer.close
96
167
  ```
97
168
 
98
- Both ```SyncProducer``` and ```AsyncProducer``` accept following options:
169
+ ## Instrumentation
99
170
 
100
- | Option | Required | Value type | Description |
101
- |-------------------- |----------|------------|---------------------------------------------------------------------|
102
- | ```topic``` | true | String | The Kafka topic that should be written to |
103
- | ```key``` | false | String | The key that should be set in the Kafka message |
104
- | ```partition``` | false | Integer | A specific partition number that should be written to |
105
- | ```partition_key``` | false | String | A string that can be used to deterministically select the partition |
106
- | ```create_time``` | false | Time | The timestamp that should be set on the message |
107
- | ```headers``` | false | Hash | Headers for the message |
171
+ Each of the producers after the `#setup` is done, has a custom monitor to which you can subscribe.
108
172
 
109
- Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
173
+ ```ruby
174
+ producer = WaterDrop::Producer.new
175
+
176
+ producer.setup do |config|
177
+ config.kafka = { 'bootstrap.servers': 'localhost:9092' }
178
+ end
179
+
180
+ producer.monitor.subscribe('message.produced_async') do |event|
181
+ puts "A message was produced to '#{event[:message][:topic]}' topic!"
182
+ end
183
+
184
+ producer.produce_async(topic: 'events', payload: 'data')
185
+
186
+ producer.close
187
+ ```
188
+
189
+ See the `WaterDrop::Instrumentation::Monitor::EVENTS` for the list of all the supported events.
190
+
191
+ ### Usage statistics
192
+
193
+ WaterDrop may be configured to emit internal metrics at a fixed interval by setting the `kafka` `statistics.interval.ms` configuration property to a value > `0`. Once that is done, emitted statistics are available after subscribing to the `statistics.emitted` publisher event.
194
+
195
+ The statistics include all of the metrics from `librdkafka` (full list [here](https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md)) as well as the diff of those against the previously emitted values.
196
+
197
+ For several attributes like `txmsgs`, `librdkafka` publishes only the totals. In order to make it easier to track the progress (for example number of messages sent between statistics emitted events), WaterDrop diffs all the numeric values against previously available numbers. All of those metrics are available under the same key as the metric but with additional `_d` postfix:
198
+
199
+
200
+ ```ruby
201
+ producer = WaterDrop::Producer.new do |config|
202
+ config.kafka = {
203
+ 'bootstrap.servers': 'localhost:9092',
204
+ 'statistics.interval.ms': 2_000 # emit statistics every 2 seconds
205
+ }
206
+ end
207
+
208
+ producer.monitor.subscribe('statistics.emitted') do |event|
209
+ sum = event[:statistics]['txmsgs']
210
+ diff = event[:statistics]['txmsgs_d']
211
+
212
+ p "Sent messages: #{sum}"
213
+ p "Messages sent from last statistics report: #{diff}"
214
+ end
215
+
216
+ sleep(2)
217
+
218
+ # Sent messages: 0
219
+ # Messages sent from last statistics report: 0
220
+
221
+ 20.times { producer.produce_async(topic: 'events', payload: 'data') }
222
+
223
+ # Sent messages: 20
224
+ # Messages sent from last statistics report: 20
225
+
226
+ sleep(2)
227
+
228
+ 20.times { producer.produce_async(topic: 'events', payload: 'data') }
229
+
230
+ # Sent messages: 40
231
+ # Messages sent from last statistics report: 20
232
+
233
+ sleep(2)
234
+
235
+ # Sent messages: 40
236
+ # Messages sent from last statistics report: 0
237
+
238
+ producer.close
239
+ ```
240
+
241
+ Note: The metrics returned may not be completely consistent between brokers, toppars and totals, due to the internal asynchronous nature of librdkafka. E.g., the top level tx total may be less than the sum of the broker tx values which it represents.
242
+
243
+ ### Forking and potential memory problems
244
+
245
+ If you work with forked processes, make sure you **don't** use the producer before the fork. You can easily configure the producer and then fork and use it.
246
+
247
+ To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
110
248
 
111
249
  ## References
112
250
 
251
+ * [WaterDrop code documentation](https://www.rubydoc.info/github/karafka/waterdrop)
113
252
  * [Karafka framework](https://github.com/karafka/karafka)
114
- * [WaterDrop Travis CI](https://travis-ci.org/karafka/waterdrop)
253
+ * [WaterDrop Actions CI](https://github.com/karafka/waterdrop/actions?query=workflow%3Ac)
115
254
  * [WaterDrop Coditsu](https://app.coditsu.io/karafka/repositories/waterdrop)
116
255
 
117
256
  ## Note on contributions
118
257
 
119
- First, thank you for considering contributing to the Karafka ecosystem! It's people like you that make the open source community such a great community!
258
+ First, thank you for considering contributing to WaterDrop! It's people like you that make the open source community such a great community!
259
+
260
+ Each pull request must pass all the RSpec specs and meet our quality requirements.
261
+
262
+ To check if everything is as it should be, we use [Coditsu](https://coditsu.io) that combines multiple linters and code analyzers for both code and documentation. Once you're done with your changes, submit a pull request.
120
263
 
121
- Each pull request must pass all the RSpec specs, integration tests and meet our quality requirements.
264
+ Coditsu will automatically check your work against our quality standards. You can find your commit check results on the [builds page](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds) of WaterDrop repository.
122
265
 
123
- Fork it, update and wait for the Github Actions results.
266
+ [![coditsu](https://coditsu.io/assets/quality_bar.svg)](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds)
data/certs/mensfeld.pem CHANGED
@@ -1,25 +1,25 @@
1
1
  -----BEGIN CERTIFICATE-----
2
2
  MIIEODCCAqCgAwIBAgIBATANBgkqhkiG9w0BAQsFADAjMSEwHwYDVQQDDBhtYWNp
3
- ZWovREM9bWVuc2ZlbGQvREM9cGwwHhcNMjEwODExMTQxNTEzWhcNMjIwODExMTQx
4
- NTEzWjAjMSEwHwYDVQQDDBhtYWNpZWovREM9bWVuc2ZlbGQvREM9cGwwggGiMA0G
5
- CSqGSIb3DQEBAQUAA4IBjwAwggGKAoIBgQDV2jKH4Ti87GM6nyT6D+ESzTI0MZDj
6
- ak2/TEwnxvijMJyCCPKT/qIkbW4/f0VHM4rhPr1nW73sb5SZBVFCLlJcOSKOBdUY
7
- TMY+SIXN2EtUaZuhAOe8LxtxjHTgRHvHcqUQMBENXTISNzCo32LnUxweu66ia4Pd
8
- 1mNRhzOqNv9YiBZvtBf7IMQ+sYdOCjboq2dlsWmJiwiDpY9lQBTnWORnT3mQxU5x
9
- vPSwnLB854cHdCS8fQo4DjeJBRZHhEbcE5sqhEMB3RZA3EtFVEXOxlNxVTS3tncI
10
- qyNXiWDaxcipaens4ObSY1C2HTV7OWb7OMqSCIybeYTSfkaSdqmcl4S6zxXkjH1J
11
- tnjayAVzD+QVXGijsPLE2PFnJAh9iDET2cMsjabO1f6l1OQNyAtqpcyQcgfnyW0z
12
- g7tGxTYD+6wJHffM9d9txOUw6djkF6bDxyqB8lo4Z3IObCx18AZjI9XPS9QG7w6q
13
- LCWuMG2lkCcRgASqaVk9fEf9yMc2xxz5o3kCAwEAAaN3MHUwCQYDVR0TBAIwADAL
14
- BgNVHQ8EBAMCBLAwHQYDVR0OBBYEFBqUFCKCOe5IuueUVqOB991jyCLLMB0GA1Ud
3
+ ZWovREM9bWVuc2ZlbGQvREM9cGwwHhcNMjAwODExMDkxNTM3WhcNMjEwODExMDkx
4
+ NTM3WjAjMSEwHwYDVQQDDBhtYWNpZWovREM9bWVuc2ZlbGQvREM9cGwwggGiMA0G
5
+ CSqGSIb3DQEBAQUAA4IBjwAwggGKAoIBgQDCpXsCgmINb6lHBXXBdyrgsBPSxC4/
6
+ 2H+weJ6L9CruTiv2+2/ZkQGtnLcDgrD14rdLIHK7t0o3EKYlDT5GhD/XUVhI15JE
7
+ N7IqnPUgexe1fbZArwQ51afxz2AmPQN2BkB2oeQHXxnSWUGMhvcEZpfbxCCJH26w
8
+ hS0Ccsma8yxA6hSlGVhFVDuCr7c2L1di6cK2CtIDpfDaWqnVNJEwBYHIxrCoWK5g
9
+ sIGekVt/admS9gRhIMaIBg+Mshth5/DEyWO2QjteTodItlxfTctrfmiAl8X8T5JP
10
+ VXeLp5SSOJ5JXE80nShMJp3RFnGw5fqjX/ffjtISYh78/By4xF3a25HdWH9+qO2Z
11
+ tx0wSGc9/4gqNM0APQnjN/4YXrGZ4IeSjtE+OrrX07l0TiyikzSLFOkZCAp8oBJi
12
+ Fhlosz8xQDJf7mhNxOaZziqASzp/hJTU/tuDKl5+ql2icnMv5iV/i6SlmvU29QNg
13
+ LCV71pUv0pWzN+OZbHZKWepGhEQ3cG9MwvkCAwEAAaN3MHUwCQYDVR0TBAIwADAL
14
+ BgNVHQ8EBAMCBLAwHQYDVR0OBBYEFImGed2AXS070ohfRidiCEhXEUN+MB0GA1Ud
15
15
  EQQWMBSBEm1hY2llakBtZW5zZmVsZC5wbDAdBgNVHRIEFjAUgRJtYWNpZWpAbWVu
16
- c2ZlbGQucGwwDQYJKoZIhvcNAQELBQADggGBADD0/UuTTFgW+CGk2U0RDw2RBOca
17
- W2LTF/G7AOzuzD0Tc4voc7WXyrgKwJREv8rgBimLnNlgmFJLmtUCh2U/MgxvcilH
18
- yshYcbseNvjkrtYnLRlWZR4SSB6Zei5AlyGVQLPkvdsBpNegcG6w075YEwzX/38a
19
- 8V9B/Yri2OGELBz8ykl7BsXUgNoUPA/4pHF6YRLz+VirOaUIQ4JfY7xGj6fSOWWz
20
- /rQ/d77r6o1mfJYM/3BRVg73a3b7DmRnE5qjwmSaSQ7u802pJnLesmArch0xGCT/
21
- fMmRli1Qb+6qOTl9mzD6UDMAyFR4t6MStLm0mIEqM0nBO5nUdUWbC7l9qXEf8XBE
22
- 2DP28p3EqSuS+lKbAWKcqv7t0iRhhmaod+Yn9mcrLN1sa3q3KSQ9BCyxezCD4Mk2
23
- R2P11bWoCtr70BsccVrN8jEhzwXngMyI2gVt750Y+dbTu1KgRqZKp/ECe7ZzPzXj
24
- pIy9vHxTANKYVyI4qj8OrFdEM5BQNu8oQpL0iQ==
16
+ c2ZlbGQucGwwDQYJKoZIhvcNAQELBQADggGBAKiHpwoENVrMi94V1zD4o8/6G3AU
17
+ gWz4udkPYHTZLUy3dLznc/sNjdkJFWT3E6NKYq7c60EpJ0m0vAEg5+F5pmNOsvD3
18
+ 2pXLj9kisEeYhR516HwXAvtngboUcb75skqvBCU++4Pu7BRAPjO1/ihLSBexbwSS
19
+ fF+J5OWNuyHHCQp+kGPLtXJe2yUYyvSWDj3I2//Vk0VhNOIlaCS1+5/P3ZJThOtm
20
+ zJUBI7h3HgovwRpcnmk2mXTmU4Zx/bCzX8EA6VY0khEvnmiq7S6eBF0H9qH8KyQ6
21
+ EkVLpvmUDFcf/uNaBQdazEMB5jYtwoA8gQlANETNGPi51KlkukhKgaIEDMkBDJOx
22
+ 65N7DzmkcyY0/GwjIVIxmRhcrCt1YeCUElmfFx0iida1/YRm6sB2AXqScc1+ECRi
23
+ 2DND//YJUikn1zwbz1kT70XmHd97B4Eytpln7K+M1u2g1pHVEPW4owD/ammXNpUy
24
+ nt70FcDD4yxJQ+0YNiHd0N8IcVBM1TMIVctMNQ==
25
25
  -----END CERTIFICATE-----
data/config/errors.yml CHANGED
@@ -1,19 +1,6 @@
1
1
  en:
2
2
  dry_validation:
3
3
  errors:
4
- broker_schema: >
5
- has an invalid format.
6
- Expected schema, host and port number.
7
- Example: kafka://127.0.0.1:9092 or kafka+ssl://127.0.0.1:9092
8
- ssl_client_cert_with_ssl_client_cert_key: >
9
- Both ssl_client_cert and ssl_client_cert_key need to be provided.
10
- ssl_client_cert_key_with_ssl_client_cert: >
11
- Both ssl_client_cert_key and ssl_client_cert need to be provided.
12
- ssl_client_cert_chain_with_ssl_client_cert: >
13
- Both ssl_client_cert_chain and ssl_client_cert need to be provided.
14
- ssl_client_cert_chain_with_ssl_client_cert_key: >
15
- Both ssl_client_cert_chain and ssl_client_cert_key need to be provided.
16
- ssl_client_cert_key_password_with_ssl_client_cert_key: >
17
- Both ssl_client_cert_key_password and ssl_client_cert_key need to be provided.
18
- sasl_oauth_token_provider_respond_to_token: >
19
- sasl_oauth_token_provider needs to respond to a #token method.
4
+ invalid_key_type: all keys need to be of type String
5
+ invalid_value_type: all values need to be of type String
6
+ max_payload_size: is more than `max_payload_size` config value
@@ -5,158 +5,57 @@
5
5
  module WaterDrop
6
6
  # Configuration object for setting up all options required by WaterDrop
7
7
  class Config
8
- extend Dry::Configurable
9
-
10
- # Config schema definition
11
- # @note We use a single instance not to create new one upon each usage
12
- SCHEMA = Contracts::Config.new.freeze
13
-
14
- private_constant :SCHEMA
8
+ include Dry::Configurable
15
9
 
16
10
  # WaterDrop options
17
- # option client_id [String] identifier of this producer
18
- setting :client_id, default: 'waterdrop'
19
- # option [Instance, nil] logger that we want to use or nil to fallback to ruby-kafka logger
20
- setting :logger, default: Logger.new($stdout, level: Logger::WARN)
11
+ #
12
+ # option [String] id of the producer. This can be helpful when building producer specific
13
+ # instrumentation or loggers. It is not the kafka producer id
14
+ setting(:id, false) { |id| id || SecureRandom.uuid }
15
+ # option [Instance] logger that we want to use
16
+ # @note Due to how rdkafka works, this setting is global for all the producers
17
+ setting(:logger, false) { |logger| logger || Logger.new($stdout, level: Logger::WARN) }
21
18
  # option [Instance] monitor that we want to use. See instrumentation part of the README for
22
19
  # more details
23
- setting :monitor, default: WaterDrop::Instrumentation::Monitor.new
20
+ setting(:monitor, false) { |monitor| monitor || WaterDrop::Instrumentation::Monitor.new }
21
+ # option [Integer] max payload size allowed for delivery to Kafka
22
+ setting :max_payload_size, 1_000_012
23
+ # option [Integer] Wait that long for the delivery report or raise an error if this takes
24
+ # longer than the timeout.
25
+ setting :max_wait_timeout, 5
26
+ # option [Numeric] how long should we wait between re-checks on the availability of the
27
+ # delivery report. In a really robust systems, this describes the min-delivery time
28
+ # for a single sync message when produced in isolation
29
+ setting :wait_timeout, 0.005 # 5 milliseconds
24
30
  # option [Boolean] should we send messages. Setting this to false can be really useful when
25
- # testing and or developing because when set to false, won't actually ping Kafka
26
- setting :deliver, default: true
27
- # option [Boolean] if you're producing messages faster than the framework or the network can
28
- # send them off, ruby-kafka might reject them. If that happens, WaterDrop will either raise
29
- # or ignore - this setting manages that behavior. This only applies to async producer as
30
- # sync producer will always raise upon problems
31
- setting :raise_on_buffer_overflow, default: true
32
-
33
- # Settings directly related to the Kafka driver
34
- setting :kafka do
35
- # option [Array<String>] Array that contains Kafka seed broker hosts with ports
36
- setting :seed_brokers
37
-
38
- # Network timeouts
39
- # option connect_timeout [Integer] Sets the number of seconds to wait while connecting to
40
- # a broker for the first time. When ruby-kafka initializes, it needs to connect to at
41
- # least one host.
42
- setting :connect_timeout, default: 10
43
- # option socket_timeout [Integer] Sets the number of seconds to wait when reading from or
44
- # writing to a socket connection to a broker. After this timeout expires the connection
45
- # will be killed. Note that some Kafka operations are by definition long-running, such as
46
- # waiting for new messages to arrive in a partition, so don't set this value too low
47
- setting :socket_timeout, default: 30
48
-
49
- # Buffering for async producer
50
- # @option [Integer] The maximum number of bytes allowed in the buffer before new messages
51
- # are rejected.
52
- setting :max_buffer_bytesize, default: 10_000_000
53
- # @option [Integer] The maximum number of messages allowed in the buffer before new messages
54
- # are rejected.
55
- setting :max_buffer_size, default: 1000
56
- # @option [Integer] The maximum number of messages allowed in the queue before new messages
57
- # are rejected. The queue is used to ferry messages from the foreground threads of your
58
- # application to the background thread that buffers and delivers messages.
59
- setting :max_queue_size, default: 1000
60
-
61
- # option [Integer] A timeout executed by a broker when the client is sending messages to it.
62
- # It defines the number of seconds the broker should wait for replicas to acknowledge the
63
- # write before responding to the client with an error. As such, it relates to the
64
- # required_acks setting. It should be set lower than socket_timeout.
65
- setting :ack_timeout, default: 5
66
- # option [Integer] The number of seconds between background message
67
- # deliveries. Default is 10 seconds. Disable timer-based background deliveries by
68
- # setting this to 0.
69
- setting :delivery_interval, default: 10
70
- # option [Integer] The number of buffered messages that will trigger a background message
71
- # delivery. Default is 100 messages. Disable buffer size based background deliveries by
72
- # setting this to 0.
73
- setting :delivery_threshold, default: 100
74
- # option [Boolean]
75
- setting :idempotent, default: false
76
- # option [Boolean]
77
- setting :transactional, default: false
78
- # option [Integer]
79
- setting :transactional_timeout, default: 60
80
-
81
- # option [Integer] The number of retries when attempting to deliver messages.
82
- setting :max_retries, default: 2
83
- # option [Integer]
84
- setting :required_acks, default: -1
85
- # option [Integer]
86
- setting :retry_backoff, default: 1
87
-
88
- # option [Integer] The minimum number of messages that must be buffered before compression is
89
- # attempted. By default only one message is required. Only relevant if compression_codec
90
- # is set.
91
- setting :compression_threshold, default: 1
92
- # option [Symbol] The codec used to compress messages. Must be either snappy or gzip.
93
- setting :compression_codec, default: nil
94
-
95
- # SSL authentication related settings
96
- # option ca_cert [String, nil] SSL CA certificate
97
- setting :ssl_ca_cert, default: nil
98
- # option ssl_ca_cert_file_path [String, nil] SSL CA certificate file path
99
- setting :ssl_ca_cert_file_path, default: nil
100
- # option ssl_ca_certs_from_system [Boolean] Use the CA certs from your system's default
101
- # certificate store
102
- setting :ssl_ca_certs_from_system, default: false
103
- # option ssl_verify_hostname [Boolean] Verify the hostname for client certs
104
- setting :ssl_verify_hostname, default: true
105
- # option ssl_client_cert [String, nil] SSL client certificate
106
- setting :ssl_client_cert, default: nil
107
- # option ssl_client_cert_key [String, nil] SSL client certificate password
108
- setting :ssl_client_cert_key, default: nil
109
- # option sasl_gssapi_principal [String, nil] sasl principal
110
- setting :sasl_gssapi_principal, default: nil
111
- # option sasl_gssapi_keytab [String, nil] sasl keytab
112
- setting :sasl_gssapi_keytab, default: nil
113
- # option sasl_plain_authzid [String] The authorization identity to use
114
- setting :sasl_plain_authzid, default: ''
115
- # option sasl_plain_username [String, nil] The username used to authenticate
116
- setting :sasl_plain_username, default: nil
117
- # option sasl_plain_password [String, nil] The password used to authenticate
118
- setting :sasl_plain_password, default: nil
119
- # option sasl_scram_username [String, nil] The username used to authenticate
120
- setting :sasl_scram_username, default: nil
121
- # option sasl_scram_password [String, nil] The password used to authenticate
122
- setting :sasl_scram_password, default: nil
123
- # option sasl_scram_mechanism [String, nil] Scram mechanism, either 'sha256' or 'sha512'
124
- setting :sasl_scram_mechanism, default: nil
125
- # option sasl_over_ssl [Boolean] whether to enforce SSL with SASL
126
- setting :sasl_over_ssl, default: true
127
- # option ssl_client_cert_chain [String, nil] client cert chain or nil if not used
128
- setting :ssl_client_cert_chain, default: nil
129
- # option ssl_client_cert_key_password [String, nil] the password required to read
130
- # the ssl_client_cert_key
131
- setting :ssl_client_cert_key_password, default: nil
132
- # @param sasl_oauth_token_provider [Object, nil] OAuthBearer Token Provider instance that
133
- # implements method token.
134
- setting :sasl_oauth_token_provider, default: nil
135
- end
136
-
137
- class << self
138
- # Configuration method
139
- # @yield Runs a block of code providing a config singleton instance to it
140
- # @yieldparam [WaterDrop::Config] WaterDrop config instance
141
- def setup
142
- configure do |config|
143
- yield(config)
144
- validate!(config.to_h)
145
- end
31
+ # testing and or developing because when set to false, won't actually ping Kafka but will
32
+ # run all the validations, etc
33
+ setting :deliver, true
34
+ # rdkafka options
35
+ # @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
36
+ setting :kafka, {}
37
+
38
+ # Configuration method
39
+ # @yield Runs a block of code providing a config singleton instance to it
40
+ # @yieldparam [WaterDrop::Config] WaterDrop config instance
41
+ def setup
42
+ configure do |config|
43
+ yield(config)
44
+ validate!(config.to_h)
146
45
  end
46
+ end
147
47
 
148
- private
48
+ private
149
49
 
150
- # Validates the configuration and if anything is wrong, will raise an exception
151
- # @param config_hash [Hash] config hash with setup details
152
- # @raise [WaterDrop::Errors::InvalidConfiguration] raised when something is wrong with
153
- # the configuration
154
- def validate!(config_hash)
155
- validation_result = SCHEMA.call(config_hash)
156
- return true if validation_result.success?
50
+ # Validates the configuration and if anything is wrong, will raise an exception
51
+ # @param config_hash [Hash] config hash with setup details
52
+ # @raise [WaterDrop::Errors::ConfigurationInvalidError] raised when something is wrong with
53
+ # the configuration
54
+ def validate!(config_hash)
55
+ result = Contracts::Config.new.call(config_hash)
56
+ return true if result.success?
157
57
 
158
- raise Errors::InvalidConfiguration, validation_result.errors.to_h
159
- end
58
+ raise Errors::ConfigurationInvalidError, result.errors.to_h
160
59
  end
161
60
  end
162
61
  end