waterdrop 2.5.0 → 2.5.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/CHANGELOG.md +3 -0
- data/Gemfile.lock +2 -2
- data/README.md +34 -7
- data/lib/waterdrop/config.rb +11 -0
- data/lib/waterdrop/producer.rb +31 -0
- data/lib/waterdrop/version.rb +1 -1
- data.tar.gz.sig +0 -0
- metadata +2 -2
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2b66f5a9cb1c6fe80fe594777cb60f9fd20f120c2a897ab439404c825503bb37
|
4
|
+
data.tar.gz: 985491a90694c7c729e5c2dd8a581127c96ec26a5eb5eda53d03bc32ab463ee6
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 6ec7c01eb151ad4142f7eccfb988c56394f53b4679e480a9d8706c73323e3f6a25f8f704595fe56d5ba45f7995756add34931cf91bfcb079be6873bcc5563371
|
7
|
+
data.tar.gz: 94261472ac4786fd7911bce919b45267b0d1dc7298965c37ceee51718733d72a650c50745fabac7a0a8604909414b4efb6233640f3f24c499e51f42316d3597a
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,8 @@
|
|
1
1
|
# WaterDrop changelog
|
2
2
|
|
3
|
+
## 2.5.1 (2023-03-09)
|
4
|
+
- [Feature] Introduce a configurable backoff upon `librdkafka` queue full (false by default).
|
5
|
+
|
3
6
|
## 2.5.0 (2023-03-04)
|
4
7
|
- [Feature] Pipe **all** the errors including synchronous errors via the `error.occurred`.
|
5
8
|
- [Improvement] Pipe delivery errors that occurred not via the error callback using the `error.occurred` channel.
|
data/Gemfile.lock
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
waterdrop (2.5.
|
4
|
+
waterdrop (2.5.1)
|
5
5
|
karafka-core (>= 2.0.12, < 3.0.0)
|
6
6
|
zeitwerk (~> 2.3)
|
7
7
|
|
@@ -30,7 +30,7 @@ GEM
|
|
30
30
|
mini_portile2 (~> 2.6)
|
31
31
|
rake (> 12)
|
32
32
|
mini_portile2 (2.8.1)
|
33
|
-
minitest (5.
|
33
|
+
minitest (5.18.0)
|
34
34
|
rake (13.0.6)
|
35
35
|
rspec (3.12.0)
|
36
36
|
rspec-core (~> 3.12.0)
|
data/README.md
CHANGED
@@ -29,6 +29,7 @@ It:
|
|
29
29
|
* [Buffering](#buffering)
|
30
30
|
+ [Using WaterDrop to buffer messages based on the application logic](#using-waterdrop-to-buffer-messages-based-on-the-application-logic)
|
31
31
|
+ [Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing](#using-waterdrop-with-rdkafka-buffers-to-achieve-periodic-auto-flushing)
|
32
|
+
* [Idempotence](#idempotence)
|
32
33
|
* [Compression](#compression)
|
33
34
|
- [Instrumentation](#instrumentation)
|
34
35
|
* [Usage statistics](#usage-statistics)
|
@@ -92,13 +93,15 @@ end
|
|
92
93
|
|
93
94
|
Some of the options are:
|
94
95
|
|
95
|
-
| Option
|
96
|
-
|
97
|
-
| `id`
|
98
|
-
| `logger`
|
99
|
-
| `deliver`
|
100
|
-
| `max_wait_timeout`
|
101
|
-
| `wait_timeout`
|
96
|
+
| Option | Description |
|
97
|
+
|------------------------------|------------------------------------------------------------------|
|
98
|
+
| `id` | id of the producer for instrumentation and logging |
|
99
|
+
| `logger` | Logger that we want to use |
|
100
|
+
| `deliver` | Should we send messages to Kafka or just fake the delivery |
|
101
|
+
| `max_wait_timeout` | Waits that long for the delivery report or raises an error |
|
102
|
+
| `wait_timeout` | Waits that long before re-check of delivery report availability |
|
103
|
+
| `wait_on_queue_full` | Should be wait on queue full or raise an error when that happens |
|
104
|
+
| `wait_on_queue_full_timeout` | Waits that long before retry when queue is full |
|
102
105
|
|
103
106
|
Full list of the root configuration options is available [here](https://github.com/karafka/waterdrop/blob/master/lib/waterdrop/config.rb#L25).
|
104
107
|
|
@@ -206,6 +209,30 @@ WaterDrop producers support buffering messages in their internal buffers and on
|
|
206
209
|
|
207
210
|
This means that depending on your use case, you can achieve both granular buffering and flushing control when needed with context awareness and periodic and size-based flushing functionalities.
|
208
211
|
|
212
|
+
### Idempotence
|
213
|
+
|
214
|
+
When idempotence is enabled, the producer will ensure that messages are successfully produced exactly once and in the original production order.
|
215
|
+
|
216
|
+
To enable idempotence, you need to set the `enable.idempotence` kafka scope setting to `true`:
|
217
|
+
|
218
|
+
```ruby
|
219
|
+
WaterDrop::Producer.new do |config|
|
220
|
+
config.deliver = true
|
221
|
+
config.kafka = {
|
222
|
+
'bootstrap.servers': 'localhost:9092',
|
223
|
+
'enable.idempotence': true
|
224
|
+
}
|
225
|
+
end
|
226
|
+
```
|
227
|
+
|
228
|
+
The following Kafka configuration properties are adjusted automatically (if not modified by the user) when idempotence is enabled:
|
229
|
+
|
230
|
+
- `max.in.flight.requests.per.connection` set to `5`
|
231
|
+
- `retries` set to `2147483647`
|
232
|
+
- `acks` set to `all`
|
233
|
+
|
234
|
+
The idempotent producer ensures that messages are always delivered in the correct order and without duplicates. In other words, when an idempotent producer sends a message, the messaging system ensures that the message is only delivered once to the message broker and subsequently to the consumers, even if the producer tries to send the message multiple times.
|
235
|
+
|
209
236
|
### Compression
|
210
237
|
|
211
238
|
WaterDrop supports following compression types:
|
data/lib/waterdrop/config.rb
CHANGED
@@ -50,6 +50,17 @@ module WaterDrop
|
|
50
50
|
# delivery report. In a really robust systems, this describes the min-delivery time
|
51
51
|
# for a single sync message when produced in isolation
|
52
52
|
setting :wait_timeout, default: 0.005 # 5 milliseconds
|
53
|
+
# option [Boolean] should we upon detecting full librdkafka queue backoff and retry or should
|
54
|
+
# we raise an exception.
|
55
|
+
# When this is set to `true`, upon full queue, we won't raise an error. There will be error
|
56
|
+
# in the `error.occurred` notification pipeline with a proper type as while this is
|
57
|
+
# recoverable, in a high number it still may mean issues.
|
58
|
+
# Waiting is one of the recommended strategies.
|
59
|
+
setting :wait_on_queue_full, default: false
|
60
|
+
# option [Integer] how long (in seconds) should we backoff before a retry when queue is full
|
61
|
+
# The retry will happen with the same message and backoff should give us some time to
|
62
|
+
# dispatch previously buffered messages.
|
63
|
+
setting :wait_on_queue_full_timeout, default: 0.1
|
53
64
|
# option [Boolean] should we send messages. Setting this to false can be really useful when
|
54
65
|
# testing and or developing because when set to false, won't actually ping Kafka but will
|
55
66
|
# run all the validations, etc
|
data/lib/waterdrop/producer.rb
CHANGED
@@ -173,6 +173,37 @@ module WaterDrop
|
|
173
173
|
# @param message [Hash] message we want to send
|
174
174
|
def produce(message)
|
175
175
|
client.produce(**message)
|
176
|
+
rescue SUPPORTED_FLOW_ERRORS.first => e
|
177
|
+
# Unless we want to wait and retry and it's a full queue, we raise normally
|
178
|
+
raise unless @config.wait_on_queue_full
|
179
|
+
raise unless e.code == :queue_full
|
180
|
+
|
181
|
+
# We use this syntax here because we want to preserve the original `#cause` when we
|
182
|
+
# instrument the error and there is no way to manually assign `#cause` value. We want to keep
|
183
|
+
# the original cause to maintain the same API across all the errors dispatched to the
|
184
|
+
# notifications pipeline.
|
185
|
+
begin
|
186
|
+
raise Errors::ProduceError
|
187
|
+
rescue Errors::ProduceError => e
|
188
|
+
# We want to instrument on this event even when we restart it.
|
189
|
+
# The reason is simple: instrumentation and visibility.
|
190
|
+
# We can recover from this, but despite that we should be able to instrument this.
|
191
|
+
# If this type of event happens too often, it may indicate that the buffer settings are not
|
192
|
+
# well configured.
|
193
|
+
@monitor.instrument(
|
194
|
+
'error.occurred',
|
195
|
+
producer_id: id,
|
196
|
+
message: message,
|
197
|
+
error: e,
|
198
|
+
type: 'message.produce'
|
199
|
+
)
|
200
|
+
|
201
|
+
# We do not poll the producer because polling happens in a background thread
|
202
|
+
# It also should not be a frequent case (queue full), hence it's ok to just throttle.
|
203
|
+
sleep @config.wait_on_queue_full_timeout
|
204
|
+
end
|
205
|
+
|
206
|
+
retry
|
176
207
|
end
|
177
208
|
|
178
209
|
# Waits on a given handler
|
data/lib/waterdrop/version.rb
CHANGED
data.tar.gz.sig
CHANGED
Binary file
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: waterdrop
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.5.
|
4
|
+
version: 2.5.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Maciej Mensfeld
|
@@ -35,7 +35,7 @@ cert_chain:
|
|
35
35
|
Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
|
36
36
|
MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
|
37
37
|
-----END CERTIFICATE-----
|
38
|
-
date: 2023-03-
|
38
|
+
date: 2023-03-09 00:00:00.000000000 Z
|
39
39
|
dependencies:
|
40
40
|
- !ruby/object:Gem::Dependency
|
41
41
|
name: karafka-core
|
metadata.gz.sig
CHANGED
Binary file
|