waterdrop 2.4.6 → 2.4.7
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/CHANGELOG.md +5 -0
- data/Gemfile.lock +3 -3
- data/README.md +33 -0
- data/lib/waterdrop/config.rb +6 -0
- data/lib/waterdrop/instrumentation/logger_listener.rb +1 -0
- data/lib/waterdrop/middleware.rb +50 -0
- data/lib/waterdrop/patches/rdkafka/metadata.rb +9 -1
- data/lib/waterdrop/producer/async.rb +4 -0
- data/lib/waterdrop/producer/buffer.rb +3 -0
- data/lib/waterdrop/producer/sync.rb +4 -0
- data/lib/waterdrop/producer.rb +3 -0
- data/lib/waterdrop/version.rb +1 -1
- data/waterdrop.gemspec +1 -1
- data.tar.gz.sig +3 -3
- metadata +5 -4
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 8aa78b8b5f2d8534689cb9fe3db46d579610ce3b4767cef46b16d8fb1e19d48e
|
4
|
+
data.tar.gz: 75ff38cc56317bc74047fe9ba53ee1f42b098299f694c1ad5767f80ed4c2cf7e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 3e5589c065d8db716a277bb78e985b85fbe94c2d064eff1cae780f9b4fa6f64ddd95ef0591c99de2dcefedd5582c80ad83442f1cd53091b27349fb193bcd98f6
|
7
|
+
data.tar.gz: 9af7835c4419dd10af2cde93b42c7797e1811f3f049fe7ae4a50a8ccad50fc6f86a8703777c60410d10c1615386eaee2acbeacf073dd640ba24af24ed534326a
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,10 @@
|
|
1
1
|
# WaterDrop changelog
|
2
2
|
|
3
|
+
## 2.4.7 (2022-12-18)
|
4
|
+
- Add support to customizable middlewares that can modify message hash prior to validation and dispatch.
|
5
|
+
- Fix a case where upon not-available leader, metadata request would not be retried
|
6
|
+
- Require `karafka-core` 2.0.7.
|
7
|
+
|
3
8
|
## 2.4.6 (2022-12-10)
|
4
9
|
- Set `statistics.interval.ms` to 5 seconds by default, so the defaults cover all the instrumentation out of the box.
|
5
10
|
|
data/Gemfile.lock
CHANGED
@@ -1,8 +1,8 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
waterdrop (2.4.
|
5
|
-
karafka-core (>= 2.0.
|
4
|
+
waterdrop (2.4.7)
|
5
|
+
karafka-core (>= 2.0.7, < 3.0.0)
|
6
6
|
zeitwerk (~> 2.3)
|
7
7
|
|
8
8
|
GEM
|
@@ -22,7 +22,7 @@ GEM
|
|
22
22
|
ffi (1.15.5)
|
23
23
|
i18n (1.12.0)
|
24
24
|
concurrent-ruby (~> 1.0)
|
25
|
-
karafka-core (2.0.
|
25
|
+
karafka-core (2.0.7)
|
26
26
|
concurrent-ruby (>= 1.1)
|
27
27
|
rdkafka (>= 0.12)
|
28
28
|
mini_portile2 (2.8.0)
|
data/README.md
CHANGED
@@ -35,6 +35,7 @@ It:
|
|
35
35
|
* [Error notifications](#error-notifications)
|
36
36
|
* [Datadog and StatsD integration](#datadog-and-statsd-integration)
|
37
37
|
* [Forking and potential memory problems](#forking-and-potential-memory-problems)
|
38
|
+
- [Middleware](#middleware)
|
38
39
|
- [Note on contributions](#note-on-contributions)
|
39
40
|
|
40
41
|
## Installation
|
@@ -420,6 +421,38 @@ If you work with forked processes, make sure you **don't** use the producer befo
|
|
420
421
|
|
421
422
|
To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
|
422
423
|
|
424
|
+
## Middleware
|
425
|
+
|
426
|
+
WaterDrop supports injecting middleware similar to Rack.
|
427
|
+
|
428
|
+
Middleware can be used to provide extra functionalities like auto-serialization of data or any other modifications of messages before their validation and dispatch.
|
429
|
+
|
430
|
+
Each middleware accepts the message hash as input and expects a message hash as a result.
|
431
|
+
|
432
|
+
There are two methods to register middlewares:
|
433
|
+
|
434
|
+
- `#prepend` - registers middleware as the first in the order of execution
|
435
|
+
- `#append` - registers middleware as the last in the order of execution
|
436
|
+
|
437
|
+
Below you can find an example middleware that converts the incoming payload into a JSON string by running `#to_json` automatically:
|
438
|
+
|
439
|
+
```ruby
|
440
|
+
class AutoMapper
|
441
|
+
def call(message)
|
442
|
+
message[:payload] = message[:payload].to_json
|
443
|
+
message
|
444
|
+
end
|
445
|
+
end
|
446
|
+
|
447
|
+
# Register middleware
|
448
|
+
producer.middleware.append(AutoMapper.new)
|
449
|
+
|
450
|
+
# Dispatch without manual casting
|
451
|
+
producer.produce_async(topic: 'users', payload: user)
|
452
|
+
```
|
453
|
+
|
454
|
+
**Note**: It is up to the end user to decide whether to modify the provided message or deep copy it and update the newly created one.
|
455
|
+
|
423
456
|
## Note on contributions
|
424
457
|
|
425
458
|
First, thank you for considering contributing to the Karafka ecosystem! It's people like you that make the open source community such a great community!
|
data/lib/waterdrop/config.rb
CHANGED
@@ -57,6 +57,12 @@ module WaterDrop
|
|
57
57
|
# rdkafka options
|
58
58
|
# @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
|
59
59
|
setting :kafka, default: {}
|
60
|
+
# Middleware chain that can be expanded with useful middleware steps
|
61
|
+
setting(
|
62
|
+
:middleware,
|
63
|
+
default: false,
|
64
|
+
constructor: ->(middleware) { middleware || WaterDrop::Middleware.new }
|
65
|
+
)
|
60
66
|
|
61
67
|
# Configuration method
|
62
68
|
# @yield Runs a block of code providing a config singleton instance to it
|
@@ -1,6 +1,7 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
module WaterDrop
|
4
|
+
# WaterDrop instrumentation related module
|
4
5
|
module Instrumentation
|
5
6
|
# Default listener that hooks up to our instrumentation and uses its events for logging
|
6
7
|
# It can be removed/replaced or anything without any harm to the Waterdrop flow
|
@@ -0,0 +1,50 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module WaterDrop
|
4
|
+
# Simple middleware layer for manipulating messages prior to their validation
|
5
|
+
class Middleware
|
6
|
+
def initialize
|
7
|
+
@mutex = Mutex.new
|
8
|
+
@steps = []
|
9
|
+
end
|
10
|
+
|
11
|
+
# Runs middleware on a single message prior to validation
|
12
|
+
#
|
13
|
+
# @param message [Hash] message hash
|
14
|
+
# @return [Hash] message hash. Either the same if transformed in place, or a copy if modified
|
15
|
+
# into a new object.
|
16
|
+
# @note You need to decide yourself whether you don't use the message hash data anywhere else
|
17
|
+
# and you want to save on memory by modifying it in place or do you want to do a deep copy
|
18
|
+
def run(message)
|
19
|
+
@steps.each do |step|
|
20
|
+
message = step.call(message)
|
21
|
+
end
|
22
|
+
|
23
|
+
message
|
24
|
+
end
|
25
|
+
|
26
|
+
# @param messages [Array<Hash>] messages on which we want to run middlewares
|
27
|
+
# @return [Array<Hash>] transformed messages
|
28
|
+
def run_many(messages)
|
29
|
+
messages.map do |message|
|
30
|
+
run(message)
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
# Register given middleware as the first one in the chain
|
35
|
+
# @param step [#call] step that needs to return the message
|
36
|
+
def prepend(step)
|
37
|
+
@mutex.synchronize do
|
38
|
+
@steps.prepend step
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
# Register given middleware as the last one in the chain
|
43
|
+
# @param step [#call] step that needs to return the message
|
44
|
+
def append(step)
|
45
|
+
@mutex.synchronize do
|
46
|
+
@steps.append step
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
@@ -7,6 +7,14 @@ module WaterDrop
|
|
7
7
|
module Rdkafka
|
8
8
|
# Rdkafka::Metadata patches
|
9
9
|
module Metadata
|
10
|
+
# Errors upon which we retry the metadata fetch
|
11
|
+
RETRIED_ERRORS = %i[
|
12
|
+
timed_out
|
13
|
+
leader_not_available
|
14
|
+
].freeze
|
15
|
+
|
16
|
+
private_constant :RETRIED_ERRORS
|
17
|
+
|
10
18
|
# We overwrite this method because there were reports of metadata operation timing out
|
11
19
|
# when Kafka was under stress. While the messages dispatch will be retried, metadata
|
12
20
|
# fetch happens prior to that, effectively crashing the process. Metadata fetch was not
|
@@ -19,7 +27,7 @@ module WaterDrop
|
|
19
27
|
|
20
28
|
super(*args)
|
21
29
|
rescue ::Rdkafka::RdkafkaError => e
|
22
|
-
raise unless e.code
|
30
|
+
raise unless RETRIED_ERRORS.include?(e.code)
|
23
31
|
raise if attempt > 10
|
24
32
|
|
25
33
|
backoff_factor = 2**attempt
|
@@ -15,6 +15,8 @@ module WaterDrop
|
|
15
15
|
# message could not be sent to Kafka
|
16
16
|
def produce_async(message)
|
17
17
|
ensure_active!
|
18
|
+
|
19
|
+
message = middleware.run(message)
|
18
20
|
validate_message!(message)
|
19
21
|
|
20
22
|
@monitor.instrument(
|
@@ -36,6 +38,8 @@ module WaterDrop
|
|
36
38
|
# and the message could not be sent to Kafka
|
37
39
|
def produce_many_async(messages)
|
38
40
|
ensure_active!
|
41
|
+
|
42
|
+
messages = middleware.run_many(messages)
|
39
43
|
messages.each { |message| validate_message!(message) }
|
40
44
|
|
41
45
|
@monitor.instrument(
|
@@ -19,6 +19,7 @@ module WaterDrop
|
|
19
19
|
# message could not be sent to Kafka
|
20
20
|
def buffer(message)
|
21
21
|
ensure_active!
|
22
|
+
message = middleware.run(message)
|
22
23
|
validate_message!(message)
|
23
24
|
|
24
25
|
@monitor.instrument(
|
@@ -37,6 +38,8 @@ module WaterDrop
|
|
37
38
|
# and the message could not be sent to Kafka
|
38
39
|
def buffer_many(messages)
|
39
40
|
ensure_active!
|
41
|
+
|
42
|
+
messages = middleware.run_many(messages)
|
40
43
|
messages.each { |message| validate_message!(message) }
|
41
44
|
|
42
45
|
@monitor.instrument(
|
@@ -17,6 +17,8 @@ module WaterDrop
|
|
17
17
|
# message could not be sent to Kafka
|
18
18
|
def produce_sync(message)
|
19
19
|
ensure_active!
|
20
|
+
|
21
|
+
message = middleware.run(message)
|
20
22
|
validate_message!(message)
|
21
23
|
|
22
24
|
@monitor.instrument(
|
@@ -47,6 +49,8 @@ module WaterDrop
|
|
47
49
|
# and the message could not be sent to Kafka
|
48
50
|
def produce_many_sync(messages)
|
49
51
|
ensure_active!
|
52
|
+
|
53
|
+
messages = middleware.run_many(messages)
|
50
54
|
messages.each { |message| validate_message!(message) }
|
51
55
|
|
52
56
|
@monitor.instrument('messages.produced_sync', producer_id: id, messages: messages) do
|
data/lib/waterdrop/producer.rb
CHANGED
@@ -3,10 +3,13 @@
|
|
3
3
|
module WaterDrop
|
4
4
|
# Main WaterDrop messages producer
|
5
5
|
class Producer
|
6
|
+
extend Forwardable
|
6
7
|
include Sync
|
7
8
|
include Async
|
8
9
|
include Buffer
|
9
10
|
|
11
|
+
def_delegators :config, :middleware
|
12
|
+
|
10
13
|
# @return [String] uuid of the current producer
|
11
14
|
attr_reader :id
|
12
15
|
# @return [Status] producer status object
|
data/lib/waterdrop/version.rb
CHANGED
data/waterdrop.gemspec
CHANGED
@@ -16,7 +16,7 @@ Gem::Specification.new do |spec|
|
|
16
16
|
spec.description = spec.summary
|
17
17
|
spec.license = 'MIT'
|
18
18
|
|
19
|
-
spec.add_dependency 'karafka-core', '>= 2.0.
|
19
|
+
spec.add_dependency 'karafka-core', '>= 2.0.7', '< 3.0.0'
|
20
20
|
spec.add_dependency 'zeitwerk', '~> 2.3'
|
21
21
|
|
22
22
|
spec.required_ruby_version = '>= 2.7'
|
data.tar.gz.sig
CHANGED
@@ -1,3 +1,3 @@
|
|
1
|
-
|
2
|
-
|
3
|
-
|
1
|
+
��+x��8�XTҞQ-u�8
|
2
|
+
q���%P,���+��%�ݗQ ��we�!}qJaP��5j�:H;�#�v�O�F[0�C�ŮP Y�cH�b��=*��~u+5��N�
|
3
|
+
����K�GD��.
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: waterdrop
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.4.
|
4
|
+
version: 2.4.7
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Maciej Mensfeld
|
@@ -35,7 +35,7 @@ cert_chain:
|
|
35
35
|
Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
|
36
36
|
MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
|
37
37
|
-----END CERTIFICATE-----
|
38
|
-
date: 2022-12-
|
38
|
+
date: 2022-12-18 00:00:00.000000000 Z
|
39
39
|
dependencies:
|
40
40
|
- !ruby/object:Gem::Dependency
|
41
41
|
name: karafka-core
|
@@ -43,7 +43,7 @@ dependencies:
|
|
43
43
|
requirements:
|
44
44
|
- - ">="
|
45
45
|
- !ruby/object:Gem::Version
|
46
|
-
version: 2.0.
|
46
|
+
version: 2.0.7
|
47
47
|
- - "<"
|
48
48
|
- !ruby/object:Gem::Version
|
49
49
|
version: 3.0.0
|
@@ -53,7 +53,7 @@ dependencies:
|
|
53
53
|
requirements:
|
54
54
|
- - ">="
|
55
55
|
- !ruby/object:Gem::Version
|
56
|
-
version: 2.0.
|
56
|
+
version: 2.0.7
|
57
57
|
- - "<"
|
58
58
|
- !ruby/object:Gem::Version
|
59
59
|
version: 3.0.0
|
@@ -108,6 +108,7 @@ files:
|
|
108
108
|
- lib/waterdrop/instrumentation/notifications.rb
|
109
109
|
- lib/waterdrop/instrumentation/vendors/datadog/dashboard.json
|
110
110
|
- lib/waterdrop/instrumentation/vendors/datadog/listener.rb
|
111
|
+
- lib/waterdrop/middleware.rb
|
111
112
|
- lib/waterdrop/patches/rdkafka/metadata.rb
|
112
113
|
- lib/waterdrop/patches/rdkafka/producer.rb
|
113
114
|
- lib/waterdrop/producer.rb
|
metadata.gz.sig
CHANGED
Binary file
|