waterdrop 2.4.6 → 2.4.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/CHANGELOG.md +5 -0
- data/Gemfile.lock +3 -3
- data/README.md +33 -0
- data/lib/waterdrop/config.rb +6 -0
- data/lib/waterdrop/instrumentation/logger_listener.rb +1 -0
- data/lib/waterdrop/middleware.rb +50 -0
- data/lib/waterdrop/patches/rdkafka/metadata.rb +9 -1
- data/lib/waterdrop/producer/async.rb +4 -0
- data/lib/waterdrop/producer/buffer.rb +3 -0
- data/lib/waterdrop/producer/sync.rb +4 -0
- data/lib/waterdrop/producer.rb +3 -0
- data/lib/waterdrop/version.rb +1 -1
- data/waterdrop.gemspec +1 -1
- data.tar.gz.sig +3 -3
- metadata +5 -4
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 8aa78b8b5f2d8534689cb9fe3db46d579610ce3b4767cef46b16d8fb1e19d48e
|
4
|
+
data.tar.gz: 75ff38cc56317bc74047fe9ba53ee1f42b098299f694c1ad5767f80ed4c2cf7e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 3e5589c065d8db716a277bb78e985b85fbe94c2d064eff1cae780f9b4fa6f64ddd95ef0591c99de2dcefedd5582c80ad83442f1cd53091b27349fb193bcd98f6
|
7
|
+
data.tar.gz: 9af7835c4419dd10af2cde93b42c7797e1811f3f049fe7ae4a50a8ccad50fc6f86a8703777c60410d10c1615386eaee2acbeacf073dd640ba24af24ed534326a
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,10 @@
|
|
1
1
|
# WaterDrop changelog
|
2
2
|
|
3
|
+
## 2.4.7 (2022-12-18)
|
4
|
+
- Add support to customizable middlewares that can modify message hash prior to validation and dispatch.
|
5
|
+
- Fix a case where upon not-available leader, metadata request would not be retried
|
6
|
+
- Require `karafka-core` 2.0.7.
|
7
|
+
|
3
8
|
## 2.4.6 (2022-12-10)
|
4
9
|
- Set `statistics.interval.ms` to 5 seconds by default, so the defaults cover all the instrumentation out of the box.
|
5
10
|
|
data/Gemfile.lock
CHANGED
@@ -1,8 +1,8 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
waterdrop (2.4.
|
5
|
-
karafka-core (>= 2.0.
|
4
|
+
waterdrop (2.4.7)
|
5
|
+
karafka-core (>= 2.0.7, < 3.0.0)
|
6
6
|
zeitwerk (~> 2.3)
|
7
7
|
|
8
8
|
GEM
|
@@ -22,7 +22,7 @@ GEM
|
|
22
22
|
ffi (1.15.5)
|
23
23
|
i18n (1.12.0)
|
24
24
|
concurrent-ruby (~> 1.0)
|
25
|
-
karafka-core (2.0.
|
25
|
+
karafka-core (2.0.7)
|
26
26
|
concurrent-ruby (>= 1.1)
|
27
27
|
rdkafka (>= 0.12)
|
28
28
|
mini_portile2 (2.8.0)
|
data/README.md
CHANGED
@@ -35,6 +35,7 @@ It:
|
|
35
35
|
* [Error notifications](#error-notifications)
|
36
36
|
* [Datadog and StatsD integration](#datadog-and-statsd-integration)
|
37
37
|
* [Forking and potential memory problems](#forking-and-potential-memory-problems)
|
38
|
+
- [Middleware](#middleware)
|
38
39
|
- [Note on contributions](#note-on-contributions)
|
39
40
|
|
40
41
|
## Installation
|
@@ -420,6 +421,38 @@ If you work with forked processes, make sure you **don't** use the producer befo
|
|
420
421
|
|
421
422
|
To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
|
422
423
|
|
424
|
+
## Middleware
|
425
|
+
|
426
|
+
WaterDrop supports injecting middleware similar to Rack.
|
427
|
+
|
428
|
+
Middleware can be used to provide extra functionalities like auto-serialization of data or any other modifications of messages before their validation and dispatch.
|
429
|
+
|
430
|
+
Each middleware accepts the message hash as input and expects a message hash as a result.
|
431
|
+
|
432
|
+
There are two methods to register middlewares:
|
433
|
+
|
434
|
+
- `#prepend` - registers middleware as the first in the order of execution
|
435
|
+
- `#append` - registers middleware as the last in the order of execution
|
436
|
+
|
437
|
+
Below you can find an example middleware that converts the incoming payload into a JSON string by running `#to_json` automatically:
|
438
|
+
|
439
|
+
```ruby
|
440
|
+
class AutoMapper
|
441
|
+
def call(message)
|
442
|
+
message[:payload] = message[:payload].to_json
|
443
|
+
message
|
444
|
+
end
|
445
|
+
end
|
446
|
+
|
447
|
+
# Register middleware
|
448
|
+
producer.middleware.append(AutoMapper.new)
|
449
|
+
|
450
|
+
# Dispatch without manual casting
|
451
|
+
producer.produce_async(topic: 'users', payload: user)
|
452
|
+
```
|
453
|
+
|
454
|
+
**Note**: It is up to the end user to decide whether to modify the provided message or deep copy it and update the newly created one.
|
455
|
+
|
423
456
|
## Note on contributions
|
424
457
|
|
425
458
|
First, thank you for considering contributing to the Karafka ecosystem! It's people like you that make the open source community such a great community!
|
data/lib/waterdrop/config.rb
CHANGED
@@ -57,6 +57,12 @@ module WaterDrop
|
|
57
57
|
# rdkafka options
|
58
58
|
# @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
|
59
59
|
setting :kafka, default: {}
|
60
|
+
# Middleware chain that can be expanded with useful middleware steps
|
61
|
+
setting(
|
62
|
+
:middleware,
|
63
|
+
default: false,
|
64
|
+
constructor: ->(middleware) { middleware || WaterDrop::Middleware.new }
|
65
|
+
)
|
60
66
|
|
61
67
|
# Configuration method
|
62
68
|
# @yield Runs a block of code providing a config singleton instance to it
|
@@ -1,6 +1,7 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
module WaterDrop
|
4
|
+
# WaterDrop instrumentation related module
|
4
5
|
module Instrumentation
|
5
6
|
# Default listener that hooks up to our instrumentation and uses its events for logging
|
6
7
|
# It can be removed/replaced or anything without any harm to the Waterdrop flow
|
@@ -0,0 +1,50 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module WaterDrop
|
4
|
+
# Simple middleware layer for manipulating messages prior to their validation
|
5
|
+
class Middleware
|
6
|
+
def initialize
|
7
|
+
@mutex = Mutex.new
|
8
|
+
@steps = []
|
9
|
+
end
|
10
|
+
|
11
|
+
# Runs middleware on a single message prior to validation
|
12
|
+
#
|
13
|
+
# @param message [Hash] message hash
|
14
|
+
# @return [Hash] message hash. Either the same if transformed in place, or a copy if modified
|
15
|
+
# into a new object.
|
16
|
+
# @note You need to decide yourself whether you don't use the message hash data anywhere else
|
17
|
+
# and you want to save on memory by modifying it in place or do you want to do a deep copy
|
18
|
+
def run(message)
|
19
|
+
@steps.each do |step|
|
20
|
+
message = step.call(message)
|
21
|
+
end
|
22
|
+
|
23
|
+
message
|
24
|
+
end
|
25
|
+
|
26
|
+
# @param messages [Array<Hash>] messages on which we want to run middlewares
|
27
|
+
# @return [Array<Hash>] transformed messages
|
28
|
+
def run_many(messages)
|
29
|
+
messages.map do |message|
|
30
|
+
run(message)
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
# Register given middleware as the first one in the chain
|
35
|
+
# @param step [#call] step that needs to return the message
|
36
|
+
def prepend(step)
|
37
|
+
@mutex.synchronize do
|
38
|
+
@steps.prepend step
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
# Register given middleware as the last one in the chain
|
43
|
+
# @param step [#call] step that needs to return the message
|
44
|
+
def append(step)
|
45
|
+
@mutex.synchronize do
|
46
|
+
@steps.append step
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
@@ -7,6 +7,14 @@ module WaterDrop
|
|
7
7
|
module Rdkafka
|
8
8
|
# Rdkafka::Metadata patches
|
9
9
|
module Metadata
|
10
|
+
# Errors upon which we retry the metadata fetch
|
11
|
+
RETRIED_ERRORS = %i[
|
12
|
+
timed_out
|
13
|
+
leader_not_available
|
14
|
+
].freeze
|
15
|
+
|
16
|
+
private_constant :RETRIED_ERRORS
|
17
|
+
|
10
18
|
# We overwrite this method because there were reports of metadata operation timing out
|
11
19
|
# when Kafka was under stress. While the messages dispatch will be retried, metadata
|
12
20
|
# fetch happens prior to that, effectively crashing the process. Metadata fetch was not
|
@@ -19,7 +27,7 @@ module WaterDrop
|
|
19
27
|
|
20
28
|
super(*args)
|
21
29
|
rescue ::Rdkafka::RdkafkaError => e
|
22
|
-
raise unless e.code
|
30
|
+
raise unless RETRIED_ERRORS.include?(e.code)
|
23
31
|
raise if attempt > 10
|
24
32
|
|
25
33
|
backoff_factor = 2**attempt
|
@@ -15,6 +15,8 @@ module WaterDrop
|
|
15
15
|
# message could not be sent to Kafka
|
16
16
|
def produce_async(message)
|
17
17
|
ensure_active!
|
18
|
+
|
19
|
+
message = middleware.run(message)
|
18
20
|
validate_message!(message)
|
19
21
|
|
20
22
|
@monitor.instrument(
|
@@ -36,6 +38,8 @@ module WaterDrop
|
|
36
38
|
# and the message could not be sent to Kafka
|
37
39
|
def produce_many_async(messages)
|
38
40
|
ensure_active!
|
41
|
+
|
42
|
+
messages = middleware.run_many(messages)
|
39
43
|
messages.each { |message| validate_message!(message) }
|
40
44
|
|
41
45
|
@monitor.instrument(
|
@@ -19,6 +19,7 @@ module WaterDrop
|
|
19
19
|
# message could not be sent to Kafka
|
20
20
|
def buffer(message)
|
21
21
|
ensure_active!
|
22
|
+
message = middleware.run(message)
|
22
23
|
validate_message!(message)
|
23
24
|
|
24
25
|
@monitor.instrument(
|
@@ -37,6 +38,8 @@ module WaterDrop
|
|
37
38
|
# and the message could not be sent to Kafka
|
38
39
|
def buffer_many(messages)
|
39
40
|
ensure_active!
|
41
|
+
|
42
|
+
messages = middleware.run_many(messages)
|
40
43
|
messages.each { |message| validate_message!(message) }
|
41
44
|
|
42
45
|
@monitor.instrument(
|
@@ -17,6 +17,8 @@ module WaterDrop
|
|
17
17
|
# message could not be sent to Kafka
|
18
18
|
def produce_sync(message)
|
19
19
|
ensure_active!
|
20
|
+
|
21
|
+
message = middleware.run(message)
|
20
22
|
validate_message!(message)
|
21
23
|
|
22
24
|
@monitor.instrument(
|
@@ -47,6 +49,8 @@ module WaterDrop
|
|
47
49
|
# and the message could not be sent to Kafka
|
48
50
|
def produce_many_sync(messages)
|
49
51
|
ensure_active!
|
52
|
+
|
53
|
+
messages = middleware.run_many(messages)
|
50
54
|
messages.each { |message| validate_message!(message) }
|
51
55
|
|
52
56
|
@monitor.instrument('messages.produced_sync', producer_id: id, messages: messages) do
|
data/lib/waterdrop/producer.rb
CHANGED
@@ -3,10 +3,13 @@
|
|
3
3
|
module WaterDrop
|
4
4
|
# Main WaterDrop messages producer
|
5
5
|
class Producer
|
6
|
+
extend Forwardable
|
6
7
|
include Sync
|
7
8
|
include Async
|
8
9
|
include Buffer
|
9
10
|
|
11
|
+
def_delegators :config, :middleware
|
12
|
+
|
10
13
|
# @return [String] uuid of the current producer
|
11
14
|
attr_reader :id
|
12
15
|
# @return [Status] producer status object
|
data/lib/waterdrop/version.rb
CHANGED
data/waterdrop.gemspec
CHANGED
@@ -16,7 +16,7 @@ Gem::Specification.new do |spec|
|
|
16
16
|
spec.description = spec.summary
|
17
17
|
spec.license = 'MIT'
|
18
18
|
|
19
|
-
spec.add_dependency 'karafka-core', '>= 2.0.
|
19
|
+
spec.add_dependency 'karafka-core', '>= 2.0.7', '< 3.0.0'
|
20
20
|
spec.add_dependency 'zeitwerk', '~> 2.3'
|
21
21
|
|
22
22
|
spec.required_ruby_version = '>= 2.7'
|
data.tar.gz.sig
CHANGED
@@ -1,3 +1,3 @@
|
|
1
|
-
|
2
|
-
|
3
|
-
|
1
|
+
��+x��8�XTҞQ-u�8
|
2
|
+
q���%P,���+��%�ݗQ ��we�!}qJaP��5j�:H;�#�v�O�F[0�C�ŮP Y�cH�b��=*��~u+5��N�
|
3
|
+
����K�GD��.
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: waterdrop
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.4.
|
4
|
+
version: 2.4.7
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Maciej Mensfeld
|
@@ -35,7 +35,7 @@ cert_chain:
|
|
35
35
|
Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
|
36
36
|
MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
|
37
37
|
-----END CERTIFICATE-----
|
38
|
-
date: 2022-12-
|
38
|
+
date: 2022-12-18 00:00:00.000000000 Z
|
39
39
|
dependencies:
|
40
40
|
- !ruby/object:Gem::Dependency
|
41
41
|
name: karafka-core
|
@@ -43,7 +43,7 @@ dependencies:
|
|
43
43
|
requirements:
|
44
44
|
- - ">="
|
45
45
|
- !ruby/object:Gem::Version
|
46
|
-
version: 2.0.
|
46
|
+
version: 2.0.7
|
47
47
|
- - "<"
|
48
48
|
- !ruby/object:Gem::Version
|
49
49
|
version: 3.0.0
|
@@ -53,7 +53,7 @@ dependencies:
|
|
53
53
|
requirements:
|
54
54
|
- - ">="
|
55
55
|
- !ruby/object:Gem::Version
|
56
|
-
version: 2.0.
|
56
|
+
version: 2.0.7
|
57
57
|
- - "<"
|
58
58
|
- !ruby/object:Gem::Version
|
59
59
|
version: 3.0.0
|
@@ -108,6 +108,7 @@ files:
|
|
108
108
|
- lib/waterdrop/instrumentation/notifications.rb
|
109
109
|
- lib/waterdrop/instrumentation/vendors/datadog/dashboard.json
|
110
110
|
- lib/waterdrop/instrumentation/vendors/datadog/listener.rb
|
111
|
+
- lib/waterdrop/middleware.rb
|
111
112
|
- lib/waterdrop/patches/rdkafka/metadata.rb
|
112
113
|
- lib/waterdrop/patches/rdkafka/producer.rb
|
113
114
|
- lib/waterdrop/producer.rb
|
metadata.gz.sig
CHANGED
Binary file
|