waterdrop 2.5.3 → 2.6.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ca3df6bcca5db1eed27ca612023b0df892e9a4c790b28353180184e0bdd4d784
4
- data.tar.gz: f4d60c59bbfff8a8af5d64f7580b4fa0509e15b3f8382849913a80a2e49830e5
3
+ metadata.gz: ed73a2332f0161e71e385fd2250c96bc43383eead9f2945b2580226f855f2643
4
+ data.tar.gz: 20705696d8534e5b3e10e84a4ae85f05a01025007879b0f59cff7d26f61d28e3
5
5
  SHA512:
6
- metadata.gz: 60d191c91cf40275895324a17c5f82838fe2f5215b3da983ad1008fa6a2ebf30ac228451b6108af5db56b0f104578541a6e35dc9fb645d0e74e38f7f9e99f233
7
- data.tar.gz: 7185c404cdcfb406ee19be95290d85a316291268bdc5c668fb482c4f496d55e648f4604b5a707cbb1d6ed6b9d79553753f6e123714c19932e9088e83b3cf5f58
6
+ metadata.gz: 92533a6e46992a10b2c7d4f3c6cece7f905697cb3c7c2576ec3f0364400b14c71695f1202d8c4b896168753c1d38b888e996632e6894a7cdbdc9f92e940540a6
7
+ data.tar.gz: '0945428add01cbf32e84c15458db0e39cbc1caabf8e7af9db4221b8e6149162a26946206632858154c202e702c2564a0857d9d420cce3711916f64ccfa1eab87'
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -1,9 +1,23 @@
1
1
  # WaterDrop changelog
2
2
 
3
+ ### 2.6.0 (2023-06-11)
4
+ - [Improvement] Introduce `client_class` setting for ability to replace underlying client with anything specific to a given env (dev, test, etc).
5
+ - [Improvement] Introduce `Clients::Buffered` useful for writing specs that do not have to talk with Kafka (id-ilych)
6
+ - [Improvement] Make `#produce` method private to avoid confusion and make sure it is not used directly (it is not part of the official API).
7
+ - [Change] Change `wait_on_queue_full` from `false` to `true` as a default.
8
+ - [Change] Rename `wait_on_queue_full_timeout` to `wait_backoff_on_queue_full` to match what it actually does.
9
+ - [Enhancement] Introduce `wait_timeout_on_queue_full` with proper meaning. That is, this represents time after which despite backoff the error will be raised. This should allow to raise an error in case the backoff attempts were insufficient. This prevents from a case, where upon never deliverable messages we would end up with an invite loop.
10
+ - [Fix] Provide `type` for queue full errors that references the appropriate public API method correctly.
11
+
12
+ ### Upgrade notes
13
+
14
+ 1. Rename `wait_on_queue_full_timeout` to `wait_backoff_on_queue_full`.
15
+ 2. Set `wait_on_queue_full` to `false` if you did not use it and do not want.
16
+
3
17
  ## 2.5.3 (2023-05-26)
4
- - Require `karafka-core` `2.0.13`
5
- - Include topic name in the `error.occurred` notification payload.
6
- - Include topic name in the `message.acknowledged` notification payload.
18
+ - [Enhancement] Include topic name in the `error.occurred` notification payload.
19
+ - [Enhancement] Include topic name in the `message.acknowledged` notification payload.
20
+ - [Maintenance] Require `karafka-core` `2.0.13`
7
21
 
8
22
  ## 2.5.2 (2023-04-24)
9
23
  - [Fix] Require missing Pathname (#345)
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- waterdrop (2.5.3)
4
+ waterdrop (2.6.0)
5
5
  karafka-core (>= 2.0.13, < 3.0.0)
6
6
  zeitwerk (~> 2.3)
7
7
 
data/README.md CHANGED
@@ -93,15 +93,16 @@ end
93
93
 
94
94
  Some of the options are:
95
95
 
96
- | Option | Description |
97
- |------------------------------|------------------------------------------------------------------|
98
- | `id` | id of the producer for instrumentation and logging |
99
- | `logger` | Logger that we want to use |
100
- | `deliver` | Should we send messages to Kafka or just fake the delivery |
101
- | `max_wait_timeout` | Waits that long for the delivery report or raises an error |
102
- | `wait_timeout` | Waits that long before re-check of delivery report availability |
103
- | `wait_on_queue_full` | Should be wait on queue full or raise an error when that happens |
104
- | `wait_on_queue_full_timeout` | Waits that long before retry when queue is full |
96
+ | Option | Description |
97
+ |------------------------------|----------------------------------------------------------------------------|
98
+ | `id` | id of the producer for instrumentation and logging |
99
+ | `logger` | Logger that we want to use |
100
+ | `deliver` | Should we send messages to Kafka or just fake the delivery |
101
+ | `max_wait_timeout` | Waits that long for the delivery report or raises an error |
102
+ | `wait_timeout` | Waits that long before re-check of delivery report availability |
103
+ | `wait_on_queue_full` | Should be wait on queue full or raise an error when that happens |
104
+ | `wait_backoff_on_queue_full` | Waits that long before retry when queue is full |
105
+ | `wait_timeout_on_queue_full` | If back-offs and attempts that that much time, error won't be retried more |
105
106
 
106
107
  Full list of the root configuration options is available [here](https://github.com/karafka/waterdrop/blob/master/lib/waterdrop/config.rb#L25).
107
108
 
@@ -10,6 +10,9 @@ en:
10
10
  max_wait_timeout_format: must be an integer that is equal or bigger than 0
11
11
  kafka_format: must be a hash with symbol based keys
12
12
  kafka_key_must_be_a_symbol: All keys under the kafka settings scope need to be symbols
13
+ wait_on_queue_full_format: must be boolean
14
+ wait_backoff_on_queue_full_format: must be a numeric that is bigger or equal to 0
15
+ wait_timeout_on_queue_full_format: must be a numeric that is bigger or equal to 0
13
16
 
14
17
  message:
15
18
  missing: must be present
@@ -0,0 +1,47 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Clients
5
+ # Client used to buffer messages that we send out in specs and other places.
6
+ class Buffered < Clients::Dummy
7
+ attr_accessor :messages
8
+
9
+ # Sync fake response for the message delivery to Kafka, since we do not dispatch anything
10
+ class SyncResponse
11
+ # @param _args Handler wait arguments (irrelevant as waiting is fake here)
12
+ def wait(*_args)
13
+ false
14
+ end
15
+ end
16
+
17
+ # @param args [Object] anything accepted by `Clients::Dummy`
18
+ def initialize(*args)
19
+ super
20
+ @messages = []
21
+ @topics = Hash.new { |k, v| k[v] = [] }
22
+ end
23
+
24
+ # "Produces" message to Kafka: it acknowledges it locally, adds it to the internal buffer
25
+ # @param message [Hash] `WaterDrop::Producer#produce_sync` message hash
26
+ def produce(message)
27
+ topic = message.fetch(:topic) { raise ArgumentError, ':topic is missing' }
28
+ @topics[topic] << message
29
+ @messages << message
30
+ SyncResponse.new
31
+ end
32
+
33
+ # Returns messages produced to a given topic
34
+ # @param topic [String]
35
+ def messages_for(topic)
36
+ @topics[topic]
37
+ end
38
+
39
+ # Clears internal buffer
40
+ # Used in between specs so messages do not leak out
41
+ def reset
42
+ @messages.clear
43
+ @topics.each_value(&:clear)
44
+ end
45
+ end
46
+ end
47
+ end
@@ -1,12 +1,15 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module WaterDrop
4
- class Producer
4
+ module Clients
5
5
  # A dummy client that is supposed to be used instead of Rdkafka::Producer in case we don't
6
- # want to dispatch anything to Kafka
7
- class DummyClient
8
- # @return [DummyClient] dummy instance
9
- def initialize
6
+ # want to dispatch anything to Kafka.
7
+ #
8
+ # It does not store anything and just ignores messages.
9
+ class Dummy
10
+ # @param _producer [WaterDrop::Producer]
11
+ # @return [Dummy] dummy instance
12
+ def initialize(_producer)
10
13
  @counter = -1
11
14
  end
12
15
 
@@ -0,0 +1,28 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ # Namespace for all the clients that WaterDrop may use under the hood
5
+ module Clients
6
+ # Default Rdkafka client.
7
+ # Since we use the ::Rdkafka::Producer under the hood, this is just a module that aligns with
8
+ # client building API for the convenience.
9
+ module Rdkafka
10
+ class << self
11
+ # @param producer [WaterDrop::Producer] producer instance with its config, etc
12
+ # @note We overwrite this that way, because we do not care
13
+ def new(producer)
14
+ client = ::Rdkafka::Config.new(producer.config.kafka.to_h).producer
15
+
16
+ # This callback is not global and is per client, thus we do not have to wrap it with a
17
+ # callbacks manager to make it work
18
+ client.delivery_callback = Instrumentation::Callbacks::Delivery.new(
19
+ producer.id,
20
+ producer.config.monitor
21
+ )
22
+
23
+ client
24
+ end
25
+ end
26
+ end
27
+ end
28
+ end
@@ -56,15 +56,20 @@ module WaterDrop
56
56
  # in the `error.occurred` notification pipeline with a proper type as while this is
57
57
  # recoverable, in a high number it still may mean issues.
58
58
  # Waiting is one of the recommended strategies.
59
- setting :wait_on_queue_full, default: false
59
+ setting :wait_on_queue_full, default: true
60
60
  # option [Integer] how long (in seconds) should we backoff before a retry when queue is full
61
61
  # The retry will happen with the same message and backoff should give us some time to
62
62
  # dispatch previously buffered messages.
63
- setting :wait_on_queue_full_timeout, default: 0.1
63
+ setting :wait_backoff_on_queue_full, default: 0.1
64
+ # option [Numeric] how many seconds should we wait with the backoff on queue having space for
65
+ # more messages before re-raising the error.
66
+ setting :wait_timeout_on_queue_full, default: 10
64
67
  # option [Boolean] should we send messages. Setting this to false can be really useful when
65
68
  # testing and or developing because when set to false, won't actually ping Kafka but will
66
69
  # run all the validations, etc
67
70
  setting :deliver, default: true
71
+ # option [Class] class for usage when creating the underlying client used to dispatch messages
72
+ setting :client_class, default: Clients::Rdkafka
68
73
  # rdkafka options
69
74
  # @see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
70
75
  setting :kafka, default: {}
@@ -19,6 +19,9 @@ module WaterDrop
19
19
  required(:max_wait_timeout) { |val| val.is_a?(Numeric) && val >= 0 }
20
20
  required(:wait_timeout) { |val| val.is_a?(Numeric) && val.positive? }
21
21
  required(:kafka) { |val| val.is_a?(Hash) && !val.empty? }
22
+ required(:wait_on_queue_full) { |val| [true, false].include?(val) }
23
+ required(:wait_backoff_on_queue_full) { |val| val.is_a?(Numeric) && val >= 0 }
24
+ required(:wait_timeout_on_queue_full) { |val| val.is_a?(Numeric) && val >= 0 }
22
25
 
23
26
  # rdkafka allows both symbols and strings as keys for config but then casts them to strings
24
27
  # This can be confusing, so we expect all keys to be symbolized
@@ -10,18 +10,13 @@ module WaterDrop
10
10
  # @return [Rdkafka::Producer, Producer::DummyClient] raw rdkafka producer or a dummy producer
11
11
  # when we don't want to dispatch any messages
12
12
  def call(producer, config)
13
- return DummyClient.new unless config.deliver
13
+ klass = config.client_class
14
+ # This allows us to have backwards compatibility.
15
+ # If it is the default client and delivery is set to false, we use dummy as we used to
16
+ # before `client_class` was introduced
17
+ klass = Clients::Dummy if klass == Clients::Rdkafka && !config.deliver
14
18
 
15
- client = Rdkafka::Config.new(config.kafka.to_h).producer
16
-
17
- # This callback is not global and is per client, thus we do not have to wrap it with a
18
- # callbacks manager to make it work
19
- client.delivery_callback = Instrumentation::Callbacks::Delivery.new(
20
- producer.id,
21
- config.monitor
22
- )
23
-
24
- client
19
+ klass.new(producer)
25
20
  end
26
21
  end
27
22
  end
@@ -7,6 +7,7 @@ module WaterDrop
7
7
  include Sync
8
8
  include Async
9
9
  include Buffer
10
+ include ::Karafka::Core::Helpers::Time
10
11
 
11
12
  # Which of the inline flow errors do we want to intercept and re-bind
12
13
  SUPPORTED_FLOW_ERRORS = [
@@ -168,15 +169,35 @@ module WaterDrop
168
169
  @contract.validate!(message, Errors::MessageInvalidError)
169
170
  end
170
171
 
172
+ # Waits on a given handler
173
+ #
174
+ # @param handler [Rdkafka::Producer::DeliveryHandle]
175
+ def wait(handler)
176
+ handler.wait(
177
+ max_wait_timeout: @config.max_wait_timeout,
178
+ wait_timeout: @config.wait_timeout
179
+ )
180
+ end
181
+
182
+ private
183
+
171
184
  # Runs the client produce method with a given message
172
185
  #
173
186
  # @param message [Hash] message we want to send
174
187
  def produce(message)
188
+ produce_time ||= monotonic_now
189
+
175
190
  client.produce(**message)
176
191
  rescue SUPPORTED_FLOW_ERRORS.first => e
177
192
  # Unless we want to wait and retry and it's a full queue, we raise normally
178
193
  raise unless @config.wait_on_queue_full
179
194
  raise unless e.code == :queue_full
195
+ # If we're running for longer than the timeout, we need to re-raise the queue full.
196
+ # This will prevent from situation where cluster is down forever and we just retry and retry
197
+ # in an infinite loop, effectively hanging the processing
198
+ raise unless monotonic_now - produce_time < @config.wait_timeout_on_queue_full * 1_000
199
+
200
+ label = caller_locations(2, 1)[0].label.split(' ').last
180
201
 
181
202
  # We use this syntax here because we want to preserve the original `#cause` when we
182
203
  # instrument the error and there is no way to manually assign `#cause` value. We want to keep
@@ -195,25 +216,15 @@ module WaterDrop
195
216
  producer_id: id,
196
217
  message: message,
197
218
  error: e,
198
- type: 'message.produce'
219
+ type: "message.#{label}"
199
220
  )
200
221
 
201
222
  # We do not poll the producer because polling happens in a background thread
202
223
  # It also should not be a frequent case (queue full), hence it's ok to just throttle.
203
- sleep @config.wait_on_queue_full_timeout
224
+ sleep @config.wait_backoff_on_queue_full
204
225
  end
205
226
 
206
227
  retry
207
228
  end
208
-
209
- # Waits on a given handler
210
- #
211
- # @param handler [Rdkafka::Producer::DeliveryHandle]
212
- def wait(handler)
213
- handler.wait(
214
- max_wait_timeout: @config.max_wait_timeout,
215
- wait_timeout: @config.wait_timeout
216
- )
217
- end
218
229
  end
219
230
  end
@@ -3,5 +3,5 @@
3
3
  # WaterDrop library
4
4
  module WaterDrop
5
5
  # Current WaterDrop version
6
- VERSION = '2.5.3'
6
+ VERSION = '2.6.0'
7
7
  end
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: waterdrop
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.5.3
4
+ version: 2.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
35
35
  Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
36
36
  MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
37
37
  -----END CERTIFICATE-----
38
- date: 2023-05-26 00:00:00.000000000 Z
38
+ date: 2023-06-11 00:00:00.000000000 Z
39
39
  dependencies:
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: karafka-core
@@ -95,6 +95,9 @@ files:
95
95
  - config/locales/errors.yml
96
96
  - docker-compose.yml
97
97
  - lib/waterdrop.rb
98
+ - lib/waterdrop/clients/buffered.rb
99
+ - lib/waterdrop/clients/dummy.rb
100
+ - lib/waterdrop/clients/rdkafka.rb
98
101
  - lib/waterdrop/config.rb
99
102
  - lib/waterdrop/contracts.rb
100
103
  - lib/waterdrop/contracts/config.rb
@@ -116,7 +119,6 @@ files:
116
119
  - lib/waterdrop/producer/async.rb
117
120
  - lib/waterdrop/producer/buffer.rb
118
121
  - lib/waterdrop/producer/builder.rb
119
- - lib/waterdrop/producer/dummy_client.rb
120
122
  - lib/waterdrop/producer/status.rb
121
123
  - lib/waterdrop/producer/sync.rb
122
124
  - lib/waterdrop/version.rb
metadata.gz.sig CHANGED
@@ -1,3 +1,2 @@
1
- 
2
- y��-�����ħԎ���+��V^�`ͰT���T��(!è��'D.7�jJ�����_�h�(P4��(6&��?OJȦD}���xd'��5
3
- ��� ��n�s� �)�c�l:�v)���0����pS�"���R�W�̢}�����؝�g��Qb�9�T���5�S5�F��^�
1
+ ��Mܩ;s�}դ���.]'�z�5U�jO+N4�A�YNV��̦�Y�o�u@��A�ם��������i=h��W�0�w�T���lU����C���r�<Y�����]2�w8:���FS �atcPw!�MK�:�o��>ɯ}z��̯籌�G�ġ�5)�]%^��Kw�<�j��f(��l����uoz�n�`H� qY�#? �&�6�@�r+oK�������A{�ǖ_N���3ar���Y�����x��л�
2
+ ٹ��@��l�.K��Z�2��'�#�1V)�J8�B;�˲�XsIx���~�{�])����$�-v���_�gԖd}���MM>�5lF\��b�/�FT\�V��#�