waterdrop 2.4.11 → 2.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c5190e7c6d460afe6d9a55cb177220e23cd5e89262430f361c3e29ab7df685e6
4
- data.tar.gz: 8cf622a618610bc1bdd283ba2046e0d233d660c0e71dfc70c78ecea2f41db22e
3
+ metadata.gz: 8aa4c6d8e6d5364a4a1a93739a0292410d664df16f7d93545bb478b5659ee443
4
+ data.tar.gz: 2bf8a36f8332be75eef108b90b2179ecc18f7301750f551eadce9d90eec6c5a9
5
5
  SHA512:
6
- metadata.gz: 48b3d383c175c392acbdd4a3a41a543192c0ca0e404d755f7fc8eae862350e8d62c9855fb8fea8dc354e358162574f4ff052c850c2ae0fc13a763f951330b97d
7
- data.tar.gz: 7da204b9493122e752933dac8856057c80c4ca640ccd31c8c5787cb9f20fbd8add2ef1851a447ef04010a72e70722d54689978697c4ef1def6ee5c3f2dc23fbe
6
+ metadata.gz: 35269366d659f9dcada5fafd5120f2434784c29417c021a6d78fad722496c37acde6617c4e4e6fee3132052248e5b935c958cb45a1113c8510d995cfd6136ccb
7
+ data.tar.gz: 8ff77a6a0c61118cf5dcad89f7d7414c510107bb559683fd8b0c133798c686cf6bbf5eef95fb14ecb5bf5762354148eb545b6835e597730457125c69cd996a41
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -1,5 +1,36 @@
1
1
  # WaterDrop changelog
2
2
 
3
+ ## 2.5.0 (2023-03-04)
4
+ - [Feature] Pipe **all** the errors including synchronous errors via the `error.occurred`.
5
+ - [Improvement] Pipe delivery errors that occurred not via the error callback using the `error.occurred` channel.
6
+ - [Improvement] Introduce `WaterDrop::Errors::ProduceError` and `WaterDrop::Errors::ProduceManyError` for any inline raised errors that occur. You can get the original error by using the `#cause`.
7
+ - [Improvement] Include `#dispatched` messages handler in the `WaterDrop::Errors::ProduceManyError` error, to be able to understand which of the messages were delegated to `librdkafka` prior to the failure.
8
+ - [Maintenance] Remove the `WaterDrop::Errors::FlushFailureError` in favour of correct error that occurred to unify the error handling.
9
+ - [Maintenance] Rename `Datadog::Listener` to `Datadog::MetricsListener` to align with Karafka (#329).
10
+ - [Fix] Do **not** flush when there is no data to flush in the internal buffer.
11
+ - [Fix] Wait on the final data flush for short-lived producers to make sure, that the message is actually dispatched by `librdkafka` or timeout.
12
+
13
+ ### Upgrade notes
14
+
15
+ Please note, this **is** a **breaking** release, hence `2.5.0`.
16
+
17
+ 1. If you used to catch `WaterDrop::Errors::FlushFailureError` now you need to catch `WaterDrop::Errors::ProduceError`. `WaterDrop::Errors::ProduceManyError` is based on the `ProduceError`, hence it should be enough.
18
+ 2. Prior to `2.5.0` there was always a chance of partial dispatches via `produce_many_` methods. Now you can get the info on all the errors via `error.occurred`.
19
+ 3. Inline `Rdkafka::RdkafkaError` are now re-raised via `WaterDrop::Errors::ProduceError` and available under `#cause`. Async `Rdkafka::RdkafkaError` errors are still directly available and you can differentiate between errors using the event `type`.
20
+ 4. If you are using the Datadog listener, you need to:
21
+
22
+ ```ruby
23
+ # Replace require:
24
+ require 'waterdrop/instrumentation/vendors/datadog/listener'
25
+ # With
26
+ require 'waterdrop/instrumentation/vendors/datadog/metrics_listener'
27
+
28
+ # Replace references of
29
+ ::WaterDrop::Instrumentation::Vendors::Datadog::Listener.new
30
+ # With
31
+ ::WaterDrop::Instrumentation::Vendors::Datadog::MetricsListener.new
32
+ ```
33
+
3
34
  ## 2.4.11 (2023-02-24)
4
35
  - Replace the local rspec locator with generalized core one.
5
36
  - Make `::WaterDrop::Instrumentation::Notifications::EVENTS` list public for anyone wanting to re-bind those into a different notification bus.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- waterdrop (2.4.11)
4
+ waterdrop (2.5.0)
5
5
  karafka-core (>= 2.0.12, < 3.0.0)
6
6
  zeitwerk (~> 2.3)
7
7
 
@@ -14,7 +14,7 @@ GEM
14
14
  minitest (>= 5.1)
15
15
  tzinfo (~> 2.0)
16
16
  byebug (11.1.3)
17
- concurrent-ruby (1.2.0)
17
+ concurrent-ruby (1.2.2)
18
18
  diff-lcs (1.5.0)
19
19
  docile (1.4.0)
20
20
  factory_bot (6.2.1)
@@ -67,4 +67,4 @@ DEPENDENCIES
67
67
  waterdrop!
68
68
 
69
69
  BUNDLED WITH
70
- 2.4.6
70
+ 2.4.7
data/README.md CHANGED
@@ -387,6 +387,8 @@ end
387
387
  # WaterDrop error occurred: Local: Broker transport failure (transport)
388
388
  ```
389
389
 
390
+ **Note:** `error.occurred` will also include any errors originating from `librdkafka` for synchronous operations, including those that are raised back to the end user.
391
+
390
392
  ### Acknowledgment notifications
391
393
 
392
394
  WaterDrop allows you to listen to Kafka messages' acknowledgment events. This will enable you to monitor deliveries of messages from WaterDrop even when using asynchronous dispatch methods.
@@ -423,7 +425,7 @@ WaterDrop comes with (optional) full Datadog and StatsD integration that you can
423
425
  ```ruby
424
426
  # require datadog/statsd and the listener as it is not loaded by default
425
427
  require 'datadog/statsd'
426
- require 'waterdrop/instrumentation/vendors/datadog/listener'
428
+ require 'waterdrop/instrumentation/vendors/datadog/metrics_listener'
427
429
 
428
430
  # initialize your producer with statistics.interval.ms enabled so the metrics are published
429
431
  producer = WaterDrop::Producer.new do |config|
@@ -435,7 +437,7 @@ producer = WaterDrop::Producer.new do |config|
435
437
  end
436
438
 
437
439
  # initialize the listener with statsd client
438
- listener = ::WaterDrop::Instrumentation::Vendors::Datadog::Listener.new do |config|
440
+ listener = ::WaterDrop::Instrumentation::Vendors::Datadog::MetricsListener.new do |config|
439
441
  config.client = Datadog::Statsd.new('localhost', 8125)
440
442
  # Publish host as a tag alongside the rest of tags
441
443
  config.default_tags = ["host:#{Socket.gethostname}"]
@@ -29,15 +29,18 @@ module WaterDrop
29
29
  # contact us as it is an error.
30
30
  StatusInvalidError = Class.new(BaseError)
31
31
 
32
- # Raised when during messages flushing something bad happened
33
- class FlushFailureError < BaseError
34
- attr_reader :dispatched_messages
32
+ # Raised when there is an inline error during single message produce operations
33
+ ProduceError = Class.new(BaseError)
35
34
 
36
- # @param dispatched_messages [Array<Rdkafka::Producer::DeliveryHandle>] handlers of the
35
+ # Raised when during messages producing something bad happened inline
36
+ class ProduceManyError < ProduceError
37
+ attr_reader :dispatched
38
+
39
+ # @param dispatched [Array<Rdkafka::Producer::DeliveryHandle>] handlers of the
37
40
  # messages that we've dispatched
38
- def initialize(dispatched_messages)
41
+ def initialize(dispatched)
39
42
  super()
40
- @dispatched_messages = dispatched_messages
43
+ @dispatched = dispatched
41
44
  end
42
45
  end
43
46
  end
@@ -17,6 +17,30 @@ module WaterDrop
17
17
  # Emits delivery details to the monitor
18
18
  # @param delivery_report [Rdkafka::Producer::DeliveryReport] delivery report
19
19
  def call(delivery_report)
20
+ if delivery_report.error.to_i.positive?
21
+ instrument_error(delivery_report)
22
+ else
23
+ instrument_acknowledged(delivery_report)
24
+ end
25
+ end
26
+
27
+ private
28
+
29
+ # @param delivery_report [Rdkafka::Producer::DeliveryReport] delivery report
30
+ def instrument_error(delivery_report)
31
+ @monitor.instrument(
32
+ 'error.occurred',
33
+ caller: self,
34
+ error: ::Rdkafka::RdkafkaError.new(delivery_report.error),
35
+ producer_id: @producer_id,
36
+ offset: delivery_report.offset,
37
+ partition: delivery_report.partition,
38
+ type: 'librdkafka.dispatch_error'
39
+ )
40
+ end
41
+
42
+ # @param delivery_report [Rdkafka::Producer::DeliveryReport] delivery report
43
+ def instrument_acknowledged(delivery_report)
20
44
  @monitor.instrument(
21
45
  'message.acknowledged',
22
46
  producer_id: @producer_id,
@@ -18,6 +18,8 @@ module WaterDrop
18
18
  # @param client_name [String] rdkafka client name
19
19
  # @param error [Rdkafka::Error] error that occurred
20
20
  # @note It will only instrument on errors of the client of our producer
21
+ # @note When there is a particular message produce error (not internal error), the error
22
+ # is shipped via the delivery callback, not via error callback.
21
23
  def call(client_name, error)
22
24
  # Emit only errors related to our client
23
25
  # Same as with statistics (mor explanation there)
@@ -10,7 +10,7 @@ module WaterDrop
10
10
  # and/or Datadog
11
11
  #
12
12
  # @note You need to setup the `dogstatsd-ruby` client and assign it
13
- class Listener
13
+ class MetricsListener
14
14
  include ::Karafka::Core::Configurable
15
15
  extend Forwardable
16
16
 
@@ -0,0 +1,34 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Patches
5
+ module Rdkafka
6
+ # Patches for the producer client
7
+ module Client
8
+ # @param _object_id [nil] rdkafka API compatibility argument
9
+ # @param timeout_ms [Integer] final flush timeout in ms
10
+ def close(_object_id = nil, timeout_ms = 5_000)
11
+ return unless @native
12
+
13
+ # Indicate to polling thread that we're closing
14
+ @polling_thread[:closing] = true
15
+ # Wait for the polling thread to finish up
16
+ @polling_thread.join
17
+
18
+ ::Rdkafka::Bindings.rd_kafka_flush(@native, timeout_ms)
19
+ ::Rdkafka::Bindings.rd_kafka_destroy(@native)
20
+
21
+ @native = nil
22
+ end
23
+ end
24
+ end
25
+ end
26
+ end
27
+
28
+ ::Rdkafka::Bindings.attach_function(
29
+ :rd_kafka_flush,
30
+ %i[pointer int],
31
+ :void
32
+ )
33
+
34
+ Rdkafka::Producer::Client.prepend WaterDrop::Patches::Rdkafka::Client
@@ -70,6 +70,14 @@ module WaterDrop
70
70
 
71
71
  @_inner_kafka
72
72
  end
73
+
74
+ # Closes our librdkafka instance with the flush patch
75
+ # @param timeout_ms [Integer] flush timeout
76
+ def close(timeout_ms = 5_000)
77
+ ObjectSpace.undefine_finalizer(self)
78
+
79
+ @client.close(nil, timeout_ms)
80
+ end
73
81
  end
74
82
  end
75
83
  end
@@ -23,7 +23,19 @@ module WaterDrop
23
23
  'message.produced_async',
24
24
  producer_id: id,
25
25
  message: message
26
- ) { client.produce(**message) }
26
+ ) { produce(message) }
27
+ rescue *SUPPORTED_FLOW_ERRORS
28
+ re_raised = Errors::ProduceError.new
29
+
30
+ @monitor.instrument(
31
+ 'error.occurred',
32
+ producer_id: id,
33
+ message: message,
34
+ error: re_raised,
35
+ type: 'message.produce_async'
36
+ )
37
+
38
+ raise re_raised
27
39
  end
28
40
 
29
41
  # Produces many messages to Kafka and does not wait for them to be delivered
@@ -39,6 +51,7 @@ module WaterDrop
39
51
  def produce_many_async(messages)
40
52
  ensure_active!
41
53
 
54
+ dispatched = []
42
55
  messages = middleware.run_many(messages)
43
56
  messages.each { |message| validate_message!(message) }
44
57
 
@@ -47,8 +60,25 @@ module WaterDrop
47
60
  producer_id: id,
48
61
  messages: messages
49
62
  ) do
50
- messages.map { |message| client.produce(**message) }
63
+ messages.each do |message|
64
+ dispatched << produce(message)
65
+ end
66
+
67
+ dispatched
51
68
  end
69
+ rescue *SUPPORTED_FLOW_ERRORS
70
+ re_raised = Errors::ProduceManyError.new(dispatched)
71
+
72
+ @monitor.instrument(
73
+ 'error.occurred',
74
+ producer_id: id,
75
+ messages: messages,
76
+ dispatched: dispatched,
77
+ error: re_raised,
78
+ type: 'messages.produce_many_async'
79
+ )
80
+
81
+ raise re_raised
52
82
  end
53
83
  end
54
84
  end
@@ -4,14 +4,6 @@ module WaterDrop
4
4
  class Producer
5
5
  # Component for buffered operations
6
6
  module Buffer
7
- # Exceptions we catch when dispatching messages from a buffer
8
- RESCUED_ERRORS = [
9
- Rdkafka::RdkafkaError,
10
- Rdkafka::Producer::DeliveryHandle::WaitTimeoutError
11
- ].freeze
12
-
13
- private_constant :RESCUED_ERRORS
14
-
15
7
  # Adds given message into the internal producer buffer without flushing it to Kafka
16
8
  #
17
9
  # @param message [Hash] hash that complies with the {Contracts::Message} contract
@@ -85,39 +77,21 @@ module WaterDrop
85
77
  # @param sync [Boolean] should it flush in a sync way
86
78
  # @return [Array<Rdkafka::Producer::DeliveryHandle, Rdkafka::Producer::DeliveryReport>]
87
79
  # delivery handles for async or delivery reports for sync
88
- # @raise [Errors::FlushFailureError] when there was a failure in flushing
80
+ # @raise [Errors::ProduceManyError] when there was a failure in flushing
89
81
  # @note We use this method underneath to provide a different instrumentation for sync and
90
82
  # async flushing within the public API
91
83
  def flush(sync)
92
84
  data_for_dispatch = nil
93
- dispatched = []
94
85
 
95
86
  @buffer_mutex.synchronize do
96
87
  data_for_dispatch = @messages
97
88
  @messages = Concurrent::Array.new
98
89
  end
99
90
 
100
- dispatched = data_for_dispatch.map { |message| client.produce(**message) }
101
-
102
- return dispatched unless sync
103
-
104
- dispatched.map do |handler|
105
- handler.wait(
106
- max_wait_timeout: @config.max_wait_timeout,
107
- wait_timeout: @config.wait_timeout
108
- )
109
- end
110
- rescue *RESCUED_ERRORS => e
111
- @monitor.instrument(
112
- 'error.occurred',
113
- caller: self,
114
- error: e,
115
- producer_id: id,
116
- dispatched: dispatched,
117
- type: sync ? 'buffer.flushed_sync.error' : 'buffer.flush_async.error'
118
- )
91
+ # Do nothing if nothing to flush
92
+ return data_for_dispatch if data_for_dispatch.empty?
119
93
 
120
- raise Errors::FlushFailureError.new(dispatched)
94
+ sync ? produce_many_sync(data_for_dispatch) : produce_many_async(data_for_dispatch)
121
95
  end
122
96
  end
123
97
  end
@@ -26,13 +26,20 @@ module WaterDrop
26
26
  producer_id: id,
27
27
  message: message
28
28
  ) do
29
- client
30
- .produce(**message)
31
- .wait(
32
- max_wait_timeout: @config.max_wait_timeout,
33
- wait_timeout: @config.wait_timeout
34
- )
29
+ wait(produce(message))
35
30
  end
31
+ rescue *SUPPORTED_FLOW_ERRORS
32
+ re_raised = Errors::ProduceError.new
33
+
34
+ @monitor.instrument(
35
+ 'error.occurred',
36
+ producer_id: id,
37
+ message: message,
38
+ error: re_raised,
39
+ type: 'message.produce_sync'
40
+ )
41
+
42
+ raise re_raised
36
43
  end
37
44
 
38
45
  # Produces many messages to Kafka and waits for them to be delivered
@@ -48,21 +55,37 @@ module WaterDrop
48
55
  # @raise [Errors::MessageInvalidError] When any of the provided messages details are invalid
49
56
  # and the message could not be sent to Kafka
50
57
  def produce_many_sync(messages)
51
- ensure_active!
58
+ ensure_active! unless @closing_thread_id && @closing_thread_id == Thread.current.object_id
52
59
 
53
60
  messages = middleware.run_many(messages)
54
61
  messages.each { |message| validate_message!(message) }
55
62
 
63
+ dispatched = []
64
+
56
65
  @monitor.instrument('messages.produced_sync', producer_id: id, messages: messages) do
57
- messages
58
- .map { |message| client.produce(**message) }
59
- .map! do |handler|
60
- handler.wait(
61
- max_wait_timeout: @config.max_wait_timeout,
62
- wait_timeout: @config.wait_timeout
63
- )
64
- end
66
+ messages.each do |message|
67
+ dispatched << produce(message)
68
+ end
69
+
70
+ dispatched.map! do |handler|
71
+ wait(handler)
72
+ end
73
+
74
+ dispatched
65
75
  end
76
+ rescue *SUPPORTED_FLOW_ERRORS
77
+ re_raised = Errors::ProduceManyError.new(dispatched)
78
+
79
+ @monitor.instrument(
80
+ 'error.occurred',
81
+ producer_id: id,
82
+ messages: messages,
83
+ dispatched: dispatched,
84
+ error: re_raised,
85
+ type: 'messages.produce_many_sync'
86
+ )
87
+
88
+ raise re_raised
66
89
  end
67
90
  end
68
91
  end
@@ -8,6 +8,14 @@ module WaterDrop
8
8
  include Async
9
9
  include Buffer
10
10
 
11
+ # Which of the inline flow errors do we want to intercept and re-bind
12
+ SUPPORTED_FLOW_ERRORS = [
13
+ Rdkafka::RdkafkaError,
14
+ Rdkafka::Producer::DeliveryHandle::WaitTimeoutError
15
+ ].freeze
16
+
17
+ private_constant :SUPPORTED_FLOW_ERRORS
18
+
11
19
  def_delegators :config, :middleware
12
20
 
13
21
  # @return [String] uuid of the current producer
@@ -117,6 +125,10 @@ module WaterDrop
117
125
  # This should be used only in case a producer was not closed properly and forgotten
118
126
  ObjectSpace.undefine_finalizer(id)
119
127
 
128
+ # We save this thread id because we need to bypass the activity verification on the
129
+ # producer for final flush of buffers.
130
+ @closing_thread_id = Thread.current.object_id
131
+
120
132
  # Flush has its own buffer mutex but even if it is blocked, flushing can still happen
121
133
  # as we close the client after the flushing (even if blocked by the mutex)
122
134
  flush(true)
@@ -125,7 +137,7 @@ module WaterDrop
125
137
  # It is safe to run it several times but not exactly the same moment
126
138
  # We also mark it as closed only if it was connected, if not, it would trigger a new
127
139
  # connection that anyhow would be immediately closed
128
- client.close if @client
140
+ client.close(@config.max_wait_timeout) if @client
129
141
 
130
142
  # Remove callbacks runners that were registered
131
143
  ::Karafka::Core::Instrumentation.statistics_callbacks.delete(@id)
@@ -155,5 +167,22 @@ module WaterDrop
155
167
  def validate_message!(message)
156
168
  @contract.validate!(message, Errors::MessageInvalidError)
157
169
  end
170
+
171
+ # Runs the client produce method with a given message
172
+ #
173
+ # @param message [Hash] message we want to send
174
+ def produce(message)
175
+ client.produce(**message)
176
+ end
177
+
178
+ # Waits on a given handler
179
+ #
180
+ # @param handler [Rdkafka::Producer::DeliveryHandle]
181
+ def wait(handler)
182
+ handler.wait(
183
+ max_wait_timeout: @config.max_wait_timeout,
184
+ wait_timeout: @config.wait_timeout
185
+ )
186
+ end
158
187
  end
159
188
  end
@@ -3,5 +3,5 @@
3
3
  # WaterDrop library
4
4
  module WaterDrop
5
5
  # Current WaterDrop version
6
- VERSION = '2.4.11'
6
+ VERSION = '2.5.0'
7
7
  end
data.tar.gz.sig CHANGED
@@ -1,3 +1,4 @@
1
- y��~[7�����m��%)�"� :�hc��?�����L₃�'�5?n��)��A ��ix?Ԅ��Bq�Ц~�j�{�܍�&3O՗�~5��;���o��SKюL%�ospW\�}�סLv��q�II!��y��4+}$X��(�֝_NT�����bD�l�514c @�3���p:��ˍ������ ��O��%{N >�=-@��'�@b�:�W��6+��G���Z.(|��]�NZ^S�<u���?�Y8��񧅳�����"�4�}�b
2
- d߿|��GȽ`qsR6{\q��;��3{C8i–��'�8��A�:�����Du(2F����oBP��2ē�1�g
3
- E=��%DDjq;MM[
1
+ �bk+%���zv45 ��@S��Zs��d`<��~sd�^$w��0����
2
+ p�����"1_-��y��Գ7� Ӭ4EI��fm�g��,7���������X ��6u1%P<���F5�i����'Z%���掜�R46����_y��w�j ᕆ쫙*����4 ���)��:��ʭBMDB����*�� ��qQ~�D
3
+ R </ �~�w[,�0��qì�C�\ Z3���Z���C�v
4
+ 9���-��o�~�Z�ݯd��'��:��5v������T�]Da@ySR�6u|c$L_  �����j�Q�'�܀��pٚ+�$�Y �I-9rq���K� *�?8++r[Y�� �?k
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: waterdrop
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.4.11
4
+ version: 2.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -35,7 +35,7 @@ cert_chain:
35
35
  Qf04B9ceLUaC4fPVEz10FyobjaFoY4i32xRto3XnrzeAgfEe4swLq8bQsR3w/EF3
36
36
  MGU0FeSV2Yj7Xc2x/7BzLK8xQn5l7Yy75iPF+KP3vVmDHnNl
37
37
  -----END CERTIFICATE-----
38
- date: 2023-02-24 00:00:00.000000000 Z
38
+ date: 2023-03-04 00:00:00.000000000 Z
39
39
  dependencies:
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: karafka-core
@@ -107,8 +107,9 @@ files:
107
107
  - lib/waterdrop/instrumentation/monitor.rb
108
108
  - lib/waterdrop/instrumentation/notifications.rb
109
109
  - lib/waterdrop/instrumentation/vendors/datadog/dashboard.json
110
- - lib/waterdrop/instrumentation/vendors/datadog/listener.rb
110
+ - lib/waterdrop/instrumentation/vendors/datadog/metrics_listener.rb
111
111
  - lib/waterdrop/middleware.rb
112
+ - lib/waterdrop/patches/rdkafka/client.rb
112
113
  - lib/waterdrop/patches/rdkafka/metadata.rb
113
114
  - lib/waterdrop/patches/rdkafka/producer.rb
114
115
  - lib/waterdrop/producer.rb
metadata.gz.sig CHANGED
Binary file