waterdrop 2.0.4 → 2.0.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 85bb80807690f36f2dff0e6da8e4382b1b03e00e34464cd9ba17fc3ca476e69a
4
- data.tar.gz: 636af3b96412184c7ae744805a941d13008ede5e4e86323c0f6117d0bdf6747b
3
+ metadata.gz: 310a3d7e1a4d0e5825b3a01f59b29c22a9f180c639951763bdf936a23c1119fd
4
+ data.tar.gz: f6c0c498266ba067201e7983d5bdea7a0aee7810a403be1cd4f4b3d62ab60633
5
5
  SHA512:
6
- metadata.gz: f67e059760f6019a455f0ffad0dcef24080e638ff5c34bafbc3cb56cec614457c2aa991e8c964685cc55bb93e8bce7c5e0e25745cd743e0ed060ea5b275d118b
7
- data.tar.gz: 5451c3dcba29bf66c4a30630733d8b8dfbf9d05935cba1414c61dd0834907f116a6ebadd8fe5df571423abe09ae687cee8aaad5acbcdf2b3eecfe595a95d4ff2
6
+ metadata.gz: 4e486cfa6aa673e008eeaccb8cf920fbb30fce1d23277021d3c6a02e36ee14b8a280e9114b9be778bdb68ba4b07eb2d64371362c454c607edf3c4b57a26a0066
7
+ data.tar.gz: 50301b9c5a5e67434f46247b5d1a83e4af2577e0f3b8f251a2795bc48aaba8c59135025e606b8143ca57560a2eac6666c530bd5d1b6059ce2e61d008e1eb9385
checksums.yaml.gz.sig CHANGED
Binary file
@@ -17,7 +17,7 @@ jobs:
17
17
  - '3.0'
18
18
  - '2.7'
19
19
  - '2.6'
20
- - 'jruby-head'
20
+ - 'jruby-9.3.1.0'
21
21
  include:
22
22
  - ruby: '3.0'
23
23
  coverage: 'true'
data/CHANGELOG.md CHANGED
@@ -1,6 +1,26 @@
1
1
  # WaterDrop changelog
2
2
 
3
- ## 2.0.4 (Unreleased)
3
+ ## 2.0.5 (2021-11-28)
4
+
5
+ ### Bug fixes
6
+
7
+ - Fixes an issue where multiple producers would emit stats of other producers causing the same stats to be published several times (as many times as a number of producers). This could cause invalid reporting for multi-kafka setups.
8
+ - Fixes a bug where emitted statistics would contain their first value as the first delta value for first stats emitted.
9
+ - Fixes a bug where decorated statistics would include a delta for a root field with non-numeric values.
10
+
11
+ ### Changes and features
12
+ - Introduces support for error callbacks instrumentation notifications with `error.emitted` monitor emitted key for tracking background errors that would occur on the producer (disconnects, etc).
13
+ - Removes the `:producer` key from `statistics.emitted` and replaces it with `:producer_id` not to inject whole producer into the payload
14
+ - Removes the `:producer` key from `message.acknowledged` and replaces it with `:producer_id` not to inject whole producer into the payload
15
+ - Cleanup and refactor of callbacks support to simplify the API and make it work with Rdkafka way of things.
16
+ - Introduces a callbacks manager concept that will also be within in Karafka `2.0` for both statistics and errors tracking per client.
17
+ - Sets default Kafka `client.id` to `waterdrop` when not set.
18
+ - Updates specs to always emit statistics for better test coverage.
19
+ - Adds statistics and errors integration specs running against Kafka.
20
+ - Replaces direct `RSpec.describe` reference with auto-discovery
21
+ - Patches `rdkafka` to provide functionalities that are needed for granular callback support.
22
+
23
+ ## 2.0.4 (2021-09-19)
4
24
  - Update `dry-*` to the recent versions and update settings syntax to match it
5
25
  - Update Zeitwerk requirement
6
26
 
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- waterdrop (2.0.4)
4
+ waterdrop (2.0.5)
5
5
  concurrent-ruby (>= 1.1)
6
6
  dry-configurable (~> 0.13)
7
7
  dry-monitor (~> 0.5)
@@ -64,15 +64,15 @@ GEM
64
64
  factory_bot (6.2.0)
65
65
  activesupport (>= 5.0.0)
66
66
  ffi (1.15.4)
67
- i18n (1.8.10)
67
+ i18n (1.8.11)
68
68
  concurrent-ruby (~> 1.0)
69
- mini_portile2 (2.7.0)
69
+ mini_portile2 (2.7.1)
70
70
  minitest (5.14.4)
71
71
  rake (13.0.6)
72
- rdkafka (0.10.0)
73
- ffi (~> 1.9)
74
- mini_portile2 (~> 2.1)
75
- rake (>= 12.3)
72
+ rdkafka (0.11.0)
73
+ ffi (~> 1.15)
74
+ mini_portile2 (~> 2.7)
75
+ rake (> 12)
76
76
  rspec (3.10.0)
77
77
  rspec-core (~> 3.10.0)
78
78
  rspec-expectations (~> 3.10.0)
@@ -85,7 +85,7 @@ GEM
85
85
  rspec-mocks (3.10.2)
86
86
  diff-lcs (>= 1.2.0, < 2.0)
87
87
  rspec-support (~> 3.10.0)
88
- rspec-support (3.10.2)
88
+ rspec-support (3.10.3)
89
89
  simplecov (0.21.2)
90
90
  docile (~> 1.1)
91
91
  simplecov-html (~> 0.11)
@@ -94,9 +94,10 @@ GEM
94
94
  simplecov_json_formatter (0.1.3)
95
95
  tzinfo (2.0.4)
96
96
  concurrent-ruby (~> 1.0)
97
- zeitwerk (2.4.2)
97
+ zeitwerk (2.5.1)
98
98
 
99
99
  PLATFORMS
100
+ x86_64-darwin
100
101
  x86_64-linux
101
102
 
102
103
  DEPENDENCIES
@@ -108,4 +109,4 @@ DEPENDENCIES
108
109
  waterdrop!
109
110
 
110
111
  BUNDLED WITH
111
- 2.2.27
112
+ 2.2.31
data/README.md CHANGED
@@ -24,22 +24,20 @@ It:
24
24
 
25
25
  ## Table of contents
26
26
 
27
- - [WaterDrop](#waterdrop)
28
- * [Table of contents](#table-of-contents)
29
- * [Installation](#installation)
30
- * [Setup](#setup)
31
- + [WaterDrop configuration options](#waterdrop-configuration-options)
32
- + [Kafka configuration options](#kafka-configuration-options)
33
- * [Usage](#usage)
34
- + [Basic usage](#basic-usage)
35
- + [Buffering](#buffering)
36
- - [Using WaterDrop to buffer messages based on the application logic](#using-waterdrop-to-buffer-messages-based-on-the-application-logic)
37
- - [Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing](#using-waterdrop-with-rdkafka-buffers-to-achieve-periodic-auto-flushing)
38
- * [Instrumentation](#instrumentation)
39
- + [Usage statistics](#usage-statistics)
40
- + [Forking and potential memory problems](#forking-and-potential-memory-problems)
41
- * [References](#references)
42
- * [Note on contributions](#note-on-contributions)
27
+ - [Installation](#installation)
28
+ - [Setup](#setup)
29
+ * [WaterDrop configuration options](#waterdrop-configuration-options)
30
+ * [Kafka configuration options](#kafka-configuration-options)
31
+ - [Usage](#usage)
32
+ * [Basic usage](#basic-usage)
33
+ * [Buffering](#buffering)
34
+ + [Using WaterDrop to buffer messages based on the application logic](#using-waterdrop-to-buffer-messages-based-on-the-application-logic)
35
+ + [Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing](#using-waterdrop-with-rdkafka-buffers-to-achieve-periodic-auto-flushing)
36
+ - [Instrumentation](#instrumentation)
37
+ * [Usage statistics](#usage-statistics)
38
+ * [Error notifications](#error-notifications)
39
+ * [Forking and potential memory problems](#forking-and-potential-memory-problems)
40
+ - [Note on contributions](#note-on-contributions)
43
41
 
44
42
  ## Installation
45
43
 
@@ -290,19 +288,42 @@ producer.close
290
288
 
291
289
  Note: The metrics returned may not be completely consistent between brokers, toppars and totals, due to the internal asynchronous nature of librdkafka. E.g., the top level tx total may be less than the sum of the broker tx values which it represents.
292
290
 
291
+ ### Error notifications
292
+
293
+ Aside from errors related to publishing messages like `buffer.flushed_async.error`, WaterDrop allows you to listen to errors that occur in its internal background threads. Things like reconnecting to Kafka upon network errors and others unrelated to publishing messages are all available under `error.emitted` notification key. You can subscribe to this event to ensure your setup is healthy and without any problems that would otherwise go unnoticed as long as messages are delivered.
294
+
295
+ ```ruby
296
+ producer = WaterDrop::Producer.new do |config|
297
+ # Note invalid connection port...
298
+ config.kafka = { 'bootstrap.servers': 'localhost:9090' }
299
+ end
300
+
301
+ producer.monitor.subscribe('error.emitted') do |event|
302
+ error = event[:error]
303
+
304
+ p "Internal error occurred: #{error}"
305
+ end
306
+
307
+ # Run this code without Kafka cluster
308
+ loop do
309
+ producer.produce_async(topic: 'events', payload: 'data')
310
+
311
+ sleep(1)
312
+ end
313
+
314
+ # After you stop your Kafka cluster, you will see a lot of those:
315
+ #
316
+ # Internal error occurred: Local: Broker transport failure (transport)
317
+ #
318
+ # Internal error occurred: Local: Broker transport failure (transport)
319
+ ```
320
+
293
321
  ### Forking and potential memory problems
294
322
 
295
323
  If you work with forked processes, make sure you **don't** use the producer before the fork. You can easily configure the producer and then fork and use it.
296
324
 
297
325
  To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
298
326
 
299
- ## References
300
-
301
- * [WaterDrop code documentation](https://www.rubydoc.info/github/karafka/waterdrop)
302
- * [Karafka framework](https://github.com/karafka/karafka)
303
- * [WaterDrop Actions CI](https://github.com/karafka/waterdrop/actions?query=workflow%3Ac)
304
- * [WaterDrop Coditsu](https://app.coditsu.io/karafka/repositories/waterdrop)
305
-
306
327
  ## Note on contributions
307
328
 
308
329
  First, thank you for considering contributing to the Karafka ecosystem! It's people like you that make the open source community such a great community!
@@ -7,6 +7,13 @@ module WaterDrop
7
7
  class Config
8
8
  include Dry::Configurable
9
9
 
10
+ # Defaults for kafka settings, that will be overwritten only if not present already
11
+ KAFKA_DEFAULTS = {
12
+ 'client.id' => 'waterdrop'
13
+ }.freeze
14
+
15
+ private_constant :KAFKA_DEFAULTS
16
+
10
17
  # WaterDrop options
11
18
  #
12
19
  # option [String] id of the producer. This can be helpful when building producer specific
@@ -53,12 +60,28 @@ module WaterDrop
53
60
  def setup
54
61
  configure do |config|
55
62
  yield(config)
63
+
64
+ merge_kafka_defaults!(config)
56
65
  validate!(config.to_h)
66
+
67
+ ::Rdkafka::Config.logger = config.logger
57
68
  end
58
69
  end
59
70
 
60
71
  private
61
72
 
73
+ # Propagates the kafka setting defaults unless they are already present
74
+ # This makes it easier to set some values that users usually don't change but still allows them
75
+ # to overwrite the whole hash if they want to
76
+ # @param config [Dry::Configurable::Config] dry config of this producer
77
+ def merge_kafka_defaults!(config)
78
+ KAFKA_DEFAULTS.each do |key, value|
79
+ next if config.kafka.key?(key)
80
+
81
+ config.kafka[key] = value
82
+ end
83
+ end
84
+
62
85
  # Validates the configuration and if anything is wrong, will raise an exception
63
86
  # @param config_hash [Hash] config hash with setup details
64
87
  # @raise [WaterDrop::Errors::ConfigurationInvalidError] raised when something is wrong with
@@ -0,0 +1,30 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Instrumentation
5
+ module Callbacks
6
+ # Creates a callable that we want to run upon each message delivery or failure
7
+ #
8
+ # @note We don't have to provide client_name here as this callback is per client instance
9
+ class Delivery
10
+ # @param producer_id [String] id of the current producer
11
+ # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
12
+ def initialize(producer_id, monitor)
13
+ @producer_id = producer_id
14
+ @monitor = monitor
15
+ end
16
+
17
+ # Emits delivery details to the monitor
18
+ # @param delivery_report [Rdkafka::Producer::DeliveryReport] delivery report
19
+ def call(delivery_report)
20
+ @monitor.instrument(
21
+ 'message.acknowledged',
22
+ producer_id: @producer_id,
23
+ offset: delivery_report.offset,
24
+ partition: delivery_report.partition
25
+ )
26
+ end
27
+ end
28
+ end
29
+ end
30
+ end
@@ -0,0 +1,35 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Instrumentation
5
+ module Callbacks
6
+ # Callback that kicks in when error occurs and is published in a background thread
7
+ class Error
8
+ # @param producer_id [String] id of the current producer
9
+ # @param client_name [String] rdkafka client name
10
+ # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
11
+ def initialize(producer_id, client_name, monitor)
12
+ @producer_id = producer_id
13
+ @client_name = client_name
14
+ @monitor = monitor
15
+ end
16
+
17
+ # Runs the instrumentation monitor with error
18
+ # @param client_name [String] rdkafka client name
19
+ # @param error [Rdkafka::Error] error that occurred
20
+ # @note If will only instrument on errors of the client of our producer
21
+ def call(client_name, error)
22
+ # Emit only errors related to our client
23
+ # Same as with statistics (mor explanation there)
24
+ return unless @client_name == client_name
25
+
26
+ @monitor.instrument(
27
+ 'error.emitted',
28
+ producer_id: @producer_id,
29
+ error: error
30
+ )
31
+ end
32
+ end
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,41 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Instrumentation
5
+ # Namespace for handlers of callbacks emitted by the kafka client lib
6
+ module Callbacks
7
+ # Statistics callback handler
8
+ # @note We decorate the statistics with our own decorator because some of the metrics from
9
+ # rdkafka are absolute. For example number of sent messages increases not in reference to
10
+ # previous statistics emit but from the beginning of the process. We decorate it with diff
11
+ # of all the numeric values against the data from the previous callback emit
12
+ class Statistics
13
+ # @param producer_id [String] id of the current producer
14
+ # @param client_name [String] rdkafka client name
15
+ # @param monitor [WaterDrop::Instrumentation::Monitor] monitor we are using
16
+ def initialize(producer_id, client_name, monitor)
17
+ @producer_id = producer_id
18
+ @client_name = client_name
19
+ @monitor = monitor
20
+ @statistics_decorator = StatisticsDecorator.new
21
+ end
22
+
23
+ # Emits decorated statistics to the monitor
24
+ # @param statistics [Hash] rdkafka statistics
25
+ def call(statistics)
26
+ # Emit only statistics related to our client
27
+ # rdkafka does not have per-instance statistics hook, thus we need to make sure that we
28
+ # emit only stats that are related to current producer. Otherwise we would emit all of
29
+ # all the time.
30
+ return unless @client_name == statistics['name']
31
+
32
+ @monitor.instrument(
33
+ 'statistics.emitted',
34
+ producer_id: @producer_id,
35
+ statistics: @statistics_decorator.call(statistics)
36
+ )
37
+ end
38
+ end
39
+ end
40
+ end
41
+ end
@@ -0,0 +1,77 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Instrumentation
5
+ module Callbacks
6
+ # Many of the librdkafka statistics are absolute values instead of a gauge.
7
+ # This means, that for example number of messages sent is an absolute growing value
8
+ # instead of being a value of messages sent from the last statistics report.
9
+ # This decorator calculates the diff against previously emited stats, so we get also
10
+ # the diff together with the original values
11
+ class StatisticsDecorator
12
+ def initialize
13
+ @previous = {}.freeze
14
+ end
15
+
16
+ # @param emited_stats [Hash] original emited statistics
17
+ # @return [Hash] emited statistics extended with the diff data
18
+ # @note We modify the emited statistics, instead of creating new. Since we don't expose
19
+ # any API to get raw data, users can just assume that the result of this decoration is
20
+ # the proper raw stats that they can use
21
+ def call(emited_stats)
22
+ diff(
23
+ @previous,
24
+ emited_stats
25
+ )
26
+
27
+ @previous = emited_stats
28
+
29
+ emited_stats.freeze
30
+ end
31
+
32
+ private
33
+
34
+ # Calculates the diff of the provided values and modifies in place the emited statistics
35
+ #
36
+ # @param previous [Object] previous value from the given scope in which
37
+ # we are
38
+ # @param current [Object] current scope from emitted statistics
39
+ # @return [Object] the diff if the values were numerics or the current scope
40
+ def diff(previous, current)
41
+ if current.is_a?(Hash)
42
+ # @note We cannot use #each_key as we modify the content of the current scope
43
+ # in place (in case it's a hash)
44
+ current.keys.each do |key|
45
+ append(
46
+ current,
47
+ key,
48
+ diff((previous || {})[key], (current || {})[key])
49
+ )
50
+ end
51
+ end
52
+
53
+ # Diff can be computed only for numerics
54
+ return current unless current.is_a?(Numeric)
55
+ # If there was no previous value, delta is always zero
56
+ return 0 unless previous
57
+ # Should never happen but just in case, a type changed in between stats
58
+ return current unless previous.is_a?(Numeric)
59
+
60
+ current - previous
61
+ end
62
+
63
+ # Appends the result of the diff to a given key as long as the result is numeric
64
+ #
65
+ # @param current [Hash] current scope
66
+ # @param key [Symbol] key based on which we were diffing
67
+ # @param result [Object] diff result
68
+ def append(current, key, result)
69
+ return unless result.is_a?(Numeric)
70
+ return if current.frozen?
71
+
72
+ current["#{key}_d"] = result
73
+ end
74
+ end
75
+ end
76
+ end
77
+ end
@@ -0,0 +1,35 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Instrumentation
5
+ # This manager allows us to register multiple callbacks into a hook that is suppose to support
6
+ # a single callback
7
+ class CallbacksManager
8
+ # @return [::WaterDrop::Instrumentation::CallbacksManager]
9
+ def initialize
10
+ @callbacks = Concurrent::Hash.new
11
+ end
12
+
13
+ # Invokes all the callbacks registered one after another
14
+ #
15
+ # @param args [Object] any args that should go to the callbacks
16
+ def call(*args)
17
+ @callbacks.each_value { |a| a.call(*args) }
18
+ end
19
+
20
+ # Adds a callback to the manager
21
+ #
22
+ # @param id [String] id of the callback (used when deleting it)
23
+ # @param callable [#call] object that responds to a `#call` method
24
+ def add(id, callable)
25
+ @callbacks[id] = callable
26
+ end
27
+
28
+ # Removes the callback from the manager
29
+ # @param id [String] id of the callback we want to remove
30
+ def delete(id)
31
+ @callbacks.delete(id)
32
+ end
33
+ end
34
+ end
35
+ end
@@ -13,18 +13,24 @@ module WaterDrop
13
13
  # @note The non-error once support timestamp benchmarking
14
14
  EVENTS = %w[
15
15
  producer.closed
16
+
16
17
  message.produced_async
17
18
  message.produced_sync
19
+ message.acknowledged
20
+ message.buffered
21
+
18
22
  messages.produced_async
19
23
  messages.produced_sync
20
- message.buffered
21
24
  messages.buffered
22
- message.acknowledged
25
+
23
26
  buffer.flushed_async
24
27
  buffer.flushed_async.error
25
28
  buffer.flushed_sync
26
29
  buffer.flushed_sync.error
30
+
27
31
  statistics.emitted
32
+
33
+ error.emitted
28
34
  ].freeze
29
35
 
30
36
  private_constant :EVENTS
@@ -2,6 +2,20 @@
2
2
 
3
3
  module WaterDrop
4
4
  # Namespace for all the things related with WaterDrop instrumentation process
5
+ # @note We do not
5
6
  module Instrumentation
7
+ class << self
8
+ # Builds a manager for statistics callbacks
9
+ # @return [WaterDrop::CallbacksManager]
10
+ def statistics_callbacks
11
+ @statistics_callbacks ||= CallbacksManager.new
12
+ end
13
+
14
+ # Builds a manager for error callbacks
15
+ # @return [WaterDrop::CallbacksManager]
16
+ def error_callbacks
17
+ @error_callbacks ||= CallbacksManager.new
18
+ end
19
+ end
6
20
  end
7
21
  end
@@ -0,0 +1,42 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ module Patches
5
+ module Rdkafka
6
+ # Extends `Rdkafka::Bindings` with some extra methods and updates callbacks that we intend
7
+ # to work with in a bit different way than rdkafka itself
8
+ module Bindings
9
+ class << self
10
+ # Add extra methods that we need
11
+ # @param mod [::Rdkafka::Bindings] rdkafka bindings module
12
+ def included(mod)
13
+ mod.attach_function :rd_kafka_name, [:pointer], :string
14
+
15
+ # Default rdkafka setup for errors doest not propagate client details, thus it always
16
+ # publishes all the stuff for all rdkafka instances. We change that by providing
17
+ # function that fetches the instance name, allowing us to have better notifications
18
+ mod.send(:remove_const, :ErrorCallback)
19
+ mod.const_set(:ErrorCallback, build_error_callback)
20
+ end
21
+
22
+ # @return [FFI::Function] overwritten callback function
23
+ def build_error_callback
24
+ FFI::Function.new(
25
+ :void, %i[pointer int string pointer]
26
+ ) do |client_prr, err_code, reason, _opaque|
27
+ return nil unless ::Rdkafka::Config.error_callback
28
+
29
+ name = ::Rdkafka::Bindings.rd_kafka_name(client_prr)
30
+
31
+ error = ::Rdkafka::RdkafkaError.new(err_code, broker_message: reason)
32
+
33
+ ::Rdkafka::Config.error_callback.call(name, error)
34
+ end
35
+ end
36
+ end
37
+ end
38
+ end
39
+ end
40
+ end
41
+
42
+ ::Rdkafka::Bindings.include(::WaterDrop::Patches::Rdkafka::Bindings)
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WaterDrop
4
+ # Patches to external components
5
+ module Patches
6
+ # Rdkafka related patches
7
+ module Rdkafka
8
+ # Rdkafka::Producer patches
9
+ module Producer
10
+ # Adds a method that allows us to get the native kafka producer name
11
+ # @return [String] producer instance name
12
+ def name
13
+ ::Rdkafka::Bindings.rd_kafka_name(@native_kafka)
14
+ end
15
+ end
16
+ end
17
+ end
18
+ end
19
+
20
+ ::Rdkafka::Producer.include ::WaterDrop::Patches::Rdkafka::Producer
@@ -12,51 +12,16 @@ module WaterDrop
12
12
  def call(producer, config)
13
13
  return DummyClient.new unless config.deliver
14
14
 
15
- Rdkafka::Config.logger = config.logger
16
- Rdkafka::Config.statistics_callback = build_statistics_callback(producer, config.monitor)
17
-
18
15
  client = Rdkafka::Config.new(config.kafka.to_h).producer
19
- client.delivery_callback = build_delivery_callback(producer, config.monitor)
20
- client
21
- end
22
16
 
23
- private
17
+ # This callback is not global and is per client, thus we do not have to wrap it with a
18
+ # callbacks manager to make it work
19
+ client.delivery_callback = Instrumentation::Callbacks::Delivery.new(
20
+ producer.id,
21
+ config.monitor
22
+ )
24
23
 
25
- # Creates a proc that we want to run upon each successful message delivery
26
- #
27
- # @param producer [Producer]
28
- # @param monitor [Object] monitor we want to use
29
- # @return [Proc] delivery callback
30
- def build_delivery_callback(producer, monitor)
31
- lambda do |delivery_report|
32
- monitor.instrument(
33
- 'message.acknowledged',
34
- producer: producer,
35
- offset: delivery_report.offset,
36
- partition: delivery_report.partition
37
- )
38
- end
39
- end
40
-
41
- # Creates a proc that we want to run upon each statistics callback execution
42
- #
43
- # @param producer [Producer]
44
- # @param monitor [Object] monitor we want to use
45
- # @return [Proc] statistics callback
46
- # @note We decorate the statistics with our own decorator because some of the metrics from
47
- # rdkafka are absolute. For example number of sent messages increases not in reference to
48
- # previous statistics emit but from the beginning of the process. We decorate it with diff
49
- # of all the numeric values against the data from the previous callback emit
50
- def build_statistics_callback(producer, monitor)
51
- statistics_decorator = StatisticsDecorator.new
52
-
53
- lambda do |statistics|
54
- monitor.instrument(
55
- 'statistics.emitted',
56
- producer: producer,
57
- statistics: statistics_decorator.call(statistics)
58
- )
59
- end
24
+ client
60
25
  end
61
26
  end
62
27
  end
@@ -80,6 +80,19 @@ module WaterDrop
80
80
 
81
81
  @pid = Process.pid
82
82
  @client = Builder.new.call(self, @config)
83
+
84
+ # Register statistics runner for this particular type of callbacks
85
+ ::WaterDrop::Instrumentation.statistics_callbacks.add(
86
+ @id,
87
+ Instrumentation::Callbacks::Statistics.new(@id, @client.name, @config.monitor)
88
+ )
89
+
90
+ # Register error tracking callback
91
+ ::WaterDrop::Instrumentation.error_callbacks.add(
92
+ @id,
93
+ Instrumentation::Callbacks::Error.new(@id, @client.name, @config.monitor)
94
+ )
95
+
83
96
  @status.connected!
84
97
  end
85
98
 
@@ -111,6 +124,10 @@ module WaterDrop
111
124
  # connection that anyhow would be immediately closed
112
125
  client.close if @client
113
126
 
127
+ # Remove callbacks runners that were registered
128
+ ::WaterDrop::Instrumentation.statistics_callbacks.delete(@id)
129
+ ::WaterDrop::Instrumentation.error_callbacks.delete(@id)
130
+
114
131
  @status.closed!
115
132
  end
116
133
  end
@@ -3,5 +3,5 @@
3
3
  # WaterDrop library
4
4
  module WaterDrop
5
5
  # Current WaterDrop version
6
- VERSION = '2.0.4'
6
+ VERSION = '2.0.5'
7
7
  end
data/lib/water_drop.rb CHANGED
@@ -28,3 +28,9 @@ Zeitwerk::Loader
28
28
  .tap { |loader| loader.ignore("#{__dir__}/waterdrop.rb") }
29
29
  .tap(&:setup)
30
30
  .tap(&:eager_load)
31
+
32
+ # Rdkafka uses a single global callback for things. We bypass that by injecting a manager for
33
+ # each callback type. Callback manager allows us to register more than one callback
34
+ # @note Those managers are also used by Karafka for consumer related statistics
35
+ Rdkafka::Config.statistics_callback = WaterDrop::Instrumentation.statistics_callbacks
36
+ Rdkafka::Config.error_callback = WaterDrop::Instrumentation.error_callbacks
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: waterdrop
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.4
4
+ version: 2.0.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Maciej Mensfeld
@@ -34,7 +34,7 @@ cert_chain:
34
34
  R2P11bWoCtr70BsccVrN8jEhzwXngMyI2gVt750Y+dbTu1KgRqZKp/ECe7ZzPzXj
35
35
  pIy9vHxTANKYVyI4qj8OrFdEM5BQNu8oQpL0iQ==
36
36
  -----END CERTIFICATE-----
37
- date: 2021-09-19 00:00:00.000000000 Z
37
+ date: 2021-11-28 00:00:00.000000000 Z
38
38
  dependencies:
39
39
  - !ruby/object:Gem::Dependency
40
40
  name: concurrent-ruby
@@ -149,14 +149,20 @@ files:
149
149
  - lib/water_drop/contracts/message.rb
150
150
  - lib/water_drop/errors.rb
151
151
  - lib/water_drop/instrumentation.rb
152
+ - lib/water_drop/instrumentation/callbacks/delivery.rb
153
+ - lib/water_drop/instrumentation/callbacks/error.rb
154
+ - lib/water_drop/instrumentation/callbacks/statistics.rb
155
+ - lib/water_drop/instrumentation/callbacks/statistics_decorator.rb
156
+ - lib/water_drop/instrumentation/callbacks_manager.rb
152
157
  - lib/water_drop/instrumentation/monitor.rb
153
158
  - lib/water_drop/instrumentation/stdout_listener.rb
159
+ - lib/water_drop/patches/rdkafka/bindings.rb
160
+ - lib/water_drop/patches/rdkafka/producer.rb
154
161
  - lib/water_drop/producer.rb
155
162
  - lib/water_drop/producer/async.rb
156
163
  - lib/water_drop/producer/buffer.rb
157
164
  - lib/water_drop/producer/builder.rb
158
165
  - lib/water_drop/producer/dummy_client.rb
159
- - lib/water_drop/producer/statistics_decorator.rb
160
166
  - lib/water_drop/producer/status.rb
161
167
  - lib/water_drop/producer/sync.rb
162
168
  - lib/water_drop/version.rb
@@ -182,7 +188,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
182
188
  - !ruby/object:Gem::Version
183
189
  version: '0'
184
190
  requirements: []
185
- rubygems_version: 3.2.27
191
+ rubygems_version: 3.2.25
186
192
  signing_key:
187
193
  specification_version: 4
188
194
  summary: Kafka messaging made easy!
metadata.gz.sig CHANGED
Binary file
@@ -1,71 +0,0 @@
1
- # frozen_string_literal: true
2
-
3
- module WaterDrop
4
- class Producer
5
- # Many of the librdkafka statistics are absolute values instead of a gauge.
6
- # This means, that for example number of messages sent is an absolute growing value
7
- # instead of being a value of messages sent from the last statistics report.
8
- # This decorator calculates the diff against previously emited stats, so we get also
9
- # the diff together with the original values
10
- class StatisticsDecorator
11
- def initialize
12
- @previous = {}.freeze
13
- end
14
-
15
- # @param emited_stats [Hash] original emited statistics
16
- # @return [Hash] emited statistics extended with the diff data
17
- # @note We modify the emited statistics, instead of creating new. Since we don't expose
18
- # any API to get raw data, users can just assume that the result of this decoration is the
19
- # proper raw stats that they can use
20
- def call(emited_stats)
21
- diff(
22
- @previous,
23
- emited_stats
24
- )
25
-
26
- @previous = emited_stats
27
-
28
- emited_stats.freeze
29
- end
30
-
31
- private
32
-
33
- # Calculates the diff of the provided values and modifies in place the emited statistics
34
- #
35
- # @param previous [Object] previous value from the given scope in which
36
- # we are
37
- # @param current [Object] current scope from emitted statistics
38
- # @return [Object] the diff if the values were numerics or the current scope
39
- def diff(previous, current)
40
- if current.is_a?(Hash)
41
- # @note We cannot use #each_key as we modify the content of the current scope
42
- # in place (in case it's a hash)
43
- current.keys.each do |key|
44
- append(
45
- current,
46
- key,
47
- diff((previous || {})[key], (current || {})[key])
48
- )
49
- end
50
- end
51
-
52
- if current.is_a?(Numeric) && previous.is_a?(Numeric)
53
- current - previous
54
- else
55
- current
56
- end
57
- end
58
-
59
- # Appends the result of the diff to a given key as long as the result is numeric
60
- #
61
- # @param current [Hash] current scope
62
- # @param key [Symbol] key based on which we were diffing
63
- # @param result [Object] diff result
64
- def append(current, key, result)
65
- return unless result.is_a?(Numeric)
66
-
67
- current["#{key}_d"] = result
68
- end
69
- end
70
- end
71
- end