waterdrop 1.4.2 → 2.0.2

Sign up to get free protection for your applications and to get access to all the features.
Files changed (40) hide show
  1. checksums.yaml +4 -4
  2. checksums.yaml.gz.sig +0 -0
  3. data.tar.gz.sig +0 -0
  4. data/.github/workflows/ci.yml +1 -2
  5. data/.gitignore +2 -0
  6. data/.ruby-version +1 -1
  7. data/CHANGELOG.md +17 -5
  8. data/Gemfile +9 -0
  9. data/Gemfile.lock +42 -29
  10. data/{MIT-LICENCE → MIT-LICENSE} +0 -0
  11. data/README.md +244 -57
  12. data/certs/mensfeld.pem +21 -21
  13. data/config/errors.yml +3 -16
  14. data/docker-compose.yml +1 -1
  15. data/lib/water_drop.rb +4 -24
  16. data/lib/water_drop/config.rb +41 -142
  17. data/lib/water_drop/contracts.rb +0 -2
  18. data/lib/water_drop/contracts/config.rb +8 -121
  19. data/lib/water_drop/contracts/message.rb +42 -0
  20. data/lib/water_drop/errors.rb +31 -5
  21. data/lib/water_drop/instrumentation/monitor.rb +16 -22
  22. data/lib/water_drop/instrumentation/stdout_listener.rb +113 -32
  23. data/lib/water_drop/patches/rdkafka_producer.rb +49 -0
  24. data/lib/water_drop/producer.rb +143 -0
  25. data/lib/water_drop/producer/async.rb +51 -0
  26. data/lib/water_drop/producer/buffer.rb +113 -0
  27. data/lib/water_drop/producer/builder.rb +63 -0
  28. data/lib/water_drop/producer/dummy_client.rb +32 -0
  29. data/lib/water_drop/producer/statistics_decorator.rb +71 -0
  30. data/lib/water_drop/producer/status.rb +52 -0
  31. data/lib/water_drop/producer/sync.rb +65 -0
  32. data/lib/water_drop/version.rb +1 -1
  33. data/waterdrop.gemspec +4 -4
  34. metadata +44 -45
  35. metadata.gz.sig +0 -0
  36. data/lib/water_drop/async_producer.rb +0 -26
  37. data/lib/water_drop/base_producer.rb +0 -57
  38. data/lib/water_drop/config_applier.rb +0 -52
  39. data/lib/water_drop/contracts/message_options.rb +0 -19
  40. data/lib/water_drop/sync_producer.rb +0 -24
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 3264233762968462432ad2d4c37809289966e943908df7c8c08d9939d585e621
4
- data.tar.gz: '079c4e36222105d110f7268dff2abb3cfbdab20a019963a80552555d084f20ae'
3
+ metadata.gz: 24189103d7583fac3c911f4b32244655ffa2cd1b5a95abe0e73dbd4344530c5b
4
+ data.tar.gz: b4f5c0e95af559e0a0a929d14a3b56b39c4da5edc9b1d9ffe995e8b1852d9c14
5
5
  SHA512:
6
- metadata.gz: 7b8759d6f7c7d3b7bb685ce725497e5b171bcc85efbdd3a5c5620c51b9a78d0c97715efe7f6ff2c3544fd39628a40122f5dbdc58164ba98355c789e487a4b4de
7
- data.tar.gz: 38e8d05a29d782446865a203289757f68d4af23f0f63e161c429b96f87dd17cfeec0786f716c1ef71615e3b2136ef6a4b88e78d96c3a4396e39945b9e3a089a8
6
+ metadata.gz: bd49152ec9f6325cf58d9d470ed1ccfdb0c5b5d9a899bcdca289500ab56008b1153db35bf90f8e5e167692be8c79cd5bf96bbf66f9f515f97922bb217a0544d5
7
+ data.tar.gz: 32b36daf1f26227c58690fea5a6846c7599d1e8859e074ca3f2075cf0effa187cfbb4cb86a50ea22dfe169d3af54fd2e3232e14c569efcaa625f52166f227dbb
checksums.yaml.gz.sig CHANGED
Binary file
data.tar.gz.sig CHANGED
Binary file
@@ -17,8 +17,7 @@ jobs:
17
17
  - '3.0'
18
18
  - '2.7'
19
19
  - '2.6'
20
- - '2.5'
21
- - 'jruby'
20
+ - 'jruby-head'
22
21
  include:
23
22
  - ruby: '3.0'
24
23
  coverage: 'true'
data/.gitignore CHANGED
@@ -4,6 +4,7 @@
4
4
  /vendor/ruby/
5
5
  /ruby/
6
6
  app.god
7
+ example.rb
7
8
 
8
9
  # minimal Rails specific artifacts
9
10
  db/*.sqlite3
@@ -12,6 +13,7 @@ db/*.sqlite3
12
13
  *.gem
13
14
  *.~
14
15
  /.coditsu/local.yml
16
+ .byebug_history
15
17
 
16
18
  # various artifacts
17
19
  **.war
data/.ruby-version CHANGED
@@ -1 +1 @@
1
- 3.0.0
1
+ 3.0.2
data/CHANGELOG.md CHANGED
@@ -1,10 +1,22 @@
1
1
  # WaterDrop changelog
2
2
 
3
- ## 1.4.2 (2021-03-30)
4
- - Additional 3.0 fixes (ojab)
5
-
6
- ## 1.4.1 (2021-03-23)
7
- - Support for Ruby 3.0
3
+ ## 2.0.2 (2021-08-13)
4
+ - Add support for `partition_key`
5
+ - Switch license from `LGPL-3.0` to `MIT`
6
+ - Switch flushing on close to sync
7
+
8
+ ## 2.0.1 (2021-06-05)
9
+ - Remove Ruby 2.5 support and update minimum Ruby requirement to 2.6
10
+ - Fix the `finalizer references object to be finalized` warning issued with 3.0
11
+
12
+ ## 2.0.0 (2020-12-13)
13
+ - Redesign of the whole API (see `README.md` for the use-cases and the current API)
14
+ - Replace `ruby-kafka` with `rdkafka`
15
+ - Switch license from `MIT` to `LGPL-3.0`
16
+ - #113 - Add some basic validations of the kafka scope of the config (Azdaroth)
17
+ - Global state removed
18
+ - Redesigned metrics that use `rdkafka` internal data + custom diffing
19
+ - Restore JRuby support
8
20
 
9
21
  ## 1.4.0 (2020-08-25)
10
22
  - Release to match Karafka 1.4 versioning.
data/Gemfile CHANGED
@@ -2,9 +2,18 @@
2
2
 
3
3
  source 'https://rubygems.org'
4
4
 
5
+ plugin 'diffend'
6
+
5
7
  gemspec
6
8
 
9
+ gem 'rdkafka'
10
+
11
+ group :development do
12
+ gem 'byebug'
13
+ end
14
+
7
15
  group :test do
16
+ gem 'factory_bot'
8
17
  gem 'rspec'
9
18
  gem 'simplecov'
10
19
  end
data/Gemfile.lock CHANGED
@@ -1,49 +1,49 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- waterdrop (1.4.2)
5
- delivery_boy (>= 0.2, < 2.x)
4
+ waterdrop (2.0.2)
5
+ concurrent-ruby (>= 1.1)
6
6
  dry-configurable (~> 0.8)
7
7
  dry-monitor (~> 0.3)
8
- dry-validation (~> 1.2)
9
- ruby-kafka (>= 0.7.8)
8
+ dry-validation (~> 1.3)
9
+ rdkafka (>= 0.6.0)
10
10
  zeitwerk (~> 2.1)
11
11
 
12
12
  GEM
13
13
  remote: https://rubygems.org/
14
14
  specs:
15
- concurrent-ruby (1.1.8)
16
- delivery_boy (1.1.0)
17
- king_konf (~> 1.0)
18
- ruby-kafka (~> 1.0)
15
+ activesupport (6.1.4)
16
+ concurrent-ruby (~> 1.0, >= 1.0.2)
17
+ i18n (>= 1.6, < 2)
18
+ minitest (>= 5.1)
19
+ tzinfo (~> 2.0)
20
+ zeitwerk (~> 2.3)
21
+ byebug (11.1.3)
22
+ concurrent-ruby (1.1.9)
19
23
  diff-lcs (1.4.4)
20
- digest-crc (0.6.3)
21
- rake (>= 12.0.0, < 14.0.0)
22
- docile (1.3.5)
24
+ docile (1.4.0)
23
25
  dry-configurable (0.12.1)
24
26
  concurrent-ruby (~> 1.0)
25
27
  dry-core (~> 0.5, >= 0.5.0)
26
- dry-container (0.7.2)
28
+ dry-container (0.8.0)
27
29
  concurrent-ruby (~> 1.0)
28
30
  dry-configurable (~> 0.1, >= 0.1.3)
29
- dry-core (0.5.0)
31
+ dry-core (0.7.1)
30
32
  concurrent-ruby (~> 1.0)
31
33
  dry-equalizer (0.3.0)
32
- dry-events (0.2.0)
34
+ dry-events (0.3.0)
33
35
  concurrent-ruby (~> 1.0)
34
- dry-core (~> 0.4)
35
- dry-equalizer (~> 0.2)
36
- dry-inflector (0.2.0)
36
+ dry-core (~> 0.5, >= 0.5)
37
+ dry-inflector (0.2.1)
37
38
  dry-initializer (3.0.4)
38
- dry-logic (1.1.0)
39
+ dry-logic (1.2.0)
39
40
  concurrent-ruby (~> 1.0)
40
41
  dry-core (~> 0.5, >= 0.5)
41
- dry-monitor (0.3.2)
42
+ dry-monitor (0.4.0)
42
43
  dry-configurable (~> 0.5)
43
- dry-core (~> 0.4)
44
- dry-equalizer (~> 0.2)
44
+ dry-core (~> 0.5, >= 0.5)
45
45
  dry-events (~> 0.2)
46
- dry-schema (1.6.1)
46
+ dry-schema (1.7.0)
47
47
  concurrent-ruby (~> 1.0)
48
48
  dry-configurable (~> 0.8, >= 0.8.3)
49
49
  dry-core (~> 0.5, >= 0.5)
@@ -63,8 +63,18 @@ GEM
63
63
  dry-equalizer (~> 0.2)
64
64
  dry-initializer (~> 3.0)
65
65
  dry-schema (~> 1.5, >= 1.5.2)
66
- king_konf (1.0.0)
67
- rake (13.0.3)
66
+ factory_bot (6.2.0)
67
+ activesupport (>= 5.0.0)
68
+ ffi (1.15.3)
69
+ i18n (1.8.10)
70
+ concurrent-ruby (~> 1.0)
71
+ mini_portile2 (2.6.1)
72
+ minitest (5.14.4)
73
+ rake (13.0.6)
74
+ rdkafka (0.9.0)
75
+ ffi (~> 1.9)
76
+ mini_portile2 (~> 2.1)
77
+ rake (>= 12.3)
68
78
  rspec (3.10.0)
69
79
  rspec-core (~> 3.10.0)
70
80
  rspec-expectations (~> 3.10.0)
@@ -78,24 +88,27 @@ GEM
78
88
  diff-lcs (>= 1.2.0, < 2.0)
79
89
  rspec-support (~> 3.10.0)
80
90
  rspec-support (3.10.2)
81
- ruby-kafka (1.3.0)
82
- digest-crc
83
91
  simplecov (0.21.2)
84
92
  docile (~> 1.1)
85
93
  simplecov-html (~> 0.11)
86
94
  simplecov_json_formatter (~> 0.1)
87
95
  simplecov-html (0.12.3)
88
- simplecov_json_formatter (0.1.2)
96
+ simplecov_json_formatter (0.1.3)
97
+ tzinfo (2.0.4)
98
+ concurrent-ruby (~> 1.0)
89
99
  zeitwerk (2.4.2)
90
100
 
91
101
  PLATFORMS
92
- x86_64-darwin-19
102
+ x86_64-darwin
93
103
  x86_64-linux
94
104
 
95
105
  DEPENDENCIES
106
+ byebug
107
+ factory_bot
108
+ rdkafka
96
109
  rspec
97
110
  simplecov
98
111
  waterdrop!
99
112
 
100
113
  BUNDLED WITH
101
- 2.2.15
114
+ 2.2.25
File without changes
data/README.md CHANGED
@@ -1,17 +1,45 @@
1
1
  # WaterDrop
2
2
 
3
- [![Build Status](https://travis-ci.org/karafka/waterdrop.svg)](https://travis-ci.org/karafka/waterdrop)
4
- [![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
3
+ **Note**: Documentation presented here refers to WaterDrop `2.0.0`.
5
4
 
6
- Gem used to send messages to Kafka in an easy way with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
5
+ WaterDrop `2.0` does **not** work with Karafka `1.*` and aims to either work as a standalone producer outside of Karafka `1.*` ecosystem or as a part of not yet released Karafka `2.0.*`.
6
+
7
+ Please refer to [this](https://github.com/karafka/waterdrop/tree/1.4) branch and its documentation for details about WaterDrop `1.*` usage.
7
8
 
8
- WaterDrop is based on Zendesks [delivery_boy](https://github.com/zendesk/delivery_boy) gem.
9
+ [![Build Status](https://github.com/karafka/waterdrop/workflows/ci/badge.svg)](https://github.com/karafka/waterdrop/actions?query=workflow%3Aci)
10
+ [![Gem Version](https://badge.fury.io/rb/waterdrop.svg)](http://badge.fury.io/rb/waterdrop)
11
+ [![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka)
9
12
 
10
- It is:
13
+ Gem used to send messages to Kafka in an easy way with an extra validation layer. It is a part of the [Karafka](https://github.com/karafka/karafka) ecosystem.
11
14
 
12
- - Thread safe
13
- - Supports sync and async producers
14
- - Working with 0.11+ Kafka
15
+ It:
16
+
17
+ - Is thread safe
18
+ - Supports sync producing
19
+ - Supports async producing
20
+ - Supports buffering
21
+ - Supports producing messages to multiple clusters
22
+ - Supports multiple delivery policies
23
+ - Works with Kafka 1.0+ and Ruby 2.6+
24
+
25
+ ## Table of contents
26
+
27
+ - [WaterDrop](#waterdrop)
28
+ * [Table of contents](#table-of-contents)
29
+ * [Installation](#installation)
30
+ * [Setup](#setup)
31
+ + [WaterDrop configuration options](#waterdrop-configuration-options)
32
+ + [Kafka configuration options](#kafka-configuration-options)
33
+ * [Usage](#usage)
34
+ + [Basic usage](#basic-usage)
35
+ + [Buffering](#buffering)
36
+ - [Using WaterDrop to buffer messages based on the application logic](#using-waterdrop-to-buffer-messages-based-on-the-application-logic)
37
+ - [Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing](#using-waterdrop-with-rdkafka-buffers-to-achieve-periodic-auto-flushing)
38
+ * [Instrumentation](#instrumentation)
39
+ + [Usage statistics](#usage-statistics)
40
+ + [Forking and potential memory problems](#forking-and-potential-memory-problems)
41
+ * [References](#references)
42
+ * [Note on contributions](#note-on-contributions)
15
43
 
16
44
  ## Installation
17
45
 
@@ -35,83 +63,244 @@ bundle install
35
63
 
36
64
  WaterDrop is a complex tool, that contains multiple configuration options. To keep everything organized, all the configuration options were divided into two groups:
37
65
 
38
- - WaterDrop options - options directly related to Karafka framework and it's components
39
- - Ruby-Kafka driver options - options related to Ruby-Kafka/Delivery boy
66
+ - WaterDrop options - options directly related to WaterDrop and its components
67
+ - Kafka driver options - options related to `rdkafka`
68
+
69
+ To apply all those configuration options, you need to create a producer instance and use the ```#setup``` method:
70
+
71
+ ```ruby
72
+ producer = WaterDrop::Producer.new
73
+
74
+ producer.setup do |config|
75
+ config.deliver = true
76
+ config.kafka = {
77
+ 'bootstrap.servers': 'localhost:9092',
78
+ 'request.required.acks': 1
79
+ }
80
+ end
81
+ ```
40
82
 
41
- To apply all those configuration options, you need to use the ```#setup``` method:
83
+ or you can do the same while initializing the producer:
42
84
 
43
85
  ```ruby
44
- WaterDrop.setup do |config|
86
+ producer = WaterDrop::Producer.new do |config|
45
87
  config.deliver = true
46
- config.kafka.seed_brokers = %w[kafka://localhost:9092]
88
+ config.kafka = {
89
+ 'bootstrap.servers': 'localhost:9092',
90
+ 'request.required.acks': 1
91
+ }
47
92
  end
48
93
  ```
49
94
 
50
95
  ### WaterDrop configuration options
51
96
 
52
- | Option | Description |
53
- |-----------------------------|------------------------------------------------------------------|
54
- | client_id | This is how the client will identify itself to the Kafka brokers |
55
- | logger | Logger that we want to use |
56
- | deliver | Should we send messages to Kafka |
97
+ | Option | Description |
98
+ |--------------------|-----------------------------------------------------------------|
99
+ | `id` | id of the producer for instrumentation and logging |
100
+ | `logger` | Logger that we want to use |
101
+ | `deliver` | Should we send messages to Kafka or just fake the delivery |
102
+ | `max_wait_timeout` | Waits that long for the delivery report or raises an error |
103
+ | `wait_timeout` | Waits that long before re-check of delivery report availability |
57
104
 
58
- ### Ruby-Kafka driver and Delivery boy configuration options
105
+ ### Kafka configuration options
59
106
 
60
- **Note:** We've listed here only **the most important** configuration options. If you're interested in all the options, please go to the [config.rb](https://github.com/karafka/waterdrop/blob/master/lib/water_drop/config.rb) file for more details.
107
+ You can create producers with different `kafka` settings. Documentation of the available configuration options is available on https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
61
108
 
62
- **Note:** All the options are subject to validations. In order to check what is and what is not acceptable, please go to the [config.rb validation schema](https://github.com/karafka/waterdrop/blob/master/lib/water_drop/schemas/config.rb) file.
109
+ ## Usage
110
+
111
+ Please refer to the [documentation](https://www.rubydoc.info/gems/waterdrop) in case you're interested in the more advanced API.
63
112
 
64
- | Option | Description |
65
- |--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
66
- | raise_on_buffer_overflow | Should we raise an exception, when messages can't be sent in an async way due to the message buffer overflow or should we just drop them |
67
- | delivery_interval | The number of seconds between background message deliveries. Disable timer-based background deliveries by setting this to 0. |
68
- | delivery_threshold | The number of buffered messages that will trigger a background message delivery. Disable buffer size based background deliveries by setting this to 0.|
69
- | required_acks | The number of Kafka replicas that must acknowledge messages before they're considered as successfully written. |
70
- | ack_timeout | A timeout executed by a broker when the client is sending messages to it. |
71
- | max_retries | The number of retries when attempting to deliver messages. |
72
- | retry_backoff | The number of seconds to wait after a failed attempt to send messages to a Kafka broker before retrying. |
73
- | max_buffer_bytesize | The maximum number of bytes allowed in the buffer before new messages are rejected. |
74
- | max_buffer_size | The maximum number of messages allowed in the buffer before new messages are rejected. |
75
- | max_queue_size | The maximum number of messages allowed in the queue before new messages are rejected. |
76
- | sasl_plain_username | The username used to authenticate. |
77
- | sasl_plain_password | The password used to authenticate. |
113
+ ### Basic usage
78
114
 
79
- This configuration can be also placed in *config/initializers* and can vary based on the environment:
115
+ To send Kafka messages, just create a producer and use it:
80
116
 
81
117
  ```ruby
82
- WaterDrop.setup do |config|
83
- config.deliver = Rails.env.production?
84
- config.kafka.seed_brokers = [Rails.env.production? ? 'kafka://prod-host:9091' : 'kafka://localhost:9092']
118
+ producer = WaterDrop::Producer.new
119
+
120
+ producer.setup do |config|
121
+ config.kafka = { 'bootstrap.servers': 'localhost:9092' }
85
122
  end
123
+
124
+ producer.produce_sync(topic: 'my-topic', payload: 'my message')
125
+
126
+ # or for async
127
+ producer.produce_async(topic: 'my-topic', payload: 'my message')
128
+
129
+ # or in batches
130
+ producer.produce_many_sync(
131
+ [
132
+ { topic: 'my-topic', payload: 'my message'},
133
+ { topic: 'my-topic', payload: 'my message'}
134
+ ]
135
+ )
136
+
137
+ # both sync and async
138
+ producer.produce_many_async(
139
+ [
140
+ { topic: 'my-topic', payload: 'my message'},
141
+ { topic: 'my-topic', payload: 'my message'}
142
+ ]
143
+ )
144
+
145
+ # Don't forget to close the producer once you're done to flush the internal buffers, etc
146
+ producer.close
86
147
  ```
87
148
 
88
- ## Usage
149
+ Each message that you want to publish, will have its value checked.
150
+
151
+ Here are all the things you can provide in the message hash:
152
+
153
+ | Option | Required | Value type | Description |
154
+ |-----------------|----------|---------------|----------------------------------------------------------|
155
+ | `topic` | true | String | The Kafka topic that should be written to |
156
+ | `payload` | true | String | Data you want to send to Kafka |
157
+ | `key` | false | String | The key that should be set in the Kafka message |
158
+ | `partition` | false | Integer | A specific partition number that should be written to |
159
+ | `partition_key` | false | String | Key to indicate the destination partition of the message |
160
+ | `timestamp` | false | Time, Integer | The timestamp that should be set on the message |
161
+ | `headers` | false | Hash | Headers for the message |
89
162
 
90
- To send Kafka messages, just use one of the producers:
163
+ Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
164
+
165
+ ### Buffering
166
+
167
+ WaterDrop producers support buffering messages in their internal buffers and on the `rdkafka` level via `queue.buffering.*` set of settings.
168
+
169
+ This means that depending on your use case, you can achieve both granular buffering and flushing control when needed with context awareness and periodic and size-based flushing functionalities.
170
+
171
+ #### Using WaterDrop to buffer messages based on the application logic
91
172
 
92
173
  ```ruby
93
- WaterDrop::SyncProducer.call('message', topic: 'my-topic')
94
- # or for async
95
- WaterDrop::AsyncProducer.call('message', topic: 'my-topic')
174
+ producer = WaterDrop::Producer.new
175
+
176
+ producer.setup do |config|
177
+ config.kafka = { 'bootstrap.servers': 'localhost:9092' }
178
+ end
179
+
180
+ # Simulating some events states of a transaction - notice, that the messages will be flushed to
181
+ # kafka only upon arrival of the `finished` state.
182
+ %w[
183
+ started
184
+ processed
185
+ finished
186
+ ].each do |state|
187
+ producer.buffer(topic: 'events', payload: state)
188
+
189
+ puts "The messages buffer size #{producer.messages.size}"
190
+ producer.flush_sync if state == 'finished'
191
+ puts "The messages buffer size #{producer.messages.size}"
192
+ end
193
+
194
+ producer.close
96
195
  ```
97
196
 
98
- Both ```SyncProducer``` and ```AsyncProducer``` accept following options:
197
+ #### Using WaterDrop with rdkafka buffers to achieve periodic auto-flushing
99
198
 
100
- | Option | Required | Value type | Description |
101
- |-------------------- |----------|------------|---------------------------------------------------------------------|
102
- | ```topic``` | true | String | The Kafka topic that should be written to |
103
- | ```key``` | false | String | The key that should be set in the Kafka message |
104
- | ```partition``` | false | Integer | A specific partition number that should be written to |
105
- | ```partition_key``` | false | String | A string that can be used to deterministically select the partition |
106
- | ```create_time``` | false | Time | The timestamp that should be set on the message |
107
- | ```headers``` | false | Hash | Headers for the message |
199
+ ```ruby
200
+ producer = WaterDrop::Producer.new
201
+
202
+ producer.setup do |config|
203
+ config.kafka = {
204
+ 'bootstrap.servers': 'localhost:9092',
205
+ # Accumulate messages for at most 10 seconds
206
+ 'queue.buffering.max.ms' => 10_000
207
+ }
208
+ end
108
209
 
109
- Keep in mind, that message you want to send should be either binary or stringified (to_s, to_json, etc).
210
+ # WaterDrop will flush messages minimum once every 10 seconds
211
+ 30.times do |i|
212
+ producer.produce_async(topic: 'events', payload: i.to_s)
213
+ sleep(1)
214
+ end
215
+
216
+ producer.close
217
+ ```
218
+
219
+ ## Instrumentation
220
+
221
+ Each of the producers after the `#setup` is done, has a custom monitor to which you can subscribe.
222
+
223
+ ```ruby
224
+ producer = WaterDrop::Producer.new
225
+
226
+ producer.setup do |config|
227
+ config.kafka = { 'bootstrap.servers': 'localhost:9092' }
228
+ end
229
+
230
+ producer.monitor.subscribe('message.produced_async') do |event|
231
+ puts "A message was produced to '#{event[:message][:topic]}' topic!"
232
+ end
233
+
234
+ producer.produce_async(topic: 'events', payload: 'data')
235
+
236
+ producer.close
237
+ ```
238
+
239
+ See the `WaterDrop::Instrumentation::Monitor::EVENTS` for the list of all the supported events.
240
+
241
+ ### Usage statistics
242
+
243
+ WaterDrop may be configured to emit internal metrics at a fixed interval by setting the `kafka` `statistics.interval.ms` configuration property to a value > `0`. Once that is done, emitted statistics are available after subscribing to the `statistics.emitted` publisher event.
244
+
245
+ The statistics include all of the metrics from `librdkafka` (full list [here](https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md)) as well as the diff of those against the previously emitted values.
246
+
247
+ For several attributes like `txmsgs`, `librdkafka` publishes only the totals. In order to make it easier to track the progress (for example number of messages sent between statistics emitted events), WaterDrop diffs all the numeric values against previously available numbers. All of those metrics are available under the same key as the metric but with additional `_d` postfix:
248
+
249
+
250
+ ```ruby
251
+ producer = WaterDrop::Producer.new do |config|
252
+ config.kafka = {
253
+ 'bootstrap.servers': 'localhost:9092',
254
+ 'statistics.interval.ms': 2_000 # emit statistics every 2 seconds
255
+ }
256
+ end
257
+
258
+ producer.monitor.subscribe('statistics.emitted') do |event|
259
+ sum = event[:statistics]['txmsgs']
260
+ diff = event[:statistics]['txmsgs_d']
261
+
262
+ p "Sent messages: #{sum}"
263
+ p "Messages sent from last statistics report: #{diff}"
264
+ end
265
+
266
+ sleep(2)
267
+
268
+ # Sent messages: 0
269
+ # Messages sent from last statistics report: 0
270
+
271
+ 20.times { producer.produce_async(topic: 'events', payload: 'data') }
272
+
273
+ # Sent messages: 20
274
+ # Messages sent from last statistics report: 20
275
+
276
+ sleep(2)
277
+
278
+ 20.times { producer.produce_async(topic: 'events', payload: 'data') }
279
+
280
+ # Sent messages: 40
281
+ # Messages sent from last statistics report: 20
282
+
283
+ sleep(2)
284
+
285
+ # Sent messages: 40
286
+ # Messages sent from last statistics report: 0
287
+
288
+ producer.close
289
+ ```
290
+
291
+ Note: The metrics returned may not be completely consistent between brokers, toppars and totals, due to the internal asynchronous nature of librdkafka. E.g., the top level tx total may be less than the sum of the broker tx values which it represents.
292
+
293
+ ### Forking and potential memory problems
294
+
295
+ If you work with forked processes, make sure you **don't** use the producer before the fork. You can easily configure the producer and then fork and use it.
296
+
297
+ To tackle this [obstacle](https://github.com/appsignal/rdkafka-ruby/issues/15) related to rdkafka, WaterDrop adds finalizer to each of the producers to close the rdkafka client before the Ruby process is shutdown. Due to the [nature of the finalizers](https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/), this implementation prevents producers from being GCed (except upon VM shutdown) and can cause memory leaks if you don't use persistent/long-lived producers in a long-running process or if you don't use the `#close` method of a producer when it is no longer needed. Creating a producer instance for each message is anyhow a rather bad idea, so we recommend not to.
110
298
 
111
299
  ## References
112
300
 
301
+ * [WaterDrop code documentation](https://www.rubydoc.info/github/karafka/waterdrop)
113
302
  * [Karafka framework](https://github.com/karafka/karafka)
114
- * [WaterDrop Travis CI](https://travis-ci.org/karafka/waterdrop)
303
+ * [WaterDrop Actions CI](https://github.com/karafka/waterdrop/actions?query=workflow%3Ac)
115
304
  * [WaterDrop Coditsu](https://app.coditsu.io/karafka/repositories/waterdrop)
116
305
 
117
306
  ## Note on contributions
@@ -123,5 +312,3 @@ Each pull request must pass all the RSpec specs and meet our quality requirement
123
312
  To check if everything is as it should be, we use [Coditsu](https://coditsu.io) that combines multiple linters and code analyzers for both code and documentation. Once you're done with your changes, submit a pull request.
124
313
 
125
314
  Coditsu will automatically check your work against our quality standards. You can find your commit check results on the [builds page](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds) of WaterDrop repository.
126
-
127
- [![coditsu](https://coditsu.io/assets/quality_bar.svg)](https://app.coditsu.io/karafka/repositories/waterdrop/builds/commit_builds)