deimos-ruby 1.24.2 → 2.0.0.pre.alpha1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.rubocop_todo.yml +0 -17
- data/.tool-versions +1 -0
- data/CHANGELOG.md +5 -0
- data/README.md +287 -498
- data/deimos-ruby.gemspec +4 -4
- data/docs/CONFIGURATION.md +133 -226
- data/docs/UPGRADING.md +237 -0
- data/lib/deimos/active_record_consume/batch_consumption.rb +29 -28
- data/lib/deimos/active_record_consume/mass_updater.rb +59 -4
- data/lib/deimos/active_record_consume/message_consumption.rb +15 -21
- data/lib/deimos/active_record_consumer.rb +36 -21
- data/lib/deimos/active_record_producer.rb +28 -9
- data/lib/deimos/backends/base.rb +4 -35
- data/lib/deimos/backends/kafka.rb +6 -22
- data/lib/deimos/backends/kafka_async.rb +6 -22
- data/lib/deimos/backends/{db.rb → outbox.rb} +13 -9
- data/lib/deimos/config/configuration.rb +116 -379
- data/lib/deimos/consume/batch_consumption.rb +24 -124
- data/lib/deimos/consume/message_consumption.rb +36 -63
- data/lib/deimos/consumer.rb +16 -75
- data/lib/deimos/ext/consumer_route.rb +35 -0
- data/lib/deimos/ext/producer_middleware.rb +94 -0
- data/lib/deimos/ext/producer_route.rb +22 -0
- data/lib/deimos/ext/redraw.rb +29 -0
- data/lib/deimos/ext/routing_defaults.rb +72 -0
- data/lib/deimos/ext/schema_route.rb +70 -0
- data/lib/deimos/kafka_message.rb +2 -2
- data/lib/deimos/kafka_source.rb +2 -7
- data/lib/deimos/kafka_topic_info.rb +1 -1
- data/lib/deimos/logging.rb +71 -0
- data/lib/deimos/message.rb +2 -11
- data/lib/deimos/metrics/datadog.rb +40 -1
- data/lib/deimos/metrics/provider.rb +4 -4
- data/lib/deimos/producer.rb +39 -116
- data/lib/deimos/railtie.rb +6 -0
- data/lib/deimos/schema_backends/avro_base.rb +21 -21
- data/lib/deimos/schema_backends/avro_schema_registry.rb +1 -2
- data/lib/deimos/schema_backends/avro_validation.rb +2 -2
- data/lib/deimos/schema_backends/base.rb +19 -12
- data/lib/deimos/schema_backends/mock.rb +6 -1
- data/lib/deimos/schema_backends/plain.rb +47 -0
- data/lib/deimos/schema_class/base.rb +2 -2
- data/lib/deimos/schema_class/enum.rb +1 -1
- data/lib/deimos/schema_class/record.rb +2 -2
- data/lib/deimos/test_helpers.rb +95 -320
- data/lib/deimos/tracing/provider.rb +6 -6
- data/lib/deimos/transcoder.rb +88 -0
- data/lib/deimos/utils/db_poller/base.rb +16 -14
- data/lib/deimos/utils/db_poller/state_based.rb +3 -3
- data/lib/deimos/utils/db_poller/time_based.rb +4 -4
- data/lib/deimos/utils/db_poller.rb +1 -1
- data/lib/deimos/utils/deadlock_retry.rb +1 -1
- data/lib/deimos/utils/{db_producer.rb → outbox_producer.rb} +16 -47
- data/lib/deimos/utils/schema_class.rb +0 -7
- data/lib/deimos/version.rb +1 -1
- data/lib/deimos.rb +79 -26
- data/lib/generators/deimos/{db_backend_generator.rb → outbox_backend_generator.rb} +4 -4
- data/lib/generators/deimos/schema_class_generator.rb +0 -1
- data/lib/generators/deimos/v2/templates/karafka.rb.tt +149 -0
- data/lib/generators/deimos/v2_generator.rb +193 -0
- data/lib/tasks/deimos.rake +5 -7
- data/spec/active_record_batch_consumer_association_spec.rb +22 -13
- data/spec/active_record_batch_consumer_spec.rb +84 -65
- data/spec/active_record_consume/batch_consumption_spec.rb +10 -10
- data/spec/active_record_consume/batch_slicer_spec.rb +12 -12
- data/spec/active_record_consume/mass_updater_spec.rb +137 -0
- data/spec/active_record_consumer_spec.rb +29 -13
- data/spec/active_record_producer_spec.rb +36 -26
- data/spec/backends/base_spec.rb +0 -23
- data/spec/backends/kafka_async_spec.rb +1 -3
- data/spec/backends/kafka_spec.rb +1 -3
- data/spec/backends/{db_spec.rb → outbox_spec.rb} +14 -20
- data/spec/batch_consumer_spec.rb +66 -116
- data/spec/consumer_spec.rb +53 -147
- data/spec/deimos_spec.rb +10 -126
- data/spec/kafka_source_spec.rb +19 -52
- data/spec/karafka/karafka.rb +69 -0
- data/spec/karafka_config/karafka_spec.rb +97 -0
- data/spec/logging_spec.rb +25 -0
- data/spec/message_spec.rb +9 -9
- data/spec/producer_spec.rb +112 -254
- data/spec/rake_spec.rb +1 -3
- data/spec/schema_backends/avro_validation_spec.rb +1 -1
- data/spec/schemas/com/my-namespace/MySchemaWithTitle.avsc +22 -0
- data/spec/snapshots/consumers-no-nest.snap +49 -0
- data/spec/snapshots/consumers.snap +49 -0
- data/spec/snapshots/consumers_and_producers-no-nest.snap +49 -0
- data/spec/snapshots/consumers_and_producers.snap +49 -0
- data/spec/snapshots/consumers_circular-no-nest.snap +49 -0
- data/spec/snapshots/consumers_circular.snap +49 -0
- data/spec/snapshots/consumers_complex_types-no-nest.snap +49 -0
- data/spec/snapshots/consumers_complex_types.snap +49 -0
- data/spec/snapshots/consumers_nested-no-nest.snap +49 -0
- data/spec/snapshots/consumers_nested.snap +49 -0
- data/spec/snapshots/namespace_folders.snap +49 -0
- data/spec/snapshots/namespace_map.snap +49 -0
- data/spec/snapshots/producers_with_key-no-nest.snap +49 -0
- data/spec/snapshots/producers_with_key.snap +49 -0
- data/spec/spec_helper.rb +61 -29
- data/spec/utils/db_poller_spec.rb +49 -39
- data/spec/utils/{db_producer_spec.rb → outbox_producer_spec.rb} +17 -184
- metadata +58 -67
- data/lib/deimos/batch_consumer.rb +0 -7
- data/lib/deimos/config/phobos_config.rb +0 -163
- data/lib/deimos/instrumentation.rb +0 -95
- data/lib/deimos/monkey_patches/phobos_cli.rb +0 -35
- data/lib/deimos/utils/inline_consumer.rb +0 -158
- data/lib/deimos/utils/lag_reporter.rb +0 -186
- data/lib/deimos/utils/schema_controller_mixin.rb +0 -129
- data/spec/config/configuration_spec.rb +0 -321
- data/spec/kafka_listener_spec.rb +0 -55
- data/spec/phobos.bad_db.yml +0 -73
- data/spec/phobos.yml +0 -77
- data/spec/utils/inline_consumer_spec.rb +0 -31
- data/spec/utils/lag_reporter_spec.rb +0 -76
- data/spec/utils/platform_schema_validation_spec.rb +0 -0
- data/spec/utils/schema_controller_mixin_spec.rb +0 -84
- /data/lib/generators/deimos/{db_backend → outbox_backend}/templates/migration +0 -0
- /data/lib/generators/deimos/{db_backend → outbox_backend}/templates/rails3_migration +0 -0
data/README.md
CHANGED
@@ -8,7 +8,10 @@
|
|
8
8
|
|
9
9
|
A Ruby framework for marrying Kafka, a schema definition like Avro, and/or ActiveRecord and provide
|
10
10
|
a useful toolbox of goodies for Ruby-based Kafka development.
|
11
|
-
Built on
|
11
|
+
Built on [Karafka](https://karafka.io/).
|
12
|
+
|
13
|
+
[!IMPORTANT]
|
14
|
+
Deimos 2.x is a major rewrite from 1.x. Please see the [Upgrading Guide](./docs/UPGRADING.md) for information on the changes and how to upgrade.
|
12
15
|
|
13
16
|
<!--ts-->
|
14
17
|
* [Additional Documentation](#additional-documentation)
|
@@ -23,15 +26,15 @@ Built on Phobos and hence Ruby-Kafka.
|
|
23
26
|
* [Kafka Message Keys](#kafka-message-keys)
|
24
27
|
* [Consumers](#consumers)
|
25
28
|
* [Rails Integration](#rails-integration)
|
26
|
-
* [
|
27
|
-
|
29
|
+
* [Producing](#rails-producing)
|
30
|
+
* [Consuming](#rails-consuming)
|
31
|
+
* [Generating Tables and Models](#generating-tables-and-models)
|
32
|
+
* [Outbox Backend](#outbox-backend)
|
28
33
|
* [Database Poller](#database-poller)
|
29
34
|
* [Running Consumers](#running-consumers)
|
30
35
|
* [Generated Schema Classes](#generated-schema-classes)
|
31
36
|
* [Metrics](#metrics)
|
32
37
|
* [Testing](#testing)
|
33
|
-
* [Test Helpers](#test-helpers)
|
34
|
-
* [Integration Test Helpers](#integration-test-helpers)
|
35
38
|
* [Utilities](#utilities)
|
36
39
|
* [Contributing](#contributing)
|
37
40
|
<!--te-->
|
@@ -70,7 +73,7 @@ are for bugfixes or new functionality which does not affect existing code. You
|
|
70
73
|
should be locking your Gemfile to the minor version:
|
71
74
|
|
72
75
|
```ruby
|
73
|
-
gem 'deimos-ruby', '~> 1.1'
|
76
|
+
gem 'deimos-ruby', '~> 1.1.0'
|
74
77
|
```
|
75
78
|
|
76
79
|
# Configuration
|
@@ -100,7 +103,15 @@ To create a new schema backend, please see the existing examples [here](lib/deim
|
|
100
103
|
|
101
104
|
# Producers
|
102
105
|
|
103
|
-
|
106
|
+
With the correct [configuration](./docs/CONFIGURATION.md), you do not need to use a Deimos producer class in order to send schema-encoded messages to Kafka. You can simply use `Karafka.producer.produce()` (see [here](https://karafka.io/docs/Producing-messages/)). There are a few features that Deimos producers provide:
|
107
|
+
|
108
|
+
* Using an instance method to determine partition key based on the provided payload
|
109
|
+
* Allowing global disabling of producers (or a particular producer class)
|
110
|
+
* Usage of the [Outbox](#outbox) producer backend.
|
111
|
+
|
112
|
+
Producer classes in general are a handy way to coerce some object into a hash or [schema class](#generated-schema-classes) that represents the payload.
|
113
|
+
|
114
|
+
A Deimos producer could look like this:
|
104
115
|
|
105
116
|
```ruby
|
106
117
|
class MyProducer < Deimos::Producer
|
@@ -113,27 +124,22 @@ class MyProducer < Deimos::Producer
|
|
113
124
|
payload[:my_id]
|
114
125
|
end
|
115
126
|
|
116
|
-
# You can call
|
117
|
-
# wrapping them.
|
127
|
+
# You can call produce directly, or create new methods wrapping it.
|
118
128
|
|
119
129
|
def send_some_message(an_object)
|
120
130
|
payload = {
|
121
131
|
'some-key' => an_object.foo,
|
122
132
|
'some-key2' => an_object.bar
|
123
133
|
}
|
124
|
-
|
125
|
-
#
|
126
|
-
|
127
|
-
self.publish(payload)
|
134
|
+
self.produce([{payload: payload}])
|
135
|
+
# additional keys can be added - see https://karafka.io/docs/WaterDrop-Usage/
|
136
|
+
self.produce([{payload: payload, topic: "other-topic", key: "some-key", partition_key: "some-key2"}])
|
128
137
|
end
|
129
|
-
|
130
138
|
end
|
131
|
-
|
132
|
-
|
133
139
|
end
|
134
140
|
```
|
135
141
|
|
136
|
-
|
142
|
+
## Auto-added Fields
|
137
143
|
|
138
144
|
If your schema has a field called `message_id`, and the payload you give
|
139
145
|
your producer doesn't have this set, Deimos will auto-generate
|
@@ -143,7 +149,7 @@ so that you can track each sent message via logging.
|
|
143
149
|
You can also provide a field in your schema called `timestamp` which will be
|
144
150
|
auto-filled with the current timestamp if not provided.
|
145
151
|
|
146
|
-
|
152
|
+
## Coerced Values
|
147
153
|
|
148
154
|
Deimos will do some simple coercions if you pass values that don't
|
149
155
|
exactly match the schema.
|
@@ -155,60 +161,28 @@ representing a number, will be parsed to Float.
|
|
155
161
|
* If the schema is :string, if the value implements its own `to_s` method,
|
156
162
|
this will be called on it. This includes hashes, symbols, numbers, dates, etc.
|
157
163
|
|
158
|
-
|
159
|
-
|
160
|
-
Deimos will send ActiveSupport Notifications.
|
161
|
-
You can listen to these notifications e.g. as follows:
|
164
|
+
## Disabling Producers
|
162
165
|
|
166
|
+
You can disable producers globally or inside a block. Globally:
|
163
167
|
```ruby
|
164
|
-
|
165
|
-
|
166
|
-
# you can access time, duration, and transaction_id
|
167
|
-
# payload contains :producer, :topic, and :payloads
|
168
|
-
data = event.payload
|
169
|
-
end
|
170
|
-
```
|
168
|
+
Deimos.config.producers.disabled = true
|
169
|
+
```
|
171
170
|
|
172
|
-
|
173
|
-
|
171
|
+
For the duration of a block:
|
172
|
+
```ruby
|
173
|
+
Deimos.disable_producers do
|
174
|
+
# code goes here
|
175
|
+
end
|
176
|
+
```
|
174
177
|
|
175
|
-
|
176
|
-
* producer - the class that produced the message
|
177
|
-
* topic
|
178
|
-
* exception_object
|
179
|
-
* payloads - the unencoded payloads
|
180
|
-
* `encode_messages` - sent when messages are being schema-encoded.
|
181
|
-
* producer - the class that produced the message
|
182
|
-
* topic
|
183
|
-
* payloads - the unencoded payloads
|
184
|
-
* `db_producer.produce` - sent when the DB producer sends messages for the
|
185
|
-
DB backend. Messages that are too large will be caught with this
|
186
|
-
notification - they will be deleted from the table and this notification
|
187
|
-
will be fired with an exception object.
|
188
|
-
* topic
|
189
|
-
* exception_object
|
190
|
-
* messages - the batch of messages (in the form of `Deimos::KafkaMessage`s)
|
191
|
-
that failed - this should have only a single message in the batch.
|
192
|
-
* `batch_consumption.valid_records` - sent when the consumer has successfully upserted records. Limited by `max_db_batch_size`.
|
193
|
-
* consumer: class of the consumer that upserted these records
|
194
|
-
* records: Records upserted into the DB (of type `ActiveRecord::Base`)
|
195
|
-
* `batch_consumption.invalid_records` - sent when the consumer has rejected records returned from `filtered_records`. Limited by `max_db_batch_size`.
|
196
|
-
* consumer: class of the consumer that rejected these records
|
197
|
-
* records: Rejected records (of type `Deimos::ActiveRecordConsume::BatchRecord`)
|
198
|
-
|
199
|
-
Similarly:
|
178
|
+
For specific producers only:
|
200
179
|
```ruby
|
201
|
-
|
202
|
-
|
203
|
-
|
204
|
-
|
205
|
-
|
206
|
-
Deimos.subscribe('encode_messages') do |event|
|
207
|
-
# ...
|
208
|
-
end
|
209
|
-
```
|
180
|
+
Deimos.disable_producers(Producer1, Producer2) do
|
181
|
+
# code goes here
|
182
|
+
end
|
183
|
+
```
|
210
184
|
|
211
|
-
|
185
|
+
## Kafka Message Keys
|
212
186
|
|
213
187
|
Topics representing events rather than domain data don't need keys. However,
|
214
188
|
best practice for domain messages is to schema-encode message keys
|
@@ -291,6 +265,40 @@ it will be encoded first against the schema). So your payload would look like
|
|
291
265
|
Remember that if you're using `schema`, the `payload_key` must be a *hash*,
|
292
266
|
not a plain value.
|
293
267
|
|
268
|
+
## Instrumentation
|
269
|
+
|
270
|
+
Deimos will send events through the [Karafka instrumentation monitor](https://karafka.io/docs/Monitoring-and-Logging/#subscribing-to-the-instrumentation-events).
|
271
|
+
You can listen to these notifications e.g. as follows:
|
272
|
+
|
273
|
+
```ruby
|
274
|
+
Karafka.monitor.subscribe('deimos.encode_message') do |event|
|
275
|
+
# event is a Karafka Event. You can use [] to access keys in the payload.
|
276
|
+
messages = event[:messages]
|
277
|
+
end
|
278
|
+
```
|
279
|
+
|
280
|
+
The following events are produced (in addition to the ones already
|
281
|
+
produced by Phobos and RubyKafka):
|
282
|
+
|
283
|
+
* `deimos.encode_message` - sent when messages are being schema-encoded.
|
284
|
+
* producer - the class that produced the message
|
285
|
+
* topic
|
286
|
+
* payloads - the unencoded payloads
|
287
|
+
* `outbox.produce` - sent when the outbox producer sends messages for the
|
288
|
+
outbox backend. Messages that are too large will be caught with this
|
289
|
+
notification - they will be deleted from the table and this notification
|
290
|
+
will be fired with an exception object.
|
291
|
+
* topic
|
292
|
+
* exception_object
|
293
|
+
* messages - the batch of messages (in the form of `Deimos::KafkaMessage`s)
|
294
|
+
that failed - this should have only a single message in the batch.
|
295
|
+
* `deimos.batch_consumption.valid_records` - sent when the consumer has successfully upserted records. Limited by `max_db_batch_size`.
|
296
|
+
* consumer: class of the consumer that upserted these records
|
297
|
+
* records: Records upserted into the DB (of type `ActiveRecord::Base`)
|
298
|
+
* `deimos.batch_consumption.invalid_records` - sent when the consumer has rejected records returned from `filtered_records`. Limited by `max_db_batch_size`.
|
299
|
+
* consumer: class of the consumer that rejected these records
|
300
|
+
* records: Rejected records (of type `Deimos::ActiveRecordConsume::BatchRecord`)
|
301
|
+
|
294
302
|
# Consumers
|
295
303
|
|
296
304
|
Here is a sample consumer:
|
@@ -305,18 +313,16 @@ class MyConsumer < Deimos::Consumer
|
|
305
313
|
exception.is_a?(MyBadError)
|
306
314
|
end
|
307
315
|
|
308
|
-
def
|
309
|
-
#
|
310
|
-
|
311
|
-
|
312
|
-
|
313
|
-
# if you need to access it separately from the payload, you can use
|
314
|
-
# metadata[:key]
|
316
|
+
def consume_batch
|
317
|
+
# messages is a Karafka Messages - see https://github.com/karafka/karafka/blob/master/lib/karafka/messages/messages.rb
|
318
|
+
messages.payloads.each do |payload|
|
319
|
+
puts payload
|
320
|
+
end
|
315
321
|
end
|
316
322
|
end
|
317
323
|
```
|
318
324
|
|
319
|
-
|
325
|
+
## Fatal Errors
|
320
326
|
|
321
327
|
The recommended configuration is for consumers *not* to raise errors
|
322
328
|
they encounter while consuming messages. Errors can be come from
|
@@ -330,95 +336,31 @@ can use instrumentation to handle errors you receive. You can also
|
|
330
336
|
specify "fatal errors" either via global configuration (`config.fatal_error`)
|
331
337
|
or via overriding a method on an individual consumer (`def fatal_error`).
|
332
338
|
|
333
|
-
|
339
|
+
## Per-Message Consumption
|
334
340
|
|
335
|
-
Instead of consuming messages
|
336
|
-
|
337
|
-
|
338
|
-
other consumers in regards to key and payload decoding, etc.
|
341
|
+
Instead of consuming messages in a batch, consumers can process one message at a time. This is
|
342
|
+
helpful if the logic involved in each message is independent and you don't want to treat the whole
|
343
|
+
batch as a single unit.
|
339
344
|
|
340
|
-
To enable
|
341
|
-
consumer is set to `
|
345
|
+
To enable message consumption, ensure that the `each_message` property of your
|
346
|
+
consumer is set to `true`.
|
342
347
|
|
343
|
-
|
348
|
+
Per-message consumers will invoke the `consume_message` method instead of `consume_batch`
|
344
349
|
as in this example:
|
345
350
|
|
346
351
|
```ruby
|
347
|
-
class
|
348
|
-
|
349
|
-
def consume_batch(payloads, metadata)
|
350
|
-
# payloads is an array of schema-decoded hashes.
|
351
|
-
# metadata is a hash that contains information like :keys, :topic,
|
352
|
-
# and :first_offset.
|
353
|
-
# Keys are automatically decoded and available as an array with
|
354
|
-
# the same cardinality as the payloads. If you need to iterate
|
355
|
-
# over payloads and keys together, you can use something like this:
|
356
|
-
|
357
|
-
payloads.zip(metadata[:keys]) do |_payload, _key|
|
358
|
-
# Do something
|
359
|
-
end
|
360
|
-
end
|
361
|
-
end
|
362
|
-
```
|
363
|
-
#### Saving data to Multiple Database tables
|
364
|
-
|
365
|
-
> This feature is implemented and tested with MySQL database ONLY.
|
366
|
-
|
367
|
-
Sometimes, the Kafka message needs to be saved to multiple database tables. For example, if a `User` topic provides you metadata and profile image for users, we might want to save it to multiple tables: `User` and `Image`.
|
368
|
-
|
369
|
-
- Return associations as keys in `record_attributes` to enable this feature.
|
370
|
-
- The `bulk_import_id_column` config allows you to specify column_name on `record_class` which can be used to retrieve IDs after save. Defaults to `bulk_import_id`. This config is *required* if you have associations but optional if you do not.
|
371
|
-
|
372
|
-
You must override the `record_attributes` (and optionally `column` and `key_columns`) methods on your consumer class for this feature to work.
|
373
|
-
- `record_attributes` - This method is required to map Kafka messages to ActiveRecord model objects.
|
374
|
-
- `columns(klass)` - Should return an array of column names that should be used by ActiveRecord klass during SQL insert operation.
|
375
|
-
- `key_columns(messages, klass)` - Should return an array of column name(s) that makes a row unique.
|
376
|
-
```ruby
|
377
|
-
class User < ApplicationRecord
|
378
|
-
has_many :images
|
379
|
-
end
|
380
|
-
|
381
|
-
class MyBatchConsumer < Deimos::ActiveRecordConsumer
|
382
|
-
|
383
|
-
record_class User
|
352
|
+
class MyMessageConsumer < Deimos::Consumer
|
384
353
|
|
385
|
-
def
|
386
|
-
|
387
|
-
|
388
|
-
images: [
|
389
|
-
{
|
390
|
-
attr1: payload.image_url
|
391
|
-
},
|
392
|
-
{
|
393
|
-
attr2: payload.other_image_url
|
394
|
-
}
|
395
|
-
]
|
396
|
-
}
|
397
|
-
end
|
398
|
-
|
399
|
-
def key_columns(klass)
|
400
|
-
case klass
|
401
|
-
when User
|
402
|
-
nil # use default
|
403
|
-
when Image
|
404
|
-
["image_url", "image_name"]
|
405
|
-
end
|
406
|
-
end
|
407
|
-
|
408
|
-
def columns(klass)
|
409
|
-
case klass
|
410
|
-
when User
|
411
|
-
nil # use default
|
412
|
-
when Image
|
413
|
-
klass.columns.map(&:name) - [:created_at, :updated_at, :id]
|
414
|
-
end
|
354
|
+
def consume_message(message)
|
355
|
+
# message is a Karafka Message object
|
356
|
+
puts message.payload
|
415
357
|
end
|
416
358
|
end
|
417
359
|
```
|
418
360
|
|
419
361
|
# Rails Integration
|
420
362
|
|
421
|
-
|
363
|
+
## <a name="rails-producing">Producing</a>
|
422
364
|
|
423
365
|
Deimos comes with an ActiveRecordProducer. This takes a single or
|
424
366
|
list of ActiveRecord objects or hashes and maps it to the given schema.
|
@@ -439,7 +381,7 @@ class MyProducer < Deimos::ActiveRecordProducer
|
|
439
381
|
|
440
382
|
# Optionally override this if you want the message to be
|
441
383
|
# sent even if fields that aren't in the schema are changed.
|
442
|
-
def watched_attributes
|
384
|
+
def watched_attributes(_record)
|
443
385
|
super + ['a_non_schema_attribute']
|
444
386
|
end
|
445
387
|
|
@@ -458,28 +400,7 @@ MyProducer.send_events([Widget.new(foo: 1), Widget.new(foo: 2)])
|
|
458
400
|
MyProducer.send_events([{foo: 1}, {foo: 2}])
|
459
401
|
```
|
460
402
|
|
461
|
-
|
462
|
-
|
463
|
-
You can disable producers globally or inside a block. Globally:
|
464
|
-
```ruby
|
465
|
-
Deimos.config.producers.disabled = true
|
466
|
-
```
|
467
|
-
|
468
|
-
For the duration of a block:
|
469
|
-
```ruby
|
470
|
-
Deimos.disable_producers do
|
471
|
-
# code goes here
|
472
|
-
end
|
473
|
-
```
|
474
|
-
|
475
|
-
For specific producers only:
|
476
|
-
```ruby
|
477
|
-
Deimos.disable_producers(Producer1, Producer2) do
|
478
|
-
# code goes here
|
479
|
-
end
|
480
|
-
```
|
481
|
-
|
482
|
-
#### KafkaSource
|
403
|
+
### KafkaSource
|
483
404
|
|
484
405
|
There is a special mixin which can be added to any ActiveRecord class. This
|
485
406
|
will create callbacks which will automatically send messages to Kafka whenever
|
@@ -491,7 +412,7 @@ will not fire if using pure SQL or Arel.
|
|
491
412
|
Note that these messages are sent *during the transaction*, i.e. using
|
492
413
|
`after_create`, `after_update` and `after_destroy`. If there are
|
493
414
|
questions of consistency between the database and Kafka, it is recommended
|
494
|
-
to switch to using the
|
415
|
+
to switch to using the outbox backend (see next section) to avoid these issues.
|
495
416
|
|
496
417
|
When the object is destroyed, an empty payload with a payload key consisting of
|
497
418
|
the record's primary key is sent to the producer. If your topic's key is
|
@@ -525,120 +446,7 @@ class Widget < ActiveRecord::Base
|
|
525
446
|
end
|
526
447
|
```
|
527
448
|
|
528
|
-
|
529
|
-
|
530
|
-
Deimos comes with a mixin for `ActionController` which automatically encodes and decodes schema
|
531
|
-
payloads. There are some advantages to encoding your data in e.g. Avro rather than straight JSON,
|
532
|
-
particularly if your service is talking to another backend service rather than the front-end
|
533
|
-
browser:
|
534
|
-
|
535
|
-
* It enforces a contract between services. Solutions like [OpenAPI](https://swagger.io/specification/)
|
536
|
-
do this as well, but in order for the client to know the contract, usually some kind of code
|
537
|
-
generation has to happen. Using schemas ensures both sides know the contract without having to change code.
|
538
|
-
In addition, OpenAPI is now a huge and confusing format, and using simpler schema formats
|
539
|
-
can be beneficial.
|
540
|
-
* Using Avro or Protobuf ensures both forwards and backwards compatibility,
|
541
|
-
which reduces the need for versioning since both sides can simply ignore fields they aren't aware
|
542
|
-
of.
|
543
|
-
* Encoding and decoding using Avro or Protobuf is generally faster than straight JSON, and
|
544
|
-
results in smaller payloads and therefore less network traffic.
|
545
|
-
|
546
|
-
To use the mixin, add the following to your `WhateverController`:
|
547
|
-
|
548
|
-
```ruby
|
549
|
-
class WhateverController < ApplicationController
|
550
|
-
include Deimos::Utils::SchemaControllerMixin
|
551
|
-
|
552
|
-
request_namespace 'my.namespace.requests'
|
553
|
-
response_namespace 'my.namespace.responses'
|
554
|
-
|
555
|
-
# Add a "schemas" line for all routes that should encode/decode schemas.
|
556
|
-
# Default is to match the schema name to the route name.
|
557
|
-
schemas :index
|
558
|
-
# will look for: my.namespace.requests.Index.avsc
|
559
|
-
# my.namespace.responses.Index.avsc
|
560
|
-
|
561
|
-
# Can use mapping to change the schema but keep the namespaces,
|
562
|
-
# i.e. use the same schema name across the two namespaces
|
563
|
-
schemas create: 'CreateTopic'
|
564
|
-
# will look for: my.namespace.requests.CreateTopic.avsc
|
565
|
-
# my.namespace.responses.CreateTopic.avsc
|
566
|
-
|
567
|
-
# If all routes use the default, you can add them all at once
|
568
|
-
schemas :index, :show, :update
|
569
|
-
|
570
|
-
# Different schemas can be specified as well
|
571
|
-
schemas :index, :show, request: 'IndexRequest', response: 'IndexResponse'
|
572
|
-
|
573
|
-
# To access the encoded data, use the `payload` helper method, and to render it back,
|
574
|
-
# use the `render_schema` method.
|
575
|
-
|
576
|
-
def index
|
577
|
-
response = { 'response_id' => payload['request_id'] + 'hi mom' }
|
578
|
-
render_schema(response)
|
579
|
-
end
|
580
|
-
end
|
581
|
-
```
|
582
|
-
|
583
|
-
To make use of this feature, your requests and responses need to have the correct content type.
|
584
|
-
For Avro content, this is the `avro/binary` content type.
|
585
|
-
|
586
|
-
# Database Backend
|
587
|
-
|
588
|
-
Deimos provides a way to allow Kafka messages to be created inside a
|
589
|
-
database transaction, and send them asynchronously. This ensures that your
|
590
|
-
database transactions and Kafka messages related to those transactions
|
591
|
-
are always in sync. Essentially, it separates the message logic so that a
|
592
|
-
message is first validated, encoded, and saved in the database, and then sent
|
593
|
-
on a separate thread. This means if you have to roll back your transaction,
|
594
|
-
it also rolls back your Kafka messages.
|
595
|
-
|
596
|
-
This is also known as the [Transactional Outbox pattern](https://microservices.io/patterns/data/transactional-outbox.html).
|
597
|
-
|
598
|
-
To enable this, first generate the migration to create the relevant tables:
|
599
|
-
|
600
|
-
rails g deimos:db_backend
|
601
|
-
|
602
|
-
You can now set the following configuration:
|
603
|
-
|
604
|
-
config.producers.backend = :db
|
605
|
-
|
606
|
-
This will save all your Kafka messages to the `kafka_messages` table instead
|
607
|
-
of immediately sending to Kafka. Now, you just need to call
|
608
|
-
|
609
|
-
Deimos.start_db_backend!
|
610
|
-
|
611
|
-
You can do this inside a thread or fork block.
|
612
|
-
If using Rails, you can use a Rake task to do this:
|
613
|
-
|
614
|
-
rails deimos:db_producer
|
615
|
-
|
616
|
-
This creates one or more threads dedicated to scanning and publishing these
|
617
|
-
messages by using the `kafka_topics` table in a manner similar to
|
618
|
-
[Delayed Job](https://github.com/collectiveidea/delayed_job).
|
619
|
-
You can pass in a number of threads to the method:
|
620
|
-
|
621
|
-
Deimos.start_db_backend!(thread_count: 2) # OR
|
622
|
-
THREAD_COUNT=5 rails deimos:db_producer
|
623
|
-
|
624
|
-
If you want to force a message to send immediately, just call the `publish_list`
|
625
|
-
method with `force_send: true`. You can also pass `force_send` into any of the
|
626
|
-
other methods that publish events, like `send_event` in `ActiveRecordProducer`.
|
627
|
-
|
628
|
-
A couple of gotchas when using this feature:
|
629
|
-
* This may result in high throughput depending on your scale. If you're
|
630
|
-
using Rails < 5.1, you should add a migration to change the `id` column
|
631
|
-
to `BIGINT`. Rails >= 5.1 sets it to BIGINT by default.
|
632
|
-
* This table is high throughput but should generally be empty. Make sure
|
633
|
-
you optimize/vacuum this table regularly to reclaim the disk space.
|
634
|
-
* Currently, threads allow you to scale the *number* of topics but not
|
635
|
-
a single large topic with lots of messages. There is an [issue](https://github.com/flipp-oss/deimos/issues/23)
|
636
|
-
opened that would help with this case.
|
637
|
-
|
638
|
-
For more information on how the database backend works and why it was
|
639
|
-
implemented, please see [Database Backends](docs/DATABASE_BACKEND.md).
|
640
|
-
|
641
|
-
### Consuming
|
449
|
+
## <a name="rails-consuming">Consuming</a>
|
642
450
|
|
643
451
|
Deimos provides an ActiveRecordConsumer which will take a payload
|
644
452
|
and automatically save it to a provided model. It will take the intersection
|
@@ -702,42 +510,19 @@ class MyConsumer < Deimos::ActiveRecordConsumer
|
|
702
510
|
end
|
703
511
|
```
|
704
512
|
|
705
|
-
|
706
|
-
|
707
|
-
Deimos provides a generator that takes an existing schema and generates a
|
708
|
-
database table based on its fields. By default, any complex sub-types (such as
|
709
|
-
records or arrays) are turned into JSON (if supported) or string columns.
|
710
|
-
|
711
|
-
Before running this migration, you must first copy the schema into your repo
|
712
|
-
in the correct path (in the example above, you would need to have a file
|
713
|
-
`{SCHEMA_ROOT}/com/my-namespace/MySchema.avsc`).
|
714
|
-
|
715
|
-
To generate a model and migration, run the following:
|
716
|
-
|
717
|
-
rails g deimos:active_record TABLE_NAME FULL_SCHEMA_NAME
|
718
|
-
|
719
|
-
Example:
|
720
|
-
|
721
|
-
rails g deimos:active_record my_table com.my-namespace.MySchema
|
722
|
-
|
723
|
-
...would generate:
|
724
|
-
|
725
|
-
db/migrate/1234_create_my_table.rb
|
726
|
-
app/models/my_table.rb
|
727
|
-
|
728
|
-
#### Batch Consumers
|
513
|
+
### Batch Consuming
|
729
514
|
|
730
515
|
Deimos also provides a batch consumption mode for `ActiveRecordConsumer` which
|
731
516
|
processes groups of messages at once using the ActiveRecord backend.
|
732
517
|
|
733
|
-
Batch ActiveRecord consumers make use of
|
518
|
+
Batch ActiveRecord consumers make use of
|
734
519
|
[activerecord-import](https://github.com/zdennis/activerecord-import) to insert
|
735
520
|
or update multiple records in bulk SQL statements. This reduces processing
|
736
521
|
time at the cost of skipping ActiveRecord callbacks for individual records.
|
737
522
|
Deleted records (tombstones) are grouped into `delete_all` calls and thus also
|
738
523
|
skip `destroy` callbacks.
|
739
524
|
|
740
|
-
Batch consumption is used when the `
|
525
|
+
Batch consumption is used when the `each_message` setting for your consumer is set to `false` (the default).
|
741
526
|
|
742
527
|
**Note**: Currently, batch consumption only supports only primary keys as identifiers out of the box. See
|
743
528
|
[the specs](spec/active_record_batch_consumer_spec.rb) for an example of how to use compound keys.
|
@@ -750,8 +535,6 @@ A sample batch consumer would look as follows:
|
|
750
535
|
|
751
536
|
```ruby
|
752
537
|
class MyConsumer < Deimos::ActiveRecordConsumer
|
753
|
-
schema 'MySchema'
|
754
|
-
key_config field: 'my_field'
|
755
538
|
record_class Widget
|
756
539
|
|
757
540
|
# Controls whether the batch is compacted before consuming.
|
@@ -760,7 +543,7 @@ class MyConsumer < Deimos::ActiveRecordConsumer
|
|
760
543
|
# If false, messages will be grouped into "slices" of independent keys
|
761
544
|
# and each slice will be imported separately.
|
762
545
|
#
|
763
|
-
|
546
|
+
compacted false
|
764
547
|
|
765
548
|
|
766
549
|
# Optional override of the default behavior, which is to call `delete_all`
|
@@ -778,7 +561,141 @@ class MyConsumer < Deimos::ActiveRecordConsumer
|
|
778
561
|
end
|
779
562
|
```
|
780
563
|
|
781
|
-
|
564
|
+
### Saving data to Multiple Database tables
|
565
|
+
|
566
|
+
> This feature is implemented and tested with MySQL ONLY.
|
567
|
+
|
568
|
+
Sometimes, a Kafka message needs to be saved to multiple database tables. For example, if a `User` topic provides you metadata and profile image for users, we might want to save it to multiple tables: `User` and `Image`.
|
569
|
+
|
570
|
+
- Return associations as keys in `record_attributes` to enable this feature.
|
571
|
+
- The `bulk_import_id_column` config allows you to specify column_name on `record_class` which can be used to retrieve IDs after save. Defaults to `bulk_import_id`. This config is *required* if you have associations but optional if you do not.
|
572
|
+
|
573
|
+
You must override the `record_attributes` (and optionally `column` and `key_columns`) methods on your consumer class for this feature to work.
|
574
|
+
- `record_attributes` - This method is required to map Kafka messages to ActiveRecord model objects.
|
575
|
+
- `columns(klass)` - Should return an array of column names that should be used by ActiveRecord klass during SQL insert operation.
|
576
|
+
- `key_columns(messages, klass)` - Should return an array of column name(s) that makes a row unique.
|
577
|
+
|
578
|
+
```ruby
|
579
|
+
class User < ApplicationRecord
|
580
|
+
has_many :images
|
581
|
+
end
|
582
|
+
|
583
|
+
class MyConsumer < Deimos::ActiveRecordConsumer
|
584
|
+
|
585
|
+
record_class User
|
586
|
+
|
587
|
+
def record_attributes(payload, _key)
|
588
|
+
{
|
589
|
+
first_name: payload.first_name,
|
590
|
+
images: [
|
591
|
+
{
|
592
|
+
attr1: payload.image_url
|
593
|
+
},
|
594
|
+
{
|
595
|
+
attr2: payload.other_image_url
|
596
|
+
}
|
597
|
+
]
|
598
|
+
}
|
599
|
+
end
|
600
|
+
|
601
|
+
def key_columns(klass)
|
602
|
+
case klass
|
603
|
+
when User
|
604
|
+
nil # use default
|
605
|
+
when Image
|
606
|
+
["image_url", "image_name"]
|
607
|
+
end
|
608
|
+
end
|
609
|
+
|
610
|
+
def columns(klass)
|
611
|
+
case klass
|
612
|
+
when User
|
613
|
+
nil # use default
|
614
|
+
when Image
|
615
|
+
klass.columns.map(&:name) - [:created_at, :updated_at, :id]
|
616
|
+
end
|
617
|
+
end
|
618
|
+
end
|
619
|
+
```
|
620
|
+
|
621
|
+
## Generating Tables and Models
|
622
|
+
|
623
|
+
Deimos provides a generator that takes an existing schema and generates a
|
624
|
+
database table based on its fields. By default, any complex sub-types (such as
|
625
|
+
records or arrays) are turned into JSON (if supported) or string columns.
|
626
|
+
|
627
|
+
Before running this migration, you must first copy the schema into your repo
|
628
|
+
in the correct path (in the example above, you would need to have a file
|
629
|
+
`{SCHEMA_ROOT}/com/my-namespace/MySchema.avsc`).
|
630
|
+
|
631
|
+
To generate a model and migration, run the following:
|
632
|
+
|
633
|
+
rails g deimos:active_record TABLE_NAME FULL_SCHEMA_NAME
|
634
|
+
|
635
|
+
Example:
|
636
|
+
|
637
|
+
rails g deimos:active_record my_table com.my-namespace.MySchema
|
638
|
+
|
639
|
+
...would generate:
|
640
|
+
|
641
|
+
db/migrate/1234_create_my_table.rb
|
642
|
+
app/models/my_table.rb
|
643
|
+
|
644
|
+
# Outbox Backend
|
645
|
+
|
646
|
+
Deimos provides a way to allow Kafka messages to be created inside a
|
647
|
+
database transaction, and send them asynchronously. This ensures that your
|
648
|
+
database transactions and Kafka messages related to those transactions
|
649
|
+
are always in sync. Essentially, it separates the message logic so that a
|
650
|
+
message is first validated, encoded, and saved in the database, and then sent
|
651
|
+
on a separate thread. This means if you have to roll back your transaction,
|
652
|
+
it also rolls back your Kafka messages.
|
653
|
+
|
654
|
+
This is also known as the [Transactional Outbox pattern](https://microservices.io/patterns/data/transactional-outbox.html).
|
655
|
+
|
656
|
+
To enable this, first generate the migration to create the relevant tables:
|
657
|
+
|
658
|
+
rails g deimos:outbox
|
659
|
+
|
660
|
+
You can now set the following configuration:
|
661
|
+
|
662
|
+
config.producers.backend = :outbox
|
663
|
+
|
664
|
+
This will save all your Kafka messages to the `kafka_messages` table instead
|
665
|
+
of immediately sending to Kafka. Now, you just need to call
|
666
|
+
|
667
|
+
Deimos.start_outbox_backend!
|
668
|
+
|
669
|
+
You can do this inside a thread or fork block.
|
670
|
+
If using Rails, you can use a Rake task to do this:
|
671
|
+
|
672
|
+
rails deimos:outbox
|
673
|
+
|
674
|
+
This creates one or more threads dedicated to scanning and publishing these
|
675
|
+
messages by using the `kafka_topics` table in a manner similar to
|
676
|
+
[Delayed Job](https://github.com/collectiveidea/delayed_job).
|
677
|
+
You can pass in a number of threads to the method:
|
678
|
+
|
679
|
+
Deimos.start_outbox_backend!(thread_count: 2) # OR
|
680
|
+
THREAD_COUNT=5 rails deimos:outbox
|
681
|
+
|
682
|
+
If you want to force a message to send immediately, just call the `produce`
|
683
|
+
method with `backend: kafka`.
|
684
|
+
|
685
|
+
A couple of gotchas when using this feature:
|
686
|
+
* This may result in high throughput depending on your scale. If you're
|
687
|
+
using Rails < 5.1, you should add a migration to change the `id` column
|
688
|
+
to `BIGINT`. Rails >= 5.1 sets it to BIGINT by default.
|
689
|
+
* This table is high throughput but should generally be empty. Make sure
|
690
|
+
you optimize/vacuum this table regularly to reclaim the disk space.
|
691
|
+
* Currently, threads allow you to scale the *number* of topics but not
|
692
|
+
a single large topic with lots of messages. There is an [issue](https://github.com/flipp-oss/deimos/issues/23)
|
693
|
+
opened that would help with this case.
|
694
|
+
|
695
|
+
For more information on how the database backend works and why it was
|
696
|
+
implemented, please see [Database Backends](docs/DATABASE_BACKEND.md).
|
697
|
+
|
698
|
+
# Database Poller
|
782
699
|
|
783
700
|
Another method of fetching updates from the database to Kafka is by polling
|
784
701
|
the database (a process popularized by [Kafka Connect](https://docs.confluent.io/current/connect/index.html)).
|
@@ -825,7 +742,7 @@ define one additional method on the producer:
|
|
825
742
|
|
826
743
|
```ruby
|
827
744
|
class MyProducer < Deimos::ActiveRecordProducer
|
828
|
-
...
|
745
|
+
# ...
|
829
746
|
def poll_query(time_from:, time_to:, column_name:, min_id:)
|
830
747
|
# Default is to use the timestamp `column_name` to find all records
|
831
748
|
# between time_from and time_to, or records where `updated_at` is equal to
|
@@ -834,6 +751,12 @@ class MyProducer < Deimos::ActiveRecordProducer
|
|
834
751
|
# middle of a timestamp, we won't miss any records.
|
835
752
|
# You can override or change this behavior if necessary.
|
836
753
|
end
|
754
|
+
|
755
|
+
# You can define this method if you need to do some extra actions with
|
756
|
+
# the collection of elements you just sent to Kafka
|
757
|
+
def post_process(batch)
|
758
|
+
# write some code here
|
759
|
+
end
|
837
760
|
end
|
838
761
|
```
|
839
762
|
|
@@ -847,25 +770,10 @@ have one process running at a time. If a particular poll takes longer than
|
|
847
770
|
the poll interval (i.e. interval is set at 1 minute but it takes 75 seconds)
|
848
771
|
the next poll will begin immediately following the first one completing.
|
849
772
|
|
850
|
-
To Post-Process records that are sent to Kafka:
|
851
|
-
|
852
|
-
You need to define one additional method in your producer class to post-process the messages sent to Kafka.
|
853
|
-
|
854
|
-
```ruby
|
855
|
-
class MyProducer < Deimos::ActiveRecordProducer
|
856
|
-
...
|
857
|
-
def post_process(batch)
|
858
|
-
# If you need to do some extra actions with
|
859
|
-
# the collection of elements you just sent to Kafka
|
860
|
-
# write some code here
|
861
|
-
end
|
862
|
-
end
|
863
|
-
```
|
864
|
-
|
865
773
|
Note that the poller will retry infinitely if it encounters a Kafka-related error such
|
866
774
|
as a communication failure. For all other errors, it will retry once by default.
|
867
775
|
|
868
|
-
|
776
|
+
## State-based pollers
|
869
777
|
|
870
778
|
By default, pollers use timestamps and IDs to determine the records to publish. However, you can
|
871
779
|
set a different mode whereby it will include all records that match your query, and when done,
|
@@ -884,7 +792,7 @@ db_poller do
|
|
884
792
|
end
|
885
793
|
```
|
886
794
|
|
887
|
-
|
795
|
+
# Running consumers
|
888
796
|
|
889
797
|
Deimos includes a rake task. Once it's in your gemfile, just run
|
890
798
|
|
@@ -895,7 +803,7 @@ which can be useful if you want to figure out if you're inside the task
|
|
895
803
|
as opposed to running your Rails server or console. E.g. you could start your
|
896
804
|
DB backend only when your rake task is running.
|
897
805
|
|
898
|
-
|
806
|
+
# Generated Schema Classes
|
899
807
|
|
900
808
|
Deimos offers a way to generate classes from Avro schemas. These classes are documented
|
901
809
|
with YARD to aid in IDE auto-complete, and will help to move errors closer to the code.
|
@@ -925,7 +833,7 @@ One additional configuration option indicates whether nested records should be g
|
|
925
833
|
|
926
834
|
You can generate a tombstone message (with only a key and no value) by calling the `YourSchemaClass.tombstone(key)` method. If you're using a `:field` key config, you can pass in just the key scalar value. If using a key schema, you can pass it in as a hash or as another schema class.
|
927
835
|
|
928
|
-
|
836
|
+
## Consumer
|
929
837
|
|
930
838
|
The consumer interface uses the `decode_message` method to turn JSON hash into the Schemas
|
931
839
|
generated Class and provides it to the `consume`/`consume_batch` methods for their use.
|
@@ -933,13 +841,13 @@ generated Class and provides it to the `consume`/`consume_batch` methods for the
|
|
933
841
|
Examples of consumers would look like this:
|
934
842
|
```ruby
|
935
843
|
class MyConsumer < Deimos::Consumer
|
936
|
-
def
|
937
|
-
# Same method as
|
938
|
-
# rather than a hash.
|
844
|
+
def consume_message(message)
|
845
|
+
# Same method as before but message.payload is now an instance of Deimos::SchemaClass::Record
|
846
|
+
# rather than a hash.
|
939
847
|
# You can interact with the schema class instance in the following way:
|
940
|
-
do_something(payload.test_id, payload.some_int)
|
848
|
+
do_something(message.payload.test_id, message.payload.some_int)
|
941
849
|
# The original behaviour was as follows:
|
942
|
-
do_something(payload[:test_id], payload[:some_int])
|
850
|
+
do_something(message.payload[:test_id], message.payload[:some_int])
|
943
851
|
end
|
944
852
|
end
|
945
853
|
```
|
@@ -958,9 +866,10 @@ class MyActiveRecordConsumer < Deimos::ActiveRecordConsumer
|
|
958
866
|
end
|
959
867
|
```
|
960
868
|
|
961
|
-
|
869
|
+
## Producer
|
870
|
+
|
962
871
|
Similarly to the consumer interface, the producer interface for using Schema Classes in your app
|
963
|
-
relies on the `
|
872
|
+
relies on the `produce` method to convert a _provided_ instance of a Schema Class
|
964
873
|
into a hash that can be used freely by the Kafka client.
|
965
874
|
|
966
875
|
Examples of producers would look like this:
|
@@ -976,8 +885,7 @@ class MyProducer < Deimos::Producer
|
|
976
885
|
test_id: test_id,
|
977
886
|
some_int: some_int
|
978
887
|
)
|
979
|
-
self.
|
980
|
-
self.publish_list([message])
|
888
|
+
self.produce({payload: message})
|
981
889
|
end
|
982
890
|
end
|
983
891
|
end
|
@@ -986,8 +894,9 @@ end
|
|
986
894
|
```ruby
|
987
895
|
class MyActiveRecordProducer < Deimos::ActiveRecordProducer
|
988
896
|
record_class Widget
|
989
|
-
# @param
|
897
|
+
# @param attributes [Hash]
|
990
898
|
# @param _record [Widget]
|
899
|
+
# @return [Deimos::SchemaClass::Record]
|
991
900
|
def self.generate_payload(attributes, _record)
|
992
901
|
# This method converts your ActiveRecord into a Deimos::SchemaClass::Record. You will be able to use super
|
993
902
|
# as an instance of Schemas::MySchema and set values that are not on your ActiveRecord schema.
|
@@ -1000,51 +909,26 @@ end
|
|
1000
909
|
|
1001
910
|
# Metrics
|
1002
911
|
|
1003
|
-
Deimos includes some metrics reporting out the box. It ships with DataDog support, but you can add custom metric providers as well.
|
912
|
+
Deimos includes some metrics reporting out of the box. It adds to the existing [Karafka DataDog support](https://karafka.io/docs/Monitoring-and-Logging/#datadog-and-statsd-integration). It ships with DataDog support, but you can add custom metric providers as well.
|
1004
913
|
|
1005
914
|
The following metrics are reported:
|
1006
|
-
* `
|
1007
|
-
it's behind the tail of the partition (a gauge). This is only sent if
|
1008
|
-
`config.consumers.report_lag` is set to true.
|
1009
|
-
* `handler` - a count of the number of messages received. Tagged
|
1010
|
-
with the following:
|
1011
|
-
* `topic:{topic_name}`
|
1012
|
-
* `status:received`
|
1013
|
-
* `status:success`
|
1014
|
-
* `status:error`
|
1015
|
-
* `time:consume` (histogram)
|
1016
|
-
* Amount of time spent executing handler for each message
|
1017
|
-
* Batch Consumers - report counts by number of batches
|
1018
|
-
* `status:batch_received`
|
1019
|
-
* `status:batch_success`
|
1020
|
-
* `status:batch_error`
|
1021
|
-
* `time:consume_batch` (histogram)
|
1022
|
-
* Amount of time spent executing handler for entire batch
|
1023
|
-
* `time:time_delayed` (histogram)
|
1024
|
-
* Indicates the amount of time between the `timestamp` property of each
|
1025
|
-
payload (if present) and the time that the consumer started processing
|
1026
|
-
the message.
|
1027
|
-
* `publish` - a count of the number of messages received. Tagged
|
1028
|
-
with `topic:{topic_name}`
|
1029
|
-
* `publish_error` - a count of the number of messages which failed
|
1030
|
-
to publish. Tagged with `topic:{topic_name}`
|
1031
|
-
* `pending_db_messages_max_wait` - the number of seconds which the
|
915
|
+
* `deimos.pending_db_messages_max_wait` - the number of seconds which the
|
1032
916
|
oldest KafkaMessage in the database has been waiting for, for use
|
1033
917
|
with the database backend. Tagged with the topic that is waiting.
|
1034
918
|
Will send a value of 0 with no topics tagged if there are no messages
|
1035
919
|
waiting.
|
1036
|
-
* `
|
920
|
+
* `deimos.outbox.publish` - the number of messages inserted into the database
|
1037
921
|
for publishing. Tagged with `topic:{topic_name}`
|
1038
|
-
* `
|
922
|
+
* `deimos.outbox.process` - the number of DB messages processed. Note that this
|
1039
923
|
is *not* the same as the number of messages *published* if those messages
|
1040
924
|
are compacted. Tagged with `topic:{topic_name}`
|
1041
925
|
|
1042
|
-
|
926
|
+
## Configuring Metrics Providers
|
1043
927
|
|
1044
928
|
See the `metrics` field under [Configuration](#configuration).
|
1045
929
|
View all available Metrics Providers [here](lib/deimos/metrics)
|
1046
930
|
|
1047
|
-
|
931
|
+
## Custom Metrics Providers
|
1048
932
|
|
1049
933
|
Using the above configuration, it is possible to pass in any generic Metrics
|
1050
934
|
Provider class as long as it exposes the methods and definitions expected by
|
@@ -1059,17 +943,18 @@ Also see [deimos.rb](lib/deimos.rb) under `Configure metrics` to see how the met
|
|
1059
943
|
# Tracing
|
1060
944
|
|
1061
945
|
Deimos also includes some tracing for kafka consumers. It ships with
|
1062
|
-
DataDog support, but you can add custom tracing providers as well.
|
946
|
+
DataDog support, but you can add custom tracing providers as well. (It does not use the built-in Karafka
|
947
|
+
tracers so that it can support per-message tracing, which Karafka does not provide for.)
|
1063
948
|
|
1064
949
|
Trace spans are used for when incoming messages are schema-decoded, and a
|
1065
950
|
separate span for message consume logic.
|
1066
951
|
|
1067
|
-
|
952
|
+
## Configuring Tracing Providers
|
1068
953
|
|
1069
954
|
See the `tracing` field under [Configuration](#configuration).
|
1070
955
|
View all available Tracing Providers [here](lib/deimos/tracing)
|
1071
956
|
|
1072
|
-
|
957
|
+
## Custom Tracing Providers
|
1073
958
|
|
1074
959
|
Using the above configuration, it is possible to pass in any generic Tracing
|
1075
960
|
Provider class as long as it exposes the methods and definitions expected by
|
@@ -1083,7 +968,9 @@ Also see [deimos.rb](lib/deimos.rb) under `Configure tracing` to see how the tra
|
|
1083
968
|
|
1084
969
|
# Testing
|
1085
970
|
|
1086
|
-
Deimos comes with a test helper class which provides useful methods for testing consumers.
|
971
|
+
Deimos comes with a test helper class which provides useful methods for testing consumers. This is built on top of
|
972
|
+
Karafka's [testing library](https://karafka.io/docs/Testing/) and is primarily helpful because it can decode
|
973
|
+
the sent messages for comparison (Karafka only decodes the messages once they have been consumed).
|
1087
974
|
|
1088
975
|
In `spec_helper.rb`:
|
1089
976
|
```ruby
|
@@ -1097,55 +984,34 @@ end
|
|
1097
984
|
```ruby
|
1098
985
|
# The following can be added to a rpsec file so that each unit
|
1099
986
|
# test can have the same settings every time it is run
|
1100
|
-
|
1101
|
-
Deimos::TestHelpers.unit_test!
|
1102
|
-
example.run
|
1103
|
-
Deimos.config.reset!
|
1104
|
-
end
|
1105
|
-
|
1106
|
-
# Similarly you can use the Kafka test helper
|
1107
|
-
around(:each) do |example|
|
1108
|
-
Deimos::TestHelpers.kafka_test!
|
1109
|
-
example.run
|
1110
|
-
Deimos.config.reset!
|
1111
|
-
end
|
1112
|
-
|
1113
|
-
# Kakfa test helper using schema registry
|
1114
|
-
around(:each) do |example|
|
1115
|
-
Deimos::TestHelpers.full_integration_test!
|
1116
|
-
example.run
|
987
|
+
after(:each) do
|
1117
988
|
Deimos.config.reset!
|
989
|
+
Deimos.config.schema.backend = :avro_validation
|
1118
990
|
end
|
1119
991
|
```
|
1120
992
|
|
1121
|
-
With the help of these helper methods,
|
1122
|
-
This also prevents Deimos setting changes from leaking in to other examples.
|
1123
|
-
|
1124
|
-
This does not take away the ability to configure Deimos manually in individual examples. Deimos can still be configured like so:
|
993
|
+
With the help of these helper methods, RSpec examples can be written without having to tinker with Deimos settings.
|
994
|
+
This also prevents Deimos setting changes from leaking in to other examples. You can make these changes on an individual test level and ensure that it resets back to where it needs to go:
|
1125
995
|
```ruby
|
1126
996
|
it 'should not fail this random test' do
|
1127
997
|
|
1128
998
|
Deimos.configure do |config|
|
1129
999
|
config.consumers.fatal_error = proc { true }
|
1130
|
-
config.consumers.reraise_errors = false
|
1131
1000
|
end
|
1132
1001
|
...
|
1133
1002
|
expect(some_object).to be_truthy
|
1134
|
-
...
|
1135
|
-
Deimos.config.reset!
|
1136
1003
|
end
|
1137
1004
|
```
|
1138
|
-
If you are using one of the test helpers in an `around(:each)` block and want to override few settings for one example,
|
1139
|
-
you can do it like in the example shown above. These settings would only apply to that specific example and the Deimos config should
|
1140
|
-
reset once the example has finished running.
|
1141
1005
|
|
1142
1006
|
## Test Usage
|
1143
1007
|
|
1144
|
-
|
1008
|
+
You can use `karafka.produce()` and `consumer.consume` in your tests without having to go through
|
1009
|
+
Deimos TestHelpers. However, there are some useful abilities that Deimos gives you:
|
1010
|
+
|
1145
1011
|
```ruby
|
1146
|
-
# Pass a consumer class (not instance) to validate a payload against it.
|
1147
|
-
#
|
1148
|
-
#
|
1012
|
+
# Pass a consumer class (not instance) to validate a payload against it. This takes either a class
|
1013
|
+
# or a topic (Karafka only supports topics in its test helpers). This will validate the payload
|
1014
|
+
# and execute the consumer logic.
|
1149
1015
|
test_consume_message(MyConsumer,
|
1150
1016
|
{ 'some-payload' => 'some-value' }) do |payload, metadata|
|
1151
1017
|
# do some expectation handling here
|
@@ -1158,15 +1024,6 @@ test_consume_message('my-topic-name',
|
|
1158
1024
|
# do some expectation handling here
|
1159
1025
|
end
|
1160
1026
|
|
1161
|
-
# Alternatively, you can test the actual consume logic:
|
1162
|
-
test_consume_message(MyConsumer,
|
1163
|
-
{ 'some-payload' => 'some-value' },
|
1164
|
-
call_original: true)
|
1165
|
-
|
1166
|
-
# Test that a given payload is invalid against the schema:
|
1167
|
-
test_consume_invalid_message(MyConsumer,
|
1168
|
-
{ 'some-invalid-payload' => 'some-value' })
|
1169
|
-
|
1170
1027
|
# For batch consumers, there are similar methods such as:
|
1171
1028
|
test_consume_batch(MyBatchConsumer,
|
1172
1029
|
[{ 'some-payload' => 'some-value' },
|
@@ -1181,7 +1038,7 @@ end
|
|
1181
1038
|
expect(topic_name).to have_sent(payload, key=nil, partition_key=nil, headers=nil)
|
1182
1039
|
|
1183
1040
|
# Inspect sent messages
|
1184
|
-
message = Deimos::
|
1041
|
+
message = Deimos::TestHelpers.sent_messages[0]
|
1185
1042
|
expect(message).to eq({
|
1186
1043
|
message: {'some-key' => 'some-value'},
|
1187
1044
|
topic: 'my-topic',
|
@@ -1190,75 +1047,7 @@ expect(message).to eq({
|
|
1190
1047
|
})
|
1191
1048
|
```
|
1192
1049
|
|
1193
|
-
|
1194
|
-
|
1195
|
-
There is also a helper method that will let you test if an existing schema
|
1196
|
-
would be compatible with a new version of it. You can use this in your
|
1197
|
-
Ruby console but it would likely not be part of your RSpec test:
|
1198
|
-
|
1199
|
-
```ruby
|
1200
|
-
require 'deimos/test_helpers'
|
1201
|
-
# Can pass a file path, a string or a hash into this:
|
1202
|
-
Deimos::TestHelpers.schemas_compatible?(schema1, schema2)
|
1203
|
-
```
|
1204
|
-
|
1205
|
-
You can use the `InlineConsumer` class to help with integration testing,
|
1206
|
-
with a full external Kafka running.
|
1207
|
-
|
1208
|
-
If you have a consumer you want to test against messages in a Kafka topic,
|
1209
|
-
use the `consume` method:
|
1210
|
-
```ruby
|
1211
|
-
Deimos::Utils::InlineConsumer.consume(
|
1212
|
-
topic: 'my-topic',
|
1213
|
-
frk_consumer: MyConsumerClass,
|
1214
|
-
num_messages: 5
|
1215
|
-
)
|
1216
|
-
```
|
1217
|
-
|
1218
|
-
This is a _synchronous_ call which will run the consumer against the
|
1219
|
-
last 5 messages in the topic. You can set `num_messages` to a number
|
1220
|
-
like `1_000_000` to always consume all the messages. Once the last
|
1221
|
-
message is retrieved, the process will wait 1 second to make sure
|
1222
|
-
they're all done, then continue execution.
|
1223
|
-
|
1224
|
-
If you just want to retrieve the contents of a topic, you can use
|
1225
|
-
the `get_messages_for` method:
|
1226
|
-
|
1227
|
-
```ruby
|
1228
|
-
Deimos::Utils::InlineConsumer.get_messages_for(
|
1229
|
-
topic: 'my-topic',
|
1230
|
-
schema: 'my-schema',
|
1231
|
-
namespace: 'my.namespace',
|
1232
|
-
key_config: { field: 'id' },
|
1233
|
-
num_messages: 5
|
1234
|
-
)
|
1235
|
-
```
|
1236
|
-
|
1237
|
-
This will run the process and simply return the last 5 messages on the
|
1238
|
-
topic, as hashes, once it's done. The format of the messages will simply
|
1239
|
-
be
|
1240
|
-
```ruby
|
1241
|
-
{
|
1242
|
-
payload: { key: value }, # payload hash here
|
1243
|
-
key: "some_value" # key value or hash here
|
1244
|
-
}
|
1245
|
-
```
|
1246
|
-
|
1247
|
-
Both payload and key will be schema-decoded as necessary according to the
|
1248
|
-
key config.
|
1249
|
-
|
1250
|
-
You can also just pass an existing producer or consumer class into the method,
|
1251
|
-
and it will extract the necessary configuration from it:
|
1252
|
-
|
1253
|
-
```ruby
|
1254
|
-
Deimos::Utils::InlineConsumer.get_messages_for(
|
1255
|
-
topic: 'my-topic',
|
1256
|
-
config_class: MyProducerClass,
|
1257
|
-
num_messages: 5
|
1258
|
-
)
|
1259
|
-
```
|
1260
|
-
|
1261
|
-
## Utilities
|
1050
|
+
# Utilities
|
1262
1051
|
|
1263
1052
|
You can use your configured schema backend directly if you want to
|
1264
1053
|
encode and decode payloads outside of the context of sending messages.
|
@@ -1272,14 +1061,14 @@ backend.validate(my_payload) # throws an error if not valid
|
|
1272
1061
|
fields = backend.schema_fields # list of fields defined in the schema
|
1273
1062
|
```
|
1274
1063
|
|
1275
|
-
You can also do an even
|
1064
|
+
You can also do an even more concise encode/decode:
|
1276
1065
|
|
1277
1066
|
```ruby
|
1278
1067
|
encoded = Deimos.encode(schema: 'MySchema', namespace: 'com.my-namespace', payload: my_payload)
|
1279
1068
|
decoded = Deimos.decode(schema: 'MySchema', namespace: 'com.my-namespace', payload: my_encoded_payload)
|
1280
1069
|
```
|
1281
1070
|
|
1282
|
-
|
1071
|
+
# Contributing
|
1283
1072
|
|
1284
1073
|
Bug reports and pull requests are welcome on GitHub at https://github.com/flipp-oss/deimos .
|
1285
1074
|
|
@@ -1289,15 +1078,15 @@ You can/should re-generate RBS types when methods or classes change by running t
|
|
1289
1078
|
rbs collection update
|
1290
1079
|
bundle exec sord --hide-private --no-sord-comments sig/defs.rbs --tags 'override:Override'
|
1291
1080
|
|
1292
|
-
|
1081
|
+
## Linting
|
1293
1082
|
|
1294
1083
|
Deimos uses Rubocop to lint the code. Please run Rubocop on your code
|
1295
1084
|
before submitting a PR. The GitHub CI will also run rubocop on your pull request.
|
1296
1085
|
|
1297
1086
|
---
|
1298
|
-
<p
|
1087
|
+
<p style="text-align: center">
|
1299
1088
|
Sponsored by<br/>
|
1300
1089
|
<a href="https://corp.flipp.com/">
|
1301
|
-
<img src="support/flipp-logo.png" title="Flipp logo" style="border:none;"/>
|
1090
|
+
<img src="support/flipp-logo.png" title="Flipp logo" style="border:none;width:396px;display:block;margin-left:auto;margin-right:auto;" alt="Flipp logo"/>
|
1302
1091
|
</a>
|
1303
1092
|
</p>
|