deimos-kafka 1.0.0.pre.beta15
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.circleci/config.yml +74 -0
- data/.gitignore +41 -0
- data/.gitmodules +0 -0
- data/.rspec +1 -0
- data/.rubocop.yml +321 -0
- data/.ruby-gemset +1 -0
- data/.ruby-version +1 -0
- data/CHANGELOG.md +9 -0
- data/CODE_OF_CONDUCT.md +77 -0
- data/Dockerfile +23 -0
- data/Gemfile +6 -0
- data/Gemfile.lock +165 -0
- data/Guardfile +22 -0
- data/LICENSE.md +195 -0
- data/README.md +742 -0
- data/Rakefile +13 -0
- data/bin/deimos +4 -0
- data/deimos-kafka.gemspec +42 -0
- data/docker-compose.yml +71 -0
- data/docs/DATABASE_BACKEND.md +147 -0
- data/docs/PULL_REQUEST_TEMPLATE.md +34 -0
- data/lib/deimos.rb +134 -0
- data/lib/deimos/active_record_consumer.rb +81 -0
- data/lib/deimos/active_record_producer.rb +64 -0
- data/lib/deimos/avro_data_coder.rb +89 -0
- data/lib/deimos/avro_data_decoder.rb +36 -0
- data/lib/deimos/avro_data_encoder.rb +51 -0
- data/lib/deimos/backends/db.rb +27 -0
- data/lib/deimos/backends/kafka.rb +27 -0
- data/lib/deimos/backends/kafka_async.rb +27 -0
- data/lib/deimos/configuration.rb +88 -0
- data/lib/deimos/consumer.rb +164 -0
- data/lib/deimos/instrumentation.rb +71 -0
- data/lib/deimos/kafka_message.rb +27 -0
- data/lib/deimos/kafka_source.rb +126 -0
- data/lib/deimos/kafka_topic_info.rb +79 -0
- data/lib/deimos/message.rb +74 -0
- data/lib/deimos/metrics/datadog.rb +47 -0
- data/lib/deimos/metrics/mock.rb +39 -0
- data/lib/deimos/metrics/provider.rb +38 -0
- data/lib/deimos/monkey_patches/phobos_cli.rb +35 -0
- data/lib/deimos/monkey_patches/phobos_producer.rb +51 -0
- data/lib/deimos/monkey_patches/ruby_kafka_heartbeat.rb +85 -0
- data/lib/deimos/monkey_patches/schema_store.rb +19 -0
- data/lib/deimos/producer.rb +218 -0
- data/lib/deimos/publish_backend.rb +30 -0
- data/lib/deimos/railtie.rb +8 -0
- data/lib/deimos/schema_coercer.rb +108 -0
- data/lib/deimos/shared_config.rb +59 -0
- data/lib/deimos/test_helpers.rb +356 -0
- data/lib/deimos/tracing/datadog.rb +35 -0
- data/lib/deimos/tracing/mock.rb +40 -0
- data/lib/deimos/tracing/provider.rb +31 -0
- data/lib/deimos/utils/db_producer.rb +95 -0
- data/lib/deimos/utils/executor.rb +117 -0
- data/lib/deimos/utils/inline_consumer.rb +144 -0
- data/lib/deimos/utils/lag_reporter.rb +182 -0
- data/lib/deimos/utils/platform_schema_validation.rb +0 -0
- data/lib/deimos/utils/signal_handler.rb +68 -0
- data/lib/deimos/version.rb +5 -0
- data/lib/generators/deimos/db_backend/templates/migration +24 -0
- data/lib/generators/deimos/db_backend/templates/rails3_migration +30 -0
- data/lib/generators/deimos/db_backend_generator.rb +48 -0
- data/lib/tasks/deimos.rake +17 -0
- data/spec/active_record_consumer_spec.rb +81 -0
- data/spec/active_record_producer_spec.rb +107 -0
- data/spec/avro_data_decoder_spec.rb +18 -0
- data/spec/avro_data_encoder_spec.rb +37 -0
- data/spec/backends/db_spec.rb +35 -0
- data/spec/backends/kafka_async_spec.rb +11 -0
- data/spec/backends/kafka_spec.rb +11 -0
- data/spec/consumer_spec.rb +169 -0
- data/spec/deimos_spec.rb +117 -0
- data/spec/kafka_source_spec.rb +168 -0
- data/spec/kafka_topic_info_spec.rb +88 -0
- data/spec/phobos.bad_db.yml +73 -0
- data/spec/phobos.yml +73 -0
- data/spec/producer_spec.rb +397 -0
- data/spec/publish_backend_spec.rb +10 -0
- data/spec/schemas/com/my-namespace/MySchema-key.avsc +13 -0
- data/spec/schemas/com/my-namespace/MySchema.avsc +18 -0
- data/spec/schemas/com/my-namespace/MySchemaWithBooleans.avsc +18 -0
- data/spec/schemas/com/my-namespace/MySchemaWithDateTimes.avsc +33 -0
- data/spec/schemas/com/my-namespace/MySchemaWithId.avsc +28 -0
- data/spec/schemas/com/my-namespace/MySchemaWithUniqueId.avsc +32 -0
- data/spec/schemas/com/my-namespace/Widget.avsc +27 -0
- data/spec/schemas/com/my-namespace/WidgetTheSecond.avsc +27 -0
- data/spec/spec_helper.rb +207 -0
- data/spec/updateable_schema_store_spec.rb +36 -0
- data/spec/utils/db_producer_spec.rb +208 -0
- data/spec/utils/executor_spec.rb +42 -0
- data/spec/utils/lag_reporter_spec.rb +69 -0
- data/spec/utils/platform_schema_validation_spec.rb +0 -0
- data/spec/utils/signal_handler_spec.rb +16 -0
- data/support/deimos-solo.png +0 -0
- data/support/deimos-with-name-next.png +0 -0
- data/support/deimos-with-name.png +0 -0
- data/support/flipp-logo.png +0 -0
- metadata +452 -0
data/README.md
ADDED
@@ -0,0 +1,742 @@
|
|
1
|
+
<p align="center">
|
2
|
+
<img src="support/deimos-with-name.png" title="Deimos logo"/>
|
3
|
+
<br/>
|
4
|
+
<img src="https://img.shields.io/circleci/build/github/flipp-oss/deimos.svg" alt="CircleCI"/>
|
5
|
+
<a href="https://badge.fury.io/rb/deimos"><img src="https://badge.fury.io/rb/deimos.svg" alt="Gem Version" height="18"></a>
|
6
|
+
<img src="https://img.shields.io/codeclimate/maintainability/flipp-oss/deimos.svg"/>
|
7
|
+
</p>
|
8
|
+
|
9
|
+
A Ruby framework for marrying Kafka, Avro, and/or ActiveRecord and provide
|
10
|
+
a useful toolbox of goodies for Ruby-based Kafka development.
|
11
|
+
Built on Phobos and hence Ruby-Kafka.
|
12
|
+
|
13
|
+
<!--ts-->
|
14
|
+
* [Installation](#installation)
|
15
|
+
* [Versioning](#versioning)
|
16
|
+
* [Configuration](#configuration)
|
17
|
+
* [Producers](#producers)
|
18
|
+
* [Auto-added Fields](#auto-added-fields)
|
19
|
+
* [Coerced Values](#coerced-values)
|
20
|
+
* [Instrumentation](#instrumentation)
|
21
|
+
* [Kafka Message Keys](#kafka-message-keys)
|
22
|
+
* [Consumers](#consumers)
|
23
|
+
* [Rails Integration](#rails-integration)
|
24
|
+
* [Running Consumers](#running-consumers)
|
25
|
+
* [Metrics](#metrics)
|
26
|
+
* [Testing](#testing)
|
27
|
+
* [Integration Test Helpers](#integration-test-helpers)
|
28
|
+
* [Contributing](#contributing)
|
29
|
+
<!--te-->
|
30
|
+
|
31
|
+
# Installation
|
32
|
+
|
33
|
+
Add this line to your application's Gemfile:
|
34
|
+
```ruby
|
35
|
+
gem 'deimos'
|
36
|
+
```
|
37
|
+
|
38
|
+
And then execute:
|
39
|
+
|
40
|
+
$ bundle
|
41
|
+
|
42
|
+
Or install it yourself as:
|
43
|
+
|
44
|
+
$ gem install deimos
|
45
|
+
|
46
|
+
# Versioning
|
47
|
+
|
48
|
+
We use version of semver for this gem. Any change in previous behavior
|
49
|
+
(something works differently or something old no longer works)
|
50
|
+
is denoted with a bump in the minor version (0.4 -> 0.5). Patch versions
|
51
|
+
are for bugfixes or new functionality which does not affect existing code. You
|
52
|
+
should be locking your Gemfile to the minor version:
|
53
|
+
|
54
|
+
```ruby
|
55
|
+
gem 'deimos', '~> 0.4'
|
56
|
+
```
|
57
|
+
|
58
|
+
# Configuration
|
59
|
+
|
60
|
+
To configure the gem, use `configure` in an initializer:
|
61
|
+
|
62
|
+
```ruby
|
63
|
+
Deimos.configure do |config|
|
64
|
+
# Configure logger
|
65
|
+
config.logger = Rails.logger
|
66
|
+
|
67
|
+
# Phobos settings
|
68
|
+
config.phobos_config_file = 'config/phobos.yml'
|
69
|
+
config.schema_registry_url = 'https://my-schema-registry.com'
|
70
|
+
config.seed_broker = 'my.seed.broker.0.net:9093,my.seed.broker.1.net:9093'
|
71
|
+
config.ssl_enabled = ENV['KAFKA_SSL_ENABLED']
|
72
|
+
if config.ssl_enabled
|
73
|
+
config.ssl_ca_cert = File.read(ENV['SSL_CA_CERT'])
|
74
|
+
config.ssl_client_cert = File.read(ENV['SSL_CLIENT_CERT'])
|
75
|
+
config.ssl_client_cert_key = File.read(ENV['SSL_CLIENT_CERT_KEY'])
|
76
|
+
end
|
77
|
+
|
78
|
+
# Other settings
|
79
|
+
|
80
|
+
# Local path to find schemas, for publishing and testing consumers
|
81
|
+
config.schema_path = "#{Rails.root}/app/schemas"
|
82
|
+
|
83
|
+
# Default namespace for producers to use
|
84
|
+
config.producer_schema_namespace = 'com.deimos.my_app'
|
85
|
+
|
86
|
+
# Prefix for all topics, e.g. environment name
|
87
|
+
config.producer_topic_prefix = 'myenv.'
|
88
|
+
|
89
|
+
# Disable all producers - e.g. when doing heavy data lifting and events
|
90
|
+
# would be fired a different way
|
91
|
+
config.disable_producers = true
|
92
|
+
|
93
|
+
# Default behavior is to swallow uncaught exceptions and log to DataDog.
|
94
|
+
# Set this to true to instead raise all errors. Note that raising an error
|
95
|
+
# will ensure that the message cannot be processed - if there is a bad
|
96
|
+
# message which will always raise that error, your consumer will not
|
97
|
+
# be able to proceed past it and will be stuck forever until you fix
|
98
|
+
# your code.
|
99
|
+
config.reraise_consumer_errors = true
|
100
|
+
|
101
|
+
# Set to true to send consumer lag metrics
|
102
|
+
config.report_lag = %w(production staging).include?(Rails.env)
|
103
|
+
|
104
|
+
# Change the default backend. See Backends, below.
|
105
|
+
config.backend = :db
|
106
|
+
|
107
|
+
# If the DB backend is being used, specify the number of threads to create
|
108
|
+
# to process the DB messages.
|
109
|
+
config.num_producer_threads = 1
|
110
|
+
|
111
|
+
# Configure the metrics provider (see below).
|
112
|
+
config.metrics = Deimos::Metrics::Mock.new({ tags: %w(env:prod my_tag:another_1) })
|
113
|
+
|
114
|
+
# Configure the tracing provider (see below).
|
115
|
+
config.tracer = Deimos::Tracing::Mock.new({service_name: 'my-service'})
|
116
|
+
end
|
117
|
+
```
|
118
|
+
|
119
|
+
Note that the configuration options from Phobos (seed_broker and the SSL settings)
|
120
|
+
can be removed from `phobos.yml` since Deimos will load them instead.
|
121
|
+
|
122
|
+
# Producers
|
123
|
+
|
124
|
+
Producers will look like this:
|
125
|
+
|
126
|
+
```ruby
|
127
|
+
class MyProducer < Deimos::Producer
|
128
|
+
|
129
|
+
# Can override default namespace.
|
130
|
+
namespace 'com.deimos.my-app-special'
|
131
|
+
topic 'MyApp.MyTopic'
|
132
|
+
schema 'MySchema'
|
133
|
+
key_config field: 'my_field' # see Kafka Message Keys, below
|
134
|
+
|
135
|
+
# If config.schema_path is app/schemas, assumes there is a file in
|
136
|
+
# app/schemas/com/deimos/my-app-special/MySchema.avsc
|
137
|
+
|
138
|
+
class << self
|
139
|
+
|
140
|
+
# Optionally override the default partition key logic, which is to use
|
141
|
+
# the payload key if it's provided, and nil if there is no payload key.
|
142
|
+
def partition_key(payload)
|
143
|
+
payload[:my_id]
|
144
|
+
end
|
145
|
+
|
146
|
+
# You can call publish / publish_list directly, or create new methods
|
147
|
+
# wrapping them.
|
148
|
+
|
149
|
+
def send_some_message(an_object)
|
150
|
+
payload = {
|
151
|
+
'some-key' => an_object.foo,
|
152
|
+
'some-key2' => an_object.bar
|
153
|
+
}
|
154
|
+
# You can also publish an array with self.publish_list(payloads)
|
155
|
+
self.publish(payload)
|
156
|
+
end
|
157
|
+
|
158
|
+
end
|
159
|
+
|
160
|
+
|
161
|
+
end
|
162
|
+
```
|
163
|
+
|
164
|
+
### Auto-added Fields
|
165
|
+
|
166
|
+
If your schema has a field called `message_id`, and the payload you give
|
167
|
+
your producer doesn't have this set, Deimos will auto-generate
|
168
|
+
a message ID. It is highly recommended to give all schemas a message_id
|
169
|
+
so that you can track each sent message via logging.
|
170
|
+
|
171
|
+
You can also provide a field in your schema called `timestamp` which will be
|
172
|
+
auto-filled with the current timestamp if not provided.
|
173
|
+
|
174
|
+
### Coerced Values
|
175
|
+
|
176
|
+
Deimos will do some simple coercions if you pass values that don't
|
177
|
+
exactly match the schema.
|
178
|
+
|
179
|
+
* If the schema is :int or :long, any integer value, or a string representing
|
180
|
+
an integer, will be parsed to Integer.
|
181
|
+
* If the schema is :float or :double, any numeric value, or a string
|
182
|
+
representing a number, will be parsed to Float.
|
183
|
+
* If the schema is :string, if the value implements its own `to_s` method,
|
184
|
+
this will be called on it. This includes hashes, symbols, numbers, dates, etc.
|
185
|
+
|
186
|
+
### Instrumentation
|
187
|
+
|
188
|
+
Deimos will send ActiveSupport Notifications.
|
189
|
+
You can listen to these notifications e.g. as follows:
|
190
|
+
|
191
|
+
```ruby
|
192
|
+
Deimos.subscribe('produce') do |event|
|
193
|
+
# event is an ActiveSupport::Notifications::Event
|
194
|
+
# you can access time, duration, and transaction_id
|
195
|
+
# payload contains :producer, :topic, and :payloads
|
196
|
+
data = event.payload
|
197
|
+
end
|
198
|
+
```
|
199
|
+
|
200
|
+
The following events are also produced:
|
201
|
+
|
202
|
+
* `produce_error` - sent when an error occurs when producing a message.
|
203
|
+
* producer - the class that produced the message
|
204
|
+
* topic
|
205
|
+
* exception_object
|
206
|
+
* payloads - the unencoded payloads
|
207
|
+
* `encode_messages` - sent when messages are being Avro-encoded.
|
208
|
+
* producer - the class that produced the message
|
209
|
+
* topic
|
210
|
+
* payloads - the unencoded payloads
|
211
|
+
|
212
|
+
Similarly:
|
213
|
+
```ruby
|
214
|
+
Deimos.subscribe('produce_error') do |event|
|
215
|
+
data = event.payloads
|
216
|
+
Mail.send("Got an error #{event.exception_object.message} on topic #{data[:topic]} with payloads #{data[:payloads]}")
|
217
|
+
end
|
218
|
+
|
219
|
+
Deimos.subscribe('encode_messages') do |event|
|
220
|
+
# ...
|
221
|
+
end
|
222
|
+
```
|
223
|
+
|
224
|
+
### Kafka Message Keys
|
225
|
+
|
226
|
+
Topics representing events rather than domain data don't need keys. However,
|
227
|
+
best practice for domain messages is to Avro-encode message keys
|
228
|
+
with a separate Avro schema.
|
229
|
+
|
230
|
+
This enforced by requiring producers to define a `key_config` directive. If
|
231
|
+
any message comes in with a key, the producer will error out if `key_config` is
|
232
|
+
not defined.
|
233
|
+
|
234
|
+
There are three possible configurations to use:
|
235
|
+
|
236
|
+
* `key_config none: true` - this indicates that you are not using keys at all
|
237
|
+
for this topic. This *must* be set if your messages won't have keys - either
|
238
|
+
all your messages in a topic need to have a key, or they all need to have
|
239
|
+
no key. This is a good choice for events that aren't keyed - you can still
|
240
|
+
set a partition key.
|
241
|
+
* `key_config plain: true` - this indicates that you are not using an Avro-encoded
|
242
|
+
key. Use this for legacy topics - new topics should not use this setting.
|
243
|
+
* `key_config schema: 'MyKeySchema-key'` - this tells the producer to look for
|
244
|
+
an existing key schema named `MyKeySchema-key` in the schema registry and to
|
245
|
+
encode the key using it. Use this if you've already created a key schema
|
246
|
+
or the key value does not exist in the existing payload
|
247
|
+
(e.g. it is a compound or generated key).
|
248
|
+
* `key_config field: 'my_field'` - this tells the producer to look for a field
|
249
|
+
named `my_field` in the value schema. When a payload comes in, the producer
|
250
|
+
will take that value from the payload and insert it in a *dynamically generated*
|
251
|
+
key schema. This key schema does not need to live in your codebase. Instead,
|
252
|
+
it will be a subset of the value schema with only the key field in it.
|
253
|
+
|
254
|
+
If your value schema looks like this:
|
255
|
+
```javascript
|
256
|
+
{
|
257
|
+
"namespace": "com.my-namespace",
|
258
|
+
"name": "MySchema",
|
259
|
+
"type": "record",
|
260
|
+
"doc": "Test schema",
|
261
|
+
"fields": [
|
262
|
+
{
|
263
|
+
"name": "test_id",
|
264
|
+
"type": "string",
|
265
|
+
"doc": "test string"
|
266
|
+
},
|
267
|
+
{
|
268
|
+
"name": "some_int",
|
269
|
+
"type": "int",
|
270
|
+
"doc": "test int"
|
271
|
+
}
|
272
|
+
]
|
273
|
+
}
|
274
|
+
```
|
275
|
+
|
276
|
+
...setting `key_config field: 'test_id'` will create a key schema that looks
|
277
|
+
like this:
|
278
|
+
|
279
|
+
```javascript
|
280
|
+
{
|
281
|
+
"namespace": "com.my-namespace",
|
282
|
+
"name": "MySchema-key",
|
283
|
+
"type": "record",
|
284
|
+
"doc": "Key for com.my-namespace.MySchema",
|
285
|
+
"fields": [
|
286
|
+
{
|
287
|
+
"name": "test_id",
|
288
|
+
"type": "string",
|
289
|
+
"doc": "test string"
|
290
|
+
}
|
291
|
+
]
|
292
|
+
}
|
293
|
+
```
|
294
|
+
|
295
|
+
If you publish a payload `{ "test_id" => "123", "some_int" => 123 }`, this
|
296
|
+
will be turned into a key that looks like `{ "test_id" => "123"}` and encoded
|
297
|
+
via Avro before being sent to Kafka.
|
298
|
+
|
299
|
+
If you are using `plain` or `schema` as your config, you will need to have a
|
300
|
+
special `payload_key` key to your payload hash. This will be extracted and
|
301
|
+
used as the key (for `plain`, it will be used directly, while for `schema`
|
302
|
+
it will be encoded first against the schema). So your payload would look like
|
303
|
+
`{ "test_id" => "123", "some_int" => 123, payload_key: "some_other_key"}`.
|
304
|
+
Remember that if you're using `schema`, the `payload_key` must be a *hash*,
|
305
|
+
not a plain value.
|
306
|
+
|
307
|
+
# Consumers
|
308
|
+
|
309
|
+
Here is a sample consumer:
|
310
|
+
|
311
|
+
```ruby
|
312
|
+
class MyConsumer < Deimos::Consumer
|
313
|
+
|
314
|
+
# These are optional but strongly recommended for testing purposes; this
|
315
|
+
# will validate against a local schema file used as the reader schema,
|
316
|
+
# as well as being able to write tests against this schema.
|
317
|
+
# This is recommended since it ensures you are always getting the values
|
318
|
+
# you expect.
|
319
|
+
schema 'MySchema'
|
320
|
+
namespace 'com.my-namespace'
|
321
|
+
# This directive works identically to the producer - see Kafka Keys, above.
|
322
|
+
# This only affects the `decode_key` method below. You need to provide
|
323
|
+
# `schema` and `namespace`, above, for this to work.
|
324
|
+
key_config field: :my_id
|
325
|
+
|
326
|
+
def consume(payload, metadata)
|
327
|
+
# Same method as Phobos consumers.
|
328
|
+
# payload is an Avro-decoded hash.
|
329
|
+
# Metadata is a hash that contains information like :key and :topic. Both
|
330
|
+
# key (if configured) and payload will be Avro-decoded.
|
331
|
+
end
|
332
|
+
end
|
333
|
+
```
|
334
|
+
|
335
|
+
# Rails Integration
|
336
|
+
|
337
|
+
### Producing
|
338
|
+
|
339
|
+
Deimos comes with an ActiveRecordProducer. This takes a single or
|
340
|
+
list of ActiveRecord objects or hashes and maps it to the given schema.
|
341
|
+
|
342
|
+
An example would look like this:
|
343
|
+
|
344
|
+
```ruby
|
345
|
+
class MyProducer < Deimos::ActiveRecordProducer
|
346
|
+
|
347
|
+
topic 'MyApp.MyTopic'
|
348
|
+
schema 'MySchema'
|
349
|
+
key_config field: 'my_field'
|
350
|
+
|
351
|
+
# The record class should be set on every ActiveRecordProducer.
|
352
|
+
# By default, if you give the producer a hash, it will re-fetch the
|
353
|
+
# record itself for use in the payload generation. This can be useful
|
354
|
+
# if you pass a list of hashes to the method e.g. as part of a
|
355
|
+
# mass import operation. You can turn off this behavior (e.g. if you're just
|
356
|
+
# using the default functionality and don't need to override it)
|
357
|
+
# by setting `refetch` to false. This will avoid extra database fetches.
|
358
|
+
record_class Widget, refetch: false
|
359
|
+
|
360
|
+
# Optionally override this if you want the message to be
|
361
|
+
# sent even if fields that aren't in the schema are changed.
|
362
|
+
def watched_attributes
|
363
|
+
super + ['a_non_schema_attribute']
|
364
|
+
end
|
365
|
+
|
366
|
+
# If you want to just use the default functionality you can leave this
|
367
|
+
# method out entirely. You only need to use it if you want to massage
|
368
|
+
# the payload in some way, e.g. adding fields that don't exist on the
|
369
|
+
# record itself.
|
370
|
+
def generate_payload(attributes, record)
|
371
|
+
super # generates payload based on the record and schema
|
372
|
+
end
|
373
|
+
|
374
|
+
end
|
375
|
+
|
376
|
+
# or `send_event` with just one Widget
|
377
|
+
MyProducer.send_events([Widget.new(foo: 1), Widget.new(foo: 2)])
|
378
|
+
MyProducer.send_events([{foo: 1}, {foo: 2}])
|
379
|
+
```
|
380
|
+
|
381
|
+
#### Disabling Producers
|
382
|
+
|
383
|
+
You can disable producers globally or inside a block. Globally:
|
384
|
+
```ruby
|
385
|
+
Deimos.config.disable_producers = true
|
386
|
+
```
|
387
|
+
|
388
|
+
For the duration of a block:
|
389
|
+
```ruby
|
390
|
+
Deimos.disable_producers do
|
391
|
+
# code goes here
|
392
|
+
end
|
393
|
+
```
|
394
|
+
|
395
|
+
For specific producers only:
|
396
|
+
```ruby
|
397
|
+
Deimos.disable_producers(Producer1, Producer2) do
|
398
|
+
# code goes here
|
399
|
+
end
|
400
|
+
```
|
401
|
+
|
402
|
+
#### KafkaSource
|
403
|
+
|
404
|
+
There is a special mixin which can be added to any ActiveRecord class. This
|
405
|
+
will create callbacks which will automatically send messages to Kafka whenever
|
406
|
+
this class is saved. This even includes using the [activerecord-import](https://github.com/zdennis/activerecord-import) gem
|
407
|
+
to import objects (including using `on_duplicate_key_update`). However,
|
408
|
+
it will *not* work for `update_all`, `delete` or `delete_all`, and naturally
|
409
|
+
will not fire if using pure SQL or Arel.
|
410
|
+
|
411
|
+
Note that these messages are sent *during the transaction*, i.e. using
|
412
|
+
`after_create`, `after_update` and `after_destroy`. If there are
|
413
|
+
questions of consistency between the database and Kafka, it is recommended
|
414
|
+
to switch to using the DB backend (see next section) to avoid these issues.
|
415
|
+
|
416
|
+
When the object is destroyed, an empty payload with a payload key consisting of
|
417
|
+
the record's primary key is sent to the producer. If your topic's key is
|
418
|
+
from another field, you will need to override the `deletion_payload` method.
|
419
|
+
|
420
|
+
```ruby
|
421
|
+
class Widget < ActiveRecord::Base
|
422
|
+
include Deimos::KafkaSource
|
423
|
+
|
424
|
+
# Class method that defines an ActiveRecordProducer(s) to take the object
|
425
|
+
# and turn it into a payload.
|
426
|
+
def self.kafka_producers
|
427
|
+
[MyProducer]
|
428
|
+
end
|
429
|
+
|
430
|
+
def deletion_payload
|
431
|
+
{ payload_key: self.uuid }
|
432
|
+
end
|
433
|
+
|
434
|
+
# Optional - indicate that you want to send messages when these events
|
435
|
+
# occur.
|
436
|
+
def self.kafka_config
|
437
|
+
{
|
438
|
+
:update => true,
|
439
|
+
:delete => true,
|
440
|
+
:import => true,
|
441
|
+
:create => true
|
442
|
+
}
|
443
|
+
end
|
444
|
+
|
445
|
+
end
|
446
|
+
```
|
447
|
+
|
448
|
+
#### Database Backend
|
449
|
+
|
450
|
+
Deimos provides a way to allow Kafka messages to be created inside a
|
451
|
+
database transaction, and send them asynchronously. This ensures that your
|
452
|
+
database transactions and Kafka messages related to those transactions
|
453
|
+
are always in sync. Essentially, it separates the message logic so that a
|
454
|
+
message is first validated, encoded, and saved in the database, and then sent
|
455
|
+
on a separate thread. This means if you have to roll back your transaction,
|
456
|
+
it also rolls back your Kafka messages.
|
457
|
+
|
458
|
+
To enable this, first generate the migration to create the relevant tables:
|
459
|
+
|
460
|
+
rails g deimos:db_backend
|
461
|
+
|
462
|
+
You can now set the following configuration:
|
463
|
+
|
464
|
+
config.publish_backend = :db
|
465
|
+
|
466
|
+
This will save all your Kafka messages to the `kafka_messages` table instead
|
467
|
+
of immediately sending to Kafka. Now, you just need to call
|
468
|
+
|
469
|
+
Deimos.start_db_backend!
|
470
|
+
|
471
|
+
This creates one or more threads dedicated to scanning and publishing these
|
472
|
+
messages by using the `kafka_topics` table in a manner similar to
|
473
|
+
[Delayed Job](https://github.com/collectiveidea/delayed_job).
|
474
|
+
You can pass in a number of threads to the method:
|
475
|
+
|
476
|
+
Deimos.start_db_backend!(thread_count: 2)
|
477
|
+
|
478
|
+
If you want to force a message to send immediately, just call the `publish_list`
|
479
|
+
method with `force_send: true`. You can also pass `force_send` into any of the
|
480
|
+
other methods that publish events, like `send_event` in `ActiveRecordProducer`.
|
481
|
+
|
482
|
+
For more information on how the database backend works and why it was
|
483
|
+
implemented, please see [Database Backends](docs/DATABASE_BACKEND.md).
|
484
|
+
|
485
|
+
### Consuming
|
486
|
+
|
487
|
+
Deimos provides an ActiveRecordConsumer which will take a payload
|
488
|
+
and automatically save it to a provided model. It will take the intersection
|
489
|
+
of the payload fields and the model attributes, and either create a new record
|
490
|
+
or update an existing record. It will use the message key to find the record
|
491
|
+
in the database.
|
492
|
+
|
493
|
+
To delete a record, simply produce a message with the record's ID as the message
|
494
|
+
key and a null payload.
|
495
|
+
|
496
|
+
Note that to retrieve the key, you must specify the correct [key encoding](#kafka-message-keys)
|
497
|
+
configuration.
|
498
|
+
|
499
|
+
A sample consumer would look as follows:
|
500
|
+
|
501
|
+
```ruby
|
502
|
+
class MyConsumer < Deimos::ActiveRecordConsumer
|
503
|
+
|
504
|
+
schema 'MySchema'
|
505
|
+
key_config field: 'my_field'
|
506
|
+
record_class Widget
|
507
|
+
|
508
|
+
# Optional override of the default behavior, which is to call `destroy`
|
509
|
+
# on the record - e.g. you can replace this with "archiving" the record
|
510
|
+
# in some way.
|
511
|
+
def destroy_record(record)
|
512
|
+
super
|
513
|
+
end
|
514
|
+
|
515
|
+
# Optional override to change the attributes of the record before they
|
516
|
+
# are saved.
|
517
|
+
def record_attributes(payload)
|
518
|
+
super.merge(:some_field => 'some_value')
|
519
|
+
end
|
520
|
+
end
|
521
|
+
```
|
522
|
+
|
523
|
+
## Running consumers
|
524
|
+
|
525
|
+
Deimos includes a rake task. Once it's in your gemfile, just run
|
526
|
+
|
527
|
+
rake deimos:start
|
528
|
+
|
529
|
+
This will automatically set an environment variable called `DEIMOS_RAKE_TASK`,
|
530
|
+
which can be useful if you want to figure out if you're inside the task
|
531
|
+
as opposed to running your Rails server or console. E.g. you could start your
|
532
|
+
DB backend only when your rake task is running.
|
533
|
+
|
534
|
+
# Metrics
|
535
|
+
|
536
|
+
Deimos includes some metrics reporting out the box. It ships with DataDog support, but you can add custom metric providers as well.
|
537
|
+
|
538
|
+
The following metrics are reported:
|
539
|
+
* `{service_name}.consumer_lag` - for each partition, the number of messages
|
540
|
+
it's behind the tail of the partition (a gauge). This is only sent if
|
541
|
+
`config.report_lag` is set to true.
|
542
|
+
* `{service_name}.handler` - a count of the number of messages received. Tagged
|
543
|
+
with the following:
|
544
|
+
* `topic:{topic_name}`
|
545
|
+
* `status:received`
|
546
|
+
* `status:success`
|
547
|
+
* `status:error`
|
548
|
+
* `time:consume` (histogram)
|
549
|
+
* `time:time_delayed` (histogram)
|
550
|
+
* `{service_name}.publish` - a count of the number of messages received. Tagged
|
551
|
+
with `topic:{topic_name}`
|
552
|
+
* `{service_name}.publish_error` - a count of the number of messages which failed
|
553
|
+
to publish. Tagged with `topic:{topic_name}`
|
554
|
+
|
555
|
+
### Configuring Metrics Providers
|
556
|
+
|
557
|
+
See the `# Configure Metrics Provider` section under [Configuration](#configuration)
|
558
|
+
View all available Metrics Providers [here](lib/deimos/metrics/metrics_providers)
|
559
|
+
|
560
|
+
### Custom Metrics Providers
|
561
|
+
|
562
|
+
Using the above configuration, it is possible to pass in any generic Metrics
|
563
|
+
Provider class as long as it exposes the methods and definitions expected by
|
564
|
+
the Metrics module.
|
565
|
+
The easiest way to do this is to inherit from the `Metrics::Provider` class
|
566
|
+
and implement the methods in it.
|
567
|
+
|
568
|
+
See the [Mock provider](lib/deimos/metrics/mock.rb) as an example. It implements a constructor which receives config, plus the required metrics methods.
|
569
|
+
|
570
|
+
Also see [deimos.rb](lib/deimos.rb) under `Configure metrics` to see how the metrics module is called.
|
571
|
+
|
572
|
+
# Tracing
|
573
|
+
|
574
|
+
Deimos also includes some tracing for kafka consumers. It ships with
|
575
|
+
DataDog support, but you can add custom tracing providers as well.
|
576
|
+
|
577
|
+
Trace spans are used for when incoming messages are avro decoded, and a
|
578
|
+
separate span for message consume logic.
|
579
|
+
|
580
|
+
### Configuring Tracing Providers
|
581
|
+
|
582
|
+
See the `# Configure Tracing Provider` section under [Configuration](#configuration)
|
583
|
+
View all available Tracing Providers [here](lib/deimos/tracing)
|
584
|
+
|
585
|
+
### Custom Tracing Providers
|
586
|
+
|
587
|
+
Using the above configuration, it is possible to pass in any generic Tracing
|
588
|
+
Provider class as long as it exposes the methods and definitions expected by
|
589
|
+
the Tracing module.
|
590
|
+
The easiest way to do this is to inherit from the `Tracing::Provider` class
|
591
|
+
and implement the methods in it.
|
592
|
+
|
593
|
+
See the [Mock provider](lib/deimos/tracing/mock.rb) as an example. It implements a constructor which receives config, plus the required tracing methods.
|
594
|
+
|
595
|
+
Also see [deimos.rb](lib/deimos.rb) under `Configure tracing` to see how the tracing module is called.
|
596
|
+
|
597
|
+
# Testing
|
598
|
+
|
599
|
+
Deimos comes with a test helper class which automatically stubs out
|
600
|
+
external calls (like metrics and tracing providers and the schema
|
601
|
+
registry) and provides useful methods for testing consumers.
|
602
|
+
|
603
|
+
In `spec_helper.rb`:
|
604
|
+
```ruby
|
605
|
+
RSpec.configure do |config|
|
606
|
+
config.include Deimos::TestHelpers
|
607
|
+
config.before(:each) do
|
608
|
+
stub_producers_and_consumers!
|
609
|
+
end
|
610
|
+
end
|
611
|
+
```
|
612
|
+
|
613
|
+
In your test, you now have the following methods available:
|
614
|
+
```ruby
|
615
|
+
# Pass a consumer class (not instance) to validate a payload against it.
|
616
|
+
# This will fail if the payload does not match the schema the consumer
|
617
|
+
# is set up to consume.
|
618
|
+
test_consume_message(MyConsumer,
|
619
|
+
{ 'some-payload' => 'some-value' }) do |payload, metadata|
|
620
|
+
# do some expectation handling here
|
621
|
+
end
|
622
|
+
|
623
|
+
# You can also pass a topic name instead of the consumer class as long
|
624
|
+
# as the topic is configured in your phobos.yml configuration:
|
625
|
+
test_consume_message('my-topic-name',
|
626
|
+
{ 'some-payload' => 'some-value' }) do |payload, metadata|
|
627
|
+
# do some expectation handling here
|
628
|
+
end
|
629
|
+
|
630
|
+
# Alternatively, you can test the actual consume logic:
|
631
|
+
test_consume_message(MyConsumer,
|
632
|
+
{ 'some-payload' => 'some-value' },
|
633
|
+
call_original: true)
|
634
|
+
|
635
|
+
# Test that a given payload is invalid against the schema:
|
636
|
+
test_consume_invalid_message(MyConsumer,
|
637
|
+
{ 'some-invalid-payload' => 'some-value' })
|
638
|
+
|
639
|
+
# A matcher which allows you to test that a message was sent on the given
|
640
|
+
# topic, without having to know which class produced it.
|
641
|
+
expect(topic_name).to have_sent(payload, key=nil)
|
642
|
+
|
643
|
+
# Inspect sent messages
|
644
|
+
message = Deimos::TestHelpers.sent_messages[0]
|
645
|
+
expect(message).to eq({
|
646
|
+
message: {'some-key' => 'some-value'},
|
647
|
+
topic: 'my-topic',
|
648
|
+
key: 'my-id'
|
649
|
+
})
|
650
|
+
```
|
651
|
+
|
652
|
+
**Important note:** To use the `have_sent` helper, your producers need to be
|
653
|
+
loaded / required *before* starting the test. You can do this in your
|
654
|
+
`spec_helper` file, or if you are defining producers dynamically, you can
|
655
|
+
add an `RSpec.prepend_before(:each)` block where you define the producer.
|
656
|
+
Alternatively, you can use the `stub_producer` and `stub_consumer` methods
|
657
|
+
in your test.
|
658
|
+
|
659
|
+
There is also a helper method that will let you test if an existing schema
|
660
|
+
would be compatible with a new version of it. You can use this in your
|
661
|
+
Ruby console but it would likely not be part of your RSpec test:
|
662
|
+
|
663
|
+
```ruby
|
664
|
+
require 'deimos/test_helpers'
|
665
|
+
# Can pass a file path, a string or a hash into this:
|
666
|
+
Deimos::TestHelpers.schemas_compatible?(schema1, schema2)
|
667
|
+
```
|
668
|
+
|
669
|
+
### Integration Test Helpers
|
670
|
+
|
671
|
+
You can use the `InlineConsumer` class to help with integration testing,
|
672
|
+
with a full external Kafka running.
|
673
|
+
|
674
|
+
If you have a consumer you want to test against messages in a Kafka topic,
|
675
|
+
use the `consume` method:
|
676
|
+
```ruby
|
677
|
+
Deimos::Utils::InlineConsumer.consume(
|
678
|
+
topic: 'my-topic',
|
679
|
+
frk_consumer: MyConsumerClass,
|
680
|
+
num_messages: 5
|
681
|
+
)
|
682
|
+
```
|
683
|
+
|
684
|
+
This is a _synchronous_ call which will run the consumer against the
|
685
|
+
last 5 messages in the topic. You can set `num_messages` to a number
|
686
|
+
like `1_000_000` to always consume all the messages. Once the last
|
687
|
+
message is retrieved, the process will wait 1 second to make sure
|
688
|
+
they're all done, then continue execution.
|
689
|
+
|
690
|
+
If you just want to retrieve the contents of a topic, you can use
|
691
|
+
the `get_messages_for` method:
|
692
|
+
|
693
|
+
```ruby
|
694
|
+
Deimos::Utils::InlineConsumer.get_messages_for(
|
695
|
+
topic: 'my-topic',
|
696
|
+
schema: 'my-schema',
|
697
|
+
namespace: 'my.namespace',
|
698
|
+
key_config: { field: 'id' },
|
699
|
+
num_messages: 5
|
700
|
+
)
|
701
|
+
```
|
702
|
+
|
703
|
+
This will run the process and simply return the last 5 messages on the
|
704
|
+
topic, as hashes, once it's done. The format of the messages will simply
|
705
|
+
be
|
706
|
+
```ruby
|
707
|
+
{
|
708
|
+
payload: { key: value }, # payload hash here
|
709
|
+
key: "some_value" # key value or hash here
|
710
|
+
}
|
711
|
+
```
|
712
|
+
|
713
|
+
Both payload and key will be Avro-decoded as necessary according to the
|
714
|
+
key config.
|
715
|
+
|
716
|
+
You can also just pass an existing producer or consumer class into the method,
|
717
|
+
and it will extract the necessary configuration from it:
|
718
|
+
|
719
|
+
```ruby
|
720
|
+
Deimos::Utils::InlineConsumer.get_messages_for(
|
721
|
+
topic: 'my-topic',
|
722
|
+
config_class: MyProducerClass,
|
723
|
+
num_messages: 5
|
724
|
+
)
|
725
|
+
```
|
726
|
+
|
727
|
+
## Contributing
|
728
|
+
|
729
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/flipp-oss/deimos .
|
730
|
+
|
731
|
+
### Linting
|
732
|
+
|
733
|
+
Deimos uses Rubocop to lint the code. Please run Rubocop on your code
|
734
|
+
before submitting a PR.
|
735
|
+
|
736
|
+
---
|
737
|
+
<p align="center">
|
738
|
+
Sponsored by<br/>
|
739
|
+
<a href="https://corp.flipp.com/">
|
740
|
+
<img src="support/flipp-logo.png" title="Flipp logo" style="border:none;"/>
|
741
|
+
</a>
|
742
|
+
</p>
|