deimos-kafka 1.0.0.pre.beta15

Sign up to get free protection for your applications and to get access to all the features.
Files changed (100) hide show
  1. checksums.yaml +7 -0
  2. data/.circleci/config.yml +74 -0
  3. data/.gitignore +41 -0
  4. data/.gitmodules +0 -0
  5. data/.rspec +1 -0
  6. data/.rubocop.yml +321 -0
  7. data/.ruby-gemset +1 -0
  8. data/.ruby-version +1 -0
  9. data/CHANGELOG.md +9 -0
  10. data/CODE_OF_CONDUCT.md +77 -0
  11. data/Dockerfile +23 -0
  12. data/Gemfile +6 -0
  13. data/Gemfile.lock +165 -0
  14. data/Guardfile +22 -0
  15. data/LICENSE.md +195 -0
  16. data/README.md +742 -0
  17. data/Rakefile +13 -0
  18. data/bin/deimos +4 -0
  19. data/deimos-kafka.gemspec +42 -0
  20. data/docker-compose.yml +71 -0
  21. data/docs/DATABASE_BACKEND.md +147 -0
  22. data/docs/PULL_REQUEST_TEMPLATE.md +34 -0
  23. data/lib/deimos.rb +134 -0
  24. data/lib/deimos/active_record_consumer.rb +81 -0
  25. data/lib/deimos/active_record_producer.rb +64 -0
  26. data/lib/deimos/avro_data_coder.rb +89 -0
  27. data/lib/deimos/avro_data_decoder.rb +36 -0
  28. data/lib/deimos/avro_data_encoder.rb +51 -0
  29. data/lib/deimos/backends/db.rb +27 -0
  30. data/lib/deimos/backends/kafka.rb +27 -0
  31. data/lib/deimos/backends/kafka_async.rb +27 -0
  32. data/lib/deimos/configuration.rb +88 -0
  33. data/lib/deimos/consumer.rb +164 -0
  34. data/lib/deimos/instrumentation.rb +71 -0
  35. data/lib/deimos/kafka_message.rb +27 -0
  36. data/lib/deimos/kafka_source.rb +126 -0
  37. data/lib/deimos/kafka_topic_info.rb +79 -0
  38. data/lib/deimos/message.rb +74 -0
  39. data/lib/deimos/metrics/datadog.rb +47 -0
  40. data/lib/deimos/metrics/mock.rb +39 -0
  41. data/lib/deimos/metrics/provider.rb +38 -0
  42. data/lib/deimos/monkey_patches/phobos_cli.rb +35 -0
  43. data/lib/deimos/monkey_patches/phobos_producer.rb +51 -0
  44. data/lib/deimos/monkey_patches/ruby_kafka_heartbeat.rb +85 -0
  45. data/lib/deimos/monkey_patches/schema_store.rb +19 -0
  46. data/lib/deimos/producer.rb +218 -0
  47. data/lib/deimos/publish_backend.rb +30 -0
  48. data/lib/deimos/railtie.rb +8 -0
  49. data/lib/deimos/schema_coercer.rb +108 -0
  50. data/lib/deimos/shared_config.rb +59 -0
  51. data/lib/deimos/test_helpers.rb +356 -0
  52. data/lib/deimos/tracing/datadog.rb +35 -0
  53. data/lib/deimos/tracing/mock.rb +40 -0
  54. data/lib/deimos/tracing/provider.rb +31 -0
  55. data/lib/deimos/utils/db_producer.rb +95 -0
  56. data/lib/deimos/utils/executor.rb +117 -0
  57. data/lib/deimos/utils/inline_consumer.rb +144 -0
  58. data/lib/deimos/utils/lag_reporter.rb +182 -0
  59. data/lib/deimos/utils/platform_schema_validation.rb +0 -0
  60. data/lib/deimos/utils/signal_handler.rb +68 -0
  61. data/lib/deimos/version.rb +5 -0
  62. data/lib/generators/deimos/db_backend/templates/migration +24 -0
  63. data/lib/generators/deimos/db_backend/templates/rails3_migration +30 -0
  64. data/lib/generators/deimos/db_backend_generator.rb +48 -0
  65. data/lib/tasks/deimos.rake +17 -0
  66. data/spec/active_record_consumer_spec.rb +81 -0
  67. data/spec/active_record_producer_spec.rb +107 -0
  68. data/spec/avro_data_decoder_spec.rb +18 -0
  69. data/spec/avro_data_encoder_spec.rb +37 -0
  70. data/spec/backends/db_spec.rb +35 -0
  71. data/spec/backends/kafka_async_spec.rb +11 -0
  72. data/spec/backends/kafka_spec.rb +11 -0
  73. data/spec/consumer_spec.rb +169 -0
  74. data/spec/deimos_spec.rb +117 -0
  75. data/spec/kafka_source_spec.rb +168 -0
  76. data/spec/kafka_topic_info_spec.rb +88 -0
  77. data/spec/phobos.bad_db.yml +73 -0
  78. data/spec/phobos.yml +73 -0
  79. data/spec/producer_spec.rb +397 -0
  80. data/spec/publish_backend_spec.rb +10 -0
  81. data/spec/schemas/com/my-namespace/MySchema-key.avsc +13 -0
  82. data/spec/schemas/com/my-namespace/MySchema.avsc +18 -0
  83. data/spec/schemas/com/my-namespace/MySchemaWithBooleans.avsc +18 -0
  84. data/spec/schemas/com/my-namespace/MySchemaWithDateTimes.avsc +33 -0
  85. data/spec/schemas/com/my-namespace/MySchemaWithId.avsc +28 -0
  86. data/spec/schemas/com/my-namespace/MySchemaWithUniqueId.avsc +32 -0
  87. data/spec/schemas/com/my-namespace/Widget.avsc +27 -0
  88. data/spec/schemas/com/my-namespace/WidgetTheSecond.avsc +27 -0
  89. data/spec/spec_helper.rb +207 -0
  90. data/spec/updateable_schema_store_spec.rb +36 -0
  91. data/spec/utils/db_producer_spec.rb +208 -0
  92. data/spec/utils/executor_spec.rb +42 -0
  93. data/spec/utils/lag_reporter_spec.rb +69 -0
  94. data/spec/utils/platform_schema_validation_spec.rb +0 -0
  95. data/spec/utils/signal_handler_spec.rb +16 -0
  96. data/support/deimos-solo.png +0 -0
  97. data/support/deimos-with-name-next.png +0 -0
  98. data/support/deimos-with-name.png +0 -0
  99. data/support/flipp-logo.png +0 -0
  100. metadata +452 -0
@@ -0,0 +1,742 @@
1
+ <p align="center">
2
+ <img src="support/deimos-with-name.png" title="Deimos logo"/>
3
+ <br/>
4
+ <img src="https://img.shields.io/circleci/build/github/flipp-oss/deimos.svg" alt="CircleCI"/>
5
+ <a href="https://badge.fury.io/rb/deimos"><img src="https://badge.fury.io/rb/deimos.svg" alt="Gem Version" height="18"></a>
6
+ <img src="https://img.shields.io/codeclimate/maintainability/flipp-oss/deimos.svg"/>
7
+ </p>
8
+
9
+ A Ruby framework for marrying Kafka, Avro, and/or ActiveRecord and provide
10
+ a useful toolbox of goodies for Ruby-based Kafka development.
11
+ Built on Phobos and hence Ruby-Kafka.
12
+
13
+ <!--ts-->
14
+ * [Installation](#installation)
15
+ * [Versioning](#versioning)
16
+ * [Configuration](#configuration)
17
+ * [Producers](#producers)
18
+ * [Auto-added Fields](#auto-added-fields)
19
+ * [Coerced Values](#coerced-values)
20
+ * [Instrumentation](#instrumentation)
21
+ * [Kafka Message Keys](#kafka-message-keys)
22
+ * [Consumers](#consumers)
23
+ * [Rails Integration](#rails-integration)
24
+ * [Running Consumers](#running-consumers)
25
+ * [Metrics](#metrics)
26
+ * [Testing](#testing)
27
+ * [Integration Test Helpers](#integration-test-helpers)
28
+ * [Contributing](#contributing)
29
+ <!--te-->
30
+
31
+ # Installation
32
+
33
+ Add this line to your application's Gemfile:
34
+ ```ruby
35
+ gem 'deimos'
36
+ ```
37
+
38
+ And then execute:
39
+
40
+ $ bundle
41
+
42
+ Or install it yourself as:
43
+
44
+ $ gem install deimos
45
+
46
+ # Versioning
47
+
48
+ We use version of semver for this gem. Any change in previous behavior
49
+ (something works differently or something old no longer works)
50
+ is denoted with a bump in the minor version (0.4 -> 0.5). Patch versions
51
+ are for bugfixes or new functionality which does not affect existing code. You
52
+ should be locking your Gemfile to the minor version:
53
+
54
+ ```ruby
55
+ gem 'deimos', '~> 0.4'
56
+ ```
57
+
58
+ # Configuration
59
+
60
+ To configure the gem, use `configure` in an initializer:
61
+
62
+ ```ruby
63
+ Deimos.configure do |config|
64
+ # Configure logger
65
+ config.logger = Rails.logger
66
+
67
+ # Phobos settings
68
+ config.phobos_config_file = 'config/phobos.yml'
69
+ config.schema_registry_url = 'https://my-schema-registry.com'
70
+ config.seed_broker = 'my.seed.broker.0.net:9093,my.seed.broker.1.net:9093'
71
+ config.ssl_enabled = ENV['KAFKA_SSL_ENABLED']
72
+ if config.ssl_enabled
73
+ config.ssl_ca_cert = File.read(ENV['SSL_CA_CERT'])
74
+ config.ssl_client_cert = File.read(ENV['SSL_CLIENT_CERT'])
75
+ config.ssl_client_cert_key = File.read(ENV['SSL_CLIENT_CERT_KEY'])
76
+ end
77
+
78
+ # Other settings
79
+
80
+ # Local path to find schemas, for publishing and testing consumers
81
+ config.schema_path = "#{Rails.root}/app/schemas"
82
+
83
+ # Default namespace for producers to use
84
+ config.producer_schema_namespace = 'com.deimos.my_app'
85
+
86
+ # Prefix for all topics, e.g. environment name
87
+ config.producer_topic_prefix = 'myenv.'
88
+
89
+ # Disable all producers - e.g. when doing heavy data lifting and events
90
+ # would be fired a different way
91
+ config.disable_producers = true
92
+
93
+ # Default behavior is to swallow uncaught exceptions and log to DataDog.
94
+ # Set this to true to instead raise all errors. Note that raising an error
95
+ # will ensure that the message cannot be processed - if there is a bad
96
+ # message which will always raise that error, your consumer will not
97
+ # be able to proceed past it and will be stuck forever until you fix
98
+ # your code.
99
+ config.reraise_consumer_errors = true
100
+
101
+ # Set to true to send consumer lag metrics
102
+ config.report_lag = %w(production staging).include?(Rails.env)
103
+
104
+ # Change the default backend. See Backends, below.
105
+ config.backend = :db
106
+
107
+ # If the DB backend is being used, specify the number of threads to create
108
+ # to process the DB messages.
109
+ config.num_producer_threads = 1
110
+
111
+ # Configure the metrics provider (see below).
112
+ config.metrics = Deimos::Metrics::Mock.new({ tags: %w(env:prod my_tag:another_1) })
113
+
114
+ # Configure the tracing provider (see below).
115
+ config.tracer = Deimos::Tracing::Mock.new({service_name: 'my-service'})
116
+ end
117
+ ```
118
+
119
+ Note that the configuration options from Phobos (seed_broker and the SSL settings)
120
+ can be removed from `phobos.yml` since Deimos will load them instead.
121
+
122
+ # Producers
123
+
124
+ Producers will look like this:
125
+
126
+ ```ruby
127
+ class MyProducer < Deimos::Producer
128
+
129
+ # Can override default namespace.
130
+ namespace 'com.deimos.my-app-special'
131
+ topic 'MyApp.MyTopic'
132
+ schema 'MySchema'
133
+ key_config field: 'my_field' # see Kafka Message Keys, below
134
+
135
+ # If config.schema_path is app/schemas, assumes there is a file in
136
+ # app/schemas/com/deimos/my-app-special/MySchema.avsc
137
+
138
+ class << self
139
+
140
+ # Optionally override the default partition key logic, which is to use
141
+ # the payload key if it's provided, and nil if there is no payload key.
142
+ def partition_key(payload)
143
+ payload[:my_id]
144
+ end
145
+
146
+ # You can call publish / publish_list directly, or create new methods
147
+ # wrapping them.
148
+
149
+ def send_some_message(an_object)
150
+ payload = {
151
+ 'some-key' => an_object.foo,
152
+ 'some-key2' => an_object.bar
153
+ }
154
+ # You can also publish an array with self.publish_list(payloads)
155
+ self.publish(payload)
156
+ end
157
+
158
+ end
159
+
160
+
161
+ end
162
+ ```
163
+
164
+ ### Auto-added Fields
165
+
166
+ If your schema has a field called `message_id`, and the payload you give
167
+ your producer doesn't have this set, Deimos will auto-generate
168
+ a message ID. It is highly recommended to give all schemas a message_id
169
+ so that you can track each sent message via logging.
170
+
171
+ You can also provide a field in your schema called `timestamp` which will be
172
+ auto-filled with the current timestamp if not provided.
173
+
174
+ ### Coerced Values
175
+
176
+ Deimos will do some simple coercions if you pass values that don't
177
+ exactly match the schema.
178
+
179
+ * If the schema is :int or :long, any integer value, or a string representing
180
+ an integer, will be parsed to Integer.
181
+ * If the schema is :float or :double, any numeric value, or a string
182
+ representing a number, will be parsed to Float.
183
+ * If the schema is :string, if the value implements its own `to_s` method,
184
+ this will be called on it. This includes hashes, symbols, numbers, dates, etc.
185
+
186
+ ### Instrumentation
187
+
188
+ Deimos will send ActiveSupport Notifications.
189
+ You can listen to these notifications e.g. as follows:
190
+
191
+ ```ruby
192
+ Deimos.subscribe('produce') do |event|
193
+ # event is an ActiveSupport::Notifications::Event
194
+ # you can access time, duration, and transaction_id
195
+ # payload contains :producer, :topic, and :payloads
196
+ data = event.payload
197
+ end
198
+ ```
199
+
200
+ The following events are also produced:
201
+
202
+ * `produce_error` - sent when an error occurs when producing a message.
203
+ * producer - the class that produced the message
204
+ * topic
205
+ * exception_object
206
+ * payloads - the unencoded payloads
207
+ * `encode_messages` - sent when messages are being Avro-encoded.
208
+ * producer - the class that produced the message
209
+ * topic
210
+ * payloads - the unencoded payloads
211
+
212
+ Similarly:
213
+ ```ruby
214
+ Deimos.subscribe('produce_error') do |event|
215
+ data = event.payloads
216
+ Mail.send("Got an error #{event.exception_object.message} on topic #{data[:topic]} with payloads #{data[:payloads]}")
217
+ end
218
+
219
+ Deimos.subscribe('encode_messages') do |event|
220
+ # ...
221
+ end
222
+ ```
223
+
224
+ ### Kafka Message Keys
225
+
226
+ Topics representing events rather than domain data don't need keys. However,
227
+ best practice for domain messages is to Avro-encode message keys
228
+ with a separate Avro schema.
229
+
230
+ This enforced by requiring producers to define a `key_config` directive. If
231
+ any message comes in with a key, the producer will error out if `key_config` is
232
+ not defined.
233
+
234
+ There are three possible configurations to use:
235
+
236
+ * `key_config none: true` - this indicates that you are not using keys at all
237
+ for this topic. This *must* be set if your messages won't have keys - either
238
+ all your messages in a topic need to have a key, or they all need to have
239
+ no key. This is a good choice for events that aren't keyed - you can still
240
+ set a partition key.
241
+ * `key_config plain: true` - this indicates that you are not using an Avro-encoded
242
+ key. Use this for legacy topics - new topics should not use this setting.
243
+ * `key_config schema: 'MyKeySchema-key'` - this tells the producer to look for
244
+ an existing key schema named `MyKeySchema-key` in the schema registry and to
245
+ encode the key using it. Use this if you've already created a key schema
246
+ or the key value does not exist in the existing payload
247
+ (e.g. it is a compound or generated key).
248
+ * `key_config field: 'my_field'` - this tells the producer to look for a field
249
+ named `my_field` in the value schema. When a payload comes in, the producer
250
+ will take that value from the payload and insert it in a *dynamically generated*
251
+ key schema. This key schema does not need to live in your codebase. Instead,
252
+ it will be a subset of the value schema with only the key field in it.
253
+
254
+ If your value schema looks like this:
255
+ ```javascript
256
+ {
257
+ "namespace": "com.my-namespace",
258
+ "name": "MySchema",
259
+ "type": "record",
260
+ "doc": "Test schema",
261
+ "fields": [
262
+ {
263
+ "name": "test_id",
264
+ "type": "string",
265
+ "doc": "test string"
266
+ },
267
+ {
268
+ "name": "some_int",
269
+ "type": "int",
270
+ "doc": "test int"
271
+ }
272
+ ]
273
+ }
274
+ ```
275
+
276
+ ...setting `key_config field: 'test_id'` will create a key schema that looks
277
+ like this:
278
+
279
+ ```javascript
280
+ {
281
+ "namespace": "com.my-namespace",
282
+ "name": "MySchema-key",
283
+ "type": "record",
284
+ "doc": "Key for com.my-namespace.MySchema",
285
+ "fields": [
286
+ {
287
+ "name": "test_id",
288
+ "type": "string",
289
+ "doc": "test string"
290
+ }
291
+ ]
292
+ }
293
+ ```
294
+
295
+ If you publish a payload `{ "test_id" => "123", "some_int" => 123 }`, this
296
+ will be turned into a key that looks like `{ "test_id" => "123"}` and encoded
297
+ via Avro before being sent to Kafka.
298
+
299
+ If you are using `plain` or `schema` as your config, you will need to have a
300
+ special `payload_key` key to your payload hash. This will be extracted and
301
+ used as the key (for `plain`, it will be used directly, while for `schema`
302
+ it will be encoded first against the schema). So your payload would look like
303
+ `{ "test_id" => "123", "some_int" => 123, payload_key: "some_other_key"}`.
304
+ Remember that if you're using `schema`, the `payload_key` must be a *hash*,
305
+ not a plain value.
306
+
307
+ # Consumers
308
+
309
+ Here is a sample consumer:
310
+
311
+ ```ruby
312
+ class MyConsumer < Deimos::Consumer
313
+
314
+ # These are optional but strongly recommended for testing purposes; this
315
+ # will validate against a local schema file used as the reader schema,
316
+ # as well as being able to write tests against this schema.
317
+ # This is recommended since it ensures you are always getting the values
318
+ # you expect.
319
+ schema 'MySchema'
320
+ namespace 'com.my-namespace'
321
+ # This directive works identically to the producer - see Kafka Keys, above.
322
+ # This only affects the `decode_key` method below. You need to provide
323
+ # `schema` and `namespace`, above, for this to work.
324
+ key_config field: :my_id
325
+
326
+ def consume(payload, metadata)
327
+ # Same method as Phobos consumers.
328
+ # payload is an Avro-decoded hash.
329
+ # Metadata is a hash that contains information like :key and :topic. Both
330
+ # key (if configured) and payload will be Avro-decoded.
331
+ end
332
+ end
333
+ ```
334
+
335
+ # Rails Integration
336
+
337
+ ### Producing
338
+
339
+ Deimos comes with an ActiveRecordProducer. This takes a single or
340
+ list of ActiveRecord objects or hashes and maps it to the given schema.
341
+
342
+ An example would look like this:
343
+
344
+ ```ruby
345
+ class MyProducer < Deimos::ActiveRecordProducer
346
+
347
+ topic 'MyApp.MyTopic'
348
+ schema 'MySchema'
349
+ key_config field: 'my_field'
350
+
351
+ # The record class should be set on every ActiveRecordProducer.
352
+ # By default, if you give the producer a hash, it will re-fetch the
353
+ # record itself for use in the payload generation. This can be useful
354
+ # if you pass a list of hashes to the method e.g. as part of a
355
+ # mass import operation. You can turn off this behavior (e.g. if you're just
356
+ # using the default functionality and don't need to override it)
357
+ # by setting `refetch` to false. This will avoid extra database fetches.
358
+ record_class Widget, refetch: false
359
+
360
+ # Optionally override this if you want the message to be
361
+ # sent even if fields that aren't in the schema are changed.
362
+ def watched_attributes
363
+ super + ['a_non_schema_attribute']
364
+ end
365
+
366
+ # If you want to just use the default functionality you can leave this
367
+ # method out entirely. You only need to use it if you want to massage
368
+ # the payload in some way, e.g. adding fields that don't exist on the
369
+ # record itself.
370
+ def generate_payload(attributes, record)
371
+ super # generates payload based on the record and schema
372
+ end
373
+
374
+ end
375
+
376
+ # or `send_event` with just one Widget
377
+ MyProducer.send_events([Widget.new(foo: 1), Widget.new(foo: 2)])
378
+ MyProducer.send_events([{foo: 1}, {foo: 2}])
379
+ ```
380
+
381
+ #### Disabling Producers
382
+
383
+ You can disable producers globally or inside a block. Globally:
384
+ ```ruby
385
+ Deimos.config.disable_producers = true
386
+ ```
387
+
388
+ For the duration of a block:
389
+ ```ruby
390
+ Deimos.disable_producers do
391
+ # code goes here
392
+ end
393
+ ```
394
+
395
+ For specific producers only:
396
+ ```ruby
397
+ Deimos.disable_producers(Producer1, Producer2) do
398
+ # code goes here
399
+ end
400
+ ```
401
+
402
+ #### KafkaSource
403
+
404
+ There is a special mixin which can be added to any ActiveRecord class. This
405
+ will create callbacks which will automatically send messages to Kafka whenever
406
+ this class is saved. This even includes using the [activerecord-import](https://github.com/zdennis/activerecord-import) gem
407
+ to import objects (including using `on_duplicate_key_update`). However,
408
+ it will *not* work for `update_all`, `delete` or `delete_all`, and naturally
409
+ will not fire if using pure SQL or Arel.
410
+
411
+ Note that these messages are sent *during the transaction*, i.e. using
412
+ `after_create`, `after_update` and `after_destroy`. If there are
413
+ questions of consistency between the database and Kafka, it is recommended
414
+ to switch to using the DB backend (see next section) to avoid these issues.
415
+
416
+ When the object is destroyed, an empty payload with a payload key consisting of
417
+ the record's primary key is sent to the producer. If your topic's key is
418
+ from another field, you will need to override the `deletion_payload` method.
419
+
420
+ ```ruby
421
+ class Widget < ActiveRecord::Base
422
+ include Deimos::KafkaSource
423
+
424
+ # Class method that defines an ActiveRecordProducer(s) to take the object
425
+ # and turn it into a payload.
426
+ def self.kafka_producers
427
+ [MyProducer]
428
+ end
429
+
430
+ def deletion_payload
431
+ { payload_key: self.uuid }
432
+ end
433
+
434
+ # Optional - indicate that you want to send messages when these events
435
+ # occur.
436
+ def self.kafka_config
437
+ {
438
+ :update => true,
439
+ :delete => true,
440
+ :import => true,
441
+ :create => true
442
+ }
443
+ end
444
+
445
+ end
446
+ ```
447
+
448
+ #### Database Backend
449
+
450
+ Deimos provides a way to allow Kafka messages to be created inside a
451
+ database transaction, and send them asynchronously. This ensures that your
452
+ database transactions and Kafka messages related to those transactions
453
+ are always in sync. Essentially, it separates the message logic so that a
454
+ message is first validated, encoded, and saved in the database, and then sent
455
+ on a separate thread. This means if you have to roll back your transaction,
456
+ it also rolls back your Kafka messages.
457
+
458
+ To enable this, first generate the migration to create the relevant tables:
459
+
460
+ rails g deimos:db_backend
461
+
462
+ You can now set the following configuration:
463
+
464
+ config.publish_backend = :db
465
+
466
+ This will save all your Kafka messages to the `kafka_messages` table instead
467
+ of immediately sending to Kafka. Now, you just need to call
468
+
469
+ Deimos.start_db_backend!
470
+
471
+ This creates one or more threads dedicated to scanning and publishing these
472
+ messages by using the `kafka_topics` table in a manner similar to
473
+ [Delayed Job](https://github.com/collectiveidea/delayed_job).
474
+ You can pass in a number of threads to the method:
475
+
476
+ Deimos.start_db_backend!(thread_count: 2)
477
+
478
+ If you want to force a message to send immediately, just call the `publish_list`
479
+ method with `force_send: true`. You can also pass `force_send` into any of the
480
+ other methods that publish events, like `send_event` in `ActiveRecordProducer`.
481
+
482
+ For more information on how the database backend works and why it was
483
+ implemented, please see [Database Backends](docs/DATABASE_BACKEND.md).
484
+
485
+ ### Consuming
486
+
487
+ Deimos provides an ActiveRecordConsumer which will take a payload
488
+ and automatically save it to a provided model. It will take the intersection
489
+ of the payload fields and the model attributes, and either create a new record
490
+ or update an existing record. It will use the message key to find the record
491
+ in the database.
492
+
493
+ To delete a record, simply produce a message with the record's ID as the message
494
+ key and a null payload.
495
+
496
+ Note that to retrieve the key, you must specify the correct [key encoding](#kafka-message-keys)
497
+ configuration.
498
+
499
+ A sample consumer would look as follows:
500
+
501
+ ```ruby
502
+ class MyConsumer < Deimos::ActiveRecordConsumer
503
+
504
+ schema 'MySchema'
505
+ key_config field: 'my_field'
506
+ record_class Widget
507
+
508
+ # Optional override of the default behavior, which is to call `destroy`
509
+ # on the record - e.g. you can replace this with "archiving" the record
510
+ # in some way.
511
+ def destroy_record(record)
512
+ super
513
+ end
514
+
515
+ # Optional override to change the attributes of the record before they
516
+ # are saved.
517
+ def record_attributes(payload)
518
+ super.merge(:some_field => 'some_value')
519
+ end
520
+ end
521
+ ```
522
+
523
+ ## Running consumers
524
+
525
+ Deimos includes a rake task. Once it's in your gemfile, just run
526
+
527
+ rake deimos:start
528
+
529
+ This will automatically set an environment variable called `DEIMOS_RAKE_TASK`,
530
+ which can be useful if you want to figure out if you're inside the task
531
+ as opposed to running your Rails server or console. E.g. you could start your
532
+ DB backend only when your rake task is running.
533
+
534
+ # Metrics
535
+
536
+ Deimos includes some metrics reporting out the box. It ships with DataDog support, but you can add custom metric providers as well.
537
+
538
+ The following metrics are reported:
539
+ * `{service_name}.consumer_lag` - for each partition, the number of messages
540
+ it's behind the tail of the partition (a gauge). This is only sent if
541
+ `config.report_lag` is set to true.
542
+ * `{service_name}.handler` - a count of the number of messages received. Tagged
543
+ with the following:
544
+ * `topic:{topic_name}`
545
+ * `status:received`
546
+ * `status:success`
547
+ * `status:error`
548
+ * `time:consume` (histogram)
549
+ * `time:time_delayed` (histogram)
550
+ * `{service_name}.publish` - a count of the number of messages received. Tagged
551
+ with `topic:{topic_name}`
552
+ * `{service_name}.publish_error` - a count of the number of messages which failed
553
+ to publish. Tagged with `topic:{topic_name}`
554
+
555
+ ### Configuring Metrics Providers
556
+
557
+ See the `# Configure Metrics Provider` section under [Configuration](#configuration)
558
+ View all available Metrics Providers [here](lib/deimos/metrics/metrics_providers)
559
+
560
+ ### Custom Metrics Providers
561
+
562
+ Using the above configuration, it is possible to pass in any generic Metrics
563
+ Provider class as long as it exposes the methods and definitions expected by
564
+ the Metrics module.
565
+ The easiest way to do this is to inherit from the `Metrics::Provider` class
566
+ and implement the methods in it.
567
+
568
+ See the [Mock provider](lib/deimos/metrics/mock.rb) as an example. It implements a constructor which receives config, plus the required metrics methods.
569
+
570
+ Also see [deimos.rb](lib/deimos.rb) under `Configure metrics` to see how the metrics module is called.
571
+
572
+ # Tracing
573
+
574
+ Deimos also includes some tracing for kafka consumers. It ships with
575
+ DataDog support, but you can add custom tracing providers as well.
576
+
577
+ Trace spans are used for when incoming messages are avro decoded, and a
578
+ separate span for message consume logic.
579
+
580
+ ### Configuring Tracing Providers
581
+
582
+ See the `# Configure Tracing Provider` section under [Configuration](#configuration)
583
+ View all available Tracing Providers [here](lib/deimos/tracing)
584
+
585
+ ### Custom Tracing Providers
586
+
587
+ Using the above configuration, it is possible to pass in any generic Tracing
588
+ Provider class as long as it exposes the methods and definitions expected by
589
+ the Tracing module.
590
+ The easiest way to do this is to inherit from the `Tracing::Provider` class
591
+ and implement the methods in it.
592
+
593
+ See the [Mock provider](lib/deimos/tracing/mock.rb) as an example. It implements a constructor which receives config, plus the required tracing methods.
594
+
595
+ Also see [deimos.rb](lib/deimos.rb) under `Configure tracing` to see how the tracing module is called.
596
+
597
+ # Testing
598
+
599
+ Deimos comes with a test helper class which automatically stubs out
600
+ external calls (like metrics and tracing providers and the schema
601
+ registry) and provides useful methods for testing consumers.
602
+
603
+ In `spec_helper.rb`:
604
+ ```ruby
605
+ RSpec.configure do |config|
606
+ config.include Deimos::TestHelpers
607
+ config.before(:each) do
608
+ stub_producers_and_consumers!
609
+ end
610
+ end
611
+ ```
612
+
613
+ In your test, you now have the following methods available:
614
+ ```ruby
615
+ # Pass a consumer class (not instance) to validate a payload against it.
616
+ # This will fail if the payload does not match the schema the consumer
617
+ # is set up to consume.
618
+ test_consume_message(MyConsumer,
619
+ { 'some-payload' => 'some-value' }) do |payload, metadata|
620
+ # do some expectation handling here
621
+ end
622
+
623
+ # You can also pass a topic name instead of the consumer class as long
624
+ # as the topic is configured in your phobos.yml configuration:
625
+ test_consume_message('my-topic-name',
626
+ { 'some-payload' => 'some-value' }) do |payload, metadata|
627
+ # do some expectation handling here
628
+ end
629
+
630
+ # Alternatively, you can test the actual consume logic:
631
+ test_consume_message(MyConsumer,
632
+ { 'some-payload' => 'some-value' },
633
+ call_original: true)
634
+
635
+ # Test that a given payload is invalid against the schema:
636
+ test_consume_invalid_message(MyConsumer,
637
+ { 'some-invalid-payload' => 'some-value' })
638
+
639
+ # A matcher which allows you to test that a message was sent on the given
640
+ # topic, without having to know which class produced it.
641
+ expect(topic_name).to have_sent(payload, key=nil)
642
+
643
+ # Inspect sent messages
644
+ message = Deimos::TestHelpers.sent_messages[0]
645
+ expect(message).to eq({
646
+ message: {'some-key' => 'some-value'},
647
+ topic: 'my-topic',
648
+ key: 'my-id'
649
+ })
650
+ ```
651
+
652
+ **Important note:** To use the `have_sent` helper, your producers need to be
653
+ loaded / required *before* starting the test. You can do this in your
654
+ `spec_helper` file, or if you are defining producers dynamically, you can
655
+ add an `RSpec.prepend_before(:each)` block where you define the producer.
656
+ Alternatively, you can use the `stub_producer` and `stub_consumer` methods
657
+ in your test.
658
+
659
+ There is also a helper method that will let you test if an existing schema
660
+ would be compatible with a new version of it. You can use this in your
661
+ Ruby console but it would likely not be part of your RSpec test:
662
+
663
+ ```ruby
664
+ require 'deimos/test_helpers'
665
+ # Can pass a file path, a string or a hash into this:
666
+ Deimos::TestHelpers.schemas_compatible?(schema1, schema2)
667
+ ```
668
+
669
+ ### Integration Test Helpers
670
+
671
+ You can use the `InlineConsumer` class to help with integration testing,
672
+ with a full external Kafka running.
673
+
674
+ If you have a consumer you want to test against messages in a Kafka topic,
675
+ use the `consume` method:
676
+ ```ruby
677
+ Deimos::Utils::InlineConsumer.consume(
678
+ topic: 'my-topic',
679
+ frk_consumer: MyConsumerClass,
680
+ num_messages: 5
681
+ )
682
+ ```
683
+
684
+ This is a _synchronous_ call which will run the consumer against the
685
+ last 5 messages in the topic. You can set `num_messages` to a number
686
+ like `1_000_000` to always consume all the messages. Once the last
687
+ message is retrieved, the process will wait 1 second to make sure
688
+ they're all done, then continue execution.
689
+
690
+ If you just want to retrieve the contents of a topic, you can use
691
+ the `get_messages_for` method:
692
+
693
+ ```ruby
694
+ Deimos::Utils::InlineConsumer.get_messages_for(
695
+ topic: 'my-topic',
696
+ schema: 'my-schema',
697
+ namespace: 'my.namespace',
698
+ key_config: { field: 'id' },
699
+ num_messages: 5
700
+ )
701
+ ```
702
+
703
+ This will run the process and simply return the last 5 messages on the
704
+ topic, as hashes, once it's done. The format of the messages will simply
705
+ be
706
+ ```ruby
707
+ {
708
+ payload: { key: value }, # payload hash here
709
+ key: "some_value" # key value or hash here
710
+ }
711
+ ```
712
+
713
+ Both payload and key will be Avro-decoded as necessary according to the
714
+ key config.
715
+
716
+ You can also just pass an existing producer or consumer class into the method,
717
+ and it will extract the necessary configuration from it:
718
+
719
+ ```ruby
720
+ Deimos::Utils::InlineConsumer.get_messages_for(
721
+ topic: 'my-topic',
722
+ config_class: MyProducerClass,
723
+ num_messages: 5
724
+ )
725
+ ```
726
+
727
+ ## Contributing
728
+
729
+ Bug reports and pull requests are welcome on GitHub at https://github.com/flipp-oss/deimos .
730
+
731
+ ### Linting
732
+
733
+ Deimos uses Rubocop to lint the code. Please run Rubocop on your code
734
+ before submitting a PR.
735
+
736
+ ---
737
+ <p align="center">
738
+ Sponsored by<br/>
739
+ <a href="https://corp.flipp.com/">
740
+ <img src="support/flipp-logo.png" title="Flipp logo" style="border:none;"/>
741
+ </a>
742
+ </p>