deimos-ruby 1.6.3 → 1.8.1.pre.beta1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.circleci/config.yml +9 -0
- data/.rubocop.yml +22 -16
- data/.ruby-version +1 -1
- data/CHANGELOG.md +42 -0
- data/Gemfile.lock +125 -98
- data/README.md +164 -16
- data/Rakefile +1 -1
- data/deimos-ruby.gemspec +4 -3
- data/docs/ARCHITECTURE.md +144 -0
- data/docs/CONFIGURATION.md +27 -0
- data/lib/deimos.rb +8 -7
- data/lib/deimos/active_record_consume/batch_consumption.rb +159 -0
- data/lib/deimos/active_record_consume/batch_slicer.rb +27 -0
- data/lib/deimos/active_record_consume/message_consumption.rb +58 -0
- data/lib/deimos/active_record_consume/schema_model_converter.rb +52 -0
- data/lib/deimos/active_record_consumer.rb +33 -75
- data/lib/deimos/active_record_producer.rb +23 -0
- data/lib/deimos/batch_consumer.rb +2 -140
- data/lib/deimos/config/configuration.rb +28 -10
- data/lib/deimos/consume/batch_consumption.rb +150 -0
- data/lib/deimos/consume/message_consumption.rb +94 -0
- data/lib/deimos/consumer.rb +79 -70
- data/lib/deimos/kafka_message.rb +1 -1
- data/lib/deimos/kafka_topic_info.rb +22 -3
- data/lib/deimos/message.rb +6 -1
- data/lib/deimos/metrics/provider.rb +0 -2
- data/lib/deimos/poll_info.rb +9 -0
- data/lib/deimos/schema_backends/avro_base.rb +28 -1
- data/lib/deimos/schema_backends/base.rb +15 -2
- data/lib/deimos/tracing/provider.rb +0 -2
- data/lib/deimos/utils/db_poller.rb +149 -0
- data/lib/deimos/utils/db_producer.rb +59 -16
- data/lib/deimos/utils/deadlock_retry.rb +68 -0
- data/lib/deimos/utils/lag_reporter.rb +19 -26
- data/lib/deimos/version.rb +1 -1
- data/lib/generators/deimos/active_record/templates/migration.rb.tt +28 -0
- data/lib/generators/deimos/active_record/templates/model.rb.tt +5 -0
- data/lib/generators/deimos/active_record_generator.rb +79 -0
- data/lib/generators/deimos/db_backend/templates/migration +1 -0
- data/lib/generators/deimos/db_backend/templates/rails3_migration +1 -0
- data/lib/generators/deimos/db_poller/templates/migration +11 -0
- data/lib/generators/deimos/db_poller/templates/rails3_migration +16 -0
- data/lib/generators/deimos/db_poller_generator.rb +48 -0
- data/lib/tasks/deimos.rake +7 -0
- data/spec/active_record_batch_consumer_spec.rb +481 -0
- data/spec/active_record_consume/batch_slicer_spec.rb +42 -0
- data/spec/active_record_consume/schema_model_converter_spec.rb +105 -0
- data/spec/active_record_consumer_spec.rb +3 -11
- data/spec/active_record_producer_spec.rb +66 -88
- data/spec/batch_consumer_spec.rb +24 -7
- data/spec/config/configuration_spec.rb +4 -0
- data/spec/consumer_spec.rb +8 -8
- data/spec/deimos_spec.rb +57 -49
- data/spec/generators/active_record_generator_spec.rb +56 -0
- data/spec/handlers/my_batch_consumer.rb +6 -1
- data/spec/handlers/my_consumer.rb +6 -1
- data/spec/kafka_topic_info_spec.rb +39 -16
- data/spec/message_spec.rb +19 -0
- data/spec/producer_spec.rb +3 -3
- data/spec/rake_spec.rb +1 -1
- data/spec/schemas/com/my-namespace/Generated.avsc +71 -0
- data/spec/schemas/com/my-namespace/MySchemaCompound-key.avsc +18 -0
- data/spec/schemas/com/my-namespace/Wibble.avsc +43 -0
- data/spec/spec_helper.rb +62 -6
- data/spec/utils/db_poller_spec.rb +320 -0
- data/spec/utils/db_producer_spec.rb +84 -10
- data/spec/utils/deadlock_retry_spec.rb +74 -0
- data/spec/utils/lag_reporter_spec.rb +29 -22
- metadata +66 -30
- data/lib/deimos/base_consumer.rb +0 -104
- data/lib/deimos/utils/executor.rb +0 -124
- data/lib/deimos/utils/platform_schema_validation.rb +0 -0
- data/lib/deimos/utils/signal_handler.rb +0 -68
- data/spec/utils/executor_spec.rb +0 -53
- data/spec/utils/signal_handler_spec.rb +0 -16
data/README.md
CHANGED
@@ -23,6 +23,7 @@ Built on Phobos and hence Ruby-Kafka.
|
|
23
23
|
* [Consumers](#consumers)
|
24
24
|
* [Rails Integration](#rails-integration)
|
25
25
|
* [Database Backend](#database-backend)
|
26
|
+
* [Database Poller](#database-poller)
|
26
27
|
* [Running Consumers](#running-consumers)
|
27
28
|
* [Metrics](#metrics)
|
28
29
|
* [Testing](#testing)
|
@@ -313,28 +314,19 @@ messages as an array and then process them together. This can improve
|
|
313
314
|
consumer throughput, depending on the use case. Batch consumers behave like
|
314
315
|
other consumers in regards to key and payload decoding, etc.
|
315
316
|
|
316
|
-
To enable batch consumption, ensure that the `delivery` property
|
317
|
+
To enable batch consumption, ensure that the `delivery` property of your
|
318
|
+
consumer is set to `inline_batch`.
|
317
319
|
|
318
|
-
|
319
|
-
|
320
|
-
consumer do
|
321
|
-
class_name 'Consumers::MyBatchConsumer'
|
322
|
-
topic 'my_batched_topic'
|
323
|
-
group_id 'my_group_id'
|
324
|
-
delivery :inline_batch
|
325
|
-
end
|
326
|
-
end
|
327
|
-
```
|
328
|
-
|
329
|
-
Batch consumers must inherit from the Deimos::BatchConsumer class as in
|
330
|
-
this sample:
|
320
|
+
Batch consumers will invoke the `consume_batch` method instead of `consume`
|
321
|
+
as in this example:
|
331
322
|
|
332
323
|
```ruby
|
333
|
-
class MyBatchConsumer < Deimos::
|
324
|
+
class MyBatchConsumer < Deimos::Consumer
|
334
325
|
|
335
326
|
def consume_batch(payloads, metadata)
|
336
327
|
# payloads is an array of schema-decoded hashes.
|
337
|
-
# metadata is a hash that contains information like :keys
|
328
|
+
# metadata is a hash that contains information like :keys, :topic,
|
329
|
+
# and :first_offset.
|
338
330
|
# Keys are automatically decoded and available as an array with
|
339
331
|
# the same cardinality as the payloads. If you need to iterate
|
340
332
|
# over payloads and keys together, you can use something like this:
|
@@ -532,12 +524,14 @@ class MyConsumer < Deimos::ActiveRecordConsumer
|
|
532
524
|
|
533
525
|
# Optional override of the way to fetch records based on payload and
|
534
526
|
# key. Default is to use the key to search the primary key of the table.
|
527
|
+
# Only used in non-batch mode.
|
535
528
|
def fetch_record(klass, payload, key)
|
536
529
|
super
|
537
530
|
end
|
538
531
|
|
539
532
|
# Optional override on how to set primary key for new records.
|
540
533
|
# Default is to set the class's primary key to the message's decoded key.
|
534
|
+
# Only used in non-batch mode.
|
541
535
|
def assign_key(record, payload, key)
|
542
536
|
super
|
543
537
|
end
|
@@ -545,6 +539,7 @@ class MyConsumer < Deimos::ActiveRecordConsumer
|
|
545
539
|
# Optional override of the default behavior, which is to call `destroy`
|
546
540
|
# on the record - e.g. you can replace this with "archiving" the record
|
547
541
|
# in some way.
|
542
|
+
# Only used in non-batch mode.
|
548
543
|
def destroy_record(record)
|
549
544
|
super
|
550
545
|
end
|
@@ -554,9 +549,159 @@ class MyConsumer < Deimos::ActiveRecordConsumer
|
|
554
549
|
def record_attributes(payload)
|
555
550
|
super.merge(:some_field => 'some_value')
|
556
551
|
end
|
552
|
+
|
553
|
+
# Optional override to change the attributes used for identifying records
|
554
|
+
def record_key(payload)
|
555
|
+
super
|
556
|
+
end
|
557
|
+
end
|
558
|
+
```
|
559
|
+
|
560
|
+
#### Generating Tables and Models
|
561
|
+
|
562
|
+
Deimos provides a generator that takes an existing schema and generates a
|
563
|
+
database table based on its fields. By default, any complex sub-types (such as
|
564
|
+
records or arrays) are turned into JSON (if supported) or string columns.
|
565
|
+
|
566
|
+
Before running this migration, you must first copy the schema into your repo
|
567
|
+
in the correct path (in the example above, you would need to have a file
|
568
|
+
`{SCHEMA_ROOT}/com/my-namespace/MySchema.avsc`).
|
569
|
+
|
570
|
+
To generate a model and migration, run the following:
|
571
|
+
|
572
|
+
rails g deimos:active_record TABLE_NAME FULL_SCHEMA_NAME
|
573
|
+
|
574
|
+
Example:
|
575
|
+
|
576
|
+
rails g deimos:active_record my_table com.my-namespace.MySchema
|
577
|
+
|
578
|
+
...would generate:
|
579
|
+
|
580
|
+
db/migrate/1234_create_my_table.rb
|
581
|
+
app/models/my_table.rb
|
582
|
+
|
583
|
+
#### Batch Consumers
|
584
|
+
|
585
|
+
Deimos also provides a batch consumption mode for `ActiveRecordConsumer` which
|
586
|
+
processes groups of messages at once using the ActiveRecord backend.
|
587
|
+
|
588
|
+
Batch ActiveRecord consumers make use of the
|
589
|
+
[activerecord-import](https://github.com/zdennis/activerecord-import) to insert
|
590
|
+
or update multiple records in bulk SQL statements. This reduces processing
|
591
|
+
time at the cost of skipping ActiveRecord callbacks for individual records.
|
592
|
+
Deleted records (tombstones) are grouped into `delete_all` calls and thus also
|
593
|
+
skip `destroy` callbacks.
|
594
|
+
|
595
|
+
Batch consumption is used when the `delivery` setting for your consumer is set to `inline_batch`.
|
596
|
+
|
597
|
+
**Note**: Currently, batch consumption only supports only primary keys as identifiers out of the box. See
|
598
|
+
[the specs](spec/active_record_batch_consumer_spec.rb) for an example of how to use compound keys.
|
599
|
+
|
600
|
+
By default, batches will be compacted before processing, i.e. only the last
|
601
|
+
message for each unique key in a batch will actually be processed. To change
|
602
|
+
this behaviour, call `compacted false` inside of your consumer definition.
|
603
|
+
|
604
|
+
A sample batch consumer would look as follows:
|
605
|
+
|
606
|
+
```ruby
|
607
|
+
class MyConsumer < Deimos::ActiveRecordConsumer
|
608
|
+
schema 'MySchema'
|
609
|
+
key_config field: 'my_field'
|
610
|
+
record_class Widget
|
611
|
+
|
612
|
+
# Controls whether the batch is compacted before consuming.
|
613
|
+
# If true, only the last message for each unique key in a batch will be
|
614
|
+
# processed.
|
615
|
+
# If false, messages will be grouped into "slices" of independent keys
|
616
|
+
# and each slice will be imported separately.
|
617
|
+
#
|
618
|
+
# compacted false
|
619
|
+
|
620
|
+
|
621
|
+
# Optional override of the default behavior, which is to call `delete_all`
|
622
|
+
# on the associated records - e.g. you can replace this with setting a deleted
|
623
|
+
# flag on the record.
|
624
|
+
def remove_records(records)
|
625
|
+
super
|
626
|
+
end
|
627
|
+
|
628
|
+
# Optional override to change the attributes of the record before they
|
629
|
+
# are saved.
|
630
|
+
def record_attributes(payload)
|
631
|
+
super.merge(:some_field => 'some_value')
|
632
|
+
end
|
633
|
+
end
|
634
|
+
```
|
635
|
+
|
636
|
+
## Database Poller
|
637
|
+
|
638
|
+
Another method of fetching updates from the database to Kafka is by polling
|
639
|
+
the database (a process popularized by [Kafka Connect](https://docs.confluent.io/current/connect/index.html)).
|
640
|
+
Deimos provides a database poller, which allows you the same pattern but
|
641
|
+
with all the flexibility of real Ruby code, and the added advantage of having
|
642
|
+
a single consistent framework to talk to Kafka.
|
643
|
+
|
644
|
+
One of the disadvantages of polling the database is that it can't detect deletions.
|
645
|
+
You can get over this by configuring a mixin to send messages *only* on deletion,
|
646
|
+
and use the poller to handle all other updates. You can reuse the same producer
|
647
|
+
for both cases to handle joins, changes/mappings, business logic, etc.
|
648
|
+
|
649
|
+
To enable the poller, generate the migration:
|
650
|
+
|
651
|
+
```ruby
|
652
|
+
rails g deimos:db_poller
|
653
|
+
```
|
654
|
+
|
655
|
+
Run the migration:
|
656
|
+
|
657
|
+
```ruby
|
658
|
+
rails db:migrate
|
659
|
+
```
|
660
|
+
|
661
|
+
Add the following configuration:
|
662
|
+
|
663
|
+
```ruby
|
664
|
+
Deimos.configure do
|
665
|
+
db_poller do
|
666
|
+
producer_class 'MyProducer' # an ActiveRecordProducer
|
667
|
+
end
|
668
|
+
db_poller do
|
669
|
+
producer_class 'MyOtherProducer'
|
670
|
+
run_every 2.minutes
|
671
|
+
delay 5.seconds # to allow for transactions to finish
|
672
|
+
full_table true # if set, dump the entire table every run; use for small tables
|
673
|
+
end
|
674
|
+
end
|
675
|
+
```
|
676
|
+
|
677
|
+
All the information around connecting and querying the database lives in the
|
678
|
+
producer itself, so you don't need to write any additional code. You can
|
679
|
+
define one additional method on the producer:
|
680
|
+
|
681
|
+
```ruby
|
682
|
+
class MyProducer < Deimos::ActiveRecordProducer
|
683
|
+
...
|
684
|
+
def poll_query(time_from:, time_to:, column_name:, min_id:)
|
685
|
+
# Default is to use the timestamp `column_name` to find all records
|
686
|
+
# between time_from and time_to, or records where `updated_at` is equal to
|
687
|
+
# `time_from` but its ID is greater than `min_id`. This is called
|
688
|
+
# successively as the DB is polled to ensure even if a batch ends in the
|
689
|
+
# middle of a timestamp, we won't miss any records.
|
690
|
+
# You can override or change this behavior if necessary.
|
691
|
+
end
|
557
692
|
end
|
558
693
|
```
|
559
694
|
|
695
|
+
To run the DB poller:
|
696
|
+
|
697
|
+
rake deimos:db_poller
|
698
|
+
|
699
|
+
Note that the DB poller creates one thread per configured poller, and is
|
700
|
+
currently designed *not* to be scaled out - i.e. it assumes you will only
|
701
|
+
have one process running at a time. If a particular poll takes longer than
|
702
|
+
the poll interval (i.e. interval is set at 1 minute but it takes 75 seconds)
|
703
|
+
the next poll will begin immediately following the first one completing.
|
704
|
+
|
560
705
|
## Running consumers
|
561
706
|
|
562
707
|
Deimos includes a rake task. Once it's in your gemfile, just run
|
@@ -783,6 +928,9 @@ Deimos::Utils::InlineConsumer.get_messages_for(
|
|
783
928
|
|
784
929
|
Bug reports and pull requests are welcome on GitHub at https://github.com/flipp-oss/deimos .
|
785
930
|
|
931
|
+
We have more information on the [internal architecture](docs/ARCHITECTURE.md) of Deimos
|
932
|
+
for contributors!
|
933
|
+
|
786
934
|
### Linting
|
787
935
|
|
788
936
|
Deimos uses Rubocop to lint the code. Please run Rubocop on your code
|
data/Rakefile
CHANGED
data/deimos-ruby.gemspec
CHANGED
@@ -21,11 +21,12 @@ Gem::Specification.new do |spec|
|
|
21
21
|
spec.add_runtime_dependency('avro_turf', '~> 0.11')
|
22
22
|
spec.add_runtime_dependency('phobos', '~> 1.9')
|
23
23
|
spec.add_runtime_dependency('ruby-kafka', '~> 0.7')
|
24
|
+
spec.add_runtime_dependency('sigurd', '0.0.1')
|
24
25
|
|
25
|
-
spec.add_development_dependency('activerecord', '~>
|
26
|
+
spec.add_development_dependency('activerecord', '~> 6')
|
26
27
|
spec.add_development_dependency('activerecord-import')
|
27
28
|
spec.add_development_dependency('avro', '~> 1.9')
|
28
|
-
spec.add_development_dependency('
|
29
|
+
spec.add_development_dependency('database_cleaner', '~> 1.7')
|
29
30
|
spec.add_development_dependency('ddtrace', '~> 0.11')
|
30
31
|
spec.add_development_dependency('dogstatsd-ruby', '~> 4.2')
|
31
32
|
spec.add_development_dependency('guard', '~> 2')
|
@@ -33,7 +34,7 @@ Gem::Specification.new do |spec|
|
|
33
34
|
spec.add_development_dependency('guard-rubocop', '~> 1')
|
34
35
|
spec.add_development_dependency('mysql2', '~> 0.5')
|
35
36
|
spec.add_development_dependency('pg', '~> 1.1')
|
36
|
-
spec.add_development_dependency('rails', '~>
|
37
|
+
spec.add_development_dependency('rails', '~> 6')
|
37
38
|
spec.add_development_dependency('rake', '~> 13')
|
38
39
|
spec.add_development_dependency('rspec', '~> 3')
|
39
40
|
spec.add_development_dependency('rspec_junit_formatter', '~>0.3')
|
@@ -0,0 +1,144 @@
|
|
1
|
+
# Deimos Architecture
|
2
|
+
|
3
|
+
Deimos is the third of three libraries that add functionality on top of each
|
4
|
+
other:
|
5
|
+
|
6
|
+
* [RubyKafka](https://github.com/zendesk/ruby-kafka) is the low-level Kafka
|
7
|
+
client, providing API's for producers, consumers and the client as a whole.
|
8
|
+
* [Phobos](https://github.com/phobos/phobos) is a lightweight wrapper on top
|
9
|
+
of RubyKafka that provides threaded consumers, a simpler way to write
|
10
|
+
producers, and lifecycle management.
|
11
|
+
* [Deimos](https://github.com/flipp-oss/deimos/) is a full-featured framework
|
12
|
+
using Phobos as its base which provides schema integration (e.g. Avro),
|
13
|
+
database integration, metrics, tracing, test helpers and other utilities.
|
14
|
+
|
15
|
+
## Folder structure
|
16
|
+
|
17
|
+
As of May 12, 2020, the following are the important files to understand in how
|
18
|
+
Deimos fits together:
|
19
|
+
* `lib/generators`: Generators to generate database migrations, e.g.
|
20
|
+
for the DB Poller and DB Producer features.
|
21
|
+
* `lib/tasks`: Rake tasks for starting consumers, DB Pollers, etc.
|
22
|
+
* `lib/deimos`: Main Deimos code.
|
23
|
+
* `lib/deimos/deimos.rb`: The bootstrap / startup code for Deimos. Also provides
|
24
|
+
some global convenience methods and (for legacy purposes) the way to
|
25
|
+
start the DB Producer.
|
26
|
+
* `lib/deimos/backends`: The different plug-in producer backends - e.g. produce
|
27
|
+
directly to Kafka, use the DB backend, etc.
|
28
|
+
* `lib/deimos/schema_backends`: The different plug-in schema handlers, such
|
29
|
+
as the various flavors of Avro (with/without schema registry etc.)
|
30
|
+
* `lib/deimos/metrics`: The different plug-in metrics providers, e.g. Datadog.
|
31
|
+
* `lib/deimos/tracing`: The different plug-in tracing providers, e.g. Datadog.
|
32
|
+
* `lib/deimos/utils`: Utility classes for things not directly related to
|
33
|
+
producing and consuming, such as the DB Poller, DB Producer, lag reporter, etc.
|
34
|
+
* `lib/deimos/config`: Classes related to configuring Deimos.
|
35
|
+
* `lib/deimos/monkey_patches`: Monkey patches to existing libraries. These
|
36
|
+
should be removed in a future update.
|
37
|
+
|
38
|
+
## Features
|
39
|
+
|
40
|
+
### Producers and Consumers
|
41
|
+
|
42
|
+
Both producers and consumers include the `SharedConfig` module, which
|
43
|
+
standardizes configuration like schema settings, topic, keys, etc.
|
44
|
+
|
45
|
+
Consumers come in two flavors: `Consumer` and `BatchConsumer`. Both include
|
46
|
+
`BaseConsumer` for shared functionality.
|
47
|
+
|
48
|
+
While producing messages go to Kafka by default, literally anything else
|
49
|
+
can happen when your producer calls `produce`, by swapping out the producer
|
50
|
+
_backend_. This is just a file that needs to inherit from `Deimos::Backends::Base`
|
51
|
+
and must implement a single method, `execute`.
|
52
|
+
|
53
|
+
Producers have a complex workflow while processing the payload to publish. This
|
54
|
+
is aided by the `Deimos::Message` class (not to be confused with the
|
55
|
+
`KafkaMessage` class, which is an ActiveRecord used by the DB Producer feature,
|
56
|
+
below).
|
57
|
+
|
58
|
+
### Schemas
|
59
|
+
|
60
|
+
Schema backends are used to encode and decode payloads into different formats
|
61
|
+
such as Avro. These are integrated with producers and consumers, as well
|
62
|
+
as test helpers. These are a bit more involved than producer backends, and
|
63
|
+
must define methods such as:
|
64
|
+
* `encode` a payload or key (when encoding a key, for Avro a key schema
|
65
|
+
may be auto-generated)
|
66
|
+
* `decode` a payload or key
|
67
|
+
* `validate` that a payload is correct for encoding
|
68
|
+
* `coerce` a payload into the given schema (e.g. turn ints into strings)
|
69
|
+
* Get a list of `schema_fields` in the configured schema, used when interacting
|
70
|
+
with ActiveRecord
|
71
|
+
* Define a `mock` backend when the given backend is used. This is used
|
72
|
+
during testing. Typically mock backends will validate values but not
|
73
|
+
actually encode/decode them.
|
74
|
+
|
75
|
+
### Configuration
|
76
|
+
|
77
|
+
Deimos has its own `Configurable` module that makes heavy use of `method_missing`
|
78
|
+
to provide a very succinct but powerful configuration format (including
|
79
|
+
default values, procs, print out as hash, reset, etc.). It also
|
80
|
+
allows for multiple blocks to define different objects of the same time
|
81
|
+
(like producers, consumers, pollers etc.).
|
82
|
+
|
83
|
+
The configuration definition for Deimos is in `config/configuration.rb`. In
|
84
|
+
addition, there are methods in `config/phobos_config.rb` which translate to/from
|
85
|
+
the Phobos configuration format and support the old `phobos.yml` method
|
86
|
+
of configuration.
|
87
|
+
|
88
|
+
### Metrics and Tracing
|
89
|
+
|
90
|
+
These are simpler than other plugins and must implement the expected methods
|
91
|
+
(`increment`, `gauge`, `histogram` and `time` for metrics, and `start`, `finish`
|
92
|
+
and `set_error` for tracing). These are used primarily in producers and consumers.
|
93
|
+
|
94
|
+
### ActiveRecord Integration
|
95
|
+
|
96
|
+
Deimos provides an `ActiveRecordConsumer` and `ActiveRecordProducer`. These are
|
97
|
+
relatively lightweight ways to save data into a database or read it off
|
98
|
+
the database as part of app logic. It uses things like the `coerce` method
|
99
|
+
of the schema backends to manage the differences between the given payload
|
100
|
+
and the configured schema for the topic.
|
101
|
+
|
102
|
+
### Database Backend / Database Producer
|
103
|
+
|
104
|
+
This feature (which provides better performance and transaction guarantees)
|
105
|
+
is powered by two components:
|
106
|
+
* The `db` _publish backend_, which saves messages to the database rather
|
107
|
+
than to Kafka;
|
108
|
+
* The `DbProducer` utility, which runs as a separate process, pulls data
|
109
|
+
from the database and sends it to Kafka.
|
110
|
+
|
111
|
+
There are a set of utility classes that power the producer, which are largely
|
112
|
+
copied from Phobos:
|
113
|
+
* `Executor` takes a set of "runnable" things (which implement a `start` and `stop`
|
114
|
+
method) puts them in a thread pool and runs them all concurrently. It
|
115
|
+
manages starting and stopping all threads when necessary.
|
116
|
+
* `SignalHandler` wraps the Executor and handles SIGINT and SIGTERM signals
|
117
|
+
to stop the executor gracefully.
|
118
|
+
|
119
|
+
In the case of this feature, the `DbProducer` is the runnable object - it
|
120
|
+
can run several threads at once.
|
121
|
+
|
122
|
+
On the database side, the `ActiveRecord` models that power this feature are:
|
123
|
+
* `KafkaMessage`: The actual message, saved to the database. This message
|
124
|
+
is already encoded by the producer, so only has to be sent.
|
125
|
+
* `KafkaTopicInfo`: Used for locking topics so only one producer can work
|
126
|
+
on it at once.
|
127
|
+
|
128
|
+
A Rake task (defined in `deimos.rake`) can be used to start the producer.
|
129
|
+
|
130
|
+
### Database Poller
|
131
|
+
|
132
|
+
This feature (which periodically polls the database to send Kafka messages)
|
133
|
+
primarily uses other aspects of Deimos and hence is relatively small in size.
|
134
|
+
The `DbPoller` class acts as a "runnable" and is used by an Executor (above).
|
135
|
+
The `PollInfo` class is saved to the database to keep track of where each
|
136
|
+
poller is up to.
|
137
|
+
|
138
|
+
A Rake task (defined in `deimos.rake`) can be used to start the pollers.
|
139
|
+
|
140
|
+
### Other Utilities
|
141
|
+
|
142
|
+
The `utils` folder also contains the `LagReporter` (which sends metrics on
|
143
|
+
lag) and the `InlineConsumer`, which can read data from a topic and directly
|
144
|
+
pass it into a handler or save it to memory.
|
data/docs/CONFIGURATION.md
CHANGED
@@ -58,6 +58,10 @@ Deimos.configure do
|
|
58
58
|
namespace 'my.namespace'
|
59
59
|
key_config field: :id
|
60
60
|
|
61
|
+
# Setting to :inline_batch will invoke consume_batch instead of consume
|
62
|
+
# for each batch of messages.
|
63
|
+
delivery :batch
|
64
|
+
|
61
65
|
# If config.schema.path is app/schemas, assumes there is a file in
|
62
66
|
# app/schemas/my/namespace/MyTopicSchema.avsc
|
63
67
|
end
|
@@ -89,6 +93,29 @@ offset_commit_threshold|0|Number of messages that can be processed before their
|
|
89
93
|
heartbeat_interval|10|Interval between heartbeats; must be less than the session window.
|
90
94
|
backoff|`(1000..60_000)`|Range representing the minimum and maximum number of milliseconds to back off after a consumer error.
|
91
95
|
|
96
|
+
## Defining Database Pollers
|
97
|
+
|
98
|
+
These are used when polling the database via `rake deimos:db_poller`. You
|
99
|
+
can create a number of pollers, one per topic.
|
100
|
+
|
101
|
+
```ruby
|
102
|
+
Deimos.configure do
|
103
|
+
db_poller do
|
104
|
+
producer_class 'MyProducer'
|
105
|
+
run_every 2.minutes
|
106
|
+
end
|
107
|
+
end
|
108
|
+
```
|
109
|
+
|
110
|
+
Config name|Default|Description
|
111
|
+
-----------|-------|-----------
|
112
|
+
producer_class|nil|ActiveRecordProducer class to use for sending messages.
|
113
|
+
run_every|60|Amount of time in seconds to wait between runs.
|
114
|
+
timestamp_column|`:updated_at`|Name of the column to query. Remember to add an index to this column!
|
115
|
+
delay_time|2|Amount of time in seconds to wait before picking up records, to allow for transactions to finish.
|
116
|
+
full_table|false|If set to true, do a full table dump to Kafka each run. Good for very small tables.
|
117
|
+
start_from_beginning|true|If false, start from the current time instead of the beginning of time if this is the first time running the poller.
|
118
|
+
|
92
119
|
## Kafka Configuration
|
93
120
|
|
94
121
|
Config name|Default|Description
|