dionysus-rb 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.circleci/config.yml +61 -0
- data/.github/workflows/ci.yml +77 -0
- data/.gitignore +12 -0
- data/.rspec +3 -0
- data/.rubocop.yml +175 -0
- data/.rubocop_todo.yml +53 -0
- data/CHANGELOG.md +227 -0
- data/Gemfile +10 -0
- data/Gemfile.lock +258 -0
- data/LICENSE.txt +21 -0
- data/README.md +1206 -0
- data/Rakefile +10 -0
- data/assets/logo.svg +51 -0
- data/bin/console +11 -0
- data/bin/karafka_health_check +14 -0
- data/bin/outbox_worker_health_check +12 -0
- data/bin/setup +8 -0
- data/dionysus-rb.gemspec +64 -0
- data/docker-compose.yml +44 -0
- data/lib/dionysus/checks/health_check.rb +50 -0
- data/lib/dionysus/checks.rb +7 -0
- data/lib/dionysus/consumer/batch_events_publisher.rb +33 -0
- data/lib/dionysus/consumer/config.rb +97 -0
- data/lib/dionysus/consumer/deserializer.rb +231 -0
- data/lib/dionysus/consumer/dionysus_event.rb +42 -0
- data/lib/dionysus/consumer/karafka_consumer_generator.rb +56 -0
- data/lib/dionysus/consumer/params_batch_processor.rb +65 -0
- data/lib/dionysus/consumer/params_batch_transformations/remove_duplicates_strategy.rb +54 -0
- data/lib/dionysus/consumer/params_batch_transformations.rb +4 -0
- data/lib/dionysus/consumer/persistor.rb +157 -0
- data/lib/dionysus/consumer/registry.rb +84 -0
- data/lib/dionysus/consumer/synced_data/assign_columns_from_synced_data.rb +27 -0
- data/lib/dionysus/consumer/synced_data/assign_columns_from_synced_data_job.rb +26 -0
- data/lib/dionysus/consumer/synced_data.rb +4 -0
- data/lib/dionysus/consumer/synchronizable_model.rb +93 -0
- data/lib/dionysus/consumer/workers_group.rb +18 -0
- data/lib/dionysus/consumer.rb +36 -0
- data/lib/dionysus/monitor.rb +48 -0
- data/lib/dionysus/producer/base_responder.rb +46 -0
- data/lib/dionysus/producer/config.rb +104 -0
- data/lib/dionysus/producer/deleted_record_serializer.rb +17 -0
- data/lib/dionysus/producer/genesis/performed.rb +11 -0
- data/lib/dionysus/producer/genesis/stream_job.rb +13 -0
- data/lib/dionysus/producer/genesis/streamer/base_job.rb +44 -0
- data/lib/dionysus/producer/genesis/streamer/standard_job.rb +43 -0
- data/lib/dionysus/producer/genesis/streamer.rb +40 -0
- data/lib/dionysus/producer/genesis.rb +62 -0
- data/lib/dionysus/producer/karafka_responder_generator.rb +133 -0
- data/lib/dionysus/producer/key.rb +14 -0
- data/lib/dionysus/producer/model_serializer.rb +105 -0
- data/lib/dionysus/producer/outbox/active_record_publishable.rb +74 -0
- data/lib/dionysus/producer/outbox/datadog_latency_reporter.rb +26 -0
- data/lib/dionysus/producer/outbox/datadog_latency_reporter_job.rb +11 -0
- data/lib/dionysus/producer/outbox/datadog_latency_reporter_scheduler.rb +47 -0
- data/lib/dionysus/producer/outbox/datadog_tracer.rb +32 -0
- data/lib/dionysus/producer/outbox/duplicates_filter.rb +26 -0
- data/lib/dionysus/producer/outbox/event_name.rb +26 -0
- data/lib/dionysus/producer/outbox/health_check.rb +48 -0
- data/lib/dionysus/producer/outbox/latency_tracker.rb +43 -0
- data/lib/dionysus/producer/outbox/model.rb +117 -0
- data/lib/dionysus/producer/outbox/producer.rb +26 -0
- data/lib/dionysus/producer/outbox/publishable.rb +106 -0
- data/lib/dionysus/producer/outbox/publisher.rb +131 -0
- data/lib/dionysus/producer/outbox/records_processor.rb +56 -0
- data/lib/dionysus/producer/outbox/runner.rb +120 -0
- data/lib/dionysus/producer/outbox/tombstone_publisher.rb +22 -0
- data/lib/dionysus/producer/outbox.rb +103 -0
- data/lib/dionysus/producer/partition_key.rb +42 -0
- data/lib/dionysus/producer/registry/validator.rb +32 -0
- data/lib/dionysus/producer/registry.rb +165 -0
- data/lib/dionysus/producer/serializer.rb +52 -0
- data/lib/dionysus/producer/suppressor.rb +18 -0
- data/lib/dionysus/producer.rb +121 -0
- data/lib/dionysus/railtie.rb +9 -0
- data/lib/dionysus/rb/version.rb +5 -0
- data/lib/dionysus/rb.rb +8 -0
- data/lib/dionysus/support/rspec/outbox_publishable.rb +78 -0
- data/lib/dionysus/topic_name.rb +15 -0
- data/lib/dionysus/utils/default_message_filter.rb +25 -0
- data/lib/dionysus/utils/exponential_backoff.rb +7 -0
- data/lib/dionysus/utils/karafka_datadog_listener.rb +20 -0
- data/lib/dionysus/utils/karafka_sentry_listener.rb +9 -0
- data/lib/dionysus/utils/null_error_handler.rb +6 -0
- data/lib/dionysus/utils/null_event_bus.rb +5 -0
- data/lib/dionysus/utils/null_hermes_event_producer.rb +5 -0
- data/lib/dionysus/utils/null_instrumenter.rb +7 -0
- data/lib/dionysus/utils/null_lock_client.rb +13 -0
- data/lib/dionysus/utils/null_model_factory.rb +5 -0
- data/lib/dionysus/utils/null_mutex_provider.rb +7 -0
- data/lib/dionysus/utils/null_retry_provider.rb +7 -0
- data/lib/dionysus/utils/null_tracer.rb +5 -0
- data/lib/dionysus/utils/null_transaction_provider.rb +15 -0
- data/lib/dionysus/utils/sidekiq_batched_job_distributor.rb +24 -0
- data/lib/dionysus/utils.rb +6 -0
- data/lib/dionysus/version.rb +7 -0
- data/lib/dionysus-rb.rb +3 -0
- data/lib/dionysus.rb +133 -0
- data/lib/tasks/dionysus.rake +18 -0
- data/log/development.log +0 -0
- data/sig/dionysus/rb.rbs +6 -0
- metadata +585 -0
data/README.md
ADDED
@@ -0,0 +1,1206 @@
|
|
1
|
+
# Dionysus::Rb
|
2
|
+
|
3
|
+
![Dionysus](assets/logo.svg)
|
4
|
+
|
5
|
+
`Dionysus` - a framework on top of [Karafka](http://github.com/karafka/karafka) for Change Data Capture on the domain model level.
|
6
|
+
|
7
|
+
In distibuted systems, transferring data between applications is often a challenge. There are multiple ways how of to do this, especially when using Kafka. There is a good chance that you be familiar with [Change Data Capture](https://www.confluent.io/learn/change-data-capture/) pattern, often applied to relational databases such as PostgreSQL, which is a way of extracting row-level changes in real time. In that cases CDC focuses on INSERTs, UPDATEs and DELETEs of rows. If you are familiar with logical replication, this concept ring a bell. When exploring Kafka, you might have you heard of [Debezium](https://debezium.io), which makes CDC via Kafka simple.
|
8
|
+
|
9
|
+
However, there is one problem with this kind of CDC - they are all about row-level changes. This could work for simple cases, but in more complex domains there is a good chance that a database row is not a great reprentation of a domain model. This would be especiallty true if you apply Domain-Driven Design methodology and what you would like to replicate is an Aggregate that could be composed of several rows coming from different tables.
|
10
|
+
|
11
|
+
Fortunately, mighty Dionysus himself, powered by wine from [Karafka](https://karafka.io/docs/), has got your back - Dionysus can handle CDC on the domain model level. On the producer side, it will publish `model_created`, `model_updated` and `model_destroyed` events with a snapshot of a given model using custom serializers, also handling dependencies and computed properties (where the value of attribute depends on the value from the other model), with a possiblity of using [transactional outbox pattern](https://karolgalanciak.com/blog/2022/11/12/the-inherent-unreliability-of-after_commit-callback-and-most-service-objects-implementation/) to ensure that everthing gets published. On the consumer side, it will make sure that the snapshots of models are persisted and that you could react to all changes not only via ActiveRecord callbacks but also via event bus. And all of this is achievable merely via a couple of config opions and powerful DSL!
|
12
|
+
|
13
|
+
|
14
|
+
## Installation
|
15
|
+
Install the gem and add to the application's Gemfile by executing:
|
16
|
+
|
17
|
+
$ bundle add "dionysus-rb"
|
18
|
+
|
19
|
+
If bundler is not being used to manage dependencies, install the gem by executing:
|
20
|
+
|
21
|
+
$ gem install "dionysus-rb"
|
22
|
+
|
23
|
+
## Usage
|
24
|
+
|
25
|
+
Please read [this article first](https://www.smily.com/engineering/integration-patterns-for-distributed-architecture-how-we-use-kafka-in-smily-and-why) to understand the context how this gem was built. Also, it's just recently been made public, so some part of the docs might require clarification. If you find any section like that, don't hesitate to submit an issue.
|
26
|
+
|
27
|
+
### TODO - update the article is published.
|
28
|
+
Also, [read this article], which is an introduction to the gem.
|
29
|
+
|
30
|
+
|
31
|
+
Any application can be both consumer and the producer of Karafka events, so let's take a look how to handle configuration for both scenario.
|
32
|
+
|
33
|
+
|
34
|
+
### Producer
|
35
|
+
|
36
|
+
First, you need to define a file `karafka.rb` with a content like this:
|
37
|
+
|
38
|
+
``` rb
|
39
|
+
# frozen_string_literal: true
|
40
|
+
|
41
|
+
Dionysus.initialize_application!(
|
42
|
+
environment: ENV["RAILS_ENV"],
|
43
|
+
seed_brokers: ENV.fetch("DIONYSUS_SEED_BROKER").split(";"),
|
44
|
+
client_id: "NAME_OF_THE_APP",
|
45
|
+
logger: Rails.logger
|
46
|
+
)
|
47
|
+
```
|
48
|
+
|
49
|
+
`DIONYSUS_SEED_BROKER` is a string containing all the brokers separated a *semicolon*, e.g. `localhost:9092`. Protocol should not be included.
|
50
|
+
|
51
|
+
This is going to handle the initialization process.
|
52
|
+
|
53
|
+
If you are migration from the gem prior to making `dionysus-rb` public, most likely you will need to also provide `consumer_group_prefix` for backwards compatibility:
|
54
|
+
|
55
|
+
``` rb
|
56
|
+
Dionysus.initialize_application!(
|
57
|
+
environment: ENV["RAILS_ENV"],
|
58
|
+
seed_brokers: ENV.fetch("DIONYSUS_SEED_BROKER").split(";"),
|
59
|
+
client_id: "NAME_OF_THE_APP",
|
60
|
+
logger: Rails.logger,
|
61
|
+
consumer_group_prefix: "prometheus_consumer_group_for"
|
62
|
+
)
|
63
|
+
```
|
64
|
+
|
65
|
+
By default, the name of the consumer grpup will be "NAME_OF_THE_APP_dionysus_consumer_group_for_NAME_OF_THE_APP" where `dionysus_consumer_group_for` is a `consumer_group_prefix`.
|
66
|
+
|
67
|
+
|
68
|
+
And define `dionysus.rb` initializer with your Kafka topics:
|
69
|
+
|
70
|
+
``` rb config/initializers/dionysus.rb
|
71
|
+
Rails.application.config.to_prepare do
|
72
|
+
Karafka::App.setup do |config|
|
73
|
+
config.producer = ::WaterDrop::Producer.new do |producer_config|
|
74
|
+
producer_config.kafka = {
|
75
|
+
'bootstrap.servers': 'localhost:9092', # this needs to be comma-separates list of brokers
|
76
|
+
'request.required.acks': 1,
|
77
|
+
"client.id": "id_of_the_producer_goes_here"
|
78
|
+
}
|
79
|
+
producer_config.id = "id_of_the_producer_goes_here"
|
80
|
+
producer_config.deliver = true
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
Dionysus::Producer.declare do
|
85
|
+
namespace :v3 do # the name of the namespace is supposed to group topics that use the same serializer, think of it as an API versioning. The name of the namespace is going to be included in the topics' names, e.g. `v3_accounts`
|
86
|
+
serializer YourCustomSerializerClass
|
87
|
+
|
88
|
+
topic :accounts, genesis_replica: true, partition_key: :id do # Refer to Genesis section for more details regarding this options, by default it's false
|
89
|
+
publish Account
|
90
|
+
end
|
91
|
+
|
92
|
+
topic :rentals, partition_key: :account_id do # partition key as a name of the attribute
|
93
|
+
publish Availability
|
94
|
+
publish Bathroom
|
95
|
+
publish Bedroom
|
96
|
+
publish Rental
|
97
|
+
end
|
98
|
+
|
99
|
+
bookings_topic_partition_key_resolver = ->(resource) do # a partition key can also be a lambda
|
100
|
+
resource.id.to_s if resource.class.name == "Booking"
|
101
|
+
resource.rental_id.to_s if resource.respond_to?(:rental_id)
|
102
|
+
end
|
103
|
+
|
104
|
+
topic :bookings, partition_key: bookings_topic_partition_key_resolver do
|
105
|
+
publish Booking, with: [BookingsFee, BookingsTax]
|
106
|
+
end
|
107
|
+
|
108
|
+
topic :los_records, partition_key: :rental_id do
|
109
|
+
publish LosRecord
|
110
|
+
end
|
111
|
+
end
|
112
|
+
end
|
113
|
+
end
|
114
|
+
```
|
115
|
+
|
116
|
+
|
117
|
+
There are a couple of important things to understand here.
|
118
|
+
- A namespace might be used for versioning so that you can have e.g., `v3` and v4` format working at the same time and consumers consumings from different ones as they need. Namespace is a part of the topic name, in the example above the following topics are declared: `v3_accounts`, `v3_rentals`, `v3_bookings`, `v3_los_records`. Most likely you will need to create them manually in the production environement, depending the Kafka cluster configuration.
|
119
|
+
- `topic` is a declaration of Kafka `topics`. To understand more about topics and what would be some rule of thumbs when designing them, please [read this article](https://www.smily.com/engineering/integration-patterns-for-distributed-architecture-intro-to-kafka).
|
120
|
+
- Some entities might have attributes depending on other entities (computed properties) or might need to be always published together (kind of like Domain-Driven Design Aggregate). For these cases, use `with` directive which is an equivalent of sideloading from REST APIs. E.g., Booking could have `final_price` attribute that depends on other models, like BookingsFee or BookingsTax, which contribute to that price. Publishing these items separately, e.g. first Bookings Fee first and then Booking with the changed final price might lead to inconsistency on the consumer side where `final_price` value doesn't match the value that would be obtained by summing all elements of the price. That's why all these records need to be published together. That's what `with` option is supposed to cover: `publish Booking, with: [BookingsFee, BookingsTax]`. Thanks to that declaration, any update to Booking or change its dependencies (BookingsFee, BookingsTax) such as creation/update/deletion will result in publishing `booking_updated` event.
|
121
|
+
|
122
|
+
|
123
|
+
#### Serializer
|
124
|
+
|
125
|
+
Serializer is a class that needs implements `serialize` method with the following signature:
|
126
|
+
|
127
|
+
``` rb
|
128
|
+
class YourCustomSerializerClass
|
129
|
+
def self.serialize(record_or_records, dependencies:)
|
130
|
+
# do stuff here
|
131
|
+
end
|
132
|
+
end
|
133
|
+
```
|
134
|
+
|
135
|
+
`record_or_records` is either a single record or the array of records and dependencies are what is defined via `with` option, in most cases this is going to be an empty array, in cases like Bookings in the example above it is going to be an array of dependencies to be sideloaded. The job of the serializer is to figure out how to find the right serializer (kind of like a factory) for a given model and how to sideload the dependencies and return the array of serialized payloads (could be one-element array when passing single record, but it needs to be an array).
|
136
|
+
|
137
|
+
The best way to implement serialization part would be to create `YourCustomSerializerClass` class inheriting from `Dionysus::Producer::Serializer`. Then, you would need to implement just a single method: `infer_serializer`:
|
138
|
+
|
139
|
+
``` rb
|
140
|
+
class YourCustomSerializerClass < Dionysus::Producer::Serializer
|
141
|
+
def infer_serializer
|
142
|
+
somehow_figure_out_the_right_serializer_for_the_model_klass(model_klass)
|
143
|
+
end
|
144
|
+
end
|
145
|
+
```
|
146
|
+
|
147
|
+
The `record` method will be available inside the class so that's how you can get a serializer for a specific model. And to implement the actual serializer for the model, you can create classes inherting from `ModelSerializer`:
|
148
|
+
|
149
|
+
``` rb
|
150
|
+
class SomeModelSerializer < Dionysus::Producer::ModelSerializer
|
151
|
+
attributes :name, :some_other_attribute
|
152
|
+
|
153
|
+
has_one :account
|
154
|
+
has_many :related_records
|
155
|
+
end
|
156
|
+
```
|
157
|
+
|
158
|
+
The declared attributes/relationships will be delegated to the given record by default, although you can override these methods.
|
159
|
+
|
160
|
+
To resolve serializers for declared relationships, also `YourCustomSerializerClass` will be used.
|
161
|
+
|
162
|
+
When testing serializers, you can just limit the scope of the test to `as_json` method:
|
163
|
+
|
164
|
+
``` rb
|
165
|
+
SomeModelSerializer.new(record_to_serialize, include: array_of_underscored_dependencies_to_be_sideloaded, context_serializer: YourCustomSerializerClass).as_json
|
166
|
+
```
|
167
|
+
|
168
|
+
You can also try testing using `YourCustomSerializerClass`, so that you could also verity that `infer_serializer` method works as expected:
|
169
|
+
|
170
|
+
``` rb
|
171
|
+
YourCustomSerializerClass.serialize(record_or_records, dependencies: dependencies)
|
172
|
+
```
|
173
|
+
|
174
|
+
|
175
|
+
### Config options
|
176
|
+
|
177
|
+
#### Bypassing serializers
|
178
|
+
|
179
|
+
|
180
|
+
For a large volume of data, sometimes it doesn't make sense to use serializers foe certain use cases to serialize records individually. One example would be deleting records for models that are soft-deletable. If the only thing you expect your consumers to do is to delete records by ID for a given model and the amount of data is huge, you will be better off sending a single event with a lot of IDs instead of sending multiple events for every record individually while perofrming the full serialization. This is usually combined with `import` option on consumer side to make the cosuming even more efficient.
|
181
|
+
|
182
|
+
|
183
|
+
Here is a complere example of a use case where serialization is bypassed using `serialize: false` option with some extras that will be useful when thinking about `import` option on the consumer side. Let's consider hypotehtical `Record` model:
|
184
|
+
|
185
|
+
|
186
|
+
``` rb
|
187
|
+
|
188
|
+
Dionysus::Producer.responders_for(Record).each do |responder|
|
189
|
+
partition_key = account.id.to_s
|
190
|
+
key = "RecordsCollection:#{account_id}"
|
191
|
+
created_records = Record.for_accounts(account).visible
|
192
|
+
canceled_records = Record.for_accounts(account).soft_deleted
|
193
|
+
|
194
|
+
message = [].tap do |current_message|
|
195
|
+
current_message << ["record_created", created_records.to_a, {}]
|
196
|
+
if canceled_records.any?
|
197
|
+
current_message << ["record_destroyed", canceled_records.map { |record| RecordPrometheusDTO.new(record) }, { serialize: false }]
|
198
|
+
end
|
199
|
+
end
|
200
|
+
|
201
|
+
result = responder.call(message, partition_key: partition_key, key: key)
|
202
|
+
end
|
203
|
+
|
204
|
+
class RecordDionysusDTO < SimpleDelegator
|
205
|
+
def as_json
|
206
|
+
{
|
207
|
+
id: id
|
208
|
+
}
|
209
|
+
end
|
210
|
+
end
|
211
|
+
```
|
212
|
+
|
213
|
+
That way, the serializer will not load any relationships, etc., it will just serialize IDs for `records_destroyed` event and as a bonus part, it covers the case where it might be useful to not deal with records one by one but with a huge batch at once and then using something like [activerecord-import](https://github.com/zdennis/activerecord-import) on the consumer side.
|
214
|
+
|
215
|
+
#### Responders
|
216
|
+
|
217
|
+
Prior to Karafka 2.0, there used to be a concept of Responders that were responsible for publishing messages. This concept was dropped in Karafka 2.0, but a similar concept is still used in Dionysus as its predecessor was built on top of Karafka 1.x.
|
218
|
+
|
219
|
+
Responders implement `call` method that take `message` as a positional argument and `partition_key` and message `key`. Most likely you are not going to need to use this knowledge, but in case you do something really, here is an API to get responders:
|
220
|
+
|
221
|
+
- `Dionysus::Producer.responders_for(model_klass)` - get all responders for a given model class, regardless of the topic
|
222
|
+
- `Dionysus::Producer.responders_for_model_for_topic(model_klass, topic)` - get all responders for a given model class, for a given topic
|
223
|
+
- `Dionysus::Producer.responders_for_dependency_parent(model_klass)` - get all parent-responders for a given model class that is a dependency (when using `with` direcive), regardless of a topic
|
224
|
+
- `Dionysus::Producer.responders_for_dependency_parent(model_klass topic)` - get all parent-responders for a given model class that is a dependency (when using `with` direcive), for a given topic
|
225
|
+
|
226
|
+
|
227
|
+
#### Instrumentation & Event Bus
|
228
|
+
|
229
|
+
|
230
|
+
Instrumenter - an object for instrumentation expecting the following interface (this is the default class):
|
231
|
+
|
232
|
+
``` rb
|
233
|
+
class Dionysus::Utils::NullInstrumenter
|
234
|
+
def self.instrument(name, payload = {})
|
235
|
+
yield
|
236
|
+
end
|
237
|
+
end
|
238
|
+
```
|
239
|
+
|
240
|
+
|
241
|
+
Event Bus is useful if you want to react to some events, with the following interface ((this is the default class)):
|
242
|
+
|
243
|
+
``` rb
|
244
|
+
class Dionysus::Utils::NullEventBus
|
245
|
+
def self.publish(name, payload)
|
246
|
+
end
|
247
|
+
end
|
248
|
+
```
|
249
|
+
|
250
|
+
For the instrumentation, the entire publishing logic is wrapped with the following block: `instrumenter.instrument("Dionysus.respond.#{responder_class_name}")`.
|
251
|
+
|
252
|
+
For the event_bus, the event it published after getting a success response from Kafka: `event_bus.publish("Dionysus.respond", topic_name: topic_name, message: message, options: final_options)`
|
253
|
+
|
254
|
+
You can configure those dependencies in the initializer:
|
255
|
+
|
256
|
+
``` rb
|
257
|
+
Dionysus::Producer.configure do |config|
|
258
|
+
config.instrumenter = MyInstrumentation
|
259
|
+
config.event_bus = MyEventBusForDionysus
|
260
|
+
end
|
261
|
+
```
|
262
|
+
|
263
|
+
They are not required though; null-object-pattern-based objects are injected by default.
|
264
|
+
|
265
|
+
|
266
|
+
#### Sentry and Datadog integration
|
267
|
+
|
268
|
+
This is applicable to both consumers and producers. For Sentry and Datadog integration, add these 2 lines to your initializer:
|
269
|
+
|
270
|
+
``` rb
|
271
|
+
Karafka.monitor.subscribe(Dionysus::Utils::KarafkaSentryListener)
|
272
|
+
Karafka.monitor.subscribe(Dionysus::Utils::KarafkaDatadogListener)
|
273
|
+
```
|
274
|
+
|
275
|
+
Don't put these inside `Rails.application.config.to_prepare do` block.
|
276
|
+
|
277
|
+
#### Transactional Outbox Pattern
|
278
|
+
|
279
|
+
The typical problem on the producer's side you will experience is a possibility of losing some messages due to the lack of transactional boundaries as things like publishing events happens usually in `after_commit` event.
|
280
|
+
|
281
|
+
To prevent that, you could take advantage of [Transactional Outbox Pattern](https://microservices.io/patterns/data/transactional-outbox.html) which is implemented in this gem.
|
282
|
+
|
283
|
+
The idea is simple - store the messages in a temporary table (in the same transaction where creating/updating/deleting publishable record happens) and then publish them in a separate process and mark these messages as published.
|
284
|
+
|
285
|
+
Dionysus has also an extra optimization allowing publish from both after commit callbacks (for performance reasons) and also after a certain delay, to publish from a separate worker that reads data from the transactional outbox table, which covers the cases where some records were not published - in that case they will not be lost, but just retried later.
|
286
|
+
|
287
|
+
##### Making models publishable
|
288
|
+
|
289
|
+
To make ActiveRecord models publishable, you need to make sure that 'Dionysus::Producer::Outbox::ActiveRecordPublishable' module is included in the model. This should be handled automatically by the gem when a model is declared inside `Dionysus::Producer.declare`.
|
290
|
+
|
291
|
+
Thanks to that, an outbox record will be created after each create/update/destroy event.
|
292
|
+
|
293
|
+
In some cases, you might want to publish update events, even after the record is soft-deleted. To do that, you need to override `dionysus_publish_updates_after_soft_delete?` method:
|
294
|
+
|
295
|
+
```rb
|
296
|
+
def dionysus_publish_updates_after_soft_delete?
|
297
|
+
true
|
298
|
+
end
|
299
|
+
```
|
300
|
+
|
301
|
+
|
302
|
+
##### Outbox configuration
|
303
|
+
|
304
|
+
``` rb
|
305
|
+
Dionysus::Producer.configure do |config|
|
306
|
+
config.database_connection_provider = ActiveRecord::Base # required
|
307
|
+
config.transaction_provider = ActiveRecord::Base # required
|
308
|
+
config.outbox_model = DionysusOutbox # required
|
309
|
+
config.outbox_publishing_batch_size = 100 # not required, defaults to 100
|
310
|
+
config.lock_client = Redlock::Client.new([ENV["REDIS_URL"]]) # required if you want to use more than a single worker/more than a single thread per worker, defaults to Dionysus::Producer::Outbox::NullLockClient. Check its interface and the interface of `redlock` gem. To cut the long story short, when the lock is acquired, a hash with the structure outlined in Dionysus::Producer::Outbox::NullLockClient should be yielded. If the lock is not acquired, a nil should be yielded.
|
311
|
+
config.lock_expiry_time = 10_000 # not required, defaults to 10_000, in milliseconds
|
312
|
+
config.error_handler = Sentry # not required but highly recommended, defaults Dionysus::Utils::NullErrorHandler. When using Sentry, you will probably want to exclude SignalException `config.excluded_exceptions += ["SignalException"]`.
|
313
|
+
config.soft_delete_column = :deleted_at # defaults to "canceled_at" when not provided
|
314
|
+
config.default_partition_key = :some_id # defaults to :account_id when not provided, you can override it per topic when declaring them with `partition_key` config option. You can pass either a symbol or a lambda taking the resource as the argument.
|
315
|
+
config.outbox_worker_sleep_seconds = 1 # defaults to 0.2 second when not provided, it's the time interval between each iteration of the outbox worker which fetches pubishable records, publishes them to Kafka and marks them as finished
|
316
|
+
config.transactional_outbox_enabled = false # not required, defaults to `true`. Set it to `false` only if you want to disable creating outbox records (which might be useful for the migration period). If you are not sure if you need this config setting or not, then probably you don't
|
317
|
+
config.publish_after_commit = true # not required, defaults to `false`. Check `Publishing records right after the transaction is committed` section for more details.
|
318
|
+
config.outbox_worker_publishing_delay = 5 # non required, defaults to 0 a delay in seconds until the outbox record is considered publishable. Check `Publishing records right after the transaction is committed` section for more details.
|
319
|
+
config.remove_consecutive_duplicates_before_publishing = true # not required, defaults to false. If set to true, the consecutive duplicates in the publishable batch will be removed and only one message will be published to a given topic. For example, if for whatever reason there are ten messages in a row for a given topic to publish `user_updated` ecent, only the last will be published. Check `Dionysus::Consumer::ParamsBatchTransformations::RemoveDuplicatesStrategy` for exact implementation. To verify if this feature is useful, it's recommended to browse Karafka UI and check messages in the topics if there are any obvious duplicates happening often.
|
320
|
+
config.observers_inline_maximum_size = 100 # not required, defaults to 1000. This config setting matters in case there is a huge amount of dependent records (observers). If the threshold is exceeded, the observers will be published via Genesis process to not cause issues like blocking the outbox worker.
|
321
|
+
config.sidekiq_queue = :default # required, defaults to :dionysus. The queue will be used for a genesis process
|
322
|
+
end
|
323
|
+
```
|
324
|
+
|
325
|
+
##### DionysusOutbox model
|
326
|
+
|
327
|
+
Generate a model for the outbox:
|
328
|
+
|
329
|
+
```
|
330
|
+
rails generate model DionysusOutbox
|
331
|
+
```
|
332
|
+
|
333
|
+
and use the following migration code:
|
334
|
+
|
335
|
+
``` rb
|
336
|
+
create_table(:dionysus_outboxes) do |t|
|
337
|
+
t.string "resource_class", null: false
|
338
|
+
t.string "resource_id", null: false
|
339
|
+
t.string "event_name", null: false
|
340
|
+
t.string "topic", null: false
|
341
|
+
t.string "partition_key"
|
342
|
+
t.datetime "published_at"
|
343
|
+
t.datetime "failed_at"
|
344
|
+
t.datetime "retry_at"
|
345
|
+
t.string "error_class"
|
346
|
+
t.string "error_message"
|
347
|
+
t.integer "attempts", null: false, default: 0
|
348
|
+
t.datetime "created_at", precision: 6, null: false
|
349
|
+
t.datetime "updated_at", precision: 6, null: false
|
350
|
+
|
351
|
+
# some of these indexes are not needed, but they are here for convenience when checking stuff in console or when using a tartarus for archiving
|
352
|
+
t.index ["topic", "created_at"], name: "index_dionysus_outboxes_publishing_idx", where: "published_at IS NULL"
|
353
|
+
t.index ["resource_class", "event_name"], name: "index_dionysus_outboxes_on_resource_class_and_event"
|
354
|
+
t.index ["resource_class", "resource_id"], name: "index_dionysus_outboxes_on_resource_class_and_resource_id"
|
355
|
+
t.index ["topic"], name: "index_dionysus_outboxes_on_topic"
|
356
|
+
t.index ["created_at"], name: "index_dionysus_outboxes_on_created_at"
|
357
|
+
t.index ["resource_class", "created_at"], name: "index_dionysus_outboxes_on_resource_class_and_created_at"
|
358
|
+
t.index ["resource_class", "published_at"], name: "index_dionysus_outboxes_on_resource_class_and_published_at"
|
359
|
+
t.index ["published_at"], name: "index_dionysus_outboxes_on_published_at"
|
360
|
+
end
|
361
|
+
```
|
362
|
+
|
363
|
+
You also need to include `Dionysus::Producer::Outbox::Model` module in your model:
|
364
|
+
|
365
|
+
``` rb
|
366
|
+
class DionysusOutbox < ApplicationRecord
|
367
|
+
include Dionysus::Producer::Outbox::Model
|
368
|
+
end
|
369
|
+
```
|
370
|
+
|
371
|
+
For testing publishable models, you can take advantage of the `Dionysus Transactional Outbox Publishable"` shared behavior. First, you need to require the following file:
|
372
|
+
|
373
|
+
``` rb
|
374
|
+
require "dionysus/support/rspec/outbox_publishable"
|
375
|
+
```
|
376
|
+
|
377
|
+
And then just add `it_behaves_like "Dionysis Transactional Outbox Publishable"` in models' specs.
|
378
|
+
|
379
|
+
|
380
|
+
##### Running outbox worker
|
381
|
+
|
382
|
+
Use the following Rake task:
|
383
|
+
|
384
|
+
```
|
385
|
+
DIONYSUS_THREADS_NUMBER=5 DB_POOL=10 bundle exec rake dionysus:producer
|
386
|
+
```
|
387
|
+
|
388
|
+
If you want to use just a single thread:
|
389
|
+
|
390
|
+
```
|
391
|
+
bundle exec rake dionysus:producer
|
392
|
+
```
|
393
|
+
|
394
|
+
##### Publishing records right after the transaction is committed
|
395
|
+
|
396
|
+
When the throughput of outbox records' creation is really high, there is a very good chance that it might take even few minutes to publish some records from the workers (due to the limited capacity).
|
397
|
+
|
398
|
+
In such case you might consider publishing records right after the transaction is committed. To do so, you need to:
|
399
|
+
|
400
|
+
1. Enable publishing globally:
|
401
|
+
|
402
|
+
``` rb
|
403
|
+
Dionysus::Producer.configure do |config|
|
404
|
+
config.publish_after_commit = true
|
405
|
+
end
|
406
|
+
```
|
407
|
+
|
408
|
+
2. Or enable/disable it per model where `Dionysus::Producer::Outbox::ActiveRecordPublishable` is included
|
409
|
+
|
410
|
+
``` rb
|
411
|
+
class MyModel < ApplicationRecord
|
412
|
+
include Dionysus::Producer::Outbox::ActiveRecordPublishable
|
413
|
+
|
414
|
+
private
|
415
|
+
|
416
|
+
def publish_after_commit?
|
417
|
+
true
|
418
|
+
end
|
419
|
+
end
|
420
|
+
```
|
421
|
+
|
422
|
+
To avoid publishing something or running into some conflicts with publishing records after the transaction (from `after_commit` callback) and from the outbox worker, it is recommended to add some delay for the outbox records until they will be considered publishable.
|
423
|
+
|
424
|
+
``` rb
|
425
|
+
Dionysus::Producer.configure do |config|
|
426
|
+
config.outbox_worker_publishing_delay = 5 # in seconds, defaults to 0
|
427
|
+
end
|
428
|
+
```
|
429
|
+
|
430
|
+
By default, the records will be considered publishable right away. With that config option, it will take 5 seconds after creation until they are considered publishable.
|
431
|
+
|
432
|
+
##### Outbox Publishing Latency Tracking
|
433
|
+
|
434
|
+
It's highly recommended to tracking latency of publishing outbox records defined as the difference between the `published_at` and `created_at` timestamps.
|
435
|
+
|
436
|
+
``` rb
|
437
|
+
Dionysus::Producer.configure do |config|
|
438
|
+
config.datadog_statsd_client = Datadog::Statsd.new("localhost", 8125, namespace: "application_name.production") # required for latency tracking, defaults to `nil`
|
439
|
+
config.high_priority_sidekiq_queue = :critical # not required, defaults to `:dionysus_high_priority`
|
440
|
+
end
|
441
|
+
```
|
442
|
+
|
443
|
+
You also need to add a job to the sidekiq-cron schedule that will run every 1 minute:
|
444
|
+
|
445
|
+
``` rb
|
446
|
+
Sidekiq.configure_server do |config|
|
447
|
+
config.on(:startup) do
|
448
|
+
Dionysus::Producer::Outbox::DatadogLatencyReporterScheduler.new.add_to_schedule
|
449
|
+
end
|
450
|
+
end
|
451
|
+
```
|
452
|
+
|
453
|
+
With this setup, you will have the following metrics available on DataDog:
|
454
|
+
|
455
|
+
- `"#{namespace}.dionysus.producer.outbox.latency.minimum"`
|
456
|
+
- `"#{namespace}.dionysus.producer.outbox.latency.maximum"`
|
457
|
+
- `"#{namespace}.dionysus.producer.outbox.latency.average"`
|
458
|
+
- `"#{namespace}.dionysus.producer.outbox.latency.highest_since_creation_date`
|
459
|
+
|
460
|
+
##### Archiving old outbox records
|
461
|
+
|
462
|
+
You will probably want to periodically archive/delete published outbox records. It's recommended to use [tartarus-rb](https://github.com/BookingSync/tartarus-rb) for that.
|
463
|
+
|
464
|
+
Here is an example config:
|
465
|
+
|
466
|
+
```
|
467
|
+
tartarus.register do |item|
|
468
|
+
item.model = DionysusOutbox
|
469
|
+
item.cron = "5 4 * * *"
|
470
|
+
item.queue = "default"
|
471
|
+
item.archive_items_older_than = -> { 3.days.ago }
|
472
|
+
item.timestamp_field = :published_at
|
473
|
+
item.archive_with = :delete_all_using_limit_in_batches
|
474
|
+
end
|
475
|
+
```
|
476
|
+
|
477
|
+
|
478
|
+
##### Events, hooks and monitors
|
479
|
+
|
480
|
+
You can subscribe to certain events that are published by `Dionysus.monitor`. The monitor is based on [`dry-monitor`](https://github.com/dry-rb/dry-monitor).
|
481
|
+
|
482
|
+
Available events and arguments are:
|
483
|
+
|
484
|
+
- "outbox_producer.started", no arguments
|
485
|
+
- "outbox_producer.stopped", no arguments
|
486
|
+
- "outbox_producer.shutting_down", no arguments
|
487
|
+
- "outbox_producer.error", arguments: error, error_message
|
488
|
+
- "outbox_producer.publishing_failed", arguments: outbox_record
|
489
|
+
- "outbox_producer.published", arguments: outbox_record
|
490
|
+
- "outbox_producer.processing_topic", arguments: topic
|
491
|
+
- "outbox_producer.processed_topic", arguments: topic
|
492
|
+
- "outbox_producer.lock_exists_for_topic", arguments: topic
|
493
|
+
|
494
|
+
|
495
|
+
##### Outbox Worker Health Check
|
496
|
+
|
497
|
+
You need to explicitly enable the health check (e.g. in the initializer, but it needs to be outside the `Rails.application.config.to_prepare` block):
|
498
|
+
|
499
|
+
``` rb
|
500
|
+
Dionysus.enable_outbox_worker_healthcheck
|
501
|
+
```
|
502
|
+
|
503
|
+
To perform the actual health check, use `bin/outbox_worker_health_check`. On success, the script exits with `0` status and on failure, it logs the error and exits with `1` status.
|
504
|
+
|
505
|
+
```
|
506
|
+
bundle exec outbox_worker_health_check
|
507
|
+
```
|
508
|
+
|
509
|
+
It works for both readiness and liveness checks.
|
510
|
+
|
511
|
+
#### Tombstoning records
|
512
|
+
|
513
|
+
The only way to get rid of messages under a given key from Kafka is to tombstone them. Use `Dionysus::Producer::Outbox::TombstonePublisher` to do it:
|
514
|
+
|
515
|
+
``` rb
|
516
|
+
Dionysus::Producer::Outbox::TombstonePublisher.new.publish(resource, responder)
|
517
|
+
```
|
518
|
+
|
519
|
+
Or if you want a custom `key`/`partition_key`:
|
520
|
+
|
521
|
+
``` rb
|
522
|
+
Dionysus::Producer::Outbox::TombstonePublisher.new.publish(resource, responder, partition_key: partition_key, key: key)
|
523
|
+
```
|
524
|
+
|
525
|
+
|
526
|
+
#### Genesis
|
527
|
+
|
528
|
+
When you add `dionysus-rb` to the existing application, there is a good chance that you will need to stream all of the existing records of the publishable models. Or maybe you changed the schema of the serializer introducing some new attributes and you want to re-stream the records. In either case you need to publish everything from scratch. Or in other words, perform a Genesis.
|
529
|
+
|
530
|
+
The way to handle this is to use `Dionysus::Producer::Genesis#stream` method, which is going to enqueue some Sidekiq jobs.
|
531
|
+
|
532
|
+
This method takes the following keyword arguments:
|
533
|
+
- `topic` - required, the name of the topic where you want to publish a given model (this is necessary as one model might be published to multiple topics)
|
534
|
+
- `model` - required, the model class you want to publish
|
535
|
+
- `from` - non-required, to be used together with `to`, it's establish the timeline defined by `from` and `to` timestamps to scope the records to the ones that were updated only during this time. Defaults to `nil`. Don't provide any value if you want to publish all records.
|
536
|
+
- `to` - non-required, to be used together with `from`, it's establish the timeline defined by `from` and `to` timestamps to scope the records to the ones that were updated only during this time. Defaults to `nil`. Don't provide any value if you want to publish all records.
|
537
|
+
- `number_of_days` - required, this arguments defined the timeline for executing all the jobs. If you set it to 7, it means the jobs to publish records to Kafka will be evenly distributed over 7 days. You can use fractions here as well, e.g. 0.5 for half of the day (12 hours).
|
538
|
+
- `streamer_job` non-required, defaults to `Dionysus::Producer::Genesis::Streamer::StandardJob`. In majority of the cases, you don't want to change this argument, but sometimes it might happen that you want to use a different strategy for streaming the records. For example, you might use model X that has a relationship to model Y and usually single record X contains thousands of models Y. In such case, you might intercept model X and provide some custom publishing logic for model Y. Check `Dionysus::Producer::Genesis::Streamer::BaseJob` if you want to apply some customization.
|
539
|
+
|
540
|
+
If you need the full mapping of all available topics and models, use `Dionysus::Producer.topics_models_mapping`.
|
541
|
+
|
542
|
+
When executing Genesis, `Dionysus::Producer::Genesis::Performed` event is going to be published via [Hermes](http://github.com/BookingSync/hermes-rb) if the gem is included. If you don't want for whatever reason to use Hermes, you can use `Dionysus::Utils::NullHermesEventProducer` (which is the default), the config options are described below.
|
543
|
+
|
544
|
+
Config options dedicated for Genesis feature:
|
545
|
+
|
546
|
+
``` rb
|
547
|
+
Dionysus::Producer.configure do |config|
|
548
|
+
config.sidekiq_queue = :messaging # non-required, defaults to `:dionysus`. Remember that you need to add this queue to the Sidekiq config file.
|
549
|
+
config.publisher_service_name = "my_service" # non-required, defaults to `WaterDrop.config.client_id`
|
550
|
+
config.genesis_consistency_safety_delay = 120.seconds # non-required, defaults to `60.seconds`, this is an extra delay taking into consideration the time it might take to schedule the jobs so that you can have an accurate timeline for the Genesis window which is needed for `Dionysus::Producer::Genesis::Performed` event
|
551
|
+
config.hermes_event_producer = Dionysus::Utils::NullHermesEventProducer # non-required
|
552
|
+
end
|
553
|
+
```
|
554
|
+
|
555
|
+
However, such a setup might not be ideal. If you have just a single topic where you publish both current events actually happening in the application at the given moment and also want to re-stream all the records, there is a good chance that you will end up with a huge lag on consumers' side at some point.
|
556
|
+
|
557
|
+
The recommended approach is to have two separate topics:
|
558
|
+
|
559
|
+
1. a standard one, used for publishing current events - e.g. "v3_rentals". This topic should have also a limited retention configured, e.g. to be 7 days.
|
560
|
+
2. a genesis-one, used for publishing everything - e.g. "v3_rentals_genesis". You might consider having an infinite retention in this topic.
|
561
|
+
|
562
|
+
Thanks to such a separation, there will not be an extra lag on the consumers' side causing delays with processing potentially critical events.
|
563
|
+
|
564
|
+
To achieve this result, add `genesis_replica: true` option when declaring a topic on the Producer's decide:
|
565
|
+
|
566
|
+
``` rb
|
567
|
+
Dionysus::Producer.declare do
|
568
|
+
namespace :v3 do
|
569
|
+
serializer YourCustomSerializerClass
|
570
|
+
|
571
|
+
topic :accounts, genesis_replica: true do
|
572
|
+
publish Account
|
573
|
+
end
|
574
|
+
end
|
575
|
+
end
|
576
|
+
```
|
577
|
+
|
578
|
+
When a topic is declared as such, there are 2 possible scenarios of publishing events
|
579
|
+
1. When calling `Dionysus::Producer::Genesis#stream` with the *primary* topic as an argument (based on the example above: `v3_accounts`), the events will be published to both `v3_accounts` and `v3_accounts_genesis` topics
|
580
|
+
2. When calling `Dionysus::Producer::Genesis#stream` with the *genesis* topic as an argument (based on the example above: `v3_accounts_genesis`), the events will be published to only `v3_accounts_genesis` topic
|
581
|
+
|
582
|
+
That implies that the event is always published to the genesis-topic. Only the primary one can be skipped. **IMPORTANT** This behavior is exactly the same during "standard" publishing, outside Genesis - the event will be published to both standard and genesis topic if the topic is declared as a genesis one.
|
583
|
+
|
584
|
+
The reasons behind this behavior is that the Genesis topic cannot have stale data, especially that it's expected to have an infinite retention.
|
585
|
+
|
586
|
+
It's a highly opinionated choice design, if you don't want to maintain a separate topic because you don't need an infinite storage, you can either set a super-short retention for the Genesis replica topic, or enable/disable the feature conditionally, e.g. via ENV variable:
|
587
|
+
|
588
|
+
``` rb
|
589
|
+
use_genesis_replica_for_accounts_topics = (ENV.fetch("USE_GENESIS_REPLICA_FOR_ACCOUNTS_TOPIC", false).to_s == "true")
|
590
|
+
|
591
|
+
topic :accounts, genesis_replica: use_genesis_replica_for_accounts_topics do
|
592
|
+
publish Account
|
593
|
+
end
|
594
|
+
```
|
595
|
+
|
596
|
+
Alternatively, feel free to submit a PR with a cleaner solution.
|
597
|
+
|
598
|
+
**Notice for consumers**: if you decide to introduce such a separation, it would be recommended to use dedicated consumers just for the genesis-topic.
|
599
|
+
|
600
|
+
### Observers for dependencies for computed properties
|
601
|
+
|
602
|
+
Imagine the case where you have for example a Rental model, with some config attribute, for example `check_in_time`. Such an attribute might not necessarily be something that is directly readable from `rentals` table as a simple column. The logic might work in the way that the value from `rentals` table is returned if it's present or delegated to a related `Account` as a fallback. That means that the `Rental` has a dependency on `Account` and you probably want to observe `Account` and publish related rentals if some `default_check_in_time` attribute changes.
|
603
|
+
|
604
|
+
To handle it, you need to do 2 things:
|
605
|
+
|
606
|
+
1. Add `changeset` column to the outbox model. If you don't need an encryption, just use `jsonb` type for the column. If you need encryption, use `text` type.
|
607
|
+
2. Add a proper topic declaration. For the example described above, it could look like this:
|
608
|
+
``` rb
|
609
|
+
topic :rentals do
|
610
|
+
publish Rental, observe: [
|
611
|
+
{
|
612
|
+
model: Account,
|
613
|
+
attributes: %i[default_check_in_time],
|
614
|
+
association_name: :rentals
|
615
|
+
}
|
616
|
+
]
|
617
|
+
end
|
618
|
+
```
|
619
|
+
|
620
|
+
It's going to work for both to-one and to-many relationships.
|
621
|
+
|
622
|
+
To make sure you the columns specified `attribues` actually exist, you can use the following service to validate them:
|
623
|
+
|
624
|
+
``` rb
|
625
|
+
Dionysus::Producer::Registry::Validator.new.validate_columns
|
626
|
+
```
|
627
|
+
|
628
|
+
You can put it in a separate spec to keep things simple or just use the following rake task:
|
629
|
+
|
630
|
+
```
|
631
|
+
bundle exec rake dionysus:validate_columns
|
632
|
+
```
|
633
|
+
|
634
|
+
You could also pass string of chained methods as `association_name`, for example: `association_name: "other.association.rentals"`. Also, when using strings, the validation will be skipped whether a given association exists or not (which is performed for symbols).
|
635
|
+
|
636
|
+
#### Encryption of changesets
|
637
|
+
|
638
|
+
If you store some sensitive data (e.g. anything in scope of GDPR), it will be a good idea to encrypt `changeset`. The recommend solution would be to use [crypt_keeper](https://github.com/jmazzi/crypt_keeper) gem. To make outbox records work with encrypted changesets, call `encrypts_changeset!` class method after declaring the encryption:
|
639
|
+
|
640
|
+
``` rb
|
641
|
+
class DionysusOutboxEncrChangeset < ApplicationRecord
|
642
|
+
include Dionysus::Producer::Outbox::Model
|
643
|
+
|
644
|
+
crypt_keeper :changeset, encryptor: :postgres_pgp, key: ENV.fetch("CRYPT_KEEPER_KEY"), encoding: "UTF-8"
|
645
|
+
|
646
|
+
encrypts_changeset!
|
647
|
+
end
|
648
|
+
```
|
649
|
+
|
650
|
+
### Consumer
|
651
|
+
|
652
|
+
First, you need to define a file `karafka.rb` with a content like this:
|
653
|
+
|
654
|
+
``` rb karafka.rb
|
655
|
+
# frozen_string_literal: true
|
656
|
+
|
657
|
+
Dionysus.initialize_application!(
|
658
|
+
environment: ENV["RAILS_ENV"],
|
659
|
+
seed_brokers: [ENV.fetch("DIONYSUS_SEED_BROKER")],
|
660
|
+
client_id: NAME_OF_THE_APP,
|
661
|
+
logger: Rails.logger
|
662
|
+
)
|
663
|
+
```
|
664
|
+
|
665
|
+
`DIONYSUS_SEED_BROKER` is a string containing all the brokers separated a semicolon, e.g. `localhost:9092`. Protocol should not be included.
|
666
|
+
|
667
|
+
|
668
|
+
If you are migration from the gem prior to making `dionysus-rb` public, most likely you will need to also provide `consumer_group_prefix` for backwards compatibility:
|
669
|
+
|
670
|
+
``` rb
|
671
|
+
Dionysus.initialize_application!(
|
672
|
+
environment: ENV["RAILS_ENV"],
|
673
|
+
seed_brokers: ENV.fetch("DIONYSUS_SEED_BROKER").split(";"),
|
674
|
+
client_id: ["NAME_OF_THE_APP"][],
|
675
|
+
logger: Rails.logger,
|
676
|
+
consumer_group_prefix: "prometheus_consumer_group_for"
|
677
|
+
)
|
678
|
+
```
|
679
|
+
|
680
|
+
By default, the name of the consumer grpup will be "NAME_OF_THE_APP_dionysus_consumer_group_for_NAME_OF_THE_APP" where `dionysus_consumer_group_for` is a `consumer_group_prefix`.
|
681
|
+
|
682
|
+
|
683
|
+
And define `dionysus.rb` initializer:
|
684
|
+
|
685
|
+
``` rb config/initializers/dionysus.rb
|
686
|
+
Rails.application.config.to_prepare do
|
687
|
+
Dionysus::Consumer.declare do
|
688
|
+
namespace :v3 do
|
689
|
+
topic :rentals do
|
690
|
+
dead_letter_queue(topic: "dead_messages", max_retries: 2)
|
691
|
+
end
|
692
|
+
end
|
693
|
+
|
694
|
+
Dionysus::Consumer.configure do |config|
|
695
|
+
config.transaction_provider = ActiveRecord::Base # not required, but highly recommended
|
696
|
+
config.model_factory = DionysusModelFactory # required
|
697
|
+
end
|
698
|
+
end
|
699
|
+
|
700
|
+
Dionysus.initialize_application!(
|
701
|
+
environment: ENV["RAILS_ENV"],
|
702
|
+
seed_brokers: [ENV.fetch("DIONYSUS_SEED_BROKER")],
|
703
|
+
client_id: NAME_OF_THE_APP,
|
704
|
+
logger: Rails.logger
|
705
|
+
)
|
706
|
+
end
|
707
|
+
```
|
708
|
+
|
709
|
+
Notice that you can provide a block to the `topic` method, which allows you to provide some extra configuration options (the same ones as in Karafka, e.g. a Dead Letter Queue config).
|
710
|
+
|
711
|
+
The structure of namespace/topics must reflect what is configured by a Producer! You just don't need to declare specific models, that happens automatically/can be configured with `model_factory` where you could e.g., return nil for the models that you don't want to be processed.
|
712
|
+
|
713
|
+
`model_factory` is an object that returns a model class (or a proper factory! It does not need to be a model class, but returning ActiveRecord model class will work and will be the simplest way to deal with it. Check specs for more details if you want to decouple it from using model classes directly) for a given name, e.g.:
|
714
|
+
|
715
|
+
``` rb
|
716
|
+
class DionysusModelFactory
|
717
|
+
def self.for_model(model_name)
|
718
|
+
model_name.classify.gsub("::", "").constantize rescue nil
|
719
|
+
end
|
720
|
+
end
|
721
|
+
```
|
722
|
+
|
723
|
+
Start `karafka server`:
|
724
|
+
|
725
|
+
```
|
726
|
+
bundle exec karafka server
|
727
|
+
```
|
728
|
+
|
729
|
+
That will be enough to process `_created`, `_updated`, and `_destroyed` events in a generic way.
|
730
|
+
|
731
|
+
So far, Dionysus expects format to be compliant with [BookingSync API v3](https://developers.bookingsync.com/reference/). It also performs some special mapping (the notation is attribute from payload to local attribute):
|
732
|
+
- id -> synced_id
|
733
|
+
- created_at -> synced_created_at
|
734
|
+
- updated_at -> synced_updated_at
|
735
|
+
- canceled_at -> synced_canceled_at
|
736
|
+
- relationship_id -> synced_relationship_id
|
737
|
+
- relationship_type -> synced_relationship_type (for polymorphic associations)
|
738
|
+
|
739
|
+
Also, Dionysus checks timestamps (`updated_at` or `created_at` from payload with local `synced_updated_at` or `synced_created_at` values). If the remote timestamp is from the past comparing to local timestamps, the persistence will not be executed. `synced_updated_at`/`synced_created_at` are configurable (check config options reference)
|
740
|
+
|
741
|
+
#### Consumer Base Class
|
742
|
+
|
743
|
+
If you are happy with `Karafka::BaseConsumer` being a base class for all your consumers, you don't need to do anything as this is a default. If you want to customize it, you have two options:
|
744
|
+
|
745
|
+
1. Global config - specify a base class in Consumer Config in an initializer via `consumer_base_class` attribute:
|
746
|
+
|
747
|
+
|
748
|
+
``` rb
|
749
|
+
Dionysus::Consumer::Config.configure do |config|
|
750
|
+
config.consumer_base_class = CustomConsumerClassInhertingFromKarafkaBaseConsumer
|
751
|
+
end
|
752
|
+
```
|
753
|
+
|
754
|
+
2. Specify per topic - which also takes precedence over a global config (so you can use both of these options!) via `consumer_base_class` option:
|
755
|
+
|
756
|
+
``` rb
|
757
|
+
topic :rentals, consumer_base_class: CustomConsumerClassInhertingFromKarafkaBaseConsumer
|
758
|
+
```
|
759
|
+
|
760
|
+
Here is an example:
|
761
|
+
|
762
|
+
``` rb
|
763
|
+
class CustomConsumerClassInhertingFromKarafkaBaseConsumer < Karafka::BaseConsumer
|
764
|
+
alias_method :original_on_consume, :on_consume
|
765
|
+
|
766
|
+
def on_consume
|
767
|
+
Retryable.perform(times: 3, errors: errors_to_retry, before_retry: BeforeRetry) do
|
768
|
+
original_on_consume
|
769
|
+
end
|
770
|
+
end
|
771
|
+
|
772
|
+
private
|
773
|
+
|
774
|
+
def errors_to_retry
|
775
|
+
@errors_to_retry ||= [ActiveRecord::StatementInvalid, PG::ConnectionBad, PG::Error]
|
776
|
+
end
|
777
|
+
|
778
|
+
class Retryable
|
779
|
+
def self.perform(times:, errors:, before_retry: ->(_error) {})
|
780
|
+
executed = 0
|
781
|
+
begin
|
782
|
+
executed += 1
|
783
|
+
yield
|
784
|
+
rescue *errors => e
|
785
|
+
if executed < times
|
786
|
+
before_retry.call(e)
|
787
|
+
retry
|
788
|
+
else
|
789
|
+
raise e
|
790
|
+
end
|
791
|
+
end
|
792
|
+
end
|
793
|
+
end
|
794
|
+
|
795
|
+
class BeforeRetry
|
796
|
+
def self.call(_error)
|
797
|
+
ActiveRecord::Base.clear_active_connections!
|
798
|
+
end
|
799
|
+
end
|
800
|
+
end
|
801
|
+
```
|
802
|
+
|
803
|
+
|
804
|
+
#### Retryable consuming
|
805
|
+
|
806
|
+
When consuming the events, it might happen that some errors will occur (similar case mentioned already for consumer base class section). If you want to retry from the errors in some way, you can inject a custom `retry_provider` which is supposed to be an object implementing `retry` method that should yield a block. You can specify it on a config level:
|
807
|
+
|
808
|
+
``` rb
|
809
|
+
Dionysus::Consumer::Config.configure do |config|
|
810
|
+
config.retry_provider = CustomRetryProvider.new
|
811
|
+
end
|
812
|
+
```
|
813
|
+
|
814
|
+
Here is an example:
|
815
|
+
|
816
|
+
|
817
|
+
``` rb
|
818
|
+
class CustomRetryProvider
|
819
|
+
def retry(&block)
|
820
|
+
Retryable.perform(times: 3, errors: errors_to_retry, before_retry: BeforeRetry, &block)
|
821
|
+
end
|
822
|
+
|
823
|
+
private
|
824
|
+
|
825
|
+
def errors_to_retry
|
826
|
+
@errors_to_retry ||= [ActiveRecord::StatementInvalid, PG::ConnectionBad, PG::Error]
|
827
|
+
end
|
828
|
+
|
829
|
+
class Retryable
|
830
|
+
def self.perform(times:, errors:, before_retry: ->(_error) {})
|
831
|
+
executed = 0
|
832
|
+
begin
|
833
|
+
executed += 1
|
834
|
+
yield
|
835
|
+
rescue *errors => e
|
836
|
+
if executed < times
|
837
|
+
before_retry.call(e)
|
838
|
+
retry
|
839
|
+
else
|
840
|
+
raise e
|
841
|
+
end
|
842
|
+
end
|
843
|
+
end
|
844
|
+
end
|
845
|
+
|
846
|
+
class BeforeRetry
|
847
|
+
def self.call(_error)
|
848
|
+
ActiveRecord::Base.clear_active_connections!
|
849
|
+
end
|
850
|
+
end
|
851
|
+
end
|
852
|
+
```
|
853
|
+
|
854
|
+
#### Association/disassociation of relationships
|
855
|
+
|
856
|
+
For relationships, especially the sideloaded ones, Dionysus doesn't know if something is has_many or has_many through relationship type, so it doesn't automatically perform linking between records. If Booking is serialized with BookingsFee, it will create/update Booking and BookingsFee as if there were some separate events, but will not magically link them. Most likely, in this scenario, BookingsFee will be linked to Booking anyway via foreign keys via synced attributes (BookingsFee will have synced_booking_id), bu `has_many :through relationship` it is not going to happen. Dionysus doesn't try to guess and allows to define the way associations should be linked to the consumer. The models need to implement the following methods:
|
857
|
+
|
858
|
+
``` rb
|
859
|
+
def resolve_to_one_association(name, id_from_remote_payload)
|
860
|
+
end
|
861
|
+
|
862
|
+
def resolve_to_many_association(name, ids_from_remote_payload)
|
863
|
+
end
|
864
|
+
```
|
865
|
+
|
866
|
+
If you don't have has_one through relationships, you can leave `resolve_to_one_association` empty. If you don't have has_many through relationships, you can implement `resolve_to_many_association` in the following way:
|
867
|
+
|
868
|
+
``` rb
|
869
|
+
def resolve_to_many_association(name, ids_from_remote_payload)
|
870
|
+
public_send(name).where.not(id: ids_from_remote_payload).destroy_all
|
871
|
+
end
|
872
|
+
```
|
873
|
+
|
874
|
+
and add it to `ApplicationRecord`.
|
875
|
+
|
876
|
+
Thay way, e.g., BookingsFees that are locally associated to the Booking but were not removed yet (but were on the Producer side, that's why they are no longer in the payload) will be cleaned up.
|
877
|
+
|
878
|
+
### Handling deletion/cancelation/restoration (soft-delete)
|
879
|
+
|
880
|
+
By default, all records are restored on create/update event by setting `soft_deleted_at_timestamp_attribute` (by default, `synced_canceled_at`) to nil.
|
881
|
+
|
882
|
+
For soft delete, there are a couple of ways this can work:
|
883
|
+
- if it's possible to soft delete record by setting a timestamp, it will be done that way (e.g., by setting `:synced_canceled_at` to a timestamp from payload)
|
884
|
+
- if `canceled_at` is not available in the payload, but the model responds to the method configured via `soft_delete_strategy` (by default: `:cancel`), that method will be called.
|
885
|
+
- if there is no other option to soft-delete the record, `destroy` method will be called.
|
886
|
+
|
887
|
+
#### Batch import
|
888
|
+
|
889
|
+
For high volume of data, you don't probably want to process records one by one, but batch-import them. In that case, you can specify `import` option for the topic:
|
890
|
+
|
891
|
+
``` rb
|
892
|
+
topic :heavy_records, import: true
|
893
|
+
```
|
894
|
+
|
895
|
+
That way, for `heavy_record_created` event there will not be persistence executed one by one, but instead, `dionysus_import` method will be called on the object returned by a model factory (here, most likely HeavtRecord class, or any other model class). The argument of the method is an array of deserialized data access objects responding to `attributes`, `has_many` and `has_one` methods. You may want to inspect the payload of each of them, although most likely you will be just interested here in the `attributes` only (unless something is sideloaded) which will contain the serialized payload (on the producer side) for attributes with some transformations applied on top for reserved attributes (id, created_at, canceled_at, updated_at).
|
896
|
+
|
897
|
+
Also, this will impact `heavy_record_destroyed` event. In that case, you need to handle the logic using `dionysus_destroy` method that is called exactly in the same way as `dionysus_import`. The recommended way to handle the payload would be to extract IDs and find the corresponding records and (sof)delete them using `update_all` method with a single query.
|
898
|
+
|
899
|
+
#### Batch transformation
|
900
|
+
|
901
|
+
You can perform some transformations on the batch of records before the batch is processed. By default `Dionysus::Consumer::ParamsBatchTransformations::RemoveDuplicatesStrategy` is applied which removes duplicates for `_updated` events from the batch (based on message `key`) and keeps the most recent event only.
|
902
|
+
|
903
|
+
You can disable it by explicitly setting it to `nil`:
|
904
|
+
|
905
|
+
``` rb
|
906
|
+
topic :my_topic, params_batch_transformation: nil
|
907
|
+
```
|
908
|
+
|
909
|
+
It is also a very useful addition when using `import: true` option. `params_batch` will always contain multiple items that will be processed sequentially, one by one. When using `import: true` that probably doesn't make much sense for the performance reasons and it might a better idea to just merge all the batches into a single one or some grouped batches.
|
910
|
+
|
911
|
+
You can do that by applying `params_batch_transformation` which expects an object with lambda-like interface responding to `call` method taking a single argument which is `params_batch`:
|
912
|
+
|
913
|
+
``` rb
|
914
|
+
topic :heavy_records, import: true, params_batch_transformation: ->(params_batch) { do_some_merging_logic_here }
|
915
|
+
```
|
916
|
+
|
917
|
+
#### Concurrency
|
918
|
+
|
919
|
+
If you process records only from a single topic/partition, you will not have any issue with concurrent processing of the same record, but if you process from multiple partitions where the same record can get published, you might run into some conflicts. In such a case, you might consider using a mutex, like advisory lock from Postgres. You can use `processing_mutex_provider` and `processing_mutex_method_name` for that that:
|
920
|
+
|
921
|
+
``` rb
|
922
|
+
Dionysus::Consumer.configure do |config|
|
923
|
+
config.processing_mutex_provider = ActiveRecord::Base # optional
|
924
|
+
config.processing_mutex_method_name = :with_advisory_lock # optional, https://github.com/ClosureTree/with_advisory_lock
|
925
|
+
end
|
926
|
+
```
|
927
|
+
|
928
|
+
Keep in mind that this is going impact database load.
|
929
|
+
|
930
|
+
#### Storing entire payload
|
931
|
+
|
932
|
+
It might be the case that when you added a given model to consuming application it didn't contain all the attributes that were serialized in the message and these attribute will be needed in the future. You have 2 options how to handle it:
|
933
|
+
|
934
|
+
1. Reset offset and consume everything from the beginning (absolutely not recommended for large volume of data)
|
935
|
+
2. Store all the attributes that were present in the message for a given record so that you can reuse them later
|
936
|
+
|
937
|
+
Dionysus forces you to go with the second option and expects that all entities will have `synced_data` accessor (although the name is configurable) which will store that payload.
|
938
|
+
|
939
|
+
If you want to configure the behavior of this attribute (you might for example want to store the data as a separate model), just override the attribute:
|
940
|
+
|
941
|
+
``` rb
|
942
|
+
class MyModel < ApplicationRecord
|
943
|
+
after_save :persist_synced_data
|
944
|
+
|
945
|
+
attr_accessor :synced_data
|
946
|
+
|
947
|
+
def synced_data=(data)
|
948
|
+
@synced_data = data
|
949
|
+
end
|
950
|
+
|
951
|
+
private
|
952
|
+
|
953
|
+
def persist_synced_data
|
954
|
+
synced_data_entity = SyncedDataEntity.find_or_initialize_by(model_name: self.class.to_s, model_id: id)
|
955
|
+
synced_data_entity.synced_data = synced_data
|
956
|
+
synced_data_entity.save!
|
957
|
+
end
|
958
|
+
end
|
959
|
+
```
|
960
|
+
|
961
|
+
That way you can store the payloads under a polymorphic SyncedDataEntity model. Or you can avoid storing anthing if that's your choice.
|
962
|
+
|
963
|
+
##### Assigning values from `synced_data`
|
964
|
+
|
965
|
+
Sometimes it might happen that you would like to assign a value from `synced_data` to a model's column, e.g., when some column was missing.
|
966
|
+
|
967
|
+
To do that, you can use `Dionysus::Consumer::SyncedData::AssignColumnsFromSyncedDataJob.enqueue`:
|
968
|
+
|
969
|
+
``` rb
|
970
|
+
Dionysus::Consumer::SyncedData::AssignColumnsFromSyncedDataJob.enqueue(model_class, columns, batch_size:) # batch_size defaults to 1000
|
971
|
+
```
|
972
|
+
|
973
|
+
To make it work, you need to make sure these values on the config level are properly set:
|
974
|
+
|
975
|
+
|
976
|
+
``` rb
|
977
|
+
Dionysus::Consumer.configure do |config|
|
978
|
+
config.resolve_synced_data_hash_proc = ->(record) { record.synced_data_model.synced_data_hash } # optional, defaults to ->(record) { record.public_send(Dionysus::Consumer.configuration.synced_data_attribute).to_h }
|
979
|
+
config.sidekiq_queue = :default # optional, defaults to `:dionysus`
|
980
|
+
end
|
981
|
+
```
|
982
|
+
|
983
|
+
If you store `synced_data` as a `jsonb` attribute on the model level, you don't need to adjust `resolve_synced_data_hash_proc`.
|
984
|
+
|
985
|
+
|
986
|
+
#### Globalize extensions
|
987
|
+
|
988
|
+
(This is no related to Dionysis itself, but might be useful)
|
989
|
+
|
990
|
+
If you use `globalize` gem, there might be a chance that you will serialize translatable attributes to the following format:
|
991
|
+
|
992
|
+
```rb
|
993
|
+
{
|
994
|
+
"translatable_attribute" => {
|
995
|
+
"en" => "English",
|
996
|
+
"fr" => "French"
|
997
|
+
}
|
998
|
+
}
|
999
|
+
```
|
1000
|
+
|
1001
|
+
To handle translated attributes correctly on the consumer, you might need the following patch for `globalize` gem, you can put it, e.g., in the initializer:
|
1002
|
+
|
1003
|
+
``` rb
|
1004
|
+
# frozen_string_literal: true
|
1005
|
+
|
1006
|
+
module Globalize::ActiveRecord::ClassMethods
|
1007
|
+
protected
|
1008
|
+
|
1009
|
+
def define_translated_attr_writer(name)
|
1010
|
+
define_method(:"#{name}=") do |value|
|
1011
|
+
if value.is_a?(Hash)
|
1012
|
+
send("#{name}_translations").each_key { |locale| value[locale] ||= "" }
|
1013
|
+
value.each do |(locale, val)|
|
1014
|
+
write_attribute(name, val, locale: locale)
|
1015
|
+
end
|
1016
|
+
else
|
1017
|
+
write_attribute(name, value)
|
1018
|
+
end
|
1019
|
+
end
|
1020
|
+
end
|
1021
|
+
|
1022
|
+
def define_translations_accessor(name)
|
1023
|
+
attribute(name, ::ActiveRecord::Type::Value.new) if Globalize.rails_5?
|
1024
|
+
define_translations_reader(name)
|
1025
|
+
define_translations_writer(name)
|
1026
|
+
define_translation_used_locales(name)
|
1027
|
+
end
|
1028
|
+
|
1029
|
+
def define_translation_used_locales(name)
|
1030
|
+
define_method(:"#{name}_used_locales") do
|
1031
|
+
send("#{name}_translations").select { |_key, value| value.present? }.keys
|
1032
|
+
end
|
1033
|
+
end
|
1034
|
+
end
|
1035
|
+
```
|
1036
|
+
|
1037
|
+
And the specs (you might need to adjust a `model`):
|
1038
|
+
|
1039
|
+
``` rb
|
1040
|
+
# frozen_string_literal: true
|
1041
|
+
|
1042
|
+
require "rails_helper"
|
1043
|
+
|
1044
|
+
RSpec.describe "Globalize extensions" do
|
1045
|
+
describe "assigning hash" do
|
1046
|
+
subject(:model_name_translations) { model.name_translations }
|
1047
|
+
|
1048
|
+
context "when hash is not empty" do
|
1049
|
+
let(:assign_name) { model.name = name_translations }
|
1050
|
+
let(:model) { Record.new }
|
1051
|
+
let(:name_translations) do
|
1052
|
+
{
|
1053
|
+
"en" => "record",
|
1054
|
+
"fr" => "record in French"
|
1055
|
+
}
|
1056
|
+
end
|
1057
|
+
|
1058
|
+
it "assigns values to a proper locale" do
|
1059
|
+
assign_name
|
1060
|
+
|
1061
|
+
expect(model_name_translations).to eq name_translations
|
1062
|
+
end
|
1063
|
+
end
|
1064
|
+
|
1065
|
+
context "when hash is empty and some translations were assigned before" do
|
1066
|
+
let(:assign_name) { model.name = name_translations }
|
1067
|
+
let(:model) { Record.new }
|
1068
|
+
let(:original_translations) do
|
1069
|
+
{
|
1070
|
+
"en" => "record",
|
1071
|
+
"fr" => "record in French"
|
1072
|
+
}
|
1073
|
+
end
|
1074
|
+
let(:name_translations) do
|
1075
|
+
{}
|
1076
|
+
end
|
1077
|
+
let(:expected_result) do
|
1078
|
+
{
|
1079
|
+
"en" => "",
|
1080
|
+
"fr" => ""
|
1081
|
+
}
|
1082
|
+
end
|
1083
|
+
|
1084
|
+
before do
|
1085
|
+
model.name = original_translations
|
1086
|
+
end
|
1087
|
+
|
1088
|
+
it "assigns nullified values for all locales" do
|
1089
|
+
assign_name
|
1090
|
+
|
1091
|
+
expect(model_name_translations).to eq expected_result
|
1092
|
+
end
|
1093
|
+
end
|
1094
|
+
end
|
1095
|
+
|
1096
|
+
describe "used locales" do
|
1097
|
+
subject(:name_used_locales) { model.name_used_locales }
|
1098
|
+
|
1099
|
+
let(:assign_name) { model.name = name_translations }
|
1100
|
+
let(:model) { Record.new }
|
1101
|
+
let(:name_translations) do
|
1102
|
+
{
|
1103
|
+
"en" => "record",
|
1104
|
+
"fr" => "record in French"
|
1105
|
+
}
|
1106
|
+
end
|
1107
|
+
|
1108
|
+
it "adds a method that extracts used locales" do
|
1109
|
+
assign_name
|
1110
|
+
|
1111
|
+
expect(name_used_locales).to match_array %w[en fr]
|
1112
|
+
end
|
1113
|
+
end
|
1114
|
+
end
|
1115
|
+
|
1116
|
+
|
1117
|
+
```
|
1118
|
+
|
1119
|
+
#### Config options
|
1120
|
+
|
1121
|
+
Full config reference:
|
1122
|
+
|
1123
|
+
``` rb
|
1124
|
+
Dionysus::Consumer.configure do |config|
|
1125
|
+
config.transaction_provider = ActiveRecord::Base # not required, but highly recommended
|
1126
|
+
config.model_factory = DionysusModelFactory # required
|
1127
|
+
config.instrumenter = MyInstrumentation # optional
|
1128
|
+
config.processing_mutex_provider = ActiveRecord::Base # optional
|
1129
|
+
config.processing_mutex_method_name = :with_advisory_lock # optional
|
1130
|
+
config.event_bus = MyEventBusForDionysus # optional
|
1131
|
+
config.soft_delete_strategy = :cancel # optional, default: :cancel
|
1132
|
+
config.soft_deleted_at_timestamp_attribute = :synced_canceled_at # optional, default: :synced_canceled_at
|
1133
|
+
config.synced_created_at_timestamp_attribute = :synced_created_at # optional, default: :synced_created_at
|
1134
|
+
config.synced_updated_at_timestamp_attribute = :synced_updated_at # optional, default: :synced_updated_at
|
1135
|
+
config.synced_id_attribute = :synced_id # optional, default: :synced_id
|
1136
|
+
config.synced_data_attribute = :synced_data # required, default: :synced_data
|
1137
|
+
config.resolve_synced_data_hash_proc = ->(record) { record.synced_data_model.synced_data_hash } # optional, defaults to ->(record) { record.public_send(Dionysus::Consumer.configuration.synced_data_attribute).to_h }
|
1138
|
+
config.sidekiq_queue = :default # optional, defaults to `:dionysus`
|
1139
|
+
config.message_filter = FilterIgnoringLargeMessageToAvoidOutofMemoryErrors.new(error_handler: Sentry) # not required, defaults to Dionysus::Utils::DefaultMessageFilter, which doesn't ignore any messages. It can be useful when you want to ignore some messages, e.g. some very large ones that would cause OOM error. Check the implementation of `Dionysus::Utils::DefaultMessageFilter for more details to understand what kind of arguments are available to set the condition. `error_handler` needs to implement Sentry-like interface.
|
1140
|
+
|
1141
|
+
# if you ever need to provide mapping:
|
1142
|
+
|
1143
|
+
config.add_attributes_mapping_for_model("Rental") do
|
1144
|
+
{
|
1145
|
+
local_rental_type: :remote_rental_type
|
1146
|
+
}
|
1147
|
+
end
|
1148
|
+
end
|
1149
|
+
```
|
1150
|
+
|
1151
|
+
#### Instrumentation & Event Bus
|
1152
|
+
|
1153
|
+
|
1154
|
+
Check publisher for reference about instrumentation and event bus. The only difference is about the methods that are instrumented and events that are published.
|
1155
|
+
|
1156
|
+
For the event bus, you may expect the `dionysus.consume` event. It contains the following attributes:
|
1157
|
+
- `topic_name`, e.g. "v3_inbox", "v3_rentals"
|
1158
|
+
- `model_name`, e.g. "Conversation", "Rental"
|
1159
|
+
- `event_name`, e.g. "rental_created", "converation_updated", "message_destroyed"
|
1160
|
+
- `transformed_data`, deserialized event payload. Please check out DeserializedRecord in `Dionysus::Consumer::Deserializer`
|
1161
|
+
- `local_changes`, this contains all changes that took place while handling this event. It contains a hash with keys as array of two elements: model/relationship name and its id from Core. Every value is a result of `ActiveModel#changes` that is called before committing it locally. This will contain changes of the main resource as well as all of relationships included. An example of possible value:
|
1162
|
+
|
1163
|
+
``` rb
|
1164
|
+
{
|
1165
|
+
["Rental", 1] => {"name" => ["old name", "Villa Saganaki"] },
|
1166
|
+
["bookings", 101] => {"start_at" => [nil, 1] }
|
1167
|
+
}
|
1168
|
+
```
|
1169
|
+
|
1170
|
+
Event bus is the recommended way to do something upon consuming events if you want to avoid putting that logic into ActiveRecord callbacks.
|
1171
|
+
|
1172
|
+
|
1173
|
+
|
1174
|
+
#### Karafka Worker Health check
|
1175
|
+
|
1176
|
+
If you want to perform a karafka health check (for consumer apps), use `Dionysus::Checks::HealthCheck.check`.
|
1177
|
+
|
1178
|
+
To make it work, you need to assign healthcheck to `Dionysus`:
|
1179
|
+
|
1180
|
+
``` rb
|
1181
|
+
# in the initializer, after calling `initialize_application!`
|
1182
|
+
Dionysus.health_check = Dionysus::Checks::HealthCheck.new
|
1183
|
+
```
|
1184
|
+
|
1185
|
+
It works for both readiness and liveness probes. However, keep in mind that you need to enable statistics emission for Karafka to liveness checks work (by setting `statistics.interval.ms' - [more about it here](https://karafka.io/docs/Monitoring-and-Logging/#naming-considerations-for-custom-events)).
|
1186
|
+
|
1187
|
+
To perform the actual health check, use `bin/karafka_health_check`. On success, the script exits with `0` status and on failure, it logs the error and exits with `1` status.
|
1188
|
+
|
1189
|
+
```
|
1190
|
+
bundle exec karafka_health_check
|
1191
|
+
```
|
1192
|
+
|
1193
|
+
## Development
|
1194
|
+
|
1195
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
1196
|
+
|
1197
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
1198
|
+
|
1199
|
+
## Contributing
|
1200
|
+
|
1201
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/BookingSync/dionysus-rb.
|
1202
|
+
|
1203
|
+
## License
|
1204
|
+
|
1205
|
+
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
1206
|
+
["NAME_OF_THE_APP"]:
|