deimos-ruby 1.20.1 → 1.22
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +8 -0
- data/README.md +23 -16
- data/docs/CONFIGURATION.md +2 -0
- data/lib/deimos/active_record_consume/batch_consumption.rb +78 -152
- data/lib/deimos/active_record_consume/batch_record.rb +78 -0
- data/lib/deimos/active_record_consume/batch_record_list.rb +78 -0
- data/lib/deimos/active_record_consume/mass_updater.rb +92 -0
- data/lib/deimos/active_record_consumer.rb +3 -12
- data/lib/deimos/config/configuration.rb +17 -1
- data/lib/deimos/config/phobos_config.rb +2 -1
- data/lib/deimos/version.rb +1 -1
- data/spec/active_record_batch_consumer_association_spec.rb +288 -0
- data/spec/active_record_batch_consumer_spec.rb +3 -19
- data/spec/active_record_consume/mass_updater_spec.rb +75 -0
- metadata +9 -5
- data/CHANGELOG.md.orig +0 -517
- data/spec/active_record_batch_consumer_mysql_spec.rb +0 -244
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 3f8d5ac90c65bc034c6ddcd9992cdc2244100e0c6ac8f7091e253055094580ad
|
4
|
+
data.tar.gz: 385c04f21fd913f4c913e851010c8a27915e6cb253643fc334b6a73497c0312f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 33edfce1a07f88bd039cde29a33b63e59411b50237c8869ab06432e9d2622d8c2c506d7311d163adc4bc5ee05344ee379cb5c2e32d99661568446c77c93abf5e
|
7
|
+
data.tar.gz: ed6f9cff6af0cd6e391c4f891e46874628cdbe41702d6114925417872231b2961d18b2239b83312f2f4fa45105a8d3d29282693252111eb0271a9e83ed2a98a6
|
data/CHANGELOG.md
CHANGED
@@ -7,6 +7,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
7
7
|
|
8
8
|
## UNRELEASED
|
9
9
|
|
10
|
+
# 1.22 - 2023-05-01
|
11
|
+
|
12
|
+
- Feature: Added `replace_associations` and changed default behavior for multi-table consuming. No longer relies on Rails direct associations and wonky magic for new vs. existing records.
|
13
|
+
- Fix: `bulk_import_id` is now handled by Deimos and does not need to be set by application code.
|
14
|
+
- ***BREAKING CHANGE*** Replaced `filter_records` with `should_consume?` on ActiveRecordConsumer.
|
15
|
+
- ***BREAKING CHANGE*** Replaced the behavior of `build_records` on ActiveRecordConsumer with a more powerful `record_attributes`.
|
16
|
+
- ***BREAKING CHANGE*** Removed the `association_list` config as it can now be inferred from the data.
|
17
|
+
|
10
18
|
# 1.21.1 - 2023-04-18
|
11
19
|
|
12
20
|
- Fix: Datadog tracing now works with Datadog 1.x
|
data/README.md
CHANGED
@@ -359,33 +359,40 @@ end
|
|
359
359
|
|
360
360
|
Sometimes, the Kafka message needs to be saved to multiple database tables. For example, if a `User` topic provides you metadata and profile image for users, we might want to save it to multiple tables: `User` and `Image`.
|
361
361
|
|
362
|
-
-
|
362
|
+
- Return associations as keys in `record_attributes` to enable this feature.
|
363
363
|
- The `bulk_import_id_column` config allows you to specify column_name on `record_class` which can be used to retrieve IDs after save. Defaults to `bulk_import_id`. This config is *required* if you have associations but optional if you do not.
|
364
364
|
|
365
|
-
You must override the `
|
366
|
-
- `
|
365
|
+
You must override the `record_attributes` (and optionally `column` and `key_columns`) methods on your consumer class for this feature to work.
|
366
|
+
- `record_attributes` - This method is required to map Kafka messages to ActiveRecord model objects.
|
367
367
|
- `columns(klass)` - Should return an array of column names that should be used by ActiveRecord klass during SQL insert operation.
|
368
368
|
- `key_columns(messages, klass)` - Should return an array of column name(s) that makes a row unique.
|
369
369
|
```ruby
|
370
|
+
class User < ApplicationRecord
|
371
|
+
has_many :images
|
372
|
+
end
|
373
|
+
|
370
374
|
class MyBatchConsumer < Deimos::ActiveRecordConsumer
|
371
375
|
|
372
376
|
record_class User
|
373
|
-
|
374
|
-
|
375
|
-
|
376
|
-
|
377
|
-
|
378
|
-
|
379
|
-
|
380
|
-
|
381
|
-
|
382
|
-
|
377
|
+
|
378
|
+
def record_attributes(payload, _key)
|
379
|
+
{
|
380
|
+
first_name: payload.first_name,
|
381
|
+
images: [
|
382
|
+
{
|
383
|
+
attr1: payload.image_url
|
384
|
+
},
|
385
|
+
{
|
386
|
+
attr2: payload.other_image_url
|
387
|
+
}
|
388
|
+
]
|
389
|
+
}
|
383
390
|
end
|
384
391
|
|
385
|
-
def key_columns(
|
392
|
+
def key_columns(klass)
|
386
393
|
case klass
|
387
394
|
when User
|
388
|
-
|
395
|
+
nil # use default
|
389
396
|
when Image
|
390
397
|
["image_url", "image_name"]
|
391
398
|
end
|
@@ -394,7 +401,7 @@ class MyBatchConsumer < Deimos::ActiveRecordConsumer
|
|
394
401
|
def columns(klass)
|
395
402
|
case klass
|
396
403
|
when User
|
397
|
-
|
404
|
+
nil # use default
|
398
405
|
when Image
|
399
406
|
klass.columns.map(&:name) - [:created_at, :updated_at, :id]
|
400
407
|
end
|
data/docs/CONFIGURATION.md
CHANGED
@@ -84,6 +84,8 @@ key_config|nil|Configuration hash for message keys. See [Kafka Message Keys](../
|
|
84
84
|
disabled|false|Set to true to skip starting an actual listener for this consumer on startup.
|
85
85
|
group_id|nil|ID of the consumer group.
|
86
86
|
use_schema_classes|nil|Set to true or false to enable or disable using the consumers schema classes. See [Generated Schema Classes](../README.md#generated-schema-classes)
|
87
|
+
bulk_import_id_column|:bulk_import_id|Name of the column to use for multi-table imports.
|
88
|
+
replace_associations|true|If false, append to associations in multi-table imports rather than replacing them.
|
87
89
|
max_db_batch_size|nil|Maximum limit for batching database calls to reduce the load on the db.
|
88
90
|
max_concurrency|1|Number of threads created for this listener. Each thread will behave as an independent consumer. They don't share any state.
|
89
91
|
start_from_beginning|true|Once the consumer group has checkpointed its progress in the topic's partitions, the consumers will always start from the checkpointed offsets, regardless of config. As such, this setting only applies when the consumer initially starts consuming from a topic
|
@@ -1,6 +1,10 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require 'deimos/active_record_consume/batch_slicer'
|
4
|
+
require 'deimos/active_record_consume/batch_record'
|
5
|
+
require 'deimos/active_record_consume/batch_record_list'
|
6
|
+
require 'deimos/active_record_consume/mass_updater'
|
7
|
+
|
4
8
|
require 'deimos/utils/deadlock_retry'
|
5
9
|
require 'deimos/message'
|
6
10
|
require 'deimos/exceptions'
|
@@ -40,10 +44,29 @@ module Deimos
|
|
40
44
|
end
|
41
45
|
end
|
42
46
|
|
47
|
+
protected
|
48
|
+
|
49
|
+
# Get the set of attribute names that uniquely identify messages in the
|
50
|
+
# batch. Requires at least one record.
|
51
|
+
# The parameters are mutually exclusive. records is used by default implementation.
|
52
|
+
# @param _klass [Class < ActiveRecord::Base] Class Name can be used to fetch columns
|
53
|
+
# @return [Array<String>,nil] List of attribute names.
|
54
|
+
# @raise If records is empty.
|
55
|
+
def key_columns(_klass)
|
56
|
+
nil
|
57
|
+
end
|
58
|
+
|
59
|
+
# Get the list of database table column names that should be saved to the database
|
60
|
+
# @param _klass [Class < ActiveRecord::Base] ActiveRecord class associated to the Entity Object
|
61
|
+
# @return [Array<String>,nil] list of table columns
|
62
|
+
def columns(_klass)
|
63
|
+
nil
|
64
|
+
end
|
65
|
+
|
43
66
|
# Get unique key for the ActiveRecord instance from the incoming key.
|
44
67
|
# Override this method (with super) to customize the set of attributes that
|
45
68
|
# uniquely identifies each record in the database.
|
46
|
-
# @param key [String] The encoded key.
|
69
|
+
# @param key [String,Hash] The encoded key.
|
47
70
|
# @return [Hash] The key attributes.
|
48
71
|
def record_key(key)
|
49
72
|
if key.nil?
|
@@ -57,7 +80,35 @@ module Deimos
|
|
57
80
|
end
|
58
81
|
end
|
59
82
|
|
60
|
-
|
83
|
+
# Create an ActiveRecord relation that matches all of the passed
|
84
|
+
# records. Used for bulk deletion.
|
85
|
+
# @param records [Array<Message>] List of messages.
|
86
|
+
# @return [ActiveRecord::Relation] Matching relation.
|
87
|
+
def deleted_query(records)
|
88
|
+
keys = records.
|
89
|
+
map { |m| record_key(m.key)[@klass.primary_key] }.
|
90
|
+
reject(&:nil?)
|
91
|
+
|
92
|
+
@klass.unscoped.where(@klass.primary_key => keys)
|
93
|
+
end
|
94
|
+
|
95
|
+
# @param _record [ActiveRecord::Base]
|
96
|
+
# @return [Boolean]
|
97
|
+
def should_consume?(_record)
|
98
|
+
true
|
99
|
+
end
|
100
|
+
|
101
|
+
private
|
102
|
+
|
103
|
+
# Compact a batch of messages, taking only the last message for each
|
104
|
+
# unique key.
|
105
|
+
# @param batch [Array<Message>] Batch of messages.
|
106
|
+
# @return [Array<Message>] Compacted batch.
|
107
|
+
def compact_messages(batch)
|
108
|
+
return batch unless batch.first&.key.present?
|
109
|
+
|
110
|
+
batch.reverse.uniq(&:key).reverse!
|
111
|
+
end
|
61
112
|
|
62
113
|
# Perform database operations for a batch of messages without compaction.
|
63
114
|
# All messages are split into slices containing only unique keys, and
|
@@ -103,63 +154,40 @@ module Deimos
|
|
103
154
|
# records to either be updated or inserted.
|
104
155
|
# @return [void]
|
105
156
|
def upsert_records(messages)
|
106
|
-
|
157
|
+
record_list = build_records(messages)
|
158
|
+
record_list.filter!(self.method(:should_consume?).to_proc)
|
107
159
|
|
108
|
-
|
109
|
-
upserts = build_records(messages)
|
110
|
-
# If overridden record_attributes indicated no record, skip
|
111
|
-
upserts.compact!
|
112
|
-
# apply ActiveRecord validations and fetch valid Records
|
113
|
-
valid_upserts = filter_records(upserts)
|
160
|
+
return if record_list.empty?
|
114
161
|
|
115
|
-
|
162
|
+
key_col_proc = self.method(:key_columns).to_proc
|
163
|
+
col_proc = self.method(:columns).to_proc
|
116
164
|
|
117
|
-
|
118
|
-
|
165
|
+
updater = MassUpdater.new(@klass,
|
166
|
+
key_col_proc: key_col_proc,
|
167
|
+
col_proc: col_proc,
|
168
|
+
replace_associations: self.class.config[:replace_associations])
|
169
|
+
updater.mass_update(record_list)
|
119
170
|
end
|
120
171
|
|
121
|
-
|
122
|
-
|
123
|
-
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
{
|
128
|
-
on_duplicate_key_update: columns
|
129
|
-
}
|
172
|
+
# @param messages [Array<Deimos::Message>]
|
173
|
+
# @return [BatchRecordList]
|
174
|
+
def build_records(messages)
|
175
|
+
records = messages.map do |m|
|
176
|
+
attrs = if self.method(:record_attributes).parameters.size == 2
|
177
|
+
record_attributes(m.payload, m.key)
|
130
178
|
else
|
131
|
-
|
132
|
-
on_duplicate_key_update: {
|
133
|
-
conflict_target: key_cols,
|
134
|
-
columns: columns
|
135
|
-
}
|
136
|
-
}
|
179
|
+
record_attributes(m.payload)
|
137
180
|
end
|
138
|
-
|
139
|
-
end
|
181
|
+
next nil if attrs.nil?
|
140
182
|
|
141
|
-
|
142
|
-
|
143
|
-
# @association_list configured on the consumer helps identify the ones required to be saved.
|
144
|
-
def import_associations(entities)
|
145
|
-
_validate_associations(entities)
|
146
|
-
_fill_primary_key_on_entities(entities)
|
183
|
+
attrs = attrs.merge(record_key(m.key))
|
184
|
+
next unless attrs
|
147
185
|
|
148
|
-
|
149
|
-
|
150
|
-
|
151
|
-
|
152
|
-
|
153
|
-
# Get associated `has_one` or `has_many` records for each entity
|
154
|
-
sub_records = Array(entity.send(assoc.name))
|
155
|
-
# Set IDS from master to each of the records in `has_one` or `has_many` relation
|
156
|
-
sub_records.each { |d| d.send("#{assoc.foreign_key}=", entity.send(assoc.active_record_primary_key)) }
|
157
|
-
sub_records
|
158
|
-
}.flatten
|
159
|
-
|
160
|
-
columns = key_columns(nil, assoc.klass)
|
161
|
-
save_records_to_database(assoc.klass, columns, sub_records) if sub_records.any?
|
162
|
-
end
|
186
|
+
BatchRecord.new(klass: @klass,
|
187
|
+
attributes: attrs,
|
188
|
+
bulk_import_column: self.class.bulk_import_id_column)
|
189
|
+
end
|
190
|
+
BatchRecordList.new(records.compact)
|
163
191
|
end
|
164
192
|
|
165
193
|
# Delete any records with a tombstone.
|
@@ -171,108 +199,6 @@ module Deimos
|
|
171
199
|
|
172
200
|
clause.delete_all
|
173
201
|
end
|
174
|
-
|
175
|
-
# Create an ActiveRecord relation that matches all of the passed
|
176
|
-
# records. Used for bulk deletion.
|
177
|
-
# @param records [Array<Message>] List of messages.
|
178
|
-
# @return [ActiveRecord::Relation] Matching relation.
|
179
|
-
def deleted_query(records)
|
180
|
-
keys = records.
|
181
|
-
map { |m| record_key(m.key)[@klass.primary_key] }.
|
182
|
-
reject(&:nil?)
|
183
|
-
|
184
|
-
@klass.unscoped.where(@klass.primary_key => keys)
|
185
|
-
end
|
186
|
-
|
187
|
-
# Get the set of attribute names that uniquely identify messages in the
|
188
|
-
# batch. Requires at least one record.
|
189
|
-
# The parameters are mutually exclusive. records is used by default implementation.
|
190
|
-
# @param records [Array<Message>] Non-empty list of messages.
|
191
|
-
# @param _klass [ActiveRecord::Class] Class Name can be used to fetch columns
|
192
|
-
# @return [Array<String>] List of attribute names.
|
193
|
-
# @raise If records is empty.
|
194
|
-
def key_columns(records, _klass)
|
195
|
-
raise 'Cannot determine key from empty batch' if records.empty?
|
196
|
-
|
197
|
-
first_key = records.first.key
|
198
|
-
record_key(first_key).keys
|
199
|
-
end
|
200
|
-
|
201
|
-
# Get the list of database table column names that should be saved to the database
|
202
|
-
# @param record_class [Class] ActiveRecord class associated to the Entity Object
|
203
|
-
# @return Array[String] list of table columns
|
204
|
-
def columns(record_class)
|
205
|
-
# In-memory records contain created_at and updated_at as nil
|
206
|
-
# which messes up ActiveRecord-Import bulk_import.
|
207
|
-
# It is necessary to ignore timestamp columns when using ActiveRecord objects
|
208
|
-
ignored_columns = %w(created_at updated_at)
|
209
|
-
record_class.columns.map(&:name) - ignored_columns
|
210
|
-
end
|
211
|
-
|
212
|
-
# Compact a batch of messages, taking only the last message for each
|
213
|
-
# unique key.
|
214
|
-
# @param batch [Array<Message>] Batch of messages.
|
215
|
-
# @return [Array<Message>] Compacted batch.
|
216
|
-
def compact_messages(batch)
|
217
|
-
return batch unless batch.first&.key.present?
|
218
|
-
|
219
|
-
batch.reverse.uniq(&:key).reverse!
|
220
|
-
end
|
221
|
-
|
222
|
-
# Turns Kafka payload into ActiveRecord Objects by mapping relevant fields
|
223
|
-
# Override this method to build object and associations with message payload
|
224
|
-
# @param messages [Array<Deimos::Message>] the array of deimos messages in batch mode
|
225
|
-
# @return [Array<ActiveRecord>] Array of ActiveRecord objects
|
226
|
-
def build_records(messages)
|
227
|
-
messages.map do |m|
|
228
|
-
attrs = if self.method(:record_attributes).parameters.size == 2
|
229
|
-
record_attributes(m.payload, m.key)
|
230
|
-
else
|
231
|
-
record_attributes(m.payload)
|
232
|
-
end
|
233
|
-
|
234
|
-
attrs = attrs&.merge(record_key(m.key))
|
235
|
-
@klass.new(attrs) unless attrs.nil?
|
236
|
-
end
|
237
|
-
end
|
238
|
-
|
239
|
-
# Filters list of Active Records by applying active record validations.
|
240
|
-
# Tip: Add validates_associated in ActiveRecord model to validate associated models
|
241
|
-
# Optionally inherit this method and apply more filters in the application code
|
242
|
-
# The default implementation throws ActiveRecord::RecordInvalid by default
|
243
|
-
# @param records Array<ActiveRecord> - List of active records which will be subjected to model validations
|
244
|
-
# @return valid Array<ActiveRecord> - Subset of records that passed the model validations
|
245
|
-
def filter_records(records)
|
246
|
-
records.each(&:validate!)
|
247
|
-
end
|
248
|
-
|
249
|
-
# Returns true if MySQL Adapter is currently used
|
250
|
-
def mysql_adapter?
|
251
|
-
ActiveRecord::Base.connection.adapter_name.downcase =~ /mysql/
|
252
|
-
end
|
253
|
-
|
254
|
-
# Checks whether the entities has necessary columns for `association_list` to work
|
255
|
-
# @return void
|
256
|
-
def _validate_associations(entities)
|
257
|
-
raise Deimos::MissingImplementationError unless mysql_adapter?
|
258
|
-
|
259
|
-
return if entities.first.respond_to?(@bulk_import_id_column)
|
260
|
-
|
261
|
-
raise "Create bulk_import_id on #{entities.first.class} and set it in `build_records` for associations." \
|
262
|
-
' Run rails g deimos:bulk_import_id {table} to create the migration.'
|
263
|
-
end
|
264
|
-
|
265
|
-
# Fills Primary Key ID on in-memory objects.
|
266
|
-
# Uses @bulk_import_id_column on in-memory records to fetch saved records in database.
|
267
|
-
# @return void
|
268
|
-
def _fill_primary_key_on_entities(entities)
|
269
|
-
table_by_bulk_import_id = @klass.
|
270
|
-
where(@bulk_import_id_column => entities.map { |e| e[@bulk_import_id_column] }).
|
271
|
-
select(:id, @bulk_import_id_column).
|
272
|
-
index_by { |e| e[@bulk_import_id_column] }
|
273
|
-
# update IDs in upsert entity
|
274
|
-
entities.each { |entity| entity.id = table_by_bulk_import_id[entity[@bulk_import_id_column]].id }
|
275
|
-
end
|
276
202
|
end
|
277
203
|
end
|
278
204
|
end
|
@@ -0,0 +1,78 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Deimos
|
4
|
+
module ActiveRecordConsume
|
5
|
+
# Keeps track of both an ActiveRecord instance and more detailed attributes.
|
6
|
+
# The attributes are needed for nested associations.
|
7
|
+
class BatchRecord
|
8
|
+
# @return [ActiveRecord::Base]
|
9
|
+
attr_accessor :record
|
10
|
+
# @return [Hash] a set of association information, represented by a hash of attributes.
|
11
|
+
# For has_one, the format would be e.g. { 'detail' => { 'foo' => 'bar'}}. For has_many, it would
|
12
|
+
# be an array, e.g. { 'details' => [{'foo' => 'bar'}, {'foo' => 'baz'}]}
|
13
|
+
attr_accessor :associations
|
14
|
+
# @return [String] A unique UUID used to associate the auto-increment ID back to
|
15
|
+
# the in-memory record.
|
16
|
+
attr_accessor :bulk_import_id
|
17
|
+
# @return [String] The column name to use for bulk IDs - defaults to `bulk_import_id`.
|
18
|
+
attr_accessor :bulk_import_column
|
19
|
+
|
20
|
+
delegate :valid?, to: :record
|
21
|
+
|
22
|
+
# @param klass [Class < ActiveRecord::Base]
|
23
|
+
# @param attributes [Hash] the full attribute list, including associations.
|
24
|
+
# @param bulk_import_column [String]
|
25
|
+
def initialize(klass:, attributes:, bulk_import_column: nil)
|
26
|
+
@klass = klass
|
27
|
+
if bulk_import_column
|
28
|
+
self.bulk_import_column = bulk_import_column
|
29
|
+
validate_import_id!
|
30
|
+
self.bulk_import_id = SecureRandom.uuid
|
31
|
+
attributes[bulk_import_column] = bulk_import_id
|
32
|
+
end
|
33
|
+
attributes = attributes.with_indifferent_access
|
34
|
+
self.record = klass.new(attributes.slice(*klass.column_names))
|
35
|
+
assoc_keys = attributes.keys.select { |k| klass.reflect_on_association(k) }
|
36
|
+
# a hash with just the association keys, removing all actual column information.
|
37
|
+
self.associations = attributes.slice(*assoc_keys)
|
38
|
+
end
|
39
|
+
|
40
|
+
# Checks whether the entities has necessary columns for association saving to work
|
41
|
+
# @return void
|
42
|
+
def validate_import_id!
|
43
|
+
return if @klass.column_names.include?(self.bulk_import_column.to_s)
|
44
|
+
|
45
|
+
raise "Create bulk_import_id on the #{@klass.table_name} table." \
|
46
|
+
' Run rails g deimos:bulk_import_id {table} to create the migration.'
|
47
|
+
end
|
48
|
+
|
49
|
+
# @return [Class < ActiveRecord::Base]
|
50
|
+
def klass
|
51
|
+
self.record.class
|
52
|
+
end
|
53
|
+
|
54
|
+
# Create a list of BatchRecord instances representing associated objects for the given
|
55
|
+
# association name.
|
56
|
+
# @param assoc_name [String]
|
57
|
+
# @param bulk_import_id [String] A UUID which should be set on *every* sub-record. Unlike the
|
58
|
+
# parent bulk_insert_id, where each record has a unique UUID,
|
59
|
+
# this is used to detect and delete old data, so this is basically a "session ID" for this
|
60
|
+
# bulk upsert.
|
61
|
+
# @return [Array<BatchRecord>]
|
62
|
+
def sub_records(assoc_name, bulk_import_id=nil)
|
63
|
+
attr_list = self.associations[assoc_name.to_s]
|
64
|
+
assoc = self.klass.reflect_on_association(assoc_name)
|
65
|
+
Array.wrap(attr_list).map { |attrs|
|
66
|
+
# Set the ID of the original object, e.g. widgets -> details, this will set widget_id.
|
67
|
+
attrs[assoc.foreign_key] = self.record[assoc.active_record_primary_key]
|
68
|
+
if bulk_import_id
|
69
|
+
attrs[self.bulk_import_column] = bulk_import_id
|
70
|
+
end
|
71
|
+
BatchRecord.new(klass: assoc.klass, attributes: attrs) if attrs
|
72
|
+
}.compact
|
73
|
+
end
|
74
|
+
|
75
|
+
# @return [String,Integer]
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
@@ -0,0 +1,78 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Deimos
|
4
|
+
module ActiveRecordConsume
|
5
|
+
# A set of BatchRecords which typically are worked with together (hence the batching!)
|
6
|
+
class BatchRecordList
|
7
|
+
# @return [Array<BatchRecord>]
|
8
|
+
attr_accessor :batch_records
|
9
|
+
attr_accessor :klass, :bulk_import_column
|
10
|
+
|
11
|
+
delegate :empty?, :map, to: :batch_records
|
12
|
+
|
13
|
+
# @param records [Array<BatchRecord>]
|
14
|
+
def initialize(records)
|
15
|
+
self.batch_records = records
|
16
|
+
self.klass = records.first&.klass
|
17
|
+
self.bulk_import_column = records.first&.bulk_import_column&.to_sym
|
18
|
+
end
|
19
|
+
|
20
|
+
# Filter out any invalid records.
|
21
|
+
# @param method [Proc]
|
22
|
+
def filter!(method)
|
23
|
+
self.batch_records.delete_if { |record| !method.call(record.record) }
|
24
|
+
end
|
25
|
+
|
26
|
+
# Get the original ActiveRecord objects.
|
27
|
+
# @return [Array<ActiveRecord::Base>]
|
28
|
+
def records
|
29
|
+
self.batch_records.map(&:record)
|
30
|
+
end
|
31
|
+
|
32
|
+
# Get the list of relevant associations, based on the keys of the association hashes of all
|
33
|
+
# records in this list.
|
34
|
+
# @return [Array<ActiveRecord::Reflection::AssociationReflection>]
|
35
|
+
def associations
|
36
|
+
return @associations if @associations
|
37
|
+
|
38
|
+
keys = self.batch_records.map { |r| r.associations.keys }.flatten.uniq.map(&:to_sym)
|
39
|
+
@associations = self.klass.reflect_on_all_associations.select { |assoc| keys.include?(assoc.name) }
|
40
|
+
end
|
41
|
+
|
42
|
+
# Go back to the DB and use the bulk_import_id to set the actual primary key (`id`) of the
|
43
|
+
# records.
|
44
|
+
def fill_primary_keys!
|
45
|
+
primary_col = self.klass.primary_key
|
46
|
+
bulk_import_map = self.klass.
|
47
|
+
where(self.bulk_import_column => self.batch_records.map(&:bulk_import_id)).
|
48
|
+
select(primary_col, self.bulk_import_column).
|
49
|
+
index_by(&self.bulk_import_column).to_h
|
50
|
+
self.batch_records.each do |r|
|
51
|
+
r.record[primary_col] = bulk_import_map[r.bulk_import_id][primary_col]
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
# @param [String] assoc_name
|
56
|
+
# @return [Array<Integer,String>]
|
57
|
+
def primary_keys(assoc_name)
|
58
|
+
assoc = self.associations.find { |a| a.name == assoc_name }
|
59
|
+
self.records.map do |record|
|
60
|
+
record[assoc.active_record_primary_key]
|
61
|
+
end
|
62
|
+
end
|
63
|
+
|
64
|
+
# @param assoc [ActiveRecord::Reflection::AssociationReflection]
|
65
|
+
# @param import_id [String]
|
66
|
+
def delete_old_records(assoc, import_id)
|
67
|
+
return if self.batch_records.none?
|
68
|
+
|
69
|
+
primary_keys = self.primary_keys(assoc.name)
|
70
|
+
assoc.klass.
|
71
|
+
where(assoc.foreign_key => primary_keys).
|
72
|
+
where("#{self.bulk_import_column} != ?", import_id).
|
73
|
+
delete_all
|
74
|
+
end
|
75
|
+
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
@@ -0,0 +1,92 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Deimos
|
4
|
+
module ActiveRecordConsume
|
5
|
+
# Responsible for updating the database itself.
|
6
|
+
class MassUpdater
|
7
|
+
|
8
|
+
# @param klass [Class < ActiveRecord::Base]
|
9
|
+
def default_keys(klass)
|
10
|
+
[klass.primary_key]
|
11
|
+
end
|
12
|
+
|
13
|
+
# @param klass [Class < ActiveRecord::Base]
|
14
|
+
def default_cols(klass)
|
15
|
+
klass.column_names - %w(created_at updated_at)
|
16
|
+
end
|
17
|
+
|
18
|
+
# @param klass [Class < ActiveRecord::Base]
|
19
|
+
# @param key_col_proc [Proc<Class < ActiveRecord::Base>]
|
20
|
+
# @param col_proc [Proc<Class < ActiveRecord::Base>]
|
21
|
+
# @param replace_associations [Boolean]
|
22
|
+
def initialize(klass, key_col_proc: nil, col_proc: nil, replace_associations: true)
|
23
|
+
@klass = klass
|
24
|
+
@replace_associations = replace_associations
|
25
|
+
|
26
|
+
@key_cols = {}
|
27
|
+
@key_col_proc = key_col_proc
|
28
|
+
|
29
|
+
@columns = {}
|
30
|
+
@col_proc = col_proc
|
31
|
+
end
|
32
|
+
|
33
|
+
# @param klass [Class < ActiveRecord::Base]
|
34
|
+
def columns(klass)
|
35
|
+
@columns[klass] ||= @col_proc&.call(klass) || self.default_cols(klass)
|
36
|
+
end
|
37
|
+
|
38
|
+
# @param klass [Class < ActiveRecord::Base]
|
39
|
+
def key_cols(klass)
|
40
|
+
@key_cols[klass] ||= @key_col_proc&.call(klass) || self.default_keys(klass)
|
41
|
+
end
|
42
|
+
|
43
|
+
# @param record_list [BatchRecordList]
|
44
|
+
def save_records_to_database(record_list)
|
45
|
+
columns = self.columns(record_list.klass)
|
46
|
+
key_cols = self.key_cols(record_list.klass)
|
47
|
+
record_list.records.each(&:validate!)
|
48
|
+
|
49
|
+
options = if @key_cols.empty?
|
50
|
+
{} # Can't upsert with no key, just do regular insert
|
51
|
+
elsif ActiveRecord::Base.connection.adapter_name.downcase =~ /mysql/
|
52
|
+
{
|
53
|
+
on_duplicate_key_update: columns
|
54
|
+
}
|
55
|
+
else
|
56
|
+
{
|
57
|
+
on_duplicate_key_update: {
|
58
|
+
conflict_target: key_cols,
|
59
|
+
columns: columns
|
60
|
+
}
|
61
|
+
}
|
62
|
+
end
|
63
|
+
record_list.klass.import!(columns, record_list.records, options)
|
64
|
+
end
|
65
|
+
|
66
|
+
# Imports associated objects and import them to database table
|
67
|
+
# The base table is expected to contain bulk_import_id column for indexing associated objects with id
|
68
|
+
# @param record_list [BatchRecordList]
|
69
|
+
def import_associations(record_list)
|
70
|
+
record_list.fill_primary_keys!
|
71
|
+
|
72
|
+
import_id = @replace_associations ? SecureRandom.uuid : nil
|
73
|
+
record_list.associations.each do |assoc|
|
74
|
+
sub_records = record_list.map { |r| r.sub_records(assoc.name, import_id) }.flatten
|
75
|
+
next unless sub_records.any?
|
76
|
+
|
77
|
+
sub_record_list = BatchRecordList.new(sub_records)
|
78
|
+
|
79
|
+
save_records_to_database(sub_record_list)
|
80
|
+
record_list.delete_old_records(assoc, import_id) if import_id
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
# @param record_list [BatchRecordList]
|
85
|
+
def mass_update(record_list)
|
86
|
+
save_records_to_database(record_list)
|
87
|
+
import_associations(record_list) if record_list.associations.any?
|
88
|
+
end
|
89
|
+
|
90
|
+
end
|
91
|
+
end
|
92
|
+
end
|
@@ -30,16 +30,9 @@ module Deimos
|
|
30
30
|
config[:record_class] = klass
|
31
31
|
end
|
32
32
|
|
33
|
-
# @
|
34
|
-
|
35
|
-
|
36
|
-
def association_list(associations)
|
37
|
-
config[:association_list] = Array(associations)
|
38
|
-
end
|
39
|
-
|
40
|
-
# @param
|
41
|
-
def bulk_import_id_column(name)
|
42
|
-
config[:bulk_import_id_column] = name
|
33
|
+
# @return [String,nil]
|
34
|
+
def bulk_import_id_column
|
35
|
+
config[:bulk_import_id_column]
|
43
36
|
end
|
44
37
|
|
45
38
|
# @param val [Boolean] Turn pre-compaction of the batch on or off. If true,
|
@@ -59,8 +52,6 @@ module Deimos
|
|
59
52
|
# Setup
|
60
53
|
def initialize
|
61
54
|
@klass = self.class.config[:record_class]
|
62
|
-
@association_list = self.class.config[:association_list]
|
63
|
-
@bulk_import_id_column = self.class.config[:bulk_import_id_column] || :bulk_import_id
|
64
55
|
@converter = ActiveRecordConsume::SchemaModelConverter.new(self.class.decoder, @klass)
|
65
56
|
|
66
57
|
if self.class.config[:key_schema]
|