elasticsearch-model-transactional_callbacks 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: dd0a0e00471ca007bf1fe2da314b50cabc442a6caade0732ae83132d880d69f2
4
+ data.tar.gz: 9a259c08c9a516118694b8a9b38433051b4d6ca4f7f08aabe1a2a243a18641a3
5
+ SHA512:
6
+ metadata.gz: 9df1626b1e0b86eb567177fb8a19d2475e4ae18038a1050e15b4e85023f86a68ef73cf2738e15d42958de739a52186783a6f0ae1778514d32dca65d090435905
7
+ data.tar.gz: 7c2cd5646a7c8b636702bd7484e702a46cf9f086effb4bb0fc7b4d77416c5c8feec07d3fff7e450a48baa05eac5ef93ba482b2e3fa4363b1104efab4fbfde973
@@ -0,0 +1,104 @@
1
+ # Elasticsearch::Model::TransactionalCallbacks
2
+
3
+ The [`elasticsearch-model`](https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model)
4
+ works great in simplifying the integration of Ruby classes ("models") with the
5
+ [Elasticsearch](http://www.elasticsearch.org/) search and analytics engine.
6
+ But, it come short with support for updating the indexed documents asynchronously.
7
+
8
+ Built-in support for updating the indexed documents comes in the form of `Elasticsearch::Model::Callbacks`
9
+ which will updates each related documents individually inside the same thread where the changes were made.
10
+ Depending on the size of your application, and the size of the changes itself, triggering N number of
11
+ indexing request to Elasticsearch could amount to nothing, or it could slow down the request-response
12
+ cycle considerably and render it unusable.
13
+
14
+ This gem aim to solve this by providing a way to update the index asynchronously via `ActiveJob`.
15
+
16
+ ## Usage
17
+
18
+ The minimum is to include `Elasticsearch::Model::TransactionalCallbacks` into any model
19
+ which could benefit from asynchronous indexing, e.g.
20
+
21
+ ```ruby
22
+ class User < ApplicationRecord
23
+ include Elasticsearch::Model
24
+ include Elasticsearch::Model::TransactionalCallbacks
25
+
26
+ index_name 'users'
27
+ document_type 'user'
28
+
29
+ mappings do
30
+ # indexes for users
31
+ end
32
+ end
33
+ ```
34
+
35
+ But, this will end up trading n+1 on updating index with n+1 on database queries in case your `#as_indexed_json`
36
+ pulls data from associated models, e.g.
37
+
38
+ ```ruby
39
+ class Post < ApplicationRecord
40
+ include Elasticsearch::Model
41
+ include Elasticsearch::Model::TransactionalCallbacks
42
+
43
+ has_many :taggings, as: :taggable
44
+ has_many :tags, through: :taggings
45
+
46
+ index_name 'posts'
47
+ document_type 'post'
48
+
49
+ mappings dynamic: false do
50
+ indexes :subject, type: 'text', analyzer: 'english'
51
+ indexes :tags, type: 'keyword'
52
+ end
53
+
54
+ def as_indexed_json(_options = {})
55
+ {
56
+ subject: subject,
57
+ tags: tags.map(&:key) # FIXME: this triggers n+1 queries
58
+ }
59
+ end
60
+ end
61
+ ```
62
+
63
+ to get around this, you can define a `scope` called `preload_for_index` like so:
64
+
65
+ ```ruby
66
+ class Post < ApplicationRecord
67
+ # ...snip...
68
+ scope :preload_for_import, -> { preload(:tags) }
69
+ # ...snip...
70
+ end
71
+ ```
72
+
73
+ and it will be automatically called by the library.
74
+
75
+ ## Compatibility
76
+ This library is compatible and tested with Elasticsearch 5. Some works might be needed to make it works with Elasticsearch 6.
77
+
78
+ ## Installation
79
+ Add this line to your application's Gemfile:
80
+
81
+ ```ruby
82
+ gem 'elasticsearch-model-transactional_callbacks'
83
+ ```
84
+
85
+ And then execute:
86
+ ```bash
87
+ $ bundle
88
+ ```
89
+
90
+ Or install it yourself as:
91
+ ```bash
92
+ $ gem install elasticsearch-model-transactional_callbacks
93
+ ```
94
+
95
+ ## Contributing
96
+ Any and all kind of help are welcomed! Especially interested in:
97
+
98
+ - sample use cases which are not yet supported,
99
+ - compatibility with elasticsearch 6.0
100
+
101
+ feel free to file an issue/PR with sample mapping!
102
+
103
+ ## License
104
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
@@ -0,0 +1,29 @@
1
+ # frozen_string_literal: true
2
+
3
+ begin
4
+ require 'bundler/setup'
5
+ rescue LoadError
6
+ puts 'You must `gem install bundler` and `bundle install` to run rake tasks'
7
+ end
8
+
9
+ require 'rdoc/task'
10
+
11
+ RDoc::Task.new(:rdoc) do |rdoc|
12
+ rdoc.rdoc_dir = 'rdoc'
13
+ rdoc.title = 'Elasticsearch::Model::TransactionalCallbacks'
14
+ rdoc.options << '--line-numbers'
15
+ rdoc.rdoc_files.include('README.md')
16
+ rdoc.rdoc_files.include('lib/**/*.rb')
17
+ end
18
+
19
+ require 'bundler/gem_tasks'
20
+
21
+ require 'rake/testtask'
22
+
23
+ Rake::TestTask.new(:test) do |t|
24
+ t.libs << 'test'
25
+ t.pattern = 'test/**/*_test.rb'
26
+ t.verbose = false
27
+ end
28
+
29
+ task default: :test
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'elasticsearch/model'
4
+ require_relative './transactional_callbacks/railtie'
5
+ require_relative './transactional_callbacks/manager'
6
+
7
+ module Elasticsearch
8
+ module Model
9
+ # Extend ElasticSearch::Model with transactional callbacks for asynchronous indexing
10
+ module TransactionalCallbacks
11
+ extend ActiveSupport::Concern
12
+
13
+ included do
14
+ after_commit :batch_index_document, on: :create
15
+ after_commit :batch_update_document, on: :update
16
+ after_commit :batch_delete_document, on: :destroy
17
+ end
18
+
19
+ def batch_index_document
20
+ Manager.queue.push(:index, self)
21
+ end
22
+
23
+ def batch_update_document
24
+ Manager.queue.push(:update, self)
25
+ end
26
+
27
+ def batch_delete_document
28
+ Manager.queue.push(:delete, self)
29
+ end
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,96 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Elasticsearch
4
+ module Model
5
+ module TransactionalCallbacks
6
+ ##
7
+ # Background job which handles the request to index/update/delete documents asynchronously
8
+ #
9
+ # Elasticsearch::Model::TransactionalCallbacks::BulkIndexingJob.perform_later(
10
+ # document_type: {
11
+ # index: [{ _id: document.id }],
12
+ # update: [{ _id: document.id }],
13
+ # delete: [{ _id: document.id }],
14
+ # }
15
+ # )
16
+ #
17
+ class BulkIndexingJob < ::ActiveJob::Base
18
+ queue_as :default
19
+
20
+ def perform(indexables)
21
+ indexables.each do |document_type, action_map|
22
+ klass = document_type.to_s.camelcase.constantize
23
+ body = transform_batches(klass, action_map)
24
+
25
+ response = bulk_index klass, body
26
+
27
+ ::Rails.logger.error "[ELASTICSEARCH] Bulk request failed: #{response['items']}" if response&.dig('errors')
28
+ end
29
+ end
30
+
31
+ private
32
+
33
+ def transform_batches(klass, action_map)
34
+ reverse_map = build_reverse_map(action_map)
35
+ resources = klass.where id: reverse_map.keys
36
+
37
+ preload(resources).find_each.map { |resource|
38
+ action, option = reverse_map[resource.id]
39
+
40
+ send "transform_#{action}", resource, option
41
+ } + action_map.fetch(:delete, []).map { |option|
42
+ transform_delete(option)
43
+ }
44
+ end
45
+
46
+ def build_reverse_map(action_map)
47
+ action_map.each_with_object({}) { |map, memo|
48
+ action, options = map
49
+
50
+ next if action == :delete
51
+
52
+ options.each do |option|
53
+ memo[option[:_id]] = [action, option]
54
+ end
55
+ }
56
+ end
57
+
58
+ def preload(resources)
59
+ resources.respond_to?(:preload_for_import) ? resources.preload_for_import : resources
60
+ end
61
+
62
+ def transform_index(resource, option)
63
+ { index: option.merge(data: to_indexed_json(resource)) }
64
+ end
65
+ # elasticsearch do support update operation in their bulk API,
66
+ # but it will fail in case the update is done to missing documents,
67
+ # while index work for both new and existing document
68
+ #
69
+ # because of this, we choose to use index for update to avoid issue with race condition
70
+ # where a document is updated immediately after it is created,
71
+ # on which elasticsearch might not be aware of the document yet
72
+ alias transform_update transform_index
73
+
74
+ def transform_delete(option)
75
+ { delete: option }
76
+ end
77
+
78
+ def to_indexed_json(resource)
79
+ return resource.as_indexed_json if resource.respond_to?(:as_indexed_json)
80
+
81
+ resource.__elasticsearch__.as_indexed_json
82
+ end
83
+
84
+ def bulk_index(klass, body)
85
+ return if body.blank?
86
+
87
+ klass.__elasticsearch__.client.bulk(
88
+ index: klass.index_name,
89
+ type: klass.document_type,
90
+ body: body
91
+ )
92
+ end
93
+ end
94
+ end
95
+ end
96
+ end
@@ -0,0 +1,42 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './bulk_indexing_job'
4
+ require_relative './queue'
5
+
6
+ module Elasticsearch
7
+ module Model
8
+ module TransactionalCallbacks
9
+ module Manager # :nodoc:
10
+ class << self
11
+ def capture
12
+ counter_stack.push(:lol)
13
+
14
+ yield.tap do
15
+ register_job if counter_stack.length == 1
16
+ end
17
+ ensure
18
+ counter_stack.pop
19
+ end
20
+
21
+ def queue
22
+ Thread.current[:elasticsearch_transactional_queue] ||= Queue.new
23
+ end
24
+
25
+ private
26
+
27
+ def counter_stack
28
+ Thread.current[:elasticsearch_transactional_counter] ||= []
29
+ end
30
+
31
+ def register_job
32
+ return if queue.empty?
33
+
34
+ BulkIndexingJob.perform_later(queue.to_h)
35
+
36
+ queue.reset!
37
+ end
38
+ end
39
+ end
40
+ end
41
+ end
42
+ end
@@ -0,0 +1,93 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Elasticsearch
4
+ module Model
5
+ module TransactionalCallbacks
6
+ ##
7
+ # Responsible for storing a queue of resources to be indexed/updated/deleted from elasticsearch
8
+ #
9
+ class Queue
10
+ attr_reader :state
11
+
12
+ delegate :empty?, to: :state
13
+
14
+ def initialize
15
+ reset!
16
+ end
17
+
18
+ def reset!
19
+ @state = {}
20
+ end
21
+
22
+ def push(action, resource)
23
+ do_push action, resource, _id: resource.id, _parent: parent_id(resource)
24
+ end
25
+
26
+ def push_all(action, relation)
27
+ return unless ::Elasticsearch::Model::TransactionalCallbacks.in?(relation.included_modules)
28
+
29
+ pluck_ids(relation) do |ids|
30
+ do_push action, relation, ids
31
+ end
32
+ end
33
+
34
+ def to_h
35
+ state
36
+ end
37
+
38
+ private
39
+
40
+ def do_push(action, resource_or_relation, options)
41
+ type = document_type(resource_or_relation)
42
+
43
+ prepare_state_for(type)
44
+
45
+ state[type][action] << options.compact
46
+ state[type][action].uniq!
47
+ end
48
+
49
+ def prepare_state_for(type)
50
+ state[type] ||= {
51
+ index: [],
52
+ update: [],
53
+ delete: []
54
+ }
55
+ end
56
+
57
+ def resource_class(resource_or_relation)
58
+ resource_or_relation.respond_to?(:document_type) ? resource_or_relation : resource_or_relation.class
59
+ end
60
+
61
+ def document_type(resource_or_relation)
62
+ resource_class(resource_or_relation).document_type.to_sym
63
+ end
64
+
65
+ def child?(resource_or_relation)
66
+ parent_type(resource_or_relation).present?
67
+ end
68
+
69
+ def parent_type(resource_or_relation)
70
+ resource_class(resource_or_relation).mapping.options.dig(:_parent, :type)
71
+ end
72
+
73
+ def parent_id(resource)
74
+ return unless child?(resource)
75
+
76
+ resource.public_send "#{parent_type resource}_id"
77
+ end
78
+
79
+ def pluck_ids(relation)
80
+ if child?(relation)
81
+ relation.pluck(:id, "#{parent_type relation}_id").map do |ids|
82
+ yield _id: ids[0], _parent: ids[1]
83
+ end
84
+ else
85
+ relation.pluck(:id).map do |id|
86
+ yield _id: id
87
+ end
88
+ end
89
+ end
90
+ end
91
+ end
92
+ end
93
+ end
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './relation'
4
+ require_relative './transaction'
5
+
6
+ module Elasticsearch
7
+ module Model
8
+ module TransactionalCallbacks
9
+ class Railtie < ::Rails::Railtie # :nodoc:
10
+ initializer 'elasticsearch.model.transactional_callbacks.initialize' do
11
+ ActiveSupport.on_load(:active_record) do
12
+ ActiveRecord::Relation.prepend Elasticsearch::Model::TransactionalCallbacks::Relation
13
+
14
+ extend Elasticsearch::Model::TransactionalCallbacks::Transaction
15
+ end
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,33 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './manager'
4
+
5
+ module Elasticsearch
6
+ module Model
7
+ module TransactionalCallbacks
8
+ ##
9
+ # Override .update_all and .delete_all of ActiveRecord::Relation to batch update/delete
10
+ # the index if the resources in question have a coresponding elasticsearch index
11
+ #
12
+ # This module are automatically included into ActiveRecord::Relation inside of railtie
13
+ #
14
+ module Relation
15
+ def update_all(*arguments)
16
+ Manager.capture do
17
+ Manager.queue.push_all(:update, self)
18
+
19
+ super(*arguments)
20
+ end
21
+ end
22
+
23
+ def delete_all
24
+ Manager.capture do
25
+ Manager.queue.push_all(:delete, self)
26
+
27
+ super
28
+ end
29
+ end
30
+ end
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './manager'
4
+
5
+ module Elasticsearch
6
+ module Model
7
+ module TransactionalCallbacks
8
+ ##
9
+ # Override ActiveRecord::Base.transaction to allow Manager to listen for
10
+ # any indexing request from active record after_commit callback
11
+ #
12
+ # This module are automatically included into ActiveRecord::Base inside of railtie
13
+ #
14
+ module Transaction
15
+ def transaction(*args, &block)
16
+ Manager.capture do
17
+ super(*args, &block)
18
+ end
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Elasticsearch
4
+ module Model
5
+ module TransactionalCallbacks
6
+ VERSION = '0.1.0'
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ # desc "Explaining what the task does"
4
+ # task :elasticsearch_model_transactional_callbacks do
5
+ # # Task goes here
6
+ # end
metadata ADDED
@@ -0,0 +1,140 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: elasticsearch-model-transactional_callbacks
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Ignatius Reza
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2019-02-12 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: elasticsearch-model
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: 5.0.0
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: 5.0.0
27
+ - !ruby/object:Gem::Dependency
28
+ name: rails
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: 5.0.0
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: 5.0.0
41
+ - !ruby/object:Gem::Dependency
42
+ name: minitest-ci
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: minitest-stub_any_instance
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: rubocop
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ">="
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ - !ruby/object:Gem::Dependency
84
+ name: sqlite3
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: 1.3.6
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: 1.3.6
97
+ description: Reduce load from your application server by offloading indexing into
98
+ background jobs
99
+ email:
100
+ - lyoneil.de.sire@gmail.com
101
+ executables: []
102
+ extensions: []
103
+ extra_rdoc_files: []
104
+ files:
105
+ - README.md
106
+ - Rakefile
107
+ - lib/elasticsearch/model/transactional_callbacks.rb
108
+ - lib/elasticsearch/model/transactional_callbacks/bulk_indexing_job.rb
109
+ - lib/elasticsearch/model/transactional_callbacks/manager.rb
110
+ - lib/elasticsearch/model/transactional_callbacks/queue.rb
111
+ - lib/elasticsearch/model/transactional_callbacks/railtie.rb
112
+ - lib/elasticsearch/model/transactional_callbacks/relation.rb
113
+ - lib/elasticsearch/model/transactional_callbacks/transaction.rb
114
+ - lib/elasticsearch/model/transactional_callbacks/version.rb
115
+ - lib/tasks/elasticsearch/model/transactional_callbacks_tasks.rake
116
+ homepage: https://github.com/ignatiusreza/elasticsearch-model-transactional_callbacks
117
+ licenses:
118
+ - MIT
119
+ metadata: {}
120
+ post_install_message:
121
+ rdoc_options: []
122
+ require_paths:
123
+ - lib
124
+ required_ruby_version: !ruby/object:Gem::Requirement
125
+ requirements:
126
+ - - ">="
127
+ - !ruby/object:Gem::Version
128
+ version: '0'
129
+ required_rubygems_version: !ruby/object:Gem::Requirement
130
+ requirements:
131
+ - - ">="
132
+ - !ruby/object:Gem::Version
133
+ version: '0'
134
+ requirements: []
135
+ rubygems_version: 3.0.2
136
+ signing_key:
137
+ specification_version: 4
138
+ summary: Extend ElasticSearch::Model with transactional callbacks for asynchronous
139
+ indexing
140
+ test_files: []