elasticsearch-model-transactional_callbacks 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: dd0a0e00471ca007bf1fe2da314b50cabc442a6caade0732ae83132d880d69f2
4
+ data.tar.gz: 9a259c08c9a516118694b8a9b38433051b4d6ca4f7f08aabe1a2a243a18641a3
5
+ SHA512:
6
+ metadata.gz: 9df1626b1e0b86eb567177fb8a19d2475e4ae18038a1050e15b4e85023f86a68ef73cf2738e15d42958de739a52186783a6f0ae1778514d32dca65d090435905
7
+ data.tar.gz: 7c2cd5646a7c8b636702bd7484e702a46cf9f086effb4bb0fc7b4d77416c5c8feec07d3fff7e450a48baa05eac5ef93ba482b2e3fa4363b1104efab4fbfde973
@@ -0,0 +1,104 @@
1
+ # Elasticsearch::Model::TransactionalCallbacks
2
+
3
+ The [`elasticsearch-model`](https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model)
4
+ works great in simplifying the integration of Ruby classes ("models") with the
5
+ [Elasticsearch](http://www.elasticsearch.org/) search and analytics engine.
6
+ But, it come short with support for updating the indexed documents asynchronously.
7
+
8
+ Built-in support for updating the indexed documents comes in the form of `Elasticsearch::Model::Callbacks`
9
+ which will updates each related documents individually inside the same thread where the changes were made.
10
+ Depending on the size of your application, and the size of the changes itself, triggering N number of
11
+ indexing request to Elasticsearch could amount to nothing, or it could slow down the request-response
12
+ cycle considerably and render it unusable.
13
+
14
+ This gem aim to solve this by providing a way to update the index asynchronously via `ActiveJob`.
15
+
16
+ ## Usage
17
+
18
+ The minimum is to include `Elasticsearch::Model::TransactionalCallbacks` into any model
19
+ which could benefit from asynchronous indexing, e.g.
20
+
21
+ ```ruby
22
+ class User < ApplicationRecord
23
+ include Elasticsearch::Model
24
+ include Elasticsearch::Model::TransactionalCallbacks
25
+
26
+ index_name 'users'
27
+ document_type 'user'
28
+
29
+ mappings do
30
+ # indexes for users
31
+ end
32
+ end
33
+ ```
34
+
35
+ But, this will end up trading n+1 on updating index with n+1 on database queries in case your `#as_indexed_json`
36
+ pulls data from associated models, e.g.
37
+
38
+ ```ruby
39
+ class Post < ApplicationRecord
40
+ include Elasticsearch::Model
41
+ include Elasticsearch::Model::TransactionalCallbacks
42
+
43
+ has_many :taggings, as: :taggable
44
+ has_many :tags, through: :taggings
45
+
46
+ index_name 'posts'
47
+ document_type 'post'
48
+
49
+ mappings dynamic: false do
50
+ indexes :subject, type: 'text', analyzer: 'english'
51
+ indexes :tags, type: 'keyword'
52
+ end
53
+
54
+ def as_indexed_json(_options = {})
55
+ {
56
+ subject: subject,
57
+ tags: tags.map(&:key) # FIXME: this triggers n+1 queries
58
+ }
59
+ end
60
+ end
61
+ ```
62
+
63
+ to get around this, you can define a `scope` called `preload_for_index` like so:
64
+
65
+ ```ruby
66
+ class Post < ApplicationRecord
67
+ # ...snip...
68
+ scope :preload_for_import, -> { preload(:tags) }
69
+ # ...snip...
70
+ end
71
+ ```
72
+
73
+ and it will be automatically called by the library.
74
+
75
+ ## Compatibility
76
+ This library is compatible and tested with Elasticsearch 5. Some works might be needed to make it works with Elasticsearch 6.
77
+
78
+ ## Installation
79
+ Add this line to your application's Gemfile:
80
+
81
+ ```ruby
82
+ gem 'elasticsearch-model-transactional_callbacks'
83
+ ```
84
+
85
+ And then execute:
86
+ ```bash
87
+ $ bundle
88
+ ```
89
+
90
+ Or install it yourself as:
91
+ ```bash
92
+ $ gem install elasticsearch-model-transactional_callbacks
93
+ ```
94
+
95
+ ## Contributing
96
+ Any and all kind of help are welcomed! Especially interested in:
97
+
98
+ - sample use cases which are not yet supported,
99
+ - compatibility with elasticsearch 6.0
100
+
101
+ feel free to file an issue/PR with sample mapping!
102
+
103
+ ## License
104
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
@@ -0,0 +1,29 @@
1
+ # frozen_string_literal: true
2
+
3
+ begin
4
+ require 'bundler/setup'
5
+ rescue LoadError
6
+ puts 'You must `gem install bundler` and `bundle install` to run rake tasks'
7
+ end
8
+
9
+ require 'rdoc/task'
10
+
11
+ RDoc::Task.new(:rdoc) do |rdoc|
12
+ rdoc.rdoc_dir = 'rdoc'
13
+ rdoc.title = 'Elasticsearch::Model::TransactionalCallbacks'
14
+ rdoc.options << '--line-numbers'
15
+ rdoc.rdoc_files.include('README.md')
16
+ rdoc.rdoc_files.include('lib/**/*.rb')
17
+ end
18
+
19
+ require 'bundler/gem_tasks'
20
+
21
+ require 'rake/testtask'
22
+
23
+ Rake::TestTask.new(:test) do |t|
24
+ t.libs << 'test'
25
+ t.pattern = 'test/**/*_test.rb'
26
+ t.verbose = false
27
+ end
28
+
29
+ task default: :test
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'elasticsearch/model'
4
+ require_relative './transactional_callbacks/railtie'
5
+ require_relative './transactional_callbacks/manager'
6
+
7
+ module Elasticsearch
8
+ module Model
9
+ # Extend ElasticSearch::Model with transactional callbacks for asynchronous indexing
10
+ module TransactionalCallbacks
11
+ extend ActiveSupport::Concern
12
+
13
+ included do
14
+ after_commit :batch_index_document, on: :create
15
+ after_commit :batch_update_document, on: :update
16
+ after_commit :batch_delete_document, on: :destroy
17
+ end
18
+
19
+ def batch_index_document
20
+ Manager.queue.push(:index, self)
21
+ end
22
+
23
+ def batch_update_document
24
+ Manager.queue.push(:update, self)
25
+ end
26
+
27
+ def batch_delete_document
28
+ Manager.queue.push(:delete, self)
29
+ end
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,96 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Elasticsearch
4
+ module Model
5
+ module TransactionalCallbacks
6
+ ##
7
+ # Background job which handles the request to index/update/delete documents asynchronously
8
+ #
9
+ # Elasticsearch::Model::TransactionalCallbacks::BulkIndexingJob.perform_later(
10
+ # document_type: {
11
+ # index: [{ _id: document.id }],
12
+ # update: [{ _id: document.id }],
13
+ # delete: [{ _id: document.id }],
14
+ # }
15
+ # )
16
+ #
17
+ class BulkIndexingJob < ::ActiveJob::Base
18
+ queue_as :default
19
+
20
+ def perform(indexables)
21
+ indexables.each do |document_type, action_map|
22
+ klass = document_type.to_s.camelcase.constantize
23
+ body = transform_batches(klass, action_map)
24
+
25
+ response = bulk_index klass, body
26
+
27
+ ::Rails.logger.error "[ELASTICSEARCH] Bulk request failed: #{response['items']}" if response&.dig('errors')
28
+ end
29
+ end
30
+
31
+ private
32
+
33
+ def transform_batches(klass, action_map)
34
+ reverse_map = build_reverse_map(action_map)
35
+ resources = klass.where id: reverse_map.keys
36
+
37
+ preload(resources).find_each.map { |resource|
38
+ action, option = reverse_map[resource.id]
39
+
40
+ send "transform_#{action}", resource, option
41
+ } + action_map.fetch(:delete, []).map { |option|
42
+ transform_delete(option)
43
+ }
44
+ end
45
+
46
+ def build_reverse_map(action_map)
47
+ action_map.each_with_object({}) { |map, memo|
48
+ action, options = map
49
+
50
+ next if action == :delete
51
+
52
+ options.each do |option|
53
+ memo[option[:_id]] = [action, option]
54
+ end
55
+ }
56
+ end
57
+
58
+ def preload(resources)
59
+ resources.respond_to?(:preload_for_import) ? resources.preload_for_import : resources
60
+ end
61
+
62
+ def transform_index(resource, option)
63
+ { index: option.merge(data: to_indexed_json(resource)) }
64
+ end
65
+ # elasticsearch do support update operation in their bulk API,
66
+ # but it will fail in case the update is done to missing documents,
67
+ # while index work for both new and existing document
68
+ #
69
+ # because of this, we choose to use index for update to avoid issue with race condition
70
+ # where a document is updated immediately after it is created,
71
+ # on which elasticsearch might not be aware of the document yet
72
+ alias transform_update transform_index
73
+
74
+ def transform_delete(option)
75
+ { delete: option }
76
+ end
77
+
78
+ def to_indexed_json(resource)
79
+ return resource.as_indexed_json if resource.respond_to?(:as_indexed_json)
80
+
81
+ resource.__elasticsearch__.as_indexed_json
82
+ end
83
+
84
+ def bulk_index(klass, body)
85
+ return if body.blank?
86
+
87
+ klass.__elasticsearch__.client.bulk(
88
+ index: klass.index_name,
89
+ type: klass.document_type,
90
+ body: body
91
+ )
92
+ end
93
+ end
94
+ end
95
+ end
96
+ end
@@ -0,0 +1,42 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './bulk_indexing_job'
4
+ require_relative './queue'
5
+
6
+ module Elasticsearch
7
+ module Model
8
+ module TransactionalCallbacks
9
+ module Manager # :nodoc:
10
+ class << self
11
+ def capture
12
+ counter_stack.push(:lol)
13
+
14
+ yield.tap do
15
+ register_job if counter_stack.length == 1
16
+ end
17
+ ensure
18
+ counter_stack.pop
19
+ end
20
+
21
+ def queue
22
+ Thread.current[:elasticsearch_transactional_queue] ||= Queue.new
23
+ end
24
+
25
+ private
26
+
27
+ def counter_stack
28
+ Thread.current[:elasticsearch_transactional_counter] ||= []
29
+ end
30
+
31
+ def register_job
32
+ return if queue.empty?
33
+
34
+ BulkIndexingJob.perform_later(queue.to_h)
35
+
36
+ queue.reset!
37
+ end
38
+ end
39
+ end
40
+ end
41
+ end
42
+ end
@@ -0,0 +1,93 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Elasticsearch
4
+ module Model
5
+ module TransactionalCallbacks
6
+ ##
7
+ # Responsible for storing a queue of resources to be indexed/updated/deleted from elasticsearch
8
+ #
9
+ class Queue
10
+ attr_reader :state
11
+
12
+ delegate :empty?, to: :state
13
+
14
+ def initialize
15
+ reset!
16
+ end
17
+
18
+ def reset!
19
+ @state = {}
20
+ end
21
+
22
+ def push(action, resource)
23
+ do_push action, resource, _id: resource.id, _parent: parent_id(resource)
24
+ end
25
+
26
+ def push_all(action, relation)
27
+ return unless ::Elasticsearch::Model::TransactionalCallbacks.in?(relation.included_modules)
28
+
29
+ pluck_ids(relation) do |ids|
30
+ do_push action, relation, ids
31
+ end
32
+ end
33
+
34
+ def to_h
35
+ state
36
+ end
37
+
38
+ private
39
+
40
+ def do_push(action, resource_or_relation, options)
41
+ type = document_type(resource_or_relation)
42
+
43
+ prepare_state_for(type)
44
+
45
+ state[type][action] << options.compact
46
+ state[type][action].uniq!
47
+ end
48
+
49
+ def prepare_state_for(type)
50
+ state[type] ||= {
51
+ index: [],
52
+ update: [],
53
+ delete: []
54
+ }
55
+ end
56
+
57
+ def resource_class(resource_or_relation)
58
+ resource_or_relation.respond_to?(:document_type) ? resource_or_relation : resource_or_relation.class
59
+ end
60
+
61
+ def document_type(resource_or_relation)
62
+ resource_class(resource_or_relation).document_type.to_sym
63
+ end
64
+
65
+ def child?(resource_or_relation)
66
+ parent_type(resource_or_relation).present?
67
+ end
68
+
69
+ def parent_type(resource_or_relation)
70
+ resource_class(resource_or_relation).mapping.options.dig(:_parent, :type)
71
+ end
72
+
73
+ def parent_id(resource)
74
+ return unless child?(resource)
75
+
76
+ resource.public_send "#{parent_type resource}_id"
77
+ end
78
+
79
+ def pluck_ids(relation)
80
+ if child?(relation)
81
+ relation.pluck(:id, "#{parent_type relation}_id").map do |ids|
82
+ yield _id: ids[0], _parent: ids[1]
83
+ end
84
+ else
85
+ relation.pluck(:id).map do |id|
86
+ yield _id: id
87
+ end
88
+ end
89
+ end
90
+ end
91
+ end
92
+ end
93
+ end
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './relation'
4
+ require_relative './transaction'
5
+
6
+ module Elasticsearch
7
+ module Model
8
+ module TransactionalCallbacks
9
+ class Railtie < ::Rails::Railtie # :nodoc:
10
+ initializer 'elasticsearch.model.transactional_callbacks.initialize' do
11
+ ActiveSupport.on_load(:active_record) do
12
+ ActiveRecord::Relation.prepend Elasticsearch::Model::TransactionalCallbacks::Relation
13
+
14
+ extend Elasticsearch::Model::TransactionalCallbacks::Transaction
15
+ end
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,33 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './manager'
4
+
5
+ module Elasticsearch
6
+ module Model
7
+ module TransactionalCallbacks
8
+ ##
9
+ # Override .update_all and .delete_all of ActiveRecord::Relation to batch update/delete
10
+ # the index if the resources in question have a coresponding elasticsearch index
11
+ #
12
+ # This module are automatically included into ActiveRecord::Relation inside of railtie
13
+ #
14
+ module Relation
15
+ def update_all(*arguments)
16
+ Manager.capture do
17
+ Manager.queue.push_all(:update, self)
18
+
19
+ super(*arguments)
20
+ end
21
+ end
22
+
23
+ def delete_all
24
+ Manager.capture do
25
+ Manager.queue.push_all(:delete, self)
26
+
27
+ super
28
+ end
29
+ end
30
+ end
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative './manager'
4
+
5
+ module Elasticsearch
6
+ module Model
7
+ module TransactionalCallbacks
8
+ ##
9
+ # Override ActiveRecord::Base.transaction to allow Manager to listen for
10
+ # any indexing request from active record after_commit callback
11
+ #
12
+ # This module are automatically included into ActiveRecord::Base inside of railtie
13
+ #
14
+ module Transaction
15
+ def transaction(*args, &block)
16
+ Manager.capture do
17
+ super(*args, &block)
18
+ end
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Elasticsearch
4
+ module Model
5
+ module TransactionalCallbacks
6
+ VERSION = '0.1.0'
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ # desc "Explaining what the task does"
4
+ # task :elasticsearch_model_transactional_callbacks do
5
+ # # Task goes here
6
+ # end
metadata ADDED
@@ -0,0 +1,140 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: elasticsearch-model-transactional_callbacks
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Ignatius Reza
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2019-02-12 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: elasticsearch-model
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: 5.0.0
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: 5.0.0
27
+ - !ruby/object:Gem::Dependency
28
+ name: rails
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: 5.0.0
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: 5.0.0
41
+ - !ruby/object:Gem::Dependency
42
+ name: minitest-ci
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: minitest-stub_any_instance
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: rubocop
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ">="
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ - !ruby/object:Gem::Dependency
84
+ name: sqlite3
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: 1.3.6
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: 1.3.6
97
+ description: Reduce load from your application server by offloading indexing into
98
+ background jobs
99
+ email:
100
+ - lyoneil.de.sire@gmail.com
101
+ executables: []
102
+ extensions: []
103
+ extra_rdoc_files: []
104
+ files:
105
+ - README.md
106
+ - Rakefile
107
+ - lib/elasticsearch/model/transactional_callbacks.rb
108
+ - lib/elasticsearch/model/transactional_callbacks/bulk_indexing_job.rb
109
+ - lib/elasticsearch/model/transactional_callbacks/manager.rb
110
+ - lib/elasticsearch/model/transactional_callbacks/queue.rb
111
+ - lib/elasticsearch/model/transactional_callbacks/railtie.rb
112
+ - lib/elasticsearch/model/transactional_callbacks/relation.rb
113
+ - lib/elasticsearch/model/transactional_callbacks/transaction.rb
114
+ - lib/elasticsearch/model/transactional_callbacks/version.rb
115
+ - lib/tasks/elasticsearch/model/transactional_callbacks_tasks.rake
116
+ homepage: https://github.com/ignatiusreza/elasticsearch-model-transactional_callbacks
117
+ licenses:
118
+ - MIT
119
+ metadata: {}
120
+ post_install_message:
121
+ rdoc_options: []
122
+ require_paths:
123
+ - lib
124
+ required_ruby_version: !ruby/object:Gem::Requirement
125
+ requirements:
126
+ - - ">="
127
+ - !ruby/object:Gem::Version
128
+ version: '0'
129
+ required_rubygems_version: !ruby/object:Gem::Requirement
130
+ requirements:
131
+ - - ">="
132
+ - !ruby/object:Gem::Version
133
+ version: '0'
134
+ requirements: []
135
+ rubygems_version: 3.0.2
136
+ signing_key:
137
+ specification_version: 4
138
+ summary: Extend ElasticSearch::Model with transactional callbacks for asynchronous
139
+ indexing
140
+ test_files: []