chewy 0.8.4 → 7.3.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/.github/CODEOWNERS +1 -0
- data/.github/ISSUE_TEMPLATE/bug_report.md +39 -0
- data/.github/ISSUE_TEMPLATE/feature_request.md +20 -0
- data/.github/PULL_REQUEST_TEMPLATE.md +16 -0
- data/.github/workflows/ruby.yml +74 -0
- data/.gitignore +1 -0
- data/.rubocop.yml +61 -0
- data/.rubocop_todo.yml +132 -0
- data/.yardopts +5 -0
- data/CHANGELOG.md +554 -245
- data/CODE_OF_CONDUCT.md +14 -0
- data/CONTRIBUTING.md +63 -0
- data/Gemfile +14 -11
- data/Guardfile +8 -6
- data/LICENSE.txt +1 -1
- data/README.md +748 -623
- data/Rakefile +11 -1
- data/chewy.gemspec +15 -19
- data/gemfiles/rails.5.2.activerecord.gemfile +11 -0
- data/gemfiles/rails.6.0.activerecord.gemfile +11 -0
- data/gemfiles/rails.6.1.activerecord.gemfile +13 -0
- data/gemfiles/rails.7.0.activerecord.gemfile +13 -0
- data/lib/chewy/config.rb +64 -50
- data/lib/chewy/errors.rb +10 -16
- data/lib/chewy/fields/base.rb +122 -32
- data/lib/chewy/fields/root.rb +48 -23
- data/lib/chewy/index/actions.rb +140 -54
- data/lib/chewy/index/adapter/active_record.rb +112 -0
- data/lib/chewy/{type → index}/adapter/base.rb +31 -12
- data/lib/chewy/index/adapter/object.rb +249 -0
- data/lib/chewy/index/adapter/orm.rb +194 -0
- data/lib/chewy/index/aliases.rb +14 -4
- data/lib/chewy/index/crutch.rb +40 -0
- data/lib/chewy/index/import/bulk_builder.rb +311 -0
- data/lib/chewy/index/import/bulk_request.rb +77 -0
- data/lib/chewy/index/import/journal_builder.rb +44 -0
- data/lib/chewy/index/import/routine.rb +139 -0
- data/lib/chewy/index/import.rb +243 -0
- data/lib/chewy/{type → index}/mapping.rb +79 -68
- data/lib/chewy/index/observe/active_record_methods.rb +87 -0
- data/lib/chewy/index/observe/callback.rb +34 -0
- data/lib/chewy/index/observe.rb +17 -0
- data/lib/chewy/index/settings.rb +10 -5
- data/lib/chewy/index/specification.rb +61 -0
- data/lib/chewy/index/syncer.rb +221 -0
- data/lib/chewy/{type → index}/witchcraft.rb +100 -39
- data/lib/chewy/index/wrapper.rb +95 -0
- data/lib/chewy/index.rb +216 -140
- data/lib/chewy/journal.rb +66 -0
- data/lib/chewy/log_subscriber.rb +8 -8
- data/lib/chewy/minitest/helpers.rb +150 -0
- data/lib/chewy/minitest/search_index_receiver.rb +76 -0
- data/lib/chewy/minitest.rb +1 -0
- data/lib/chewy/multi_search.rb +62 -0
- data/lib/chewy/railtie.rb +12 -25
- data/lib/chewy/rake_helper.rb +335 -37
- data/lib/chewy/repository.rb +2 -2
- data/lib/chewy/rspec/build_query.rb +12 -0
- data/lib/chewy/rspec/helpers.rb +55 -0
- data/lib/chewy/rspec/update_index.rb +106 -90
- data/lib/chewy/rspec.rb +3 -1
- data/lib/chewy/runtime/version.rb +4 -4
- data/lib/chewy/runtime.rb +1 -1
- data/lib/chewy/search/loader.rb +61 -0
- data/lib/chewy/{query → search}/pagination/kaminari.rb +13 -5
- data/lib/chewy/search/parameters/aggs.rb +16 -0
- data/lib/chewy/search/parameters/allow_partial_search_results.rb +27 -0
- data/lib/chewy/search/parameters/collapse.rb +16 -0
- data/lib/chewy/search/parameters/concerns/bool_storage.rb +24 -0
- data/lib/chewy/search/parameters/concerns/hash_storage.rb +23 -0
- data/lib/chewy/search/parameters/concerns/integer_storage.rb +14 -0
- data/lib/chewy/search/parameters/concerns/query_storage.rb +238 -0
- data/lib/chewy/search/parameters/concerns/string_array_storage.rb +23 -0
- data/lib/chewy/search/parameters/concerns/string_storage.rb +14 -0
- data/lib/chewy/search/parameters/docvalue_fields.rb +12 -0
- data/lib/chewy/search/parameters/explain.rb +16 -0
- data/lib/chewy/search/parameters/filter.rb +47 -0
- data/lib/chewy/search/parameters/highlight.rb +16 -0
- data/lib/chewy/search/parameters/ignore_unavailable.rb +27 -0
- data/lib/chewy/search/parameters/indices.rb +78 -0
- data/lib/chewy/search/parameters/indices_boost.rb +52 -0
- data/lib/chewy/search/parameters/limit.rb +17 -0
- data/lib/chewy/search/parameters/load.rb +32 -0
- data/lib/chewy/search/parameters/min_score.rb +16 -0
- data/lib/chewy/search/parameters/none.rb +25 -0
- data/lib/chewy/search/parameters/offset.rb +17 -0
- data/lib/chewy/search/parameters/order.rb +51 -0
- data/lib/chewy/search/parameters/post_filter.rb +19 -0
- data/lib/chewy/search/parameters/preference.rb +16 -0
- data/lib/chewy/search/parameters/profile.rb +16 -0
- data/lib/chewy/search/parameters/query.rb +19 -0
- data/lib/chewy/search/parameters/request_cache.rb +27 -0
- data/lib/chewy/search/parameters/rescore.rb +29 -0
- data/lib/chewy/search/parameters/script_fields.rb +16 -0
- data/lib/chewy/search/parameters/search_after.rb +20 -0
- data/lib/chewy/search/parameters/search_type.rb +16 -0
- data/lib/chewy/search/parameters/source.rb +77 -0
- data/lib/chewy/search/parameters/storage.rb +95 -0
- data/lib/chewy/search/parameters/stored_fields.rb +63 -0
- data/lib/chewy/search/parameters/suggest.rb +16 -0
- data/lib/chewy/search/parameters/terminate_after.rb +16 -0
- data/lib/chewy/search/parameters/timeout.rb +16 -0
- data/lib/chewy/search/parameters/track_scores.rb +16 -0
- data/lib/chewy/search/parameters/track_total_hits.rb +16 -0
- data/lib/chewy/search/parameters/version.rb +16 -0
- data/lib/chewy/search/parameters.rb +170 -0
- data/lib/chewy/search/query_proxy.rb +264 -0
- data/lib/chewy/search/request.rb +1071 -0
- data/lib/chewy/search/response.rb +119 -0
- data/lib/chewy/search/scoping.rb +49 -0
- data/lib/chewy/search/scrolling.rb +137 -0
- data/lib/chewy/search.rb +68 -28
- data/lib/chewy/stash.rb +68 -0
- data/lib/chewy/strategy/active_job.rb +3 -2
- data/lib/chewy/strategy/atomic.rb +2 -4
- data/lib/chewy/strategy/atomic_no_refresh.rb +18 -0
- data/lib/chewy/strategy/base.rb +13 -3
- data/lib/chewy/strategy/bypass.rb +1 -2
- data/lib/chewy/strategy/delayed_sidekiq/scheduler.rb +148 -0
- data/lib/chewy/strategy/delayed_sidekiq/worker.rb +52 -0
- data/lib/chewy/strategy/delayed_sidekiq.rb +17 -0
- data/lib/chewy/strategy/lazy_sidekiq.rb +64 -0
- data/lib/chewy/strategy/sidekiq.rb +15 -2
- data/lib/chewy/strategy/urgent.rb +1 -1
- data/lib/chewy/strategy.rb +16 -20
- data/lib/chewy/version.rb +1 -1
- data/lib/chewy.rb +81 -82
- data/lib/generators/chewy/install_generator.rb +3 -3
- data/lib/tasks/chewy.rake +99 -32
- data/migration_guide.md +56 -0
- data/spec/chewy/config_spec.rb +87 -15
- data/spec/chewy/fields/base_spec.rb +542 -233
- data/spec/chewy/fields/root_spec.rb +115 -17
- data/spec/chewy/fields/time_fields_spec.rb +13 -12
- data/spec/chewy/index/actions_spec.rb +595 -77
- data/spec/chewy/index/adapter/active_record_spec.rb +601 -0
- data/spec/chewy/index/adapter/object_spec.rb +243 -0
- data/spec/chewy/index/aliases_spec.rb +5 -5
- data/spec/chewy/index/import/bulk_builder_spec.rb +494 -0
- data/spec/chewy/index/import/bulk_request_spec.rb +95 -0
- data/spec/chewy/index/import/journal_builder_spec.rb +87 -0
- data/spec/chewy/index/import/routine_spec.rb +110 -0
- data/spec/chewy/index/import_spec.rb +615 -0
- data/spec/chewy/index/mapping_spec.rb +135 -0
- data/spec/chewy/index/observe/active_record_methods_spec.rb +68 -0
- data/spec/chewy/index/observe/callback_spec.rb +139 -0
- data/spec/chewy/index/observe_spec.rb +143 -0
- data/spec/chewy/index/settings_spec.rb +103 -50
- data/spec/chewy/index/specification_spec.rb +159 -0
- data/spec/chewy/index/syncer_spec.rb +118 -0
- data/spec/chewy/index/witchcraft_spec.rb +245 -0
- data/spec/chewy/index/wrapper_spec.rb +100 -0
- data/spec/chewy/index_spec.rb +149 -121
- data/spec/chewy/journal_spec.rb +223 -0
- data/spec/chewy/minitest/helpers_spec.rb +198 -0
- data/spec/chewy/minitest/search_index_receiver_spec.rb +118 -0
- data/spec/chewy/multi_search_spec.rb +84 -0
- data/spec/chewy/rake_helper_spec.rb +656 -0
- data/spec/chewy/repository_spec.rb +8 -8
- data/spec/chewy/rspec/build_query_spec.rb +34 -0
- data/spec/chewy/rspec/helpers_spec.rb +61 -0
- data/spec/chewy/rspec/update_index_spec.rb +220 -114
- data/spec/chewy/runtime_spec.rb +2 -2
- data/spec/chewy/search/loader_spec.rb +83 -0
- data/spec/chewy/search/pagination/kaminari_examples.rb +69 -0
- data/spec/chewy/search/pagination/kaminari_spec.rb +21 -0
- data/spec/chewy/search/parameters/aggs_spec.rb +5 -0
- data/spec/chewy/search/parameters/bool_storage_examples.rb +53 -0
- data/spec/chewy/search/parameters/collapse_spec.rb +5 -0
- data/spec/chewy/search/parameters/docvalue_fields_spec.rb +5 -0
- data/spec/chewy/search/parameters/explain_spec.rb +5 -0
- data/spec/chewy/search/parameters/filter_spec.rb +5 -0
- data/spec/chewy/search/parameters/hash_storage_examples.rb +59 -0
- data/spec/chewy/search/parameters/highlight_spec.rb +5 -0
- data/spec/chewy/search/parameters/ignore_unavailable_spec.rb +67 -0
- data/spec/chewy/search/parameters/indices_spec.rb +99 -0
- data/spec/chewy/search/parameters/integer_storage_examples.rb +32 -0
- data/spec/chewy/search/parameters/limit_spec.rb +5 -0
- data/spec/chewy/search/parameters/load_spec.rb +60 -0
- data/spec/chewy/search/parameters/min_score_spec.rb +32 -0
- data/spec/chewy/search/parameters/none_spec.rb +5 -0
- data/spec/chewy/search/parameters/offset_spec.rb +5 -0
- data/spec/chewy/search/parameters/order_spec.rb +72 -0
- data/spec/chewy/search/parameters/post_filter_spec.rb +5 -0
- data/spec/chewy/search/parameters/preference_spec.rb +5 -0
- data/spec/chewy/search/parameters/profile_spec.rb +5 -0
- data/spec/chewy/search/parameters/query_spec.rb +5 -0
- data/spec/chewy/search/parameters/query_storage_examples.rb +434 -0
- data/spec/chewy/search/parameters/request_cache_spec.rb +67 -0
- data/spec/chewy/search/parameters/rescore_spec.rb +62 -0
- data/spec/chewy/search/parameters/script_fields_spec.rb +5 -0
- data/spec/chewy/search/parameters/search_after_spec.rb +35 -0
- data/spec/chewy/search/parameters/search_type_spec.rb +5 -0
- data/spec/chewy/search/parameters/source_spec.rb +162 -0
- data/spec/chewy/search/parameters/storage_spec.rb +60 -0
- data/spec/chewy/search/parameters/stored_fields_spec.rb +126 -0
- data/spec/chewy/search/parameters/string_array_storage_examples.rb +63 -0
- data/spec/chewy/search/parameters/string_storage_examples.rb +32 -0
- data/spec/chewy/search/parameters/suggest_spec.rb +5 -0
- data/spec/chewy/search/parameters/terminate_after_spec.rb +5 -0
- data/spec/chewy/search/parameters/timeout_spec.rb +5 -0
- data/spec/chewy/search/parameters/track_scores_spec.rb +5 -0
- data/spec/chewy/search/parameters/track_total_hits_spec.rb +5 -0
- data/spec/chewy/search/parameters/version_spec.rb +5 -0
- data/spec/chewy/search/parameters_spec.rb +161 -0
- data/spec/chewy/search/query_proxy_spec.rb +119 -0
- data/spec/chewy/search/request_spec.rb +880 -0
- data/spec/chewy/search/response_spec.rb +202 -0
- data/spec/chewy/search/scrolling_spec.rb +171 -0
- data/spec/chewy/search_spec.rb +82 -55
- data/spec/chewy/stash_spec.rb +85 -0
- data/spec/chewy/strategy/active_job_spec.rb +27 -8
- data/spec/chewy/strategy/atomic_no_refresh_spec.rb +60 -0
- data/spec/chewy/strategy/atomic_spec.rb +13 -11
- data/spec/chewy/strategy/delayed_sidekiq_spec.rb +190 -0
- data/spec/chewy/strategy/lazy_sidekiq_spec.rb +214 -0
- data/spec/chewy/strategy/sidekiq_spec.rb +19 -7
- data/spec/chewy/strategy_spec.rb +19 -15
- data/spec/chewy_spec.rb +65 -88
- data/spec/spec_helper.rb +11 -20
- data/spec/support/active_record.rb +48 -6
- data/spec/support/class_helpers.rb +4 -19
- metadata +299 -183
- data/.travis.yml +0 -76
- data/Appraisals +0 -76
- data/gemfiles/rails.3.2.activerecord.gemfile +0 -15
- data/gemfiles/rails.3.2.activerecord.kaminari.gemfile +0 -14
- data/gemfiles/rails.3.2.activerecord.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.0.activerecord.gemfile +0 -15
- data/gemfiles/rails.4.0.activerecord.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.0.activerecord.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.0.mongoid.4.0.0.gemfile +0 -15
- data/gemfiles/rails.4.0.mongoid.4.0.0.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.0.mongoid.4.0.0.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.0.mongoid.5.1.0.gemfile +0 -15
- data/gemfiles/rails.4.0.mongoid.5.1.0.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.0.mongoid.5.1.0.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.1.activerecord.gemfile +0 -15
- data/gemfiles/rails.4.1.activerecord.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.1.activerecord.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.1.mongoid.4.0.0.gemfile +0 -15
- data/gemfiles/rails.4.1.mongoid.4.0.0.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.1.mongoid.4.0.0.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.1.mongoid.5.1.0.gemfile +0 -15
- data/gemfiles/rails.4.1.mongoid.5.1.0.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.1.mongoid.5.1.0.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.2.activerecord.gemfile +0 -16
- data/gemfiles/rails.4.2.activerecord.kaminari.gemfile +0 -15
- data/gemfiles/rails.4.2.activerecord.will_paginate.gemfile +0 -15
- data/gemfiles/rails.4.2.mongoid.4.0.0.gemfile +0 -15
- data/gemfiles/rails.4.2.mongoid.4.0.0.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.2.mongoid.4.0.0.will_paginate.gemfile +0 -14
- data/gemfiles/rails.4.2.mongoid.5.1.0.gemfile +0 -15
- data/gemfiles/rails.4.2.mongoid.5.1.0.kaminari.gemfile +0 -14
- data/gemfiles/rails.4.2.mongoid.5.1.0.will_paginate.gemfile +0 -14
- data/gemfiles/rails.5.0.0.beta3.activerecord.gemfile +0 -16
- data/gemfiles/rails.5.0.0.beta3.activerecord.kaminari.gemfile +0 -16
- data/gemfiles/rails.5.0.0.beta3.activerecord.will_paginate.gemfile +0 -15
- data/gemfiles/sequel.4.31.gemfile +0 -13
- data/lib/chewy/backports/deep_dup.rb +0 -46
- data/lib/chewy/backports/duplicable.rb +0 -90
- data/lib/chewy/query/compose.rb +0 -69
- data/lib/chewy/query/criteria.rb +0 -181
- data/lib/chewy/query/filters.rb +0 -227
- data/lib/chewy/query/loading.rb +0 -111
- data/lib/chewy/query/nodes/and.rb +0 -25
- data/lib/chewy/query/nodes/base.rb +0 -17
- data/lib/chewy/query/nodes/bool.rb +0 -32
- data/lib/chewy/query/nodes/equal.rb +0 -34
- data/lib/chewy/query/nodes/exists.rb +0 -20
- data/lib/chewy/query/nodes/expr.rb +0 -28
- data/lib/chewy/query/nodes/field.rb +0 -106
- data/lib/chewy/query/nodes/has_child.rb +0 -14
- data/lib/chewy/query/nodes/has_parent.rb +0 -14
- data/lib/chewy/query/nodes/has_relation.rb +0 -61
- data/lib/chewy/query/nodes/match_all.rb +0 -11
- data/lib/chewy/query/nodes/missing.rb +0 -20
- data/lib/chewy/query/nodes/not.rb +0 -25
- data/lib/chewy/query/nodes/or.rb +0 -25
- data/lib/chewy/query/nodes/prefix.rb +0 -18
- data/lib/chewy/query/nodes/query.rb +0 -20
- data/lib/chewy/query/nodes/range.rb +0 -63
- data/lib/chewy/query/nodes/raw.rb +0 -15
- data/lib/chewy/query/nodes/regexp.rb +0 -31
- data/lib/chewy/query/nodes/script.rb +0 -20
- data/lib/chewy/query/pagination/will_paginate.rb +0 -27
- data/lib/chewy/query/pagination.rb +0 -16
- data/lib/chewy/query/scoping.rb +0 -20
- data/lib/chewy/query.rb +0 -1026
- data/lib/chewy/strategy/resque.rb +0 -26
- data/lib/chewy/type/actions.rb +0 -19
- data/lib/chewy/type/adapter/active_record.rb +0 -72
- data/lib/chewy/type/adapter/mongoid.rb +0 -58
- data/lib/chewy/type/adapter/object.rb +0 -89
- data/lib/chewy/type/adapter/orm.rb +0 -156
- data/lib/chewy/type/adapter/sequel.rb +0 -75
- data/lib/chewy/type/crutch.rb +0 -31
- data/lib/chewy/type/import.rb +0 -224
- data/lib/chewy/type/observe.rb +0 -76
- data/lib/chewy/type/wrapper.rb +0 -53
- data/lib/chewy/type.rb +0 -89
- data/lib/sequel/plugins/chewy_observe.rb +0 -78
- data/spec/chewy/query/criteria_spec.rb +0 -433
- data/spec/chewy/query/filters_spec.rb +0 -173
- data/spec/chewy/query/loading_spec.rb +0 -86
- data/spec/chewy/query/nodes/and_spec.rb +0 -16
- data/spec/chewy/query/nodes/bool_spec.rb +0 -22
- data/spec/chewy/query/nodes/equal_spec.rb +0 -32
- data/spec/chewy/query/nodes/exists_spec.rb +0 -18
- data/spec/chewy/query/nodes/has_child_spec.rb +0 -40
- data/spec/chewy/query/nodes/has_parent_spec.rb +0 -40
- data/spec/chewy/query/nodes/match_all_spec.rb +0 -11
- data/spec/chewy/query/nodes/missing_spec.rb +0 -15
- data/spec/chewy/query/nodes/not_spec.rb +0 -16
- data/spec/chewy/query/nodes/or_spec.rb +0 -16
- data/spec/chewy/query/nodes/prefix_spec.rb +0 -16
- data/spec/chewy/query/nodes/query_spec.rb +0 -12
- data/spec/chewy/query/nodes/range_spec.rb +0 -32
- data/spec/chewy/query/nodes/raw_spec.rb +0 -11
- data/spec/chewy/query/nodes/regexp_spec.rb +0 -31
- data/spec/chewy/query/nodes/script_spec.rb +0 -15
- data/spec/chewy/query/pagination/kaminari_spec.rb +0 -57
- data/spec/chewy/query/pagination/will_paginage_spec.rb +0 -60
- data/spec/chewy/query/pagination_spec.rb +0 -36
- data/spec/chewy/query_spec.rb +0 -632
- data/spec/chewy/strategy/resque_spec.rb +0 -40
- data/spec/chewy/type/actions_spec.rb +0 -31
- data/spec/chewy/type/adapter/active_record_spec.rb +0 -317
- data/spec/chewy/type/adapter/mongoid_spec.rb +0 -253
- data/spec/chewy/type/adapter/object_spec.rb +0 -139
- data/spec/chewy/type/adapter/sequel_spec.rb +0 -320
- data/spec/chewy/type/import_spec.rb +0 -433
- data/spec/chewy/type/mapping_spec.rb +0 -106
- data/spec/chewy/type/observe_spec.rb +0 -127
- data/spec/chewy/type/witchcraft_spec.rb +0 -154
- data/spec/chewy/type/wrapper_spec.rb +0 -58
- data/spec/chewy/type_spec.rb +0 -33
- data/spec/support/mongoid.rb +0 -81
- data/spec/support/sequel.rb +0 -75
data/README.md
CHANGED
@@ -1,20 +1,15 @@
|
|
1
1
|
[![Gem Version](https://badge.fury.io/rb/chewy.svg)](http://badge.fury.io/rb/chewy)
|
2
|
-
[![
|
2
|
+
[![GitHub Actions](https://github.com/toptal/chewy/actions/workflows/ruby.yml/badge.svg)](https://github.com/toptal/chewy/actions/workflows/ruby.yml)
|
3
3
|
[![Code Climate](https://codeclimate.com/github/toptal/chewy.svg)](https://codeclimate.com/github/toptal/chewy)
|
4
4
|
[![Inline docs](http://inch-ci.org/github/toptal/chewy.svg?branch=master)](http://inch-ci.org/github/toptal/chewy)
|
5
5
|
|
6
|
-
<p align="right">Sponsored by</p>
|
7
|
-
<p align="right"><a href="http://www.toptal.com/"><img src="http://www.toptal.com/assets/public/blocks/logo/big.png" alt="Toptal" width="105" height="34"></a></p>
|
8
|
-
|
9
6
|
# Chewy
|
10
7
|
|
11
|
-
Chewy is an ODM
|
8
|
+
Chewy is an ODM (Object Document Mapper), built on top of [the official Elasticsearch client](https://github.com/elastic/elasticsearch-ruby).
|
12
9
|
|
13
10
|
## Why Chewy?
|
14
11
|
|
15
|
-
|
16
|
-
|
17
|
-
Index classes are independent from ORM/ODM models. Now, implementing e.g. cross-model autocomplete is much easier. You can just define the index and work with it in an object-oriented style. You can define several types for index - one per indexed model.
|
12
|
+
In this section we'll cover why you might want to use Chewy instead of the official `elasticsearch-ruby` client gem.
|
18
13
|
|
19
14
|
* Every index is observable by all the related models.
|
20
15
|
|
@@ -28,12 +23,11 @@ Chewy is an ODM and wrapper for [the official Elasticsearch client](https://gith
|
|
28
23
|
|
29
24
|
Chewy has an ActiveRecord-style query DSL. It is chainable, mergeable and lazy, so you can produce queries in the most efficient way. It also has object-oriented query and filter builders.
|
30
25
|
|
31
|
-
* Support for ActiveRecord
|
32
|
-
|
26
|
+
* Support for ActiveRecord.
|
33
27
|
|
34
28
|
## Installation
|
35
29
|
|
36
|
-
Add this line to your application's Gemfile
|
30
|
+
Add this line to your application's `Gemfile`:
|
37
31
|
|
38
32
|
gem 'chewy'
|
39
33
|
|
@@ -45,19 +39,181 @@ Or install it yourself as:
|
|
45
39
|
|
46
40
|
$ gem install chewy
|
47
41
|
|
48
|
-
##
|
42
|
+
## Compatibility
|
49
43
|
|
50
|
-
###
|
44
|
+
### Ruby
|
45
|
+
|
46
|
+
Chewy is compatible with MRI 2.6-3.0¹.
|
47
|
+
|
48
|
+
> ¹ Ruby 3 is only supported with Rails 6.1
|
49
|
+
|
50
|
+
### Elasticsearch compatibility matrix
|
51
|
+
|
52
|
+
| Chewy version | Elasticsearch version |
|
53
|
+
| ------------- | ---------------------------------- |
|
54
|
+
| 7.2.x | 7.x |
|
55
|
+
| 7.1.x | 7.x |
|
56
|
+
| 7.0.x | 6.8, 7.x |
|
57
|
+
| 6.0.0 | 5.x, 6.x |
|
58
|
+
| 5.x | 5.x, limited support for 1.x & 2.x |
|
59
|
+
|
60
|
+
**Important:** Chewy doesn't follow SemVer, so you should always
|
61
|
+
check the release notes before upgrading. The major version is linked to the
|
62
|
+
newest supported Elasticsearch and the minor version bumps may include breaking changes.
|
63
|
+
|
64
|
+
See our [migration guide](migration_guide.md) for detailed upgrade instructions between
|
65
|
+
various Chewy versions.
|
66
|
+
|
67
|
+
### Active Record
|
68
|
+
|
69
|
+
5.2, 6.0, 6.1 Active Record versions are supported by all Chewy versions.
|
70
|
+
|
71
|
+
## Getting Started
|
72
|
+
|
73
|
+
Chewy provides functionality for Elasticsearch index handling, documents import mappings, index update strategies and chainable query DSL.
|
74
|
+
|
75
|
+
### Minimal client setting
|
76
|
+
|
77
|
+
Create `config/initializers/chewy.rb` with this line:
|
78
|
+
|
79
|
+
```ruby
|
80
|
+
Chewy.settings = {host: 'localhost:9250'}
|
81
|
+
```
|
82
|
+
|
83
|
+
And run `rails g chewy:install` to generate `chewy.yml`:
|
84
|
+
|
85
|
+
```yaml
|
86
|
+
# config/chewy.yml
|
87
|
+
# separate environment configs
|
88
|
+
test:
|
89
|
+
host: 'localhost:9250'
|
90
|
+
prefix: 'test'
|
91
|
+
development:
|
92
|
+
host: 'localhost:9200'
|
93
|
+
```
|
94
|
+
|
95
|
+
### Elasticsearch
|
96
|
+
|
97
|
+
Make sure you have Elasticsearch up and running. You can [install](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html) it locally, but the easiest way is to use [Docker](https://www.docker.com/get-started):
|
98
|
+
|
99
|
+
```shell
|
100
|
+
$ docker run --rm --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.11.1
|
101
|
+
```
|
102
|
+
|
103
|
+
### Index
|
104
|
+
|
105
|
+
Create `app/chewy/users_index.rb` with User Index:
|
106
|
+
|
107
|
+
```ruby
|
108
|
+
class UsersIndex < Chewy::Index
|
109
|
+
settings analysis: {
|
110
|
+
analyzer: {
|
111
|
+
email: {
|
112
|
+
tokenizer: 'keyword',
|
113
|
+
filter: ['lowercase']
|
114
|
+
}
|
115
|
+
}
|
116
|
+
}
|
51
117
|
|
52
|
-
|
118
|
+
index_scope User
|
119
|
+
field :first_name
|
120
|
+
field :last_name
|
121
|
+
field :email, analyzer: 'email'
|
122
|
+
end
|
123
|
+
```
|
53
124
|
|
54
|
-
|
125
|
+
### Model
|
126
|
+
|
127
|
+
Add User model, table and migrate it:
|
128
|
+
|
129
|
+
```shell
|
130
|
+
$ bundle exec rails g model User first_name last_name email
|
131
|
+
$ bundle exec rails db:migrate
|
132
|
+
```
|
133
|
+
|
134
|
+
Add `update_index` to app/models/user.rb:
|
135
|
+
|
136
|
+
```ruby
|
137
|
+
class User < ApplicationRecord
|
138
|
+
update_index('users') { self }
|
139
|
+
end
|
140
|
+
```
|
141
|
+
|
142
|
+
### Example of data request
|
143
|
+
|
144
|
+
1. Once a record is created (could be done via the Rails console), it creates User index too:
|
145
|
+
|
146
|
+
```
|
147
|
+
User.create(
|
148
|
+
first_name: "test1",
|
149
|
+
last_name: "test1",
|
150
|
+
email: 'test1@example.com',
|
151
|
+
# other fields
|
152
|
+
)
|
153
|
+
# UsersIndex Import (355.3ms) {:index=>1}
|
154
|
+
# => #<User id: 1, first_name: "test1", last_name: "test1", email: "test1@example.com", # other fields>
|
155
|
+
```
|
156
|
+
|
157
|
+
2. A query could be exposed at a given `UsersController`:
|
158
|
+
|
159
|
+
```ruby
|
160
|
+
def search
|
161
|
+
@users = UsersIndex.query(query_string: { fields: [:first_name, :last_name, :email, ...], query: search_params[:query], default_operator: 'and' })
|
162
|
+
render json: @users.to_json, status: :ok
|
163
|
+
end
|
164
|
+
|
165
|
+
private
|
166
|
+
|
167
|
+
def search_params
|
168
|
+
params.permit(:query, :page, :per)
|
169
|
+
end
|
170
|
+
```
|
171
|
+
|
172
|
+
3. So a request against `http://localhost:3000/users/search?query=test1@example.com` issuing a response like:
|
173
|
+
|
174
|
+
```json
|
175
|
+
[
|
176
|
+
{
|
177
|
+
"attributes":{
|
178
|
+
"id":"1",
|
179
|
+
"first_name":"test1",
|
180
|
+
"last_name":"test1",
|
181
|
+
"email":"test1@example.com",
|
182
|
+
...
|
183
|
+
"_score":0.9808291,
|
184
|
+
"_explanation":null
|
185
|
+
},
|
186
|
+
"_data":{
|
187
|
+
"_index":"users",
|
188
|
+
"_type":"_doc",
|
189
|
+
"_id":"1",
|
190
|
+
"_score":0.9808291,
|
191
|
+
"_source":{
|
192
|
+
"first_name":"test1",
|
193
|
+
"last_name":"test1",
|
194
|
+
"email":"test1@example.com",
|
195
|
+
...
|
196
|
+
}
|
197
|
+
}
|
198
|
+
}
|
199
|
+
]
|
200
|
+
```
|
201
|
+
|
202
|
+
## Usage and configuration
|
203
|
+
|
204
|
+
### Client settings
|
205
|
+
|
206
|
+
To configure the Chewy client you need to add `chewy.rb` file with `Chewy.settings` hash:
|
55
207
|
|
56
208
|
```ruby
|
57
209
|
# config/initializers/chewy.rb
|
58
210
|
Chewy.settings = {host: 'localhost:9250'} # do not use environments
|
59
211
|
```
|
60
212
|
|
213
|
+
And add `chewy.yml` configuration file.
|
214
|
+
|
215
|
+
You can create `chewy.yml` manually or run `rails g chewy:install` to generate it:
|
216
|
+
|
61
217
|
```yaml
|
62
218
|
# config/chewy.yml
|
63
219
|
# separate environment configs
|
@@ -83,7 +239,31 @@ Chewy.logger = Logger.new(STDOUT)
|
|
83
239
|
|
84
240
|
See [config.rb](lib/chewy/config.rb) for more details.
|
85
241
|
|
86
|
-
|
242
|
+
#### AWS Elasticsearch
|
243
|
+
|
244
|
+
If you would like to use AWS's Elasticsearch using an IAM user policy, you will need to sign your requests for the `es:*` action by injecting the appropriate headers passing a proc to `transport_options`.
|
245
|
+
You'll need an additional gem for Faraday middleware: add `gem 'faraday_middleware-aws-sigv4'` to your Gemfile.
|
246
|
+
|
247
|
+
```ruby
|
248
|
+
require 'faraday_middleware/aws_sigv4'
|
249
|
+
|
250
|
+
Chewy.settings = {
|
251
|
+
host: 'http://my-es-instance-on-aws.us-east-1.es.amazonaws.com:80',
|
252
|
+
port: 80, # 443 for https host
|
253
|
+
transport_options: {
|
254
|
+
headers: { content_type: 'application/json' },
|
255
|
+
proc: -> (f) do
|
256
|
+
f.request :aws_sigv4,
|
257
|
+
service: 'es',
|
258
|
+
region: 'us-east-1',
|
259
|
+
access_key_id: ENV['AWS_ACCESS_KEY'],
|
260
|
+
secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
|
261
|
+
end
|
262
|
+
}
|
263
|
+
}
|
264
|
+
```
|
265
|
+
|
266
|
+
#### Index definition
|
87
267
|
|
88
268
|
1. Create `/app/chewy/users_index.rb`
|
89
269
|
|
@@ -93,41 +273,38 @@ See [config.rb](lib/chewy/config.rb) for more details.
|
|
93
273
|
end
|
94
274
|
```
|
95
275
|
|
96
|
-
2.
|
276
|
+
2. Define index scope (you can omit this part if you don't need to specify a scope (i.e. use PORO objects for import) or options)
|
97
277
|
|
98
278
|
```ruby
|
99
279
|
class UsersIndex < Chewy::Index
|
100
|
-
|
280
|
+
index_scope User.active # or just model instead_of scope: index_scope User
|
101
281
|
end
|
102
282
|
```
|
103
283
|
|
104
|
-
|
105
|
-
|
106
|
-
3. Add some type mappings
|
284
|
+
3. Add some mappings
|
107
285
|
|
108
286
|
```ruby
|
109
287
|
class UsersIndex < Chewy::Index
|
110
|
-
|
111
|
-
|
112
|
-
|
113
|
-
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
|
119
|
-
|
120
|
-
end
|
121
|
-
field :rating, type: 'integer' # custom data type
|
122
|
-
field :created, type: 'date', include_in_all: false,
|
123
|
-
value: ->{ created_at } # value proc for source object context
|
288
|
+
index_scope User.active.includes(:country, :badges, :projects)
|
289
|
+
field :first_name, :last_name # multiple fields without additional options
|
290
|
+
field :email, analyzer: 'email' # Elasticsearch-related options
|
291
|
+
field :country, value: ->(user) { user.country.name } # custom value proc
|
292
|
+
field :badges, value: ->(user) { user.badges.map(&:name) } # passing array values to index
|
293
|
+
field :projects do # the same block syntax for multi_field, if `:type` is specified
|
294
|
+
field :title
|
295
|
+
field :description # default data type is `text`
|
296
|
+
# additional top-level objects passed to value proc:
|
297
|
+
field :categories, value: ->(project, user) { project.categories.map(&:name) if user.active? }
|
124
298
|
end
|
299
|
+
field :rating, type: 'integer' # custom data type
|
300
|
+
field :created, type: 'date', include_in_all: false,
|
301
|
+
value: ->{ created_at } # value proc for source object context
|
125
302
|
end
|
126
303
|
```
|
127
304
|
|
128
|
-
[See here for mapping definitions](
|
305
|
+
[See here for mapping definitions](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html).
|
129
306
|
|
130
|
-
4. Add some index-
|
307
|
+
4. Add some index-related settings. Analyzer repositories might be used as well. See `Chewy::Index.settings` docs for details:
|
131
308
|
|
132
309
|
```ruby
|
133
310
|
class UsersIndex < Chewy::Index
|
@@ -140,69 +317,61 @@ See [config.rb](lib/chewy/config.rb) for more details.
|
|
140
317
|
}
|
141
318
|
}
|
142
319
|
|
143
|
-
|
144
|
-
|
145
|
-
|
146
|
-
|
147
|
-
|
148
|
-
|
149
|
-
|
150
|
-
|
151
|
-
|
152
|
-
|
153
|
-
|
154
|
-
end
|
155
|
-
field :about_translations, type: 'object' # pass object type explicitly if necessary
|
156
|
-
field :rating, type: 'integer'
|
157
|
-
field :created, type: 'date', include_in_all: false,
|
158
|
-
value: ->{ created_at }
|
320
|
+
index_scope User.active.includes(:country, :badges, :projects)
|
321
|
+
root date_detection: false do
|
322
|
+
template 'about_translations.*', type: 'text', analyzer: 'standard'
|
323
|
+
|
324
|
+
field :first_name, :last_name
|
325
|
+
field :email, analyzer: 'email'
|
326
|
+
field :country, value: ->(user) { user.country.name }
|
327
|
+
field :badges, value: ->(user) { user.badges.map(&:name) }
|
328
|
+
field :projects do
|
329
|
+
field :title
|
330
|
+
field :description
|
159
331
|
end
|
332
|
+
field :about_translations, type: 'object' # pass object type explicitly if necessary
|
333
|
+
field :rating, type: 'integer'
|
334
|
+
field :created, type: 'date', include_in_all: false,
|
335
|
+
value: ->{ created_at }
|
160
336
|
end
|
161
337
|
end
|
162
338
|
```
|
163
339
|
|
164
|
-
[See index settings here](
|
165
|
-
[See root object settings here](
|
340
|
+
[See index settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html).
|
341
|
+
[See root object settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html).
|
166
342
|
|
167
|
-
See [mapping.rb](lib/chewy/
|
343
|
+
See [mapping.rb](lib/chewy/index/mapping.rb) for more details.
|
168
344
|
|
169
345
|
5. Add model-observing code
|
170
346
|
|
171
347
|
```ruby
|
172
348
|
class User < ActiveRecord::Base
|
173
|
-
update_index('users
|
349
|
+
update_index('users') { self } # specifying index and back-reference
|
174
350
|
# for updating after user save or destroy
|
175
351
|
end
|
176
352
|
|
177
353
|
class Country < ActiveRecord::Base
|
178
354
|
has_many :users
|
179
355
|
|
180
|
-
update_index('users
|
356
|
+
update_index('users') { users } # return single object or collection
|
181
357
|
end
|
182
358
|
|
183
359
|
class Project < ActiveRecord::Base
|
184
|
-
update_index('users
|
185
|
-
end
|
186
|
-
|
187
|
-
class Badge < ActiveRecord::Base
|
188
|
-
has_and_belongs_to_many :users
|
189
|
-
|
190
|
-
update_index('users') { users } # if index has only one type
|
191
|
-
# there is no need to specify updated type
|
360
|
+
update_index('users') { user if user.active? } # you can return even `nil` from the back-reference
|
192
361
|
end
|
193
362
|
|
194
363
|
class Book < ActiveRecord::Base
|
195
|
-
update_index(->(book) {"
|
196
|
-
|
197
|
-
|
364
|
+
update_index(->(book) {"books_#{book.language}"}) { self } # dynamic index name with proc.
|
365
|
+
# For book with language == "en"
|
366
|
+
# this code will generate `books_en`
|
198
367
|
end
|
199
368
|
```
|
200
369
|
|
201
370
|
Also, you can use the second argument for method name passing:
|
202
371
|
|
203
372
|
```ruby
|
204
|
-
update_index('users
|
205
|
-
update_index('users
|
373
|
+
update_index('users', :self)
|
374
|
+
update_index('users', :users)
|
206
375
|
```
|
207
376
|
|
208
377
|
In the case of a belongs_to association you may need to update both associated objects, previous and current:
|
@@ -211,47 +380,28 @@ See [config.rb](lib/chewy/config.rb) for more details.
|
|
211
380
|
class City < ActiveRecord::Base
|
212
381
|
belongs_to :country
|
213
382
|
|
214
|
-
update_index('cities
|
215
|
-
update_index 'countries
|
216
|
-
# For the latest active_record changed values are
|
217
|
-
# already in `previous_changes` hash,
|
218
|
-
# but for mongoid you have to use `changes` hash
|
383
|
+
update_index('cities') { self }
|
384
|
+
update_index 'countries' do
|
219
385
|
previous_changes['country_id'] || country
|
220
386
|
end
|
221
387
|
end
|
222
388
|
```
|
223
389
|
|
224
|
-
|
390
|
+
### Default import options
|
225
391
|
|
226
|
-
|
227
|
-
class User < Sequel::Model
|
228
|
-
update_index('users#user') { self }
|
229
|
-
end
|
230
|
-
```
|
231
|
-
|
232
|
-
However, to make it work, you must load the chewy plugin into Sequel model:
|
233
|
-
|
234
|
-
```ruby
|
235
|
-
Sequel::Model.plugin :chewy_observe # for all models, or...
|
236
|
-
User.plugin :chewy_observe # just for User
|
237
|
-
```
|
238
|
-
|
239
|
-
### Type default import options
|
240
|
-
|
241
|
-
Every type has `default_import_options` configuration to specify, suddenly, default import options:
|
392
|
+
Every index has `default_import_options` configuration to specify, suddenly, default import options:
|
242
393
|
|
243
394
|
```ruby
|
244
395
|
class ProductsIndex < Chewy::Index
|
245
|
-
|
246
|
-
|
396
|
+
index_scope Post.includes(:tags)
|
397
|
+
default_import_options batch_size: 100, bulk_size: 10.megabytes, refresh: false
|
247
398
|
|
248
|
-
|
249
|
-
|
250
|
-
end
|
399
|
+
field :name
|
400
|
+
field :tags, value: -> { tags.map(&:name) }
|
251
401
|
end
|
252
402
|
```
|
253
403
|
|
254
|
-
See [import.rb](lib/chewy/
|
404
|
+
See [import.rb](lib/chewy/index/import.rb) for available options.
|
255
405
|
|
256
406
|
### Multi (nested) and object field types
|
257
407
|
|
@@ -269,17 +419,17 @@ This will automatically set the type or root field to `object`. You may also spe
|
|
269
419
|
To define a multi field you have to specify any type except for `object` or `nested` in the root field:
|
270
420
|
|
271
421
|
```ruby
|
272
|
-
field :full_name, type: '
|
422
|
+
field :full_name, type: 'text', value: ->{ full_name.strip } do
|
273
423
|
field :ordered, analyzer: 'ordered'
|
274
|
-
field :untouched,
|
424
|
+
field :untouched, type: 'keyword'
|
275
425
|
end
|
276
426
|
```
|
277
427
|
|
278
|
-
The `value:` option for internal fields
|
428
|
+
The `value:` option for internal fields will no longer be effective.
|
279
429
|
|
280
430
|
### Geo Point fields
|
281
431
|
|
282
|
-
You can use [Elasticsearch's geo mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/
|
432
|
+
You can use [Elasticsearch's geo mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) with the `geo_point` field type, allowing you to query, filter and order by latitude and longitude. You can use the following hash format:
|
283
433
|
|
284
434
|
```ruby
|
285
435
|
field :coordinates, type: 'geo_point', value: ->{ {lat: latitude, lon: longitude} }
|
@@ -296,20 +446,36 @@ end
|
|
296
446
|
|
297
447
|
See the section on *Script fields* for details on calculating distance in a search.
|
298
448
|
|
449
|
+
### Join fields
|
450
|
+
|
451
|
+
You can use a [join field](https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html)
|
452
|
+
to implement parent-child relationships between documents.
|
453
|
+
It [replaces the old `parent_id` based parent-child mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#parent-child-mapping-types)
|
454
|
+
|
455
|
+
To use it, you need to pass `relations` and `join` (with `type` and `id`) options:
|
456
|
+
```ruby
|
457
|
+
field :hierarchy_link, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join: {type: :comment_type, id: :commented_id}
|
458
|
+
```
|
459
|
+
assuming you have `comment_type` and `commented_id` fields in your model.
|
460
|
+
|
461
|
+
Note that when you reindex a parent, its children and grandchildren will be reindexed as well.
|
462
|
+
This may require additional queries to the primary database and to elastisearch.
|
463
|
+
|
464
|
+
Also note that the join field doesn't support crutches (it should be a field directly defined on the model).
|
465
|
+
|
299
466
|
### Crutches™ technology
|
300
467
|
|
301
468
|
Assume you are defining your index like this (product has_many categories through product_categories):
|
302
469
|
|
303
470
|
```ruby
|
304
471
|
class ProductsIndex < Chewy::Index
|
305
|
-
|
306
|
-
|
307
|
-
|
308
|
-
end
|
472
|
+
index_scope Product.includes(:categories)
|
473
|
+
field :name
|
474
|
+
field :category_names, value: ->(product) { product.categories.map(&:name) } # or shorter just -> { categories.map(&:name) }
|
309
475
|
end
|
310
476
|
```
|
311
477
|
|
312
|
-
Then the Chewy reindexing flow
|
478
|
+
Then the Chewy reindexing flow will look like the following pseudo-code:
|
313
479
|
|
314
480
|
```ruby
|
315
481
|
Product.includes(:categories).find_in_batches(1000) do |batch|
|
@@ -321,30 +487,27 @@ Product.includes(:categories).find_in_batches(1000) do |batch|
|
|
321
487
|
end
|
322
488
|
```
|
323
489
|
|
324
|
-
|
325
|
-
|
326
|
-
Then you can replace Rails associations with Chewy Crutches™ technology:
|
490
|
+
If you meet complicated cases when associations are not applicable you can replace Rails associations with Chewy Crutches™ technology:
|
327
491
|
|
328
492
|
```ruby
|
329
493
|
class ProductsIndex < Chewy::Index
|
330
|
-
|
331
|
-
|
332
|
-
|
333
|
-
|
334
|
-
|
335
|
-
|
336
|
-
|
337
|
-
|
338
|
-
end
|
339
|
-
|
340
|
-
field :name
|
341
|
-
# simply use crutch-fetched data as a value:
|
342
|
-
field :category_names, value: ->(product, crutches) { crutches.categories[product.id] }
|
494
|
+
index_scope Product
|
495
|
+
crutch :categories do |collection| # collection here is a current batch of products
|
496
|
+
# data is fetched with a lightweight query without objects initialization
|
497
|
+
data = ProductCategory.joins(:category).where(product_id: collection.map(&:id)).pluck(:product_id, 'categories.name')
|
498
|
+
# then we have to convert fetched data to appropriate format
|
499
|
+
# this will return our data in structure like:
|
500
|
+
# {123 => ['sweets', 'juices'], 456 => ['meat']}
|
501
|
+
data.each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
|
343
502
|
end
|
503
|
+
|
504
|
+
field :name
|
505
|
+
# simply use crutch-fetched data as a value:
|
506
|
+
field :category_names, value: ->(product, crutches) { crutches[:categories][product.id] }
|
344
507
|
end
|
345
508
|
```
|
346
509
|
|
347
|
-
An example flow
|
510
|
+
An example flow will look like this:
|
348
511
|
|
349
512
|
```ruby
|
350
513
|
Product.includes(:categories).find_in_batches(1000) do |batch|
|
@@ -362,22 +525,21 @@ So Chewy Crutches™ technology is able to increase your indexing performance in
|
|
362
525
|
|
363
526
|
### Witchcraft™ technology
|
364
527
|
|
365
|
-
One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of
|
528
|
+
One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of these procs is executed on imported object to extract result document to import. It would be great for performance to use one huge whole-document-returning proc instead. So basically the idea or Witchcraft™ technology is to compile a single document-returning proc from the index definition.
|
366
529
|
|
367
530
|
```ruby
|
368
|
-
|
369
|
-
|
370
|
-
|
371
|
-
|
372
|
-
|
373
|
-
|
374
|
-
|
375
|
-
|
376
|
-
end
|
531
|
+
index_scope Product
|
532
|
+
witchcraft!
|
533
|
+
|
534
|
+
field :title
|
535
|
+
field :tags, value: -> { tags.map(&:name) }
|
536
|
+
field :categories do
|
537
|
+
field :name, value: -> (product, category) { category.name }
|
538
|
+
field :type, value: -> (product, category, crutch) { crutch.types[category.name] }
|
377
539
|
end
|
378
540
|
```
|
379
541
|
|
380
|
-
The
|
542
|
+
The index definition above will be compiled to something close to:
|
381
543
|
|
382
544
|
```ruby
|
383
545
|
-> (object, crutches) do
|
@@ -395,19 +557,128 @@ end
|
|
395
557
|
```
|
396
558
|
|
397
559
|
And don't even ask how is it possible, it is a witchcraft.
|
398
|
-
Obviously not every type of definition might be compiled
|
560
|
+
Obviously not every type of definition might be compiled. There are some restrictions:
|
561
|
+
|
562
|
+
1. Use reasonable formatting to make `method_source` be able to extract field value proc sources.
|
563
|
+
2. Value procs with splat arguments are not supported right now.
|
564
|
+
3. If you are generating fields dynamically use value proc with arguments, argumentless value procs are not supported yet:
|
565
|
+
|
566
|
+
```ruby
|
567
|
+
[:first_name, :last_name].each do |name|
|
568
|
+
field name, value: -> (o) { o.send(name) }
|
569
|
+
end
|
570
|
+
```
|
571
|
+
|
572
|
+
However, it is quite possible that your index definition will be supported by Witchcraft™ technology out of the box in most of the cases.
|
573
|
+
|
574
|
+
### Raw Import
|
575
|
+
|
576
|
+
Another way to speed up import time is Raw Imports. This technology is only available in ActiveRecord adapter. Very often, ActiveRecord model instantiation is what consumes most of the CPU and RAM resources. Precious time is wasted on converting, say, timestamps from strings and then serializing them back to strings. Chewy can operate on raw hashes of data directly obtained from the database. All you need is to provide a way to convert that hash to a lightweight object that mimics the behaviour of the normal ActiveRecord object.
|
577
|
+
|
578
|
+
```ruby
|
579
|
+
class LightweightProduct
|
580
|
+
def initialize(attributes)
|
581
|
+
@attributes = attributes
|
582
|
+
end
|
583
|
+
|
584
|
+
# Depending on the database, `created_at` might
|
585
|
+
# be in different formats. In PostgreSQL, for example,
|
586
|
+
# you might see the following format:
|
587
|
+
# "2016-03-22 16:23:22"
|
588
|
+
#
|
589
|
+
# Taking into account that Elastic expects something different,
|
590
|
+
# one might do something like the following, just to avoid
|
591
|
+
# unnecessary String -> DateTime -> String conversion.
|
592
|
+
#
|
593
|
+
# "2016-03-22 16:23:22" -> "2016-03-22T16:23:22Z"
|
594
|
+
def created_at
|
595
|
+
@attributes['created_at'].tr(' ', 'T') << 'Z'
|
596
|
+
end
|
597
|
+
end
|
598
|
+
|
599
|
+
index_scope Product
|
600
|
+
default_import_options raw_import: ->(hash) {
|
601
|
+
LightweightProduct.new(hash)
|
602
|
+
}
|
399
603
|
|
400
|
-
|
604
|
+
field :created_at, 'datetime'
|
605
|
+
```
|
606
|
+
|
607
|
+
Also, you can pass `:raw_import` option to the `import` method explicitly.
|
608
|
+
|
609
|
+
### Index creation during import
|
610
|
+
|
611
|
+
By default, when you perform import Chewy checks whether an index exists and creates it if it's absent.
|
612
|
+
You can turn off this feature to decrease Elasticsearch hits count.
|
613
|
+
To do so you need to set `skip_index_creation_on_import` parameter to `false` in your `config/chewy.yml`
|
614
|
+
|
615
|
+
### Skip record fields during import
|
616
|
+
|
617
|
+
You can use `ignore_blank: true` to skip fields that return `true` for the `.blank?` method:
|
618
|
+
|
619
|
+
```ruby
|
620
|
+
index_scope Country
|
621
|
+
field :id
|
622
|
+
field :cities, ignore_blank: true do
|
623
|
+
field :id
|
624
|
+
field :name
|
625
|
+
field :surname, ignore_blank: true
|
626
|
+
field :description
|
627
|
+
end
|
628
|
+
```
|
629
|
+
|
630
|
+
#### Default values for different types
|
631
|
+
|
632
|
+
By default `ignore_blank` is false on every type except `geo_point`.
|
633
|
+
|
634
|
+
### Journaling
|
401
635
|
|
402
|
-
You can
|
636
|
+
You can record all actions that were made to the separate journal index in ElasticSearch.
|
637
|
+
When you create/update/destroy your documents, it will be saved in this special index.
|
638
|
+
If you make something with a batch of documents (e.g. during index reset) it will be saved as a one record, including primary keys of each document that was affected.
|
639
|
+
Common journal record looks like this:
|
640
|
+
|
641
|
+
```json
|
642
|
+
{
|
643
|
+
"action": "index",
|
644
|
+
"object_id": [1, 2, 3],
|
645
|
+
"index_name": "...",
|
646
|
+
"created_at": "<timestamp>"
|
647
|
+
}
|
648
|
+
```
|
649
|
+
|
650
|
+
This feature is turned off by default.
|
651
|
+
But you can turn it on by setting `journal` setting to `true` in `config/chewy.yml`.
|
652
|
+
Also, you can specify journal index name. For example:
|
653
|
+
|
654
|
+
```yaml
|
655
|
+
# config/chewy.yml
|
656
|
+
production:
|
657
|
+
journal: true
|
658
|
+
journal_name: my_super_journal
|
659
|
+
```
|
660
|
+
|
661
|
+
Also, you can provide this option while you're importing some index:
|
403
662
|
|
404
663
|
```ruby
|
405
|
-
|
406
|
-
UsersIndex.type_hash['user'] # => UsersIndex::User
|
407
|
-
UsersIndex.types # => [UsersIndex::User]
|
408
|
-
UsersIndex.type_names # => ['user']
|
664
|
+
CityIndex.import journal: true
|
409
665
|
```
|
410
666
|
|
667
|
+
Or as a default import option for an index:
|
668
|
+
|
669
|
+
```ruby
|
670
|
+
class CityIndex
|
671
|
+
index_scope City
|
672
|
+
default_import_options journal: true
|
673
|
+
end
|
674
|
+
```
|
675
|
+
|
676
|
+
You may be wondering why do you need it? The answer is simple: not to lose the data.
|
677
|
+
|
678
|
+
Imagine that you reset your index in a zero-downtime manner (to separate index), and in the meantime somebody keeps updating the data frequently (to old index). So all these actions will be written to the journal index and you'll be able to apply them after index reset using the `Chewy::Journal` interface.
|
679
|
+
|
680
|
+
When enabled, journal can grow to enormous size, consider setting up cron job that would clean it occasionally using [`chewy:journal:clean` rake task](#chewyjournal).
|
681
|
+
|
411
682
|
### Index manipulation
|
412
683
|
|
413
684
|
```ruby
|
@@ -420,24 +691,22 @@ UsersIndex.create! # use bang or non-bang methods
|
|
420
691
|
UsersIndex.purge
|
421
692
|
UsersIndex.purge! # deletes then creates index
|
422
693
|
|
423
|
-
UsersIndex
|
424
|
-
|
425
|
-
UsersIndex
|
426
|
-
UsersIndex
|
427
|
-
UsersIndex
|
694
|
+
UsersIndex.import # import with 0 arguments process all the data specified in index_scope definition
|
695
|
+
UsersIndex.import User.where('rating > 100') # or import specified users scope
|
696
|
+
UsersIndex.import User.where('rating > 100').to_a # or import specified users array
|
697
|
+
UsersIndex.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
|
698
|
+
UsersIndex.import User.where('rating > 100'), update_fields: [:email] # if update fields are specified - it will update their values only with the `update` bulk action
|
699
|
+
UsersIndex.import! # raises an exception in case of any import errors
|
428
700
|
|
429
|
-
UsersIndex.import # import every defined type
|
430
|
-
UsersIndex.import user: User.where('rating > 100') # import only active users to `user` type.
|
431
|
-
# Other index types, if exists, will be imported with default scope from the type definition.
|
432
701
|
UsersIndex.reset! # purges index and imports default data for all types
|
433
702
|
```
|
434
703
|
|
435
|
-
If the passed user is `#destroyed?`, or satisfies a `delete_if`
|
704
|
+
If the passed user is `#destroyed?`, or satisfies a `delete_if` index_scope option, or the specified id does not exist in the database, import will perform delete from index action for this object.
|
436
705
|
|
437
706
|
```ruby
|
438
|
-
|
439
|
-
|
440
|
-
|
707
|
+
index_scope User, delete_if: :deleted_at
|
708
|
+
index_scope User, delete_if: -> { deleted_at }
|
709
|
+
index_scope User, delete_if: ->(user) { user.deleted_at }
|
441
710
|
```
|
442
711
|
|
443
712
|
See [actions.rb](lib/chewy/index/actions.rb) for more details.
|
@@ -448,13 +717,12 @@ Assume you've got the following code:
|
|
448
717
|
|
449
718
|
```ruby
|
450
719
|
class City < ActiveRecord::Base
|
451
|
-
update_index 'cities
|
720
|
+
update_index 'cities', :self
|
452
721
|
end
|
453
722
|
|
454
723
|
class CitiesIndex < Chewy::Index
|
455
|
-
|
456
|
-
|
457
|
-
end
|
724
|
+
index_scope City
|
725
|
+
field :name
|
458
726
|
end
|
459
727
|
```
|
460
728
|
|
@@ -474,26 +742,112 @@ end
|
|
474
742
|
|
475
743
|
Using this strategy delays the index update request until the end of the block. Updated records are aggregated and the index update happens with the bulk API. So this strategy is highly optimized.
|
476
744
|
|
477
|
-
#### `:
|
745
|
+
#### `:sidekiq`
|
478
746
|
|
479
|
-
This does the same thing as `:atomic`, but asynchronously using
|
747
|
+
This does the same thing as `:atomic`, but asynchronously using sidekiq. Patch `Chewy::Strategy::Sidekiq::Worker` for index updates improving.
|
480
748
|
|
481
749
|
```ruby
|
482
|
-
Chewy.strategy(:
|
750
|
+
Chewy.strategy(:sidekiq) do
|
483
751
|
City.popular.map(&:do_some_update_action!)
|
484
752
|
end
|
485
753
|
```
|
486
754
|
|
487
|
-
|
755
|
+
The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
|
756
|
+
```
|
757
|
+
Chewy.settings[:sidekiq] = {queue: :low}
|
758
|
+
```
|
488
759
|
|
489
|
-
|
760
|
+
#### `:lazy_sidekiq`
|
761
|
+
|
762
|
+
This does the same thing as `:sidekiq`, but with lazy evaluation. Beware it does not allow you to use any non-persistent record state for indices and conditions because record will be re-fetched from database asynchronously using sidekiq. However for destroying records strategy will fallback to `:sidekiq` because it's not possible to re-fetch deleted records from database.
|
763
|
+
|
764
|
+
The purpose of this strategy is to improve the response time of the code that should update indexes, as it does not only defer actual ES calls to a background job but `update_index` callbacks evaluation (for created and updated objects) too. Similar to `:sidekiq`, index update is asynchronous so this strategy cannot be used when data and index synchronization is required.
|
490
765
|
|
491
766
|
```ruby
|
492
|
-
Chewy.strategy(:
|
767
|
+
Chewy.strategy(:lazy_sidekiq) do
|
493
768
|
City.popular.map(&:do_some_update_action!)
|
494
769
|
end
|
495
770
|
```
|
496
771
|
|
772
|
+
The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
|
773
|
+
```
|
774
|
+
Chewy.settings[:sidekiq] = {queue: :low}
|
775
|
+
```
|
776
|
+
|
777
|
+
#### `:delayed_sidekiq`
|
778
|
+
|
779
|
+
It accumulates ids of records to be reindexed during the latency window in redis and then does the reindexing of all accumulated records at once.
|
780
|
+
The strategy is very useful in case of frequently mutated records.
|
781
|
+
It supports `update_fields` option, so it will try to select just enough data from the DB
|
782
|
+
|
783
|
+
There are three options that can be defined in the index:
|
784
|
+
```ruby
|
785
|
+
class CitiesIndex...
|
786
|
+
strategy_config delayed_sidekiq: {
|
787
|
+
latency: 3,
|
788
|
+
margin: 2,
|
789
|
+
ttl: 60 * 60 * 24,
|
790
|
+
reindex_wrapper: ->(&reindex) {
|
791
|
+
ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
|
792
|
+
}
|
793
|
+
# latency - will prevent scheduling identical jobs
|
794
|
+
# margin - main purpose is to cover db replication lag by the margin
|
795
|
+
# ttl - a chunk expiration time (in seconds)
|
796
|
+
# reindex_wrapper - lambda that accepts block to wrap that reindex process AR connection block.
|
797
|
+
}
|
798
|
+
|
799
|
+
...
|
800
|
+
end
|
801
|
+
```
|
802
|
+
|
803
|
+
Also you can define defaults in the `initializers/chewy.rb`
|
804
|
+
```ruby
|
805
|
+
Chewy.settings = {
|
806
|
+
strategy_config: {
|
807
|
+
delayed_sidekiq: {
|
808
|
+
latency: 3,
|
809
|
+
margin: 2,
|
810
|
+
ttl: 60 * 60 * 24,
|
811
|
+
reindex_wrapper: ->(&reindex) {
|
812
|
+
ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
|
813
|
+
}
|
814
|
+
}
|
815
|
+
}
|
816
|
+
}
|
817
|
+
|
818
|
+
```
|
819
|
+
or in `config/chewy.yml`
|
820
|
+
```ruby
|
821
|
+
strategy_config:
|
822
|
+
delayed_sidekiq:
|
823
|
+
latency: 3
|
824
|
+
margin: 2
|
825
|
+
ttl: <%= 60 * 60 * 24 %>
|
826
|
+
# reindex_wrapper setting is not possible here!!! use the initializer instead
|
827
|
+
```
|
828
|
+
|
829
|
+
You can use the strategy identically to other strategies
|
830
|
+
```ruby
|
831
|
+
Chewy.strategy(:delayed_sidekiq) do
|
832
|
+
City.popular.map(&:do_some_update_action!)
|
833
|
+
end
|
834
|
+
```
|
835
|
+
|
836
|
+
The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
|
837
|
+
```
|
838
|
+
Chewy.settings[:sidekiq] = {queue: :low}
|
839
|
+
```
|
840
|
+
|
841
|
+
Explicit call of the reindex using `:delayed_sidekiq strategy`
|
842
|
+
```ruby
|
843
|
+
CitiesIndex.import([1, 2, 3], strategy: :delayed_sidekiq)
|
844
|
+
```
|
845
|
+
|
846
|
+
Explicit call of the reindex using `:delayed_sidekiq` strategy with `:update_fields` support
|
847
|
+
```ruby
|
848
|
+
CitiesIndex.import([1, 2, 3], update_fields: [:name], strategy: :delayed_sidekiq)
|
849
|
+
```
|
850
|
+
|
497
851
|
#### `:active_job`
|
498
852
|
|
499
853
|
This does the same thing as `:atomic`, but using ActiveJob. This will inherit the ActiveJob configuration settings including the `active_job.queue_adapter` setting for the environment. Patch `Chewy::Strategy::ActiveJob::Worker` for index updates improving.
|
@@ -504,6 +858,11 @@ Chewy.strategy(:active_job) do
|
|
504
858
|
end
|
505
859
|
```
|
506
860
|
|
861
|
+
The default queue name is `chewy`, you can customize it in settings: `active_job.queue_name`
|
862
|
+
```
|
863
|
+
Chewy.settings[:active_job] = {queue: :low}
|
864
|
+
```
|
865
|
+
|
507
866
|
#### `:urgent`
|
508
867
|
|
509
868
|
The following strategy is convenient if you are going to update documents in your index one by one.
|
@@ -514,7 +873,7 @@ Chewy.strategy(:urgent) do
|
|
514
873
|
end
|
515
874
|
```
|
516
875
|
|
517
|
-
This code
|
876
|
+
This code will perform `City.popular.count` requests for ES documents update.
|
518
877
|
|
519
878
|
It is convenient for use in e.g. the Rails console with non-block notation:
|
520
879
|
|
@@ -525,7 +884,9 @@ It is convenient for use in e.g. the Rails console with non-block notation:
|
|
525
884
|
|
526
885
|
#### `:bypass`
|
527
886
|
|
528
|
-
|
887
|
+
When the bypass strategy is active the index will not be automatically updated on object save.
|
888
|
+
|
889
|
+
For example, on `City.first.save!` the cities index would not be updated.
|
529
890
|
|
530
891
|
#### Nesting
|
531
892
|
|
@@ -579,582 +940,341 @@ RSpec.configure do |config|
|
|
579
940
|
end
|
580
941
|
```
|
581
942
|
|
582
|
-
###
|
943
|
+
### Elasticsearch client options
|
583
944
|
|
584
|
-
|
585
|
-
scope = UsersIndex.query(term: {name: 'foo'})
|
586
|
-
.filter(range: {rating: {gte: 100}})
|
587
|
-
.order(created: :desc)
|
588
|
-
.limit(20).offset(100)
|
589
|
-
|
590
|
-
scope.to_a # => will produce array of UserIndex::User or other types instances
|
591
|
-
scope.map { |user| user.email }
|
592
|
-
scope.total_count # => will return total objects count
|
945
|
+
All connection options, except the `:prefix`, are passed to the `Elasticseach::Client.new` ([chewy/lib/chewy.rb](https://github.com/toptal/chewy/blob/f5bad9f83c21416ac10590f6f34009c645062e89/lib/chewy.rb#L153-L160)):
|
593
946
|
|
594
|
-
|
595
|
-
scope.explain.map { |user| user._explanation }
|
596
|
-
scope.only(:id, :email) # returns ids and emails only
|
947
|
+
Here's the relevant Elasticsearch documentation on the subject: https://rubydoc.info/gems/elasticsearch-transport#setting-hosts
|
597
948
|
|
598
|
-
|
599
|
-
```
|
949
|
+
### `ActiveSupport::Notifications` support
|
600
950
|
|
601
|
-
|
951
|
+
Chewy has notifying the following events:
|
602
952
|
|
603
|
-
|
604
|
-
UsersIndex::User.filter(term: {name: 'foo'}) # will return UserIndex::User collection only
|
605
|
-
```
|
953
|
+
#### `search_query.chewy` payload
|
606
954
|
|
607
|
-
|
608
|
-
`
|
955
|
+
* `payload[:index]`: requested index class
|
956
|
+
* `payload[:request]`: request hash
|
609
957
|
|
610
|
-
|
958
|
+
#### `import_objects.chewy` payload
|
611
959
|
|
612
|
-
|
960
|
+
* `payload[:index]`: currently imported index name
|
961
|
+
* `payload[:import]`: imports stats, total imported and deleted objects count:
|
613
962
|
|
614
|
-
|
963
|
+
```ruby
|
964
|
+
{index: 30, delete: 5}
|
965
|
+
```
|
615
966
|
|
616
|
-
|
617
|
-
UsersIndex::User.filter{ name == 'Fred' }.filter{ age < 42 } # will be wrapped with `and` filter
|
618
|
-
UsersIndex::User.filter{ name == 'Fred' }.filter{ age < 42 }.filter_mode(:should) # will be wrapped with bool `should` filter
|
619
|
-
UsersIndex::User.filter{ name == 'Fred' }.filter{ age < 42 }.filter_mode('75%') # will be wrapped with bool `should` filter with `minimum_should_match: '75%'`
|
620
|
-
```
|
967
|
+
* `payload[:errors]`: might not exist. Contains grouped errors with objects ids list:
|
621
968
|
|
622
|
-
|
969
|
+
```ruby
|
970
|
+
{index: {
|
971
|
+
'error 1 text' => ['1', '2', '3'],
|
972
|
+
'error 2 text' => ['4']
|
973
|
+
}, delete: {
|
974
|
+
'delete error text' => ['10', '12']
|
975
|
+
}}
|
976
|
+
```
|
623
977
|
|
624
|
-
###
|
978
|
+
### NewRelic integration
|
625
979
|
|
626
|
-
|
980
|
+
To integrate with NewRelic you may use the following example source (config/initializers/chewy.rb):
|
627
981
|
|
628
982
|
```ruby
|
629
|
-
|
630
|
-
UsersIndex::User.delete_all
|
631
|
-
UsersIndex.filter{ age < 42 }.delete_all
|
632
|
-
UsersIndex::User.filter{ age < 42 }.delete_all
|
633
|
-
```
|
634
|
-
|
635
|
-
### Filters query DSL
|
983
|
+
require 'new_relic/agent/instrumentation/evented_subscriber'
|
636
984
|
|
637
|
-
|
985
|
+
class ChewySubscriber < NewRelic::Agent::Instrumentation::EventedSubscriber
|
986
|
+
def start(name, id, payload)
|
987
|
+
event = ChewyEvent.new(name, Time.current, nil, id, payload)
|
988
|
+
push_event(event)
|
989
|
+
end
|
638
990
|
|
639
|
-
|
640
|
-
|
641
|
-
|
642
|
-
```
|
991
|
+
def finish(_name, id, _payload)
|
992
|
+
pop_event(id).finish
|
993
|
+
end
|
643
994
|
|
644
|
-
|
995
|
+
class ChewyEvent < NewRelic::Agent::Instrumentation::Event
|
996
|
+
OPERATIONS = {
|
997
|
+
'import_objects.chewy' => 'import',
|
998
|
+
'search_query.chewy' => 'search',
|
999
|
+
'delete_query.chewy' => 'delete'
|
1000
|
+
}.freeze
|
645
1001
|
|
646
|
-
*
|
1002
|
+
def initialize(*args)
|
1003
|
+
super
|
1004
|
+
@segment = start_segment
|
1005
|
+
end
|
647
1006
|
|
648
|
-
|
649
|
-
|
650
|
-
|
651
|
-
|
1007
|
+
def start_segment
|
1008
|
+
segment = NewRelic::Agent::Transaction::DatastoreSegment.new product, operation, collection, host, port
|
1009
|
+
if (txn = state.current_transaction)
|
1010
|
+
segment.transaction = txn
|
1011
|
+
end
|
1012
|
+
segment.notice_sql @payload[:request].to_s
|
1013
|
+
segment.start
|
1014
|
+
segment
|
1015
|
+
end
|
652
1016
|
|
653
|
-
|
654
|
-
|
1017
|
+
def finish
|
1018
|
+
if (txn = state.current_transaction)
|
1019
|
+
txn.add_segment @segment
|
1020
|
+
end
|
1021
|
+
@segment.finish
|
1022
|
+
end
|
655
1023
|
|
656
|
-
|
657
|
-
UsersIndex.filter{ name == 'Name' } # simple field term filter
|
658
|
-
UsersIndex.filter{ name(:bool) == ['Name1', 'Name2'] } # terms query with `execution: :bool` option passed
|
659
|
-
UsersIndex.filter{ answers.title =~ /regexp/ } # regexp filter for `answers.title` field
|
660
|
-
```
|
1024
|
+
private
|
661
1025
|
|
662
|
-
|
1026
|
+
def state
|
1027
|
+
@state ||= NewRelic::Agent::TransactionState.tl_get
|
1028
|
+
end
|
663
1029
|
|
664
|
-
|
665
|
-
|
666
|
-
|
667
|
-
must(
|
668
|
-
should(name =~ 'Fr').should_not(name == 'Fred') & (age == 42), email =~ /gmail\.com/
|
669
|
-
) | ((roles.admin == true) & name?)
|
670
|
-
} # many of the combination possibilities
|
671
|
-
```
|
1030
|
+
def product
|
1031
|
+
'Elasticsearch'
|
1032
|
+
end
|
672
1033
|
|
673
|
-
|
1034
|
+
def operation
|
1035
|
+
OPERATIONS[name]
|
1036
|
+
end
|
674
1037
|
|
675
|
-
|
676
|
-
|
677
|
-
|
1038
|
+
def collection
|
1039
|
+
payload.values_at(:type, :index)
|
1040
|
+
.reject { |value| value.try(:empty?) }
|
1041
|
+
.first
|
1042
|
+
.to_s
|
1043
|
+
end
|
678
1044
|
|
679
|
-
|
680
|
-
|
681
|
-
|
1045
|
+
def host
|
1046
|
+
Chewy.client.transport.hosts.first[:host]
|
1047
|
+
end
|
682
1048
|
|
683
|
-
|
684
|
-
|
685
|
-
|
1049
|
+
def port
|
1050
|
+
Chewy.client.transport.hosts.first[:port]
|
1051
|
+
end
|
1052
|
+
end
|
1053
|
+
end
|
686
1054
|
|
687
|
-
|
688
|
-
UsersIndex.filter{ name(cache: 'name_regexp') =~ /Name/ }
|
689
|
-
# Or not
|
690
|
-
UsersIndex.filter{ name(cache: true) =~ /Name/ }
|
1055
|
+
ActiveSupport::Notifications.subscribe(/.chewy$/, ChewySubscriber.new)
|
691
1056
|
```
|
692
1057
|
|
693
|
-
|
694
|
-
|
695
|
-
* Term filter
|
696
|
-
|
697
|
-
```json
|
698
|
-
{"term": {"name": "Fred"}}
|
699
|
-
{"not": {"term": {"name": "Johny"}}}
|
700
|
-
```
|
701
|
-
|
702
|
-
```ruby
|
703
|
-
UsersIndex.filter{ name == 'Fred' }
|
704
|
-
UsersIndex.filter{ name != 'Johny' }
|
705
|
-
```
|
706
|
-
|
707
|
-
* Terms filter
|
1058
|
+
### Search requests
|
708
1059
|
|
709
|
-
|
710
|
-
{"terms": {"name": ["Fred", "Johny"]}}
|
711
|
-
{"not": {"terms": {"name": ["Fred", "Johny"]}}}
|
1060
|
+
Quick introduction.
|
712
1061
|
|
713
|
-
|
1062
|
+
#### Composing requests
|
714
1063
|
|
715
|
-
|
1064
|
+
The request DSL have the same chainable nature as AR. The main class is `Chewy::Search::Request`.
|
716
1065
|
|
717
|
-
|
718
|
-
|
719
|
-
|
720
|
-
```
|
721
|
-
|
722
|
-
```ruby
|
723
|
-
UsersIndex.filter{ name == ['Fred', 'Johny'] }
|
724
|
-
UsersIndex.filter{ name != ['Fred', 'Johny'] }
|
725
|
-
|
726
|
-
UsersIndex.filter{ name(:|) == ['Fred', 'Johny'] }
|
727
|
-
UsersIndex.filter{ name(:or) == ['Fred', 'Johny'] }
|
728
|
-
UsersIndex.filter{ name(execution: :or) == ['Fred', 'Johny'] }
|
729
|
-
|
730
|
-
UsersIndex.filter{ name(:&) == ['Fred', 'Johny'] }
|
731
|
-
UsersIndex.filter{ name(:and) == ['Fred', 'Johny'] }
|
732
|
-
UsersIndex.filter{ name(execution: :and) == ['Fred', 'Johny'] }
|
733
|
-
|
734
|
-
UsersIndex.filter{ name(:b) == ['Fred', 'Johny'] }
|
735
|
-
UsersIndex.filter{ name(:bool) == ['Fred', 'Johny'] }
|
736
|
-
UsersIndex.filter{ name(execution: :bool) == ['Fred', 'Johny'] }
|
737
|
-
|
738
|
-
UsersIndex.filter{ name(:f) == ['Fred', 'Johny'] }
|
739
|
-
UsersIndex.filter{ name(:fielddata) == ['Fred', 'Johny'] }
|
740
|
-
UsersIndex.filter{ name(execution: :fielddata) == ['Fred', 'Johny'] }
|
741
|
-
```
|
742
|
-
|
743
|
-
* Regexp filter (== and =~ are equivalent)
|
744
|
-
|
745
|
-
```json
|
746
|
-
{"regexp": {"name.first": "s.*y"}}
|
747
|
-
|
748
|
-
{"not": {"regexp": {"name.first": "s.*y"}}}
|
749
|
-
|
750
|
-
{"regexp": {"name.first": {"value": "s.*y", "flags": "ANYSTRING|INTERSECTION"}}}
|
751
|
-
```
|
752
|
-
|
753
|
-
```ruby
|
754
|
-
UsersIndex.filter{ name.first == /s.*y/ }
|
755
|
-
UsersIndex.filter{ name.first =~ /s.*y/ }
|
756
|
-
|
757
|
-
UsersIndex.filter{ name.first != /s.*y/ }
|
758
|
-
UsersIndex.filter{ name.first !~ /s.*y/ }
|
759
|
-
|
760
|
-
UsersIndex.filter{ name.first(:anystring, :intersection) == /s.*y/ }
|
761
|
-
UsersIndex.filter{ name.first(flags: [:anystring, :intersection]) == /s.*y/ }
|
762
|
-
```
|
763
|
-
|
764
|
-
* Prefix filter
|
765
|
-
|
766
|
-
```json
|
767
|
-
{"prefix": {"name": "Fre"}}
|
768
|
-
{"not": {"prefix": {"name": "Joh"}}}
|
769
|
-
```
|
770
|
-
|
771
|
-
```ruby
|
772
|
-
UsersIndex.filter{ name =~ re' }
|
773
|
-
UsersIndex.filter{ name !~ 'Joh' }
|
774
|
-
```
|
775
|
-
|
776
|
-
* Exists filter
|
777
|
-
|
778
|
-
```json
|
779
|
-
{"exists": {"field": "name"}}
|
780
|
-
```
|
781
|
-
|
782
|
-
```ruby
|
783
|
-
UsersIndex.filter{ name? }
|
784
|
-
UsersIndex.filter{ !!name }
|
785
|
-
UsersIndex.filter{ !!name? }
|
786
|
-
UsersIndex.filter{ name != nil }
|
787
|
-
UsersIndex.filter{ !(name == nil) }
|
788
|
-
```
|
789
|
-
|
790
|
-
* Missing filter
|
791
|
-
|
792
|
-
```json
|
793
|
-
{"missing": {"field": "name", "existence": true, "null_value": false}}
|
794
|
-
{"missing": {"field": "name", "existence": true, "null_value": true}}
|
795
|
-
{"missing": {"field": "name", "existence": false, "null_value": true}}
|
796
|
-
```
|
797
|
-
|
798
|
-
```ruby
|
799
|
-
UsersIndex.filter{ !name }
|
800
|
-
UsersIndex.filter{ !name? }
|
801
|
-
UsersIndex.filter{ name == nil }
|
802
|
-
```
|
803
|
-
|
804
|
-
* Range
|
805
|
-
|
806
|
-
```json
|
807
|
-
{"range": {"age": {"gt": 42}}}
|
808
|
-
{"range": {"age": {"gte": 42}}}
|
809
|
-
{"range": {"age": {"lt": 42}}}
|
810
|
-
{"range": {"age": {"lte": 42}}}
|
811
|
-
|
812
|
-
{"range": {"age": {"gt": 40, "lt": 50}}}
|
813
|
-
{"range": {"age": {"gte": 40, "lte": 50}}}
|
814
|
-
|
815
|
-
{"range": {"age": {"gt": 40, "lte": 50}}}
|
816
|
-
{"range": {"age": {"gte": 40, "lt": 50}}}
|
817
|
-
```
|
818
|
-
|
819
|
-
```ruby
|
820
|
-
UsersIndex.filter{ age > 42 }
|
821
|
-
UsersIndex.filter{ age >= 42 }
|
822
|
-
UsersIndex.filter{ age < 42 }
|
823
|
-
UsersIndex.filter{ age <= 42 }
|
824
|
-
|
825
|
-
UsersIndex.filter{ age == (40..50) }
|
826
|
-
UsersIndex.filter{ (age > 40) & (age < 50) }
|
827
|
-
UsersIndex.filter{ age == [40..50] }
|
828
|
-
UsersIndex.filter{ (age >= 40) & (age <= 50) }
|
829
|
-
|
830
|
-
UsersIndex.filter{ (age > 40) & (age <= 50) }
|
831
|
-
UsersIndex.filter{ (age >= 40) & (age < 50) }
|
832
|
-
```
|
833
|
-
|
834
|
-
* Bool filter
|
835
|
-
|
836
|
-
```json
|
837
|
-
{"bool": {
|
838
|
-
"must": [{"term": {"name": "Name"}}],
|
839
|
-
"should": [{"term": {"age": 42}}, {"term": {"age": 45}}]
|
840
|
-
}}
|
841
|
-
```
|
842
|
-
|
843
|
-
```ruby
|
844
|
-
UsersIndex.filter{ must(name == 'Name').should(age == 42, age == 45) }
|
845
|
-
```
|
846
|
-
|
847
|
-
* And filter
|
848
|
-
|
849
|
-
```json
|
850
|
-
{"and": [{"term": {"name": "Name"}}, {"range": {"age": {"lt": 42}}}]}
|
851
|
-
```
|
1066
|
+
```ruby
|
1067
|
+
CitiesIndex.query(match: {name: 'London'})
|
1068
|
+
```
|
852
1069
|
|
853
|
-
|
854
|
-
UsersIndex.filter{ (name == 'Name') & (age < 42) }
|
855
|
-
```
|
1070
|
+
Main methods of the request DSL are: `query`, `filter` and `post_filter`, it is possible to pass pure query hashes or use `elasticsearch-dsl`.
|
856
1071
|
|
857
|
-
|
1072
|
+
```ruby
|
1073
|
+
CitiesIndex
|
1074
|
+
.filter(term: {name: 'Bangkok'})
|
1075
|
+
.query(match: {name: 'London'})
|
1076
|
+
.query.not(range: {population: {gt: 1_000_000}})
|
1077
|
+
```
|
858
1078
|
|
859
|
-
|
860
|
-
{"or": [{"term": {"name": "Name"}}, {"range": {"age": {"lt": 42}}}]}
|
861
|
-
```
|
1079
|
+
You can query a set of indexes at once:
|
862
1080
|
|
863
|
-
|
864
|
-
|
865
|
-
|
1081
|
+
```ruby
|
1082
|
+
CitiesIndex.indices(CountriesIndex).query(match: {name: 'Some'})
|
1083
|
+
```
|
866
1084
|
|
867
|
-
|
868
|
-
{"not": {"term": {"name": "Name"}}}
|
869
|
-
{"not": {"range": {"age": {"lt": 42}}}}
|
870
|
-
```
|
1085
|
+
See https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html and https://github.com/elastic/elasticsearch-dsl-ruby for more details.
|
871
1086
|
|
872
|
-
|
873
|
-
UsersIndex.filter{ !(name == 'Name') } # or UsersIndex.filter{ name != 'Name' }
|
874
|
-
UsersIndex.filter{ !(age < 42) }
|
875
|
-
```
|
1087
|
+
An important part of requests manipulation is merging. There are 4 methods to perform it: `merge`, `and`, `or`, `not`. See [Chewy::Search::QueryProxy](lib/chewy/search/query_proxy.rb) for details. Also, `only` and `except` methods help to remove unneeded parts of the request.
|
876
1088
|
|
877
|
-
|
1089
|
+
Every other request part is covered by a bunch of additional methods, see [Chewy::Search::Request](lib/chewy/search/request.rb) for details:
|
878
1090
|
|
879
|
-
|
880
|
-
|
881
|
-
|
1091
|
+
```ruby
|
1092
|
+
CitiesIndex.limit(10).offset(30).order(:name, {population: {order: :desc}})
|
1093
|
+
```
|
882
1094
|
|
883
|
-
|
884
|
-
UsersIndex.filter{ match_all }
|
885
|
-
```
|
1095
|
+
Request DSL also provides additional scope actions, like `delete_all`, `exists?`, `count`, `pluck`, etc.
|
886
1096
|
|
887
|
-
|
1097
|
+
#### Pagination
|
888
1098
|
|
889
|
-
|
890
|
-
{"has_child": {"type": "blog_tag", "query": {"term": {"tag": "something"}}}
|
891
|
-
{"has_child": {"type": "comment", "filter": {"term": {"user": "john"}}}
|
892
|
-
```
|
1099
|
+
The request DSL supports pagination with `Kaminari`. An extension is enabled on initialization if `Kaminari` is available. See [Chewy::Search](lib/chewy/search.rb) and [Chewy::Search::Pagination::Kaminari](lib/chewy/search/pagination/kaminari.rb) for details.
|
893
1100
|
|
894
|
-
|
895
|
-
UsersIndex.filter{ has_child(:blog_tag).query(term: {tag: 'something'}) }
|
896
|
-
UsersIndex.filter{ has_child(:comment).filter{ user == 'john' } }
|
897
|
-
```
|
1101
|
+
#### Named scopes
|
898
1102
|
|
899
|
-
|
1103
|
+
Chewy supports named scopes functionality. There is no specialized DSL for named scopes definition, it is simply about defining class methods.
|
900
1104
|
|
901
|
-
|
902
|
-
{"has_parent": {"type": "blog", "query": {"term": {"tag": "something"}}}}
|
903
|
-
{"has_parent": {"type": "blog", "filter": {"term": {"text": "bonsai three"}}}}
|
904
|
-
```
|
1105
|
+
See [Chewy::Search::Scoping](lib/chewy/search/scoping.rb) for details.
|
905
1106
|
|
906
|
-
|
907
|
-
UsersIndex.filter{ has_parent(:blog).query(term: {tag: 'something'}) }
|
908
|
-
UsersIndex.filter{ has_parent(:blog).filter{ text == 'bonsai three' } }
|
909
|
-
```
|
1107
|
+
#### Scroll API
|
910
1108
|
|
911
|
-
|
1109
|
+
ElasticSearch scroll API is utilized by a bunch of methods: `scroll_batches`, `scroll_hits`, `scroll_wrappers` and `scroll_objects`.
|
912
1110
|
|
913
|
-
|
1111
|
+
See [Chewy::Search::Scrolling](lib/chewy/search/scrolling.rb) for details.
|
914
1112
|
|
915
|
-
|
1113
|
+
#### Loading objects
|
916
1114
|
|
917
|
-
|
1115
|
+
It is possible to load ORM/ODM source objects with the `objects` method. To provide additional loading options use `load` method:
|
918
1116
|
|
919
1117
|
```ruby
|
920
|
-
|
1118
|
+
CitiesIndex.load(scope: -> { active }).to_a # to_a returns `Chewy::Index` wrappers.
|
1119
|
+
CitiesIndex.load(scope: -> { active }).objects # An array of AR source objects.
|
921
1120
|
```
|
922
1121
|
|
923
|
-
|
924
|
-
|
925
|
-
The response will include the `:facets` sidechannel:
|
926
|
-
|
927
|
-
```
|
928
|
-
< { ... ,"facets":{"countries":{"_type":"terms","missing":?,"total":?,"other":?,"terms":[{"term":"USA","count":?},{"term":"Brazil","count":?}, ...}}
|
929
|
-
```
|
1122
|
+
See [Chewy::Search::Loader](lib/chewy/search/loader.rb) for more details.
|
930
1123
|
|
931
|
-
|
932
|
-
|
933
|
-
Aggregations are part of the optional sidechannel that can be requested with a query.
|
934
|
-
|
935
|
-
You interact with aggregations using the composable #aggregations method (or its alias #aggs)
|
936
|
-
|
937
|
-
Let's look at an example.
|
1124
|
+
In case when it is necessary to iterate through both of the wrappers and objects simultaneously, `object_hash` method helps a lot:
|
938
1125
|
|
939
1126
|
```ruby
|
940
|
-
|
941
|
-
|
942
|
-
|
943
|
-
field :rating
|
944
|
-
end
|
1127
|
+
scope = CitiesIndex.load(scope: -> { active })
|
1128
|
+
scope.each do |wrapper|
|
1129
|
+
scope.object_hash[wrapper]
|
945
1130
|
end
|
946
|
-
|
947
|
-
all_johns = UsersIndex::User.filter { name == 'john' }.aggs({ avg_rating: { avg: { field: 'rating' } } })
|
948
|
-
|
949
|
-
avg_johns_rating = all_johns.aggs
|
950
|
-
# => {"avg_rating"=>{"value"=>3.5}}
|
951
1131
|
```
|
952
1132
|
|
953
|
-
|
954
|
-
which is also available under the .agg alias method.
|
1133
|
+
### Rake tasks
|
955
1134
|
|
956
|
-
|
1135
|
+
For a Rails application, some index-maintaining rake tasks are defined.
|
957
1136
|
|
958
|
-
|
959
|
-
class UsersIndex < Chewy::Index
|
960
|
-
define_type User do
|
961
|
-
field :name
|
962
|
-
field :rating, type: "long"
|
963
|
-
agg :avg_rating do
|
964
|
-
{ avg: { field: 'rating' } }
|
965
|
-
end
|
966
|
-
end
|
967
|
-
end
|
1137
|
+
#### `chewy:reset`
|
968
1138
|
|
969
|
-
|
1139
|
+
Performs zero-downtime reindexing as described [here](https://www.elastic.co/blog/changing-mapping-with-zero-downtime). So the rake task creates a new index with unique suffix and then simply aliases it to the common index name. The previous index is deleted afterwards (see `Chewy::Index.reset!` for more details).
|
970
1140
|
|
971
|
-
|
972
|
-
#
|
1141
|
+
```bash
|
1142
|
+
rake chewy:reset # resets all the existing indices
|
1143
|
+
rake chewy:reset[users] # resets UsersIndex only
|
1144
|
+
rake chewy:reset[users,cities] # resets UsersIndex and CitiesIndex
|
1145
|
+
rake chewy:reset[-users,cities] # resets every index in the application except specified ones
|
973
1146
|
```
|
974
1147
|
|
975
|
-
|
976
|
-
with the same name. To explicitly reference an aggregation you provide a string to the #aggs method of the form:
|
977
|
-
`index_name#document_type.aggregation_name`
|
1148
|
+
#### `chewy:upgrade`
|
978
1149
|
|
979
|
-
|
1150
|
+
Performs reset exactly the same way as `chewy:reset` does, but only when the index specification (setting or mapping) was changed.
|
980
1151
|
|
981
|
-
|
982
|
-
class UsersIndex < Chewy::Index
|
983
|
-
define_type User do
|
984
|
-
field :name
|
985
|
-
field :rating, type: "long"
|
986
|
-
agg :avg_rating do
|
987
|
-
{ avg: { field: 'rating' } }
|
988
|
-
end
|
989
|
-
end
|
990
|
-
define_type Post do
|
991
|
-
field :title
|
992
|
-
field :body
|
993
|
-
field :comments do
|
994
|
-
field :message
|
995
|
-
field :rating, type: "long"
|
996
|
-
end
|
997
|
-
agg :avg_rating do
|
998
|
-
{ avg: { field: 'comments.rating' } }
|
999
|
-
end
|
1000
|
-
end
|
1001
|
-
end
|
1002
|
-
|
1003
|
-
all_docs = UsersIndex.filter {match_all}.aggs("users#user.avg_rating")
|
1004
|
-
all_docs.aggs
|
1005
|
-
# => {"users#user.avg_rating"=>{"value"=>3.5}}
|
1006
|
-
```
|
1152
|
+
It works only when index specification is locked in `Chewy::Stash::Specification` index. The first run will reset all indexes and lock their specifications.
|
1007
1153
|
|
1008
|
-
|
1154
|
+
See [Chewy::Stash::Specification](lib/chewy/stash.rb) and [Chewy::Index::Specification](lib/chewy/index/specification.rb) for more details.
|
1009
1155
|
|
1010
|
-
Script fields allow you to execute Elasticsearch's scripting languages such as groovy and javascript. More about supported languages and what scripting is [here](https://www.elastic.co/guide/en/elasticsearch/reference/0.90/modules-scripting.html). This feature allows you to calculate the distance between geo points, for example. This is how to use the DSL:
|
1011
1156
|
|
1012
|
-
```
|
1013
|
-
|
1014
|
-
|
1015
|
-
|
1016
|
-
|
1017
|
-
lon: -122.351591
|
1018
|
-
},
|
1019
|
-
script: "doc['coordinates'].distanceInMiles(lat, lon)"
|
1020
|
-
}
|
1021
|
-
)
|
1157
|
+
```bash
|
1158
|
+
rake chewy:upgrade # upgrades all the existing indices
|
1159
|
+
rake chewy:upgrade[users] # upgrades UsersIndex only
|
1160
|
+
rake chewy:upgrade[users,cities] # upgrades UsersIndex and CitiesIndex
|
1161
|
+
rake chewy:upgrade[-users,cities] # upgrades every index in the application except specified ones
|
1022
1162
|
```
|
1023
|
-
Here, `coordinates` is a field with type `geo_point`. There will be a `distance` field for the index's model in the search result.
|
1024
1163
|
|
1025
|
-
|
1164
|
+
#### `chewy:update`
|
1026
1165
|
|
1027
|
-
|
1166
|
+
It doesn't create indexes, it simply imports everything to the existing ones and fails if the index was not created before.
|
1028
1167
|
|
1029
|
-
```
|
1030
|
-
|
1168
|
+
```bash
|
1169
|
+
rake chewy:update # updates all the existing indices
|
1170
|
+
rake chewy:update[users] # updates UsersIndex only
|
1171
|
+
rake chewy:update[users,cities] # updates UsersIndex and CitiesIndex
|
1172
|
+
rake chewy:update[-users,cities] # updates every index in the application except UsersIndex and CitiesIndex
|
1031
1173
|
```
|
1032
1174
|
|
1033
|
-
|
1175
|
+
#### `chewy:sync`
|
1034
1176
|
|
1035
|
-
|
1177
|
+
Provides a way to synchronize outdated indexes with the source quickly and without doing a full reset. By default field `updated_at` is used to find outdated records, but this could be customized by `outdated_sync_field` as described at [Chewy::Index::Syncer](lib/chewy/index/syncer.rb).
|
1036
1178
|
|
1037
|
-
|
1038
|
-
|
1179
|
+
Arguments are similar to the ones taken by `chewy:update` task.
|
1180
|
+
|
1181
|
+
See [Chewy::Index::Syncer](lib/chewy/index/syncer.rb) for more details.
|
1182
|
+
|
1183
|
+
```bash
|
1184
|
+
rake chewy:sync # synchronizes all the existing indices
|
1185
|
+
rake chewy:sync[users] # synchronizes UsersIndex only
|
1186
|
+
rake chewy:sync[users,cities] # synchronizes UsersIndex and CitiesIndex
|
1187
|
+
rake chewy:sync[-users,cities] # synchronizes every index in the application except except UsersIndex and CitiesIndex
|
1039
1188
|
```
|
1040
1189
|
|
1041
|
-
|
1190
|
+
#### `chewy:deploy`
|
1042
1191
|
|
1043
|
-
It is
|
1192
|
+
This rake task is especially useful during the production deploy. It is a combination of `chewy:upgrade` and `chewy:sync` and the latter is called only for the indexes that were not reset during the first stage.
|
1044
1193
|
|
1045
|
-
|
1046
|
-
scope = UsersIndex.filter(range: {rating: {gte: 100}})
|
1194
|
+
It is not possible to specify any particular indexes for this task as it doesn't make much sense.
|
1047
1195
|
|
1048
|
-
|
1049
|
-
scope.load.query(...) # => since objects are loaded lazily you can complete scope
|
1050
|
-
scope.load(user: { scope: ->{ includes(:country) }}) # you can also pass loading scopes for each
|
1051
|
-
# possibly returned type
|
1052
|
-
scope.load(user: { scope: User.includes(:country) }) # the second scope passing way.
|
1053
|
-
scope.load(scope: ->{ includes(:country) }) # and more common scope applied to every loaded object type.
|
1196
|
+
Right now the approach is that if some data had been updated, but index definition was not changed (no changes satisfying the synchronization algorithm were done), it would be much faster to perform manual partial index update inside data migrations or even manually after the deploy.
|
1054
1197
|
|
1055
|
-
|
1056
|
-
```
|
1198
|
+
Also, there is always full reset alternative with `rake chewy:reset`.
|
1057
1199
|
|
1058
|
-
|
1200
|
+
#### `chewy:create_missing_indexes`
|
1059
1201
|
|
1060
|
-
|
1061
|
-
UsersIndex.filter(range: {rating: {gte: 100}}).preload(...).query(...).map(&:_object)
|
1062
|
-
```
|
1202
|
+
This rake task creates newly defined indexes in ElasticSearch and skips existing ones. Useful for production-like environments.
|
1063
1203
|
|
1064
|
-
|
1204
|
+
#### Parallelizing rake tasks
|
1065
1205
|
|
1066
|
-
|
1206
|
+
Every task described above has its own parallel version. Every parallel rake task takes the number for processes for execution as the first argument and the rest of the arguments are exactly the same as for the non-parallel task version.
|
1067
1207
|
|
1068
|
-
|
1208
|
+
[https://github.com/grosser/parallel](https://github.com/grosser/parallel) gem is required to use these tasks.
|
1069
1209
|
|
1070
|
-
|
1210
|
+
If the number of processes is not specified explicitly - `parallel` gem tries to automatically derive the number of processes to use.
|
1071
1211
|
|
1072
|
-
|
1073
|
-
|
1212
|
+
```bash
|
1213
|
+
rake chewy:parallel:reset
|
1214
|
+
rake chewy:parallel:upgrade[4]
|
1215
|
+
rake chewy:parallel:update[4,cities]
|
1216
|
+
rake chewy:parallel:sync[4,-users]
|
1217
|
+
rake chewy:parallel:deploy[4] # performs parallel upgrade and parallel sync afterwards
|
1218
|
+
```
|
1074
1219
|
|
1075
|
-
#### `
|
1220
|
+
#### `chewy:journal`
|
1076
1221
|
|
1077
|
-
|
1078
|
-
* `payload[:import]`: imports stats, total imported and deleted objects count:
|
1222
|
+
This namespace contains two tasks for the journal manipulations: `chewy:journal:apply` and `chewy:journal:clean`. Both are taking time as the first argument (optional for clean) and a list of indexes exactly as the tasks above. Time can be in any format parsable by ActiveSupport.
|
1079
1223
|
|
1080
|
-
|
1081
|
-
|
1082
|
-
|
1224
|
+
```bash
|
1225
|
+
rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)"] # apply journaled changes for the past hour
|
1226
|
+
rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)",users] # apply journaled changes for the past hour on UsersIndex only
|
1227
|
+
```
|
1083
1228
|
|
1084
|
-
|
1229
|
+
When the size of the journal becomes very large, the classical way of deletion would be obstructive and resource consuming. Fortunately, Chewy internally uses [delete-by-query](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-delete-by-query.html#docs-delete-by-query-task-api) ES function which supports async execution with batching and [throttling](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-throttle).
|
1085
1230
|
|
1086
|
-
|
1087
|
-
|
1088
|
-
|
1089
|
-
|
1090
|
-
}, delete: {
|
1091
|
-
'delete error text' => ['10', '12']
|
1092
|
-
}}
|
1093
|
-
```
|
1231
|
+
The available options, which can be set by ENV variables, are listed below:
|
1232
|
+
* `WAIT_FOR_COMPLETION` - a boolean flag. It controls async execution. It waits by default. When set to `false` (`0`, `f`, `false` or `off` in any case spelling is accepted as `false`), Elasticsearch performs some preflight checks, launches the request, and returns a task reference you can use to cancel the task or get its status.
|
1233
|
+
* `REQUESTS_PER_SECOND` - float. The throttle for this request in sub-requests per second. No throttling is enforced by default.
|
1234
|
+
* `SCROLL_SIZE` - integer. The number of documents to be deleted in single sub-request. The default batch size is 1000.
|
1094
1235
|
|
1095
|
-
|
1236
|
+
```bash
|
1237
|
+
rake chewy:journal:clean WAIT_FOR_COMPLETION=false REQUESTS_PER_SECOND=10 SCROLL_SIZE=5000
|
1238
|
+
```
|
1096
1239
|
|
1097
|
-
|
1240
|
+
### RSpec integration
|
1098
1241
|
|
1099
|
-
|
1100
|
-
ActiveSupport::Notifications.subscribe('import_objects.chewy') do |name, start, finish, id, payload|
|
1101
|
-
metric_name = "Database/ElasticSearch/import"
|
1102
|
-
duration = (finish - start).to_f
|
1103
|
-
logged = "#{payload[:type]} #{payload[:import].to_a.map{ |i| i.join(':') }.join(', ')}"
|
1104
|
-
|
1105
|
-
self.class.trace_execution_scoped([metric_name]) do
|
1106
|
-
NewRelic::Agent.instance.transaction_sampler.notice_sql(logged, nil, duration)
|
1107
|
-
NewRelic::Agent.instance.sql_sampler.notice_sql(logged, metric_name, nil, duration)
|
1108
|
-
NewRelic::Agent.record_metric(metric_name, duration)
|
1109
|
-
end
|
1110
|
-
end
|
1242
|
+
Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features:
|
1111
1243
|
|
1112
|
-
|
1113
|
-
|
1114
|
-
|
1115
|
-
|
1244
|
+
[update_index](lib/chewy/rspec/update_index.rb) helper
|
1245
|
+
`mock_elasticsearch_response` helper to mock elasticsearch response
|
1246
|
+
`mock_elasticsearch_response_sources` helper to mock elasticsearch response sources
|
1247
|
+
`build_query` matcher to compare request and expected query (returns `true`/`false`)
|
1116
1248
|
|
1117
|
-
|
1118
|
-
NewRelic::Agent.instance.transaction_sampler.notice_sql(logged, nil, duration)
|
1119
|
-
NewRelic::Agent.instance.sql_sampler.notice_sql(logged, metric_name, nil, duration)
|
1120
|
-
NewRelic::Agent.record_metric(metric_name, duration)
|
1121
|
-
end
|
1122
|
-
end
|
1123
|
-
```
|
1249
|
+
To use `mock_elasticsearch_response` and `mock_elasticsearch_response_sources` helpers add `include Chewy::Rspec::Helpers` to your tests.
|
1124
1250
|
|
1125
|
-
|
1251
|
+
See [chewy/rspec/](lib/chewy/rspec/) for more details.
|
1126
1252
|
|
1127
|
-
|
1253
|
+
### Minitest integration
|
1128
1254
|
|
1129
|
-
|
1130
|
-
rake chewy:reset # resets all the existing indices, declared in app/chewy
|
1131
|
-
rake chewy:reset[users] # resets UsersIndex only
|
1255
|
+
Add `require 'chewy/minitest'` to your test_helper.rb, and then for tests which you'd like indexing test hooks, `include Chewy::Minitest::Helpers`.
|
1132
1256
|
|
1133
|
-
|
1134
|
-
|
1135
|
-
|
1257
|
+
Since you can set `:bypass` strategy for test suites and manually handle import for the index and manually flush test indices using `Chewy.massacre`. This will help reduce unnecessary ES requests
|
1258
|
+
|
1259
|
+
But if you require chewy to index/update model regularly in your test suite then you can specify `:urgent` strategy for documents indexing. Add `Chewy.strategy(:urgent)` to test_helper.rb.
|
1136
1260
|
|
1137
|
-
|
1261
|
+
Also, you can use additional helpers:
|
1138
1262
|
|
1263
|
+
`mock_elasticsearch_response` to mock elasticsearch response
|
1264
|
+
`mock_elasticsearch_response_sources` to mock elasticsearch response sources
|
1265
|
+
`assert_elasticsearch_query` to compare request and expected query (returns `true`/`false`)
|
1139
1266
|
|
1140
|
-
|
1267
|
+
See [chewy/minitest/](lib/chewy/minitest/) for more details.
|
1141
1268
|
|
1142
|
-
|
1269
|
+
### DatabaseCleaner
|
1143
1270
|
|
1144
|
-
If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](https://github.com/DatabaseCleaner/database_cleaner#how-to-use), you may run into the problem that `ActiveRecord`'s models are not indexed automatically on save despite the fact that you set the callbacks to do this with the `update_index` method. The issue arises because `chewy`
|
1271
|
+
If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](https://github.com/DatabaseCleaner/database_cleaner#how-to-use), you may run into the problem that `ActiveRecord`'s models are not indexed automatically on save despite the fact that you set the callbacks to do this with the `update_index` method. The issue arises because `chewy` indices data on `after_commit` run as default, but all `after_commit` callbacks are not run with the `DatabaseCleaner`'s' `transaction` strategy. You can solve this issue by changing the `Chewy.use_after_commit_callbacks` option. Just add the following initializer in your Rails application:
|
1145
1272
|
|
1146
1273
|
```ruby
|
1147
1274
|
#config/initializers/chewy.rb
|
1148
1275
|
Chewy.use_after_commit_callbacks = !Rails.env.test?
|
1149
1276
|
```
|
1150
1277
|
|
1151
|
-
## TODO a.k.a coming soon:
|
1152
|
-
|
1153
|
-
* Typecasting support
|
1154
|
-
* Advanced (simplified) query DSL: `UsersIndex.query { email == 'my@gmail.com' }` will produce term query
|
1155
|
-
* update_all support
|
1156
|
-
* Maybe, closer ORM/ODM integration, creating index classes implicitly
|
1157
|
-
|
1158
1278
|
## Contributing
|
1159
1279
|
|
1160
1280
|
1. Fork it (http://github.com/toptal/chewy/fork)
|
@@ -1164,9 +1284,14 @@ Chewy.use_after_commit_callbacks = !Rails.env.test?
|
|
1164
1284
|
5. Push to the branch (`git push origin my-new-feature`)
|
1165
1285
|
6. Create new Pull Request
|
1166
1286
|
|
1167
|
-
Use the following Rake tasks to control the Elasticsearch cluster while developing
|
1287
|
+
Use the following Rake tasks to control the Elasticsearch cluster while developing, if you prefer native Elasticsearch installation over the dockerized one:
|
1168
1288
|
|
1169
1289
|
```bash
|
1170
1290
|
rake elasticsearch:start # start Elasticsearch cluster on 9250 port for tests
|
1171
1291
|
rake elasticsearch:stop # stop Elasticsearch
|
1172
1292
|
```
|
1293
|
+
|
1294
|
+
## Copyright
|
1295
|
+
|
1296
|
+
Copyright (c) 2013-2021 Toptal, LLC. See [LICENSE.txt](LICENSE.txt) for
|
1297
|
+
further details.
|