chewy 0.10.1 → 7.3.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/.github/CODEOWNERS +1 -0
- data/.github/ISSUE_TEMPLATE/bug_report.md +39 -0
- data/.github/ISSUE_TEMPLATE/feature_request.md +20 -0
- data/.github/PULL_REQUEST_TEMPLATE.md +16 -0
- data/.github/workflows/ruby.yml +74 -0
- data/.rubocop.yml +28 -23
- data/.rubocop_todo.yml +110 -22
- data/CHANGELOG.md +480 -298
- data/CODE_OF_CONDUCT.md +14 -0
- data/CONTRIBUTING.md +63 -0
- data/Gemfile +3 -5
- data/Guardfile +3 -1
- data/LICENSE.txt +1 -1
- data/README.md +571 -333
- data/chewy.gemspec +12 -15
- data/gemfiles/rails.5.2.activerecord.gemfile +11 -0
- data/gemfiles/rails.6.0.activerecord.gemfile +11 -0
- data/gemfiles/rails.6.1.activerecord.gemfile +13 -0
- data/gemfiles/rails.7.0.activerecord.gemfile +13 -0
- data/lib/chewy/config.rb +48 -77
- data/lib/chewy/errors.rb +4 -10
- data/lib/chewy/fields/base.rb +88 -16
- data/lib/chewy/fields/root.rb +15 -21
- data/lib/chewy/index/actions.rb +67 -38
- data/lib/chewy/{type → index}/adapter/active_record.rb +18 -4
- data/lib/chewy/{type → index}/adapter/base.rb +11 -12
- data/lib/chewy/{type → index}/adapter/object.rb +28 -32
- data/lib/chewy/{type → index}/adapter/orm.rb +26 -24
- data/lib/chewy/index/aliases.rb +14 -5
- data/lib/chewy/index/crutch.rb +40 -0
- data/lib/chewy/index/import/bulk_builder.rb +311 -0
- data/lib/chewy/{type → index}/import/bulk_request.rb +10 -9
- data/lib/chewy/{type → index}/import/journal_builder.rb +11 -12
- data/lib/chewy/{type → index}/import/routine.rb +19 -18
- data/lib/chewy/{type → index}/import.rb +82 -36
- data/lib/chewy/{type → index}/mapping.rb +63 -62
- data/lib/chewy/index/observe/active_record_methods.rb +87 -0
- data/lib/chewy/index/observe/callback.rb +34 -0
- data/lib/chewy/index/observe.rb +17 -0
- data/lib/chewy/index/settings.rb +2 -0
- data/lib/chewy/index/specification.rb +13 -10
- data/lib/chewy/{type → index}/syncer.rb +62 -63
- data/lib/chewy/{type → index}/witchcraft.rb +15 -9
- data/lib/chewy/{type → index}/wrapper.rb +16 -6
- data/lib/chewy/index.rb +68 -93
- data/lib/chewy/journal.rb +25 -14
- data/lib/chewy/minitest/helpers.rb +91 -18
- data/lib/chewy/minitest/search_index_receiver.rb +29 -33
- data/lib/chewy/multi_search.rb +62 -0
- data/lib/chewy/railtie.rb +8 -24
- data/lib/chewy/rake_helper.rb +141 -112
- data/lib/chewy/rspec/build_query.rb +12 -0
- data/lib/chewy/rspec/helpers.rb +55 -0
- data/lib/chewy/rspec/update_index.rb +58 -49
- data/lib/chewy/rspec.rb +2 -0
- data/lib/chewy/runtime.rb +1 -1
- data/lib/chewy/search/loader.rb +19 -41
- data/lib/chewy/search/parameters/allow_partial_search_results.rb +27 -0
- data/lib/chewy/search/parameters/collapse.rb +16 -0
- data/lib/chewy/search/parameters/concerns/query_storage.rb +6 -5
- data/lib/chewy/search/parameters/ignore_unavailable.rb +27 -0
- data/lib/chewy/search/parameters/indices.rb +78 -0
- data/lib/chewy/search/parameters/none.rb +1 -3
- data/lib/chewy/search/parameters/order.rb +6 -19
- data/lib/chewy/search/parameters/source.rb +5 -1
- data/lib/chewy/search/parameters/track_total_hits.rb +16 -0
- data/lib/chewy/search/parameters.rb +28 -8
- data/lib/chewy/search/query_proxy.rb +9 -2
- data/lib/chewy/search/request.rb +207 -157
- data/lib/chewy/search/response.rb +5 -5
- data/lib/chewy/search/scoping.rb +7 -8
- data/lib/chewy/search/scrolling.rb +14 -13
- data/lib/chewy/search.rb +7 -26
- data/lib/chewy/stash.rb +27 -29
- data/lib/chewy/strategy/active_job.rb +2 -2
- data/lib/chewy/strategy/atomic.rb +1 -1
- data/lib/chewy/strategy/atomic_no_refresh.rb +18 -0
- data/lib/chewy/strategy/base.rb +10 -0
- data/lib/chewy/strategy/delayed_sidekiq/scheduler.rb +148 -0
- data/lib/chewy/strategy/delayed_sidekiq/worker.rb +52 -0
- data/lib/chewy/strategy/delayed_sidekiq.rb +17 -0
- data/lib/chewy/strategy/lazy_sidekiq.rb +64 -0
- data/lib/chewy/strategy/sidekiq.rb +3 -2
- data/lib/chewy/strategy.rb +6 -19
- data/lib/chewy/version.rb +1 -1
- data/lib/chewy.rb +37 -80
- data/lib/generators/chewy/install_generator.rb +1 -1
- data/lib/tasks/chewy.rake +26 -32
- data/migration_guide.md +56 -0
- data/spec/chewy/config_spec.rb +27 -57
- data/spec/chewy/fields/base_spec.rb +457 -174
- data/spec/chewy/fields/root_spec.rb +24 -32
- data/spec/chewy/fields/time_fields_spec.rb +5 -5
- data/spec/chewy/index/actions_spec.rb +425 -60
- data/spec/chewy/{type → index}/adapter/active_record_spec.rb +110 -44
- data/spec/chewy/{type → index}/adapter/object_spec.rb +21 -6
- data/spec/chewy/index/aliases_spec.rb +3 -3
- data/spec/chewy/index/import/bulk_builder_spec.rb +494 -0
- data/spec/chewy/{type → index}/import/bulk_request_spec.rb +5 -12
- data/spec/chewy/{type → index}/import/journal_builder_spec.rb +22 -30
- data/spec/chewy/{type → index}/import/routine_spec.rb +19 -19
- data/spec/chewy/{type → index}/import_spec.rb +154 -95
- data/spec/chewy/index/mapping_spec.rb +135 -0
- data/spec/chewy/index/observe/active_record_methods_spec.rb +68 -0
- data/spec/chewy/index/observe/callback_spec.rb +139 -0
- data/spec/chewy/index/observe_spec.rb +143 -0
- data/spec/chewy/index/settings_spec.rb +3 -1
- data/spec/chewy/index/specification_spec.rb +32 -33
- data/spec/chewy/{type → index}/syncer_spec.rb +14 -19
- data/spec/chewy/{type → index}/witchcraft_spec.rb +34 -21
- data/spec/chewy/index/wrapper_spec.rb +100 -0
- data/spec/chewy/index_spec.rb +99 -114
- data/spec/chewy/journal_spec.rb +56 -101
- data/spec/chewy/minitest/helpers_spec.rb +122 -14
- data/spec/chewy/minitest/search_index_receiver_spec.rb +24 -26
- data/spec/chewy/multi_search_spec.rb +84 -0
- data/spec/chewy/rake_helper_spec.rb +325 -101
- data/spec/chewy/rspec/build_query_spec.rb +34 -0
- data/spec/chewy/rspec/helpers_spec.rb +61 -0
- data/spec/chewy/rspec/update_index_spec.rb +106 -102
- data/spec/chewy/runtime_spec.rb +2 -2
- data/spec/chewy/search/loader_spec.rb +19 -53
- data/spec/chewy/search/pagination/kaminari_examples.rb +3 -5
- data/spec/chewy/search/pagination/kaminari_spec.rb +1 -1
- data/spec/chewy/search/parameters/collapse_spec.rb +5 -0
- data/spec/chewy/search/parameters/ignore_unavailable_spec.rb +67 -0
- data/spec/chewy/search/parameters/indices_spec.rb +99 -0
- data/spec/chewy/search/parameters/none_spec.rb +1 -1
- data/spec/chewy/search/parameters/order_spec.rb +18 -11
- data/spec/chewy/search/parameters/query_storage_examples.rb +67 -21
- data/spec/chewy/search/parameters/search_after_spec.rb +4 -1
- data/spec/chewy/search/parameters/source_spec.rb +8 -2
- data/spec/chewy/search/parameters/track_total_hits_spec.rb +5 -0
- data/spec/chewy/search/parameters_spec.rb +39 -8
- data/spec/chewy/search/query_proxy_spec.rb +68 -17
- data/spec/chewy/search/request_spec.rb +360 -149
- data/spec/chewy/search/response_spec.rb +35 -25
- data/spec/chewy/search/scrolling_spec.rb +28 -26
- data/spec/chewy/search_spec.rb +73 -53
- data/spec/chewy/stash_spec.rb +16 -26
- data/spec/chewy/strategy/active_job_spec.rb +23 -10
- data/spec/chewy/strategy/atomic_no_refresh_spec.rb +60 -0
- data/spec/chewy/strategy/atomic_spec.rb +9 -10
- data/spec/chewy/strategy/delayed_sidekiq_spec.rb +190 -0
- data/spec/chewy/strategy/lazy_sidekiq_spec.rb +214 -0
- data/spec/chewy/strategy/sidekiq_spec.rb +14 -10
- data/spec/chewy/strategy_spec.rb +19 -15
- data/spec/chewy_spec.rb +17 -110
- data/spec/spec_helper.rb +7 -22
- data/spec/support/active_record.rb +43 -5
- metadata +123 -198
- data/.travis.yml +0 -53
- data/Appraisals +0 -79
- data/LEGACY_DSL.md +0 -497
- data/gemfiles/rails.4.0.activerecord.gemfile +0 -14
- data/gemfiles/rails.4.1.activerecord.gemfile +0 -14
- data/gemfiles/rails.4.2.activerecord.gemfile +0 -15
- data/gemfiles/rails.4.2.mongoid.5.1.gemfile +0 -15
- data/gemfiles/rails.5.0.activerecord.gemfile +0 -15
- data/gemfiles/rails.5.0.mongoid.6.0.gemfile +0 -15
- data/gemfiles/rails.5.1.activerecord.gemfile +0 -15
- data/gemfiles/rails.5.1.mongoid.6.1.gemfile +0 -15
- data/gemfiles/sequel.4.45.gemfile +0 -11
- data/lib/chewy/backports/deep_dup.rb +0 -46
- data/lib/chewy/backports/duplicable.rb +0 -91
- data/lib/chewy/query/compose.rb +0 -68
- data/lib/chewy/query/criteria.rb +0 -191
- data/lib/chewy/query/filters.rb +0 -227
- data/lib/chewy/query/loading.rb +0 -111
- data/lib/chewy/query/nodes/and.rb +0 -25
- data/lib/chewy/query/nodes/base.rb +0 -17
- data/lib/chewy/query/nodes/bool.rb +0 -34
- data/lib/chewy/query/nodes/equal.rb +0 -34
- data/lib/chewy/query/nodes/exists.rb +0 -20
- data/lib/chewy/query/nodes/expr.rb +0 -28
- data/lib/chewy/query/nodes/field.rb +0 -110
- data/lib/chewy/query/nodes/has_child.rb +0 -15
- data/lib/chewy/query/nodes/has_parent.rb +0 -15
- data/lib/chewy/query/nodes/has_relation.rb +0 -59
- data/lib/chewy/query/nodes/match_all.rb +0 -11
- data/lib/chewy/query/nodes/missing.rb +0 -20
- data/lib/chewy/query/nodes/not.rb +0 -25
- data/lib/chewy/query/nodes/or.rb +0 -25
- data/lib/chewy/query/nodes/prefix.rb +0 -19
- data/lib/chewy/query/nodes/query.rb +0 -20
- data/lib/chewy/query/nodes/range.rb +0 -63
- data/lib/chewy/query/nodes/raw.rb +0 -15
- data/lib/chewy/query/nodes/regexp.rb +0 -35
- data/lib/chewy/query/nodes/script.rb +0 -20
- data/lib/chewy/query/pagination.rb +0 -25
- data/lib/chewy/query.rb +0 -1098
- data/lib/chewy/search/pagination/will_paginate.rb +0 -43
- data/lib/chewy/search/parameters/types.rb +0 -20
- data/lib/chewy/strategy/resque.rb +0 -27
- data/lib/chewy/strategy/shoryuken.rb +0 -40
- data/lib/chewy/type/actions.rb +0 -43
- data/lib/chewy/type/adapter/mongoid.rb +0 -69
- data/lib/chewy/type/adapter/sequel.rb +0 -95
- data/lib/chewy/type/crutch.rb +0 -32
- data/lib/chewy/type/import/bulk_builder.rb +0 -122
- data/lib/chewy/type/observe.rb +0 -78
- data/lib/chewy/type.rb +0 -117
- data/lib/sequel/plugins/chewy_observe.rb +0 -78
- data/spec/chewy/query/criteria_spec.rb +0 -700
- data/spec/chewy/query/filters_spec.rb +0 -201
- data/spec/chewy/query/loading_spec.rb +0 -124
- data/spec/chewy/query/nodes/and_spec.rb +0 -12
- data/spec/chewy/query/nodes/bool_spec.rb +0 -14
- data/spec/chewy/query/nodes/equal_spec.rb +0 -32
- data/spec/chewy/query/nodes/exists_spec.rb +0 -18
- data/spec/chewy/query/nodes/has_child_spec.rb +0 -59
- data/spec/chewy/query/nodes/has_parent_spec.rb +0 -59
- data/spec/chewy/query/nodes/match_all_spec.rb +0 -11
- data/spec/chewy/query/nodes/missing_spec.rb +0 -16
- data/spec/chewy/query/nodes/not_spec.rb +0 -13
- data/spec/chewy/query/nodes/or_spec.rb +0 -12
- data/spec/chewy/query/nodes/prefix_spec.rb +0 -16
- data/spec/chewy/query/nodes/query_spec.rb +0 -12
- data/spec/chewy/query/nodes/range_spec.rb +0 -32
- data/spec/chewy/query/nodes/raw_spec.rb +0 -11
- data/spec/chewy/query/nodes/regexp_spec.rb +0 -43
- data/spec/chewy/query/nodes/script_spec.rb +0 -15
- data/spec/chewy/query/pagination/kaminari_spec.rb +0 -5
- data/spec/chewy/query/pagination/will_paginate_spec.rb +0 -5
- data/spec/chewy/query/pagination_spec.rb +0 -39
- data/spec/chewy/query_spec.rb +0 -636
- data/spec/chewy/search/pagination/will_paginate_examples.rb +0 -63
- data/spec/chewy/search/pagination/will_paginate_spec.rb +0 -23
- data/spec/chewy/search/parameters/indices_boost_spec.rb +0 -83
- data/spec/chewy/search/parameters/types_spec.rb +0 -5
- data/spec/chewy/strategy/resque_spec.rb +0 -46
- data/spec/chewy/strategy/shoryuken_spec.rb +0 -64
- data/spec/chewy/type/actions_spec.rb +0 -50
- data/spec/chewy/type/adapter/mongoid_spec.rb +0 -372
- data/spec/chewy/type/adapter/sequel_spec.rb +0 -472
- data/spec/chewy/type/import/bulk_builder_spec.rb +0 -279
- data/spec/chewy/type/mapping_spec.rb +0 -142
- data/spec/chewy/type/observe_spec.rb +0 -137
- data/spec/chewy/type/wrapper_spec.rb +0 -98
- data/spec/chewy/type_spec.rb +0 -55
- data/spec/support/mongoid.rb +0 -93
- data/spec/support/sequel.rb +0 -80
data/README.md
CHANGED
@@ -1,66 +1,15 @@
|
|
1
1
|
[![Gem Version](https://badge.fury.io/rb/chewy.svg)](http://badge.fury.io/rb/chewy)
|
2
|
-
[![
|
2
|
+
[![GitHub Actions](https://github.com/toptal/chewy/actions/workflows/ruby.yml/badge.svg)](https://github.com/toptal/chewy/actions/workflows/ruby.yml)
|
3
3
|
[![Code Climate](https://codeclimate.com/github/toptal/chewy.svg)](https://codeclimate.com/github/toptal/chewy)
|
4
4
|
[![Inline docs](http://inch-ci.org/github/toptal/chewy.svg?branch=master)](http://inch-ci.org/github/toptal/chewy)
|
5
5
|
|
6
|
-
<p align="right">Sponsored by</p>
|
7
|
-
<p align="right"><a href="https://www.toptal.com/"><img src="https://www.toptal.com/assets/public/blocks/logo/big.png" alt="Toptal" width="105" height="34"></a></p>
|
8
|
-
|
9
6
|
# Chewy
|
10
7
|
|
11
|
-
Chewy is an ODM
|
12
|
-
|
13
|
-
## Table of Contents
|
14
|
-
|
15
|
-
* [Why Chewy?](#why-chewy)
|
16
|
-
* [Installation](#installation)
|
17
|
-
* [Usage](#usage)
|
18
|
-
* [Client settings](#client-settings)
|
19
|
-
* [AWS ElasticSearch configuration](#aws-elastic-search)
|
20
|
-
* [Index definition](#index-definition)
|
21
|
-
* [Type default import options](#type-default-import-options)
|
22
|
-
* [Multi (nested) and object field types](#multi-nested-and-object-field-types)
|
23
|
-
* [Parent and children types](#parent-and-children-types)
|
24
|
-
* [Geo Point fields](#geo-point-fields)
|
25
|
-
* [Crutches™ technology](#crutches-technology)
|
26
|
-
* [Witchcraft™ technology](#witchcraft-technology)
|
27
|
-
* [Raw Import](#raw-import)
|
28
|
-
* [Index creation during import](#index-creation-during-import)
|
29
|
-
* [Journaling](#journaling)
|
30
|
-
* [Types access](#types-access)
|
31
|
-
* [Index manipulation](#index-manipulation)
|
32
|
-
* [Index update strategies](#index-update-strategies)
|
33
|
-
* [Nesting](#nesting)
|
34
|
-
* [Non-block notation](#non-block-notation)
|
35
|
-
* [Designing your own strategies](#designing-your-own-strategies)
|
36
|
-
* [Rails application strategies integration](#rails-application-strategies-integration)
|
37
|
-
* [ActiveSupport::Notifications support](#activesupport-notifications-support)
|
38
|
-
* [NewRelic integration](#newrelic-integration)
|
39
|
-
* [Search requests](#search-requests)
|
40
|
-
* [Composing requests](#composing-requests)
|
41
|
-
* [Pagination](#pagination)
|
42
|
-
* [Named scopes](#named-scopes)
|
43
|
-
* [Scroll API](#scroll-api)
|
44
|
-
* [Loading objects](#loading-objects)
|
45
|
-
* [Legacy DSL incompatibilities](#legacy-dsl-incompatibilities)
|
46
|
-
* [Rake tasks](#rake-tasks)
|
47
|
-
* [chewy:reset](#chewyreset)
|
48
|
-
* [chewy:upgrade](#chewyupgrade)
|
49
|
-
* [chewy:update](#chewyupdate)
|
50
|
-
* [chewy:sync](#chewysync)
|
51
|
-
* [chewy:deploy](#chewydeploy)
|
52
|
-
* [Parallelizing rake tasks](#parallelizing-rake-tasks)
|
53
|
-
* [chewy:journal](#chewyjournal)
|
54
|
-
* [Rspec integration](#rspec-integration)
|
55
|
-
* [Minitest integration](#minitest-integration)
|
56
|
-
* [TODO a.k.a coming soon](#todo-aka-coming-soon)
|
57
|
-
* [Contributing](#contributing)
|
8
|
+
Chewy is an ODM (Object Document Mapper), built on top of [the official Elasticsearch client](https://github.com/elastic/elasticsearch-ruby).
|
58
9
|
|
59
10
|
## Why Chewy?
|
60
11
|
|
61
|
-
|
62
|
-
|
63
|
-
Index classes are independent from ORM/ODM models. Now, implementing e.g. cross-model autocomplete is much easier. You can just define the index and work with it in an object-oriented style. You can define several types for index - one per indexed model.
|
12
|
+
In this section we'll cover why you might want to use Chewy instead of the official `elasticsearch-ruby` client gem.
|
64
13
|
|
65
14
|
* Every index is observable by all the related models.
|
66
15
|
|
@@ -74,12 +23,11 @@ Chewy is an ODM and wrapper for [the official Elasticsearch client](https://gith
|
|
74
23
|
|
75
24
|
Chewy has an ActiveRecord-style query DSL. It is chainable, mergeable and lazy, so you can produce queries in the most efficient way. It also has object-oriented query and filter builders.
|
76
25
|
|
77
|
-
* Support for ActiveRecord
|
78
|
-
|
26
|
+
* Support for ActiveRecord.
|
79
27
|
|
80
28
|
## Installation
|
81
29
|
|
82
|
-
Add this line to your application's Gemfile
|
30
|
+
Add this line to your application's `Gemfile`:
|
83
31
|
|
84
32
|
gem 'chewy'
|
85
33
|
|
@@ -91,19 +39,181 @@ Or install it yourself as:
|
|
91
39
|
|
92
40
|
$ gem install chewy
|
93
41
|
|
94
|
-
##
|
42
|
+
## Compatibility
|
95
43
|
|
96
|
-
###
|
44
|
+
### Ruby
|
45
|
+
|
46
|
+
Chewy is compatible with MRI 2.6-3.0¹.
|
47
|
+
|
48
|
+
> ¹ Ruby 3 is only supported with Rails 6.1
|
49
|
+
|
50
|
+
### Elasticsearch compatibility matrix
|
51
|
+
|
52
|
+
| Chewy version | Elasticsearch version |
|
53
|
+
| ------------- | ---------------------------------- |
|
54
|
+
| 7.2.x | 7.x |
|
55
|
+
| 7.1.x | 7.x |
|
56
|
+
| 7.0.x | 6.8, 7.x |
|
57
|
+
| 6.0.0 | 5.x, 6.x |
|
58
|
+
| 5.x | 5.x, limited support for 1.x & 2.x |
|
59
|
+
|
60
|
+
**Important:** Chewy doesn't follow SemVer, so you should always
|
61
|
+
check the release notes before upgrading. The major version is linked to the
|
62
|
+
newest supported Elasticsearch and the minor version bumps may include breaking changes.
|
63
|
+
|
64
|
+
See our [migration guide](migration_guide.md) for detailed upgrade instructions between
|
65
|
+
various Chewy versions.
|
66
|
+
|
67
|
+
### Active Record
|
68
|
+
|
69
|
+
5.2, 6.0, 6.1 Active Record versions are supported by all Chewy versions.
|
70
|
+
|
71
|
+
## Getting Started
|
72
|
+
|
73
|
+
Chewy provides functionality for Elasticsearch index handling, documents import mappings, index update strategies and chainable query DSL.
|
74
|
+
|
75
|
+
### Minimal client setting
|
76
|
+
|
77
|
+
Create `config/initializers/chewy.rb` with this line:
|
78
|
+
|
79
|
+
```ruby
|
80
|
+
Chewy.settings = {host: 'localhost:9250'}
|
81
|
+
```
|
82
|
+
|
83
|
+
And run `rails g chewy:install` to generate `chewy.yml`:
|
84
|
+
|
85
|
+
```yaml
|
86
|
+
# config/chewy.yml
|
87
|
+
# separate environment configs
|
88
|
+
test:
|
89
|
+
host: 'localhost:9250'
|
90
|
+
prefix: 'test'
|
91
|
+
development:
|
92
|
+
host: 'localhost:9200'
|
93
|
+
```
|
94
|
+
|
95
|
+
### Elasticsearch
|
97
96
|
|
98
|
-
|
97
|
+
Make sure you have Elasticsearch up and running. You can [install](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html) it locally, but the easiest way is to use [Docker](https://www.docker.com/get-started):
|
99
98
|
|
100
|
-
|
99
|
+
```shell
|
100
|
+
$ docker run --rm --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.11.1
|
101
|
+
```
|
102
|
+
|
103
|
+
### Index
|
104
|
+
|
105
|
+
Create `app/chewy/users_index.rb` with User Index:
|
106
|
+
|
107
|
+
```ruby
|
108
|
+
class UsersIndex < Chewy::Index
|
109
|
+
settings analysis: {
|
110
|
+
analyzer: {
|
111
|
+
email: {
|
112
|
+
tokenizer: 'keyword',
|
113
|
+
filter: ['lowercase']
|
114
|
+
}
|
115
|
+
}
|
116
|
+
}
|
117
|
+
|
118
|
+
index_scope User
|
119
|
+
field :first_name
|
120
|
+
field :last_name
|
121
|
+
field :email, analyzer: 'email'
|
122
|
+
end
|
123
|
+
```
|
124
|
+
|
125
|
+
### Model
|
126
|
+
|
127
|
+
Add User model, table and migrate it:
|
128
|
+
|
129
|
+
```shell
|
130
|
+
$ bundle exec rails g model User first_name last_name email
|
131
|
+
$ bundle exec rails db:migrate
|
132
|
+
```
|
133
|
+
|
134
|
+
Add `update_index` to app/models/user.rb:
|
135
|
+
|
136
|
+
```ruby
|
137
|
+
class User < ApplicationRecord
|
138
|
+
update_index('users') { self }
|
139
|
+
end
|
140
|
+
```
|
141
|
+
|
142
|
+
### Example of data request
|
143
|
+
|
144
|
+
1. Once a record is created (could be done via the Rails console), it creates User index too:
|
145
|
+
|
146
|
+
```
|
147
|
+
User.create(
|
148
|
+
first_name: "test1",
|
149
|
+
last_name: "test1",
|
150
|
+
email: 'test1@example.com',
|
151
|
+
# other fields
|
152
|
+
)
|
153
|
+
# UsersIndex Import (355.3ms) {:index=>1}
|
154
|
+
# => #<User id: 1, first_name: "test1", last_name: "test1", email: "test1@example.com", # other fields>
|
155
|
+
```
|
156
|
+
|
157
|
+
2. A query could be exposed at a given `UsersController`:
|
158
|
+
|
159
|
+
```ruby
|
160
|
+
def search
|
161
|
+
@users = UsersIndex.query(query_string: { fields: [:first_name, :last_name, :email, ...], query: search_params[:query], default_operator: 'and' })
|
162
|
+
render json: @users.to_json, status: :ok
|
163
|
+
end
|
164
|
+
|
165
|
+
private
|
166
|
+
|
167
|
+
def search_params
|
168
|
+
params.permit(:query, :page, :per)
|
169
|
+
end
|
170
|
+
```
|
171
|
+
|
172
|
+
3. So a request against `http://localhost:3000/users/search?query=test1@example.com` issuing a response like:
|
173
|
+
|
174
|
+
```json
|
175
|
+
[
|
176
|
+
{
|
177
|
+
"attributes":{
|
178
|
+
"id":"1",
|
179
|
+
"first_name":"test1",
|
180
|
+
"last_name":"test1",
|
181
|
+
"email":"test1@example.com",
|
182
|
+
...
|
183
|
+
"_score":0.9808291,
|
184
|
+
"_explanation":null
|
185
|
+
},
|
186
|
+
"_data":{
|
187
|
+
"_index":"users",
|
188
|
+
"_type":"_doc",
|
189
|
+
"_id":"1",
|
190
|
+
"_score":0.9808291,
|
191
|
+
"_source":{
|
192
|
+
"first_name":"test1",
|
193
|
+
"last_name":"test1",
|
194
|
+
"email":"test1@example.com",
|
195
|
+
...
|
196
|
+
}
|
197
|
+
}
|
198
|
+
}
|
199
|
+
]
|
200
|
+
```
|
201
|
+
|
202
|
+
## Usage and configuration
|
203
|
+
|
204
|
+
### Client settings
|
205
|
+
|
206
|
+
To configure the Chewy client you need to add `chewy.rb` file with `Chewy.settings` hash:
|
101
207
|
|
102
208
|
```ruby
|
103
209
|
# config/initializers/chewy.rb
|
104
210
|
Chewy.settings = {host: 'localhost:9250'} # do not use environments
|
105
211
|
```
|
106
212
|
|
213
|
+
And add `chewy.yml` configuration file.
|
214
|
+
|
215
|
+
You can create `chewy.yml` manually or run `rails g chewy:install` to generate it:
|
216
|
+
|
107
217
|
```yaml
|
108
218
|
# config/chewy.yml
|
109
219
|
# separate environment configs
|
@@ -129,27 +239,31 @@ Chewy.logger = Logger.new(STDOUT)
|
|
129
239
|
|
130
240
|
See [config.rb](lib/chewy/config.rb) for more details.
|
131
241
|
|
132
|
-
####
|
133
|
-
|
242
|
+
#### AWS Elasticsearch
|
243
|
+
|
244
|
+
If you would like to use AWS's Elasticsearch using an IAM user policy, you will need to sign your requests for the `es:*` action by injecting the appropriate headers passing a proc to `transport_options`.
|
245
|
+
You'll need an additional gem for Faraday middleware: add `gem 'faraday_middleware-aws-sigv4'` to your Gemfile.
|
134
246
|
|
135
247
|
```ruby
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
140
|
-
|
141
|
-
|
142
|
-
|
143
|
-
|
144
|
-
|
145
|
-
|
146
|
-
|
147
|
-
|
148
|
-
|
248
|
+
require 'faraday_middleware/aws_sigv4'
|
249
|
+
|
250
|
+
Chewy.settings = {
|
251
|
+
host: 'http://my-es-instance-on-aws.us-east-1.es.amazonaws.com:80',
|
252
|
+
port: 80, # 443 for https host
|
253
|
+
transport_options: {
|
254
|
+
headers: { content_type: 'application/json' },
|
255
|
+
proc: -> (f) do
|
256
|
+
f.request :aws_sigv4,
|
257
|
+
service: 'es',
|
258
|
+
region: 'us-east-1',
|
259
|
+
access_key_id: ENV['AWS_ACCESS_KEY'],
|
260
|
+
secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
|
261
|
+
end
|
149
262
|
}
|
150
|
-
|
263
|
+
}
|
264
|
+
```
|
151
265
|
|
152
|
-
|
266
|
+
#### Index definition
|
153
267
|
|
154
268
|
1. Create `/app/chewy/users_index.rb`
|
155
269
|
|
@@ -159,41 +273,38 @@ If you would like to use AWS's ElasticSearch using an IAM user policy, you will
|
|
159
273
|
end
|
160
274
|
```
|
161
275
|
|
162
|
-
2.
|
276
|
+
2. Define index scope (you can omit this part if you don't need to specify a scope (i.e. use PORO objects for import) or options)
|
163
277
|
|
164
278
|
```ruby
|
165
279
|
class UsersIndex < Chewy::Index
|
166
|
-
|
280
|
+
index_scope User.active # or just model instead_of scope: index_scope User
|
167
281
|
end
|
168
282
|
```
|
169
283
|
|
170
|
-
|
171
|
-
|
172
|
-
3. Add some type mappings
|
284
|
+
3. Add some mappings
|
173
285
|
|
174
286
|
```ruby
|
175
287
|
class UsersIndex < Chewy::Index
|
176
|
-
|
177
|
-
|
178
|
-
|
179
|
-
|
180
|
-
|
181
|
-
|
182
|
-
|
183
|
-
|
184
|
-
|
185
|
-
|
186
|
-
end
|
187
|
-
field :rating, type: 'integer' # custom data type
|
188
|
-
field :created, type: 'date', include_in_all: false,
|
189
|
-
value: ->{ created_at } # value proc for source object context
|
288
|
+
index_scope User.active.includes(:country, :badges, :projects)
|
289
|
+
field :first_name, :last_name # multiple fields without additional options
|
290
|
+
field :email, analyzer: 'email' # Elasticsearch-related options
|
291
|
+
field :country, value: ->(user) { user.country.name } # custom value proc
|
292
|
+
field :badges, value: ->(user) { user.badges.map(&:name) } # passing array values to index
|
293
|
+
field :projects do # the same block syntax for multi_field, if `:type` is specified
|
294
|
+
field :title
|
295
|
+
field :description # default data type is `text`
|
296
|
+
# additional top-level objects passed to value proc:
|
297
|
+
field :categories, value: ->(project, user) { project.categories.map(&:name) if user.active? }
|
190
298
|
end
|
299
|
+
field :rating, type: 'integer' # custom data type
|
300
|
+
field :created, type: 'date', include_in_all: false,
|
301
|
+
value: ->{ created_at } # value proc for source object context
|
191
302
|
end
|
192
303
|
```
|
193
304
|
|
194
305
|
[See here for mapping definitions](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html).
|
195
306
|
|
196
|
-
4. Add some index-
|
307
|
+
4. Add some index-related settings. Analyzer repositories might be used as well. See `Chewy::Index.settings` docs for details:
|
197
308
|
|
198
309
|
```ruby
|
199
310
|
class UsersIndex < Chewy::Index
|
@@ -206,23 +317,22 @@ If you would like to use AWS's ElasticSearch using an IAM user policy, you will
|
|
206
317
|
}
|
207
318
|
}
|
208
319
|
|
209
|
-
|
210
|
-
|
211
|
-
|
212
|
-
|
213
|
-
|
214
|
-
|
215
|
-
|
216
|
-
|
217
|
-
|
218
|
-
|
219
|
-
|
220
|
-
end
|
221
|
-
field :about_translations, type: 'object' # pass object type explicitly if necessary
|
222
|
-
field :rating, type: 'integer'
|
223
|
-
field :created, type: 'date', include_in_all: false,
|
224
|
-
value: ->{ created_at }
|
320
|
+
index_scope User.active.includes(:country, :badges, :projects)
|
321
|
+
root date_detection: false do
|
322
|
+
template 'about_translations.*', type: 'text', analyzer: 'standard'
|
323
|
+
|
324
|
+
field :first_name, :last_name
|
325
|
+
field :email, analyzer: 'email'
|
326
|
+
field :country, value: ->(user) { user.country.name }
|
327
|
+
field :badges, value: ->(user) { user.badges.map(&:name) }
|
328
|
+
field :projects do
|
329
|
+
field :title
|
330
|
+
field :description
|
225
331
|
end
|
332
|
+
field :about_translations, type: 'object' # pass object type explicitly if necessary
|
333
|
+
field :rating, type: 'integer'
|
334
|
+
field :created, type: 'date', include_in_all: false,
|
335
|
+
value: ->{ created_at }
|
226
336
|
end
|
227
337
|
end
|
228
338
|
```
|
@@ -230,45 +340,38 @@ If you would like to use AWS's ElasticSearch using an IAM user policy, you will
|
|
230
340
|
[See index settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html).
|
231
341
|
[See root object settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html).
|
232
342
|
|
233
|
-
See [mapping.rb](lib/chewy/
|
343
|
+
See [mapping.rb](lib/chewy/index/mapping.rb) for more details.
|
234
344
|
|
235
345
|
5. Add model-observing code
|
236
346
|
|
237
347
|
```ruby
|
238
348
|
class User < ActiveRecord::Base
|
239
|
-
update_index('users
|
349
|
+
update_index('users') { self } # specifying index and back-reference
|
240
350
|
# for updating after user save or destroy
|
241
351
|
end
|
242
352
|
|
243
353
|
class Country < ActiveRecord::Base
|
244
354
|
has_many :users
|
245
355
|
|
246
|
-
update_index('users
|
356
|
+
update_index('users') { users } # return single object or collection
|
247
357
|
end
|
248
358
|
|
249
359
|
class Project < ActiveRecord::Base
|
250
|
-
update_index('users
|
251
|
-
end
|
252
|
-
|
253
|
-
class Badge < ActiveRecord::Base
|
254
|
-
has_and_belongs_to_many :users
|
255
|
-
|
256
|
-
update_index('users') { users } # if index has only one type
|
257
|
-
# there is no need to specify updated type
|
360
|
+
update_index('users') { user if user.active? } # you can return even `nil` from the back-reference
|
258
361
|
end
|
259
362
|
|
260
363
|
class Book < ActiveRecord::Base
|
261
|
-
update_index(->(book) {"
|
262
|
-
|
263
|
-
|
364
|
+
update_index(->(book) {"books_#{book.language}"}) { self } # dynamic index name with proc.
|
365
|
+
# For book with language == "en"
|
366
|
+
# this code will generate `books_en`
|
264
367
|
end
|
265
368
|
```
|
266
369
|
|
267
370
|
Also, you can use the second argument for method name passing:
|
268
371
|
|
269
372
|
```ruby
|
270
|
-
update_index('users
|
271
|
-
update_index('users
|
373
|
+
update_index('users', :self)
|
374
|
+
update_index('users', :users)
|
272
375
|
```
|
273
376
|
|
274
377
|
In the case of a belongs_to association you may need to update both associated objects, previous and current:
|
@@ -277,47 +380,28 @@ If you would like to use AWS's ElasticSearch using an IAM user policy, you will
|
|
277
380
|
class City < ActiveRecord::Base
|
278
381
|
belongs_to :country
|
279
382
|
|
280
|
-
update_index('cities
|
281
|
-
update_index 'countries
|
282
|
-
# For the latest active_record changed values are
|
283
|
-
# already in `previous_changes` hash,
|
284
|
-
# but for mongoid you have to use `changes` hash
|
383
|
+
update_index('cities') { self }
|
384
|
+
update_index 'countries' do
|
285
385
|
previous_changes['country_id'] || country
|
286
386
|
end
|
287
387
|
end
|
288
388
|
```
|
289
389
|
|
290
|
-
|
291
|
-
|
292
|
-
```ruby
|
293
|
-
class User < Sequel::Model
|
294
|
-
update_index('users#user') { self }
|
295
|
-
end
|
296
|
-
```
|
390
|
+
### Default import options
|
297
391
|
|
298
|
-
|
299
|
-
|
300
|
-
```ruby
|
301
|
-
Sequel::Model.plugin :chewy_observe # for all models, or...
|
302
|
-
User.plugin :chewy_observe # just for User
|
303
|
-
```
|
304
|
-
|
305
|
-
### Type default import options
|
306
|
-
|
307
|
-
Every type has `default_import_options` configuration to specify, suddenly, default import options:
|
392
|
+
Every index has `default_import_options` configuration to specify, suddenly, default import options:
|
308
393
|
|
309
394
|
```ruby
|
310
395
|
class ProductsIndex < Chewy::Index
|
311
|
-
|
312
|
-
|
396
|
+
index_scope Post.includes(:tags)
|
397
|
+
default_import_options batch_size: 100, bulk_size: 10.megabytes, refresh: false
|
313
398
|
|
314
|
-
|
315
|
-
|
316
|
-
end
|
399
|
+
field :name
|
400
|
+
field :tags, value: -> { tags.map(&:name) }
|
317
401
|
end
|
318
402
|
```
|
319
403
|
|
320
|
-
See [import.rb](lib/chewy/
|
404
|
+
See [import.rb](lib/chewy/index/import.rb) for available options.
|
321
405
|
|
322
406
|
### Multi (nested) and object field types
|
323
407
|
|
@@ -335,26 +419,14 @@ This will automatically set the type or root field to `object`. You may also spe
|
|
335
419
|
To define a multi field you have to specify any type except for `object` or `nested` in the root field:
|
336
420
|
|
337
421
|
```ruby
|
338
|
-
field :full_name, type: '
|
422
|
+
field :full_name, type: 'text', value: ->{ full_name.strip } do
|
339
423
|
field :ordered, analyzer: 'ordered'
|
340
|
-
field :untouched,
|
424
|
+
field :untouched, type: 'keyword'
|
341
425
|
end
|
342
426
|
```
|
343
427
|
|
344
428
|
The `value:` option for internal fields will no longer be effective.
|
345
429
|
|
346
|
-
### Parent and children types
|
347
|
-
|
348
|
-
To define [parent](https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child-mapping.html) type for a given index_type, you can include root options for the type where you can specify parent_type and parent_id
|
349
|
-
|
350
|
-
```ruby
|
351
|
-
define_type User.includes(:account) do
|
352
|
-
root parent: 'account', parent_id: ->{ account_id } do
|
353
|
-
field :created_at, type: 'date'
|
354
|
-
field :task_id, type: 'integer'
|
355
|
-
end
|
356
|
-
end
|
357
|
-
```
|
358
430
|
### Geo Point fields
|
359
431
|
|
360
432
|
You can use [Elasticsearch's geo mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) with the `geo_point` field type, allowing you to query, filter and order by latitude and longitude. You can use the following hash format:
|
@@ -374,20 +446,36 @@ end
|
|
374
446
|
|
375
447
|
See the section on *Script fields* for details on calculating distance in a search.
|
376
448
|
|
449
|
+
### Join fields
|
450
|
+
|
451
|
+
You can use a [join field](https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html)
|
452
|
+
to implement parent-child relationships between documents.
|
453
|
+
It [replaces the old `parent_id` based parent-child mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#parent-child-mapping-types)
|
454
|
+
|
455
|
+
To use it, you need to pass `relations` and `join` (with `type` and `id`) options:
|
456
|
+
```ruby
|
457
|
+
field :hierarchy_link, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join: {type: :comment_type, id: :commented_id}
|
458
|
+
```
|
459
|
+
assuming you have `comment_type` and `commented_id` fields in your model.
|
460
|
+
|
461
|
+
Note that when you reindex a parent, its children and grandchildren will be reindexed as well.
|
462
|
+
This may require additional queries to the primary database and to elastisearch.
|
463
|
+
|
464
|
+
Also note that the join field doesn't support crutches (it should be a field directly defined on the model).
|
465
|
+
|
377
466
|
### Crutches™ technology
|
378
467
|
|
379
468
|
Assume you are defining your index like this (product has_many categories through product_categories):
|
380
469
|
|
381
470
|
```ruby
|
382
471
|
class ProductsIndex < Chewy::Index
|
383
|
-
|
384
|
-
|
385
|
-
|
386
|
-
end
|
472
|
+
index_scope Product.includes(:categories)
|
473
|
+
field :name
|
474
|
+
field :category_names, value: ->(product) { product.categories.map(&:name) } # or shorter just -> { categories.map(&:name) }
|
387
475
|
end
|
388
476
|
```
|
389
477
|
|
390
|
-
Then the Chewy reindexing flow will look like the following pseudo-code
|
478
|
+
Then the Chewy reindexing flow will look like the following pseudo-code:
|
391
479
|
|
392
480
|
```ruby
|
393
481
|
Product.includes(:categories).find_in_batches(1000) do |batch|
|
@@ -399,26 +487,23 @@ Product.includes(:categories).find_in_batches(1000) do |batch|
|
|
399
487
|
end
|
400
488
|
```
|
401
489
|
|
402
|
-
|
403
|
-
|
404
|
-
Then you can replace Rails associations with Chewy Crutches™ technology:
|
490
|
+
If you meet complicated cases when associations are not applicable you can replace Rails associations with Chewy Crutches™ technology:
|
405
491
|
|
406
492
|
```ruby
|
407
493
|
class ProductsIndex < Chewy::Index
|
408
|
-
|
409
|
-
|
410
|
-
|
411
|
-
|
412
|
-
|
413
|
-
|
414
|
-
|
415
|
-
|
416
|
-
end
|
417
|
-
|
418
|
-
field :name
|
419
|
-
# simply use crutch-fetched data as a value:
|
420
|
-
field :category_names, value: ->(product, crutches) { crutches.categories[product.id] }
|
494
|
+
index_scope Product
|
495
|
+
crutch :categories do |collection| # collection here is a current batch of products
|
496
|
+
# data is fetched with a lightweight query without objects initialization
|
497
|
+
data = ProductCategory.joins(:category).where(product_id: collection.map(&:id)).pluck(:product_id, 'categories.name')
|
498
|
+
# then we have to convert fetched data to appropriate format
|
499
|
+
# this will return our data in structure like:
|
500
|
+
# {123 => ['sweets', 'juices'], 456 => ['meat']}
|
501
|
+
data.each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
|
421
502
|
end
|
503
|
+
|
504
|
+
field :name
|
505
|
+
# simply use crutch-fetched data as a value:
|
506
|
+
field :category_names, value: ->(product, crutches) { crutches[:categories][product.id] }
|
422
507
|
end
|
423
508
|
```
|
424
509
|
|
@@ -440,22 +525,21 @@ So Chewy Crutches™ technology is able to increase your indexing performance in
|
|
440
525
|
|
441
526
|
### Witchcraft™ technology
|
442
527
|
|
443
|
-
One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of
|
528
|
+
One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of these procs is executed on imported object to extract result document to import. It would be great for performance to use one huge whole-document-returning proc instead. So basically the idea or Witchcraft™ technology is to compile a single document-returning proc from the index definition.
|
444
529
|
|
445
530
|
```ruby
|
446
|
-
|
447
|
-
|
448
|
-
|
449
|
-
|
450
|
-
|
451
|
-
|
452
|
-
|
453
|
-
|
454
|
-
end
|
531
|
+
index_scope Product
|
532
|
+
witchcraft!
|
533
|
+
|
534
|
+
field :title
|
535
|
+
field :tags, value: -> { tags.map(&:name) }
|
536
|
+
field :categories do
|
537
|
+
field :name, value: -> (product, category) { category.name }
|
538
|
+
field :type, value: -> (product, category, crutch) { crutch.types[category.name] }
|
455
539
|
end
|
456
540
|
```
|
457
541
|
|
458
|
-
The
|
542
|
+
The index definition above will be compiled to something close to:
|
459
543
|
|
460
544
|
```ruby
|
461
545
|
-> (object, crutches) do
|
@@ -485,7 +569,7 @@ Obviously not every type of definition might be compiled. There are some restric
|
|
485
569
|
end
|
486
570
|
```
|
487
571
|
|
488
|
-
However, it is quite possible that your
|
572
|
+
However, it is quite possible that your index definition will be supported by Witchcraft™ technology out of the box in most of the cases.
|
489
573
|
|
490
574
|
### Raw Import
|
491
575
|
|
@@ -512,13 +596,12 @@ class LightweightProduct
|
|
512
596
|
end
|
513
597
|
end
|
514
598
|
|
515
|
-
|
516
|
-
|
517
|
-
|
518
|
-
|
599
|
+
index_scope Product
|
600
|
+
default_import_options raw_import: ->(hash) {
|
601
|
+
LightweightProduct.new(hash)
|
602
|
+
}
|
519
603
|
|
520
|
-
|
521
|
-
end
|
604
|
+
field :created_at, 'datetime'
|
522
605
|
```
|
523
606
|
|
524
607
|
Also, you can pass `:raw_import` option to the `import` method explicitly.
|
@@ -529,6 +612,24 @@ By default, when you perform import Chewy checks whether an index exists and cre
|
|
529
612
|
You can turn off this feature to decrease Elasticsearch hits count.
|
530
613
|
To do so you need to set `skip_index_creation_on_import` parameter to `false` in your `config/chewy.yml`
|
531
614
|
|
615
|
+
### Skip record fields during import
|
616
|
+
|
617
|
+
You can use `ignore_blank: true` to skip fields that return `true` for the `.blank?` method:
|
618
|
+
|
619
|
+
```ruby
|
620
|
+
index_scope Country
|
621
|
+
field :id
|
622
|
+
field :cities, ignore_blank: true do
|
623
|
+
field :id
|
624
|
+
field :name
|
625
|
+
field :surname, ignore_blank: true
|
626
|
+
field :description
|
627
|
+
end
|
628
|
+
```
|
629
|
+
|
630
|
+
#### Default values for different types
|
631
|
+
|
632
|
+
By default `ignore_blank` is false on every type except `geo_point`.
|
532
633
|
|
533
634
|
### Journaling
|
534
635
|
|
@@ -542,7 +643,6 @@ Common journal record looks like this:
|
|
542
643
|
"action": "index",
|
543
644
|
"object_id": [1, 2, 3],
|
544
645
|
"index_name": "...",
|
545
|
-
"type_name": "...",
|
546
646
|
"created_at": "<timestamp>"
|
547
647
|
}
|
548
648
|
```
|
@@ -568,28 +668,16 @@ Or as a default import option for an index:
|
|
568
668
|
|
569
669
|
```ruby
|
570
670
|
class CityIndex
|
571
|
-
|
572
|
-
|
573
|
-
end
|
671
|
+
index_scope City
|
672
|
+
default_import_options journal: true
|
574
673
|
end
|
575
674
|
```
|
576
675
|
|
577
676
|
You may be wondering why do you need it? The answer is simple: not to lose the data.
|
578
677
|
|
579
|
-
Imagine that you reset your index in a zero-downtime manner (to separate index), and
|
580
|
-
|
581
|
-
### Types access
|
678
|
+
Imagine that you reset your index in a zero-downtime manner (to separate index), and in the meantime somebody keeps updating the data frequently (to old index). So all these actions will be written to the journal index and you'll be able to apply them after index reset using the `Chewy::Journal` interface.
|
582
679
|
|
583
|
-
|
584
|
-
|
585
|
-
```ruby
|
586
|
-
UsersIndex::User # => UsersIndex::User
|
587
|
-
UsersIndex.type_hash['user'] # => UsersIndex::User
|
588
|
-
UsersIndex.type('user') # => UsersIndex::User
|
589
|
-
UsersIndex.type('foo') # => raises error UndefinedType("Unknown type in UsersIndex: foo")
|
590
|
-
UsersIndex.types # => [UsersIndex::User]
|
591
|
-
UsersIndex.type_names # => ['user']
|
592
|
-
```
|
680
|
+
When enabled, journal can grow to enormous size, consider setting up cron job that would clean it occasionally using [`chewy:journal:clean` rake task](#chewyjournal).
|
593
681
|
|
594
682
|
### Index manipulation
|
595
683
|
|
@@ -603,25 +691,22 @@ UsersIndex.create! # use bang or non-bang methods
|
|
603
691
|
UsersIndex.purge
|
604
692
|
UsersIndex.purge! # deletes then creates index
|
605
693
|
|
606
|
-
UsersIndex
|
607
|
-
|
608
|
-
UsersIndex
|
609
|
-
UsersIndex
|
610
|
-
UsersIndex
|
611
|
-
UsersIndex
|
694
|
+
UsersIndex.import # import with 0 arguments process all the data specified in index_scope definition
|
695
|
+
UsersIndex.import User.where('rating > 100') # or import specified users scope
|
696
|
+
UsersIndex.import User.where('rating > 100').to_a # or import specified users array
|
697
|
+
UsersIndex.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
|
698
|
+
UsersIndex.import User.where('rating > 100'), update_fields: [:email] # if update fields are specified - it will update their values only with the `update` bulk action
|
699
|
+
UsersIndex.import! # raises an exception in case of any import errors
|
612
700
|
|
613
|
-
UsersIndex.import # import every defined type
|
614
|
-
UsersIndex.import user: User.where('rating > 100') # import only active users to `user` type.
|
615
|
-
# Other index types, if exists, will be imported with default scope from the type definition.
|
616
701
|
UsersIndex.reset! # purges index and imports default data for all types
|
617
702
|
```
|
618
703
|
|
619
|
-
If the passed user is `#destroyed?`, or satisfies a `delete_if`
|
704
|
+
If the passed user is `#destroyed?`, or satisfies a `delete_if` index_scope option, or the specified id does not exist in the database, import will perform delete from index action for this object.
|
620
705
|
|
621
706
|
```ruby
|
622
|
-
|
623
|
-
|
624
|
-
|
707
|
+
index_scope User, delete_if: :deleted_at
|
708
|
+
index_scope User, delete_if: -> { deleted_at }
|
709
|
+
index_scope User, delete_if: ->(user) { user.deleted_at }
|
625
710
|
```
|
626
711
|
|
627
712
|
See [actions.rb](lib/chewy/index/actions.rb) for more details.
|
@@ -632,13 +717,12 @@ Assume you've got the following code:
|
|
632
717
|
|
633
718
|
```ruby
|
634
719
|
class City < ActiveRecord::Base
|
635
|
-
update_index 'cities
|
720
|
+
update_index 'cities', :self
|
636
721
|
end
|
637
722
|
|
638
723
|
class CitiesIndex < Chewy::Index
|
639
|
-
|
640
|
-
|
641
|
-
end
|
724
|
+
index_scope City
|
725
|
+
field :name
|
642
726
|
end
|
643
727
|
```
|
644
728
|
|
@@ -658,46 +742,127 @@ end
|
|
658
742
|
|
659
743
|
Using this strategy delays the index update request until the end of the block. Updated records are aggregated and the index update happens with the bulk API. So this strategy is highly optimized.
|
660
744
|
|
661
|
-
#### `:
|
745
|
+
#### `:sidekiq`
|
662
746
|
|
663
|
-
This does the same thing as `:atomic`, but asynchronously using
|
747
|
+
This does the same thing as `:atomic`, but asynchronously using sidekiq. Patch `Chewy::Strategy::Sidekiq::Worker` for index updates improving.
|
664
748
|
|
665
749
|
```ruby
|
666
|
-
Chewy.strategy(:
|
750
|
+
Chewy.strategy(:sidekiq) do
|
667
751
|
City.popular.map(&:do_some_update_action!)
|
668
752
|
end
|
669
753
|
```
|
670
754
|
|
671
|
-
|
755
|
+
The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
|
756
|
+
```
|
757
|
+
Chewy.settings[:sidekiq] = {queue: :low}
|
758
|
+
```
|
672
759
|
|
673
|
-
|
760
|
+
#### `:lazy_sidekiq`
|
761
|
+
|
762
|
+
This does the same thing as `:sidekiq`, but with lazy evaluation. Beware it does not allow you to use any non-persistent record state for indices and conditions because record will be re-fetched from database asynchronously using sidekiq. However for destroying records strategy will fallback to `:sidekiq` because it's not possible to re-fetch deleted records from database.
|
763
|
+
|
764
|
+
The purpose of this strategy is to improve the response time of the code that should update indexes, as it does not only defer actual ES calls to a background job but `update_index` callbacks evaluation (for created and updated objects) too. Similar to `:sidekiq`, index update is asynchronous so this strategy cannot be used when data and index synchronization is required.
|
674
765
|
|
675
766
|
```ruby
|
676
|
-
Chewy.strategy(:
|
767
|
+
Chewy.strategy(:lazy_sidekiq) do
|
677
768
|
City.popular.map(&:do_some_update_action!)
|
678
769
|
end
|
679
770
|
```
|
680
771
|
|
681
|
-
|
772
|
+
The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
|
773
|
+
```
|
774
|
+
Chewy.settings[:sidekiq] = {queue: :low}
|
775
|
+
```
|
682
776
|
|
683
|
-
|
777
|
+
#### `:delayed_sidekiq`
|
778
|
+
|
779
|
+
It accumulates ids of records to be reindexed during the latency window in redis and then does the reindexing of all accumulated records at once.
|
780
|
+
The strategy is very useful in case of frequently mutated records.
|
781
|
+
It supports `update_fields` option, so it will try to select just enough data from the DB
|
684
782
|
|
783
|
+
There are three options that can be defined in the index:
|
685
784
|
```ruby
|
686
|
-
|
785
|
+
class CitiesIndex...
|
786
|
+
strategy_config delayed_sidekiq: {
|
787
|
+
latency: 3,
|
788
|
+
margin: 2,
|
789
|
+
ttl: 60 * 60 * 24,
|
790
|
+
reindex_wrapper: ->(&reindex) {
|
791
|
+
ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
|
792
|
+
}
|
793
|
+
# latency - will prevent scheduling identical jobs
|
794
|
+
# margin - main purpose is to cover db replication lag by the margin
|
795
|
+
# ttl - a chunk expiration time (in seconds)
|
796
|
+
# reindex_wrapper - lambda that accepts block to wrap that reindex process AR connection block.
|
797
|
+
}
|
798
|
+
|
799
|
+
...
|
800
|
+
end
|
801
|
+
```
|
802
|
+
|
803
|
+
Also you can define defaults in the `initializers/chewy.rb`
|
804
|
+
```ruby
|
805
|
+
Chewy.settings = {
|
806
|
+
strategy_config: {
|
807
|
+
delayed_sidekiq: {
|
808
|
+
latency: 3,
|
809
|
+
margin: 2,
|
810
|
+
ttl: 60 * 60 * 24,
|
811
|
+
reindex_wrapper: ->(&reindex) {
|
812
|
+
ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
|
813
|
+
}
|
814
|
+
}
|
815
|
+
}
|
816
|
+
}
|
817
|
+
|
818
|
+
```
|
819
|
+
or in `config/chewy.yml`
|
820
|
+
```ruby
|
821
|
+
strategy_config:
|
822
|
+
delayed_sidekiq:
|
823
|
+
latency: 3
|
824
|
+
margin: 2
|
825
|
+
ttl: <%= 60 * 60 * 24 %>
|
826
|
+
# reindex_wrapper setting is not possible here!!! use the initializer instead
|
827
|
+
```
|
828
|
+
|
829
|
+
You can use the strategy identically to other strategies
|
830
|
+
```ruby
|
831
|
+
Chewy.strategy(:delayed_sidekiq) do
|
687
832
|
City.popular.map(&:do_some_update_action!)
|
688
833
|
end
|
689
834
|
```
|
690
835
|
|
691
|
-
|
836
|
+
The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
|
837
|
+
```
|
838
|
+
Chewy.settings[:sidekiq] = {queue: :low}
|
839
|
+
```
|
692
840
|
|
693
|
-
|
841
|
+
Explicit call of the reindex using `:delayed_sidekiq strategy`
|
842
|
+
```ruby
|
843
|
+
CitiesIndex.import([1, 2, 3], strategy: :delayed_sidekiq)
|
844
|
+
```
|
694
845
|
|
846
|
+
Explicit call of the reindex using `:delayed_sidekiq` strategy with `:update_fields` support
|
695
847
|
```ruby
|
696
|
-
|
848
|
+
CitiesIndex.import([1, 2, 3], update_fields: [:name], strategy: :delayed_sidekiq)
|
849
|
+
```
|
850
|
+
|
851
|
+
#### `:active_job`
|
852
|
+
|
853
|
+
This does the same thing as `:atomic`, but using ActiveJob. This will inherit the ActiveJob configuration settings including the `active_job.queue_adapter` setting for the environment. Patch `Chewy::Strategy::ActiveJob::Worker` for index updates improving.
|
854
|
+
|
855
|
+
```ruby
|
856
|
+
Chewy.strategy(:active_job) do
|
697
857
|
City.popular.map(&:do_some_update_action!)
|
698
858
|
end
|
699
859
|
```
|
700
860
|
|
861
|
+
The default queue name is `chewy`, you can customize it in settings: `active_job.queue_name`
|
862
|
+
```
|
863
|
+
Chewy.settings[:active_job] = {queue: :low}
|
864
|
+
```
|
865
|
+
|
701
866
|
#### `:urgent`
|
702
867
|
|
703
868
|
The following strategy is convenient if you are going to update documents in your index one by one.
|
@@ -719,7 +884,9 @@ It is convenient for use in e.g. the Rails console with non-block notation:
|
|
719
884
|
|
720
885
|
#### `:bypass`
|
721
886
|
|
722
|
-
|
887
|
+
When the bypass strategy is active the index will not be automatically updated on object save.
|
888
|
+
|
889
|
+
For example, on `City.first.save!` the cities index would not be updated.
|
723
890
|
|
724
891
|
#### Nesting
|
725
892
|
|
@@ -773,6 +940,12 @@ RSpec.configure do |config|
|
|
773
940
|
end
|
774
941
|
```
|
775
942
|
|
943
|
+
### Elasticsearch client options
|
944
|
+
|
945
|
+
All connection options, except the `:prefix`, are passed to the `Elasticseach::Client.new` ([chewy/lib/chewy.rb](https://github.com/toptal/chewy/blob/f5bad9f83c21416ac10590f6f34009c645062e89/lib/chewy.rb#L153-L160)):
|
946
|
+
|
947
|
+
Here's the relevant Elasticsearch documentation on the subject: https://rubydoc.info/gems/elasticsearch-transport#setting-hosts
|
948
|
+
|
776
949
|
### `ActiveSupport::Notifications` support
|
777
950
|
|
778
951
|
Chewy has notifying the following events:
|
@@ -784,14 +957,14 @@ Chewy has notifying the following events:
|
|
784
957
|
|
785
958
|
#### `import_objects.chewy` payload
|
786
959
|
|
787
|
-
* `payload[:
|
960
|
+
* `payload[:index]`: currently imported index name
|
788
961
|
* `payload[:import]`: imports stats, total imported and deleted objects count:
|
789
962
|
|
790
963
|
```ruby
|
791
964
|
{index: 30, delete: 5}
|
792
965
|
```
|
793
966
|
|
794
|
-
* `payload[:errors]`: might not
|
967
|
+
* `payload[:errors]`: might not exist. Contains grouped errors with objects ids list:
|
795
968
|
|
796
969
|
```ruby
|
797
970
|
{index: {
|
@@ -807,72 +980,123 @@ Chewy has notifying the following events:
|
|
807
980
|
To integrate with NewRelic you may use the following example source (config/initializers/chewy.rb):
|
808
981
|
|
809
982
|
```ruby
|
810
|
-
|
811
|
-
|
812
|
-
|
813
|
-
|
814
|
-
|
815
|
-
|
816
|
-
|
817
|
-
|
818
|
-
|
983
|
+
require 'new_relic/agent/instrumentation/evented_subscriber'
|
984
|
+
|
985
|
+
class ChewySubscriber < NewRelic::Agent::Instrumentation::EventedSubscriber
|
986
|
+
def start(name, id, payload)
|
987
|
+
event = ChewyEvent.new(name, Time.current, nil, id, payload)
|
988
|
+
push_event(event)
|
989
|
+
end
|
990
|
+
|
991
|
+
def finish(_name, id, _payload)
|
992
|
+
pop_event(id).finish
|
819
993
|
end
|
820
|
-
end
|
821
994
|
|
822
|
-
|
823
|
-
|
824
|
-
|
825
|
-
|
995
|
+
class ChewyEvent < NewRelic::Agent::Instrumentation::Event
|
996
|
+
OPERATIONS = {
|
997
|
+
'import_objects.chewy' => 'import',
|
998
|
+
'search_query.chewy' => 'search',
|
999
|
+
'delete_query.chewy' => 'delete'
|
1000
|
+
}.freeze
|
1001
|
+
|
1002
|
+
def initialize(*args)
|
1003
|
+
super
|
1004
|
+
@segment = start_segment
|
1005
|
+
end
|
1006
|
+
|
1007
|
+
def start_segment
|
1008
|
+
segment = NewRelic::Agent::Transaction::DatastoreSegment.new product, operation, collection, host, port
|
1009
|
+
if (txn = state.current_transaction)
|
1010
|
+
segment.transaction = txn
|
1011
|
+
end
|
1012
|
+
segment.notice_sql @payload[:request].to_s
|
1013
|
+
segment.start
|
1014
|
+
segment
|
1015
|
+
end
|
1016
|
+
|
1017
|
+
def finish
|
1018
|
+
if (txn = state.current_transaction)
|
1019
|
+
txn.add_segment @segment
|
1020
|
+
end
|
1021
|
+
@segment.finish
|
1022
|
+
end
|
1023
|
+
|
1024
|
+
private
|
1025
|
+
|
1026
|
+
def state
|
1027
|
+
@state ||= NewRelic::Agent::TransactionState.tl_get
|
1028
|
+
end
|
1029
|
+
|
1030
|
+
def product
|
1031
|
+
'Elasticsearch'
|
1032
|
+
end
|
1033
|
+
|
1034
|
+
def operation
|
1035
|
+
OPERATIONS[name]
|
1036
|
+
end
|
826
1037
|
|
827
|
-
|
828
|
-
|
829
|
-
|
830
|
-
|
1038
|
+
def collection
|
1039
|
+
payload.values_at(:type, :index)
|
1040
|
+
.reject { |value| value.try(:empty?) }
|
1041
|
+
.first
|
1042
|
+
.to_s
|
1043
|
+
end
|
1044
|
+
|
1045
|
+
def host
|
1046
|
+
Chewy.client.transport.hosts.first[:host]
|
1047
|
+
end
|
1048
|
+
|
1049
|
+
def port
|
1050
|
+
Chewy.client.transport.hosts.first[:port]
|
1051
|
+
end
|
831
1052
|
end
|
832
1053
|
end
|
1054
|
+
|
1055
|
+
ActiveSupport::Notifications.subscribe(/.chewy$/, ChewySubscriber.new)
|
833
1056
|
```
|
834
1057
|
|
835
1058
|
### Search requests
|
836
1059
|
|
837
|
-
|
838
|
-
|
839
|
-
If you want to use it - simply do `Chewy.search_class = Chewy::Query` somewhere before indices are initialized.
|
840
|
-
|
841
|
-
The new DSL is enabled by default, here is a quick introduction.
|
1060
|
+
Quick introduction.
|
842
1061
|
|
843
1062
|
#### Composing requests
|
844
1063
|
|
845
|
-
The request DSL have the same chainable nature as AR
|
1064
|
+
The request DSL have the same chainable nature as AR. The main class is `Chewy::Search::Request`.
|
846
1065
|
|
847
1066
|
```ruby
|
848
|
-
|
849
|
-
PlaceIndex::City.query(match: {name: 'London'}) # returns cities only.
|
1067
|
+
CitiesIndex.query(match: {name: 'London'})
|
850
1068
|
```
|
851
1069
|
|
852
|
-
Main methods of the request DSL are: `query`, `filter` and `post_filter`, it is possible to pass pure query hashes or use `elasticsearch-dsl`.
|
1070
|
+
Main methods of the request DSL are: `query`, `filter` and `post_filter`, it is possible to pass pure query hashes or use `elasticsearch-dsl`.
|
853
1071
|
|
854
1072
|
```ruby
|
855
|
-
|
1073
|
+
CitiesIndex
|
856
1074
|
.filter(term: {name: 'Bangkok'})
|
857
|
-
.query {
|
1075
|
+
.query(match: {name: 'London'})
|
858
1076
|
.query.not(range: {population: {gt: 1_000_000}})
|
859
1077
|
```
|
860
1078
|
|
861
|
-
|
1079
|
+
You can query a set of indexes at once:
|
1080
|
+
|
1081
|
+
```ruby
|
1082
|
+
CitiesIndex.indices(CountriesIndex).query(match: {name: 'Some'})
|
1083
|
+
```
|
1084
|
+
|
1085
|
+
See https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html and https://github.com/elastic/elasticsearch-dsl-ruby for more details.
|
862
1086
|
|
863
1087
|
An important part of requests manipulation is merging. There are 4 methods to perform it: `merge`, `and`, `or`, `not`. See [Chewy::Search::QueryProxy](lib/chewy/search/query_proxy.rb) for details. Also, `only` and `except` methods help to remove unneeded parts of the request.
|
864
1088
|
|
865
1089
|
Every other request part is covered by a bunch of additional methods, see [Chewy::Search::Request](lib/chewy/search/request.rb) for details:
|
866
1090
|
|
867
1091
|
```ruby
|
868
|
-
|
1092
|
+
CitiesIndex.limit(10).offset(30).order(:name, {population: {order: :desc}})
|
869
1093
|
```
|
870
1094
|
|
871
1095
|
Request DSL also provides additional scope actions, like `delete_all`, `exists?`, `count`, `pluck`, etc.
|
872
1096
|
|
873
1097
|
#### Pagination
|
874
1098
|
|
875
|
-
The request DSL supports pagination with `Kaminari
|
1099
|
+
The request DSL supports pagination with `Kaminari`. An extension is enabled on initialization if `Kaminari` is available. See [Chewy::Search](lib/chewy/search.rb) and [Chewy::Search::Pagination::Kaminari](lib/chewy/search/pagination/kaminari.rb) for details.
|
876
1100
|
|
877
1101
|
#### Named scopes
|
878
1102
|
|
@@ -891,8 +1115,8 @@ See [Chewy::Search::Scrolling](lib/chewy/search/scrolling.rb) for details.
|
|
891
1115
|
It is possible to load ORM/ODM source objects with the `objects` method. To provide additional loading options use `load` method:
|
892
1116
|
|
893
1117
|
```ruby
|
894
|
-
|
895
|
-
|
1118
|
+
CitiesIndex.load(scope: -> { active }).to_a # to_a returns `Chewy::Index` wrappers.
|
1119
|
+
CitiesIndex.load(scope: -> { active }).objects # An array of AR source objects.
|
896
1120
|
```
|
897
1121
|
|
898
1122
|
See [Chewy::Search::Loader](lib/chewy/search/loader.rb) for more details.
|
@@ -900,23 +1124,12 @@ See [Chewy::Search::Loader](lib/chewy/search/loader.rb) for more details.
|
|
900
1124
|
In case when it is necessary to iterate through both of the wrappers and objects simultaneously, `object_hash` method helps a lot:
|
901
1125
|
|
902
1126
|
```ruby
|
903
|
-
scope =
|
1127
|
+
scope = CitiesIndex.load(scope: -> { active })
|
904
1128
|
scope.each do |wrapper|
|
905
1129
|
scope.object_hash[wrapper]
|
906
1130
|
end
|
907
1131
|
```
|
908
1132
|
|
909
|
-
#### Legacy DSL incompatibilities
|
910
|
-
|
911
|
-
* Filters advanced block DSL is not supported anymore, `elasticsearch-dsl` is used instead.
|
912
|
-
* Things like `query_mode` and `filter_mode` are in past, use advanced DSL to achieve similar behavior. See [Chewy::Search::QueryProxy](lib/chewy/search/query_proxy.rb) for details.
|
913
|
-
* `preload` method is no more, the collection returned by scope doesn't depend on loading options, scope always returns `Chewy::Type` wrappers. To get ORM/ODM objects, use `#objects` method.
|
914
|
-
* Some of the methods have changed their purpose: `only` was used to filter fields before, now it filters the scope. To filter fields use `source` or `stored_fields`.
|
915
|
-
* `types!` method is no more, use `except(:types).types(...)`
|
916
|
-
* Named aggregations are not supported, use named scopes instead.
|
917
|
-
* A lot of query-level methods were not ported: everything that is related to boost and scoring. Use `query` manipulation to provide them.
|
918
|
-
* `Chewy::Type#_object` returns nil always. Use `Chewy::Search::Response#object_hash` instead.
|
919
|
-
|
920
1133
|
### Rake tasks
|
921
1134
|
|
922
1135
|
For a Rails application, some index-maintaining rake tasks are defined.
|
@@ -928,59 +1141,57 @@ Performs zero-downtime reindexing as described [here](https://www.elastic.co/blo
|
|
928
1141
|
```bash
|
929
1142
|
rake chewy:reset # resets all the existing indices
|
930
1143
|
rake chewy:reset[users] # resets UsersIndex only
|
931
|
-
rake chewy:reset[users,
|
932
|
-
rake chewy:reset[-users,
|
1144
|
+
rake chewy:reset[users,cities] # resets UsersIndex and CitiesIndex
|
1145
|
+
rake chewy:reset[-users,cities] # resets every index in the application except specified ones
|
933
1146
|
```
|
934
1147
|
|
935
1148
|
#### `chewy:upgrade`
|
936
1149
|
|
937
1150
|
Performs reset exactly the same way as `chewy:reset` does, but only when the index specification (setting or mapping) was changed.
|
938
1151
|
|
939
|
-
It works only when index specification is locked in `Chewy::Stash` index. The first run will reset all indexes and lock their specifications.
|
1152
|
+
It works only when index specification is locked in `Chewy::Stash::Specification` index. The first run will reset all indexes and lock their specifications.
|
940
1153
|
|
941
|
-
See [Chewy::Stash](lib/chewy/stash.rb) and [Chewy::Index::Specification](lib/chewy/index/specification.rb) for more details.
|
1154
|
+
See [Chewy::Stash::Specification](lib/chewy/stash.rb) and [Chewy::Index::Specification](lib/chewy/index/specification.rb) for more details.
|
942
1155
|
|
943
1156
|
|
944
1157
|
```bash
|
945
1158
|
rake chewy:upgrade # upgrades all the existing indices
|
946
1159
|
rake chewy:upgrade[users] # upgrades UsersIndex only
|
947
|
-
rake chewy:upgrade[users,
|
948
|
-
rake chewy:upgrade[-users,
|
1160
|
+
rake chewy:upgrade[users,cities] # upgrades UsersIndex and CitiesIndex
|
1161
|
+
rake chewy:upgrade[-users,cities] # upgrades every index in the application except specified ones
|
949
1162
|
```
|
950
1163
|
|
951
1164
|
#### `chewy:update`
|
952
1165
|
|
953
1166
|
It doesn't create indexes, it simply imports everything to the existing ones and fails if the index was not created before.
|
954
1167
|
|
955
|
-
Unlike `reset` or `upgrade` tasks, it is possible to pass type references to update the particular type. In index name is passed without the type specified, it will update all the types defined for this index.
|
956
|
-
|
957
1168
|
```bash
|
958
1169
|
rake chewy:update # updates all the existing indices
|
959
1170
|
rake chewy:update[users] # updates UsersIndex only
|
960
|
-
rake chewy:update[users,
|
961
|
-
rake chewy:update[-users,
|
1171
|
+
rake chewy:update[users,cities] # updates UsersIndex and CitiesIndex
|
1172
|
+
rake chewy:update[-users,cities] # updates every index in the application except UsersIndex and CitiesIndex
|
962
1173
|
```
|
963
1174
|
|
964
1175
|
#### `chewy:sync`
|
965
1176
|
|
966
|
-
Provides a way to synchronize outdated indexes with the source quickly and without doing a full reset.
|
1177
|
+
Provides a way to synchronize outdated indexes with the source quickly and without doing a full reset. By default field `updated_at` is used to find outdated records, but this could be customized by `outdated_sync_field` as described at [Chewy::Index::Syncer](lib/chewy/index/syncer.rb).
|
967
1178
|
|
968
|
-
Arguments are similar to the ones taken by `chewy:update` task.
|
1179
|
+
Arguments are similar to the ones taken by `chewy:update` task.
|
969
1180
|
|
970
|
-
See [Chewy::
|
1181
|
+
See [Chewy::Index::Syncer](lib/chewy/index/syncer.rb) for more details.
|
971
1182
|
|
972
1183
|
```bash
|
973
1184
|
rake chewy:sync # synchronizes all the existing indices
|
974
1185
|
rake chewy:sync[users] # synchronizes UsersIndex only
|
975
|
-
rake chewy:sync[users,
|
976
|
-
rake chewy:sync[-users,
|
1186
|
+
rake chewy:sync[users,cities] # synchronizes UsersIndex and CitiesIndex
|
1187
|
+
rake chewy:sync[-users,cities] # synchronizes every index in the application except except UsersIndex and CitiesIndex
|
977
1188
|
```
|
978
1189
|
|
979
1190
|
#### `chewy:deploy`
|
980
1191
|
|
981
1192
|
This rake task is especially useful during the production deploy. It is a combination of `chewy:upgrade` and `chewy:sync` and the latter is called only for the indexes that were not reset during the first stage.
|
982
1193
|
|
983
|
-
It is not possible to specify any particular
|
1194
|
+
It is not possible to specify any particular indexes for this task as it doesn't make much sense.
|
984
1195
|
|
985
1196
|
Right now the approach is that if some data had been updated, but index definition was not changed (no changes satisfying the synchronization algorithm were done), it would be much faster to perform manual partial index update inside data migrations or even manually after the deploy.
|
986
1197
|
|
@@ -997,23 +1208,43 @@ If the number of processes is not specified explicitly - `parallel` gem tries to
|
|
997
1208
|
```bash
|
998
1209
|
rake chewy:parallel:reset
|
999
1210
|
rake chewy:parallel:upgrade[4]
|
1000
|
-
rake chewy:parallel:update[4,
|
1211
|
+
rake chewy:parallel:update[4,cities]
|
1001
1212
|
rake chewy:parallel:sync[4,-users]
|
1002
1213
|
rake chewy:parallel:deploy[4] # performs parallel upgrade and parallel sync afterwards
|
1003
1214
|
```
|
1004
1215
|
|
1005
1216
|
#### `chewy:journal`
|
1006
1217
|
|
1007
|
-
This namespace contains two tasks for the journal manipulations: `chewy:journal:apply` and `chewy:journal:clean`. Both are taking time as the first argument (optional for clean) and a list of indexes
|
1218
|
+
This namespace contains two tasks for the journal manipulations: `chewy:journal:apply` and `chewy:journal:clean`. Both are taking time as the first argument (optional for clean) and a list of indexes exactly as the tasks above. Time can be in any format parsable by ActiveSupport.
|
1008
1219
|
|
1009
1220
|
```bash
|
1010
1221
|
rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)"] # apply journaled changes for the past hour
|
1011
1222
|
rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)",users] # apply journaled changes for the past hour on UsersIndex only
|
1012
1223
|
```
|
1013
1224
|
|
1014
|
-
|
1225
|
+
When the size of the journal becomes very large, the classical way of deletion would be obstructive and resource consuming. Fortunately, Chewy internally uses [delete-by-query](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-delete-by-query.html#docs-delete-by-query-task-api) ES function which supports async execution with batching and [throttling](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-throttle).
|
1015
1226
|
|
1016
|
-
|
1227
|
+
The available options, which can be set by ENV variables, are listed below:
|
1228
|
+
* `WAIT_FOR_COMPLETION` - a boolean flag. It controls async execution. It waits by default. When set to `false` (`0`, `f`, `false` or `off` in any case spelling is accepted as `false`), Elasticsearch performs some preflight checks, launches the request, and returns a task reference you can use to cancel the task or get its status.
|
1229
|
+
* `REQUESTS_PER_SECOND` - float. The throttle for this request in sub-requests per second. No throttling is enforced by default.
|
1230
|
+
* `SCROLL_SIZE` - integer. The number of documents to be deleted in single sub-request. The default batch size is 1000.
|
1231
|
+
|
1232
|
+
```bash
|
1233
|
+
rake chewy:journal:clean WAIT_FOR_COMPLETION=false REQUESTS_PER_SECOND=10 SCROLL_SIZE=5000
|
1234
|
+
```
|
1235
|
+
|
1236
|
+
### RSpec integration
|
1237
|
+
|
1238
|
+
Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features:
|
1239
|
+
|
1240
|
+
[update_index](lib/chewy/rspec/update_index.rb) helper
|
1241
|
+
`mock_elasticsearch_response` helper to mock elasticsearch response
|
1242
|
+
`mock_elasticsearch_response_sources` helper to mock elasticsearch response sources
|
1243
|
+
`build_query` matcher to compare request and expected query (returns `true`/`false`)
|
1244
|
+
|
1245
|
+
To use `mock_elasticsearch_response` and `mock_elasticsearch_response_sources` helpers add `include Chewy::Rspec::Helpers` to your tests.
|
1246
|
+
|
1247
|
+
See [chewy/rspec/](lib/chewy/rspec/) for more details.
|
1017
1248
|
|
1018
1249
|
### Minitest integration
|
1019
1250
|
|
@@ -1023,6 +1254,14 @@ Since you can set `:bypass` strategy for test suites and manually handle import
|
|
1023
1254
|
|
1024
1255
|
But if you require chewy to index/update model regularly in your test suite then you can specify `:urgent` strategy for documents indexing. Add `Chewy.strategy(:urgent)` to test_helper.rb.
|
1025
1256
|
|
1257
|
+
Also, you can use additional helpers:
|
1258
|
+
|
1259
|
+
`mock_elasticsearch_response` to mock elasticsearch response
|
1260
|
+
`mock_elasticsearch_response_sources` to mock elasticsearch response sources
|
1261
|
+
`assert_elasticsearch_query` to compare request and expected query (returns `true`/`false`)
|
1262
|
+
|
1263
|
+
See [chewy/minitest/](lib/chewy/minitest/) for more details.
|
1264
|
+
|
1026
1265
|
### DatabaseCleaner
|
1027
1266
|
|
1028
1267
|
If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](https://github.com/DatabaseCleaner/database_cleaner#how-to-use), you may run into the problem that `ActiveRecord`'s models are not indexed automatically on save despite the fact that you set the callbacks to do this with the `update_index` method. The issue arises because `chewy` indices data on `after_commit` run as default, but all `after_commit` callbacks are not run with the `DatabaseCleaner`'s' `transaction` strategy. You can solve this issue by changing the `Chewy.use_after_commit_callbacks` option. Just add the following initializer in your Rails application:
|
@@ -1032,12 +1271,6 @@ If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](htt
|
|
1032
1271
|
Chewy.use_after_commit_callbacks = !Rails.env.test?
|
1033
1272
|
```
|
1034
1273
|
|
1035
|
-
## TODO a.k.a coming soon:
|
1036
|
-
|
1037
|
-
* Typecasting support
|
1038
|
-
* update_all support
|
1039
|
-
* Maybe, closer ORM/ODM integration, creating index classes implicitly
|
1040
|
-
|
1041
1274
|
## Contributing
|
1042
1275
|
|
1043
1276
|
1. Fork it (http://github.com/toptal/chewy/fork)
|
@@ -1047,9 +1280,14 @@ Chewy.use_after_commit_callbacks = !Rails.env.test?
|
|
1047
1280
|
5. Push to the branch (`git push origin my-new-feature`)
|
1048
1281
|
6. Create new Pull Request
|
1049
1282
|
|
1050
|
-
Use the following Rake tasks to control the Elasticsearch cluster while developing
|
1283
|
+
Use the following Rake tasks to control the Elasticsearch cluster while developing, if you prefer native Elasticsearch installation over the dockerized one:
|
1051
1284
|
|
1052
1285
|
```bash
|
1053
1286
|
rake elasticsearch:start # start Elasticsearch cluster on 9250 port for tests
|
1054
1287
|
rake elasticsearch:stop # stop Elasticsearch
|
1055
1288
|
```
|
1289
|
+
|
1290
|
+
## Copyright
|
1291
|
+
|
1292
|
+
Copyright (c) 2013-2021 Toptal, LLC. See [LICENSE.txt](LICENSE.txt) for
|
1293
|
+
further details.
|