chewy 6.0.0 → 7.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (188) hide show
  1. checksums.yaml +4 -4
  2. data/.github/CODEOWNERS +1 -0
  3. data/.github/ISSUE_TEMPLATE/bug_report.md +39 -0
  4. data/.github/ISSUE_TEMPLATE/feature_request.md +20 -0
  5. data/.github/PULL_REQUEST_TEMPLATE.md +16 -0
  6. data/.github/dependabot.yml +42 -0
  7. data/.github/workflows/ruby.yml +60 -0
  8. data/.rubocop.yml +16 -8
  9. data/.rubocop_todo.yml +110 -22
  10. data/CHANGELOG.md +396 -105
  11. data/CODE_OF_CONDUCT.md +14 -0
  12. data/CONTRIBUTING.md +63 -0
  13. data/Gemfile +4 -10
  14. data/Guardfile +3 -1
  15. data/README.md +497 -275
  16. data/chewy.gemspec +5 -20
  17. data/gemfiles/base.gemfile +12 -0
  18. data/gemfiles/rails.6.1.activerecord.gemfile +10 -15
  19. data/gemfiles/rails.7.0.activerecord.gemfile +14 -0
  20. data/gemfiles/rails.7.1.activerecord.gemfile +14 -0
  21. data/lib/chewy/config.rb +60 -52
  22. data/lib/chewy/elastic_client.rb +31 -0
  23. data/lib/chewy/errors.rb +7 -10
  24. data/lib/chewy/fields/base.rb +79 -13
  25. data/lib/chewy/fields/root.rb +4 -14
  26. data/lib/chewy/index/actions.rb +54 -37
  27. data/lib/chewy/{type → index}/adapter/active_record.rb +30 -6
  28. data/lib/chewy/{type → index}/adapter/base.rb +2 -3
  29. data/lib/chewy/{type → index}/adapter/object.rb +27 -31
  30. data/lib/chewy/{type → index}/adapter/orm.rb +17 -18
  31. data/lib/chewy/index/aliases.rb +14 -5
  32. data/lib/chewy/index/crutch.rb +40 -0
  33. data/lib/chewy/index/import/bulk_builder.rb +311 -0
  34. data/lib/chewy/{type → index}/import/bulk_request.rb +6 -7
  35. data/lib/chewy/{type → index}/import/journal_builder.rb +11 -12
  36. data/lib/chewy/{type → index}/import/routine.rb +18 -17
  37. data/lib/chewy/{type → index}/import.rb +76 -32
  38. data/lib/chewy/{type → index}/mapping.rb +29 -34
  39. data/lib/chewy/index/observe/active_record_methods.rb +87 -0
  40. data/lib/chewy/index/observe/callback.rb +34 -0
  41. data/lib/chewy/index/observe.rb +17 -0
  42. data/lib/chewy/index/specification.rb +1 -0
  43. data/lib/chewy/{type → index}/syncer.rb +59 -59
  44. data/lib/chewy/{type → index}/witchcraft.rb +11 -7
  45. data/lib/chewy/{type → index}/wrapper.rb +2 -2
  46. data/lib/chewy/index.rb +67 -94
  47. data/lib/chewy/journal.rb +25 -14
  48. data/lib/chewy/log_subscriber.rb +5 -1
  49. data/lib/chewy/minitest/helpers.rb +86 -13
  50. data/lib/chewy/minitest/search_index_receiver.rb +24 -26
  51. data/lib/chewy/railtie.rb +6 -20
  52. data/lib/chewy/rake_helper.rb +169 -113
  53. data/lib/chewy/rspec/build_query.rb +12 -0
  54. data/lib/chewy/rspec/helpers.rb +55 -0
  55. data/lib/chewy/rspec/update_index.rb +55 -44
  56. data/lib/chewy/rspec.rb +2 -0
  57. data/lib/chewy/runtime/version.rb +1 -1
  58. data/lib/chewy/runtime.rb +1 -1
  59. data/lib/chewy/search/loader.rb +19 -41
  60. data/lib/chewy/search/parameters/collapse.rb +16 -0
  61. data/lib/chewy/search/parameters/concerns/query_storage.rb +2 -2
  62. data/lib/chewy/search/parameters/ignore_unavailable.rb +27 -0
  63. data/lib/chewy/search/parameters/indices.rb +13 -58
  64. data/lib/chewy/search/parameters/knn.rb +16 -0
  65. data/lib/chewy/search/parameters/order.rb +6 -19
  66. data/lib/chewy/search/parameters/source.rb +5 -1
  67. data/lib/chewy/search/parameters/storage.rb +1 -1
  68. data/lib/chewy/search/parameters/track_total_hits.rb +16 -0
  69. data/lib/chewy/search/parameters.rb +6 -4
  70. data/lib/chewy/search/query_proxy.rb +9 -2
  71. data/lib/chewy/search/request.rb +169 -134
  72. data/lib/chewy/search/response.rb +5 -5
  73. data/lib/chewy/search/scoping.rb +7 -8
  74. data/lib/chewy/search/scrolling.rb +13 -13
  75. data/lib/chewy/search.rb +9 -19
  76. data/lib/chewy/stash.rb +19 -30
  77. data/lib/chewy/strategy/active_job.rb +1 -1
  78. data/lib/chewy/strategy/atomic_no_refresh.rb +18 -0
  79. data/lib/chewy/strategy/base.rb +10 -0
  80. data/lib/chewy/strategy/delayed_sidekiq/scheduler.rb +168 -0
  81. data/lib/chewy/strategy/delayed_sidekiq/worker.rb +76 -0
  82. data/lib/chewy/strategy/delayed_sidekiq.rb +30 -0
  83. data/lib/chewy/strategy/lazy_sidekiq.rb +64 -0
  84. data/lib/chewy/strategy/sidekiq.rb +2 -1
  85. data/lib/chewy/strategy.rb +6 -19
  86. data/lib/chewy/version.rb +1 -1
  87. data/lib/chewy.rb +39 -86
  88. data/lib/generators/chewy/install_generator.rb +1 -1
  89. data/lib/tasks/chewy.rake +36 -32
  90. data/migration_guide.md +46 -8
  91. data/spec/chewy/config_spec.rb +16 -41
  92. data/spec/chewy/elastic_client_spec.rb +26 -0
  93. data/spec/chewy/fields/base_spec.rb +432 -147
  94. data/spec/chewy/fields/root_spec.rb +20 -28
  95. data/spec/chewy/fields/time_fields_spec.rb +5 -5
  96. data/spec/chewy/index/actions_spec.rb +368 -59
  97. data/spec/chewy/{type → index}/adapter/active_record_spec.rb +156 -40
  98. data/spec/chewy/{type → index}/adapter/object_spec.rb +21 -6
  99. data/spec/chewy/index/aliases_spec.rb +3 -3
  100. data/spec/chewy/index/import/bulk_builder_spec.rb +494 -0
  101. data/spec/chewy/{type → index}/import/bulk_request_spec.rb +5 -12
  102. data/spec/chewy/{type → index}/import/journal_builder_spec.rb +9 -19
  103. data/spec/chewy/{type → index}/import/routine_spec.rb +19 -19
  104. data/spec/chewy/{type → index}/import_spec.rb +164 -98
  105. data/spec/chewy/index/mapping_spec.rb +135 -0
  106. data/spec/chewy/index/observe/active_record_methods_spec.rb +68 -0
  107. data/spec/chewy/index/observe/callback_spec.rb +139 -0
  108. data/spec/chewy/index/observe_spec.rb +143 -0
  109. data/spec/chewy/index/settings_spec.rb +3 -1
  110. data/spec/chewy/index/specification_spec.rb +20 -30
  111. data/spec/chewy/{type → index}/syncer_spec.rb +14 -19
  112. data/spec/chewy/{type → index}/witchcraft_spec.rb +20 -22
  113. data/spec/chewy/index/wrapper_spec.rb +100 -0
  114. data/spec/chewy/index_spec.rb +60 -105
  115. data/spec/chewy/journal_spec.rb +25 -74
  116. data/spec/chewy/minitest/helpers_spec.rb +123 -15
  117. data/spec/chewy/minitest/search_index_receiver_spec.rb +28 -30
  118. data/spec/chewy/multi_search_spec.rb +4 -5
  119. data/spec/chewy/rake_helper_spec.rb +315 -55
  120. data/spec/chewy/rspec/build_query_spec.rb +34 -0
  121. data/spec/chewy/rspec/helpers_spec.rb +61 -0
  122. data/spec/chewy/rspec/update_index_spec.rb +74 -71
  123. data/spec/chewy/runtime_spec.rb +2 -2
  124. data/spec/chewy/search/loader_spec.rb +19 -53
  125. data/spec/chewy/search/pagination/kaminari_examples.rb +4 -6
  126. data/spec/chewy/search/pagination/kaminari_spec.rb +2 -2
  127. data/spec/chewy/search/parameters/collapse_spec.rb +5 -0
  128. data/spec/chewy/search/parameters/ignore_unavailable_spec.rb +67 -0
  129. data/spec/chewy/search/parameters/indices_spec.rb +26 -117
  130. data/spec/chewy/search/parameters/knn_spec.rb +5 -0
  131. data/spec/chewy/search/parameters/order_spec.rb +18 -11
  132. data/spec/chewy/search/parameters/query_storage_examples.rb +67 -21
  133. data/spec/chewy/search/parameters/search_after_spec.rb +4 -1
  134. data/spec/chewy/search/parameters/source_spec.rb +8 -2
  135. data/spec/chewy/search/parameters/track_total_hits_spec.rb +5 -0
  136. data/spec/chewy/search/parameters_spec.rb +18 -4
  137. data/spec/chewy/search/query_proxy_spec.rb +68 -17
  138. data/spec/chewy/search/request_spec.rb +292 -110
  139. data/spec/chewy/search/response_spec.rb +12 -12
  140. data/spec/chewy/search/scrolling_spec.rb +10 -17
  141. data/spec/chewy/search_spec.rb +40 -34
  142. data/spec/chewy/stash_spec.rb +9 -21
  143. data/spec/chewy/strategy/active_job_spec.rb +16 -16
  144. data/spec/chewy/strategy/atomic_no_refresh_spec.rb +60 -0
  145. data/spec/chewy/strategy/atomic_spec.rb +9 -10
  146. data/spec/chewy/strategy/delayed_sidekiq_spec.rb +208 -0
  147. data/spec/chewy/strategy/lazy_sidekiq_spec.rb +214 -0
  148. data/spec/chewy/strategy/sidekiq_spec.rb +12 -12
  149. data/spec/chewy/strategy_spec.rb +19 -15
  150. data/spec/chewy_spec.rb +24 -107
  151. data/spec/spec_helper.rb +3 -22
  152. data/spec/support/active_record.rb +25 -7
  153. metadata +78 -339
  154. data/.circleci/config.yml +0 -240
  155. data/Appraisals +0 -81
  156. data/gemfiles/rails.5.2.activerecord.gemfile +0 -17
  157. data/gemfiles/rails.5.2.mongoid.6.4.gemfile +0 -17
  158. data/gemfiles/rails.6.0.activerecord.gemfile +0 -17
  159. data/gemfiles/sequel.4.45.gemfile +0 -11
  160. data/lib/chewy/backports/deep_dup.rb +0 -46
  161. data/lib/chewy/backports/duplicable.rb +0 -91
  162. data/lib/chewy/search/pagination/will_paginate.rb +0 -43
  163. data/lib/chewy/search/parameters/types.rb +0 -20
  164. data/lib/chewy/strategy/resque.rb +0 -27
  165. data/lib/chewy/strategy/shoryuken.rb +0 -40
  166. data/lib/chewy/type/actions.rb +0 -43
  167. data/lib/chewy/type/adapter/mongoid.rb +0 -67
  168. data/lib/chewy/type/adapter/sequel.rb +0 -93
  169. data/lib/chewy/type/crutch.rb +0 -32
  170. data/lib/chewy/type/import/bulk_builder.rb +0 -122
  171. data/lib/chewy/type/observe.rb +0 -82
  172. data/lib/chewy/type.rb +0 -120
  173. data/lib/sequel/plugins/chewy_observe.rb +0 -63
  174. data/spec/chewy/search/pagination/will_paginate_examples.rb +0 -63
  175. data/spec/chewy/search/pagination/will_paginate_spec.rb +0 -23
  176. data/spec/chewy/search/parameters/types_spec.rb +0 -5
  177. data/spec/chewy/strategy/resque_spec.rb +0 -46
  178. data/spec/chewy/strategy/shoryuken_spec.rb +0 -70
  179. data/spec/chewy/type/actions_spec.rb +0 -50
  180. data/spec/chewy/type/adapter/mongoid_spec.rb +0 -372
  181. data/spec/chewy/type/adapter/sequel_spec.rb +0 -472
  182. data/spec/chewy/type/import/bulk_builder_spec.rb +0 -194
  183. data/spec/chewy/type/mapping_spec.rb +0 -175
  184. data/spec/chewy/type/observe_spec.rb +0 -137
  185. data/spec/chewy/type/wrapper_spec.rb +0 -100
  186. data/spec/chewy/type_spec.rb +0 -55
  187. data/spec/support/mongoid.rb +0 -93
  188. data/spec/support/sequel.rb +0 -80
data/README.md CHANGED
@@ -1,64 +1,16 @@
1
1
  [![Gem Version](https://badge.fury.io/rb/chewy.svg)](http://badge.fury.io/rb/chewy)
2
- [![CircleCI](https://circleci.com/gh/toptal/chewy/tree/master.svg?style=svg)](https://circleci.com/gh/toptal/chewy/tree/master)
2
+ [![GitHub Actions](https://github.com/toptal/chewy/actions/workflows/ruby.yml/badge.svg)](https://github.com/toptal/chewy/actions/workflows/ruby.yml)
3
3
  [![Code Climate](https://codeclimate.com/github/toptal/chewy.svg)](https://codeclimate.com/github/toptal/chewy)
4
4
  [![Inline docs](http://inch-ci.org/github/toptal/chewy.svg?branch=master)](http://inch-ci.org/github/toptal/chewy)
5
5
 
6
6
  # Chewy
7
7
 
8
- Chewy is an ODM (Object Document Mapper), built on top of the [the official Elasticsearch client](https://github.com/elastic/elasticsearch-ruby).
9
-
10
- ## Table of Contents
11
-
12
- * [Why Chewy?](#why-chewy)
13
- * [Installation](#installation)
14
- * [Usage](#usage)
15
- * [Client settings](#client-settings)
16
- * [AWS ElasticSearch configuration](#aws-elastic-search)
17
- * [Index definition](#index-definition)
18
- * [Type default import options](#type-default-import-options)
19
- * [Multi (nested) and object field types](#multi-nested-and-object-field-types)
20
- * [Parent and children types](#parent-and-children-types)
21
- * [Geo Point fields](#geo-point-fields)
22
- * [Crutches™ technology](#crutches-technology)
23
- * [Witchcraft™ technology](#witchcraft-technology)
24
- * [Raw Import](#raw-import)
25
- * [Index creation during import](#index-creation-during-import)
26
- * [Journaling](#journaling)
27
- * [Types access](#types-access)
28
- * [Index manipulation](#index-manipulation)
29
- * [Index update strategies](#index-update-strategies)
30
- * [Nesting](#nesting)
31
- * [Non-block notation](#non-block-notation)
32
- * [Designing your own strategies](#designing-your-own-strategies)
33
- * [Rails application strategies integration](#rails-application-strategies-integration)
34
- * [ActiveSupport::Notifications support](#activesupportnotifications-support)
35
- * [NewRelic integration](#newrelic-integration)
36
- * [Search requests](#search-requests)
37
- * [Composing requests](#composing-requests)
38
- * [Pagination](#pagination)
39
- * [Named scopes](#named-scopes)
40
- * [Scroll API](#scroll-api)
41
- * [Loading objects](#loading-objects)
42
- * [Rake tasks](#rake-tasks)
43
- * [chewy:reset](#chewyreset)
44
- * [chewy:upgrade](#chewyupgrade)
45
- * [chewy:update](#chewyupdate)
46
- * [chewy:sync](#chewysync)
47
- * [chewy:deploy](#chewydeploy)
48
- * [Parallelizing rake tasks](#parallelizing-rake-tasks)
49
- * [chewy:journal](#chewyjournal)
50
- * [RSpec integration](#rspec-integration)
51
- * [Minitest integration](#minitest-integration)
52
- * [Contributing](#contributing)
8
+ Chewy is an ODM (Object Document Mapper), built on top of [the official Elasticsearch client](https://github.com/elastic/elasticsearch-ruby).
53
9
 
54
10
  ## Why Chewy?
55
11
 
56
12
  In this section we'll cover why you might want to use Chewy instead of the official `elasticsearch-ruby` client gem.
57
13
 
58
- * Multi-model indices.
59
-
60
- Index classes are independent from ORM/ODM models. Now, implementing e.g. cross-model autocomplete is much easier. You can just define the index and work with it in an object-oriented style. You can define several types for index - one per indexed model.
61
-
62
14
  * Every index is observable by all the related models.
63
15
 
64
16
  Most of the indexed models are related to other and sometimes it is necessary to denormalize this related data and put at the same object. For example, you need to index an array of tags together with an article. Chewy allows you to specify an updateable index for every model separately - so corresponding articles will be reindexed on any tag update.
@@ -71,7 +23,7 @@ In this section we'll cover why you might want to use Chewy instead of the offic
71
23
 
72
24
  Chewy has an ActiveRecord-style query DSL. It is chainable, mergeable and lazy, so you can produce queries in the most efficient way. It also has object-oriented query and filter builders.
73
25
 
74
- * Support for ActiveRecord, [Mongoid](https://github.com/mongoid/mongoid) and [Sequel](https://github.com/jeremyevans/sequel).
26
+ * Support for ActiveRecord.
75
27
 
76
28
  ## Installation
77
29
 
@@ -91,35 +43,177 @@ Or install it yourself as:
91
43
 
92
44
  ### Ruby
93
45
 
94
- Chewy is compatible with MRI 2.5-3.0¹.
46
+ Chewy is compatible with MRI 3.0-3.2¹.
95
47
 
96
48
  > ¹ Ruby 3 is only supported with Rails 6.1
97
49
 
50
+ ### Elasticsearch compatibility matrix
51
+
52
+ | Chewy version | Elasticsearch version |
53
+ | ------------- | ---------------------------------- |
54
+ | 7.2.x | 7.x |
55
+ | 7.1.x | 7.x |
56
+ | 7.0.x | 6.8, 7.x |
57
+ | 6.0.0 | 5.x, 6.x |
58
+ | 5.x | 5.x, limited support for 1.x & 2.x |
59
+
60
+ **Important:** Chewy doesn't follow SemVer, so you should always
61
+ check the release notes before upgrading. The major version is linked to the
62
+ newest supported Elasticsearch and the minor version bumps may include breaking changes.
63
+
64
+ See our [migration guide](migration_guide.md) for detailed upgrade instructions between
65
+ various Chewy versions.
66
+
67
+ ### Active Record
68
+
69
+ 5.2, 6.0, 6.1 Active Record versions are supported by all Chewy versions.
70
+
71
+ ## Getting Started
72
+
73
+ Chewy provides functionality for Elasticsearch index handling, documents import mappings, index update strategies and chainable query DSL.
74
+
75
+ ### Minimal client setting
76
+
77
+ Create `config/initializers/chewy.rb` with this line:
78
+
79
+ ```ruby
80
+ Chewy.settings = {host: 'localhost:9250'}
81
+ ```
82
+
83
+ And run `rails g chewy:install` to generate `chewy.yml`:
84
+
85
+ ```yaml
86
+ # config/chewy.yml
87
+ # separate environment configs
88
+ test:
89
+ host: 'localhost:9250'
90
+ prefix: 'test'
91
+ development:
92
+ host: 'localhost:9200'
93
+ ```
94
+
98
95
  ### Elasticsearch
99
96
 
100
- Chewy 5 is compatible with Elasticsearch 5.
97
+ Make sure you have Elasticsearch up and running. You can [install](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html) it locally, but the easiest way is to use [Docker](https://www.docker.com/get-started):
101
98
 
102
- Chewy 6 is compatible with Elasticsearch 6. See [Migration guide](migration_guide.md).
99
+ ```shell
100
+ $ docker run --rm --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.11.1
101
+ ```
103
102
 
104
- Future versions of Chewy will support Elasticsearch 7.
103
+ ### Index
105
104
 
106
- ## Usage
105
+ Create `app/chewy/users_index.rb` with User Index:
107
106
 
108
- ### Client settings
107
+ ```ruby
108
+ class UsersIndex < Chewy::Index
109
+ settings analysis: {
110
+ analyzer: {
111
+ email: {
112
+ tokenizer: 'keyword',
113
+ filter: ['lowercase']
114
+ }
115
+ }
116
+ }
117
+
118
+ index_scope User
119
+ field :first_name
120
+ field :last_name
121
+ field :email, analyzer: 'email'
122
+ end
123
+ ```
124
+
125
+ ### Model
126
+
127
+ Add User model, table and migrate it:
128
+
129
+ ```shell
130
+ $ bundle exec rails g model User first_name last_name email
131
+ $ bundle exec rails db:migrate
132
+ ```
133
+
134
+ Add `update_index` to app/models/user.rb:
135
+
136
+ ```ruby
137
+ class User < ApplicationRecord
138
+ update_index('users') { self }
139
+ end
140
+ ```
141
+
142
+ ### Example of data request
143
+
144
+ 1. Once a record is created (could be done via the Rails console), it creates User index too:
145
+
146
+ ```
147
+ User.create(
148
+ first_name: "test1",
149
+ last_name: "test1",
150
+ email: 'test1@example.com',
151
+ # other fields
152
+ )
153
+ # UsersIndex Import (355.3ms) {:index=>1}
154
+ # => #<User id: 1, first_name: "test1", last_name: "test1", email: "test1@example.com", # other fields>
155
+ ```
156
+
157
+ 2. A query could be exposed at a given `UsersController`:
158
+
159
+ ```ruby
160
+ def search
161
+ @users = UsersIndex.query(query_string: { fields: [:first_name, :last_name, :email, ...], query: search_params[:query], default_operator: 'and' })
162
+ render json: @users.to_json, status: :ok
163
+ end
164
+
165
+ private
166
+
167
+ def search_params
168
+ params.permit(:query, :page, :per)
169
+ end
170
+ ```
171
+
172
+ 3. So a request against `http://localhost:3000/users/search?query=test1@example.com` issuing a response like:
173
+
174
+ ```json
175
+ [
176
+ {
177
+ "attributes":{
178
+ "id":"1",
179
+ "first_name":"test1",
180
+ "last_name":"test1",
181
+ "email":"test1@example.com",
182
+ ...
183
+ "_score":0.9808291,
184
+ "_explanation":null
185
+ },
186
+ "_data":{
187
+ "_index":"users",
188
+ "_type":"_doc",
189
+ "_id":"1",
190
+ "_score":0.9808291,
191
+ "_source":{
192
+ "first_name":"test1",
193
+ "last_name":"test1",
194
+ "email":"test1@example.com",
195
+ ...
196
+ }
197
+ }
198
+ }
199
+ ]
200
+ ```
109
201
 
110
- There are two ways to configure the Chewy client:
202
+ ## Usage and configuration
111
203
 
112
- * via the hash `Chewy.settings`
113
- * via the configuration file `chewy.yml`
204
+ ### Client settings
114
205
 
115
- You can create `chewy.yml` manually or run `rails g chewy:install` to
116
- generate it.
206
+ To configure the Chewy client you need to add `chewy.rb` file with `Chewy.settings` hash:
117
207
 
118
208
  ```ruby
119
209
  # config/initializers/chewy.rb
120
210
  Chewy.settings = {host: 'localhost:9250'} # do not use environments
121
211
  ```
122
212
 
213
+ And add `chewy.yml` configuration file.
214
+
215
+ You can create `chewy.yml` manually or run `rails g chewy:install` to generate it:
216
+
123
217
  ```yaml
124
218
  # config/chewy.yml
125
219
  # separate environment configs
@@ -169,7 +263,7 @@ Chewy.settings = {
169
263
  }
170
264
  ```
171
265
 
172
- ### Index definition
266
+ #### Index definition
173
267
 
174
268
  1. Create `/app/chewy/users_index.rb`
175
269
 
@@ -179,41 +273,38 @@ Chewy.settings = {
179
273
  end
180
274
  ```
181
275
 
182
- 2. Add one or more types mapping
276
+ 2. Define index scope (you can omit this part if you don't need to specify a scope (i.e. use PORO objects for import) or options)
183
277
 
184
278
  ```ruby
185
279
  class UsersIndex < Chewy::Index
186
- define_type User.active # or just model instead_of scope: define_type User
280
+ index_scope User.active # or just model instead_of scope: index_scope User
187
281
  end
188
282
  ```
189
283
 
190
- Newly-defined index type class is accessible via `UsersIndex.user` or `UsersIndex::User`
191
-
192
- 3. Add some type mappings
284
+ 3. Add some mappings
193
285
 
194
286
  ```ruby
195
287
  class UsersIndex < Chewy::Index
196
- define_type User.active.includes(:country, :badges, :projects) do
197
- field :first_name, :last_name # multiple fields without additional options
198
- field :email, analyzer: 'email' # Elasticsearch-related options
199
- field :country, value: ->(user) { user.country.name } # custom value proc
200
- field :badges, value: ->(user) { user.badges.map(&:name) } # passing array values to index
201
- field :projects do # the same block syntax for multi_field, if `:type` is specified
202
- field :title
203
- field :description # default data type is `text`
204
- # additional top-level objects passed to value proc:
205
- field :categories, value: ->(project, user) { project.categories.map(&:name) if user.active? }
206
- end
207
- field :rating, type: 'integer' # custom data type
208
- field :created, type: 'date', include_in_all: false,
209
- value: ->{ created_at } # value proc for source object context
288
+ index_scope User.active.includes(:country, :badges, :projects)
289
+ field :first_name, :last_name # multiple fields without additional options
290
+ field :email, analyzer: 'email' # Elasticsearch-related options
291
+ field :country, value: ->(user) { user.country.name } # custom value proc
292
+ field :badges, value: ->(user) { user.badges.map(&:name) } # passing array values to index
293
+ field :projects do # the same block syntax for multi_field, if `:type` is specified
294
+ field :title
295
+ field :description # default data type is `text`
296
+ # additional top-level objects passed to value proc:
297
+ field :categories, value: ->(project, user) { project.categories.map(&:name) if user.active? }
210
298
  end
299
+ field :rating, type: 'integer' # custom data type
300
+ field :created, type: 'date', include_in_all: false,
301
+ value: ->{ created_at } # value proc for source object context
211
302
  end
212
303
  ```
213
304
 
214
305
  [See here for mapping definitions](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html).
215
306
 
216
- 4. Add some index- and type-related settings. Analyzer repositories might be used as well. See `Chewy::Index.settings` docs for details:
307
+ 4. Add some index-related settings. Analyzer repositories might be used as well. See `Chewy::Index.settings` docs for details:
217
308
 
218
309
  ```ruby
219
310
  class UsersIndex < Chewy::Index
@@ -226,23 +317,22 @@ Chewy.settings = {
226
317
  }
227
318
  }
228
319
 
229
- define_type User.active.includes(:country, :badges, :projects) do
230
- root date_detection: false do
231
- template 'about_translations.*', type: 'text', analyzer: 'standard'
232
-
233
- field :first_name, :last_name
234
- field :email, analyzer: 'email'
235
- field :country, value: ->(user) { user.country.name }
236
- field :badges, value: ->(user) { user.badges.map(&:name) }
237
- field :projects do
238
- field :title
239
- field :description
240
- end
241
- field :about_translations, type: 'object' # pass object type explicitly if necessary
242
- field :rating, type: 'integer'
243
- field :created, type: 'date', include_in_all: false,
244
- value: ->{ created_at }
320
+ index_scope User.active.includes(:country, :badges, :projects)
321
+ root date_detection: false do
322
+ template 'about_translations.*', type: 'text', analyzer: 'standard'
323
+
324
+ field :first_name, :last_name
325
+ field :email, analyzer: 'email'
326
+ field :country, value: ->(user) { user.country.name }
327
+ field :badges, value: ->(user) { user.badges.map(&:name) }
328
+ field :projects do
329
+ field :title
330
+ field :description
245
331
  end
332
+ field :about_translations, type: 'object' # pass object type explicitly if necessary
333
+ field :rating, type: 'integer'
334
+ field :created, type: 'date', include_in_all: false,
335
+ value: ->{ created_at }
246
336
  end
247
337
  end
248
338
  ```
@@ -250,45 +340,38 @@ Chewy.settings = {
250
340
  [See index settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html).
251
341
  [See root object settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html).
252
342
 
253
- See [mapping.rb](lib/chewy/type/mapping.rb) for more details.
343
+ See [mapping.rb](lib/chewy/index/mapping.rb) for more details.
254
344
 
255
345
  5. Add model-observing code
256
346
 
257
347
  ```ruby
258
348
  class User < ActiveRecord::Base
259
- update_index('users#user') { self } # specifying index, type and back-reference
349
+ update_index('users') { self } # specifying index and back-reference
260
350
  # for updating after user save or destroy
261
351
  end
262
352
 
263
353
  class Country < ActiveRecord::Base
264
354
  has_many :users
265
355
 
266
- update_index('users#user') { users } # return single object or collection
356
+ update_index('users') { users } # return single object or collection
267
357
  end
268
358
 
269
359
  class Project < ActiveRecord::Base
270
- update_index('users#user') { user if user.active? } # you can return even `nil` from the back-reference
271
- end
272
-
273
- class Badge < ActiveRecord::Base
274
- has_and_belongs_to_many :users
275
-
276
- update_index('users') { users } # if index has only one type
277
- # there is no need to specify updated type
360
+ update_index('users') { user if user.active? } # you can return even `nil` from the back-reference
278
361
  end
279
362
 
280
363
  class Book < ActiveRecord::Base
281
- update_index(->(book) {"books#book_#{book.language}"}) { self } # dynamic index and type with proc.
282
- # For book with language == "en"
283
- # this code will generate `books#book_en`
364
+ update_index(->(book) {"books_#{book.language}"}) { self } # dynamic index name with proc.
365
+ # For book with language == "en"
366
+ # this code will generate `books_en`
284
367
  end
285
368
  ```
286
369
 
287
370
  Also, you can use the second argument for method name passing:
288
371
 
289
372
  ```ruby
290
- update_index('users#user', :self)
291
- update_index('users#user', :users)
373
+ update_index('users', :self)
374
+ update_index('users', :users)
292
375
  ```
293
376
 
294
377
  In the case of a belongs_to association you may need to update both associated objects, previous and current:
@@ -297,47 +380,28 @@ Chewy.settings = {
297
380
  class City < ActiveRecord::Base
298
381
  belongs_to :country
299
382
 
300
- update_index('cities#city') { self }
301
- update_index 'countries#country' do
302
- # For the latest active_record changed values are
303
- # already in `previous_changes` hash,
304
- # but for mongoid you have to use `changes` hash
383
+ update_index('cities') { self }
384
+ update_index 'countries' do
305
385
  previous_changes['country_id'] || country
306
386
  end
307
387
  end
308
388
  ```
309
389
 
310
- You can observe Sequel models in the same way as ActiveRecord:
390
+ ### Default import options
311
391
 
312
- ```ruby
313
- class User < Sequel::Model
314
- update_index('users#user') { self }
315
- end
316
- ```
317
-
318
- However, to make it work, you must load the chewy plugin into Sequel model:
319
-
320
- ```ruby
321
- Sequel::Model.plugin :chewy_observe # for all models, or...
322
- User.plugin :chewy_observe # just for User
323
- ```
324
-
325
- ### Type default import options
326
-
327
- Every type has `default_import_options` configuration to specify, suddenly, default import options:
392
+ Every index has `default_import_options` configuration to specify, suddenly, default import options:
328
393
 
329
394
  ```ruby
330
395
  class ProductsIndex < Chewy::Index
331
- define_type Post.includes(:tags) do
332
- default_import_options batch_size: 100, bulk_size: 10.megabytes, refresh: false
396
+ index_scope Post.includes(:tags)
397
+ default_import_options batch_size: 100, bulk_size: 10.megabytes, refresh: false
333
398
 
334
- field :name
335
- field :tags, value: -> { tags.map(&:name) }
336
- end
399
+ field :name
400
+ field :tags, value: -> { tags.map(&:name) }
337
401
  end
338
402
  ```
339
403
 
340
- See [import.rb](lib/chewy/type/import.rb) for available options.
404
+ See [import.rb](lib/chewy/index/import.rb) for available options.
341
405
 
342
406
  ### Multi (nested) and object field types
343
407
 
@@ -363,18 +427,6 @@ end
363
427
 
364
428
  The `value:` option for internal fields will no longer be effective.
365
429
 
366
- ### Parent and children types
367
-
368
- To define [parent](https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child-mapping.html) type for a given index_type, you can include root options for the type where you can specify parent_type and parent_id
369
-
370
- ```ruby
371
- define_type User.includes(:account) do
372
- root parent: 'account', parent_id: ->{ account_id } do
373
- field :created_at, type: 'date'
374
- field :task_id, type: 'integer'
375
- end
376
- end
377
- ```
378
430
  ### Geo Point fields
379
431
 
380
432
  You can use [Elasticsearch's geo mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) with the `geo_point` field type, allowing you to query, filter and order by latitude and longitude. You can use the following hash format:
@@ -394,20 +446,36 @@ end
394
446
 
395
447
  See the section on *Script fields* for details on calculating distance in a search.
396
448
 
449
+ ### Join fields
450
+
451
+ You can use a [join field](https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html)
452
+ to implement parent-child relationships between documents.
453
+ It [replaces the old `parent_id` based parent-child mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#parent-child-mapping-types)
454
+
455
+ To use it, you need to pass `relations` and `join` (with `type` and `id`) options:
456
+ ```ruby
457
+ field :hierarchy_link, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join: {type: :comment_type, id: :commented_id}
458
+ ```
459
+ assuming you have `comment_type` and `commented_id` fields in your model.
460
+
461
+ Note that when you reindex a parent, its children and grandchildren will be reindexed as well.
462
+ This may require additional queries to the primary database and to elastisearch.
463
+
464
+ Also note that the join field doesn't support crutches (it should be a field directly defined on the model).
465
+
397
466
  ### Crutches™ technology
398
467
 
399
468
  Assume you are defining your index like this (product has_many categories through product_categories):
400
469
 
401
470
  ```ruby
402
471
  class ProductsIndex < Chewy::Index
403
- define_type Product.includes(:categories) do
404
- field :name
405
- field :category_names, value: ->(product) { product.categories.map(&:name) } # or shorter just -> { categories.map(&:name) }
406
- end
472
+ index_scope Product.includes(:categories)
473
+ field :name
474
+ field :category_names, value: ->(product) { product.categories.map(&:name) } # or shorter just -> { categories.map(&:name) }
407
475
  end
408
476
  ```
409
477
 
410
- Then the Chewy reindexing flow will look like the following pseudo-code (even in Mongoid):
478
+ Then the Chewy reindexing flow will look like the following pseudo-code:
411
479
 
412
480
  ```ruby
413
481
  Product.includes(:categories).find_in_batches(1000) do |batch|
@@ -419,26 +487,23 @@ Product.includes(:categories).find_in_batches(1000) do |batch|
419
487
  end
420
488
  ```
421
489
 
422
- But in Rails 4.1 and 4.2 you may face a problem with slow associations (take a look at https://github.com/rails/rails/pull/19423). Also, there might be really complicated cases when associations are not applicable.
423
-
424
- Then you can replace Rails associations with Chewy Crutches™ technology:
490
+ If you meet complicated cases when associations are not applicable you can replace Rails associations with Chewy Crutches™ technology:
425
491
 
426
492
  ```ruby
427
493
  class ProductsIndex < Chewy::Index
428
- define_type Product do
429
- crutch :categories do |collection| # collection here is a current batch of products
430
- # data is fetched with a lightweight query without objects initialization
431
- data = ProductCategory.joins(:category).where(product_id: collection.map(&:id)).pluck(:product_id, 'categories.name')
432
- # then we have to convert fetched data to appropriate format
433
- # this will return our data in structure like:
434
- # {123 => ['sweets', 'juices'], 456 => ['meat']}
435
- data.each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
436
- end
437
-
438
- field :name
439
- # simply use crutch-fetched data as a value:
440
- field :category_names, value: ->(product, crutches) { crutches.categories[product.id] }
494
+ index_scope Product
495
+ crutch :categories do |collection| # collection here is a current batch of products
496
+ # data is fetched with a lightweight query without objects initialization
497
+ data = ProductCategory.joins(:category).where(product_id: collection.map(&:id)).pluck(:product_id, 'categories.name')
498
+ # then we have to convert fetched data to appropriate format
499
+ # this will return our data in structure like:
500
+ # {123 => ['sweets', 'juices'], 456 => ['meat']}
501
+ data.each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
441
502
  end
503
+
504
+ field :name
505
+ # simply use crutch-fetched data as a value:
506
+ field :category_names, value: ->(product, crutches) { crutches[:categories][product.id] }
442
507
  end
443
508
  ```
444
509
 
@@ -460,22 +525,21 @@ So Chewy Crutches™ technology is able to increase your indexing performance in
460
525
 
461
526
  ### Witchcraft™ technology
462
527
 
463
- One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of this procs is executed on imported object to extract result document to import. It would be great for performance to use one huge whole-document-returning proc instead. So basically the idea or Witchcraft™ technology is to compile a single document-returning proc from the type definition.
528
+ One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of these procs is executed on imported object to extract result document to import. It would be great for performance to use one huge whole-document-returning proc instead. So basically the idea or Witchcraft™ technology is to compile a single document-returning proc from the index definition.
464
529
 
465
530
  ```ruby
466
- define_type Product do
467
- witchcraft!
468
-
469
- field :title
470
- field :tags, value: -> { tags.map(&:name) }
471
- field :categories do
472
- field :name, value: -> (product, category) { category.name }
473
- field :type, value: -> (product, category, crutch) { crutch.types[category.name] }
474
- end
531
+ index_scope Product
532
+ witchcraft!
533
+
534
+ field :title
535
+ field :tags, value: -> { tags.map(&:name) }
536
+ field :categories do
537
+ field :name, value: -> (product, category) { category.name }
538
+ field :type, value: -> (product, category, crutch) { crutch.types[category.name] }
475
539
  end
476
540
  ```
477
541
 
478
- The type definition above will be compiled to something close to:
542
+ The index definition above will be compiled to something close to:
479
543
 
480
544
  ```ruby
481
545
  -> (object, crutches) do
@@ -505,7 +569,7 @@ Obviously not every type of definition might be compiled. There are some restric
505
569
  end
506
570
  ```
507
571
 
508
- However, it is quite possible that your type definition will be supported by Witchcraft™ technology out of the box in the most of the cases.
572
+ However, it is quite possible that your index definition will be supported by Witchcraft™ technology out of the box in most of the cases.
509
573
 
510
574
  ### Raw Import
511
575
 
@@ -532,13 +596,12 @@ class LightweightProduct
532
596
  end
533
597
  end
534
598
 
535
- define_type Product do
536
- default_import_options raw_import: ->(hash) {
537
- LightweightProduct.new(hash)
538
- }
599
+ index_scope Product
600
+ default_import_options raw_import: ->(hash) {
601
+ LightweightProduct.new(hash)
602
+ }
539
603
 
540
- field :created_at, 'datetime'
541
- end
604
+ field :created_at, 'datetime'
542
605
  ```
543
606
 
544
607
  Also, you can pass `:raw_import` option to the `import` method explicitly.
@@ -549,6 +612,24 @@ By default, when you perform import Chewy checks whether an index exists and cre
549
612
  You can turn off this feature to decrease Elasticsearch hits count.
550
613
  To do so you need to set `skip_index_creation_on_import` parameter to `false` in your `config/chewy.yml`
551
614
 
615
+ ### Skip record fields during import
616
+
617
+ You can use `ignore_blank: true` to skip fields that return `true` for the `.blank?` method:
618
+
619
+ ```ruby
620
+ index_scope Country
621
+ field :id
622
+ field :cities, ignore_blank: true do
623
+ field :id
624
+ field :name
625
+ field :surname, ignore_blank: true
626
+ field :description
627
+ end
628
+ ```
629
+
630
+ #### Default values for different types
631
+
632
+ By default `ignore_blank` is false on every type except `geo_point`.
552
633
 
553
634
  ### Journaling
554
635
 
@@ -562,7 +643,6 @@ Common journal record looks like this:
562
643
  "action": "index",
563
644
  "object_id": [1, 2, 3],
564
645
  "index_name": "...",
565
- "type_name": "...",
566
646
  "created_at": "<timestamp>"
567
647
  }
568
648
  ```
@@ -588,28 +668,16 @@ Or as a default import option for an index:
588
668
 
589
669
  ```ruby
590
670
  class CityIndex
591
- define_type City do
592
- default_import_options journal: true
593
- end
671
+ index_scope City
672
+ default_import_options journal: true
594
673
  end
595
674
  ```
596
675
 
597
676
  You may be wondering why do you need it? The answer is simple: not to lose the data.
598
677
 
599
- Imagine that you reset your index in a zero-downtime manner (to separate index), and at the meantime somebody keeps updating the data frequently (to old index). So all these actions will be written to the journal index and you'll be able to apply them after index reset using the `Chewy::Journal` interface.
600
-
601
- ### Types access
678
+ Imagine that you reset your index in a zero-downtime manner (to separate index), and in the meantime somebody keeps updating the data frequently (to old index). So all these actions will be written to the journal index and you'll be able to apply them after index reset using the `Chewy::Journal` interface.
602
679
 
603
- You can access index-defined types with the following API:
604
-
605
- ```ruby
606
- UsersIndex::User # => UsersIndex::User
607
- UsersIndex.type_hash['user'] # => UsersIndex::User
608
- UsersIndex.type('user') # => UsersIndex::User
609
- UsersIndex.type('foo') # => raises error UndefinedType("Unknown type in UsersIndex: foo")
610
- UsersIndex.types # => [UsersIndex::User]
611
- UsersIndex.type_names # => ['user']
612
- ```
680
+ When enabled, journal can grow to enormous size, consider setting up cron job that would clean it occasionally using [`chewy:journal:clean` rake task](#chewyjournal).
613
681
 
614
682
  ### Index manipulation
615
683
 
@@ -623,25 +691,22 @@ UsersIndex.create! # use bang or non-bang methods
623
691
  UsersIndex.purge
624
692
  UsersIndex.purge! # deletes then creates index
625
693
 
626
- UsersIndex::User.import # import with 0 arguments process all the data specified in type definition
627
- # literally, User.active.includes(:country, :badges, :projects).find_in_batches
628
- UsersIndex::User.import User.where('rating > 100') # or import specified users scope
629
- UsersIndex::User.import User.where('rating > 100').to_a # or import specified users array
630
- UsersIndex::User.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
631
- UsersIndex::User.import User.where('rating > 100'), update_fields: [:email] # if update fields are specified - it will update their values only with the `update` bulk action.
694
+ UsersIndex.import # import with 0 arguments process all the data specified in index_scope definition
695
+ UsersIndex.import User.where('rating > 100') # or import specified users scope
696
+ UsersIndex.import User.where('rating > 100').to_a # or import specified users array
697
+ UsersIndex.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
698
+ UsersIndex.import User.where('rating > 100'), update_fields: [:email] # if update fields are specified - it will update their values only with the `update` bulk action
699
+ UsersIndex.import! # raises an exception in case of any import errors
632
700
 
633
- UsersIndex.import # import every defined type
634
- UsersIndex.import user: User.where('rating > 100') # import only active users to `user` type.
635
- # Other index types, if exists, will be imported with default scope from the type definition.
636
701
  UsersIndex.reset! # purges index and imports default data for all types
637
702
  ```
638
703
 
639
- If the passed user is `#destroyed?`, or satisfies a `delete_if` type option, or the specified id does not exist in the database, import will perform delete from index action for this object.
704
+ If the passed user is `#destroyed?`, or satisfies a `delete_if` index_scope option, or the specified id does not exist in the database, import will perform delete from index action for this object.
640
705
 
641
706
  ```ruby
642
- define_type User, delete_if: :deleted_at
643
- define_type User, delete_if: -> { deleted_at }
644
- define_type User, delete_if: ->(user) { user.deleted_at }
707
+ index_scope User, delete_if: :deleted_at
708
+ index_scope User, delete_if: -> { deleted_at }
709
+ index_scope User, delete_if: ->(user) { user.deleted_at }
645
710
  ```
646
711
 
647
712
  See [actions.rb](lib/chewy/index/actions.rb) for more details.
@@ -652,13 +717,12 @@ Assume you've got the following code:
652
717
 
653
718
  ```ruby
654
719
  class City < ActiveRecord::Base
655
- update_index 'cities#city', :self
720
+ update_index 'cities', :self
656
721
  end
657
722
 
658
723
  class CitiesIndex < Chewy::Index
659
- define_type City do
660
- field :name
661
- end
724
+ index_scope City
725
+ field :name
662
726
  end
663
727
  ```
664
728
 
@@ -678,22 +742,29 @@ end
678
742
 
679
743
  Using this strategy delays the index update request until the end of the block. Updated records are aggregated and the index update happens with the bulk API. So this strategy is highly optimized.
680
744
 
681
- #### `:resque`
745
+ #### `:sidekiq`
682
746
 
683
- This does the same thing as `:atomic`, but asynchronously using resque. The default queue name is `chewy`. Patch `Chewy::Strategy::Resque::Worker` for index updates improving.
747
+ This does the same thing as `:atomic`, but asynchronously using sidekiq. Patch `Chewy::Strategy::Sidekiq::Worker` for index updates improving.
684
748
 
685
749
  ```ruby
686
- Chewy.strategy(:resque) do
750
+ Chewy.strategy(:sidekiq) do
687
751
  City.popular.map(&:do_some_update_action!)
688
752
  end
689
753
  ```
690
754
 
691
- #### `:sidekiq`
755
+ The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
756
+ ```
757
+ Chewy.settings[:sidekiq] = {queue: :low}
758
+ ```
692
759
 
693
- This does the same thing as `:atomic`, but asynchronously using sidekiq. Patch `Chewy::Strategy::Sidekiq::Worker` for index updates improving.
760
+ #### `:lazy_sidekiq`
761
+
762
+ This does the same thing as `:sidekiq`, but with lazy evaluation. Beware it does not allow you to use any non-persistent record state for indices and conditions because record will be re-fetched from database asynchronously using sidekiq. However for destroying records strategy will fallback to `:sidekiq` because it's not possible to re-fetch deleted records from database.
763
+
764
+ The purpose of this strategy is to improve the response time of the code that should update indexes, as it does not only defer actual ES calls to a background job but `update_index` callbacks evaluation (for created and updated objects) too. Similar to `:sidekiq`, index update is asynchronous so this strategy cannot be used when data and index synchronization is required.
694
765
 
695
766
  ```ruby
696
- Chewy.strategy(:sidekiq) do
767
+ Chewy.strategy(:lazy_sidekiq) do
697
768
  City.popular.map(&:do_some_update_action!)
698
769
  end
699
770
  ```
@@ -703,31 +774,104 @@ The default queue name is `chewy`, you can customize it in settings: `sidekiq.qu
703
774
  Chewy.settings[:sidekiq] = {queue: :low}
704
775
  ```
705
776
 
706
- #### `:active_job`
777
+ #### `:delayed_sidekiq`
707
778
 
708
- This does the same thing as `:atomic`, but using ActiveJob. This will inherit the ActiveJob configuration settings including the `active_job.queue_adapter` setting for the environment. Patch `Chewy::Strategy::ActiveJob::Worker` for index updates improving.
779
+ It accumulates IDs of records to be reindexed during the latency window in Redis and then performs the reindexing of all accumulated records at once.
780
+ This strategy is very useful in the case of frequently mutated records.
781
+ It supports the `update_fields` option, so it will attempt to select just enough data from the database.
709
782
 
783
+ Keep in mind, this strategy does not guarantee reindexing in the event of Sidekiq worker termination or an error during the reindexing phase.
784
+ This behavior is intentional to prevent continuous growth of Redis db.
785
+
786
+ There are three options that can be defined in the index:
710
787
  ```ruby
711
- Chewy.strategy(:active_job) do
788
+ class CitiesIndex...
789
+ strategy_config delayed_sidekiq: {
790
+ latency: 3,
791
+ margin: 2,
792
+ ttl: 60 * 60 * 24,
793
+ reindex_wrapper: ->(&reindex) {
794
+ ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
795
+ }
796
+ # latency - will prevent scheduling identical jobs
797
+ # margin - main purpose is to cover db replication lag by the margin
798
+ # ttl - a chunk expiration time (in seconds)
799
+ # reindex_wrapper - lambda that accepts block to wrap that reindex process AR connection block.
800
+ }
801
+
802
+ ...
803
+ end
804
+ ```
805
+
806
+ Also you can define defaults in the `initializers/chewy.rb`
807
+ ```ruby
808
+ Chewy.settings = {
809
+ strategy_config: {
810
+ delayed_sidekiq: {
811
+ latency: 3,
812
+ margin: 2,
813
+ ttl: 60 * 60 * 24,
814
+ reindex_wrapper: ->(&reindex) {
815
+ ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
816
+ }
817
+ }
818
+ }
819
+ }
820
+
821
+ ```
822
+ or in `config/chewy.yml`
823
+ ```ruby
824
+ strategy_config:
825
+ delayed_sidekiq:
826
+ latency: 3
827
+ margin: 2
828
+ ttl: <%= 60 * 60 * 24 %>
829
+ # reindex_wrapper setting is not possible here!!! use the initializer instead
830
+ ```
831
+
832
+ You can use the strategy identically to other strategies
833
+ ```ruby
834
+ Chewy.strategy(:delayed_sidekiq) do
712
835
  City.popular.map(&:do_some_update_action!)
713
836
  end
714
837
  ```
715
838
 
716
- The default queue name is `chewy`, you can customize it in settings: `active_job.queue_name`
839
+ The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
717
840
  ```
718
- Chewy.settings[:active_job] = {queue: :low}
841
+ Chewy.settings[:sidekiq] = {queue: :low}
842
+ ```
843
+
844
+ Explicit call of the reindex using `:delayed_sidekiq strategy`
845
+ ```ruby
846
+ CitiesIndex.import([1, 2, 3], strategy: :delayed_sidekiq)
847
+ ```
848
+
849
+ Explicit call of the reindex using `:delayed_sidekiq` strategy with `:update_fields` support
850
+ ```ruby
851
+ CitiesIndex.import([1, 2, 3], update_fields: [:name], strategy: :delayed_sidekiq)
719
852
  ```
720
853
 
721
- #### `:shoryuken`
854
+ While running tests with delayed_sidekiq strategy and Sidekiq is using a real redis instance that is NOT cleaned up in between tests (via e.g. `Sidekiq.redis(&:flushdb)`), you'll want to cleanup some redis keys in between tests to avoid state leaking and flaky tests. Chewy provides a convenience method for that:
855
+ ```ruby
856
+ # it might be a good idea to also add to your testing setup, e.g.: a rspec `before` hook
857
+ Chewy::Strategy::DelayedSidekiq.clear_timechunks!
858
+ ```
722
859
 
723
- This does the same thing as `:atomic`, but asynchronously using shoryuken. Patch `Chewy::Strategy::Shoryuken::Worker` for index updates improving.
860
+ #### `:active_job`
861
+
862
+ This does the same thing as `:atomic`, but using ActiveJob. This will inherit the ActiveJob configuration settings including the `active_job.queue_adapter` setting for the environment. Patch `Chewy::Strategy::ActiveJob::Worker` for index updates improving.
724
863
 
725
864
  ```ruby
726
- Chewy.strategy(:shoryuken) do
865
+ Chewy.strategy(:active_job) do
727
866
  City.popular.map(&:do_some_update_action!)
728
867
  end
729
868
  ```
730
869
 
870
+ The default queue name is `chewy`, you can customize it in settings: `active_job.queue_name`
871
+ ```
872
+ Chewy.settings[:active_job] = {queue: :low}
873
+ ```
874
+
731
875
  #### `:urgent`
732
876
 
733
877
  The following strategy is convenient if you are going to update documents in your index one by one.
@@ -749,7 +893,9 @@ It is convenient for use in e.g. the Rails console with non-block notation:
749
893
 
750
894
  #### `:bypass`
751
895
 
752
- The bypass strategy simply silences index updates.
896
+ When the bypass strategy is active the index will not be automatically updated on object save.
897
+
898
+ For example, on `City.first.save!` the cities index would not be updated.
753
899
 
754
900
  #### Nesting
755
901
 
@@ -803,6 +949,12 @@ RSpec.configure do |config|
803
949
  end
804
950
  ```
805
951
 
952
+ ### Elasticsearch client options
953
+
954
+ All connection options, except the `:prefix`, are passed to the `Elasticseach::Client.new` ([chewy/lib/chewy.rb](https://github.com/toptal/chewy/blob/f5bad9f83c21416ac10590f6f34009c645062e89/lib/chewy.rb#L153-L160)):
955
+
956
+ Here's the relevant Elasticsearch documentation on the subject: https://rubydoc.info/gems/elasticsearch-transport#setting-hosts
957
+
806
958
  ### `ActiveSupport::Notifications` support
807
959
 
808
960
  Chewy has notifying the following events:
@@ -814,14 +966,14 @@ Chewy has notifying the following events:
814
966
 
815
967
  #### `import_objects.chewy` payload
816
968
 
817
- * `payload[:type]`: currently imported type
969
+ * `payload[:index]`: currently imported index name
818
970
  * `payload[:import]`: imports stats, total imported and deleted objects count:
819
971
 
820
972
  ```ruby
821
973
  {index: 30, delete: 5}
822
974
  ```
823
975
 
824
- * `payload[:errors]`: might not exists. Contains grouped errors with objects ids list:
976
+ * `payload[:errors]`: might not exist. Contains grouped errors with objects ids list:
825
977
 
826
978
  ```ruby
827
979
  {index: {
@@ -918,37 +1070,42 @@ Quick introduction.
918
1070
 
919
1071
  #### Composing requests
920
1072
 
921
- The request DSL have the same chainable nature as AR or Mongoid ones. The main class is `Chewy::Search::Request`. It is possible to perform requests on behalf of indices or types:
1073
+ The request DSL have the same chainable nature as AR. The main class is `Chewy::Search::Request`.
922
1074
 
923
1075
  ```ruby
924
- PlaceIndex.query(match: {name: 'London'}) # returns documents of any type
925
- PlaceIndex::City.query(match: {name: 'London'}) # returns cities only.
1076
+ CitiesIndex.query(match: {name: 'London'})
926
1077
  ```
927
1078
 
928
- Main methods of the request DSL are: `query`, `filter` and `post_filter`, it is possible to pass pure query hashes or use `elasticsearch-dsl`. Also, there is an additional
1079
+ Main methods of the request DSL are: `query`, `filter` and `post_filter`, it is possible to pass pure query hashes or use `elasticsearch-dsl`.
929
1080
 
930
1081
  ```ruby
931
- PlaceIndex
1082
+ CitiesIndex
932
1083
  .filter(term: {name: 'Bangkok'})
933
- .query { match name: 'London' }
1084
+ .query(match: {name: 'London'})
934
1085
  .query.not(range: {population: {gt: 1_000_000}})
935
1086
  ```
936
1087
 
937
- See https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html and https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-dsl for more details.
1088
+ You can query a set of indexes at once:
1089
+
1090
+ ```ruby
1091
+ CitiesIndex.indices(CountriesIndex).query(match: {name: 'Some'})
1092
+ ```
1093
+
1094
+ See https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html and https://github.com/elastic/elasticsearch-dsl-ruby for more details.
938
1095
 
939
1096
  An important part of requests manipulation is merging. There are 4 methods to perform it: `merge`, `and`, `or`, `not`. See [Chewy::Search::QueryProxy](lib/chewy/search/query_proxy.rb) for details. Also, `only` and `except` methods help to remove unneeded parts of the request.
940
1097
 
941
1098
  Every other request part is covered by a bunch of additional methods, see [Chewy::Search::Request](lib/chewy/search/request.rb) for details:
942
1099
 
943
1100
  ```ruby
944
- PlaceIndex.limit(10).offset(30).order(:name, {population: {order: :desc}})
1101
+ CitiesIndex.limit(10).offset(30).order(:name, {population: {order: :desc}})
945
1102
  ```
946
1103
 
947
1104
  Request DSL also provides additional scope actions, like `delete_all`, `exists?`, `count`, `pluck`, etc.
948
1105
 
949
1106
  #### Pagination
950
1107
 
951
- The request DSL supports pagination with `Kaminari` and `WillPaginate`. An appropriate extension is enabled on initializtion if any of libraries is available. See [Chewy::Search](lib/chewy/search.rb) and [Chewy::Search::Pagination](lib/chewy/search/pagination/) namespace for details.
1108
+ The request DSL supports pagination with `Kaminari`. An extension is enabled on initialization if `Kaminari` is available. See [Chewy::Search](lib/chewy/search.rb) and [Chewy::Search::Pagination::Kaminari](lib/chewy/search/pagination/kaminari.rb) for details.
952
1109
 
953
1110
  #### Named scopes
954
1111
 
@@ -967,8 +1124,8 @@ See [Chewy::Search::Scrolling](lib/chewy/search/scrolling.rb) for details.
967
1124
  It is possible to load ORM/ODM source objects with the `objects` method. To provide additional loading options use `load` method:
968
1125
 
969
1126
  ```ruby
970
- PlacesIndex.load(scope: -> { active }).to_a # to_a returns `Chewy::Type` wrappers.
971
- PlacesIndex.load(scope: -> { active }).objects # An array of AR source objects.
1127
+ CitiesIndex.load(scope: -> { active }).to_a # to_a returns `Chewy::Index` wrappers.
1128
+ CitiesIndex.load(scope: -> { active }).objects # An array of AR source objects.
972
1129
  ```
973
1130
 
974
1131
  See [Chewy::Search::Loader](lib/chewy/search/loader.rb) for more details.
@@ -976,7 +1133,7 @@ See [Chewy::Search::Loader](lib/chewy/search/loader.rb) for more details.
976
1133
  In case when it is necessary to iterate through both of the wrappers and objects simultaneously, `object_hash` method helps a lot:
977
1134
 
978
1135
  ```ruby
979
- scope = PlacesIndex.load(scope: -> { active })
1136
+ scope = CitiesIndex.load(scope: -> { active })
980
1137
  scope.each do |wrapper|
981
1138
  scope.object_hash[wrapper]
982
1139
  end
@@ -993,8 +1150,8 @@ Performs zero-downtime reindexing as described [here](https://www.elastic.co/blo
993
1150
  ```bash
994
1151
  rake chewy:reset # resets all the existing indices
995
1152
  rake chewy:reset[users] # resets UsersIndex only
996
- rake chewy:reset[users,places] # resets UsersIndex and PlacesIndex
997
- rake chewy:reset[-users,places] # resets every index in the application except specified ones
1153
+ rake chewy:reset[users,cities] # resets UsersIndex and CitiesIndex
1154
+ rake chewy:reset[-users,cities] # resets every index in the application except specified ones
998
1155
  ```
999
1156
 
1000
1157
  #### `chewy:upgrade`
@@ -1009,48 +1166,50 @@ See [Chewy::Stash::Specification](lib/chewy/stash.rb) and [Chewy::Index::Specifi
1009
1166
  ```bash
1010
1167
  rake chewy:upgrade # upgrades all the existing indices
1011
1168
  rake chewy:upgrade[users] # upgrades UsersIndex only
1012
- rake chewy:upgrade[users,places] # upgrades UsersIndex and PlacesIndex
1013
- rake chewy:upgrade[-users,places] # upgrades every index in the application except specified ones
1169
+ rake chewy:upgrade[users,cities] # upgrades UsersIndex and CitiesIndex
1170
+ rake chewy:upgrade[-users,cities] # upgrades every index in the application except specified ones
1014
1171
  ```
1015
1172
 
1016
1173
  #### `chewy:update`
1017
1174
 
1018
1175
  It doesn't create indexes, it simply imports everything to the existing ones and fails if the index was not created before.
1019
1176
 
1020
- Unlike `reset` or `upgrade` tasks, it is possible to pass type references to update the particular type. In index name is passed without the type specified, it will update all the types defined for this index.
1021
-
1022
1177
  ```bash
1023
1178
  rake chewy:update # updates all the existing indices
1024
1179
  rake chewy:update[users] # updates UsersIndex only
1025
- rake chewy:update[users,places#city] # updates the whole UsersIndex and PlacesIndex::City type
1026
- rake chewy:update[-users,places#city] # updates every index in the application except every type defined in UsersIndex and the rest of the types defined in PlacesIndex
1180
+ rake chewy:update[users,cities] # updates UsersIndex and CitiesIndex
1181
+ rake chewy:update[-users,cities] # updates every index in the application except UsersIndex and CitiesIndex
1027
1182
  ```
1028
1183
 
1029
1184
  #### `chewy:sync`
1030
1185
 
1031
- Provides a way to synchronize outdated indexes with the source quickly and without doing a full reset.
1186
+ Provides a way to synchronize outdated indexes with the source quickly and without doing a full reset. By default field `updated_at` is used to find outdated records, but this could be customized by `outdated_sync_field` as described at [Chewy::Index::Syncer](lib/chewy/index/syncer.rb).
1032
1187
 
1033
- Arguments are similar to the ones taken by `chewy:update` task. It is possible to specify a particular type or a whole index.
1188
+ Arguments are similar to the ones taken by `chewy:update` task.
1034
1189
 
1035
- See [Chewy::Type::Syncer](lib/chewy/type/syncer.rb) for more details.
1190
+ See [Chewy::Index::Syncer](lib/chewy/index/syncer.rb) for more details.
1036
1191
 
1037
1192
  ```bash
1038
1193
  rake chewy:sync # synchronizes all the existing indices
1039
1194
  rake chewy:sync[users] # synchronizes UsersIndex only
1040
- rake chewy:sync[users,places#city] # synchronizes the whole UsersIndex and PlacesIndex::City type
1041
- rake chewy:sync[-users,places#city] # synchronizes every index in the application except every type defined in UsersIndex and the rest of the types defined in PlacesIndex
1195
+ rake chewy:sync[users,cities] # synchronizes UsersIndex and CitiesIndex
1196
+ rake chewy:sync[-users,cities] # synchronizes every index in the application except except UsersIndex and CitiesIndex
1042
1197
  ```
1043
1198
 
1044
1199
  #### `chewy:deploy`
1045
1200
 
1046
1201
  This rake task is especially useful during the production deploy. It is a combination of `chewy:upgrade` and `chewy:sync` and the latter is called only for the indexes that were not reset during the first stage.
1047
1202
 
1048
- It is not possible to specify any particular types/indexes for this task as it doesn't make much sense.
1203
+ It is not possible to specify any particular indexes for this task as it doesn't make much sense.
1049
1204
 
1050
1205
  Right now the approach is that if some data had been updated, but index definition was not changed (no changes satisfying the synchronization algorithm were done), it would be much faster to perform manual partial index update inside data migrations or even manually after the deploy.
1051
1206
 
1052
1207
  Also, there is always full reset alternative with `rake chewy:reset`.
1053
1208
 
1209
+ #### `chewy:create_missing_indexes`
1210
+
1211
+ This rake task creates newly defined indexes in ElasticSearch and skips existing ones. Useful for production-like environments.
1212
+
1054
1213
  #### Parallelizing rake tasks
1055
1214
 
1056
1215
  Every task described above has its own parallel version. Every parallel rake task takes the number for processes for execution as the first argument and the rest of the arguments are exactly the same as for the non-parallel task version.
@@ -1062,23 +1221,43 @@ If the number of processes is not specified explicitly - `parallel` gem tries to
1062
1221
  ```bash
1063
1222
  rake chewy:parallel:reset
1064
1223
  rake chewy:parallel:upgrade[4]
1065
- rake chewy:parallel:update[4,places#city]
1224
+ rake chewy:parallel:update[4,cities]
1066
1225
  rake chewy:parallel:sync[4,-users]
1067
1226
  rake chewy:parallel:deploy[4] # performs parallel upgrade and parallel sync afterwards
1068
1227
  ```
1069
1228
 
1070
1229
  #### `chewy:journal`
1071
1230
 
1072
- This namespace contains two tasks for the journal manipulations: `chewy:journal:apply` and `chewy:journal:clean`. Both are taking time as the first argument (optional for clean) and a list of indexes/types exactly as the tasks above. Time can be in any format parsable by ActiveSupport.
1231
+ This namespace contains two tasks for the journal manipulations: `chewy:journal:apply` and `chewy:journal:clean`. Both are taking time as the first argument (optional for clean) and a list of indexes exactly as the tasks above. Time can be in any format parsable by ActiveSupport.
1073
1232
 
1074
1233
  ```bash
1075
1234
  rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)"] # apply journaled changes for the past hour
1076
1235
  rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)",users] # apply journaled changes for the past hour on UsersIndex only
1077
1236
  ```
1078
1237
 
1238
+ When the size of the journal becomes very large, the classical way of deletion would be obstructive and resource consuming. Fortunately, Chewy internally uses [delete-by-query](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-delete-by-query.html#docs-delete-by-query-task-api) ES function which supports async execution with batching and [throttling](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-throttle).
1239
+
1240
+ The available options, which can be set by ENV variables, are listed below:
1241
+ * `WAIT_FOR_COMPLETION` - a boolean flag. It controls async execution. It waits by default. When set to `false` (`0`, `f`, `false` or `off` in any case spelling is accepted as `false`), Elasticsearch performs some preflight checks, launches the request, and returns a task reference you can use to cancel the task or get its status.
1242
+ * `REQUESTS_PER_SECOND` - float. The throttle for this request in sub-requests per second. No throttling is enforced by default.
1243
+ * `SCROLL_SIZE` - integer. The number of documents to be deleted in single sub-request. The default batch size is 1000.
1244
+
1245
+ ```bash
1246
+ rake chewy:journal:clean WAIT_FOR_COMPLETION=false REQUESTS_PER_SECOND=10 SCROLL_SIZE=5000
1247
+ ```
1248
+
1079
1249
  ### RSpec integration
1080
1250
 
1081
- Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features: See [update_index.rb](lib/chewy/rspec/update_index.rb) for more details.
1251
+ Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features:
1252
+
1253
+ [update_index](lib/chewy/rspec/update_index.rb) helper
1254
+ `mock_elasticsearch_response` helper to mock elasticsearch response
1255
+ `mock_elasticsearch_response_sources` helper to mock elasticsearch response sources
1256
+ `build_query` matcher to compare request and expected query (returns `true`/`false`)
1257
+
1258
+ To use `mock_elasticsearch_response` and `mock_elasticsearch_response_sources` helpers add `include Chewy::Rspec::Helpers` to your tests.
1259
+
1260
+ See [chewy/rspec/](lib/chewy/rspec/) for more details.
1082
1261
 
1083
1262
  ### Minitest integration
1084
1263
 
@@ -1088,6 +1267,14 @@ Since you can set `:bypass` strategy for test suites and manually handle import
1088
1267
 
1089
1268
  But if you require chewy to index/update model regularly in your test suite then you can specify `:urgent` strategy for documents indexing. Add `Chewy.strategy(:urgent)` to test_helper.rb.
1090
1269
 
1270
+ Also, you can use additional helpers:
1271
+
1272
+ `mock_elasticsearch_response` to mock elasticsearch response
1273
+ `mock_elasticsearch_response_sources` to mock elasticsearch response sources
1274
+ `assert_elasticsearch_query` to compare request and expected query (returns `true`/`false`)
1275
+
1276
+ See [chewy/minitest/](lib/chewy/minitest/) for more details.
1277
+
1091
1278
  ### DatabaseCleaner
1092
1279
 
1093
1280
  If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](https://github.com/DatabaseCleaner/database_cleaner#how-to-use), you may run into the problem that `ActiveRecord`'s models are not indexed automatically on save despite the fact that you set the callbacks to do this with the `update_index` method. The issue arises because `chewy` indices data on `after_commit` run as default, but all `after_commit` callbacks are not run with the `DatabaseCleaner`'s' `transaction` strategy. You can solve this issue by changing the `Chewy.use_after_commit_callbacks` option. Just add the following initializer in your Rails application:
@@ -1097,6 +1284,41 @@ If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](htt
1097
1284
  Chewy.use_after_commit_callbacks = !Rails.env.test?
1098
1285
  ```
1099
1286
 
1287
+ ### Pre-request Filter
1288
+
1289
+ Should you need to inspect the query prior to it being dispatched to ElasticSearch during any queries, you can use the `before_es_request_filter`. `before_es_request_filter` is a callable object, as demonstrated below:
1290
+
1291
+ ```ruby
1292
+ Chewy.before_es_request_filter = -> (method_name, args, kw_args) { ... }
1293
+ ```
1294
+
1295
+ While using the `before_es_request_filter`, please consider the following:
1296
+
1297
+ * `before_es_request_filter` acts as a simple proxy before any request made via the `ElasticSearch::Client`. The arguments passed to this filter include:
1298
+ * `method_name` - The name of the method being called. Examples are search, count, bulk and etc.
1299
+ * `args` and `kw_args` - These are the positional arguments provided in the method call.
1300
+ * The operation is synchronous, so avoid executing any heavy or time-consuming operations within the filter to prevent performance degradation.
1301
+ * The return value of the proc is disregarded. This filter is intended for inspection or modification of the query rather than generating a response.
1302
+ * Any exception raised inside the callback will propagate upward and halt the execution of the query. It is essential to handle potential errors adequately to ensure the stability of your search functionality.
1303
+
1304
+ ### Import scope clean-up behavior
1305
+
1306
+ Whenever you set the `import_scope` for the index, in the case of ActiveRecord,
1307
+ options for order, offset and limit will be removed. You can set the behavior of
1308
+ chewy, before the clean-up itself.
1309
+
1310
+ The default behavior is a warning sent to the Chewy logger (`:warn`). Another more
1311
+ restrictive option is raising an exception (`:raise`). Both options have a
1312
+ negative impact on performance since verifying whether the code uses any of
1313
+ these options requires building AREL query.
1314
+
1315
+ To avoid the loading time impact, you can ignore the check (`:ignore`) before
1316
+ the clean-up.
1317
+
1318
+ ```
1319
+ Chewy.import_scope_cleanup_behavior = :ignore
1320
+ ```
1321
+
1100
1322
  ## Contributing
1101
1323
 
1102
1324
  1. Fork it (http://github.com/toptal/chewy/fork)
@@ -1106,7 +1328,7 @@ Chewy.use_after_commit_callbacks = !Rails.env.test?
1106
1328
  5. Push to the branch (`git push origin my-new-feature`)
1107
1329
  6. Create new Pull Request
1108
1330
 
1109
- Use the following Rake tasks to control the Elasticsearch cluster while developing.
1331
+ Use the following Rake tasks to control the Elasticsearch cluster while developing, if you prefer native Elasticsearch installation over the dockerized one:
1110
1332
 
1111
1333
  ```bash
1112
1334
  rake elasticsearch:start # start Elasticsearch cluster on 9250 port for tests