chewy 8.0.0 → 8.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (144) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +42 -0
  3. data/README.md +30 -16
  4. data/lib/chewy/errors.rb +3 -0
  5. data/lib/chewy/fields/root.rb +3 -3
  6. data/lib/chewy/index/crutch.rb +12 -2
  7. data/lib/chewy/index/import/bulk_builder.rb +4 -3
  8. data/lib/chewy/index/import/routine.rb +2 -1
  9. data/lib/chewy/index/import.rb +4 -4
  10. data/lib/chewy/index/witchcraft.rb +24 -8
  11. data/lib/chewy/multi_search.rb +1 -1
  12. data/lib/chewy/search/parameters/runtime_mappings.rb +14 -0
  13. data/lib/chewy/search/request.rb +18 -2
  14. data/lib/chewy/search/scrolling.rb +14 -6
  15. data/lib/chewy/stash.rb +10 -6
  16. data/lib/chewy/version.rb +1 -1
  17. metadata +5 -131
  18. data/.github/CODEOWNERS +0 -1
  19. data/.github/ISSUE_TEMPLATE/bug_report.md +0 -39
  20. data/.github/ISSUE_TEMPLATE/feature_request.md +0 -20
  21. data/.github/PULL_REQUEST_TEMPLATE.md +0 -16
  22. data/.github/dependabot.yml +0 -42
  23. data/.github/workflows/ruby.yml +0 -61
  24. data/.gitignore +0 -22
  25. data/.rspec +0 -2
  26. data/.rubocop.yml +0 -64
  27. data/.rubocop_todo.yml +0 -225
  28. data/.yardopts +0 -5
  29. data/CODE_OF_CONDUCT.md +0 -14
  30. data/CONTRIBUTING.md +0 -63
  31. data/Gemfile +0 -22
  32. data/Guardfile +0 -25
  33. data/Rakefile +0 -17
  34. data/chewy.gemspec +0 -24
  35. data/docker-compose.yml +0 -14
  36. data/docs/README.md +0 -16
  37. data/docs/configuration.md +0 -440
  38. data/docs/import.md +0 -122
  39. data/docs/indexing.md +0 -329
  40. data/docs/querying.md +0 -72
  41. data/docs/rake_tasks.md +0 -108
  42. data/docs/testing.md +0 -41
  43. data/docs/troubleshooting.md +0 -101
  44. data/filters +0 -78
  45. data/gemfiles/base.gemfile +0 -12
  46. data/gemfiles/rails.7.2.activerecord.gemfile +0 -14
  47. data/gemfiles/rails.8.0.activerecord.gemfile +0 -14
  48. data/migration_guide.md +0 -56
  49. data/spec/chewy/config_spec.rb +0 -110
  50. data/spec/chewy/elastic_client_spec.rb +0 -26
  51. data/spec/chewy/fields/base_spec.rb +0 -700
  52. data/spec/chewy/fields/root_spec.rb +0 -142
  53. data/spec/chewy/fields/time_fields_spec.rb +0 -28
  54. data/spec/chewy/index/actions_spec.rb +0 -851
  55. data/spec/chewy/index/adapter/active_record_spec.rb +0 -663
  56. data/spec/chewy/index/adapter/object_spec.rb +0 -243
  57. data/spec/chewy/index/aliases_spec.rb +0 -49
  58. data/spec/chewy/index/import/bulk_builder_spec.rb +0 -494
  59. data/spec/chewy/index/import/bulk_request_spec.rb +0 -95
  60. data/spec/chewy/index/import/journal_builder_spec.rb +0 -87
  61. data/spec/chewy/index/import/routine_spec.rb +0 -110
  62. data/spec/chewy/index/import_spec.rb +0 -615
  63. data/spec/chewy/index/mapping_spec.rb +0 -135
  64. data/spec/chewy/index/observe/active_record_methods_spec.rb +0 -68
  65. data/spec/chewy/index/observe/callback_spec.rb +0 -139
  66. data/spec/chewy/index/observe_spec.rb +0 -143
  67. data/spec/chewy/index/settings_spec.rb +0 -136
  68. data/spec/chewy/index/specification_spec.rb +0 -156
  69. data/spec/chewy/index/syncer_spec.rb +0 -118
  70. data/spec/chewy/index/witchcraft_spec.rb +0 -245
  71. data/spec/chewy/index/wrapper_spec.rb +0 -100
  72. data/spec/chewy/index_spec.rb +0 -269
  73. data/spec/chewy/journal_spec.rb +0 -223
  74. data/spec/chewy/minitest/helpers_spec.rb +0 -194
  75. data/spec/chewy/minitest/search_index_receiver_spec.rb +0 -120
  76. data/spec/chewy/multi_search_spec.rb +0 -84
  77. data/spec/chewy/rake_helper_spec.rb +0 -656
  78. data/spec/chewy/repository_spec.rb +0 -50
  79. data/spec/chewy/rspec/build_query_spec.rb +0 -34
  80. data/spec/chewy/rspec/helpers_spec.rb +0 -61
  81. data/spec/chewy/rspec/update_index_spec.rb +0 -313
  82. data/spec/chewy/runtime/version_spec.rb +0 -48
  83. data/spec/chewy/runtime_spec.rb +0 -9
  84. data/spec/chewy/search/loader_spec.rb +0 -83
  85. data/spec/chewy/search/pagination/kaminari_examples.rb +0 -69
  86. data/spec/chewy/search/pagination/kaminari_spec.rb +0 -21
  87. data/spec/chewy/search/parameters/aggs_spec.rb +0 -5
  88. data/spec/chewy/search/parameters/bool_storage_examples.rb +0 -53
  89. data/spec/chewy/search/parameters/collapse_spec.rb +0 -5
  90. data/spec/chewy/search/parameters/docvalue_fields_spec.rb +0 -5
  91. data/spec/chewy/search/parameters/explain_spec.rb +0 -5
  92. data/spec/chewy/search/parameters/filter_spec.rb +0 -5
  93. data/spec/chewy/search/parameters/hash_storage_examples.rb +0 -59
  94. data/spec/chewy/search/parameters/highlight_spec.rb +0 -5
  95. data/spec/chewy/search/parameters/ignore_unavailable_spec.rb +0 -67
  96. data/spec/chewy/search/parameters/indices_spec.rb +0 -99
  97. data/spec/chewy/search/parameters/integer_storage_examples.rb +0 -32
  98. data/spec/chewy/search/parameters/knn_spec.rb +0 -5
  99. data/spec/chewy/search/parameters/limit_spec.rb +0 -5
  100. data/spec/chewy/search/parameters/load_spec.rb +0 -60
  101. data/spec/chewy/search/parameters/min_score_spec.rb +0 -32
  102. data/spec/chewy/search/parameters/none_spec.rb +0 -5
  103. data/spec/chewy/search/parameters/offset_spec.rb +0 -5
  104. data/spec/chewy/search/parameters/order_spec.rb +0 -72
  105. data/spec/chewy/search/parameters/post_filter_spec.rb +0 -5
  106. data/spec/chewy/search/parameters/preference_spec.rb +0 -5
  107. data/spec/chewy/search/parameters/profile_spec.rb +0 -5
  108. data/spec/chewy/search/parameters/query_spec.rb +0 -5
  109. data/spec/chewy/search/parameters/query_storage_examples.rb +0 -434
  110. data/spec/chewy/search/parameters/request_cache_spec.rb +0 -67
  111. data/spec/chewy/search/parameters/rescore_spec.rb +0 -62
  112. data/spec/chewy/search/parameters/script_fields_spec.rb +0 -5
  113. data/spec/chewy/search/parameters/search_after_spec.rb +0 -35
  114. data/spec/chewy/search/parameters/search_type_spec.rb +0 -5
  115. data/spec/chewy/search/parameters/source_spec.rb +0 -162
  116. data/spec/chewy/search/parameters/storage_spec.rb +0 -60
  117. data/spec/chewy/search/parameters/stored_fields_spec.rb +0 -126
  118. data/spec/chewy/search/parameters/string_array_storage_examples.rb +0 -63
  119. data/spec/chewy/search/parameters/string_storage_examples.rb +0 -32
  120. data/spec/chewy/search/parameters/suggest_spec.rb +0 -5
  121. data/spec/chewy/search/parameters/terminate_after_spec.rb +0 -5
  122. data/spec/chewy/search/parameters/timeout_spec.rb +0 -5
  123. data/spec/chewy/search/parameters/track_scores_spec.rb +0 -5
  124. data/spec/chewy/search/parameters/track_total_hits_spec.rb +0 -5
  125. data/spec/chewy/search/parameters/version_spec.rb +0 -5
  126. data/spec/chewy/search/parameters_spec.rb +0 -161
  127. data/spec/chewy/search/query_proxy_spec.rb +0 -95
  128. data/spec/chewy/search/request_spec.rb +0 -886
  129. data/spec/chewy/search/response_spec.rb +0 -180
  130. data/spec/chewy/search/scrolling_spec.rb +0 -171
  131. data/spec/chewy/search_spec.rb +0 -127
  132. data/spec/chewy/stash_spec.rb +0 -85
  133. data/spec/chewy/strategy/active_job_spec.rb +0 -73
  134. data/spec/chewy/strategy/atomic_no_refresh_spec.rb +0 -60
  135. data/spec/chewy/strategy/atomic_spec.rb +0 -61
  136. data/spec/chewy/strategy/delayed_sidekiq_spec.rb +0 -225
  137. data/spec/chewy/strategy/lazy_sidekiq_spec.rb +0 -214
  138. data/spec/chewy/strategy/sidekiq_spec.rb +0 -52
  139. data/spec/chewy/strategy_spec.rb +0 -125
  140. data/spec/chewy_spec.rb +0 -100
  141. data/spec/spec_helper.rb +0 -69
  142. data/spec/support/active_record.rb +0 -124
  143. data/spec/support/class_helpers.rb +0 -16
  144. data/spec/support/fail_helpers.rb +0 -13
data/docs/import.md DELETED
@@ -1,122 +0,0 @@
1
- # Import
2
-
3
- ## Default import options
4
-
5
- Every index has `default_import_options` configuration to specify, suddenly, default import options:
6
-
7
- ```ruby
8
- class ProductsIndex < Chewy::Index
9
- index_scope Post.includes(:tags)
10
- default_import_options batch_size: 100, bulk_size: 10.megabytes, refresh: false
11
-
12
- field :name
13
- field :tags, value: -> { tags.map(&:name) }
14
- end
15
- ```
16
-
17
- See [import.rb](../lib/chewy/index/import.rb) for available options. For field definitions (`field`, `index_scope`, etc.), see [indexing.md](indexing.md#index-definition).
18
-
19
- ## Raw import
20
-
21
- Another way to speed up import time is Raw Imports. This technology is only available in ActiveRecord adapter. Very often, ActiveRecord model instantiation is what consumes most of the CPU and RAM resources. Precious time is wasted on converting, say, timestamps from strings and then serializing them back to strings. Chewy can operate on raw hashes of data directly obtained from the database. All you need is to provide a way to convert that hash to a lightweight object that mimics the behaviour of the normal ActiveRecord object.
22
-
23
- ```ruby
24
- class LightweightProduct
25
- def initialize(attributes)
26
- @attributes = attributes
27
- end
28
-
29
- # Depending on the database, `created_at` might
30
- # be in different formats. In PostgreSQL, for example,
31
- # you might see the following format:
32
- # "2016-03-22 16:23:22"
33
- #
34
- # Taking into account that Elastic expects something different,
35
- # one might do something like the following, just to avoid
36
- # unnecessary String -> DateTime -> String conversion.
37
- #
38
- # "2016-03-22 16:23:22" -> "2016-03-22T16:23:22Z"
39
- def created_at
40
- @attributes['created_at'].tr(' ', 'T') << 'Z'
41
- end
42
- end
43
-
44
- index_scope Product
45
- default_import_options raw_import: ->(hash) {
46
- LightweightProduct.new(hash)
47
- }
48
-
49
- field :created_at, 'datetime'
50
- ```
51
-
52
- Also, you can pass `:raw_import` option to the `import` method explicitly.
53
-
54
- ## Index creation during import
55
-
56
- By default, when you perform import Chewy checks whether an index exists and creates it if it's absent.
57
- You can turn off this feature to decrease Elasticsearch hits count.
58
- To do so you need to set `skip_index_creation_on_import` parameter to `false` in your `config/chewy.yml`.
59
-
60
- ## Skip record fields during import
61
-
62
- You can use `ignore_blank: true` to skip fields that return `true` for the `.blank?` method:
63
-
64
- ```ruby
65
- index_scope Country
66
- field :id
67
- field :cities, ignore_blank: true do
68
- field :id
69
- field :name
70
- field :surname, ignore_blank: true
71
- field :description
72
- end
73
- ```
74
-
75
- ### Default values for different types
76
-
77
- By default `ignore_blank` is false on every type except `geo_point`.
78
-
79
- ## Journaling
80
-
81
- You can record all actions that were made to the separate journal index in Elasticsearch.
82
- When you create/update/destroy your documents, it will be saved in this special index.
83
- If you make something with a batch of documents (e.g. during index reset) it will be saved as a one record, including primary keys of each document that was affected.
84
- Common journal record looks like this:
85
-
86
- ```json
87
- {
88
- "action": "index",
89
- "object_id": [1, 2, 3],
90
- "index_name": "...",
91
- "created_at": "<timestamp>"
92
- }
93
- ```
94
-
95
- This feature is turned off by default.
96
- You can turn it on by setting `journal` option to `true` in `config/chewy.yml`.
97
-
98
- Also, you can provide this option while you're importing some index:
99
-
100
- ```ruby
101
- CityIndex.import journal: true
102
- ```
103
-
104
- Or as a default import option for an index:
105
-
106
- ```ruby
107
- class CityIndex
108
- index_scope City
109
- default_import_options journal: true
110
- end
111
- ```
112
-
113
- You may be wondering why do you need it? The answer is simple: not to lose the data.
114
-
115
- Imagine that you reset your index in a zero-downtime manner (to separate index),
116
- and in the meantime somebody keeps updating the data frequently (to old
117
- index). So all these actions will be written to the journal index and you'll be
118
- able to apply them after index reset using the `Chewy::Journal` interface. You can subscribe to journal events via `ActiveSupport::Notifications` — see [configuration.md](configuration.md#activesupportnotifications-support) for details.
119
-
120
- When enabled, journal can grow to enormous size, consider setting up cron job
121
- that would clean it occasionally using [`chewy:journal:clean` rake
122
- task](rake_tasks.md#chewyjournal).
data/docs/indexing.md DELETED
@@ -1,329 +0,0 @@
1
- # Indexing
2
-
3
- ## Index definition
4
-
5
- 1. Create `/app/chewy/users_index.rb`
6
-
7
- ```ruby
8
- class UsersIndex < Chewy::Index
9
-
10
- end
11
- ```
12
-
13
- 2. Define index scope (you can omit this part if you don't need to specify a scope (i.e. use PORO objects for import) or options)
14
-
15
- ```ruby
16
- class UsersIndex < Chewy::Index
17
- index_scope User.active # or just model instead_of scope: index_scope User
18
- end
19
- ```
20
-
21
- 3. Add some mappings
22
-
23
- ```ruby
24
- class UsersIndex < Chewy::Index
25
- index_scope User.active.includes(:country, :badges, :projects)
26
- field :first_name, :last_name # multiple fields without additional options
27
- field :email, analyzer: 'email' # Elasticsearch-related options
28
- field :country, value: ->(user) { user.country.name } # custom value proc
29
- field :badges, value: ->(user) { user.badges.map(&:name) } # passing array values to index
30
- field :projects do # the same block syntax for multi_field, if `:type` is specified
31
- field :title
32
- field :description # default data type is `text`
33
- # additional top-level objects passed to value proc:
34
- field :categories, value: ->(project, user) { project.categories.map(&:name) if user.active? }
35
- end
36
- field :rating, type: 'integer' # custom data type
37
- field :created, type: 'date', include_in_all: false,
38
- value: ->{ created_at } # value proc for source object context
39
- end
40
- ```
41
-
42
- [See here for mapping definitions](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html).
43
-
44
- 4. Add some index-related settings. Analyzer repositories might be used as well. See `Chewy::Index.settings` docs for details:
45
-
46
- ```ruby
47
- class UsersIndex < Chewy::Index
48
- settings analysis: {
49
- analyzer: {
50
- email: {
51
- tokenizer: 'keyword',
52
- filter: ['lowercase']
53
- }
54
- }
55
- }
56
-
57
- index_scope User.active.includes(:country, :badges, :projects)
58
- root date_detection: false do
59
- template 'about_translations.*', type: 'text', analyzer: 'standard'
60
-
61
- field :first_name, :last_name
62
- field :email, analyzer: 'email'
63
- field :country, value: ->(user) { user.country.name }
64
- field :badges, value: ->(user) { user.badges.map(&:name) }
65
- field :projects do
66
- field :title
67
- field :description
68
- end
69
- field :about_translations, type: 'object' # pass object type explicitly if necessary
70
- field :rating, type: 'integer'
71
- field :created, type: 'date', include_in_all: false,
72
- value: ->{ created_at }
73
- end
74
- end
75
- ```
76
-
77
- [See index settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html).
78
- [See root object settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html).
79
-
80
- See [mapping.rb](../lib/chewy/index/mapping.rb) for more details.
81
-
82
- 5. Add model-observing code
83
-
84
- ```ruby
85
- class User < ActiveRecord::Base
86
- update_index('users') { self } # specifying index and back-reference
87
- # for updating after user save or destroy
88
- end
89
-
90
- class Country < ActiveRecord::Base
91
- has_many :users
92
-
93
- update_index('users') { users } # return single object or collection
94
- end
95
-
96
- class Project < ActiveRecord::Base
97
- update_index('users') { user if user.active? } # you can return even `nil` from the back-reference
98
- end
99
-
100
- class Book < ActiveRecord::Base
101
- update_index(->(book) {"books_#{book.language}"}) { self } # dynamic index name with proc.
102
- # For book with language == "en"
103
- # this code will generate `books_en`
104
- end
105
- ```
106
-
107
- The `update_index` callback requires an active update strategy to be set. See [configuration.md](configuration.md#index-update-strategies) for available strategies and how they integrate with Rails.
108
-
109
- Also, you can use the second argument for method name passing:
110
-
111
- ```ruby
112
- update_index('users', :self)
113
- update_index('users', :users)
114
- ```
115
-
116
- In the case of a belongs_to association you may need to update both associated objects, previous and current:
117
-
118
- ```ruby
119
- class City < ActiveRecord::Base
120
- belongs_to :country
121
-
122
- update_index('cities') { self }
123
- update_index 'countries' do
124
- previous_changes['country_id'] || country
125
- end
126
- end
127
- ```
128
-
129
- ## Multi (nested) and object field types
130
-
131
- To define an objects field you can simply nest fields in the DSL:
132
-
133
- ```ruby
134
- field :projects do
135
- field :title
136
- field :description
137
- end
138
- ```
139
-
140
- This will automatically set the type or root field to `object`. You may also specify `type: 'objects'` explicitly.
141
-
142
- To define a multi field you have to specify any type except for `object` or `nested` in the root field:
143
-
144
- ```ruby
145
- field :full_name, type: 'text', value: ->{ full_name.strip } do
146
- field :ordered, analyzer: 'ordered'
147
- field :untouched, type: 'keyword'
148
- end
149
- ```
150
-
151
- The `value:` option for internal fields will no longer be effective.
152
-
153
- ## Geo Point fields
154
-
155
- You can use [Elasticsearch's geo mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) with the `geo_point` field type, allowing you to query, filter and order by latitude and longitude. You can use the following hash format:
156
-
157
- ```ruby
158
- field :coordinates, type: 'geo_point', value: ->{ {lat: latitude, lon: longitude} }
159
- ```
160
-
161
- or by using nested fields:
162
-
163
- ```ruby
164
- field :coordinates, type: 'geo_point' do
165
- field :lat, value: ->{ latitude }
166
- field :long, value: ->{ longitude }
167
- end
168
- ```
169
-
170
- See the section on *Script fields* for details on calculating distance in a search.
171
-
172
- ## Join fields
173
-
174
- You can use a [join field](https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html)
175
- to implement parent-child relationships between documents.
176
- It [replaces the old `parent_id` based parent-child mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#parent-child-mapping-types)
177
-
178
- To use it, you need to pass `relations` and `join` (with `type` and `id`) options:
179
- ```ruby
180
- field :hierarchy_link, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join: {type: :comment_type, id: :commented_id}
181
- ```
182
- assuming you have `comment_type` and `commented_id` fields in your model.
183
-
184
- Note that when you reindex a parent, its children and grandchildren will be reindexed as well.
185
- This may require additional queries to the primary database and to Elasticsearch.
186
-
187
- Also note that the join field doesn't support crutches (it should be a field directly defined on the model).
188
-
189
- ## Crutches technology
190
-
191
- Assume you are defining your index like this (product has_many categories through product_categories):
192
-
193
- ```ruby
194
- class ProductsIndex < Chewy::Index
195
- index_scope Product.includes(:categories)
196
- field :name
197
- field :category_names, value: ->(product) { product.categories.map(&:name) } # or shorter just -> { categories.map(&:name) }
198
- end
199
- ```
200
-
201
- Then the Chewy reindexing flow will look like the following pseudo-code:
202
-
203
- ```ruby
204
- Product.includes(:categories).find_in_batches(1000) do |batch|
205
- bulk_body = batch.map do |object|
206
- {name: object.name, category_names: object.categories.map(&:name)}.to_json
207
- end
208
- # here we are sending every batch of data to ES
209
- Chewy.client.bulk bulk_body
210
- end
211
- ```
212
-
213
- If you meet complicated cases when associations are not applicable you can replace Rails associations with Chewy Crutches technology:
214
-
215
- ```ruby
216
- class ProductsIndex < Chewy::Index
217
- index_scope Product
218
- crutch :categories do |collection| # collection here is a current batch of products
219
- # data is fetched with a lightweight query without objects initialization
220
- data = ProductCategory.joins(:category).where(product_id: collection.map(&:id)).pluck(:product_id, 'categories.name')
221
- # then we have to convert fetched data to appropriate format
222
- # this will return our data in structure like:
223
- # {123 => ['sweets', 'juices'], 456 => ['meat']}
224
- data.each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
225
- end
226
-
227
- field :name
228
- # simply use crutch-fetched data as a value:
229
- field :category_names, value: ->(product, crutches) { crutches[:categories][product.id] }
230
- end
231
- ```
232
-
233
- An example flow will look like this:
234
-
235
- ```ruby
236
- Product.includes(:categories).find_in_batches(1000) do |batch|
237
- crutches[:categories] = ProductCategory.joins(:category).where(product_id: batch.map(&:id)).pluck(:product_id, 'categories.name')
238
- .each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
239
-
240
- bulk_body = batch.map do |object|
241
- {name: object.name, category_names: crutches[:categories][object.id]}.to_json
242
- end
243
- Chewy.client.bulk bulk_body
244
- end
245
- ```
246
-
247
- So Chewy Crutches technology is able to increase your indexing performance in some cases up to a hundredfold or even more depending on your associations complexity. For another approach to import performance, see [Raw import](import.md#raw-import).
248
-
249
- ## Witchcraft technology
250
-
251
- One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of these procs is executed on imported object to extract result document to import. It would be great for performance to use one huge whole-document-returning proc instead. So basically the idea or Witchcraft technology is to compile a single document-returning proc from the index definition.
252
-
253
- ```ruby
254
- index_scope Product
255
- witchcraft!
256
-
257
- field :title
258
- field :tags, value: -> { tags.map(&:name) }
259
- field :categories do
260
- field :name, value: -> (product, category) { category.name }
261
- field :type, value: -> (product, category, crutch) { crutch.types[category.name] }
262
- end
263
- ```
264
-
265
- The index definition above will be compiled to something close to:
266
-
267
- ```ruby
268
- -> (object, crutches) do
269
- {
270
- title: object.title,
271
- tags: object.tags.map(&:name),
272
- categories: object.categories.map do |object2|
273
- {
274
- name: object2.name
275
- type: crutches.types[object2.name]
276
- }
277
- end
278
- }
279
- end
280
- ```
281
-
282
- And don't even ask how is it possible, it is a witchcraft.
283
- Obviously not every type of definition might be compiled. There are some restrictions:
284
-
285
- 1. Use reasonable formatting to make `method_source` be able to extract field value proc sources.
286
- 2. Value procs with splat arguments are not supported right now.
287
- 3. If you are generating fields dynamically use value proc with arguments, argumentless value procs are not supported yet:
288
-
289
- ```ruby
290
- [:first_name, :last_name].each do |name|
291
- field name, value: -> (o) { o.send(name) }
292
- end
293
- ```
294
-
295
- However, it is quite possible that your index definition will be supported by Witchcraft technology out of the box in most of the cases.
296
-
297
- ## Index manipulation
298
-
299
- ```ruby
300
- UsersIndex.delete # destroy index if it exists
301
- UsersIndex.delete!
302
-
303
- UsersIndex.create
304
- UsersIndex.create! # use bang or non-bang methods
305
-
306
- UsersIndex.purge
307
- UsersIndex.purge! # deletes then creates index
308
-
309
- UsersIndex.import # import with 0 arguments process all the data specified in index_scope definition
310
- UsersIndex.import User.where('rating > 100') # or import specified users scope
311
- UsersIndex.import User.where('rating > 100').to_a # or import specified users array
312
- UsersIndex.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
313
- UsersIndex.import User.where('rating > 100'), update_fields: [:email] # if update fields are specified - it will update their values only with the `update` bulk action
314
- UsersIndex.import! # raises an exception in case of any import errors
315
-
316
- UsersIndex.reset! # purges index and imports default data for all types
317
- ```
318
-
319
- For more on import options, batching and journaling, see [import.md](import.md).
320
-
321
- If the passed user is `#destroyed?`, or satisfies a `delete_if` index_scope option, or the specified id does not exist in the database, import will perform delete from index action for this object.
322
-
323
- ```ruby
324
- index_scope User, delete_if: :deleted_at
325
- index_scope User, delete_if: -> { deleted_at }
326
- index_scope User, delete_if: ->(user) { user.deleted_at }
327
- ```
328
-
329
- See [actions.rb](../lib/chewy/index/actions.rb) for more details.
data/docs/querying.md DELETED
@@ -1,72 +0,0 @@
1
- # Querying
2
-
3
- ## Composing requests
4
-
5
- The request DSL have the same chainable nature as AR. The main class is `Chewy::Search::Request`.
6
-
7
- ```ruby
8
- CitiesIndex.query(match: {name: 'London'})
9
- ```
10
-
11
- Main methods of the request DSL are: `query`, `filter` and `post_filter`, it is possible to pass pure query hashes or use `elasticsearch-dsl`.
12
-
13
- ```ruby
14
- CitiesIndex
15
- .filter(term: {name: 'Bangkok'})
16
- .query(match: {name: 'London'})
17
- .query.not(range: {population: {gt: 1_000_000}})
18
- ```
19
-
20
- You can query a set of indexes at once:
21
-
22
- ```ruby
23
- CitiesIndex.indices(CountriesIndex).query(match: {name: 'Some'})
24
- ```
25
-
26
- See https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html and https://github.com/elastic/elasticsearch-dsl-ruby for more details.
27
-
28
- An important part of requests manipulation is merging. There are 4 methods to perform it: `merge`, `and`, `or`, `not`. See [Chewy::Search::QueryProxy](../lib/chewy/search/query_proxy.rb) for details. Also, `only` and `except` methods help to remove unneeded parts of the request.
29
-
30
- Every other request part is covered by a bunch of additional methods, see [Chewy::Search::Request](../lib/chewy/search/request.rb) for details:
31
-
32
- ```ruby
33
- CitiesIndex.limit(10).offset(30).order(:name, {population: {order: :desc}})
34
- ```
35
-
36
- Request DSL also provides additional scope actions, like `delete_all`, `exists?`, `count`, `pluck`, etc.
37
-
38
- ## Pagination
39
-
40
- The request DSL supports pagination with `Kaminari`. An extension is enabled on initialization if `Kaminari` is available. See [Chewy::Search](../lib/chewy/search.rb) and [Chewy::Search::Pagination::Kaminari](../lib/chewy/search/pagination/kaminari.rb) for details.
41
-
42
- ## Named scopes
43
-
44
- Chewy supports named scopes functionality. There is no specialized DSL for named scopes definition, it is simply about defining class methods.
45
-
46
- See [Chewy::Search::Scoping](../lib/chewy/search/scoping.rb) for details.
47
-
48
- ## Scroll API
49
-
50
- Elasticsearch scroll API is utilized by a bunch of methods: `scroll_batches`, `scroll_hits`, `scroll_wrappers` and `scroll_objects`.
51
-
52
- See [Chewy::Search::Scrolling](../lib/chewy/search/scrolling.rb) for details.
53
-
54
- ## Loading objects
55
-
56
- It is possible to load ORM/ODM source objects with the `objects` method. To provide additional loading options use `load` method:
57
-
58
- ```ruby
59
- CitiesIndex.load(scope: -> { active }).to_a # to_a returns `Chewy::Index` wrappers.
60
- CitiesIndex.load(scope: -> { active }).objects # An array of AR source objects.
61
- ```
62
-
63
- See [Chewy::Search::Loader](../lib/chewy/search/loader.rb) for more details.
64
-
65
- In case when it is necessary to iterate through both of the wrappers and objects simultaneously, `object_hash` method helps a lot:
66
-
67
- ```ruby
68
- scope = CitiesIndex.load(scope: -> { active })
69
- scope.each do |wrapper|
70
- scope.object_hash[wrapper]
71
- end
72
- ```
data/docs/rake_tasks.md DELETED
@@ -1,108 +0,0 @@
1
- # Rake Tasks
2
-
3
- For a Rails application, some index-maintaining rake tasks are defined.
4
-
5
- ## `chewy:reset`
6
-
7
- Performs zero-downtime reindexing as described [here](https://www.elastic.co/blog/changing-mapping-with-zero-downtime). So the rake task creates a new index with unique suffix and then simply aliases it to the common index name. The previous index is deleted afterwards (see `Chewy::Index.reset!` for more details).
8
-
9
- ```bash
10
- rake chewy:reset # resets all the existing indices
11
- rake chewy:reset[users] # resets UsersIndex only
12
- rake chewy:reset[users,cities] # resets UsersIndex and CitiesIndex
13
- rake chewy:reset[-users,cities] # resets every index in the application except specified ones
14
- ```
15
-
16
- ## `chewy:upgrade`
17
-
18
- Performs reset exactly the same way as `chewy:reset` does, but only when the index specification (setting or mapping) was changed.
19
-
20
- It works only when index specification is locked in `Chewy::Stash::Specification` index. The first run will reset all indexes and lock their specifications.
21
-
22
- See [Chewy::Stash::Specification](../lib/chewy/stash.rb) and [Chewy::Index::Specification](../lib/chewy/index/specification.rb) for more details.
23
-
24
-
25
- ```bash
26
- rake chewy:upgrade # upgrades all the existing indices
27
- rake chewy:upgrade[users] # upgrades UsersIndex only
28
- rake chewy:upgrade[users,cities] # upgrades UsersIndex and CitiesIndex
29
- rake chewy:upgrade[-users,cities] # upgrades every index in the application except specified ones
30
- ```
31
-
32
- ## `chewy:update`
33
-
34
- It doesn't create indexes, it simply imports everything to the existing ones and fails if the index was not created before.
35
-
36
- ```bash
37
- rake chewy:update # updates all the existing indices
38
- rake chewy:update[users] # updates UsersIndex only
39
- rake chewy:update[users,cities] # updates UsersIndex and CitiesIndex
40
- rake chewy:update[-users,cities] # updates every index in the application except UsersIndex and CitiesIndex
41
- ```
42
-
43
- ## `chewy:sync`
44
-
45
- Provides a way to synchronize outdated indexes with the source quickly and without doing a full reset. By default field `updated_at` is used to find outdated records, but this could be customized by `outdated_sync_field` as described at [Chewy::Index::Syncer](../lib/chewy/index/syncer.rb).
46
-
47
- Arguments are similar to the ones taken by `chewy:update` task.
48
-
49
- See [Chewy::Index::Syncer](../lib/chewy/index/syncer.rb) for more details.
50
-
51
- ```bash
52
- rake chewy:sync # synchronizes all the existing indices
53
- rake chewy:sync[users] # synchronizes UsersIndex only
54
- rake chewy:sync[users,cities] # synchronizes UsersIndex and CitiesIndex
55
- rake chewy:sync[-users,cities] # synchronizes every index in the application except UsersIndex and CitiesIndex
56
- ```
57
-
58
- ## `chewy:deploy`
59
-
60
- This rake task is especially useful during the production deploy. It is a combination of `chewy:upgrade` and `chewy:sync` and the latter is called only for the indexes that were not reset during the first stage.
61
-
62
- It is not possible to specify any particular indexes for this task as it doesn't make much sense.
63
-
64
- Right now the approach is that if some data had been updated, but index definition was not changed (no changes satisfying the synchronization algorithm were done), it would be much faster to perform manual partial index update inside data migrations or even manually after the deploy.
65
-
66
- Also, there is always full reset alternative with `rake chewy:reset`. See [configuration.md](configuration.md#index-update-strategies) for how update strategies interact with deployment.
67
-
68
- ## `chewy:create_missing_indexes`
69
-
70
- This rake task creates newly defined indexes in Elasticsearch and skips existing ones. Useful for production-like environments.
71
-
72
- ## Parallelizing rake tasks
73
-
74
- Every task described above has its own parallel version. Every parallel rake task takes the number for processes for execution as the first argument and the rest of the arguments are exactly the same as for the non-parallel task version.
75
-
76
- [https://github.com/grosser/parallel](https://github.com/grosser/parallel) gem is required to use these tasks.
77
-
78
- If the number of processes is not specified explicitly - `parallel` gem tries to automatically derive the number of processes to use.
79
-
80
- ```bash
81
- rake chewy:parallel:reset
82
- rake chewy:parallel:upgrade[4]
83
- rake chewy:parallel:update[4,cities]
84
- rake chewy:parallel:sync[4,-users]
85
- rake chewy:parallel:deploy[4] # performs parallel upgrade and parallel sync afterwards
86
- ```
87
-
88
- ## `chewy:journal`
89
-
90
- This namespace contains two tasks for the journal manipulations: `chewy:journal:apply` and `chewy:journal:clean`. Both are taking time as the first argument (optional for clean) and a list of indexes exactly as the tasks above. Time can be in any format parsable by ActiveSupport.
91
-
92
- ```bash
93
- rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)"] # apply journaled changes for the past hour
94
- rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)",users] # apply journaled changes for the past hour on UsersIndex only
95
- ```
96
-
97
- See [import.md](import.md#journaling) for how journaling works and how to enable it.
98
-
99
- When the size of the journal becomes very large, the classical way of deletion would be obstructive and resource consuming. Fortunately, Chewy internally uses [delete-by-query](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-delete-by-query.html#docs-delete-by-query-task-api) ES function which supports async execution with batching and [throttling](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-throttle).
100
-
101
- The available options, which can be set by ENV variables, are listed below:
102
- * `WAIT_FOR_COMPLETION` - a boolean flag. It controls async execution. It waits by default. When set to `false` (`0`, `f`, `false` or `off` in any case spelling is accepted as `false`), Elasticsearch performs some preflight checks, launches the request, and returns a task reference you can use to cancel the task or get its status.
103
- * `REQUESTS_PER_SECOND` - float. The throttle for this request in sub-requests per second. No throttling is enforced by default.
104
- * `SCROLL_SIZE` - integer. The number of documents to be deleted in single sub-request. The default batch size is 1000.
105
-
106
- ```bash
107
- rake chewy:journal:clean WAIT_FOR_COMPLETION=false REQUESTS_PER_SECOND=10 SCROLL_SIZE=5000
108
- ```
data/docs/testing.md DELETED
@@ -1,41 +0,0 @@
1
- # Testing
2
-
3
- ## RSpec integration
4
-
5
- Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features:
6
-
7
- [update_index](../lib/chewy/rspec/update_index.rb) helper
8
- `mock_elasticsearch_response` helper to mock elasticsearch response
9
- `mock_elasticsearch_response_sources` helper to mock elasticsearch response sources
10
- `build_query` matcher to compare request and expected query (returns `true`/`false`)
11
-
12
- To use `mock_elasticsearch_response` and `mock_elasticsearch_response_sources` helpers add `include Chewy::Rspec::Helpers` to your tests.
13
-
14
- See [chewy/rspec/](../lib/chewy/rspec/) for more details.
15
-
16
- ## Minitest integration
17
-
18
- Add `require 'chewy/minitest'` to your test_helper.rb, and then for tests which you'd like indexing test hooks, `include Chewy::Minitest::Helpers`.
19
-
20
- You can set the `:bypass` strategy for test suites and manually handle imports and flush test indices using `Chewy.massacre`. This will help reduce unnecessary ES requests.
21
-
22
- But if you require chewy to index/update model regularly in your test suite then you can specify `:urgent` strategy for documents indexing. Add `Chewy.strategy(:urgent)` to test_helper.rb.
23
-
24
- Also, you can use additional helpers:
25
-
26
- `mock_elasticsearch_response` to mock elasticsearch response
27
- `mock_elasticsearch_response_sources` to mock elasticsearch response sources
28
- `assert_elasticsearch_query` to compare request and expected query (returns `true`/`false`)
29
-
30
- See [chewy/minitest/](../lib/chewy/minitest/) for more details.
31
-
32
- ## DatabaseCleaner
33
-
34
- If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](https://github.com/DatabaseCleaner/database_cleaner#how-to-use), you may run into the problem that `ActiveRecord`'s models are not indexed automatically on save despite the fact that you set the callbacks to do this with the `update_index` method. The issue arises because `chewy` indices data on `after_commit` run as default, but all `after_commit` callbacks are not run with the `DatabaseCleaner`'s' `transaction` strategy. You can solve this issue by changing the `Chewy.use_after_commit_callbacks` option. Just add the following initializer in your Rails application:
35
-
36
- ```ruby
37
- #config/initializers/chewy.rb
38
- Chewy.use_after_commit_callbacks = !Rails.env.test?
39
- ```
40
-
41
- If you're seeing other unexpected behavior in tests, check [troubleshooting.md](troubleshooting.md) for common issues and debugging tips.