chewy 7.2.5 → 7.2.7
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/CODEOWNERS +1 -0
- data/.github/workflows/ruby.yml +8 -1
- data/CHANGELOG.md +25 -1
- data/README.md +14 -0
- data/lib/chewy/fields/base.rb +1 -1
- data/lib/chewy/index/import/bulk_builder.rb +4 -5
- data/lib/chewy/journal.rb +17 -6
- data/lib/chewy/rake_helper.rb +38 -5
- data/lib/chewy/search/parameters/collapse.rb +16 -0
- data/lib/chewy/search/request.rb +33 -10
- data/lib/chewy/stash.rb +3 -3
- data/lib/chewy/version.rb +1 -1
- data/lib/tasks/chewy.rake +7 -1
- data/spec/chewy/fields/base_spec.rb +1 -0
- data/spec/chewy/index/import/bulk_builder_spec.rb +4 -0
- data/spec/chewy/journal_spec.rb +13 -49
- data/spec/chewy/rake_helper_spec.rb +68 -0
- data/spec/chewy/search/parameters/collapse_spec.rb +5 -0
- data/spec/chewy/search/request_spec.rb +35 -0
- metadata +10 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f720ba38bdda37de9d1a9b54d0b9c28efbe444e3c9cdfb8e3a1fef0f01aeaf85
|
4
|
+
data.tar.gz: 01041cfe59a33c5b9b07bc46f4e5c0d9a68802a7ddf6ad99aebf7a34c85b32b2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 01567be06ff1aa7deb0cd9e2459c9a094d80c113e088e1eac7dbdea0cf6b74bbb78579923bf7335bb2114884a3cb3a99ca946009fe3f762566b76a82c17ea7f2
|
7
|
+
data.tar.gz: 0cd235971d30435c1d5ad2b33f70643733fb5af02d7d49891668d216396fdaa0326b7aad4c01bcbf3ed207ef3ee9ac37c02ff269087b9bb6c8e498304dd902d9
|
data/.github/CODEOWNERS
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
.github/workflows @toptal/rogue-one
|
data/.github/workflows/ruby.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -8,6 +8,29 @@
|
|
8
8
|
|
9
9
|
### Bugs Fixed
|
10
10
|
|
11
|
+
## 7.2.7 (2022-11-15)
|
12
|
+
|
13
|
+
### New Features
|
14
|
+
|
15
|
+
* [#857](https://github.com/toptal/chewy/pull/857): Allow passing `wait_for_completion`, `request_per_second` and `scroll_size` options to `chewy:journal:clean` rake task and `delete_all` query builder method. ([@konalegi][])([@barthez][])
|
16
|
+
|
17
|
+
### Changes
|
18
|
+
|
19
|
+
### Bugs Fixed
|
20
|
+
|
21
|
+
* [#863](https://github.com/toptal/chewy/pull/863): Fix `crutches` call doesn't respect `update_fields` option. ([@skcc321][])
|
22
|
+
|
23
|
+
## 7.2.6 (2022-06-13)
|
24
|
+
|
25
|
+
### New Features
|
26
|
+
|
27
|
+
* [#841](https://github.com/toptal/chewy/pull/841): Add the [`collapse`](https://www.elastic.co/guide/en/elasticsearch/reference/current/collapse-search-results.html) option to the request. ([@jkostolansky][])
|
28
|
+
|
29
|
+
### Bugs Fixed
|
30
|
+
|
31
|
+
* [#842](https://github.com/toptal/chewy/issues/842): Fix `ignore_blank` handling. ([@rabotyaga][])
|
32
|
+
* [#848](https://github.com/toptal/chewy/issues/848): Fix invalid journal pagination. ([@konalegi][])
|
33
|
+
|
11
34
|
## 7.2.5 (2022-03-04)
|
12
35
|
|
13
36
|
### New Features
|
@@ -15,7 +38,7 @@
|
|
15
38
|
* [#827](https://github.com/toptal/chewy/pull/827): Add `:lazy_sidekiq` strategy, that defers not only importing but also `update_index` callback evaluation for created and updated objects. ([@sl4vr][])
|
16
39
|
* [#827](https://github.com/toptal/chewy/pull/827): Add `:atomic_no_refresh` strategy. Like `:atomic`, but `refresh=false` parameter is set. ([@barthez][])
|
17
40
|
* [#827](https://github.com/toptal/chewy/pull/827): Add `:no_refresh` chain call to `update_index` matcher to ensure import was called with `refresh=false`. ([@barthez][])
|
18
|
-
|
41
|
+
|
19
42
|
### Bugs Fixed
|
20
43
|
|
21
44
|
* [#835](https://github.com/toptal/chewy/pull/835): Support keyword arguments in named scopes. ([@milk1000cc][])
|
@@ -696,6 +719,7 @@
|
|
696
719
|
[@jimmybaker]: https://github.com/jimmybaker
|
697
720
|
[@jirikolarik]: https://github.com/jirikolarik
|
698
721
|
[@jirutka]: https://github.com/jirutka
|
722
|
+
[@jkostolansky]: https://github.com/jkostolansky
|
699
723
|
[@joeljunstrom]: https://github.com/joeljunstrom
|
700
724
|
[@jondavidford]: https://github.com/jondavidford
|
701
725
|
[@joonty]: https://github.com/joonty
|
data/README.md
CHANGED
@@ -677,6 +677,8 @@ You may be wondering why do you need it? The answer is simple: not to lose the d
|
|
677
677
|
|
678
678
|
Imagine that you reset your index in a zero-downtime manner (to separate index), and at the meantime somebody keeps updating the data frequently (to old index). So all these actions will be written to the journal index and you'll be able to apply them after index reset using the `Chewy::Journal` interface.
|
679
679
|
|
680
|
+
When enabled, journal can grow to enormous size, consider setting up cron job that would clean it occasionally using [`chewy:journal:clean` rake task](#chewyjournal).
|
681
|
+
|
680
682
|
### Index manipulation
|
681
683
|
|
682
684
|
```ruby
|
@@ -694,6 +696,7 @@ UsersIndex.import User.where('rating > 100') # or import specified users scope
|
|
694
696
|
UsersIndex.import User.where('rating > 100').to_a # or import specified users array
|
695
697
|
UsersIndex.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
|
696
698
|
UsersIndex.import User.where('rating > 100'), update_fields: [:email] # if update fields are specified - it will update their values only with the `update` bulk action
|
699
|
+
UsersIndex.import! # raises an exception in case of any import errors
|
697
700
|
|
698
701
|
UsersIndex.reset! # purges index and imports default data for all types
|
699
702
|
```
|
@@ -1143,6 +1146,17 @@ rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)"] # apply journaled changes f
|
|
1143
1146
|
rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)",users] # apply journaled changes for the past hour on UsersIndex only
|
1144
1147
|
```
|
1145
1148
|
|
1149
|
+
When the size of the journal becomes very large, the classical way of deletion would be obstructive and resource consuming. Fortunately, Chewy internally uses [delete-by-query](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-delete-by-query.html#docs-delete-by-query-task-api) ES function which supports async execution with batching and [throttling](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-throttle).
|
1150
|
+
|
1151
|
+
The available options, which can be set by ENV variables, are listed below:
|
1152
|
+
* `WAIT_FOR_COMPLETION` - a boolean flag. It controls async execution. It waits by default. When set to `false` (`0`, `f`, `false` or `off` in any case spelling is accepted as `false`), Elasticsearch performs some preflight checks, launches the request, and returns a task reference you can use to cancel the task or get its status.
|
1153
|
+
* `REQUESTS_PER_SECOND` - float. The throttle for this request in sub-requests per second. No throttling is enforced by default.
|
1154
|
+
* `SCROLL_SIZE` - integer. The number of documents to be deleted in single sub-request. The default batch size is 1000.
|
1155
|
+
|
1156
|
+
```bash
|
1157
|
+
rake chewy:journal:clean WAIT_FOR_COMPLETION=false REQUESTS_PER_SECOND=10 SCROLL_SIZE=5000
|
1158
|
+
```
|
1159
|
+
|
1146
1160
|
### RSpec integration
|
1147
1161
|
|
1148
1162
|
Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features:
|
data/lib/chewy/fields/base.rb
CHANGED
@@ -48,12 +48,11 @@ module Chewy
|
|
48
48
|
def index_entry(object)
|
49
49
|
entry = {}
|
50
50
|
entry[:_id] = index_object_ids[object] if index_object_ids[object]
|
51
|
+
entry[:routing] = routing(object) if join_field?
|
51
52
|
|
52
|
-
data = data_for(object)
|
53
53
|
parent = cache(entry[:_id])
|
54
|
-
|
55
|
-
|
56
|
-
if parent_changed?(data, parent)
|
54
|
+
data = data_for(object) if parent.present?
|
55
|
+
if parent.present? && parent_changed?(data, parent)
|
57
56
|
reindex_entries(object, data) + reindex_descendants(object)
|
58
57
|
elsif @fields.present?
|
59
58
|
return [] unless entry[:_id]
|
@@ -61,7 +60,7 @@ module Chewy
|
|
61
60
|
entry[:data] = {doc: data_for(object, fields: @fields)}
|
62
61
|
[{update: entry}]
|
63
62
|
else
|
64
|
-
entry[:data] = data
|
63
|
+
entry[:data] = data || data_for(object)
|
65
64
|
[{index: entry}]
|
66
65
|
end
|
67
66
|
end
|
data/lib/chewy/journal.rb
CHANGED
@@ -16,14 +16,17 @@ module Chewy
|
|
16
16
|
# specified indexes.
|
17
17
|
#
|
18
18
|
# @param since_time [Time, DateTime] timestamp from which changes will be applied
|
19
|
-
# @param
|
19
|
+
# @param fetch_limit [Int] amount of entries to be fetched on each cycle
|
20
20
|
# @return [Integer] the amount of journal entries found
|
21
|
-
def apply(since_time,
|
21
|
+
def apply(since_time, fetch_limit: 10, **import_options)
|
22
22
|
stage = 1
|
23
23
|
since_time -= 1
|
24
24
|
count = 0
|
25
|
-
|
26
|
-
|
25
|
+
|
26
|
+
total_count = entries(since_time, fetch_limit).total_count
|
27
|
+
|
28
|
+
while count < total_count
|
29
|
+
entries = entries(since_time, fetch_limit).to_a.presence or break
|
27
30
|
count += entries.size
|
28
31
|
groups = reference_groups(entries)
|
29
32
|
ActiveSupport::Notifications.instrument 'apply_journal.chewy', stage: stage, groups: groups
|
@@ -40,12 +43,20 @@ module Chewy
|
|
40
43
|
#
|
41
44
|
# @param until_time [Time, DateTime] time to clean up until it
|
42
45
|
# @return [Hash] delete_by_query ES API call result
|
43
|
-
def clean(until_time = nil)
|
44
|
-
Chewy::Stash::Journal.clean(
|
46
|
+
def clean(until_time = nil, delete_by_query_options: {})
|
47
|
+
Chewy::Stash::Journal.clean(
|
48
|
+
until_time,
|
49
|
+
only: @only,
|
50
|
+
delete_by_query_options: delete_by_query_options.merge(refresh: false)
|
51
|
+
)
|
45
52
|
end
|
46
53
|
|
47
54
|
private
|
48
55
|
|
56
|
+
def entries(since_time, fetch_limit)
|
57
|
+
Chewy::Stash::Journal.entries(since_time, only: @only).order(:created_at).limit(fetch_limit)
|
58
|
+
end
|
59
|
+
|
49
60
|
def reference_groups(entries)
|
50
61
|
entries.group_by(&:index_name)
|
51
62
|
.transform_keys { |index_name| Chewy.derive_name(index_name) }
|
data/lib/chewy/rake_helper.rb
CHANGED
@@ -19,6 +19,9 @@ module Chewy
|
|
19
19
|
output.puts " Applying journal to #{targets}, #{count} entries, stage #{payload[:stage]}"
|
20
20
|
end
|
21
21
|
|
22
|
+
DELETE_BY_QUERY_OPTIONS = %w[WAIT_FOR_COMPLETION REQUESTS_PER_SECOND SCROLL_SIZE].freeze
|
23
|
+
FALSE_VALUES = %w[0 f false off].freeze
|
24
|
+
|
22
25
|
class << self
|
23
26
|
# Performs zero-downtime reindexing of all documents for the specified indexes
|
24
27
|
#
|
@@ -162,7 +165,7 @@ module Chewy
|
|
162
165
|
|
163
166
|
subscribed_task_stats(output) do
|
164
167
|
output.puts "Applying journal entries created after #{time}"
|
165
|
-
count = Chewy::Journal.new(
|
168
|
+
count = Chewy::Journal.new(journal_indexes_from(only: only, except: except)).apply(time)
|
166
169
|
output.puts 'No journal entries were created after the specified time' if count.zero?
|
167
170
|
end
|
168
171
|
end
|
@@ -181,12 +184,16 @@ module Chewy
|
|
181
184
|
# @param except [Array<Chewy::Index, String>, Chewy::Index, String] indexes to exclude from processing
|
182
185
|
# @param output [IO] output io for logging
|
183
186
|
# @return [Array<Chewy::Index>] indexes that were actually updated
|
184
|
-
def journal_clean(time: nil, only: nil, except: nil, output: $stdout)
|
187
|
+
def journal_clean(time: nil, only: nil, except: nil, delete_by_query_options: {}, output: $stdout)
|
185
188
|
subscribed_task_stats(output) do
|
186
189
|
output.puts "Cleaning journal entries created before #{time}" if time
|
187
|
-
response = Chewy::Journal.new(
|
188
|
-
|
189
|
-
|
190
|
+
response = Chewy::Journal.new(journal_indexes_from(only: only, except: except)).clean(time, delete_by_query_options: delete_by_query_options)
|
191
|
+
if response.key?('task')
|
192
|
+
output.puts "Task to cleanup the journal has been created, #{response['task']}"
|
193
|
+
else
|
194
|
+
count = response['deleted'] || response['_indices']['_all']['deleted']
|
195
|
+
output.puts "Cleaned up #{count} journal entries"
|
196
|
+
end
|
190
197
|
end
|
191
198
|
end
|
192
199
|
|
@@ -228,6 +235,26 @@ module Chewy
|
|
228
235
|
end
|
229
236
|
end
|
230
237
|
|
238
|
+
# Reads options that are required to run journal cleanup asynchronously from ENV hash
|
239
|
+
# @see https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
|
240
|
+
#
|
241
|
+
# @example
|
242
|
+
# Chewy::RakeHelper.delete_by_query_options_from_env({'WAIT_FOR_COMPLETION' => 'false','REQUESTS_PER_SECOND' => '10','SCROLL_SIZE' => '5000'})
|
243
|
+
# # => { wait_for_completion: false, requests_per_second: 10.0, scroll_size: 5000 }
|
244
|
+
#
|
245
|
+
def delete_by_query_options_from_env(env)
|
246
|
+
env
|
247
|
+
.slice(*DELETE_BY_QUERY_OPTIONS)
|
248
|
+
.transform_keys { |k| k.downcase.to_sym }
|
249
|
+
.to_h do |key, value|
|
250
|
+
case key
|
251
|
+
when :wait_for_completion then [key, !FALSE_VALUES.include?(value.downcase)]
|
252
|
+
when :requests_per_second then [key, value.to_f]
|
253
|
+
when :scroll_size then [key, value.to_i]
|
254
|
+
end
|
255
|
+
end
|
256
|
+
end
|
257
|
+
|
231
258
|
def normalize_indexes(*identifiers)
|
232
259
|
identifiers.flatten(1).map { |identifier| normalize_index(identifier) }
|
233
260
|
end
|
@@ -248,6 +275,12 @@ module Chewy
|
|
248
275
|
|
249
276
|
private
|
250
277
|
|
278
|
+
def journal_indexes_from(only: nil, except: nil)
|
279
|
+
return if Array.wrap(only).empty? && Array.wrap(except).empty?
|
280
|
+
|
281
|
+
indexes_from(only: only, except: except)
|
282
|
+
end
|
283
|
+
|
251
284
|
def indexes_from(only: nil, except: nil)
|
252
285
|
indexes = if only.present?
|
253
286
|
normalize_indexes(Array.wrap(only))
|
@@ -0,0 +1,16 @@
|
|
1
|
+
require 'chewy/search/parameters/storage'
|
2
|
+
|
3
|
+
module Chewy
|
4
|
+
module Search
|
5
|
+
class Parameters
|
6
|
+
# Just a standard hash storage. Nothing to see here.
|
7
|
+
#
|
8
|
+
# @see Chewy::Search::Parameters::HashStorage
|
9
|
+
# @see Chewy::Search::Request#collapse
|
10
|
+
# @see https://www.elastic.co/guide/en/elasticsearch/reference/current/collapse-search-results.html
|
11
|
+
class Collapse < Storage
|
12
|
+
include HashStorage
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
16
|
+
end
|
data/lib/chewy/search/request.rb
CHANGED
@@ -24,7 +24,7 @@ module Chewy
|
|
24
24
|
track_scores track_total_hits request_cache explain version profile
|
25
25
|
search_type preference limit offset terminate_after
|
26
26
|
timeout min_score source stored_fields search_after
|
27
|
-
load script_fields suggest aggs aggregations none
|
27
|
+
load script_fields suggest aggs aggregations collapse none
|
28
28
|
indices_boost rescore highlight total total_count
|
29
29
|
total_entries indices types delete_all count exists?
|
30
30
|
exist? find pluck scroll_batches scroll_hits
|
@@ -41,7 +41,7 @@ module Chewy
|
|
41
41
|
EXTRA_STORAGES = %i[aggs suggest].freeze
|
42
42
|
# An array of storage names that are changing the returned hist collection in any way.
|
43
43
|
WHERE_STORAGES = %i[
|
44
|
-
query filter post_filter none min_score rescore indices_boost
|
44
|
+
query filter post_filter none min_score rescore indices_boost collapse
|
45
45
|
].freeze
|
46
46
|
|
47
47
|
delegate :hits, :wrappers, :objects, :records, :documents,
|
@@ -509,7 +509,18 @@ module Chewy
|
|
509
509
|
# @see https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html#multi-index
|
510
510
|
# @param value [true, false, nil]
|
511
511
|
# @return [Chewy::Search::Request]
|
512
|
-
|
512
|
+
#
|
513
|
+
# @!method collapse(value)
|
514
|
+
# Replaces the value of the `collapse` request part.
|
515
|
+
#
|
516
|
+
# @example
|
517
|
+
# PlacesIndex.collapse(field: :name)
|
518
|
+
# # => <PlacesIndex::Query {..., :body=>{:collapse=>{"field"=>:name}}}>
|
519
|
+
# @see Chewy::Search::Parameters::Collapse
|
520
|
+
# @see https://www.elastic.co/guide/en/elasticsearch/reference/current/collapse-search-results.html
|
521
|
+
# @param value [Hash]
|
522
|
+
# @return [Chewy::Search::Request]
|
523
|
+
%i[request_cache search_type preference timeout limit offset terminate_after min_score ignore_unavailable collapse].each do |name|
|
513
524
|
define_method name do |value|
|
514
525
|
modify(name) { replace!(value) }
|
515
526
|
end
|
@@ -800,8 +811,8 @@ module Chewy
|
|
800
811
|
# Returns a new scope containing only specified storages.
|
801
812
|
#
|
802
813
|
# @example
|
803
|
-
# PlacesIndex.limit(10).offset(10).order(:name).
|
804
|
-
# # => <PlacesIndex::Query {..., :body=>{:
|
814
|
+
# PlacesIndex.limit(10).offset(10).order(:name).only(:offset, :order)
|
815
|
+
# # => <PlacesIndex::Query {..., :body=>{:from=>10, :sort=>["name"]}}>
|
805
816
|
# @param values [Array<String, Symbol>]
|
806
817
|
# @return [Chewy::Search::Request] new scope
|
807
818
|
def only(*values)
|
@@ -811,8 +822,8 @@ module Chewy
|
|
811
822
|
# Returns a new scope containing all the storages except specified.
|
812
823
|
#
|
813
824
|
# @example
|
814
|
-
# PlacesIndex.limit(10).offset(10).order(:name).
|
815
|
-
# # => <PlacesIndex::Query {..., :body=>{:
|
825
|
+
# PlacesIndex.limit(10).offset(10).order(:name).except(:offset, :order)
|
826
|
+
# # => <PlacesIndex::Query {..., :body=>{:size=>10}}>
|
816
827
|
# @param values [Array<String, Symbol>]
|
817
828
|
# @return [Chewy::Search::Request] new scope
|
818
829
|
def except(*values)
|
@@ -951,10 +962,22 @@ module Chewy
|
|
951
962
|
#
|
952
963
|
# @see https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
|
953
964
|
# @note The result hash is different for different API used.
|
954
|
-
# @param refresh [true, false]
|
965
|
+
# @param refresh [true, false] Refreshes all shards involved in the delete by query
|
966
|
+
# @param wait_for_completion [true, false] wait for request completion or run it asynchronously
|
967
|
+
# and return task reference at `.tasks/task/${taskId}`.
|
968
|
+
# @param requests_per_second [Float] The throttle for this request in sub-requests per second
|
969
|
+
# @param scroll_size [Integer] Size of the scroll request that powers the operation
|
970
|
+
|
955
971
|
# @return [Hash] the result of query execution
|
956
|
-
def delete_all(refresh: true)
|
957
|
-
request_body = only(WHERE_STORAGES).render.merge(
|
972
|
+
def delete_all(refresh: true, wait_for_completion: nil, requests_per_second: nil, scroll_size: nil)
|
973
|
+
request_body = only(WHERE_STORAGES).render.merge(
|
974
|
+
{
|
975
|
+
refresh: refresh,
|
976
|
+
wait_for_completion: wait_for_completion,
|
977
|
+
requests_per_second: requests_per_second,
|
978
|
+
scroll_size: scroll_size
|
979
|
+
}.compact
|
980
|
+
)
|
958
981
|
ActiveSupport::Notifications.instrument 'delete_query.chewy', notification_payload(request: request_body) do
|
959
982
|
request_body[:body] = {query: {match_all: {}}} if request_body[:body].empty?
|
960
983
|
Chewy.client.delete_by_query(request_body)
|
data/lib/chewy/stash.rb
CHANGED
@@ -28,12 +28,12 @@ module Chewy
|
|
28
28
|
# Cleans up all the journal entries until the specified time. If nothing is
|
29
29
|
# specified - cleans up everything.
|
30
30
|
#
|
31
|
-
# @param
|
31
|
+
# @param until_time [Time, DateTime] Clean everything before that date
|
32
32
|
# @param only [Chewy::Index, Array<Chewy::Index>] indexes to clean up journal entries for
|
33
|
-
def self.clean(until_time = nil, only: [])
|
33
|
+
def self.clean(until_time = nil, only: [], delete_by_query_options: {})
|
34
34
|
scope = self.for(only)
|
35
35
|
scope = scope.filter(range: {created_at: {lte: until_time}}) if until_time
|
36
|
-
scope.delete_all
|
36
|
+
scope.delete_all(**delete_by_query_options)
|
37
37
|
end
|
38
38
|
|
39
39
|
# Selects all the journal entries for the specified indices.
|
data/lib/chewy/version.rb
CHANGED
data/lib/tasks/chewy.rake
CHANGED
@@ -94,7 +94,13 @@ namespace :chewy do
|
|
94
94
|
|
95
95
|
desc 'Removes journal records created before the specified timestamp for the specified indexes/types or all of them'
|
96
96
|
task clean: :environment do |_task, args|
|
97
|
-
Chewy::RakeHelper.
|
97
|
+
delete_options = Chewy::RakeHelper.delete_by_query_options_from_env(ENV)
|
98
|
+
Chewy::RakeHelper.journal_clean(
|
99
|
+
[
|
100
|
+
parse_journal_args(args.extras),
|
101
|
+
{delete_by_query_options: delete_options}
|
102
|
+
].reduce({}, :merge)
|
103
|
+
)
|
98
104
|
end
|
99
105
|
end
|
100
106
|
end
|
@@ -62,6 +62,8 @@ describe Chewy::Index::Import::BulkBuilder do
|
|
62
62
|
let(:to_index) { cities.first(2) }
|
63
63
|
let(:delete) { [cities.last] }
|
64
64
|
specify do
|
65
|
+
expect(subject).to receive(:data_for).with(cities.first).and_call_original
|
66
|
+
expect(subject).to receive(:data_for).with(cities.second).and_call_original
|
65
67
|
expect(subject.bulk_body).to eq([
|
66
68
|
{index: {_id: 1, data: {'name' => 'City17', 'rating' => 42}}},
|
67
69
|
{index: {_id: 2, data: {'name' => 'City18', 'rating' => 42}}},
|
@@ -72,6 +74,8 @@ describe Chewy::Index::Import::BulkBuilder do
|
|
72
74
|
context ':fields' do
|
73
75
|
let(:fields) { %w[name] }
|
74
76
|
specify do
|
77
|
+
expect(subject).to receive(:data_for).with(cities.first, fields: [:name]).and_call_original
|
78
|
+
expect(subject).to receive(:data_for).with(cities.second, fields: [:name]).and_call_original
|
75
79
|
expect(subject.bulk_body).to eq([
|
76
80
|
{update: {_id: 1, data: {doc: {'name' => 'City17'}}}},
|
77
81
|
{update: {_id: 2, data: {doc: {'name' => 'City18'}}}},
|
data/spec/chewy/journal_spec.rb
CHANGED
@@ -199,59 +199,23 @@ describe Chewy::Journal do
|
|
199
199
|
end
|
200
200
|
end
|
201
201
|
|
202
|
-
context '
|
203
|
-
let(:time) { Time.now
|
204
|
-
before do
|
205
|
-
Timecop.freeze
|
206
|
-
Chewy.strategy(:urgent)
|
207
|
-
City.create!(id: 1)
|
208
|
-
end
|
209
|
-
|
210
|
-
after do
|
211
|
-
Chewy.strategy.pop
|
212
|
-
Timecop.return
|
213
|
-
end
|
214
|
-
|
215
|
-
specify 'journal was cleaned after the first call' do
|
216
|
-
expect(Chewy::Stash::Journal).to receive(:entries).exactly(2).and_call_original
|
217
|
-
expect(described_class.new.apply(time)).to eq(1)
|
218
|
-
end
|
219
|
-
|
220
|
-
context 'endless journal' do
|
221
|
-
let(:count_of_checks) { 10 } # default
|
222
|
-
let!(:journal_entries) do
|
223
|
-
record = Chewy::Stash::Journal.entries(time).first
|
224
|
-
Array.new(count_of_checks) do |i|
|
225
|
-
Chewy::Stash::Journal.new(
|
226
|
-
record.attributes.merge(
|
227
|
-
'created_at' => time.to_i + i,
|
228
|
-
'references' => [i.to_s]
|
229
|
-
)
|
230
|
-
)
|
231
|
-
end
|
232
|
-
end
|
233
|
-
|
234
|
-
specify '10 retries by default' do
|
235
|
-
expect(Chewy::Stash::Journal)
|
236
|
-
.to receive(:entries).exactly(count_of_checks) { [journal_entries.shift].compact }
|
237
|
-
expect(described_class.new.apply(time)).to eq(10)
|
238
|
-
end
|
202
|
+
context 'when order is not preserved' do
|
203
|
+
let(:time) { Time.now }
|
239
204
|
|
240
|
-
|
241
|
-
|
242
|
-
|
243
|
-
|
205
|
+
it 'paginates properly through all items' do
|
206
|
+
Chewy.strategy(:urgent) do
|
207
|
+
Timecop.travel(time + 1.minute) { City.create!(id: 2) }
|
208
|
+
Timecop.travel(time + 3.minute) { City.create!(id: 4) }
|
209
|
+
Timecop.travel(time + 2.minute) { City.create!(id: 1) }
|
210
|
+
Timecop.travel(time + 4.minute) { City.create!(id: 3) }
|
244
211
|
end
|
245
212
|
|
246
|
-
|
247
|
-
|
213
|
+
CitiesIndex.purge!
|
214
|
+
expect(CitiesIndex.all.to_a.length).to eq 0
|
248
215
|
|
249
|
-
|
250
|
-
|
251
|
-
|
252
|
-
expect(described_class.new.apply(time, retries: retries)).to eq(5)
|
253
|
-
end
|
254
|
-
end
|
216
|
+
# Replay on specific index
|
217
|
+
expect(described_class.new(CitiesIndex).apply(time, fetch_limit: 2)).to eq(4)
|
218
|
+
expect(CitiesIndex.all.to_a.map(&:id).sort).to eq([1, 2, 3, 4])
|
255
219
|
end
|
256
220
|
end
|
257
221
|
end
|
@@ -426,6 +426,33 @@ Total: \\d+s\\Z
|
|
426
426
|
described_class.journal_clean(except: CitiesIndex, output: output)
|
427
427
|
expect(output.string).to match(Regexp.new(<<-OUTPUT, Regexp::MULTILINE))
|
428
428
|
\\ACleaned up 1 journal entries
|
429
|
+
Total: \\d+s\\Z
|
430
|
+
OUTPUT
|
431
|
+
end
|
432
|
+
|
433
|
+
it 'executes asynchronously' do
|
434
|
+
output = StringIO.new
|
435
|
+
expect(Chewy.client).to receive(:delete_by_query).with(
|
436
|
+
{
|
437
|
+
body: {query: {match_all: {}}},
|
438
|
+
index: ['chewy_journal'],
|
439
|
+
refresh: false,
|
440
|
+
requests_per_second: 10.0,
|
441
|
+
scroll_size: 200,
|
442
|
+
wait_for_completion: false
|
443
|
+
}
|
444
|
+
).and_call_original
|
445
|
+
described_class.journal_clean(
|
446
|
+
output: output,
|
447
|
+
delete_by_query_options: {
|
448
|
+
wait_for_completion: false,
|
449
|
+
requests_per_second: 10.0,
|
450
|
+
scroll_size: 200
|
451
|
+
}
|
452
|
+
)
|
453
|
+
|
454
|
+
expect(output.string).to match(Regexp.new(<<-OUTPUT, Regexp::MULTILINE))
|
455
|
+
\\ATask to cleanup the journal has been created, [^\\n]*
|
429
456
|
Total: \\d+s\\Z
|
430
457
|
OUTPUT
|
431
458
|
end
|
@@ -502,4 +529,45 @@ Total: \\d+s\\Z
|
|
502
529
|
end
|
503
530
|
end
|
504
531
|
end
|
532
|
+
|
533
|
+
describe '.delete_by_query_options_from_env' do
|
534
|
+
subject(:options) { described_class.delete_by_query_options_from_env(env) }
|
535
|
+
let(:env) do
|
536
|
+
{
|
537
|
+
'WAIT_FOR_COMPLETION' => 'false',
|
538
|
+
'REQUESTS_PER_SECOND' => '10',
|
539
|
+
'SCROLL_SIZE' => '5000'
|
540
|
+
}
|
541
|
+
end
|
542
|
+
|
543
|
+
it 'parses the options' do
|
544
|
+
expect(options).to eq(
|
545
|
+
wait_for_completion: false,
|
546
|
+
requests_per_second: 10.0,
|
547
|
+
scroll_size: 5000
|
548
|
+
)
|
549
|
+
end
|
550
|
+
|
551
|
+
context 'with different boolean values' do
|
552
|
+
it 'parses the option correctly' do
|
553
|
+
%w[1 t true TRUE on ON].each do |v|
|
554
|
+
expect(described_class.delete_by_query_options_from_env({'WAIT_FOR_COMPLETION' => v}))
|
555
|
+
.to eq(wait_for_completion: true)
|
556
|
+
end
|
557
|
+
|
558
|
+
%w[0 f false FALSE off OFF].each do |v|
|
559
|
+
expect(described_class.delete_by_query_options_from_env({'WAIT_FOR_COMPLETION' => v}))
|
560
|
+
.to eq(wait_for_completion: false)
|
561
|
+
end
|
562
|
+
end
|
563
|
+
end
|
564
|
+
|
565
|
+
context 'with other env' do
|
566
|
+
let(:env) { {'SOME_ENV' => '123', 'REQUESTS_PER_SECOND' => '15'} }
|
567
|
+
|
568
|
+
it 'parses only the options' do
|
569
|
+
expect(options).to eq(requests_per_second: 15.0)
|
570
|
+
end
|
571
|
+
end
|
572
|
+
end
|
505
573
|
end
|
@@ -314,6 +314,16 @@ describe Chewy::Search::Request do
|
|
314
314
|
end
|
315
315
|
end
|
316
316
|
|
317
|
+
describe '#collapse' do
|
318
|
+
specify { expect(subject.collapse(foo: {bar: 42}).render[:body]).to include(collapse: {'foo' => {bar: 42}}) }
|
319
|
+
specify do
|
320
|
+
expect(subject.collapse(foo: {bar: 42}).collapse(moo: {baz: 43}).render[:body])
|
321
|
+
.to include(collapse: {'moo' => {baz: 43}})
|
322
|
+
end
|
323
|
+
specify { expect(subject.collapse(foo: {bar: 42}).collapse(nil).render[:body]).to be_blank }
|
324
|
+
specify { expect { subject.collapse(foo: {bar: 42}) }.not_to change { subject.render } }
|
325
|
+
end
|
326
|
+
|
317
327
|
describe '#docvalue_fields' do
|
318
328
|
specify { expect(subject.docvalue_fields(:foo).render[:body]).to include(docvalue_fields: ['foo']) }
|
319
329
|
specify do
|
@@ -807,6 +817,31 @@ describe Chewy::Search::Request do
|
|
807
817
|
request: {index: ['products'], body: {query: {match: {name: 'name3'}}}, refresh: false}
|
808
818
|
)
|
809
819
|
end
|
820
|
+
|
821
|
+
it 'delete records asynchronously' do
|
822
|
+
outer_payload = nil
|
823
|
+
ActiveSupport::Notifications.subscribe('delete_query.chewy') do |_name, _start, _finish, _id, payload|
|
824
|
+
outer_payload = payload
|
825
|
+
end
|
826
|
+
subject.query(match: {name: 'name3'}).delete_all(
|
827
|
+
refresh: false,
|
828
|
+
wait_for_completion: false,
|
829
|
+
requests_per_second: 10.0,
|
830
|
+
scroll_size: 2000
|
831
|
+
)
|
832
|
+
expect(outer_payload).to eq(
|
833
|
+
index: ProductsIndex,
|
834
|
+
indexes: [ProductsIndex],
|
835
|
+
request: {
|
836
|
+
index: ['products'],
|
837
|
+
body: {query: {match: {name: 'name3'}}},
|
838
|
+
refresh: false,
|
839
|
+
wait_for_completion: false,
|
840
|
+
requests_per_second: 10.0,
|
841
|
+
scroll_size: 2000
|
842
|
+
}
|
843
|
+
)
|
844
|
+
end
|
810
845
|
end
|
811
846
|
|
812
847
|
describe '#response=' do
|
metadata
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: chewy
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 7.2.
|
4
|
+
version: 7.2.7
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Toptal, LLC
|
8
8
|
- pyromaniac
|
9
|
-
autorequire:
|
9
|
+
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2022-
|
12
|
+
date: 2022-11-15 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: database_cleaner
|
@@ -222,6 +222,7 @@ executables: []
|
|
222
222
|
extensions: []
|
223
223
|
extra_rdoc_files: []
|
224
224
|
files:
|
225
|
+
- ".github/CODEOWNERS"
|
225
226
|
- ".github/ISSUE_TEMPLATE/bug_report.md"
|
226
227
|
- ".github/ISSUE_TEMPLATE/feature_request.md"
|
227
228
|
- ".github/PULL_REQUEST_TEMPLATE.md"
|
@@ -293,6 +294,7 @@ files:
|
|
293
294
|
- lib/chewy/search/parameters.rb
|
294
295
|
- lib/chewy/search/parameters/aggs.rb
|
295
296
|
- lib/chewy/search/parameters/allow_partial_search_results.rb
|
297
|
+
- lib/chewy/search/parameters/collapse.rb
|
296
298
|
- lib/chewy/search/parameters/concerns/bool_storage.rb
|
297
299
|
- lib/chewy/search/parameters/concerns/hash_storage.rb
|
298
300
|
- lib/chewy/search/parameters/concerns/integer_storage.rb
|
@@ -389,6 +391,7 @@ files:
|
|
389
391
|
- spec/chewy/search/pagination/kaminari_spec.rb
|
390
392
|
- spec/chewy/search/parameters/aggs_spec.rb
|
391
393
|
- spec/chewy/search/parameters/bool_storage_examples.rb
|
394
|
+
- spec/chewy/search/parameters/collapse_spec.rb
|
392
395
|
- spec/chewy/search/parameters/docvalue_fields_spec.rb
|
393
396
|
- spec/chewy/search/parameters/explain_spec.rb
|
394
397
|
- spec/chewy/search/parameters/filter_spec.rb
|
@@ -446,7 +449,7 @@ homepage: https://github.com/toptal/chewy
|
|
446
449
|
licenses:
|
447
450
|
- MIT
|
448
451
|
metadata: {}
|
449
|
-
post_install_message:
|
452
|
+
post_install_message:
|
450
453
|
rdoc_options: []
|
451
454
|
require_paths:
|
452
455
|
- lib
|
@@ -461,8 +464,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
461
464
|
- !ruby/object:Gem::Version
|
462
465
|
version: '0'
|
463
466
|
requirements: []
|
464
|
-
rubygems_version: 3.
|
465
|
-
signing_key:
|
467
|
+
rubygems_version: 3.2.33
|
468
|
+
signing_key:
|
466
469
|
specification_version: 4
|
467
470
|
summary: Elasticsearch ODM client wrapper
|
468
471
|
test_files:
|
@@ -505,6 +508,7 @@ test_files:
|
|
505
508
|
- spec/chewy/search/pagination/kaminari_spec.rb
|
506
509
|
- spec/chewy/search/parameters/aggs_spec.rb
|
507
510
|
- spec/chewy/search/parameters/bool_storage_examples.rb
|
511
|
+
- spec/chewy/search/parameters/collapse_spec.rb
|
508
512
|
- spec/chewy/search/parameters/docvalue_fields_spec.rb
|
509
513
|
- spec/chewy/search/parameters/explain_spec.rb
|
510
514
|
- spec/chewy/search/parameters/filter_spec.rb
|