searchkick 5.0.3 → 5.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -0
- data/README.md +127 -80
- data/lib/searchkick/index.rb +1 -1
- data/lib/searchkick/index_options.rb +2 -2
- data/lib/searchkick/model.rb +1 -1
- data/lib/searchkick/query.rb +6 -0
- data/lib/searchkick/version.rb +1 -1
- data/lib/searchkick.rb +4 -3
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 89c6a97d4c898be7f1f494cc4bfafc8aed5acc202a855588e81f73d86ea7b123
|
4
|
+
data.tar.gz: c362bb2916ec0b1fa83d72efd2314e747077b5cd696ed2ece089204be9452010
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bf1a9191ee97a19afde4c44b84c02b34b7f6b8b3ac464cdd34671dd60cb8a191ec338a028db8221c2a502fadf0059987f957c918221175631e8aa70344371ee7
|
7
|
+
data.tar.gz: 32e7b4b9f0899088e709a77389d8fa845ef68a3fd819d3823229eb1ffd99a6121647d5ab82d209e6254e4c242530d7698441d3b97f65a1e2436acbced6b6086f
|
data/CHANGELOG.md
CHANGED
data/README.md
CHANGED
@@ -66,7 +66,7 @@ gem "elasticsearch" # select one
|
|
66
66
|
gem "opensearch-ruby" # select one
|
67
67
|
```
|
68
68
|
|
69
|
-
The latest version works with Elasticsearch 7 and 8 and OpenSearch 1. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
|
69
|
+
The latest version works with Elasticsearch 7 and 8 and OpenSearch 1 and 2. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
|
70
70
|
|
71
71
|
Add searchkick to models you want to search.
|
72
72
|
|
@@ -592,6 +592,14 @@ There are four strategies for keeping the index synced with your database.
|
|
592
592
|
end
|
593
593
|
```
|
594
594
|
|
595
|
+
And reindex a record or relation manually.
|
596
|
+
|
597
|
+
```ruby
|
598
|
+
product.reindex
|
599
|
+
# or
|
600
|
+
store.products.reindex(mode: :async)
|
601
|
+
```
|
602
|
+
|
595
603
|
You can also do bulk updates.
|
596
604
|
|
597
605
|
```ruby
|
@@ -608,6 +616,12 @@ Searchkick.callbacks(false) do
|
|
608
616
|
end
|
609
617
|
```
|
610
618
|
|
619
|
+
Or override the model’s strategy.
|
620
|
+
|
621
|
+
```ruby
|
622
|
+
product.reindex(mode: :async) # :inline or :queue
|
623
|
+
```
|
624
|
+
|
611
625
|
### Associations
|
612
626
|
|
613
627
|
Data is **not** automatically synced when an association is updated. If this is desired, add a callback to reindex:
|
@@ -654,20 +668,16 @@ The best starting point to improve your search **by far** is to track searches a
|
|
654
668
|
Product.search("apple", track: {user_id: current_user.id})
|
655
669
|
```
|
656
670
|
|
657
|
-
[See the docs](https://github.com/ankane/searchjoy) for how to install and use.
|
658
|
-
|
659
|
-
Focus on:
|
660
|
-
|
661
|
-
- top searches with low conversions
|
662
|
-
- top searches with no results
|
671
|
+
[See the docs](https://github.com/ankane/searchjoy) for how to install and use. Focus on top searches with a low conversion rate.
|
663
672
|
|
664
|
-
Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches.
|
673
|
+
Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches. This can make a huge difference on the quality of your search.
|
665
674
|
|
666
675
|
Add conversion data with:
|
667
676
|
|
668
677
|
```ruby
|
669
678
|
class Product < ApplicationRecord
|
670
|
-
has_many :
|
679
|
+
has_many :conversions, class_name: "Searchjoy::Conversion", as: :convertable
|
680
|
+
has_many :searches, class_name: "Searchjoy::Search", through: :conversions
|
671
681
|
|
672
682
|
searchkick conversions: [:conversions] # name of field
|
673
683
|
|
@@ -681,15 +691,100 @@ class Product < ApplicationRecord
|
|
681
691
|
end
|
682
692
|
```
|
683
693
|
|
684
|
-
Reindex and set up a cron job to add new conversions daily.
|
694
|
+
Reindex and set up a cron job to add new conversions daily. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
|
685
695
|
|
686
|
-
|
687
|
-
|
696
|
+
### Performant Conversions
|
697
|
+
|
698
|
+
A performant way to do conversions is to cache them to prevent N+1 queries. For Postgres, create a migration with:
|
699
|
+
|
700
|
+
```ruby
|
701
|
+
add_column :products, :search_conversions, :jsonb
|
702
|
+
```
|
703
|
+
|
704
|
+
For MySQL, use `:json`, and for others, use `:text` with a [JSON serializer](https://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html).
|
705
|
+
|
706
|
+
Next, update your model. Create a separate method for conversion data so you can use [partial reindexing](#partial-reindexing).
|
707
|
+
|
708
|
+
```ruby
|
709
|
+
class Product < ApplicationRecord
|
710
|
+
searchkick conversions: [:conversions]
|
711
|
+
|
712
|
+
def search_data
|
713
|
+
{
|
714
|
+
name: name,
|
715
|
+
category: category
|
716
|
+
}.merge(conversions_data)
|
717
|
+
end
|
718
|
+
|
719
|
+
def conversions_data
|
720
|
+
{
|
721
|
+
conversions: search_conversions || {}
|
722
|
+
}
|
723
|
+
end
|
724
|
+
end
|
725
|
+
```
|
726
|
+
|
727
|
+
Deploy and reindex your data. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
|
728
|
+
|
729
|
+
```ruby
|
730
|
+
Product.reindex
|
731
|
+
```
|
732
|
+
|
733
|
+
Then, create a job to update the conversions column and reindex records with new conversions. Here’s one you can use for Searchjoy:
|
734
|
+
|
735
|
+
```ruby
|
736
|
+
class UpdateConversionsJob < ApplicationJob
|
737
|
+
def perform(class_name, since: nil, update: true, reindex: true)
|
738
|
+
model = Searchkick.load_model(class_name)
|
739
|
+
|
740
|
+
# get records that have a recent conversion
|
741
|
+
recently_converted_ids =
|
742
|
+
Searchjoy::Conversion.where(convertable_type: class_name).where(created_at: since..)
|
743
|
+
.order(:convertable_id).distinct.pluck(:convertable_id)
|
744
|
+
|
745
|
+
# split into batches
|
746
|
+
recently_converted_ids.in_groups_of(1000, false) do |ids|
|
747
|
+
if update
|
748
|
+
# fetch conversions
|
749
|
+
conversions =
|
750
|
+
Searchjoy::Conversion.where(convertable_id: ids, convertable_type: class_name)
|
751
|
+
.joins(:search).where.not(searchjoy_searches: {user_id: nil})
|
752
|
+
.group(:convertable_id, :query).distinct.count(:user_id)
|
753
|
+
|
754
|
+
# group by record
|
755
|
+
conversions_by_record = {}
|
756
|
+
conversions.each do |(id, query), count|
|
757
|
+
(conversions_by_record[id] ||= {})[query] = count
|
758
|
+
end
|
759
|
+
|
760
|
+
# update conversions column
|
761
|
+
model.transaction do
|
762
|
+
conversions_by_record.each do |id, conversions|
|
763
|
+
model.where(id: id).update_all(search_conversions: conversions)
|
764
|
+
end
|
765
|
+
end
|
766
|
+
end
|
767
|
+
|
768
|
+
if reindex
|
769
|
+
# reindex conversions data
|
770
|
+
model.where(id: ids).reindex(:conversions_data)
|
771
|
+
end
|
772
|
+
end
|
773
|
+
end
|
774
|
+
end
|
688
775
|
```
|
689
776
|
|
690
|
-
|
777
|
+
Run the job:
|
691
778
|
|
692
|
-
|
779
|
+
```ruby
|
780
|
+
UpdateConversionsJob.perform_now("Product")
|
781
|
+
```
|
782
|
+
|
783
|
+
And set it up to run daily.
|
784
|
+
|
785
|
+
```ruby
|
786
|
+
UpdateConversionsJob.perform_later("Product", since: 1.day.ago)
|
787
|
+
```
|
693
788
|
|
694
789
|
## Personalized Results
|
695
790
|
|
@@ -1575,11 +1670,12 @@ Reindex a subset of attributes to reduce time spent generating search data and c
|
|
1575
1670
|
class Product < ApplicationRecord
|
1576
1671
|
def search_data
|
1577
1672
|
{
|
1578
|
-
name: name
|
1579
|
-
|
1673
|
+
name: name,
|
1674
|
+
category: category
|
1675
|
+
}.merge(prices_data)
|
1580
1676
|
end
|
1581
1677
|
|
1582
|
-
def
|
1678
|
+
def prices_data
|
1583
1679
|
{
|
1584
1680
|
price: price,
|
1585
1681
|
sale_price: sale_price
|
@@ -1591,68 +1687,7 @@ end
|
|
1591
1687
|
And use:
|
1592
1688
|
|
1593
1689
|
```ruby
|
1594
|
-
Product.reindex(:
|
1595
|
-
```
|
1596
|
-
|
1597
|
-
### Performant Conversions
|
1598
|
-
|
1599
|
-
Split out conversions into a separate method so you can use partial reindexing, and cache conversions to prevent N+1 queries. Be sure to use a centralized cache store like Memcached or Redis.
|
1600
|
-
|
1601
|
-
```ruby
|
1602
|
-
class Product < ApplicationRecord
|
1603
|
-
def search_data
|
1604
|
-
{
|
1605
|
-
name: name
|
1606
|
-
}.merge(search_conversions)
|
1607
|
-
end
|
1608
|
-
|
1609
|
-
def search_conversions
|
1610
|
-
{
|
1611
|
-
conversions: Rails.cache.read("search_conversions:#{self.class.name}:#{id}") || {}
|
1612
|
-
}
|
1613
|
-
end
|
1614
|
-
end
|
1615
|
-
```
|
1616
|
-
|
1617
|
-
Create a job to update the cache and reindex records with new conversions.
|
1618
|
-
|
1619
|
-
```ruby
|
1620
|
-
class ReindexConversionsJob < ApplicationJob
|
1621
|
-
def perform(class_name)
|
1622
|
-
# get records that have a recent conversion
|
1623
|
-
recently_converted_ids =
|
1624
|
-
Searchjoy::Search.where("convertable_type = ? AND converted_at > ?", class_name, 1.day.ago)
|
1625
|
-
.order(:convertable_id).distinct.pluck(:convertable_id)
|
1626
|
-
|
1627
|
-
# split into groups
|
1628
|
-
recently_converted_ids.in_groups_of(1000, false) do |ids|
|
1629
|
-
# fetch conversions
|
1630
|
-
conversions =
|
1631
|
-
Searchjoy::Search.where(convertable_id: ids, convertable_type: class_name)
|
1632
|
-
.group(:convertable_id, :query).distinct.count(:user_id)
|
1633
|
-
|
1634
|
-
# group conversions by record
|
1635
|
-
conversions_by_record = {}
|
1636
|
-
conversions.each do |(id, query), count|
|
1637
|
-
(conversions_by_record[id] ||= {})[query] = count
|
1638
|
-
end
|
1639
|
-
|
1640
|
-
# write to cache
|
1641
|
-
conversions_by_record.each do |id, conversions|
|
1642
|
-
Rails.cache.write("search_conversions:#{class_name}:#{id}", conversions)
|
1643
|
-
end
|
1644
|
-
|
1645
|
-
# partial reindex
|
1646
|
-
class_name.constantize.where(id: ids).reindex(:search_conversions)
|
1647
|
-
end
|
1648
|
-
end
|
1649
|
-
end
|
1650
|
-
```
|
1651
|
-
|
1652
|
-
Run the job with:
|
1653
|
-
|
1654
|
-
```ruby
|
1655
|
-
ReindexConversionsJob.perform_later("Product")
|
1690
|
+
Product.reindex(:prices_data)
|
1656
1691
|
```
|
1657
1692
|
|
1658
1693
|
## Advanced
|
@@ -2036,12 +2071,24 @@ Turn on misspellings after a certain number of characters
|
|
2036
2071
|
Product.search("api", misspellings: {prefix_length: 2}) # api, apt, no ahi
|
2037
2072
|
```
|
2038
2073
|
|
2039
|
-
**Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch
|
2074
|
+
**Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch 1
|
2040
2075
|
|
2041
2076
|
```ruby
|
2042
2077
|
Product.search("ah", misspellings: {prefix_length: 2}) # ah, no aha
|
2043
2078
|
```
|
2044
2079
|
|
2080
|
+
BigDecimal values are indexed as floats by default so they can be used for boosting. Convert them to strings to keep full precision.
|
2081
|
+
|
2082
|
+
```ruby
|
2083
|
+
class Product < ApplicationRecord
|
2084
|
+
def search_data
|
2085
|
+
{
|
2086
|
+
units: units.to_s("F")
|
2087
|
+
}
|
2088
|
+
end
|
2089
|
+
end
|
2090
|
+
```
|
2091
|
+
|
2045
2092
|
## Gotchas
|
2046
2093
|
|
2047
2094
|
### Consistency
|
data/lib/searchkick/index.rb
CHANGED
@@ -418,7 +418,7 @@ module Searchkick
|
|
418
418
|
true
|
419
419
|
end
|
420
420
|
rescue => e
|
421
|
-
if Searchkick.transport_error?(e) && e.message.include?("No handler for type [text]")
|
421
|
+
if Searchkick.transport_error?(e) && (e.message.include?("No handler for type [text]") || e.message.include?("class java.util.ArrayList cannot be cast to class java.util.Map"))
|
422
422
|
raise UnsupportedVersionError
|
423
423
|
end
|
424
424
|
|
@@ -19,7 +19,7 @@ module Searchkick
|
|
19
19
|
mappings = generate_mappings.deep_symbolize_keys.deep_merge(custom_mappings)
|
20
20
|
end
|
21
21
|
|
22
|
-
set_deep_paging(settings) if options[:deep_paging]
|
22
|
+
set_deep_paging(settings) if options[:deep_paging] || options[:max_result_window]
|
23
23
|
|
24
24
|
{
|
25
25
|
settings: settings,
|
@@ -525,7 +525,7 @@ module Searchkick
|
|
525
525
|
def set_deep_paging(settings)
|
526
526
|
if !settings.dig(:index, :max_result_window) && !settings[:"index.max_result_window"]
|
527
527
|
settings[:index] ||= {}
|
528
|
-
settings[:index][:max_result_window] = 1_000_000_000
|
528
|
+
settings[:index][:max_result_window] = options[:max_result_window] || 1_000_000_000
|
529
529
|
end
|
530
530
|
end
|
531
531
|
|
data/lib/searchkick/model.rb
CHANGED
@@ -5,7 +5,7 @@ module Searchkick
|
|
5
5
|
|
6
6
|
unknown_keywords = options.keys - [:_all, :_type, :batch_size, :callbacks, :case_sensitive, :conversions, :deep_paging, :default_fields,
|
7
7
|
:filterable, :geo_shape, :highlight, :ignore_above, :index_name, :index_prefix, :inheritance, :language,
|
8
|
-
:locations, :mappings, :match, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
|
8
|
+
:locations, :mappings, :match, :max_result_window, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
|
9
9
|
:special_characters, :stem, :stemmer, :stem_conversions, :stem_exclusion, :stemmer_override, :suggest, :synonyms, :text_end,
|
10
10
|
:text_middle, :text_start, :unscope, :word, :word_end, :word_middle, :word_start]
|
11
11
|
raise ArgumentError, "unknown keywords: #{unknown_keywords.join(", ")}" if unknown_keywords.any?
|
data/lib/searchkick/query.rb
CHANGED
@@ -254,6 +254,12 @@ module Searchkick
|
|
254
254
|
offset = options[:offset] || (page - 1) * per_page + padding
|
255
255
|
scroll = options[:scroll]
|
256
256
|
|
257
|
+
max_result_window = searchkick_options[:max_result_window]
|
258
|
+
if max_result_window
|
259
|
+
offset = max_result_window if offset > max_result_window
|
260
|
+
per_page = max_result_window - offset if offset + per_page > max_result_window
|
261
|
+
end
|
262
|
+
|
257
263
|
# model and eager loading
|
258
264
|
load = options[:load].nil? ? true : options[:load]
|
259
265
|
|
data/lib/searchkick/version.rb
CHANGED
data/lib/searchkick.rb
CHANGED
@@ -135,8 +135,9 @@ module Searchkick
|
|
135
135
|
@opensearch
|
136
136
|
end
|
137
137
|
|
138
|
-
|
139
|
-
|
138
|
+
# TODO always check true version in Searchkick 6
|
139
|
+
def self.server_below?(version, true_version = false)
|
140
|
+
server_version = !true_version && opensearch? ? "7.10.2" : self.server_version
|
140
141
|
Gem::Version.new(server_version.split("-")[0]) < Gem::Version.new(version.split("-")[0])
|
141
142
|
end
|
142
143
|
|
@@ -284,7 +285,7 @@ module Searchkick
|
|
284
285
|
relation
|
285
286
|
end
|
286
287
|
|
287
|
-
#
|
288
|
+
# public (for reindexing conversions)
|
288
289
|
def self.load_model(class_name, allow_child: false)
|
289
290
|
model = class_name.safe_constantize
|
290
291
|
raise Error, "Could not find class: #{class_name}" unless model
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: searchkick
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 5.0.
|
4
|
+
version: 5.0.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Andrew Kane
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2022-
|
11
|
+
date: 2022-06-17 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activemodel
|