searchkick 5.0.3 → 5.0.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -0
- data/README.md +127 -80
- data/lib/searchkick/index.rb +1 -1
- data/lib/searchkick/index_options.rb +2 -2
- data/lib/searchkick/model.rb +1 -1
- data/lib/searchkick/query.rb +6 -0
- data/lib/searchkick/version.rb +1 -1
- data/lib/searchkick.rb +4 -3
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 89c6a97d4c898be7f1f494cc4bfafc8aed5acc202a855588e81f73d86ea7b123
|
4
|
+
data.tar.gz: c362bb2916ec0b1fa83d72efd2314e747077b5cd696ed2ece089204be9452010
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bf1a9191ee97a19afde4c44b84c02b34b7f6b8b3ac464cdd34671dd60cb8a191ec338a028db8221c2a502fadf0059987f957c918221175631e8aa70344371ee7
|
7
|
+
data.tar.gz: 32e7b4b9f0899088e709a77389d8fa845ef68a3fd819d3823229eb1ffd99a6121647d5ab82d209e6254e4c242530d7698441d3b97f65a1e2436acbced6b6086f
|
data/CHANGELOG.md
CHANGED
data/README.md
CHANGED
@@ -66,7 +66,7 @@ gem "elasticsearch" # select one
|
|
66
66
|
gem "opensearch-ruby" # select one
|
67
67
|
```
|
68
68
|
|
69
|
-
The latest version works with Elasticsearch 7 and 8 and OpenSearch 1. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
|
69
|
+
The latest version works with Elasticsearch 7 and 8 and OpenSearch 1 and 2. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
|
70
70
|
|
71
71
|
Add searchkick to models you want to search.
|
72
72
|
|
@@ -592,6 +592,14 @@ There are four strategies for keeping the index synced with your database.
|
|
592
592
|
end
|
593
593
|
```
|
594
594
|
|
595
|
+
And reindex a record or relation manually.
|
596
|
+
|
597
|
+
```ruby
|
598
|
+
product.reindex
|
599
|
+
# or
|
600
|
+
store.products.reindex(mode: :async)
|
601
|
+
```
|
602
|
+
|
595
603
|
You can also do bulk updates.
|
596
604
|
|
597
605
|
```ruby
|
@@ -608,6 +616,12 @@ Searchkick.callbacks(false) do
|
|
608
616
|
end
|
609
617
|
```
|
610
618
|
|
619
|
+
Or override the model’s strategy.
|
620
|
+
|
621
|
+
```ruby
|
622
|
+
product.reindex(mode: :async) # :inline or :queue
|
623
|
+
```
|
624
|
+
|
611
625
|
### Associations
|
612
626
|
|
613
627
|
Data is **not** automatically synced when an association is updated. If this is desired, add a callback to reindex:
|
@@ -654,20 +668,16 @@ The best starting point to improve your search **by far** is to track searches a
|
|
654
668
|
Product.search("apple", track: {user_id: current_user.id})
|
655
669
|
```
|
656
670
|
|
657
|
-
[See the docs](https://github.com/ankane/searchjoy) for how to install and use.
|
658
|
-
|
659
|
-
Focus on:
|
660
|
-
|
661
|
-
- top searches with low conversions
|
662
|
-
- top searches with no results
|
671
|
+
[See the docs](https://github.com/ankane/searchjoy) for how to install and use. Focus on top searches with a low conversion rate.
|
663
672
|
|
664
|
-
Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches.
|
673
|
+
Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches. This can make a huge difference on the quality of your search.
|
665
674
|
|
666
675
|
Add conversion data with:
|
667
676
|
|
668
677
|
```ruby
|
669
678
|
class Product < ApplicationRecord
|
670
|
-
has_many :
|
679
|
+
has_many :conversions, class_name: "Searchjoy::Conversion", as: :convertable
|
680
|
+
has_many :searches, class_name: "Searchjoy::Search", through: :conversions
|
671
681
|
|
672
682
|
searchkick conversions: [:conversions] # name of field
|
673
683
|
|
@@ -681,15 +691,100 @@ class Product < ApplicationRecord
|
|
681
691
|
end
|
682
692
|
```
|
683
693
|
|
684
|
-
Reindex and set up a cron job to add new conversions daily.
|
694
|
+
Reindex and set up a cron job to add new conversions daily. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
|
685
695
|
|
686
|
-
|
687
|
-
|
696
|
+
### Performant Conversions
|
697
|
+
|
698
|
+
A performant way to do conversions is to cache them to prevent N+1 queries. For Postgres, create a migration with:
|
699
|
+
|
700
|
+
```ruby
|
701
|
+
add_column :products, :search_conversions, :jsonb
|
702
|
+
```
|
703
|
+
|
704
|
+
For MySQL, use `:json`, and for others, use `:text` with a [JSON serializer](https://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html).
|
705
|
+
|
706
|
+
Next, update your model. Create a separate method for conversion data so you can use [partial reindexing](#partial-reindexing).
|
707
|
+
|
708
|
+
```ruby
|
709
|
+
class Product < ApplicationRecord
|
710
|
+
searchkick conversions: [:conversions]
|
711
|
+
|
712
|
+
def search_data
|
713
|
+
{
|
714
|
+
name: name,
|
715
|
+
category: category
|
716
|
+
}.merge(conversions_data)
|
717
|
+
end
|
718
|
+
|
719
|
+
def conversions_data
|
720
|
+
{
|
721
|
+
conversions: search_conversions || {}
|
722
|
+
}
|
723
|
+
end
|
724
|
+
end
|
725
|
+
```
|
726
|
+
|
727
|
+
Deploy and reindex your data. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
|
728
|
+
|
729
|
+
```ruby
|
730
|
+
Product.reindex
|
731
|
+
```
|
732
|
+
|
733
|
+
Then, create a job to update the conversions column and reindex records with new conversions. Here’s one you can use for Searchjoy:
|
734
|
+
|
735
|
+
```ruby
|
736
|
+
class UpdateConversionsJob < ApplicationJob
|
737
|
+
def perform(class_name, since: nil, update: true, reindex: true)
|
738
|
+
model = Searchkick.load_model(class_name)
|
739
|
+
|
740
|
+
# get records that have a recent conversion
|
741
|
+
recently_converted_ids =
|
742
|
+
Searchjoy::Conversion.where(convertable_type: class_name).where(created_at: since..)
|
743
|
+
.order(:convertable_id).distinct.pluck(:convertable_id)
|
744
|
+
|
745
|
+
# split into batches
|
746
|
+
recently_converted_ids.in_groups_of(1000, false) do |ids|
|
747
|
+
if update
|
748
|
+
# fetch conversions
|
749
|
+
conversions =
|
750
|
+
Searchjoy::Conversion.where(convertable_id: ids, convertable_type: class_name)
|
751
|
+
.joins(:search).where.not(searchjoy_searches: {user_id: nil})
|
752
|
+
.group(:convertable_id, :query).distinct.count(:user_id)
|
753
|
+
|
754
|
+
# group by record
|
755
|
+
conversions_by_record = {}
|
756
|
+
conversions.each do |(id, query), count|
|
757
|
+
(conversions_by_record[id] ||= {})[query] = count
|
758
|
+
end
|
759
|
+
|
760
|
+
# update conversions column
|
761
|
+
model.transaction do
|
762
|
+
conversions_by_record.each do |id, conversions|
|
763
|
+
model.where(id: id).update_all(search_conversions: conversions)
|
764
|
+
end
|
765
|
+
end
|
766
|
+
end
|
767
|
+
|
768
|
+
if reindex
|
769
|
+
# reindex conversions data
|
770
|
+
model.where(id: ids).reindex(:conversions_data)
|
771
|
+
end
|
772
|
+
end
|
773
|
+
end
|
774
|
+
end
|
688
775
|
```
|
689
776
|
|
690
|
-
|
777
|
+
Run the job:
|
691
778
|
|
692
|
-
|
779
|
+
```ruby
|
780
|
+
UpdateConversionsJob.perform_now("Product")
|
781
|
+
```
|
782
|
+
|
783
|
+
And set it up to run daily.
|
784
|
+
|
785
|
+
```ruby
|
786
|
+
UpdateConversionsJob.perform_later("Product", since: 1.day.ago)
|
787
|
+
```
|
693
788
|
|
694
789
|
## Personalized Results
|
695
790
|
|
@@ -1575,11 +1670,12 @@ Reindex a subset of attributes to reduce time spent generating search data and c
|
|
1575
1670
|
class Product < ApplicationRecord
|
1576
1671
|
def search_data
|
1577
1672
|
{
|
1578
|
-
name: name
|
1579
|
-
|
1673
|
+
name: name,
|
1674
|
+
category: category
|
1675
|
+
}.merge(prices_data)
|
1580
1676
|
end
|
1581
1677
|
|
1582
|
-
def
|
1678
|
+
def prices_data
|
1583
1679
|
{
|
1584
1680
|
price: price,
|
1585
1681
|
sale_price: sale_price
|
@@ -1591,68 +1687,7 @@ end
|
|
1591
1687
|
And use:
|
1592
1688
|
|
1593
1689
|
```ruby
|
1594
|
-
Product.reindex(:
|
1595
|
-
```
|
1596
|
-
|
1597
|
-
### Performant Conversions
|
1598
|
-
|
1599
|
-
Split out conversions into a separate method so you can use partial reindexing, and cache conversions to prevent N+1 queries. Be sure to use a centralized cache store like Memcached or Redis.
|
1600
|
-
|
1601
|
-
```ruby
|
1602
|
-
class Product < ApplicationRecord
|
1603
|
-
def search_data
|
1604
|
-
{
|
1605
|
-
name: name
|
1606
|
-
}.merge(search_conversions)
|
1607
|
-
end
|
1608
|
-
|
1609
|
-
def search_conversions
|
1610
|
-
{
|
1611
|
-
conversions: Rails.cache.read("search_conversions:#{self.class.name}:#{id}") || {}
|
1612
|
-
}
|
1613
|
-
end
|
1614
|
-
end
|
1615
|
-
```
|
1616
|
-
|
1617
|
-
Create a job to update the cache and reindex records with new conversions.
|
1618
|
-
|
1619
|
-
```ruby
|
1620
|
-
class ReindexConversionsJob < ApplicationJob
|
1621
|
-
def perform(class_name)
|
1622
|
-
# get records that have a recent conversion
|
1623
|
-
recently_converted_ids =
|
1624
|
-
Searchjoy::Search.where("convertable_type = ? AND converted_at > ?", class_name, 1.day.ago)
|
1625
|
-
.order(:convertable_id).distinct.pluck(:convertable_id)
|
1626
|
-
|
1627
|
-
# split into groups
|
1628
|
-
recently_converted_ids.in_groups_of(1000, false) do |ids|
|
1629
|
-
# fetch conversions
|
1630
|
-
conversions =
|
1631
|
-
Searchjoy::Search.where(convertable_id: ids, convertable_type: class_name)
|
1632
|
-
.group(:convertable_id, :query).distinct.count(:user_id)
|
1633
|
-
|
1634
|
-
# group conversions by record
|
1635
|
-
conversions_by_record = {}
|
1636
|
-
conversions.each do |(id, query), count|
|
1637
|
-
(conversions_by_record[id] ||= {})[query] = count
|
1638
|
-
end
|
1639
|
-
|
1640
|
-
# write to cache
|
1641
|
-
conversions_by_record.each do |id, conversions|
|
1642
|
-
Rails.cache.write("search_conversions:#{class_name}:#{id}", conversions)
|
1643
|
-
end
|
1644
|
-
|
1645
|
-
# partial reindex
|
1646
|
-
class_name.constantize.where(id: ids).reindex(:search_conversions)
|
1647
|
-
end
|
1648
|
-
end
|
1649
|
-
end
|
1650
|
-
```
|
1651
|
-
|
1652
|
-
Run the job with:
|
1653
|
-
|
1654
|
-
```ruby
|
1655
|
-
ReindexConversionsJob.perform_later("Product")
|
1690
|
+
Product.reindex(:prices_data)
|
1656
1691
|
```
|
1657
1692
|
|
1658
1693
|
## Advanced
|
@@ -2036,12 +2071,24 @@ Turn on misspellings after a certain number of characters
|
|
2036
2071
|
Product.search("api", misspellings: {prefix_length: 2}) # api, apt, no ahi
|
2037
2072
|
```
|
2038
2073
|
|
2039
|
-
**Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch
|
2074
|
+
**Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch 1
|
2040
2075
|
|
2041
2076
|
```ruby
|
2042
2077
|
Product.search("ah", misspellings: {prefix_length: 2}) # ah, no aha
|
2043
2078
|
```
|
2044
2079
|
|
2080
|
+
BigDecimal values are indexed as floats by default so they can be used for boosting. Convert them to strings to keep full precision.
|
2081
|
+
|
2082
|
+
```ruby
|
2083
|
+
class Product < ApplicationRecord
|
2084
|
+
def search_data
|
2085
|
+
{
|
2086
|
+
units: units.to_s("F")
|
2087
|
+
}
|
2088
|
+
end
|
2089
|
+
end
|
2090
|
+
```
|
2091
|
+
|
2045
2092
|
## Gotchas
|
2046
2093
|
|
2047
2094
|
### Consistency
|
data/lib/searchkick/index.rb
CHANGED
@@ -418,7 +418,7 @@ module Searchkick
|
|
418
418
|
true
|
419
419
|
end
|
420
420
|
rescue => e
|
421
|
-
if Searchkick.transport_error?(e) && e.message.include?("No handler for type [text]")
|
421
|
+
if Searchkick.transport_error?(e) && (e.message.include?("No handler for type [text]") || e.message.include?("class java.util.ArrayList cannot be cast to class java.util.Map"))
|
422
422
|
raise UnsupportedVersionError
|
423
423
|
end
|
424
424
|
|
@@ -19,7 +19,7 @@ module Searchkick
|
|
19
19
|
mappings = generate_mappings.deep_symbolize_keys.deep_merge(custom_mappings)
|
20
20
|
end
|
21
21
|
|
22
|
-
set_deep_paging(settings) if options[:deep_paging]
|
22
|
+
set_deep_paging(settings) if options[:deep_paging] || options[:max_result_window]
|
23
23
|
|
24
24
|
{
|
25
25
|
settings: settings,
|
@@ -525,7 +525,7 @@ module Searchkick
|
|
525
525
|
def set_deep_paging(settings)
|
526
526
|
if !settings.dig(:index, :max_result_window) && !settings[:"index.max_result_window"]
|
527
527
|
settings[:index] ||= {}
|
528
|
-
settings[:index][:max_result_window] = 1_000_000_000
|
528
|
+
settings[:index][:max_result_window] = options[:max_result_window] || 1_000_000_000
|
529
529
|
end
|
530
530
|
end
|
531
531
|
|
data/lib/searchkick/model.rb
CHANGED
@@ -5,7 +5,7 @@ module Searchkick
|
|
5
5
|
|
6
6
|
unknown_keywords = options.keys - [:_all, :_type, :batch_size, :callbacks, :case_sensitive, :conversions, :deep_paging, :default_fields,
|
7
7
|
:filterable, :geo_shape, :highlight, :ignore_above, :index_name, :index_prefix, :inheritance, :language,
|
8
|
-
:locations, :mappings, :match, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
|
8
|
+
:locations, :mappings, :match, :max_result_window, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
|
9
9
|
:special_characters, :stem, :stemmer, :stem_conversions, :stem_exclusion, :stemmer_override, :suggest, :synonyms, :text_end,
|
10
10
|
:text_middle, :text_start, :unscope, :word, :word_end, :word_middle, :word_start]
|
11
11
|
raise ArgumentError, "unknown keywords: #{unknown_keywords.join(", ")}" if unknown_keywords.any?
|
data/lib/searchkick/query.rb
CHANGED
@@ -254,6 +254,12 @@ module Searchkick
|
|
254
254
|
offset = options[:offset] || (page - 1) * per_page + padding
|
255
255
|
scroll = options[:scroll]
|
256
256
|
|
257
|
+
max_result_window = searchkick_options[:max_result_window]
|
258
|
+
if max_result_window
|
259
|
+
offset = max_result_window if offset > max_result_window
|
260
|
+
per_page = max_result_window - offset if offset + per_page > max_result_window
|
261
|
+
end
|
262
|
+
|
257
263
|
# model and eager loading
|
258
264
|
load = options[:load].nil? ? true : options[:load]
|
259
265
|
|
data/lib/searchkick/version.rb
CHANGED
data/lib/searchkick.rb
CHANGED
@@ -135,8 +135,9 @@ module Searchkick
|
|
135
135
|
@opensearch
|
136
136
|
end
|
137
137
|
|
138
|
-
|
139
|
-
|
138
|
+
# TODO always check true version in Searchkick 6
|
139
|
+
def self.server_below?(version, true_version = false)
|
140
|
+
server_version = !true_version && opensearch? ? "7.10.2" : self.server_version
|
140
141
|
Gem::Version.new(server_version.split("-")[0]) < Gem::Version.new(version.split("-")[0])
|
141
142
|
end
|
142
143
|
|
@@ -284,7 +285,7 @@ module Searchkick
|
|
284
285
|
relation
|
285
286
|
end
|
286
287
|
|
287
|
-
#
|
288
|
+
# public (for reindexing conversions)
|
288
289
|
def self.load_model(class_name, allow_child: false)
|
289
290
|
model = class_name.safe_constantize
|
290
291
|
raise Error, "Could not find class: #{class_name}" unless model
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: searchkick
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 5.0.
|
4
|
+
version: 5.0.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Andrew Kane
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2022-
|
11
|
+
date: 2022-06-17 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activemodel
|