searchkick 5.0.3 → 5.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ec7741cb306a56f1a5ae5a07450c5c102236bc6103079fc34a5f602fa2853b31
4
- data.tar.gz: 2bd747ee31846c901ce2a58125b34e9c90af6937a2c7dc041f2dd9f69701f1e9
3
+ metadata.gz: 89c6a97d4c898be7f1f494cc4bfafc8aed5acc202a855588e81f73d86ea7b123
4
+ data.tar.gz: c362bb2916ec0b1fa83d72efd2314e747077b5cd696ed2ece089204be9452010
5
5
  SHA512:
6
- metadata.gz: f92f2a3c7bb27862b1768f5ecedc19ad0eab515d62515f94d72412ed7c22a20493049dd0e52cba21b3a43d8c846b0dc0c6cda40c68edc12f6af37e8487fb9403
7
- data.tar.gz: f69a1cfc401bad0f09bda3f2788b1096cc004dcb765eb20b3d8ada06202185eeae630d953005298e5944ed1b8c239fa808eeebb32e9440cc7ae9a32e771469dc
6
+ metadata.gz: bf1a9191ee97a19afde4c44b84c02b34b7f6b8b3ac464cdd34671dd60cb8a191ec338a028db8221c2a502fadf0059987f957c918221175631e8aa70344371ee7
7
+ data.tar.gz: 32e7b4b9f0899088e709a77389d8fa845ef68a3fd819d3823229eb1ffd99a6121647d5ab82d209e6254e4c242530d7698441d3b97f65a1e2436acbced6b6086f
data/CHANGELOG.md CHANGED
@@ -1,3 +1,8 @@
1
+ ## 5.0.4 (2022-06-16)
2
+
3
+ - Added `max_result_window` option
4
+ - Improved error message for unsupported versions of Elasticsearch
5
+
1
6
  ## 5.0.3 (2022-03-13)
2
7
 
3
8
  - Fixed context for index name for inherited models
data/README.md CHANGED
@@ -66,7 +66,7 @@ gem "elasticsearch" # select one
66
66
  gem "opensearch-ruby" # select one
67
67
  ```
68
68
 
69
- The latest version works with Elasticsearch 7 and 8 and OpenSearch 1. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
69
+ The latest version works with Elasticsearch 7 and 8 and OpenSearch 1 and 2. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
70
70
 
71
71
  Add searchkick to models you want to search.
72
72
 
@@ -592,6 +592,14 @@ There are four strategies for keeping the index synced with your database.
592
592
  end
593
593
  ```
594
594
 
595
+ And reindex a record or relation manually.
596
+
597
+ ```ruby
598
+ product.reindex
599
+ # or
600
+ store.products.reindex(mode: :async)
601
+ ```
602
+
595
603
  You can also do bulk updates.
596
604
 
597
605
  ```ruby
@@ -608,6 +616,12 @@ Searchkick.callbacks(false) do
608
616
  end
609
617
  ```
610
618
 
619
+ Or override the model’s strategy.
620
+
621
+ ```ruby
622
+ product.reindex(mode: :async) # :inline or :queue
623
+ ```
624
+
611
625
  ### Associations
612
626
 
613
627
  Data is **not** automatically synced when an association is updated. If this is desired, add a callback to reindex:
@@ -654,20 +668,16 @@ The best starting point to improve your search **by far** is to track searches a
654
668
  Product.search("apple", track: {user_id: current_user.id})
655
669
  ```
656
670
 
657
- [See the docs](https://github.com/ankane/searchjoy) for how to install and use.
658
-
659
- Focus on:
660
-
661
- - top searches with low conversions
662
- - top searches with no results
671
+ [See the docs](https://github.com/ankane/searchjoy) for how to install and use. Focus on top searches with a low conversion rate.
663
672
 
664
- Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches.
673
+ Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches. This can make a huge difference on the quality of your search.
665
674
 
666
675
  Add conversion data with:
667
676
 
668
677
  ```ruby
669
678
  class Product < ApplicationRecord
670
- has_many :searches, class_name: "Searchjoy::Search", as: :convertable
679
+ has_many :conversions, class_name: "Searchjoy::Conversion", as: :convertable
680
+ has_many :searches, class_name: "Searchjoy::Search", through: :conversions
671
681
 
672
682
  searchkick conversions: [:conversions] # name of field
673
683
 
@@ -681,15 +691,100 @@ class Product < ApplicationRecord
681
691
  end
682
692
  ```
683
693
 
684
- Reindex and set up a cron job to add new conversions daily.
694
+ Reindex and set up a cron job to add new conversions daily. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
685
695
 
686
- ```sh
687
- rake searchkick:reindex CLASS=Product
696
+ ### Performant Conversions
697
+
698
+ A performant way to do conversions is to cache them to prevent N+1 queries. For Postgres, create a migration with:
699
+
700
+ ```ruby
701
+ add_column :products, :search_conversions, :jsonb
702
+ ```
703
+
704
+ For MySQL, use `:json`, and for others, use `:text` with a [JSON serializer](https://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html).
705
+
706
+ Next, update your model. Create a separate method for conversion data so you can use [partial reindexing](#partial-reindexing).
707
+
708
+ ```ruby
709
+ class Product < ApplicationRecord
710
+ searchkick conversions: [:conversions]
711
+
712
+ def search_data
713
+ {
714
+ name: name,
715
+ category: category
716
+ }.merge(conversions_data)
717
+ end
718
+
719
+ def conversions_data
720
+ {
721
+ conversions: search_conversions || {}
722
+ }
723
+ end
724
+ end
725
+ ```
726
+
727
+ Deploy and reindex your data. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
728
+
729
+ ```ruby
730
+ Product.reindex
731
+ ```
732
+
733
+ Then, create a job to update the conversions column and reindex records with new conversions. Here’s one you can use for Searchjoy:
734
+
735
+ ```ruby
736
+ class UpdateConversionsJob < ApplicationJob
737
+ def perform(class_name, since: nil, update: true, reindex: true)
738
+ model = Searchkick.load_model(class_name)
739
+
740
+ # get records that have a recent conversion
741
+ recently_converted_ids =
742
+ Searchjoy::Conversion.where(convertable_type: class_name).where(created_at: since..)
743
+ .order(:convertable_id).distinct.pluck(:convertable_id)
744
+
745
+ # split into batches
746
+ recently_converted_ids.in_groups_of(1000, false) do |ids|
747
+ if update
748
+ # fetch conversions
749
+ conversions =
750
+ Searchjoy::Conversion.where(convertable_id: ids, convertable_type: class_name)
751
+ .joins(:search).where.not(searchjoy_searches: {user_id: nil})
752
+ .group(:convertable_id, :query).distinct.count(:user_id)
753
+
754
+ # group by record
755
+ conversions_by_record = {}
756
+ conversions.each do |(id, query), count|
757
+ (conversions_by_record[id] ||= {})[query] = count
758
+ end
759
+
760
+ # update conversions column
761
+ model.transaction do
762
+ conversions_by_record.each do |id, conversions|
763
+ model.where(id: id).update_all(search_conversions: conversions)
764
+ end
765
+ end
766
+ end
767
+
768
+ if reindex
769
+ # reindex conversions data
770
+ model.where(id: ids).reindex(:conversions_data)
771
+ end
772
+ end
773
+ end
774
+ end
688
775
  ```
689
776
 
690
- This can make a huge difference on the quality of your search.
777
+ Run the job:
691
778
 
692
- For a more performant way to reindex conversion data, check out [performant conversions](#performant-conversions).
779
+ ```ruby
780
+ UpdateConversionsJob.perform_now("Product")
781
+ ```
782
+
783
+ And set it up to run daily.
784
+
785
+ ```ruby
786
+ UpdateConversionsJob.perform_later("Product", since: 1.day.ago)
787
+ ```
693
788
 
694
789
  ## Personalized Results
695
790
 
@@ -1575,11 +1670,12 @@ Reindex a subset of attributes to reduce time spent generating search data and c
1575
1670
  class Product < ApplicationRecord
1576
1671
  def search_data
1577
1672
  {
1578
- name: name
1579
- }.merge(search_prices)
1673
+ name: name,
1674
+ category: category
1675
+ }.merge(prices_data)
1580
1676
  end
1581
1677
 
1582
- def search_prices
1678
+ def prices_data
1583
1679
  {
1584
1680
  price: price,
1585
1681
  sale_price: sale_price
@@ -1591,68 +1687,7 @@ end
1591
1687
  And use:
1592
1688
 
1593
1689
  ```ruby
1594
- Product.reindex(:search_prices)
1595
- ```
1596
-
1597
- ### Performant Conversions
1598
-
1599
- Split out conversions into a separate method so you can use partial reindexing, and cache conversions to prevent N+1 queries. Be sure to use a centralized cache store like Memcached or Redis.
1600
-
1601
- ```ruby
1602
- class Product < ApplicationRecord
1603
- def search_data
1604
- {
1605
- name: name
1606
- }.merge(search_conversions)
1607
- end
1608
-
1609
- def search_conversions
1610
- {
1611
- conversions: Rails.cache.read("search_conversions:#{self.class.name}:#{id}") || {}
1612
- }
1613
- end
1614
- end
1615
- ```
1616
-
1617
- Create a job to update the cache and reindex records with new conversions.
1618
-
1619
- ```ruby
1620
- class ReindexConversionsJob < ApplicationJob
1621
- def perform(class_name)
1622
- # get records that have a recent conversion
1623
- recently_converted_ids =
1624
- Searchjoy::Search.where("convertable_type = ? AND converted_at > ?", class_name, 1.day.ago)
1625
- .order(:convertable_id).distinct.pluck(:convertable_id)
1626
-
1627
- # split into groups
1628
- recently_converted_ids.in_groups_of(1000, false) do |ids|
1629
- # fetch conversions
1630
- conversions =
1631
- Searchjoy::Search.where(convertable_id: ids, convertable_type: class_name)
1632
- .group(:convertable_id, :query).distinct.count(:user_id)
1633
-
1634
- # group conversions by record
1635
- conversions_by_record = {}
1636
- conversions.each do |(id, query), count|
1637
- (conversions_by_record[id] ||= {})[query] = count
1638
- end
1639
-
1640
- # write to cache
1641
- conversions_by_record.each do |id, conversions|
1642
- Rails.cache.write("search_conversions:#{class_name}:#{id}", conversions)
1643
- end
1644
-
1645
- # partial reindex
1646
- class_name.constantize.where(id: ids).reindex(:search_conversions)
1647
- end
1648
- end
1649
- end
1650
- ```
1651
-
1652
- Run the job with:
1653
-
1654
- ```ruby
1655
- ReindexConversionsJob.perform_later("Product")
1690
+ Product.reindex(:prices_data)
1656
1691
  ```
1657
1692
 
1658
1693
  ## Advanced
@@ -2036,12 +2071,24 @@ Turn on misspellings after a certain number of characters
2036
2071
  Product.search("api", misspellings: {prefix_length: 2}) # api, apt, no ahi
2037
2072
  ```
2038
2073
 
2039
- **Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch
2074
+ **Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch 1
2040
2075
 
2041
2076
  ```ruby
2042
2077
  Product.search("ah", misspellings: {prefix_length: 2}) # ah, no aha
2043
2078
  ```
2044
2079
 
2080
+ BigDecimal values are indexed as floats by default so they can be used for boosting. Convert them to strings to keep full precision.
2081
+
2082
+ ```ruby
2083
+ class Product < ApplicationRecord
2084
+ def search_data
2085
+ {
2086
+ units: units.to_s("F")
2087
+ }
2088
+ end
2089
+ end
2090
+ ```
2091
+
2045
2092
  ## Gotchas
2046
2093
 
2047
2094
  ### Consistency
@@ -418,7 +418,7 @@ module Searchkick
418
418
  true
419
419
  end
420
420
  rescue => e
421
- if Searchkick.transport_error?(e) && e.message.include?("No handler for type [text]")
421
+ if Searchkick.transport_error?(e) && (e.message.include?("No handler for type [text]") || e.message.include?("class java.util.ArrayList cannot be cast to class java.util.Map"))
422
422
  raise UnsupportedVersionError
423
423
  end
424
424
 
@@ -19,7 +19,7 @@ module Searchkick
19
19
  mappings = generate_mappings.deep_symbolize_keys.deep_merge(custom_mappings)
20
20
  end
21
21
 
22
- set_deep_paging(settings) if options[:deep_paging]
22
+ set_deep_paging(settings) if options[:deep_paging] || options[:max_result_window]
23
23
 
24
24
  {
25
25
  settings: settings,
@@ -525,7 +525,7 @@ module Searchkick
525
525
  def set_deep_paging(settings)
526
526
  if !settings.dig(:index, :max_result_window) && !settings[:"index.max_result_window"]
527
527
  settings[:index] ||= {}
528
- settings[:index][:max_result_window] = 1_000_000_000
528
+ settings[:index][:max_result_window] = options[:max_result_window] || 1_000_000_000
529
529
  end
530
530
  end
531
531
 
@@ -5,7 +5,7 @@ module Searchkick
5
5
 
6
6
  unknown_keywords = options.keys - [:_all, :_type, :batch_size, :callbacks, :case_sensitive, :conversions, :deep_paging, :default_fields,
7
7
  :filterable, :geo_shape, :highlight, :ignore_above, :index_name, :index_prefix, :inheritance, :language,
8
- :locations, :mappings, :match, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
8
+ :locations, :mappings, :match, :max_result_window, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
9
9
  :special_characters, :stem, :stemmer, :stem_conversions, :stem_exclusion, :stemmer_override, :suggest, :synonyms, :text_end,
10
10
  :text_middle, :text_start, :unscope, :word, :word_end, :word_middle, :word_start]
11
11
  raise ArgumentError, "unknown keywords: #{unknown_keywords.join(", ")}" if unknown_keywords.any?
@@ -254,6 +254,12 @@ module Searchkick
254
254
  offset = options[:offset] || (page - 1) * per_page + padding
255
255
  scroll = options[:scroll]
256
256
 
257
+ max_result_window = searchkick_options[:max_result_window]
258
+ if max_result_window
259
+ offset = max_result_window if offset > max_result_window
260
+ per_page = max_result_window - offset if offset + per_page > max_result_window
261
+ end
262
+
257
263
  # model and eager loading
258
264
  load = options[:load].nil? ? true : options[:load]
259
265
 
@@ -1,3 +1,3 @@
1
1
  module Searchkick
2
- VERSION = "5.0.3"
2
+ VERSION = "5.0.4"
3
3
  end
data/lib/searchkick.rb CHANGED
@@ -135,8 +135,9 @@ module Searchkick
135
135
  @opensearch
136
136
  end
137
137
 
138
- def self.server_below?(version)
139
- server_version = opensearch? ? "7.10.2" : self.server_version
138
+ # TODO always check true version in Searchkick 6
139
+ def self.server_below?(version, true_version = false)
140
+ server_version = !true_version && opensearch? ? "7.10.2" : self.server_version
140
141
  Gem::Version.new(server_version.split("-")[0]) < Gem::Version.new(version.split("-")[0])
141
142
  end
142
143
 
@@ -284,7 +285,7 @@ module Searchkick
284
285
  relation
285
286
  end
286
287
 
287
- # private
288
+ # public (for reindexing conversions)
288
289
  def self.load_model(class_name, allow_child: false)
289
290
  model = class_name.safe_constantize
290
291
  raise Error, "Could not find class: #{class_name}" unless model
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: searchkick
3
3
  version: !ruby/object:Gem::Version
4
- version: 5.0.3
4
+ version: 5.0.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2022-03-13 00:00:00.000000000 Z
11
+ date: 2022-06-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activemodel