searchkick 5.0.3 → 5.0.4

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ec7741cb306a56f1a5ae5a07450c5c102236bc6103079fc34a5f602fa2853b31
4
- data.tar.gz: 2bd747ee31846c901ce2a58125b34e9c90af6937a2c7dc041f2dd9f69701f1e9
3
+ metadata.gz: 89c6a97d4c898be7f1f494cc4bfafc8aed5acc202a855588e81f73d86ea7b123
4
+ data.tar.gz: c362bb2916ec0b1fa83d72efd2314e747077b5cd696ed2ece089204be9452010
5
5
  SHA512:
6
- metadata.gz: f92f2a3c7bb27862b1768f5ecedc19ad0eab515d62515f94d72412ed7c22a20493049dd0e52cba21b3a43d8c846b0dc0c6cda40c68edc12f6af37e8487fb9403
7
- data.tar.gz: f69a1cfc401bad0f09bda3f2788b1096cc004dcb765eb20b3d8ada06202185eeae630d953005298e5944ed1b8c239fa808eeebb32e9440cc7ae9a32e771469dc
6
+ metadata.gz: bf1a9191ee97a19afde4c44b84c02b34b7f6b8b3ac464cdd34671dd60cb8a191ec338a028db8221c2a502fadf0059987f957c918221175631e8aa70344371ee7
7
+ data.tar.gz: 32e7b4b9f0899088e709a77389d8fa845ef68a3fd819d3823229eb1ffd99a6121647d5ab82d209e6254e4c242530d7698441d3b97f65a1e2436acbced6b6086f
data/CHANGELOG.md CHANGED
@@ -1,3 +1,8 @@
1
+ ## 5.0.4 (2022-06-16)
2
+
3
+ - Added `max_result_window` option
4
+ - Improved error message for unsupported versions of Elasticsearch
5
+
1
6
  ## 5.0.3 (2022-03-13)
2
7
 
3
8
  - Fixed context for index name for inherited models
data/README.md CHANGED
@@ -66,7 +66,7 @@ gem "elasticsearch" # select one
66
66
  gem "opensearch-ruby" # select one
67
67
  ```
68
68
 
69
- The latest version works with Elasticsearch 7 and 8 and OpenSearch 1. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
69
+ The latest version works with Elasticsearch 7 and 8 and OpenSearch 1 and 2. For Elasticsearch 6, use version 4.6.3 and [this readme](https://github.com/ankane/searchkick/blob/v4.6.3/README.md).
70
70
 
71
71
  Add searchkick to models you want to search.
72
72
 
@@ -592,6 +592,14 @@ There are four strategies for keeping the index synced with your database.
592
592
  end
593
593
  ```
594
594
 
595
+ And reindex a record or relation manually.
596
+
597
+ ```ruby
598
+ product.reindex
599
+ # or
600
+ store.products.reindex(mode: :async)
601
+ ```
602
+
595
603
  You can also do bulk updates.
596
604
 
597
605
  ```ruby
@@ -608,6 +616,12 @@ Searchkick.callbacks(false) do
608
616
  end
609
617
  ```
610
618
 
619
+ Or override the model’s strategy.
620
+
621
+ ```ruby
622
+ product.reindex(mode: :async) # :inline or :queue
623
+ ```
624
+
611
625
  ### Associations
612
626
 
613
627
  Data is **not** automatically synced when an association is updated. If this is desired, add a callback to reindex:
@@ -654,20 +668,16 @@ The best starting point to improve your search **by far** is to track searches a
654
668
  Product.search("apple", track: {user_id: current_user.id})
655
669
  ```
656
670
 
657
- [See the docs](https://github.com/ankane/searchjoy) for how to install and use.
658
-
659
- Focus on:
660
-
661
- - top searches with low conversions
662
- - top searches with no results
671
+ [See the docs](https://github.com/ankane/searchjoy) for how to install and use. Focus on top searches with a low conversion rate.
663
672
 
664
- Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches.
673
+ Searchkick can then use the conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches. This can make a huge difference on the quality of your search.
665
674
 
666
675
  Add conversion data with:
667
676
 
668
677
  ```ruby
669
678
  class Product < ApplicationRecord
670
- has_many :searches, class_name: "Searchjoy::Search", as: :convertable
679
+ has_many :conversions, class_name: "Searchjoy::Conversion", as: :convertable
680
+ has_many :searches, class_name: "Searchjoy::Search", through: :conversions
671
681
 
672
682
  searchkick conversions: [:conversions] # name of field
673
683
 
@@ -681,15 +691,100 @@ class Product < ApplicationRecord
681
691
  end
682
692
  ```
683
693
 
684
- Reindex and set up a cron job to add new conversions daily.
694
+ Reindex and set up a cron job to add new conversions daily. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
685
695
 
686
- ```sh
687
- rake searchkick:reindex CLASS=Product
696
+ ### Performant Conversions
697
+
698
+ A performant way to do conversions is to cache them to prevent N+1 queries. For Postgres, create a migration with:
699
+
700
+ ```ruby
701
+ add_column :products, :search_conversions, :jsonb
702
+ ```
703
+
704
+ For MySQL, use `:json`, and for others, use `:text` with a [JSON serializer](https://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html).
705
+
706
+ Next, update your model. Create a separate method for conversion data so you can use [partial reindexing](#partial-reindexing).
707
+
708
+ ```ruby
709
+ class Product < ApplicationRecord
710
+ searchkick conversions: [:conversions]
711
+
712
+ def search_data
713
+ {
714
+ name: name,
715
+ category: category
716
+ }.merge(conversions_data)
717
+ end
718
+
719
+ def conversions_data
720
+ {
721
+ conversions: search_conversions || {}
722
+ }
723
+ end
724
+ end
725
+ ```
726
+
727
+ Deploy and reindex your data. For zero downtime deployment, temporarily set `conversions: false` in your search calls until the data is reindexed.
728
+
729
+ ```ruby
730
+ Product.reindex
731
+ ```
732
+
733
+ Then, create a job to update the conversions column and reindex records with new conversions. Here’s one you can use for Searchjoy:
734
+
735
+ ```ruby
736
+ class UpdateConversionsJob < ApplicationJob
737
+ def perform(class_name, since: nil, update: true, reindex: true)
738
+ model = Searchkick.load_model(class_name)
739
+
740
+ # get records that have a recent conversion
741
+ recently_converted_ids =
742
+ Searchjoy::Conversion.where(convertable_type: class_name).where(created_at: since..)
743
+ .order(:convertable_id).distinct.pluck(:convertable_id)
744
+
745
+ # split into batches
746
+ recently_converted_ids.in_groups_of(1000, false) do |ids|
747
+ if update
748
+ # fetch conversions
749
+ conversions =
750
+ Searchjoy::Conversion.where(convertable_id: ids, convertable_type: class_name)
751
+ .joins(:search).where.not(searchjoy_searches: {user_id: nil})
752
+ .group(:convertable_id, :query).distinct.count(:user_id)
753
+
754
+ # group by record
755
+ conversions_by_record = {}
756
+ conversions.each do |(id, query), count|
757
+ (conversions_by_record[id] ||= {})[query] = count
758
+ end
759
+
760
+ # update conversions column
761
+ model.transaction do
762
+ conversions_by_record.each do |id, conversions|
763
+ model.where(id: id).update_all(search_conversions: conversions)
764
+ end
765
+ end
766
+ end
767
+
768
+ if reindex
769
+ # reindex conversions data
770
+ model.where(id: ids).reindex(:conversions_data)
771
+ end
772
+ end
773
+ end
774
+ end
688
775
  ```
689
776
 
690
- This can make a huge difference on the quality of your search.
777
+ Run the job:
691
778
 
692
- For a more performant way to reindex conversion data, check out [performant conversions](#performant-conversions).
779
+ ```ruby
780
+ UpdateConversionsJob.perform_now("Product")
781
+ ```
782
+
783
+ And set it up to run daily.
784
+
785
+ ```ruby
786
+ UpdateConversionsJob.perform_later("Product", since: 1.day.ago)
787
+ ```
693
788
 
694
789
  ## Personalized Results
695
790
 
@@ -1575,11 +1670,12 @@ Reindex a subset of attributes to reduce time spent generating search data and c
1575
1670
  class Product < ApplicationRecord
1576
1671
  def search_data
1577
1672
  {
1578
- name: name
1579
- }.merge(search_prices)
1673
+ name: name,
1674
+ category: category
1675
+ }.merge(prices_data)
1580
1676
  end
1581
1677
 
1582
- def search_prices
1678
+ def prices_data
1583
1679
  {
1584
1680
  price: price,
1585
1681
  sale_price: sale_price
@@ -1591,68 +1687,7 @@ end
1591
1687
  And use:
1592
1688
 
1593
1689
  ```ruby
1594
- Product.reindex(:search_prices)
1595
- ```
1596
-
1597
- ### Performant Conversions
1598
-
1599
- Split out conversions into a separate method so you can use partial reindexing, and cache conversions to prevent N+1 queries. Be sure to use a centralized cache store like Memcached or Redis.
1600
-
1601
- ```ruby
1602
- class Product < ApplicationRecord
1603
- def search_data
1604
- {
1605
- name: name
1606
- }.merge(search_conversions)
1607
- end
1608
-
1609
- def search_conversions
1610
- {
1611
- conversions: Rails.cache.read("search_conversions:#{self.class.name}:#{id}") || {}
1612
- }
1613
- end
1614
- end
1615
- ```
1616
-
1617
- Create a job to update the cache and reindex records with new conversions.
1618
-
1619
- ```ruby
1620
- class ReindexConversionsJob < ApplicationJob
1621
- def perform(class_name)
1622
- # get records that have a recent conversion
1623
- recently_converted_ids =
1624
- Searchjoy::Search.where("convertable_type = ? AND converted_at > ?", class_name, 1.day.ago)
1625
- .order(:convertable_id).distinct.pluck(:convertable_id)
1626
-
1627
- # split into groups
1628
- recently_converted_ids.in_groups_of(1000, false) do |ids|
1629
- # fetch conversions
1630
- conversions =
1631
- Searchjoy::Search.where(convertable_id: ids, convertable_type: class_name)
1632
- .group(:convertable_id, :query).distinct.count(:user_id)
1633
-
1634
- # group conversions by record
1635
- conversions_by_record = {}
1636
- conversions.each do |(id, query), count|
1637
- (conversions_by_record[id] ||= {})[query] = count
1638
- end
1639
-
1640
- # write to cache
1641
- conversions_by_record.each do |id, conversions|
1642
- Rails.cache.write("search_conversions:#{class_name}:#{id}", conversions)
1643
- end
1644
-
1645
- # partial reindex
1646
- class_name.constantize.where(id: ids).reindex(:search_conversions)
1647
- end
1648
- end
1649
- end
1650
- ```
1651
-
1652
- Run the job with:
1653
-
1654
- ```ruby
1655
- ReindexConversionsJob.perform_later("Product")
1690
+ Product.reindex(:prices_data)
1656
1691
  ```
1657
1692
 
1658
1693
  ## Advanced
@@ -2036,12 +2071,24 @@ Turn on misspellings after a certain number of characters
2036
2071
  Product.search("api", misspellings: {prefix_length: 2}) # api, apt, no ahi
2037
2072
  ```
2038
2073
 
2039
- **Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch
2074
+ **Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off with Elasticsearch 7 and OpenSearch 1
2040
2075
 
2041
2076
  ```ruby
2042
2077
  Product.search("ah", misspellings: {prefix_length: 2}) # ah, no aha
2043
2078
  ```
2044
2079
 
2080
+ BigDecimal values are indexed as floats by default so they can be used for boosting. Convert them to strings to keep full precision.
2081
+
2082
+ ```ruby
2083
+ class Product < ApplicationRecord
2084
+ def search_data
2085
+ {
2086
+ units: units.to_s("F")
2087
+ }
2088
+ end
2089
+ end
2090
+ ```
2091
+
2045
2092
  ## Gotchas
2046
2093
 
2047
2094
  ### Consistency
@@ -418,7 +418,7 @@ module Searchkick
418
418
  true
419
419
  end
420
420
  rescue => e
421
- if Searchkick.transport_error?(e) && e.message.include?("No handler for type [text]")
421
+ if Searchkick.transport_error?(e) && (e.message.include?("No handler for type [text]") || e.message.include?("class java.util.ArrayList cannot be cast to class java.util.Map"))
422
422
  raise UnsupportedVersionError
423
423
  end
424
424
 
@@ -19,7 +19,7 @@ module Searchkick
19
19
  mappings = generate_mappings.deep_symbolize_keys.deep_merge(custom_mappings)
20
20
  end
21
21
 
22
- set_deep_paging(settings) if options[:deep_paging]
22
+ set_deep_paging(settings) if options[:deep_paging] || options[:max_result_window]
23
23
 
24
24
  {
25
25
  settings: settings,
@@ -525,7 +525,7 @@ module Searchkick
525
525
  def set_deep_paging(settings)
526
526
  if !settings.dig(:index, :max_result_window) && !settings[:"index.max_result_window"]
527
527
  settings[:index] ||= {}
528
- settings[:index][:max_result_window] = 1_000_000_000
528
+ settings[:index][:max_result_window] = options[:max_result_window] || 1_000_000_000
529
529
  end
530
530
  end
531
531
 
@@ -5,7 +5,7 @@ module Searchkick
5
5
 
6
6
  unknown_keywords = options.keys - [:_all, :_type, :batch_size, :callbacks, :case_sensitive, :conversions, :deep_paging, :default_fields,
7
7
  :filterable, :geo_shape, :highlight, :ignore_above, :index_name, :index_prefix, :inheritance, :language,
8
- :locations, :mappings, :match, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
8
+ :locations, :mappings, :match, :max_result_window, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
9
9
  :special_characters, :stem, :stemmer, :stem_conversions, :stem_exclusion, :stemmer_override, :suggest, :synonyms, :text_end,
10
10
  :text_middle, :text_start, :unscope, :word, :word_end, :word_middle, :word_start]
11
11
  raise ArgumentError, "unknown keywords: #{unknown_keywords.join(", ")}" if unknown_keywords.any?
@@ -254,6 +254,12 @@ module Searchkick
254
254
  offset = options[:offset] || (page - 1) * per_page + padding
255
255
  scroll = options[:scroll]
256
256
 
257
+ max_result_window = searchkick_options[:max_result_window]
258
+ if max_result_window
259
+ offset = max_result_window if offset > max_result_window
260
+ per_page = max_result_window - offset if offset + per_page > max_result_window
261
+ end
262
+
257
263
  # model and eager loading
258
264
  load = options[:load].nil? ? true : options[:load]
259
265
 
@@ -1,3 +1,3 @@
1
1
  module Searchkick
2
- VERSION = "5.0.3"
2
+ VERSION = "5.0.4"
3
3
  end
data/lib/searchkick.rb CHANGED
@@ -135,8 +135,9 @@ module Searchkick
135
135
  @opensearch
136
136
  end
137
137
 
138
- def self.server_below?(version)
139
- server_version = opensearch? ? "7.10.2" : self.server_version
138
+ # TODO always check true version in Searchkick 6
139
+ def self.server_below?(version, true_version = false)
140
+ server_version = !true_version && opensearch? ? "7.10.2" : self.server_version
140
141
  Gem::Version.new(server_version.split("-")[0]) < Gem::Version.new(version.split("-")[0])
141
142
  end
142
143
 
@@ -284,7 +285,7 @@ module Searchkick
284
285
  relation
285
286
  end
286
287
 
287
- # private
288
+ # public (for reindexing conversions)
288
289
  def self.load_model(class_name, allow_child: false)
289
290
  model = class_name.safe_constantize
290
291
  raise Error, "Could not find class: #{class_name}" unless model
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: searchkick
3
3
  version: !ruby/object:Gem::Version
4
- version: 5.0.3
4
+ version: 5.0.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2022-03-13 00:00:00.000000000 Z
11
+ date: 2022-06-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activemodel