searchkick 5.4.0 → 5.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5dfa66d383b0e1a91288b4636fef42b22ac73ec63294a7955ac20298851df1c1
4
- data.tar.gz: fbe59c3e85352b01674c16831f67b3906367e6a032769582ee552038853194f8
3
+ metadata.gz: 37b82bea130dfffde9634bf74e4e078e97bbe6e8d0b4e540c13800c765ddc6c3
4
+ data.tar.gz: 2435f38ff18bf471673e1df11f0fd96cfacb408cf390e04afcc068488777a358
5
5
  SHA512:
6
- metadata.gz: 25d7df6cf3861522a99851c8c561f0726a6e922d4aafb0fe646883b86c37ba4c1ec78735940a5eb70056433422fa2987989701260ec7b1e2969fb452758d2d43
7
- data.tar.gz: d9b637bba6c90e1c08de0b9cd8239e6b8b03aa86eb0a77aebc6b855ff0744220d775c8b7c8a12e1df736f98ee44dac3c6f11ad3a0506aa51e5c22f120f3c13d3
6
+ metadata.gz: f0541cc6673c2382ec82bce16bb5c230fa0fb3220d6430324739c8fb722ca75f8f70d33c1833642ee03edc936f8dba88c37282cdf6e200ead84ae16688054307
7
+ data.tar.gz: 7522e671cf005d61a5df2856e6754605570f281f6e4b359f537f4a9d6cfe17eccad1466b81adbb683b782c9941caac4f935a2244094f7af11b09f58954d173ff
data/CHANGELOG.md CHANGED
@@ -1,3 +1,11 @@
1
+ ## 5.5.0 (2025-04-03)
2
+
3
+ - Added `m` and `ef_construction` to `knn` index option
4
+ - Added `ef_search` to `knn` search option
5
+ - Fixed exact cosine distance for OpenSearch 2.19+
6
+ - Dropped support for Ruby < 3.2 and Active Record < 7.1
7
+ - Dropped support for Mongoid < 8
8
+
1
9
  ## 5.4.0 (2024-09-04)
2
10
 
3
11
  - Added `knn` option
@@ -833,4 +841,22 @@ Breaking changes
833
841
 
834
842
  ## 0.1.2 (2013-07-30)
835
843
 
836
- - Launch
844
+ - Use conversions by default
845
+
846
+ ## 0.1.1 (2013-07-29)
847
+
848
+ - Renamed `_source` to `search_data`
849
+ - Renamed `searchkick_import` to `search_import`
850
+
851
+ ## 0.1.0 (2013-07-28)
852
+
853
+ - Added `_source` method
854
+ - Added `index_name` option
855
+
856
+ ## 0.0.2 (2013-07-17)
857
+
858
+ - Added `conversions` option
859
+
860
+ ## 0.0.1 (2013-07-14)
861
+
862
+ - First release
data/LICENSE.txt CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013-2023 Andrew Kane
1
+ Copyright (c) 2013-2025 Andrew Kane
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -821,7 +821,7 @@ Autocomplete predicts what a user will type, making the search experience faster
821
821
 
822
822
  **Note 2:** If you only have a few thousand records, don’t use Searchkick for autocomplete. It’s *much* faster to load all records into JavaScript and autocomplete there (eliminates network requests).
823
823
 
824
- First, specify which fields use this feature. This is necessary since autocomplete can increase the index size significantly, but don’t worry - this gives you blazing faster queries.
824
+ First, specify which fields use this feature. This is necessary since autocomplete can increase the index size significantly, but don’t worry - this gives you blazing fast queries.
825
825
 
826
826
  ```ruby
827
827
  class Movie < ApplicationRecord
@@ -835,7 +835,7 @@ Reindex and search with:
835
835
  Movie.search("jurassic pa", fields: [:title], match: :word_start)
836
836
  ```
837
837
 
838
- Typically, you want to use a JavaScript library like [typeahead.js](https://twitter.github.io/typeahead.js/) or [jQuery UI](https://jqueryui.com/autocomplete/).
838
+ Use a front-end library like [typeahead.js](https://twitter.github.io/typeahead.js/) to show the results.
839
839
 
840
840
  #### Here’s how to make it work with Rails
841
841
 
@@ -844,13 +844,14 @@ First, add a route and controller action.
844
844
  ```ruby
845
845
  class MoviesController < ApplicationController
846
846
  def autocomplete
847
- render json: Movie.search(params[:query], {
847
+ render json: Movie.search(
848
+ params[:query],
848
849
  fields: ["title^5", "director"],
849
850
  match: :word_start,
850
851
  limit: 10,
851
852
  load: false,
852
853
  misspellings: {below: 5}
853
- }).map(&:title)
854
+ ).map(&:title)
854
855
  end
855
856
  end
856
857
  ```
@@ -1023,7 +1024,7 @@ Additional options can be specified for each field:
1023
1024
  Band.search("cinema", fields: [:name], highlight: {fields: {name: {fragment_size: 200}}})
1024
1025
  ```
1025
1026
 
1026
- You can find available highlight options in the [Elasticsearch reference](https://www.elastic.co/guide/en/elasticsearch/reference/current/highlighting.html).
1027
+ You can find available highlight options in the [Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/highlighting.html) or [OpenSearch](https://opensearch.org/docs/latest/search-plugins/searching-data/highlight/) reference.
1027
1028
 
1028
1029
  ## Similar Items
1029
1030
 
@@ -1511,13 +1512,25 @@ See [Production Rails](https://github.com/ankane/production_rails) for other goo
1511
1512
 
1512
1513
  ### JSON Generation
1513
1514
 
1514
- Significantly increase performance with faster JSON generation. Add [Oj](https://github.com/ohler55/oj) to your Gemfile.
1515
+ Increase performance with faster JSON generation. Add to your Gemfile:
1515
1516
 
1516
1517
  ```ruby
1517
- gem "oj"
1518
+ gem "json", ">= 2.10.2"
1518
1519
  ```
1519
1520
 
1520
- This speeds up all JSON generation and parsing in your application (automatically!)
1521
+ And create an initializer with:
1522
+
1523
+ ```ruby
1524
+ class SearchSerializer
1525
+ def dump(object)
1526
+ JSON.generate(object)
1527
+ end
1528
+ end
1529
+
1530
+ Elasticsearch::API.settings[:serializer] = SearchSerializer.new
1531
+ # or
1532
+ OpenSearch::API.settings[:serializer] = SearchSerializer.new
1533
+ ```
1521
1534
 
1522
1535
  ### Persistent HTTP Connections
1523
1536
 
@@ -1533,8 +1546,6 @@ To reduce log noise, create an initializer with:
1533
1546
  Ethon.logger = Logger.new(nil)
1534
1547
  ```
1535
1548
 
1536
- If you run into issues on Windows, check out [this post](https://www.rastating.com/fixing-issues-in-typhoeus-and-httparty-on-windows/).
1537
-
1538
1549
  ### Searchable Fields
1539
1550
 
1540
1551
  By default, all string fields are searchable (can be used in `fields` option). Speed up indexing and reduce index size by only making some fields searchable.
@@ -1563,7 +1574,7 @@ For large data sets, you can use background jobs to parallelize reindexing.
1563
1574
 
1564
1575
  ```ruby
1565
1576
  Product.reindex(mode: :async)
1566
- # {index_name: "products_production_20170111210018065"}
1577
+ # {index_name: "products_production_20250111210018065"}
1567
1578
  ```
1568
1579
 
1569
1580
  Once the jobs complete, promote the new index with:
@@ -1590,19 +1601,19 @@ You can also have Searchkick wait for reindexing to complete
1590
1601
  Product.reindex(mode: :async, wait: true)
1591
1602
  ```
1592
1603
 
1593
- You can use [ActiveJob::TrafficControl](https://github.com/nickelser/activejob-traffic_control) to control concurrency. Install the gem:
1604
+ You can use your background job framework to control concurrency. For Solid Queue, create an initializer with:
1594
1605
 
1595
1606
  ```ruby
1596
- gem "activejob-traffic_control", ">= 0.1.3"
1597
- ```
1598
-
1599
- And create an initializer with:
1607
+ module SearchkickBulkReindexConcurrency
1608
+ extend ActiveSupport::Concern
1600
1609
 
1601
- ```ruby
1602
- ActiveJob::TrafficControl.client = Searchkick.redis
1610
+ included do
1611
+ limits_concurrency to: 3, key: ""
1612
+ end
1613
+ end
1603
1614
 
1604
- class Searchkick::BulkReindexJob
1605
- concurrency 3
1615
+ Rails.application.config.after_initialize do
1616
+ Searchkick::BulkReindexJob.include(SearchkickBulkReindexConcurrency)
1606
1617
  end
1607
1618
  ```
1608
1619
 
@@ -1616,7 +1627,7 @@ You can specify a longer refresh interval while reindexing to increase performan
1616
1627
  Product.reindex(mode: :async, refresh_interval: "30s")
1617
1628
  ```
1618
1629
 
1619
- **Note:** This only makes a noticable difference with parallel reindexing.
1630
+ **Note:** This only makes a noticeable difference with parallel reindexing.
1620
1631
 
1621
1632
  When promoting, have it restored to the value in your mapping (defaults to `1s`).
1622
1633
 
@@ -1863,9 +1874,27 @@ Reindex and search with:
1863
1874
  Product.search(knn: {field: :embedding, vector: [1, 2, 3]}, limit: 10)
1864
1875
  ```
1865
1876
 
1877
+ ### HNSW Options
1878
+
1879
+ Nearest neighbor search uses [HNSW](https://en.wikipedia.org/wiki/Hierarchical_navigable_small_world) for indexing.
1880
+
1881
+ Specify `m` and `ef_construction`
1882
+
1883
+ ```ruby
1884
+ class Product < ApplicationRecord
1885
+ searchkick knn: {embedding: {dimensions: 3, distance: "cosine", m: 16, ef_construction: 100}}
1886
+ end
1887
+ ```
1888
+
1889
+ Specify `ef_search`
1890
+
1891
+ ```ruby
1892
+ Product.search(knn: {field: :embedding, vector: [1, 2, 3], ef_search: 40}, limit: 10)
1893
+ ```
1894
+
1866
1895
  ## Semantic Search
1867
1896
 
1868
- First, add [nearest neighbor search](#nearest-neighbor-search-unreleased-experimental) to your model
1897
+ First, add [nearest neighbor search](#nearest-neighbor-search) to your model
1869
1898
 
1870
1899
  ```ruby
1871
1900
  class Product < ApplicationRecord
@@ -2205,42 +2234,6 @@ end
2205
2234
 
2206
2235
  For convenience, this is set by default in the test environment.
2207
2236
 
2208
- ## Upgrading
2209
-
2210
- ### 5.0
2211
-
2212
- Searchkick 5 supports both the `elasticsearch` and `opensearch-ruby` gems. Add the one you want to use to your Gemfile:
2213
-
2214
- ```ruby
2215
- gem "elasticsearch"
2216
- # or
2217
- gem "opensearch-ruby"
2218
- ```
2219
-
2220
- If using the deprecated `faraday_middleware-aws-signers-v4` gem, switch to `faraday_middleware-aws-sigv4`.
2221
-
2222
- Also, searches now use lazy loading:
2223
-
2224
- ```ruby
2225
- # search not executed
2226
- Product.search("milk")
2227
-
2228
- # search executed
2229
- Product.search("milk").to_a
2230
- ```
2231
-
2232
- You can reindex relations in the background:
2233
-
2234
- ```ruby
2235
- store.products.reindex(mode: :async)
2236
- # or
2237
- store.products.reindex(mode: :queue)
2238
- ```
2239
-
2240
- And there’s a [new option](#default-scopes) for models with default scopes.
2241
-
2242
- Check out the [changelog](https://github.com/ankane/searchkick/blob/master/CHANGELOG.md#500-2022-02-21) for the full list of changes.
2243
-
2244
2237
  ## History
2245
2238
 
2246
2239
  View the [changelog](https://github.com/ankane/searchkick/blob/master/CHANGELOG.md).
@@ -451,7 +451,8 @@ module Searchkick
451
451
  vector_options[:method] = {
452
452
  name: "hnsw",
453
453
  space_type: space_type,
454
- engine: "lucene"
454
+ engine: "lucene",
455
+ parameters: knn_options.slice(:m, :ef_construction)
455
456
  }
456
457
  end
457
458
 
@@ -475,6 +476,14 @@ module Searchkick
475
476
  else
476
477
  raise ArgumentError, "Unknown distance: #{distance}"
477
478
  end
479
+
480
+ vector_index_options = knn_options.slice(:m, :ef_construction)
481
+ if vector_index_options.any?
482
+ # TODO no quantization by default in Searchkick 6
483
+ # int8_hnsw was made the default in Elasticsearch 8.14.0
484
+ type = Searchkick.server_below?("8.14.0") ? "hnsw" : "int8_hnsw"
485
+ vector_options[:index_options] = {type: type}.merge(vector_index_options)
486
+ end
478
487
  end
479
488
 
480
489
  mapping[field.to_s] = vector_options
@@ -891,6 +891,7 @@ module Searchkick
891
891
  exact = knn[:exact]
892
892
  exact = field_options[:distance].nil? || distance != field_options[:distance] if exact.nil?
893
893
  k = per_page + offset
894
+ ef_search = knn[:ef_search]
894
895
  filter = payload.delete(:query)
895
896
 
896
897
  if distance.nil?
@@ -934,22 +935,31 @@ module Searchkick
934
935
  space_type: space_type
935
936
  }
936
937
  },
937
- boost: distance == "cosine" ? 0.5 : 1.0
938
+ boost: distance == "cosine" && Searchkick.server_below?("2.19.0", true) ? 0.5 : 1.0
938
939
  }
939
940
  }
940
941
  else
942
+ if ef_search && Searchkick.server_below?("2.16.0", true)
943
+ raise Error, "ef_search requires OpenSearch 2.16+"
944
+ end
945
+
941
946
  payload[:query] = {
942
947
  knn: {
943
948
  field.to_sym => {
944
949
  vector: vector,
945
950
  k: k,
946
951
  filter: filter
947
- }
952
+ }.merge(ef_search ? {method_parameters: {ef_search: ef_search}} : {})
948
953
  }
949
954
  }
950
955
  end
951
956
  else
952
957
  if exact
958
+ # prevent incorrect distances/results with Elasticsearch 9.0.0-rc1
959
+ if !below90? && field_options[:distance] == "cosine" && distance != "cosine"
960
+ raise ArgumentError, "distance must match searchkick options"
961
+ end
962
+
953
963
  # https://github.com/elastic/elasticsearch/blob/main/docs/reference/vectors/vector-functions.asciidoc
954
964
  source =
955
965
  case distance
@@ -987,7 +997,7 @@ module Searchkick
987
997
  query_vector: vector,
988
998
  k: k,
989
999
  filter: filter
990
- }
1000
+ }.merge(ef_search ? {num_candidates: ef_search} : {})
991
1001
  end
992
1002
  end
993
1003
  end
@@ -1134,13 +1144,17 @@ module Searchkick
1134
1144
  range_query =
1135
1145
  case op
1136
1146
  when :gt
1137
- {from: op_value, include_lower: false}
1147
+ # TODO always use gt in Searchkick 6
1148
+ below90? ? {from: op_value, include_lower: false} : {gt: op_value}
1138
1149
  when :gte
1139
- {from: op_value, include_lower: true}
1150
+ # TODO always use gte in Searchkick 6
1151
+ below90? ? {from: op_value, include_lower: true} : {gte: op_value}
1140
1152
  when :lt
1141
- {to: op_value, include_upper: false}
1153
+ # TODO always use lt in Searchkick 6
1154
+ below90? ? {to: op_value, include_upper: false} : {lt: op_value}
1142
1155
  when :lte
1143
- {to: op_value, include_upper: true}
1156
+ # TODO always use lte in Searchkick 6
1157
+ below90? ? {to: op_value, include_upper: true} : {lte: op_value}
1144
1158
  else
1145
1159
  raise ArgumentError, "Unknown where operator: #{op.inspect}"
1146
1160
  end
@@ -1301,5 +1315,9 @@ module Searchkick
1301
1315
  def below80?
1302
1316
  Searchkick.server_below?("8.0.0")
1303
1317
  end
1318
+
1319
+ def below90?
1320
+ Searchkick.server_below?("9.0.0")
1321
+ end
1304
1322
  end
1305
1323
  end
@@ -1,4 +1,4 @@
1
- module Searckick
1
+ module Searchkick
2
2
  class Railtie < Rails::Railtie
3
3
  rake_tasks do
4
4
  load "tasks/searchkick.rake"
@@ -1,3 +1,3 @@
1
1
  module Searchkick
2
- VERSION = "5.4.0"
2
+ VERSION = "5.5.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: searchkick
3
3
  version: !ruby/object:Gem::Version
4
- version: 5.4.0
4
+ version: 5.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
- autorequire:
9
8
  bindir: bin
10
9
  cert_chain: []
11
- date: 2024-09-04 00:00:00.000000000 Z
10
+ date: 2025-04-03 00:00:00.000000000 Z
12
11
  dependencies:
13
12
  - !ruby/object:Gem::Dependency
14
13
  name: activemodel
@@ -16,14 +15,14 @@ dependencies:
16
15
  requirements:
17
16
  - - ">="
18
17
  - !ruby/object:Gem::Version
19
- version: '6.1'
18
+ version: '7.1'
20
19
  type: :runtime
21
20
  prerelease: false
22
21
  version_requirements: !ruby/object:Gem::Requirement
23
22
  requirements:
24
23
  - - ">="
25
24
  - !ruby/object:Gem::Version
26
- version: '6.1'
25
+ version: '7.1'
27
26
  - !ruby/object:Gem::Dependency
28
27
  name: hashie
29
28
  requirement: !ruby/object:Gem::Requirement
@@ -38,7 +37,6 @@ dependencies:
38
37
  - - ">="
39
38
  - !ruby/object:Gem::Version
40
39
  version: '0'
41
- description:
42
40
  email: andrew@ankane.org
43
41
  executables: []
44
42
  extensions: []
@@ -79,7 +77,6 @@ homepage: https://github.com/ankane/searchkick
79
77
  licenses:
80
78
  - MIT
81
79
  metadata: {}
82
- post_install_message:
83
80
  rdoc_options: []
84
81
  require_paths:
85
82
  - lib
@@ -87,15 +84,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
87
84
  requirements:
88
85
  - - ">="
89
86
  - !ruby/object:Gem::Version
90
- version: '3.1'
87
+ version: '3.2'
91
88
  required_rubygems_version: !ruby/object:Gem::Requirement
92
89
  requirements:
93
90
  - - ">="
94
91
  - !ruby/object:Gem::Version
95
92
  version: '0'
96
93
  requirements: []
97
- rubygems_version: 3.5.11
98
- signing_key:
94
+ rubygems_version: 3.6.2
99
95
  specification_version: 4
100
96
  summary: Intelligent search made easy with Rails and Elasticsearch or OpenSearch
101
97
  test_files: []