searchkick 4.4.2 → 4.5.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 782d732ce3dca45ba3654f3afa2ee565fc396543179dbdc87a0ee7418a151a3c
4
- data.tar.gz: 2752fb40094f068a97e77009634fccf7a1433182294835ad077a66632d43bcb6
3
+ metadata.gz: 44f4a5255a208b7d24aa1b6d668808d3aff83a9ebb9f07a1a1a83fd9e3845738
4
+ data.tar.gz: 0ac096c02151a04a00751de140520547570b9c24fa2fccffd9d4761637b85531
5
5
  SHA512:
6
- metadata.gz: b1125b09caaf20b0dcc8aa7c55755e257338499dda6d09427e139f9167d9e5b2e9971657b830352babe8100c510236412fbaa5a6edf9617c8eefcbe7aff4d2c7
7
- data.tar.gz: b111187e5e1d05f19120f850833a491405727811fde6b31465f6021dffdd01b44e55c6e9a19b930246a4a92671ddda62aa65f9fa6fb6793342a5581009961985
6
+ metadata.gz: ff25ecdd74852fdbca33909d37445adbb42aa76785df75d9f1d9903d1dbee5712507d30a654dfaf829b9b6c0db0148cf48fbb4a257e3decf69e6c316c40b66fc
7
+ data.tar.gz: e07d8ddd9dbcf661db01948a7f20ad0b91d6427adddbbe987d571011983046650bd21726f74a2d0d11caaf9d49cd83b70a92738dc1cb21346e2c74009a1c494c
data/CHANGELOG.md CHANGED
@@ -1,3 +1,22 @@
1
+ ## 4.5.1 (2021-08-03)
2
+
3
+ - Improved performance of reindex queue
4
+
5
+ ## 4.5.0 (2021-06-07)
6
+
7
+ - Added experimental support for OpenSearch
8
+ - Added support for synonyms in Japanese
9
+
10
+ ## 4.4.4 (2021-03-12)
11
+
12
+ - Fixed `too_long_frame_exception` with `scroll` method
13
+ - Fixed multi-word emoji tokenization
14
+
15
+ ## 4.4.3 (2021-02-25)
16
+
17
+ - Added support for Hunspell
18
+ - Fixed warning about accessing system indices
19
+
1
20
  ## 4.4.2 (2020-11-23)
2
21
 
3
22
  - Added `missing_records` method to results
data/README.md CHANGED
@@ -20,7 +20,7 @@ Plus:
20
20
  - autocomplete
21
21
  - “Did you mean” suggestions
22
22
  - supports many languages
23
- - works with ActiveRecord, Mongoid, and NoBrainer
23
+ - works with Active Record, Mongoid, and NoBrainer
24
24
 
25
25
  Check out [Searchjoy](https://github.com/ankane/searchjoy) for analytics and [Autosuggest](https://github.com/ankane/autosuggest) for query suggestions
26
26
 
@@ -45,11 +45,11 @@ Check out [Searchjoy](https://github.com/ankane/searchjoy) for analytics and [Au
45
45
 
46
46
  ## Getting Started
47
47
 
48
- [Install Elasticsearch](https://www.elastic.co/downloads/elasticsearch). For Homebrew, use:
48
+ Install [Elasticsearch](https://www.elastic.co/downloads/elasticsearch) or [OpenSearch](https://opensearch.org/downloads.html) (OpenSearch support is experimental). For Homebrew, use:
49
49
 
50
50
  ```sh
51
- brew install elasticsearch
52
- brew services start elasticsearch
51
+ brew install elasticsearch # or opensearch
52
+ brew services start elasticsearch # or opensearch
53
53
  ```
54
54
 
55
55
  Add this line to your application’s Gemfile:
@@ -58,7 +58,7 @@ Add this line to your application’s Gemfile:
58
58
  gem 'searchkick'
59
59
  ```
60
60
 
61
- The latest version works with Elasticsearch 6 and 7. For Elasticsearch 5, use version 3.1.3 and [this readme](https://github.com/ankane/searchkick/blob/v3.1.3/README.md).
61
+ The latest version works with Elasticsearch 6 and 7 and OpenSearch 1. For Elasticsearch 5, use version 3.1.3 and [this readme](https://github.com/ankane/searchkick/blob/v3.1.3/README.md).
62
62
 
63
63
  Add searchkick to models you want to search.
64
64
 
@@ -176,7 +176,7 @@ Get the full response from Elasticsearch
176
176
  results.response
177
177
  ```
178
178
 
179
- **Note:** By default, Elasticsearch [limits paging](#deep-paging-master) to the first 10,000 results for performance. With Elasticsearch 7, this applies to the total count as well.
179
+ **Note:** By default, Elasticsearch [limits paging](#deep-paging) to the first 10,000 results for performance. With Elasticsearch 7, this applies to the total count as well.
180
180
 
181
181
  ### Boosting
182
182
 
@@ -209,7 +209,7 @@ boost_by_recency: {created_at: {scale: "7d", decay: 0.5}}
209
209
 
210
210
  You can also boost by:
211
211
 
212
- - [Conversions](#keep-getting-better)
212
+ - [Conversions](#intelligent-search)
213
213
  - [Distance](#boost-by-distance)
214
214
 
215
215
  ### Get Everything
@@ -311,7 +311,7 @@ class Product < ApplicationRecord
311
311
  end
312
312
  ```
313
313
 
314
- See the [list of stemmers](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-stemmer-tokenfilter.html). A few languages require plugins:
314
+ See the [list of languages](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-stemmer-tokenfilter.html#analysis-stemmer-tokenfilter-configure-parms). A few languages require plugins:
315
315
 
316
316
  - `chinese` - [analysis-ik plugin](https://github.com/medcl/elasticsearch-analysis-ik)
317
317
  - `chinese2` - [analysis-smartcn plugin](https://www.elastic.co/guide/en/elasticsearch/plugins/7.4/analysis-smartcn.html)
@@ -322,6 +322,14 @@ See the [list of stemmers](https://www.elastic.co/guide/en/elasticsearch/referen
322
322
  - `ukrainian` - [analysis-ukrainian plugin](https://www.elastic.co/guide/en/elasticsearch/plugins/7.4/analysis-ukrainian.html)
323
323
  - `vietnamese` - [analysis-vietnamese plugin](https://github.com/duydo/elasticsearch-analysis-vietnamese)
324
324
 
325
+ You can also use a Hunspell dictionary for stemming.
326
+
327
+ ```ruby
328
+ class Product < ApplicationRecord
329
+ searchkick stemmer: {type: "hunspell", locale: "en_US"}
330
+ end
331
+ ```
332
+
325
333
  Disable stemming with:
326
334
 
327
335
  ```ruby
@@ -641,7 +649,7 @@ class Product < ApplicationRecord
641
649
  def search_data
642
650
  {
643
651
  name: name,
644
- conversions: searches.group(:query).uniq.count(:user_id)
652
+ conversions: searches.group(:query).distinct.count(:user_id)
645
653
  # {"ice cream" => 234, "chocolate" => 67, "cream" => 2}
646
654
  }
647
655
  end
@@ -1204,12 +1212,18 @@ FactoryBot.create(:product, :some_trait, :reindex, some_attribute: "foo")
1204
1212
 
1205
1213
  ### GitHub Actions
1206
1214
 
1207
- Check out [setup-elasticsearch](https://github.com/ankane/setup-elasticsearch) for an easy way to install Elasticsearch.
1215
+ Check out [setup-elasticsearch](https://github.com/ankane/setup-elasticsearch) for an easy way to install Elasticsearch:
1208
1216
 
1209
1217
  ```yml
1210
1218
  - uses: ankane/setup-elasticsearch@v1
1211
1219
  ```
1212
1220
 
1221
+ And [setup-opensearch](https://github.com/ankane/setup-opensearch) for an easy way to install OpenSearch:
1222
+
1223
+ ```yml
1224
+ - uses: ankane/setup-opensearch@v1
1225
+ ```
1226
+
1213
1227
  ## Deployment
1214
1228
 
1215
1229
  Searchkick uses `ENV["ELASTICSEARCH_URL"]` for the Elasticsearch server. This defaults to `http://localhost:9200`.
@@ -1217,7 +1231,7 @@ Searchkick uses `ENV["ELASTICSEARCH_URL"]` for the Elasticsearch server. This de
1217
1231
  - [Elastic Cloud](#elastic-cloud)
1218
1232
  - [Heroku](#heroku)
1219
1233
  - [Amazon Elasticsearch Service](#amazon-elasticsearch-service)
1220
- - [Self-Hosted and Other](#other)
1234
+ - [Self-Hosted and Other](#self-hosted-and-other)
1221
1235
 
1222
1236
  ### Elastic Cloud
1223
1237
 
@@ -1469,7 +1483,7 @@ Product.search_index.promote(index_name, update_refresh_interval: true)
1469
1483
 
1470
1484
  ### Queuing
1471
1485
 
1472
- Push ids of records needing reindexed to a queue and reindex in bulk for better performance. First, set up Redis in an initializer. We recommend using [connection_pool](https://github.com/mperham/connection_pool).
1486
+ Push ids of records needing reindexing to a queue and reindex in bulk for better performance. First, set up Redis in an initializer. We recommend using [connection_pool](https://github.com/mperham/connection_pool).
1473
1487
 
1474
1488
  ```ruby
1475
1489
  Searchkick.redis = ConnectionPool.new { Redis.new }
@@ -1572,14 +1586,14 @@ class ReindexConversionsJob < ApplicationJob
1572
1586
  # get records that have a recent conversion
1573
1587
  recently_converted_ids =
1574
1588
  Searchjoy::Search.where("convertable_type = ? AND converted_at > ?", class_name, 1.day.ago)
1575
- .order(:convertable_id).uniq.pluck(:convertable_id)
1589
+ .order(:convertable_id).distinct.pluck(:convertable_id)
1576
1590
 
1577
1591
  # split into groups
1578
1592
  recently_converted_ids.in_groups_of(1000, false) do |ids|
1579
1593
  # fetch conversions
1580
1594
  conversions =
1581
1595
  Searchjoy::Search.where(convertable_id: ids, convertable_type: class_name)
1582
- .group(:convertable_id, :query).uniq.count(:user_id)
1596
+ .group(:convertable_id, :query).distinct.count(:user_id)
1583
1597
 
1584
1598
  # group conversions by record
1585
1599
  conversions_by_record = {}
@@ -1831,7 +1845,7 @@ class Product < ApplicationRecord
1831
1845
  def search_data
1832
1846
  {
1833
1847
  name: name,
1834
- unique_user_conversions: searches.group(:query).uniq.count(:user_id),
1848
+ unique_user_conversions: searches.group(:query).distinct.count(:user_id),
1835
1849
  # {"ice cream" => 234, "chocolate" => 67, "cream" => 2}
1836
1850
  total_conversions: searches.group(:query).count
1837
1851
  # {"ice cream" => 412, "chocolate" => 117, "cream" => 6}
data/lib/searchkick.rb CHANGED
@@ -74,11 +74,24 @@ module Searchkick
74
74
  (defined?(@search_timeout) && @search_timeout) || timeout
75
75
  end
76
76
 
77
+ # private
78
+ def self.server_info
79
+ @server_info ||= client.info
80
+ end
81
+
77
82
  def self.server_version
78
- @server_version ||= client.info["version"]["number"]
83
+ @server_version ||= server_info["version"]["number"]
84
+ end
85
+
86
+ def self.opensearch?
87
+ unless defined?(@opensearch)
88
+ @opensearch = server_info["version"]["distribution"] == "opensearch"
89
+ end
90
+ @opensearch
79
91
  end
80
92
 
81
93
  def self.server_below?(version)
94
+ server_version = opensearch? ? "7.10.2" : self.server_version
82
95
  Gem::Version.new(server_version.split("-")[0]) < Gem::Version.new(version.split("-")[0])
83
96
  end
84
97
 
@@ -105,7 +105,7 @@ module Searchkick
105
105
  indices =
106
106
  begin
107
107
  if client.indices.respond_to?(:get_alias)
108
- client.indices.get_alias
108
+ client.indices.get_alias(index: "#{name}*")
109
109
  else
110
110
  client.indices.get_aliases
111
111
  end
@@ -161,6 +161,7 @@ module Searchkick
161
161
  RecordData.new(self, record).document_type
162
162
  end
163
163
 
164
+ # TODO use like: [{_index: ..., _id: ...}] in Searchkick 5
164
165
  def similar_record(record, **options)
165
166
  like_text = retrieve(record).to_hash
166
167
  .keep_if { |k, _| !options[:fields] || options[:fields].map(&:to_s).include?(k) }
@@ -153,6 +153,7 @@ module Searchkick
153
153
  }
154
154
  }
155
155
 
156
+ raise ArgumentError, "Can't pass both language and stemmer" if options[:stemmer] && language
156
157
  update_language(settings, language)
157
158
  update_stemming(settings)
158
159
 
@@ -234,6 +235,27 @@ module Searchkick
234
235
  type: "kuromoji"
235
236
  }
236
237
  )
238
+ when "japanese2"
239
+ analyzer = {
240
+ type: "custom",
241
+ tokenizer: "kuromoji_tokenizer",
242
+ filter: [
243
+ "kuromoji_baseform",
244
+ "kuromoji_part_of_speech",
245
+ "cjk_width",
246
+ "ja_stop",
247
+ "searchkick_stemmer",
248
+ "lowercase"
249
+ ]
250
+ }
251
+ settings[:analysis][:analyzer].merge!(
252
+ default_analyzer => analyzer.deep_dup,
253
+ searchkick_search: analyzer.deep_dup,
254
+ searchkick_search2: analyzer.deep_dup
255
+ )
256
+ settings[:analysis][:filter][:searchkick_stemmer] = {
257
+ type: "kuromoji_stemmer"
258
+ }
237
259
  when "korean"
238
260
  settings[:analysis][:analyzer].merge!(
239
261
  default_analyzer => {
@@ -286,6 +308,18 @@ module Searchkick
286
308
  end
287
309
 
288
310
  def update_stemming(settings)
311
+ if options[:stemmer]
312
+ stemmer = options[:stemmer]
313
+ # could also support snowball and stemmer
314
+ case stemmer[:type]
315
+ when "hunspell"
316
+ # supports all token filter options
317
+ settings[:analysis][:filter][:searchkick_stemmer] = stemmer
318
+ else
319
+ raise ArgumentError, "Unknown stemmer: #{stemmer[:type]}"
320
+ end
321
+ end
322
+
289
323
  stem = options[:stem]
290
324
 
291
325
  # language analyzer used
@@ -499,8 +533,18 @@ module Searchkick
499
533
  end
500
534
  settings[:analysis][:filter][:searchkick_synonym_graph] = synonym_graph
501
535
 
502
- [:searchkick_search2, :searchkick_word_search].each do |analyzer|
503
- settings[:analysis][:analyzer][analyzer][:filter].insert(2, "searchkick_synonym_graph")
536
+ if options[:language] == "japanese2"
537
+ [:searchkick_search, :searchkick_search2].each do |analyzer|
538
+ settings[:analysis][:analyzer][analyzer][:filter].insert(4, "searchkick_synonym_graph")
539
+ end
540
+ else
541
+ [:searchkick_search2, :searchkick_word_search].each do |analyzer|
542
+ unless settings[:analysis][:analyzer][analyzer].key?(:filter)
543
+ raise Searchkick::Error, "Search synonyms are not supported yet for language"
544
+ end
545
+
546
+ settings[:analysis][:analyzer][analyzer][:filter].insert(2, "searchkick_synonym_graph")
547
+ end
504
548
  end
505
549
  end
506
550
  end
@@ -6,7 +6,7 @@ module Searchkick
6
6
  unknown_keywords = options.keys - [:_all, :_type, :batch_size, :callbacks, :case_sensitive, :conversions, :deep_paging, :default_fields,
7
7
  :filterable, :geo_shape, :highlight, :ignore_above, :index_name, :index_prefix, :inheritance, :language,
8
8
  :locations, :mappings, :match, :merge_mappings, :routing, :searchable, :search_synonyms, :settings, :similarity,
9
- :special_characters, :stem, :stem_conversions, :stem_exclusion, :stemmer_override, :suggest, :synonyms, :text_end,
9
+ :special_characters, :stem, :stemmer, :stem_conversions, :stem_exclusion, :stemmer_override, :suggest, :synonyms, :text_end,
10
10
  :text_middle, :text_start, :word, :wordnet, :word_end, :word_middle, :word_start]
11
11
  raise ArgumentError, "unknown keywords: #{unknown_keywords.join(", ")}" if unknown_keywords.any?
12
12
 
@@ -11,7 +11,7 @@ module Searchkick
11
11
  if record_ids.any?
12
12
  batch_options = {
13
13
  class_name: class_name,
14
- record_ids: record_ids,
14
+ record_ids: record_ids.uniq,
15
15
  index_name: index_name
16
16
  }
17
17
 
@@ -25,7 +25,7 @@ module Searchkick
25
25
  term = term.to_s
26
26
 
27
27
  if options[:emoji]
28
- term = EmojiParser.parse_unicode(term) { |e| " #{e.name} " }.strip
28
+ term = EmojiParser.parse_unicode(term) { |e| " #{e.name.tr('_', ' ')} " }.strip
29
29
  end
30
30
 
31
31
  @klass = klass
@@ -353,8 +353,8 @@ module Searchkick
353
353
  shared_options[:cutoff_frequency] = 0.001 unless operator.to_s == "and" || field_misspellings == false || (!below73? && !track_total_hits?)
354
354
  qs << shared_options.merge(analyzer: "searchkick_search")
355
355
 
356
- # searchkick_search and searchkick_search2 are the same for ukrainian
357
- unless %w(japanese korean polish ukrainian vietnamese).include?(searchkick_options[:language])
356
+ # searchkick_search and searchkick_search2 are the same for some languages
357
+ unless %w(japanese japanese2 korean polish ukrainian vietnamese).include?(searchkick_options[:language])
358
358
  qs << shared_options.merge(analyzer: "searchkick_search2")
359
359
  end
360
360
  exclude_analyzer = "searchkick_search2"
@@ -864,10 +864,11 @@ module Searchkick
864
864
  }
865
865
  end
866
866
 
867
- # TODO id transformation for arrays
868
867
  def set_order(payload)
869
868
  order = options[:order].is_a?(Enumerable) ? options[:order] : {options[:order] => :asc}
870
869
  id_field = :_id
870
+ # TODO no longer map id to _id in Searchkick 5
871
+ # since sorting on _id is deprecated in Elasticsearch
871
872
  payload[:sort] = order.is_a?(Array) ? order : Hash[order.map { |k, v| [k.to_s == "id" ? id_field : k, v] }]
872
873
  end
873
874
 
@@ -14,11 +14,17 @@ module Searchkick
14
14
 
15
15
  # TODO use reliable queuing
16
16
  def reserve(limit: 1000)
17
- record_ids = Set.new
18
- while record_ids.size < limit && (record_id = Searchkick.with_redis { |r| r.rpop(redis_key) })
19
- record_ids << record_id
17
+ if supports_rpop_with_count?
18
+ Searchkick.with_redis { |r| r.call("rpop", redis_key, limit) }
19
+ else
20
+ record_ids = []
21
+ Searchkick.with_redis do |r|
22
+ while record_ids.size < limit && (record_id = r.rpop(redis_key))
23
+ record_ids << record_id
24
+ end
25
+ end
26
+ record_ids
20
27
  end
21
- record_ids.to_a
22
28
  end
23
29
 
24
30
  def clear
@@ -34,5 +40,13 @@ module Searchkick
34
40
  def redis_key
35
41
  "searchkick:reindex_queue:#{name}"
36
42
  end
43
+
44
+ def supports_rpop_with_count?
45
+ redis_version >= Gem::Version.new("6.2")
46
+ end
47
+
48
+ def redis_version
49
+ @redis_version ||= Searchkick.with_redis { |r| Gem::Version.new(r.info["redis_version"]) }
50
+ end
37
51
  end
38
52
  end
@@ -188,14 +188,9 @@ module Searchkick
188
188
 
189
189
  records.clear_scroll
190
190
  else
191
- params = {
192
- scroll: options[:scroll],
193
- scroll_id: scroll_id
194
- }
195
-
196
191
  begin
197
192
  # TODO Active Support notifications for this scroll call
198
- Searchkick::Results.new(@klass, Searchkick.client.scroll(params), @options)
193
+ Searchkick::Results.new(@klass, Searchkick.client.scroll(scroll: options[:scroll], body: {scroll_id: scroll_id}), @options)
199
194
  rescue Elasticsearch::Transport::Transport::Errors::NotFound => e
200
195
  if e.class.to_s =~ /NotFound/ && e.message =~ /search_context_missing_exception/i
201
196
  raise Searchkick::Error, "Scroll id has expired"
@@ -236,7 +231,7 @@ module Searchkick
236
231
  index_alias = index.split("_")[0..-2].join("_")
237
232
  Array((options[:index_mapping] || {})[index_alias])
238
233
  end
239
- raise Searchkick::Error, "Unknown model for index: #{index}" unless models.any?
234
+ raise Searchkick::Error, "Unknown model for index: #{index}. Pass the `models` option to the search method." unless models.any?
240
235
  index_models[index] = models
241
236
  end
242
237
 
@@ -1,3 +1,3 @@
1
1
  module Searchkick
2
- VERSION = "4.4.2"
2
+ VERSION = "4.5.1"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: searchkick
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.4.2
4
+ version: 4.5.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-11-24 00:00:00.000000000 Z
11
+ date: 2021-08-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activemodel
@@ -102,7 +102,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
102
102
  - !ruby/object:Gem::Version
103
103
  version: '0'
104
104
  requirements: []
105
- rubygems_version: 3.1.4
105
+ rubygems_version: 3.2.22
106
106
  signing_key:
107
107
  specification_version: 4
108
108
  summary: Intelligent search made easy with Rails and Elasticsearch