searchkick-hooopo 2.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (74) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +22 -0
  3. data/.travis.yml +35 -0
  4. data/CHANGELOG.md +491 -0
  5. data/Gemfile +12 -0
  6. data/LICENSE.txt +22 -0
  7. data/README.md +1908 -0
  8. data/Rakefile +20 -0
  9. data/benchmark/Gemfile +23 -0
  10. data/benchmark/benchmark.rb +97 -0
  11. data/lib/searchkick/bulk_reindex_job.rb +17 -0
  12. data/lib/searchkick/index.rb +500 -0
  13. data/lib/searchkick/index_options.rb +333 -0
  14. data/lib/searchkick/indexer.rb +28 -0
  15. data/lib/searchkick/logging.rb +242 -0
  16. data/lib/searchkick/middleware.rb +12 -0
  17. data/lib/searchkick/model.rb +156 -0
  18. data/lib/searchkick/process_batch_job.rb +23 -0
  19. data/lib/searchkick/process_queue_job.rb +23 -0
  20. data/lib/searchkick/query.rb +901 -0
  21. data/lib/searchkick/reindex_queue.rb +38 -0
  22. data/lib/searchkick/reindex_v2_job.rb +39 -0
  23. data/lib/searchkick/results.rb +216 -0
  24. data/lib/searchkick/tasks.rb +33 -0
  25. data/lib/searchkick/version.rb +3 -0
  26. data/lib/searchkick.rb +215 -0
  27. data/searchkick.gemspec +28 -0
  28. data/test/aggs_test.rb +197 -0
  29. data/test/autocomplete_test.rb +75 -0
  30. data/test/boost_test.rb +175 -0
  31. data/test/callbacks_test.rb +59 -0
  32. data/test/ci/before_install.sh +17 -0
  33. data/test/errors_test.rb +19 -0
  34. data/test/gemfiles/activerecord31.gemfile +7 -0
  35. data/test/gemfiles/activerecord32.gemfile +7 -0
  36. data/test/gemfiles/activerecord40.gemfile +8 -0
  37. data/test/gemfiles/activerecord41.gemfile +8 -0
  38. data/test/gemfiles/activerecord42.gemfile +7 -0
  39. data/test/gemfiles/activerecord50.gemfile +7 -0
  40. data/test/gemfiles/apartment.gemfile +8 -0
  41. data/test/gemfiles/cequel.gemfile +8 -0
  42. data/test/gemfiles/mongoid2.gemfile +7 -0
  43. data/test/gemfiles/mongoid3.gemfile +6 -0
  44. data/test/gemfiles/mongoid4.gemfile +7 -0
  45. data/test/gemfiles/mongoid5.gemfile +7 -0
  46. data/test/gemfiles/mongoid6.gemfile +8 -0
  47. data/test/gemfiles/nobrainer.gemfile +8 -0
  48. data/test/gemfiles/parallel_tests.gemfile +8 -0
  49. data/test/geo_shape_test.rb +172 -0
  50. data/test/highlight_test.rb +78 -0
  51. data/test/index_test.rb +153 -0
  52. data/test/inheritance_test.rb +83 -0
  53. data/test/marshal_test.rb +8 -0
  54. data/test/match_test.rb +276 -0
  55. data/test/misspellings_test.rb +56 -0
  56. data/test/model_test.rb +42 -0
  57. data/test/multi_search_test.rb +22 -0
  58. data/test/multi_tenancy_test.rb +22 -0
  59. data/test/order_test.rb +46 -0
  60. data/test/pagination_test.rb +53 -0
  61. data/test/partial_reindex_test.rb +58 -0
  62. data/test/query_test.rb +35 -0
  63. data/test/records_test.rb +10 -0
  64. data/test/reindex_test.rb +52 -0
  65. data/test/reindex_v2_job_test.rb +32 -0
  66. data/test/routing_test.rb +23 -0
  67. data/test/should_index_test.rb +32 -0
  68. data/test/similar_test.rb +28 -0
  69. data/test/sql_test.rb +198 -0
  70. data/test/suggest_test.rb +85 -0
  71. data/test/synonyms_test.rb +67 -0
  72. data/test/test_helper.rb +527 -0
  73. data/test/where_test.rb +223 -0
  74. metadata +250 -0
data/README.md ADDED
@@ -0,0 +1,1908 @@
1
+ # Searchkick
2
+
3
+ :rocket: Intelligent search made easy
4
+
5
+ Searchkick learns what **your users** are looking for. As more people search, it gets smarter and the results get better. It’s friendly for developers - and magical for your users.
6
+
7
+ Searchkick handles:
8
+
9
+ - stemming - `tomatoes` matches `tomato`
10
+ - special characters - `jalapeno` matches `jalapeño`
11
+ - extra whitespace - `dishwasher` matches `dish washer`
12
+ - misspellings - `zuchini` matches `zucchini`
13
+ - custom synonyms - `qtip` matches `cotton swab`
14
+
15
+ Plus:
16
+
17
+ - query like SQL - no need to learn a new query language
18
+ - reindex without downtime
19
+ - easily personalize results for each user
20
+ - autocomplete
21
+ - “Did you mean” suggestions
22
+ - works with ActiveRecord, Mongoid, and NoBrainer
23
+
24
+ :speech_balloon: Get [handcrafted updates](http://chartkick.us7.list-manage.com/subscribe?u=952c861f99eb43084e0a49f98&id=6ea6541e8e&group[0][4]=true) for new features
25
+
26
+ :tangerine: Battle-tested at [Instacart](https://www.instacart.com/opensource)
27
+
28
+ [![Build Status](https://travis-ci.org/ankane/searchkick.svg?branch=master)](https://travis-ci.org/ankane/searchkick)
29
+
30
+ ## Contents
31
+
32
+ - [Getting Started](#getting-started)
33
+ - [Querying](#querying)
34
+ - [Indexing](#indexing)
35
+ - [Instant Search / Autocomplete](#instant-search--autocomplete)
36
+ - [Aggregations](#aggregations)
37
+ - [Deployment](#deployment)
38
+ - [Performance](#performance)
39
+ - [Elasticsearch DSL](#advanced)
40
+ - [Reference](#reference)
41
+
42
+ ## Getting Started
43
+
44
+ [Install Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/setup.html). For Homebrew, use:
45
+
46
+ ```sh
47
+ brew install elasticsearch
48
+ brew services start elasticsearch
49
+ ```
50
+
51
+ Add this line to your application’s Gemfile:
52
+
53
+ ```ruby
54
+ gem 'searchkick'
55
+ ```
56
+
57
+ The latest version works with Elasticsearch 2 and 5. For Elasticsearch 1, use version 1.5.1 and [this readme](https://github.com/ankane/searchkick/blob/v1.5.1/README.md).
58
+
59
+ Add searchkick to models you want to search.
60
+
61
+ ```ruby
62
+ class Product < ActiveRecord::Base
63
+ searchkick
64
+ end
65
+ ```
66
+
67
+ Add data to the search index.
68
+
69
+ ```ruby
70
+ Product.reindex
71
+ ```
72
+
73
+ And to query, use:
74
+
75
+ ```ruby
76
+ products = Product.search "apples"
77
+ products.each do |product|
78
+ puts product.name
79
+ end
80
+ ```
81
+
82
+ Searchkick supports the complete [Elasticsearch Search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html). As your search becomes more advanced, we recommend you use the [Elasticsearch DSL](#advanced) for maximum flexibility.
83
+
84
+ ## Querying
85
+
86
+ Query like SQL
87
+
88
+ ```ruby
89
+ Product.search "apples", where: {in_stock: true}, limit: 10, offset: 50
90
+ ```
91
+
92
+ Search specific fields
93
+
94
+ ```ruby
95
+ fields: [:name, :brand]
96
+ ```
97
+
98
+ Where
99
+
100
+ ```ruby
101
+ where: {
102
+ expires_at: {gt: Time.now}, # lt, gte, lte also available
103
+ orders_count: 1..10, # equivalent to {gte: 1, lte: 10}
104
+ aisle_id: [25, 30], # in
105
+ store_id: {not: 2}, # not
106
+ aisle_id: {not: [25, 30]}, # not in
107
+ user_ids: {all: [1, 3]}, # all elements in array
108
+ category: /frozen .+/, # regexp
109
+ _or: [{in_stock: true}, {backordered: true}]
110
+ }
111
+ ```
112
+
113
+ Order
114
+
115
+ ```ruby
116
+ order: {_score: :desc} # most relevant first - default
117
+ ```
118
+
119
+ [All of these sort options are supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html)
120
+
121
+ Limit / offset
122
+
123
+ ```ruby
124
+ limit: 20, offset: 40
125
+ ```
126
+
127
+ Select
128
+
129
+ ```ruby
130
+ select: [:name]
131
+ ```
132
+
133
+ ### Results
134
+
135
+ Searches return a `Searchkick::Results` object. This responds like an array to most methods.
136
+
137
+ ```ruby
138
+ results = Product.search("milk")
139
+ results.size
140
+ results.any?
141
+ results.each { |result| ... }
142
+ ```
143
+
144
+ By default, ids are fetched from Elasticsearch and records are fetched from your database. To fetch everything from Elasticsearch, use:
145
+
146
+ ```ruby
147
+ Product.search("apples", load: false)
148
+ ```
149
+
150
+ Get total results
151
+
152
+ ```ruby
153
+ results.total_count
154
+ ```
155
+
156
+ Get the time the search took (in milliseconds)
157
+
158
+ ```ruby
159
+ results.took
160
+ ```
161
+
162
+ Get the full response from Elasticsearch
163
+
164
+ ```ruby
165
+ results.response
166
+ ```
167
+
168
+ ### Boosting
169
+
170
+ Boost important fields
171
+
172
+ ```ruby
173
+ fields: ["title^10", "description"]
174
+ ```
175
+
176
+ Boost by the value of a field (field must be numeric)
177
+
178
+ ```ruby
179
+ boost_by: [:orders_count] # give popular documents a little boost
180
+ boost_by: {orders_count: {factor: 10}} # default factor is 1
181
+ ```
182
+
183
+ Boost matching documents
184
+
185
+ ```ruby
186
+ boost_where: {user_id: 1}
187
+ boost_where: {user_id: {value: 1, factor: 100}} # default factor is 1000
188
+ boost_where: {user_id: [{value: 1, factor: 100}, {value: 2, factor: 200}]}
189
+ ```
190
+
191
+ [Conversions](#keep-getting-better) are also a great way to boost.
192
+
193
+ ### Get Everything
194
+
195
+ Use a `*` for the query.
196
+
197
+ ```ruby
198
+ Product.search "*"
199
+ ```
200
+
201
+ ### Pagination
202
+
203
+ Plays nicely with kaminari and will_paginate.
204
+
205
+ ```ruby
206
+ # controller
207
+ @products = Product.search "milk", page: params[:page], per_page: 20
208
+ ```
209
+
210
+ View with kaminari
211
+
212
+ ```erb
213
+ <%= paginate @products %>
214
+ ```
215
+
216
+ View with will_paginate
217
+
218
+ ```erb
219
+ <%= will_paginate @products %>
220
+ ```
221
+
222
+ ### Partial Matches
223
+
224
+ By default, results must match all words in the query.
225
+
226
+ ```ruby
227
+ Product.search "fresh honey" # fresh AND honey
228
+ ```
229
+
230
+ To change this, use:
231
+
232
+ ```ruby
233
+ Product.search "fresh honey", operator: "or" # fresh OR honey
234
+ ```
235
+
236
+ By default, results must match the entire word - `back` will not match `backpack`. You can change this behavior with:
237
+
238
+ ```ruby
239
+ class Product < ActiveRecord::Base
240
+ searchkick word_start: [:name]
241
+ end
242
+ ```
243
+
244
+ And to search (after you reindex):
245
+
246
+ ```ruby
247
+ Product.search "back", fields: [:name], match: :word_start
248
+ ```
249
+
250
+ Available options are:
251
+
252
+ ```ruby
253
+ :word # default
254
+ :word_start
255
+ :word_middle
256
+ :word_end
257
+ :text_start
258
+ :text_middle
259
+ :text_end
260
+ ```
261
+
262
+ ### Exact Matches
263
+
264
+ To match a field exactly (case-insensitive), use:
265
+
266
+ ```ruby
267
+ User.search query, fields: [{email: :exact}, :name]
268
+ ```
269
+
270
+ ### Phrase Matches
271
+
272
+ To only match the exact order, use:
273
+
274
+ ```ruby
275
+ User.search "fresh honey", match: :phrase
276
+ ```
277
+
278
+ ### Language
279
+
280
+ Searchkick defaults to English for stemming. To change this, use:
281
+
282
+ ```ruby
283
+ class Product < ActiveRecord::Base
284
+ searchkick language: "german"
285
+ end
286
+ ```
287
+
288
+ [See the list of stemmers](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-stemmer-tokenfilter.html)
289
+
290
+ ### Synonyms
291
+
292
+ ```ruby
293
+ class Product < ActiveRecord::Base
294
+ searchkick synonyms: [["scallion", "green onion"], ["qtip", "cotton swab"]]
295
+ end
296
+ ```
297
+
298
+ Call `Product.reindex` after changing synonyms.
299
+
300
+ To read synonyms from a file, use:
301
+
302
+ ```ruby
303
+ synonyms: -> { CSV.read("/some/path/synonyms.csv") }
304
+ ```
305
+
306
+ For directional synonyms, use:
307
+
308
+ ```ruby
309
+ synonyms: ["lightbulb => halogenlamp"]
310
+ ```
311
+
312
+ ### Tags and Dynamic Synonyms
313
+
314
+ The above approach works well when your synonym list is static, but in practice, this is often not the case. When you analyze search conversions, you often want to add new synonyms or tags without a full reindex. You can use a library like [ActsAsTaggableOn](https://github.com/mbleigh/acts-as-taggable-on) and do:
315
+
316
+ ```ruby
317
+ class Product < ActiveRecord::Base
318
+ acts_as_taggable
319
+ scope :search_import, -> { includes(:tags) }
320
+
321
+ def search_data
322
+ {
323
+ name_tagged: "#{name} #{tags.map(&:name).join(" ")}"
324
+ }
325
+ end
326
+ end
327
+ ```
328
+
329
+ Search with:
330
+
331
+ ```ruby
332
+ Product.search query, fields: [:name_tagged]
333
+ ```
334
+
335
+ ### WordNet
336
+
337
+ Prepopulate English synonyms with the [WordNet database](https://en.wikipedia.org/wiki/WordNet).
338
+
339
+ Download [WordNet 3.0](http://wordnetcode.princeton.edu/3.0/WNprolog-3.0.tar.gz) to each Elasticsearch server and move `wn_s.pl` to the `/var/lib` directory.
340
+
341
+ ```sh
342
+ cd /tmp
343
+ curl -o wordnet.tar.gz http://wordnetcode.princeton.edu/3.0/WNprolog-3.0.tar.gz
344
+ tar -zxvf wordnet.tar.gz
345
+ mv prolog/wn_s.pl /var/lib
346
+ ```
347
+
348
+ Tell each model to use it:
349
+
350
+ ```ruby
351
+ class Product < ActiveRecord::Base
352
+ searchkick wordnet: true
353
+ end
354
+ ```
355
+
356
+ ### Misspellings
357
+
358
+ By default, Searchkick handles misspelled queries by returning results with an [edit distance](https://en.wikipedia.org/wiki/Levenshtein_distance) of one.
359
+
360
+ You can change this with:
361
+
362
+ ```ruby
363
+ Product.search "zucini", misspellings: {edit_distance: 2} # zucchini
364
+ ```
365
+
366
+ To prevent poor precision and improve performance for correctly spelled queries (which should be a majority for most applications), Searchkick can first perform a search without misspellings, and if there are too few results, perform another with them.
367
+
368
+ ```ruby
369
+ Product.search "zuchini", misspellings: {below: 5}
370
+ ```
371
+
372
+ If there are fewer than 5 results, a 2nd search is performed with misspellings enabled. The result of this query is returned.
373
+
374
+ Turn off misspellings with:
375
+
376
+ ```ruby
377
+ Product.search "zuchini", misspellings: false # no zucchini
378
+ ```
379
+
380
+ ### Bad Matches
381
+
382
+ If a user searches `butter`, they may also get results for `peanut butter`. To prevent this, use:
383
+
384
+ ```ruby
385
+ Product.search "butter", exclude: ["peanut butter"]
386
+ ```
387
+
388
+ You can map queries and terms to exclude with:
389
+
390
+ ```ruby
391
+ exclude_queries = {
392
+ "butter" => ["peanut butter"],
393
+ "cream" => ["ice cream", "whipped cream"]
394
+ }
395
+
396
+ Product.search query, exclude: exclude_queries[query]
397
+ ```
398
+
399
+ ### Emoji
400
+
401
+ Search :ice_cream::cake: and get `ice cream cake`!
402
+
403
+ Add this line to your application’s Gemfile:
404
+
405
+ ```ruby
406
+ gem 'gemoji-parser'
407
+ ```
408
+
409
+ And use:
410
+
411
+ ```ruby
412
+ Product.search "🍨🍰", emoji: true
413
+ ```
414
+
415
+ ## Indexing
416
+
417
+ Control what data is indexed with the `search_data` method. Call `Product.reindex` after changing this method.
418
+
419
+ ```ruby
420
+ class Product < ActiveRecord::Base
421
+ belongs_to :department
422
+
423
+ def search_data
424
+ {
425
+ name: name,
426
+ department_name: department.name,
427
+ on_sale: sale_price.present?
428
+ }
429
+ end
430
+ end
431
+ ```
432
+
433
+ Searchkick uses `find_in_batches` to import documents. To eager load associations, use the `search_import` scope.
434
+
435
+ ```ruby
436
+ class Product < ActiveRecord::Base
437
+ scope :search_import, -> { includes(:department) }
438
+ end
439
+ ```
440
+
441
+ By default, all records are indexed. To control which records are indexed, use the `should_index?` method together with the `search_import` scope.
442
+
443
+ ```ruby
444
+ class Product < ActiveRecord::Base
445
+ scope :search_import, -> { where(active: true) }
446
+
447
+ def should_index?
448
+ active # only index active records
449
+ end
450
+ end
451
+ ```
452
+
453
+ If a reindex is interrupted, you can resume it with:
454
+
455
+ ```ruby
456
+ Product.reindex(resume: true)
457
+ ```
458
+
459
+ For large data sets, try [parallel reindexing](#parallel-reindexing).
460
+
461
+ ### To Reindex, or Not to Reindex
462
+
463
+ #### Reindex
464
+
465
+ - when you install or upgrade searchkick
466
+ - change the `search_data` method
467
+ - change the `searchkick` method
468
+
469
+ #### No need to reindex
470
+
471
+ - app starts
472
+
473
+ ### Stay Synced
474
+
475
+ There are four strategies for keeping the index synced with your database.
476
+
477
+ 1. Immediate (default)
478
+
479
+ Anytime a record is inserted, updated, or deleted
480
+
481
+ 2. Asynchronous
482
+
483
+ Use background jobs for better performance
484
+
485
+ ```ruby
486
+ class Product < ActiveRecord::Base
487
+ searchkick callbacks: :async
488
+ end
489
+ ```
490
+
491
+ And [install Active Job](https://github.com/ankane/activejob_backport) for Rails 4.1 and below. Jobs are added to a queue named `searchkick`.
492
+
493
+ 3. Queuing
494
+
495
+ Push ids of records that need updated to a queue and reindex in the background in batches. This is more performant than the asynchronous method, which updates records individually. See [how to set up](#queuing).
496
+
497
+ 4. Manual
498
+
499
+ Turn off automatic syncing
500
+
501
+ ```ruby
502
+ class Product < ActiveRecord::Base
503
+ searchkick callbacks: false
504
+ end
505
+ ```
506
+
507
+ You can also do bulk updates.
508
+
509
+ ```ruby
510
+ Searchkick.callbacks(:bulk) do
511
+ User.find_each(&:update_fields)
512
+ end
513
+ ```
514
+
515
+ Or temporarily skip updates.
516
+
517
+ ```ruby
518
+ Searchkick.callbacks(false) do
519
+ User.find_each(&:update_fields)
520
+ end
521
+ ```
522
+
523
+ #### Associations
524
+
525
+ Data is **not** automatically synced when an association is updated. If this is desired, add a callback to reindex:
526
+
527
+ ```ruby
528
+ class Image < ActiveRecord::Base
529
+ belongs_to :product
530
+
531
+ after_commit :reindex_product
532
+
533
+ def reindex_product
534
+ product.reindex # or reindex_async
535
+ end
536
+ end
537
+ ```
538
+
539
+ ### Analytics
540
+
541
+ The best starting point to improve your search **by far** is to track searches and conversions.
542
+
543
+ [Searchjoy](https://github.com/ankane/searchjoy) makes it easy.
544
+
545
+ ```ruby
546
+ Product.search "apple", track: {user_id: current_user.id}
547
+ ```
548
+
549
+ [See the docs](https://github.com/ankane/searchjoy) for how to install and use.
550
+
551
+ Focus on:
552
+
553
+ - top searches with low conversions
554
+ - top searches with no results
555
+
556
+ ### Keep Getting Better
557
+
558
+ Searchkick can use conversion data to learn what users are looking for. If a user searches for “ice cream” and adds Ben & Jerry’s Chunky Monkey to the cart (our conversion metric at Instacart), that item gets a little more weight for similar searches.
559
+
560
+ The first step is to define your conversion metric and start tracking conversions. The database works well for low volume, but feel free to use Redis or another datastore.
561
+
562
+ You do **not** need to clean up the search queries. Searchkick automatically treats `apple` and `APPLES` the same.
563
+
564
+ Next, add conversions to the index.
565
+
566
+ ```ruby
567
+ class Product < ActiveRecord::Base
568
+ has_many :searches, class_name: "Searchjoy::Search", as: :convertable
569
+
570
+ searchkick conversions: ["conversions"] # name of field
571
+
572
+ def search_data
573
+ {
574
+ name: name,
575
+ conversions: searches.group(:query).uniq.count(:user_id)
576
+ # {"ice cream" => 234, "chocolate" => 67, "cream" => 2}
577
+ }
578
+ end
579
+ end
580
+ ```
581
+
582
+ Reindex and set up a cron job to add new conversions daily.
583
+
584
+ ```ruby
585
+ rake searchkick:reindex CLASS=Product
586
+ ```
587
+
588
+ **Note:** For a more performant (but more advanced) approach, check out [performant conversions](#performant-conversions).
589
+
590
+ ### Personalized Results
591
+
592
+ Order results differently for each user. For example, show a user’s previously purchased products before other results.
593
+
594
+ ```ruby
595
+ class Product < ActiveRecord::Base
596
+ def search_data
597
+ {
598
+ name: name,
599
+ orderer_ids: orders.pluck(:user_id) # boost this product for these users
600
+ }
601
+ end
602
+ end
603
+ ```
604
+
605
+ Reindex and search with:
606
+
607
+ ```ruby
608
+ Product.search "milk", boost_where: {orderer_ids: current_user.id}
609
+ ```
610
+
611
+ ### Instant Search / Autocomplete
612
+
613
+ Autocomplete predicts what a user will type, making the search experience faster and easier.
614
+
615
+ ![Autocomplete](https://raw.githubusercontent.com/ankane/searchkick/gh-pages/autocomplete.png)
616
+
617
+ **Note:** To autocomplete on general categories (like `cereal` rather than product names), check out [Autosuggest](https://github.com/ankane/autosuggest).
618
+
619
+ **Note 2:** If you only have a few thousand records, don’t use Searchkick for autocomplete. It’s *much* faster to load all records into JavaScript and autocomplete there (eliminates network requests).
620
+
621
+ First, specify which fields use this feature. This is necessary since autocomplete can increase the index size significantly, but don’t worry - this gives you blazing faster queries.
622
+
623
+ ```ruby
624
+ class Movie < ActiveRecord::Base
625
+ searchkick word_start: [:title, :director]
626
+ end
627
+ ```
628
+
629
+ Reindex and search with:
630
+
631
+ ```ruby
632
+ Movie.search "jurassic pa", fields: [:title], match: :word_start
633
+ ```
634
+
635
+ Typically, you want to use a JavaScript library like [typeahead.js](http://twitter.github.io/typeahead.js/) or [jQuery UI](http://jqueryui.com/autocomplete/).
636
+
637
+ #### Here’s how to make it work with Rails
638
+
639
+ First, add a route and controller action.
640
+
641
+ ```ruby
642
+ class MoviesController < ApplicationController
643
+ def autocomplete
644
+ render json: Movie.search(params[:query], {
645
+ fields: ["title^5", "director"],
646
+ match: :word_start,
647
+ limit: 10,
648
+ load: false,
649
+ misspellings: {below: 5}
650
+ }).map(&:title)
651
+ end
652
+ end
653
+ ```
654
+
655
+ **Note:** Use `load: false` and `misspellings: {below: n}` (or `misspellings: false`) for best performance.
656
+
657
+ Then add the search box and JavaScript code to a view.
658
+
659
+ ```html
660
+ <input type="text" id="query" name="query" />
661
+
662
+ <script src="jquery.js"></script>
663
+ <script src="typeahead.bundle.js"></script>
664
+ <script>
665
+ var movies = new Bloodhound({
666
+ datumTokenizer: Bloodhound.tokenizers.whitespace,
667
+ queryTokenizer: Bloodhound.tokenizers.whitespace,
668
+ remote: {
669
+ url: '/movies/autocomplete?query=%QUERY',
670
+ wildcard: '%QUERY'
671
+ }
672
+ });
673
+ $('#query').typeahead(null, {
674
+ source: movies
675
+ });
676
+ </script>
677
+ ```
678
+
679
+ ### Suggestions
680
+
681
+ ![Suggest](https://raw.githubusercontent.com/ankane/searchkick/gh-pages/recursion.png)
682
+
683
+ ```ruby
684
+ class Product < ActiveRecord::Base
685
+ searchkick suggest: [:name] # fields to generate suggestions
686
+ end
687
+ ```
688
+
689
+ Reindex and search with:
690
+
691
+ ```ruby
692
+ products = Product.search "peantu butta", suggest: true
693
+ products.suggestions # ["peanut butter"]
694
+ ```
695
+
696
+ ### Aggregations
697
+
698
+ [Aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html) provide aggregated search data.
699
+
700
+ ![Aggregations](https://raw.githubusercontent.com/ankane/searchkick/gh-pages/facets.png)
701
+
702
+ ```ruby
703
+ products = Product.search "chuck taylor", aggs: [:product_type, :gender, :brand]
704
+ products.aggs
705
+ ```
706
+
707
+ By default, `where` conditions apply to aggregations.
708
+
709
+ ```ruby
710
+ Product.search "wingtips", where: {color: "brandy"}, aggs: [:size]
711
+ # aggregations for brandy wingtips are returned
712
+ ```
713
+
714
+ Change this with:
715
+
716
+ ```ruby
717
+ Product.search "wingtips", where: {color: "brandy"}, aggs: [:size], smart_aggs: false
718
+ # aggregations for all wingtips are returned
719
+ ```
720
+
721
+ Set `where` conditions for each aggregation separately with:
722
+
723
+ ```ruby
724
+ Product.search "wingtips", aggs: {size: {where: {color: "brandy"}}}
725
+ ```
726
+
727
+ Limit
728
+
729
+ ```ruby
730
+ Product.search "apples", aggs: {store_id: {limit: 10}}
731
+ ```
732
+
733
+ Order
734
+
735
+ ```ruby
736
+ Product.search "wingtips", aggs: {color: {order: {"_term" => "asc"}}} # alphabetically
737
+ ```
738
+
739
+ [All of these options are supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-order)
740
+
741
+ Ranges
742
+
743
+ ```ruby
744
+ price_ranges = [{to: 20}, {from: 20, to: 50}, {from: 50}]
745
+ Product.search "*", aggs: {price: {ranges: price_ranges}}
746
+ ```
747
+
748
+ Minimum document count
749
+
750
+ ```ruby
751
+ Product.search "apples", aggs: {store_id: {min_doc_count: 2}}
752
+ ```
753
+
754
+ Date histogram
755
+
756
+ ```ruby
757
+ Product.search "pear", aggs: {products_per_year: {date_histogram: {field: :created_at, interval: :year}}}
758
+ ```
759
+
760
+ #### Moving From Facets
761
+
762
+ 1. Replace `facets` with `aggs` in searches. **Note:** Stats facets are not supported at this time.
763
+
764
+ ```ruby
765
+ products = Product.search "chuck taylor", facets: [:brand]
766
+ # to
767
+ products = Product.search "chuck taylor", aggs: [:brand]
768
+ ```
769
+
770
+ 2. Replace the `facets` method with `aggs` for results.
771
+
772
+ ```ruby
773
+ products.facets
774
+ # to
775
+ products.aggs
776
+ ```
777
+
778
+ The keys in results differ slightly. Instead of:
779
+
780
+ ```json
781
+ {
782
+ "_type":"terms",
783
+ "missing":0,
784
+ "total":45,
785
+ "other":34,
786
+ "terms":[
787
+ {"term":14.0,"count":11}
788
+ ]
789
+ }
790
+ ```
791
+
792
+ You get:
793
+
794
+ ```json
795
+ {
796
+ "doc_count":45,
797
+ "doc_count_error_upper_bound":0,
798
+ "sum_other_doc_count":34,
799
+ "buckets":[
800
+ {"key":14.0,"doc_count":11}
801
+ ]
802
+ }
803
+ ```
804
+
805
+ Update your application to handle this.
806
+
807
+ 3. By default, `where` conditions apply to aggregations. This is equivalent to `smart_facets: true`. If you have `smart_facets: true`, you can remove it. If this is not desired, set `smart_aggs: false`.
808
+
809
+ 4. If you have any range facets with dates, change the key from `ranges` to `date_ranges`.
810
+
811
+ ```ruby
812
+ facets: {date_field: {ranges: date_ranges}}
813
+ # to
814
+ aggs: {date_field: {date_ranges: date_ranges}}
815
+ ```
816
+
817
+ ### Highlight
818
+
819
+ Specify which fields to index with highlighting.
820
+
821
+ ```ruby
822
+ class Product < ActiveRecord::Base
823
+ searchkick highlight: [:name]
824
+ end
825
+ ```
826
+
827
+ Highlight the search query in the results.
828
+
829
+ ```ruby
830
+ bands = Band.search "cinema", fields: [:name], highlight: true
831
+ ```
832
+
833
+ **Note:** The `fields` option is required, unless highlight options are given - see below.
834
+
835
+ View the highlighted fields with:
836
+
837
+ ```ruby
838
+ bands.each do |band|
839
+ band.search_highlights[:name] # "Two Door <em>Cinema</em> Club"
840
+ end
841
+ ```
842
+
843
+ To change the tag, use:
844
+
845
+ ```ruby
846
+ Band.search "cinema", fields: [:name], highlight: {tag: "<strong>"}
847
+ ```
848
+
849
+ To highlight and search different fields, use:
850
+
851
+ ```ruby
852
+ Band.search "cinema", fields: [:name], highlight: {fields: [:description]}
853
+ ```
854
+
855
+ Additional options, including fragment size, can be specified for each field:
856
+
857
+ ```ruby
858
+ Band.search "cinema", fields: [:name], highlight: {fields: {name: {fragment_size: 200}}}
859
+ ```
860
+
861
+ You can find available highlight options in the [Elasticsearch reference](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-highlighting.html#_highlighted_fragments).
862
+
863
+ ### Similar Items
864
+
865
+ Find similar items.
866
+
867
+ ```ruby
868
+ product = Product.first
869
+ product.similar(fields: [:name], where: {size: "12 oz"})
870
+ ```
871
+
872
+ ### Geospatial Searches
873
+
874
+ ```ruby
875
+ class Restaurant < ActiveRecord::Base
876
+ searchkick locations: [:location]
877
+
878
+ def search_data
879
+ attributes.merge location: {lat: latitude, lon: longitude}
880
+ end
881
+ end
882
+ ```
883
+
884
+ Reindex and search with:
885
+
886
+ ```ruby
887
+ Restaurant.search "pizza", where: {location: {near: {lat: 37, lon: -114}, within: "100mi"}} # or 160km
888
+ ```
889
+
890
+ Bounded by a box
891
+
892
+ ```ruby
893
+ Restaurant.search "sushi", where: {location: {top_left: {lat: 38, lon: -123}, bottom_right: {lat: 37, lon: -122}}}
894
+ ```
895
+
896
+ Bounded by a polygon
897
+
898
+ ```ruby
899
+ Restaurant.search "dessert", where: {location: {geo_polygon: {points: [{lat: 38, lon: -123}, {lat: 39, lon: -123}, {lat: 37, lon: 122}]}}}
900
+ ```
901
+
902
+ ### Boost By Distance
903
+
904
+ Boost results by distance - closer results are boosted more
905
+
906
+ ```ruby
907
+ Restaurant.search "noodles", boost_by_distance: {location: {origin: {lat: 37, lon: -122}}}
908
+ ```
909
+
910
+ Also supports [additional options](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_decay_functions)
911
+
912
+ ```ruby
913
+ Restaurant.search "wings", boost_by_distance: {location: {origin: {lat: 37, lon: -122}, function: "linear", scale: "30mi", decay: 0.5}}
914
+ ```
915
+
916
+ ### Geo Shapes
917
+
918
+ You can also index and search geo shapes.
919
+
920
+ ```ruby
921
+ class Restaurant < ActiveRecord::Base
922
+ searchkick geo_shape: {
923
+ bounds: {tree: "geohash", precision: "1km"}
924
+ }
925
+
926
+ def search_data
927
+ attributes.merge(
928
+ bounds: {
929
+ type: "envelope",
930
+ coordinates: [{lat: 4, lon: 1}, {lat: 2, lon: 3}]
931
+ }
932
+ )
933
+ end
934
+ end
935
+ ```
936
+
937
+ See the [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-shape.html) for details.
938
+
939
+ Find shapes intersecting with the query shape
940
+
941
+ ```ruby
942
+ Restaurant.search "soup", where: {bounds: {geo_shape: {type: "polygon", coordinates: [[{lat: 38, lon: -123}, ...]]}}}
943
+ ```
944
+
945
+ Falling entirely within the query shape
946
+
947
+ ```ruby
948
+ Restaurant.search "salad", where: {bounds: {geo_shape: {type: "circle", relation: "within", coordinates: [{lat: 38, lon: -123}], radius: "1km"}}}
949
+ ```
950
+
951
+ Not touching the query shape
952
+
953
+ ```ruby
954
+ Restaurant.search "burger", where: {bounds: {geo_shape: {type: "envelope", relation: "disjoint", coordinates: [{lat: 38, lon: -123}, {lat: 37, lon: -122}]}}}
955
+ ```
956
+
957
+ Containing the query shape (Elasticsearch 2.2+)
958
+
959
+ ```ruby
960
+ Restaurant.search "fries", where: {bounds: {geo_shape: {type: "envelope", relation: "contains", coordinates: [{lat: 38, lon: -123}, {lat: 37, lon: -122}]}}}
961
+ ```
962
+
963
+ ## Inheritance
964
+
965
+ Searchkick supports single table inheritance.
966
+
967
+ ```ruby
968
+ class Dog < Animal
969
+ end
970
+ ```
971
+
972
+ The parent and child model can both reindex.
973
+
974
+ ```ruby
975
+ Animal.reindex
976
+ Dog.reindex # equivalent
977
+ ```
978
+
979
+ And to search, use:
980
+
981
+ ```ruby
982
+ Animal.search "*" # all animals
983
+ Dog.search "*" # just dogs
984
+ Animal.search "*", type: [Dog, Cat] # just cats and dogs
985
+ ```
986
+
987
+ **Note:** The `suggest` option retrieves suggestions from the parent at the moment.
988
+
989
+ ```ruby
990
+ Dog.search "airbudd", suggest: true # suggestions for all animals
991
+ ```
992
+
993
+ ## Debugging Queries
994
+
995
+ To help with debugging queries, you can use:
996
+
997
+ ```ruby
998
+ Product.search("soap", debug: true)
999
+ ```
1000
+
1001
+ This prints useful info to `stdout`.
1002
+
1003
+ See how Elasticsearch scores your queries with:
1004
+
1005
+ ```ruby
1006
+ Product.search("soap", explain: true).response
1007
+ ```
1008
+
1009
+ See how Elasticsearch tokenizes your queries with:
1010
+
1011
+ ```ruby
1012
+ Product.search_index.tokens("Dish Washer Soap", analyzer: "searchkick_index")
1013
+ # ["dish", "dishwash", "washer", "washersoap", "soap"]
1014
+
1015
+ Product.search_index.tokens("dishwasher soap", analyzer: "searchkick_search")
1016
+ # ["dishwashersoap"] - no match
1017
+
1018
+ Product.search_index.tokens("dishwasher soap", analyzer: "searchkick_search2")
1019
+ # ["dishwash", "soap"] - match!!
1020
+ ```
1021
+
1022
+ Partial matches
1023
+
1024
+ ```ruby
1025
+ Product.search_index.tokens("San Diego", analyzer: "searchkick_word_start_index")
1026
+ # ["s", "sa", "san", "d", "di", "die", "dieg", "diego"]
1027
+
1028
+ Product.search_index.tokens("dieg", analyzer: "searchkick_word_search")
1029
+ # ["dieg"] - match!!
1030
+ ```
1031
+
1032
+ See the [complete list of analyzers](https://github.com/ankane/searchkick/blob/31780ddac7a89eab1e0552a32b403f2040a37931/lib/searchkick/index_options.rb#L32).
1033
+
1034
+ ## Deployment
1035
+
1036
+ Searchkick uses `ENV["ELASTICSEARCH_URL"]` for the Elasticsearch server. This defaults to `http://localhost:9200`.
1037
+
1038
+ ### Heroku
1039
+
1040
+ Choose an add-on: [SearchBox](https://elements.heroku.com/addons/searchbox), [Bonsai](https://elements.heroku.com/addons/bonsai), or [Elastic Cloud](https://elements.heroku.com/addons/foundelasticsearch).
1041
+
1042
+ ```sh
1043
+ # SearchBox
1044
+ heroku addons:create searchbox:starter
1045
+ heroku config:set ELASTICSEARCH_URL=`heroku config:get SEARCHBOX_URL`
1046
+
1047
+ # Bonsai
1048
+ heroku addons:create bonsai
1049
+ heroku config:set ELASTICSEARCH_URL=`heroku config:get BONSAI_URL`
1050
+
1051
+ # Found
1052
+ heroku addons:create foundelasticsearch
1053
+ heroku config:set ELASTICSEARCH_URL=`heroku config:get FOUNDELASTICSEARCH_URL`
1054
+ ```
1055
+
1056
+ Then deploy and reindex:
1057
+
1058
+ ```sh
1059
+ heroku run rake searchkick:reindex CLASS=Product
1060
+ ```
1061
+
1062
+ ### Amazon Elasticsearch Service
1063
+
1064
+ Include `elasticsearch 1.0.15` or greater in your Gemfile.
1065
+
1066
+ ```ruby
1067
+ gem 'elasticsearch', '>= 1.0.15'
1068
+ ```
1069
+
1070
+ Create an initializer `config/initializers/elasticsearch.rb` with:
1071
+
1072
+ ```ruby
1073
+ ENV["ELASTICSEARCH_URL"] = "https://es-domain-1234.us-east-1.es.amazonaws.com"
1074
+ ```
1075
+
1076
+ To use signed request, include in your Gemfile:
1077
+
1078
+ ```ruby
1079
+ gem 'faraday_middleware-aws-signers-v4'
1080
+ ```
1081
+
1082
+ and add to your initializer:
1083
+
1084
+ ```ruby
1085
+ Searchkick.aws_credentials = {
1086
+ access_key_id: ENV["AWS_ACCESS_KEY_ID"],
1087
+ secret_access_key: ENV["AWS_SECRET_ACCESS_KEY"],
1088
+ region: "us-east-1"
1089
+ }
1090
+ ```
1091
+
1092
+ Then deploy and reindex:
1093
+
1094
+ ```sh
1095
+ rake searchkick:reindex CLASS=Product
1096
+ ```
1097
+
1098
+ ### Other
1099
+
1100
+ Create an initializer `config/initializers/elasticsearch.rb` with:
1101
+
1102
+ ```ruby
1103
+ ENV["ELASTICSEARCH_URL"] = "http://username:password@api.searchbox.io"
1104
+ ```
1105
+
1106
+ Then deploy and reindex:
1107
+
1108
+ ```sh
1109
+ rake searchkick:reindex CLASS=Product
1110
+ ```
1111
+
1112
+ ### Automatic Failover
1113
+
1114
+ Create an initializer `config/initializers/elasticsearch.rb` with multiple hosts:
1115
+
1116
+ ```ruby
1117
+ ENV["ELASTICSEARCH_URL"] = "http://localhost:9200,http://localhost:9201"
1118
+
1119
+ Searchkick.client_options = {
1120
+ retry_on_failure: true
1121
+ }
1122
+ ```
1123
+
1124
+ See [elasticsearch-transport](https://github.com/elastic/elasticsearch-ruby/blob/master/elasticsearch-transport) for a complete list of options.
1125
+
1126
+ ### Lograge
1127
+
1128
+ Add the following to `config/environments/production.rb`:
1129
+
1130
+ ```ruby
1131
+ config.lograge.custom_options = lambda do |event|
1132
+ options = {}
1133
+ options[:search] = event.payload[:searchkick_runtime] if event.payload[:searchkick_runtime].to_f > 0
1134
+ options
1135
+ end
1136
+ ```
1137
+
1138
+ See [Production Rails](https://github.com/ankane/production_rails) for other good practices.
1139
+
1140
+ ## Performance
1141
+
1142
+ ### JSON Generation
1143
+
1144
+ Significantly increase performance with faster JSON generation. Add [Oj](https://github.com/ohler55/oj) to your Gemfile.
1145
+
1146
+ ```ruby
1147
+ gem 'oj'
1148
+ ```
1149
+
1150
+ This speeds up all JSON generation and parsing in your application (automatically!)
1151
+
1152
+ ### Persistent HTTP Connections
1153
+
1154
+ Significantly increase performance with persistent HTTP connections. Add [Typhoeus](https://github.com/typhoeus/typhoeus) to your Gemfile and it’ll automatically be used.
1155
+
1156
+ ```ruby
1157
+ gem 'typhoeus'
1158
+ ```
1159
+
1160
+ To reduce log noise, create an initializer with:
1161
+
1162
+ ```ruby
1163
+ Ethon.logger = Logger.new("/dev/null")
1164
+ ```
1165
+
1166
+ If you run into issues on Windows, check out [this post](https://www.rastating.com/fixing-issues-in-typhoeus-and-httparty-on-windows/).
1167
+
1168
+ ### Searchable Fields
1169
+
1170
+ By default, all string fields are searchable (can be used in `fields` option). Speed up indexing and reduce index size by only making some fields searchable. This disables the `_all` field unless it’s listed.
1171
+
1172
+ ```ruby
1173
+ class Product < ActiveRecord::Base
1174
+ searchkick searchable: [:name]
1175
+ end
1176
+ ```
1177
+
1178
+ ### Filterable Fields
1179
+
1180
+ By default, all fields are filterable (can be used in `where` option). Speed up indexing and reduce index size by only making some fields filterable.
1181
+
1182
+ ```ruby
1183
+ class Product < ActiveRecord::Base
1184
+ searchkick filterable: [:store_id]
1185
+ end
1186
+ ```
1187
+
1188
+ ### Parallel Reindexing
1189
+
1190
+ For large data sets, you can use background jobs to parallelize reindexing.
1191
+
1192
+ ```ruby
1193
+ Product.reindex(async: true)
1194
+ # {index_name: "products_production_20170111210018065"}
1195
+ ```
1196
+
1197
+ Once the jobs complete, promote the new index with:
1198
+
1199
+ ```ruby
1200
+ Product.search_index.promote(index_name)
1201
+ ```
1202
+
1203
+ You can optionally track the status with Redis:
1204
+
1205
+ ```ruby
1206
+ Searchkick.redis = Redis.new
1207
+ ```
1208
+
1209
+ And use:
1210
+
1211
+ ```ruby
1212
+ Searchkick.reindex_status(index_name)
1213
+ ```
1214
+
1215
+ You can use [ActiveJob::TrafficControl](https://github.com/nickelser/activejob-traffic_control) to control concurrency. Install the gem:
1216
+
1217
+ ```ruby
1218
+ gem 'activejob-traffic_control', '>= 0.1.3'
1219
+ ```
1220
+
1221
+ And create an initializer with:
1222
+
1223
+ ```ruby
1224
+ ActiveJob::TrafficControl.client = Searchkick.redis
1225
+
1226
+ class Searchkick::BulkReindexJob
1227
+ concurrency 3
1228
+ end
1229
+ ```
1230
+
1231
+ This will allow only 3 jobs to run at once.
1232
+
1233
+ ### Refresh Interval
1234
+
1235
+ You can specify a longer refresh interval while reindexing to increase performance.
1236
+
1237
+ ```ruby
1238
+ Product.reindex(async: true, refresh_interval: "30s")
1239
+ ```
1240
+
1241
+ **Note:** This only makes a noticable difference with parallel reindexing.
1242
+
1243
+ When promoting, have it restored to the value in your mapping (defaults to `1s`).
1244
+
1245
+ ```ruby
1246
+ Product.search_index.promote(index_name, update_refresh_interval: true)
1247
+ ```
1248
+
1249
+ ### Queuing
1250
+
1251
+ Push ids of records needing reindexed to a queue and reindex in bulk for better performance. First, set up Redis in an initializer. We recommend using [connection_pool](https://github.com/mperham/connection_pool).
1252
+
1253
+ ```ruby
1254
+ Searchkick.redis = ConnectionPool.new { Redis.new }
1255
+ ```
1256
+
1257
+ And ask your models to queue updates.
1258
+
1259
+ ```ruby
1260
+ class Product < ActiveRecord::Base
1261
+ searchkick callbacks: :queue
1262
+ end
1263
+ ```
1264
+
1265
+ Then, set up a background job to run.
1266
+
1267
+ ```ruby
1268
+ Searchkick::ProcessQueueJob.perform_later(class_name: "Product")
1269
+ ```
1270
+
1271
+ You can check the queue length with:
1272
+
1273
+ ```ruby
1274
+ Product.search_index.reindex_queue.length
1275
+ ```
1276
+
1277
+ For more tips, check out [Keeping Elasticsearch in Sync](https://www.elastic.co/blog/found-keeping-elasticsearch-in-sync).
1278
+
1279
+ ### Routing
1280
+
1281
+ Searchkick supports [Elasticsearch’s routing feature](https://www.elastic.co/blog/customizing-your-document-routing), which can significantly speed up searches.
1282
+
1283
+ ```ruby
1284
+ class Business < ActiveRecord::Base
1285
+ searchkick routing: true
1286
+
1287
+ def search_routing
1288
+ city_id
1289
+ end
1290
+ end
1291
+ ```
1292
+
1293
+ Reindex and search with:
1294
+
1295
+ ```ruby
1296
+ Business.search "ice cream", routing: params[:city_id]
1297
+ ```
1298
+
1299
+ ### Partial Reindexing
1300
+
1301
+ Reindex a subset of attributes to reduce time spent generating search data and cut down on network traffic.
1302
+
1303
+ ```ruby
1304
+ class Product < ActiveRecord::Base
1305
+ def search_data
1306
+ {
1307
+ name: name
1308
+ }.merge(search_prices)
1309
+ end
1310
+
1311
+ def search_prices
1312
+ {
1313
+ price: price,
1314
+ sale_price: sale_price
1315
+ }
1316
+ end
1317
+ end
1318
+ ```
1319
+
1320
+ And use:
1321
+
1322
+ ```ruby
1323
+ Product.reindex(:search_prices)
1324
+ ```
1325
+
1326
+ ### Performant Conversions
1327
+
1328
+ Split out conversions into a separate method so you can use partial reindexing, and cache conversions to prevent N+1 queries. Be sure to use a centralized cache store like Memcached or Redis.
1329
+
1330
+ ```ruby
1331
+ class Product < ActiveRecord::Base
1332
+ def search_data
1333
+ {
1334
+ name: name
1335
+ }.merge(search_conversions)
1336
+ end
1337
+
1338
+ def search_conversions
1339
+ {
1340
+ conversions: Rails.cache.read("search_conversions:#{self.class.name}:#{id}") || {}
1341
+ }
1342
+ end
1343
+ end
1344
+ ```
1345
+
1346
+ Create a job to update the cache and reindex records with new conversions.
1347
+
1348
+ ```ruby
1349
+ class ReindexConversionsJob < ActiveJob::Base
1350
+ def perform(class_name)
1351
+ # get records that have a recent conversion
1352
+ recently_converted_ids =
1353
+ Searchjoy::Search.where("convertable_type = ? AND converted_at > ?", class_name, 1.day.ago)
1354
+ .order(:convertable_id).uniq.pluck(:convertable_id)
1355
+
1356
+ # split into groups
1357
+ recently_converted_ids.in_groups_of(1000, false) do |ids|
1358
+ # fetch conversions
1359
+ conversions =
1360
+ Searchjoy::Search.where(convertable_id: ids, convertable_type: class_name)
1361
+ .group(:convertable_id, :query).uniq.count(:user_id)
1362
+
1363
+ # group conversions by record
1364
+ conversions_by_record = {}
1365
+ conversions.each do |(id, query), count|
1366
+ (conversions_by_record[id] ||= {})[query] = count
1367
+ end
1368
+
1369
+ # write to cache
1370
+ conversions_by_record.each do |id, conversions|
1371
+ Rails.cache.write("search_conversions:#{class_name}:#{id}", conversions)
1372
+ end
1373
+
1374
+ # partial reindex
1375
+ class_name.constantize.where(id: ids).reindex(:search_conversions)
1376
+ end
1377
+ end
1378
+ end
1379
+ ```
1380
+
1381
+ Run the job with:
1382
+
1383
+ ```ruby
1384
+ ReindexConversionsJob.perform_later("Product")
1385
+ ```
1386
+
1387
+ ## Advanced
1388
+
1389
+ Searchkick makes it easy to use the Elasticsearch DSL on its own.
1390
+
1391
+ ### Advanced Mapping
1392
+
1393
+ Create a custom mapping:
1394
+
1395
+ ```ruby
1396
+ class Product < ActiveRecord::Base
1397
+ searchkick mappings: {
1398
+ product: {
1399
+ properties: {
1400
+ name: {type: "string", analyzer: "keyword"}
1401
+ }
1402
+ }
1403
+ }
1404
+ end
1405
+ ```
1406
+ **Note:** If you use a custom mapping, you'll need to use [custom searching](#advanced-search) as well.
1407
+
1408
+ To keep the mappings and settings generated by Searchkick, use:
1409
+
1410
+ ```ruby
1411
+ class Product < ActiveRecord::Base
1412
+ searchkick merge_mappings: true, mappings: {...}
1413
+ end
1414
+ ```
1415
+
1416
+ ### Advanced Search
1417
+
1418
+ And use the `body` option to search:
1419
+
1420
+ ```ruby
1421
+ products = Product.search body: {match: {name: "milk"}}
1422
+ ```
1423
+
1424
+ **Note:** This replaces the entire body, so other options are ignored.
1425
+
1426
+ View the response with:
1427
+
1428
+ ```ruby
1429
+ products.response
1430
+ ```
1431
+
1432
+ To modify the query generated by Searchkick, use:
1433
+
1434
+ ```ruby
1435
+ products = Product.search "milk", body_options: {min_score: 1}
1436
+ ```
1437
+
1438
+ or
1439
+
1440
+ ```ruby
1441
+ products =
1442
+ Product.search "apples" do |body|
1443
+ body[:min_score] = 1
1444
+ end
1445
+ ```
1446
+
1447
+ ### Elasticsearch Gem
1448
+
1449
+ Searchkick is built on top of the [elasticsearch](https://github.com/elastic/elasticsearch-ruby) gem. To access the client directly, use:
1450
+
1451
+ ```ruby
1452
+ Searchkick.client
1453
+ ```
1454
+
1455
+ ## Multi Search
1456
+
1457
+ To batch search requests for performance, use:
1458
+
1459
+ ```ruby
1460
+ fresh_products = Product.search("fresh", execute: false)
1461
+ frozen_products = Product.search("frozen", execute: false)
1462
+ Searchkick.multi_search([fresh_products, frozen_products])
1463
+ ```
1464
+
1465
+ Then use `fresh_products` and `frozen_products` as typical results.
1466
+
1467
+ **Note:** Errors are not raised as with single requests. Use the `error` method on each query to check for errors. Also, if you use the `below` option for misspellings, misspellings will be disabled.
1468
+
1469
+ ## Multiple Indices
1470
+
1471
+ Search across multiple indices with:
1472
+
1473
+ ```ruby
1474
+ Searchkick.search "milk", index_name: [Product, Category]
1475
+ ```
1476
+
1477
+ Boost specific indices with:
1478
+
1479
+ ```ruby
1480
+ indices_boost: {Category => 2, Product => 1}
1481
+ ```
1482
+
1483
+ ## Nested Data
1484
+
1485
+ To query nested data, use dot notation.
1486
+
1487
+ ```ruby
1488
+ User.search "san", fields: ["address.city"], where: {"address.zip_code" => 12345}
1489
+ ```
1490
+
1491
+ ## Search Concepts
1492
+
1493
+ ### Precision and Recall
1494
+
1495
+ [Precision and recall](https://en.wikipedia.org/wiki/Precision_and_recall) are two key concepts in search (also known as *information retrieval*). To help illustrate, let’s walk through an example.
1496
+
1497
+ You have a store with 16 types of apples. A user searches for `apples` gets 10 results. 8 of the results are for apples, and 2 are for apple juice.
1498
+
1499
+ **Precision** is the fraction of documents in the results that are relevant. There are 10 results and 8 are relevant, so precision is 80%.
1500
+
1501
+ **Recall** is the fraction of relevant documents in the results out of all relevant documents. There are 16 apples and only 8 in the results, so recall is 50%.
1502
+
1503
+ There’s typically a trade-off between the two. As you tweak your search to increase precision (not return irrelevant documents), there’s are greater chance a relevant document also isn’t returned, which decreases recall. The opposite also applies. As you try to increase recall (return a higher number of relevent documents), there’s a greater chance you also return an irrelevant document, decreasing precision.
1504
+
1505
+ ## Reference
1506
+
1507
+ Reindex one record
1508
+
1509
+ ```ruby
1510
+ product = Product.find(1)
1511
+ product.reindex
1512
+ # or to reindex in the background
1513
+ product.reindex_async
1514
+ ```
1515
+
1516
+ Reindex multiple records
1517
+
1518
+ ```ruby
1519
+ Product.where(store_id: 1).reindex
1520
+ ```
1521
+
1522
+ Reindex associations
1523
+
1524
+ ```ruby
1525
+ store.products.reindex
1526
+ ```
1527
+
1528
+ Remove old indices
1529
+
1530
+ ```ruby
1531
+ Product.search_index.clean_indices
1532
+ ```
1533
+
1534
+ Use custom settings
1535
+
1536
+ ```ruby
1537
+ class Product < ActiveRecord::Base
1538
+ searchkick settings: {number_of_shards: 3}
1539
+ end
1540
+ ```
1541
+
1542
+ Use a different index name
1543
+
1544
+ ```ruby
1545
+ class Product < ActiveRecord::Base
1546
+ searchkick index_name: "products_v2"
1547
+ end
1548
+ ```
1549
+
1550
+ Use a dynamic index name
1551
+
1552
+ ```ruby
1553
+ class Product < ActiveRecord::Base
1554
+ searchkick index_name: -> { "#{name.tableize}-#{I18n.locale}" }
1555
+ end
1556
+ ```
1557
+
1558
+ Prefix the index name
1559
+
1560
+ ```ruby
1561
+ class Product < ActiveRecord::Base
1562
+ searchkick index_prefix: "datakick"
1563
+ end
1564
+ ```
1565
+
1566
+ Multiple conversion fields
1567
+
1568
+ ```ruby
1569
+ class Product < ActiveRecord::Base
1570
+ has_many :searches, class_name: "Searchjoy::Search"
1571
+
1572
+ # searchkick also supports multiple "conversions" fields
1573
+ searchkick conversions: ["unique_user_conversions", "total_conversions"]
1574
+
1575
+ def search_data
1576
+ {
1577
+ name: name,
1578
+ unique_user_conversions: searches.group(:query).uniq.count(:user_id),
1579
+ # {"ice cream" => 234, "chocolate" => 67, "cream" => 2}
1580
+ total_conversions: searches.group(:query).count
1581
+ # {"ice cream" => 412, "chocolate" => 117, "cream" => 6}
1582
+ }
1583
+ end
1584
+ end
1585
+ ```
1586
+
1587
+ and during query time:
1588
+
1589
+ ```ruby
1590
+ Product.search("banana") # boost by both fields (default)
1591
+ Product.search("banana", conversions: "total_conversions") # only boost by total_conversions
1592
+ Product.search("banana", conversions: false) # no conversion boosting
1593
+ ```
1594
+
1595
+ Change timeout
1596
+
1597
+ ```ruby
1598
+ Searchkick.timeout = 15 # defaults to 10
1599
+ ```
1600
+
1601
+ Set a lower timeout for searches
1602
+
1603
+ ```ruby
1604
+ Searchkick.search_timeout = 3
1605
+ ```
1606
+
1607
+ Change the search method name
1608
+
1609
+ ```ruby
1610
+ Searchkick.search_method_name = :lookup
1611
+ ```
1612
+
1613
+ Change search queue name
1614
+
1615
+ ```ruby
1616
+ Searchkick.queue_name = :search_reindex
1617
+ ```
1618
+
1619
+ Eager load associations
1620
+
1621
+ ```ruby
1622
+ Product.search "milk", includes: [:brand, :stores]
1623
+ ```
1624
+
1625
+ Turn off special characters
1626
+
1627
+ ```ruby
1628
+ class Product < ActiveRecord::Base
1629
+ # A will not match Ä
1630
+ searchkick special_characters: false
1631
+ end
1632
+ ```
1633
+
1634
+ Use a different [similarity algorithm](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html) for scoring
1635
+
1636
+ ```ruby
1637
+ class Product < ActiveRecord::Base
1638
+ searchkick similarity: "classic"
1639
+ end
1640
+ ```
1641
+
1642
+ Change import batch size
1643
+
1644
+ ```ruby
1645
+ class Product < ActiveRecord::Base
1646
+ searchkick batch_size: 200 # defaults to 1000
1647
+ end
1648
+ ```
1649
+
1650
+ Create index without importing
1651
+
1652
+ ```ruby
1653
+ Product.reindex(import: false)
1654
+ ```
1655
+
1656
+ Lazy searching
1657
+
1658
+ ```ruby
1659
+ products = Product.search("carrots", execute: false)
1660
+ products.each { ... } # search not executed until here
1661
+ ```
1662
+
1663
+ Add [request parameters](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html), like `search_type` and `query_cache`
1664
+
1665
+ ```ruby
1666
+ Product.search("carrots", request_params: {search_type: "dfs_query_then_fetch"})
1667
+ ```
1668
+
1669
+ Reindex conditionally
1670
+
1671
+ ```ruby
1672
+ class Product < ActiveRecord::Base
1673
+ searchkick callbacks: false
1674
+
1675
+ # add the callbacks manually
1676
+ after_commit :reindex, if: -> (model) { model.previous_changes.key?("name") } # use your own condition
1677
+ end
1678
+ ```
1679
+
1680
+ Reindex all models - Rails only
1681
+
1682
+ ```sh
1683
+ rake searchkick:reindex:all
1684
+ ```
1685
+
1686
+ Turn on misspellings after a certain number of characters
1687
+
1688
+ ```ruby
1689
+ Product.search "api", misspellings: {prefix_length: 2} # api, apt, no ahi
1690
+ ```
1691
+
1692
+ **Note:** With this option, if the query length is the same as `prefix_length`, misspellings are turned off
1693
+
1694
+ ```ruby
1695
+ Product.search "ah", misspellings: {prefix_length: 2} # ah, no aha
1696
+ ```
1697
+
1698
+ ## Testing
1699
+
1700
+ For performance, only enable Searchkick callbacks for the tests that need it.
1701
+
1702
+ ### Minitest
1703
+
1704
+ Add to your `test/test_helper.rb`:
1705
+
1706
+ ```ruby
1707
+ # reindex models
1708
+ Product.reindex
1709
+
1710
+ # and disable callbacks
1711
+ Searchkick.disable_callbacks
1712
+ ```
1713
+
1714
+ And use:
1715
+
1716
+ ```ruby
1717
+ class ProductTest < Minitest::Test
1718
+ def setup
1719
+ Searchkick.enable_callbacks
1720
+ end
1721
+
1722
+ def teardown
1723
+ Searchkick.disable_callbacks
1724
+ end
1725
+
1726
+ def test_search
1727
+ Product.create!(name: "Apple")
1728
+ Product.search_index.refresh
1729
+ assert_equal ["Apple"], Product.search("apple").map(&:name)
1730
+ end
1731
+ end
1732
+ ```
1733
+
1734
+ ### RSpec
1735
+
1736
+ Add to your `spec/spec_helper.rb`:
1737
+
1738
+ ```ruby
1739
+ RSpec.configure do |config|
1740
+ config.before(:suite) do
1741
+ # reindex models
1742
+ Product.reindex
1743
+
1744
+ # and disable callbacks
1745
+ Searchkick.disable_callbacks
1746
+ end
1747
+
1748
+ config.around(:each, search: true) do |example|
1749
+ Searchkick.enable_callbacks
1750
+ example.run
1751
+ Searchkick.disable_callbacks
1752
+ end
1753
+ end
1754
+ ```
1755
+
1756
+ And use:
1757
+
1758
+ ```ruby
1759
+ describe Product, search: true do
1760
+ it "searches" do
1761
+ Product.create!(name: "Apple")
1762
+ Product.search_index.refresh
1763
+ assert_equal ["Apple"], Product.search("apple").map(&:name)
1764
+ end
1765
+ end
1766
+ ```
1767
+
1768
+ ### Factory Girl
1769
+
1770
+ Use a trait and an after `create` hook for each indexed model:
1771
+
1772
+ ```ruby
1773
+ FactoryGirl.define do
1774
+ factory :product do
1775
+ # ...
1776
+
1777
+ # Note: This should be the last trait in the list so `reindex` is called
1778
+ # after all the other callbacks complete.
1779
+ trait :reindex do
1780
+ after(:create) do |product, _evaluator|
1781
+ product.reindex(refresh: true)
1782
+ end
1783
+ end
1784
+ end
1785
+ end
1786
+
1787
+ # use it
1788
+ FactoryGirl.create(:product, :some_trait, :reindex, some_attribute: "foo")
1789
+ ```
1790
+
1791
+ ### Parallel Tests
1792
+
1793
+ Set:
1794
+
1795
+ ```ruby
1796
+ Searchkick.index_suffix = ENV["TEST_ENV_NUMBER"]
1797
+ ```
1798
+
1799
+ ## Multi-Tenancy
1800
+
1801
+ Check out [this great post](https://www.tiagoamaro.com.br/2014/12/11/multi-tenancy-with-searchkick/) on the [Apartment](https://github.com/influitive/apartment) gem. Follow a similar pattern if you use another gem.
1802
+
1803
+ ## Upgrading
1804
+
1805
+ View the [changelog](https://github.com/ankane/searchkick/blob/master/CHANGELOG.md).
1806
+
1807
+ Important notes are listed below.
1808
+
1809
+ ### 2.0.0
1810
+
1811
+ - Added support for `reindex` on associations
1812
+
1813
+ #### Breaking Changes
1814
+
1815
+ - Removed support for Elasticsearch 1 as it reaches [end of life](https://www.elastic.co/support/eol)
1816
+ - Removed facets, legacy options, and legacy methods
1817
+ - Invalid options now throw an `ArgumentError`
1818
+ - The `query` and `json` options have been removed in favor of `body`
1819
+ - The `include` option has been removed in favor of `includes`
1820
+ - The `personalize` option has been removed in favor of `boost_where`
1821
+ - The `partial` option has been removed in favor of `operator`
1822
+ - Renamed `select_v2` to `select` (legacy `select` no longer available)
1823
+ - The `_all` field is disabled if `searchable` option is used (for performance)
1824
+ - The `partial_reindex(:method_name)` method has been replaced with `reindex(:method_name)`
1825
+ - The `unsearchable` and `only_analyzed` options have been removed in favor of `searchable` and `filterable`
1826
+ - `load: false` no longer returns an array in Elasticsearch 2
1827
+
1828
+ ### 1.0.0
1829
+
1830
+ - Added support for Elasticsearch 2.0
1831
+ - Facets are deprecated in favor of [aggregations](#aggregations) - see [how to upgrade](#moving-from-facets)
1832
+
1833
+ #### Breaking Changes
1834
+
1835
+ - **ActiveRecord 4.1+ and Mongoid 3+:** Attempting to reindex with a scope now throws a `Searchkick::DangerousOperation` error to keep your from accidentally recreating your index with only a few records.
1836
+
1837
+ ```ruby
1838
+ Product.where(color: "brandy").reindex # error!
1839
+ ```
1840
+
1841
+ If this is what you intend to do, use:
1842
+
1843
+ ```ruby
1844
+ Product.where(color: "brandy").reindex(accept_danger: true)
1845
+ ```
1846
+
1847
+ - Misspellings are enabled by default for [partial matches](#partial-matches). Use `misspellings: false` to disable.
1848
+ - [Transpositions](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) are enabled by default for misspellings. Use `misspellings: {transpositions: false}` to disable.
1849
+
1850
+ ### 0.6.0 and 0.7.0
1851
+
1852
+ If running Searchkick `0.6.0` or `0.7.0` and Elasticsearch `0.90`, we recommend upgrading to Searchkick `0.6.1` or `0.7.1` to fix an issue that causes downtime when reindexing.
1853
+
1854
+ ### 0.3.0
1855
+
1856
+ Before `0.3.0`, locations were indexed incorrectly. When upgrading, be sure to reindex immediately.
1857
+
1858
+ ## Elasticsearch Gotchas
1859
+
1860
+ ### Consistency
1861
+
1862
+ Elasticsearch is eventually consistent, meaning it can take up to a second for a change to reflect in search. You can use the `refresh` method to have it show up immediately.
1863
+
1864
+ ```ruby
1865
+ product.save!
1866
+ Product.search_index.refresh
1867
+ ```
1868
+
1869
+ ### Inconsistent Scores
1870
+
1871
+ Due to the distributed nature of Elasticsearch, you can get incorrect results when the number of documents in the index is low. You can [read more about it here](https://www.elastic.co/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch). To fix this, do:
1872
+
1873
+ ```ruby
1874
+ class Product < ActiveRecord::Base
1875
+ searchkick settings: {number_of_shards: 1}
1876
+ end
1877
+ ```
1878
+
1879
+ For convenience, this is set by default in the test environment.
1880
+
1881
+ ## Thanks
1882
+
1883
+ Thanks to Karel Minarik for [Elasticsearch Ruby](https://github.com/elasticsearch/elasticsearch-ruby) and [Tire](https://github.com/karmi/retire), Jaroslav Kalistsuk for [zero downtime reindexing](https://gist.github.com/jarosan/3124884), and Alex Leschenko for [Elasticsearch autocomplete](https://github.com/leschenko/elasticsearch_autocomplete).
1884
+
1885
+ ## Roadmap
1886
+
1887
+ - Reindex API
1888
+ - Incorporate human eval
1889
+
1890
+ ## Contributing
1891
+
1892
+ Everyone is encouraged to help improve this project. Here are a few ways you can help:
1893
+
1894
+ - [Report bugs](https://github.com/ankane/searchkick/issues)
1895
+ - Fix bugs and [submit pull requests](https://github.com/ankane/searchkick/pulls)
1896
+ - Write, clarify, or fix documentation
1897
+ - Suggest or add new features
1898
+
1899
+ If you’re looking for ideas, [try here](https://github.com/ankane/searchkick/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22).
1900
+
1901
+ To get started with development and testing:
1902
+
1903
+ ```sh
1904
+ git clone https://github.com/ankane/searchkick.git
1905
+ cd searchkick
1906
+ bundle install
1907
+ rake test
1908
+ ```