mongodb_meilisearch 1.3.0 → 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md DELETED
@@ -1,531 +0,0 @@
1
- # <!-- :TOC: -->
2
- - [MongodbMeilisearch](#mongodbmeilisearch)
3
- - [Installation](#installation)
4
- - [Usage](#usage)
5
- - [Model Integration](#model-integration)
6
- - [Indexes](#indexes)
7
- - [Searching](#searching)
8
- - [Development](#development)
9
- - [License](#license)
10
- - [Code of Conduct](#code-of-conduct)
11
-
12
- # MongodbMeilisearch
13
-
14
- A simple gem for integrating [Meilisearch](https://www.meilisearch.com) into Ruby† applications that are backed by [MongoDB](https://www.mongodb.com/).
15
-
16
-
17
- † It's currently limited to Rails apps, but hopefully that will change soon.
18
-
19
- ## Installation
20
-
21
- Install the gem and add to the application's Gemfile by executing:
22
-
23
- $ bundle add mongodb_meilisearch
24
-
25
- If bundler is not being used to manage dependencies, install the gem by executing:
26
-
27
- $ gem install mongodb_meilisearch
28
-
29
- ## Usage
30
-
31
- A high level overview
32
-
33
- ### Pre-Requisites
34
-
35
- - [Meilisearch](https://www.meilisearch.com)
36
- - [MongoDB](https://www.mongodb.com/)
37
- - Some models that `include Mongoid::Document`
38
-
39
- ### Configuration
40
-
41
- Define the following variables in your environment (or `.env` file if you're using `dotenv`).
42
- The url below is the default one Meilisearch uses when run locally.
43
-
44
- ```bash
45
- SEARCH_ENABLED=true
46
- MEILI_MASTER_KEY=<your api key here>
47
- MEILISEARCH_URL=http://127.0.0.1:7700
48
-
49
- ```
50
-
51
- Optional configuration
52
- ```bash
53
- MEILISEARCH_TIMEOUT=10
54
- MEILISEARCH_MAX_RETRIES=2
55
- ```
56
-
57
- ## Model Integration
58
-
59
- Add the following near the top of your model. Only the `extend` and `include` lines are required.
60
- This assumes your model also includes `Mongoid::Document`
61
-
62
- ```ruby
63
- include Search::InstanceMethods
64
- extend Search::ClassMethods
65
- ```
66
-
67
- If you want Rails to automatically add, update, and delete records from the index, add the following to your model.
68
-
69
- You can override these methods if needed, but you're unlikely to want to.
70
-
71
- ```ruby
72
- # enabled?() is controlled by the SEARCH_ENABLED environment variable
73
- if Search::Client.instance.enabled?
74
- after_create :add_to_search
75
- after_update :update_in_search
76
- after_destroy :remove_from_search
77
- end
78
- ```
79
-
80
- Assuming you've done the above a new index will be created with a name that
81
- corresponds to your model's name, only in snake case. All of your models
82
- fields will be indexed and [filterable](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering).
83
-
84
-
85
- ### Example Rails Model
86
-
87
- Here's what it looks like when you put it all
88
- together in a Rails model with the default behavior.
89
-
90
- ```ruby
91
- class Person
92
- include Mongoid::Document
93
- extend Search::ClassMethods
94
-
95
- if Search::Client.instance.enabled?
96
- after_create :add_to_search
97
- after_update :update_in_search
98
- after_destroy :remove_from_search
99
- end
100
-
101
- # normal Mongoid attributes
102
- field :name, type: String
103
- field :description, type: String
104
- field :age, type: Integer
105
- end
106
- ```
107
-
108
- Note that that _unless you configure it otherwise_ the ids of `belongs_to` objects
109
- will not be searchable. This is because they're random strings that no human's ever
110
- going to be searching for, and we don't want to waste RAM or storage.
111
-
112
- ### Going Beyond The Defaults
113
- This module strives for sensible defaults, but you can override them with the
114
- following optional constants:
115
-
116
- - `PRIMARY_SEARCH_KEY` - a Symbol matching one of your model's attributes
117
- that is guaranteed unique. This defaults to `_id`
118
- - `SEARCH_INDEX_NAME` - a String - useful if you want to have records from
119
- multiple classes come back in the same search results. This defaults to the
120
- underscored form of the current class name.
121
- - `SEARCH_OPTIONS` - a hash of key value pairs in JS style
122
- - See the [meilisearch search parameter docs](https://www.meilisearch.com/docs/reference/api/search#search-parameters) for details.
123
- - example from [meliesearch's `multi_param_spec`](https://github.com/meilisearch/meilisearch-ruby/blob/main/spec/meilisearch/index/search/multi_params_spec.rb)
124
- ```ruby
125
- {
126
- attributesToCrop: ['title'],
127
- cropLength: 2,
128
- filter: 'genre = adventure',
129
- attributesToHighlight: ['title'],
130
- limit: 2
131
- }
132
- ```
133
- - `SEARCH_RANKING_RULES` - an array of strings that correspond to meilisearch rules
134
- see [meilisearch ranking rules docs](https://www.meilisearch.com/docs/learn/core_concepts/relevancy#ranking-rules)
135
- You probably don't want to change this.
136
-
137
-
138
- ## Indexes
139
-
140
- Searching is limited to records that have been added to a given index. This means,
141
- if you want to perform one search and get back records from multiple models you'll need to
142
- add them to the same index.
143
-
144
- In order to do that add the `SEARCH_INDEX_NAME` constant to the model whose search stuff you want to end up in the same index. You can name this just about anything. The important thing is
145
- that all the models that share this index have the same `SEARCH_INDEX_NAME` constant defined. You may want to just add it to a module they all import.
146
-
147
-
148
- ```ruby
149
- SEARCH_INDEX_NAME='general_search'
150
- ```
151
-
152
- If multiple models are using the same index, you should also
153
- add `CLASS_PREFIXED_SEARCH_IDS=true`. This causes the `id` field to
154
- be `<ClassName>_<_id>` For example, a `Note` record might have an
155
- index of `"Note_64274543906b1d7d02c1fcc6"`. If undefined this will default to `false`.
156
- This is not needed if you can absolutely guarantee that there will be
157
- no overlap in ids amongst all the models using a shared index.
158
-
159
- ```ruby
160
- CLASS_PREFIXED_SEARCH_IDS=true
161
- ```
162
-
163
- Setting `CLASS_PREFIXED_SEARCH_IDS` to `true` will also cause the original Mongoid `_id` field to be indexed
164
- as `original_document_id`. This is useful if you want to be able to retrieve the original record from the database.
165
-
166
- ### Searchable Data
167
- You probably don't want to index _all_ the fields. For example,
168
- unless you intend to allow users to sort by when a record was created,
169
- there's no point in recording it's `created_at` in the search index.
170
- It'll just waste bandwidth, memory, and disk space.
171
-
172
- Define a `SEARCHABLE_ATTRIBUTES` constant with an array of strings to limit things.
173
- These are the field names, and/or names of methods you wish to have indexed.
174
-
175
- By default these will _also_ be the fields you can filter on.
176
-
177
- Note that Meilisearch requires there to be an `id` field and it must be a string.
178
- If you don't define one it will use string version of the `_id` your
179
- document's `BSON::ObjectId`.
180
-
181
- ```ruby
182
- # explicitly define the fields you want to be searchable
183
- # this should be an array of symbols
184
- SEARCHABLE_ATTRIBUTES = %w[title body]
185
- # OR explicitly define the fields you DON'T want searchable
186
- SEARCHABLE_ATTRIBUTES = searchable_attributes - [:created_at]
187
- ```
188
-
189
- #### Including Foreign Key data
190
- If, for example, your `Person` `belongs_to: group`
191
- and you wanted that group's id to be searchable you would include `group_id`
192
- in the list.
193
-
194
- If you don't specify any `SEARCHABLE_ATTRIBUTES`, the default list will
195
- exclude any fields that are `Mongoid::Fields::ForeignKey` objects.
196
-
197
-
198
- #### Getting Extra Specific
199
- If your searchable data needs to by dynamically generated instead of
200
- just taken directly from the `Mongoid::Document`'s attributes or
201
- existing methods you can define a `search_indexable_hash` method on your class.
202
-
203
- Before you do, please note that as of v1.1 your `SEARCHABLE_ATTRIBUTES`
204
- constant can contain fields and method names in its array of values. Making
205
- a method for each thing dynamically generated thing you want in the search
206
- and then including it in SEARCHABLE_ATTRIBUTES is going to be
207
- the easiest way of accomplishing this.
208
-
209
- Your `search_indexable_hash` must return a hash, and that hash must include the following keys:
210
- - `"id"` - a string that uniquely identifies the record
211
- - `"object_class"` the name of the class that this record corresponds to.
212
-
213
- The value of `"object_class"` is usually just `self.class.name`.
214
- This is something specific to this gem, and not Meilisearch itself.
215
-
216
- See `InstanceMethods#search_indexable_hash` for an example.
217
-
218
- #### Filterable Fields
219
- If you'd like to only be able to filter on a subset of those then
220
- you can define `FILTERABLE_ATTRIBUTE_NAMES` but it _must_ be a subset
221
- of `SEARCHABLE_ATTRIBUTES`. This is enforced by the gem to guarantee
222
- no complaints from Meilisearch. These must be symbols.
223
-
224
- If you have no direct need for filterable results,
225
- set `UNFILTERABLE_IN_SEARCH=true` in your model. This will save
226
- on index size and speed up indexing, but you won't be able to filter
227
- search results, and that's half of what makes Meilisearch so great.
228
- It should be noted, that even if this _is_ set to `true` this gem
229
- will still add `"object_class"` as a filterable attribute.
230
-
231
- This is the magic that allows you to have an index shared by multiple
232
- models and still be able to retrieve results specifically for one.
233
-
234
- If you decide to re-enable filtering you can remove that constant, or set it to false.
235
- Then call the following. If `FILTERABLE_ATTRIBUTE_NAMES` is defined it will use that,
236
- otherwise it will use whatever `.searchable_attributes` returns.
237
-
238
- ```ruby
239
- MyModel.set_filterable_attributes! # synchronous
240
- MyModel.set_filterable_attributes # asynchronous
241
- ```
242
-
243
- This will cause Meilisearch to reindex all the records for that index. If you
244
- have a large number of records this could take a while. Consider running it
245
- on a background thread. Note that filtering is managed at the index level, not the individual
246
- record level. By setting filterable attributes you're giving Meilisearch
247
- guidance on what to do when indexing your data.
248
-
249
- Note that you will encounter problems in a shared index if you try and
250
- filter on a field that one of the contributing models doesn't have set
251
- as a filterable field, or doesn't have at all.
252
-
253
- ### Sortable Fields
254
-
255
- Sortable fields work in essentially the same way as filterable fields.
256
- By default it's the same as your `FILTERABLE_ATTRIBUTE_NAMES` which, in turn, defaults to your `SEARCHABLE_ATTRIBUTES` You can
257
- override it by setting `SORTABLE_ATTRIBUTE_NAMES`.
258
-
259
- Note that you will encounter problems in a shared index if you try and
260
- sort on a field that one of the contributing models doesn't have set
261
- as a sortable field, or doesn't have at all.
262
-
263
- ```ruby
264
- MyModel.set_sortable_attributes! # synchronous
265
- MyModel.set_sortable_attributes # asynchronous
266
- ```
267
-
268
- ### Indexing things
269
- **Important note**: By default anything you do that updates the search index (adding, removing, or changing) happens asynchronously.
270
-
271
- Sometimes, especially when debugging something on the console, you want to
272
- update the index _synchronously_. The convention used in this codebase - and in the meilisearch-ruby library we build on - is that
273
- the synchronous methods are the ones with the bang. Similar to how mutating
274
- state is potentially dangerous and noted with a bang, using synchronous methods
275
- is potentially problematic for your users, and thus noted with a bang.
276
-
277
-
278
- For example:
279
- ```ruby
280
- MyModel.reindex # runs asyncronously
281
-
282
- ```
283
-
284
- vs
285
-
286
- ```ruby
287
- MyModel.reindex! # runs synchronously
288
- ```
289
-
290
- #### Reindexing, Adding, Updating, and Deleting
291
-
292
- **Reindexing**
293
- Calling `MyModel.reindex!` deletes all the existing records from the current index,
294
- and then reindexes all the records for the current model. It's safe to run this
295
- even if there aren't any records. In addition to re-indexing your models,
296
- it will update/set the "sortable" and "filterable" fields on the
297
- relevant indexes.
298
-
299
- Note: reindexing behaves slightly differently than all the other methods.
300
- It runs semi-asynchronously by default. The Asynchronous form will first,
301
- attempt to _synchronously_ delete all the records from the index. If that
302
- fails an exception will be raised. Otherwise you'd think everything was
303
- fine when actually it had failed miserably. If you call `.reindex!`
304
- it will be entirely synchronous.
305
-
306
- Note: adding, updating, and deleting should happen automatically
307
- if you've defined `after_create`, `after_update`, and `after_destroy`
308
- as instructed above. You'll mostly only want to use these when manually
309
- mucking with things in the console.
310
-
311
- **Adding**
312
- Be careful to not add documents that are already in the index.
313
-
314
- - Add everything: `MyClass.add_all_to_search`
315
- - Add a specific instance: `my_instance.add_to_search`
316
- - Add a specific subset of documents: `MyClass.add_documents(documents_hashes)`
317
- IMPORTANT: `documents_hashes` must be an array of hashes that were each generated
318
- via `search_indexable_hash`
319
-
320
- **Updating**
321
- - Update everything: call `reindex`
322
- - Update a specific instance: `my_instance.update_in_search`
323
- - Update a specific subset of documents: `MyClass.update_documents(documents_hashes)`
324
- IMPORTANT: `documents_hashes` must be an array of hashes that were generated
325
- via `search_indexable_hash` The `PRIMARY_SEARCH_KEY` (`_id` by default) will be
326
- used to find records in the index to update.
327
-
328
-
329
- **Deleting**
330
- - Delete everything: `MyClass.delete_all_documents!`
331
- - Delete a specific record: `my_instance.remove_from_search`
332
- - Delete the index: `MyClass.delete_index!`
333
- WARNING: if you think you should use this, you're probably
334
- mistaken.
335
-
336
- #### Indexes
337
- By default every model gets its own search index. This means that
338
- `Foo.search("some text")` will only search `Foo` objects. To have a
339
- search cross objects you'll need to use a "Shared Index" (see below).
340
-
341
-
342
- The name of the index isn't important when not using shared indexes.
343
- By default a model's index is the snake cased form of the class name.
344
- For example, data for `MyWidget` models will be stored in the `my_widget` index.
345
-
346
- #### Shared indexes
347
- Imagine you have a `Note` and a `Comment` model, sharing an index so that
348
- you can perform a single search and have search results for both models
349
- that are ranked by relevance.
350
-
351
- In this case both models would define a `SEARCH_INDEX_NAME` constant with the
352
- same value. You might want to just put this, and the other search stuff
353
- in a common module that they all `include`.
354
-
355
- Then, when you search you can say `Note.search("search term")` and it will _only_
356
- bring back results for `Note` records. If you want to include results that match
357
- `Comment` records too, you can set the optional `filtered_by_class` parameter to `false`.
358
-
359
- For example: `Note.search("search term", filtered_by_class: false)`
360
- will return all matching `Note` results, as well as results for _all_ the
361
- other models that share the same index as `Note`.
362
-
363
- ⚠ Models sharing the same index must share the same primary key field as well.
364
- This is a known limitation of the system.
365
-
366
- ## Searching
367
-
368
- To get a list of all the matching objects in the order returned by the search engine
369
- run `MyModel.search("search term")` Note that this will restrict the results to
370
- records generated by the model you're calling this on. If you have an index
371
- that contains data from multiple models and wish to include all of them in
372
- the results pass in the optional `filtered_by_class` parameter with a `false` value.
373
- E.g. `MyModel.search("search term", filtered_by_class: false)`
374
-
375
- Searching returns a hash, with the class name of the results as the key and an array of
376
- String ids, or `Mongoid::Document` objects as the value. By default it assumes you want
377
- `Mongoid::Document` objects. The returned hash _also_ includes a key
378
- of `"search_result_metadata"` which includes the metadata provided by Meilisearch regarding
379
- your request. You'll need this for pagination if you have lots of results. To _exclude_
380
- the metadata pass `include_metadata: false` as an option.
381
- E.g. `MyModel.search("search term", include_metadata: false)`
382
-
383
- ### Useful Keyword Parameters
384
-
385
- - `ids_only`
386
- - only return matching ids. These will be an array under the `"matches"` key.
387
- - defaults to `false`
388
- - `filtered_by_class`
389
- - limit results to the class you initiated the search from. E.g. `Note.search("foo")` will only return results from the `Note` class even if there are records from other classes in the same index.
390
- - defaults to `true`
391
- - `include_metadata`
392
- - include the metadata about the search results provided by Meilisearch. If true (default) there will be a `"search_result_metadata"` key, with a hash of the Meilisearch metadata.
393
- - You'll likely need this in order to support pagination, however if you just want to return a single page worth of data, you can set this to `false` to discard it.
394
- - defaults to `true`
395
-
396
- ### Example Search Results
397
-
398
- Search results, ids only, for a class where `CLASS_PREFIXED_SEARCH_IDS=false`.
399
-
400
- ```ruby
401
- Note.search('foo', ids_only: true) # => returns
402
- {
403
- "matches" => [
404
- "64274a5d906b1d7d02c1fcc7",
405
- "643f5e1c906b1d60f9763071",
406
- "64483e63906b1d84f149717a"
407
- ],
408
- "search_result_metadata" => {
409
- "query"=>query_string,
410
- "processingTimeMs"=>1,
411
- "limit"=>50,
412
- "offset"=>0,
413
- "estimatedTotalHits"=>33,
414
- "nbHits"=>33
415
- }
416
- }
417
- ```
418
- If `CLASS_PREFIXED_SEARCH_IDS=true` the above would have ids like `"Note_64274a5d906b1d7d02c1fcc7"`
419
-
420
-
421
- Without `ids_only` you get full objects in a `matches` array.
422
-
423
-
424
- ```ruby
425
- Note.search('foo') # or Note.search('foo', ids_only: false) # => returns
426
- {
427
- "matches" => [
428
- #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
429
- #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
430
- #<Note _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Standup Notes (for wed)", body: "very full bodied", type: "misc", context: "WORK">
431
- ],
432
- "search_result_metadata" => {
433
- "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
434
- "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
435
- }
436
- }
437
- ```
438
-
439
-
440
-
441
- If `Note` records shared an index with `Task` and they both had `CLASS_PREFIXED_SEARCH_ID=true` you'd get a result like this.
442
-
443
- ```ruby
444
- Note.search('foo') #=> returns
445
- {
446
- "matches" => [
447
- #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
448
- #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
449
- #<Task _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Do the thing", body: "very full bodied", type: "misc", context: "WORK">
450
- ],
451
- "search_result_metadata" => {
452
- "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
453
- "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
454
- }
455
-
456
- }
457
- ```
458
-
459
- ### Custom Search Options
460
-
461
- To invoke any of Meilisearch's custom search options (see [their documentation](https://www.meilisearch.com/docs/reference/api/search)). You can pass them in via an options hash.
462
-
463
- `MyModel.search("search term", options: <my custom options>)`
464
-
465
- Currently the Meilisearch-ruby gem can convert keys from snake case to
466
- camel case. For example `hits_per_page` will become `hitsPerPage`.
467
- Meilisearch ultimately wants camel case (`camelCase`) parameter keys,
468
- _but_ `meilisearch-ruby` wants snake case (`snake_case`).
469
-
470
- Follow Meilisearch's documentation
471
- to see what's available and what type of options to pass it, but convert
472
- them to snake case first. Note that your
473
- options keys and values must all be simple JSON values.
474
-
475
- If for some reason that still isn't enough, you can work with the
476
- meilisearch-ruby index directly via
477
- `Search::Client.instance.index(search_index_name)`
478
-
479
- #### Pagination
480
- This gem has no specific pagination handling, as there are multiple libraries for
481
- handling pagination in Ruby. Here's an example of how to get started
482
- with [Pagy](https://github.com/ddnexus/pagy).
483
-
484
- ```ruby
485
- current_page_number = 1
486
- max_items_per_page = 10
487
-
488
- search_results = Note.search('foo')
489
-
490
- Pagy.new(
491
- count: search_results["search_result_metadata"]["nbHits"],
492
- page: current_page_number,
493
- items: max_items_per_page
494
- )
495
- ```
496
-
497
- ## Development
498
- To contribute to this gem.
499
-
500
-
501
- - Run `bundle install` to install all the dependencies.
502
- - run `lefthook install` to set up [lefthook](https://github.com/evilmartians/lefthook)
503
- This will do things like make sure the tests still pass, and run rubocop before you commit.
504
- - Start hacking.
505
- - Add RSpec tests.
506
- - Add your name to CONTRIBUTORS.md
507
- - Make PR.
508
-
509
- NOTE: by contributing to this repository you are offering to transfer copyright to the current maintainer of the repository.
510
-
511
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
512
-
513
- Bug reports and pull requests are welcome on GitHub at
514
- https://github.com/masukomi/mongodb_meilisearch.
515
- This project is intended to be a safe, welcoming space for collaboration,
516
- and contributors are expected to adhere to the
517
- [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).
518
-
519
- ## License
520
-
521
- The gem is available as open source under the terms of the
522
- [Server Side Public License](https://github.com/masukomi/mongodb_meilisearch/blob/main/LICENSE.txt). For those unfamiliar, the short version is that if you use it in a server side app you need to
523
- share all the code for that app and its infrastructure. It's like AGPL on
524
- steroids. Commercial licenses are available if you want to use this in a
525
- commercial setting but not share all your source.
526
-
527
- ## Code of Conduct
528
-
529
- Everyone interacting in this project's codebases, issue trackers,
530
- chat rooms and mailing lists is expected to follow the
531
- [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).