mongodb_meilisearch 1.3.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md DELETED
@@ -1,531 +0,0 @@
1
- # <!-- :TOC: -->
2
- - [MongodbMeilisearch](#mongodbmeilisearch)
3
- - [Installation](#installation)
4
- - [Usage](#usage)
5
- - [Model Integration](#model-integration)
6
- - [Indexes](#indexes)
7
- - [Searching](#searching)
8
- - [Development](#development)
9
- - [License](#license)
10
- - [Code of Conduct](#code-of-conduct)
11
-
12
- # MongodbMeilisearch
13
-
14
- A simple gem for integrating [Meilisearch](https://www.meilisearch.com) into Ruby† applications that are backed by [MongoDB](https://www.mongodb.com/).
15
-
16
-
17
- † It's currently limited to Rails apps, but hopefully that will change soon.
18
-
19
- ## Installation
20
-
21
- Install the gem and add to the application's Gemfile by executing:
22
-
23
- $ bundle add mongodb_meilisearch
24
-
25
- If bundler is not being used to manage dependencies, install the gem by executing:
26
-
27
- $ gem install mongodb_meilisearch
28
-
29
- ## Usage
30
-
31
- A high level overview
32
-
33
- ### Pre-Requisites
34
-
35
- - [Meilisearch](https://www.meilisearch.com)
36
- - [MongoDB](https://www.mongodb.com/)
37
- - Some models that `include Mongoid::Document`
38
-
39
- ### Configuration
40
-
41
- Define the following variables in your environment (or `.env` file if you're using `dotenv`).
42
- The url below is the default one Meilisearch uses when run locally.
43
-
44
- ```bash
45
- SEARCH_ENABLED=true
46
- MEILI_MASTER_KEY=<your api key here>
47
- MEILISEARCH_URL=http://127.0.0.1:7700
48
-
49
- ```
50
-
51
- Optional configuration
52
- ```bash
53
- MEILISEARCH_TIMEOUT=10
54
- MEILISEARCH_MAX_RETRIES=2
55
- ```
56
-
57
- ## Model Integration
58
-
59
- Add the following near the top of your model. Only the `extend` and `include` lines are required.
60
- This assumes your model also includes `Mongoid::Document`
61
-
62
- ```ruby
63
- include Search::InstanceMethods
64
- extend Search::ClassMethods
65
- ```
66
-
67
- If you want Rails to automatically add, update, and delete records from the index, add the following to your model.
68
-
69
- You can override these methods if needed, but you're unlikely to want to.
70
-
71
- ```ruby
72
- # enabled?() is controlled by the SEARCH_ENABLED environment variable
73
- if Search::Client.instance.enabled?
74
- after_create :add_to_search
75
- after_update :update_in_search
76
- after_destroy :remove_from_search
77
- end
78
- ```
79
-
80
- Assuming you've done the above a new index will be created with a name that
81
- corresponds to your model's name, only in snake case. All of your models
82
- fields will be indexed and [filterable](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering).
83
-
84
-
85
- ### Example Rails Model
86
-
87
- Here's what it looks like when you put it all
88
- together in a Rails model with the default behavior.
89
-
90
- ```ruby
91
- class Person
92
- include Mongoid::Document
93
- extend Search::ClassMethods
94
-
95
- if Search::Client.instance.enabled?
96
- after_create :add_to_search
97
- after_update :update_in_search
98
- after_destroy :remove_from_search
99
- end
100
-
101
- # normal Mongoid attributes
102
- field :name, type: String
103
- field :description, type: String
104
- field :age, type: Integer
105
- end
106
- ```
107
-
108
- Note that that _unless you configure it otherwise_ the ids of `belongs_to` objects
109
- will not be searchable. This is because they're random strings that no human's ever
110
- going to be searching for, and we don't want to waste RAM or storage.
111
-
112
- ### Going Beyond The Defaults
113
- This module strives for sensible defaults, but you can override them with the
114
- following optional constants:
115
-
116
- - `PRIMARY_SEARCH_KEY` - a Symbol matching one of your model's attributes
117
- that is guaranteed unique. This defaults to `_id`
118
- - `SEARCH_INDEX_NAME` - a String - useful if you want to have records from
119
- multiple classes come back in the same search results. This defaults to the
120
- underscored form of the current class name.
121
- - `SEARCH_OPTIONS` - a hash of key value pairs in JS style
122
- - See the [meilisearch search parameter docs](https://www.meilisearch.com/docs/reference/api/search#search-parameters) for details.
123
- - example from [meliesearch's `multi_param_spec`](https://github.com/meilisearch/meilisearch-ruby/blob/main/spec/meilisearch/index/search/multi_params_spec.rb)
124
- ```ruby
125
- {
126
- attributesToCrop: ['title'],
127
- cropLength: 2,
128
- filter: 'genre = adventure',
129
- attributesToHighlight: ['title'],
130
- limit: 2
131
- }
132
- ```
133
- - `SEARCH_RANKING_RULES` - an array of strings that correspond to meilisearch rules
134
- see [meilisearch ranking rules docs](https://www.meilisearch.com/docs/learn/core_concepts/relevancy#ranking-rules)
135
- You probably don't want to change this.
136
-
137
-
138
- ## Indexes
139
-
140
- Searching is limited to records that have been added to a given index. This means,
141
- if you want to perform one search and get back records from multiple models you'll need to
142
- add them to the same index.
143
-
144
- In order to do that add the `SEARCH_INDEX_NAME` constant to the model whose search stuff you want to end up in the same index. You can name this just about anything. The important thing is
145
- that all the models that share this index have the same `SEARCH_INDEX_NAME` constant defined. You may want to just add it to a module they all import.
146
-
147
-
148
- ```ruby
149
- SEARCH_INDEX_NAME='general_search'
150
- ```
151
-
152
- If multiple models are using the same index, you should also
153
- add `CLASS_PREFIXED_SEARCH_IDS=true`. This causes the `id` field to
154
- be `<ClassName>_<_id>` For example, a `Note` record might have an
155
- index of `"Note_64274543906b1d7d02c1fcc6"`. If undefined this will default to `false`.
156
- This is not needed if you can absolutely guarantee that there will be
157
- no overlap in ids amongst all the models using a shared index.
158
-
159
- ```ruby
160
- CLASS_PREFIXED_SEARCH_IDS=true
161
- ```
162
-
163
- Setting `CLASS_PREFIXED_SEARCH_IDS` to `true` will also cause the original Mongoid `_id` field to be indexed
164
- as `original_document_id`. This is useful if you want to be able to retrieve the original record from the database.
165
-
166
- ### Searchable Data
167
- You probably don't want to index _all_ the fields. For example,
168
- unless you intend to allow users to sort by when a record was created,
169
- there's no point in recording it's `created_at` in the search index.
170
- It'll just waste bandwidth, memory, and disk space.
171
-
172
- Define a `SEARCHABLE_ATTRIBUTES` constant with an array of strings to limit things.
173
- These are the field names, and/or names of methods you wish to have indexed.
174
-
175
- By default these will _also_ be the fields you can filter on.
176
-
177
- Note that Meilisearch requires there to be an `id` field and it must be a string.
178
- If you don't define one it will use string version of the `_id` your
179
- document's `BSON::ObjectId`.
180
-
181
- ```ruby
182
- # explicitly define the fields you want to be searchable
183
- # this should be an array of symbols
184
- SEARCHABLE_ATTRIBUTES = %w[title body]
185
- # OR explicitly define the fields you DON'T want searchable
186
- SEARCHABLE_ATTRIBUTES = searchable_attributes - [:created_at]
187
- ```
188
-
189
- #### Including Foreign Key data
190
- If, for example, your `Person` `belongs_to: group`
191
- and you wanted that group's id to be searchable you would include `group_id`
192
- in the list.
193
-
194
- If you don't specify any `SEARCHABLE_ATTRIBUTES`, the default list will
195
- exclude any fields that are `Mongoid::Fields::ForeignKey` objects.
196
-
197
-
198
- #### Getting Extra Specific
199
- If your searchable data needs to by dynamically generated instead of
200
- just taken directly from the `Mongoid::Document`'s attributes or
201
- existing methods you can define a `search_indexable_hash` method on your class.
202
-
203
- Before you do, please note that as of v1.1 your `SEARCHABLE_ATTRIBUTES`
204
- constant can contain fields and method names in its array of values. Making
205
- a method for each thing dynamically generated thing you want in the search
206
- and then including it in SEARCHABLE_ATTRIBUTES is going to be
207
- the easiest way of accomplishing this.
208
-
209
- Your `search_indexable_hash` must return a hash, and that hash must include the following keys:
210
- - `"id"` - a string that uniquely identifies the record
211
- - `"object_class"` the name of the class that this record corresponds to.
212
-
213
- The value of `"object_class"` is usually just `self.class.name`.
214
- This is something specific to this gem, and not Meilisearch itself.
215
-
216
- See `InstanceMethods#search_indexable_hash` for an example.
217
-
218
- #### Filterable Fields
219
- If you'd like to only be able to filter on a subset of those then
220
- you can define `FILTERABLE_ATTRIBUTE_NAMES` but it _must_ be a subset
221
- of `SEARCHABLE_ATTRIBUTES`. This is enforced by the gem to guarantee
222
- no complaints from Meilisearch. These must be symbols.
223
-
224
- If you have no direct need for filterable results,
225
- set `UNFILTERABLE_IN_SEARCH=true` in your model. This will save
226
- on index size and speed up indexing, but you won't be able to filter
227
- search results, and that's half of what makes Meilisearch so great.
228
- It should be noted, that even if this _is_ set to `true` this gem
229
- will still add `"object_class"` as a filterable attribute.
230
-
231
- This is the magic that allows you to have an index shared by multiple
232
- models and still be able to retrieve results specifically for one.
233
-
234
- If you decide to re-enable filtering you can remove that constant, or set it to false.
235
- Then call the following. If `FILTERABLE_ATTRIBUTE_NAMES` is defined it will use that,
236
- otherwise it will use whatever `.searchable_attributes` returns.
237
-
238
- ```ruby
239
- MyModel.set_filterable_attributes! # synchronous
240
- MyModel.set_filterable_attributes # asynchronous
241
- ```
242
-
243
- This will cause Meilisearch to reindex all the records for that index. If you
244
- have a large number of records this could take a while. Consider running it
245
- on a background thread. Note that filtering is managed at the index level, not the individual
246
- record level. By setting filterable attributes you're giving Meilisearch
247
- guidance on what to do when indexing your data.
248
-
249
- Note that you will encounter problems in a shared index if you try and
250
- filter on a field that one of the contributing models doesn't have set
251
- as a filterable field, or doesn't have at all.
252
-
253
- ### Sortable Fields
254
-
255
- Sortable fields work in essentially the same way as filterable fields.
256
- By default it's the same as your `FILTERABLE_ATTRIBUTE_NAMES` which, in turn, defaults to your `SEARCHABLE_ATTRIBUTES` You can
257
- override it by setting `SORTABLE_ATTRIBUTE_NAMES`.
258
-
259
- Note that you will encounter problems in a shared index if you try and
260
- sort on a field that one of the contributing models doesn't have set
261
- as a sortable field, or doesn't have at all.
262
-
263
- ```ruby
264
- MyModel.set_sortable_attributes! # synchronous
265
- MyModel.set_sortable_attributes # asynchronous
266
- ```
267
-
268
- ### Indexing things
269
- **Important note**: By default anything you do that updates the search index (adding, removing, or changing) happens asynchronously.
270
-
271
- Sometimes, especially when debugging something on the console, you want to
272
- update the index _synchronously_. The convention used in this codebase - and in the meilisearch-ruby library we build on - is that
273
- the synchronous methods are the ones with the bang. Similar to how mutating
274
- state is potentially dangerous and noted with a bang, using synchronous methods
275
- is potentially problematic for your users, and thus noted with a bang.
276
-
277
-
278
- For example:
279
- ```ruby
280
- MyModel.reindex # runs asyncronously
281
-
282
- ```
283
-
284
- vs
285
-
286
- ```ruby
287
- MyModel.reindex! # runs synchronously
288
- ```
289
-
290
- #### Reindexing, Adding, Updating, and Deleting
291
-
292
- **Reindexing**
293
- Calling `MyModel.reindex!` deletes all the existing records from the current index,
294
- and then reindexes all the records for the current model. It's safe to run this
295
- even if there aren't any records. In addition to re-indexing your models,
296
- it will update/set the "sortable" and "filterable" fields on the
297
- relevant indexes.
298
-
299
- Note: reindexing behaves slightly differently than all the other methods.
300
- It runs semi-asynchronously by default. The Asynchronous form will first,
301
- attempt to _synchronously_ delete all the records from the index. If that
302
- fails an exception will be raised. Otherwise you'd think everything was
303
- fine when actually it had failed miserably. If you call `.reindex!`
304
- it will be entirely synchronous.
305
-
306
- Note: adding, updating, and deleting should happen automatically
307
- if you've defined `after_create`, `after_update`, and `after_destroy`
308
- as instructed above. You'll mostly only want to use these when manually
309
- mucking with things in the console.
310
-
311
- **Adding**
312
- Be careful to not add documents that are already in the index.
313
-
314
- - Add everything: `MyClass.add_all_to_search`
315
- - Add a specific instance: `my_instance.add_to_search`
316
- - Add a specific subset of documents: `MyClass.add_documents(documents_hashes)`
317
- IMPORTANT: `documents_hashes` must be an array of hashes that were each generated
318
- via `search_indexable_hash`
319
-
320
- **Updating**
321
- - Update everything: call `reindex`
322
- - Update a specific instance: `my_instance.update_in_search`
323
- - Update a specific subset of documents: `MyClass.update_documents(documents_hashes)`
324
- IMPORTANT: `documents_hashes` must be an array of hashes that were generated
325
- via `search_indexable_hash` The `PRIMARY_SEARCH_KEY` (`_id` by default) will be
326
- used to find records in the index to update.
327
-
328
-
329
- **Deleting**
330
- - Delete everything: `MyClass.delete_all_documents!`
331
- - Delete a specific record: `my_instance.remove_from_search`
332
- - Delete the index: `MyClass.delete_index!`
333
- WARNING: if you think you should use this, you're probably
334
- mistaken.
335
-
336
- #### Indexes
337
- By default every model gets its own search index. This means that
338
- `Foo.search("some text")` will only search `Foo` objects. To have a
339
- search cross objects you'll need to use a "Shared Index" (see below).
340
-
341
-
342
- The name of the index isn't important when not using shared indexes.
343
- By default a model's index is the snake cased form of the class name.
344
- For example, data for `MyWidget` models will be stored in the `my_widget` index.
345
-
346
- #### Shared indexes
347
- Imagine you have a `Note` and a `Comment` model, sharing an index so that
348
- you can perform a single search and have search results for both models
349
- that are ranked by relevance.
350
-
351
- In this case both models would define a `SEARCH_INDEX_NAME` constant with the
352
- same value. You might want to just put this, and the other search stuff
353
- in a common module that they all `include`.
354
-
355
- Then, when you search you can say `Note.search("search term")` and it will _only_
356
- bring back results for `Note` records. If you want to include results that match
357
- `Comment` records too, you can set the optional `filtered_by_class` parameter to `false`.
358
-
359
- For example: `Note.search("search term", filtered_by_class: false)`
360
- will return all matching `Note` results, as well as results for _all_ the
361
- other models that share the same index as `Note`.
362
-
363
- ⚠ Models sharing the same index must share the same primary key field as well.
364
- This is a known limitation of the system.
365
-
366
- ## Searching
367
-
368
- To get a list of all the matching objects in the order returned by the search engine
369
- run `MyModel.search("search term")` Note that this will restrict the results to
370
- records generated by the model you're calling this on. If you have an index
371
- that contains data from multiple models and wish to include all of them in
372
- the results pass in the optional `filtered_by_class` parameter with a `false` value.
373
- E.g. `MyModel.search("search term", filtered_by_class: false)`
374
-
375
- Searching returns a hash, with the class name of the results as the key and an array of
376
- String ids, or `Mongoid::Document` objects as the value. By default it assumes you want
377
- `Mongoid::Document` objects. The returned hash _also_ includes a key
378
- of `"search_result_metadata"` which includes the metadata provided by Meilisearch regarding
379
- your request. You'll need this for pagination if you have lots of results. To _exclude_
380
- the metadata pass `include_metadata: false` as an option.
381
- E.g. `MyModel.search("search term", include_metadata: false)`
382
-
383
- ### Useful Keyword Parameters
384
-
385
- - `ids_only`
386
- - only return matching ids. These will be an array under the `"matches"` key.
387
- - defaults to `false`
388
- - `filtered_by_class`
389
- - limit results to the class you initiated the search from. E.g. `Note.search("foo")` will only return results from the `Note` class even if there are records from other classes in the same index.
390
- - defaults to `true`
391
- - `include_metadata`
392
- - include the metadata about the search results provided by Meilisearch. If true (default) there will be a `"search_result_metadata"` key, with a hash of the Meilisearch metadata.
393
- - You'll likely need this in order to support pagination, however if you just want to return a single page worth of data, you can set this to `false` to discard it.
394
- - defaults to `true`
395
-
396
- ### Example Search Results
397
-
398
- Search results, ids only, for a class where `CLASS_PREFIXED_SEARCH_IDS=false`.
399
-
400
- ```ruby
401
- Note.search('foo', ids_only: true) # => returns
402
- {
403
- "matches" => [
404
- "64274a5d906b1d7d02c1fcc7",
405
- "643f5e1c906b1d60f9763071",
406
- "64483e63906b1d84f149717a"
407
- ],
408
- "search_result_metadata" => {
409
- "query"=>query_string,
410
- "processingTimeMs"=>1,
411
- "limit"=>50,
412
- "offset"=>0,
413
- "estimatedTotalHits"=>33,
414
- "nbHits"=>33
415
- }
416
- }
417
- ```
418
- If `CLASS_PREFIXED_SEARCH_IDS=true` the above would have ids like `"Note_64274a5d906b1d7d02c1fcc7"`
419
-
420
-
421
- Without `ids_only` you get full objects in a `matches` array.
422
-
423
-
424
- ```ruby
425
- Note.search('foo') # or Note.search('foo', ids_only: false) # => returns
426
- {
427
- "matches" => [
428
- #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
429
- #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
430
- #<Note _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Standup Notes (for wed)", body: "very full bodied", type: "misc", context: "WORK">
431
- ],
432
- "search_result_metadata" => {
433
- "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
434
- "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
435
- }
436
- }
437
- ```
438
-
439
-
440
-
441
- If `Note` records shared an index with `Task` and they both had `CLASS_PREFIXED_SEARCH_ID=true` you'd get a result like this.
442
-
443
- ```ruby
444
- Note.search('foo') #=> returns
445
- {
446
- "matches" => [
447
- #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
448
- #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
449
- #<Task _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Do the thing", body: "very full bodied", type: "misc", context: "WORK">
450
- ],
451
- "search_result_metadata" => {
452
- "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
453
- "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
454
- }
455
-
456
- }
457
- ```
458
-
459
- ### Custom Search Options
460
-
461
- To invoke any of Meilisearch's custom search options (see [their documentation](https://www.meilisearch.com/docs/reference/api/search)). You can pass them in via an options hash.
462
-
463
- `MyModel.search("search term", options: <my custom options>)`
464
-
465
- Currently the Meilisearch-ruby gem can convert keys from snake case to
466
- camel case. For example `hits_per_page` will become `hitsPerPage`.
467
- Meilisearch ultimately wants camel case (`camelCase`) parameter keys,
468
- _but_ `meilisearch-ruby` wants snake case (`snake_case`).
469
-
470
- Follow Meilisearch's documentation
471
- to see what's available and what type of options to pass it, but convert
472
- them to snake case first. Note that your
473
- options keys and values must all be simple JSON values.
474
-
475
- If for some reason that still isn't enough, you can work with the
476
- meilisearch-ruby index directly via
477
- `Search::Client.instance.index(search_index_name)`
478
-
479
- #### Pagination
480
- This gem has no specific pagination handling, as there are multiple libraries for
481
- handling pagination in Ruby. Here's an example of how to get started
482
- with [Pagy](https://github.com/ddnexus/pagy).
483
-
484
- ```ruby
485
- current_page_number = 1
486
- max_items_per_page = 10
487
-
488
- search_results = Note.search('foo')
489
-
490
- Pagy.new(
491
- count: search_results["search_result_metadata"]["nbHits"],
492
- page: current_page_number,
493
- items: max_items_per_page
494
- )
495
- ```
496
-
497
- ## Development
498
- To contribute to this gem.
499
-
500
-
501
- - Run `bundle install` to install all the dependencies.
502
- - run `lefthook install` to set up [lefthook](https://github.com/evilmartians/lefthook)
503
- This will do things like make sure the tests still pass, and run rubocop before you commit.
504
- - Start hacking.
505
- - Add RSpec tests.
506
- - Add your name to CONTRIBUTORS.md
507
- - Make PR.
508
-
509
- NOTE: by contributing to this repository you are offering to transfer copyright to the current maintainer of the repository.
510
-
511
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
512
-
513
- Bug reports and pull requests are welcome on GitHub at
514
- https://github.com/masukomi/mongodb_meilisearch.
515
- This project is intended to be a safe, welcoming space for collaboration,
516
- and contributors are expected to adhere to the
517
- [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).
518
-
519
- ## License
520
-
521
- The gem is available as open source under the terms of the
522
- [Server Side Public License](https://github.com/masukomi/mongodb_meilisearch/blob/main/LICENSE.txt). For those unfamiliar, the short version is that if you use it in a server side app you need to
523
- share all the code for that app and its infrastructure. It's like AGPL on
524
- steroids. Commercial licenses are available if you want to use this in a
525
- commercial setting but not share all your source.
526
-
527
- ## Code of Conduct
528
-
529
- Everyone interacting in this project's codebases, issue trackers,
530
- chat rooms and mailing lists is expected to follow the
531
- [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).