mongodb_meilisearch 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md ADDED
@@ -0,0 +1,435 @@
1
+ # MongodbMeilisearch
2
+
3
+ A simple gem for integrating [Meilisearch](https://www.meilisearch.com) into Ruby† applications that are backed by [MongoDB](https://www.mongodb.com/).
4
+
5
+
6
+ † It's currently limited to Rails apps, but hopefully that will change soon.
7
+
8
+ ## Installation
9
+
10
+ Install the gem and add to the application's Gemfile by executing:
11
+
12
+ $ bundle add mongodb_meilisearch
13
+
14
+ If bundler is not being used to manage dependencies, install the gem by executing:
15
+
16
+ $ gem install mongodb_meilisearch
17
+
18
+ ## Usage
19
+
20
+ A high level overview
21
+
22
+ ### Pre-Requisites
23
+
24
+ - [Meilisearch](https://www.meilisearch.com)
25
+ - [MongoDB](https://www.mongodb.com/)
26
+ - Some models that `include Mongoid::Document`
27
+
28
+ ### Configuration
29
+
30
+ Define the following variables in your environment (or `.env` file if you're using `dotenv`).
31
+ The url below is the default one Meilisearch uses when run locally.
32
+
33
+ ```bash
34
+ SEARCH_ENABLED=true
35
+ MEILISEARCH_API_KEY=<your api key here>
36
+ MEILISEARCH_URL=http://127.0.0.1:7700
37
+
38
+ # optional configuration
39
+ MEILISEARCH_TIMEOUT=10
40
+ MEILISEARCH_MAX_RETRIES=2
41
+ ```
42
+
43
+ ## Model Integration
44
+
45
+ Add the following near the top of your model. Only the `extend` and `include` lines are required.
46
+ This assumes your model also includes `Mongoid::Document`
47
+
48
+ ```ruby
49
+ extend Search::ClassMethods
50
+ include Search::InstanceMethods
51
+ ```
52
+
53
+ If you want Rails to automatically add, update, and delete records from the index, add the following to your model.
54
+
55
+ You can override these methods if needed, but you're unlikely to want to.
56
+
57
+ ```ruby
58
+ # enabled?() is controlled by the SEARCH_ENABLED environment variable
59
+ if Search::Client.instance.enabled?
60
+ after_create :add_to_search
61
+ after_update :update_in_search
62
+ after_destroy :remove_from_search
63
+ end
64
+ ```
65
+
66
+ Assuming you've done the above a new index will be created with a name that
67
+ corresponds to your model's name, only in snake case. All of your models
68
+ attributes will be indexed and [filterable](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering).
69
+
70
+
71
+ ### Going Beyond The Defaults
72
+ This module strives for sensible defaults, but you can override them with the
73
+ following optional constants:
74
+
75
+ * `PRIMARY_SEARCH_KEY` - a Symbol matching one of your model's attributes
76
+ that is guaranteed unique. This defaults to `_id`
77
+ * `SEARCH_INDEX_NAME` - a String - useful if you want to have records from
78
+ multiple classes come back in the same search results. This defaults to the
79
+ underscored form of the current class name.
80
+ * `SEARCH_OPTIONS` - a hash of key value pairs in JS style
81
+ - See the [meilisearch search parameter docs](https://www.meilisearch.com/docs/reference/api/search#search-parameters) for details.
82
+ - example from [meliesearch's `multi_param_spec`](https://github.com/meilisearch/meilisearch-ruby/blob/main/spec/meilisearch/index/search/multi_params_spec.rb)
83
+ ```ruby
84
+ {
85
+ attributesToCrop: ['title'],
86
+ cropLength: 2,
87
+ filter: 'genre = adventure',
88
+ attributesToHighlight: ['title'],
89
+ limit: 2
90
+ }
91
+ ```
92
+ * `SEARCH_RANKING_RULES` - an array of strings that correspond to meilisearch rules
93
+ see [meilisearch ranking rules docs](https://www.meilisearch.com/docs/learn/core_concepts/relevancy#ranking-rules)
94
+ You probably don't want to change this.
95
+
96
+
97
+ ## Indexes
98
+
99
+ Searching is limited to records that have been added to a given index. This means,
100
+ if you want to perform one search and get back records from multiple models you'll need to
101
+ add them to the same index.
102
+
103
+ In order to do that add the `SEARCH_INDEX_NAME` constant to the model whose search stuff you want to end up in the same index. You can name this just about anything. The important thing is
104
+ that all the models that share this index have the same `SEARCH_INDEX_NAME` constant defined. You may want to just add it to a module they all import.
105
+
106
+
107
+ ```ruby
108
+ SEARCH_INDEX_NAME='general_search'
109
+ ```
110
+
111
+ If multiple models are using the same index, you should also
112
+ add `CLASS_PREFIXED_SEARCH_IDS=true`. This causes the `id` field to
113
+ be `<ClassName>_<_id>` For example, a `Note` record might have an
114
+ index of `"Note_64274543906b1d7d02c1fcc6"`. If undefined this will default to `false`.
115
+ This is not needed if you can absolutely guarantee that there will be
116
+ no overlap in ids amongst all the models using a shared index.
117
+
118
+ ```ruby
119
+ CLASS_PREFIXED_SEARCH_IDS=true
120
+ ```
121
+
122
+ Setting `CLASS_PREFIXED_SEARCH_IDS` to `true` will also cause the original Mongoid `_id` field to be indexed
123
+ as `original_document_id`. This is useful if you want to be able to retrieve the original record from the database.
124
+
125
+ ### Searchable Data
126
+ You probably don't want to index _all_ the fields. For example,
127
+ unless you intend to allow users to sort by when a record was created,
128
+ there's no point in recording it's `created_at` in the search index.
129
+ It'll just waste bandwidth, memory, and disk space.
130
+
131
+ Define a `SEARCHABLE_ATTRIBUTES` constant with an array of strings to limit things.
132
+ By default these will _also_ be the fields you can filter on. Note that
133
+ Meilisearch requires there to be an `id` field and it must be a string.
134
+ If you don't define one it will use string version of the `_id` your
135
+ document's `BSON::ObjectId`.
136
+
137
+ ```ruby
138
+ # explicitly define the fields you want to be searchable
139
+ # this should be an array of symbols
140
+ SEARCHABLE_ATTRIBUTES = %i[title body]
141
+ # OR explicitly define the fields you DON'T want searchable
142
+ SEARCHABLE_ATTRIBUTES = searchable_attributes - [:created_at]
143
+ ```
144
+
145
+ #### Getting Extra Specific
146
+ If your searchable data needs to by dynamically generated instead of
147
+ just taken directly from the `Mongoid::Document`'s attributes you can
148
+ define a `search_indexable_hash` method on your class. This method
149
+ must return a hash, and that hash must include the following keys:
150
+ - `"id"` - a string that uniquely identifies the record
151
+ - `"object_class"` the name of the class that this record corresponds to.
152
+
153
+ The value of `"object_class"` is usually just `self.class.name`. Additionally,
154
+ this is something specific to this gem, and not Meilisearch itself.
155
+
156
+ See `InstanceMethods#search_indexable_hash` for an example.
157
+
158
+ #### Filterable Fields
159
+ If you'd like to only be able to filter on a subset of those then
160
+ you can define `FILTERABLE_ATTRIBUTE_NAMES` but it _must_ be a subset
161
+ of `SEARCHABLE_ATTRIBUTES`. This is enforced by the gem to guarantee
162
+ no complaints from Meilisearch. These must be symbols.
163
+
164
+ If you have no direct need for filterable results,
165
+ set `UNFILTERABLE_IN_SEARCH=true` in your model. This will save
166
+ on index size and speed up indexing, but you won't be able to filter
167
+ search results, and that's half of what makes Meilisearch so great.
168
+ It should be noted, that even if this _is_ set to `true` this gem
169
+ will still add `"object_class"` as a filterable attribute.
170
+
171
+ This is the magic that allows you to have an index shared by multiple
172
+ models and still be able to retrieve results specifically for one.
173
+
174
+ If you decide to re-enable filtering you can remove that constant, or set it to false.
175
+ Then call the following. If `FILTERABLE_ATTRIBUTE_NAMES` is defined it will use that,
176
+ otherwise it will use whatever `.searchable_attributes` returns.
177
+
178
+ ```ruby
179
+ MyModel.set_filterable_attributes!
180
+ ```
181
+
182
+ This will cause Meilisearch to reindex all the records for that index. If you
183
+ have a large number of records this could take a while. Consider running it
184
+ on a background thread. Note that filtering is managed at the index level, not the individual
185
+ record level. By setting filterable attributes you're giving Meilisearch
186
+ guidance on what to do when indexing your data.
187
+
188
+
189
+
190
+ ### Indexing things
191
+ **Important note**: By default anything you do that updates the search index (adding, removing, or changing) happens asynchronously.
192
+
193
+ Sometimes, especially when debugging something on the console, you want to
194
+ update the index _synchronously_. The convention used in this codebase is that
195
+ the synchronous methods are the ones with the bang. Similar to how mutating
196
+ state is potentially dangerous and noted with a bang, using synchronous methods
197
+ is potentially problematic for your users, and thus noted with a bang.
198
+
199
+
200
+ For example:
201
+ ```ruby
202
+ MyModel.reindex # runs asyncronously
203
+ # vs
204
+ MyModel.reindex! # runs synchronously
205
+ ```
206
+
207
+ #### Reindexing, Adding, Updating, and Deleting
208
+
209
+ **Reindexing**
210
+ Calling `MyModel.reindex!` deletes all the existing records from the current index,
211
+ and then reindexes all the records for the current model. It's safe to run this
212
+ even if there aren't any records.
213
+
214
+ Note: reindexing behaves slightly differently than all the other methods.
215
+ It runs semi-asynchronously by default. The Asynchronous form will first,
216
+ attempt to _synchronously_ delete all the records from the index. If that
217
+ fails an exception will be raised. Otherwise you'd think everything was
218
+ fine when actually it had failed miserably. If you call `.reindex!`
219
+ it will be entirely synchronous.
220
+
221
+ Note: adding, updating, and deleting should happen automatically
222
+ if you've defined `after_create`, `after_update`, and `after_destroy`
223
+ as instructed above. You'll mostly only want to use these when manually
224
+ mucking with things in the console.
225
+
226
+ **Adding**
227
+ Be careful to not add documents that are already in the index.
228
+
229
+ - Add everything: `MyClass.add_all_to_search`
230
+ - Add a specific instance: `my_instance.add_to_search`
231
+ - Add a specific subset of documents: `MyClass.add_documents(documents_hashes)`
232
+ IMPORTANT: `documents_hashes` must be an array of hashes that were each generated
233
+ via `search_indexable_hash`
234
+
235
+ **Updating**
236
+ - Update everything: call `reindex`
237
+ - Update a specific instance: `my_instance.update_in_search`
238
+ - Update a specific subset of documents: `MyClass.update_documents(documents_hashes)`
239
+ IMPORTANT: `documents_hashes` must be an array of hashes that were generated
240
+ via `search_indexable_hash` The `PRIMARY_SEARCH_KEY` (`_id` by default) will be
241
+ used to find records in the index to update.
242
+
243
+
244
+ **Deleting**
245
+ - Delete everything: `MyClass.delete_all_documents!`
246
+ - Delete a specific record: `my_instance.remove_from_search`
247
+ - Delete the index: `MyClass.delete_index!`
248
+ WARNING: if you think you should use this, you're probably
249
+ mistaken.
250
+
251
+ #### Shared indexes
252
+ Imagine you have a `Note` and a `Comment` model, sharing an index so that
253
+ you can perform a single search and have search results for both models
254
+ that are ranked by relevance.
255
+
256
+ In this case both models would define a `SEARCH_INDEX_NAME` constant with the
257
+ same value. You might want to just put this, and the other search stuff
258
+ in a common module that they all `include`.
259
+
260
+ Then, when you search you can say `Note.search("search term")` and it will _only_
261
+ bring back results for `Note` records. If you want to include results that match
262
+ `Comment` records too, you can set the optional `filtered_by_class` parameter to `false`.
263
+
264
+ For example: `Note.search("search term", filtered_by_class: false)`
265
+ will return all matching `Note` results, as well as results for _all_ the
266
+ other models that share the same index as `Note`.
267
+
268
+ ⚠ Models sharing the same index must share the same primary key field as well.
269
+ This is a known limitation of the system.
270
+
271
+ ## Searching
272
+
273
+ To get a list of all the matching objects in the order returned by the search engine
274
+ run `MyModel.search("search term")` Note that this will restrict the results to
275
+ records generated by the model you're calling this on. If you have an index
276
+ that contains data from multiple models and wish to include all of them in
277
+ the results pass in the optional `filtered_by_class` parameter with a `false` value.
278
+ E.g. `MyModel.search("search term", filtered_by_class: false)`
279
+
280
+ Searching returns a hash, with the class name of the results as the key and an array of
281
+ String ids, or `Mongoid::Document` objects as the value. By default it assumes you want
282
+ `Mongoid::Document` objects. The returned hash _also_ includes a key
283
+ of `"search_result_metadata"` which includes the metadata provided by Meilisearch regarding
284
+ your request. You'll need this for pagination if you have lots of results. To _exclude_
285
+ the metadata pass `include_metadata: false` as an option.
286
+ E.g. `MyModel.search("search term", include_metadata: false)`
287
+
288
+ ### Useful Keyword Parameters
289
+
290
+ - `ids_only`
291
+ - only return matching ids. These will be an array under the `"matches"` key.
292
+ - defaults to `false`
293
+ - `filtered_by_class`
294
+ - limit results to the class you initiated the search from. E.g. `Note.search("foo")` will only return results from the `Note` class even if there are records from other classes in the same index.
295
+ - defaults to `true`
296
+ - `include_metadata`
297
+ - include the metadata about the search results provided by Meilisearch. If true (default) there will be a `"search_result_metadata"` key, with a hash of the Meilisearch metadata.
298
+ - You'll likely need this in order to support pagination, however if you just want to return a single page worth of data, you can set this to `false` to discard it.
299
+ - defaults to `true`
300
+
301
+ ### Example Search Results
302
+
303
+ Search results, ids only, for a class where `CLASS_PREFIXED_SEARCH_IDS=false`.
304
+
305
+ ```ruby
306
+ Note.search('foo', ids_only: true)
307
+ # returns
308
+ {
309
+ "matches" => [
310
+ "64274a5d906b1d7d02c1fcc7",
311
+ "643f5e1c906b1d60f9763071",
312
+ "64483e63906b1d84f149717a"
313
+ ],
314
+ "search_result_metadata" => {
315
+ "query"=>query_string,
316
+ "processingTimeMs"=>1,
317
+ "limit"=>50,
318
+ "offset"=>0,
319
+ "estimatedTotalHits"=>33,
320
+ "nbHits"=>33
321
+ }
322
+ }
323
+ ```
324
+ If `CLASS_PREFIXED_SEARCH_IDS=true` the above would have ids like `"Note_64274a5d906b1d7d02c1fcc7"`
325
+
326
+
327
+ Without `ids_only` you get full objects in a `matches` array.
328
+
329
+
330
+ ```ruby
331
+ Note.search('foo') # or Note.search('foo', ids_only: false)
332
+ # returns
333
+ {
334
+ "matches" => [
335
+ #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
336
+ #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
337
+ #<Note _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Standup Notes (for wed)", body: "very full bodied", type: "misc", context: "WORK">
338
+ ],
339
+ "search_result_metadata" => {
340
+ "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
341
+ "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
342
+ }
343
+ }
344
+ ```
345
+
346
+
347
+
348
+ If `Note` records shared an index with `Task` and they both had `CLASS_PREFIXED_SEARCH_ID=true` you'd get a result like this.
349
+
350
+ ```ruby
351
+ Note.search('foo')
352
+ # returns
353
+ {
354
+ "matches" => [
355
+ #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
356
+ #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
357
+ #<Task _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Do the thing", body: "very full bodied", type: "misc", context: "WORK">
358
+ ],
359
+ "search_result_metadata" => {
360
+ "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
361
+ "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
362
+ }
363
+
364
+ }
365
+ ```
366
+
367
+ ### Custom Search Options
368
+
369
+ To invoke any of Meilisearch's custom search options (see [their documentation](https://www.meilisearch.com/docs/reference/api/search)). You can pass them in via an options hash.
370
+
371
+ `MyModel.search("search term", options: <my custom options>)`
372
+
373
+ The Meilisearch-ruby gem should be able to convert keys from snake case to
374
+ camel case. For example `hits_per_page` will become `hitsPerPage`.
375
+ Meilisearch ultimately wants camel case. Follow their documentation
376
+ to see what's available and what type of options to pass it. Note that your
377
+ options keys and values must all be simple JSON values.
378
+
379
+ If for some reason that still isn't enough, you can work with the
380
+ meilisearch-ruby index directly via
381
+ `Search::Client.instance.index(search_index_name)`
382
+
383
+ #### Pagination
384
+ This gem has no specific pagination handling, as there are multiple libraries for
385
+ handling pagination in Ruby. Here's an example of how to get started
386
+ with [Pagy](https://github.com/ddnexus/pagy).
387
+
388
+ ```ruby
389
+ current_page_number = 1
390
+ max_items_per_page = 10
391
+
392
+ search_results = Note.search('foo')
393
+
394
+ Pagy.new(
395
+ count: search_results["search_result_metadata"]["nbHits"],
396
+ page: current_page_number,
397
+ items: max_items_per_page
398
+ )
399
+ ```
400
+
401
+ ## Development
402
+ To contribute to this gem.
403
+
404
+
405
+ - Run `bundle install` to install all the dependencies.
406
+ - run `lefthook install` to set up [lefthook](https://github.com/evilmartians/lefthook)
407
+ This will do things like make sure the tests still pass, and run rubocop before you commit.
408
+ - Start hacking.
409
+ - Add RSpec tests.
410
+ - Add your name to CONTRIBUTORS.md
411
+ - Make PR.
412
+
413
+ NOTE: by contributing to this repository you are offering to transfer copyright to the current maintainer of the repository.
414
+
415
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
416
+
417
+ Bug reports and pull requests are welcome on GitHub at
418
+ https://github.com/masukomi/mongodb_meilisearch.
419
+ This project is intended to be a safe, welcoming space for collaboration,
420
+ and contributors are expected to adhere to the
421
+ [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).
422
+
423
+ ## License
424
+
425
+ The gem is available as open source under the terms of the
426
+ [Server Side Public License](https://github.com/masukomi/mongodb_meilisearch/blob/main/LICENSE.txt). For those unfamiliar, the short version is that if you use it in a server side app you need to
427
+ share all the code for that app and its infrastructure. It's like AGPL on
428
+ steroids. Commercial licenses are available if you want to use this in a
429
+ commercial setting but not share all your source.
430
+
431
+ ## Code of Conduct
432
+
433
+ Everyone interacting in this project's codebases, issue trackers,
434
+ chat rooms and mailing lists is expected to follow the
435
+ [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ require "rubocop/rake_task"
9
+
10
+ RuboCop::RakeTask.new
11
+
12
+ task default: %i[spec rubocop]
data/lefthook.yml ADDED
@@ -0,0 +1,18 @@
1
+ skip_output:
2
+ - meta
3
+ - skips
4
+ pre-commit:
5
+ parallel: true
6
+ commands:
7
+ rubocop:
8
+ run: bundle exec rubocop -A --force-exclusion {staged_files}
9
+ stage_fixed: true
10
+ tags: linting
11
+ scripts:
12
+ "bad_words":
13
+ exclude: "Gemfile|Gemfile.lock|mongodb_meilisearch.gemspec"
14
+ runner: bash
15
+ tags: bad_words
16
+ "rb_tester":
17
+ runner: ruby
18
+ tags: testing
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module MongodbMeilisearch
4
+ # The current version of MongodbMeilisearch
5
+ # @note This library will adhere to strict semantic versioning.
6
+ # See https://semver.org/
7
+ #
8
+ VERSION = "1.0.0"
9
+ end
@@ -0,0 +1,3 @@
1
+ require "search/class_methods"
2
+ require "search/instance_methods"
3
+ require "search/client"