mongodb_meilisearch 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,435 @@
1
+ # MongodbMeilisearch
2
+
3
+ A simple gem for integrating [Meilisearch](https://www.meilisearch.com) into Ruby† applications that are backed by [MongoDB](https://www.mongodb.com/).
4
+
5
+
6
+ † It's currently limited to Rails apps, but hopefully that will change soon.
7
+
8
+ ## Installation
9
+
10
+ Install the gem and add to the application's Gemfile by executing:
11
+
12
+ $ bundle add mongodb_meilisearch
13
+
14
+ If bundler is not being used to manage dependencies, install the gem by executing:
15
+
16
+ $ gem install mongodb_meilisearch
17
+
18
+ ## Usage
19
+
20
+ A high level overview
21
+
22
+ ### Pre-Requisites
23
+
24
+ - [Meilisearch](https://www.meilisearch.com)
25
+ - [MongoDB](https://www.mongodb.com/)
26
+ - Some models that `include Mongoid::Document`
27
+
28
+ ### Configuration
29
+
30
+ Define the following variables in your environment (or `.env` file if you're using `dotenv`).
31
+ The url below is the default one Meilisearch uses when run locally.
32
+
33
+ ```bash
34
+ SEARCH_ENABLED=true
35
+ MEILISEARCH_API_KEY=<your api key here>
36
+ MEILISEARCH_URL=http://127.0.0.1:7700
37
+
38
+ # optional configuration
39
+ MEILISEARCH_TIMEOUT=10
40
+ MEILISEARCH_MAX_RETRIES=2
41
+ ```
42
+
43
+ ## Model Integration
44
+
45
+ Add the following near the top of your model. Only the `extend` and `include` lines are required.
46
+ This assumes your model also includes `Mongoid::Document`
47
+
48
+ ```ruby
49
+ extend Search::ClassMethods
50
+ include Search::InstanceMethods
51
+ ```
52
+
53
+ If you want Rails to automatically add, update, and delete records from the index, add the following to your model.
54
+
55
+ You can override these methods if needed, but you're unlikely to want to.
56
+
57
+ ```ruby
58
+ # enabled?() is controlled by the SEARCH_ENABLED environment variable
59
+ if Search::Client.instance.enabled?
60
+ after_create :add_to_search
61
+ after_update :update_in_search
62
+ after_destroy :remove_from_search
63
+ end
64
+ ```
65
+
66
+ Assuming you've done the above a new index will be created with a name that
67
+ corresponds to your model's name, only in snake case. All of your models
68
+ attributes will be indexed and [filterable](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering).
69
+
70
+
71
+ ### Going Beyond The Defaults
72
+ This module strives for sensible defaults, but you can override them with the
73
+ following optional constants:
74
+
75
+ * `PRIMARY_SEARCH_KEY` - a Symbol matching one of your model's attributes
76
+ that is guaranteed unique. This defaults to `_id`
77
+ * `SEARCH_INDEX_NAME` - a String - useful if you want to have records from
78
+ multiple classes come back in the same search results. This defaults to the
79
+ underscored form of the current class name.
80
+ * `SEARCH_OPTIONS` - a hash of key value pairs in JS style
81
+ - See the [meilisearch search parameter docs](https://www.meilisearch.com/docs/reference/api/search#search-parameters) for details.
82
+ - example from [meliesearch's `multi_param_spec`](https://github.com/meilisearch/meilisearch-ruby/blob/main/spec/meilisearch/index/search/multi_params_spec.rb)
83
+ ```ruby
84
+ {
85
+ attributesToCrop: ['title'],
86
+ cropLength: 2,
87
+ filter: 'genre = adventure',
88
+ attributesToHighlight: ['title'],
89
+ limit: 2
90
+ }
91
+ ```
92
+ * `SEARCH_RANKING_RULES` - an array of strings that correspond to meilisearch rules
93
+ see [meilisearch ranking rules docs](https://www.meilisearch.com/docs/learn/core_concepts/relevancy#ranking-rules)
94
+ You probably don't want to change this.
95
+
96
+
97
+ ## Indexes
98
+
99
+ Searching is limited to records that have been added to a given index. This means,
100
+ if you want to perform one search and get back records from multiple models you'll need to
101
+ add them to the same index.
102
+
103
+ In order to do that add the `SEARCH_INDEX_NAME` constant to the model whose search stuff you want to end up in the same index. You can name this just about anything. The important thing is
104
+ that all the models that share this index have the same `SEARCH_INDEX_NAME` constant defined. You may want to just add it to a module they all import.
105
+
106
+
107
+ ```ruby
108
+ SEARCH_INDEX_NAME='general_search'
109
+ ```
110
+
111
+ If multiple models are using the same index, you should also
112
+ add `CLASS_PREFIXED_SEARCH_IDS=true`. This causes the `id` field to
113
+ be `<ClassName>_<_id>` For example, a `Note` record might have an
114
+ index of `"Note_64274543906b1d7d02c1fcc6"`. If undefined this will default to `false`.
115
+ This is not needed if you can absolutely guarantee that there will be
116
+ no overlap in ids amongst all the models using a shared index.
117
+
118
+ ```ruby
119
+ CLASS_PREFIXED_SEARCH_IDS=true
120
+ ```
121
+
122
+ Setting `CLASS_PREFIXED_SEARCH_IDS` to `true` will also cause the original Mongoid `_id` field to be indexed
123
+ as `original_document_id`. This is useful if you want to be able to retrieve the original record from the database.
124
+
125
+ ### Searchable Data
126
+ You probably don't want to index _all_ the fields. For example,
127
+ unless you intend to allow users to sort by when a record was created,
128
+ there's no point in recording it's `created_at` in the search index.
129
+ It'll just waste bandwidth, memory, and disk space.
130
+
131
+ Define a `SEARCHABLE_ATTRIBUTES` constant with an array of strings to limit things.
132
+ By default these will _also_ be the fields you can filter on. Note that
133
+ Meilisearch requires there to be an `id` field and it must be a string.
134
+ If you don't define one it will use string version of the `_id` your
135
+ document's `BSON::ObjectId`.
136
+
137
+ ```ruby
138
+ # explicitly define the fields you want to be searchable
139
+ # this should be an array of symbols
140
+ SEARCHABLE_ATTRIBUTES = %i[title body]
141
+ # OR explicitly define the fields you DON'T want searchable
142
+ SEARCHABLE_ATTRIBUTES = searchable_attributes - [:created_at]
143
+ ```
144
+
145
+ #### Getting Extra Specific
146
+ If your searchable data needs to by dynamically generated instead of
147
+ just taken directly from the `Mongoid::Document`'s attributes you can
148
+ define a `search_indexable_hash` method on your class. This method
149
+ must return a hash, and that hash must include the following keys:
150
+ - `"id"` - a string that uniquely identifies the record
151
+ - `"object_class"` the name of the class that this record corresponds to.
152
+
153
+ The value of `"object_class"` is usually just `self.class.name`. Additionally,
154
+ this is something specific to this gem, and not Meilisearch itself.
155
+
156
+ See `InstanceMethods#search_indexable_hash` for an example.
157
+
158
+ #### Filterable Fields
159
+ If you'd like to only be able to filter on a subset of those then
160
+ you can define `FILTERABLE_ATTRIBUTE_NAMES` but it _must_ be a subset
161
+ of `SEARCHABLE_ATTRIBUTES`. This is enforced by the gem to guarantee
162
+ no complaints from Meilisearch. These must be symbols.
163
+
164
+ If you have no direct need for filterable results,
165
+ set `UNFILTERABLE_IN_SEARCH=true` in your model. This will save
166
+ on index size and speed up indexing, but you won't be able to filter
167
+ search results, and that's half of what makes Meilisearch so great.
168
+ It should be noted, that even if this _is_ set to `true` this gem
169
+ will still add `"object_class"` as a filterable attribute.
170
+
171
+ This is the magic that allows you to have an index shared by multiple
172
+ models and still be able to retrieve results specifically for one.
173
+
174
+ If you decide to re-enable filtering you can remove that constant, or set it to false.
175
+ Then call the following. If `FILTERABLE_ATTRIBUTE_NAMES` is defined it will use that,
176
+ otherwise it will use whatever `.searchable_attributes` returns.
177
+
178
+ ```ruby
179
+ MyModel.set_filterable_attributes!
180
+ ```
181
+
182
+ This will cause Meilisearch to reindex all the records for that index. If you
183
+ have a large number of records this could take a while. Consider running it
184
+ on a background thread. Note that filtering is managed at the index level, not the individual
185
+ record level. By setting filterable attributes you're giving Meilisearch
186
+ guidance on what to do when indexing your data.
187
+
188
+
189
+
190
+ ### Indexing things
191
+ **Important note**: By default anything you do that updates the search index (adding, removing, or changing) happens asynchronously.
192
+
193
+ Sometimes, especially when debugging something on the console, you want to
194
+ update the index _synchronously_. The convention used in this codebase is that
195
+ the synchronous methods are the ones with the bang. Similar to how mutating
196
+ state is potentially dangerous and noted with a bang, using synchronous methods
197
+ is potentially problematic for your users, and thus noted with a bang.
198
+
199
+
200
+ For example:
201
+ ```ruby
202
+ MyModel.reindex # runs asyncronously
203
+ # vs
204
+ MyModel.reindex! # runs synchronously
205
+ ```
206
+
207
+ #### Reindexing, Adding, Updating, and Deleting
208
+
209
+ **Reindexing**
210
+ Calling `MyModel.reindex!` deletes all the existing records from the current index,
211
+ and then reindexes all the records for the current model. It's safe to run this
212
+ even if there aren't any records.
213
+
214
+ Note: reindexing behaves slightly differently than all the other methods.
215
+ It runs semi-asynchronously by default. The Asynchronous form will first,
216
+ attempt to _synchronously_ delete all the records from the index. If that
217
+ fails an exception will be raised. Otherwise you'd think everything was
218
+ fine when actually it had failed miserably. If you call `.reindex!`
219
+ it will be entirely synchronous.
220
+
221
+ Note: adding, updating, and deleting should happen automatically
222
+ if you've defined `after_create`, `after_update`, and `after_destroy`
223
+ as instructed above. You'll mostly only want to use these when manually
224
+ mucking with things in the console.
225
+
226
+ **Adding**
227
+ Be careful to not add documents that are already in the index.
228
+
229
+ - Add everything: `MyClass.add_all_to_search`
230
+ - Add a specific instance: `my_instance.add_to_search`
231
+ - Add a specific subset of documents: `MyClass.add_documents(documents_hashes)`
232
+ IMPORTANT: `documents_hashes` must be an array of hashes that were each generated
233
+ via `search_indexable_hash`
234
+
235
+ **Updating**
236
+ - Update everything: call `reindex`
237
+ - Update a specific instance: `my_instance.update_in_search`
238
+ - Update a specific subset of documents: `MyClass.update_documents(documents_hashes)`
239
+ IMPORTANT: `documents_hashes` must be an array of hashes that were generated
240
+ via `search_indexable_hash` The `PRIMARY_SEARCH_KEY` (`_id` by default) will be
241
+ used to find records in the index to update.
242
+
243
+
244
+ **Deleting**
245
+ - Delete everything: `MyClass.delete_all_documents!`
246
+ - Delete a specific record: `my_instance.remove_from_search`
247
+ - Delete the index: `MyClass.delete_index!`
248
+ WARNING: if you think you should use this, you're probably
249
+ mistaken.
250
+
251
+ #### Shared indexes
252
+ Imagine you have a `Note` and a `Comment` model, sharing an index so that
253
+ you can perform a single search and have search results for both models
254
+ that are ranked by relevance.
255
+
256
+ In this case both models would define a `SEARCH_INDEX_NAME` constant with the
257
+ same value. You might want to just put this, and the other search stuff
258
+ in a common module that they all `include`.
259
+
260
+ Then, when you search you can say `Note.search("search term")` and it will _only_
261
+ bring back results for `Note` records. If you want to include results that match
262
+ `Comment` records too, you can set the optional `filtered_by_class` parameter to `false`.
263
+
264
+ For example: `Note.search("search term", filtered_by_class: false)`
265
+ will return all matching `Note` results, as well as results for _all_ the
266
+ other models that share the same index as `Note`.
267
+
268
+ ⚠ Models sharing the same index must share the same primary key field as well.
269
+ This is a known limitation of the system.
270
+
271
+ ## Searching
272
+
273
+ To get a list of all the matching objects in the order returned by the search engine
274
+ run `MyModel.search("search term")` Note that this will restrict the results to
275
+ records generated by the model you're calling this on. If you have an index
276
+ that contains data from multiple models and wish to include all of them in
277
+ the results pass in the optional `filtered_by_class` parameter with a `false` value.
278
+ E.g. `MyModel.search("search term", filtered_by_class: false)`
279
+
280
+ Searching returns a hash, with the class name of the results as the key and an array of
281
+ String ids, or `Mongoid::Document` objects as the value. By default it assumes you want
282
+ `Mongoid::Document` objects. The returned hash _also_ includes a key
283
+ of `"search_result_metadata"` which includes the metadata provided by Meilisearch regarding
284
+ your request. You'll need this for pagination if you have lots of results. To _exclude_
285
+ the metadata pass `include_metadata: false` as an option.
286
+ E.g. `MyModel.search("search term", include_metadata: false)`
287
+
288
+ ### Useful Keyword Parameters
289
+
290
+ - `ids_only`
291
+ - only return matching ids. These will be an array under the `"matches"` key.
292
+ - defaults to `false`
293
+ - `filtered_by_class`
294
+ - limit results to the class you initiated the search from. E.g. `Note.search("foo")` will only return results from the `Note` class even if there are records from other classes in the same index.
295
+ - defaults to `true`
296
+ - `include_metadata`
297
+ - include the metadata about the search results provided by Meilisearch. If true (default) there will be a `"search_result_metadata"` key, with a hash of the Meilisearch metadata.
298
+ - You'll likely need this in order to support pagination, however if you just want to return a single page worth of data, you can set this to `false` to discard it.
299
+ - defaults to `true`
300
+
301
+ ### Example Search Results
302
+
303
+ Search results, ids only, for a class where `CLASS_PREFIXED_SEARCH_IDS=false`.
304
+
305
+ ```ruby
306
+ Note.search('foo', ids_only: true)
307
+ # returns
308
+ {
309
+ "matches" => [
310
+ "64274a5d906b1d7d02c1fcc7",
311
+ "643f5e1c906b1d60f9763071",
312
+ "64483e63906b1d84f149717a"
313
+ ],
314
+ "search_result_metadata" => {
315
+ "query"=>query_string,
316
+ "processingTimeMs"=>1,
317
+ "limit"=>50,
318
+ "offset"=>0,
319
+ "estimatedTotalHits"=>33,
320
+ "nbHits"=>33
321
+ }
322
+ }
323
+ ```
324
+ If `CLASS_PREFIXED_SEARCH_IDS=true` the above would have ids like `"Note_64274a5d906b1d7d02c1fcc7"`
325
+
326
+
327
+ Without `ids_only` you get full objects in a `matches` array.
328
+
329
+
330
+ ```ruby
331
+ Note.search('foo') # or Note.search('foo', ids_only: false)
332
+ # returns
333
+ {
334
+ "matches" => [
335
+ #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
336
+ #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
337
+ #<Note _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Standup Notes (for wed)", body: "very full bodied", type: "misc", context: "WORK">
338
+ ],
339
+ "search_result_metadata" => {
340
+ "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
341
+ "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
342
+ }
343
+ }
344
+ ```
345
+
346
+
347
+
348
+ If `Note` records shared an index with `Task` and they both had `CLASS_PREFIXED_SEARCH_ID=true` you'd get a result like this.
349
+
350
+ ```ruby
351
+ Note.search('foo')
352
+ # returns
353
+ {
354
+ "matches" => [
355
+ #<Note _id: 64274a5d906b1d7d02c1fcc7, created_at: 2023-03-15 00:00:00 UTC, updated_at: 2023-03-31 21:02:21.108 UTC, title: "A note from the past", body: "a body", type: "misc", context: "dachary">,
356
+ #<Note _id: 643f5e1c906b1d60f9763071, created_at: 2023-04-18 00:00:00 UTC, updated_at: 2023-04-19 03:21:00.41 UTC, title: "offline standup ", body: "onother body", type: "misc", context: "WORK">,
357
+ #<Task _id: 64483e63906b1d84f149717a, created_at: 2023-04-25 00:00:00 UTC, updated_at: 2023-04-26 11:23:38.125 UTC, title: "Do the thing", body: "very full bodied", type: "misc", context: "WORK">
358
+ ],
359
+ "search_result_metadata" => {
360
+ "query"=>query_string, "processingTimeMs"=>1, "limit"=>50,
361
+ "offset"=>0, "estimatedTotalHits"=>33, "nbHits"=>33
362
+ }
363
+
364
+ }
365
+ ```
366
+
367
+ ### Custom Search Options
368
+
369
+ To invoke any of Meilisearch's custom search options (see [their documentation](https://www.meilisearch.com/docs/reference/api/search)). You can pass them in via an options hash.
370
+
371
+ `MyModel.search("search term", options: <my custom options>)`
372
+
373
+ The Meilisearch-ruby gem should be able to convert keys from snake case to
374
+ camel case. For example `hits_per_page` will become `hitsPerPage`.
375
+ Meilisearch ultimately wants camel case. Follow their documentation
376
+ to see what's available and what type of options to pass it. Note that your
377
+ options keys and values must all be simple JSON values.
378
+
379
+ If for some reason that still isn't enough, you can work with the
380
+ meilisearch-ruby index directly via
381
+ `Search::Client.instance.index(search_index_name)`
382
+
383
+ #### Pagination
384
+ This gem has no specific pagination handling, as there are multiple libraries for
385
+ handling pagination in Ruby. Here's an example of how to get started
386
+ with [Pagy](https://github.com/ddnexus/pagy).
387
+
388
+ ```ruby
389
+ current_page_number = 1
390
+ max_items_per_page = 10
391
+
392
+ search_results = Note.search('foo')
393
+
394
+ Pagy.new(
395
+ count: search_results["search_result_metadata"]["nbHits"],
396
+ page: current_page_number,
397
+ items: max_items_per_page
398
+ )
399
+ ```
400
+
401
+ ## Development
402
+ To contribute to this gem.
403
+
404
+
405
+ - Run `bundle install` to install all the dependencies.
406
+ - run `lefthook install` to set up [lefthook](https://github.com/evilmartians/lefthook)
407
+ This will do things like make sure the tests still pass, and run rubocop before you commit.
408
+ - Start hacking.
409
+ - Add RSpec tests.
410
+ - Add your name to CONTRIBUTORS.md
411
+ - Make PR.
412
+
413
+ NOTE: by contributing to this repository you are offering to transfer copyright to the current maintainer of the repository.
414
+
415
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
416
+
417
+ Bug reports and pull requests are welcome on GitHub at
418
+ https://github.com/masukomi/mongodb_meilisearch.
419
+ This project is intended to be a safe, welcoming space for collaboration,
420
+ and contributors are expected to adhere to the
421
+ [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).
422
+
423
+ ## License
424
+
425
+ The gem is available as open source under the terms of the
426
+ [Server Side Public License](https://github.com/masukomi/mongodb_meilisearch/blob/main/LICENSE.txt). For those unfamiliar, the short version is that if you use it in a server side app you need to
427
+ share all the code for that app and its infrastructure. It's like AGPL on
428
+ steroids. Commercial licenses are available if you want to use this in a
429
+ commercial setting but not share all your source.
430
+
431
+ ## Code of Conduct
432
+
433
+ Everyone interacting in this project's codebases, issue trackers,
434
+ chat rooms and mailing lists is expected to follow the
435
+ [code of conduct](https://github.com/masukomi/mongodb_meilisearch/blob/main/CODE_OF_CONDUCT.md).
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ require "rubocop/rake_task"
9
+
10
+ RuboCop::RakeTask.new
11
+
12
+ task default: %i[spec rubocop]
data/lefthook.yml ADDED
@@ -0,0 +1,18 @@
1
+ skip_output:
2
+ - meta
3
+ - skips
4
+ pre-commit:
5
+ parallel: true
6
+ commands:
7
+ rubocop:
8
+ run: bundle exec rubocop -A --force-exclusion {staged_files}
9
+ stage_fixed: true
10
+ tags: linting
11
+ scripts:
12
+ "bad_words":
13
+ exclude: "Gemfile|Gemfile.lock|mongodb_meilisearch.gemspec"
14
+ runner: bash
15
+ tags: bad_words
16
+ "rb_tester":
17
+ runner: ruby
18
+ tags: testing
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module MongodbMeilisearch
4
+ # The current version of MongodbMeilisearch
5
+ # @note This library will adhere to strict semantic versioning.
6
+ # See https://semver.org/
7
+ #
8
+ VERSION = "1.0.0"
9
+ end
@@ -0,0 +1,3 @@
1
+ require "search/class_methods"
2
+ require "search/instance_methods"
3
+ require "search/client"