redi_search 0.1.0 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (55) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +516 -112
  3. data/lib/redi_search.rb +5 -2
  4. data/lib/redi_search/add.rb +70 -0
  5. data/lib/redi_search/alter.rb +30 -0
  6. data/lib/redi_search/create.rb +53 -0
  7. data/lib/redi_search/document.rb +71 -16
  8. data/lib/redi_search/index.rb +31 -26
  9. data/lib/redi_search/lazily_load.rb +65 -0
  10. data/lib/redi_search/log_subscriber.rb +4 -0
  11. data/lib/redi_search/model.rb +41 -18
  12. data/lib/redi_search/schema.rb +17 -8
  13. data/lib/redi_search/schema/text_field.rb +0 -2
  14. data/lib/redi_search/search.rb +22 -44
  15. data/lib/redi_search/search/clauses.rb +60 -31
  16. data/lib/redi_search/search/clauses/and.rb +17 -0
  17. data/lib/redi_search/search/clauses/application_clause.rb +18 -0
  18. data/lib/redi_search/search/clauses/boolean.rb +72 -0
  19. data/lib/redi_search/search/clauses/highlight.rb +47 -0
  20. data/lib/redi_search/search/clauses/in_order.rb +17 -0
  21. data/lib/redi_search/search/clauses/language.rb +23 -0
  22. data/lib/redi_search/search/clauses/limit.rb +27 -0
  23. data/lib/redi_search/search/clauses/no_content.rb +17 -0
  24. data/lib/redi_search/search/clauses/no_stop_words.rb +17 -0
  25. data/lib/redi_search/search/clauses/or.rb +23 -0
  26. data/lib/redi_search/search/clauses/return.rb +23 -0
  27. data/lib/redi_search/search/clauses/slop.rb +23 -0
  28. data/lib/redi_search/search/clauses/sort_by.rb +25 -0
  29. data/lib/redi_search/search/clauses/verbatim.rb +17 -0
  30. data/lib/redi_search/search/clauses/where.rb +66 -0
  31. data/lib/redi_search/search/clauses/with_scores.rb +17 -0
  32. data/lib/redi_search/search/result.rb +46 -0
  33. data/lib/redi_search/search/term.rb +4 -4
  34. data/lib/redi_search/spellcheck.rb +30 -29
  35. data/lib/redi_search/spellcheck/result.rb +44 -0
  36. data/lib/redi_search/version.rb +1 -1
  37. metadata +101 -31
  38. data/.gitignore +0 -11
  39. data/.rubocop.yml +0 -1757
  40. data/.travis.yml +0 -23
  41. data/Gemfile +0 -17
  42. data/Rakefile +0 -12
  43. data/bin/console +0 -8
  44. data/bin/publish +0 -58
  45. data/bin/setup +0 -8
  46. data/bin/test +0 -7
  47. data/lib/redi_search/document/converter.rb +0 -26
  48. data/lib/redi_search/error.rb +0 -6
  49. data/lib/redi_search/result/collection.rb +0 -22
  50. data/lib/redi_search/search/and_clause.rb +0 -15
  51. data/lib/redi_search/search/boolean_clause.rb +0 -72
  52. data/lib/redi_search/search/highlight_clause.rb +0 -43
  53. data/lib/redi_search/search/or_clause.rb +0 -21
  54. data/lib/redi_search/search/where_clause.rb +0 -66
  55. data/redi_search.gemspec +0 -48
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 28ddea6e54cfc064e7b19da040b9da9d6ff137b36366312ec6b0e5a436ae9a30
4
- data.tar.gz: 927872142cc1d81953b5ec1a56d3575cb0d36f0c3197779843214fbb9263fcfa
3
+ metadata.gz: 0666cc15e7fc7d132026337c916b5d233114d65378ff172d9a48ae81ed1a9eae
4
+ data.tar.gz: 575c1230885537ea9a31b753808894d187f00718fe8d0fca282cba426b75e0c8
5
5
  SHA512:
6
- metadata.gz: ba5883e93d5d28a55ad66ee100e628cb9af73f226e4547ee875bfd6caba022fe6a0e947f159e3627efd205bec9977a1752ed1d35d58e7208fd61d92f936d3d96
7
- data.tar.gz: 3753947da050d30d1d815dfb6430b9e19aa5a35bdd526ad65f65afd2d67dc6dd54ad51bebe3a8fdc949fa3483b9a459436fdfb46459de7def71f16da581a4846
6
+ metadata.gz: 0f1486c6d2a7b1a726d6c956398e693317df4efffa0c73e45b180a3acb2ac36b69284e68e3230862e6948015b1ceee0a1ac518dab6182c86e4b2c710f587cd16
7
+ data.tar.gz: c9e179c20fc545612a00fd2f9b46e5a2eae9ce1292d4635625a80abf641d726d248ee753633dfa8e292b0dba1543af410e58cf5a10380d7854c66af2c06326a1
data/README.md CHANGED
@@ -1,42 +1,45 @@
1
+ <p align="center">
2
+ <a href="https://github.com/npezza93/redi_search">
3
+ <img src="https://raw.githubusercontent.com/npezza93/redi_search/master/.github/logo.svg?sanitize=true" width="350">
4
+ </a>
5
+ </p>
6
+
1
7
  # RediSearch
2
8
 
3
9
  [![Build Status](https://travis-ci.com/npezza93/redi_search.svg?branch=master)](https://travis-ci.com/npezza93/redi_search)
4
10
  [![Test Coverage](https://api.codeclimate.com/v1/badges/c6437acac5684de2549d/test_coverage)](https://codeclimate.com/github/npezza93/redi_search/test_coverage)
5
11
  [![Maintainability](https://api.codeclimate.com/v1/badges/c6437acac5684de2549d/maintainability)](https://codeclimate.com/github/npezza93/redi_search/maintainability)
6
12
 
7
- A simple, but powerful Ruby wrapper around RediSearch,
8
- a search engine on top of Redis.
13
+ A simple, but powerful, Ruby wrapper around RediSearch, a search engine on top of
14
+ Redis.
9
15
 
10
16
  ## Installation
11
17
 
12
18
  Firstly, Redis and RediSearch need to be installed.
13
19
 
14
- You can download Redis from https://redis.io/download, and check out installation instructions [here](https://github.com/antirez/redis#installing-redis). Alternatively, on macOS or Linux you can install via Homebrew.
15
-
16
- To install RediSearch:
17
- 1. `git clone https://github.com/RedisLabsModules/RediSearch.git`
18
- 1. `cd RediSearch`
19
- 1. `mkdir build`
20
- 1. `cd build`
21
- 1. `cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo`
22
- 1. `make`
23
- 1. `redis-server --loadmodule ./redisearch.so or load the module in your redis.conf`
20
+ You can download Redis from https://redis.io/download, and check out
21
+ installation instructions
22
+ [here](https://github.com/antirez/redis#installing-redis). Alternatively, on
23
+ macOS or Linux you can install via Homebrew.
24
24
 
25
- You can also checkout [here](https://oss.redislabs.com/redisearch/Quick_Start.html) for more detailed installation instructions. If you already have a redis-server running you can also update your redis.conf file to always load the redisearch module. (On macOS the redis.conf file can be found `/usr/local/etc/redis.conf`)
25
+ To install RediSearch check out,
26
+ [https://oss.redislabs.com/redisearch/Quick_Start.html](https://oss.redislabs.com/redisearch/Quick_Start.html).
27
+ Once you have RediSearch built, you can update your redis.conf file to always
28
+ load the redisearch module with `loadmodule /path/to/redisearch.so`. (On macOS
29
+ the redis.conf file can be found `/usr/local/etc/redis.conf`)
26
30
 
27
-
28
- After Redis and RediSearch are up and running, add this line to your application's Gemfile:
31
+ After Redis and RediSearch are up and running, add this line to your Gemfile:
29
32
 
30
33
  ```ruby
31
34
  gem 'redi_search'
32
35
  ```
33
36
 
34
- And then execute:
37
+ And then:
35
38
  ```bash
36
39
  ❯ bundle
37
40
  ````
38
41
 
39
- Or install it yourself as:
42
+ Or install it yourself:
40
43
  ```bash
41
44
  ❯ gem install redi_search
42
45
  ```
@@ -46,9 +49,10 @@ and require it:
46
49
  require 'redi_search'
47
50
  ```
48
51
 
49
- ## Usage
52
+ Once the gem is installed and required you'll need to configure it with your
53
+ Redis configuration. If you're on Rails, this should go in an initializer
54
+ (`config/initializers/redi_search.rb`)
50
55
 
51
- ### Configuration
52
56
  ```ruby
53
57
  RediSearch.configure do |config|
54
58
  config.redis_config = {
@@ -58,108 +62,341 @@ RediSearch.configure do |config|
58
62
  end
59
63
  ```
60
64
 
61
- ### Index
65
+ ## Table of Contents
66
+ - [Preface](#preface)
67
+ - [Schema](#schema)
68
+ - [Document](#document)
69
+ - [Index](#index)
70
+ - [Searching](#searching)
71
+ - [Spellcheck](#spellcheck)
72
+ - [Rails Integration](#rails-integration)
73
+
74
+
75
+ ## Preface
76
+ RediSearch revolves around a search index, so lets start with
77
+ defining what a search index is. According to [Switype](https://swiftype.com):
78
+ > A search index is a body of structured data that a search engine refers to
79
+ > when looking for results that are relevant to a specific query. Indexes are a
80
+ > critical piece of any search system, since they must be tailored to the
81
+ > specific information retrieval method of the search engine’s algorithm. In
82
+ > this manner, the algorithm and the index are inextricably linked to one
83
+ > another. Index can also be used as a verb (indexing), referring to the process
84
+ > of collecting unstructured website data in a structured format that is
85
+ > tailored for the search engine algorithm.
86
+ >
87
+ > One way to think about indices is to consider the following analogy between a
88
+ > search infrastructure and an office filing system. Imagine you hand an intern
89
+ > a stack of thousands of pieces of paper (documents) and tell them to organize
90
+ > these pieces of paper in a filing cabinet (index) to help the company find
91
+ > information more efficiently. The intern will first have to sort through the
92
+ > papers and get a sense of all the information contained within them, then they
93
+ > will have to decide on a system for arranging them in the filing cabinet, then
94
+ > finally they’ll need to decide what is the most effective manner for searching
95
+ > through and selecting from the files once they are in the cabinet. In this
96
+ > example, the process of organizing and filing the papers corresponds to the
97
+ > process of indexing website content, and the method for searching across these
98
+ > organized files and finding those that are most relevant corresponds to the
99
+ > search algorithm.
100
+
101
+
102
+ ## Schema
103
+
104
+ This defines the fields and the properties of those fields in the index. A
105
+ schema is a hash, with field names as the keys, and the field type(and options)
106
+ as the value. Each field can be one of four types: geo, numeric, tag, or text
107
+ and can have many options. A simple example of a schema is:
108
+ ```ruby
109
+ { first_name: :text, last_name: :text }
110
+ ```
111
+
112
+ The supported options for each type are as follows:
113
+
114
+ ##### Text field
115
+ With no options: `{ name: :text }`
116
+
117
+ <details>
118
+ <summary>Options</summary>
119
+ <ul>
120
+ <li>
121
+ <b>weight</b> (default: 1.0)
122
+ <ul>
123
+ <li>Declares the importance of this field when calculating result accuracy. This is a multiplication factor.</li>
124
+ <li>Ex: <code>{ name: { text: { weight: 2 } } }</code></li>
125
+ </ul>
126
+ </li>
127
+ <li>
128
+ <b>phonetic</b>
129
+ <ul>
130
+ <li>Will perform phonetic matching on field in searches by default. The obligatory {matcher} argument specifies the phonetic algorithm and language used. The following matchers are supported:
131
+ <ul>
132
+ <li>dm:en - Double Metaphone for English</li>
133
+ <li>dm:fr - Double Metaphone for French</li>
134
+ <li>dm:pt - Double Metaphone for Portuguese</li>
135
+ <li>dm:es - Double Metaphone for Spanish</li>
136
+ </ul>
137
+ </li>
138
+ <li>
139
+ Ex: <code>{ name: { text: { phonetic: 'dm:en' } } }</code>
140
+ </li>
141
+ </ul>
142
+ </li>
143
+ <li>
144
+ <b>sortable</b> (default: false)
145
+ <ul>
146
+ <li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
147
+ <li>Ex: <code>{ name: { text: { sortable: true } } }</code></li>
148
+ </ul>
149
+ </li>
150
+ <li>
151
+ <b>no_index</b> (default: false)
152
+ <ul>
153
+ <li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
154
+ <li>Ex: <code>{ name: { text: { no_index: true } } }</code></li>
155
+ </ul>
156
+ </li>
157
+ <li>
158
+ <b>no_stem</b> (default: false)
159
+ <ul>
160
+ <li>Disable stemming when indexing its values. This may be ideal for things like proper names.</li>
161
+ <li>Ex: <code>{ name: { text: { no_stem: true } } }</code></li>
162
+ </ul>
163
+ </li>
164
+ </ul>
165
+ </details>
166
+
167
+ ##### Numeric field
168
+ With no options: `{ price: :numeric }`
169
+
170
+ <details>
171
+ <summary>Options</summary>
172
+ <ul>
173
+ <li>
174
+ <b>sortable</b> (default: false)
175
+ <ul>
176
+ <li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
177
+ <li>Ex: <code>{ id: { numeric: { sortable: true } } }</code></li>
178
+ </ul>
179
+ </li>
180
+ <li>
181
+ <b>no_index</b> (default: false)
182
+ <ul>
183
+ <li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
184
+ <li>Ex: <code>{ id: { numeric: { no_index: true } } }</code></li>
185
+ </ul>
186
+ </li>
187
+ </ul>
188
+ </details>
189
+
190
+ ##### Tag field
191
+ With no options: `{ tag: :tag }`
192
+
193
+ <details>
194
+ <summary>Options</summary>
195
+ <ul>
196
+ <li>
197
+ <b>sortable</b> (default: false)
198
+ <ul>
199
+ <li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
200
+ <li>Ex: <code>{ tag: { tag: { sortable: true } } }</code></li>
201
+ </ul>
202
+ </li>
203
+ <li>
204
+ <b>no_index</b> (default: false)
205
+ <ul>
206
+ <li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
207
+ <li>Ex: <code>{ tag: { tag: { no_index: true } } }</code></li>
208
+ </ul>
209
+ </li>
210
+ <li>
211
+ <b>separator</b> (default: ",")
212
+ <ul>
213
+ <li>Indicates how the text contained in the field is to be split into individual tags. The default is ,. The value must be a single character.</li>
214
+ <li>Ex: <code>{ tag: { tag: { separator: ',' } } }</code></li>
215
+ </ul>
216
+ </li>
217
+ </ul>
218
+ </details>
219
+
220
+ ##### Geo field
221
+ With no options: `{ place: :geo }`
222
+
223
+ <details>
224
+ <summary>Options</summary>
225
+ <ul>
226
+ <li>
227
+ <b>sortable</b> (default: false)
228
+ <ul>
229
+ <li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
230
+ <li>Ex: <code>{ place: { geo: { sortable: true } } }</code></li>
231
+ </ul>
232
+ </li>
233
+ <li>
234
+ <b>no_index</b> (default: false)
235
+ <ul>
236
+ <li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
237
+ <li>Ex: <code>{ place: { geo: { no_index: true } } }</code></li>
238
+ </ul>
239
+ </li>
240
+ </ul>
241
+ </details>
242
+
243
+ ## Document
244
+
245
+ A `Document` is the Ruby representation of a RediSearch document.
246
+
247
+ You can fetch a `Document` using `.get` or `mget` class methods.
248
+ - `get(index, document_id)` fetches a single document in an `Index` for a given
249
+ `document_id`.
250
+ - `mget(index, *document_ids)` fetches a collection of `Document`s
251
+ in an `Index` for the given `document_ids`.
252
+
253
+ You can also make a `Document` instance using the
254
+ `.for_object(index, record, serializer: nil, only: [])` class method. It takes
255
+ an `Index` instance and a ruby object. That object must respond to all the
256
+ fields specified in the indexes schema or pass a serializer class that accepts
257
+ the object and responds to all the fields specified in the indexes schema.
258
+ `only` accepts an array of fields from the schema and limits the fields that are
259
+ passed to the `Document`.
260
+
261
+ Once you have an instance of a `Document`, it responds to all the fields
262
+ specified in the indexes schema as methods and `document_id`. `document_id` is
263
+ automatically prepended with the indexes names unless it already is to ensure
264
+ uniqueness. We prepend the index name because if you have two documents with the
265
+ same id in different indexes we don't want the documents to override each other.
266
+ There is also a `#document_id_without_index` method which removes the prepended
267
+ index name.
268
+
269
+ Finally there is a `#del` method that will remove the document from the index.
270
+ It optionally accepts a `delete_document` named argument that signifies whether
271
+ the document should be completely removed from the Redis instance vs just the
272
+ index.
273
+
274
+
275
+ ## Index
276
+
277
+ To initialize an index, pass the name of the index as a string or symbol and the
278
+ schema.
62
279
 
63
- All actions revolve around indexes. To instantiate one:
64
280
  ```ruby
65
281
  RediSearch::Index.new(name_of_index, schema)
66
282
  ```
67
- The name is a string identifying the index and the schema is the argument is a hash that defines all the fields in an index. Each field can be one of four types: geo, numeric, tag, or text.
68
-
69
- #### Text field options
70
- - *weight* (default: 1.0)
71
- - Declares the importance of this field when calculating result accuracy. This is a multiplication factor.
72
- - Ex: `{ name: { text: { weight: 2 } } }`
73
- - *phonetic*
74
- - Will perform phonetic matching on field in searches by default. The obligatory {matcher} argument specifies the phonetic algorithm and language used. The following matchers are supported:
75
- - dm:en - Double Metaphone for English
76
- - dm:fr - Double Metaphone for French
77
- - dm:pt - Double Metaphone for Portuguese
78
- - dm:es - Double Metaphone for Spanish
79
- - Ex: `{ name: { text: { phonetic: 'dm:en' } } }`
80
- - *sortable* (default: false)
81
- - Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).
82
- - Ex: `{ name: { text: { sortable: true } } }`
83
- - *no_index* (default: false)
84
- - Field will not be indexed. This is useful in conjunction with `sortable`, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has `no_index` and doesn't have `sortable`, it will just be ignored by the index.
85
- - Ex: `{ name: { text: { no_index: true } } }`
86
- - *no_stem* (default: false)
87
- - Disable stemming when indexing its values. This may be ideal for things like proper names.
88
- - Ex: `{ name: { text: { no_stem: true } } }`
89
-
90
- #### Numeric field options
91
- - *sortable* (default: false)
92
- - Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).
93
- - Ex: `{ id: { numeric: { sortable: true } } }`
94
- - *no_index* (default: false)
95
- - Field will not be indexed. This is useful in conjunction with `sortable`, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has `no_index` and doesn't have `sortable`, it will just be ignored by the index.
96
- - Ex: `{ id: { numeric: { no_index: true } } }`
97
-
98
- #### Tag field options
99
- - *sortable* (default: false)
100
- - Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).
101
- - Ex: `{ tag: { tag: { sortable: true } } }`
102
- - *no_index* (default: false)
103
- - Field will not be indexed. This is useful in conjunction with `sortable`, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has `no_index` and doesn't have `sortable`, it will just be ignored by the index.
104
- - Ex: `{ tag: { tag: { no_index: true } } }`
105
- - *separator* (default: ",")
106
- - Indicates how the text contained in the field is to be split into individual tags. The default is ,. The value must be a single character.
107
- - Ex: `{ tag: { tag: { separator: ',' } } }`
108
-
109
- #### Geo field options
110
- - *sortable* (default: false)
111
- - Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).
112
- - Ex: `{ place: { geo: { sortable: true } } }`
113
- - *no_index* (default: false)
114
- - Field will not be indexed. This is useful in conjunction with `sortable`, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has `no_index` and doesn't have `sortable`, it will just be ignored by the index.
115
- - Ex: `{ place: { geo: { no_index: true } } }`
116
-
117
- Some of the commands that are available on an index are as follows:
118
- - *create*
119
- - creates the index in the Redis instance, returns a boolean. Has an accompanying bang method that will raise an exception upon failure.
120
- - *drop*
121
- - drops the index from the Redis instance, returns a boolean. Has an accompanying bang method that will raise an exception upon failure.
122
- - *exist?*
123
- - Returns a boolean signifying indexes existence.
124
- - *info*
125
- - Returns a hash with all the information for the index
126
- - *fields*
127
- - Returns an array of field names in the index
128
- - *add*
129
- - Takes an object as the first argument and a second argument that is a score (a value between 0.0 and 1.0). The object passed must respond to all the fields in the schema. Has an accompanying bang method that will raise an exception upon failure.
130
- - *add_multiple!*
131
- - Takes an array of objects that respond to all the fields in the schema. This provides a more performant way to add multiple documents to the index.
132
-
133
- ### Searching
134
-
135
- Searching is initiated off an `RediSearch::Index` object.
283
+
284
+ #### Available Commands
285
+
286
+ - `create`
287
+ - Creates the index in the Redis instance, returns a boolean. Has an
288
+ accompanying bang method that will raise an exception upon failure. Will
289
+ return `false` if the index already exists. Accepts a few options:
290
+ - `max_text_fields: #{true || false}`
291
+ - For efficiency, RediSearch encodes indexes differently if they are
292
+ created with less than 32 text fields. This option forces RediSearch
293
+ to encode indexes as if there were more than 32 text fields, which
294
+ allows you to add additional fields (beyond 32) using `alter`.
295
+ - `no_offsets: #{true || false}`
296
+ - If set, we do not store term offsets for documents (saves memory, does
297
+ not allow exact searches or highlighting). Implies `no_highlight`.
298
+ - `temporary: #{seconds}`
299
+ - Create a lightweight temporary index which will expire after `seconds`
300
+ seconds of inactivity. The internal idle timer is reset whenever the
301
+ index is searched or added to. Because such indexes are lightweight,
302
+ you can create thousands of such indexes without negative performance
303
+ implications.
304
+ - `no_highlight: #{true || false}`
305
+ - Conserves storage space and memory by disabling highlighting support.
306
+ If set, we do not store corresponding byte offsets for term positions.
307
+ `no_highlight` is also implied by `no_offsets`.
308
+ - `no_fields: #{true || false}`
309
+ - If set, we do not store field bits for each term. Saves memory, does
310
+ not allow filtering by specific fields.
311
+ - `no_frequencies: #{true || false}`
312
+ - If set, we avoid saving the term frequencies in the index. This saves
313
+ memory but does not allow sorting based on the frequencies of a given
314
+ term within the document.
315
+ - `drop`
316
+ - Drops the index from the Redis instance, returns a boolean. Has an
317
+ accompanying bang method that will raise an exception upon failure. Will
318
+ return `false` if the index has already been dropped.
319
+ - `exist?`
320
+ - Returns a boolean signifying index existence.
321
+ - `info`
322
+ - Returns a struct object with all the information about the index.
323
+ - `fields`
324
+ - Returns an array of the field names in the index.
325
+ - `add(document, score: 1.0, replace: {}, language: nil, no_save: false)`
326
+ - Takes a `Document` object and options. Has an
327
+ accompanying bang method that will raise an exception upon failure.
328
+ - `score` -> The document's rank, a value between 0.0 and 1.0
329
+ - `language` -> Use a stemmer for the supplied language during indexing.
330
+ - `no_save` -> Don't save the actual document in the database and only index it.
331
+ - `replace` -> Accepts a boolean or a hash. If a truthy value is passed, we
332
+ will do an UPSERT style insertion - and delete an older version of the
333
+ document if it exists.
334
+ - `replace: { partial: true }` -> Allows you to not have to specify all
335
+ fields for reindexing. Fields not given to the command will be loaded from
336
+ the current version of the document.
337
+ - `add_multiple!(documents, score: 1.0, replace: {}, language: nil, no_save: false)`
338
+ - Takes an array of `Document` objects. This provides a more performant way to
339
+ add multiple documents to the index. Accepts the same options as `add`.
340
+ - `del(document, delete_document: false)`
341
+ - Removes a document from the index. `delete_document` signifies whether the
342
+ document should be completely removed from the Redis instance vs just the
343
+ index.
344
+ - `document_count`
345
+ - Returns the number of documents in the index
346
+ - `alter(field_name, schema)`
347
+ - Adds a new field to the index. Ex: `index.alter(:first_name, text: { phonetic: "dm:en" })`
348
+ - `reindex(documents, recreate: false, **options)`
349
+ - If `recreate` is `true` the index will be dropped and recreated
350
+ - `options` accepts the same options as `add`
351
+
352
+
353
+ ## Searching
354
+
355
+ Searching is initiated off a `RediSearch::Index` instance with clauses that can
356
+ be chained together. When searching, an array of `Document`s is returned
357
+ which has public reader methods for all the schema fields and a `document_id`
358
+ method which returns the id of the document prefixed with the index name.
359
+
136
360
  ```ruby
137
361
  main ❯ index = RediSearch::Index.new("user_idx", name: { text: { phonetic: "dm:en" } })
362
+ main ❯ index.add RediSearch::Document.for_object(index, User.new("10039", "Gene", "Volkman"))
363
+ main ❯ index.add RediSearch::Document.for_object(index, User.new("9998", "Jeannie", "Ledner"))
138
364
  main ❯ index.search("john")
139
365
  RediSearch (1.1ms) FT.SEARCH user_idx `john`
140
366
  => [#<RediSearch::Document:0x00007f862e241b78 first: "Gene", last: "Volkman", document_id: "10039">,
141
367
  #<RediSearch::Document:0x00007f862e2417b8 first: "Jeannie", last: "Ledner", document_id: "9998">]
142
368
  ```
143
- - Simple phrase query - hello AND world
369
+ **Simple phrase query** - `hello AND world`
144
370
  ```ruby
145
371
  index.search("hello").and("world")
146
372
  ```
147
- - Exact phrase query - hello FOLLOWED BY world
373
+ **Exact phrase query** - `hello FOLLOWED BY world`
148
374
  ```ruby
149
375
  index.search("hello world")
150
376
  ```
151
- - Union: documents containing either hello OR world
377
+ **Union query** - `hello OR world`
152
378
  ```ruby
153
379
  index.search("hello").or("world")
154
380
  ```
155
- - Not: documents containing hello but not world
381
+ **Negation query** - `hello AND NOT world`
156
382
  ```ruby
157
383
  index.search("hello").and.not("world")
158
384
  ```
159
385
 
386
+ Complex intersections and unions:
387
+ ```ruby
388
+ # Intersection of unions
389
+ index.search(index.search("hello").or("halo")).and(index.search("world").or("werld"))
390
+ # Negation of union
391
+ index.search("hello").and.not(index.search("world").or("werld"))
392
+ # Union inside phrase
393
+ index.search("hello").and(index.search("world").or("werld"))
394
+ ```
395
+
160
396
  All terms support a few options that can be applied.
161
397
 
162
- - Prefix Queries: match all terms starting with a prefix
398
+ **Prefix terms**: match all terms starting with a prefix.
399
+ (Akin to `like term%` in SQL)
163
400
  ```ruby
164
401
  index.search("hel", prefix: true)
165
402
  index.search("hello worl", prefix: true)
@@ -167,29 +404,112 @@ index.search("hel", prefix: true).and("worl", prefix: true)
167
404
  index.search("hello").and.not("worl", prefix: true)
168
405
  ```
169
406
 
170
- - Optional terms with higher priority to ones containing more matches
407
+ **Optional terms**: documents containing the optional terms will rank higher
408
+ than those without
171
409
  ```ruby
172
410
  index.search("foo").and("bar", optional: true).and("baz", optional: true)
173
411
  ```
174
412
 
175
- - Fuzzy matches are performed based on Levenshtein distance (LD). The maximum Levenshtein distance supported is 3.
413
+ **Fuzzy terms**: matches are performed based on Levenshtein distance (LD). The
414
+ maximum Levenshtein distance supported is 3.
176
415
  ```ruby
177
416
  index.search("zuchini", fuzziness: 1)
178
417
  ```
179
418
 
180
- - Complex intersections and unions
419
+ Search terms can also be scoped to specific fields using a `where` clause:
181
420
  ```ruby
182
- # Intersection of unions
183
- index.search(index.search("hello").or("halo")).and(index.search("world").or("werld"))
184
- # Negation of union
185
- index.search("hello").and.not(index.search("world").or("werld"))
186
- # Union inside phrase
187
- index.search("hello").and(index.search("world").or("werld"))
421
+ # Simple field specific query
422
+ index.search.where(name: "john")
423
+ # Using where with options
424
+ index.search.where(first: "jon", fuzziness: 1)
425
+ # Using where with more complex query
426
+ index.search.where(first: index.search("bill").or("bob"))
427
+ ```
428
+
429
+ Searching for numeric fields takes a range:
430
+ ```ruby
431
+ index.search.where(number: 0..100)
432
+ # Searching to infinity
433
+ index.search.where(number: 0..Float::INFINITY)
434
+ index.search.where(number: -Float::INFINITY..0)
188
435
  ```
189
436
 
190
- ### Rails Integration
437
+ ##### Query level clauses
438
+ - `slop(level)`
439
+ - We allow a maximum of N intervening number of unmatched offsets between
440
+ phrase terms. (i.e the slop for exact phrases is 0)
441
+ - `in_order`
442
+ - Usually used in conjunction with `slop`. We make sure the query terms appear
443
+ in the same order in the document as in the query, regardless of the offsets
444
+ between them.
445
+ - `no_content`
446
+ - Only return the document ids and not the content. This is useful if
447
+ RediSearch is being used on a Rails model where the document attributes
448
+ don't matter and it's being converted into ActiveRecord objects.
449
+ - `language(language)`
450
+ - Stemmer to use for the supplied language during search for query expansion.
451
+ If querying documents in Chinese, this should be set to chinese in order to
452
+ properly tokenize the query terms. If an unsupported language is sent, the
453
+ command returns an error.
454
+ - `sort_by(field, order: :asc)`
455
+ - If the supplied field is a sortable field, the results are ordered by the
456
+ value of this field. This applies to both text and numeric fields. Available
457
+ orders are `:asc` or `:desc`
458
+ - `limit(num, offset = 0)`
459
+ - Limit the results to the specified `num` at the `offset`. The default limit
460
+ is set to `10`.
461
+ - `count`
462
+ - Returns the number of documents found in the search query
463
+ - `highlight(fields: [], opening_tag: "<b>", closing_tag: "</b>")`
464
+ - Use this option to format occurrences of matched text. `fields` are an
465
+ array of fields to be highlighted.
466
+ - `verbatim`
467
+ - Do not try to use stemming for query expansion but search the query terms
468
+ verbatim.
469
+ - `no_stop_words`
470
+ - Do not filter stopwords from the query.
471
+ - `with_scores`
472
+ - Include the relative internal score of each document. This can be used to
473
+ merge results from multiple instances. This will add a `score` method to the
474
+ returned `Document` instances.
475
+ - `return(*fields)`
476
+ - Limit which fields from the document are returned.
477
+ - `explain`
478
+ - Returns the execution plan for a complex query. In the returned response,
479
+ a + on a term is an indication of stemming.
480
+ - `to_redis`
481
+ - Returns the command to be executed without executing it.
482
+
483
+
484
+ ## Spellcheck
485
+
486
+ Spellchecking is initiated off a `RediSearch::Index` instance and provides
487
+ suggestions for misspelled search terms. It takes an optional `distance`
488
+ argument which is the maximal Levenshtein distance for spelling suggestions. It
489
+ returns an array where each element contains suggestions for each search term
490
+ and a normalized score based on its occurrences in the index.
491
+
492
+ ```ruby
493
+ main ❯ index = RediSearch::Index.new("user_idx", name: { text: { phonetic: "dm:en" } })
494
+ main ❯ index.spellcheck("jimy")
495
+ RediSearch (1.1ms) FT.SPELLCHECK user_idx jimy DISTANCE 1
496
+ => [#<RediSearch::Spellcheck::Result:0x00007f805591c670
497
+ term: "jimy",
498
+ suggestions:
499
+ [#<struct RediSearch::Spellcheck::Suggestion score=0.0006849315068493151, suggestion="jimmy">,
500
+ #<struct RediSearch::Spellcheck::Suggestion score=0.00019569471624266145, suggestion="jim">]>]
501
+ main ❯ index.spellcheck("jimy", distance: 2).first.suggestions
502
+ RediSearch (0.5ms) FT.SPELLCHECK user_idx jimy DISTANCE 2
503
+ => [#<struct RediSearch::Spellcheck::Suggestion score=0.0006849315068493151, suggestion="jimmy">,
504
+ #<struct RediSearch::Spellcheck::Suggestion score=0.00019569471624266145, suggestion="jim">]
505
+ ```
506
+
507
+
508
+ ## Rails Integration
509
+
510
+ Integration with Rails is super easy! Call `redi_search` with the schema keyword
511
+ arg from inside your model. Ex:
191
512
 
192
- Integration with Rails is on by default! All you have to do is add the following to the model you want to search:
193
513
  ```ruby
194
514
  class User < ApplicationRecord
195
515
  redi_search schema: {
@@ -199,22 +519,106 @@ class User < ApplicationRecord
199
519
  end
200
520
  ```
201
521
 
202
- This will automatically add `User.search` and `User.reindex` methods. You can also use `User.redi_search_index` to get the `RediSearch::Index` instance. `User.reindex` will first `drop` the index if it exists, create the index with the given schema, and then add all the records to the index.
522
+ This will automatically add `User.search` and `User.spellcheck`
523
+ methods which behave the same as if you called them on an `Index` instance.
524
+
525
+ `User.reindex(only: [], **options)` is also added and behaves similarly to `RediSearch::Index#reindex`. Some of the differences include:
526
+ - By default does an upsert for all documents added using the
527
+ option `replace: { partial: true }`.
528
+ - `Document`s do not to be passed as the first parameter. The `search_import`
529
+ scope is automatically called and all the records are converted
530
+ to `Document`s.
531
+ - Accepts an optional `only` parameter where you can specify a limited number
532
+ of fields to update. Useful if you alter the schema and only need to index a
533
+ particular field.
534
+
535
+
536
+ The `redi_search` class method also takes an optional `serializer` argument
537
+ which takes the class name of a serializer. The serializer must respond to all
538
+ the fields in a schema as methods. We don't serialize to a JSON object since
539
+ RediSearch doesn't serialize documents that way.
540
+
541
+ ```ruby
542
+ class User < ApplicationRecord
543
+ redi_search schema: {
544
+ name: { text: { phonetic: "dm:en" } }
545
+ }, serializer: UserSerializer
546
+ end
547
+
548
+ class UserSerializer < SimpleDelegator
549
+ def name
550
+ "#{first_name} #{last_name}"
551
+ end
552
+ end
553
+ ```
554
+
555
+ You can create a scope on the model to eager load relationships when indexing or
556
+ it can be used to limit the records to index.
557
+
558
+ ```ruby
559
+ class User < ApplicationRecord
560
+ scope :search_import, -> { includes(:posts) }
561
+ end
562
+ ```
563
+
564
+ The default index name for model indexes is
565
+ `#{model_name.plural}_#{RediSearch.env}`. The `redi_search` method takes an
566
+ optional `index_prefix` argument which gets prepended to the index name:
567
+
568
+ ```ruby
569
+ class User < ApplicationRecord
570
+ redi_search schema: {
571
+ first: { text: { phonetic: "dm:en" } },
572
+ last: { text: { phonetic: "dm:en" } }
573
+ }, index_prefix: 'prefix'
574
+ end
575
+
576
+ User.redi_search_index.name
577
+ # => prefix_users_development
578
+ ```
579
+
580
+ When integrating RediSearch into a model, records will automatically be indexed
581
+ after creating and updating and will be removed from the index upon destruction.
582
+
583
+ There are few more convenience methods that are publicly available:
584
+ - `redi_search_document`
585
+ - Returns the record as a `RediSearch::Document` instance
586
+ - `redi_search_delete_document`
587
+ - Removes the record from the index
588
+ - `redi_search_add_document`
589
+ - Adds the record to the index
590
+ - `redi_search_index`
591
+ - Returns the `RediSearch::Index` instance
592
+
203
593
 
204
594
  ## Development
205
595
 
206
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment. You can also start a rails console if you `cd` into `test/dummy`.
596
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run
597
+ `rake test` to run the tests. You can also run `bin/console` for an interactive
598
+ prompt that will allow you to experiment. You can also start a rails console if
599
+ you `cd` into `test/dummy`.
207
600
 
208
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, execute `bin/publish (major|minor|patch)` which will update the version number in `version.rb`, create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
601
+ To install this gem onto your local machine, run `bundle exec rake install`. To
602
+ release a new version, execute `bin/publish (major|minor|patch)` which will
603
+ update the version number in `version.rb`, create a git tag for the version,
604
+ push git commits and tags, and push the `.gem` file to
605
+ [rubygems.org](https://rubygems.org).
209
606
 
210
607
  ## Contributing
211
608
 
212
- Bug reports and pull requests are welcome on GitHub at https://github.com/npezza93/redi_search. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
609
+ Bug reports and pull requests are welcome on
610
+ [GitHub](https://github.com/npezza93/redi_search). This project is intended to
611
+ be a safe, welcoming space for collaboration, and contributors are expected to
612
+ adhere to the [Contributor Covenant](http://contributor-covenant.org) code of
613
+ conduct.
213
614
 
214
615
  ## License
215
616
 
216
- The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
617
+ The gem is available as open source under the terms of the
618
+ [MIT License](https://opensource.org/licenses/MIT).
217
619
 
218
620
  ## Code of Conduct
219
621
 
220
- Everyone interacting in the RediSearch project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/npezza93/redi_search/blob/master/CODE_OF_CONDUCT.md).
622
+ Everyone interacting in the RediSearch project’s codebases, issue trackers, chat
623
+ rooms and mailing lists is expected to follow the [code of
624
+ conduct](https://github.com/npezza93/redi_search/blob/master/CODE_OF_CONDUCT.md).