redi_search 0.1.0 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +516 -112
- data/lib/redi_search.rb +5 -2
- data/lib/redi_search/add.rb +70 -0
- data/lib/redi_search/alter.rb +30 -0
- data/lib/redi_search/create.rb +53 -0
- data/lib/redi_search/document.rb +71 -16
- data/lib/redi_search/index.rb +31 -26
- data/lib/redi_search/lazily_load.rb +65 -0
- data/lib/redi_search/log_subscriber.rb +4 -0
- data/lib/redi_search/model.rb +41 -18
- data/lib/redi_search/schema.rb +17 -8
- data/lib/redi_search/schema/text_field.rb +0 -2
- data/lib/redi_search/search.rb +22 -44
- data/lib/redi_search/search/clauses.rb +60 -31
- data/lib/redi_search/search/clauses/and.rb +17 -0
- data/lib/redi_search/search/clauses/application_clause.rb +18 -0
- data/lib/redi_search/search/clauses/boolean.rb +72 -0
- data/lib/redi_search/search/clauses/highlight.rb +47 -0
- data/lib/redi_search/search/clauses/in_order.rb +17 -0
- data/lib/redi_search/search/clauses/language.rb +23 -0
- data/lib/redi_search/search/clauses/limit.rb +27 -0
- data/lib/redi_search/search/clauses/no_content.rb +17 -0
- data/lib/redi_search/search/clauses/no_stop_words.rb +17 -0
- data/lib/redi_search/search/clauses/or.rb +23 -0
- data/lib/redi_search/search/clauses/return.rb +23 -0
- data/lib/redi_search/search/clauses/slop.rb +23 -0
- data/lib/redi_search/search/clauses/sort_by.rb +25 -0
- data/lib/redi_search/search/clauses/verbatim.rb +17 -0
- data/lib/redi_search/search/clauses/where.rb +66 -0
- data/lib/redi_search/search/clauses/with_scores.rb +17 -0
- data/lib/redi_search/search/result.rb +46 -0
- data/lib/redi_search/search/term.rb +4 -4
- data/lib/redi_search/spellcheck.rb +30 -29
- data/lib/redi_search/spellcheck/result.rb +44 -0
- data/lib/redi_search/version.rb +1 -1
- metadata +101 -31
- data/.gitignore +0 -11
- data/.rubocop.yml +0 -1757
- data/.travis.yml +0 -23
- data/Gemfile +0 -17
- data/Rakefile +0 -12
- data/bin/console +0 -8
- data/bin/publish +0 -58
- data/bin/setup +0 -8
- data/bin/test +0 -7
- data/lib/redi_search/document/converter.rb +0 -26
- data/lib/redi_search/error.rb +0 -6
- data/lib/redi_search/result/collection.rb +0 -22
- data/lib/redi_search/search/and_clause.rb +0 -15
- data/lib/redi_search/search/boolean_clause.rb +0 -72
- data/lib/redi_search/search/highlight_clause.rb +0 -43
- data/lib/redi_search/search/or_clause.rb +0 -21
- data/lib/redi_search/search/where_clause.rb +0 -66
- data/redi_search.gemspec +0 -48
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 0666cc15e7fc7d132026337c916b5d233114d65378ff172d9a48ae81ed1a9eae
|
4
|
+
data.tar.gz: 575c1230885537ea9a31b753808894d187f00718fe8d0fca282cba426b75e0c8
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0f1486c6d2a7b1a726d6c956398e693317df4efffa0c73e45b180a3acb2ac36b69284e68e3230862e6948015b1ceee0a1ac518dab6182c86e4b2c710f587cd16
|
7
|
+
data.tar.gz: c9e179c20fc545612a00fd2f9b46e5a2eae9ce1292d4635625a80abf641d726d248ee753633dfa8e292b0dba1543af410e58cf5a10380d7854c66af2c06326a1
|
data/README.md
CHANGED
@@ -1,42 +1,45 @@
|
|
1
|
+
<p align="center">
|
2
|
+
<a href="https://github.com/npezza93/redi_search">
|
3
|
+
<img src="https://raw.githubusercontent.com/npezza93/redi_search/master/.github/logo.svg?sanitize=true" width="350">
|
4
|
+
</a>
|
5
|
+
</p>
|
6
|
+
|
1
7
|
# RediSearch
|
2
8
|
|
3
9
|
[](https://travis-ci.com/npezza93/redi_search)
|
4
10
|
[](https://codeclimate.com/github/npezza93/redi_search/test_coverage)
|
5
11
|
[](https://codeclimate.com/github/npezza93/redi_search/maintainability)
|
6
12
|
|
7
|
-
A simple, but powerful Ruby wrapper around RediSearch,
|
8
|
-
|
13
|
+
A simple, but powerful, Ruby wrapper around RediSearch, a search engine on top of
|
14
|
+
Redis.
|
9
15
|
|
10
16
|
## Installation
|
11
17
|
|
12
18
|
Firstly, Redis and RediSearch need to be installed.
|
13
19
|
|
14
|
-
You can download Redis from https://redis.io/download, and check out
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
1. `cd RediSearch`
|
19
|
-
1. `mkdir build`
|
20
|
-
1. `cd build`
|
21
|
-
1. `cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo`
|
22
|
-
1. `make`
|
23
|
-
1. `redis-server --loadmodule ./redisearch.so or load the module in your redis.conf`
|
20
|
+
You can download Redis from https://redis.io/download, and check out
|
21
|
+
installation instructions
|
22
|
+
[here](https://github.com/antirez/redis#installing-redis). Alternatively, on
|
23
|
+
macOS or Linux you can install via Homebrew.
|
24
24
|
|
25
|
-
|
25
|
+
To install RediSearch check out,
|
26
|
+
[https://oss.redislabs.com/redisearch/Quick_Start.html](https://oss.redislabs.com/redisearch/Quick_Start.html).
|
27
|
+
Once you have RediSearch built, you can update your redis.conf file to always
|
28
|
+
load the redisearch module with `loadmodule /path/to/redisearch.so`. (On macOS
|
29
|
+
the redis.conf file can be found `/usr/local/etc/redis.conf`)
|
26
30
|
|
27
|
-
|
28
|
-
After Redis and RediSearch are up and running, add this line to your application's Gemfile:
|
31
|
+
After Redis and RediSearch are up and running, add this line to your Gemfile:
|
29
32
|
|
30
33
|
```ruby
|
31
34
|
gem 'redi_search'
|
32
35
|
```
|
33
36
|
|
34
|
-
And then
|
37
|
+
And then:
|
35
38
|
```bash
|
36
39
|
❯ bundle
|
37
40
|
````
|
38
41
|
|
39
|
-
Or install it yourself
|
42
|
+
Or install it yourself:
|
40
43
|
```bash
|
41
44
|
❯ gem install redi_search
|
42
45
|
```
|
@@ -46,9 +49,10 @@ and require it:
|
|
46
49
|
require 'redi_search'
|
47
50
|
```
|
48
51
|
|
49
|
-
|
52
|
+
Once the gem is installed and required you'll need to configure it with your
|
53
|
+
Redis configuration. If you're on Rails, this should go in an initializer
|
54
|
+
(`config/initializers/redi_search.rb`)
|
50
55
|
|
51
|
-
### Configuration
|
52
56
|
```ruby
|
53
57
|
RediSearch.configure do |config|
|
54
58
|
config.redis_config = {
|
@@ -58,108 +62,341 @@ RediSearch.configure do |config|
|
|
58
62
|
end
|
59
63
|
```
|
60
64
|
|
61
|
-
|
65
|
+
## Table of Contents
|
66
|
+
- [Preface](#preface)
|
67
|
+
- [Schema](#schema)
|
68
|
+
- [Document](#document)
|
69
|
+
- [Index](#index)
|
70
|
+
- [Searching](#searching)
|
71
|
+
- [Spellcheck](#spellcheck)
|
72
|
+
- [Rails Integration](#rails-integration)
|
73
|
+
|
74
|
+
|
75
|
+
## Preface
|
76
|
+
RediSearch revolves around a search index, so lets start with
|
77
|
+
defining what a search index is. According to [Switype](https://swiftype.com):
|
78
|
+
> A search index is a body of structured data that a search engine refers to
|
79
|
+
> when looking for results that are relevant to a specific query. Indexes are a
|
80
|
+
> critical piece of any search system, since they must be tailored to the
|
81
|
+
> specific information retrieval method of the search engine’s algorithm. In
|
82
|
+
> this manner, the algorithm and the index are inextricably linked to one
|
83
|
+
> another. Index can also be used as a verb (indexing), referring to the process
|
84
|
+
> of collecting unstructured website data in a structured format that is
|
85
|
+
> tailored for the search engine algorithm.
|
86
|
+
>
|
87
|
+
> One way to think about indices is to consider the following analogy between a
|
88
|
+
> search infrastructure and an office filing system. Imagine you hand an intern
|
89
|
+
> a stack of thousands of pieces of paper (documents) and tell them to organize
|
90
|
+
> these pieces of paper in a filing cabinet (index) to help the company find
|
91
|
+
> information more efficiently. The intern will first have to sort through the
|
92
|
+
> papers and get a sense of all the information contained within them, then they
|
93
|
+
> will have to decide on a system for arranging them in the filing cabinet, then
|
94
|
+
> finally they’ll need to decide what is the most effective manner for searching
|
95
|
+
> through and selecting from the files once they are in the cabinet. In this
|
96
|
+
> example, the process of organizing and filing the papers corresponds to the
|
97
|
+
> process of indexing website content, and the method for searching across these
|
98
|
+
> organized files and finding those that are most relevant corresponds to the
|
99
|
+
> search algorithm.
|
100
|
+
|
101
|
+
|
102
|
+
## Schema
|
103
|
+
|
104
|
+
This defines the fields and the properties of those fields in the index. A
|
105
|
+
schema is a hash, with field names as the keys, and the field type(and options)
|
106
|
+
as the value. Each field can be one of four types: geo, numeric, tag, or text
|
107
|
+
and can have many options. A simple example of a schema is:
|
108
|
+
```ruby
|
109
|
+
{ first_name: :text, last_name: :text }
|
110
|
+
```
|
111
|
+
|
112
|
+
The supported options for each type are as follows:
|
113
|
+
|
114
|
+
##### Text field
|
115
|
+
With no options: `{ name: :text }`
|
116
|
+
|
117
|
+
<details>
|
118
|
+
<summary>Options</summary>
|
119
|
+
<ul>
|
120
|
+
<li>
|
121
|
+
<b>weight</b> (default: 1.0)
|
122
|
+
<ul>
|
123
|
+
<li>Declares the importance of this field when calculating result accuracy. This is a multiplication factor.</li>
|
124
|
+
<li>Ex: <code>{ name: { text: { weight: 2 } } }</code></li>
|
125
|
+
</ul>
|
126
|
+
</li>
|
127
|
+
<li>
|
128
|
+
<b>phonetic</b>
|
129
|
+
<ul>
|
130
|
+
<li>Will perform phonetic matching on field in searches by default. The obligatory {matcher} argument specifies the phonetic algorithm and language used. The following matchers are supported:
|
131
|
+
<ul>
|
132
|
+
<li>dm:en - Double Metaphone for English</li>
|
133
|
+
<li>dm:fr - Double Metaphone for French</li>
|
134
|
+
<li>dm:pt - Double Metaphone for Portuguese</li>
|
135
|
+
<li>dm:es - Double Metaphone for Spanish</li>
|
136
|
+
</ul>
|
137
|
+
</li>
|
138
|
+
<li>
|
139
|
+
Ex: <code>{ name: { text: { phonetic: 'dm:en' } } }</code>
|
140
|
+
</li>
|
141
|
+
</ul>
|
142
|
+
</li>
|
143
|
+
<li>
|
144
|
+
<b>sortable</b> (default: false)
|
145
|
+
<ul>
|
146
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
147
|
+
<li>Ex: <code>{ name: { text: { sortable: true } } }</code></li>
|
148
|
+
</ul>
|
149
|
+
</li>
|
150
|
+
<li>
|
151
|
+
<b>no_index</b> (default: false)
|
152
|
+
<ul>
|
153
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
154
|
+
<li>Ex: <code>{ name: { text: { no_index: true } } }</code></li>
|
155
|
+
</ul>
|
156
|
+
</li>
|
157
|
+
<li>
|
158
|
+
<b>no_stem</b> (default: false)
|
159
|
+
<ul>
|
160
|
+
<li>Disable stemming when indexing its values. This may be ideal for things like proper names.</li>
|
161
|
+
<li>Ex: <code>{ name: { text: { no_stem: true } } }</code></li>
|
162
|
+
</ul>
|
163
|
+
</li>
|
164
|
+
</ul>
|
165
|
+
</details>
|
166
|
+
|
167
|
+
##### Numeric field
|
168
|
+
With no options: `{ price: :numeric }`
|
169
|
+
|
170
|
+
<details>
|
171
|
+
<summary>Options</summary>
|
172
|
+
<ul>
|
173
|
+
<li>
|
174
|
+
<b>sortable</b> (default: false)
|
175
|
+
<ul>
|
176
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
177
|
+
<li>Ex: <code>{ id: { numeric: { sortable: true } } }</code></li>
|
178
|
+
</ul>
|
179
|
+
</li>
|
180
|
+
<li>
|
181
|
+
<b>no_index</b> (default: false)
|
182
|
+
<ul>
|
183
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
184
|
+
<li>Ex: <code>{ id: { numeric: { no_index: true } } }</code></li>
|
185
|
+
</ul>
|
186
|
+
</li>
|
187
|
+
</ul>
|
188
|
+
</details>
|
189
|
+
|
190
|
+
##### Tag field
|
191
|
+
With no options: `{ tag: :tag }`
|
192
|
+
|
193
|
+
<details>
|
194
|
+
<summary>Options</summary>
|
195
|
+
<ul>
|
196
|
+
<li>
|
197
|
+
<b>sortable</b> (default: false)
|
198
|
+
<ul>
|
199
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
200
|
+
<li>Ex: <code>{ tag: { tag: { sortable: true } } }</code></li>
|
201
|
+
</ul>
|
202
|
+
</li>
|
203
|
+
<li>
|
204
|
+
<b>no_index</b> (default: false)
|
205
|
+
<ul>
|
206
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
207
|
+
<li>Ex: <code>{ tag: { tag: { no_index: true } } }</code></li>
|
208
|
+
</ul>
|
209
|
+
</li>
|
210
|
+
<li>
|
211
|
+
<b>separator</b> (default: ",")
|
212
|
+
<ul>
|
213
|
+
<li>Indicates how the text contained in the field is to be split into individual tags. The default is ,. The value must be a single character.</li>
|
214
|
+
<li>Ex: <code>{ tag: { tag: { separator: ',' } } }</code></li>
|
215
|
+
</ul>
|
216
|
+
</li>
|
217
|
+
</ul>
|
218
|
+
</details>
|
219
|
+
|
220
|
+
##### Geo field
|
221
|
+
With no options: `{ place: :geo }`
|
222
|
+
|
223
|
+
<details>
|
224
|
+
<summary>Options</summary>
|
225
|
+
<ul>
|
226
|
+
<li>
|
227
|
+
<b>sortable</b> (default: false)
|
228
|
+
<ul>
|
229
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
230
|
+
<li>Ex: <code>{ place: { geo: { sortable: true } } }</code></li>
|
231
|
+
</ul>
|
232
|
+
</li>
|
233
|
+
<li>
|
234
|
+
<b>no_index</b> (default: false)
|
235
|
+
<ul>
|
236
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
237
|
+
<li>Ex: <code>{ place: { geo: { no_index: true } } }</code></li>
|
238
|
+
</ul>
|
239
|
+
</li>
|
240
|
+
</ul>
|
241
|
+
</details>
|
242
|
+
|
243
|
+
## Document
|
244
|
+
|
245
|
+
A `Document` is the Ruby representation of a RediSearch document.
|
246
|
+
|
247
|
+
You can fetch a `Document` using `.get` or `mget` class methods.
|
248
|
+
- `get(index, document_id)` fetches a single document in an `Index` for a given
|
249
|
+
`document_id`.
|
250
|
+
- `mget(index, *document_ids)` fetches a collection of `Document`s
|
251
|
+
in an `Index` for the given `document_ids`.
|
252
|
+
|
253
|
+
You can also make a `Document` instance using the
|
254
|
+
`.for_object(index, record, serializer: nil, only: [])` class method. It takes
|
255
|
+
an `Index` instance and a ruby object. That object must respond to all the
|
256
|
+
fields specified in the indexes schema or pass a serializer class that accepts
|
257
|
+
the object and responds to all the fields specified in the indexes schema.
|
258
|
+
`only` accepts an array of fields from the schema and limits the fields that are
|
259
|
+
passed to the `Document`.
|
260
|
+
|
261
|
+
Once you have an instance of a `Document`, it responds to all the fields
|
262
|
+
specified in the indexes schema as methods and `document_id`. `document_id` is
|
263
|
+
automatically prepended with the indexes names unless it already is to ensure
|
264
|
+
uniqueness. We prepend the index name because if you have two documents with the
|
265
|
+
same id in different indexes we don't want the documents to override each other.
|
266
|
+
There is also a `#document_id_without_index` method which removes the prepended
|
267
|
+
index name.
|
268
|
+
|
269
|
+
Finally there is a `#del` method that will remove the document from the index.
|
270
|
+
It optionally accepts a `delete_document` named argument that signifies whether
|
271
|
+
the document should be completely removed from the Redis instance vs just the
|
272
|
+
index.
|
273
|
+
|
274
|
+
|
275
|
+
## Index
|
276
|
+
|
277
|
+
To initialize an index, pass the name of the index as a string or symbol and the
|
278
|
+
schema.
|
62
279
|
|
63
|
-
All actions revolve around indexes. To instantiate one:
|
64
280
|
```ruby
|
65
281
|
RediSearch::Index.new(name_of_index, schema)
|
66
282
|
```
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
-
|
71
|
-
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
-
|
81
|
-
|
82
|
-
|
83
|
-
-
|
84
|
-
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
-
|
100
|
-
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
-
|
105
|
-
-
|
106
|
-
-
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
-
|
111
|
-
|
112
|
-
|
113
|
-
-
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
-
|
119
|
-
|
120
|
-
|
121
|
-
|
122
|
-
-
|
123
|
-
|
124
|
-
-
|
125
|
-
-
|
126
|
-
|
127
|
-
|
128
|
-
-
|
129
|
-
-
|
130
|
-
-
|
131
|
-
-
|
132
|
-
|
133
|
-
|
134
|
-
|
135
|
-
|
283
|
+
|
284
|
+
#### Available Commands
|
285
|
+
|
286
|
+
- `create`
|
287
|
+
- Creates the index in the Redis instance, returns a boolean. Has an
|
288
|
+
accompanying bang method that will raise an exception upon failure. Will
|
289
|
+
return `false` if the index already exists. Accepts a few options:
|
290
|
+
- `max_text_fields: #{true || false}`
|
291
|
+
- For efficiency, RediSearch encodes indexes differently if they are
|
292
|
+
created with less than 32 text fields. This option forces RediSearch
|
293
|
+
to encode indexes as if there were more than 32 text fields, which
|
294
|
+
allows you to add additional fields (beyond 32) using `alter`.
|
295
|
+
- `no_offsets: #{true || false}`
|
296
|
+
- If set, we do not store term offsets for documents (saves memory, does
|
297
|
+
not allow exact searches or highlighting). Implies `no_highlight`.
|
298
|
+
- `temporary: #{seconds}`
|
299
|
+
- Create a lightweight temporary index which will expire after `seconds`
|
300
|
+
seconds of inactivity. The internal idle timer is reset whenever the
|
301
|
+
index is searched or added to. Because such indexes are lightweight,
|
302
|
+
you can create thousands of such indexes without negative performance
|
303
|
+
implications.
|
304
|
+
- `no_highlight: #{true || false}`
|
305
|
+
- Conserves storage space and memory by disabling highlighting support.
|
306
|
+
If set, we do not store corresponding byte offsets for term positions.
|
307
|
+
`no_highlight` is also implied by `no_offsets`.
|
308
|
+
- `no_fields: #{true || false}`
|
309
|
+
- If set, we do not store field bits for each term. Saves memory, does
|
310
|
+
not allow filtering by specific fields.
|
311
|
+
- `no_frequencies: #{true || false}`
|
312
|
+
- If set, we avoid saving the term frequencies in the index. This saves
|
313
|
+
memory but does not allow sorting based on the frequencies of a given
|
314
|
+
term within the document.
|
315
|
+
- `drop`
|
316
|
+
- Drops the index from the Redis instance, returns a boolean. Has an
|
317
|
+
accompanying bang method that will raise an exception upon failure. Will
|
318
|
+
return `false` if the index has already been dropped.
|
319
|
+
- `exist?`
|
320
|
+
- Returns a boolean signifying index existence.
|
321
|
+
- `info`
|
322
|
+
- Returns a struct object with all the information about the index.
|
323
|
+
- `fields`
|
324
|
+
- Returns an array of the field names in the index.
|
325
|
+
- `add(document, score: 1.0, replace: {}, language: nil, no_save: false)`
|
326
|
+
- Takes a `Document` object and options. Has an
|
327
|
+
accompanying bang method that will raise an exception upon failure.
|
328
|
+
- `score` -> The document's rank, a value between 0.0 and 1.0
|
329
|
+
- `language` -> Use a stemmer for the supplied language during indexing.
|
330
|
+
- `no_save` -> Don't save the actual document in the database and only index it.
|
331
|
+
- `replace` -> Accepts a boolean or a hash. If a truthy value is passed, we
|
332
|
+
will do an UPSERT style insertion - and delete an older version of the
|
333
|
+
document if it exists.
|
334
|
+
- `replace: { partial: true }` -> Allows you to not have to specify all
|
335
|
+
fields for reindexing. Fields not given to the command will be loaded from
|
336
|
+
the current version of the document.
|
337
|
+
- `add_multiple!(documents, score: 1.0, replace: {}, language: nil, no_save: false)`
|
338
|
+
- Takes an array of `Document` objects. This provides a more performant way to
|
339
|
+
add multiple documents to the index. Accepts the same options as `add`.
|
340
|
+
- `del(document, delete_document: false)`
|
341
|
+
- Removes a document from the index. `delete_document` signifies whether the
|
342
|
+
document should be completely removed from the Redis instance vs just the
|
343
|
+
index.
|
344
|
+
- `document_count`
|
345
|
+
- Returns the number of documents in the index
|
346
|
+
- `alter(field_name, schema)`
|
347
|
+
- Adds a new field to the index. Ex: `index.alter(:first_name, text: { phonetic: "dm:en" })`
|
348
|
+
- `reindex(documents, recreate: false, **options)`
|
349
|
+
- If `recreate` is `true` the index will be dropped and recreated
|
350
|
+
- `options` accepts the same options as `add`
|
351
|
+
|
352
|
+
|
353
|
+
## Searching
|
354
|
+
|
355
|
+
Searching is initiated off a `RediSearch::Index` instance with clauses that can
|
356
|
+
be chained together. When searching, an array of `Document`s is returned
|
357
|
+
which has public reader methods for all the schema fields and a `document_id`
|
358
|
+
method which returns the id of the document prefixed with the index name.
|
359
|
+
|
136
360
|
```ruby
|
137
361
|
main ❯ index = RediSearch::Index.new("user_idx", name: { text: { phonetic: "dm:en" } })
|
362
|
+
main ❯ index.add RediSearch::Document.for_object(index, User.new("10039", "Gene", "Volkman"))
|
363
|
+
main ❯ index.add RediSearch::Document.for_object(index, User.new("9998", "Jeannie", "Ledner"))
|
138
364
|
main ❯ index.search("john")
|
139
365
|
RediSearch (1.1ms) FT.SEARCH user_idx `john`
|
140
366
|
=> [#<RediSearch::Document:0x00007f862e241b78 first: "Gene", last: "Volkman", document_id: "10039">,
|
141
367
|
#<RediSearch::Document:0x00007f862e2417b8 first: "Jeannie", last: "Ledner", document_id: "9998">]
|
142
368
|
```
|
143
|
-
|
369
|
+
**Simple phrase query** - `hello AND world`
|
144
370
|
```ruby
|
145
371
|
index.search("hello").and("world")
|
146
372
|
```
|
147
|
-
|
373
|
+
**Exact phrase query** - `hello FOLLOWED BY world`
|
148
374
|
```ruby
|
149
375
|
index.search("hello world")
|
150
376
|
```
|
151
|
-
|
377
|
+
**Union query** - `hello OR world`
|
152
378
|
```ruby
|
153
379
|
index.search("hello").or("world")
|
154
380
|
```
|
155
|
-
|
381
|
+
**Negation query** - `hello AND NOT world`
|
156
382
|
```ruby
|
157
383
|
index.search("hello").and.not("world")
|
158
384
|
```
|
159
385
|
|
386
|
+
Complex intersections and unions:
|
387
|
+
```ruby
|
388
|
+
# Intersection of unions
|
389
|
+
index.search(index.search("hello").or("halo")).and(index.search("world").or("werld"))
|
390
|
+
# Negation of union
|
391
|
+
index.search("hello").and.not(index.search("world").or("werld"))
|
392
|
+
# Union inside phrase
|
393
|
+
index.search("hello").and(index.search("world").or("werld"))
|
394
|
+
```
|
395
|
+
|
160
396
|
All terms support a few options that can be applied.
|
161
397
|
|
162
|
-
|
398
|
+
**Prefix terms**: match all terms starting with a prefix.
|
399
|
+
(Akin to `like term%` in SQL)
|
163
400
|
```ruby
|
164
401
|
index.search("hel", prefix: true)
|
165
402
|
index.search("hello worl", prefix: true)
|
@@ -167,29 +404,112 @@ index.search("hel", prefix: true).and("worl", prefix: true)
|
|
167
404
|
index.search("hello").and.not("worl", prefix: true)
|
168
405
|
```
|
169
406
|
|
170
|
-
|
407
|
+
**Optional terms**: documents containing the optional terms will rank higher
|
408
|
+
than those without
|
171
409
|
```ruby
|
172
410
|
index.search("foo").and("bar", optional: true).and("baz", optional: true)
|
173
411
|
```
|
174
412
|
|
175
|
-
|
413
|
+
**Fuzzy terms**: matches are performed based on Levenshtein distance (LD). The
|
414
|
+
maximum Levenshtein distance supported is 3.
|
176
415
|
```ruby
|
177
416
|
index.search("zuchini", fuzziness: 1)
|
178
417
|
```
|
179
418
|
|
180
|
-
|
419
|
+
Search terms can also be scoped to specific fields using a `where` clause:
|
181
420
|
```ruby
|
182
|
-
#
|
183
|
-
index.search
|
184
|
-
#
|
185
|
-
index.search
|
186
|
-
#
|
187
|
-
index.search
|
421
|
+
# Simple field specific query
|
422
|
+
index.search.where(name: "john")
|
423
|
+
# Using where with options
|
424
|
+
index.search.where(first: "jon", fuzziness: 1)
|
425
|
+
# Using where with more complex query
|
426
|
+
index.search.where(first: index.search("bill").or("bob"))
|
427
|
+
```
|
428
|
+
|
429
|
+
Searching for numeric fields takes a range:
|
430
|
+
```ruby
|
431
|
+
index.search.where(number: 0..100)
|
432
|
+
# Searching to infinity
|
433
|
+
index.search.where(number: 0..Float::INFINITY)
|
434
|
+
index.search.where(number: -Float::INFINITY..0)
|
188
435
|
```
|
189
436
|
|
190
|
-
|
437
|
+
##### Query level clauses
|
438
|
+
- `slop(level)`
|
439
|
+
- We allow a maximum of N intervening number of unmatched offsets between
|
440
|
+
phrase terms. (i.e the slop for exact phrases is 0)
|
441
|
+
- `in_order`
|
442
|
+
- Usually used in conjunction with `slop`. We make sure the query terms appear
|
443
|
+
in the same order in the document as in the query, regardless of the offsets
|
444
|
+
between them.
|
445
|
+
- `no_content`
|
446
|
+
- Only return the document ids and not the content. This is useful if
|
447
|
+
RediSearch is being used on a Rails model where the document attributes
|
448
|
+
don't matter and it's being converted into ActiveRecord objects.
|
449
|
+
- `language(language)`
|
450
|
+
- Stemmer to use for the supplied language during search for query expansion.
|
451
|
+
If querying documents in Chinese, this should be set to chinese in order to
|
452
|
+
properly tokenize the query terms. If an unsupported language is sent, the
|
453
|
+
command returns an error.
|
454
|
+
- `sort_by(field, order: :asc)`
|
455
|
+
- If the supplied field is a sortable field, the results are ordered by the
|
456
|
+
value of this field. This applies to both text and numeric fields. Available
|
457
|
+
orders are `:asc` or `:desc`
|
458
|
+
- `limit(num, offset = 0)`
|
459
|
+
- Limit the results to the specified `num` at the `offset`. The default limit
|
460
|
+
is set to `10`.
|
461
|
+
- `count`
|
462
|
+
- Returns the number of documents found in the search query
|
463
|
+
- `highlight(fields: [], opening_tag: "<b>", closing_tag: "</b>")`
|
464
|
+
- Use this option to format occurrences of matched text. `fields` are an
|
465
|
+
array of fields to be highlighted.
|
466
|
+
- `verbatim`
|
467
|
+
- Do not try to use stemming for query expansion but search the query terms
|
468
|
+
verbatim.
|
469
|
+
- `no_stop_words`
|
470
|
+
- Do not filter stopwords from the query.
|
471
|
+
- `with_scores`
|
472
|
+
- Include the relative internal score of each document. This can be used to
|
473
|
+
merge results from multiple instances. This will add a `score` method to the
|
474
|
+
returned `Document` instances.
|
475
|
+
- `return(*fields)`
|
476
|
+
- Limit which fields from the document are returned.
|
477
|
+
- `explain`
|
478
|
+
- Returns the execution plan for a complex query. In the returned response,
|
479
|
+
a + on a term is an indication of stemming.
|
480
|
+
- `to_redis`
|
481
|
+
- Returns the command to be executed without executing it.
|
482
|
+
|
483
|
+
|
484
|
+
## Spellcheck
|
485
|
+
|
486
|
+
Spellchecking is initiated off a `RediSearch::Index` instance and provides
|
487
|
+
suggestions for misspelled search terms. It takes an optional `distance`
|
488
|
+
argument which is the maximal Levenshtein distance for spelling suggestions. It
|
489
|
+
returns an array where each element contains suggestions for each search term
|
490
|
+
and a normalized score based on its occurrences in the index.
|
491
|
+
|
492
|
+
```ruby
|
493
|
+
main ❯ index = RediSearch::Index.new("user_idx", name: { text: { phonetic: "dm:en" } })
|
494
|
+
main ❯ index.spellcheck("jimy")
|
495
|
+
RediSearch (1.1ms) FT.SPELLCHECK user_idx jimy DISTANCE 1
|
496
|
+
=> [#<RediSearch::Spellcheck::Result:0x00007f805591c670
|
497
|
+
term: "jimy",
|
498
|
+
suggestions:
|
499
|
+
[#<struct RediSearch::Spellcheck::Suggestion score=0.0006849315068493151, suggestion="jimmy">,
|
500
|
+
#<struct RediSearch::Spellcheck::Suggestion score=0.00019569471624266145, suggestion="jim">]>]
|
501
|
+
main ❯ index.spellcheck("jimy", distance: 2).first.suggestions
|
502
|
+
RediSearch (0.5ms) FT.SPELLCHECK user_idx jimy DISTANCE 2
|
503
|
+
=> [#<struct RediSearch::Spellcheck::Suggestion score=0.0006849315068493151, suggestion="jimmy">,
|
504
|
+
#<struct RediSearch::Spellcheck::Suggestion score=0.00019569471624266145, suggestion="jim">]
|
505
|
+
```
|
506
|
+
|
507
|
+
|
508
|
+
## Rails Integration
|
509
|
+
|
510
|
+
Integration with Rails is super easy! Call `redi_search` with the schema keyword
|
511
|
+
arg from inside your model. Ex:
|
191
512
|
|
192
|
-
Integration with Rails is on by default! All you have to do is add the following to the model you want to search:
|
193
513
|
```ruby
|
194
514
|
class User < ApplicationRecord
|
195
515
|
redi_search schema: {
|
@@ -199,22 +519,106 @@ class User < ApplicationRecord
|
|
199
519
|
end
|
200
520
|
```
|
201
521
|
|
202
|
-
This will automatically add `User.search` and `User.
|
522
|
+
This will automatically add `User.search` and `User.spellcheck`
|
523
|
+
methods which behave the same as if you called them on an `Index` instance.
|
524
|
+
|
525
|
+
`User.reindex(only: [], **options)` is also added and behaves similarly to `RediSearch::Index#reindex`. Some of the differences include:
|
526
|
+
- By default does an upsert for all documents added using the
|
527
|
+
option `replace: { partial: true }`.
|
528
|
+
- `Document`s do not to be passed as the first parameter. The `search_import`
|
529
|
+
scope is automatically called and all the records are converted
|
530
|
+
to `Document`s.
|
531
|
+
- Accepts an optional `only` parameter where you can specify a limited number
|
532
|
+
of fields to update. Useful if you alter the schema and only need to index a
|
533
|
+
particular field.
|
534
|
+
|
535
|
+
|
536
|
+
The `redi_search` class method also takes an optional `serializer` argument
|
537
|
+
which takes the class name of a serializer. The serializer must respond to all
|
538
|
+
the fields in a schema as methods. We don't serialize to a JSON object since
|
539
|
+
RediSearch doesn't serialize documents that way.
|
540
|
+
|
541
|
+
```ruby
|
542
|
+
class User < ApplicationRecord
|
543
|
+
redi_search schema: {
|
544
|
+
name: { text: { phonetic: "dm:en" } }
|
545
|
+
}, serializer: UserSerializer
|
546
|
+
end
|
547
|
+
|
548
|
+
class UserSerializer < SimpleDelegator
|
549
|
+
def name
|
550
|
+
"#{first_name} #{last_name}"
|
551
|
+
end
|
552
|
+
end
|
553
|
+
```
|
554
|
+
|
555
|
+
You can create a scope on the model to eager load relationships when indexing or
|
556
|
+
it can be used to limit the records to index.
|
557
|
+
|
558
|
+
```ruby
|
559
|
+
class User < ApplicationRecord
|
560
|
+
scope :search_import, -> { includes(:posts) }
|
561
|
+
end
|
562
|
+
```
|
563
|
+
|
564
|
+
The default index name for model indexes is
|
565
|
+
`#{model_name.plural}_#{RediSearch.env}`. The `redi_search` method takes an
|
566
|
+
optional `index_prefix` argument which gets prepended to the index name:
|
567
|
+
|
568
|
+
```ruby
|
569
|
+
class User < ApplicationRecord
|
570
|
+
redi_search schema: {
|
571
|
+
first: { text: { phonetic: "dm:en" } },
|
572
|
+
last: { text: { phonetic: "dm:en" } }
|
573
|
+
}, index_prefix: 'prefix'
|
574
|
+
end
|
575
|
+
|
576
|
+
User.redi_search_index.name
|
577
|
+
# => prefix_users_development
|
578
|
+
```
|
579
|
+
|
580
|
+
When integrating RediSearch into a model, records will automatically be indexed
|
581
|
+
after creating and updating and will be removed from the index upon destruction.
|
582
|
+
|
583
|
+
There are few more convenience methods that are publicly available:
|
584
|
+
- `redi_search_document`
|
585
|
+
- Returns the record as a `RediSearch::Document` instance
|
586
|
+
- `redi_search_delete_document`
|
587
|
+
- Removes the record from the index
|
588
|
+
- `redi_search_add_document`
|
589
|
+
- Adds the record to the index
|
590
|
+
- `redi_search_index`
|
591
|
+
- Returns the `RediSearch::Index` instance
|
592
|
+
|
203
593
|
|
204
594
|
## Development
|
205
595
|
|
206
|
-
After checking out the repo, run `bin/setup` to install dependencies. Then, run
|
596
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run
|
597
|
+
`rake test` to run the tests. You can also run `bin/console` for an interactive
|
598
|
+
prompt that will allow you to experiment. You can also start a rails console if
|
599
|
+
you `cd` into `test/dummy`.
|
207
600
|
|
208
|
-
To install this gem onto your local machine, run `bundle exec rake install`. To
|
601
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To
|
602
|
+
release a new version, execute `bin/publish (major|minor|patch)` which will
|
603
|
+
update the version number in `version.rb`, create a git tag for the version,
|
604
|
+
push git commits and tags, and push the `.gem` file to
|
605
|
+
[rubygems.org](https://rubygems.org).
|
209
606
|
|
210
607
|
## Contributing
|
211
608
|
|
212
|
-
Bug reports and pull requests are welcome on
|
609
|
+
Bug reports and pull requests are welcome on
|
610
|
+
[GitHub](https://github.com/npezza93/redi_search). This project is intended to
|
611
|
+
be a safe, welcoming space for collaboration, and contributors are expected to
|
612
|
+
adhere to the [Contributor Covenant](http://contributor-covenant.org) code of
|
613
|
+
conduct.
|
213
614
|
|
214
615
|
## License
|
215
616
|
|
216
|
-
The gem is available as open source under the terms of the
|
617
|
+
The gem is available as open source under the terms of the
|
618
|
+
[MIT License](https://opensource.org/licenses/MIT).
|
217
619
|
|
218
620
|
## Code of Conduct
|
219
621
|
|
220
|
-
Everyone interacting in the RediSearch project’s codebases, issue trackers, chat
|
622
|
+
Everyone interacting in the RediSearch project’s codebases, issue trackers, chat
|
623
|
+
rooms and mailing lists is expected to follow the [code of
|
624
|
+
conduct](https://github.com/npezza93/redi_search/blob/master/CODE_OF_CONDUCT.md).
|