redi_search 0.1.0 → 1.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +516 -112
- data/lib/redi_search.rb +5 -2
- data/lib/redi_search/add.rb +70 -0
- data/lib/redi_search/alter.rb +30 -0
- data/lib/redi_search/create.rb +53 -0
- data/lib/redi_search/document.rb +71 -16
- data/lib/redi_search/index.rb +31 -26
- data/lib/redi_search/lazily_load.rb +65 -0
- data/lib/redi_search/log_subscriber.rb +4 -0
- data/lib/redi_search/model.rb +41 -18
- data/lib/redi_search/schema.rb +17 -8
- data/lib/redi_search/schema/text_field.rb +0 -2
- data/lib/redi_search/search.rb +22 -44
- data/lib/redi_search/search/clauses.rb +60 -31
- data/lib/redi_search/search/clauses/and.rb +17 -0
- data/lib/redi_search/search/clauses/application_clause.rb +18 -0
- data/lib/redi_search/search/clauses/boolean.rb +72 -0
- data/lib/redi_search/search/clauses/highlight.rb +47 -0
- data/lib/redi_search/search/clauses/in_order.rb +17 -0
- data/lib/redi_search/search/clauses/language.rb +23 -0
- data/lib/redi_search/search/clauses/limit.rb +27 -0
- data/lib/redi_search/search/clauses/no_content.rb +17 -0
- data/lib/redi_search/search/clauses/no_stop_words.rb +17 -0
- data/lib/redi_search/search/clauses/or.rb +23 -0
- data/lib/redi_search/search/clauses/return.rb +23 -0
- data/lib/redi_search/search/clauses/slop.rb +23 -0
- data/lib/redi_search/search/clauses/sort_by.rb +25 -0
- data/lib/redi_search/search/clauses/verbatim.rb +17 -0
- data/lib/redi_search/search/clauses/where.rb +66 -0
- data/lib/redi_search/search/clauses/with_scores.rb +17 -0
- data/lib/redi_search/search/result.rb +46 -0
- data/lib/redi_search/search/term.rb +4 -4
- data/lib/redi_search/spellcheck.rb +30 -29
- data/lib/redi_search/spellcheck/result.rb +44 -0
- data/lib/redi_search/version.rb +1 -1
- metadata +101 -31
- data/.gitignore +0 -11
- data/.rubocop.yml +0 -1757
- data/.travis.yml +0 -23
- data/Gemfile +0 -17
- data/Rakefile +0 -12
- data/bin/console +0 -8
- data/bin/publish +0 -58
- data/bin/setup +0 -8
- data/bin/test +0 -7
- data/lib/redi_search/document/converter.rb +0 -26
- data/lib/redi_search/error.rb +0 -6
- data/lib/redi_search/result/collection.rb +0 -22
- data/lib/redi_search/search/and_clause.rb +0 -15
- data/lib/redi_search/search/boolean_clause.rb +0 -72
- data/lib/redi_search/search/highlight_clause.rb +0 -43
- data/lib/redi_search/search/or_clause.rb +0 -21
- data/lib/redi_search/search/where_clause.rb +0 -66
- data/redi_search.gemspec +0 -48
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 0666cc15e7fc7d132026337c916b5d233114d65378ff172d9a48ae81ed1a9eae
|
4
|
+
data.tar.gz: 575c1230885537ea9a31b753808894d187f00718fe8d0fca282cba426b75e0c8
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0f1486c6d2a7b1a726d6c956398e693317df4efffa0c73e45b180a3acb2ac36b69284e68e3230862e6948015b1ceee0a1ac518dab6182c86e4b2c710f587cd16
|
7
|
+
data.tar.gz: c9e179c20fc545612a00fd2f9b46e5a2eae9ce1292d4635625a80abf641d726d248ee753633dfa8e292b0dba1543af410e58cf5a10380d7854c66af2c06326a1
|
data/README.md
CHANGED
@@ -1,42 +1,45 @@
|
|
1
|
+
<p align="center">
|
2
|
+
<a href="https://github.com/npezza93/redi_search">
|
3
|
+
<img src="https://raw.githubusercontent.com/npezza93/redi_search/master/.github/logo.svg?sanitize=true" width="350">
|
4
|
+
</a>
|
5
|
+
</p>
|
6
|
+
|
1
7
|
# RediSearch
|
2
8
|
|
3
9
|
[![Build Status](https://travis-ci.com/npezza93/redi_search.svg?branch=master)](https://travis-ci.com/npezza93/redi_search)
|
4
10
|
[![Test Coverage](https://api.codeclimate.com/v1/badges/c6437acac5684de2549d/test_coverage)](https://codeclimate.com/github/npezza93/redi_search/test_coverage)
|
5
11
|
[![Maintainability](https://api.codeclimate.com/v1/badges/c6437acac5684de2549d/maintainability)](https://codeclimate.com/github/npezza93/redi_search/maintainability)
|
6
12
|
|
7
|
-
A simple, but powerful Ruby wrapper around RediSearch,
|
8
|
-
|
13
|
+
A simple, but powerful, Ruby wrapper around RediSearch, a search engine on top of
|
14
|
+
Redis.
|
9
15
|
|
10
16
|
## Installation
|
11
17
|
|
12
18
|
Firstly, Redis and RediSearch need to be installed.
|
13
19
|
|
14
|
-
You can download Redis from https://redis.io/download, and check out
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
1. `cd RediSearch`
|
19
|
-
1. `mkdir build`
|
20
|
-
1. `cd build`
|
21
|
-
1. `cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo`
|
22
|
-
1. `make`
|
23
|
-
1. `redis-server --loadmodule ./redisearch.so or load the module in your redis.conf`
|
20
|
+
You can download Redis from https://redis.io/download, and check out
|
21
|
+
installation instructions
|
22
|
+
[here](https://github.com/antirez/redis#installing-redis). Alternatively, on
|
23
|
+
macOS or Linux you can install via Homebrew.
|
24
24
|
|
25
|
-
|
25
|
+
To install RediSearch check out,
|
26
|
+
[https://oss.redislabs.com/redisearch/Quick_Start.html](https://oss.redislabs.com/redisearch/Quick_Start.html).
|
27
|
+
Once you have RediSearch built, you can update your redis.conf file to always
|
28
|
+
load the redisearch module with `loadmodule /path/to/redisearch.so`. (On macOS
|
29
|
+
the redis.conf file can be found `/usr/local/etc/redis.conf`)
|
26
30
|
|
27
|
-
|
28
|
-
After Redis and RediSearch are up and running, add this line to your application's Gemfile:
|
31
|
+
After Redis and RediSearch are up and running, add this line to your Gemfile:
|
29
32
|
|
30
33
|
```ruby
|
31
34
|
gem 'redi_search'
|
32
35
|
```
|
33
36
|
|
34
|
-
And then
|
37
|
+
And then:
|
35
38
|
```bash
|
36
39
|
❯ bundle
|
37
40
|
````
|
38
41
|
|
39
|
-
Or install it yourself
|
42
|
+
Or install it yourself:
|
40
43
|
```bash
|
41
44
|
❯ gem install redi_search
|
42
45
|
```
|
@@ -46,9 +49,10 @@ and require it:
|
|
46
49
|
require 'redi_search'
|
47
50
|
```
|
48
51
|
|
49
|
-
|
52
|
+
Once the gem is installed and required you'll need to configure it with your
|
53
|
+
Redis configuration. If you're on Rails, this should go in an initializer
|
54
|
+
(`config/initializers/redi_search.rb`)
|
50
55
|
|
51
|
-
### Configuration
|
52
56
|
```ruby
|
53
57
|
RediSearch.configure do |config|
|
54
58
|
config.redis_config = {
|
@@ -58,108 +62,341 @@ RediSearch.configure do |config|
|
|
58
62
|
end
|
59
63
|
```
|
60
64
|
|
61
|
-
|
65
|
+
## Table of Contents
|
66
|
+
- [Preface](#preface)
|
67
|
+
- [Schema](#schema)
|
68
|
+
- [Document](#document)
|
69
|
+
- [Index](#index)
|
70
|
+
- [Searching](#searching)
|
71
|
+
- [Spellcheck](#spellcheck)
|
72
|
+
- [Rails Integration](#rails-integration)
|
73
|
+
|
74
|
+
|
75
|
+
## Preface
|
76
|
+
RediSearch revolves around a search index, so lets start with
|
77
|
+
defining what a search index is. According to [Switype](https://swiftype.com):
|
78
|
+
> A search index is a body of structured data that a search engine refers to
|
79
|
+
> when looking for results that are relevant to a specific query. Indexes are a
|
80
|
+
> critical piece of any search system, since they must be tailored to the
|
81
|
+
> specific information retrieval method of the search engine’s algorithm. In
|
82
|
+
> this manner, the algorithm and the index are inextricably linked to one
|
83
|
+
> another. Index can also be used as a verb (indexing), referring to the process
|
84
|
+
> of collecting unstructured website data in a structured format that is
|
85
|
+
> tailored for the search engine algorithm.
|
86
|
+
>
|
87
|
+
> One way to think about indices is to consider the following analogy between a
|
88
|
+
> search infrastructure and an office filing system. Imagine you hand an intern
|
89
|
+
> a stack of thousands of pieces of paper (documents) and tell them to organize
|
90
|
+
> these pieces of paper in a filing cabinet (index) to help the company find
|
91
|
+
> information more efficiently. The intern will first have to sort through the
|
92
|
+
> papers and get a sense of all the information contained within them, then they
|
93
|
+
> will have to decide on a system for arranging them in the filing cabinet, then
|
94
|
+
> finally they’ll need to decide what is the most effective manner for searching
|
95
|
+
> through and selecting from the files once they are in the cabinet. In this
|
96
|
+
> example, the process of organizing and filing the papers corresponds to the
|
97
|
+
> process of indexing website content, and the method for searching across these
|
98
|
+
> organized files and finding those that are most relevant corresponds to the
|
99
|
+
> search algorithm.
|
100
|
+
|
101
|
+
|
102
|
+
## Schema
|
103
|
+
|
104
|
+
This defines the fields and the properties of those fields in the index. A
|
105
|
+
schema is a hash, with field names as the keys, and the field type(and options)
|
106
|
+
as the value. Each field can be one of four types: geo, numeric, tag, or text
|
107
|
+
and can have many options. A simple example of a schema is:
|
108
|
+
```ruby
|
109
|
+
{ first_name: :text, last_name: :text }
|
110
|
+
```
|
111
|
+
|
112
|
+
The supported options for each type are as follows:
|
113
|
+
|
114
|
+
##### Text field
|
115
|
+
With no options: `{ name: :text }`
|
116
|
+
|
117
|
+
<details>
|
118
|
+
<summary>Options</summary>
|
119
|
+
<ul>
|
120
|
+
<li>
|
121
|
+
<b>weight</b> (default: 1.0)
|
122
|
+
<ul>
|
123
|
+
<li>Declares the importance of this field when calculating result accuracy. This is a multiplication factor.</li>
|
124
|
+
<li>Ex: <code>{ name: { text: { weight: 2 } } }</code></li>
|
125
|
+
</ul>
|
126
|
+
</li>
|
127
|
+
<li>
|
128
|
+
<b>phonetic</b>
|
129
|
+
<ul>
|
130
|
+
<li>Will perform phonetic matching on field in searches by default. The obligatory {matcher} argument specifies the phonetic algorithm and language used. The following matchers are supported:
|
131
|
+
<ul>
|
132
|
+
<li>dm:en - Double Metaphone for English</li>
|
133
|
+
<li>dm:fr - Double Metaphone for French</li>
|
134
|
+
<li>dm:pt - Double Metaphone for Portuguese</li>
|
135
|
+
<li>dm:es - Double Metaphone for Spanish</li>
|
136
|
+
</ul>
|
137
|
+
</li>
|
138
|
+
<li>
|
139
|
+
Ex: <code>{ name: { text: { phonetic: 'dm:en' } } }</code>
|
140
|
+
</li>
|
141
|
+
</ul>
|
142
|
+
</li>
|
143
|
+
<li>
|
144
|
+
<b>sortable</b> (default: false)
|
145
|
+
<ul>
|
146
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
147
|
+
<li>Ex: <code>{ name: { text: { sortable: true } } }</code></li>
|
148
|
+
</ul>
|
149
|
+
</li>
|
150
|
+
<li>
|
151
|
+
<b>no_index</b> (default: false)
|
152
|
+
<ul>
|
153
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
154
|
+
<li>Ex: <code>{ name: { text: { no_index: true } } }</code></li>
|
155
|
+
</ul>
|
156
|
+
</li>
|
157
|
+
<li>
|
158
|
+
<b>no_stem</b> (default: false)
|
159
|
+
<ul>
|
160
|
+
<li>Disable stemming when indexing its values. This may be ideal for things like proper names.</li>
|
161
|
+
<li>Ex: <code>{ name: { text: { no_stem: true } } }</code></li>
|
162
|
+
</ul>
|
163
|
+
</li>
|
164
|
+
</ul>
|
165
|
+
</details>
|
166
|
+
|
167
|
+
##### Numeric field
|
168
|
+
With no options: `{ price: :numeric }`
|
169
|
+
|
170
|
+
<details>
|
171
|
+
<summary>Options</summary>
|
172
|
+
<ul>
|
173
|
+
<li>
|
174
|
+
<b>sortable</b> (default: false)
|
175
|
+
<ul>
|
176
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
177
|
+
<li>Ex: <code>{ id: { numeric: { sortable: true } } }</code></li>
|
178
|
+
</ul>
|
179
|
+
</li>
|
180
|
+
<li>
|
181
|
+
<b>no_index</b> (default: false)
|
182
|
+
<ul>
|
183
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
184
|
+
<li>Ex: <code>{ id: { numeric: { no_index: true } } }</code></li>
|
185
|
+
</ul>
|
186
|
+
</li>
|
187
|
+
</ul>
|
188
|
+
</details>
|
189
|
+
|
190
|
+
##### Tag field
|
191
|
+
With no options: `{ tag: :tag }`
|
192
|
+
|
193
|
+
<details>
|
194
|
+
<summary>Options</summary>
|
195
|
+
<ul>
|
196
|
+
<li>
|
197
|
+
<b>sortable</b> (default: false)
|
198
|
+
<ul>
|
199
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
200
|
+
<li>Ex: <code>{ tag: { tag: { sortable: true } } }</code></li>
|
201
|
+
</ul>
|
202
|
+
</li>
|
203
|
+
<li>
|
204
|
+
<b>no_index</b> (default: false)
|
205
|
+
<ul>
|
206
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
207
|
+
<li>Ex: <code>{ tag: { tag: { no_index: true } } }</code></li>
|
208
|
+
</ul>
|
209
|
+
</li>
|
210
|
+
<li>
|
211
|
+
<b>separator</b> (default: ",")
|
212
|
+
<ul>
|
213
|
+
<li>Indicates how the text contained in the field is to be split into individual tags. The default is ,. The value must be a single character.</li>
|
214
|
+
<li>Ex: <code>{ tag: { tag: { separator: ',' } } }</code></li>
|
215
|
+
</ul>
|
216
|
+
</li>
|
217
|
+
</ul>
|
218
|
+
</details>
|
219
|
+
|
220
|
+
##### Geo field
|
221
|
+
With no options: `{ place: :geo }`
|
222
|
+
|
223
|
+
<details>
|
224
|
+
<summary>Options</summary>
|
225
|
+
<ul>
|
226
|
+
<li>
|
227
|
+
<b>sortable</b> (default: false)
|
228
|
+
<ul>
|
229
|
+
<li>Allows the user to later sort the results by the value of this field (this adds memory overhead so do not declare it on large text fields).</li>
|
230
|
+
<li>Ex: <code>{ place: { geo: { sortable: true } } }</code></li>
|
231
|
+
</ul>
|
232
|
+
</li>
|
233
|
+
<li>
|
234
|
+
<b>no_index</b> (default: false)
|
235
|
+
<ul>
|
236
|
+
<li>Field will not be indexed. This is useful in conjunction with <code>sortable</code>, to create fields whose update using PARTIAL will not cause full reindexing of the document. If a field has <code>no_index</code> and doesn't have <code>sortable</code>, it will just be ignored by the index.</li>
|
237
|
+
<li>Ex: <code>{ place: { geo: { no_index: true } } }</code></li>
|
238
|
+
</ul>
|
239
|
+
</li>
|
240
|
+
</ul>
|
241
|
+
</details>
|
242
|
+
|
243
|
+
## Document
|
244
|
+
|
245
|
+
A `Document` is the Ruby representation of a RediSearch document.
|
246
|
+
|
247
|
+
You can fetch a `Document` using `.get` or `mget` class methods.
|
248
|
+
- `get(index, document_id)` fetches a single document in an `Index` for a given
|
249
|
+
`document_id`.
|
250
|
+
- `mget(index, *document_ids)` fetches a collection of `Document`s
|
251
|
+
in an `Index` for the given `document_ids`.
|
252
|
+
|
253
|
+
You can also make a `Document` instance using the
|
254
|
+
`.for_object(index, record, serializer: nil, only: [])` class method. It takes
|
255
|
+
an `Index` instance and a ruby object. That object must respond to all the
|
256
|
+
fields specified in the indexes schema or pass a serializer class that accepts
|
257
|
+
the object and responds to all the fields specified in the indexes schema.
|
258
|
+
`only` accepts an array of fields from the schema and limits the fields that are
|
259
|
+
passed to the `Document`.
|
260
|
+
|
261
|
+
Once you have an instance of a `Document`, it responds to all the fields
|
262
|
+
specified in the indexes schema as methods and `document_id`. `document_id` is
|
263
|
+
automatically prepended with the indexes names unless it already is to ensure
|
264
|
+
uniqueness. We prepend the index name because if you have two documents with the
|
265
|
+
same id in different indexes we don't want the documents to override each other.
|
266
|
+
There is also a `#document_id_without_index` method which removes the prepended
|
267
|
+
index name.
|
268
|
+
|
269
|
+
Finally there is a `#del` method that will remove the document from the index.
|
270
|
+
It optionally accepts a `delete_document` named argument that signifies whether
|
271
|
+
the document should be completely removed from the Redis instance vs just the
|
272
|
+
index.
|
273
|
+
|
274
|
+
|
275
|
+
## Index
|
276
|
+
|
277
|
+
To initialize an index, pass the name of the index as a string or symbol and the
|
278
|
+
schema.
|
62
279
|
|
63
|
-
All actions revolve around indexes. To instantiate one:
|
64
280
|
```ruby
|
65
281
|
RediSearch::Index.new(name_of_index, schema)
|
66
282
|
```
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
-
|
71
|
-
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
-
|
81
|
-
|
82
|
-
|
83
|
-
-
|
84
|
-
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
-
|
100
|
-
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
-
|
105
|
-
-
|
106
|
-
-
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
-
|
111
|
-
|
112
|
-
|
113
|
-
-
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
-
|
119
|
-
|
120
|
-
|
121
|
-
|
122
|
-
-
|
123
|
-
|
124
|
-
-
|
125
|
-
-
|
126
|
-
|
127
|
-
|
128
|
-
-
|
129
|
-
-
|
130
|
-
-
|
131
|
-
-
|
132
|
-
|
133
|
-
|
134
|
-
|
135
|
-
|
283
|
+
|
284
|
+
#### Available Commands
|
285
|
+
|
286
|
+
- `create`
|
287
|
+
- Creates the index in the Redis instance, returns a boolean. Has an
|
288
|
+
accompanying bang method that will raise an exception upon failure. Will
|
289
|
+
return `false` if the index already exists. Accepts a few options:
|
290
|
+
- `max_text_fields: #{true || false}`
|
291
|
+
- For efficiency, RediSearch encodes indexes differently if they are
|
292
|
+
created with less than 32 text fields. This option forces RediSearch
|
293
|
+
to encode indexes as if there were more than 32 text fields, which
|
294
|
+
allows you to add additional fields (beyond 32) using `alter`.
|
295
|
+
- `no_offsets: #{true || false}`
|
296
|
+
- If set, we do not store term offsets for documents (saves memory, does
|
297
|
+
not allow exact searches or highlighting). Implies `no_highlight`.
|
298
|
+
- `temporary: #{seconds}`
|
299
|
+
- Create a lightweight temporary index which will expire after `seconds`
|
300
|
+
seconds of inactivity. The internal idle timer is reset whenever the
|
301
|
+
index is searched or added to. Because such indexes are lightweight,
|
302
|
+
you can create thousands of such indexes without negative performance
|
303
|
+
implications.
|
304
|
+
- `no_highlight: #{true || false}`
|
305
|
+
- Conserves storage space and memory by disabling highlighting support.
|
306
|
+
If set, we do not store corresponding byte offsets for term positions.
|
307
|
+
`no_highlight` is also implied by `no_offsets`.
|
308
|
+
- `no_fields: #{true || false}`
|
309
|
+
- If set, we do not store field bits for each term. Saves memory, does
|
310
|
+
not allow filtering by specific fields.
|
311
|
+
- `no_frequencies: #{true || false}`
|
312
|
+
- If set, we avoid saving the term frequencies in the index. This saves
|
313
|
+
memory but does not allow sorting based on the frequencies of a given
|
314
|
+
term within the document.
|
315
|
+
- `drop`
|
316
|
+
- Drops the index from the Redis instance, returns a boolean. Has an
|
317
|
+
accompanying bang method that will raise an exception upon failure. Will
|
318
|
+
return `false` if the index has already been dropped.
|
319
|
+
- `exist?`
|
320
|
+
- Returns a boolean signifying index existence.
|
321
|
+
- `info`
|
322
|
+
- Returns a struct object with all the information about the index.
|
323
|
+
- `fields`
|
324
|
+
- Returns an array of the field names in the index.
|
325
|
+
- `add(document, score: 1.0, replace: {}, language: nil, no_save: false)`
|
326
|
+
- Takes a `Document` object and options. Has an
|
327
|
+
accompanying bang method that will raise an exception upon failure.
|
328
|
+
- `score` -> The document's rank, a value between 0.0 and 1.0
|
329
|
+
- `language` -> Use a stemmer for the supplied language during indexing.
|
330
|
+
- `no_save` -> Don't save the actual document in the database and only index it.
|
331
|
+
- `replace` -> Accepts a boolean or a hash. If a truthy value is passed, we
|
332
|
+
will do an UPSERT style insertion - and delete an older version of the
|
333
|
+
document if it exists.
|
334
|
+
- `replace: { partial: true }` -> Allows you to not have to specify all
|
335
|
+
fields for reindexing. Fields not given to the command will be loaded from
|
336
|
+
the current version of the document.
|
337
|
+
- `add_multiple!(documents, score: 1.0, replace: {}, language: nil, no_save: false)`
|
338
|
+
- Takes an array of `Document` objects. This provides a more performant way to
|
339
|
+
add multiple documents to the index. Accepts the same options as `add`.
|
340
|
+
- `del(document, delete_document: false)`
|
341
|
+
- Removes a document from the index. `delete_document` signifies whether the
|
342
|
+
document should be completely removed from the Redis instance vs just the
|
343
|
+
index.
|
344
|
+
- `document_count`
|
345
|
+
- Returns the number of documents in the index
|
346
|
+
- `alter(field_name, schema)`
|
347
|
+
- Adds a new field to the index. Ex: `index.alter(:first_name, text: { phonetic: "dm:en" })`
|
348
|
+
- `reindex(documents, recreate: false, **options)`
|
349
|
+
- If `recreate` is `true` the index will be dropped and recreated
|
350
|
+
- `options` accepts the same options as `add`
|
351
|
+
|
352
|
+
|
353
|
+
## Searching
|
354
|
+
|
355
|
+
Searching is initiated off a `RediSearch::Index` instance with clauses that can
|
356
|
+
be chained together. When searching, an array of `Document`s is returned
|
357
|
+
which has public reader methods for all the schema fields and a `document_id`
|
358
|
+
method which returns the id of the document prefixed with the index name.
|
359
|
+
|
136
360
|
```ruby
|
137
361
|
main ❯ index = RediSearch::Index.new("user_idx", name: { text: { phonetic: "dm:en" } })
|
362
|
+
main ❯ index.add RediSearch::Document.for_object(index, User.new("10039", "Gene", "Volkman"))
|
363
|
+
main ❯ index.add RediSearch::Document.for_object(index, User.new("9998", "Jeannie", "Ledner"))
|
138
364
|
main ❯ index.search("john")
|
139
365
|
RediSearch (1.1ms) FT.SEARCH user_idx `john`
|
140
366
|
=> [#<RediSearch::Document:0x00007f862e241b78 first: "Gene", last: "Volkman", document_id: "10039">,
|
141
367
|
#<RediSearch::Document:0x00007f862e2417b8 first: "Jeannie", last: "Ledner", document_id: "9998">]
|
142
368
|
```
|
143
|
-
|
369
|
+
**Simple phrase query** - `hello AND world`
|
144
370
|
```ruby
|
145
371
|
index.search("hello").and("world")
|
146
372
|
```
|
147
|
-
|
373
|
+
**Exact phrase query** - `hello FOLLOWED BY world`
|
148
374
|
```ruby
|
149
375
|
index.search("hello world")
|
150
376
|
```
|
151
|
-
|
377
|
+
**Union query** - `hello OR world`
|
152
378
|
```ruby
|
153
379
|
index.search("hello").or("world")
|
154
380
|
```
|
155
|
-
|
381
|
+
**Negation query** - `hello AND NOT world`
|
156
382
|
```ruby
|
157
383
|
index.search("hello").and.not("world")
|
158
384
|
```
|
159
385
|
|
386
|
+
Complex intersections and unions:
|
387
|
+
```ruby
|
388
|
+
# Intersection of unions
|
389
|
+
index.search(index.search("hello").or("halo")).and(index.search("world").or("werld"))
|
390
|
+
# Negation of union
|
391
|
+
index.search("hello").and.not(index.search("world").or("werld"))
|
392
|
+
# Union inside phrase
|
393
|
+
index.search("hello").and(index.search("world").or("werld"))
|
394
|
+
```
|
395
|
+
|
160
396
|
All terms support a few options that can be applied.
|
161
397
|
|
162
|
-
|
398
|
+
**Prefix terms**: match all terms starting with a prefix.
|
399
|
+
(Akin to `like term%` in SQL)
|
163
400
|
```ruby
|
164
401
|
index.search("hel", prefix: true)
|
165
402
|
index.search("hello worl", prefix: true)
|
@@ -167,29 +404,112 @@ index.search("hel", prefix: true).and("worl", prefix: true)
|
|
167
404
|
index.search("hello").and.not("worl", prefix: true)
|
168
405
|
```
|
169
406
|
|
170
|
-
|
407
|
+
**Optional terms**: documents containing the optional terms will rank higher
|
408
|
+
than those without
|
171
409
|
```ruby
|
172
410
|
index.search("foo").and("bar", optional: true).and("baz", optional: true)
|
173
411
|
```
|
174
412
|
|
175
|
-
|
413
|
+
**Fuzzy terms**: matches are performed based on Levenshtein distance (LD). The
|
414
|
+
maximum Levenshtein distance supported is 3.
|
176
415
|
```ruby
|
177
416
|
index.search("zuchini", fuzziness: 1)
|
178
417
|
```
|
179
418
|
|
180
|
-
|
419
|
+
Search terms can also be scoped to specific fields using a `where` clause:
|
181
420
|
```ruby
|
182
|
-
#
|
183
|
-
index.search
|
184
|
-
#
|
185
|
-
index.search
|
186
|
-
#
|
187
|
-
index.search
|
421
|
+
# Simple field specific query
|
422
|
+
index.search.where(name: "john")
|
423
|
+
# Using where with options
|
424
|
+
index.search.where(first: "jon", fuzziness: 1)
|
425
|
+
# Using where with more complex query
|
426
|
+
index.search.where(first: index.search("bill").or("bob"))
|
427
|
+
```
|
428
|
+
|
429
|
+
Searching for numeric fields takes a range:
|
430
|
+
```ruby
|
431
|
+
index.search.where(number: 0..100)
|
432
|
+
# Searching to infinity
|
433
|
+
index.search.where(number: 0..Float::INFINITY)
|
434
|
+
index.search.where(number: -Float::INFINITY..0)
|
188
435
|
```
|
189
436
|
|
190
|
-
|
437
|
+
##### Query level clauses
|
438
|
+
- `slop(level)`
|
439
|
+
- We allow a maximum of N intervening number of unmatched offsets between
|
440
|
+
phrase terms. (i.e the slop for exact phrases is 0)
|
441
|
+
- `in_order`
|
442
|
+
- Usually used in conjunction with `slop`. We make sure the query terms appear
|
443
|
+
in the same order in the document as in the query, regardless of the offsets
|
444
|
+
between them.
|
445
|
+
- `no_content`
|
446
|
+
- Only return the document ids and not the content. This is useful if
|
447
|
+
RediSearch is being used on a Rails model where the document attributes
|
448
|
+
don't matter and it's being converted into ActiveRecord objects.
|
449
|
+
- `language(language)`
|
450
|
+
- Stemmer to use for the supplied language during search for query expansion.
|
451
|
+
If querying documents in Chinese, this should be set to chinese in order to
|
452
|
+
properly tokenize the query terms. If an unsupported language is sent, the
|
453
|
+
command returns an error.
|
454
|
+
- `sort_by(field, order: :asc)`
|
455
|
+
- If the supplied field is a sortable field, the results are ordered by the
|
456
|
+
value of this field. This applies to both text and numeric fields. Available
|
457
|
+
orders are `:asc` or `:desc`
|
458
|
+
- `limit(num, offset = 0)`
|
459
|
+
- Limit the results to the specified `num` at the `offset`. The default limit
|
460
|
+
is set to `10`.
|
461
|
+
- `count`
|
462
|
+
- Returns the number of documents found in the search query
|
463
|
+
- `highlight(fields: [], opening_tag: "<b>", closing_tag: "</b>")`
|
464
|
+
- Use this option to format occurrences of matched text. `fields` are an
|
465
|
+
array of fields to be highlighted.
|
466
|
+
- `verbatim`
|
467
|
+
- Do not try to use stemming for query expansion but search the query terms
|
468
|
+
verbatim.
|
469
|
+
- `no_stop_words`
|
470
|
+
- Do not filter stopwords from the query.
|
471
|
+
- `with_scores`
|
472
|
+
- Include the relative internal score of each document. This can be used to
|
473
|
+
merge results from multiple instances. This will add a `score` method to the
|
474
|
+
returned `Document` instances.
|
475
|
+
- `return(*fields)`
|
476
|
+
- Limit which fields from the document are returned.
|
477
|
+
- `explain`
|
478
|
+
- Returns the execution plan for a complex query. In the returned response,
|
479
|
+
a + on a term is an indication of stemming.
|
480
|
+
- `to_redis`
|
481
|
+
- Returns the command to be executed without executing it.
|
482
|
+
|
483
|
+
|
484
|
+
## Spellcheck
|
485
|
+
|
486
|
+
Spellchecking is initiated off a `RediSearch::Index` instance and provides
|
487
|
+
suggestions for misspelled search terms. It takes an optional `distance`
|
488
|
+
argument which is the maximal Levenshtein distance for spelling suggestions. It
|
489
|
+
returns an array where each element contains suggestions for each search term
|
490
|
+
and a normalized score based on its occurrences in the index.
|
491
|
+
|
492
|
+
```ruby
|
493
|
+
main ❯ index = RediSearch::Index.new("user_idx", name: { text: { phonetic: "dm:en" } })
|
494
|
+
main ❯ index.spellcheck("jimy")
|
495
|
+
RediSearch (1.1ms) FT.SPELLCHECK user_idx jimy DISTANCE 1
|
496
|
+
=> [#<RediSearch::Spellcheck::Result:0x00007f805591c670
|
497
|
+
term: "jimy",
|
498
|
+
suggestions:
|
499
|
+
[#<struct RediSearch::Spellcheck::Suggestion score=0.0006849315068493151, suggestion="jimmy">,
|
500
|
+
#<struct RediSearch::Spellcheck::Suggestion score=0.00019569471624266145, suggestion="jim">]>]
|
501
|
+
main ❯ index.spellcheck("jimy", distance: 2).first.suggestions
|
502
|
+
RediSearch (0.5ms) FT.SPELLCHECK user_idx jimy DISTANCE 2
|
503
|
+
=> [#<struct RediSearch::Spellcheck::Suggestion score=0.0006849315068493151, suggestion="jimmy">,
|
504
|
+
#<struct RediSearch::Spellcheck::Suggestion score=0.00019569471624266145, suggestion="jim">]
|
505
|
+
```
|
506
|
+
|
507
|
+
|
508
|
+
## Rails Integration
|
509
|
+
|
510
|
+
Integration with Rails is super easy! Call `redi_search` with the schema keyword
|
511
|
+
arg from inside your model. Ex:
|
191
512
|
|
192
|
-
Integration with Rails is on by default! All you have to do is add the following to the model you want to search:
|
193
513
|
```ruby
|
194
514
|
class User < ApplicationRecord
|
195
515
|
redi_search schema: {
|
@@ -199,22 +519,106 @@ class User < ApplicationRecord
|
|
199
519
|
end
|
200
520
|
```
|
201
521
|
|
202
|
-
This will automatically add `User.search` and `User.
|
522
|
+
This will automatically add `User.search` and `User.spellcheck`
|
523
|
+
methods which behave the same as if you called them on an `Index` instance.
|
524
|
+
|
525
|
+
`User.reindex(only: [], **options)` is also added and behaves similarly to `RediSearch::Index#reindex`. Some of the differences include:
|
526
|
+
- By default does an upsert for all documents added using the
|
527
|
+
option `replace: { partial: true }`.
|
528
|
+
- `Document`s do not to be passed as the first parameter. The `search_import`
|
529
|
+
scope is automatically called and all the records are converted
|
530
|
+
to `Document`s.
|
531
|
+
- Accepts an optional `only` parameter where you can specify a limited number
|
532
|
+
of fields to update. Useful if you alter the schema and only need to index a
|
533
|
+
particular field.
|
534
|
+
|
535
|
+
|
536
|
+
The `redi_search` class method also takes an optional `serializer` argument
|
537
|
+
which takes the class name of a serializer. The serializer must respond to all
|
538
|
+
the fields in a schema as methods. We don't serialize to a JSON object since
|
539
|
+
RediSearch doesn't serialize documents that way.
|
540
|
+
|
541
|
+
```ruby
|
542
|
+
class User < ApplicationRecord
|
543
|
+
redi_search schema: {
|
544
|
+
name: { text: { phonetic: "dm:en" } }
|
545
|
+
}, serializer: UserSerializer
|
546
|
+
end
|
547
|
+
|
548
|
+
class UserSerializer < SimpleDelegator
|
549
|
+
def name
|
550
|
+
"#{first_name} #{last_name}"
|
551
|
+
end
|
552
|
+
end
|
553
|
+
```
|
554
|
+
|
555
|
+
You can create a scope on the model to eager load relationships when indexing or
|
556
|
+
it can be used to limit the records to index.
|
557
|
+
|
558
|
+
```ruby
|
559
|
+
class User < ApplicationRecord
|
560
|
+
scope :search_import, -> { includes(:posts) }
|
561
|
+
end
|
562
|
+
```
|
563
|
+
|
564
|
+
The default index name for model indexes is
|
565
|
+
`#{model_name.plural}_#{RediSearch.env}`. The `redi_search` method takes an
|
566
|
+
optional `index_prefix` argument which gets prepended to the index name:
|
567
|
+
|
568
|
+
```ruby
|
569
|
+
class User < ApplicationRecord
|
570
|
+
redi_search schema: {
|
571
|
+
first: { text: { phonetic: "dm:en" } },
|
572
|
+
last: { text: { phonetic: "dm:en" } }
|
573
|
+
}, index_prefix: 'prefix'
|
574
|
+
end
|
575
|
+
|
576
|
+
User.redi_search_index.name
|
577
|
+
# => prefix_users_development
|
578
|
+
```
|
579
|
+
|
580
|
+
When integrating RediSearch into a model, records will automatically be indexed
|
581
|
+
after creating and updating and will be removed from the index upon destruction.
|
582
|
+
|
583
|
+
There are few more convenience methods that are publicly available:
|
584
|
+
- `redi_search_document`
|
585
|
+
- Returns the record as a `RediSearch::Document` instance
|
586
|
+
- `redi_search_delete_document`
|
587
|
+
- Removes the record from the index
|
588
|
+
- `redi_search_add_document`
|
589
|
+
- Adds the record to the index
|
590
|
+
- `redi_search_index`
|
591
|
+
- Returns the `RediSearch::Index` instance
|
592
|
+
|
203
593
|
|
204
594
|
## Development
|
205
595
|
|
206
|
-
After checking out the repo, run `bin/setup` to install dependencies. Then, run
|
596
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run
|
597
|
+
`rake test` to run the tests. You can also run `bin/console` for an interactive
|
598
|
+
prompt that will allow you to experiment. You can also start a rails console if
|
599
|
+
you `cd` into `test/dummy`.
|
207
600
|
|
208
|
-
To install this gem onto your local machine, run `bundle exec rake install`. To
|
601
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To
|
602
|
+
release a new version, execute `bin/publish (major|minor|patch)` which will
|
603
|
+
update the version number in `version.rb`, create a git tag for the version,
|
604
|
+
push git commits and tags, and push the `.gem` file to
|
605
|
+
[rubygems.org](https://rubygems.org).
|
209
606
|
|
210
607
|
## Contributing
|
211
608
|
|
212
|
-
Bug reports and pull requests are welcome on
|
609
|
+
Bug reports and pull requests are welcome on
|
610
|
+
[GitHub](https://github.com/npezza93/redi_search). This project is intended to
|
611
|
+
be a safe, welcoming space for collaboration, and contributors are expected to
|
612
|
+
adhere to the [Contributor Covenant](http://contributor-covenant.org) code of
|
613
|
+
conduct.
|
213
614
|
|
214
615
|
## License
|
215
616
|
|
216
|
-
The gem is available as open source under the terms of the
|
617
|
+
The gem is available as open source under the terms of the
|
618
|
+
[MIT License](https://opensource.org/licenses/MIT).
|
217
619
|
|
218
620
|
## Code of Conduct
|
219
621
|
|
220
|
-
Everyone interacting in the RediSearch project’s codebases, issue trackers, chat
|
622
|
+
Everyone interacting in the RediSearch project’s codebases, issue trackers, chat
|
623
|
+
rooms and mailing lists is expected to follow the [code of
|
624
|
+
conduct](https://github.com/npezza93/redi_search/blob/master/CODE_OF_CONDUCT.md).
|