elasticsearch-model 0.0.1 → 0.1.0.rc1
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +3 -0
- data/LICENSE.txt +1 -1
- data/README.md +669 -8
- data/Rakefile +52 -0
- data/elasticsearch-model.gemspec +48 -17
- data/examples/activerecord_article.rb +77 -0
- data/examples/activerecord_associations.rb +153 -0
- data/examples/couchbase_article.rb +66 -0
- data/examples/datamapper_article.rb +71 -0
- data/examples/mongoid_article.rb +68 -0
- data/examples/ohm_article.rb +70 -0
- data/examples/riak_article.rb +52 -0
- data/gemfiles/3.gemfile +11 -0
- data/gemfiles/4.gemfile +11 -0
- data/lib/elasticsearch/model.rb +151 -1
- data/lib/elasticsearch/model/adapter.rb +145 -0
- data/lib/elasticsearch/model/adapters/active_record.rb +97 -0
- data/lib/elasticsearch/model/adapters/default.rb +44 -0
- data/lib/elasticsearch/model/adapters/mongoid.rb +90 -0
- data/lib/elasticsearch/model/callbacks.rb +35 -0
- data/lib/elasticsearch/model/client.rb +61 -0
- data/lib/elasticsearch/model/importing.rb +94 -0
- data/lib/elasticsearch/model/indexing.rb +332 -0
- data/lib/elasticsearch/model/naming.rb +101 -0
- data/lib/elasticsearch/model/proxy.rb +127 -0
- data/lib/elasticsearch/model/response.rb +70 -0
- data/lib/elasticsearch/model/response/base.rb +44 -0
- data/lib/elasticsearch/model/response/pagination.rb +96 -0
- data/lib/elasticsearch/model/response/records.rb +71 -0
- data/lib/elasticsearch/model/response/result.rb +50 -0
- data/lib/elasticsearch/model/response/results.rb +32 -0
- data/lib/elasticsearch/model/searching.rb +107 -0
- data/lib/elasticsearch/model/serializing.rb +35 -0
- data/lib/elasticsearch/model/support/forwardable.rb +44 -0
- data/lib/elasticsearch/model/version.rb +1 -1
- data/test/integration/active_record_associations_parent_child.rb +138 -0
- data/test/integration/active_record_associations_test.rb +306 -0
- data/test/integration/active_record_basic_test.rb +139 -0
- data/test/integration/active_record_import_test.rb +74 -0
- data/test/integration/active_record_namespaced_model_test.rb +49 -0
- data/test/integration/active_record_pagination_test.rb +109 -0
- data/test/integration/mongoid_basic_test.rb +178 -0
- data/test/test_helper.rb +57 -0
- data/test/unit/adapter_active_record_test.rb +93 -0
- data/test/unit/adapter_default_test.rb +31 -0
- data/test/unit/adapter_mongoid_test.rb +87 -0
- data/test/unit/adapter_test.rb +69 -0
- data/test/unit/callbacks_test.rb +30 -0
- data/test/unit/client_test.rb +27 -0
- data/test/unit/importing_test.rb +97 -0
- data/test/unit/indexing_test.rb +364 -0
- data/test/unit/module_test.rb +46 -0
- data/test/unit/naming_test.rb +76 -0
- data/test/unit/proxy_test.rb +88 -0
- data/test/unit/response_base_test.rb +40 -0
- data/test/unit/response_pagination_test.rb +159 -0
- data/test/unit/response_records_test.rb +87 -0
- data/test/unit/response_result_test.rb +52 -0
- data/test/unit/response_results_test.rb +31 -0
- data/test/unit/response_test.rb +57 -0
- data/test/unit/searching_search_request_test.rb +73 -0
- data/test/unit/searching_test.rb +39 -0
- data/test/unit/serializing_test.rb +17 -0
- metadata +418 -11
data/.gitignore
CHANGED
data/LICENSE.txt
CHANGED
data/README.md
CHANGED
@@ -1,21 +1,682 @@
|
|
1
1
|
# Elasticsearch::Model
|
2
2
|
|
3
|
-
|
3
|
+
The `elasticsearch-model` library builds on top of the
|
4
|
+
the [`elasticsearch`](https://github.com/elasticsearch/elasticsearch-ruby) library.
|
5
|
+
|
6
|
+
It aims to simplify integration of Ruby classes ("models"), commonly found
|
7
|
+
e.g. in [Ruby on Rails](http://rubyonrails.org) applications, with the
|
8
|
+
[Elasticsearch](http://www.elasticsearch.org) search and analytics engine.
|
9
|
+
|
10
|
+
The library is compatible with Ruby 1.9.3 and higher.
|
4
11
|
|
5
12
|
## Installation
|
6
13
|
|
7
|
-
|
14
|
+
Install the package from [Rubygems](https://rubygems.org):
|
15
|
+
|
16
|
+
gem install elasticsearch-model --pre
|
8
17
|
|
9
|
-
|
18
|
+
To use an unreleased version, either add it to your `Gemfile` for [Bundler](http://bundler.io):
|
10
19
|
|
11
|
-
|
20
|
+
gem 'elasticsearch-model', git: 'git://github.com/elasticsearch/elasticsearch-rails.git'
|
12
21
|
|
13
|
-
|
22
|
+
or install it from a source code checkout:
|
14
23
|
|
15
|
-
|
24
|
+
git clone https://github.com/elasticsearch/elasticsearch-rails.git
|
25
|
+
cd elasticsearch-rails/elasticsearch-model
|
26
|
+
bundle install
|
27
|
+
rake install
|
16
28
|
|
17
|
-
$ gem install elasticsearch-model
|
18
29
|
|
19
30
|
## Usage
|
20
31
|
|
21
|
-
|
32
|
+
Let's suppose you have an `Article` model:
|
33
|
+
|
34
|
+
```ruby
|
35
|
+
require 'active_record'
|
36
|
+
ActiveRecord::Base.establish_connection( adapter: 'sqlite3', database: ":memory:" )
|
37
|
+
ActiveRecord::Schema.define(version: 1) { create_table(:articles) { |t| t.string :title } }
|
38
|
+
|
39
|
+
class Article < ActiveRecord::Base; end
|
40
|
+
|
41
|
+
Article.create title: 'Quick brown fox'
|
42
|
+
Article.create title: 'Fast black dogs'
|
43
|
+
Article.create title: 'Swift green frogs'
|
44
|
+
```
|
45
|
+
|
46
|
+
### Setup
|
47
|
+
|
48
|
+
To add the Elasticsearch integration for this model, require `elasticsearch/model`
|
49
|
+
and include the main module in your class:
|
50
|
+
|
51
|
+
```ruby
|
52
|
+
require 'elasticsearch/model'
|
53
|
+
|
54
|
+
class Article < ActiveRecord::Base
|
55
|
+
include Elasticsearch::Model
|
56
|
+
end
|
57
|
+
```
|
58
|
+
|
59
|
+
This will extend the model with functionality related to Elasticsearch.
|
60
|
+
|
61
|
+
#### Feature Extraction Pattern
|
62
|
+
|
63
|
+
Instead of including the `Elasticsearch::Model` module directly in your model,
|
64
|
+
you can include it in a "concern" or "trait" module, which is quite common pattern in Rails applications,
|
65
|
+
using e.g. `ActiveSupport::Concern` as the instrumentation:
|
66
|
+
|
67
|
+
```ruby
|
68
|
+
# In: app/models/concerns/searchable.rb
|
69
|
+
#
|
70
|
+
module Searchable
|
71
|
+
extend ActiveSupport::Concern
|
72
|
+
|
73
|
+
included do
|
74
|
+
include Elasticsearch::Model
|
75
|
+
|
76
|
+
mapping do
|
77
|
+
# ...
|
78
|
+
end
|
79
|
+
end
|
80
|
+
|
81
|
+
module ClassMethods
|
82
|
+
def search(query)
|
83
|
+
# ...
|
84
|
+
end
|
85
|
+
end
|
86
|
+
end
|
87
|
+
|
88
|
+
# In: app/models/article.rb
|
89
|
+
#
|
90
|
+
class Article
|
91
|
+
include Searchable
|
92
|
+
end
|
93
|
+
```
|
94
|
+
|
95
|
+
#### The `__elasticsearch__` Proxy
|
96
|
+
|
97
|
+
The `Elasticsearch::Model` module contains a big amount of class and instance methods to provide
|
98
|
+
all its functionality. To prevent polluting your model namespace, this functionality is primarily
|
99
|
+
available via the `__elasticsearch__` class and instance level proxy methods;
|
100
|
+
see the `Elasticsearch::Model::Proxy` class documentation for technical information.
|
101
|
+
|
102
|
+
The module will include important methods, such as `search`, into the includeing class or module
|
103
|
+
only when they haven't been defined already. Following two calls are thus functionally equivalent:
|
104
|
+
|
105
|
+
```ruby
|
106
|
+
Article.__elasticsearch__.search 'fox'
|
107
|
+
Article.search 'fox'
|
108
|
+
```
|
109
|
+
|
110
|
+
See the `Elasticsearch::Model` module documentation for technical information.
|
111
|
+
|
112
|
+
### The Elasticsearch client
|
113
|
+
|
114
|
+
The module will set up a [client](https://github.com/elasticsearch/elasticsearch-ruby/tree/master/elasticsearch),
|
115
|
+
connected to `localhost:9200`, by default. You can access and use it as any other `Elasticsearch::Client`:
|
116
|
+
|
117
|
+
```ruby
|
118
|
+
Article.__elasticsearch__.client.cluster.health
|
119
|
+
# => { "cluster_name"=>"elasticsearch", "status"=>"yellow", ... }
|
120
|
+
```
|
121
|
+
|
122
|
+
To use a client with different configuration, just set up a client for the model:
|
123
|
+
|
124
|
+
```ruby
|
125
|
+
Article.__elasticsearch__.client = Elasticsearch::Client.new host: 'api.server.org'
|
126
|
+
```
|
127
|
+
|
128
|
+
Or configure the client for all models:
|
129
|
+
|
130
|
+
```ruby
|
131
|
+
Elasticsearch::Model.client = Elasticsearch::Client.new log:true
|
132
|
+
```
|
133
|
+
|
134
|
+
You might want to do this during you application bootstrap process, e.g. in a Rails initializer.
|
135
|
+
|
136
|
+
Please refer to the
|
137
|
+
[`elasticsearch-transport`](https://github.com/elasticsearch/elasticsearch-ruby/tree/master/elasticsearch-transport)
|
138
|
+
library documentation for all the configuration options, and to the
|
139
|
+
[`elasticsearch-api`](http://rubydoc.info/gems/elasticsearch-api) library documentation
|
140
|
+
for information about the Ruby client API.
|
141
|
+
|
142
|
+
### Importing the data
|
143
|
+
|
144
|
+
The first thing you'll want to do is importing your data into the index:
|
145
|
+
|
146
|
+
```ruby
|
147
|
+
Article.import
|
148
|
+
# => 0
|
149
|
+
```
|
150
|
+
|
151
|
+
No errors were reported during importing, so... let's search the index!
|
152
|
+
|
153
|
+
|
154
|
+
### Searching
|
155
|
+
|
156
|
+
For starters, we can try the "simple" type of search:
|
157
|
+
|
158
|
+
```ruby
|
159
|
+
response = Article.search 'fox dogs'
|
160
|
+
|
161
|
+
response.took
|
162
|
+
# => 3
|
163
|
+
|
164
|
+
response.results.total
|
165
|
+
# => 2
|
166
|
+
|
167
|
+
response.results.first._score
|
168
|
+
# => 0.02250402
|
169
|
+
|
170
|
+
response.results.first._source.title
|
171
|
+
# => "Quick brown fox"
|
172
|
+
```
|
173
|
+
|
174
|
+
#### Search results
|
175
|
+
|
176
|
+
The returned `response` object is a rich wrapper around the JSON returned from Elasticsearch,
|
177
|
+
providing access to response metadata and the actual results ("hits").
|
178
|
+
|
179
|
+
Each "hit" is wrapped in the `Result` class, and provides method access
|
180
|
+
to its properties via [`Hashie::Mash`](http://github.com/intridea/hashie).
|
181
|
+
|
182
|
+
The `results` object supports the `Enumerable` interface:
|
183
|
+
|
184
|
+
```ruby
|
185
|
+
response.results.map { |r| r._source.title }
|
186
|
+
# => ["Quick brown fox", "Fast black dogs"]
|
187
|
+
|
188
|
+
response.results.select { |r| r.title =~ /^Q/ }
|
189
|
+
# => [#<Elasticsearch::Model::Response::Result:0x007 ... "_source"=>{"title"=>"Quick brown fox"}}>]
|
190
|
+
```
|
191
|
+
|
192
|
+
In fact, the `response` object will delegate `Enumerable` methods to `results`:
|
193
|
+
|
194
|
+
```ruby
|
195
|
+
response.any? { |r| r.title =~ /fox|dog/ }
|
196
|
+
# => true
|
197
|
+
```
|
198
|
+
|
199
|
+
#### Search results as database records
|
200
|
+
|
201
|
+
Instead of returning documents from Elasticsearch, the `records` method will return a collection
|
202
|
+
of model instances, fetched from the primary database, ordered by score:
|
203
|
+
|
204
|
+
```ruby
|
205
|
+
response.records.to_a
|
206
|
+
# Article Load (0.3ms) SELECT "articles".* FROM "articles" WHERE "articles"."id" IN (1, 2)
|
207
|
+
# => [#<Article id: 1, title: "Quick brown fox">, #<Article id: 2, title: "Fast black dogs">]
|
208
|
+
```
|
209
|
+
|
210
|
+
The returned object is the genuine collection of model instances returned by your database,
|
211
|
+
i.e. `ActiveRecord::Relation` for ActiveRecord, or `Mongoid::Criteria` in case of MongoDB. This allows you to
|
212
|
+
chain other methods on top of search results, as you would normally do:
|
213
|
+
|
214
|
+
```ruby
|
215
|
+
response.records.where(title: 'Quick brown fox').to_a
|
216
|
+
# Article Load (0.2ms) SELECT "articles".* FROM "articles" WHERE "articles"."id" IN (1, 2) AND "articles"."title" = 'Quick brown fox'
|
217
|
+
# => [#<Article id: 1, title: "Quick brown fox">]
|
218
|
+
|
219
|
+
response.records.records.class
|
220
|
+
# => ActiveRecord::Relation::ActiveRecord_Relation_Article
|
221
|
+
```
|
222
|
+
|
223
|
+
The ordering of the records by score will be preserved, unless you explicitely specify a different
|
224
|
+
order in your model query language:
|
225
|
+
|
226
|
+
```ruby
|
227
|
+
response.records.order(:title).to_a
|
228
|
+
# Article Load (0.2ms) SELECT "articles".* FROM "articles" WHERE "articles"."id" IN (1, 2) ORDER BY "articles".title ASC
|
229
|
+
# => [#<Article id: 2, title: "Fast black dogs">, #<Article id: 1, title: "Quick brown fox">]
|
230
|
+
```
|
231
|
+
|
232
|
+
The `records` method returns the real instances of your model, which is useful when you want to access your
|
233
|
+
model methods -- at the expense of slowing down your application, of course.
|
234
|
+
In most cases, working with `results` coming from Elasticsearch is sufficient, and much faster. See the
|
235
|
+
[`elasticsearch-rails`](https://github.com/elasticsearch/elasticsearch-rails/tree/master/elasticsearch-rails)
|
236
|
+
library for more information about compatibility with the Ruby on Rails framework.
|
237
|
+
|
238
|
+
When you want to access both the database `records` and search `results`, use the `each_with_hit`
|
239
|
+
(or `map_with_hit`) iterator:
|
240
|
+
|
241
|
+
```ruby
|
242
|
+
response.records.each_with_hit { |record, hit| puts "* #{record.title}: #{hit._score}" }
|
243
|
+
# * Quick brown fox: 0.02250402
|
244
|
+
# * Fast black dogs: 0.02250402
|
245
|
+
```
|
246
|
+
|
247
|
+
#### Pagination
|
248
|
+
|
249
|
+
You can implement pagination with the `from` and `size` search parameters. However, search results
|
250
|
+
can be automatically paginated with the [`kaminari`](http://rubygems.org/gems/kaminari) gem.
|
251
|
+
|
252
|
+
If Kaminari is loaded, use the familiar paging methods:
|
253
|
+
|
254
|
+
```ruby
|
255
|
+
response.page(2).results
|
256
|
+
response.page(2).records
|
257
|
+
```
|
258
|
+
|
259
|
+
In a Rails controller, use the the `params[:page]` parameter to paginate through results:
|
260
|
+
|
261
|
+
```ruby
|
262
|
+
@articles = Article.search(params[:q]).page(params[:page]).records
|
263
|
+
|
264
|
+
@articles.current_page
|
265
|
+
# => 2
|
266
|
+
@articles.next_page
|
267
|
+
# => 3
|
268
|
+
```
|
269
|
+
To initialize and include the pagination support manually:
|
270
|
+
|
271
|
+
```ruby
|
272
|
+
Kaminari::Hooks.init
|
273
|
+
Elasticsearch::Model::Response::Response.__send__ :include, Elasticsearch::Model::Response::Pagination::Kaminari
|
274
|
+
```
|
275
|
+
|
276
|
+
#### The Elasticsearch DSL
|
277
|
+
|
278
|
+
In most situation, you'll want to pass the search definition
|
279
|
+
in the Elasticsearch [domain-specific language](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html) to the client:
|
280
|
+
|
281
|
+
```ruby
|
282
|
+
response = Article.search query: { match: { title: "Fox Dogs" } },
|
283
|
+
highlight: { fields: { title: {} } }
|
284
|
+
|
285
|
+
response.results.first.highlight.title
|
286
|
+
# ["Quick brown <em>fox</em>"]
|
287
|
+
```
|
288
|
+
|
289
|
+
You can pass any object which implements a `to_hash` method, or you can use your favourite JSON builder
|
290
|
+
to build the search definition as a JSON string:
|
291
|
+
|
292
|
+
```ruby
|
293
|
+
require 'jbuilder'
|
294
|
+
|
295
|
+
query = Jbuilder.encode do |json|
|
296
|
+
json.query do
|
297
|
+
json.match do
|
298
|
+
json.title do
|
299
|
+
json.query "fox dogs"
|
300
|
+
end
|
301
|
+
end
|
302
|
+
end
|
303
|
+
end
|
304
|
+
|
305
|
+
response = Article.search query
|
306
|
+
response.results.first.title
|
307
|
+
# => "Quick brown fox"
|
308
|
+
```
|
309
|
+
|
310
|
+
### Index Configuration
|
311
|
+
|
312
|
+
For proper search engine function, it's often necessary to configure the index properly.
|
313
|
+
The `Elasticsearch::Model` integration provides class methods to set up index settings and mappings.
|
314
|
+
|
315
|
+
```ruby
|
316
|
+
class Article
|
317
|
+
settings index: { number_of_shards: 1 } do
|
318
|
+
mappings dynamic: 'false' do
|
319
|
+
indexes :title, analyzer: 'english', index_options: 'offsets'
|
320
|
+
end
|
321
|
+
end
|
322
|
+
end
|
323
|
+
|
324
|
+
Article.mappings.to_hash
|
325
|
+
# => {
|
326
|
+
# :article => {
|
327
|
+
# :dynamic => "false",
|
328
|
+
# :properties => {
|
329
|
+
# :title => {
|
330
|
+
# :type => "string",
|
331
|
+
# :analyzer => "english",
|
332
|
+
# :index_options => "offsets"
|
333
|
+
# }
|
334
|
+
# }
|
335
|
+
# }
|
336
|
+
# }
|
337
|
+
|
338
|
+
Article.settings.to_hash
|
339
|
+
# { :index => { :number_of_shards => 1 } }
|
340
|
+
```
|
341
|
+
|
342
|
+
You can use the defined settings and mappings to create an index with desired configuration:
|
343
|
+
|
344
|
+
```ruby
|
345
|
+
Article.__elasticsearch__.client.indices.delete index: Article.index_name rescue nil
|
346
|
+
Article.__elasticsearch__.client.indices.create \
|
347
|
+
index: Article.index_name,
|
348
|
+
body: { settings: Article.settings.to_hash, mappings: Article.mappings.to_hash }
|
349
|
+
```
|
350
|
+
|
351
|
+
There's a shortcut available for this common operation (convenient e.g. in tests):
|
352
|
+
|
353
|
+
```ruby
|
354
|
+
Article.__elasticsearch__.create_index! force: true
|
355
|
+
Article.__elasticsearch__.refresh_index!
|
356
|
+
```
|
357
|
+
|
358
|
+
By default, index name and document type will be inferred from your class name,
|
359
|
+
you can set it explicitely, however:
|
360
|
+
|
361
|
+
```ruby
|
362
|
+
class Article
|
363
|
+
index_name "articles-#{Rails.env}"
|
364
|
+
document_type "post"
|
365
|
+
end
|
366
|
+
```
|
367
|
+
|
368
|
+
### Updating the Documents in the Index
|
369
|
+
|
370
|
+
Usually, we need to update the Elasticsearch index when records in the database are created, updated or deleted;
|
371
|
+
use the `index_document`, `update_document` and `delete_document` methods, respectively:
|
372
|
+
|
373
|
+
```ruby
|
374
|
+
Article.first.__elasticsearch__.index_document
|
375
|
+
# => {"ok"=>true, ... "_version"=>2}
|
376
|
+
```
|
377
|
+
|
378
|
+
#### Automatic Callbacks
|
379
|
+
|
380
|
+
You can automatically update the index whenever the record changes, by including
|
381
|
+
the `Elasticsearch::Model::Callbacks` module in your model:
|
382
|
+
|
383
|
+
```ruby
|
384
|
+
class Article
|
385
|
+
include Elasticsearch::Model
|
386
|
+
include Elasticsearch::Model::Callbacks
|
387
|
+
end
|
388
|
+
|
389
|
+
Article.first.update_attribute :title, 'Updated!'
|
390
|
+
|
391
|
+
Article.search('*').map { |r| r.title }
|
392
|
+
# => ["Updated!", "Lime green frogs", "Fast black dogs"]
|
393
|
+
```
|
394
|
+
|
395
|
+
The automatic callback on record update keeps track of changes in your model
|
396
|
+
(via [`ActiveModel::Dirty`](http://api.rubyonrails.org/classes/ActiveModel/Dirty.html)-compliant implementation),
|
397
|
+
and performs a _partial update_ when this support is available.
|
398
|
+
|
399
|
+
The automatic callbacks are implemented in database adapters coming with `Elasticsearch::Model`. You can easily
|
400
|
+
implement your own adapter: please see the relevant chapter below.
|
401
|
+
|
402
|
+
#### Custom Callbacks
|
403
|
+
|
404
|
+
In case you would need more control of the indexing process, you can implement these callbacks yourself,
|
405
|
+
by hooking into `after_create`, `after_save`, `after_update` or `after_destroy` operations:
|
406
|
+
|
407
|
+
```ruby
|
408
|
+
class Article
|
409
|
+
include Elasticsearch::Model
|
410
|
+
|
411
|
+
after_save { logger.debug ["Updating document... ", index_document ].join }
|
412
|
+
after_destroy { logger.debug ["Deleting document... ", delete_document].join }
|
413
|
+
end
|
414
|
+
```
|
415
|
+
|
416
|
+
For ActiveRecord-based models, you need to hook into the `after_commit` callback, to protect
|
417
|
+
your data against inconsistencies caused by transaction rollbacks:
|
418
|
+
|
419
|
+
```ruby
|
420
|
+
class Article < ActiveRecord::Base
|
421
|
+
include Elasticsearch::Model
|
422
|
+
|
423
|
+
after_commit on: [:create] do
|
424
|
+
index_document if self.published?
|
425
|
+
end
|
426
|
+
|
427
|
+
after_commit on: [:update] do
|
428
|
+
update_document if self.published?
|
429
|
+
end
|
430
|
+
|
431
|
+
after_commit on: [:destroy] do
|
432
|
+
delete_document if self.published?
|
433
|
+
end
|
434
|
+
end
|
435
|
+
```
|
436
|
+
|
437
|
+
#### Asynchronous Callbacks
|
438
|
+
|
439
|
+
Of course, you're still performing an HTTP request during your database transaction, which is not optimal
|
440
|
+
for large-scale applications. A better option would be to process the index operations in background,
|
441
|
+
with a tool like [_Resque_](https://github.com/resque/resque) or [_Sidekiq_](https://github.com/mperham/sidekiq):
|
442
|
+
|
443
|
+
```ruby
|
444
|
+
class Article
|
445
|
+
include Elasticsearch::Model
|
446
|
+
|
447
|
+
after_save { Indexer.perform_async(:index, self.id) }
|
448
|
+
after_destroy { Indexer.perform_async(:delete, self.id) }
|
449
|
+
end
|
450
|
+
```
|
451
|
+
|
452
|
+
An example implementation of the `Indexer` worker class could look like this:
|
453
|
+
|
454
|
+
```ruby
|
455
|
+
class Indexer
|
456
|
+
include Sidekiq::Worker
|
457
|
+
sidekiq_options queue: 'elasticsearch', retry: false
|
458
|
+
|
459
|
+
Logger = Sidekiq.logger.level == Logger::DEBUG ? Sidekiq.logger : nil
|
460
|
+
Client = Elasticsearch::Client.new host: 'localhost:9200', logger: Logger
|
461
|
+
|
462
|
+
def perform(operation, record_id)
|
463
|
+
logger.debug [operation, "ID: #{record_id}"]
|
464
|
+
|
465
|
+
case operation.to_s
|
466
|
+
when /index/
|
467
|
+
record = Article.find(record_id)
|
468
|
+
Client.index index: 'articles', type: 'article', id: record.id, body: record.as_indexed_json
|
469
|
+
when /delete/
|
470
|
+
Client.delete index: 'articles', type: 'article', id: record_id
|
471
|
+
else raise ArgumentError, "Unknown operation '#{operation}'"
|
472
|
+
end
|
473
|
+
end
|
474
|
+
end
|
475
|
+
```
|
476
|
+
|
477
|
+
Start the _Sidekiq_ workers with `bundle exec sidekiq --queue elasticsearch --verbose` and
|
478
|
+
update a model:
|
479
|
+
|
480
|
+
```ruby
|
481
|
+
Article.first.update_attribute :title, 'Updated'
|
482
|
+
```
|
483
|
+
|
484
|
+
You'll see the job being processed in the console where you started the _Sidekiq_ worker:
|
485
|
+
|
486
|
+
```
|
487
|
+
Indexer JID-eb7e2daf389a1e5e83697128 DEBUG: ["index", "ID: 7"]
|
488
|
+
Indexer JID-eb7e2daf389a1e5e83697128 INFO: PUT http://localhost:9200/articles/article/1 [status:200, request:0.004s, query:n/a]
|
489
|
+
Indexer JID-eb7e2daf389a1e5e83697128 DEBUG: > {"id":1,"title":"Updated", ...}
|
490
|
+
Indexer JID-eb7e2daf389a1e5e83697128 DEBUG: < {"ok":true,"_index":"articles","_type":"article","_id":"1","_version":6}
|
491
|
+
Indexer JID-eb7e2daf389a1e5e83697128 INFO: done: 0.006 sec
|
492
|
+
```
|
493
|
+
|
494
|
+
### Model Serialization
|
495
|
+
|
496
|
+
By default, the model instance will be serialized to JSON using the `as_indexed_json` method,
|
497
|
+
which is defined automatically by the `Elasticsearch::Model::Serializing` module:
|
498
|
+
|
499
|
+
```ruby
|
500
|
+
Article.first.__elasticsearch__.as_indexed_json
|
501
|
+
# => {"id"=>1, "title"=>"Quick brown fox"}
|
502
|
+
```
|
503
|
+
|
504
|
+
If you want to customize the serialization, just implement the `as_indexed_json` method yourself:
|
505
|
+
|
506
|
+
```ruby
|
507
|
+
class Article
|
508
|
+
include Elasticsearch::Model
|
509
|
+
|
510
|
+
def as_indexed_json(options={})
|
511
|
+
as_json(only: 'title')
|
512
|
+
end
|
513
|
+
end
|
514
|
+
|
515
|
+
Article.first.as_indexed_json
|
516
|
+
# => {"title"=>"Quick brown fox"}
|
517
|
+
```
|
518
|
+
|
519
|
+
The re-defined method will be used in the indexing methods, such as `index_document`.
|
520
|
+
|
521
|
+
#### Relationships and Associations
|
522
|
+
|
523
|
+
When you have a more complicated structure/schema, you need to customize the `as_indexed_json` method -
|
524
|
+
or perform the indexing separately, on your own.
|
525
|
+
For example, let's have an `Article` model, which _has_many_ `Comment`s,
|
526
|
+
`Author`s and `Categories`. We might want to define the serialization like this:
|
527
|
+
|
528
|
+
```ruby
|
529
|
+
def as_indexed_json(options={})
|
530
|
+
self.as_json(
|
531
|
+
include: { categories: { only: :title},
|
532
|
+
authors: { methods: [:full_name], only: [:full_name] },
|
533
|
+
comments: { only: :text }
|
534
|
+
})
|
535
|
+
end
|
536
|
+
|
537
|
+
Article.first.as_indexed_json
|
538
|
+
# => { "id" => 1,
|
539
|
+
# "title" => "First Article",
|
540
|
+
# "created_at" => 2013-12-03 13:39:02 UTC,
|
541
|
+
# "updated_at" => 2013-12-03 13:39:02 UTC,
|
542
|
+
# "categories" => [ { "title" => "One" } ],
|
543
|
+
# "authors" => [ { "full_name" => "John Smith" } ],
|
544
|
+
# "comments" => [ { "text" => "First comment" } ] }
|
545
|
+
```
|
546
|
+
|
547
|
+
Of course, when you want to use the automatic indexing callbacks, you need to hook into the appropriate
|
548
|
+
_ActiveRecord_ callbacks -- please see the full example in `examples/activerecord_associations.rb`.
|
549
|
+
|
550
|
+
### Other ActiveModel Frameworks
|
551
|
+
|
552
|
+
The `Elasticsearch::Model` module is fully compatible with any ActiveModel-compatible model, such as _Mongoid_:
|
553
|
+
|
554
|
+
```ruby
|
555
|
+
require 'mongoid'
|
556
|
+
|
557
|
+
Mongoid.connect_to 'articles'
|
558
|
+
|
559
|
+
class Article
|
560
|
+
include Mongoid::Document
|
561
|
+
|
562
|
+
field :id, type: String
|
563
|
+
field :title, type: String
|
564
|
+
|
565
|
+
attr_accessible :id, :title, :published_at
|
566
|
+
|
567
|
+
include Elasticsearch::Model
|
568
|
+
|
569
|
+
def as_indexed_json(options={})
|
570
|
+
as_json(except: [:id, :_id])
|
571
|
+
end
|
572
|
+
end
|
573
|
+
|
574
|
+
Article.create id: '1', title: 'Quick brown fox'
|
575
|
+
Article.import
|
576
|
+
|
577
|
+
response = Article.search 'fox';
|
578
|
+
response.records.to_a
|
579
|
+
# MOPED: 127.0.0.1:27017 QUERY database=articles collection=articles selector={"_id"=>{"$in"=>["1"]}} ...
|
580
|
+
# => [#<Article _id: 1, id: nil, title: "Quick brown fox", published_at: nil>]
|
581
|
+
```
|
582
|
+
|
583
|
+
Full examples for CouchBase, DataMapper, Mongoid, Ohm and Riak models can be found in the `examples` folder.
|
584
|
+
|
585
|
+
### Adapters
|
586
|
+
|
587
|
+
To support various "OxM" (object-relational- or object-document-mapper) implementations and frameworks,
|
588
|
+
the `Elasticsearch::Model` integration supports an "adapter" concept.
|
589
|
+
|
590
|
+
An adapter provides implementations for common behaviour, such as fetching records from the database,
|
591
|
+
hooking into model callbacks for automatic index updates, or efficient bulk loading from the database.
|
592
|
+
The integration comes with adapters for _ActiveRecord_ and _Mongoid_ out of the box.
|
593
|
+
|
594
|
+
Writing an adapter for your favourite framework is straightforward -- let's see
|
595
|
+
a simplified adapter for [_DataMapper_](http://datamapper.org):
|
596
|
+
|
597
|
+
```ruby
|
598
|
+
module DataMapperAdapter
|
599
|
+
|
600
|
+
# Implement the interface for fetching records
|
601
|
+
#
|
602
|
+
module Records
|
603
|
+
def records
|
604
|
+
klass.all(id: @ids)
|
605
|
+
end
|
606
|
+
|
607
|
+
# ...
|
608
|
+
end
|
609
|
+
end
|
610
|
+
|
611
|
+
# Register the adapter
|
612
|
+
#
|
613
|
+
Elasticsearch::Model::Adapter.register(
|
614
|
+
DataMapperAdapter,
|
615
|
+
lambda { |klass| defined?(::DataMapper::Resource) and klass.ancestors.include?(::DataMapper::Resource) }
|
616
|
+
)
|
617
|
+
```
|
618
|
+
|
619
|
+
Require the adapter and include `Elasticsearch::Model` in the class:
|
620
|
+
|
621
|
+
```ruby
|
622
|
+
require 'datamapper_adapter'
|
623
|
+
|
624
|
+
class Article
|
625
|
+
include DataMapper::Resource
|
626
|
+
include Elasticsearch::Model
|
627
|
+
|
628
|
+
property :id, Serial
|
629
|
+
property :title, String
|
630
|
+
end
|
631
|
+
```
|
632
|
+
|
633
|
+
When accessing the `records` method of the response, for example,
|
634
|
+
the implementation from our adapter will be used now:
|
635
|
+
|
636
|
+
```ruby
|
637
|
+
response = Article.search 'foo'
|
638
|
+
|
639
|
+
response.records.to_a
|
640
|
+
# ~ (0.000057) SELECT "id", "title", "published_at" FROM "articles" WHERE "id" IN (3, 1) ORDER BY "id"
|
641
|
+
# => [#<Article @id=1 @title="Foo" @published_at=nil>, #<Article @id=3 @title="Foo Foo" @published_at=nil>]
|
642
|
+
|
643
|
+
response.records.records.class
|
644
|
+
# => DataMapper::Collection
|
645
|
+
```
|
646
|
+
|
647
|
+
More examples can be found in the `examples` folder. Please see the `Elasticsearch::Model::Adapter`
|
648
|
+
module and its submodules for technical information.
|
649
|
+
|
650
|
+
## Development and Community
|
651
|
+
|
652
|
+
For local development, clone the repository and run `bundle install`. See `rake -T` for a list of
|
653
|
+
available Rake tasks for running tests, generating documentation, starting a testing cluster, etc.
|
654
|
+
|
655
|
+
Bug fixes and features must be covered by unit tests.
|
656
|
+
|
657
|
+
Github's pull requests and issues are used to communicate, send bug reports and code contributions.
|
658
|
+
|
659
|
+
To run all tests against a test Elasticsearch cluster, use a command like this:
|
660
|
+
|
661
|
+
```bash
|
662
|
+
curl -# https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.0.0.RC1.tar.gz | tar xz -C tmp/
|
663
|
+
SERVER=start TEST_CLUSTER_COMMAND=$PWD/tmp/elasticsearch-1.0.0.RC1/bin/elasticsearch bundle exec rake test:all
|
664
|
+
```
|
665
|
+
|
666
|
+
## License
|
667
|
+
|
668
|
+
This software is licensed under the Apache 2 license, quoted below.
|
669
|
+
|
670
|
+
Copyright (c) 2014 Elasticsearch <http://www.elasticsearch.org>
|
671
|
+
|
672
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
673
|
+
you may not use this file except in compliance with the License.
|
674
|
+
You may obtain a copy of the License at
|
675
|
+
|
676
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
677
|
+
|
678
|
+
Unless required by applicable law or agreed to in writing, software
|
679
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
680
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
681
|
+
See the License for the specific language governing permissions and
|
682
|
+
limitations under the License.
|