load_balanced_tire 0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +14 -0
- data/.travis.yml +29 -0
- data/Gemfile +4 -0
- data/MIT-LICENSE +20 -0
- data/README.markdown +760 -0
- data/Rakefile +78 -0
- data/examples/rails-application-template.rb +249 -0
- data/examples/tire-dsl.rb +876 -0
- data/lib/tire.rb +55 -0
- data/lib/tire/alias.rb +296 -0
- data/lib/tire/configuration.rb +30 -0
- data/lib/tire/dsl.rb +43 -0
- data/lib/tire/http/client.rb +62 -0
- data/lib/tire/http/clients/curb.rb +61 -0
- data/lib/tire/http/clients/faraday.rb +71 -0
- data/lib/tire/http/response.rb +27 -0
- data/lib/tire/index.rb +361 -0
- data/lib/tire/logger.rb +60 -0
- data/lib/tire/model/callbacks.rb +40 -0
- data/lib/tire/model/import.rb +26 -0
- data/lib/tire/model/indexing.rb +128 -0
- data/lib/tire/model/naming.rb +100 -0
- data/lib/tire/model/percolate.rb +99 -0
- data/lib/tire/model/persistence.rb +71 -0
- data/lib/tire/model/persistence/attributes.rb +143 -0
- data/lib/tire/model/persistence/finders.rb +66 -0
- data/lib/tire/model/persistence/storage.rb +69 -0
- data/lib/tire/model/search.rb +307 -0
- data/lib/tire/results/collection.rb +114 -0
- data/lib/tire/results/item.rb +86 -0
- data/lib/tire/results/pagination.rb +54 -0
- data/lib/tire/rubyext/hash.rb +8 -0
- data/lib/tire/rubyext/ruby_1_8.rb +7 -0
- data/lib/tire/rubyext/symbol.rb +11 -0
- data/lib/tire/search.rb +188 -0
- data/lib/tire/search/facet.rb +74 -0
- data/lib/tire/search/filter.rb +28 -0
- data/lib/tire/search/highlight.rb +37 -0
- data/lib/tire/search/query.rb +186 -0
- data/lib/tire/search/scan.rb +114 -0
- data/lib/tire/search/script_field.rb +23 -0
- data/lib/tire/search/sort.rb +25 -0
- data/lib/tire/tasks.rb +135 -0
- data/lib/tire/utils.rb +17 -0
- data/lib/tire/version.rb +22 -0
- data/test/fixtures/articles/1.json +1 -0
- data/test/fixtures/articles/2.json +1 -0
- data/test/fixtures/articles/3.json +1 -0
- data/test/fixtures/articles/4.json +1 -0
- data/test/fixtures/articles/5.json +1 -0
- data/test/integration/active_model_indexing_test.rb +51 -0
- data/test/integration/active_model_searchable_test.rb +114 -0
- data/test/integration/active_record_searchable_test.rb +446 -0
- data/test/integration/boolean_queries_test.rb +43 -0
- data/test/integration/count_test.rb +34 -0
- data/test/integration/custom_score_queries_test.rb +88 -0
- data/test/integration/dis_max_queries_test.rb +68 -0
- data/test/integration/dsl_search_test.rb +22 -0
- data/test/integration/explanation_test.rb +44 -0
- data/test/integration/facets_test.rb +259 -0
- data/test/integration/filtered_queries_test.rb +66 -0
- data/test/integration/filters_test.rb +63 -0
- data/test/integration/fuzzy_queries_test.rb +20 -0
- data/test/integration/highlight_test.rb +64 -0
- data/test/integration/index_aliases_test.rb +122 -0
- data/test/integration/index_mapping_test.rb +43 -0
- data/test/integration/index_store_test.rb +96 -0
- data/test/integration/index_update_document_test.rb +111 -0
- data/test/integration/mongoid_searchable_test.rb +309 -0
- data/test/integration/percolator_test.rb +111 -0
- data/test/integration/persistent_model_test.rb +130 -0
- data/test/integration/prefix_query_test.rb +43 -0
- data/test/integration/query_return_version_test.rb +70 -0
- data/test/integration/query_string_test.rb +52 -0
- data/test/integration/range_queries_test.rb +36 -0
- data/test/integration/reindex_test.rb +46 -0
- data/test/integration/results_test.rb +39 -0
- data/test/integration/scan_test.rb +56 -0
- data/test/integration/script_fields_test.rb +38 -0
- data/test/integration/sort_test.rb +36 -0
- data/test/integration/text_query_test.rb +39 -0
- data/test/models/active_model_article.rb +31 -0
- data/test/models/active_model_article_with_callbacks.rb +49 -0
- data/test/models/active_model_article_with_custom_document_type.rb +7 -0
- data/test/models/active_model_article_with_custom_index_name.rb +7 -0
- data/test/models/active_record_models.rb +122 -0
- data/test/models/article.rb +15 -0
- data/test/models/mongoid_models.rb +97 -0
- data/test/models/persistent_article.rb +11 -0
- data/test/models/persistent_article_in_namespace.rb +12 -0
- data/test/models/persistent_article_with_casting.rb +28 -0
- data/test/models/persistent_article_with_defaults.rb +11 -0
- data/test/models/persistent_articles_with_custom_index_name.rb +10 -0
- data/test/models/supermodel_article.rb +17 -0
- data/test/models/validated_model.rb +11 -0
- data/test/test_helper.rb +93 -0
- data/test/unit/active_model_lint_test.rb +17 -0
- data/test/unit/configuration_test.rb +74 -0
- data/test/unit/http_client_test.rb +76 -0
- data/test/unit/http_response_test.rb +49 -0
- data/test/unit/index_alias_test.rb +275 -0
- data/test/unit/index_test.rb +894 -0
- data/test/unit/logger_test.rb +125 -0
- data/test/unit/model_callbacks_test.rb +116 -0
- data/test/unit/model_import_test.rb +71 -0
- data/test/unit/model_persistence_test.rb +528 -0
- data/test/unit/model_search_test.rb +913 -0
- data/test/unit/results_collection_test.rb +281 -0
- data/test/unit/results_item_test.rb +162 -0
- data/test/unit/rubyext_test.rb +66 -0
- data/test/unit/search_facet_test.rb +153 -0
- data/test/unit/search_filter_test.rb +42 -0
- data/test/unit/search_highlight_test.rb +46 -0
- data/test/unit/search_query_test.rb +301 -0
- data/test/unit/search_scan_test.rb +113 -0
- data/test/unit/search_script_field_test.rb +26 -0
- data/test/unit/search_sort_test.rb +50 -0
- data/test/unit/search_test.rb +499 -0
- data/test/unit/tire_test.rb +126 -0
- data/tire.gemspec +90 -0
- metadata +549 -0
data/.gitignore
ADDED
data/.travis.yml
ADDED
@@ -0,0 +1,29 @@
|
|
1
|
+
# ---------------------------------------------------------
|
2
|
+
# Configuration file for http://travis-ci.org/#!/karmi/tire
|
3
|
+
# ---------------------------------------------------------
|
4
|
+
|
5
|
+
language: ruby
|
6
|
+
|
7
|
+
rvm:
|
8
|
+
- 1.9.3
|
9
|
+
- 1.8.7
|
10
|
+
- ree
|
11
|
+
|
12
|
+
env:
|
13
|
+
- TEST_COMMAND="rake test:unit"
|
14
|
+
- TEST_COMMAND="rake test:integration"
|
15
|
+
|
16
|
+
script: "bundle exec $TEST_COMMAND"
|
17
|
+
|
18
|
+
before_install:
|
19
|
+
- sudo service elasticsearch start
|
20
|
+
|
21
|
+
matrix:
|
22
|
+
exclude:
|
23
|
+
- rvm: 1.8.7
|
24
|
+
env: TEST_COMMAND="rake test:integration"
|
25
|
+
- rvm: ree
|
26
|
+
env: TEST_COMMAND="rake test:integration"
|
27
|
+
|
28
|
+
notifications:
|
29
|
+
disable: true
|
data/Gemfile
ADDED
data/MIT-LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright 2011 Karel Minarik
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.markdown
ADDED
@@ -0,0 +1,760 @@
|
|
1
|
+
Tire
|
2
|
+
=========
|
3
|
+
|
4
|
+
_Tire_ is a Ruby (1.8 or 1.9) client for the [ElasticSearch](http://www.elasticsearch.org/)
|
5
|
+
search engine/database.
|
6
|
+
|
7
|
+
_ElasticSearch_ is a scalable, distributed, cloud-ready, highly-available,
|
8
|
+
full-text search engine and database with
|
9
|
+
[powerfull aggregation features](http://www.elasticsearch.org/guide/reference/api/search/facets/),
|
10
|
+
communicating by JSON over RESTful HTTP, based on [Lucene](http://lucene.apache.org/), written in Java.
|
11
|
+
|
12
|
+
This Readme provides a brief overview of _Tire's_ features. The more detailed documentation is at <http://karmi.github.com/tire/>.
|
13
|
+
|
14
|
+
Both of these documents contain a lot of information. Please set aside some time to read them thoroughly, before you blindly dive into „somehow making it work“. Just skimming through it **won't work** for you. For more information, please refer to the [integration test suite](https://github.com/karmi/tire/tree/master/test/integration)
|
15
|
+
and [issues](https://github.com/karmi/tire/issues).
|
16
|
+
|
17
|
+
Installation
|
18
|
+
------------
|
19
|
+
|
20
|
+
OK. First, you need a running _ElasticSearch_ server. Thankfully, it's easy. Let's define easy:
|
21
|
+
|
22
|
+
$ curl -k -L -o elasticsearch-0.19.0.tar.gz http://github.com/downloads/elasticsearch/elasticsearch/elasticsearch-0.19.0.tar.gz
|
23
|
+
$ tar -zxvf elasticsearch-0.19.0.tar.gz
|
24
|
+
$ ./elasticsearch-0.19.0/bin/elasticsearch -f
|
25
|
+
|
26
|
+
See, easy. On a Mac, you can also use _Homebrew_:
|
27
|
+
|
28
|
+
$ brew install elasticsearch
|
29
|
+
|
30
|
+
Now, let's install the gem via Rubygems:
|
31
|
+
|
32
|
+
$ gem install tire
|
33
|
+
|
34
|
+
Of course, you can install it from the source as well:
|
35
|
+
|
36
|
+
$ git clone git://github.com/karmi/tire.git
|
37
|
+
$ cd tire
|
38
|
+
$ rake install
|
39
|
+
|
40
|
+
|
41
|
+
Usage
|
42
|
+
-----
|
43
|
+
|
44
|
+
_Tire_ exposes easy-to-use domain specific language for fluent communication with _ElasticSearch_.
|
45
|
+
|
46
|
+
It easily blends with your _ActiveModel_/_ActiveRecord_ classes for convenient usage in _Rails_ applications.
|
47
|
+
|
48
|
+
To test-drive the core _ElasticSearch_ functionality, let's require the gem:
|
49
|
+
|
50
|
+
```ruby
|
51
|
+
require 'rubygems'
|
52
|
+
require 'tire'
|
53
|
+
```
|
54
|
+
|
55
|
+
Please note that you can copy these snippets from the much more extensive and heavily annotated file
|
56
|
+
in [examples/tire-dsl.rb](http://karmi.github.com/tire/).
|
57
|
+
|
58
|
+
Also, note that we're doing some heavy JSON lifting here. _Tire_ uses the
|
59
|
+
[_multi_json_](https://github.com/intridea/multi_json) gem as a generic JSON wrapper,
|
60
|
+
which allows you to use your preferred JSON library. We'll use the
|
61
|
+
[_yajl-ruby_](https://github.com/brianmario/yajl-ruby) gem in the full on mode here:
|
62
|
+
|
63
|
+
```ruby
|
64
|
+
require 'yajl/json_gem'
|
65
|
+
```
|
66
|
+
|
67
|
+
Let's create an index named `articles` and store/index some documents:
|
68
|
+
|
69
|
+
```ruby
|
70
|
+
Tire.index 'articles' do
|
71
|
+
delete
|
72
|
+
create
|
73
|
+
|
74
|
+
store :title => 'One', :tags => ['ruby']
|
75
|
+
store :title => 'Two', :tags => ['ruby', 'python']
|
76
|
+
store :title => 'Three', :tags => ['java']
|
77
|
+
store :title => 'Four', :tags => ['ruby', 'php']
|
78
|
+
|
79
|
+
refresh
|
80
|
+
end
|
81
|
+
```
|
82
|
+
|
83
|
+
We can also create the index with custom
|
84
|
+
[mapping](http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index.html)
|
85
|
+
for a specific document type:
|
86
|
+
|
87
|
+
```ruby
|
88
|
+
Tire.index 'articles' do
|
89
|
+
delete
|
90
|
+
|
91
|
+
create :mappings => {
|
92
|
+
:article => {
|
93
|
+
:properties => {
|
94
|
+
:id => { :type => 'string', :index => 'not_analyzed', :include_in_all => false },
|
95
|
+
:title => { :type => 'string', :boost => 2.0, :analyzer => 'snowball' },
|
96
|
+
:tags => { :type => 'string', :analyzer => 'keyword' },
|
97
|
+
:content => { :type => 'string', :analyzer => 'snowball' }
|
98
|
+
}
|
99
|
+
}
|
100
|
+
}
|
101
|
+
end
|
102
|
+
```
|
103
|
+
|
104
|
+
Of course, we may have large amounts of data, and it may be impossible or impractical to add them to the index
|
105
|
+
one by one. We can use _ElasticSearch's_
|
106
|
+
[bulk storage](http://www.elasticsearch.org/guide/reference/api/bulk.html).
|
107
|
+
Notice, that collection items must have an `id` property or method,
|
108
|
+
and should have a `type` property, if you've set any specific mapping for the index.
|
109
|
+
|
110
|
+
```ruby
|
111
|
+
articles = [
|
112
|
+
{ :id => '1', :type => 'article', :title => 'one', :tags => ['ruby'] },
|
113
|
+
{ :id => '2', :type => 'article', :title => 'two', :tags => ['ruby', 'python'] },
|
114
|
+
{ :id => '3', :type => 'article', :title => 'three', :tags => ['java'] },
|
115
|
+
{ :id => '4', :type => 'article', :title => 'four', :tags => ['ruby', 'php'] }
|
116
|
+
]
|
117
|
+
|
118
|
+
Tire.index 'articles' do
|
119
|
+
import articles
|
120
|
+
end
|
121
|
+
```
|
122
|
+
|
123
|
+
We can easily manipulate the documents before storing them in the index, by passing a block to the
|
124
|
+
`import` method, like this:
|
125
|
+
|
126
|
+
```ruby
|
127
|
+
Tire.index 'articles' do
|
128
|
+
import articles do |documents|
|
129
|
+
|
130
|
+
documents.each { |document| document[:title].capitalize! }
|
131
|
+
end
|
132
|
+
|
133
|
+
refresh
|
134
|
+
end
|
135
|
+
```
|
136
|
+
|
137
|
+
If this _declarative_ notation does not fit well in your context,
|
138
|
+
you can use _Tire's_ classes directly, in a more imperative manner:
|
139
|
+
|
140
|
+
```ruby
|
141
|
+
index = Tire::Index.new('oldskool')
|
142
|
+
index.delete
|
143
|
+
index.create
|
144
|
+
index.store :title => "Let's do it the old way!"
|
145
|
+
index.refresh
|
146
|
+
```
|
147
|
+
|
148
|
+
OK. Now, let's go search all the data.
|
149
|
+
|
150
|
+
We will be searching for articles whose `title` begins with letter “T”, sorted by `title` in `descending` order,
|
151
|
+
filtering them for ones tagged “ruby”, and also retrieving some [_facets_](http://www.elasticsearch.org/guide/reference/api/search/facets/)
|
152
|
+
from the database:
|
153
|
+
|
154
|
+
```ruby
|
155
|
+
s = Tire.search 'articles' do
|
156
|
+
query do
|
157
|
+
string 'title:T*'
|
158
|
+
end
|
159
|
+
|
160
|
+
filter :terms, :tags => ['ruby']
|
161
|
+
|
162
|
+
sort { by :title, 'desc' }
|
163
|
+
|
164
|
+
facet 'global-tags', :global => true do
|
165
|
+
terms :tags
|
166
|
+
end
|
167
|
+
|
168
|
+
facet 'current-tags' do
|
169
|
+
terms :tags
|
170
|
+
end
|
171
|
+
end
|
172
|
+
```
|
173
|
+
|
174
|
+
(Of course, we may also page the results with `from` and `size` query options, retrieve only specific fields
|
175
|
+
or highlight content matching our query, etc.)
|
176
|
+
|
177
|
+
Let's display the results:
|
178
|
+
|
179
|
+
```ruby
|
180
|
+
s.results.each do |document|
|
181
|
+
puts "* #{ document.title } [tags: #{document.tags.join(', ')}]"
|
182
|
+
end
|
183
|
+
|
184
|
+
# * Two [tags: ruby, python]
|
185
|
+
```
|
186
|
+
|
187
|
+
Let's display the global facets (distribution of tags across the whole database):
|
188
|
+
|
189
|
+
```ruby
|
190
|
+
s.results.facets['global-tags']['terms'].each do |f|
|
191
|
+
puts "#{f['term'].ljust(10)} #{f['count']}"
|
192
|
+
end
|
193
|
+
|
194
|
+
# ruby 3
|
195
|
+
# python 1
|
196
|
+
# php 1
|
197
|
+
# java 1
|
198
|
+
```
|
199
|
+
|
200
|
+
Now, let's display the facets based on current query (notice that count for articles
|
201
|
+
tagged with 'java' is included, even though it's not returned by our query;
|
202
|
+
count for articles tagged 'php' is excluded, since they don't match the current query):
|
203
|
+
|
204
|
+
```ruby
|
205
|
+
s.results.facets['current-tags']['terms'].each do |f|
|
206
|
+
puts "#{f['term'].ljust(10)} #{f['count']}"
|
207
|
+
end
|
208
|
+
|
209
|
+
# ruby 1
|
210
|
+
# python 1
|
211
|
+
# java 1
|
212
|
+
```
|
213
|
+
|
214
|
+
Notice, that only variables from the enclosing scope are accessible.
|
215
|
+
If we want to access the variables or methods from outer scope,
|
216
|
+
we have to use a slight variation of the DSL, by passing the
|
217
|
+
`search` and `query` objects around.
|
218
|
+
|
219
|
+
```ruby
|
220
|
+
@query = 'title:T*'
|
221
|
+
|
222
|
+
Tire.search 'articles' do |search|
|
223
|
+
search.query do |query|
|
224
|
+
query.string @query
|
225
|
+
end
|
226
|
+
end
|
227
|
+
```
|
228
|
+
|
229
|
+
Quite often, we need complex queries with boolean logic.
|
230
|
+
Instead of composing long query strings such as `tags:ruby OR tags:java AND NOT tags:python`,
|
231
|
+
we can use the [_bool_](http://www.elasticsearch.org/guide/reference/query-dsl/bool-query.html)
|
232
|
+
query. In _Tire_, we build them declaratively.
|
233
|
+
|
234
|
+
```ruby
|
235
|
+
Tire.search 'articles' do
|
236
|
+
query do
|
237
|
+
boolean do
|
238
|
+
should { string 'tags:ruby' }
|
239
|
+
should { string 'tags:java' }
|
240
|
+
must_not { string 'tags:python' }
|
241
|
+
end
|
242
|
+
end
|
243
|
+
end
|
244
|
+
```
|
245
|
+
|
246
|
+
The best thing about `boolean` queries is that we can easily save these partial queries as Ruby blocks,
|
247
|
+
to mix and reuse them later. So, we may define a query for the _tags_ property:
|
248
|
+
|
249
|
+
```ruby
|
250
|
+
tags_query = lambda do
|
251
|
+
boolean.should { string 'tags:ruby' }
|
252
|
+
boolean.should { string 'tags:java' }
|
253
|
+
end
|
254
|
+
```
|
255
|
+
|
256
|
+
And a query for the _published_on_ property:
|
257
|
+
|
258
|
+
```ruby
|
259
|
+
published_on_query = lambda do
|
260
|
+
boolean.must { string 'published_on:[2011-01-01 TO 2011-01-02]' }
|
261
|
+
end
|
262
|
+
```
|
263
|
+
|
264
|
+
Now, we can combine these queries for different searches:
|
265
|
+
|
266
|
+
```ruby
|
267
|
+
Tire.search 'articles' do
|
268
|
+
query do
|
269
|
+
boolean &tags_query
|
270
|
+
boolean &published_on_query
|
271
|
+
end
|
272
|
+
end
|
273
|
+
```
|
274
|
+
|
275
|
+
Note, that you can pass options for configuring queries, facets, etc. by passing a Hash as the last argument to the method call:
|
276
|
+
|
277
|
+
```ruby
|
278
|
+
Tire.search 'articles' do
|
279
|
+
query do
|
280
|
+
string 'ruby python', :default_operator => 'AND', :use_dis_max => true
|
281
|
+
end
|
282
|
+
end
|
283
|
+
```
|
284
|
+
|
285
|
+
You don't have to define the search criteria in one monolithic _Ruby_ block -- you can build the search step by step,
|
286
|
+
until you call the `results` method:
|
287
|
+
|
288
|
+
```ruby
|
289
|
+
s = Tire.search('articles') { query { string 'title:T*' } }
|
290
|
+
s.filter :terms, :tags => ['ruby']
|
291
|
+
p s.results
|
292
|
+
```
|
293
|
+
|
294
|
+
If configuring the search payload with blocks feels somehow too weak for you, you can pass
|
295
|
+
a plain old Ruby `Hash` (or JSON string) with the query declaration to the `search` method:
|
296
|
+
|
297
|
+
```ruby
|
298
|
+
Tire.search 'articles', :query => { :prefix => { :title => 'fou' } }
|
299
|
+
```
|
300
|
+
|
301
|
+
If this sounds like a great idea to you, you are probably able to write your application
|
302
|
+
using just `curl`, `sed` and `awk`.
|
303
|
+
|
304
|
+
Do note again, however, that you're not tied to the declarative block-style DSL _Tire_ offers to you.
|
305
|
+
If it makes more sense in your context, you can use the API directly, in a more imperative style:
|
306
|
+
|
307
|
+
```ruby
|
308
|
+
search = Tire::Search::Search.new('articles')
|
309
|
+
search.query { string('title:T*') }
|
310
|
+
search.filter :terms, :tags => ['ruby']
|
311
|
+
search.sort { by :title, 'desc' }
|
312
|
+
search.facet('global-tags') { terms :tags, :global => true }
|
313
|
+
# ...
|
314
|
+
p search.results
|
315
|
+
```
|
316
|
+
|
317
|
+
To debug the query we have laboriously set up like this,
|
318
|
+
we can display the full query JSON for close inspection:
|
319
|
+
|
320
|
+
```ruby
|
321
|
+
puts s.to_json
|
322
|
+
# {"facets":{"current-tags":{"terms":{"field":"tags"}},"global-tags":{"global":true,"terms":{"field":"tags"}}},"query":{"query_string":{"query":"title:T*"}},"filter":{"terms":{"tags":["ruby"]}},"sort":[{"title":"desc"}]}
|
323
|
+
```
|
324
|
+
|
325
|
+
Or, better, we can display the corresponding `curl` command to recreate and debug the request in the terminal:
|
326
|
+
|
327
|
+
```ruby
|
328
|
+
puts s.to_curl
|
329
|
+
# curl -X POST "http://localhost:9200/articles/_search?pretty=true" -d '{"facets":{"current-tags":{"terms":{"field":"tags"}},"global-tags":{"global":true,"terms":{"field":"tags"}}},"query":{"query_string":{"query":"title:T*"}},"filter":{"terms":{"tags":["ruby"]}},"sort":[{"title":"desc"}]}'
|
330
|
+
```
|
331
|
+
|
332
|
+
However, we can simply log every search query (and other requests) in this `curl`-friendly format:
|
333
|
+
|
334
|
+
```ruby
|
335
|
+
Tire.configure { logger 'elasticsearch.log' }
|
336
|
+
```
|
337
|
+
|
338
|
+
When you set the log level to _debug_:
|
339
|
+
|
340
|
+
```ruby
|
341
|
+
Tire.configure { logger 'elasticsearch.log', :level => 'debug' }
|
342
|
+
```
|
343
|
+
|
344
|
+
the JSON responses are logged as well. This is not a great idea for production environment,
|
345
|
+
but it's priceless when you want to paste a complicated transaction to the mailing list or IRC channel.
|
346
|
+
|
347
|
+
The _Tire_ DSL tries hard to provide a strong Ruby-like API for the main _ElasticSearch_ features.
|
348
|
+
|
349
|
+
By default, _Tire_ wraps the results collection in a enumerable `Results::Collection` class,
|
350
|
+
and result items in a `Results::Item` class, which looks like a child of `Hash` and `Openstruct`,
|
351
|
+
for smooth iterating over and displaying the results.
|
352
|
+
|
353
|
+
You may wrap the result items in your own class by setting the `Tire.configuration.wrapper`
|
354
|
+
property. Your class must take a `Hash` of attributes on initialization.
|
355
|
+
|
356
|
+
If that seems like a great idea to you, there's a big chance you already have such class.
|
357
|
+
|
358
|
+
One would bet it's an `ActiveRecord` or `ActiveModel` class, containing model of your Rails application.
|
359
|
+
|
360
|
+
Fortunately, _Tire_ makes blending _ElasticSearch_ features into your models trivially possible.
|
361
|
+
|
362
|
+
|
363
|
+
ActiveModel Integration
|
364
|
+
-----------------------
|
365
|
+
|
366
|
+
If you're the type with no time for lengthy introductions, you can generate a fully working
|
367
|
+
example Rails application, with an `ActiveRecord` model and a search form, to play with
|
368
|
+
(it even downloads _ElasticSearch_ itself, generates the application skeleton and leaves you with
|
369
|
+
a _Git_ repository to explore the steps and the code):
|
370
|
+
|
371
|
+
$ rails new searchapp -m https://raw.github.com/karmi/tire/master/examples/rails-application-template.rb
|
372
|
+
|
373
|
+
For the rest of us, let's suppose you have an `Article` class in your _Rails_ application.
|
374
|
+
|
375
|
+
To make it searchable with _Tire_, just `include` it:
|
376
|
+
|
377
|
+
```ruby
|
378
|
+
class Article < ActiveRecord::Base
|
379
|
+
include Tire::Model::Search
|
380
|
+
include Tire::Model::Callbacks
|
381
|
+
end
|
382
|
+
```
|
383
|
+
|
384
|
+
When you now save a record:
|
385
|
+
|
386
|
+
```ruby
|
387
|
+
Article.create :title => "I Love ElasticSearch",
|
388
|
+
:content => "...",
|
389
|
+
:author => "Captain Nemo",
|
390
|
+
:published_on => Time.now
|
391
|
+
```
|
392
|
+
|
393
|
+
it is automatically added into an index called 'articles', because of the included callbacks.
|
394
|
+
|
395
|
+
The document attributes are indexed exactly as when you call the `Article#to_json` method.
|
396
|
+
|
397
|
+
Now you can search the records:
|
398
|
+
|
399
|
+
```ruby
|
400
|
+
Article.search 'love'
|
401
|
+
```
|
402
|
+
|
403
|
+
OK. This is where the search game stops, often. Not here.
|
404
|
+
|
405
|
+
First of all, you may use the full query DSL, as explained above, with filters, sorting,
|
406
|
+
advanced facet aggregation, highlighting, etc:
|
407
|
+
|
408
|
+
```ruby
|
409
|
+
Article.search do
|
410
|
+
query { string 'love' }
|
411
|
+
facet('timeline') { date :published_on, :interval => 'month' }
|
412
|
+
sort { by :published_on, 'desc' }
|
413
|
+
end
|
414
|
+
```
|
415
|
+
|
416
|
+
Second, dynamic mapping is a godsend when you're prototyping.
|
417
|
+
For serious usage, though, you'll definitely want to define a custom _mapping_ for your models:
|
418
|
+
|
419
|
+
```ruby
|
420
|
+
class Article < ActiveRecord::Base
|
421
|
+
include Tire::Model::Search
|
422
|
+
include Tire::Model::Callbacks
|
423
|
+
|
424
|
+
mapping do
|
425
|
+
indexes :id, :index => :not_analyzed
|
426
|
+
indexes :title, :analyzer => 'snowball', :boost => 100
|
427
|
+
indexes :content, :analyzer => 'snowball'
|
428
|
+
indexes :content_size, :as => 'content.size'
|
429
|
+
indexes :author, :analyzer => 'keyword'
|
430
|
+
indexes :published_on, :type => 'date', :include_in_all => false
|
431
|
+
end
|
432
|
+
end
|
433
|
+
```
|
434
|
+
|
435
|
+
In this case, _only_ the defined model attributes are indexed. The `mapping` declaration creates the
|
436
|
+
index when the class is loaded or when the importing features are used, and _only_ when it does not yet exist.
|
437
|
+
|
438
|
+
You can define different [_analyzers_](http://www.elasticsearch.org/guide/reference/index-modules/analysis/index.html),
|
439
|
+
[_boost_](http://www.elasticsearch.org/guide/reference/mapping/boost-field.html) levels for different properties,
|
440
|
+
or any other configuration for _elasticsearch_.
|
441
|
+
|
442
|
+
You're not limited to 1:1 mapping between your model properties and the serialized document. With the `:as` option,
|
443
|
+
you can pass a string or a _Proc_ object which is evaluated in the instance context (see the `content_size` property).
|
444
|
+
|
445
|
+
Chances are, you want to declare also a custom _settings_ for the index, such as set the number of shards,
|
446
|
+
replicas, or create elaborate analyzer chains, such as the hipster's choice: [_ngrams_](https://gist.github.com/1160430).
|
447
|
+
In this case, just wrap the `mapping` method in a `settings` one, passing it the settings as a Hash:
|
448
|
+
|
449
|
+
```ruby
|
450
|
+
class URL < ActiveRecord::Base
|
451
|
+
include Tire::Model::Search
|
452
|
+
include Tire::Model::Callbacks
|
453
|
+
|
454
|
+
settings :number_of_shards => 1,
|
455
|
+
:number_of_replicas => 1,
|
456
|
+
:analysis => {
|
457
|
+
:filter => {
|
458
|
+
:url_ngram => {
|
459
|
+
"type" => "nGram",
|
460
|
+
"max_gram" => 5,
|
461
|
+
"min_gram" => 3 }
|
462
|
+
},
|
463
|
+
:analyzer => {
|
464
|
+
:url_analyzer => {
|
465
|
+
"tokenizer" => "lowercase",
|
466
|
+
"filter" => ["stop", "url_ngram"],
|
467
|
+
"type" => "custom" }
|
468
|
+
}
|
469
|
+
} do
|
470
|
+
mapping { indexes :url, :type => 'string', :analyzer => "url_analyzer" }
|
471
|
+
end
|
472
|
+
end
|
473
|
+
```
|
474
|
+
|
475
|
+
It may well be reasonable to wrap the index creation logic declared with `Tire.index('urls').create`
|
476
|
+
in a class method of your model, in a module method, etc, to have better control on index creation when
|
477
|
+
bootstrapping the application with Rake tasks or when setting up the test suite.
|
478
|
+
_Tire_ will not hold that against you.
|
479
|
+
|
480
|
+
You may have just stopped wondering: what if I have my own `settings` class method defined?
|
481
|
+
Or what if some other gem defines `settings`, or some other _Tire_ method, such as `update_index`?
|
482
|
+
Things will break, right? No, they won't.
|
483
|
+
|
484
|
+
In fact, all this time you've been using only _proxies_ to the real _Tire_ methods, which live in the `tire`
|
485
|
+
class and instance methods of your model. Only when not trampling on someone's foot — which is the majority
|
486
|
+
of cases —, will _Tire_ bring its methods to the namespace of your class.
|
487
|
+
|
488
|
+
So, instead of writing `Article.search`, you could write `Article.tire.search`, and instead of
|
489
|
+
`@article.update_index` you could write `@article.tire.update_index`, to be on the safe side.
|
490
|
+
Let's have a look on an example with the `mapping` method:
|
491
|
+
|
492
|
+
```ruby
|
493
|
+
class Article < ActiveRecord::Base
|
494
|
+
include Tire::Model::Search
|
495
|
+
include Tire::Model::Callbacks
|
496
|
+
|
497
|
+
tire.mapping do
|
498
|
+
indexes :id, :type => 'string', :index => :not_analyzed
|
499
|
+
# ...
|
500
|
+
end
|
501
|
+
end
|
502
|
+
```
|
503
|
+
|
504
|
+
Of course, you could also use the block form:
|
505
|
+
|
506
|
+
```ruby
|
507
|
+
class Article < ActiveRecord::Base
|
508
|
+
include Tire::Model::Search
|
509
|
+
include Tire::Model::Callbacks
|
510
|
+
|
511
|
+
tire do
|
512
|
+
mapping do
|
513
|
+
indexes :id, :type => 'string', :index => :not_analyzed
|
514
|
+
# ...
|
515
|
+
end
|
516
|
+
end
|
517
|
+
end
|
518
|
+
```
|
519
|
+
|
520
|
+
Internally, _Tire_ uses these proxy methods exclusively. When you run into issues,
|
521
|
+
use the proxied method, eg. `Article.tire.mapping`, directly.
|
522
|
+
|
523
|
+
When you want a tight grip on how the attributes are added to the index, just
|
524
|
+
implement the `to_indexed_json` method in your model.
|
525
|
+
|
526
|
+
The easiest way is to customize the `to_json` serialization support of your model:
|
527
|
+
|
528
|
+
```ruby
|
529
|
+
class Article < ActiveRecord::Base
|
530
|
+
# ...
|
531
|
+
|
532
|
+
include_root_in_json = false
|
533
|
+
def to_indexed_json
|
534
|
+
to_json :except => ['updated_at'], :methods => ['length']
|
535
|
+
end
|
536
|
+
end
|
537
|
+
```
|
538
|
+
|
539
|
+
Of course, it may well be reasonable to define the indexed JSON from the ground up:
|
540
|
+
|
541
|
+
```ruby
|
542
|
+
class Article < ActiveRecord::Base
|
543
|
+
# ...
|
544
|
+
|
545
|
+
def to_indexed_json
|
546
|
+
names = author.split(/\W/)
|
547
|
+
last_name = names.pop
|
548
|
+
first_name = names.join
|
549
|
+
|
550
|
+
{
|
551
|
+
:title => title,
|
552
|
+
:content => content,
|
553
|
+
:author => {
|
554
|
+
:first_name => first_name,
|
555
|
+
:last_name => last_name
|
556
|
+
}
|
557
|
+
}.to_json
|
558
|
+
end
|
559
|
+
end
|
560
|
+
```
|
561
|
+
|
562
|
+
Notice, that you may want to skip including the `Tire::Model::Callbacks` module in special cases,
|
563
|
+
like when your records are indexed via some external mechanism, let's say a _CouchDB_ or _RabbitMQ_
|
564
|
+
[river](http://www.elasticsearch.org/blog/2010/09/28/the_river.html), or when you need better
|
565
|
+
control on how the documents are added to or removed from the index:
|
566
|
+
|
567
|
+
```ruby
|
568
|
+
class Article < ActiveRecord::Base
|
569
|
+
include Tire::Model::Search
|
570
|
+
|
571
|
+
after_save do
|
572
|
+
update_index if state == 'published'
|
573
|
+
end
|
574
|
+
end
|
575
|
+
```
|
576
|
+
|
577
|
+
The results returned by `Article.search` are wrapped in the aforementioned `Item` class, by default.
|
578
|
+
This way, we have a fast and flexible access to the properties returned from _ElasticSearch_ (via the
|
579
|
+
`_source` or `fields` JSON properties). This way, we can index whatever JSON we like in _ElasticSearch_,
|
580
|
+
and retrieve it, simply, via the dot notation:
|
581
|
+
|
582
|
+
```ruby
|
583
|
+
articles = Article.search 'love'
|
584
|
+
articles.each do |article|
|
585
|
+
puts article.title
|
586
|
+
puts article.author.last_name
|
587
|
+
end
|
588
|
+
```
|
589
|
+
|
590
|
+
The `Item` instances masquerade themselves as instances of your model within a _Rails_ application
|
591
|
+
(based on the `_type` property retrieved from _ElasticSearch_), so you can use them carefree;
|
592
|
+
all the `url_for` or `dom_id` helpers work as expected.
|
593
|
+
|
594
|
+
If you need to access the “real” model (eg. to access its assocations or methods not
|
595
|
+
stored in _ElasticSearch_), just load it from the database:
|
596
|
+
|
597
|
+
```ruby
|
598
|
+
puts article.load(:include => 'comments').comments.size
|
599
|
+
```
|
600
|
+
|
601
|
+
You can see that _Tire_ stays as far from the database as possible. That's because it believes
|
602
|
+
you have most of the data you want to display stored in _ElasticSearch_. When you need
|
603
|
+
to eagerly load the records from the database itself, for whatever reason,
|
604
|
+
you can do it with the `:load` option when searching:
|
605
|
+
|
606
|
+
```ruby
|
607
|
+
# Will call `Article.search [1, 2, 3]`
|
608
|
+
Article.search 'love', :load => true
|
609
|
+
```
|
610
|
+
|
611
|
+
Instead of simple `true`, you can pass any options for the model's find method:
|
612
|
+
|
613
|
+
```ruby
|
614
|
+
# Will call `Article.search [1, 2, 3], :include => 'comments'`
|
615
|
+
Article.search :load => { :include => 'comments' } do
|
616
|
+
query { string 'love' }
|
617
|
+
end
|
618
|
+
```
|
619
|
+
|
620
|
+
Note that _Tire_ search results are fully compatible with [`will_paginate`](https://github.com/mislav/will_paginate),
|
621
|
+
so you can pass all the usual parameters to the `search` method in the controller:
|
622
|
+
|
623
|
+
```ruby
|
624
|
+
@articles = Article.search params[:q], :page => (params[:page] || 1)
|
625
|
+
```
|
626
|
+
|
627
|
+
OK. Chances are, you have lots of records stored in your database. How will you get them to _ElasticSearch_? Easy:
|
628
|
+
|
629
|
+
```ruby
|
630
|
+
Article.index.import Article.all
|
631
|
+
```
|
632
|
+
|
633
|
+
This way, however, all your records are loaded into memory, serialized into JSON,
|
634
|
+
and sent down the wire to _ElasticSearch_. Not practical, you say? You're right.
|
635
|
+
|
636
|
+
Provided your model implements some sort of _pagination_ — and it probably does —, you can just run:
|
637
|
+
|
638
|
+
```ruby
|
639
|
+
Article.import
|
640
|
+
```
|
641
|
+
|
642
|
+
In this case, the `Article.paginate` method is called, and your records are sent to the index
|
643
|
+
in chunks of 1000. If that number doesn't suit you, just provide a better one:
|
644
|
+
|
645
|
+
```ruby
|
646
|
+
Article.import :per_page => 100
|
647
|
+
```
|
648
|
+
|
649
|
+
Any other parameters you provide to the `import` method are passed down to the `paginate` method.
|
650
|
+
|
651
|
+
Are we saying you have to fiddle with this thing in a `rails console` or silly Ruby scripts? No.
|
652
|
+
Just call the included _Rake_ task on the commandline:
|
653
|
+
|
654
|
+
```bash
|
655
|
+
$ rake environment tire:import CLASS='Article'
|
656
|
+
```
|
657
|
+
|
658
|
+
You can also force-import the data by deleting the index first (and creating it with mapping
|
659
|
+
provided by the `mapping` block in your model):
|
660
|
+
|
661
|
+
```bash
|
662
|
+
$ rake environment tire:import CLASS='Article' FORCE=true
|
663
|
+
```
|
664
|
+
|
665
|
+
When you'll spend more time with _ElasticSearch_, you'll notice how
|
666
|
+
[index aliases](http://www.elasticsearch.org/guide/reference/api/admin-indices-aliases.html)
|
667
|
+
are the best idea since the invention of inverted index.
|
668
|
+
You can index your data into a fresh index (and possibly update an alias once everything's fine):
|
669
|
+
|
670
|
+
```bash
|
671
|
+
$ rake environment tire:import CLASS='Article' INDEX='articles-2011-05'
|
672
|
+
```
|
673
|
+
|
674
|
+
OK. All this time we have been talking about `ActiveRecord` models, since
|
675
|
+
it is a reasonable _Rails_' default for the storage layer.
|
676
|
+
|
677
|
+
But what if you use another database such as [MongoDB](http://www.mongodb.org/),
|
678
|
+
another object mapping library, such as [Mongoid](http://mongoid.org/)?
|
679
|
+
|
680
|
+
Well, things stay mostly the same:
|
681
|
+
|
682
|
+
```ruby
|
683
|
+
class Article
|
684
|
+
include Mongoid::Document
|
685
|
+
field :title, :type => String
|
686
|
+
field :content, :type => String
|
687
|
+
|
688
|
+
include Tire::Model::Search
|
689
|
+
include Tire::Model::Callbacks
|
690
|
+
|
691
|
+
# These Mongo guys sure do get funky with their IDs in +serializable_hash+, let's fix it.
|
692
|
+
#
|
693
|
+
def to_indexed_json
|
694
|
+
self.as_json
|
695
|
+
end
|
696
|
+
|
697
|
+
end
|
698
|
+
|
699
|
+
Article.create :title => 'I Love ElasticSearch'
|
700
|
+
|
701
|
+
Article.tire.search 'love'
|
702
|
+
```
|
703
|
+
|
704
|
+
_Tire_ does not care what's your primary data storage solution, if it has an _ActiveModel_-compatible
|
705
|
+
adapter. But there's more.
|
706
|
+
|
707
|
+
_Tire_ implements not only _searchable_ features, but also _persistence_ features. This means you can use a _Tire_ model **instead of your database**, not just for _searching_ your database. Why would you like to do that?
|
708
|
+
|
709
|
+
Well, because you're tired of database migrations and lots of hand-holding with your
|
710
|
+
database to store stuff like `{ :name => 'Tire', :tags => [ 'ruby', 'search' ] }`.
|
711
|
+
Because all you need, really, is to just dump a JSON-representation of your data into a database and load it back again.
|
712
|
+
Because you've noticed that _searching_ your data is a much more effective way of retrieval
|
713
|
+
then constructing elaborate database query conditions.
|
714
|
+
Because you have _lots_ of data and want to use _ElasticSearch's_ advanced distributed features.
|
715
|
+
|
716
|
+
All good reasons to use _ElasticSearch_ as a schema-free and highly-scalable storage and retrieval/aggregation engine for your data.
|
717
|
+
|
718
|
+
To use the persistence mode, we'll include the `Tire::Persistence` module in our class and define its properties;
|
719
|
+
we can add the standard mapping declarations, set default values, or define casting for the property to create
|
720
|
+
lightweight associations between the models.
|
721
|
+
|
722
|
+
```ruby
|
723
|
+
class Article
|
724
|
+
include Tire::Model::Persistence
|
725
|
+
|
726
|
+
validates_presence_of :title, :author
|
727
|
+
|
728
|
+
property :title, :analyzer => 'snowball'
|
729
|
+
property :published_on, :type => 'date'
|
730
|
+
property :tags, :default => [], :analyzer => 'keyword'
|
731
|
+
property :author, :class => Author
|
732
|
+
property :comments, :class => [Comment]
|
733
|
+
end
|
734
|
+
```
|
735
|
+
|
736
|
+
Please be sure to peruse the [integration test suite](https://github.com/karmi/tire/tree/master/test/integration)
|
737
|
+
for examples of the API and _ActiveModel_ integration usage.
|
738
|
+
|
739
|
+
|
740
|
+
Extensions and Additions
|
741
|
+
------------------------
|
742
|
+
|
743
|
+
The [_tire-contrib_](http://github.com/karmi/tire-contrib/) project contains additions
|
744
|
+
and extensions to the core _Tire_ functionality — be sure to check them out.
|
745
|
+
|
746
|
+
|
747
|
+
Other Clients
|
748
|
+
-------------
|
749
|
+
|
750
|
+
Check out [other _ElasticSearch_ clients](http://www.elasticsearch.org/guide/appendix/clients.html).
|
751
|
+
|
752
|
+
|
753
|
+
Feedback
|
754
|
+
--------
|
755
|
+
|
756
|
+
You can send feedback via [e-mail](mailto:karmi@karmi.cz) or via [Github Issues](https://github.com/karmi/tire/issues).
|
757
|
+
|
758
|
+
-----
|
759
|
+
|
760
|
+
[Karel Minarik](http://karmi.cz) and [contributors](http://github.com/karmi/tire/contributors)
|