elasticsearch-persistence 0.0.0 → 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (39) hide show
  1. checksums.yaml +15 -0
  2. data/LICENSE.txt +10 -19
  3. data/README.md +432 -14
  4. data/Rakefile +56 -0
  5. data/elasticsearch-persistence.gemspec +45 -17
  6. data/examples/sinatra/.gitignore +7 -0
  7. data/examples/sinatra/Gemfile +28 -0
  8. data/examples/sinatra/README.markdown +36 -0
  9. data/examples/sinatra/application.rb +238 -0
  10. data/examples/sinatra/config.ru +7 -0
  11. data/examples/sinatra/test.rb +118 -0
  12. data/lib/elasticsearch/persistence.rb +88 -2
  13. data/lib/elasticsearch/persistence/client.rb +51 -0
  14. data/lib/elasticsearch/persistence/repository.rb +75 -0
  15. data/lib/elasticsearch/persistence/repository/class.rb +71 -0
  16. data/lib/elasticsearch/persistence/repository/find.rb +73 -0
  17. data/lib/elasticsearch/persistence/repository/naming.rb +115 -0
  18. data/lib/elasticsearch/persistence/repository/response/results.rb +90 -0
  19. data/lib/elasticsearch/persistence/repository/search.rb +60 -0
  20. data/lib/elasticsearch/persistence/repository/serialize.rb +31 -0
  21. data/lib/elasticsearch/persistence/repository/store.rb +95 -0
  22. data/lib/elasticsearch/persistence/version.rb +1 -1
  23. data/test/integration/repository/custom_class_test.rb +85 -0
  24. data/test/integration/repository/customized_class_test.rb +82 -0
  25. data/test/integration/repository/default_class_test.rb +108 -0
  26. data/test/integration/repository/virtus_model_test.rb +114 -0
  27. data/test/test_helper.rb +46 -0
  28. data/test/unit/persistence_test.rb +32 -0
  29. data/test/unit/repository_class_test.rb +51 -0
  30. data/test/unit/repository_client_test.rb +32 -0
  31. data/test/unit/repository_find_test.rb +375 -0
  32. data/test/unit/repository_indexing_test.rb +37 -0
  33. data/test/unit/repository_module_test.rb +144 -0
  34. data/test/unit/repository_naming_test.rb +146 -0
  35. data/test/unit/repository_response_results_test.rb +98 -0
  36. data/test/unit/repository_search_test.rb +97 -0
  37. data/test/unit/repository_serialize_test.rb +57 -0
  38. data/test/unit/repository_store_test.rb +287 -0
  39. metadata +288 -20
checksums.yaml ADDED
@@ -0,0 +1,15 @@
1
+ ---
2
+ !binary "U0hBMQ==":
3
+ metadata.gz: !binary |-
4
+ YTFjMGYyOGMzYTRjNWY3YjNjOWJhNjA5MWI1Nzc5Y2MyNTJjZmU4MA==
5
+ data.tar.gz: !binary |-
6
+ MDQwNzM2ZGJiZTQ4Yjk2NWZlMDZkM2Y5ODE4ODkzZTMzMzU5NWJmMQ==
7
+ SHA512:
8
+ metadata.gz: !binary |-
9
+ MDY0NjExYzBlNTQ3NzM2YjQzNTUyMTRhYzk3ODk5Njk5MzdlZDBlMzE1ZGM3
10
+ YjYxMmVmM2QxZmM1YzY0N2RkYTZkMjZiYWNiOGFlM2NlOGQzMmZjN2ZhYmFm
11
+ YTI4NjFlMWZhNjE3ZjJhMGFlNjZhYzkyOTA4Y2Q5OGFmNzk3ZWI=
12
+ data.tar.gz: !binary |-
13
+ YmJhMzFhZDBjNTQ4ZjU1ZTA5ZGM3MzI2MmY2MzhlNGIyNTBkNDc3Nzk5ZDAy
14
+ MGE4MGU4MTc4ZjFjMjJiMGExM2JhOTRhMWM0NmVkZTEzYTkzOWExNjQ3ODIx
15
+ NTYxNmQ0NDQ1ODA2NTIzNWJhMTEyYjZkNDhlYmZjZGZkNjkyN2Y=
data/LICENSE.txt CHANGED
@@ -1,22 +1,13 @@
1
- Copyright (c) 2013 Karel Minarik
1
+ Copyright (c) 2014 Elasticsearch <http://www.elasticsearch.org>
2
2
 
3
- MIT License
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
4
6
 
5
- Permission is hereby granted, free of charge, to any person obtaining
6
- a copy of this software and associated documentation files (the
7
- "Software"), to deal in the Software without restriction, including
8
- without limitation the rights to use, copy, modify, merge, publish,
9
- distribute, sublicense, and/or sell copies of the Software, and to
10
- permit persons to whom the Software is furnished to do so, subject to
11
- the following conditions:
7
+ http://www.apache.org/licenses/LICENSE-2.0
12
8
 
13
- The above copyright notice and this permission notice shall be
14
- included in all copies or substantial portions of the Software.
15
-
16
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
- NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
data/README.md CHANGED
@@ -1,29 +1,447 @@
1
1
  # Elasticsearch::Persistence
2
2
 
3
- TODO: Write a gem description
3
+ Persistence layer for Ruby domain objects in Elasticsearch, using the Repository and ActiveRecord patterns.
4
+
5
+ The library is compatible with Ruby 1.9.3 (or higher) and Elasticsearch 1.0 (or higher).
4
6
 
5
7
  ## Installation
6
8
 
7
- Add this line to your application's Gemfile:
9
+ Install the package from [Rubygems](https://rubygems.org):
8
10
 
9
- gem 'elasticsearch-persistence'
11
+ gem install elasticsearch-persistence
10
12
 
11
- And then execute:
13
+ To use an unreleased version, either add it to your `Gemfile` for [Bundler](http://bundler.io):
12
14
 
13
- $ bundle
15
+ gem 'elasticsearch-persistence', git: 'git://github.com/elasticsearch/elasticsearch-rails.git'
14
16
 
15
- Or install it yourself as:
17
+ or install it from a source code checkout:
16
18
 
17
- $ gem install elasticsearch-persistence
19
+ git clone https://github.com/elasticsearch/elasticsearch-rails.git
20
+ cd elasticsearch-rails/elasticsearch-persistence
21
+ bundle install
22
+ rake install
18
23
 
19
24
  ## Usage
20
25
 
21
- TODO: Write usage instructions here
26
+ ### The Repository Pattern
27
+
28
+ The `Elasticsearch::Persistence::Repository` module provides an implementation of the
29
+ [repository pattern](http://martinfowler.com/eaaCatalog/repository.html) and allows
30
+ to save, delete, find and search objects stored in Elasticsearch, as well as configure
31
+ mappings and settings for the index.
32
+
33
+ Let's have a simple plain old Ruby object (PORO):
34
+
35
+ ```ruby
36
+ class Note
37
+ attr_reader :attributes
38
+
39
+ def initialize(attributes={})
40
+ @attributes = attributes
41
+ end
42
+
43
+ def to_hash
44
+ @attributes
45
+ end
46
+ end
47
+ ```
48
+
49
+ Let's create a default, "dumb" repository, as a first step:
50
+
51
+ ```ruby
52
+ require 'elasticsearch/persistence'
53
+ repository = Elasticsearch::Persistence::Repository.new
54
+ ```
55
+
56
+ We can save a `Note` instance into the repository...
57
+
58
+ ```ruby
59
+ note = Note.new id: 1, text: 'Test'
60
+
61
+ repository.save(note)
62
+ # PUT http://localhost:9200/repository/note/1 [status:201, request:0.210s, query:n/a]
63
+ # > {"id":1,"text":"Test"}
64
+ # < {"_index":"repository","_type":"note","_id":"1","_version":1,"created":true}
65
+ ```
66
+
67
+ ...find it...
68
+
69
+ ```ruby
70
+ n = repository.find(1)
71
+ # GET http://localhost:9200/repository/_all/1 [status:200, request:0.003s, query:n/a]
72
+ # < {"_index":"repository","_type":"note","_id":"1","_version":2,"found":true, "_source" : {"id":1,"text":"Test"}}
73
+ => <Note:0x007fcbfc0c4980 @attributes={"id"=>1, "text"=>"Test"}>
74
+ ```
75
+
76
+ ...search for it...
77
+
78
+ ```ruby
79
+ repository.search(query: { match: { text: 'test' } }).first
80
+ # GET http://localhost:9200/repository/_search [status:200, request:0.005s, query:0.002s]
81
+ # > {"query":{"match":{"text":"test"}}}
82
+ # < {"took":2, ... "hits":{"total":1, ... "hits":[{ ... "_source" : {"id":1,"text":"Test"}}]}}
83
+ => <Note:0x007fcbfc1c7b70 @attributes={"id"=>1, "text"=>"Test"}>
84
+ ```
85
+
86
+ ...or delete it:
87
+
88
+ ```ruby
89
+ repository.delete(note)
90
+ # DELETE http://localhost:9200/repository/note/1 [status:200, request:0.014s, query:n/a]
91
+ # < {"found":true,"_index":"repository","_type":"note","_id":"1","_version":3}
92
+ => {"found"=>true, "_index"=>"repository", "_type"=>"note", "_id"=>"1", "_version"=>2}
93
+ ```
94
+
95
+ The repository module provides a number of features and facilities to configure and customize the behaviour:
96
+
97
+ * Configuring the Elasticsearch [client](https://github.com/elasticsearch/elasticsearch-ruby#usage) being used
98
+ * Setting the index name, document type, and object class for deserialization
99
+ * Composing mappings and settings for the index
100
+ * Creating, deleting or refreshing the index
101
+ * Finding or searching for documents
102
+ * Providing access both to domain objects and hits for search results
103
+ * Providing access to the Elasticsearch response for search results (aggregations, total, ...)
104
+ * Defining the methods for serialization and deserialization
105
+
106
+ You can use the default repository class, or include the module in your own. Let's review it in detail.
107
+
108
+ #### The Default Class
109
+
110
+ For simple cases, you can use the default, bundled repository class, and configure/customize it:
111
+
112
+ ```ruby
113
+ repository = Elasticsearch::Persistence::Repository.new do
114
+ # Configure the Elasticsearch client
115
+ client Elasticsearch::Client.new url: ENV['ELASTICSEARCH_URL'], log: true
116
+
117
+ # Set a custom index name
118
+ index :my_notes
119
+
120
+ # Set a custom document type
121
+ type :my_note
122
+
123
+ # Specify the class to inicialize when deserializing documents
124
+ klass Note
125
+
126
+ # Configure the settings and mappings for the Elasticsearch index
127
+ settings number_of_shards: 1 do
128
+ mapping do
129
+ indexes :text, analyzer: 'snowball'
130
+ end
131
+ end
132
+
133
+ # Customize the serialization logic
134
+ def serialize(document)
135
+ super.merge(my_special_key: 'my_special_stuff')
136
+ end
137
+
138
+ # Customize the de-serialization logic
139
+ def deserialize(document)
140
+ puts "# ***** CUSTOM DESERIALIZE LOGIC KICKING IN... *****"
141
+ super
142
+ end
143
+ end
144
+ ```
145
+
146
+ The custom Elasticsearch client will be used now, with a custom index and type names,
147
+ as well as the custom serialization and de-serialization logic.
148
+
149
+ We can create the index with the desired settings and mappings:
150
+
151
+ ```ruby
152
+ repository.create_index! force: true
153
+ # PUT http://localhost:9200/my_notes
154
+ # > {"settings":{"number_of_shards":1},"mappings":{ ... {"text":{"analyzer":"snowball","type":"string"}}}}}
155
+ ```
156
+
157
+ Save the document with extra properties added by the `serialize` method:
158
+
159
+ ```ruby
160
+ repository.save(note)
161
+ # PUT http://localhost:9200/my_notes/my_note/1
162
+ # > {"id":1,"text":"Test","my_special_key":"my_special_stuff"}
163
+ {"_index"=>"my_notes", "_type"=>"my_note", "_id"=>"1", "_version"=>4, ... }
164
+ ```
165
+
166
+ And `deserialize` it:
167
+
168
+ ```ruby
169
+ repository.find(1)
170
+ # ***** CUSTOM DESERIALIZE LOGIC KICKING IN... *****
171
+ <Note:0x007f9bd782b7a0 @attributes={... "my_special_key"=>"my_special_stuff"}>
172
+ ```
173
+
174
+ #### A Custom Class
175
+
176
+ In most cases, though, you'll want to use a custom class for the repository, so let's do that:
177
+
178
+ ```ruby
179
+ require 'base64'
180
+
181
+ class NoteRepository
182
+ include Elasticsearch::Persistence::Repository
183
+
184
+ def initialize(options={})
185
+ index options[:index] || 'notes'
186
+ client Elasticsearch::Client.new url: options[:url], log: options[:log]
187
+ end
188
+
189
+ klass Note
190
+
191
+ settings number_of_shards: 1 do
192
+ mapping do
193
+ indexes :text, analyzer: 'snowball'
194
+ # Do not index images
195
+ indexes :image, index: 'no'
196
+ end
197
+ end
198
+
199
+ # Base64 encode the "image" field in the document
200
+ #
201
+ def serialize(document)
202
+ hash = document.to_hash.clone
203
+ hash['image'] = Base64.encode64(hash['image']) if hash['image']
204
+ hash.to_hash
205
+ end
206
+
207
+ # Base64 decode the "image" field in the document
208
+ #
209
+ def deserialize(document)
210
+ hash = document['_source']
211
+ hash['image'] = Base64.decode64(hash['image']) if hash['image']
212
+ klass.new hash
213
+ end
214
+ end
215
+ ```
216
+
217
+ Include the `Elasticsearch::Persistence::Repository` module to add the repository methods into the class.
218
+
219
+ You can customize the repository in the familiar way, by calling the DSL-like methods.
220
+
221
+ You can implement a custom initializer for your repository, add complex logic in its
222
+ class and instance methods -- in general, have all the freedom of a standard Ruby class.
223
+
224
+ ```ruby
225
+ repository = NoteRepository.new url: 'http://localhost:9200', log: true
226
+
227
+ # Configure the repository instance
228
+ repository.index = 'notes_development'
229
+ repository.client.transport.logger.formatter = proc { |s, d, p, m| "\e[2m# #{m}\n\e[0m" }
230
+
231
+ repository.create_index! force: true
232
+
233
+ note = Note.new 'id' => 1, 'text' => 'Document with image', 'image' => '... BINARY DATA ...'
234
+
235
+ repository.save(note)
236
+ # PUT http://localhost:9200/notes_development/note/1
237
+ # > {"id":1,"text":"Document with image","image":"Li4uIEJJTkFSWSBEQVRBIC4uLg==\n"}
238
+ puts repository.find(1).attributes['image']
239
+ # GET http://localhost:9200/notes_development/note/1
240
+ # < {... "_source" : { ... "image":"Li4uIEJJTkFSWSBEQVRBIC4uLg==\n"}}
241
+ # => ... BINARY DATA ...
242
+ ```
243
+
244
+ #### Methods Provided by the Repository
245
+
246
+ ##### Client
247
+
248
+ The repository uses the standard Elasticsearch [client](https://github.com/elasticsearch/elasticsearch-ruby#usage),
249
+ which is accessible with the `client` getter and setter methods:
250
+
251
+ ```ruby
252
+ repository.client = Elasticsearch::Client.new url: 'http://search.server.org'
253
+ repository.client.transport.logger = Logger.new(STDERR)
254
+ ```
255
+
256
+ ##### Naming
257
+
258
+ The `index` method specifies the Elasticsearch index to use for storage, lookup and search
259
+ (when not set, the value is inferred from the repository class name):
260
+
261
+ ```ruby
262
+ repository.index = 'notes_development'
263
+ ```
264
+
265
+ The `type` method specifies the Elasticsearch document type to use for storage, lookup and search
266
+ (when not set, the value is inferred from the document class name, or `_all` is used):
267
+
268
+ ```ruby
269
+ repository.type = 'my_note'
270
+ ```
271
+
272
+ The `klass` method specifies the Ruby class name to use when initializing objects from
273
+ documents retrieved from the repository (when not set, the value is inferred from the
274
+ document `_type` as fetched from Elasticsearch):
275
+
276
+ ```ruby
277
+ repository.klass = MyNote
278
+ ```
279
+
280
+ ##### Index Configuration
281
+
282
+ The `settings` and `mappings` methods, provided by the
283
+ [`elasticsearch-model`](http://rubydoc.info/gems/elasticsearch-model/Elasticsearch/Model/Indexing/ClassMethods)
284
+ gem, allow to configure the index properties:
285
+
286
+ ```ruby
287
+ repository.settings number_of_shards: 1
288
+ repository.settings.to_hash
289
+ # => {:number_of_shards=>1}
290
+
291
+ repository.mappings { indexes :title, analyzer: 'snowball' }
292
+ repository.mappings.to_hash
293
+ # => { :note => {:properties=> ... }}
294
+ ```
295
+
296
+ The convenience methods `create_index!`, `delete_index!` and `refresh_index!` allow you to manage the index lifecycle.
297
+
298
+ ##### Serialization
299
+
300
+ The `serialize` and `deserialize` methods allow you to customize the serialization of the document when passing it
301
+ to the storage, and the initialization procedure when loading it from the storage:
302
+
303
+ ```ruby
304
+ class NoteRepository
305
+ def serialize(document)
306
+ Hash[document.to_hash.map() { |k,v| v.upcase! if k == :title; [k,v] }]
307
+ end
308
+ def deserialize(document)
309
+ MyNote.new ActiveSupport::HashWithIndifferentAccess.new(document['_source']).deep_symbolize_keys
310
+ end
311
+ end
312
+ ```
313
+
314
+ ##### Storage
315
+
316
+ The `save` method allows you to store a domain object in the repository:
317
+
318
+ ```ruby
319
+ note = Note.new id: 1, title: 'Quick Brown Fox'
320
+ repository.save(note)
321
+ # => {"_index"=>"notes_development", "_type"=>"my_note", "_id"=>"1", "_version"=>1, "created"=>true}
322
+ ```
323
+
324
+ The `update` method allows you to perform a partial update of a document in the repository.
325
+ Use either a partial document:
326
+
327
+ ```ruby
328
+ repository.update id: 1, title: 'UPDATED', tags: []
329
+ # => {"_index"=>"notes_development", "_type"=>"note", "_id"=>"1", "_version"=>2}
330
+ ```
331
+
332
+ Or a script (optionally with parameters):
333
+
334
+ ```ruby
335
+ repository.update 1, script: 'if (!ctx._source.tags.contains(t)) { ctx._source.tags += t }', params: { t: 'foo' }
336
+ # => {"_index"=>"notes_development", "_type"=>"note", "_id"=>"1", "_version"=>3}
337
+ ```
338
+
339
+
340
+ The `delete` method allows to remove objects from the repository (pass either the object itself or its ID):
341
+
342
+ ```ruby
343
+ repository.delete(note)
344
+ repository.delete(1)
345
+ ```
346
+
347
+ ##### Finding
348
+
349
+ The `find` method allows to find one or many documents in the storage and returns them as deserialized Ruby objects:
350
+
351
+ ```ruby
352
+ repository.save Note.new(id: 2, title: 'Fast White Dog')
353
+
354
+ note = repository.find(1)
355
+ # => <MyNote ... QUICK BROWN FOX>
356
+
357
+ notes = repository.find(1, 2)
358
+ # => [<MyNote... QUICK BROWN FOX>, <MyNote ... FAST WHITE DOG>]
359
+ ```
360
+
361
+ When the document with a specific ID isn't found, a `nil` is returned instead of the deserialized object:
362
+
363
+ ```ruby
364
+ notes = repository.find(1, 3, 2)
365
+ # => [<MyNote ...>, nil, <MyNote ...>]
366
+ ```
367
+
368
+ Handle the missing objects in the application code, or call `compact` on the result.
369
+
370
+ ##### Search
371
+
372
+ The `search` method to retrieve objects from the repository by a query string or definition in the Elasticsearch DSL:
373
+
374
+ ```ruby
375
+ repository.search('fox or dog').to_a
376
+ # GET http://localhost:9200/notes_development/my_note/_search?q=fox
377
+ # => [<MyNote ... FOX ...>, <MyNote ... DOG ...>]
378
+
379
+ repository.search(query: { match: { title: 'fox dog' } }).to_a
380
+ # GET http://localhost:9200/notes_development/my_note/_search
381
+ # > {"query":{"match":{"title":"fox dog"}}}
382
+ # => [<MyNote ... FOX ...>, <MyNote ... DOG ...>]
383
+ ```
384
+
385
+ The returned object is an instance of the `Elasticsearch::Persistence::Repository::Response::Results` class,
386
+ which provides access to the results, the full returned response and hits.
387
+
388
+ ```ruby
389
+ results = repository.search(query: { match: { title: 'fox dog' } })
390
+
391
+ # Iterate over the objects
392
+ #
393
+ results.each do |note|
394
+ puts "* #{note.attributes[:title]}"
395
+ end
396
+ # * QUICK BROWN FOX
397
+ # * FAST WHITE DOG
398
+
399
+ # Iterate over the objects and hits
400
+ #
401
+ results.each_with_hit do |note, hit|
402
+ puts "* #{note.attributes[:title]}, score: #{hit._score}"
403
+ end
404
+ # * QUICK BROWN FOX, score: 0.29930896
405
+ # * FAST WHITE DOG, score: 0.29930896
406
+
407
+ # Get total results
408
+ #
409
+ results.total
410
+ # => 2
411
+
412
+ # Access the raw response as a Hashie::Mash instance
413
+ results.response._shards.failed
414
+ # => 0
415
+ ```
416
+
417
+ #### Example Application
418
+
419
+ An example Sinatra application is available in
420
+ [`examples/sinatra/application.rb`](examples/sinatra/application.rb),
421
+ and demonstrates a rich set of features of the repository.
422
+
423
+
424
+ ### The ActiveRecord Pattern
425
+
426
+ [_Work in progress_](https://github.com/elasticsearch/elasticsearch-rails/pull/91).
427
+ The ActiveRecord [pattern](http://www.martinfowler.com/eaaCatalog/activeRecord.html) will work
428
+ in a very similar way as `Tire::Model::Persistence`, allowing a drop-in replacement of
429
+ an Elasticsearch-backed model in Ruby on Rails applications.
430
+
431
+ ## License
432
+
433
+ This software is licensed under the Apache 2 license, quoted below.
434
+
435
+ Copyright (c) 2014 Elasticsearch <http://www.elasticsearch.org>
436
+
437
+ Licensed under the Apache License, Version 2.0 (the "License");
438
+ you may not use this file except in compliance with the License.
439
+ You may obtain a copy of the License at
22
440
 
23
- ## Contributing
441
+ http://www.apache.org/licenses/LICENSE-2.0
24
442
 
25
- 1. Fork it
26
- 2. Create your feature branch (`git checkout -b my-new-feature`)
27
- 3. Commit your changes (`git commit -am 'Add some feature'`)
28
- 4. Push to the branch (`git push origin my-new-feature`)
29
- 5. Create new Pull Request
443
+ Unless required by applicable law or agreed to in writing, software
444
+ distributed under the License is distributed on an "AS IS" BASIS,
445
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
446
+ See the License for the specific language governing permissions and
447
+ limitations under the License.