elasticsearch-persistence 0.0.0 → 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. checksums.yaml +15 -0
  2. data/LICENSE.txt +10 -19
  3. data/README.md +432 -14
  4. data/Rakefile +56 -0
  5. data/elasticsearch-persistence.gemspec +45 -17
  6. data/examples/sinatra/.gitignore +7 -0
  7. data/examples/sinatra/Gemfile +28 -0
  8. data/examples/sinatra/README.markdown +36 -0
  9. data/examples/sinatra/application.rb +238 -0
  10. data/examples/sinatra/config.ru +7 -0
  11. data/examples/sinatra/test.rb +118 -0
  12. data/lib/elasticsearch/persistence.rb +88 -2
  13. data/lib/elasticsearch/persistence/client.rb +51 -0
  14. data/lib/elasticsearch/persistence/repository.rb +75 -0
  15. data/lib/elasticsearch/persistence/repository/class.rb +71 -0
  16. data/lib/elasticsearch/persistence/repository/find.rb +73 -0
  17. data/lib/elasticsearch/persistence/repository/naming.rb +115 -0
  18. data/lib/elasticsearch/persistence/repository/response/results.rb +90 -0
  19. data/lib/elasticsearch/persistence/repository/search.rb +60 -0
  20. data/lib/elasticsearch/persistence/repository/serialize.rb +31 -0
  21. data/lib/elasticsearch/persistence/repository/store.rb +95 -0
  22. data/lib/elasticsearch/persistence/version.rb +1 -1
  23. data/test/integration/repository/custom_class_test.rb +85 -0
  24. data/test/integration/repository/customized_class_test.rb +82 -0
  25. data/test/integration/repository/default_class_test.rb +108 -0
  26. data/test/integration/repository/virtus_model_test.rb +114 -0
  27. data/test/test_helper.rb +46 -0
  28. data/test/unit/persistence_test.rb +32 -0
  29. data/test/unit/repository_class_test.rb +51 -0
  30. data/test/unit/repository_client_test.rb +32 -0
  31. data/test/unit/repository_find_test.rb +375 -0
  32. data/test/unit/repository_indexing_test.rb +37 -0
  33. data/test/unit/repository_module_test.rb +144 -0
  34. data/test/unit/repository_naming_test.rb +146 -0
  35. data/test/unit/repository_response_results_test.rb +98 -0
  36. data/test/unit/repository_search_test.rb +97 -0
  37. data/test/unit/repository_serialize_test.rb +57 -0
  38. data/test/unit/repository_store_test.rb +287 -0
  39. metadata +288 -20
checksums.yaml ADDED
@@ -0,0 +1,15 @@
1
+ ---
2
+ !binary "U0hBMQ==":
3
+ metadata.gz: !binary |-
4
+ YTFjMGYyOGMzYTRjNWY3YjNjOWJhNjA5MWI1Nzc5Y2MyNTJjZmU4MA==
5
+ data.tar.gz: !binary |-
6
+ MDQwNzM2ZGJiZTQ4Yjk2NWZlMDZkM2Y5ODE4ODkzZTMzMzU5NWJmMQ==
7
+ SHA512:
8
+ metadata.gz: !binary |-
9
+ MDY0NjExYzBlNTQ3NzM2YjQzNTUyMTRhYzk3ODk5Njk5MzdlZDBlMzE1ZGM3
10
+ YjYxMmVmM2QxZmM1YzY0N2RkYTZkMjZiYWNiOGFlM2NlOGQzMmZjN2ZhYmFm
11
+ YTI4NjFlMWZhNjE3ZjJhMGFlNjZhYzkyOTA4Y2Q5OGFmNzk3ZWI=
12
+ data.tar.gz: !binary |-
13
+ YmJhMzFhZDBjNTQ4ZjU1ZTA5ZGM3MzI2MmY2MzhlNGIyNTBkNDc3Nzk5ZDAy
14
+ MGE4MGU4MTc4ZjFjMjJiMGExM2JhOTRhMWM0NmVkZTEzYTkzOWExNjQ3ODIx
15
+ NTYxNmQ0NDQ1ODA2NTIzNWJhMTEyYjZkNDhlYmZjZGZkNjkyN2Y=
data/LICENSE.txt CHANGED
@@ -1,22 +1,13 @@
1
- Copyright (c) 2013 Karel Minarik
1
+ Copyright (c) 2014 Elasticsearch <http://www.elasticsearch.org>
2
2
 
3
- MIT License
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
4
6
 
5
- Permission is hereby granted, free of charge, to any person obtaining
6
- a copy of this software and associated documentation files (the
7
- "Software"), to deal in the Software without restriction, including
8
- without limitation the rights to use, copy, modify, merge, publish,
9
- distribute, sublicense, and/or sell copies of the Software, and to
10
- permit persons to whom the Software is furnished to do so, subject to
11
- the following conditions:
7
+ http://www.apache.org/licenses/LICENSE-2.0
12
8
 
13
- The above copyright notice and this permission notice shall be
14
- included in all copies or substantial portions of the Software.
15
-
16
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
- NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
data/README.md CHANGED
@@ -1,29 +1,447 @@
1
1
  # Elasticsearch::Persistence
2
2
 
3
- TODO: Write a gem description
3
+ Persistence layer for Ruby domain objects in Elasticsearch, using the Repository and ActiveRecord patterns.
4
+
5
+ The library is compatible with Ruby 1.9.3 (or higher) and Elasticsearch 1.0 (or higher).
4
6
 
5
7
  ## Installation
6
8
 
7
- Add this line to your application's Gemfile:
9
+ Install the package from [Rubygems](https://rubygems.org):
8
10
 
9
- gem 'elasticsearch-persistence'
11
+ gem install elasticsearch-persistence
10
12
 
11
- And then execute:
13
+ To use an unreleased version, either add it to your `Gemfile` for [Bundler](http://bundler.io):
12
14
 
13
- $ bundle
15
+ gem 'elasticsearch-persistence', git: 'git://github.com/elasticsearch/elasticsearch-rails.git'
14
16
 
15
- Or install it yourself as:
17
+ or install it from a source code checkout:
16
18
 
17
- $ gem install elasticsearch-persistence
19
+ git clone https://github.com/elasticsearch/elasticsearch-rails.git
20
+ cd elasticsearch-rails/elasticsearch-persistence
21
+ bundle install
22
+ rake install
18
23
 
19
24
  ## Usage
20
25
 
21
- TODO: Write usage instructions here
26
+ ### The Repository Pattern
27
+
28
+ The `Elasticsearch::Persistence::Repository` module provides an implementation of the
29
+ [repository pattern](http://martinfowler.com/eaaCatalog/repository.html) and allows
30
+ to save, delete, find and search objects stored in Elasticsearch, as well as configure
31
+ mappings and settings for the index.
32
+
33
+ Let's have a simple plain old Ruby object (PORO):
34
+
35
+ ```ruby
36
+ class Note
37
+ attr_reader :attributes
38
+
39
+ def initialize(attributes={})
40
+ @attributes = attributes
41
+ end
42
+
43
+ def to_hash
44
+ @attributes
45
+ end
46
+ end
47
+ ```
48
+
49
+ Let's create a default, "dumb" repository, as a first step:
50
+
51
+ ```ruby
52
+ require 'elasticsearch/persistence'
53
+ repository = Elasticsearch::Persistence::Repository.new
54
+ ```
55
+
56
+ We can save a `Note` instance into the repository...
57
+
58
+ ```ruby
59
+ note = Note.new id: 1, text: 'Test'
60
+
61
+ repository.save(note)
62
+ # PUT http://localhost:9200/repository/note/1 [status:201, request:0.210s, query:n/a]
63
+ # > {"id":1,"text":"Test"}
64
+ # < {"_index":"repository","_type":"note","_id":"1","_version":1,"created":true}
65
+ ```
66
+
67
+ ...find it...
68
+
69
+ ```ruby
70
+ n = repository.find(1)
71
+ # GET http://localhost:9200/repository/_all/1 [status:200, request:0.003s, query:n/a]
72
+ # < {"_index":"repository","_type":"note","_id":"1","_version":2,"found":true, "_source" : {"id":1,"text":"Test"}}
73
+ => <Note:0x007fcbfc0c4980 @attributes={"id"=>1, "text"=>"Test"}>
74
+ ```
75
+
76
+ ...search for it...
77
+
78
+ ```ruby
79
+ repository.search(query: { match: { text: 'test' } }).first
80
+ # GET http://localhost:9200/repository/_search [status:200, request:0.005s, query:0.002s]
81
+ # > {"query":{"match":{"text":"test"}}}
82
+ # < {"took":2, ... "hits":{"total":1, ... "hits":[{ ... "_source" : {"id":1,"text":"Test"}}]}}
83
+ => <Note:0x007fcbfc1c7b70 @attributes={"id"=>1, "text"=>"Test"}>
84
+ ```
85
+
86
+ ...or delete it:
87
+
88
+ ```ruby
89
+ repository.delete(note)
90
+ # DELETE http://localhost:9200/repository/note/1 [status:200, request:0.014s, query:n/a]
91
+ # < {"found":true,"_index":"repository","_type":"note","_id":"1","_version":3}
92
+ => {"found"=>true, "_index"=>"repository", "_type"=>"note", "_id"=>"1", "_version"=>2}
93
+ ```
94
+
95
+ The repository module provides a number of features and facilities to configure and customize the behaviour:
96
+
97
+ * Configuring the Elasticsearch [client](https://github.com/elasticsearch/elasticsearch-ruby#usage) being used
98
+ * Setting the index name, document type, and object class for deserialization
99
+ * Composing mappings and settings for the index
100
+ * Creating, deleting or refreshing the index
101
+ * Finding or searching for documents
102
+ * Providing access both to domain objects and hits for search results
103
+ * Providing access to the Elasticsearch response for search results (aggregations, total, ...)
104
+ * Defining the methods for serialization and deserialization
105
+
106
+ You can use the default repository class, or include the module in your own. Let's review it in detail.
107
+
108
+ #### The Default Class
109
+
110
+ For simple cases, you can use the default, bundled repository class, and configure/customize it:
111
+
112
+ ```ruby
113
+ repository = Elasticsearch::Persistence::Repository.new do
114
+ # Configure the Elasticsearch client
115
+ client Elasticsearch::Client.new url: ENV['ELASTICSEARCH_URL'], log: true
116
+
117
+ # Set a custom index name
118
+ index :my_notes
119
+
120
+ # Set a custom document type
121
+ type :my_note
122
+
123
+ # Specify the class to inicialize when deserializing documents
124
+ klass Note
125
+
126
+ # Configure the settings and mappings for the Elasticsearch index
127
+ settings number_of_shards: 1 do
128
+ mapping do
129
+ indexes :text, analyzer: 'snowball'
130
+ end
131
+ end
132
+
133
+ # Customize the serialization logic
134
+ def serialize(document)
135
+ super.merge(my_special_key: 'my_special_stuff')
136
+ end
137
+
138
+ # Customize the de-serialization logic
139
+ def deserialize(document)
140
+ puts "# ***** CUSTOM DESERIALIZE LOGIC KICKING IN... *****"
141
+ super
142
+ end
143
+ end
144
+ ```
145
+
146
+ The custom Elasticsearch client will be used now, with a custom index and type names,
147
+ as well as the custom serialization and de-serialization logic.
148
+
149
+ We can create the index with the desired settings and mappings:
150
+
151
+ ```ruby
152
+ repository.create_index! force: true
153
+ # PUT http://localhost:9200/my_notes
154
+ # > {"settings":{"number_of_shards":1},"mappings":{ ... {"text":{"analyzer":"snowball","type":"string"}}}}}
155
+ ```
156
+
157
+ Save the document with extra properties added by the `serialize` method:
158
+
159
+ ```ruby
160
+ repository.save(note)
161
+ # PUT http://localhost:9200/my_notes/my_note/1
162
+ # > {"id":1,"text":"Test","my_special_key":"my_special_stuff"}
163
+ {"_index"=>"my_notes", "_type"=>"my_note", "_id"=>"1", "_version"=>4, ... }
164
+ ```
165
+
166
+ And `deserialize` it:
167
+
168
+ ```ruby
169
+ repository.find(1)
170
+ # ***** CUSTOM DESERIALIZE LOGIC KICKING IN... *****
171
+ <Note:0x007f9bd782b7a0 @attributes={... "my_special_key"=>"my_special_stuff"}>
172
+ ```
173
+
174
+ #### A Custom Class
175
+
176
+ In most cases, though, you'll want to use a custom class for the repository, so let's do that:
177
+
178
+ ```ruby
179
+ require 'base64'
180
+
181
+ class NoteRepository
182
+ include Elasticsearch::Persistence::Repository
183
+
184
+ def initialize(options={})
185
+ index options[:index] || 'notes'
186
+ client Elasticsearch::Client.new url: options[:url], log: options[:log]
187
+ end
188
+
189
+ klass Note
190
+
191
+ settings number_of_shards: 1 do
192
+ mapping do
193
+ indexes :text, analyzer: 'snowball'
194
+ # Do not index images
195
+ indexes :image, index: 'no'
196
+ end
197
+ end
198
+
199
+ # Base64 encode the "image" field in the document
200
+ #
201
+ def serialize(document)
202
+ hash = document.to_hash.clone
203
+ hash['image'] = Base64.encode64(hash['image']) if hash['image']
204
+ hash.to_hash
205
+ end
206
+
207
+ # Base64 decode the "image" field in the document
208
+ #
209
+ def deserialize(document)
210
+ hash = document['_source']
211
+ hash['image'] = Base64.decode64(hash['image']) if hash['image']
212
+ klass.new hash
213
+ end
214
+ end
215
+ ```
216
+
217
+ Include the `Elasticsearch::Persistence::Repository` module to add the repository methods into the class.
218
+
219
+ You can customize the repository in the familiar way, by calling the DSL-like methods.
220
+
221
+ You can implement a custom initializer for your repository, add complex logic in its
222
+ class and instance methods -- in general, have all the freedom of a standard Ruby class.
223
+
224
+ ```ruby
225
+ repository = NoteRepository.new url: 'http://localhost:9200', log: true
226
+
227
+ # Configure the repository instance
228
+ repository.index = 'notes_development'
229
+ repository.client.transport.logger.formatter = proc { |s, d, p, m| "\e[2m# #{m}\n\e[0m" }
230
+
231
+ repository.create_index! force: true
232
+
233
+ note = Note.new 'id' => 1, 'text' => 'Document with image', 'image' => '... BINARY DATA ...'
234
+
235
+ repository.save(note)
236
+ # PUT http://localhost:9200/notes_development/note/1
237
+ # > {"id":1,"text":"Document with image","image":"Li4uIEJJTkFSWSBEQVRBIC4uLg==\n"}
238
+ puts repository.find(1).attributes['image']
239
+ # GET http://localhost:9200/notes_development/note/1
240
+ # < {... "_source" : { ... "image":"Li4uIEJJTkFSWSBEQVRBIC4uLg==\n"}}
241
+ # => ... BINARY DATA ...
242
+ ```
243
+
244
+ #### Methods Provided by the Repository
245
+
246
+ ##### Client
247
+
248
+ The repository uses the standard Elasticsearch [client](https://github.com/elasticsearch/elasticsearch-ruby#usage),
249
+ which is accessible with the `client` getter and setter methods:
250
+
251
+ ```ruby
252
+ repository.client = Elasticsearch::Client.new url: 'http://search.server.org'
253
+ repository.client.transport.logger = Logger.new(STDERR)
254
+ ```
255
+
256
+ ##### Naming
257
+
258
+ The `index` method specifies the Elasticsearch index to use for storage, lookup and search
259
+ (when not set, the value is inferred from the repository class name):
260
+
261
+ ```ruby
262
+ repository.index = 'notes_development'
263
+ ```
264
+
265
+ The `type` method specifies the Elasticsearch document type to use for storage, lookup and search
266
+ (when not set, the value is inferred from the document class name, or `_all` is used):
267
+
268
+ ```ruby
269
+ repository.type = 'my_note'
270
+ ```
271
+
272
+ The `klass` method specifies the Ruby class name to use when initializing objects from
273
+ documents retrieved from the repository (when not set, the value is inferred from the
274
+ document `_type` as fetched from Elasticsearch):
275
+
276
+ ```ruby
277
+ repository.klass = MyNote
278
+ ```
279
+
280
+ ##### Index Configuration
281
+
282
+ The `settings` and `mappings` methods, provided by the
283
+ [`elasticsearch-model`](http://rubydoc.info/gems/elasticsearch-model/Elasticsearch/Model/Indexing/ClassMethods)
284
+ gem, allow to configure the index properties:
285
+
286
+ ```ruby
287
+ repository.settings number_of_shards: 1
288
+ repository.settings.to_hash
289
+ # => {:number_of_shards=>1}
290
+
291
+ repository.mappings { indexes :title, analyzer: 'snowball' }
292
+ repository.mappings.to_hash
293
+ # => { :note => {:properties=> ... }}
294
+ ```
295
+
296
+ The convenience methods `create_index!`, `delete_index!` and `refresh_index!` allow you to manage the index lifecycle.
297
+
298
+ ##### Serialization
299
+
300
+ The `serialize` and `deserialize` methods allow you to customize the serialization of the document when passing it
301
+ to the storage, and the initialization procedure when loading it from the storage:
302
+
303
+ ```ruby
304
+ class NoteRepository
305
+ def serialize(document)
306
+ Hash[document.to_hash.map() { |k,v| v.upcase! if k == :title; [k,v] }]
307
+ end
308
+ def deserialize(document)
309
+ MyNote.new ActiveSupport::HashWithIndifferentAccess.new(document['_source']).deep_symbolize_keys
310
+ end
311
+ end
312
+ ```
313
+
314
+ ##### Storage
315
+
316
+ The `save` method allows you to store a domain object in the repository:
317
+
318
+ ```ruby
319
+ note = Note.new id: 1, title: 'Quick Brown Fox'
320
+ repository.save(note)
321
+ # => {"_index"=>"notes_development", "_type"=>"my_note", "_id"=>"1", "_version"=>1, "created"=>true}
322
+ ```
323
+
324
+ The `update` method allows you to perform a partial update of a document in the repository.
325
+ Use either a partial document:
326
+
327
+ ```ruby
328
+ repository.update id: 1, title: 'UPDATED', tags: []
329
+ # => {"_index"=>"notes_development", "_type"=>"note", "_id"=>"1", "_version"=>2}
330
+ ```
331
+
332
+ Or a script (optionally with parameters):
333
+
334
+ ```ruby
335
+ repository.update 1, script: 'if (!ctx._source.tags.contains(t)) { ctx._source.tags += t }', params: { t: 'foo' }
336
+ # => {"_index"=>"notes_development", "_type"=>"note", "_id"=>"1", "_version"=>3}
337
+ ```
338
+
339
+
340
+ The `delete` method allows to remove objects from the repository (pass either the object itself or its ID):
341
+
342
+ ```ruby
343
+ repository.delete(note)
344
+ repository.delete(1)
345
+ ```
346
+
347
+ ##### Finding
348
+
349
+ The `find` method allows to find one or many documents in the storage and returns them as deserialized Ruby objects:
350
+
351
+ ```ruby
352
+ repository.save Note.new(id: 2, title: 'Fast White Dog')
353
+
354
+ note = repository.find(1)
355
+ # => <MyNote ... QUICK BROWN FOX>
356
+
357
+ notes = repository.find(1, 2)
358
+ # => [<MyNote... QUICK BROWN FOX>, <MyNote ... FAST WHITE DOG>]
359
+ ```
360
+
361
+ When the document with a specific ID isn't found, a `nil` is returned instead of the deserialized object:
362
+
363
+ ```ruby
364
+ notes = repository.find(1, 3, 2)
365
+ # => [<MyNote ...>, nil, <MyNote ...>]
366
+ ```
367
+
368
+ Handle the missing objects in the application code, or call `compact` on the result.
369
+
370
+ ##### Search
371
+
372
+ The `search` method to retrieve objects from the repository by a query string or definition in the Elasticsearch DSL:
373
+
374
+ ```ruby
375
+ repository.search('fox or dog').to_a
376
+ # GET http://localhost:9200/notes_development/my_note/_search?q=fox
377
+ # => [<MyNote ... FOX ...>, <MyNote ... DOG ...>]
378
+
379
+ repository.search(query: { match: { title: 'fox dog' } }).to_a
380
+ # GET http://localhost:9200/notes_development/my_note/_search
381
+ # > {"query":{"match":{"title":"fox dog"}}}
382
+ # => [<MyNote ... FOX ...>, <MyNote ... DOG ...>]
383
+ ```
384
+
385
+ The returned object is an instance of the `Elasticsearch::Persistence::Repository::Response::Results` class,
386
+ which provides access to the results, the full returned response and hits.
387
+
388
+ ```ruby
389
+ results = repository.search(query: { match: { title: 'fox dog' } })
390
+
391
+ # Iterate over the objects
392
+ #
393
+ results.each do |note|
394
+ puts "* #{note.attributes[:title]}"
395
+ end
396
+ # * QUICK BROWN FOX
397
+ # * FAST WHITE DOG
398
+
399
+ # Iterate over the objects and hits
400
+ #
401
+ results.each_with_hit do |note, hit|
402
+ puts "* #{note.attributes[:title]}, score: #{hit._score}"
403
+ end
404
+ # * QUICK BROWN FOX, score: 0.29930896
405
+ # * FAST WHITE DOG, score: 0.29930896
406
+
407
+ # Get total results
408
+ #
409
+ results.total
410
+ # => 2
411
+
412
+ # Access the raw response as a Hashie::Mash instance
413
+ results.response._shards.failed
414
+ # => 0
415
+ ```
416
+
417
+ #### Example Application
418
+
419
+ An example Sinatra application is available in
420
+ [`examples/sinatra/application.rb`](examples/sinatra/application.rb),
421
+ and demonstrates a rich set of features of the repository.
422
+
423
+
424
+ ### The ActiveRecord Pattern
425
+
426
+ [_Work in progress_](https://github.com/elasticsearch/elasticsearch-rails/pull/91).
427
+ The ActiveRecord [pattern](http://www.martinfowler.com/eaaCatalog/activeRecord.html) will work
428
+ in a very similar way as `Tire::Model::Persistence`, allowing a drop-in replacement of
429
+ an Elasticsearch-backed model in Ruby on Rails applications.
430
+
431
+ ## License
432
+
433
+ This software is licensed under the Apache 2 license, quoted below.
434
+
435
+ Copyright (c) 2014 Elasticsearch <http://www.elasticsearch.org>
436
+
437
+ Licensed under the Apache License, Version 2.0 (the "License");
438
+ you may not use this file except in compliance with the License.
439
+ You may obtain a copy of the License at
22
440
 
23
- ## Contributing
441
+ http://www.apache.org/licenses/LICENSE-2.0
24
442
 
25
- 1. Fork it
26
- 2. Create your feature branch (`git checkout -b my-new-feature`)
27
- 3. Commit your changes (`git commit -am 'Add some feature'`)
28
- 4. Push to the branch (`git push origin my-new-feature`)
29
- 5. Create new Pull Request
443
+ Unless required by applicable law or agreed to in writing, software
444
+ distributed under the License is distributed on an "AS IS" BASIS,
445
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
446
+ See the License for the specific language governing permissions and
447
+ limitations under the License.