oedipus-dm 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore ADDED
@@ -0,0 +1,10 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ spec/data/index/*
6
+ spec/data/binlog/*
7
+ spec/data/searchd.*
8
+ spec/data/sphinx.*
9
+ lib/oedipus/oedipus.so
10
+ tmp/
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in oedipus-dm.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Copyright © 2012 Chris Corbyn.
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,451 @@
1
+ # Oedipus Sphinx Integration for DataMapper
2
+
3
+ This gem is a work in progress, binding [Oedipus](https://github.com/d11wtq/oedipus)
4
+ with [DataMapper](https://github.com/datamapper/dm-core), in order to support
5
+ the querying and updating of Sphinx indexes through DataMapper models.
6
+
7
+ The gem is not yet published, as it is still in development.
8
+
9
+ ## Usage
10
+
11
+ All features of Oedipus will ultimately be supported, but I'm documenting as
12
+ I complete wrapping the features.
13
+
14
+ ### Configure Oedipus
15
+
16
+ Oedipus must be configured to connect to a SphinxQL host. The older searchd
17
+ interface is not supported.
18
+
19
+ ``` ruby
20
+ require "oedipus-dm"
21
+
22
+ Oedipus::DataMapper.configure do |config|
23
+ config.host = "localhost"
24
+ config.port = 9306
25
+ end
26
+ ```
27
+
28
+ In Rails you can do this in an initializer for example. If you prefer not to
29
+ use a global configuration, it is possible to specify how to connect on a
30
+ per-index basis instead.
31
+
32
+ ### Defining an Index
33
+
34
+ The most basic way to connect sphinx index with your model is to define a
35
+ `.index` method on the model itself. Oedipus doesn't directly mix behaviour
36
+ into your models by default, as experience suggests this makes testing in
37
+ isolation more difficult (note that you can easily have a standalone `Index`
38
+ that wraps your model, if you prefer this).
39
+
40
+ For a non-realtime index, something like the following would work fine.
41
+
42
+ ``` ruby
43
+ class Post
44
+ include DataMapper::Resource
45
+
46
+ property :id, Serial
47
+ property :title, String
48
+ property :body, Text
49
+ property :view_count, Integer
50
+
51
+ belongs_to :user
52
+
53
+ def self.index
54
+ @index ||= Oedipus::DataMapper::Index.new(self)
55
+ end
56
+ end
57
+ ```
58
+
59
+ Oedipus will use the `storage_name` of your model as the index name in Sphinx.
60
+ If you need to use a different name, pass the `:name` option to the Index.
61
+
62
+ ``` ruby
63
+ def self.index
64
+ @index ||= Oedipus::DataMapper::Index.new(self, name: :posts_rt)
65
+ end
66
+ ```
67
+
68
+ If you have not globally configured Oedipus, or want to specify different
69
+ connection settings, pass the `:connection` option.
70
+
71
+ ``` ruby
72
+ def self.index
73
+ @index ||= Oedipus::DataMapper::Index.new(
74
+ self,
75
+ connection: Oedipus.connect("localhost:9306")
76
+ )
77
+ end
78
+ ```
79
+
80
+ #### Map fields and attributes with your model
81
+
82
+ By default, the only field that Oedipus will map with your model is the `:id`
83
+ attribute, which it will try to map with the key of your model. This
84
+ configuration will work fine for non-realtime indexes in most cases, but it
85
+ is not optimized for many cases.
86
+
87
+ When Oedipus finds search results, it pulls out all the attributes defined in
88
+ your index, then tries to map them to instances of your model. Mapping `:id`
89
+ alone means that DataMapper will load all of your resources from the database
90
+ when you first try to access any other attribute.
91
+
92
+ Chances are, you have some attributes in your index that can be mapped to your
93
+ model, avoiding the extra database hit. You can add these mappings like so.
94
+
95
+ ``` ruby
96
+ Oedipus::DataMapper::Index.new(self) do |idx|
97
+ idx.map :user_id
98
+ idx.map :views, with: :view_count
99
+ end
100
+ ```
101
+
102
+ `Index#map` takes the name of the attribute in your index. By default it will
103
+ map 1:1 with a property of the same name in your model. If the property name
104
+ in your model differs from that in the index, you may specify that with the
105
+ `:with` option, as you see with the `:views` attribute above.
106
+
107
+ Now when Oedipus loads your search results, they will be loaded with `:id`,
108
+ `:user_id` and `:view_count` pre-loaded.
109
+
110
+ #### Complex mappings
111
+
112
+ The attributes in your index may not always be literal copies of the
113
+ properties in your model. If you need to provide an ad-hoc loading mechanism,
114
+ you can pass a lambda as a `:set` option, which specifies how to set the
115
+ value onto the resource. To give a contrived example:
116
+
117
+ ``` ruby
118
+ Oedipus::DataMapper::Index.new(self) do |idx|
119
+ idx.map :x2_views, set: ->(r, v) { r.view_count = v/2 }
120
+ end
121
+ ```
122
+
123
+ For realtime indexes, the `:get` counterpart exists, which specifies how to
124
+ retrieve the value from your resource, for inserting into the index.
125
+
126
+ ``` ruby
127
+ Oedipus::DataMapper::Index.new(self) do |idx|
128
+ idx.map :x2_views, set: ->(r, v) { r.view_count = v/2 }, get: ->(r) { r.view_count * 2 }
129
+ end
130
+ ```
131
+
132
+ ### Fulltext search for resources, via the index
133
+
134
+ The `Index` class provides a `#search` method, which accepts the same
135
+ arguments as the underlying oedipus gem, but returns collections of
136
+ DataMapper resources, instead of Hashes.
137
+
138
+ ``` ruby
139
+ Post.index.search("badgers").each do |post|
140
+ puts "Found post #{post.title}"
141
+ end
142
+ ```
143
+
144
+ #### Filter by attributes
145
+
146
+ As with the main oedipus gem, attribute filters are specified as options, with
147
+ the notable difference that you may use DataMapper's Symbol operators, for
148
+ style/semantic reasons.
149
+
150
+ ``` ruby
151
+ Post.index.search("badgers", :views.gt => 1000).each do |post|
152
+ puts "Found post #{post.title}"
153
+ end
154
+ ```
155
+
156
+ Of course, the non-Symbol operators provided by Oedipus are supported too:
157
+
158
+ ``` ruby
159
+ Post.index.search("badgers", views: Oedipus.gt(1000)).each do |post|
160
+ puts "Found post #{post.title}"
161
+ end
162
+ ```
163
+
164
+ #### Order the results
165
+
166
+ This works as with the main oedipus gem, but you may use DataMapper's notation
167
+ for style/semantic reasons.
168
+
169
+ ``` ruby
170
+ Post.index.search("badgers", order: [:views.desc]).each do |post|
171
+ puts "Found post #{post.title}"
172
+ end
173
+ ```
174
+
175
+ Oedipus' Hash notation is supported too:
176
+
177
+ ``` ruby
178
+ Post.index.search("badgers", order: {views: :desc}).each do |post|
179
+ puts "Found post #{post.title}"
180
+ end
181
+ ```
182
+
183
+ #### Apply limits and offsets
184
+
185
+ This is done just as you would expect.
186
+
187
+ ``` ruby
188
+ Post.index.search("badgers", limit: 30, offset: 60).each do |post|
189
+ puts "Found post #{post.title}"
190
+ end
191
+ ```
192
+
193
+ ### Integration with dm-pager (a.k.a dm-pagination)
194
+
195
+ Oedipus integrates well with [dm-pager](https://github.com/visionmedia/dm-pagination),
196
+ allowing you to pass a `:pager` option to the `#search` method. Limits and
197
+ offsets will be applied, and the resulting collection will have a `#pager`
198
+ method that you can use.
199
+
200
+ You must have dm-pager loaded for this to work. Oedipus does not directly
201
+ depend on it.
202
+
203
+ ``` ruby
204
+ Post.index.search(
205
+ "badgers",
206
+ pager: {
207
+ page: 7,
208
+ per_page: 30,
209
+ page_param: :page
210
+ }
211
+ )
212
+ ```
213
+
214
+ In the current version it is *not* possible to do something like `search(..).page(2)`,
215
+ or rather, doing so will not do what you expect, as the results have already been
216
+ loaded. This is on my radar, however.
217
+
218
+ ### Faceted Search
219
+
220
+ Oedipus makes faceted searches really easy. Pass in a `:facets` option, as a
221
+ Hash, where each key names the facet and the value lists the arguments, then
222
+ Oedipus provides the results for each facet nested inside the collection.
223
+
224
+ Each facet inherits the base search, which it may override in some way, such as
225
+ filtering by an attribute, or modifying the fulltext query itself.
226
+
227
+ ``` ruby
228
+ posts = Post.index.search(
229
+ "badgers",
230
+ facets: {
231
+ popular: {:views.gte => 1000},
232
+ in_title: "@title (%{query})",
233
+ popular_farming: ["%{query} & farming", {:views.gte => 200}]
234
+ }
235
+ )
236
+
237
+ puts "Found #{posts.total_found} posts about badgers..."
238
+ posts.each do |post|
239
+ puts "Title: #{post.title}"
240
+ end
241
+
242
+ puts "Found #{posts.facets[:popular].total_found} popular posts about badgers"
243
+ posts.facets[:popular].each do |post|
244
+ puts "Title: #{post.title}"
245
+ end
246
+
247
+ puts "Found #{posts.facets[:in_title].total_found} posts with 'badgers' in the title"
248
+ posts.facets[:in_title].each do |post|
249
+ puts "Title: #{post.title}"
250
+ end
251
+
252
+ puts "Found #{posts.facets[:popular_farming].total_count} popular posts about both 'badgers' and 'farming'"
253
+ posts.facets[:popular_farming].each do |post|
254
+ puts "Title: #{post.title}"
255
+ end
256
+ ```
257
+
258
+ The actual arguments to each facet can be either an array (if overriding both
259
+ `query` and `options`), or just the query or the options to override.
260
+
261
+ Oedipus replaces `%{query}` in your facets with whatever the base query was,
262
+ which is useful if you want to amend the search, rather than completely
263
+ overwrite it (which is also possible).
264
+
265
+ #### Performance tip
266
+
267
+ A common use of faceted search is to provide links to the full listing for
268
+ each facet, but not necessarily to display the actual results. If you only
269
+ need the meta data, such as the count, set `:limit => 0` on each facet. The
270
+ result sets for the facets will be empty, but the `#total_found` will still
271
+ be reflected.
272
+
273
+ ``` ruby
274
+ posts = Post.index.search(
275
+ "badgers",
276
+ facets: {
277
+ popular: {:views.gte => 1000, :limit => 0}
278
+ }
279
+ )
280
+
281
+ puts posts.facets[:popular].total_found
282
+ ```
283
+
284
+ ### Performing multiple searches in parallel
285
+
286
+ It is possible to execute multiple searches in a single request, much like
287
+ performing a faceted search, but with the exeception that the queries need
288
+ not be related to each other in any way.
289
+
290
+ This is done through `#multi_search`, which accepts a Hash of named searches.
291
+
292
+ ``` ruby
293
+ Post.index.multi_search(
294
+ badgers: "badgers",
295
+ popular_badgers: ["badgers", :views.gte => 1000],
296
+ rabbits: "rabbits"
297
+ ).each do |name, results|
298
+ puts "Results for #{name}..."
299
+ results.each do |post|
300
+ puts "Title: #{post.title}"
301
+ end
302
+ end
303
+ ```
304
+
305
+ The return value is a Hash whose keys match the names of the searches in the
306
+ input Hash. The end result is much like if you had called `#search`
307
+ repeatedly, except that Sphinx has a chance to optimize the common parts in
308
+ the queries, which it will attempt to do.
309
+
310
+ ## Realtime index management
311
+
312
+ Oedipus allows you to keep realtime indexes up-to-date as your models change.
313
+
314
+ The index definition remains the same, but there are some considerations to
315
+ be made.
316
+
317
+ Since realtime indexes are updated whenever something changes on your models,
318
+ you must also list the fulltext fields in the mappings for your index, so that
319
+ they can be saved. Note that the fields are not returned in Sphinx search
320
+ results, however; they will be lazy-loaded if you try to access them in the
321
+ returned collection.
322
+
323
+ ``` ruby
324
+ Oedipus::DataMapper::Index.new(self) do |idx|
325
+ idx.map :title
326
+ idx.map :body
327
+ idx.map :user_id
328
+ idx.map :views, with: :view_count
329
+ end
330
+ ```
331
+
332
+ ### Inserting a resource into the index
333
+
334
+ You can invoke `#insert` on the index, passing in the resource. The resource
335
+ *must* be saved and *must* have a key.
336
+
337
+ ``` ruby
338
+ Post.index.insert(a_post)
339
+ ```
340
+
341
+ In practice, to keep things in sync, you should do this in an `after :create`
342
+ hook on your model.
343
+
344
+ ``` ruby
345
+ class Post
346
+ # ... snip ...
347
+
348
+ after(:create) { model.index.insert(self) }
349
+ end
350
+ ```
351
+
352
+ ### Updating resource in the index
353
+
354
+ **NOTE** This behaviour is currently broken in SphinxQL... you should use
355
+ `#replace` instead. I have patches in progress for Sphinx itself.
356
+
357
+ Invoke `#update` on the index, passing in the resource. The resource
358
+ *must* be saved and *must* have a key.
359
+
360
+ ``` ruby
361
+ Post.index.update(a_post)
362
+ ```
363
+
364
+ In practice, to keep things in sync, you should do this in an `after :update`
365
+ hook on your model.
366
+
367
+ ``` ruby
368
+ class Post
369
+ # ... snip ...
370
+
371
+ after(:update) { model.index.update(self) }
372
+ end
373
+ ```
374
+
375
+ ### Replacing a resource in the index
376
+
377
+ Replacing a resource is much like updating it, except that it is completely
378
+ overwritten. Although SphinxQL in theory supports updates, it has never
379
+ worked in practice, so you should use this method for now (current Sphinx
380
+ version 2.0.4 at time of writing).
381
+
382
+ ``` ruby
383
+ Post.index.replace(a_post)
384
+ ```
385
+
386
+ In practice, to keep things in sync, you should do this in an `after :update`
387
+ hook on your model.
388
+
389
+ ``` ruby
390
+ class Post
391
+ # ... snip ...
392
+
393
+ after(:update) { model.index.replace(self) }
394
+ end
395
+ ```
396
+
397
+ You can also use this as a convenience, removing the need for both
398
+ `after :create` and `after :update` hooks. Just put it inside a single
399
+ `after :save` hook, which will work in both cases.
400
+
401
+ ``` ruby
402
+ class Post
403
+ # ... snip ...
404
+
405
+ # works for both inserts and updates
406
+ after(:save) { model.index.replace(self) }
407
+ end
408
+ ```
409
+
410
+ ### Deleting a resource from the index
411
+
412
+ You can invoke `#delete` on the index, passing in the resource. The resource
413
+ *must* be saved and *must* have a key.
414
+
415
+ ``` ruby
416
+ Post.index.delete(a_post)
417
+ ```
418
+
419
+ In practice, to keep things in sync, you should do this in an `before :destroy`
420
+ hook on your model. Note the use of `before` instead of `after`, in order to
421
+ avoid returning missing data in your search results.
422
+
423
+ ``` ruby
424
+ class Post
425
+ # ... snip ...
426
+
427
+ before(:destroy) { model.index.delete(self) }
428
+ end
429
+ ```
430
+
431
+ ## Talking directly to Oedipus
432
+
433
+ If you want to by-pass DataMapper and just go straight to Oedipus, which returns
434
+ lightweight results using Arrays and Hashes, you call use the `#raw` method on the
435
+ index.
436
+
437
+ See the [oedipus documentation](https://github.com/d11wtq/oedipus) for details of
438
+ how to work with this object.
439
+
440
+ ``` ruby
441
+ require 'pp'
442
+ pp Post.index.raw.search(
443
+ "badgers",
444
+ user_id: Oedipus.not(7),
445
+ order: {views: :desc}
446
+ )
447
+ ```
448
+
449
+ ## Licensing and Copyright
450
+
451
+ Refer to the LICENSE file for details.
data/Rakefile ADDED
@@ -0,0 +1,17 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ desc "Run the full RSpec suite (requires SEARCHD environment variable)"
5
+ RSpec::Core::RakeTask.new('spec') do |t|
6
+ t.pattern = 'spec/'
7
+ end
8
+
9
+ desc "Run the RSpec unit tests alone"
10
+ RSpec::Core::RakeTask.new('spec:unit') do |t|
11
+ t.pattern = 'spec/unit/'
12
+ end
13
+
14
+ desc "Run the integration tests (requires SEARCHD environment variable)"
15
+ RSpec::Core::RakeTask.new('spec:integration') do |t|
16
+ t.pattern = 'spec/integration/'
17
+ end
@@ -0,0 +1,52 @@
1
+ # encoding: utf-8
2
+
3
+ ##
4
+ # DataMapper Integration for Oedipus.
5
+ # Copyright © 2012 Chris Corbyn.
6
+ #
7
+ # See LICENSE file for details.
8
+ ##
9
+
10
+ module Oedipus
11
+ module DataMapper
12
+ # Adds some additional methods to DataMapper::Collection to provide meta data.
13
+ class Collection < ::DataMapper::Collection
14
+ attr_reader :time
15
+ attr_reader :total_found
16
+ attr_reader :count
17
+ attr_reader :facets
18
+ attr_reader :keywords
19
+ attr_reader :docs
20
+
21
+ # Initialize a new Collection for the given query and records.
22
+ #
23
+ # @param [DataMapper::Query] query
24
+ # a query contructed to search for records with a set of ids
25
+ #
26
+ # @param [Array] records
27
+ # a pre-loaded collection of records used to hydrate models
28
+ #
29
+ # @params [Hash] options
30
+ # additional options specifying meta data about the results
31
+ #
32
+ # @option [Integer] total_found
33
+ # the total number of records found, without limits applied
34
+ #
35
+ # @option [Integer] count
36
+ # the actual number of results
37
+ #
38
+ # @option [Hash] facets
39
+ # any facets that were also found
40
+ def initialize(query, records = nil, options = {})
41
+ super(query, records)
42
+ @time = options[:time]
43
+ @total_found = options[:total_found]
44
+ @count = options[:count]
45
+ @keywords = options[:keywords]
46
+ @docs = options[:docs]
47
+ @facets = options.fetch(:facets, {})
48
+ @pager = options[:pager]
49
+ end
50
+ end
51
+ end
52
+ end
@@ -0,0 +1,64 @@
1
+ # encoding: utf-8
2
+
3
+ ##
4
+ # DataMapper Integration for Oedipus.
5
+ # Copyright © 2012 Chris Corbyn.
6
+ #
7
+ # See LICENSE file for details.
8
+ ##
9
+
10
+ module Oedipus
11
+ module DataMapper
12
+ # Methods for converting between DataMapper and Oedipus types
13
+ module Conversions
14
+ # Performs a deep conversion of DataMapper-style operators to Oedipus operators
15
+ def convert_filters(args)
16
+ query, options = connection[name].send(:extract_query_data, args, nil)
17
+ [
18
+ query,
19
+ options.inject({}) { |o, (k, v)|
20
+ case k
21
+ when ::DataMapper::Query::Operator
22
+ case k.operator
23
+ when :not, :lt, :lte, :gt, :gte
24
+ o.merge!(k.target => Oedipus.send(k.operator, v))
25
+ else
26
+ raise ArgumentError, "Unsupported Sphinx filter operator #{k.operator}"
27
+ end
28
+ when :order
29
+ o.merge!(order: convert_order(v))
30
+ when :facets
31
+ o.merge!(facets: convert_facets(v))
32
+ else
33
+ o.merge!(k => v)
34
+ end
35
+ }
36
+ ].compact
37
+ end
38
+
39
+ private
40
+
41
+ def convert_facets(facets)
42
+ Array(facets).inject({}) { |o, (k, v)| o.merge!(k => convert_filters(v)) }
43
+ end
44
+
45
+ def convert_order(order)
46
+ Hash[
47
+ Array(order).map { |k, v|
48
+ case k
49
+ when ::DataMapper::Query::Operator
50
+ case k.operator
51
+ when :asc, :desc
52
+ [k.target, k.operator]
53
+ else
54
+ raise ArgumentError, "Unsupported Sphinx order operator #{k.operator}"
55
+ end
56
+ else
57
+ [k, v || :asc]
58
+ end
59
+ }
60
+ ]
61
+ end
62
+ end
63
+ end
64
+ end