oedipus-dm 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,10 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ spec/data/index/*
6
+ spec/data/binlog/*
7
+ spec/data/searchd.*
8
+ spec/data/sphinx.*
9
+ lib/oedipus/oedipus.so
10
+ tmp/
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in oedipus-dm.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Copyright © 2012 Chris Corbyn.
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,451 @@
1
+ # Oedipus Sphinx Integration for DataMapper
2
+
3
+ This gem is a work in progress, binding [Oedipus](https://github.com/d11wtq/oedipus)
4
+ with [DataMapper](https://github.com/datamapper/dm-core), in order to support
5
+ the querying and updating of Sphinx indexes through DataMapper models.
6
+
7
+ The gem is not yet published, as it is still in development.
8
+
9
+ ## Usage
10
+
11
+ All features of Oedipus will ultimately be supported, but I'm documenting as
12
+ I complete wrapping the features.
13
+
14
+ ### Configure Oedipus
15
+
16
+ Oedipus must be configured to connect to a SphinxQL host. The older searchd
17
+ interface is not supported.
18
+
19
+ ``` ruby
20
+ require "oedipus-dm"
21
+
22
+ Oedipus::DataMapper.configure do |config|
23
+ config.host = "localhost"
24
+ config.port = 9306
25
+ end
26
+ ```
27
+
28
+ In Rails you can do this in an initializer for example. If you prefer not to
29
+ use a global configuration, it is possible to specify how to connect on a
30
+ per-index basis instead.
31
+
32
+ ### Defining an Index
33
+
34
+ The most basic way to connect sphinx index with your model is to define a
35
+ `.index` method on the model itself. Oedipus doesn't directly mix behaviour
36
+ into your models by default, as experience suggests this makes testing in
37
+ isolation more difficult (note that you can easily have a standalone `Index`
38
+ that wraps your model, if you prefer this).
39
+
40
+ For a non-realtime index, something like the following would work fine.
41
+
42
+ ``` ruby
43
+ class Post
44
+ include DataMapper::Resource
45
+
46
+ property :id, Serial
47
+ property :title, String
48
+ property :body, Text
49
+ property :view_count, Integer
50
+
51
+ belongs_to :user
52
+
53
+ def self.index
54
+ @index ||= Oedipus::DataMapper::Index.new(self)
55
+ end
56
+ end
57
+ ```
58
+
59
+ Oedipus will use the `storage_name` of your model as the index name in Sphinx.
60
+ If you need to use a different name, pass the `:name` option to the Index.
61
+
62
+ ``` ruby
63
+ def self.index
64
+ @index ||= Oedipus::DataMapper::Index.new(self, name: :posts_rt)
65
+ end
66
+ ```
67
+
68
+ If you have not globally configured Oedipus, or want to specify different
69
+ connection settings, pass the `:connection` option.
70
+
71
+ ``` ruby
72
+ def self.index
73
+ @index ||= Oedipus::DataMapper::Index.new(
74
+ self,
75
+ connection: Oedipus.connect("localhost:9306")
76
+ )
77
+ end
78
+ ```
79
+
80
+ #### Map fields and attributes with your model
81
+
82
+ By default, the only field that Oedipus will map with your model is the `:id`
83
+ attribute, which it will try to map with the key of your model. This
84
+ configuration will work fine for non-realtime indexes in most cases, but it
85
+ is not optimized for many cases.
86
+
87
+ When Oedipus finds search results, it pulls out all the attributes defined in
88
+ your index, then tries to map them to instances of your model. Mapping `:id`
89
+ alone means that DataMapper will load all of your resources from the database
90
+ when you first try to access any other attribute.
91
+
92
+ Chances are, you have some attributes in your index that can be mapped to your
93
+ model, avoiding the extra database hit. You can add these mappings like so.
94
+
95
+ ``` ruby
96
+ Oedipus::DataMapper::Index.new(self) do |idx|
97
+ idx.map :user_id
98
+ idx.map :views, with: :view_count
99
+ end
100
+ ```
101
+
102
+ `Index#map` takes the name of the attribute in your index. By default it will
103
+ map 1:1 with a property of the same name in your model. If the property name
104
+ in your model differs from that in the index, you may specify that with the
105
+ `:with` option, as you see with the `:views` attribute above.
106
+
107
+ Now when Oedipus loads your search results, they will be loaded with `:id`,
108
+ `:user_id` and `:view_count` pre-loaded.
109
+
110
+ #### Complex mappings
111
+
112
+ The attributes in your index may not always be literal copies of the
113
+ properties in your model. If you need to provide an ad-hoc loading mechanism,
114
+ you can pass a lambda as a `:set` option, which specifies how to set the
115
+ value onto the resource. To give a contrived example:
116
+
117
+ ``` ruby
118
+ Oedipus::DataMapper::Index.new(self) do |idx|
119
+ idx.map :x2_views, set: ->(r, v) { r.view_count = v/2 }
120
+ end
121
+ ```
122
+
123
+ For realtime indexes, the `:get` counterpart exists, which specifies how to
124
+ retrieve the value from your resource, for inserting into the index.
125
+
126
+ ``` ruby
127
+ Oedipus::DataMapper::Index.new(self) do |idx|
128
+ idx.map :x2_views, set: ->(r, v) { r.view_count = v/2 }, get: ->(r) { r.view_count * 2 }
129
+ end
130
+ ```
131
+
132
+ ### Fulltext search for resources, via the index
133
+
134
+ The `Index` class provides a `#search` method, which accepts the same
135
+ arguments as the underlying oedipus gem, but returns collections of
136
+ DataMapper resources, instead of Hashes.
137
+
138
+ ``` ruby
139
+ Post.index.search("badgers").each do |post|
140
+ puts "Found post #{post.title}"
141
+ end
142
+ ```
143
+
144
+ #### Filter by attributes
145
+
146
+ As with the main oedipus gem, attribute filters are specified as options, with
147
+ the notable difference that you may use DataMapper's Symbol operators, for
148
+ style/semantic reasons.
149
+
150
+ ``` ruby
151
+ Post.index.search("badgers", :views.gt => 1000).each do |post|
152
+ puts "Found post #{post.title}"
153
+ end
154
+ ```
155
+
156
+ Of course, the non-Symbol operators provided by Oedipus are supported too:
157
+
158
+ ``` ruby
159
+ Post.index.search("badgers", views: Oedipus.gt(1000)).each do |post|
160
+ puts "Found post #{post.title}"
161
+ end
162
+ ```
163
+
164
+ #### Order the results
165
+
166
+ This works as with the main oedipus gem, but you may use DataMapper's notation
167
+ for style/semantic reasons.
168
+
169
+ ``` ruby
170
+ Post.index.search("badgers", order: [:views.desc]).each do |post|
171
+ puts "Found post #{post.title}"
172
+ end
173
+ ```
174
+
175
+ Oedipus' Hash notation is supported too:
176
+
177
+ ``` ruby
178
+ Post.index.search("badgers", order: {views: :desc}).each do |post|
179
+ puts "Found post #{post.title}"
180
+ end
181
+ ```
182
+
183
+ #### Apply limits and offsets
184
+
185
+ This is done just as you would expect.
186
+
187
+ ``` ruby
188
+ Post.index.search("badgers", limit: 30, offset: 60).each do |post|
189
+ puts "Found post #{post.title}"
190
+ end
191
+ ```
192
+
193
+ ### Integration with dm-pager (a.k.a dm-pagination)
194
+
195
+ Oedipus integrates well with [dm-pager](https://github.com/visionmedia/dm-pagination),
196
+ allowing you to pass a `:pager` option to the `#search` method. Limits and
197
+ offsets will be applied, and the resulting collection will have a `#pager`
198
+ method that you can use.
199
+
200
+ You must have dm-pager loaded for this to work. Oedipus does not directly
201
+ depend on it.
202
+
203
+ ``` ruby
204
+ Post.index.search(
205
+ "badgers",
206
+ pager: {
207
+ page: 7,
208
+ per_page: 30,
209
+ page_param: :page
210
+ }
211
+ )
212
+ ```
213
+
214
+ In the current version it is *not* possible to do something like `search(..).page(2)`,
215
+ or rather, doing so will not do what you expect, as the results have already been
216
+ loaded. This is on my radar, however.
217
+
218
+ ### Faceted Search
219
+
220
+ Oedipus makes faceted searches really easy. Pass in a `:facets` option, as a
221
+ Hash, where each key names the facet and the value lists the arguments, then
222
+ Oedipus provides the results for each facet nested inside the collection.
223
+
224
+ Each facet inherits the base search, which it may override in some way, such as
225
+ filtering by an attribute, or modifying the fulltext query itself.
226
+
227
+ ``` ruby
228
+ posts = Post.index.search(
229
+ "badgers",
230
+ facets: {
231
+ popular: {:views.gte => 1000},
232
+ in_title: "@title (%{query})",
233
+ popular_farming: ["%{query} & farming", {:views.gte => 200}]
234
+ }
235
+ )
236
+
237
+ puts "Found #{posts.total_found} posts about badgers..."
238
+ posts.each do |post|
239
+ puts "Title: #{post.title}"
240
+ end
241
+
242
+ puts "Found #{posts.facets[:popular].total_found} popular posts about badgers"
243
+ posts.facets[:popular].each do |post|
244
+ puts "Title: #{post.title}"
245
+ end
246
+
247
+ puts "Found #{posts.facets[:in_title].total_found} posts with 'badgers' in the title"
248
+ posts.facets[:in_title].each do |post|
249
+ puts "Title: #{post.title}"
250
+ end
251
+
252
+ puts "Found #{posts.facets[:popular_farming].total_count} popular posts about both 'badgers' and 'farming'"
253
+ posts.facets[:popular_farming].each do |post|
254
+ puts "Title: #{post.title}"
255
+ end
256
+ ```
257
+
258
+ The actual arguments to each facet can be either an array (if overriding both
259
+ `query` and `options`), or just the query or the options to override.
260
+
261
+ Oedipus replaces `%{query}` in your facets with whatever the base query was,
262
+ which is useful if you want to amend the search, rather than completely
263
+ overwrite it (which is also possible).
264
+
265
+ #### Performance tip
266
+
267
+ A common use of faceted search is to provide links to the full listing for
268
+ each facet, but not necessarily to display the actual results. If you only
269
+ need the meta data, such as the count, set `:limit => 0` on each facet. The
270
+ result sets for the facets will be empty, but the `#total_found` will still
271
+ be reflected.
272
+
273
+ ``` ruby
274
+ posts = Post.index.search(
275
+ "badgers",
276
+ facets: {
277
+ popular: {:views.gte => 1000, :limit => 0}
278
+ }
279
+ )
280
+
281
+ puts posts.facets[:popular].total_found
282
+ ```
283
+
284
+ ### Performing multiple searches in parallel
285
+
286
+ It is possible to execute multiple searches in a single request, much like
287
+ performing a faceted search, but with the exeception that the queries need
288
+ not be related to each other in any way.
289
+
290
+ This is done through `#multi_search`, which accepts a Hash of named searches.
291
+
292
+ ``` ruby
293
+ Post.index.multi_search(
294
+ badgers: "badgers",
295
+ popular_badgers: ["badgers", :views.gte => 1000],
296
+ rabbits: "rabbits"
297
+ ).each do |name, results|
298
+ puts "Results for #{name}..."
299
+ results.each do |post|
300
+ puts "Title: #{post.title}"
301
+ end
302
+ end
303
+ ```
304
+
305
+ The return value is a Hash whose keys match the names of the searches in the
306
+ input Hash. The end result is much like if you had called `#search`
307
+ repeatedly, except that Sphinx has a chance to optimize the common parts in
308
+ the queries, which it will attempt to do.
309
+
310
+ ## Realtime index management
311
+
312
+ Oedipus allows you to keep realtime indexes up-to-date as your models change.
313
+
314
+ The index definition remains the same, but there are some considerations to
315
+ be made.
316
+
317
+ Since realtime indexes are updated whenever something changes on your models,
318
+ you must also list the fulltext fields in the mappings for your index, so that
319
+ they can be saved. Note that the fields are not returned in Sphinx search
320
+ results, however; they will be lazy-loaded if you try to access them in the
321
+ returned collection.
322
+
323
+ ``` ruby
324
+ Oedipus::DataMapper::Index.new(self) do |idx|
325
+ idx.map :title
326
+ idx.map :body
327
+ idx.map :user_id
328
+ idx.map :views, with: :view_count
329
+ end
330
+ ```
331
+
332
+ ### Inserting a resource into the index
333
+
334
+ You can invoke `#insert` on the index, passing in the resource. The resource
335
+ *must* be saved and *must* have a key.
336
+
337
+ ``` ruby
338
+ Post.index.insert(a_post)
339
+ ```
340
+
341
+ In practice, to keep things in sync, you should do this in an `after :create`
342
+ hook on your model.
343
+
344
+ ``` ruby
345
+ class Post
346
+ # ... snip ...
347
+
348
+ after(:create) { model.index.insert(self) }
349
+ end
350
+ ```
351
+
352
+ ### Updating resource in the index
353
+
354
+ **NOTE** This behaviour is currently broken in SphinxQL... you should use
355
+ `#replace` instead. I have patches in progress for Sphinx itself.
356
+
357
+ Invoke `#update` on the index, passing in the resource. The resource
358
+ *must* be saved and *must* have a key.
359
+
360
+ ``` ruby
361
+ Post.index.update(a_post)
362
+ ```
363
+
364
+ In practice, to keep things in sync, you should do this in an `after :update`
365
+ hook on your model.
366
+
367
+ ``` ruby
368
+ class Post
369
+ # ... snip ...
370
+
371
+ after(:update) { model.index.update(self) }
372
+ end
373
+ ```
374
+
375
+ ### Replacing a resource in the index
376
+
377
+ Replacing a resource is much like updating it, except that it is completely
378
+ overwritten. Although SphinxQL in theory supports updates, it has never
379
+ worked in practice, so you should use this method for now (current Sphinx
380
+ version 2.0.4 at time of writing).
381
+
382
+ ``` ruby
383
+ Post.index.replace(a_post)
384
+ ```
385
+
386
+ In practice, to keep things in sync, you should do this in an `after :update`
387
+ hook on your model.
388
+
389
+ ``` ruby
390
+ class Post
391
+ # ... snip ...
392
+
393
+ after(:update) { model.index.replace(self) }
394
+ end
395
+ ```
396
+
397
+ You can also use this as a convenience, removing the need for both
398
+ `after :create` and `after :update` hooks. Just put it inside a single
399
+ `after :save` hook, which will work in both cases.
400
+
401
+ ``` ruby
402
+ class Post
403
+ # ... snip ...
404
+
405
+ # works for both inserts and updates
406
+ after(:save) { model.index.replace(self) }
407
+ end
408
+ ```
409
+
410
+ ### Deleting a resource from the index
411
+
412
+ You can invoke `#delete` on the index, passing in the resource. The resource
413
+ *must* be saved and *must* have a key.
414
+
415
+ ``` ruby
416
+ Post.index.delete(a_post)
417
+ ```
418
+
419
+ In practice, to keep things in sync, you should do this in an `before :destroy`
420
+ hook on your model. Note the use of `before` instead of `after`, in order to
421
+ avoid returning missing data in your search results.
422
+
423
+ ``` ruby
424
+ class Post
425
+ # ... snip ...
426
+
427
+ before(:destroy) { model.index.delete(self) }
428
+ end
429
+ ```
430
+
431
+ ## Talking directly to Oedipus
432
+
433
+ If you want to by-pass DataMapper and just go straight to Oedipus, which returns
434
+ lightweight results using Arrays and Hashes, you call use the `#raw` method on the
435
+ index.
436
+
437
+ See the [oedipus documentation](https://github.com/d11wtq/oedipus) for details of
438
+ how to work with this object.
439
+
440
+ ``` ruby
441
+ require 'pp'
442
+ pp Post.index.raw.search(
443
+ "badgers",
444
+ user_id: Oedipus.not(7),
445
+ order: {views: :desc}
446
+ )
447
+ ```
448
+
449
+ ## Licensing and Copyright
450
+
451
+ Refer to the LICENSE file for details.
data/Rakefile ADDED
@@ -0,0 +1,17 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ desc "Run the full RSpec suite (requires SEARCHD environment variable)"
5
+ RSpec::Core::RakeTask.new('spec') do |t|
6
+ t.pattern = 'spec/'
7
+ end
8
+
9
+ desc "Run the RSpec unit tests alone"
10
+ RSpec::Core::RakeTask.new('spec:unit') do |t|
11
+ t.pattern = 'spec/unit/'
12
+ end
13
+
14
+ desc "Run the integration tests (requires SEARCHD environment variable)"
15
+ RSpec::Core::RakeTask.new('spec:integration') do |t|
16
+ t.pattern = 'spec/integration/'
17
+ end
@@ -0,0 +1,52 @@
1
+ # encoding: utf-8
2
+
3
+ ##
4
+ # DataMapper Integration for Oedipus.
5
+ # Copyright © 2012 Chris Corbyn.
6
+ #
7
+ # See LICENSE file for details.
8
+ ##
9
+
10
+ module Oedipus
11
+ module DataMapper
12
+ # Adds some additional methods to DataMapper::Collection to provide meta data.
13
+ class Collection < ::DataMapper::Collection
14
+ attr_reader :time
15
+ attr_reader :total_found
16
+ attr_reader :count
17
+ attr_reader :facets
18
+ attr_reader :keywords
19
+ attr_reader :docs
20
+
21
+ # Initialize a new Collection for the given query and records.
22
+ #
23
+ # @param [DataMapper::Query] query
24
+ # a query contructed to search for records with a set of ids
25
+ #
26
+ # @param [Array] records
27
+ # a pre-loaded collection of records used to hydrate models
28
+ #
29
+ # @params [Hash] options
30
+ # additional options specifying meta data about the results
31
+ #
32
+ # @option [Integer] total_found
33
+ # the total number of records found, without limits applied
34
+ #
35
+ # @option [Integer] count
36
+ # the actual number of results
37
+ #
38
+ # @option [Hash] facets
39
+ # any facets that were also found
40
+ def initialize(query, records = nil, options = {})
41
+ super(query, records)
42
+ @time = options[:time]
43
+ @total_found = options[:total_found]
44
+ @count = options[:count]
45
+ @keywords = options[:keywords]
46
+ @docs = options[:docs]
47
+ @facets = options.fetch(:facets, {})
48
+ @pager = options[:pager]
49
+ end
50
+ end
51
+ end
52
+ end
@@ -0,0 +1,64 @@
1
+ # encoding: utf-8
2
+
3
+ ##
4
+ # DataMapper Integration for Oedipus.
5
+ # Copyright © 2012 Chris Corbyn.
6
+ #
7
+ # See LICENSE file for details.
8
+ ##
9
+
10
+ module Oedipus
11
+ module DataMapper
12
+ # Methods for converting between DataMapper and Oedipus types
13
+ module Conversions
14
+ # Performs a deep conversion of DataMapper-style operators to Oedipus operators
15
+ def convert_filters(args)
16
+ query, options = connection[name].send(:extract_query_data, args, nil)
17
+ [
18
+ query,
19
+ options.inject({}) { |o, (k, v)|
20
+ case k
21
+ when ::DataMapper::Query::Operator
22
+ case k.operator
23
+ when :not, :lt, :lte, :gt, :gte
24
+ o.merge!(k.target => Oedipus.send(k.operator, v))
25
+ else
26
+ raise ArgumentError, "Unsupported Sphinx filter operator #{k.operator}"
27
+ end
28
+ when :order
29
+ o.merge!(order: convert_order(v))
30
+ when :facets
31
+ o.merge!(facets: convert_facets(v))
32
+ else
33
+ o.merge!(k => v)
34
+ end
35
+ }
36
+ ].compact
37
+ end
38
+
39
+ private
40
+
41
+ def convert_facets(facets)
42
+ Array(facets).inject({}) { |o, (k, v)| o.merge!(k => convert_filters(v)) }
43
+ end
44
+
45
+ def convert_order(order)
46
+ Hash[
47
+ Array(order).map { |k, v|
48
+ case k
49
+ when ::DataMapper::Query::Operator
50
+ case k.operator
51
+ when :asc, :desc
52
+ [k.target, k.operator]
53
+ else
54
+ raise ArgumentError, "Unsupported Sphinx order operator #{k.operator}"
55
+ end
56
+ else
57
+ [k, v || :asc]
58
+ end
59
+ }
60
+ ]
61
+ end
62
+ end
63
+ end
64
+ end