xapian_db 1.0 → 1.1

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGELOG.md CHANGED
@@ -1,3 +1,22 @@
1
+ ##1.1 (September 7th, 2011)
2
+
3
+ Fixes:
4
+
5
+ - better handling of the beanstalk-client dependency
6
+ - recreate the xapian index database if the configured path exists but does not contain a valid xapian index
7
+ - support for non-integer primary keys (removed unneccesary to_i conversion)
8
+
9
+ Features:
10
+
11
+ - rails sample app upgraded to 3.1
12
+ - support for value range queries (strings, dates, numbers)
13
+ - sorting now works on a global query, too (XapianDb.search...)
14
+ - global factes queries have now the same options like class scoped facet queries
15
+ - Support for custom serialization into xapian documents; overwrite the serialization implementation in type_codec.rb or implement your own serialization for specific types (see examples/custom_serialization.rb)
16
+ - support to reindex a single object while evaluation an ignore_if block (if present)
17
+
18
+ IMPORTANT: YOU MUST REBUILD YOUR XAPIAN INDEX DATABASE SINCE THE INDEX STRUCTURE HAS CHANGED!
19
+
1
20
  ##1.0 (August 17th, 2011)
2
21
 
3
22
  Features:
data/README.rdoc CHANGED
@@ -1,5 +1,9 @@
1
1
  = XapianDb
2
2
 
3
+ == Important Information
4
+
5
+ If you upgrade from an earlier version of xapian_db to 1.1, you MUST rebuild your entire index (XapianDb.rebuild_xapian_index)!
6
+
3
7
  == What's in the box?
4
8
 
5
9
  XapianDb is a ruby gem that combines features of nosql databases and fulltext indexing into one piece. The result: Rich documents and very fast queries. It is based on {Xapian}[http://xapian.org/], an efficient and powerful indexing library.
@@ -116,6 +120,14 @@ You may add a filter expression to exclude objects from the index. This is handy
116
120
  blueprint.ignore_if {active == false}
117
121
  end
118
122
 
123
+ You can add a type information to an attribute. As of now the special types :string, :date and :number are supported (and required for range queries):
124
+
125
+ XapianDb::DocumentBlueprint.setup(Person) do |blueprint|
126
+ blueprint.attribute :age, :as => :number
127
+ blueprint.attribute :date_of_birth, :as => :date
128
+ blueprint.attribute :name, :as => :string
129
+ end
130
+
119
131
  You can override the global adapter configuration in a specific blueprint. Let's say you use ActiveRecord, but you have
120
132
  one more class that is not stored in the database, but you want it to be indexed:
121
133
 
@@ -145,6 +157,10 @@ To rebuild the index for all blueprints, use
145
157
 
146
158
  XapianDb.rebuild_xapian_index
147
159
 
160
+ You can update the index for a single object, too (e.g. to reevaluate an ignore_if block without modifying and saving the object):
161
+
162
+ XapianDb.reindex object
163
+
148
164
  === Query the index
149
165
 
150
166
  A simple query looks like this:
@@ -180,7 +196,26 @@ On class queries you can specifiy order options:
180
196
  results = Person.search "name:Foo", :order => :first_name
181
197
  results = Person.search "Fo*", :order => [:name, :first_name], :sort_decending => true
182
198
 
183
- Please note that the order option is not available for global searches (XapianDb.search...)
199
+ If you define an attribute with a supported type, you can do range searches:
200
+
201
+ XapianDb::DocumentBlueprint.setup(Person) do |blueprint|
202
+ blueprint.attribute :age, :as => :number
203
+ blueprint.attribute :date_of_birth, :as => :date
204
+ blueprint.attribute :name, :as => :string
205
+ end
206
+
207
+ result = XapianDb.search("date_of_birth:2011-01-01..2011-12-31")
208
+ result = XapianDb.search("age:30..40")
209
+ result = XapianDb.search("name:Adam..Chris")
210
+
211
+ Open Ranges are supported, too:
212
+
213
+ result = XapianDb.search("age:..40")
214
+ result = XapianDb.search("age:30..")
215
+
216
+ You can combine range query expressions with other expressions:
217
+
218
+ result = XapianDb.search("age:30..40 AND city:Aarau")
184
219
 
185
220
  === Process the results
186
221
 
@@ -216,7 +251,15 @@ Or with kaminari:
216
251
  If you want to implement a simple drilldown for your searches, you can use a global facets query:
217
252
 
218
253
  search_expression = "Foo"
219
- facets = XapianDb.facets(search_expression)
254
+ facets = XapianDb.facets(:name, search_expression)
255
+ facets.each do |name, count|
256
+ puts "#{name}: #{count} hits"
257
+ end
258
+
259
+ If you want the facets based on the indexed class, use the special attribute :indexed_class:
260
+
261
+ search_expression = "Foo"
262
+ facets = XapianDb.facets(:indexed_class, search_expression)
220
263
  facets.each do |klass, count|
221
264
  puts "#{klass.name}: #{count} hits"
222
265
 
@@ -224,7 +267,7 @@ If you want to implement a simple drilldown for your searches, you can use a glo
224
267
  # doc = klass.search search_expression
225
268
  end
226
269
 
227
- A global facet search always groups the results by the class of the indexed objects. There is a class level facet query syntax available, too:
270
+ A class level facet query is possible, too:
228
271
 
229
272
  search_expression = "Foo"
230
273
  facets = Person.facets(:name, search_expression)
@@ -232,7 +275,7 @@ A global facet search always groups the results by the class of the indexed obje
232
275
  puts "#{name}: #{count} hits"
233
276
  end
234
277
 
235
- At the class level, any attribute can be used for a facet query. Use facet queries on attributes that store atomic values like strings, numbers or dates.
278
+ Any attribute declared in a blueprint can be used for a facet query. Use facet queries on attributes that store atomic values like strings, numbers or dates.
236
279
  If you use it on attributes that contain collections (like an array of strings), you might get unexpected results.
237
280
 
238
281
  === Find similar documents
@@ -269,6 +312,12 @@ you can use the auto_indexing_disabled method with a block and rebuild the whole
269
312
  end
270
313
  Person.rebuild_xapian_index
271
314
 
315
+ == Add your own serializers for special objects
316
+
317
+ XapianDb serializes objects to xapian documents using YAML by default. This way, type information is preserved und you get back what you put into a xapian document, not just a string.
318
+
319
+ However, dates need special handling to support date range queries. To support date range queries and allow the addition of other custom data types in the future, XapianDb uses a simple, extensible mechanism to serialize / deserialize your objects. An example on how to extend this mechanism is provided in examples/custom_serialization.rb.
320
+
272
321
  == Production setup
273
322
 
274
323
  Since Xapian allows only one database instance to write to the index, the default setup of XapianDb will not work
data/lib/type_codec.rb ADDED
@@ -0,0 +1,124 @@
1
+ # encoding: utf-8
2
+
3
+ # This class is responsible for encoding and decoding values depending on their
4
+ # type
5
+
6
+ require "bigdecimal"
7
+
8
+ module XapianDb
9
+
10
+ class TypeCodec
11
+
12
+ extend XapianDb::Utilities
13
+
14
+ # Get the codec for a type
15
+ # @param [Symbol] type a supported type as a string or symbol.
16
+ # The following types are supported:
17
+ # - :date
18
+ # @return [DateCodec]
19
+ def self.codec_for(type)
20
+ begin
21
+ constantize "XapianDb::TypeCodec::#{camelize("#{type}_codec")}"
22
+ rescue NameError
23
+ raise ArgumentError.new "no codec defined for type #{type}"
24
+ end
25
+ end
26
+
27
+ class GenericCodec
28
+
29
+ # Encode an object to its yaml representation
30
+ # @param [Object] object an object to encode
31
+ # @return [String] the yaml string
32
+ def self.encode(object)
33
+ begin
34
+ if object.respond_to?(:attributes)
35
+ object.attributes.to_yaml
36
+ else
37
+ object.to_yaml
38
+ end
39
+ rescue NoMethodError
40
+ raise ArgumentError.new "#{object} does not support yaml serialization"
41
+ end
42
+ end
43
+
44
+ # Decode an object from a yaml string
45
+ # @param [String] yaml_string a yaml string representing the object
46
+ # @return [Object] the parsed object
47
+ def self.decode(yaml_string)
48
+ begin
49
+ YAML::load yaml_string
50
+ rescue TypeError
51
+ raise ArgumentError.new "'#{yaml_string}' cannot be loaded by YAML"
52
+ end
53
+ end
54
+ end
55
+
56
+ class StringCodec
57
+
58
+ # Encode an object to a string
59
+ # @param [Object] object an object to encode
60
+ # @return [String] the string
61
+ def self.encode(object)
62
+ object.to_s
63
+ end
64
+
65
+ # Decode a string
66
+ # @param [String] string a string
67
+ # @return [String] the string
68
+ def self.decode(string)
69
+ string
70
+ end
71
+ end
72
+
73
+ class DateCodec
74
+
75
+ # Encode a date to a string in the format 'yyyymmdd'
76
+ # @param [Date] date a date object to encode
77
+ # @return [String] the encoded date
78
+ def self.encode(date)
79
+ begin
80
+ date.strftime "%Y%m%d"
81
+ rescue NoMethodError
82
+ raise ArgumentError.new "#{date} was expected to be a date"
83
+ end
84
+ end
85
+
86
+ # Decode a string to a date
87
+ # @param [String] date_as_string a string representing a date
88
+ # @return [Date] the parsed date
89
+ def self.decode(date_as_string)
90
+ begin
91
+ Date.parse date_as_string
92
+ rescue ArgumentError
93
+ raise ArgumentError.new "'#{date_as_string}' cannot be converted to a date"
94
+ end
95
+ end
96
+ end
97
+
98
+ class NumberCodec
99
+
100
+ # Encode a number to a sortable string
101
+ # @param [Integer, BigDecimal, Float] number a number object to encode
102
+ # @return [String] the encoded number
103
+ def self.encode(number)
104
+ begin
105
+ Xapian::sortable_serialise number
106
+ rescue TypeError
107
+ raise ArgumentError.new "#{number} was expected to be a number"
108
+ end
109
+ end
110
+
111
+ # Decode a string to a BigDecimal
112
+ # @param [String] number_as_string a string representing a number
113
+ # @return [BigDecimal] the decoded number
114
+ def self.decode(encoded_number)
115
+ begin
116
+ BigDecimal.new(Xapian::sortable_unserialise(encoded_number).to_s)
117
+ rescue TypeError
118
+ raise ArgumentError.new "#{encoded_number} cannot be unserialized"
119
+ end
120
+ end
121
+ end
122
+
123
+ end
124
+ end
@@ -37,14 +37,9 @@ module XapianDb
37
37
 
38
38
  klass.class_eval do
39
39
 
40
- # add the after save logic
40
+ # add the after commit logic
41
41
  after_commit do
42
- blueprint = XapianDb::DocumentBlueprint.blueprint_for klass
43
- if blueprint.should_index?(self)
44
- XapianDb.index(self)
45
- else
46
- XapianDb.delete_doc_with(self.xapian_id)
47
- end
42
+ XapianDb.reindex(self)
48
43
  end
49
44
 
50
45
  # add the after destroy logic
@@ -33,9 +33,9 @@ module XapianDb
33
33
  order = options.delete :order
34
34
  if order
35
35
  attr_names = [order].flatten
36
- blueprint = XapianDb::DocumentBlueprint.blueprint_for klass
37
- sort_indices = attr_names.map {|attr_name| blueprint.value_index_for(attr_name)}
38
- options[:sort_indices] = attr_names.map {|attr_name| blueprint.value_index_for(attr_name)}
36
+ undefined_attrs = attr_names - XapianDb::DocumentBlueprint.attributes
37
+ raise ArgumentError.new "invalid order clause: attributes #{undefined_attrs.inspect} are not defined" unless undefined_attrs.empty?
38
+ options[:sort_indices] = attr_names.map {|attr_name| XapianDb::DocumentBlueprint.value_number_for(attr_name) }
39
39
  end
40
40
  result = XapianDb.database.search "#{class_scope} and (#{expression})", options
41
41
 
@@ -52,29 +52,9 @@ module XapianDb
52
52
  end
53
53
 
54
54
  # Add a method to search atribute facets of this class
55
- define_singleton_method(:facets) do |attr_name, expression|
56
-
57
- # return an empty hash if no search expression is given
58
- return {} if expression.nil? || expression.strip.empty?
59
-
60
- class_scope = "indexed_class:#{klass.name.downcase}"
61
- blueprint = XapianDb::DocumentBlueprint.blueprint_for klass
62
- value_index = blueprint.value_index_for attr_name.to_sym
63
-
64
- query_parser = QueryParser.new(XapianDb.database)
65
- query = query_parser.parse("#{class_scope} and (#{expression})")
66
- enquiry = Xapian::Enquire.new(XapianDb.database.reader)
67
- enquiry.query = query
68
- enquiry.collapse_key = value_index
69
- facets = {}
70
- enquiry.mset(0, XapianDb.database.size).matches.each do |match|
71
- facet_value = YAML::load match.document.values[value_index].value
72
- # We must add 1 to the collapse_count since collapse_count means
73
- # "how many other matches are there?"
74
- facets[facet_value] = match.collapse_count + 1
75
- end
76
- facets
77
-
55
+ define_singleton_method(:facets) do |attribute, expression|
56
+ class_scope = "indexed_class:#{klass.name.downcase}"
57
+ XapianDb.database.facets attribute, "#{class_scope} and (#{expression})"
78
58
  end
79
59
 
80
60
  end
@@ -67,10 +67,9 @@ module XapianDb
67
67
  if path.to_sym == :memory
68
68
  @_database = XapianDb.create_db
69
69
  else
70
- if File.exist?(path)
70
+ begin
71
71
  @_database = XapianDb.open_db :path => path
72
- else
73
- # Database does not exist; create it
72
+ rescue IOError
74
73
  @_database = XapianDb.create_db :path => path
75
74
  end
76
75
  end
@@ -75,13 +75,9 @@ module XapianDb
75
75
  sort_decending = opts.delete :sort_decending
76
76
 
77
77
  if sort_indices
78
- raise ArgumentError.new("Sorting is available for class scoped searches only") unless expression =~ /^indexed_class:/
79
- sorter = Xapian::MultiValueSorter.new
80
-
81
- sort_indices.each do |index|
82
- sorter.add(index, sort_decending)
83
- end
84
- enquiry.set_sort_by_key_then_relevance(sorter)
78
+ sorter = Xapian::MultiValueKeyMaker.new
79
+ sort_indices.each { |index| sorter.add_value index }
80
+ enquiry.set_sort_by_key_then_relevance(sorter, sort_decending)
85
81
  end
86
82
 
87
83
  opts[:spelling_suggestion] = @query_parser.spelling_suggestion
@@ -123,25 +119,28 @@ module XapianDb
123
119
  Resultset.new(enquiry, :db_size => self.size)
124
120
  end
125
121
 
126
- # A very simple implementation of facets limited to the class facets.
122
+ # A very simple implementation of facets using Xapian collapse key.
123
+ # @param [Symbol, String] attribute the name of an attribute declared in one ore more blueprints
127
124
  # @param [String] expression A valid search expression (see {#search} for examples).
128
125
  # @return [Hash<Class, Integer>] A hash containing the classes and the hits per class
129
- def facets(expression)
130
- @query_parser ||= QueryParser.new(self)
131
- query = @query_parser.parse(expression)
132
- enquiry = Xapian::Enquire.new(reader)
126
+ def facets(attribute, expression)
127
+ # return an empty hash if no search expression is given
128
+ return {} if expression.nil? || expression.strip.empty?
129
+ value_number = XapianDb::DocumentBlueprint.value_number_for(attribute)
130
+ query_parser = QueryParser.new(XapianDb.database)
131
+ query = query_parser.parse(expression)
132
+ enquiry = Xapian::Enquire.new(XapianDb.database.reader)
133
133
  enquiry.query = query
134
- enquiry.collapse_key = 0 # Value 0 always contains the class name
134
+ enquiry.collapse_key = value_number
135
135
  facets = {}
136
- enquiry.mset(0, self.size).matches.each do |match|
137
- class_name = match.document.values[0].value
136
+ enquiry.mset(0, XapianDb.database.size).matches.each do |match|
137
+ facet_value = YAML::load match.document.value(value_number)
138
138
  # We must add 1 to the collapse_count since collapse_count means
139
139
  # "how many other matches are there?"
140
- facets[constantize class_name] = match.collapse_count + 1
140
+ facets[facet_value] = match.collapse_count + 1
141
141
  end
142
142
  facets
143
143
  end
144
-
145
144
  end
146
145
 
147
146
  # In Memory database
@@ -35,12 +35,17 @@ module XapianDb
35
35
  @blueprints ||= {}
36
36
  blueprint = DocumentBlueprint.new
37
37
  yield blueprint if block_given? # configure the blueprint through the block
38
+ validate_type_consistency_on blueprint
38
39
  # Remove a previously loaded blueprint for this class to avoid stale blueprint definitions
39
- @blueprints.delete_if { |key, blueprint| key.name == klass.name }
40
+ @blueprints.delete_if { |indexed_class, blueprint| indexed_class.name == klass.name }
40
41
  @blueprints[klass] = blueprint
41
42
  @_adapter = blueprint._adapter || XapianDb::Config.adapter || Adapters::GenericAdapter
42
43
  @_adapter.add_class_helper_methods_to klass
43
- @searchable_prefixes = nil # force rebuild of the searchable prefixes
44
+
45
+ @searchable_prefixes = @blueprints.values.map { |blueprint| blueprint.searchable_prefixes }.flatten.compact.uniq || []
46
+ # We can always do a field search on the name of the indexed class
47
+ @searchable_prefixes << "indexed_class"
48
+ @attributes = @blueprints.values.map { |blueprint| blueprint.attribute_names}.flatten.compact.uniq.sort || []
44
49
  end
45
50
 
46
51
  # Get all configured classes
@@ -63,14 +68,54 @@ module XapianDb
63
68
  raise "Blueprint for class #{klass} is not defined"
64
69
  end
65
70
 
71
+ # Get the value number for an attribute. Please note that this is not the index in the values
72
+ # array of a xapian document but the valueno. Therefore, document.values[value_number] returns
73
+ # the wrong data, use document.value(value_number) instead.
74
+ # @param [attribute] The name of an attribute
75
+ # @return [Integer] The value number
76
+ def value_number_for(attribute)
77
+ raise ArgumentError.new "attribute #{attribute} is not configured in any blueprint" if @attributes.nil?
78
+ return 0 if attribute.to_sym == :indexed_class
79
+ position = @attributes.index attribute.to_sym
80
+ if position
81
+ # We add 1 because value slot 0 is reserved for the class name
82
+ return position + 1
83
+ else
84
+ raise ArgumentError.new "attribute #{attribute} is not configured in any blueprint"
85
+ end
86
+ end
87
+
88
+ # Get the type info of an attribute
89
+ # @param [attribute] The name of an indexed method
90
+ # @return [Symbol] The defined type or :untyped if no type is defined
91
+ def type_info_for(attribute)
92
+ return nil if @blueprints.nil?
93
+ @blueprints.values.each do |blueprint|
94
+ return blueprint.type_map[attribute] if blueprint.type_map.has_key?(attribute)
95
+ end
96
+ nil
97
+ end
98
+
66
99
  # Return an array of all configured text methods in any blueprint
67
100
  # @return [Array<String>] All searchable prefixes
68
101
  def searchable_prefixes
69
- return [] unless @blueprints
70
- @searchable_prefixes ||= @blueprints.values.map { |blueprint| blueprint.searchable_prefixes }.flatten.compact.uniq
71
- # We can always do a field search on the name of the indexed class
72
- @searchable_prefixes << "indexed_class"
73
- @searchable_prefixes
102
+ @searchable_prefixes || []
103
+ end
104
+
105
+ # Return an array of all defined attributes
106
+ # @return [Array<Symbol>] All defined attributes
107
+ def attributes
108
+ @attributes || []
109
+ end
110
+
111
+ private
112
+
113
+ def validate_type_consistency_on(blueprint)
114
+ blueprint.type_map.each do |method_name, type|
115
+ if type_info_for(method_name) && type_info_for(method_name) != type
116
+ raise ArgumentError.new "ambigous type definition for #{method_name} detected (#{type_info_for(method_name)}, #{type})"
117
+ end
118
+ end
74
119
  end
75
120
 
76
121
  end
@@ -79,6 +124,8 @@ module XapianDb
79
124
  # Instance methods
80
125
  # ---------------------------------------------------------------------------------
81
126
 
127
+ attr_reader :type_map
128
+
82
129
  # Get the names of all configured attributes sorted alphabetically
83
130
  # @return [Array<Symbol>] The names of the attributes
84
131
  def attribute_names
@@ -89,7 +136,7 @@ module XapianDb
89
136
  # @param [Symbol] attribute The name of the attribute
90
137
  # @return [Block] The block
91
138
  def block_for_attribute(attribute)
92
- @attributes_hash[attribute]
139
+ @attributes_hash[attribute][:block]
93
140
  end
94
141
 
95
142
  # Get the names of all configured index methods sorted alphabetically
@@ -105,22 +152,10 @@ module XapianDb
105
152
  @indexed_methods_hash[method]
106
153
  end
107
154
 
108
- # Return the value index of an attribute. Needed to access the value of an attribute
109
- # from a Xapian document.
110
- # @param [String, Symbol] attribute_name The name of the attribute
111
- # @return [Integer] The value index of the attribute
112
- # @raise ArgumentError if the attribute name is unknown
113
- def value_index_for(attribute_name)
114
- index = attribute_names.index attribute_name.to_sym
115
- raise ArgumentError.new("Attribute #{attribute_name} unknown") unless index
116
- # We add 1 because value slot 0 is reserved for the class name
117
- index + 1
118
- end
119
-
120
155
  # Return an array of all configured text methods in this blueprint
121
156
  # @return [Array<String>] All searchable prefixes
122
157
  def searchable_prefixes
123
- @prefixes ||= @indexed_methods_hash.keys
158
+ @searchable_prefixes ||= indexed_method_names
124
159
  end
125
160
 
126
161
  # Should the object go into the index? Evaluates an ignore expression,
@@ -151,10 +186,11 @@ module XapianDb
151
186
 
152
187
  # Add an accessor for each attribute
153
188
  attribute_names.each do |attribute|
154
- index = value_index_for(attribute)
189
+ index = DocumentBlueprint.value_number_for(attribute)
190
+ codec = XapianDb::TypeCodec.codec_for @type_map[attribute]
155
191
  @accessors_module.instance_eval do
156
192
  define_method attribute do
157
- YAML::load(self.values[index].value)
193
+ codec.decode self.value(index)
158
194
  end
159
195
  end
160
196
  end
@@ -174,8 +210,9 @@ module XapianDb
174
210
 
175
211
  # Construct the blueprint
176
212
  def initialize
177
- @attributes_hash = {}
213
+ @attributes_hash = {}
178
214
  @indexed_methods_hash = {}
215
+ @type_map = {}
179
216
  end
180
217
 
181
218
  # Set the adapter
@@ -194,6 +231,7 @@ module XapianDb
194
231
  # @param [Hash] options
195
232
  # @option options [Integer] :weight (1) The weight for this attribute.
196
233
  # @option options [Boolean] :index (true) Should the attribute be indexed?
234
+ # @option options [Symbol] :as should add type info for range queries (:date, :numeric)
197
235
  # @example For complex attribute configurations you may pass a block:
198
236
  # XapianDb::DocumentBlueprint.setup(IndexedObject) do |blueprint|
199
237
  # blueprint.attribute :complex do
@@ -206,13 +244,15 @@ module XapianDb
206
244
  # end
207
245
  def attribute(name, options={}, &block)
208
246
  raise ArgumentError.new("You cannot use #{name} as an attribute name since it is a reserved method name of Xapian::Document") if reserved_method_name?(name)
209
- opts = {:index => true}.merge(options)
247
+ do_not_index = options.delete(:index) == false
248
+ @type_map[name] = (options.delete(:as) || :generic)
249
+
210
250
  if block_given?
211
- @attributes_hash[name] = block
251
+ @attributes_hash[name] = {:block => block}.merge(options)
212
252
  else
213
- @attributes_hash[name] = nil
253
+ @attributes_hash[name] = options
214
254
  end
215
- self.index(name, opts, &block) if opts[:index]
255
+ self.index(name, options, &block) unless do_not_index
216
256
  end
217
257
 
218
258
  # Add a list of attributes to the blueprint. Attributes will be stored in the xapian documents ans
@@ -221,7 +261,8 @@ module XapianDb
221
261
  def attributes(*attributes)
222
262
  attributes.each do |attr|
223
263
  raise ArgumentError.new("You cannot use #{attr} as an attribute name since it is a reserved method name of Xapian::Document") if reserved_method_name?(attr)
224
- @attributes_hash[attr] = nil
264
+ @attributes_hash[attr] = {}
265
+ @type_map[attr] = :generic
225
266
  self.index attr
226
267
  end
227
268
  end
@@ -249,7 +290,9 @@ module XapianDb
249
290
  when 2
250
291
  # Is it a method name with options?
251
292
  if args.last.is_a? Hash
252
- @indexed_methods_hash[args.first] = IndexOptions.new(args.last.merge(:block => block))
293
+ options = args.last
294
+ assert_valid_keys options, :weight
295
+ @indexed_methods_hash[args.first] = IndexOptions.new(options.merge(:block => block))
253
296
  else
254
297
  add_indexes_from args
255
298
  end
@@ -266,16 +309,16 @@ module XapianDb
266
309
  # Options for an indexed method
267
310
  class IndexOptions
268
311
 
269
- # The weight for the indexed value
270
- attr_accessor :weight, :block
312
+ attr_reader :weight, :block
271
313
 
272
314
  # Constructor
273
315
  # @param [Hash] options
274
316
  # @option options [Integer] :weight (1) The weight for the indexed value
275
- def initialize(options)
317
+ def initialize(options = {})
276
318
  @weight = options[:weight] || 1
277
319
  @block = options[:block]
278
320
  end
321
+
279
322
  end
280
323
 
281
324
  private
@@ -295,4 +338,4 @@ module XapianDb
295
338
 
296
339
  end
297
340
 
298
- end
341
+ end
@@ -3,7 +3,7 @@
3
3
  module XapianDb
4
4
  module IndexWriters
5
5
 
6
- # Worker to update the Xapian index; the worker is used in the beanstalk worker rake task
6
+ # Worker to update the Xapian index; the worker is used in the beanstalk worker script
7
7
  # and uses the DirectWriter to do the real work
8
8
  # @author Gernot Kogler
9
9
  class BeanstalkWorker
@@ -12,7 +12,7 @@ module XapianDb
12
12
 
13
13
  def index_task(options)
14
14
  klass = constantize options[:class]
15
- obj = klass.respond_to?(:get) ? klass.get(options[:id].to_i) : klass.find(options[:id].to_i)
15
+ obj = klass.respond_to?(:get) ? klass.get(options[:id]) : klass.find(options[:id])
16
16
  DirectWriter.index obj
17
17
  end
18
18
 
@@ -43,10 +43,9 @@ module XapianDb
43
43
  value = @obj.send(attribute)
44
44
  end
45
45
 
46
- # If we have an object that responds to attributes (e.g. an Active Record
47
- # or a Datamapper model), we serialize only the attributes
48
- yaml = value.respond_to?(:attributes) ? value.attributes.to_yaml : value.to_yaml
49
- @xapian_doc.add_value(@blueprint.value_index_for(attribute), yaml)
46
+ codec = XapianDb::TypeCodec.codec_for @blueprint.type_map[attribute]
47
+ encoded_string = codec.encode value
48
+ @xapian_doc.add_value DocumentBlueprint.value_number_for(attribute), encoded_string
50
49
  end
51
50
  end
52
51
 
@@ -105,4 +104,4 @@ module XapianDb
105
104
 
106
105
  end
107
106
 
108
- end
107
+ end
@@ -39,7 +39,20 @@ module XapianDb
39
39
 
40
40
  # Add the searchable prefixes to allow searches by field
41
41
  # (like "name:Kogler")
42
- XapianDb::DocumentBlueprint.searchable_prefixes.each{|prefix| parser.add_prefix(prefix.to_s.downcase, "X#{prefix.to_s.upcase}") }
42
+ XapianDb::DocumentBlueprint.searchable_prefixes.each do |prefix|
43
+ parser.add_prefix(prefix.to_s.downcase, "X#{prefix.to_s.upcase}")
44
+ type_info = XapianDb::DocumentBlueprint.type_info_for(prefix)
45
+ next if type_info.nil? || type_info == :generic
46
+ value_number = XapianDb::DocumentBlueprint.value_number_for(prefix)
47
+ case type_info
48
+ when :date
49
+ parser.add_valuerangeprocessor Xapian::DateValueRangeProcessor.new(value_number, "#{prefix}:")
50
+ when :number
51
+ parser.add_valuerangeprocessor Xapian::NumberValueRangeProcessor.new(value_number, "#{prefix}:")
52
+ when :string
53
+ parser.add_valuerangeprocessor Xapian::StringValueRangeProcessor.new(value_number, "#{prefix}:")
54
+ end
55
+ end
43
56
  query = parser.parse_query(expression, @query_flags)
44
57
  @spelling_suggestion = parser.get_corrected_query_string.force_encoding("UTF-8")
45
58
  @spelling_suggestion = nil if @spelling_suggestion.empty?
@@ -48,4 +61,4 @@ module XapianDb
48
61
 
49
62
  end
50
63
 
51
- end
64
+ end
@@ -24,5 +24,11 @@ module XapianDb
24
24
  constant
25
25
  end
26
26
 
27
+ # Taken from Rails
28
+ def assert_valid_keys(hash, *valid_keys)
29
+ unknown_keys = hash.keys - [valid_keys].flatten
30
+ raise(ArgumentError, "Unsupported option(s) detected: #{unknown_keys.join(", ")}") unless unknown_keys.empty?
31
+ end
32
+
27
33
  end
28
34
  end
data/lib/xapian_db.rb CHANGED
@@ -9,6 +9,22 @@
9
9
  require 'xapian'
10
10
  require 'yaml'
11
11
 
12
+ do_not_require = %w(update_stopwords.rb railtie.rb base_adapter.rb beanstalk_writer.rb utilities.rb install_generator.rb)
13
+ files = Dir.glob("#{File.dirname(__FILE__)}/**/*.rb").reject{|path| do_not_require.include?(File.basename(path))}
14
+ # Require these first
15
+ require "#{File.dirname(__FILE__)}/xapian_db/utilities"
16
+ require "#{File.dirname(__FILE__)}/xapian_db/adapters/base_adapter"
17
+ files.each {|file| require file}
18
+
19
+ # Configure XapianDB if we are in a Rails app
20
+ require File.dirname(__FILE__) + '/xapian_db/railtie' if defined?(Rails)
21
+
22
+ # Try to require the beanstalk writer (depends on beanstalk-client)
23
+ begin
24
+ require File.dirname(__FILE__) + '/xapian_db/index_writers/beanstalk_writer'
25
+ rescue LoadError
26
+ end
27
+
12
28
  module XapianDb
13
29
 
14
30
  # Supported languages
@@ -75,14 +91,21 @@ module XapianDb
75
91
  # See {XapianDb::Database#search} for options
76
92
  # @return [XapianDb::Resultset]
77
93
  def self.search(expression, options={})
94
+ order = options.delete :order
95
+ if order
96
+ attr_names = [order].flatten
97
+ undefined_attrs = attr_names - XapianDb::DocumentBlueprint.attributes
98
+ raise ArgumentError.new "invalid order clause: attributes #{undefined_attrs.inspect} are not defined" unless undefined_attrs.empty?
99
+ options[:sort_indices] = attr_names.map {|attr_name| XapianDb::DocumentBlueprint.value_number_for(attr_name) }
100
+ end
78
101
  XapianDb::Config.database.search(expression, options)
79
102
  end
80
103
 
81
104
  # Get facets from the configured database.
82
105
  # See {XapianDb::Database#facets} for options
83
106
  # @return [Hash<Class, Integer>] A hash containing the classes and the hits per class
84
- def self.facets(expression)
85
- XapianDb::Config.database.facets(expression)
107
+ def self.facets(attribute, expression)
108
+ XapianDb::Config.database.facets attribute, expression
86
109
  end
87
110
 
88
111
  # Update an object in the index
@@ -99,6 +122,17 @@ module XapianDb
99
122
  writer.delete_doc_with xapian_id
100
123
  end
101
124
 
125
+ # Update or delete a xapian document belonging to an object depending on the ignore_if logic(if present)
126
+ # @param [Object] object An instance of a class with a blueprint configuration
127
+ def self.reindex(object)
128
+ blueprint = XapianDb::DocumentBlueprint.blueprint_for object.class
129
+ if blueprint.should_index?(object)
130
+ XapianDb.index object
131
+ else
132
+ XapianDb.delete_doc_with object.xapian_id
133
+ end
134
+ end
135
+
102
136
  # Reindex all objects of a given class
103
137
  # @param [Class] klass The class to reindex
104
138
  # @param [Hash] options Options for reindexing
@@ -161,16 +195,3 @@ module XapianDb
161
195
  end
162
196
 
163
197
  end
164
-
165
- do_not_require = %w(update_stopwords.rb railtie.rb base_adapter.rb beanstalk_writer.rb utilities.rb install_generator.rb)
166
- files = Dir.glob("#{File.dirname(__FILE__)}/**/*.rb").reject{|path| do_not_require.include?(File.basename(path))}
167
- # Require these first
168
- require "#{File.dirname(__FILE__)}/xapian_db/utilities"
169
- require "#{File.dirname(__FILE__)}/xapian_db/adapters/base_adapter"
170
- files.each {|file| require file}
171
-
172
- # Configure XapianDB if we are in a Rails app
173
- require File.dirname(__FILE__) + '/xapian_db/railtie' if defined?(Rails)
174
-
175
- # Require the beanstalk writer is beanstalk-client is installed
176
- require File.dirname(__FILE__) + '/xapian_db/index_writers/beanstalk_writer' if Gem.available?('beanstalk-client')
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: xapian_db
3
3
  version: !ruby/object:Gem::Version
4
- version: '1.0'
4
+ version: '1.1'
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,12 +9,12 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2011-08-17 00:00:00.000000000 +02:00
12
+ date: 2011-09-07 00:00:00.000000000 +02:00
13
13
  default_executable:
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: daemons
17
- requirement: &70169658679640 !ruby/object:Gem::Requirement
17
+ requirement: &70318028242380 !ruby/object:Gem::Requirement
18
18
  none: false
19
19
  requirements:
20
20
  - - ! '>='
@@ -22,10 +22,10 @@ dependencies:
22
22
  version: 1.0.10
23
23
  type: :runtime
24
24
  prerelease: false
25
- version_requirements: *70169658679640
25
+ version_requirements: *70318028242380
26
26
  - !ruby/object:Gem::Dependency
27
27
  name: xapian-ruby
28
- requirement: &70169658679180 !ruby/object:Gem::Requirement
28
+ requirement: &70318028241920 !ruby/object:Gem::Requirement
29
29
  none: false
30
30
  requirements:
31
31
  - - ! '>='
@@ -33,10 +33,10 @@ dependencies:
33
33
  version: 1.2.6
34
34
  type: :runtime
35
35
  prerelease: false
36
- version_requirements: *70169658679180
36
+ version_requirements: *70318028241920
37
37
  - !ruby/object:Gem::Dependency
38
38
  name: rspec
39
- requirement: &70169658678720 !ruby/object:Gem::Requirement
39
+ requirement: &70318028241460 !ruby/object:Gem::Requirement
40
40
  none: false
41
41
  requirements:
42
42
  - - ! '>='
@@ -44,10 +44,10 @@ dependencies:
44
44
  version: 2.3.1
45
45
  type: :development
46
46
  prerelease: false
47
- version_requirements: *70169658678720
47
+ version_requirements: *70318028241460
48
48
  - !ruby/object:Gem::Dependency
49
49
  name: simplecov
50
- requirement: &70169658678120 !ruby/object:Gem::Requirement
50
+ requirement: &70318028241000 !ruby/object:Gem::Requirement
51
51
  none: false
52
52
  requirements:
53
53
  - - ! '>='
@@ -55,10 +55,10 @@ dependencies:
55
55
  version: 0.3.7
56
56
  type: :development
57
57
  prerelease: false
58
- version_requirements: *70169658678120
58
+ version_requirements: *70318028241000
59
59
  - !ruby/object:Gem::Dependency
60
60
  name: beanstalk-client
61
- requirement: &70169658677520 !ruby/object:Gem::Requirement
61
+ requirement: &70318028240520 !ruby/object:Gem::Requirement
62
62
  none: false
63
63
  requirements:
64
64
  - - ! '>='
@@ -66,7 +66,7 @@ dependencies:
66
66
  version: 1.1.0
67
67
  type: :development
68
68
  prerelease: false
69
- version_requirements: *70169658677520
69
+ version_requirements: *70318028240520
70
70
  description: XapianDb is a ruby gem that combines features of nosql databases and
71
71
  fulltext indexing. It is based on Xapian, an efficient and powerful indexing library
72
72
  email: gernot.kogler (at) garaio (dot) com
@@ -76,6 +76,7 @@ extra_rdoc_files: []
76
76
  files:
77
77
  - lib/generators/install_generator.rb
78
78
  - lib/generators/templates/beanstalk_worker
79
+ - lib/type_codec.rb
79
80
  - lib/xapian_db/adapters/active_record_adapter.rb
80
81
  - lib/xapian_db/adapters/base_adapter.rb
81
82
  - lib/xapian_db/adapters/datamapper_adapter.rb