xapian_db 0.3.3 → 0.3.4

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGELOG.md CHANGED
@@ -1,3 +1,14 @@
1
+ ##0.3.4 (December 14th, 2010)
2
+
3
+ Features:
4
+
5
+ - perform searches on indexed classes to scope the search to objects of a specific class
6
+ - specify multiple blueprint attributes and index methods in one statement (without specifying options)
7
+ - use blocks for complex attribute or index specifications
8
+ - changed the implementation of Resultset.size to get more accurate estimations
9
+ - changed the indexing of active_record or datamapper models when declared as attributes or indexes
10
+ in a blueprint (indexes now all attributes of the object instead of using to_s)
11
+
1
12
  ##0.3.3 (December 13th, 2010)
2
13
 
3
14
  Features:
@@ -15,6 +26,7 @@ Changes:
15
26
  ##0.3.2 (December 10th, 2010)
16
27
 
17
28
  Features:
29
+
18
30
  - Moved the per_page option from Resultset.paginate to Database.search
19
31
  - Added support for language settings (global and dynamic per object)
20
32
  - Added support for xapian stemmers
@@ -30,6 +42,7 @@ Bugfixes:
30
42
  ##0.3.0 (December 4th, 2010)
31
43
 
32
44
  Features:
45
+
33
46
  - Rails integration with configuration file (config/xapian_db.yml) and automatic setup
34
47
 
35
48
  ##0.2.0 (December 1st, 2010)
data/README.rdoc CHANGED
@@ -116,6 +116,27 @@ you can configure the blueprint to use the language of the object when indexing:
116
116
 
117
117
  The method must return the iso code for the language (:en, :de, ...) as a symbol or a string. Don't worry if you have languages in your database that are not supported by Xapian. If the language is not supported, XapianDb will fall back to the global language configuration or none, if you haven't configured one.
118
118
 
119
+ If you want to declare multiple attributes or indexes with default options, you can do this in one statement:
120
+
121
+ XapianDb::DocumentBlueprint.setup(Person) do |blueprint|
122
+ blueprint.attributes :name, :first_name, :profession
123
+ blueprint.index :notes, :remarks, :cv
124
+ end
125
+
126
+ Note that you cannot add options using this mass declaration syntax (e.g. <code>blueprint.attributes :name, :weight => 10, :first_name</code> is not valid).
127
+
128
+ Use blocks for complex evaluations of attributes or indexed values:
129
+
130
+ XapianDb::DocumentBlueprint.setup(IndexedObject) do |blueprint|
131
+ blueprint.attribute :complex do
132
+ if @id == 1
133
+ "One"
134
+ else
135
+ "Not one"
136
+ end
137
+ end
138
+ end
139
+
119
140
  You can place this configuration anywhere, e.g. in an initializer.
120
141
 
121
142
  === Update the index
@@ -126,6 +147,12 @@ use the method <code>rebuild_xapian_index</code>:
126
147
 
127
148
  Person.rebuild_xapian_index
128
149
 
150
+ To get info about the reindex process, use the verbose option:
151
+
152
+ Person.rebuild_xapian_index :verbose => true
153
+
154
+ In verbose mode, XapianDb will use the progressbar gem if available.
155
+
129
156
  === Query the index
130
157
 
131
158
  A simple query looks like this:
@@ -134,12 +161,20 @@ A simple query looks like this:
134
161
 
135
162
  You can use wildcards and boolean operators:
136
163
 
137
- results = XapianDb.search("Fo*" OR "Baz")
164
+ results = XapianDb.search("fo* or baz")
138
165
 
139
166
  You can query attributes:
140
167
 
141
168
  results = XapianDb.search("name:Foo")
142
169
 
170
+ You can query objects of a specific class:
171
+
172
+ results = Person.search("name:Foo")
173
+
174
+ If you want to override the default of 10 docs per page, pass the :per_page argument:
175
+
176
+ results = Person.search("name:Foo", :per_page => 20)
177
+
143
178
  === Process the results
144
179
 
145
180
  <code>XapianDb.search</code> returns a resultset object. You can access the number of hits directly:
@@ -150,17 +185,17 @@ If you use a persistent database, the resultset may contain a spelling correctio
150
185
 
151
186
  # Assuming you have at least one document containing "mouse"
152
187
  results = XapianDb.search("moose")
153
- results.corrected_query # "mouse"
188
+ results.spelling_suggestion # "mouse"
154
189
 
155
190
  To access the found documents, get a page from the resultset:
156
191
 
157
- page = result.paginate # Get the first page with 10 documents
158
- page = result.paginate(:page => 2, :per_page => 20) # Get the second page page with documents 21-40
192
+ page = result.paginate # Get the first page
193
+ page = result.paginate :page => 2 # Get the second page
159
194
 
160
195
  Now you can access the documents:
161
196
 
162
197
  doc = page.first
163
- puts doc.domain_class # Get the type of the indexed object, e.g. "Person"
198
+ puts doc.indexed_class # Get the type of the indexed object as a string, e.g. "Person"
164
199
  puts doc.name # We can access the configured attributes
165
200
  person = doc.indexed_object # Access the object behind this doc (lazy loaded)
166
201
 
@@ -16,7 +16,7 @@ module XapianDb
16
16
  # in every found xapian document
17
17
  # @author Gernot Kogler
18
18
 
19
- class ActiveRecordAdapter
19
+ class ActiveRecordAdapter < BaseAdapter
20
20
 
21
21
  class << self
22
22
 
@@ -24,6 +24,9 @@ module XapianDb
24
24
  # @param [Class] klass The class to add the helper methods to
25
25
  def add_class_helper_methods_to(klass)
26
26
 
27
+ # Add the helpers from the base class
28
+ super klass
29
+
27
30
  klass.instance_eval do
28
31
  # define the method to retrieve a unique key
29
32
  define_method(:xapian_id) do
@@ -0,0 +1,31 @@
1
+ # encoding: utf-8
2
+
3
+ module XapianDb
4
+ module Adapters
5
+
6
+ # base class for all adapters.
7
+ # This adapter does the following:
8
+ # - adds the class method <code>search(expression)</code> to an indexed class
9
+ # @author Gernot Kogler
10
+
11
+ class BaseAdapter
12
+
13
+ class << self
14
+
15
+ # Implement the class helper methods
16
+ # @param [Class] klass The class to add the helper methods to
17
+ def add_class_helper_methods_to(klass)
18
+
19
+ klass.class_eval do
20
+
21
+ # Add a method to search models of this class
22
+ define_singleton_method(:search) do |expression|
23
+ XapianDb.database.search "indexed_class:#{klass.name.downcase} and (#{expression})"
24
+ end
25
+
26
+ end
27
+ end
28
+ end
29
+ end
30
+ end
31
+ end
@@ -15,7 +15,7 @@ module XapianDb
15
15
  # - adds the instance method <code>indexed_object</code> to the module that will be included
16
16
  # in every found xapian document
17
17
  # @author Gernot Kogler
18
- class DatamapperAdapter
18
+ class DatamapperAdapter < BaseAdapter
19
19
 
20
20
  class << self
21
21
 
@@ -23,6 +23,9 @@ module XapianDb
23
23
  # @param [Class] klass The class to add the helper methods to
24
24
  def add_class_helper_methods_to(klass)
25
25
 
26
+ # Add the helpers from the base class
27
+ super klass
28
+
26
29
  klass.instance_eval do
27
30
  # define the method to retrieve a unique key
28
31
  define_method(:xapian_id) do
@@ -10,7 +10,7 @@ module XapianDb
10
10
  # This adapter does the following:
11
11
  # - adds the instance method <code>xapian_id</code> to an indexed class
12
12
  # @author Gernot Kogler
13
- class GenericAdapter
13
+ class GenericAdapter < BaseAdapter
14
14
 
15
15
  class << self
16
16
 
@@ -27,6 +27,10 @@ module XapianDb
27
27
  # @param [Class] klass The class to add the helper methods to
28
28
  def add_class_helper_methods_to(klass)
29
29
  raise "Unique key is not configured for generic adapter!" if @unique_key_block.nil?
30
+
31
+ # Add the helpers from the base class
32
+ super klass
33
+
30
34
  expression = @unique_key_block
31
35
  klass.instance_eval do
32
36
  define_method(:xapian_id) do
@@ -59,7 +59,8 @@ module XapianDb
59
59
  query = @query_parser.parse(expression)
60
60
  enquiry = Xapian::Enquire.new(reader)
61
61
  enquiry.query = query
62
- opts[:corrected_query] = @query_parser.corrected_query
62
+ opts[:spelling_suggestion] = @query_parser.spelling_suggestion
63
+ opts[:db_size] = self.size
63
64
  Resultset.new(enquiry, opts)
64
65
  end
65
66
 
@@ -13,6 +13,15 @@ module XapianDb
13
13
  # blueprint.attribute :first_name
14
14
  # blueprint.index :remarks
15
15
  # end
16
+ # @example A document blueprint configuration with a complex attribute for the class Person
17
+ # XapianDb::DocumentBlueprint.setup(Person) do |blueprint|
18
+ # # Our Person class has a method lang_cd. We use this method to
19
+ # # index each person with its language
20
+ # blueprint.language_method :lang_cd
21
+ # blueprint.attribute :complex, :weight => 10 do
22
+ # # add some logic here to evaluate the value of 'complex'
23
+ # end
24
+ # end
16
25
  # @author Gernot Kogler
17
26
  class DocumentBlueprint
18
27
 
@@ -53,6 +62,8 @@ module XapianDb
53
62
  prefixes << blueprint.searchable_prefixes
54
63
  end
55
64
  @searchable_prefixes = prefixes.flatten.compact.uniq
65
+ # We can always do a field search on the name of the indexed class
66
+ @searchable_prefixes << "indexed_class"
56
67
  end
57
68
 
58
69
  end
@@ -64,7 +75,7 @@ module XapianDb
64
75
  # Return an array of all configured text methods in this blueprint
65
76
  # @return [Array<String>] All searchable prefixes
66
77
  def searchable_prefixes
67
- @prefixes ||= indexed_methods.keys
78
+ @prefixes ||= indexed_methods_hash.keys
68
79
  end
69
80
 
70
81
  # Lazily build and return a module that implements accessors for each field
@@ -75,12 +86,12 @@ module XapianDb
75
86
 
76
87
  # Add the accessor for the indexed class
77
88
  @accessors_module.instance_eval do
78
- define_method :domain_class do
89
+ define_method :indexed_class do
79
90
  self.values[0].value
80
91
  end
81
92
  end
82
93
 
83
- @attributes.each_with_index do |field, index|
94
+ @attributes_hash.keys.each_with_index do |field, index|
84
95
  @accessors_module.instance_eval do
85
96
  define_method field do
86
97
  YAML::load(self.values[index+1].value)
@@ -103,12 +114,12 @@ module XapianDb
103
114
 
104
115
  # Collection of the configured attribute methods
105
116
  # @return [Array<Symbol>] The names of the configured attribute methods
106
- attr_reader :attributes
117
+ attr_reader :attributes_hash
107
118
 
108
119
  # Collection of the configured index methods
109
120
  # @return [Hash<Symbol, IndexOptions>] A hashtable containing all index methods as
110
121
  # keys and IndexOptions as values
111
- attr_reader :indexed_methods
122
+ attr_reader :indexed_methods_hash
112
123
 
113
124
  # Set / read a custom adapter.
114
125
  # Use this configuration option if you need a specific adapter for an indexed class.
@@ -117,8 +128,8 @@ module XapianDb
117
128
 
118
129
  # Construct the blueprint
119
130
  def initialize
120
- @attributes = []
121
- @indexed_methods = {}
131
+ @attributes_hash = {}
132
+ @indexed_methods_hash = {}
122
133
  end
123
134
 
124
135
  # Set the name of the method to get the language for an indexed object
@@ -134,32 +145,91 @@ module XapianDb
134
145
  # @param [Hash] options
135
146
  # @option options [Integer] :weight (1) The weight for this attribute.
136
147
  # @option options [Boolean] :index (true) Should the attribute be indexed?
137
- # @todo Make sure the name does not collide with a method name of Xapian::Document since
138
- def attribute(name, options={})
148
+ # @example For complex attribute configurations you may pass a block:
149
+ # XapianDb::DocumentBlueprint.setup(IndexedObject) do |blueprint|
150
+ # blueprint.attribute :complex do
151
+ # if @id == 1
152
+ # "One"
153
+ # else
154
+ # "Not one"
155
+ # end
156
+ # end
157
+ # end
158
+ # @todo Make sure the name does not collide with a method name of Xapian::Document
159
+ def attribute(name, options={}, &block)
139
160
  opts = {:index => true}.merge(options)
140
- @attributes << name
141
- self.index(name, opts) if opts[:index]
161
+ if block_given?
162
+ @attributes_hash[name] = block
163
+ else
164
+ @attributes_hash[name] = nil
165
+ end
166
+ self.index(name, opts, &block) if opts[:index]
167
+ end
168
+
169
+ # Add list of attributes to the blueprint. Attributes will be stored in the xapian documents an can be
170
+ # accessed from a search result.
171
+ # @param [Array] attributes An array of method names that deliver the values for the attributes
172
+ # @todo Make sure the name does not collide with a method name of Xapian::Document
173
+ def attributes(*attributes)
174
+ attributes.each do |attr|
175
+ @attributes_hash[attr] = nil
176
+ self.index attr
177
+ end
142
178
  end
143
179
 
144
180
  # Add an indexed value to the blueprint. Indexed values are not accessible from a search result.
145
- # @param [String] name The name of the method that delivers the value for the index
146
- # @param [Hash] options
147
- # @option options [Integer] :weight (1) The weight for this indexed value
148
- def index(name, options={})
149
- @indexed_methods[name] = IndexOptions.new(options)
181
+ # @param [Array] args An array of arguments; you can pass a method name, an array of method names
182
+ # or a method name and an options hash.
183
+ # @param [Block] &block An optional block for complex configurations
184
+ # Avaliable options:
185
+ # - :weight (default: 1) The weight for this indexed value
186
+ # @example Simple index declaration
187
+ # blueprint.index :name
188
+ # @example Index declaration with options
189
+ # blueprint.index :name, :weight => 10
190
+ # @example Mass index declaration
191
+ # blueprint.index :name, :first_name, :profession
192
+ # @example Index declaration with a block
193
+ # blueprint.index :complex, :weight => 10 do
194
+ # # add some logic here to calculate the value for 'complex'
195
+ # end
196
+ def index(*args, &block)
197
+ case args.size
198
+ when 1
199
+ @indexed_methods_hash[args.first] = IndexOptions.new(:weight => 1, :block => block)
200
+ when 2
201
+ # Is it a method name with options?
202
+ if args.last.is_a? Hash
203
+ @indexed_methods_hash[args.first] = IndexOptions.new(args.last.merge(:block => block))
204
+ else
205
+ add_indexes_from args
206
+ end
207
+ else # multiple arguments
208
+ add_indexes_from args
209
+ end
150
210
  end
151
211
 
152
212
  # Options for an indexed method
153
213
  class IndexOptions
154
214
 
155
215
  # The weight for the indexed value
156
- attr_accessor :weight
216
+ attr_accessor :weight, :block
157
217
 
158
218
  # Constructor
159
219
  # @param [Hash] options
160
220
  # @option options [Integer] :weight (1) The weight for the indexed value
161
221
  def initialize(options)
162
222
  @weight = options[:weight] || 1
223
+ @block = options[:block]
224
+ end
225
+ end
226
+
227
+ private
228
+
229
+ # Add index configurations from an array
230
+ def add_indexes_from(array)
231
+ array.each do |arg|
232
+ @indexed_methods_hash[arg] = IndexOptions.new(:weight => 1)
163
233
  end
164
234
  end
165
235
 
@@ -36,8 +36,12 @@ module XapianDb
36
36
  @xapian_doc.add_value(0, @obj.class.name)
37
37
 
38
38
  pos = 1
39
- @blueprint.attributes.each do |attribute, options|
40
- value = @obj.send(attribute)
39
+ @blueprint.attributes_hash.each do |attribute, block|
40
+ if block
41
+ value = @obj.instance_eval(&block)
42
+ else
43
+ value = @obj.send(attribute)
44
+ end
41
45
  @xapian_doc.add_value(pos, value.to_yaml)
42
46
  pos += 1
43
47
  end
@@ -52,20 +56,26 @@ module XapianDb
52
56
  if @stemmer
53
57
  term_generator.stemmer = @stemmer
54
58
  term_generator.stopper = @stopper unless @stopper.nil?
55
- # Enable the creation of a spelling index if the database is not in memory
56
- if @database.is_a? XapianDb::PersistentDatabase
57
- term_generator.set_flags Xapian::TermGenerator::FLAG_SPELLING if @database.is_a? XapianDb::PersistentDatabase
58
- end
59
+ # Enable the creation of a spelling dictionary if the database is not in memory
60
+ term_generator.set_flags Xapian::TermGenerator::FLAG_SPELLING if @database.is_a? XapianDb::PersistentDatabase
59
61
  end
60
62
 
61
- # Always index the class and the primary key
62
- @xapian_doc.add_term("C#{@obj.class}")
63
+ # Index the primary key as a unique term
63
64
  @xapian_doc.add_term("Q#{@obj.xapian_id}")
64
65
 
65
- @blueprint.indexed_methods.each do |method, options|
66
- value = @obj.send(method)
67
- unless value.nil?
68
- values = value.is_a?(Array) ? value : [value]
66
+ # Index the class with the field name
67
+ term_generator.index_text("#{@obj.class}".downcase, 1, "XINDEXED_CLASS")
68
+ @xapian_doc.add_term("C#{@obj.class}")
69
+
70
+
71
+ @blueprint.indexed_methods_hash.each do |method, options|
72
+ if options.block
73
+ obj = @obj.instance_eval(&options.block)
74
+ else
75
+ obj = @obj.send(method)
76
+ end
77
+ unless obj.nil?
78
+ values = get_values_to_index_from obj
69
79
  values.each do |value|
70
80
  # Add value with field name
71
81
  term_generator.index_text(value.to_s.downcase, options.weight, "X#{method.upcase}")
@@ -96,6 +106,20 @@ module XapianDb
96
106
 
97
107
  end
98
108
 
109
+ # Get the values to index from an object
110
+ def get_values_to_index_from(obj)
111
+
112
+ # if it's an array, that's fine
113
+ return obj if obj.is_a? Array
114
+
115
+ # if the object responds to attributes and attributes is a hash,
116
+ # we use the attributes values (works well for active_record and datamapper objects)
117
+ return obj.attributes.values if obj.respond_to?(:attributes) && obj.attributes.is_a?(Hash)
118
+
119
+ # The object is unkown and will be indexed by its to_s method
120
+ return [obj]
121
+ end
122
+
99
123
  end
100
124
 
101
125
  end
@@ -8,7 +8,7 @@ module XapianDb
8
8
 
9
9
  # The spelling corrected query (if a language is configured)
10
10
  # @return [String]
11
- attr_reader :corrected_query
11
+ attr_reader :spelling_suggestion
12
12
 
13
13
  # Constructor
14
14
  # @param [XapianDb::Database] database The database to query
@@ -39,7 +39,7 @@ module XapianDb
39
39
  # (like "name:Kogler")
40
40
  XapianDb::DocumentBlueprint.searchable_prefixes.each{|prefix| parser.add_prefix(prefix.to_s.downcase, "X#{prefix.to_s.upcase}") }
41
41
  query = parser.parse_query(expression, @query_flags)
42
- @corrected_query = parser.get_corrected_query_string
42
+ @spelling_suggestion = parser.get_corrected_query_string
43
43
  query
44
44
  end
45
45
 
@@ -17,19 +17,18 @@ module XapianDb
17
17
 
18
18
  # The spelling corrected query (if a language is configured)
19
19
  # @return [String]
20
- attr_reader :corrected_query
20
+ attr_reader :spelling_suggestion
21
21
 
22
22
  # Constructor
23
23
  # @param [Xapian::Enquire] enquiry a Xapian query result (see http://xapian.org/docs/apidoc/html/classXapian_1_1Enquire.html)
24
24
  # @param [Hash] options
25
25
  # @option options [Integer] :per_page (10) How many docs per page?
26
- # @option options [String] :corrected_query (nil) The spelling corrected query (if a language is configured)
26
+ # @option options [String] :spelling_suggestion (nil) The spelling corrected query (if a language is configured)
27
27
  def initialize(enquiry, options)
28
28
  @enquiry = enquiry
29
- # By passing 0 as the max parameter to the mset method,
30
- # we only get statistics about the query, no results
31
- @size = enquiry.mset(0, 0).matches_estimated
32
- @corrected_query = options[:corrected_query]
29
+ # To get more accurate results, we pass the doc count to the mset method
30
+ @size = enquiry.mset(0, options[:db_size]).matches_estimated
31
+ @spelling_suggestion = options[:spelling_suggestion]
33
32
  @per_page = options[:per_page]
34
33
  end
35
34
 
data/lib/xapian_db.rb CHANGED
@@ -79,8 +79,10 @@ module XapianDb
79
79
 
80
80
  end
81
81
 
82
- do_not_require = %w(update_stopwords.rb railtie.rb)
82
+ do_not_require = %w(update_stopwords.rb railtie.rb base_adapter.rb)
83
83
  files = Dir.glob("#{File.dirname(__FILE__)}/**/*.rb").reject{|path| do_not_require.include?(File.basename(path))}
84
+ # Require the base adapter first
85
+ require "#{File.dirname(__FILE__)}/xapian_db/adapters/base_adapter"
84
86
  files.each {|file| require file}
85
87
 
86
88
  # Configure XapianDB if we are in a Rails app
metadata CHANGED
@@ -5,8 +5,8 @@ version: !ruby/object:Gem::Version
5
5
  segments:
6
6
  - 0
7
7
  - 3
8
- - 3
9
- version: 0.3.3
8
+ - 4
9
+ version: 0.3.4
10
10
  platform: ruby
11
11
  authors:
12
12
  - Gernot Kogler
@@ -57,6 +57,7 @@ extra_rdoc_files: []
57
57
 
58
58
  files:
59
59
  - lib/xapian_db/adapters/active_record_adapter.rb
60
+ - lib/xapian_db/adapters/base_adapter.rb
60
61
  - lib/xapian_db/adapters/datamapper_adapter.rb
61
62
  - lib/xapian_db/adapters/generic_adapter.rb
62
63
  - lib/xapian_db/config.rb