acts_as_xapian 0.1.1

Sign up to get free protection for your applications and to get access to all the features.
data/.document ADDED
@@ -0,0 +1,5 @@
1
+ README.rdoc
2
+ lib/**/*.rb
3
+ bin/*
4
+ features/**/*.feature
5
+ LICENSE
data/.gitignore ADDED
@@ -0,0 +1,21 @@
1
+ ## MAC OS
2
+ .DS_Store
3
+
4
+ ## TEXTMATE
5
+ *.tmproj
6
+ tmtags
7
+
8
+ ## EMACS
9
+ *~
10
+ \#*
11
+ .\#*
12
+
13
+ ## VIM
14
+ *.swp
15
+
16
+ ## PROJECT::GENERAL
17
+ coverage
18
+ rdoc
19
+ pkg
20
+
21
+ ## PROJECT::SPECIFIC
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 Mike Nelson
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.rdoc ADDED
@@ -0,0 +1,148 @@
1
+ =acts_as_xapian_gem / acts_as_xapian
2
+
3
+ == Introduction
4
+
5
+ Xapian[http://www.xapian.org] is a full text search engine library which has Ruby bindings. acts_as_xapian adds support for it to Rails. It is an alternative to acts_as_solr, acts_as_ferret, Ultrasphinx, acts_as_indexed, acts_as_searchable or acts_as_tsearch.
6
+
7
+ acts_as_xapian is deployed in production on these websites.
8
+ * WhatDoTheyKnow[http://www.whatdotheyknow.com]
9
+ * MindBites[http://www.mindbites.com]
10
+
11
+ == A Quick Note
12
+
13
+ This gem was created directly from the acts_as_xapian plugin. There were very few changes, the majority of which were to make the gem handle installation better. If you'd like more information about the original plugin go here[http://www.github.com/frabcus/acts_as_xapian/] or if I've left something crucial out, send me a message via github.
14
+
15
+ == Installation
16
+
17
+ Install Xapian with the ruby bindings on your box. For you OSX users, I'd recommend using Homebrew[http://github.com/mxcl/homebrew]
18
+
19
+ Then install the gem
20
+ sudo gem install acts_as_xapian
21
+
22
+ Navigate to your project and generate the required files
23
+ script/generate acts_as_xapian
24
+
25
+ Migrate your database
26
+ rake db:migrate
27
+
28
+ == Usage
29
+
30
+ Xapian is an offline indexing search library - only one process can have the Xapian database open for writing at once, and others that try meanwhile are unceremoniously kicked out. For this reason, acts_as_xapian does not support immediate writing to the database when your models change.
31
+
32
+ Instead, there is a ActsAsXapianJob model which stores which models need updating or deleting in the search index. A rake task 'xapian:update_index' then performs the updates since last change. You can run it on a cron job, or similar.
33
+
34
+ Here's how to add indexing to your Rails app:
35
+
36
+ Put acts_as_xapian in your models that need search indexing. e.g.
37
+
38
+ acts_as_xapian :texts => [:name, :short_name],
39
+ :values => [[ :created_at, 0, "created_at", :date ]],
40
+ :terms => [[ :variety, 'V', "variety" ]]
41
+
42
+ Options must include:
43
+
44
+ * :texts, an array of fields for indexing with full text search.
45
+ e.g. :texts => [ :title, :body ]
46
+
47
+ * :values, things which have a range of values for sorting or collapsing. Specify an array quadruple of [ field, identifier, prefix, type ] where _identifier_ is an arbitary numeric identifier for use in the Xapian database, _prefix_ is the part to use in search queries that goes before the : , and _type_ can be any of :string, :number or :date.
48
+ e.g. :values => [[ :created_at, 0, "created_at", :date ], [ :size, 1, "size", :string ]]
49
+
50
+ * :terms, things which come with a prefix (before a ':') in search queries. Specify an array triple of [ field, char, prefix ] where _char_ is an arbitary single upper case char used in the Xapian database, just pick any single uppercase character, but use a different one for each prefix. _prefix_ is the part to use in search queries that goes before the : . For example, if you were making Google and indexing to be able to later do a query like "site:www.whatdotheyknow.com", then the prefix would be "site".
51
+ e.g. :terms => [ [ :variety, 'V', "variety" ] ]
52
+
53
+ A 'field' is a symbol referring to either an attribute or a function which returns the text, date or number to index. Both 'identifier' and 'char' must be the same for the same prefix in different models.
54
+
55
+ Options may include:
56
+
57
+ * :eager_load, added as an :include clause when looking up search results in database
58
+ * :if, either an attribute or a function which if returns false means the object isn't indexed
59
+
60
+ To build the index
61
+ the first time, call:
62
+ rake xapian:rebuild_index
63
+
64
+ It puts the db in the development/test/production directory in your db directory. See the configuration section below if you want to change this.
65
+
66
+ Then from a cron job or a daemon, or by hand regularly call:
67
+ 'rake xapian:update_index'
68
+
69
+
70
+ == Querying
71
+
72
+
73
+ === Testing indexing
74
+
75
+ If you just want to test indexing is working, you'll find this rake task useful:
76
+ rake xapian:query q="moo"
77
+
78
+ You have a few more options here:
79
+ * models - the models to query (ex: models="User Company"). Omitting searches all xapian models
80
+ * offset - the offset of the results
81
+ * limit - the limiting number of results
82
+ * sort_by_prefix - sort by the prefix specified in value field of the acts_as_xapian call
83
+ * collapse_by_prefix - collapse the results based on best result for it's prefix
84
+
85
+ === Performing a query
86
+
87
+ To perform a query from code call ActsAsXapian::Search.new. This takes in turn:
88
+ * model_classes - list of models to search, e.g. [PublicBody, InfoRequestEvent]
89
+ * query_string - Google like syntax, see below
90
+
91
+ And then a hash of options:
92
+ * :offset - Offset of first result (default 0)
93
+ * :limit - Number of results per page
94
+ * :sort_by_prefix - Optionally, prefix of value to sort by, otherwise sort by relevance
95
+ * :sort_by_ascending - Default true (documents with higher values better/earlier), set to false for descending sort
96
+ * :collapse_by_prefix - Optionally, prefix of value to collapse by (i.e. only return most relevant result from group)
97
+
98
+ Google like query syntax is as described in {Xapian::QueryParser Syntax}[http://www.xapian.org/docs/queryparser.html] Queries can include prefix:value parts, according to what you indexed in the acts_as_xapian part above. You can also say things like model:InfoRequestEvent to constrain by model in more complex ways than the :model parameter, or modelid:InfoRequestEvent-100 to only find one specific object.
99
+
100
+ Returns an ActsAsXapian::Search object. Useful methods are:
101
+ * description - a techy one, to check how the query has been parsed
102
+ * matches_estimated - a guesstimate at the total number of hits
103
+ * spelling_correction - the corrected query string if there is a correction, otherwise nil
104
+ * words_to_highlight - list of words for you to highlight, perhaps with TextHelper::highlight
105
+ * results - an array of hashes each structured like:
106
+ {:model > YourModel, :weight => 3.92, :percent => 100%, :collapse_count => 0}
107
+ * :model - your Rails model, this is what you most want!
108
+ * :weight - relevancy measure
109
+ * :percent - the weight as a %, 0 meaning the item did not match the query at all
110
+ * :collapse_count - number of results with the same prefix, if you specified collapse_by_prefix
111
+
112
+ === Finding similar models
113
+
114
+ To find models that are similar to a given set of models call ActsAsXapian::Similar.new. This takes:
115
+ * model_classes - list of model classes to return models from within
116
+ * models - list of models that you want to find related ones to
117
+
118
+ Returns an ActsAsXapian::Similar object. Has all methods from ActsAsXapian::Search above, except for words_to_highlight. In addition has:
119
+ * important_terms - the terms extracted from the input models, that were used to search for output. You need the results methods to get the similar models.
120
+
121
+
122
+ == Configuration
123
+
124
+
125
+ If you want to customise the configuration of acts_as_xapian, it will look for a file called 'xapian.yml' under RAILS_ROOT/config. As is familiar from the format of the database.yml file, separate :development, :test and :production sections are expected.
126
+
127
+ The following options are available:
128
+ * base_db_path - specifies the directory, relative to RAILS_ROOT, in which acts_as_xapian stores its search index databases. Default is the xapiandbs directory within the db directory.
129
+
130
+
131
+ == Performance
132
+
133
+ On development sites, acts_as_xapian automatically logs the time taken to do searches. The time displayed is for the Xapian parts of the query; the Rails database model lookups will be logged separately by ActiveRecord. Example:
134
+
135
+ Xapian query (0.00029s) Search: hello
136
+
137
+ To enable this, and other performance logging, on a production site, temporarily add this to the end of your config/environment.rb
138
+
139
+ ActiveRecord::Base.logger = Logger.new(STDOUT)
140
+
141
+
142
+ == Support
143
+
144
+ Please ask any questions on the {acts_as_xapian Google Group}[http://groups.google.com/group/acts_as_xapian]
145
+
146
+ The official home page and repository for acts_as_xapian are the {acts_as_xapian github page}[http://github.com/frabcus/acts_as_xapian/wikis]
147
+
148
+ For more details about anything, see source code in lib/acts_as_xapian/*.rb
data/Rakefile ADDED
@@ -0,0 +1,46 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ begin
5
+ require 'jeweler'
6
+ Jeweler::Tasks.new do |gem|
7
+ gem.name = "acts_as_xapian"
8
+ gem.summary = %Q{A gem for interacting with the Xapian full text search engine}
9
+ gem.description = %Q{A gem for interacting with the Xapian full text search engine. Completely based on the acts_as_xapian plugin.}
10
+ gem.email = "mdnelson30@gmail.com"
11
+ gem.homepage = "http://github.com/mnelson/acts_as_xapian_gem"
12
+ gem.authors = ["Mike Nelson"]
13
+ gem.add_development_dependency "rspec", ">= 1.2.9"
14
+ gem.add_development_dependency "active_record"
15
+ # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
16
+ end
17
+ Jeweler::GemcutterTasks.new
18
+ rescue LoadError
19
+ puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
20
+ end
21
+
22
+ require 'spec/rake/spectask'
23
+ Spec::Rake::SpecTask.new(:spec) do |spec|
24
+ spec.libs << 'lib' << 'spec'
25
+ spec.spec_files = FileList['spec/**/*_spec.rb']
26
+ end
27
+
28
+ Spec::Rake::SpecTask.new(:rcov) do |spec|
29
+ spec.libs << 'lib' << 'spec'
30
+ spec.pattern = 'spec/**/*_spec.rb'
31
+ spec.rcov = true
32
+ end
33
+
34
+ task :spec => :check_dependencies
35
+
36
+ task :default => :spec
37
+
38
+ require 'rake/rdoctask'
39
+ Rake::RDocTask.new do |rdoc|
40
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
41
+
42
+ rdoc.rdoc_dir = 'rdoc'
43
+ rdoc.title = "acts_as_xapian #{version}"
44
+ rdoc.rdoc_files.include('README*')
45
+ rdoc.rdoc_files.include('lib/**/*.rb')
46
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.1.1
@@ -0,0 +1,70 @@
1
+ # Generated by jeweler
2
+ # DO NOT EDIT THIS FILE DIRECTLY
3
+ # Instead, edit Jeweler::Tasks in Rakefile, and run the gemspec command
4
+ # -*- encoding: utf-8 -*-
5
+
6
+ Gem::Specification.new do |s|
7
+ s.name = %q{acts_as_xapian}
8
+ s.version = "0.1.1"
9
+
10
+ s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
+ s.authors = ["Mike Nelson"]
12
+ s.date = %q{2010-03-17}
13
+ s.description = %q{A gem for interacting with the Xapian full text search engine. Completely based on the acts_as_xapian plugin.}
14
+ s.email = %q{mdnelson30@gmail.com}
15
+ s.extra_rdoc_files = [
16
+ "LICENSE",
17
+ "README.rdoc"
18
+ ]
19
+ s.files = [
20
+ ".document",
21
+ ".gitignore",
22
+ "LICENSE",
23
+ "README.rdoc",
24
+ "Rakefile",
25
+ "VERSION",
26
+ "acts_as_xapian.gemspec",
27
+ "generators/acts_as_xapian/USAGE",
28
+ "generators/acts_as_xapian/acts_as_xapian_generator.rb",
29
+ "generators/acts_as_xapian/templates/migrations/migration.rb",
30
+ "generators/acts_as_xapian/templates/tasks/xapian.rake",
31
+ "lib/acts_as_xapian.rb",
32
+ "lib/acts_as_xapian/base.rb",
33
+ "lib/acts_as_xapian/core_ext/array.rb",
34
+ "lib/acts_as_xapian/index.rb",
35
+ "lib/acts_as_xapian/query_base.rb",
36
+ "lib/acts_as_xapian/readable_index.rb",
37
+ "lib/acts_as_xapian/search.rb",
38
+ "lib/acts_as_xapian/similar.rb",
39
+ "lib/acts_as_xapian/writeable_index.rb",
40
+ "spec/acts_as_xapian_spec.rb",
41
+ "spec/spec.opts",
42
+ "spec/spec_helper.rb"
43
+ ]
44
+ s.homepage = %q{http://github.com/mnelson/acts_as_xapian_gem}
45
+ s.rdoc_options = ["--charset=UTF-8"]
46
+ s.require_paths = ["lib"]
47
+ s.rubygems_version = %q{1.3.6}
48
+ s.summary = %q{A gem for interacting with the Xapian full text search engine}
49
+ s.test_files = [
50
+ "spec/acts_as_xapian_spec.rb",
51
+ "spec/spec_helper.rb"
52
+ ]
53
+
54
+ if s.respond_to? :specification_version then
55
+ current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
56
+ s.specification_version = 3
57
+
58
+ if Gem::Version.new(Gem::RubyGemsVersion) >= Gem::Version.new('1.2.0') then
59
+ s.add_development_dependency(%q<rspec>, [">= 1.2.9"])
60
+ s.add_development_dependency(%q<active_record>, [">= 0"])
61
+ else
62
+ s.add_dependency(%q<rspec>, [">= 1.2.9"])
63
+ s.add_dependency(%q<active_record>, [">= 0"])
64
+ end
65
+ else
66
+ s.add_dependency(%q<rspec>, [">= 1.2.9"])
67
+ s.add_dependency(%q<active_record>, [">= 0"])
68
+ end
69
+ end
70
+
@@ -0,0 +1 @@
1
+ ./script/generate acts_as_xapian
@@ -0,0 +1,14 @@
1
+ class ActsAsXapianGenerator < Rails::Generator::Base
2
+ def manifest
3
+ record do |m|
4
+ m.migration_template 'migrations/migration.rb', 'db/migrate',
5
+ :migration_file_name => "create_acts_as_xapian"
6
+ m.file "tasks/xapian.rake", "lib/tasks/xapian.rake"
7
+ end
8
+ end
9
+
10
+ protected
11
+ def banner
12
+ "Usage: #{$0} acts_as_xapian"
13
+ end
14
+ end
@@ -0,0 +1,14 @@
1
+ class CreateActsAsXapian < ActiveRecord::Migration
2
+ def self.up
3
+ create_table :acts_as_xapian_jobs do |t|
4
+ t.column :model, :string, :null => false
5
+ t.column :model_id, :integer, :null => false
6
+ t.column :action, :string, :null => false
7
+ end
8
+ add_index :acts_as_xapian_jobs, [:model, :model_id], :unique => true
9
+ end
10
+ def self.down
11
+ drop_table :acts_as_xapian_jobs
12
+ end
13
+ end
14
+
@@ -0,0 +1,42 @@
1
+
2
+ namespace :xapian do
3
+
4
+ # Parameters - specify "flush=true" to save changes to the Xapian database
5
+ # after each model that is updated. This is safer, but slower. Specify
6
+ # "verbose=true" to print model name as it is run.
7
+ desc 'Updates Xapian search index with changes to models since last call'
8
+ task :update_index => :environment do
9
+ ActsAsXapian::WriteableIndex.update_index(ENV['flush'] ? true : false, ENV['verbose'] ? true : false)
10
+ end
11
+
12
+ desc 'Pulls all the xapian models from either the params or the project itself'
13
+ task :retrieve_models => :environment do
14
+ @models = (ENV['models'] || ENV['m']) && (ENV['models'] || ENV['m']).split(" ").map{|m| m.constantize} || ActiveRecord::Base.send(:subclasses).select{|klazz| klazz.respond_to?(:xapian?)}
15
+ STDOUT.puts("Found Xapian Models: #{@models.map(&:name).join(', ')}")
16
+ end
17
+ # Parameters - specify 'models="PublicBody User"' to say which models
18
+ # you index with Xapian.
19
+ # This totally rebuilds the database, so you will want to restart any
20
+ # web server afterwards to make sure it gets the changes, rather than
21
+ # still pointing to the old deleted database. Specify "verbose=true" to
22
+ # print model name as it is run.
23
+ desc 'Completely rebuilds Xapian search index (must specify all models)'
24
+ task :rebuild_index => :retrieve_models do
25
+ ActsAsXapian::WriteableIndex.rebuild_index(@models, ENV['verbose'] ? true : false)
26
+ end
27
+
28
+ # Parameters - are models, query, offset, limit, sort_by_prefix,
29
+ # collapse_by_prefix
30
+ desc 'Run a query, return YAML of results'
31
+ task :query => :retrieve_models do
32
+ raise "specify q=\"your terms\" as parameter" if (ENV['query'] || ENV['q']).nil?
33
+ s = ActsAsXapian::Search.new(@models,
34
+ (ENV['query'] || ENV['q']),
35
+ :offset => (ENV['offset'] || 0), :limit => (ENV['limit'] || 10),
36
+ :sort_by_prefix => (ENV['sort_by_prefix'] || nil),
37
+ :collapse_by_prefix => (ENV['collapse_by_prefix'] || nil)
38
+ )
39
+ STDOUT.puts(s.results.to_yaml)
40
+ end
41
+ end
42
+
@@ -0,0 +1,215 @@
1
+ # acts_as_xapian/lib/acts_as_xapian.rb:
2
+ # Xapian full text search in Ruby on Rails.
3
+ #
4
+ # Copyright (c) 2008 UK Citizens Online Democracy. All rights reserved.
5
+ # Email: francis@mysociety.org; WWW: http://www.mysociety.org/
6
+ #
7
+ # Documentation
8
+ # =============
9
+ #
10
+ # See ../README.txt foocumentation. Please update that file if you edit
11
+ # code.
12
+
13
+ # Make it so if Xapian isn't installed, the Rails app doesn't fail completely,
14
+ # just when somebody does a search.
15
+ begin
16
+ require 'xapian'
17
+ $acts_as_xapian_bindings_available = true
18
+ rescue LoadError
19
+ STDERR.puts "acts_as_xapian: No Ruby bindings for Xapian installed"
20
+ $acts_as_xapian_bindings_available = false
21
+ end
22
+
23
+ module ActsAsXapian
24
+ class NoXapianRubyBindingsError < StandardError; end
25
+
26
+ # Offline indexing job queue model, create with migration made
27
+ # using "script/generate acts_as_xapian" as described in ../README.txt
28
+ class ActsAsXapianJob < ActiveRecord::Base; end
29
+
30
+ ######################################################################
31
+ # Module level variables
32
+ # XXX must be some kind of cattr_accessor that can do this better
33
+ def self.bindings_available
34
+ $acts_as_xapian_bindings_available
35
+ end
36
+
37
+ ######################################################################
38
+ # Main entry point, add acts_as_xapian to your model.
39
+
40
+ module ActsMethods
41
+ # See top of this file for docs
42
+ def acts_as_xapian(options)
43
+ # Give error only on queries if bindings not available
44
+ return unless ActsAsXapian.bindings_available
45
+
46
+ include InstanceMethods
47
+ extend ClassMethods
48
+
49
+ class_eval('def xapian_boost(term_type, term); 1; end') unless self.instance_methods.include?('xapian_boost')
50
+
51
+ # extend has_many && has_many_and_belongs_to associations with our ProxyFinder to get scoped results
52
+ # I've written a small report in the discussion group why this is the proper way of doing this.
53
+ # see here: XXX - write it you lazy douche bag!
54
+ self.reflections.each do |association_name, r|
55
+ # skip if the associated model isn't indexed by acts_as_xapian
56
+ next unless r.klass.respond_to?(:xapian?) && r.klass.xapian?
57
+ # skip all associations except ham and habtm
58
+ next unless [:has_many, :has_many_and_belongs_to_many].include?(r.macro)
59
+
60
+ # XXX todo:
61
+ # extend the associated model xapian options with this term:
62
+ # [proxy_reflection.primary_key_name.to_sym, <magically find a free capital letter>, proxy_reflection.primary_key_name]
63
+ # otherways this assumes that the associated & indexed model indexes this kind of term
64
+
65
+ # but before you do the above, rewrite the options syntax... wich imho is actually very ugly
66
+
67
+ # XXX test this nifty feature on habtm!
68
+
69
+ if r.options[:extend].nil?
70
+ r.options[:extend] = [ProxyFinder]
71
+ elsif !r.options[:extend].include?(ProxyFinder)
72
+ r.options[:extend] << ProxyFinder
73
+ end
74
+ end
75
+
76
+ cattr_accessor :xapian_options
77
+ self.xapian_options = options
78
+
79
+ ActsAsXapian::Index.init(self.class.to_s, options)
80
+
81
+ after_save :xapian_mark_needs_index
82
+ after_destroy :xapian_mark_needs_destroy
83
+ end
84
+ end
85
+
86
+ module ClassMethods
87
+ # Model.find_with_xapian("Search Term OR Phrase")
88
+ # => Array of Records
89
+ #
90
+ # this can be used through association proxies /!\ DANGEROUS MAGIC /!\
91
+ # example:
92
+ # @document = Document.find(params[:id])
93
+ # @document_pages = @document.pages.find_with_xapian("Search Term OR Phrase").compact # NOTE THE compact wich removes nil objects from the array
94
+ #
95
+ # as seen here: http://pastie.org/270114
96
+ def find_with_xapian(search_term, options = {})
97
+ search_with_xapian(search_term, options).results.map {|x| x[:model] }
98
+ end
99
+
100
+ def search_with_xapian(search_term, options = {})
101
+ ActsAsXapian::Search.new([self], search_term, options)
102
+ end
103
+
104
+ def with_xapian_scope(ids)
105
+ with_scope(:find => {:conditions => {"#{self.table_name}.#{self.primary_key}" => ids}, :include => self.xapian_options[:eager_load]}) { yield }
106
+ end
107
+
108
+ #this method should return true if the integration of xapian on self is complete
109
+ def xapian?
110
+ self.included_modules.include?(InstanceMethods) && self.extended_by.include?(ClassMethods)
111
+ end
112
+ end
113
+
114
+ ######################################################################
115
+ # Instance methods that get injected into your model.
116
+
117
+ module InstanceMethods
118
+ # Used internally
119
+ def xapian_document_term
120
+ "#{self.class}-#{self.id}"
121
+ end
122
+
123
+ # Extract value of a field from the model
124
+ def xapian_value(field, type = nil)
125
+ value = self.respond_to?(field) ? self.send(field) : self[field] # Give preference to method if it exists
126
+ case type
127
+ when :date
128
+ value = value.to_time if value.kind_of?(Date)
129
+ raise "Only Time or Date types supported by acts_as_xapian for :date fields, got #{value.class}" unless value.kind_of?(Time)
130
+ value.utc.strftime("%Y%m%d")
131
+ when :boolean
132
+ value ? true : false
133
+ when :number
134
+ value.nil? ? "" : Xapian::sortable_serialise(value.to_f)
135
+ else
136
+ value.to_s
137
+ end
138
+ end
139
+
140
+ # Store record in the Xapian database
141
+ def xapian_index
142
+ # if we have a conditional function for indexing, call it and destory object if failed
143
+ if self.class.xapian_options.key?(:if) && !xapian_value(self.class.xapian_options[:if], :boolean)
144
+ self.xapian_destroy
145
+ return
146
+ end
147
+
148
+ # otherwise (re)write the Xapian record for the object
149
+ doc = Xapian::Document.new
150
+ WriteableIndex.term_generator.document = doc
151
+
152
+ doc.data = self.xapian_document_term
153
+
154
+ doc.add_term("M#{self.class}")
155
+ doc.add_term("I#{doc.data}")
156
+ (self.xapian_options[:terms] || []).each do |term|
157
+ WriteableIndex.term_generator.increase_termpos # stop phrases spanning different text fields
158
+ WriteableIndex.term_generator.index_text(xapian_value(term[0]), self.xapian_boost(:term, term[0]), term[1])
159
+ end
160
+ (self.xapian_options[:values] || []).each {|value| doc.add_value(value[1], xapian_value(value[0], value[3])) }
161
+ (self.xapian_options[:texts] || []).each do |text|
162
+ WriteableIndex.term_generator.increase_termpos # stop phrases spanning different text fields
163
+ WriteableIndex.term_generator.index_text(xapian_value(text), self.xapian_boost(:text, text))
164
+ end
165
+
166
+ WriteableIndex.replace_document("I#{doc.data}", doc)
167
+ end
168
+
169
+ # Delete record from the Xapian database
170
+ def xapian_destroy
171
+ WriteableIndex.delete_document("I#{self.xapian_document_term}")
172
+ end
173
+
174
+ # Used to mark changes needed by batch indexer
175
+ def xapian_mark_needs_index
176
+ model = self.class.base_class.to_s
177
+ model_id = self.id
178
+ return false unless model_id # After save gets called even if save fails
179
+ ActiveRecord::Base.transaction do
180
+ found = ActsAsXapianJob.delete_all(["model = ? and model_id = ?", model, model_id])
181
+ job = ActsAsXapianJob.new
182
+ job.model = model
183
+ job.model_id = model_id
184
+ job.action = 'update'
185
+ job.save!
186
+ end
187
+ end
188
+
189
+ def xapian_mark_needs_destroy
190
+ model = self.class.base_class.to_s
191
+ model_id = self.id
192
+ ActiveRecord::Base.transaction do
193
+ found = ActsAsXapianJob.delete_all(["model = ? and model_id = ?", model, model_id])
194
+ job = ActsAsXapianJob.new
195
+ job.model = model
196
+ job.model_id = model_id
197
+ job.action = 'destroy'
198
+ job.save!
199
+ end
200
+ end
201
+ end
202
+
203
+ module ProxyFinder
204
+ def find_with_xapian(search_term, options = {})
205
+ search_with_xapian(search_term, options).results.map {|x| x[:model] }
206
+ end
207
+
208
+ def search_with_xapian(search_term, options = {})
209
+ ActsAsXapian::Search.new([proxy_reflection.klass], "#{proxy_reflection.primary_key_name}:#{proxy_owner.id} #{search_term}", options)
210
+ end
211
+ end
212
+ end
213
+
214
+ # Reopen ActiveRecord and include the acts_as_xapian method
215
+ ActiveRecord::Base.extend ActsAsXapian::ActsMethods
@@ -0,0 +1,24 @@
1
+ module ActsAsXapian
2
+ module ArrayExt
3
+ # Creates a ActsAsXapian::Similar search passing through the options to the search
4
+ #
5
+ # The model classes to search are automatically generated off the classes of the
6
+ # entries in the array. If you want more control over which models to search,
7
+ # specify option :models and it will override the default behavior
8
+ def search_similar(options = {})
9
+ raise "All entries must be xapian models" unless all? {|i| i.class.respond_to?(:xapian?) && i.class.xapian? }
10
+ models = options.delete(:models)
11
+ ActsAsXapian::Similar.new(models || map {|i| i.class }.uniq, self, options)
12
+ end
13
+
14
+ # Runs a ActsAsXapian::Similar search passing back the returned models instead of the
15
+ # search object. Takes all the same options as search_similar
16
+ def find_similar(options = {})
17
+ search_similar(options).results.map {|x| x[:model] }
18
+ end
19
+ end
20
+ end
21
+
22
+ Array.class_eval do
23
+ include ActsAsXapian::ArrayExt
24
+ end