mongoid-haystack 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,249 @@
1
+ NAME
2
+ ----
3
+
4
+ mongoid-haystack.rb
5
+
6
+ DESCRIPTION
7
+ -----------
8
+
9
+ mongoid-haystack provides a zero-config, POLS, pure mongo, fulltext search
10
+ solution for your mongoid models.
11
+
12
+ SYNOPSIS
13
+ --------
14
+
15
+ ````ruby
16
+
17
+ # simple usage is simple
18
+ #
19
+ class Article
20
+ include Mongoid::Document
21
+ include Mongoid::Haystack
22
+
23
+ field(:content, :type => String)
24
+ end
25
+
26
+ Article.create!(:content => 'teh cats')
27
+
28
+ results = Article.search('cat')
29
+
30
+ article = results.first.model
31
+
32
+
33
+ # haystack stems the search terms and does score based sorting all using a
34
+ # fast b-tree
35
+ #
36
+ a = Article.create!(:content => 'cats are awesome')
37
+ b = Article.create!(:content => 'dogs eat cats')
38
+ c = Article.create!(:content => 'dogs dogs dogs')
39
+
40
+ results = Article.search('dogs cats').models
41
+ results == [b, a, c] #=> true
42
+
43
+ results = Article.search('awesome').models
44
+ results == [a] #=> true
45
+
46
+
47
+ # cross models searching is supported out of the box, and models can
48
+ # customise how they are indexed:
49
+ #
50
+ # - a global score lets some models appear hight in the global results
51
+ # - keywords count more than fulltext
52
+ #
53
+ class Article
54
+ include Mongoid::Document
55
+ include Mongoid::Haystack
56
+
57
+ field(:title, :type => String)
58
+ field(:content, :type => String)
59
+
60
+ def to_haystack
61
+ { :score => 11, :keywords => title, :fulltext => content }
62
+ end
63
+ end
64
+
65
+ class Comment
66
+ include Mongoid::Document
67
+ include Mongoid::Haystack
68
+
69
+ field(:content, :type => String)
70
+
71
+ def to_haystack
72
+ { :score => -11, :fulltext => content }
73
+ end
74
+ end
75
+
76
+ a1 = Article.create!(:title => 'hot pants', :content => 'teh b 52s rock')
77
+ a2 = Article.create!(:title => 'boring title', :content => 'but hot content that rocks')
78
+
79
+ c = Comment.create!(:content => 'those guys rock')
80
+
81
+ results = Mongoid::Haystack.search('rock')
82
+ results.count #=> 3
83
+
84
+ models = results.models
85
+ models == [a1, a2, c] #=> true. articles first beause we generally score them higher
86
+
87
+ results = Mongoid::Haystack.search('hot')
88
+ models = results.models
89
+ models == [a1, a2] #=> true. because keywords score highter than general fulltext
90
+
91
+
92
+ # by default searching returns Mongoid::Haystack::Index objects. you'll want
93
+ # to expand these results to the models they reference in your views, but
94
+ # avoid doing an N+1 query. to do this simply call #models on the result set
95
+ # and the models will be eager loaded using only as many queries as their are
96
+ # model types in your result set
97
+ #
98
+
99
+ @results = Mongoid::Haystack.search('needle').page(params[:page]).per(10)
100
+ @models = @results.models
101
+
102
+
103
+ # you can decorate your search items with arbirtrary meta data and filter
104
+ # searches by it later. this too uses a b-tree index.
105
+ #
106
+ class Article
107
+ include Mongoid::Document
108
+ include Mongoid::Haystack
109
+
110
+ belongs_to :author, :class_name => '::User'
111
+
112
+ field(:title, :type => String)
113
+ field(:content, :type => String)
114
+
115
+ def to_haystack
116
+ {
117
+ :score => author.popularity,
118
+ :keywords => title,
119
+ :fulltext => content,
120
+ :facets => {:author_id => author.id}
121
+ }
122
+ end
123
+ end
124
+
125
+ a =
126
+ author.articles.create!(
127
+ :title => 'iggy and keith',
128
+ :content => 'seen the needles and the damage done...'
129
+ )
130
+
131
+ author_articles = Article.search('needle', :facets => {:author_id => author.id})
132
+
133
+
134
+ ````
135
+
136
+ DESCRIPTION
137
+ -----------
138
+
139
+ there two main pathways to understand in the code. shit going into the
140
+ index, and shit coming out.
141
+
142
+ shit going in entails:
143
+
144
+ - stem and stopword the search terms.
145
+ - create or update a new token for each
146
+ - create an index item reference all the tokens with precomputed scores
147
+
148
+ for example the terms 'dog dogs cat' might result in these tokens
149
+
150
+ ````javascript
151
+
152
+ [
153
+ {
154
+ '_id' : '0x1',
155
+ 'value' : 'dog',
156
+ 'count' : 2
157
+ },
158
+
159
+
160
+ {
161
+ '_id' : '0x2',
162
+ 'value' : 'cat',
163
+ 'count' : 1
164
+ }
165
+ ]
166
+
167
+ ````
168
+
169
+ and this index item
170
+
171
+
172
+ ````javascript
173
+
174
+ {
175
+ '_id' : '50c11759a04745961e000001'
176
+
177
+ 'model_type' : 'Article',
178
+ 'model_id' : '50c11775a04745461f000001'
179
+
180
+ 'tokens' : ['0x1', '0x2'],
181
+
182
+ 'score' : 10,
183
+
184
+ 'keyword_scores' : {
185
+ '0x1' : 2,
186
+ '0x2' : 1
187
+ },
188
+
189
+ 'fulltext_scores' : {
190
+ }
191
+ }
192
+
193
+
194
+ ````
195
+
196
+ being built
197
+
198
+ in addition, some other information is tracked such and the total number of
199
+ search tokens every discovered in the corpus
200
+
201
+
202
+
203
+ a few things to notice:
204
+
205
+ - the tokens are counted and auto-id'd using hex notation and a sequence
206
+ generator. the reason for this is so that their ids are legit hash keys
207
+ in the keyword and fulltext score hashes.
208
+
209
+ - the data structure above allows both filtering for index items that have
210
+ certain tokens, but also ordering them based on global, keyword, and
211
+ fulltext score without resorting to map-reduce: a b-tree index can be
212
+ used.
213
+
214
+ - all tokens have their text/stem stored exactly once. aka: we do not store
215
+ 'hugewords' all over the place but store it once and count occurances of
216
+ it to keep the total index much smaller
217
+
218
+
219
+
220
+
221
+ pulling objects back out in a search involved these logical steps:
222
+
223
+ - filter the search terms through the same tokenizer as when indexed
224
+
225
+ - lookup tokens for each of the tokens in the search string
226
+
227
+ - using the count for each token, plus the global token count that has been
228
+ tracked we can decide to order the results by relatively rare words first
229
+ and, all else being equal (same rarity bin: 0.10, 0.20, 0.30, etc.), the
230
+ order in which the user typed the words
231
+
232
+ - this approach is applies and is valid whether we are doing a union (or) or
233
+ intersection (all) search and regardless of whether facets are included in
234
+ the search. facets, however, never affect the order unless done so by the
235
+ user manually. eg
236
+
237
+ ````ruby
238
+
239
+ results =
240
+ Mongoid::Haystack.
241
+ search('foo bar', :facets => {:hotness.gte => 11}).
242
+ order_by('facets.hotness' => :desc)
243
+
244
+ ````
245
+
246
+
247
+ SEE ALSO
248
+ --------
249
+ tests: <a href='https://github.com/ahoward/mongoid-haystack/blob/master/test/mongoid-haystack_test.rb'>./test/mongoid-haystack_test.rb<a/>
@@ -73,12 +73,13 @@ module Mongoid
73
73
 
74
74
  keyword_scores = Hash.new{|h,k| h[k] = 0}
75
75
  fulltext_scores = Hash.new{|h,k| h[k] = 0}
76
+ token_ids = []
76
77
 
77
78
  Token.values_for(keywords).each do |value|
78
79
  token = Token.add(value)
79
80
  id = token.id
80
81
 
81
- index.tokens.push(id)
82
+ token_ids.push(id)
82
83
  keyword_scores[id] += 1
83
84
  end
84
85
 
@@ -86,7 +87,7 @@ module Mongoid
86
87
  token = Token.add(value)
87
88
  id = token.id
88
89
 
89
- index.tokens.push(id)
90
+ token_ids.push(id)
90
91
  fulltext_scores[id] += 1
91
92
  end
92
93
 
@@ -96,7 +97,7 @@ module Mongoid
96
97
  index.score = score if score
97
98
  index.facets = facets if facets
98
99
 
99
- index.tokens = index.tokens.uniq
100
+ index.token_ids = token_ids
100
101
 
101
102
  index.save!
102
103
  end
@@ -105,16 +106,12 @@ module Mongoid
105
106
  def remove(*args)
106
107
  models_for(*args) do |model|
107
108
  index = where(:model_type => model.class.name, :model_id => model.id).first
108
-
109
- if index
110
- subtract(index)
111
- index.destroy
112
- end
109
+ index.destroy if index
113
110
  end
114
111
  end
115
112
 
116
113
  def subtract(index)
117
- tokens = Token.where(:id.in => index.tokens)
114
+ tokens = index.tokens
118
115
 
119
116
  n = 0
120
117
 
@@ -145,9 +142,11 @@ module Mongoid
145
142
  end
146
143
  end
147
144
 
145
+ before_destroy{|index| Index.subtract(index)}
146
+
148
147
  belongs_to(:model, :polymorphic => true)
149
148
 
150
- field(:tokens, :type => Array, :default => [])
149
+ has_and_belongs_to_many(:tokens, :class_name => '::Mongoid::Haystack::Token', :inverse_of => nil)
151
150
  field(:score, :type => Integer, :default => 0)
152
151
  field(:keyword_scores, :type => Hash, :default => proc{ Hash.new{|h,k| h[k] = 0} })
153
152
  field(:fulltext_scores, :type => Hash, :default => proc{ Hash.new{|h,k| h[k] = 0} })
@@ -156,7 +155,7 @@ module Mongoid
156
155
  index({:model_type => 1})
157
156
  index({:model_id => 1})
158
157
 
159
- index({:tokens => 1})
158
+ index({:token_ids => 1})
160
159
  index({:score => 1})
161
160
  index({:keyword_scores => 1})
162
161
  index({:fulltext_scores => 1})
@@ -5,15 +5,34 @@ module Mongoid
5
5
  options = Map.options_for!(args)
6
6
  search = args.join(' ')
7
7
 
8
+ conditions = {}
9
+ order = []
10
+
11
+ op = :token_ids.in
12
+
13
+ #
14
+ case
15
+ when options[:all]
16
+ op = :token_ids.all
17
+ search += Coerce.string(options[:all])
18
+
19
+ when options[:any]
20
+ op = :token_ids.in
21
+ search += Coerce.string(options[:any])
22
+
23
+ when options[:in]
24
+ op = :token_ids.in
25
+ search += Coerce.string(options[:in])
26
+ end
27
+
8
28
  #
9
29
  tokens = search_tokens_for(search)
30
+ token_ids = tokens.map{|token| token.id}
10
31
 
11
32
  #
12
- conditions = {}
13
- conditions[:tokens.in] = tokens.map{|token| token.id}
33
+ conditions[op] = token_ids
14
34
 
15
35
  #
16
- order = []
17
36
  order.push(["score", :desc])
18
37
 
19
38
  tokens.each do |token|
@@ -26,7 +45,7 @@ module Mongoid
26
45
 
27
46
  #
28
47
  if options[:facets]
29
- conditions[:facets] = options[:facets]
48
+ conditions[:facets] = {'$elemMatch' => options[:facets]}
30
49
  end
31
50
 
32
51
  #
@@ -36,7 +55,9 @@ module Mongoid
36
55
  end
37
56
 
38
57
  #
39
- Index.where(conditions).order_by(order)
58
+ Index.where(conditions).order_by(order).tap do |results|
59
+ results.extend(Denormalize)
60
+ end
40
61
  end
41
62
 
42
63
  def search_tokens_for(search)
@@ -62,7 +83,7 @@ module Mongoid
62
83
  options[:types] = Array(options[:types]).flatten.compact
63
84
  options[:types].push(self)
64
85
  args.push(options)
65
- Haystack.search(*args, &block)
86
+ results = Haystack.search(*args, &block)
66
87
  end
67
88
 
68
89
  after_save do |doc|
@@ -80,6 +101,8 @@ module Mongoid
80
101
  nil
81
102
  end
82
103
  end
104
+
105
+ has_one(:haystack_index, :as => :model, :class_name => '::Mongoid::Haystack::Index')
83
106
  end
84
107
 
85
108
  InstanceMethods = proc do
@@ -92,5 +115,73 @@ module Mongoid
92
115
  other.class_eval(&InstanceMethods)
93
116
  end
94
117
  end
118
+
119
+ module Denormalize
120
+ def denormalize
121
+ ::Mongoid::Haystack.denormalize(self)
122
+ self
123
+ end
124
+
125
+ def models
126
+ denormalize
127
+ map(&:model)
128
+ end
129
+ end
130
+
131
+ def Haystack.denormalize(results)
132
+ queries = Hash.new{|h,k| h[k] = []}
133
+
134
+ results = results.to_a.flatten.compact
135
+
136
+ results.each do |result|
137
+ model_type = result[:model_type]
138
+ model_id = result[:model_id]
139
+ model_class = model_type.constantize
140
+ queries[model_class].push(model_id)
141
+ end
142
+
143
+ index = Hash.new{|h,k| h[k] = {}}
144
+
145
+ queries.each do |model_class, model_ids|
146
+ models =
147
+ begin
148
+ model_class.find(model_ids)
149
+ rescue Mongoid::Errors::DocumentNotFound
150
+ model_ids.map do |model_id|
151
+ begin
152
+ model_class.find(model_id)
153
+ rescue Mongoid::Errors::DocumentNotFound
154
+ nil
155
+ end
156
+ end
157
+ end
158
+
159
+ models.each do |model|
160
+ index[model.class.name] ||= Hash.new
161
+ next unless model
162
+ index[model.class.name][model.id.to_s] = model
163
+ end
164
+ end
165
+
166
+ to_ignore = []
167
+
168
+ results.each_with_index do |result, i|
169
+ model = index[result['model_type']][result['model_id'].to_s]
170
+
171
+ if model.nil?
172
+ to_ignore.push(i)
173
+ next
174
+ else
175
+ result.model = model
176
+ end
177
+
178
+ result.model.freeze
179
+ result.freeze
180
+ end
181
+
182
+ to_ignore.reverse.each{|i| results.delete_at(i)}
183
+
184
+ results.to_a
185
+ end
95
186
  end
96
187
  end
@@ -1,77 +1,80 @@
1
1
  # encoding: utf-8
2
2
 
3
- module Stemming
4
- def stem(*args)
5
- string = args.join(' ')
6
- words = string.scan(/[\w._-]+/)
7
- stems = []
8
- words.each do |word|
9
- word = word.downcase
10
- stem = word.stem.downcase
11
- next if Stopwords.stopword?(word)
12
- next if Stopwords.stopword?(stem)
13
- stems.push(stem)
14
- end
15
- stems
16
- end
3
+ module Mongoid
4
+ module Haystack
5
+ module Stemming
6
+ def stem(*args)
7
+ string = args.join(' ')
8
+ words = Util.words_for(*args)
9
+ stems = []
10
+ words.each do |word|
11
+ stem = word.stem.downcase
12
+ next if Stopwords.stopword?(word)
13
+ next if Stopwords.stopword?(stem)
14
+ stems.push(stem)
15
+ end
16
+ stems
17
+ end
17
18
 
18
- alias_method('for', 'stem')
19
+ alias_method('for', 'stem')
19
20
 
20
- module Stopwords
21
- dirname = __FILE__.sub(/\.rb\Z/, '')
22
- glob = File.join(dirname, 'stopwords', '*.txt')
21
+ module Stopwords
22
+ dirname = __FILE__.sub(/\.rb\Z/, '')
23
+ glob = File.join(dirname, 'stopwords', '*.txt')
23
24
 
24
- List = {}
25
+ List = {}
25
26
 
26
- Dir.glob(glob).each do |wordlist|
27
- basename = File.basename(wordlist)
28
- name = basename.split(/\./).first
27
+ Dir.glob(glob).each do |wordlist|
28
+ basename = File.basename(wordlist)
29
+ name = basename.split(/\./).first
29
30
 
30
- open(wordlist) do |fd|
31
- lines = fd.readlines
32
- words = lines.map{|line| line.strip}
33
- words.delete_if{|word| word.empty?}
34
- words.push('')
35
- List[name] = words
36
- end
37
- end
31
+ open(wordlist) do |fd|
32
+ lines = fd.readlines
33
+ words = lines.map{|line| line.strip}
34
+ words.delete_if{|word| word.empty?}
35
+ words.push('')
36
+ List[name] = words
37
+ end
38
+ end
38
39
 
39
- unless defined?(All)
40
- All = []
41
- All.concat(List['english'])
42
- All.concat(List['full_english'])
43
- All.concat(List['extended_english'])
44
- #All.concat(List['full_french'])
45
- #All.concat(List['full_spanish'])
46
- #All.concat(List['full_portuguese'])
47
- #All.concat(List['full_italian'])
48
- #All.concat(List['full_german'])
49
- #All.concat(List['full_dutch'])
50
- #All.concat(List['full_norwegian'])
51
- #All.concat(List['full_danish'])
52
- #All.concat(List['full_russian'])
53
- #All.concat(List['full_russian_koi8_r'])
54
- #All.concat(List['full_finnish'])
55
- All.sort!
56
- All.uniq!
57
- end
40
+ unless defined?(All)
41
+ All = []
42
+ All.concat(List['english'])
43
+ All.concat(List['full_english'])
44
+ All.concat(List['extended_english'])
45
+ #All.concat(List['full_french'])
46
+ #All.concat(List['full_spanish'])
47
+ #All.concat(List['full_portuguese'])
48
+ #All.concat(List['full_italian'])
49
+ #All.concat(List['full_german'])
50
+ #All.concat(List['full_dutch'])
51
+ #All.concat(List['full_norwegian'])
52
+ #All.concat(List['full_danish'])
53
+ #All.concat(List['full_russian'])
54
+ #All.concat(List['full_russian_koi8_r'])
55
+ #All.concat(List['full_finnish'])
56
+ All.sort!
57
+ All.uniq!
58
+ end
58
59
 
59
- unless defined?(Index)
60
- Index = {}
60
+ unless defined?(Index)
61
+ Index = {}
61
62
 
62
- All.each do |word|
63
- Index[word] = word
63
+ All.each do |word|
64
+ Index[word] = word
65
+ end
66
+ end
67
+
68
+ def stopword?(word)
69
+ !!Index[word]
70
+ end
71
+
72
+ extend(Stopwords)
64
73
  end
65
- end
66
74
 
67
- def stopword?(word)
68
- !!Index[word]
75
+ extend(Stemming)
69
76
  end
70
-
71
- extend(Stopwords)
72
77
  end
73
-
74
- extend(Stemming)
75
78
  end
76
79
 
77
80
  if $0 == __FILE__
@@ -4,33 +4,20 @@ module Mongoid
4
4
  include Mongoid::Document
5
5
 
6
6
  class << Token
7
- def values_for(*args, &block)
8
- string = args.join(' ')
9
- values = string.scan(/[^\s]+/)
10
- Stemming.stem(*values)
7
+ def values_for(*args)
8
+ Haystack.stems_for(*args)
11
9
  end
12
10
 
13
11
  def add(value)
14
- token = nil
15
- created = nil
16
-
17
- Haystack.find_or_create(
18
- proc do
19
- token = where(:value => value).first
20
- created = false if token
21
- token
22
- end,
23
-
24
- proc do
25
- token = create!(:value => value)
26
- created = true if token
27
- token
28
- end
29
- )
12
+ token =
13
+ Haystack.find_or_create(
14
+ ->{ where(:value => value).first },
15
+ ->{ create!(:value => value) }
16
+ )
30
17
 
31
18
  token.inc(:count, 1)
32
19
 
33
- Count[:tokens].inc(1) #if created
20
+ Count[:tokens].inc(1)
34
21
 
35
22
  token
36
23
  end
@@ -33,9 +33,6 @@ module Mongoid
33
33
  models.map{|model| model.destroy_all}
34
34
  end
35
35
 
36
- def stem(*args, &block)
37
- Stemming.stem(*args, &block)
38
- end
39
36
 
40
37
  def find_or_create(finder, creator)
41
38
  doc = finder.call()
@@ -59,6 +56,23 @@ module Mongoid
59
56
  end
60
57
  end
61
58
 
59
+ def words_for(*args)
60
+ string = args.flatten.compact.join(' ').scan(/\w+/).join(' ')
61
+ words = []
62
+ UnicodeUtils.each_word(string) do |word|
63
+ word = UnicodeUtils.nfkd(word.strip)
64
+ word.gsub!(/\A(?:[^\w]|_|\s)+/, '') # leading punctuation/spaces
65
+ word.gsub!(/(?:[^\w]|_|\s+)+\Z/, '') # trailing punctuation/spaces
66
+ next if word.empty?
67
+ words.push(word)
68
+ end
69
+ words
70
+ end
71
+
72
+ def stems_for(*args, &block)
73
+ Stemming.stem(*args, &block)
74
+ end
75
+
62
76
  extend Util
63
77
  end
64
78
 
@@ -2,7 +2,7 @@
2
2
  #
3
3
  module Mongoid
4
4
  module Haystack
5
- const_set :Version, '1.0.0'
5
+ const_set :Version, '1.1.0'
6
6
 
7
7
  class << Haystack
8
8
  def version
@@ -11,9 +11,11 @@
11
11
 
12
12
  def dependencies
13
13
  {
14
- 'mongoid' => [ 'mongoid' , '~> 3.0' ] ,
15
- 'map' => [ 'map' , '~> 6.2' ] ,
16
- 'fattr' => [ 'fattr' , '~> 2.2' ] ,
14
+ 'mongoid' => [ 'mongoid' , '~> 3.0' ] ,
15
+ 'map' => [ 'map' , '~> 6.2' ] ,
16
+ 'fattr' => [ 'fattr' , '~> 2.2' ] ,
17
+ 'coerce' => [ 'coerce' , '~> 0.0.3' ] ,
18
+ 'unicode_utils' => [ 'unicode_utils' , '~> 1.4.0' ] ,
17
19
  }
18
20
  end
19
21
 
@@ -66,6 +68,9 @@
66
68
  end
67
69
  end
68
70
 
71
+ require 'unicode_utils/u'
72
+ require 'unicode_utils/each_word'
73
+
69
74
  load Haystack.libdir('stemming.rb')
70
75
  load Haystack.libdir('util.rb')
71
76
  load Haystack.libdir('count.rb')
@@ -74,6 +79,10 @@
74
79
  load Haystack.libdir('index.rb')
75
80
  load Haystack.libdir('search.rb')
76
81
 
82
+ def Haystack.included(other)
83
+ other.send(:include, Search)
84
+ end
85
+
77
86
  extend Haystack
78
87
  end
79
88
  end
@@ -3,13 +3,14 @@
3
3
 
4
4
  Gem::Specification::new do |spec|
5
5
  spec.name = "mongoid-haystack"
6
- spec.version = "1.0.0"
6
+ spec.version = "1.1.0"
7
7
  spec.platform = Gem::Platform::RUBY
8
8
  spec.summary = "mongoid-haystack"
9
9
  spec.description = "a mongoid 3 zero-config, zero-integration, POLS pure mongo fulltext solution"
10
10
 
11
11
  spec.files =
12
- ["Rakefile",
12
+ ["README.md",
13
+ "Rakefile",
13
14
  "lib",
14
15
  "lib/app",
15
16
  "lib/app/models",
@@ -63,6 +64,10 @@ Gem::Specification::new do |spec|
63
64
 
64
65
  spec.add_dependency(*["fattr", "~> 2.2"])
65
66
 
67
+ spec.add_dependency(*["coerce", "~> 0.0.3"])
68
+
69
+ spec.add_dependency(*["unicode_utils", "~> 1.4.0"])
70
+
66
71
 
67
72
  spec.extensions.push(*[])
68
73
 
data/test/helper.rb CHANGED
@@ -7,22 +7,35 @@ require_relative 'testing'
7
7
  require_relative '../lib/mongoid-haystack.rb'
8
8
 
9
9
  Mongoid::Haystack.connect!
10
+ Mongoid::Haystack.reset!
10
11
 
11
12
  class A
12
13
  include Mongoid::Document
13
14
  field(:content, :type => String)
14
15
  def to_s; content; end
16
+
17
+ field(:a)
18
+ field(:b)
19
+ field(:c)
15
20
  end
16
21
 
17
22
  class B
18
23
  include Mongoid::Document
19
24
  field(:content, :type => String)
20
25
  def to_s; content; end
26
+
27
+ field(:a)
28
+ field(:b)
29
+ field(:c)
21
30
  end
22
31
 
23
32
  class C
24
33
  include Mongoid::Document
25
34
  field(:content, :type => String)
26
35
  def to_s; content; end
36
+
37
+ field(:a)
38
+ field(:b)
39
+ field(:c)
27
40
  end
28
41
 
@@ -1,15 +1,6 @@
1
1
  require_relative 'helper'
2
2
 
3
3
  Testing Mongoid::Haystack do
4
- ##
5
- #
6
- Mongoid::Haystack.reset!
7
-
8
- setup do
9
- [A, B, C].map{|m| m.destroy_all}
10
- Mongoid::Haystack.destroy_all
11
- end
12
-
13
4
  ##
14
5
  #
15
6
  testing 'that models can, at minimum, be indexed and searched' do
@@ -49,7 +40,7 @@ Testing Mongoid::Haystack do
49
40
  ##
50
41
  #
51
42
  testing 'that basic stemming can be performed' do
52
- assert{ Mongoid::Haystack.stem('dogs cats') == %w[ dog cat ] }
43
+ assert{ Mongoid::Haystack.stems_for('dogs cats fishes') == %w[ dog cat fish ] }
53
44
  end
54
45
 
55
46
  testing 'that words are stemmed when they are indexed' do
@@ -80,14 +71,12 @@ Testing Mongoid::Haystack do
80
71
  end
81
72
 
82
73
  testing 'that removing a model from the index decrements counts appropriately' do
83
- #
84
74
  a = A.create!(:content => 'dog')
85
75
  b = A.create!(:content => 'cat')
86
76
  c = A.create!(:content => 'cats dogs')
87
77
 
88
78
  assert{ Mongoid::Haystack.index(A) }
89
79
 
90
- #
91
80
  assert{ Mongoid::Haystack.search('cat').first }
92
81
 
93
82
  assert{ Mongoid::Haystack::Token.where(:value => 'cat').first.count == 2 }
@@ -116,4 +105,159 @@ Testing Mongoid::Haystack do
116
105
  assert{ Mongoid::Haystack::Token.where(:value => 'cat').first.count == 0 }
117
106
  assert{ Mongoid::Haystack::Token.where(:value => 'dog').first.count == 0 }
118
107
  end
108
+
109
+ ##
110
+ #
111
+ testing 'that search uses a b-tree index' do
112
+ a = A.create!(:content => 'dog')
113
+
114
+ assert{ Mongoid::Haystack.index(A) }
115
+ assert{ Mongoid::Haystack.search('dog').explain['cursor'] =~ /BtreeCursor/i }
116
+ end
117
+
118
+ ##
119
+ #
120
+ testing 'that classes can export a custom [score|keywords|fulltext] for the search index' do
121
+ k = new_klass do
122
+ def to_haystack
123
+ colors.push(color = colors.shift)
124
+
125
+ {
126
+ :score => score,
127
+
128
+ :keywords => "cats #{ color }",
129
+
130
+ :fulltext => 'now is the time for all good men...'
131
+ }
132
+ end
133
+
134
+ def self.score
135
+ @score ||= 0
136
+ ensure
137
+ @score += 1
138
+ end
139
+
140
+ def score
141
+ self.class.score
142
+ end
143
+
144
+ def self.colors
145
+ @colors ||= %w( black white )
146
+ end
147
+
148
+ def colors
149
+ self.class.colors
150
+ end
151
+ end
152
+
153
+ a = k.create!(:content => 'dog')
154
+ b = k.create!(:content => 'dogs too')
155
+
156
+ assert{ a.haystack_index.score == 0 }
157
+ assert{ b.haystack_index.score == 1 }
158
+
159
+ assert do
160
+ a.haystack_index.tokens.map(&:value).sort ==
161
+ ["black", "cat", "good", "men", "time"]
162
+ end
163
+ assert do
164
+ b.haystack_index.tokens.map(&:value).sort ==
165
+ ["cat", "good", "men", "time", "white"]
166
+ end
167
+
168
+ assert{ Mongoid::Haystack.search('cat').count == 2 }
169
+ assert{ Mongoid::Haystack.search('black').count == 1 }
170
+ assert{ Mongoid::Haystack.search('white').count == 1 }
171
+ assert{ Mongoid::Haystack.search('good men').count == 2 }
172
+ end
173
+
174
+ ##
175
+ #
176
+ testing 'that set intersection and union are supported via search' do
177
+ a = A.create!(:content => 'dog')
178
+ b = A.create!(:content => 'dog cat')
179
+ c = A.create!(:content => 'dog cat fish')
180
+
181
+ assert{ Mongoid::Haystack.index(A) }
182
+
183
+ assert{ Mongoid::Haystack.search(:any => 'dog').count == 3 }
184
+ assert{ Mongoid::Haystack.search(:any => 'dog cat').count == 3 }
185
+ assert{ Mongoid::Haystack.search(:any => 'dog cat fish').count == 3 }
186
+
187
+ assert{ Mongoid::Haystack.search(:all => 'dog').count == 3 }
188
+ assert{ Mongoid::Haystack.search(:all => 'dog cat').count == 2 }
189
+ assert{ Mongoid::Haystack.search(:all => 'dog cat fish').count == 1 }
190
+ end
191
+
192
+ ##
193
+ #
194
+ testing 'that classes can export custom facets and then search them, again using a b-tree index' do
195
+ k = new_klass do
196
+ field(:to_haystack, :type => Hash, :default => proc{ Hash.new })
197
+ end
198
+
199
+ a = k.create!(:content => 'hello kitty', :to_haystack => { :keywords => 'cat', :facets => {:x => 42.0}})
200
+ b = k.create!(:content => 'hello kitty', :to_haystack => { :keywords => 'cat', :facets => {:x => 4.20}})
201
+
202
+ assert{ Mongoid::Haystack.search('cat').where(:facets => {'x' => 42.0}).first.model == a }
203
+ assert{ Mongoid::Haystack.search('cat').where(:facets => {'x' => 4.20}).first.model == b }
204
+
205
+ assert{ Mongoid::Haystack.search('cat').where('facets.x' => 42.0).first.model == a }
206
+ assert{ Mongoid::Haystack.search('cat').where('facets.x' => 4.20).first.model == b }
207
+
208
+ assert{ Mongoid::Haystack.search('cat').where('facets' => {'x' => 42.0}).explain['cursor'] =~ /BtreeCursor/ }
209
+ assert{ Mongoid::Haystack.search('cat').where('facets' => {'x' => 4.20}).explain['cursor'] =~ /BtreeCursor/ }
210
+
211
+ assert{ Mongoid::Haystack.search('cat').where('facets.x' => 42.0).explain['cursor'] =~ /BtreeCursor/ }
212
+ assert{ Mongoid::Haystack.search('cat').where('facets.x' => 4.20).explain['cursor'] =~ /BtreeCursor/ }
213
+ end
214
+
215
+ ##
216
+ #
217
+ testing 'that keywords are considered more highly than fulltext' do
218
+ k = new_klass do
219
+ field(:title)
220
+ field(:body)
221
+
222
+ def to_haystack
223
+ { :keywords => title, :fulltext => body }
224
+ end
225
+ end
226
+
227
+ a = k.create!(:title => 'the cats', :body => 'like to meow')
228
+ b = k.create!(:title => 'the dogs', :body => 'do not like to meow, they bark at cats')
229
+
230
+ assert{ Mongoid::Haystack.search('cat').count == 2 }
231
+ assert{ Mongoid::Haystack.search('cat').first.model == a }
232
+
233
+ assert{ Mongoid::Haystack.search('meow').count == 2 }
234
+ assert{ Mongoid::Haystack.search('bark').count == 1 }
235
+ assert{ Mongoid::Haystack.search('dog').first.model == b }
236
+ end
237
+
238
+ protected
239
+
240
+ def new_klass(&block)
241
+ Object.send(:remove_const, :K) if Object.send(:const_defined?, :K)
242
+
243
+ k = Class.new(A) do
244
+ self.default_collection_name = :ks
245
+ def self.name() 'K' end
246
+ include ::Mongoid::Haystack::Search
247
+ class_eval(&block) if block
248
+ end
249
+
250
+ Object.const_set(:K, k)
251
+
252
+ k
253
+ end
254
+
255
+ H = Mongoid::Haystack
256
+ T = Mongoid::Haystack::Token
257
+ I = Mongoid::Haystack::Index
258
+
259
+ setup do
260
+ [A, B, C].map{|m| m.destroy_all}
261
+ Mongoid::Haystack.destroy_all
262
+ end
119
263
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: mongoid-haystack
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -59,12 +59,45 @@ dependencies:
59
59
  - - ~>
60
60
  - !ruby/object:Gem::Version
61
61
  version: '2.2'
62
+ - !ruby/object:Gem::Dependency
63
+ name: coerce
64
+ requirement: !ruby/object:Gem::Requirement
65
+ none: false
66
+ requirements:
67
+ - - ~>
68
+ - !ruby/object:Gem::Version
69
+ version: 0.0.3
70
+ type: :runtime
71
+ prerelease: false
72
+ version_requirements: !ruby/object:Gem::Requirement
73
+ none: false
74
+ requirements:
75
+ - - ~>
76
+ - !ruby/object:Gem::Version
77
+ version: 0.0.3
78
+ - !ruby/object:Gem::Dependency
79
+ name: unicode_utils
80
+ requirement: !ruby/object:Gem::Requirement
81
+ none: false
82
+ requirements:
83
+ - - ~>
84
+ - !ruby/object:Gem::Version
85
+ version: 1.4.0
86
+ type: :runtime
87
+ prerelease: false
88
+ version_requirements: !ruby/object:Gem::Requirement
89
+ none: false
90
+ requirements:
91
+ - - ~>
92
+ - !ruby/object:Gem::Version
93
+ version: 1.4.0
62
94
  description: a mongoid 3 zero-config, zero-integration, POLS pure mongo fulltext solution
63
95
  email: ara.t.howard@gmail.com
64
96
  executables: []
65
97
  extensions: []
66
98
  extra_rdoc_files: []
67
99
  files:
100
+ - README.md
68
101
  - Rakefile
69
102
  - lib/app/models/mongoid/haystack/count.rb
70
103
  - lib/app/models/mongoid/haystack/index.rb