mwmitchell-rsolr 0.6.2 → 0.6.3

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGES.txt CHANGED
@@ -1,3 +1,9 @@
1
+ 0.6.3 - January 21, 2009
2
+ Added a new param mapping module: RSolr::Connection::ParamMapping
3
+ Mapping only for fields that need conversion/escaping or nested (facet.*) etc.
4
+ This new module can be activated by using the #search method
5
+ New tests for ParamMapping
6
+
1
7
  0.6.2 - January 14, 2009
2
8
  Removed mapping and indexer modules -- seems to me that a general purpose mapping library
3
9
  would be more valuable than an embedded module in RSolr ( RMapper ?)
data/README.rdoc CHANGED
@@ -45,7 +45,23 @@ Use the #search method to take advantage of some of the param mapping (currently
45
45
 
46
46
  Thanks to a little Ruby magic, we can chain symbols to create Solr "dot" syntax: :facet.field=>'cat'
47
47
 
48
- ====Pagination
48
+ ==== Search Params
49
+ The #search method can accept the following params:
50
+ ===== When :qt is :standard
51
+ :page
52
+ :per_page
53
+ :queries
54
+ :filters
55
+ :phrase_queries
56
+ :phrase_filters
57
+ :facets
58
+ ===== When :qt is :dismax (also includes the :standard params)
59
+ :alternate_query
60
+ :query_fields
61
+ :phrase_fields
62
+ :boost_query
63
+
64
+ ==== Pagination
49
65
  Pagination is simplified by using the :page and :per_page params when using the #search method:
50
66
 
51
67
  response = solr.search(:page=>1, :per_page=>10, :q=>'*:*')
@@ -56,14 +72,11 @@ Pagination is simplified by using the :page and :per_page params when using the
56
72
  response.next_page
57
73
 
58
74
  If you use WillPaginate, just pass-in the response to the #will_paginate view helper:
59
-
75
+
60
76
  <%= will_paginate(@response) %>
61
77
 
62
78
  The #search method automatically figures out the :start and :rows values, based on the values of :page and :per_page. The will_paginate view helper uses the methods: #current_page, #previous_page, #next_page and #total_pages to create the pagination view widget.
63
79
 
64
- The #search method will be providing more param mapping, but not until RSolr gets more use. If you have ideas for a param mapping interface, let me know! I'll be throwing some stuff together as well.
65
-
66
-
67
80
  === Updating Solr
68
81
  Updating is done using native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML "messages".
69
82
 
@@ -96,91 +109,13 @@ Commit & Optimize
96
109
  solr.optimize
97
110
 
98
111
 
99
- ==Response Formats
112
+ == Response Formats
100
113
  The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd and wrapped up in a nice RSolr::Response class. You can get an unwrapped response by setting the :wt to "ruby" - notice, the string -- not a symbol. All other response formats are available as expected, :wt=>'xml' etc.. Currently, the only response format that gets eval'd and wrapped is :ruby.
101
114
 
102
115
  You can access the original request context (path, params, url etc.) from response.request. The response.request is a hash that contains the generated params, url, path, post data, headers etc., very useful for debugging and testing.
103
116
 
104
- ==Data Mapping
105
- The RSolr::Mapper::Base class provides some nice ways of mapping data. You provide a hash mapping and a "data source". The keys of the hash mapping become the Solr field names. The values of the hash mapping get processed differently based on the value. The data source must be an Enumerable type object. The hash mapping is processed for each item in the data source.
106
-
107
- ===Hash Map Processing
108
- If the value is a string, the value of the String is used as the final Solr field value. If the value is a Symbol, the Symbol is used as a key on the data source. An Enumerable type does the same as the Symbol, but for each item in the set. The most interesting and flexible processing occurs when the value is a Proc. When a Proc is used as a hash mapping value, the mapper executes the Proc's #call method, passing in the current data source item.
109
117
 
110
- ===Examples
111
-
112
- mapping = {
113
- :id=>:id,
114
- :title=>:title,
115
- :source=>'Example',
116
- :meta=>[:author, :sub_title],
117
- :web_id=>proc {|item|
118
- WebService.fetch_item_id_by_name(item[:name])
119
- }
120
- }
121
-
122
- data_source = [
123
- {
124
- :id=>100,
125
- :title=>'Doc One',
126
- :author=>'Mr. X',
127
- :sub_title=>'A first class document.',
128
- :name=>'doc_1'
129
- },
130
- {
131
- :id=>200,
132
- :title=>'Doc Two',
133
- :author=>'Mr. XYZ',
134
- :sub_title=>'A second class document.',
135
- :name=>'doc_2'
136
- }
137
- ]
138
-
139
- mapper = RSolr::Mapper::Base(mapping)
140
- mapped_data = mapper.map(data_source)
141
-
142
- # the following would be true...
143
- mapped_data == [
144
- {
145
- :id=>100,
146
- :title=>'Doc One',
147
- :source=>'Example',
148
- :meta=>['Mr. X', 'A first class document'],
149
- :web_id=>'web_id_for_doc_1_for_example'
150
- },
151
- {
152
- :id=>200,
153
- :title=>'Doc Two',
154
- :source=>'Example',
155
- :meta=>['Mr. XYZ', 'A second class document'],
156
- :web_id=>'web_id_for_doc_2_for_example'
157
- }
158
- ]
159
-
160
- ===RSS Mapper
161
- There is currently one built in mapper, RSolr::Mapper::RSS. Here's an example usage:
162
-
163
- mapper = RSolr::Mapper::RSS.new
164
- mapping = {
165
- :channel=>:'channel.title',
166
- :url=>:'channel.link',
167
- :total=>:'items.size',
168
- :title=>proc {|item,m| item.title },
169
- :link=>proc {|item,m| item.link },
170
- :published=>proc {|item,m| item.date },
171
- :description=>proc {|item,m| item.description }
172
- }
173
- mapped_data = m.map('http://site.com/feed.rss')
174
-
175
- ==Indexing
176
- RSolr comes with a simple indexer that makes use of the Solr mapper. Here's an example, using the "mapping" and "mapped_data" variables above (RSS mapper):
177
-
178
- solr = RSolr.connect(:http)
179
- i = RSolr::Indexer.new(solr, mapping)
180
- i.index(mapped_data)
181
-
182
-
183
- ==HTTP Client Adapter
118
+ == HTTP Client Adapter
184
119
  You can specify the http client adapter to use by setting RSolr::Connection::Adapter::HTTP.client_adapter to one of:
185
120
  :net_http uses the standard Net::HTTP library
186
121
  :curb uses the Ruby "curl" bindings
@@ -194,6 +129,6 @@ Example of using the HTTP client only:
194
129
  hclient = RSolr::HTTPClient.connect(url, :curb)
195
130
  hclient = RSolr::HTTPClient.connect(url, :net_http)
196
131
 
197
- After reading this http://apocryph.org/2008/11/09/more_indepth_analysis_ruby_http_client_performance/ - I would recommend using the :curb adapter. NOTE: You can't use the :curb adapter under jRuby. To install curb:
132
+ After reading this http://apocryph.org/2008/11/09/more_indepth_analysis_ruby_http_client_performance - I would recommend using the :curb adapter. NOTE: You can't use the :curb adapter under jRuby. To install curb:
198
133
 
199
134
  sudo gem install curb
@@ -5,11 +5,17 @@ class RSolr::Connection::Base
5
5
 
6
6
  attr_reader :adapter, :opts
7
7
 
8
+ attr_accessor :param_mappers
9
+
8
10
  # "adapter" is instance of:
9
11
  # RSolr::Adapter::HTTP
10
12
  # RSolr::Adapter::Direct (jRuby only)
11
13
  def initialize(adapter, opts={})
12
14
  @adapter=adapter
15
+ @param_mappers = {
16
+ :standard=>RSolr::Connection::ParamMapping::Standard,
17
+ :dismax=>RSolr::Connection::ParamMapping::Dismax
18
+ }
13
19
  opts[:global_params]||={}
14
20
  default_global_params = {
15
21
  :wt=>:ruby,
@@ -39,12 +45,12 @@ class RSolr::Connection::Base
39
45
  p[:wt]==:ruby ? RSolr::Response::Query::Base.new(response) : response
40
46
  end
41
47
 
42
- # same as the #query method, but with additional param mapping
43
- # currently only :page and :per_page
44
- # TODO: need to get some nice and friendly param mapping here:
45
- # search(:fields=>'', :facet_fields=>[], :filters=>{})
46
- def search(params)
47
- self.query(modify_params_for_pagination(params))
48
+ # register your own mapper?
49
+ def search(params,&blk)
50
+ qt = params[:qt] ? params[:qt].to_sym : :dismax
51
+ mapper_class = @param_mappers[qt]
52
+ mapper = mapper_class.new(params)
53
+ query(mapper.map(&blk))
48
54
  end
49
55
 
50
56
  # Finds a document by its id
@@ -110,22 +116,4 @@ class RSolr::Connection::Base
110
116
  RSolr::Message
111
117
  end
112
118
 
113
- # given a hash, this method will attempt to produce the
114
- # correct :start and :rows values for Solr
115
- # -- if the hash contains a :per_page value, it's used for the rows
116
- # if the :per_page value doesn't exist (nil), the :rows value is
117
- # used, and if :rows is not set, the default value is 10
118
- # -- if the hash contains a :page value (the current page)
119
- # it is used to calculate the :start value
120
- # returns a hash with the :rows and :start values
121
- # :per_page and :page are deleted
122
- def modify_params_for_pagination(p)
123
- per_page = p.delete(:per_page).to_s.to_i
124
- p[:rows] = per_page==0 ? (p[:rows] || 10) : per_page
125
- page = p.delete(:page).to_s.to_i
126
- page = page > 0 ? page : 1
127
- p[:start] = (page - 1) * (p[:rows] || 0)
128
- p
129
- end
130
-
131
119
  end
@@ -0,0 +1,41 @@
1
+ class RSolr::Connection::ParamMapping::Dismax < RSolr::Connection::ParamMapping::Standard
2
+
3
+ def setup_mappings
4
+ super
5
+
6
+ mapping_for :alternate_query, :q.alt do |val|
7
+ format_query(val)
8
+ end
9
+
10
+ mapping_for :query_fields, :qf do |val|
11
+ create_boost_query(val)
12
+ end
13
+
14
+ mapping_for :phrase_fields, :pf do |val|
15
+ create_boost_query(val)
16
+ end
17
+
18
+ mapping_for :boost_query, :bq do |val|
19
+ format_query(val)
20
+ end
21
+
22
+ end
23
+
24
+ protected
25
+
26
+ def create_boost_query(input)
27
+ case input
28
+ when Hash
29
+ qf = []
30
+ input.each_pair do |k,v|
31
+ qf << (v.to_s.empty? ? k : "#{k}^#{v}")
32
+ end
33
+ qf.join(' ')
34
+ when Array
35
+ input.join(' ')
36
+ when String
37
+ input
38
+ end
39
+ end
40
+
41
+ end
@@ -0,0 +1,115 @@
1
+ class RSolr::Connection::ParamMapping::Standard
2
+
3
+ include RSolr::Connection::ParamMapping::MappingMethods
4
+
5
+ attr_reader :input, :output
6
+
7
+ def initialize(input)
8
+ @output = {}
9
+ @input = input
10
+ setup_mappings
11
+ end
12
+
13
+ def setup_mappings
14
+
15
+ mapping_for :per_page, :rows do |val|
16
+ val = val.to_s.to_i
17
+ val < 0 ? 0 : val
18
+ end
19
+
20
+ mapping_for :page, :start do |val|
21
+ val = val.to_s.to_i
22
+ page = val > 0 ? val : 1
23
+ ((page - 1) * (@output[:rows] || 0))
24
+ end
25
+
26
+ mapping_for :queries, :q do |val|
27
+ format_query(val)
28
+ end
29
+
30
+ mapping_for :phrase_queries, :q do |val|
31
+ [@output[:q], format_query(val, true)].reject{|v|v.to_s.empty?}.join(' ')
32
+ end
33
+
34
+ mapping_for :filters, :fq do |val|
35
+ format_query(val)
36
+ end
37
+
38
+ # this must come after the :filter/:fq mapper
39
+ mapping_for :phrase_filters, :fq do |val|
40
+ [@output[:fq], format_query(val, true)].reject{|v|v.to_s.empty?}.join(' ')
41
+ end
42
+
43
+ mapping_for :facets do |input|
44
+ next if input.to_s.empty?
45
+ @output[:facet] = true
46
+ @output[:facet.field] = []
47
+ if input[:queries]
48
+ # convert to an array if needed
49
+ input[:queries] = [input[:queries]] unless input[:queries].is_a?(Array)
50
+ @output[:facet.query] = input[:queries].map{|q|format_query(q)}
51
+ end
52
+ common_sub_fields = [:sort, :limit, :missing, :mincount, :prefix, :offset, :method, :enum.cache.minDf]
53
+ (common_sub_fields).each do |subfield|
54
+ next unless input[subfield]
55
+ @output["facet.#{subfield}"] = input[subfield]
56
+ end
57
+ if input[:fields]
58
+ input[:fields].each do |f|
59
+ if f.kind_of? Hash
60
+ key = f.keys[0]
61
+ value = f[key]
62
+ @output[:facet.field] << key
63
+ common_sub_fields.each do |subfield|
64
+ next unless value[subfield]
65
+ @output["f.#{key}.facet.#{subfield}"] = input[subfield]
66
+ end
67
+ else
68
+ @output[:facet.field] << f
69
+ end
70
+ end
71
+ end
72
+ end
73
+ end
74
+
75
+ def format_query(input, quote=false)
76
+ case input
77
+ when Array
78
+ format_array_query(input, quote)
79
+ when Hash
80
+ format_hash_query(input, quote)
81
+ else
82
+ prep_value(input, quote)
83
+ end
84
+ end
85
+
86
+ def format_array_query(input, quote)
87
+ input.collect do |v|
88
+ v.is_a?(Hash) ? format_hash_query(v, quote) : prep_value(v, quote)
89
+ end.join(' ')
90
+ end
91
+
92
+ # groups values to a single field: title:(value1 value2) instead of title:value1 title:value2
93
+ # a value can be a range or a string
94
+ def format_hash_query(input, quote=false)
95
+ q = []
96
+ input.each_pair do |field,value|
97
+ next if value.to_s.empty? # skip blank values!
98
+ # create the field plus the delimiter if the field is not blank
99
+ value = [value] unless value.is_a?(Array)
100
+ fielded_queries = value.collect do |vv|
101
+ vv.is_a?(Range) ? "[#{vv.min} TO #{vv.max}]" : prep_value(vv, quote)
102
+ end
103
+ field = field.to_s.empty? ? '' : "#{field}:"
104
+ if fielded_queries.size > 0
105
+ q << (fielded_queries.size > 1 ? "#{field}(#{fielded_queries.join(' ')})" : "#{field}#{fielded_queries.to_s}")
106
+ end
107
+ end
108
+ q.join(' ')
109
+ end
110
+
111
+ def prep_value(val, quote=false)
112
+ quote ? %(\"#{val}\") : val.to_s
113
+ end
114
+
115
+ end
@@ -0,0 +1,39 @@
1
+ module RSolr::Connection::ParamMapping
2
+
3
+ autoload :Standard, 'rsolr/connection/param_mapping/standard'
4
+ autoload :Dismax, 'rsolr/connection/param_mapping/dismax'
5
+
6
+ module MappingMethods
7
+
8
+ def mappers
9
+ @mappers ||= []
10
+ end
11
+
12
+ def mapping_for(user_param_name, solr_param_name=nil, &block)
13
+ return unless @input[user_param_name]
14
+ if (m = self.mappers.detect{|m|m[:input_name] == user_param_name})
15
+ self.mappers.delete m
16
+ end
17
+ self.mappers << {:input_name=>user_param_name, :output_name=>solr_param_name, :block=>block}
18
+ end
19
+
20
+ def map(&blk)
21
+ input = @input.dup
22
+ mappers.each do |m|
23
+ input_value = input[m[:input_name]]
24
+ input.delete m[:input_name]
25
+ if m[:block]
26
+ value = m[:block].call(input_value)
27
+ else
28
+ value = input_value
29
+ end
30
+ if m[:output_name]
31
+ @output[m[:output_name]] = value
32
+ end
33
+ end
34
+ @output.merge(input)
35
+ end
36
+
37
+ end
38
+
39
+ end
@@ -2,5 +2,6 @@ module RSolr::Connection
2
2
 
3
3
  autoload :Base, 'rsolr/connection/base'
4
4
  autoload :Adapter, 'rsolr/connection/adapter'
5
+ autoload :ParamMapping, 'rsolr/connection/param_mapping'
5
6
 
6
7
  end
data/lib/rsolr.rb CHANGED
@@ -7,7 +7,7 @@ proc {|base, files|
7
7
 
8
8
  module RSolr
9
9
 
10
- VERSION = '0.6.2'
10
+ VERSION = '0.6.3'
11
11
 
12
12
  autoload :Message, 'rsolr/message'
13
13
  autoload :Response, 'rsolr/response'
@@ -0,0 +1,61 @@
1
+ require File.join(File.dirname(__FILE__), '..', 'test_helpers')
2
+
3
+ class ParamMappingTest < RSolrBaseTest
4
+
5
+ include RSolr::Connection::ParamMapping
6
+
7
+ def test_standard_simple
8
+ input = {
9
+ :queries=>'a query',
10
+ :filters=>'a filter',
11
+ :page=>1,
12
+ :per_page=>10,
13
+ :phrase_queries=>'a phrase query',
14
+ :phrase_filters=>'a phrase filter',
15
+ :facets=>{
16
+ :fields=>[:one,:two]
17
+ }
18
+ }
19
+ mapper = Standard.new(input)
20
+ output = mapper.map
21
+
22
+ assert_equal "a query \"a phrase query\"", output[:q]
23
+ assert_equal "a filter \"a phrase filter\"", output[:fq]
24
+ assert_equal 0, output[:start]
25
+ assert_equal 10, output[:rows]
26
+ # facet.field can be specified multiple times, so we need an array
27
+ # the url builder automatically adds multiple params for arrays
28
+ assert_equal [:one, :two], output[:facet.field]
29
+ end
30
+
31
+ def test_standard_complex
32
+ input = {
33
+ :queries=>['a query', {:field=>'value'}, 'blah'],
34
+ :filters=>['a filter', {:filter=>'field'}, 'blah'],
35
+ :phrase_queries=>['a phrase', {:phrase_field=>'phrase value'}],
36
+ :phrase_filters=>{:can_also_be_a=>'hash'}
37
+ }
38
+ mapper = Standard.new(input)
39
+ output = mapper.map
40
+
41
+ assert_equal "a query field:value blah \"a phrase\" phrase_field:\"phrase value\"", output[:q]
42
+ assert_equal "a filter filter:field blah can_also_be_a:\"hash\"", output[:fq]
43
+ end
44
+
45
+ def test_dismax
46
+ input = {
47
+ :alternate_query=>{:can_be_a_string_hash_or_array=>'OK'},
48
+ :query_fields=>{:a_field_to_boost=>20, :another_field_to_boost=>200},
49
+ :phrase_fields=>{:phrase_field=>20},
50
+ :boost_query=>[{:field_to_use_for_boost_query=>'a'}, 'test']
51
+ }
52
+ mapper = Dismax.new(input)
53
+ output = mapper.map
54
+ assert_equal 'can_be_a_string_hash_or_array:OK', output[:q.alt]
55
+ assert output[:qf]=~/another_field_to_boost\^200/
56
+ assert output[:qf]=~/a_field_to_boost\^20/
57
+ assert_equal 'phrase_field^20', output[:pf]
58
+ assert_equal 'field_to_use_for_boost_query:a test', output[:bq]
59
+ end
60
+
61
+ end
@@ -5,6 +5,7 @@
5
5
 
6
6
  module ConnectionTestMethods
7
7
 
8
+
8
9
  #def teardown
9
10
  # @solr.delete_by_query('id:[* TO *]')
10
11
  # @solr.commit
@@ -80,10 +81,11 @@ module ConnectionTestMethods
80
81
 
81
82
  def test_add
82
83
  assert_equal 0, @solr.query(:q=>'*:*').total
83
- response = @solr.add(:id=>100)
84
+ update_response = @solr.add(:id=>100)
85
+ assert update_response.is_a?(RSolr::Response::Update)
86
+ #
84
87
  @solr.commit
85
88
  assert_equal 1, @solr.query(:q=>'*:*').total
86
- assert response.is_a?(RSolr::Response::Update)
87
89
  end
88
90
 
89
91
  def test_delete_by_id
@@ -16,17 +16,6 @@ class ResponsePaginationTest < RSolrBaseTest
16
16
  #assert_equal 0, dummy_connection.send(:calculate_start, 0, 50)
17
17
  end
18
18
 
19
- def test_connection_modify_params_for_pagination
20
- dummy_connection = RSolr::Connection::Base.new(nil)
21
- p = dummy_connection.send(:modify_params_for_pagination, {:page=>1})
22
- assert_equal 0, p[:start]
23
- assert_equal 10, p[:rows]
24
- #
25
- p = dummy_connection.send(:modify_params_for_pagination, {:page=>10, :per_page=>100})
26
- assert_equal 900, p[:start]
27
- assert_equal 100, p[:rows]
28
- end
29
-
30
19
  def test_math
31
20
  response = create_response({'rows'=>5})
32
21
  assert_equal response.params['rows'], response.per_page
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: mwmitchell-rsolr
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.2
4
+ version: 0.6.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Matt Mitchell
@@ -9,7 +9,7 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2009-01-14 00:00:00 -08:00
12
+ date: 2009-01-21 00:00:00 -08:00
13
13
  default_executable:
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
@@ -43,6 +43,9 @@ files:
43
43
  - lib/rsolr/connection/adapter/http.rb
44
44
  - lib/rsolr/connection/adapter.rb
45
45
  - lib/rsolr/connection/base.rb
46
+ - lib/rsolr/connection/param_mapping.rb
47
+ - lib/rsolr/connection/param_mapping/dismax.rb
48
+ - lib/rsolr/connection/param_mapping/standard.rb
46
49
  - lib/rsolr/connection.rb
47
50
  - lib/rsolr/http_client/adapter/curb.rb
48
51
  - lib/rsolr/http_client/adapter/net_http.rb
@@ -89,6 +92,7 @@ summary: A Ruby client for Apache Solr
89
92
  test_files:
90
93
  - test/connection/direct_test.rb
91
94
  - test/connection/http_test.rb
95
+ - test/connection/param_mapping_test.rb
92
96
  - test/connection/test_methods.rb
93
97
  - test/core_ext_test
94
98
  - test/http_client/curb_test.rb