mwmitchell-solr 0.5.3 → 0.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. data/CHANGES.txt +28 -0
  2. data/README.rdoc +50 -13
  3. data/Rakefile +1 -1
  4. data/examples/direct.rb +6 -4
  5. data/examples/http.rb +7 -2
  6. data/lib/solr/{adapter → connection/adapter}/common_methods.rb +1 -44
  7. data/lib/solr/{adapter → connection/adapter}/direct.rb +28 -13
  8. data/lib/solr/connection/adapter/http.rb +51 -0
  9. data/lib/solr/connection/adapter.rb +7 -0
  10. data/lib/solr/connection/search_ext.rb +28 -24
  11. data/lib/solr/connection.rb +1 -1
  12. data/lib/solr/http_client/adapter/curb.rb +51 -0
  13. data/lib/solr/http_client/adapter/net_http.rb +48 -0
  14. data/lib/solr/http_client/adapter.rb +6 -0
  15. data/lib/solr/http_client.rb +115 -0
  16. data/lib/solr/mapper/rss.rb +2 -0
  17. data/lib/solr/message.rb +5 -1
  18. data/lib/solr/response/base.rb +32 -0
  19. data/lib/solr/response/index_info.rb +22 -0
  20. data/lib/solr/response/query.rb +93 -0
  21. data/lib/solr/response/update.rb +4 -0
  22. data/lib/solr.rb +3 -4
  23. data/test/{direct_test.rb → connection/direct_test.rb} +4 -4
  24. data/test/connection/http_test.rb +19 -0
  25. data/test/{connection_test_methods.rb → connection/test_methods.rb} +14 -4
  26. data/test/http_client/curb_test.rb +19 -0
  27. data/test/http_client/net_http_test.rb +13 -0
  28. data/test/http_client/test_methods.rb +40 -0
  29. data/test/{adapter_common_methods_test.rb → http_client/util_test.rb} +3 -12
  30. data/test/message_test.rb +10 -0
  31. data/test/{ext_pagination_test.rb → pagination_test.rb} +8 -8
  32. data/test/test_helpers.rb +8 -0
  33. metadata +29 -17
  34. data/lib/solr/adapter/http.rb +0 -55
  35. data/lib/solr/adapter.rb +0 -7
  36. data/test/ext_search_test.rb +0 -9
  37. data/test/http_test.rb +0 -15
  38. data/test/indexer_test.rb +0 -14
data/CHANGES.txt ADDED
@@ -0,0 +1,28 @@
1
+ 0.5.4 - December 29, 2008
2
+
3
+ Re-organized the main Solr adapters, they're now in Solr::Connection::Adapter instead of Solr::Adapter
4
+
5
+ All responses from HTTPClient and Connection::Adapter::Direct return a hash with the following keys:
6
+
7
+ :status_code
8
+ :body
9
+ :params
10
+ :url
11
+ :path
12
+ :headers
13
+ :data
14
+
15
+ This hash is now available in the solr response objects as #source - this will be useful in testing and debugging by allowing you to see the generated params and queries... example:
16
+
17
+ response = Solr.query(:q=>'*:*')
18
+ response.source[:params]
19
+ response.source[:body]
20
+ response.source[:url]
21
+
22
+ Added MultiValue field support in Solr::Message, thanks to Fouad Mardini
23
+
24
+ Bug in Solr::Connection::SearchExt where the :q params was not getting generated - fixed by Fouad Mardini
25
+
26
+ Organized tests a bit, moved connection tests into test/connection
27
+
28
+ Fixed a bug in Solr::Connection::Adapter::HTTP where invalid HTTP POST headers were being generated
data/README.rdoc CHANGED
@@ -21,25 +21,21 @@ You can set Solr params that will be sent for every request:
21
21
 
22
22
  solr = Solr.connect(:http, :global_params=>{:wt=>:ruby, :echoParams=>'EXPLICIT'})
23
23
 
24
- Solr.connect also yields the adapter instance if a block is supplied:
25
-
26
- solr = Solr.connect(:http) do |net_http|
27
- net_http.class == Net::HTTP
28
- # set ssl options etc..
29
- end
30
-
31
24
 
32
25
  == Requests
33
26
  Once you have a connection, you can execute queries, updates etc..
34
27
 
35
28
 
36
29
  === Querying
37
- response = solr.query(:q=>'washington', :facet=>true, :facet.limit=>-1, :facet.field=cat, :facet.field=>inStock)
30
+ response = solr.query(:q=>'washington', :facet=>true, :facet.limit=>-1, :facet.field=>cat, :facet.field=>inStock)
38
31
  response = solr.find_by_id(1)
39
32
 
40
33
  Using the #search method makes building more complex Solr queries easier:
41
34
 
42
35
  response = solr.search 'my search', :filters=>{:price=>(0.00..10.00)}
36
+ response.docs.each do |doc|
37
+ doc.price
38
+ end
43
39
 
44
40
  ====Pagination
45
41
  Pagination is simplified by using the :page and :per_page params:
@@ -55,8 +51,10 @@ If you use WillPaginate, just pass-in the response to the #will_paginate view he
55
51
 
56
52
  <%= will_paginate(@response) %>
57
53
 
54
+ The #query and #search methods automatically figure out the :start and :rows value, based on the values of :page and :per_page. The will_paginate view helper just needs the right methods (#current_page, #previous_page, #next_page and #total_pages) to create the pagination view widget.
55
+
58
56
 
59
- === Updating
57
+ === Updating Solr
60
58
  Updating is done using native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML "messages".
61
59
 
62
60
  Single document
@@ -89,13 +87,15 @@ Commit & Optimize
89
87
 
90
88
 
91
89
  ==Response Formats
92
- The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd and wrapped up in a nice Solr::Response class. You can get raw ruby by setting the :wt to "ruby" - notice, the string -- not a symbol. All other response formats are available as expected, :wt=>'xml' etc.. Currently, the only response format that gets eval'd and wrapped is :ruby.
90
+ The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd and wrapped up in a nice Solr::Response class. You can get an unwrapped response by setting the :wt to "ruby" - notice, the string -- not a symbol. All other response formats are available as expected, :wt=>'xml' etc.. Currently, the only response format that gets eval'd and wrapped is :ruby.
91
+
92
+ You can access the original request context (path, params, url etc.) from response.source. The response.source is a hash that contains the generated params, url, path, post data, headers etc.. This could be useful for debugging and testing.
93
93
 
94
94
  ==Data Mapping
95
- The Solr::Mapper::Base class provides some nice ways of mapping data. You provide a hash mapping and a "data source". The keys of the hash mapping become the Solr field names. The values of the hash mapping get processed differently based on the type of value. The data source must be an Enumerable type object. The hash mapping is called for each item in the data source.
95
+ The Solr::Mapper::Base class provides some nice ways of mapping data. You provide a hash mapping and a "data source". The keys of the hash mapping become the Solr field names. The values of the hash mapping get processed differently based on the value. The data source must be an Enumerable type object. The hash mapping is processed for each item in the data source.
96
96
 
97
97
  ===Hash Map Processing
98
- If the value is a string, the String is used as the final Solr field value. If the value is a Symbol, the Symbol is used as a key on the data source. An Enumerable type does the same as the Symbol, but for each item. The most interesting processing occurs when the value is a Proc. When a Proc is used as a hash mapping value, the Solr::Mapper::Base executes the Proc's #call method, passing in the current data source item.
98
+ If the value is a string, the value of the String is used as the final Solr field value. If the value is a Symbol, the Symbol is used as a key on the data source. An Enumerable type does the same as the Symbol, but for each item in the set. The most interesting and flexible processing occurs when the value is a Proc. When a Proc is used as a hash mapping value, the mapper executes the Proc's #call method, passing in the current data source item.
99
99
 
100
100
  ===Examples
101
101
 
@@ -145,4 +145,41 @@ If the value is a string, the String is used as the final Solr field value. If t
145
145
  :meta=>['Mr. XYZ', 'A second class document'],
146
146
  :web_id=>'web_id_for_doc_2_for_example'
147
147
  }
148
- ]
148
+ ]
149
+
150
+ ===RSS Mapper
151
+ There is currently one built in mapper, Solr::Mapper::RSS. Here's an example usage:
152
+
153
+ mapper = Solr::Mapper::RSS.new
154
+ mapping = {
155
+ :channel=>:'channel.title',
156
+ :url=>:'channel.link',
157
+ :total=>:'items.size',
158
+ :title=>proc {|item,m| item.title },
159
+ :link=>proc {|item,m| item.link },
160
+ :published=>proc {|item,m| item.date },
161
+ :description=>proc {|item,m| item.description }
162
+ }
163
+ mapped_data = m.map('http://site.com/feed.rss')
164
+
165
+ ==Indexing
166
+ Solr (ruby) comes with a simple indexer that makes use of the Solr mapper. Here's an example, using the "mapping" and "mapped_data" variables above (RSS mapper):
167
+
168
+ solr = Solr.connect(:http)
169
+ i = Solr::Indexer.new(solr, mapping)
170
+ i.index(data_source)
171
+
172
+
173
+ ==HTTP Client Adapter
174
+ You can specify the Ruby http client to use by setting Solr::Adapter::HTTP.client_adapter to one of:
175
+ :net_http uses the standard Net::HTTP library
176
+ :curb uses the Ruby "curl" bindings
177
+
178
+ Example:
179
+
180
+ hclient = Solr::HTTPClient.connect(url, :curb)
181
+ hclient = Solr::HTTPClient.connect(url, :net_http)
182
+
183
+ After reading this http://apocryph.org/2008/11/09/more_indepth_analysis_ruby_http_client_performance/ - I would recommend using the :curb adapter. NOTE: You can't use the :curb adapter under jRuby. To install curb:
184
+
185
+ sudo gem install curb
data/Rakefile CHANGED
@@ -17,7 +17,7 @@ task :default => [:test_units]
17
17
 
18
18
  desc "Run basic tests"
19
19
  Rake::TestTask.new("test_units") { |t|
20
- t.pattern = 'test/*_test.rb'
20
+ t.pattern = 'test/**/*_test.rb'
21
21
  t.verbose = true
22
22
  t.warning = true
23
23
  }
data/examples/direct.rb CHANGED
@@ -1,14 +1,16 @@
1
1
  # Must be executed using jruby
2
2
  require File.join(File.dirname(__FILE__), '..', 'lib', 'solr')
3
3
 
4
- path_to_solr_dist=''
5
-
6
4
  base = File.expand_path( File.dirname(__FILE__) )
7
5
  dist = File.join(base, '..', 'apache-solr')
8
6
  home = File.join(dist, 'example', 'solr')
9
7
 
10
8
  solr = Solr.connect(:direct, :home_dir=>home, :dist_dir=>dist)
11
9
 
12
- #`cd ../apache-solr-1.3.0/example/exampledocs && ./post.sh ./*.xml`
10
+ `cd ../apache-solr/example/exampledocs && ./post.sh ./*.xml`
11
+
12
+ response = solr.search 'ipod', :filters=>{:price=>(0..50)}, :per_page=>2, :page=>1
13
+
14
+ solr.delete_by_query('*:*')
13
15
 
14
- solr.search 'ipod', :filters=>{:price=>(0..50)}, :per_page=>2, :page=>1
16
+ puts response.inspect
data/examples/http.rb CHANGED
@@ -1,7 +1,12 @@
1
+ # Must be executed using jruby
1
2
  require File.join(File.dirname(__FILE__), '..', 'lib', 'solr')
2
3
 
3
4
  solr = Solr.connect(:http)
4
5
 
5
- #`cd ../apache-solr-1.3.0/example/exampledocs && ./post.sh ./*.xml`
6
+ `cd ../apache-solr/example/exampledocs && ./post.sh ./*.xml`
6
7
 
7
- r = solr.paginate 'ipod', :filters=>{:price=>(0..50)}, :per_page=>2, :page=>1
8
+ response = solr.search 'ipod', :filters=>{:price=>(0..50)}, :per_page=>2, :page=>1
9
+
10
+ solr.delete_by_query('*:*')
11
+
12
+ puts response.inspect
@@ -7,9 +7,7 @@
7
7
  # request_path is a string to a handler (/select)
8
8
  # params is a hash for query string params
9
9
  # data is optional string of xml
10
- #
11
- #
12
- module Solr::Adapter::CommonMethods
10
+ module Solr::Connection::Adapter::CommonMethods
13
11
 
14
12
  # send a request to the "select" handler
15
13
  def query(params)
@@ -45,45 +43,4 @@ module Solr::Adapter::CommonMethods
45
43
  @adapter.send_request(handler_path, params, data)
46
44
  end
47
45
 
48
- # escapes a query key/value for http
49
- def escape(s)
50
- s.to_s.gsub(/([^ a-zA-Z0-9_.-]+)/n) {
51
- '%'+$1.unpack('H2'*$1.size).join('%').upcase
52
- }.tr(' ', '+')
53
- end
54
-
55
- def build_param(k,v)
56
- "#{escape(k)}=#{escape(v)}"
57
- end
58
-
59
- # takes a path and a hash of query params, returns an escaped url with query params
60
- def build_url(path, params_hash=nil)
61
- query = hash_to_params(params_hash)
62
- query ? path + '?' + query : path
63
- end
64
-
65
- #
66
- # converts hash into URL query string, keys get an alpha sort
67
- # if a value is an array, the array values get mapped to the same key:
68
- # hash_to_params(:q=>'blah', 'facet.field'=>['location_facet', 'format_facet'])
69
- # returns:
70
- # ?q=blah&facet.field=location_facet&facet.field=format.facet
71
- #
72
- # if a value is empty/nil etc., the key is not added
73
- def hash_to_params(params)
74
- return unless params.is_a?(Hash)
75
- # copy params and convert keys to strings
76
- params = params.inject({}){|acc,(k,v)| acc.merge({k.to_s, v}) }
77
- # get sorted keys
78
- params.keys.sort.inject([]) do |acc,k|
79
- v = params[k]
80
- if v.is_a?(Array)
81
- acc << v.reject{|i|i.to_s.empty?}.collect{|vv|build_param(k, vv)}
82
- elsif ! v.to_s.empty?
83
- acc.push(build_param(k, v))
84
- end
85
- acc
86
- end.join('&')
87
- end
88
-
89
46
  end
@@ -5,11 +5,12 @@ require 'java'
5
5
  #
6
6
  # Connection for JRuby + DirectSolrConnection
7
7
  #
8
- class Solr::Adapter::Direct
8
+ class Solr::Connection::Adapter::Direct
9
9
 
10
- include Solr::Adapter::CommonMethods
10
+ include Solr::HTTPClient::Util
11
+ include Solr::Connection::Adapter::CommonMethods
11
12
 
12
- attr_accessor :opts, :connection, :home_dir
13
+ attr_accessor :opts, :home_dir
13
14
 
14
15
  # required: opts[:home_dir] is absolute path to solr home (the directory with "data", "config" etc.)
15
16
  # opts must also contain either
@@ -19,31 +20,45 @@ class Solr::Adapter::Direct
19
20
  # OTHER OPTS:
20
21
  # :select_path => 'the/select/handler'
21
22
  # :update_path => 'the/update/handler'
22
- # If a block is given, the @connection instance (DirectSolrConnection) is yielded
23
23
  def initialize(opts, &block)
24
- @home_dir = opts[:home_dir]
25
- opts[:data_dir] ||= File.join(@home_dir , 'data')
24
+ @home_dir = opts[:home_dir].to_s
25
+ opts[:data_dir] ||= File.join(@home_dir, 'data')
26
26
  if opts[:dist_dir]
27
27
  # add the standard lib and dist directories to the :jar_paths
28
28
  opts[:jar_paths] = [File.join(opts[:dist_dir], 'lib'), File.join(opts[:dist_dir], 'dist')]
29
29
  end
30
30
  @opts = default_options.merge(opts)
31
- require_jars(@opts[:jar_paths])
32
- import_dependencies
33
- @connection = DirectSolrConnection.new(@home_dir, @opts[:data_dir], nil)
34
- yield @connection if block_given?
31
+ end
32
+
33
+ # loads/imports the java dependencies
34
+ # sets the @connection instance variable
35
+ def connection
36
+ @connection ||= (
37
+ require_jars(@opts[:jar_paths]) if @opts[:jar_paths]
38
+ import_dependencies
39
+ DirectSolrConnection.new(@home_dir, @opts[:data_dir], nil)
40
+ )
35
41
  end
36
42
 
37
43
  # send a request to the connection
38
44
  # request '/update', :wt=>:xml, '</commit>'
39
- def send_request(request_url_path, params={}, data=nil)
45
+ def send_request(path, params={}, data=nil)
40
46
  data = data.to_xml if data.respond_to?(:to_xml)
41
- full_path = build_url(request_url_path, params)
47
+ url = build_url(path, params)
42
48
  begin
43
- @connection.request(full_path, data)
49
+ body = connection.request(url, data)
44
50
  rescue
45
51
  raise Solr::RequestError.new($!.message)
46
52
  end
53
+ {
54
+ :status_code=>'',
55
+ :body=>body,
56
+ :url=>url,
57
+ :path=>path,
58
+ :params=>params,
59
+ :data=>data,
60
+ :headers=>{}
61
+ }
47
62
  end
48
63
 
49
64
  protected
@@ -0,0 +1,51 @@
1
+ #
2
+ # Connection for standard HTTP Solr server
3
+ #
4
+ class Solr::Connection::Adapter::HTTP
5
+
6
+ class << self
7
+ attr_accessor :client_adapter
8
+ end
9
+
10
+ @client_adapter = :net_http
11
+
12
+ include Solr::Connection::Adapter::CommonMethods
13
+
14
+ attr_reader :opts
15
+
16
+ # opts can have:
17
+ # :url => 'http://localhost:8080/solr'
18
+ # :select_path => '/the/url/path/to/the/select/handler'
19
+ # :update_path => '/the/url/path/to/the/update/handler'
20
+ # :luke_path => '/admin/luke'
21
+ #
22
+ def initialize(opts={}, &block)
23
+ opts[:url]||='http://127.0.0.1:8983/solr'
24
+ @opts = default_options.merge(opts)
25
+ end
26
+
27
+ def connection
28
+ @connection ||= Solr::HTTPClient.connect(@opts[:url], self.class.client_adapter)
29
+ end
30
+
31
+ # send a request to the connection
32
+ # request '/update', :wt=>:xml, '</commit>'
33
+ def send_request(path, params={}, data=nil)
34
+ data = data.to_xml if data.respond_to?(:to_xml)
35
+ if data
36
+ http_context = connection.post(path, data, params, post_headers)
37
+ else
38
+ http_context = connection.get(path, params)
39
+ end
40
+ raise Solr::RequestError.new(http_context[:body]) unless http_context[:status_code] == 200
41
+ http_context
42
+ end
43
+
44
+ protected
45
+
46
+ # The standard post headers
47
+ def post_headers
48
+ {"Content-Type" => 'text/xml; charset=utf-8'}
49
+ end
50
+
51
+ end
@@ -0,0 +1,7 @@
1
+ module Solr::Connection::Adapter
2
+
3
+ autoload :Direct, 'solr/connection/adapter/direct'
4
+ autoload :HTTP, 'solr/connection/adapter/http'
5
+ autoload :CommonMethods, 'solr/connection/adapter/common_methods'
6
+
7
+ end
@@ -1,32 +1,29 @@
1
1
  module Solr::Connection::SearchExt
2
2
 
3
- def search(query, params={})
4
- if params[:fields].is_a?(Array)
5
- params[:fl] = params.delete(:fields).join(' ')
6
- else
7
- params[:fl] = params.delete :fields
8
- end
9
- fq = build_filters(params.delete(:filters)).join(' ') if params[:filters]
10
- if params[:fq] and fq
11
- params[:fq] += " AND #{fq}"
12
- else
13
- params[:fq] = fq
3
+ def search(q_param, params={})
4
+ if params[:fields]
5
+ fields = params.delete :fields
6
+ params[:fl] = fields.is_a?(Array) ? fields.join(' ') : fields
14
7
  end
8
+
9
+ params[:fq] = build_filters(params.delete(:filters)) if params[:filters]
15
10
  facets = params.delete(:facets) if params[:facets]
11
+
16
12
  if facets
17
13
  if facets.is_a?(Array)
18
- params << {:facet => true}
19
- params += build_facets(facets)
14
+ params.merge!({:facet => true})
15
+ params.merge! build_facets(facets)
20
16
  elsif facets.is_a?(Hash)
21
- params << {:facet => true}
22
- params += build_facet(facets)
17
+ params.mrege!({:facet => true})
18
+ #params += build_facet(facets)
23
19
  elsif facets.is_a?(String)
24
- params += facets
20
+ #params += facets
25
21
  else
26
22
  raise 'facets must either be a Hash or an Array'
27
23
  end
28
24
  end
29
- params[:qt] ||= :dismax
25
+ #params[:qt] ||= :dismax
26
+ params[:q] = build_query(q_param)
30
27
  self.query params
31
28
  end
32
29
 
@@ -81,10 +78,11 @@ module Solr::Connection::SearchExt
81
78
  end
82
79
  params
83
80
  end
84
-
81
+
85
82
  def build_facets(facet_array)
86
- facet_array.inject([]) do |params, facet_hash|
87
- params.push build_facet(facet_hash)
83
+ facet_array.inject({}) do |p, facet_hash|
84
+ build_facet(facet_hash).each {|k| p.merge!(k) }
85
+ p
88
86
  end
89
87
  end
90
88
 
@@ -96,10 +94,12 @@ module Solr::Connection::SearchExt
96
94
  if 'field' == k.to_s
97
95
  params << {"facet.field" => v}
98
96
  elsif 'query' == k.to_s
99
- q = build_query("facet.query", v)
100
- params << q
101
- elsif ['name', :name].include?(k.to_s)
102
- # do nothing
97
+ q = build_query(v)
98
+ params << {"facet.query"=>q}
99
+ if facet_name
100
+ # keep track of names => facet_queries
101
+ name_to_facet_query[facet_name] = q['facet.query']
102
+ end
103
103
  else
104
104
  params << {"f.#{facet_hash[:field]}.facet.#{k}" => v}
105
105
  end
@@ -107,4 +107,8 @@ module Solr::Connection::SearchExt
107
107
  params
108
108
  end
109
109
 
110
+ def name_to_facet_query
111
+ @name_to_facet_query ||= {}
112
+ end
113
+
110
114
  end