rsolr 1.0.0.beta → 1.0.0.beta2

Sign up to get free protection for your applications and to get access to all the features.
data/README.rdoc CHANGED
@@ -1,11 +1,13 @@
1
1
  =RSolr
2
2
 
3
+ A simple, extensible Ruby client for Apache Solr.
4
+
3
5
  Notice: This document is only for the the 1.0 (pre-release) in the master branch. The last stable gem release documentation can be found here: http://github.com/mwmitchell/rsolr/tree/v0.12.1
4
6
 
5
- A simple, extensible Ruby client for Apache Solr.
7
+ ==Documentation
8
+ The code docs can be viewed here : http://rdoc.info/projects/mwmitchell/rsolr
6
9
 
7
10
  == Installation:
8
- gem sources -a http://gemcutter.org
9
11
  sudo gem install rsolr
10
12
 
11
13
  == Example:
@@ -13,44 +15,50 @@ A simple, extensible Ruby client for Apache Solr.
13
15
  require 'rsolr'
14
16
 
15
17
  # Direct connection
16
- solr = RSolr.connect 'http://solrserver.com'
18
+ solr = RSolr.connect :url => 'http://solrserver.com'
17
19
 
18
20
  # Connecting over a proxy server
19
- solr = RSolr.connect 'http://solrserver.com', :proxy=>'http://user:pass@proxy.example.com:8080'
21
+ solr = RSolr.connect :url => 'http://solrserver.com', :proxy=>'http://user:pass@proxy.example.com:8080'
20
22
 
21
23
  # send a request to /select
22
- response = solr.get 'select', :q=>'*:*'
23
-
24
- # send a request to a custom request handler; /catalog
25
- response = solr.get 'catalog', :q=>'*:*'
24
+ response = solr.get 'select', :params => {:q => '*:*'}
26
25
 
26
+ # send a request to /catalog
27
+ response = solr.get 'catalog', :params => {:q => '*:*'}
28
+
27
29
  == Querying
28
- Use the #select method to send requests to the /select handler:
29
- response = solr.get('select', {
30
+ Use the #get / #post method to send search requests to the /select handler:
31
+ response = solr.get 'select', :params => {
30
32
  :q=>'washington',
31
33
  :start=>0,
32
34
  :rows=>10
33
- })
35
+ }
34
36
 
35
- The params sent into the method are sent to Solr as-is. The one exception is if a value is an array. When an array is used, multiple parameters *with the same name* are generated for the Solr query. Example:
37
+ The :params sent into the method are sent to Solr as-is. When an array is used, multiple parameters *with the same name* are generated for the Solr query. Example:
36
38
 
37
- solr.get 'select', :q=>'roses', :fq=>['red', 'violet']
39
+ solr.get 'select', :params => {:q=>'roses', :fq=>['red', 'violet']}
38
40
 
39
41
  The above statement generates this Solr query:
40
42
 
41
43
  select?q=roses&fq=red&fq=violet
42
44
 
43
- There may be cases where the query string is too long for a GET request. RSolr solves this issue by providing a simple way to POST a query to Solr:
44
- response = solr.post "select", nil, enormous_params_hash
45
+ ===Method Missing
46
+ The RSolr::Client class also uses method_missing for setting the request handler/path:
47
+
48
+ solr.paintings :params => {:q=>'roses', :fq=>['red', 'violet']}
49
+
50
+ This is sent to Solr as:
51
+ paintings?q=roses&fq=red&fq=violet
45
52
 
46
- nil is passed in as the query string data. The enormous_params_hash variable ends up serialized as a form-encoded query string, and the correct content-type headers are sent along to Solr.
47
53
 
48
- == Updating Solr
49
- Updating us done using native Ruby objects. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These objects get turned into simple XML "messages". Raw XML strings can also be used.
54
+ ===Using POST for Search Queries
55
+ There may be cases where the query string is too long for a GET request. RSolr solves this issue by converting hash objects into form-encoded strings:
56
+ response = solr.post "select", :data => enormous_params_hash
50
57
 
51
- Raw XML via #update
52
- solr.update '<commit/>'
53
- solr.update '<optimize/>'
58
+ The :data hash is serialized as a form-encoded query string, and the correct content-type headers are sent along to Solr.
59
+
60
+ == Updating Solr
61
+ Updating is done using native Ruby objects. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These objects get turned into simple XML "messages". Raw XML strings can also be used.
54
62
 
55
63
  Single document via #add
56
64
  solr.add :id=>1, :price=>1.00
@@ -59,17 +67,28 @@ Multiple documents via #add
59
67
  documents = [{:id=>1, :price=>1.00}, {:id=>2, :price=>10.50}]
60
68
  solr.add documents
61
69
 
62
- When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) when using the #add method:
70
+ The optional :add_attributes hash can also be used to set Solr "add" document attributes:
71
+ solr.add documents, :add_attributes => {:commitWithin => 10}
72
+
73
+ Raw XML via #update
74
+ solr.update :data => '<commit/>'
75
+ solr.update :data => '<optimize/>'
76
+
77
+ When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) by calling the +xml+ method directly:
63
78
 
64
79
  doc = {:id=>1, :price=>1.00}
65
- add_attributes = {:allowDups=>false, :commitWithin=>10.0}
66
- solr.add(doc, add_attributes) do |doc|
80
+ add_attributes = {:allowDups=>false, :commitWithin=>10}
81
+ add_xml = solr.xml.add(doc, add_attributes) do |doc|
67
82
  # boost each document
68
83
  doc.attrs[:boost] = 1.5
69
84
  # boost the price field:
70
85
  doc.field_by_name(:price).attrs[:boost] = 2.0
71
86
  end
72
87
 
88
+ Now the "add_xml" object can be sent to Solr like:
89
+ solr.update :data => add_xml
90
+
91
+ ===Deleting
73
92
  Delete by id
74
93
  solr.delete_by_id 1
75
94
  or an array of ids
@@ -80,34 +99,26 @@ Delete by query:
80
99
  Delete by array of queries
81
100
  solr.delete_by_query ['price:1.00', 'price:10.00']
82
101
 
83
- Commit & optimize shortcuts
84
- solr.commit
85
- solr.optimize
102
+ ===Commit / Optimize
103
+ solr.commit, :commit_attributes => {}
104
+ solr.optimize, :optimize_attributes => {}
86
105
 
87
106
  == Response Formats
88
107
  The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd resulting in a Hash. You can get a raw response by setting the :wt to "ruby" - notice, the string -- not a symbol. RSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, :wt=>'xml' etc..
89
108
 
90
109
  ===Evaluated Ruby (default)
91
- solr.get 'select', :wt=>:ruby # notice :ruby is a Symbol
110
+ solr.get 'select', :params => {:wt => :ruby} # notice :ruby is a Symbol
92
111
  ===Raw Ruby
93
- solr.get 'select', :wt=>'ruby' # notice 'ruby' is a String
112
+ solr.get 'select', :params => {:wt => 'ruby'} # notice 'ruby' is a String
94
113
 
95
114
  ===XML:
96
- solr.get 'select', :wt=>:xml
115
+ solr.get 'select', :params => {:wt => :xml}
97
116
  ===JSON:
98
- solr.get 'select', :wt=>:json
99
-
100
- You can access the original request context (path, params, url etc.) by calling the #request method:
101
- result = solr.get 'select', :q=>'*:*'
102
- result.request[:uri]
103
- result.request[:params]
104
- etc..
117
+ solr.get 'select', :params => {:wt => :json}
105
118
 
106
- Similarly, the object returned has a response object. This contains any headers that Solr returned, along with the raw response body:
107
- result = solr.get 'select', :q=>'*:*'
108
- result.response[:headers]
109
- result.response[:status]
110
- result.response[:body]
119
+ ==Http Request Methods: +get+, +post+, and +head+
120
+ RSolr can send GET, POST and HEAD requests to Solr:
121
+ response = solr.head "admin"
111
122
 
112
123
  ==Related Resources & Projects
113
124
  * {RSolr Google Group}[http://groups.google.com/group/rsolr] -- The RSolr discussion group
@@ -139,6 +150,7 @@ Similarly, the object returned has a response object. This contains any headers
139
150
  * Send me a pull request. Bonus points for topic branches.
140
151
 
141
152
  ==Contributors
153
+ * Colin Steele
142
154
  * Lorenzo Riccucci
143
155
  * Mike Perham
144
156
  * Mat Brown
@@ -155,4 +167,4 @@ Matt Mitchell <mailto:goodieboy@gmail.com>
155
167
 
156
168
  ==Copyright
157
169
 
158
- Copyright (c) 2008-2010 mwmitchell. See LICENSE for details.
170
+ Copyright (c) 2008-2010 Matt Mitchell. See LICENSE for details.
data/VERSION CHANGED
@@ -1 +1 @@
1
- 1.0.0.beta
1
+ 1.0.0.beta2
data/lib/rsolr/client.rb CHANGED
@@ -6,43 +6,55 @@ class RSolr::Client
6
6
  @connection = connection
7
7
  end
8
8
 
9
- # GET request
10
- def get path = '', params = nil, headers = nil
11
- send_request :get, path, params, nil, headers
12
- end
13
-
14
- # essentially a GET, but no response body
15
- def head path = '', params = nil, headers = nil
16
- send_request :head, path, params, nil, headers
9
+ %W(get post head).each do |meth|
10
+ class_eval <<-RUBY
11
+ def #{meth} path, opts = {}, &block
12
+ send_request path, opts.merge(:method => :#{meth}), &block
13
+ end
14
+ RUBY
17
15
  end
18
16
 
19
- # A path is required for a POST since, well...
20
- # the / resource doesn't do anything with a POST.
21
- # Also, Solr doesn't do headers with a POST
22
- def post path, data = nil, params = nil, headers = nil
23
- send_request :post, path, params, data, headers
17
+ def method_missing name, *args
18
+ send_request name, *args
24
19
  end
25
20
 
26
21
  # POST XML messages to /update with optional params
27
- def update data, params = {}, headers = {}
28
- headers['Content-Type'] ||= 'text/xml'
29
- post 'update', data, params, headers
22
+ #
23
+ # If not set, opts[:headers] will be set to a hash with the key
24
+ # 'Content-Type' set to 'text/xml'
25
+ #
26
+ # +opts+ can/should contain:
27
+ #
28
+ # :data - posted data
29
+ # :headers - http headers
30
+ # :params - query parameter hash
31
+ #
32
+ def update opts = {}
33
+ opts[:headers] ||= {}
34
+ opts[:headers]['Content-Type'] ||= 'text/xml'
35
+ post 'update', opts
30
36
  end
31
37
 
32
38
  #
39
+ # +add+ creates xml "add" documents and sends the xml data to the +update+ method
33
40
  # single record:
34
41
  # solr.update(:id=>1, :name=>'one')
35
42
  #
36
43
  # update using an array
37
- # solr.update([{:id=>1, :name=>'one'}, {:id=>2, :name=>'two'}])
38
- #
39
- def add(doc, params={}, &block)
40
- update xml.add(doc, params, &block)
44
+ #
45
+ # solr.update(
46
+ # [{:id=>1, :name=>'one'}, {:id=>2, :name=>'two'}],
47
+ # :add_attributes => {:boost=>5.0, :commitWithin=>10}
48
+ # )
49
+ #
50
+ def add doc, opts = {}
51
+ add_attributes = opts.delete :add_attributes
52
+ update opts.merge(:data => xml.add(doc, add_attributes))
41
53
  end
42
54
 
43
- # send "commit" xml with options
55
+ # send "commit" xml with opts
44
56
  #
45
- # Options recognized by solr
57
+ # opts recognized by solr
46
58
  #
47
59
  # :maxSegments => N - optimizes down to at most N number of segments
48
60
  # :waitFlush => true|false - do not return until changes are flushed to disk
@@ -51,13 +63,14 @@ class RSolr::Client
51
63
  #
52
64
  # *NOTE* :expungeDeletes is Solr 1.4 only
53
65
  #
54
- def commit( options = {} )
55
- update xml.commit( options )
66
+ def commit opts = {}
67
+ commit_attrs = opts.delete :commit_attributes
68
+ update opts.merge(:data => xml.commit( commit_attrs ))
56
69
  end
57
70
 
58
- # send "optimize" xml with options.
71
+ # send "optimize" xml with opts.
59
72
  #
60
- # Options recognized by solr
73
+ # opts recognized by solr
61
74
  #
62
75
  # :maxSegments => N - optimizes down to at most N number of segments
63
76
  # :waitFlush => true|false - do not return until changes are flushed to disk
@@ -66,28 +79,29 @@ class RSolr::Client
66
79
  #
67
80
  # *NOTE* :expungeDeletes is Solr 1.4 only
68
81
  #
69
- def optimize( options = {} )
70
- update xml.optimize( options )
82
+ def optimize opts = {}
83
+ optimize_attributes = opts.delete :optimize_attributes
84
+ update opts.merge(:data => xml.optimize(optimize_attributes))
71
85
  end
72
-
86
+
73
87
  # send </rollback>
74
88
  # NOTE: solr 1.4 only
75
- def rollback
76
- update xml.rollback
89
+ def rollback opts = {}
90
+ update opts.merge(:data => xml.rollback)
77
91
  end
78
92
 
79
93
  # Delete one or many documents by id
80
94
  # solr.delete_by_id 10
81
95
  # solr.delete_by_id([12, 41, 199])
82
- def delete_by_id(id)
83
- update xml.delete_by_id(id)
96
+ def delete_by_id id, opts = {}
97
+ update opts.merge(:data => xml.delete_by_id(id))
84
98
  end
85
99
 
86
100
  # delete one or many documents by query
87
101
  # solr.delete_by_query 'available:0'
88
102
  # solr.delete_by_query ['quantity:0', 'manu:"FQ"']
89
- def delete_by_query(query)
90
- update xml.delete_by_query(query)
103
+ def delete_by_query query, opts = {}
104
+ update opts.merge(:data => xml.delete_by_query(query))
91
105
  end
92
106
 
93
107
  # shortcut to RSolr::Message::Generator
@@ -95,62 +109,33 @@ class RSolr::Client
95
109
  @xml ||= RSolr::Xml::Generator.new
96
110
  end
97
111
 
98
- def send_request method, path, params, data, headers
99
- params = map_params params
100
- uri, data, headers = build_request path, params, data, headers
101
- request_context = {:connection=>connection, :method => method, :uri => uri, :data => data, :headers => headers, :params => params}
102
- begin
103
- response = data ? connection.send(method, uri, data, headers) : connection.send(method, uri, headers)
104
- rescue
105
- $!.extend(RSolr::Error::SolrContext).request = request_context
106
- raise $!
107
- end
108
- raise "The connection adapter returned an unexpected object" unless response.is_a?(Hash)
109
- raise RSolr::Error::Http.new request_context, response unless [200,302].include?(response[:status])
110
- adapt_response request_context, response
111
- end
112
-
113
- def map_params params
114
- params = params.nil? ? {} : params.dup
115
- params[:wt] ||= :ruby
116
- params
112
+ # +send_request+ is the main request method.
113
+ #
114
+ # "path" : A string value that usually represents a solr request handler
115
+ # "opt" : A hash, which can contain the following keys:
116
+ # :method : required - the http method (:get, :post or :head)
117
+ # :params : optional - the query string params in hash form
118
+ # :data : optional - post data -- if a hash is given, it's sent as "application/x-www-form-urlencoded"
119
+ # :headers : optional - hash of request headers
120
+ # All other options are passed right along to the connection request method (:get, :post, or :head)
121
+ #
122
+ # +send_request+ returns either a string or hash on a successful ruby request.
123
+ # When the :params[:wt] => :ruby, the response will be a hash, else a string.
124
+ #
125
+ def send_request path, opts = {}
126
+ connection.send_request path, opts
117
127
  end
118
128
 
119
- def build_request path, params, data, headers
120
- params ||= {}
121
- headers ||= {}
122
- request_uri = params.any? ? "#{path}?#{RSolr::Uri.params_to_solr params}" : path
123
- if data
124
- if data.is_a? Hash
125
- data = RSolr::Uri.params_to_solr data
126
- headers['Content-Type'] ||= 'application/x-www-form-urlencoded'
127
- end
128
- end
129
- [request_uri, data, headers]
129
+ # used for debugging/inspection
130
+ # - accepts the same args as send_request
131
+ def build_request path, opts = {}
132
+ connection.build_request path, opts
130
133
  end
131
134
 
132
- # This method will evaluate the :body value
133
- # if the params[:uri].params[:wt] == :ruby
134
- # ... otherwise, the body is returned as is.
135
- # The return object has a special method attached called #context.
136
- # This method gives you access to the original
137
- # request and response from the connection.
138
- # This method will raise an InvalidRubyResponse
139
- # if the :wt => :ruby and the body
140
- # couldn't be evaluated.
141
- def adapt_response request, response
142
- data = response[:body]
143
- if request[:params][:wt] == :ruby
144
- begin
145
- data = Kernel.eval data.to_s
146
- rescue SyntaxError
147
- raise RSolr::Error::InvalidRubyResponse.new request, response
148
- end
149
- end
150
- data.extend Module.new.instance_eval{attr_accessor :request, :response; self}
151
- data.request = request
152
- data.response = response
153
- data
135
+ # used for debugging/inspection
136
+ # - accepts the same args as send_request
137
+ def adapt_response request_context, response
138
+ connection.adapt_response request_context, response
154
139
  end
155
140
 
156
141
  end
@@ -0,0 +1,95 @@
1
+ # Connectable is designed to be shared across solr driver implementations.
2
+ # If the driver uses an http url/proxy and returns the standard http respon
3
+ # data (status, body, headers) then this module could be used.
4
+ module RSolr::Connectable
5
+
6
+ attr_reader :uri, :proxy, :options
7
+
8
+ def initialize options = {}
9
+ url = options[:url] || 'http://127.0.0.1:8983/solr/'
10
+ url << "/" unless url[-1] == ?/
11
+ proxy_url = options[:proxy]
12
+ proxy_url << "/" unless proxy_url.nil? or proxy_url[-1] == ?/
13
+ @uri = RSolr::Uri.create url
14
+ @proxy = RSolr::Uri.create proxy_url if proxy_url
15
+ @options = options
16
+ end
17
+
18
+ #
19
+ def base_request_uri
20
+ base_uri.request_uri
21
+ end
22
+
23
+ def base_uri
24
+ @proxy || @uri
25
+ end
26
+
27
+ # creates a request context hash,
28
+ # sends it to the connection.execute method
29
+ # which returns a simple hash,
30
+ # then passes the request/response into adapt_response.
31
+ def send_request path, opts
32
+ request_context = build_request path, opts
33
+ raw_response = execute request_context
34
+ adapt_response request_context, raw_response
35
+ end
36
+
37
+ # all connection imlementations that use this mixin need to create an execute method
38
+ def execute request_context
39
+ raise "You gotta implement this method and return a hash like => {:status => <integer>, :body => <string>, :headers => <hash>}"
40
+ end
41
+
42
+ # build_request sets up the uri/query string
43
+ # and converts the +data+ arg to form-urlencoded
44
+ # if the +data+ arg is a hash.
45
+ # returns a hash with the following keys:
46
+ # :method
47
+ # :params
48
+ # :headers
49
+ # :data
50
+ # :uri
51
+ # :path
52
+ # :query
53
+ def build_request path, opts
54
+ opts[:method] ||= :get
55
+ raise "The :data option can only be used if :method => :post" if opts[:method] != :post and opts[:data]
56
+ opts[:params] = opts[:params].nil? ? {:wt => :ruby} : opts[:params].merge(:wt => :ruby)
57
+ query = RSolr::Uri.params_to_solr(opts[:params]) unless opts[:params].empty?
58
+ opts[:query] = query
59
+ if opts[:data].is_a? Hash
60
+ opts[:data] = RSolr::Uri.params_to_solr opts[:data]
61
+ opts[:headers] ||= {}
62
+ opts[:headers]['Content-Type'] ||= 'application/x-www-form-urlencoded'
63
+ end
64
+ opts[:path] = path
65
+ opts[:uri] = base_uri.merge(path.to_s + (query ? "?#{query}" : "")) if base_uri
66
+ opts
67
+ end
68
+
69
+ # This method will evaluate the :body value
70
+ # if the params[:uri].params[:wt] == :ruby
71
+ # ... otherwise, the body is returned as is.
72
+ # The return object has methods attached, :request and :response.
73
+ # These methods give you access to the original
74
+ # request and response from the connection.
75
+ #
76
+ # +adapt_response+ will raise an InvalidRubyResponse
77
+ # if :wt == :ruby and the body
78
+ # couldn't be evaluated.
79
+ def adapt_response request, response
80
+ raise "The response does not have the correct keys => :body, :headers, :status" unless
81
+ %W(body headers status) == response.keys.map{|k|k.to_s}.sort
82
+ raise RSolr::Error::Http.new request, response unless
83
+ [200,302].include? response[:status]
84
+ data = response[:body]
85
+ if request[:params][:wt] == :ruby
86
+ begin
87
+ data = Kernel.eval data.to_s
88
+ rescue SyntaxError
89
+ raise RSolr::Error::InvalidRubyResponse.new request, response
90
+ end
91
+ end
92
+ data
93
+ end
94
+
95
+ end
data/lib/rsolr/error.rb CHANGED
@@ -6,26 +6,17 @@ module RSolr::Error
6
6
 
7
7
  def to_s
8
8
  m = "#{super.to_s}"
9
-
10
9
  if response
11
10
  m << " - #{response[:status]} #{Http::STATUS_CODES[response[:status].to_i]}"
12
11
  details = parse_solr_error_response response[:body]
13
- m << "Error: #{details}\n" if details
14
- end
15
-
16
- m << "\n" + self.backtrace[0..10].join("\n")
17
- m << "\n\nSolr Request:"
18
- m << "\n Method: #{request[:method].to_s.upcase}"
19
- m << "\n Base URL: #{request[:connection].uri.to_s}"
20
- m << "\n URL: #{request[:uri]}"
21
- m << "\n Params: #{request[:params].inspect}"
22
- m << "\n Data: #{request[:data].inspect}" if request[:data]
23
- m << "\n Headers: #{request[:headers].inspect}"
24
- if response
25
- m << "\n\nSolr Response:"
26
- m << "\n Code: #{response[:status]}"
27
- m << "\n Headers: #{response[:headers].inspect}"
12
+ m << "\nError: #{details}\n" if details
28
13
  end
14
+ p = "\nQuery: #{request[:path]}?#{request[:query]}"
15
+ p = "\nRequest Headers: #{request[:headers].inspect}" if request[:headers]
16
+ p = "\nRequest Data: #{request[:data].inspect}" if request[:data]
17
+ p << "\n"
18
+ p << "\nBacktrace: " + self.backtrace[0..10].join("\n")
19
+ m << p
29
20
  m
30
21
  end
31
22