backupify-rsolr-nokogiri 0.12.1.1

Sign up to get free protection for your applications and to get access to all the features.
data/LICENSE ADDED
@@ -0,0 +1,13 @@
1
+ Copyright 2008-2010 Matt Mitchell
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
@@ -0,0 +1,129 @@
1
+ =RSolr
2
+
3
+ A simple, extensible Ruby client for Apache Solr.
4
+
5
+ == Installation:
6
+ gem sources -a http://gemcutter.org
7
+ sudo gem install rsolr
8
+
9
+ == Example:
10
+ require 'rubygems'
11
+ require 'rsolr'
12
+ solr = RSolr.connect :url=>'http://solrserver.com'
13
+
14
+ # send a request to /select
15
+ response = rsolr.select :q=>'*:*'
16
+
17
+ # send a request to a custom request handler; /catalog
18
+ response = rsolr.request '/catalog', :q=>'*:*'
19
+
20
+ # alternative to above:
21
+ response = rsolr.catalog :q=>'*:*'
22
+
23
+ == Querying
24
+ Use the #select method to send requests to the /select handler:
25
+ response = solr.select({
26
+ :q=>'washington',
27
+ :start=>0,
28
+ :rows=>10
29
+ })
30
+
31
+ The params sent into the method are sent to Solr as-is. The one exception is if a value is an array. When an array is used, multiple parameters *with the same name* are generated for the Solr query. Example:
32
+
33
+ solr.select :q=>'roses', :fq=>['red', 'violet']
34
+
35
+ The above statement generates this Solr query:
36
+
37
+ ?q=roses&fq=red&fq=violet
38
+
39
+ Use the #request method for a custom request handler path:
40
+ response = solr.request '/documents', :q=>'test'
41
+
42
+ A shortcut for the above example use a method call instead:
43
+ response = solr.documents :q=>'test'
44
+
45
+
46
+ == Updating Solr
47
+ Updating uses native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML "messages". Raw XML strings can also be used.
48
+
49
+ Raw XML via #update
50
+ solr.update '</commit>'
51
+ solr.update '</optimize>'
52
+
53
+ Single document via #add
54
+ solr.add :id=>1, :price=>1.00
55
+
56
+ Multiple documents via #add
57
+ documents = [{:id=>1, :price=>1.00}, {:id=>2, :price=>10.50}]
58
+ solr.add documents
59
+
60
+ When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) when using the #add method:
61
+
62
+ doc = {:id=>1, :price=>1.00}
63
+ add_attributes = {:allowDups=>false, :commitWithin=>10.0}
64
+ solr.add(doc, add_attributes) do |doc|
65
+ # boost each document
66
+ doc.attrs[:boost] = 1.5
67
+ # boost the price field:
68
+ doc.field_by_name(:price).attrs[:boost] = 2.0
69
+ end
70
+
71
+ Delete by id
72
+ solr.delete_by_id 1
73
+ or an array of ids
74
+ solr.delete_by_id [1, 2, 3, 4]
75
+
76
+ Delete by query:
77
+ solr.delete_by_query 'price:1.00'
78
+ Delete by array of queries
79
+ solr.delete_by_query ['price:1.00', 'price:10.00']
80
+
81
+ Commit & optimize shortcuts
82
+ solr.commit
83
+ solr.optimize
84
+
85
+ == Response Formats
86
+ The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd resulting in a Hash. You can get a raw response by setting the :wt to "ruby" - notice, the string -- not a symbol. RSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, :wt=>'xml' etc..
87
+
88
+ ===Evaluated Ruby (default)
89
+ solr.select(:wt=>:ruby) # notice :ruby is a Symbol
90
+ ===Raw Ruby
91
+ solr.select(:wt=>'ruby') # notice 'ruby' is a String
92
+
93
+ ===XML:
94
+ solr.select(:wt=>:xml)
95
+ ===JSON:
96
+ solr.select(:wt=>:json)
97
+
98
+ You can access the original request context (path, params, url etc.) by calling the #raw method:
99
+ response = solr.select :q=>'*:*'
100
+ response.raw[:status_code]
101
+ response.raw[:body]
102
+ response.raw[:url]
103
+
104
+ The raw is a hash that contains the generated params, url, path, post data, headers etc., very useful for debugging and testing.
105
+
106
+ ==Related Resources & Projects
107
+ * {RSolr Google Group}[http://groups.google.com/group/rsolr] -- The RSolr discussion group
108
+ * {rsolr-ext}[http://github.com/mwmitchell/rsolr-ext] -- An extension kit for RSolr
109
+ * {rsolr-direct}[http://github.com/mwmitchell/rsolr-direct] -- JRuby direct connection for RSolr
110
+ * {SunSpot}[http://github.com/outoftime/sunspot] -- An awesome Solr DSL, built with RSolr
111
+ * {Blacklight}[http://blacklightopac.org] -- A "next generation" Library OPAC, built with RSolr
112
+ * {java_bin}[http://github.com/kennyj/java_bin] -- Provides javabin/binary parsing for RSolr
113
+ * {Solr}[http://lucene.apache.org/solr/] -- The Apache Solr project
114
+ * {solr-ruby}[http://wiki.apache.org/solr/solr-ruby] -- The original Solr Ruby Gem!
115
+
116
+ == Note on Patches/Pull Requests
117
+ * Fork the project.
118
+ * Make your feature addition or bug fix.
119
+ * Add tests for it. This is important so I don't break it in a future version unintentionally.
120
+ * Commit, do not mess with rakefile, version, or history
121
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
122
+ * Send me a pull request. Bonus points for topic branches.
123
+
124
+ ==Contributors
125
+ * mperham
126
+ * Mat Brown
127
+ * shairontoledo
128
+ * Matthew Rudy
129
+ * Fouad Mardini
@@ -0,0 +1,14 @@
1
+ require 'rake'
2
+ require 'rake/testtask'
3
+ require 'rdoc/task'
4
+ require 'rubygems/package_task'
5
+
6
+ ENV['RUBYOPT'] = '-W1'
7
+
8
+ task :environment do
9
+ require File.dirname(__FILE__) + '/lib/rsolr'
10
+ end
11
+
12
+ Dir['tasks/**/*.rake'].each { |t| load t }
13
+
14
+ task :default => ['spec:api']
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.12.1.1
@@ -0,0 +1,50 @@
1
+
2
+ require 'rubygems'
3
+ $:.unshift File.dirname(__FILE__) unless $:.include?(File.dirname(__FILE__))
4
+
5
+ module RSolr
6
+
7
+ def self.version
8
+ @version ||= File.read(File.join(File.dirname(__FILE__), '..', 'VERSION'))
9
+ end
10
+
11
+ VERSION = self.version
12
+
13
+ autoload :Message, 'rsolr/message'
14
+ autoload :Client, 'rsolr/client'
15
+ autoload :Connection, 'rsolr/connection'
16
+
17
+ module Connectable
18
+
19
+ def connect opts={}
20
+ Client.new Connection::NetHttp.new(opts)
21
+ end
22
+
23
+ end
24
+
25
+ extend Connectable
26
+
27
+ # A module that contains string related methods
28
+ module Char
29
+
30
+ # escape - from the solr-ruby library
31
+ # RSolr.escape('asdf')
32
+ # backslash everything that isn't a word character
33
+ def escape(value)
34
+ value.gsub(/(\W)/, '\\\\\1')
35
+ end
36
+
37
+ end
38
+
39
+ # send the escape method into the Connection class ->
40
+ # solr = RSolr.connect
41
+ # solr.escape('asdf')
42
+ RSolr::Client.send(:include, Char)
43
+
44
+ # bring escape into this module (RSolr) -> RSolr.escape('asdf')
45
+ extend Char
46
+
47
+ # RequestError is a common/generic exception class used by the adapters
48
+ class RequestError < RuntimeError; end
49
+
50
+ end
@@ -0,0 +1,114 @@
1
+ class RSolr::Client
2
+
3
+ attr_reader :connection
4
+
5
+ # "connection" is instance of:
6
+ # RSolr::Adapter::HTTP
7
+ # RSolr::Adapter::Direct (jRuby only)
8
+ # or any other class that uses the connection "interface"
9
+ def initialize(connection)
10
+ @connection = connection
11
+ end
12
+
13
+ # Send a request to a request handler using the method name.
14
+ # Also proxies to the #paginate method if the method starts with "paginate_"
15
+ def method_missing(method_name, *args, &blk)
16
+ request("/#{method_name}", *args, &blk)
17
+ end
18
+
19
+ # sends data to the update handler
20
+ # data can be a string of xml, or an object that returns xml from its #to_xml method
21
+ def update(data, params={})
22
+ request '/update', params, data
23
+ end
24
+
25
+ # send request solr
26
+ # params is hash with valid solr request params (:q, :fl, :qf etc..)
27
+ # if params[:wt] is not set, the default is :ruby
28
+ # if :wt is something other than :ruby, the raw response body is used
29
+ # otherwise, a simple Hash is returned
30
+ # NOTE: to get raw ruby, use :wt=>'ruby' <- a string, not a symbol like :ruby
31
+ #
32
+ #
33
+ def request(path, params={}, *extra)
34
+ response = @connection.request(path, map_params(params), *extra)
35
+ adapt_response(response)
36
+ end
37
+
38
+ #
39
+ # single record:
40
+ # solr.update(:id=>1, :name=>'one')
41
+ #
42
+ # update using an array
43
+ # solr.update([{:id=>1, :name=>'one'}, {:id=>2, :name=>'two'}])
44
+ #
45
+ def add(doc, &block)
46
+ update message.add(doc, &block)
47
+ end
48
+
49
+ # send </commit>
50
+ def commit
51
+ update message.commit
52
+ end
53
+
54
+ # send </optimize>
55
+ def optimize
56
+ update message.optimize
57
+ end
58
+
59
+ # send </rollback>
60
+ # NOTE: solr 1.4 only
61
+ def rollback
62
+ update message.rollback
63
+ end
64
+
65
+ # Delete one or many documents by id
66
+ # solr.delete_by_id 10
67
+ # solr.delete_by_id([12, 41, 199])
68
+ def delete_by_id(id)
69
+ update message.delete_by_id(id)
70
+ end
71
+
72
+ # delete one or many documents by query
73
+ # solr.delete_by_query 'available:0'
74
+ # solr.delete_by_query ['quantity:0', 'manu:"FQ"']
75
+ def delete_by_query(query)
76
+ update message.delete_by_query(query)
77
+ end
78
+
79
+ # shortcut to RSolr::Message::Generator
80
+ def message *opts
81
+ @message ||= RSolr::Message::Generator.new
82
+ end
83
+
84
+ protected
85
+
86
+ # sets default params etc.. - could be used as a mapping hook
87
+ # type of request should be passed in here? -> map_params(:query, {})
88
+ def map_params(params)
89
+ params||={}
90
+ {:wt=>:ruby}.merge(params)
91
+ end
92
+
93
+ # "connection_response" must be a hash with the following keys:
94
+ # :params - a sub hash of standard solr params
95
+ # : body - the raw response body from the solr server
96
+ # This method will evaluate the :body value if the params[:wt] == :ruby
97
+ # otherwise, the body is returned
98
+ # The return object has a special method attached called #raw
99
+ # This method gives you access to the original response from the connection,
100
+ # so you can access things like the actual :url sent to solr,
101
+ # the raw :body, original :params and original :data
102
+ def adapt_response(connection_response)
103
+ data = connection_response[:body]
104
+ # if the wt is :ruby, evaluate the ruby string response
105
+ if connection_response[:params][:wt] == :ruby
106
+ data = Kernel.eval(data)
107
+ end
108
+ # attach a method called #raw that returns the original connection response value
109
+ def data.raw; @raw end
110
+ data.send(:instance_variable_set, '@raw', connection_response)
111
+ data
112
+ end
113
+
114
+ end
@@ -0,0 +1,9 @@
1
+ require 'uri'
2
+
3
+ module RSolr::Connection
4
+
5
+ autoload :NetHttp, 'rsolr/connection/net_http'
6
+ autoload :Utils, 'rsolr/connection/utils'
7
+ autoload :Requestable, 'rsolr/connection/requestable'
8
+
9
+ end
@@ -0,0 +1,48 @@
1
+ require 'net/http'
2
+
3
+ #
4
+ # Connection for standard HTTP Solr server
5
+ #
6
+ class RSolr::Connection::NetHttp
7
+
8
+ include RSolr::Connection::Requestable
9
+
10
+ def connection
11
+ @connection ||= Net::HTTP.new(@uri.host, @uri.port)
12
+ end
13
+
14
+ def get path, params={}
15
+ url = self.build_url path, params
16
+ net_http_response = self.connection.get url
17
+ create_http_context net_http_response, url, path, params
18
+ end
19
+
20
+ def post path, data, params={}, headers={}
21
+ url = self.build_url path, params
22
+ net_http_response = self.connection.post url, data, headers
23
+ create_http_context net_http_response, url, path, params, data, headers
24
+ end
25
+
26
+ def create_http_context net_http_response, url, path, params, data=nil, headers={}
27
+ full_url = "#{@uri.scheme}://#{@uri.host}"
28
+ full_url += @uri.port ? ":#{@uri.port}" : ''
29
+ full_url += url
30
+ {
31
+ :status_code=>net_http_response.code.to_i,
32
+ :url=>full_url,
33
+ :body=> encode_utf8(net_http_response.body),
34
+ :path=>path,
35
+ :params=>params,
36
+ :data=>data,
37
+ :headers=>headers,
38
+ :message => net_http_response.message
39
+ }
40
+ end
41
+
42
+ # accepts a path/string and optional hash of query params
43
+ def build_url path, params={}
44
+ full_path = @uri.path + path
45
+ super full_path, params, @uri.query
46
+ end
47
+
48
+ end
@@ -0,0 +1,43 @@
1
+ # A module that defines the interface and top-level logic for http based connection classes.
2
+ module RSolr::Connection::Requestable
3
+
4
+ include RSolr::Connection::Utils
5
+
6
+ attr_reader :opts, :uri
7
+
8
+ # opts can have:
9
+ # :url => 'http://localhost:8080/solr'
10
+ def initialize opts={}
11
+ opts[:url] ||= 'http://127.0.0.1:8983/solr'
12
+ @opts = opts
13
+ @uri = URI.parse opts[:url]
14
+ end
15
+
16
+ # send a request to the connection
17
+ # request '/select', :q=>'*:*'
18
+ #
19
+ # request '/update', {:wt=>:xml}, '</commit>'
20
+ #
21
+ # force a post where the post body is the param query
22
+ # request '/update', "<optimize/>", :method=>:post
23
+ #
24
+ def request path, params={}, *extra
25
+ opts = extra[-1].kind_of?(Hash) ? extra.pop : {}
26
+ data = extra[0]
27
+ # force a POST, use the query string as the POST body
28
+ if opts[:method] == :post and data.to_s.empty?
29
+ http_context = self.post(path, hash_to_query(params), {}, {'Content-Type' => 'application/x-www-form-urlencoded'})
30
+ else
31
+ if data
32
+ # standard POST, using "data" as the POST body
33
+ http_context = self.post(path, data, params, {"Content-Type" => 'text/xml; charset=utf-8'})
34
+ else
35
+ # standard GET
36
+ http_context = self.get(path, params)
37
+ end
38
+ end
39
+ raise RSolr::RequestError.new("Solr Response: #{http_context[:message]}") unless http_context[:status_code] == 200
40
+ http_context
41
+ end
42
+
43
+ end
@@ -0,0 +1,73 @@
1
+ # Helpful utility methods for building queries to a Solr server
2
+ # This includes helpers that the Direct connection can use.
3
+ module RSolr::Connection::Utils
4
+
5
+ # Performs URI escaping so that you can construct proper
6
+ # query strings faster. Use this rather than the cgi.rb
7
+ # version since it's faster. (Stolen from Rack).
8
+ def escape(s)
9
+ s.to_s.gsub(/([^ a-zA-Z0-9_.-]+)/n) {
10
+ #'%'+$1.unpack('H2'*$1.size).join('%').upcase
11
+ '%'+$1.unpack('H2'*bytesize($1)).join('%').upcase
12
+ }.tr(' ', '+')
13
+ end
14
+
15
+ # encodes the string as utf-8 in Ruby 1.9
16
+ # returns the unaltered string in Ruby 1.8
17
+ def encode_utf8 string
18
+ (string.respond_to?(:force_encoding) and string.respond_to?(:encoding)) ?
19
+ string.force_encoding(Encoding::UTF_8) : string
20
+ end
21
+
22
+ # Return the bytesize of String; uses String#length under Ruby 1.8 and
23
+ # String#bytesize under 1.9.
24
+ if ''.respond_to?(:bytesize)
25
+ def bytesize(string)
26
+ string.bytesize
27
+ end
28
+ else
29
+ def bytesize(string)
30
+ string.size
31
+ end
32
+ end
33
+
34
+ # creates and returns a url as a string
35
+ # "url" is the base url
36
+ # "params" is an optional hash of GET style query params
37
+ # "string_query" is an extra query string that will be appended to the
38
+ # result of "url" and "params".
39
+ def build_url url='', params={}, string_query=''
40
+ queries = [string_query, hash_to_query(params)]
41
+ queries.delete_if{|i| i.to_s.empty?}
42
+ url += "?#{queries.join('&')}" unless queries.empty?
43
+ url
44
+ end
45
+
46
+ # converts a key value pair to an escaped string:
47
+ # Example:
48
+ # build_param(:id, 1) == "id=1"
49
+ def build_param(k,v)
50
+ "#{escape(k)}=#{escape(v)}"
51
+ end
52
+
53
+ #
54
+ # converts hash into URL query string, keys get an alpha sort
55
+ # if a value is an array, the array values get mapped to the same key:
56
+ # hash_to_query(:q=>'blah', :fq=>['blah', 'blah'], :facet=>{:field=>['location_facet', 'format_facet']})
57
+ # returns:
58
+ # ?q=blah&fq=blah&fq=blah&facet.field=location_facet&facet.field=format.facet
59
+ #
60
+ # if a value is empty/nil etc., it is not added
61
+ def hash_to_query(params)
62
+ mapped = params.map do |k, v|
63
+ next if v.to_s.empty?
64
+ if v.class == Array
65
+ hash_to_query(v.map { |x| [k, x] })
66
+ else
67
+ build_param k, v
68
+ end
69
+ end
70
+ mapped.compact.join("&")
71
+ end
72
+
73
+ end