backupify-rsolr-nokogiri 0.12.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/LICENSE ADDED
@@ -0,0 +1,13 @@
1
+ Copyright 2008-2010 Matt Mitchell
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
@@ -0,0 +1,129 @@
1
+ =RSolr
2
+
3
+ A simple, extensible Ruby client for Apache Solr.
4
+
5
+ == Installation:
6
+ gem sources -a http://gemcutter.org
7
+ sudo gem install rsolr
8
+
9
+ == Example:
10
+ require 'rubygems'
11
+ require 'rsolr'
12
+ solr = RSolr.connect :url=>'http://solrserver.com'
13
+
14
+ # send a request to /select
15
+ response = rsolr.select :q=>'*:*'
16
+
17
+ # send a request to a custom request handler; /catalog
18
+ response = rsolr.request '/catalog', :q=>'*:*'
19
+
20
+ # alternative to above:
21
+ response = rsolr.catalog :q=>'*:*'
22
+
23
+ == Querying
24
+ Use the #select method to send requests to the /select handler:
25
+ response = solr.select({
26
+ :q=>'washington',
27
+ :start=>0,
28
+ :rows=>10
29
+ })
30
+
31
+ The params sent into the method are sent to Solr as-is. The one exception is if a value is an array. When an array is used, multiple parameters *with the same name* are generated for the Solr query. Example:
32
+
33
+ solr.select :q=>'roses', :fq=>['red', 'violet']
34
+
35
+ The above statement generates this Solr query:
36
+
37
+ ?q=roses&fq=red&fq=violet
38
+
39
+ Use the #request method for a custom request handler path:
40
+ response = solr.request '/documents', :q=>'test'
41
+
42
+ A shortcut for the above example use a method call instead:
43
+ response = solr.documents :q=>'test'
44
+
45
+
46
+ == Updating Solr
47
+ Updating uses native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML "messages". Raw XML strings can also be used.
48
+
49
+ Raw XML via #update
50
+ solr.update '</commit>'
51
+ solr.update '</optimize>'
52
+
53
+ Single document via #add
54
+ solr.add :id=>1, :price=>1.00
55
+
56
+ Multiple documents via #add
57
+ documents = [{:id=>1, :price=>1.00}, {:id=>2, :price=>10.50}]
58
+ solr.add documents
59
+
60
+ When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) when using the #add method:
61
+
62
+ doc = {:id=>1, :price=>1.00}
63
+ add_attributes = {:allowDups=>false, :commitWithin=>10.0}
64
+ solr.add(doc, add_attributes) do |doc|
65
+ # boost each document
66
+ doc.attrs[:boost] = 1.5
67
+ # boost the price field:
68
+ doc.field_by_name(:price).attrs[:boost] = 2.0
69
+ end
70
+
71
+ Delete by id
72
+ solr.delete_by_id 1
73
+ or an array of ids
74
+ solr.delete_by_id [1, 2, 3, 4]
75
+
76
+ Delete by query:
77
+ solr.delete_by_query 'price:1.00'
78
+ Delete by array of queries
79
+ solr.delete_by_query ['price:1.00', 'price:10.00']
80
+
81
+ Commit & optimize shortcuts
82
+ solr.commit
83
+ solr.optimize
84
+
85
+ == Response Formats
86
+ The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd resulting in a Hash. You can get a raw response by setting the :wt to "ruby" - notice, the string -- not a symbol. RSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, :wt=>'xml' etc..
87
+
88
+ ===Evaluated Ruby (default)
89
+ solr.select(:wt=>:ruby) # notice :ruby is a Symbol
90
+ ===Raw Ruby
91
+ solr.select(:wt=>'ruby') # notice 'ruby' is a String
92
+
93
+ ===XML:
94
+ solr.select(:wt=>:xml)
95
+ ===JSON:
96
+ solr.select(:wt=>:json)
97
+
98
+ You can access the original request context (path, params, url etc.) by calling the #raw method:
99
+ response = solr.select :q=>'*:*'
100
+ response.raw[:status_code]
101
+ response.raw[:body]
102
+ response.raw[:url]
103
+
104
+ The raw is a hash that contains the generated params, url, path, post data, headers etc., very useful for debugging and testing.
105
+
106
+ ==Related Resources & Projects
107
+ * {RSolr Google Group}[http://groups.google.com/group/rsolr] -- The RSolr discussion group
108
+ * {rsolr-ext}[http://github.com/mwmitchell/rsolr-ext] -- An extension kit for RSolr
109
+ * {rsolr-direct}[http://github.com/mwmitchell/rsolr-direct] -- JRuby direct connection for RSolr
110
+ * {SunSpot}[http://github.com/outoftime/sunspot] -- An awesome Solr DSL, built with RSolr
111
+ * {Blacklight}[http://blacklightopac.org] -- A "next generation" Library OPAC, built with RSolr
112
+ * {java_bin}[http://github.com/kennyj/java_bin] -- Provides javabin/binary parsing for RSolr
113
+ * {Solr}[http://lucene.apache.org/solr/] -- The Apache Solr project
114
+ * {solr-ruby}[http://wiki.apache.org/solr/solr-ruby] -- The original Solr Ruby Gem!
115
+
116
+ == Note on Patches/Pull Requests
117
+ * Fork the project.
118
+ * Make your feature addition or bug fix.
119
+ * Add tests for it. This is important so I don't break it in a future version unintentionally.
120
+ * Commit, do not mess with rakefile, version, or history
121
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
122
+ * Send me a pull request. Bonus points for topic branches.
123
+
124
+ ==Contributors
125
+ * mperham
126
+ * Mat Brown
127
+ * shairontoledo
128
+ * Matthew Rudy
129
+ * Fouad Mardini
@@ -0,0 +1,14 @@
1
+ require 'rake'
2
+ require 'rake/testtask'
3
+ require 'rdoc/task'
4
+ require 'rubygems/package_task'
5
+
6
+ ENV['RUBYOPT'] = '-W1'
7
+
8
+ task :environment do
9
+ require File.dirname(__FILE__) + '/lib/rsolr'
10
+ end
11
+
12
+ Dir['tasks/**/*.rake'].each { |t| load t }
13
+
14
+ task :default => ['spec:api']
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.12.1.1
@@ -0,0 +1,50 @@
1
+
2
+ require 'rubygems'
3
+ $:.unshift File.dirname(__FILE__) unless $:.include?(File.dirname(__FILE__))
4
+
5
+ module RSolr
6
+
7
+ def self.version
8
+ @version ||= File.read(File.join(File.dirname(__FILE__), '..', 'VERSION'))
9
+ end
10
+
11
+ VERSION = self.version
12
+
13
+ autoload :Message, 'rsolr/message'
14
+ autoload :Client, 'rsolr/client'
15
+ autoload :Connection, 'rsolr/connection'
16
+
17
+ module Connectable
18
+
19
+ def connect opts={}
20
+ Client.new Connection::NetHttp.new(opts)
21
+ end
22
+
23
+ end
24
+
25
+ extend Connectable
26
+
27
+ # A module that contains string related methods
28
+ module Char
29
+
30
+ # escape - from the solr-ruby library
31
+ # RSolr.escape('asdf')
32
+ # backslash everything that isn't a word character
33
+ def escape(value)
34
+ value.gsub(/(\W)/, '\\\\\1')
35
+ end
36
+
37
+ end
38
+
39
+ # send the escape method into the Connection class ->
40
+ # solr = RSolr.connect
41
+ # solr.escape('asdf')
42
+ RSolr::Client.send(:include, Char)
43
+
44
+ # bring escape into this module (RSolr) -> RSolr.escape('asdf')
45
+ extend Char
46
+
47
+ # RequestError is a common/generic exception class used by the adapters
48
+ class RequestError < RuntimeError; end
49
+
50
+ end
@@ -0,0 +1,114 @@
1
+ class RSolr::Client
2
+
3
+ attr_reader :connection
4
+
5
+ # "connection" is instance of:
6
+ # RSolr::Adapter::HTTP
7
+ # RSolr::Adapter::Direct (jRuby only)
8
+ # or any other class that uses the connection "interface"
9
+ def initialize(connection)
10
+ @connection = connection
11
+ end
12
+
13
+ # Send a request to a request handler using the method name.
14
+ # Also proxies to the #paginate method if the method starts with "paginate_"
15
+ def method_missing(method_name, *args, &blk)
16
+ request("/#{method_name}", *args, &blk)
17
+ end
18
+
19
+ # sends data to the update handler
20
+ # data can be a string of xml, or an object that returns xml from its #to_xml method
21
+ def update(data, params={})
22
+ request '/update', params, data
23
+ end
24
+
25
+ # send request solr
26
+ # params is hash with valid solr request params (:q, :fl, :qf etc..)
27
+ # if params[:wt] is not set, the default is :ruby
28
+ # if :wt is something other than :ruby, the raw response body is used
29
+ # otherwise, a simple Hash is returned
30
+ # NOTE: to get raw ruby, use :wt=>'ruby' <- a string, not a symbol like :ruby
31
+ #
32
+ #
33
+ def request(path, params={}, *extra)
34
+ response = @connection.request(path, map_params(params), *extra)
35
+ adapt_response(response)
36
+ end
37
+
38
+ #
39
+ # single record:
40
+ # solr.update(:id=>1, :name=>'one')
41
+ #
42
+ # update using an array
43
+ # solr.update([{:id=>1, :name=>'one'}, {:id=>2, :name=>'two'}])
44
+ #
45
+ def add(doc, &block)
46
+ update message.add(doc, &block)
47
+ end
48
+
49
+ # send </commit>
50
+ def commit
51
+ update message.commit
52
+ end
53
+
54
+ # send </optimize>
55
+ def optimize
56
+ update message.optimize
57
+ end
58
+
59
+ # send </rollback>
60
+ # NOTE: solr 1.4 only
61
+ def rollback
62
+ update message.rollback
63
+ end
64
+
65
+ # Delete one or many documents by id
66
+ # solr.delete_by_id 10
67
+ # solr.delete_by_id([12, 41, 199])
68
+ def delete_by_id(id)
69
+ update message.delete_by_id(id)
70
+ end
71
+
72
+ # delete one or many documents by query
73
+ # solr.delete_by_query 'available:0'
74
+ # solr.delete_by_query ['quantity:0', 'manu:"FQ"']
75
+ def delete_by_query(query)
76
+ update message.delete_by_query(query)
77
+ end
78
+
79
+ # shortcut to RSolr::Message::Generator
80
+ def message *opts
81
+ @message ||= RSolr::Message::Generator.new
82
+ end
83
+
84
+ protected
85
+
86
+ # sets default params etc.. - could be used as a mapping hook
87
+ # type of request should be passed in here? -> map_params(:query, {})
88
+ def map_params(params)
89
+ params||={}
90
+ {:wt=>:ruby}.merge(params)
91
+ end
92
+
93
+ # "connection_response" must be a hash with the following keys:
94
+ # :params - a sub hash of standard solr params
95
+ # : body - the raw response body from the solr server
96
+ # This method will evaluate the :body value if the params[:wt] == :ruby
97
+ # otherwise, the body is returned
98
+ # The return object has a special method attached called #raw
99
+ # This method gives you access to the original response from the connection,
100
+ # so you can access things like the actual :url sent to solr,
101
+ # the raw :body, original :params and original :data
102
+ def adapt_response(connection_response)
103
+ data = connection_response[:body]
104
+ # if the wt is :ruby, evaluate the ruby string response
105
+ if connection_response[:params][:wt] == :ruby
106
+ data = Kernel.eval(data)
107
+ end
108
+ # attach a method called #raw that returns the original connection response value
109
+ def data.raw; @raw end
110
+ data.send(:instance_variable_set, '@raw', connection_response)
111
+ data
112
+ end
113
+
114
+ end
@@ -0,0 +1,9 @@
1
+ require 'uri'
2
+
3
+ module RSolr::Connection
4
+
5
+ autoload :NetHttp, 'rsolr/connection/net_http'
6
+ autoload :Utils, 'rsolr/connection/utils'
7
+ autoload :Requestable, 'rsolr/connection/requestable'
8
+
9
+ end
@@ -0,0 +1,48 @@
1
+ require 'net/http'
2
+
3
+ #
4
+ # Connection for standard HTTP Solr server
5
+ #
6
+ class RSolr::Connection::NetHttp
7
+
8
+ include RSolr::Connection::Requestable
9
+
10
+ def connection
11
+ @connection ||= Net::HTTP.new(@uri.host, @uri.port)
12
+ end
13
+
14
+ def get path, params={}
15
+ url = self.build_url path, params
16
+ net_http_response = self.connection.get url
17
+ create_http_context net_http_response, url, path, params
18
+ end
19
+
20
+ def post path, data, params={}, headers={}
21
+ url = self.build_url path, params
22
+ net_http_response = self.connection.post url, data, headers
23
+ create_http_context net_http_response, url, path, params, data, headers
24
+ end
25
+
26
+ def create_http_context net_http_response, url, path, params, data=nil, headers={}
27
+ full_url = "#{@uri.scheme}://#{@uri.host}"
28
+ full_url += @uri.port ? ":#{@uri.port}" : ''
29
+ full_url += url
30
+ {
31
+ :status_code=>net_http_response.code.to_i,
32
+ :url=>full_url,
33
+ :body=> encode_utf8(net_http_response.body),
34
+ :path=>path,
35
+ :params=>params,
36
+ :data=>data,
37
+ :headers=>headers,
38
+ :message => net_http_response.message
39
+ }
40
+ end
41
+
42
+ # accepts a path/string and optional hash of query params
43
+ def build_url path, params={}
44
+ full_path = @uri.path + path
45
+ super full_path, params, @uri.query
46
+ end
47
+
48
+ end
@@ -0,0 +1,43 @@
1
+ # A module that defines the interface and top-level logic for http based connection classes.
2
+ module RSolr::Connection::Requestable
3
+
4
+ include RSolr::Connection::Utils
5
+
6
+ attr_reader :opts, :uri
7
+
8
+ # opts can have:
9
+ # :url => 'http://localhost:8080/solr'
10
+ def initialize opts={}
11
+ opts[:url] ||= 'http://127.0.0.1:8983/solr'
12
+ @opts = opts
13
+ @uri = URI.parse opts[:url]
14
+ end
15
+
16
+ # send a request to the connection
17
+ # request '/select', :q=>'*:*'
18
+ #
19
+ # request '/update', {:wt=>:xml}, '</commit>'
20
+ #
21
+ # force a post where the post body is the param query
22
+ # request '/update', "<optimize/>", :method=>:post
23
+ #
24
+ def request path, params={}, *extra
25
+ opts = extra[-1].kind_of?(Hash) ? extra.pop : {}
26
+ data = extra[0]
27
+ # force a POST, use the query string as the POST body
28
+ if opts[:method] == :post and data.to_s.empty?
29
+ http_context = self.post(path, hash_to_query(params), {}, {'Content-Type' => 'application/x-www-form-urlencoded'})
30
+ else
31
+ if data
32
+ # standard POST, using "data" as the POST body
33
+ http_context = self.post(path, data, params, {"Content-Type" => 'text/xml; charset=utf-8'})
34
+ else
35
+ # standard GET
36
+ http_context = self.get(path, params)
37
+ end
38
+ end
39
+ raise RSolr::RequestError.new("Solr Response: #{http_context[:message]}") unless http_context[:status_code] == 200
40
+ http_context
41
+ end
42
+
43
+ end
@@ -0,0 +1,73 @@
1
+ # Helpful utility methods for building queries to a Solr server
2
+ # This includes helpers that the Direct connection can use.
3
+ module RSolr::Connection::Utils
4
+
5
+ # Performs URI escaping so that you can construct proper
6
+ # query strings faster. Use this rather than the cgi.rb
7
+ # version since it's faster. (Stolen from Rack).
8
+ def escape(s)
9
+ s.to_s.gsub(/([^ a-zA-Z0-9_.-]+)/n) {
10
+ #'%'+$1.unpack('H2'*$1.size).join('%').upcase
11
+ '%'+$1.unpack('H2'*bytesize($1)).join('%').upcase
12
+ }.tr(' ', '+')
13
+ end
14
+
15
+ # encodes the string as utf-8 in Ruby 1.9
16
+ # returns the unaltered string in Ruby 1.8
17
+ def encode_utf8 string
18
+ (string.respond_to?(:force_encoding) and string.respond_to?(:encoding)) ?
19
+ string.force_encoding(Encoding::UTF_8) : string
20
+ end
21
+
22
+ # Return the bytesize of String; uses String#length under Ruby 1.8 and
23
+ # String#bytesize under 1.9.
24
+ if ''.respond_to?(:bytesize)
25
+ def bytesize(string)
26
+ string.bytesize
27
+ end
28
+ else
29
+ def bytesize(string)
30
+ string.size
31
+ end
32
+ end
33
+
34
+ # creates and returns a url as a string
35
+ # "url" is the base url
36
+ # "params" is an optional hash of GET style query params
37
+ # "string_query" is an extra query string that will be appended to the
38
+ # result of "url" and "params".
39
+ def build_url url='', params={}, string_query=''
40
+ queries = [string_query, hash_to_query(params)]
41
+ queries.delete_if{|i| i.to_s.empty?}
42
+ url += "?#{queries.join('&')}" unless queries.empty?
43
+ url
44
+ end
45
+
46
+ # converts a key value pair to an escaped string:
47
+ # Example:
48
+ # build_param(:id, 1) == "id=1"
49
+ def build_param(k,v)
50
+ "#{escape(k)}=#{escape(v)}"
51
+ end
52
+
53
+ #
54
+ # converts hash into URL query string, keys get an alpha sort
55
+ # if a value is an array, the array values get mapped to the same key:
56
+ # hash_to_query(:q=>'blah', :fq=>['blah', 'blah'], :facet=>{:field=>['location_facet', 'format_facet']})
57
+ # returns:
58
+ # ?q=blah&fq=blah&fq=blah&facet.field=location_facet&facet.field=format.facet
59
+ #
60
+ # if a value is empty/nil etc., it is not added
61
+ def hash_to_query(params)
62
+ mapped = params.map do |k, v|
63
+ next if v.to_s.empty?
64
+ if v.class == Array
65
+ hash_to_query(v.map { |x| [k, x] })
66
+ else
67
+ build_param k, v
68
+ end
69
+ end
70
+ mapped.compact.join("&")
71
+ end
72
+
73
+ end