gsolr 0.12.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore ADDED
@@ -0,0 +1,3 @@
1
+ pkg/*
2
+ *.gem
3
+ .bundle
data/Gemfile ADDED
@@ -0,0 +1,11 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in gsolr.gemspec
4
+ gemspec
5
+
6
+ # gem 'builder'
7
+
8
+ group :test do
9
+ gem 'rspec'
10
+ gem 'rspec-core'
11
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,30 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ gsolr (0.0.1)
5
+ json (~> 1.4.6)
6
+
7
+ GEM
8
+ remote: http://rubygems.org/
9
+ specs:
10
+ diff-lcs (1.1.2)
11
+ json (1.4.6)
12
+ rspec (2.0.1)
13
+ rspec-core (~> 2.0.1)
14
+ rspec-expectations (~> 2.0.1)
15
+ rspec-mocks (~> 2.0.1)
16
+ rspec-core (2.0.1)
17
+ rspec-expectations (2.0.1)
18
+ diff-lcs (>= 1.1.2)
19
+ rspec-mocks (2.0.1)
20
+ rspec-core (~> 2.0.1)
21
+ rspec-expectations (~> 2.0.1)
22
+
23
+ PLATFORMS
24
+ ruby
25
+
26
+ DEPENDENCIES
27
+ gsolr!
28
+ json (~> 1.4.6)
29
+ rspec
30
+ rspec-core
data/LICENSE ADDED
@@ -0,0 +1,13 @@
1
+ Copyright 2008-2010 Matt Mitchell
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
data/README.md ADDED
@@ -0,0 +1,160 @@
1
+ # GSolr
2
+
3
+ A simple, extensible Ruby client for the Solr interface. Capable of talking to Solr and to Riak.
4
+
5
+ ## Installation:
6
+ sudo gem install gsolr
7
+
8
+ ## Example:
9
+ require 'rubygems'
10
+ require 'gsolr'
11
+ solr = GSolr.connect :url => "http://solrserver.com"
12
+
13
+ # send a request to /select
14
+ response = gsolr.select :q=>'*:*'
15
+
16
+ # send a request to a custom request handler; /catalog
17
+ response = gsolr.request '/catalog', :q=>'*:*'
18
+
19
+ # alternative to above:
20
+ response = gsolr.catalog :q=>'*:*'
21
+
22
+ ## Querying
23
+ Use the #select method to send requests to the /select handler:
24
+
25
+ response = solr.select {
26
+ :q=>'washington',
27
+ :start=>0,
28
+ :rows=>10
29
+ }
30
+
31
+ The params sent into the method are sent to Solr as-is. The one exception is if a value is an array. When an array is used, multiple parameters *with the same name* are generated for the Solr query. Example:
32
+
33
+ solr.select {
34
+ :q => 'roses',
35
+ :fq => ['red', 'violet']
36
+ }
37
+
38
+ The above statement generates this Solr query:
39
+
40
+ .../?q=roses&fq=red&fq=violet
41
+
42
+ Use the #request method for a custom request handler path:
43
+
44
+ response = solr.request '/documents', :q=>'test'
45
+
46
+ A shortcut for the above example use a method call instead:
47
+
48
+ response = solr.documents :q=>'test'
49
+
50
+
51
+ ## Updating Solr
52
+ Updating uses native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML "messages". Raw XML strings can also be used.
53
+
54
+ Raw XML via #update
55
+
56
+ solr.update '</commit>'
57
+ solr.update '</optimize>'
58
+
59
+ Single document via #add
60
+
61
+ solr.add {
62
+ :id => 1,
63
+ :price => 1.00
64
+ }
65
+
66
+ Multiple documents via #add
67
+
68
+ documents = [{
69
+ :id => 1,
70
+ :price => 1.00
71
+ }, {
72
+ :id => 2,
73
+ :price => 10.50
74
+ }]
75
+
76
+ solr.add documents
77
+
78
+ When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) when using the #add method:
79
+
80
+ add_doc = {:id=>1, :price=>1.00}
81
+ add_attr = {:allowDups=>false, :commitWithin=>10.0}
82
+
83
+ solr.add(add_doc, add_attr) do |doc|
84
+ # boost each document
85
+ doc.attrs[:boost] = 1.5
86
+ # boost the price field:
87
+ doc.field_by_name(:price).attrs[:boost] = 2.0
88
+ end
89
+
90
+ Delete by id
91
+
92
+ solr.delete_by_id 1
93
+
94
+ or an array of ids
95
+
96
+ solr.delete_by_id [1, 2, 3, 4]
97
+
98
+ Delete by query:
99
+
100
+ solr.delete_by_query 'price:1.00'
101
+
102
+ Delete by array of queries
103
+
104
+ solr.delete_by_query ['price:1.00', 'price:10.00']
105
+
106
+ Commit & optimize shortcuts
107
+
108
+ solr.commit
109
+ solr.optimize
110
+
111
+ ## Response Formats
112
+ The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd resulting in a Hash. You can get a raw response by setting the :wt to "ruby" - notice, the string -- not a symbol. GSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, :wt=>'xml' etc..
113
+
114
+ ### Evaluated Ruby (default)
115
+
116
+ solr.select(:wt=>:ruby) # notice :ruby is a Symbol
117
+
118
+ ### Raw Ruby
119
+
120
+ solr.select(:wt=>'ruby') # notice 'ruby' is a String
121
+
122
+ ### XML:
123
+
124
+ solr.select(:wt=>:xml)
125
+
126
+ ### JSON:
127
+
128
+ solr.select(:wt=>:json)
129
+
130
+ You can access the original request context (path, params, url etc.) by calling the #raw method:
131
+
132
+ response = solr.select :q=>'*:*'
133
+
134
+ response.raw[:status_code]
135
+ response.raw[:body]
136
+ response.raw[:url]
137
+
138
+ The raw is a hash that contains the generated params, url, path, post data, headers etc., very useful for debugging and testing.
139
+
140
+ ## Related Resources & Projects
141
+ * The Apache Solr project
142
+ * [Solr](http://lucene.apache.org/solr/)
143
+ * The original Solr Ruby Gem
144
+ * [solr-ruby](http://wiki.apache.org/solr/solr-ruby)
145
+ * The RSolr Gem, from which this was hijacked
146
+ * [RSolr](https://github.com/mwmitchell/rsolr)
147
+
148
+ ## Note on Patches/Pull Requests
149
+ * Fork the project.
150
+ * Add tests for your contribution.
151
+ * Write your contribution.
152
+ * Commit only that contribution. Changes to rakefile, version, or history should be done in a respective commit.
153
+ * Send a pull request.
154
+
155
+ ## Contributors (to the RSolr project, who therefore contributed to this)
156
+ * mperham
157
+ * Mat Brown
158
+ * shairontoledo
159
+ * Matthew Rudy
160
+ * Fouad Mardini
data/Rakefile ADDED
@@ -0,0 +1,40 @@
1
+ $:.unshift(File.dirname(__FILE__)) unless
2
+ $:.include?(File.dirname(__FILE__)) || $:.include?(File.expand_path(File.dirname(__FILE__)))
3
+
4
+ require 'bundler'
5
+
6
+ require 'rake'
7
+ require 'rake/testtask'
8
+
9
+ require 'rake/gempackagetask'
10
+
11
+ gemspec = eval File.read('gsolr.gemspec')
12
+
13
+ # Gem packaging tasks
14
+ Rake::GemPackageTask.new(gemspec) do |pkg|
15
+ pkg.need_zip = false
16
+ pkg.need_tar = false
17
+ end
18
+
19
+ task :gem => :gemspec
20
+
21
+ desc %{Build the gemspec file.}
22
+ task :gemspec do
23
+ gemspec.validate
24
+ end
25
+
26
+ desc %{Release the gem to RubyGems.org}
27
+ task :release => :gem do
28
+ system "gem push pkg/#{gemspec.name}-#{gemspec.version}.gem"
29
+ end
30
+
31
+
32
+ ENV['RUBYOPT'] = '-W1'
33
+
34
+ task :environment do
35
+ require File.dirname(__FILE__) + '/lib/gsolr'
36
+ end
37
+
38
+ Dir['tasks/**/*.rake'].each { |t| load t }
39
+
40
+ task :default => ['spec:api']
data/gsolr.gemspec ADDED
@@ -0,0 +1,25 @@
1
+ # -*- encoding: utf-8 -*-
2
+ $:.push File.expand_path("../lib", __FILE__)
3
+ require "gsolr/version"
4
+
5
+ Gem::Specification.new do |s|
6
+ s.name = "gsolr"
7
+ s.version = Gsolr::VERSION
8
+ s.platform = Gem::Platform::RUBY
9
+ s.authors = ["Scott Gonyea"]
10
+ s.email = ["me@sgonyea.com"]
11
+ s.homepage = "http://rubygems.org/gems/gsolr"
12
+ s.summary = %q{Generic Solr Client}
13
+ s.description = %q{This is a generic solr client, capable of talking to Solr, as well as Riak}
14
+
15
+ s.rubyforge_project = "gsolr"
16
+
17
+ s.add_dependency('json', '~>1.4.6')
18
+
19
+ s.add_development_dependency "rspec"
20
+
21
+ s.files = `git ls-files`.split("\n")
22
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
23
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
24
+ s.require_paths = ["lib"]
25
+ end
@@ -0,0 +1,121 @@
1
+ module GSolr
2
+ class Client
3
+
4
+ attr_reader :connection
5
+
6
+ # "connection" is instance of:
7
+ # GSolr::Adapter::HTTP
8
+ # GSolr::Adapter::Direct (jRuby only)
9
+ # or any other class that uses the connection "interface"
10
+ def initialize(connection)
11
+ @connection = connection
12
+ end
13
+
14
+ # Send a request to a request handler using the method name.
15
+ # Also proxies to the #paginate method if the method starts with "paginate_"
16
+ def method_missing(method_name, *args, &blk)
17
+ request("/#{method_name}", *args, &blk)
18
+ end
19
+
20
+ # sends data to the update handler
21
+ # data can be a string of xml, or an object that returns xml from its #to_xml method
22
+ def update(data, params={})
23
+ request '/update', params, data
24
+ end
25
+
26
+ # send request solr
27
+ # params is hash with valid solr request params (:q, :fl, :qf etc..)
28
+ # if params[:wt] is not set, the default is :ruby
29
+ # if :wt is something other than :ruby, the raw response body is used
30
+ # otherwise, a simple Hash is returned
31
+ # NOTE: to get raw ruby, use :wt=>'ruby' <- a string, not a symbol like :ruby
32
+ #
33
+ def request(path, params={}, *extra)
34
+ response = @connection.request(path, map_params(params), *extra)
35
+ adapt_response(response)
36
+ end
37
+
38
+ #
39
+ # single record:
40
+ # solr.update(:id=>1, :name=>'one')
41
+ #
42
+ # update using an array
43
+ # solr.update([{:id=>1, :name=>'one'}, {:id=>2, :name=>'two'}])
44
+ #
45
+ def add(doc, &block)
46
+ update message.add(doc, &block)
47
+ end
48
+
49
+ # send </commit>
50
+ def commit
51
+ update message.commit
52
+ end
53
+
54
+ # send </optimize>
55
+ def optimize
56
+ update message.optimize
57
+ end
58
+
59
+ # send </rollback>
60
+ # NOTE: solr 1.4 only
61
+ def rollback
62
+ update message.rollback
63
+ end
64
+
65
+ # Delete one or many documents by id
66
+ # solr.delete_by_id 10
67
+ # solr.delete_by_id([12, 41, 199])
68
+ def delete_by_id(id)
69
+ update message.delete_by_id(id)
70
+ end
71
+
72
+ # delete one or many documents by query
73
+ # solr.delete_by_query 'available:0'
74
+ # solr.delete_by_query ['quantity:0', 'manu:"FQ"']
75
+ def delete_by_query(query)
76
+ update message.delete_by_query(query)
77
+ end
78
+
79
+ # shortcut to GSolr::Message::Generator
80
+ def message *opts
81
+ @message ||= GSolr::Message::Generator.new
82
+ end
83
+
84
+ protected
85
+
86
+ # sets default params etc.. - could be used as a mapping hook
87
+ # type of request should be passed in here? -> map_params(:query, {})
88
+ def map_params(params)
89
+ params||={}
90
+ {:wt=>:json}.merge(params)
91
+ end
92
+
93
+ # "connection_response" must be a hash with the following keys:
94
+ # :params - a sub hash of standard solr params
95
+ # : body - the raw response body from the solr server
96
+ # This method will evaluate the :body value if the params[:wt] == :ruby
97
+ # otherwise, the body is returned
98
+ # The return object has a special method attached called #raw
99
+ # This method gives you access to the original response from the connection,
100
+ # so you can access things like the actual :url sent to solr,
101
+ # the raw :body, original :params and original :data
102
+ def adapt_response(connection_response)
103
+ data = connection_response[:body]
104
+
105
+ # if the wt is :ruby, evaluate the ruby string response
106
+ if connection_response[:params][:wt] == :ruby
107
+ data = Kernel.eval(data)
108
+ end
109
+
110
+ # attach a method called #raw that returns the original connection response value
111
+ def data.raw
112
+ @raw
113
+ end
114
+
115
+ data.send(:instance_variable_set, '@raw', connection_response)
116
+
117
+ return data
118
+ end
119
+
120
+ end # class Client
121
+ end # module GSolr
@@ -0,0 +1,56 @@
1
+ require 'net/http'
2
+
3
+ #
4
+ # Connection for standard HTTP Solr server
5
+ #
6
+ module GSolr
7
+ module Connection
8
+ class NetHttp
9
+
10
+ include GSolr::Connection::Requestable
11
+
12
+ def connection
13
+ @connection ||= Net::HTTP.new(@uri.host, @uri.port)
14
+ end
15
+
16
+ def get(path, params={})
17
+ url = self.build_url path, params
18
+ net_http_response = self.connection.get url
19
+ create_http_context net_http_response, url, path, params
20
+ end
21
+
22
+ def post(path, data, params={}, headers={})
23
+ url = self.build_url path, params
24
+ net_http_response = self.connection.post url, data, headers
25
+ create_http_context net_http_response, url, path, params, data, headers
26
+ end
27
+
28
+ def create_http_context(net_http_response, url, path, params, data=nil, headers={})
29
+ full_url = "#{@uri.scheme}://#{@uri.host}"
30
+
31
+ full_url += ":#{@uri.port}" if @uri.port
32
+
33
+ full_url += url
34
+
35
+ return {
36
+ :status_code => net_http_response.code.to_i,
37
+ :url => full_url,
38
+ :body => encode_utf8(net_http_response.body),
39
+ :path => path,
40
+ :params => params,
41
+ :data => data,
42
+ :headers => headers,
43
+ :message => net_http_response.message
44
+ }
45
+ end
46
+
47
+ # accepts a path/string and optional hash of query params
48
+ def build_url(path, params={})
49
+ full_path = @uri.path + path
50
+
51
+ super full_path, params, @uri.query
52
+ end
53
+
54
+ end # class NetHttp
55
+ end # module Connection
56
+ end # module GSolr
@@ -0,0 +1,48 @@
1
+ # A module that defines the interface and top-level logic for http based connection classes.
2
+ module GSolr
3
+ module Connection
4
+ module Requestable
5
+
6
+ include GSolr::Connection::Utils
7
+
8
+ attr_reader :opts, :uri
9
+
10
+ # opts can have:
11
+ # :url => 'http://localhost:8080/solr'
12
+ def initialize(opts={})
13
+ opts[:url] ||= 'http://127.0.0.1:8983/solr'
14
+ @opts = opts
15
+ @uri = URI.parse opts[:url]
16
+ end
17
+
18
+ # send a request to the connection
19
+ # request '/select', :q=>'*:*'
20
+ #
21
+ # request '/update', {:wt=>:xml}, '</commit>'
22
+ #
23
+ # force a post where the post body is the param query
24
+ # request '/update', "<optimize/>", :method=>:post
25
+ #
26
+ def request(path, params={}, *extra)
27
+ opts = extra[-1].kind_of?(Hash) ? extra.pop : {}
28
+ data = extra[0]
29
+ # force a POST, use the query string as the POST body
30
+ if opts[:method] == :post and data.to_s.empty?
31
+ http_context = self.post(path, hash_to_query(params), {}, {'Content-Type' => 'application/x-www-form-urlencoded'})
32
+ else
33
+ if data
34
+ # standard POST, using "data" as the POST body
35
+ http_context = self.post(path, data, params, {"Content-Type" => 'text/xml; charset=utf-8'})
36
+ else
37
+ # standard GET
38
+ http_context = self.get(path, params)
39
+ end
40
+ end
41
+
42
+ raise GSolr::RequestError.new("Solr Response: #{http_context[:message]}") unless http_context[:status_code] == 200
43
+
44
+ return http_context
45
+ end
46
+ end # module Requestable
47
+ end # module Connection
48
+ end # module GSolr
@@ -0,0 +1,82 @@
1
+ # Helpful utility methods for building queries to a Solr server
2
+ # This includes helpers that the Direct connection can use.
3
+ module GSolr
4
+ module Connection
5
+ module Utils
6
+ # Performs URI escaping so that you can construct proper
7
+ # query strings faster. Use this rather than the cgi.rb
8
+ # version since it's faster. (Stolen from Rack).
9
+ def escape(s)
10
+ s.to_s.gsub(/([^ a-zA-Z0-9_.-]+)/n) {
11
+ #'%'+$1.unpack('H2'*$1.size).join('%').upcase
12
+ '%'+$1.unpack('H2'*bytesize($1)).join('%').upcase
13
+ }.tr(' ', '+')
14
+ end
15
+
16
+ # encodes the string as utf-8 in Ruby 1.9
17
+ # returns the unaltered string in Ruby 1.8
18
+ def encode_utf8(string)
19
+ (string.respond_to?(:force_encoding) and string.respond_to?(:encoding)) ?
20
+ string.force_encoding(Encoding::UTF_8) : string
21
+ end
22
+
23
+ # Return the bytesize of String; uses String#length under Ruby 1.8 and
24
+ # String#bytesize under 1.9.
25
+ if ''.respond_to?(:bytesize)
26
+ def bytesize(string)
27
+ string.bytesize
28
+ end
29
+ else
30
+ def bytesize(string)
31
+ string.size
32
+ end
33
+ end
34
+
35
+ # creates and returns a url as a string
36
+ # "url" is the base url
37
+ # "params" is an optional hash of GET style query params
38
+ # "string_query" is an extra query string that will be appended to the
39
+ # result of "url" and "params".
40
+ def build_url(url='', params={}, string_query='')
41
+ queries = [string_query, hash_to_query(params)]
42
+
43
+ queries.delete_if{|q_elem|
44
+ q_elem.to_s.empty?
45
+ }
46
+
47
+ url += "?#{queries.join('&')}" unless queries.empty?
48
+
49
+ return url
50
+ end
51
+
52
+ # converts a key value pair to an escaped string:
53
+ # Example:
54
+ # build_param(:id, 1) == "id=1"
55
+ def build_param(k,v)
56
+ "#{escape(k)}=#{escape(v)}"
57
+ end
58
+
59
+ #
60
+ # converts hash into URL query string, keys get an alpha sort
61
+ # if a value is an array, the array values get mapped to the same key:
62
+ # hash_to_query(:q=>'blah', :fq=>['blah', 'blah'], :facet=>{:field=>['location_facet', 'format_facet']})
63
+ # returns:
64
+ # ?q=blah&fq=blah&fq=blah&facet.field=location_facet&facet.field=format.facet
65
+ #
66
+ # if a value is empty/nil etc., it is not added
67
+ def hash_to_query(params)
68
+ mapped = params.map do |key, val|
69
+ next if val.to_s.empty?
70
+
71
+ if val.class == Array
72
+ hash_to_query(v.map { |elem| [key, elem] })
73
+ else
74
+ build_param key, val
75
+ end
76
+ end
77
+
78
+ mapped.compact.join("&")
79
+ end
80
+ end # module Utils
81
+ end # module Connection
82
+ end # module GSolr
@@ -0,0 +1,9 @@
1
+ require 'uri'
2
+
3
+ module GSolr
4
+ module Connection
5
+ autoload :NetHttp, 'gsolr/connection/net_http'
6
+ autoload :Utils, 'gsolr/connection/utils'
7
+ autoload :Requestable, 'gsolr/connection/requestable'
8
+ end
9
+ end
@@ -0,0 +1,52 @@
1
+ # A class that represents a "doc" xml element for a solr update
2
+ module GSolr
3
+ module Message
4
+ class Document
5
+
6
+ # "attrs" is a hash for setting the "doc" xml attributes
7
+ # "fields" is an array of Field objects
8
+ attr_accessor :attrs, :fields
9
+
10
+ # "doc_hash" must be a Hash/Mash object
11
+ # If a value in the "doc_hash" is an array,
12
+ # a field object is created for each value...
13
+ def initialize(doc_hash = {})
14
+ @fields = []
15
+ doc_hash.each_pair do |field,values|
16
+ # create a new field for each value (multi-valued)
17
+ # put non-array values into an array
18
+ values = [values] unless values.is_a?(Array)
19
+ values.each do |v|
20
+ next if v.to_s.empty?
21
+ @fields << GSolr::Message::Field.new({:name=>field}, v.to_s)
22
+ end
23
+ end
24
+ @attrs={}
25
+ end
26
+
27
+ # returns an array of fields that match the "name" arg
28
+ def fields_by_name(name)
29
+ @fields.select{|f|f.name==name}
30
+ end
31
+
32
+ # returns the *first* field that matches the "name" arg
33
+ def field_by_name(name)
34
+ @fields.detect{|f|f.name==name}
35
+ end
36
+
37
+ #
38
+ # Add a field value to the document. Options map directly to
39
+ # XML attributes in the Solr <field> node.
40
+ # See http://wiki.apache.org/solr/UpdateXmlMessages#head-8315b8028923d028950ff750a57ee22cbf7977c6
41
+ #
42
+ # === Example:
43
+ #
44
+ # document.add_field('title', 'A Title', :boost => 2.0)
45
+ #
46
+ def add_field(name, value, options = {})
47
+ @fields << GSolr::Message::Field.new(options.merge({:name=>name}), value)
48
+ end
49
+
50
+ end # class Document
51
+ end # module Message
52
+ end # module GSolr
@@ -0,0 +1,23 @@
1
+ # A class that represents a "doc"/"field" xml element for a solr update
2
+ module GSolr
3
+ module Message
4
+ class Field
5
+
6
+ # "attrs" is a hash for setting the "doc" xml attributes
7
+ # "value" is the text value for the node
8
+ attr_accessor :attrs, :value
9
+
10
+ # "attrs" must be a hash
11
+ # "value" should be something that responds to #_to_s
12
+ def initialize(attrs, value)
13
+ @attrs = attrs
14
+ @value = value
15
+ end
16
+
17
+ # the value of the "name" attribute
18
+ def name
19
+ @attrs[:name]
20
+ end
21
+ end # class Field
22
+ end # module Message
23
+ end # module GSolr