rsolr 0.12.0 → 2.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. checksums.yaml +7 -0
  2. data/.github/workflows/ruby.yml +29 -0
  3. data/.gitignore +13 -0
  4. data/.rspec +2 -0
  5. data/CHANGES.txt +63 -260
  6. data/Gemfile +13 -0
  7. data/README.rdoc +177 -63
  8. data/Rakefile +19 -0
  9. data/lib/rsolr/char.rb +6 -0
  10. data/lib/rsolr/client.rb +344 -86
  11. data/lib/rsolr/document.rb +66 -0
  12. data/lib/rsolr/error.rb +182 -0
  13. data/lib/rsolr/field.rb +87 -0
  14. data/lib/rsolr/generator.rb +5 -0
  15. data/lib/rsolr/json.rb +60 -0
  16. data/lib/rsolr/response.rb +95 -0
  17. data/lib/rsolr/uri.rb +25 -0
  18. data/lib/rsolr/version.rb +7 -0
  19. data/lib/rsolr/xml.rb +150 -0
  20. data/lib/rsolr.rb +47 -35
  21. data/rsolr.gemspec +44 -31
  22. data/spec/api/client_spec.rb +423 -0
  23. data/spec/api/document_spec.rb +48 -0
  24. data/spec/api/error_spec.rb +158 -0
  25. data/spec/api/json_spec.rb +248 -0
  26. data/spec/api/pagination_spec.rb +31 -0
  27. data/spec/api/rsolr_spec.rb +31 -0
  28. data/spec/api/uri_spec.rb +37 -0
  29. data/spec/api/xml_spec.rb +255 -0
  30. data/spec/fixtures/basic_configs/_rest_managed.json +1 -0
  31. data/spec/fixtures/basic_configs/currency.xml +67 -0
  32. data/spec/fixtures/basic_configs/lang/stopwords_en.txt +54 -0
  33. data/spec/fixtures/basic_configs/protwords.txt +21 -0
  34. data/spec/fixtures/basic_configs/schema.xml +530 -0
  35. data/spec/fixtures/basic_configs/solrconfig.xml +572 -0
  36. data/spec/fixtures/basic_configs/stopwords.txt +14 -0
  37. data/spec/fixtures/basic_configs/synonyms.txt +29 -0
  38. data/spec/integration/solr5_spec.rb +38 -0
  39. data/spec/lib/rsolr/client_spec.rb +19 -0
  40. data/spec/spec_helper.rb +94 -0
  41. metadata +228 -54
  42. data/lib/rsolr/connection/net_http.rb +0 -48
  43. data/lib/rsolr/connection/requestable.rb +0 -43
  44. data/lib/rsolr/connection/utils.rb +0 -73
  45. data/lib/rsolr/connection.rb +0 -9
  46. data/lib/rsolr/message/document.rb +0 -48
  47. data/lib/rsolr/message/field.rb +0 -20
  48. data/lib/rsolr/message/generator.rb +0 -89
  49. data/lib/rsolr/message.rb +0 -8
data/README.rdoc CHANGED
@@ -1,63 +1,131 @@
1
1
  =RSolr
2
2
 
3
- A Ruby client for Apache Solr. RSolr has been developed to be simple and extendable. It features transparent JRuby DirectSolrConnection support and a simple Hash-in, Hash-out architecture.
3
+ A simple, extensible Ruby client for Apache Solr.
4
+
5
+ ==Documentation
6
+ The code docs http://www.rubydoc.info/gems/rsolr
4
7
 
5
8
  == Installation:
6
- gem sources -a http://gemcutter.org
7
- sudo gem install rsolr
9
+ gem install rsolr
8
10
 
9
- ==Related Resources & Projects
10
- * {Solr}[http://lucene.apache.org/solr/]
11
- * {RSolr Google Group}[http://groups.google.com/group/rsolr]
12
- * {RSolr::Ext}[http://github.com/mwmitchell/rsolr-ext] -- an extension kit for RSolr
13
- * {Sunspot}[http://github.com/outoftime/sunspot] -- an awesome Solr DSL, built with RSolr
14
- * {Blacklight}[http://blacklightopac.org] -- a next generation Library OPAC, built with RSolr
15
- * {solr-ruby}[http://wiki.apache.org/solr/solr-ruby] -- the original Solr Ruby Gem
16
- * {java_bin}[http://github.com/kennyj/java_bin] -- Provides javabin/binary parsing for Ruby
17
-
18
- == Simple usage:
19
- require 'rubygems'
11
+ == Example:
20
12
  require 'rsolr'
21
- solr = RSolr.connect :url=>'http://solrserver.com'
22
-
13
+
14
+ # Direct connection
15
+ solr = RSolr.connect :url => 'http://solrserver.com'
16
+
17
+ # Connecting over a proxy server
18
+ solr = RSolr.connect :url => 'http://solrserver.com', :proxy=>'http://user:pass@proxy.example.com:8080'
19
+
20
+ # Using an alternate Faraday adapter
21
+ solr = RSolr.connect :url => 'http://solrserver.com', :adapter => :em_http
22
+
23
+ # Using a custom Faraday connection
24
+ conn = Faraday.new do |faraday|
25
+ faraday.response :logger # log requests to STDOUT
26
+ faraday.adapter Faraday.default_adapter # make requests with Net::HTTP
27
+ end
28
+ solr = RSolr.connect conn, :url => 'http://solrserver.com'
29
+
23
30
  # send a request to /select
24
- response = rsolr.select :q=>'*:*'
25
-
26
- # send a request to a custom request handler; /catalog
27
- response = rsolr.request '/catalog', :q=>'*:*'
28
-
29
- # alternative to above:
30
- response = rsolr.catalog :q=>'*:*'
31
+ response = solr.get 'select', :params => {:q => '*:*'}
32
+
33
+ # send a request to /catalog
34
+ response = solr.get 'catalog', :params => {:q => '*:*'}
35
+
36
+ When the Solr +:wt+ is +:ruby+, then the response will be a Hash. This Hash is the same object returned by Solr, but evaluated as Ruby. If the +:wt+ is not +:ruby+, then the response will be a String.
37
+
38
+ The response also exposes 2 attribute readers (for any +:wt+ value), +:request+ and +:response+. Both are Hash objects with symbolized keys.
39
+
40
+ The +:request+ attribute contains the original request context. You can use this for debugging or logging. Some of the keys this object contains are +:uri+, +:query+, +:method+ etc..
41
+
42
+ The +:response+ attribute contains the original response. This object contains the +:status+, +:body+ and +:headers+ keys.
43
+
44
+ == Request formats
45
+
46
+ By default, RSolr uses the Solr JSON command format for all requests.
47
+
48
+ RSolr.connect :url => 'http://solrserver.com', update_format: :json # the default
49
+ # or
50
+ RSolr.connect :url => 'http://solrserver.com', update_format: :xml
51
+
52
+ == Timeouts
53
+ The read and connect timeout settings can be set when creating a new instance of RSolr, and will
54
+ be passed on to underlying Faraday instance:
55
+
56
+ solr = RSolr.connect(:timeout => 120, :open_timeout => 120)
57
+
58
+ == Retry 503s
59
+ A 503 is usually a temporary error which RSolr may retry if requested. You may specify the number of retry attempts with the +:retry_503+ option.
60
+
61
+ Only requests which specify a Retry-After header will be retried, after waiting the indicated retry interval, otherwise RSolr will treat the request as a 500. You may specify a maximum Retry-After interval to wait with the +:retry_after_limit+ option (default: one second).
62
+ solr = RSolr.connect(:retry_503 => 1, :retry_after_limit => 1)
63
+
64
+ For additional control, consider using a custom Faraday connection (see above) using its `retry` middleware.
31
65
 
32
66
  == Querying
33
- Use the #select method to send requests to the /select handler:
34
- response = solr.select({
67
+ Use the #get / #post method to send search requests to the /select handler:
68
+ response = solr.get 'select', :params => {
35
69
  :q=>'washington',
36
70
  :start=>0,
37
71
  :rows=>10
38
- })
72
+ }
73
+ response["response"]["docs"].each{|doc| puts doc["id"] }
39
74
 
40
- The params sent into the method are sent to Solr as-is. The one exception is if a value is an array. When an array is used, multiple parameters are generated for the Solr query. Example:
41
-
42
- solr.select :q=>'roses', :fq=>['red', 'violet']
75
+ The +:params+ sent into the method are sent to Solr as-is, which is to say they are converted to Solr url style, but no special mapping is used.
76
+ When an array is used, multiple parameters *with the same name* are generated for the Solr query. Example:
77
+
78
+ solr.get 'select', :params => {:q=>'roses', :fq=>['red', 'violet']}
43
79
 
44
80
  The above statement generates this Solr query:
45
-
46
- ?q=roses&fq=red&fq=violet
47
81
 
48
- Use the #request method for a custom request handler path:
49
- response = solr.request '/documents', :q=>'test'
82
+ select?q=roses&fq=red&fq=violet
50
83
 
51
- A shortcut for the above example:
52
- response = solr.documents :q=>'test'
84
+ ===Pagination
85
+ To paginate through a set of Solr documents, use the paginate method:
86
+ solr.paginate 1, 10, "select", :params => {:q => "test"}
53
87
 
88
+ The first argument is the current page, the second is how many documents to return for each page. In other words, "page" is the "start" Solr param and "per-page" is the "rows" Solr param.
54
89
 
55
- == Updating Solr
56
- Updating can be done using native Ruby structures. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These structures get turned into simple XML "messages". Raw XML strings can also be used.
90
+ The paginate method returns WillPaginate ready "docs" objects, so for example in a Rails application, paginating is as simple as:
91
+ <%= will_paginate @solr_response["response"]["docs"] %>
92
+
93
+ ===Method Missing
94
+ The +RSolr::Client+ class also uses +method_missing+ for setting the request handler/path:
95
+
96
+ solr.paintings :params => {:q=>'roses', :fq=>['red', 'violet']}
97
+
98
+ This is sent to Solr as:
99
+ paintings?q=roses&fq=red&fq=violet
100
+
101
+ This works with pagination as well:
102
+
103
+ solr.paginate_paintings 1, 10, {:q=>'roses', :fq=>['red', 'violet']}
104
+
105
+ ===Using POST for Search Queries
106
+ There may be cases where the query string is too long for a GET request. RSolr solves this issue by converting hash objects into form-encoded strings:
107
+ response = solr.music :data => {:q => "*:*"}
108
+
109
+ The +:data+ hash is serialized as a form-encoded query string, and the correct content-type headers are sent along to Solr.
110
+
111
+ ===Sending HEAD Requests
112
+ There may be cases where you'd like to send a HEAD request to Solr:
113
+ solr.head("admin/ping").response[:status] == 200
114
+
115
+ ==Sending HTTP Headers
116
+ Solr responds to the request headers listed here: http://wiki.apache.org/solr/SolrAndHTTPCaches
117
+ To send header information to Solr using RSolr, just use the +:headers+ option:
118
+ response = solr.head "admin/ping", :headers => {"Cache-Control" => "If-None-Match"}
57
119
 
58
- Raw XML via #update
59
- solr.update '</commit>'
60
- solr.update '</optimize>'
120
+ ===Building a Request
121
+ +RSolr::Client+ provides a method for building a request context, which can be useful for debugging or logging etc.:
122
+ request_context = solr.build_request "select", :data => {:q => "*:*"}, :method => :post, :headers => {}
123
+
124
+ To build a paginated request use build_paginated_request:
125
+ request_context = solr.build_paginated_request 1, 10, "select", ...
126
+
127
+ == Updating Solr
128
+ Updating is done using native Ruby objects. Hashes are used for single documents and arrays are used for a collection of documents (hashes). These objects get turned into simple XML "messages". Raw XML strings can also be used.
61
129
 
62
130
  Single document via #add
63
131
  solr.add :id=>1, :price=>1.00
@@ -66,17 +134,28 @@ Multiple documents via #add
66
134
  documents = [{:id=>1, :price=>1.00}, {:id=>2, :price=>10.50}]
67
135
  solr.add documents
68
136
 
69
- When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) when using the #add method:
70
-
137
+ The optional +:add_attributes+ hash can also be used to set Solr "add" document attributes:
138
+ solr.add documents, :add_attributes => {:commitWithin => 10}
139
+
140
+ Raw commands via #update
141
+ solr.update data: '<commit/>', headers: { 'Content-Type' => 'text/xml' }
142
+ solr.update data: { optimize: true }.to_json, headers: { 'Content-Type' => 'application/json' }
143
+
144
+ When adding, you can also supply "add" xml element attributes and/or a block for manipulating other "add" related elements (docs and fields) by calling the +xml+ method directly:
145
+
71
146
  doc = {:id=>1, :price=>1.00}
72
- add_attributes = {:allowDups=>false, :commitWithin=>10.0}
73
- solr.add(doc, add_attributes) do |doc|
147
+ add_attributes = {:allowDups=>false, :commitWithin=>10}
148
+ add_xml = solr.xml.add(doc, add_attributes) do |doc|
74
149
  # boost each document
75
150
  doc.attrs[:boost] = 1.5
76
151
  # boost the price field:
77
152
  doc.field_by_name(:price).attrs[:boost] = 2.0
78
153
  end
79
154
 
155
+ Now the "add_xml" object can be sent to Solr like:
156
+ solr.update :data => add_xml
157
+
158
+ ===Deleting
80
159
  Delete by id
81
160
  solr.delete_by_id 1
82
161
  or an array of ids
@@ -87,27 +166,62 @@ Delete by query:
87
166
  Delete by array of queries
88
167
  solr.delete_by_query ['price:1.00', 'price:10.00']
89
168
 
90
- Commit & optimize shortcuts
91
- solr.commit
92
- solr.optimize
169
+ ===Commit / Optimize
170
+ solr.commit, :commit_attributes => {}
171
+ solr.optimize, :optimize_attributes => {}
93
172
 
94
173
  == Response Formats
95
- The default response format is Ruby. When the :wt param is set to :ruby, the response is eval'd resulting in a Hash. You can get a raw response by setting the :wt to "ruby" - notice, the string -- not a symbol. RSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, :wt=>'xml' etc..
96
-
97
- ===Evaluated Ruby (default)
98
- solr.select(:wt=>:ruby) # notice :ruby is a Symbol
99
- ===Raw Ruby
100
- solr.select(:wt=>'ruby') # notice 'ruby' is a String
174
+ The default response format is Ruby. When the +:wt+ param is set to +:ruby+, the response is eval'd resulting in a Hash. You can get a raw response by setting the +:wt+ to +"ruby"+ - notice, the string -- not a symbol. RSolr will eval the Ruby string ONLY if the :wt value is :ruby. All other response formats are available as expected, +:wt=>'xml'+ etc..
101
175
 
176
+ ===Evaluated Ruby:
177
+ solr.get 'select', :params => {:wt => :ruby} # notice :ruby is a Symbol
178
+ ===Raw Ruby:
179
+ solr.get 'select', :params => {:wt => 'ruby'} # notice 'ruby' is a String
102
180
  ===XML:
103
- solr.select(:wt=>:xml)
104
- ===JSON:
105
- solr.select(:wt=>:json)
181
+ solr.get 'select', :params => {:wt => :xml}
182
+ ===JSON (default):
183
+ solr.get 'select', :params => {:wt => :json}
106
184
 
107
- You can access the original request context (path, params, url etc.) by calling the #raw method:
108
- response = solr.select :q=>'*:*'
109
- response.raw[:status_code]
110
- response.raw[:body]
111
- response.raw[:url]
112
-
113
- The raw is a hash that contains the generated params, url, path, post data, headers etc., very useful for debugging and testing.
185
+ ==Related Resources & Projects
186
+ * {RSolr Google Group}[http://groups.google.com/group/rsolr] -- The RSolr discussion group
187
+ * {rsolr-ext}[http://github.com/mwmitchell/rsolr-ext] -- An extension kit for RSolr
188
+ * {rsolr-direct}[http://github.com/mwmitchell/rsolr-direct] -- JRuby direct connection for RSolr
189
+ * {rsolr-nokogiri}[http://github.com/mwmitchell/rsolr-nokogiri] -- Gives RSolr Nokogiri for XML generation.
190
+ * {SunSpot}[http://github.com/sunspot/sunspot] -- An awesome Solr DSL, built with RSolr
191
+ * {Blacklight}[http://blacklightopac.org] -- A "next generation" Library OPAC, built with RSolr
192
+ * {java_bin}[http://github.com/kennyj/java_bin] -- Provides javabin/binary parsing for RSolr
193
+ * {Solr}[http://lucene.apache.org/solr/] -- The Apache Solr project
194
+ * {solr-ruby}[http://wiki.apache.org/solr/solr-ruby] -- The original Solr Ruby Gem!
195
+
196
+ == Note on Patches/Pull Requests
197
+ * Fork the project.
198
+ * Make your feature addition or bug fix.
199
+ * Add tests for it. This is important so I don't break it in a future version unintentionally.
200
+ * Commit, do not mess with rakefile, version, or history
201
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
202
+ * Send me a pull request. Bonus points for topic branches.
203
+
204
+ ==Contributors
205
+ * Nathan Witmer
206
+ * Magnus Bergmark
207
+ * shima
208
+ * Randy Souza
209
+ * Mat Brown
210
+ * Jeremy Hinegardner
211
+ * Denis Goeury
212
+ * shairon toledo
213
+ * Rob Di Marco
214
+ * Peter Kieltyka
215
+ * Mike Perham
216
+ * Lucas Souza
217
+ * Dmitry Lihachev
218
+ * Antoine Latter
219
+ * Naomi Dushay
220
+
221
+ ==Author
222
+
223
+ Matt Mitchell <mailto:goodieboy@gmail.com>
224
+
225
+ ==Copyright
226
+
227
+ Copyright (c) 2008-2010 Matt Mitchell. See LICENSE for details.
data/Rakefile ADDED
@@ -0,0 +1,19 @@
1
+ require 'bundler/gem_tasks'
2
+
3
+ task default: ['spec']
4
+
5
+ require 'rspec/core/rake_task'
6
+
7
+ RSpec::Core::RakeTask.new(:spec)
8
+
9
+ # Rdoc
10
+ require 'rdoc/task'
11
+
12
+ desc 'Generate documentation for the rsolr gem.'
13
+ RDoc::Task.new(:doc) do |rdoc|
14
+ rdoc.rdoc_dir = 'doc'
15
+ rdoc.title = 'RSolr'
16
+ rdoc.options << '--line-numbers' << '--inline-source'
17
+ rdoc.rdoc_files.include('README.rdoc')
18
+ rdoc.rdoc_files.include('lib/**/*.rb')
19
+ end
data/lib/rsolr/char.rb ADDED
@@ -0,0 +1,6 @@
1
+ # :nodoc:
2
+ module RSolr::Char
3
+ def self.included(*)
4
+ warn 'RSolr::Char is deprecated without replacement, and will be removed in RSolr 3.x'
5
+ end
6
+ end