traject-solrj_writer 1.0.0-java

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: f7d301bb6262198a78ec629ad45985652b152cb0
4
+ data.tar.gz: 99196e4d061051b040f99ea553fd5a75328f9990
5
+ SHA512:
6
+ metadata.gz: e4441cbe2d7a76dc4274c760d8d832c2ea7027063e4e1b8e10db310ae0839ef9c4a5c556aa4366c431a0d4441b1cbd9f3c1a97b46f9aaaea0faad3c44824123b
7
+ data.tar.gz: 812e57b618b13f27f5f2024df39db2777c58503aa1fb9ec021066a176ebb851686c809844cefd7a22ea86c922dc2c8aacbe0acd2a7439468fb790212f223ad4e
@@ -0,0 +1,14 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+ *.bundle
11
+ *.so
12
+ *.o
13
+ *.a
14
+ mkmf.log
@@ -0,0 +1,3 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.2.0
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in traject-solrj_writer.gemspec
4
+ gemspec
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2015 Bill Dueber
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,116 @@
1
+ # Traject::SolrJWriter
2
+
3
+ Use [Traject](http://github.com/traject-project/traject) to write to
4
+ a Solr index using the `solrj` java library.
5
+
6
+ **This gem requires JRuby and Traject >= 2.0**
7
+
8
+ **This gem is not yet released**
9
+
10
+ ## Notes on using this gem
11
+ * Our benchmarking indicates that `Traject::SolrJsonWriter` (included with Traject) outperforms
12
+ this library by a notable swath. Use that if you can.
13
+ * If you're running a version of Solr < 3.2, you can't use `SolrJsonWriter` at all; this
14
+ becomes your best bet.
15
+ * Given its reliance on loading `.jar` files, `Traject::SolrJWriter` obviously require JRuby.
16
+
17
+ ## Usage
18
+
19
+ You'll need to make sure this gem is available (e.g., by putting it in your gemfile)
20
+ and then have code like this:
21
+
22
+ ```ruby
23
+ # Sample traject configuration for using solrj
24
+ require 'traject'
25
+ require 'traject/solrj_writer'
26
+
27
+
28
+ settings do
29
+ # Arguments for any solr writer
30
+ provide "solr.url", ENV["SOLR_URL"] | 'http://localhost:8983/solr/core1'
31
+ provide "solr_writer.commit_on_close", "true"
32
+ provide "solr_writer.thread_pool", 2
33
+ provide "solr_writer.batch_size", 50
34
+
35
+ # SolrJ Specific stuff
36
+ provide "solrj_writer.parser_class_name", "XMLResponseParser"
37
+ provide "writer_class_name", "Traject::SolrJWriter"
38
+
39
+ store 'processing_thread_pool', 5
40
+ store "log.batch_size", 25_000
41
+
42
+ ```
43
+
44
+ ...and then use Traject as normal.
45
+
46
+
47
+ ## Full list of settings
48
+
49
+ ### Generic Solr settings (used for both SolrJWriter and SolrJsonWriter)
50
+
51
+ * `solr.url`: Your solr url (required)
52
+ * `solr_writer.commit_on_close`: If true (or string 'true'), send a commit to solr
53
+ at end of #process.
54
+
55
+ * `solr_writer.batch_size`: If non-nil and more than 1, send documents to
56
+ solr in batches of solrj_writer.batch_size. If nil/1,
57
+ however, an http transaction with solr will be done
58
+ per doc. DEFAULT to 100, which seems to be a sweet spot.
59
+
60
+ * `solr_writer.thread_pool`: Defaults to 1. A thread pool is used for submitting docs
61
+ to solr. Set to 0 or nil to disable threading. Set to 1,
62
+ there will still be a single bg thread doing the adds. For
63
+ very fast Solr servers and very fast indexing processes, may
64
+ make sense to increase this value to throw at Solr as fast as it
65
+ can catch.
66
+
67
+ ### SolrJ-specific settings
68
+
69
+ * `solrj_writer.server_class_name`: Defaults to "HttpSolrServer". You can specify
70
+ another Solr Server sub-class, but it has
71
+ to take a one-arg url constructor. Maybe
72
+ subclass this writer class and overwrite
73
+ instantiate_solr_server! otherwise
74
+
75
+ * `solrj.jar_dir`: Custom directory containing all of the SolrJ jars. All
76
+ jars in this dir will be loaded. Otherwise,
77
+ we load our own packaged solrj jars. This setting
78
+ can't really be used differently in the same app instance,
79
+ since jars are loaded globally.
80
+
81
+ * `solrj_writer.parser_class_name`: A String name of a class in package
82
+ org.apache.solr.client.solrj.impl,
83
+ we'll instantiate one with a zero-arg
84
+ constructor, and pass it as an arg to setParser on
85
+ the SolrServer instance, if present.
86
+ NOTE: For contacting a Solr 1.x server, with the
87
+ recent version of SolrJ used by default, set to
88
+ "XMLResponseParser"
89
+
90
+
91
+
92
+
93
+ ## Installation
94
+
95
+ Add this line to your application's Gemfile:
96
+
97
+ ```ruby
98
+ gem 'traject-solrj_writer'
99
+ ```
100
+
101
+ And then execute:
102
+
103
+ $ bundle
104
+
105
+ Or install it yourself as:
106
+
107
+ $ gem install traject-solrj_writer
108
+
109
+
110
+ ## Contributing
111
+
112
+ 1. Fork it ( https://github.com/traject-project/traject-solrj_writer/fork )
113
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
114
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
115
+ 4. Push to the branch (`git push origin my-new-feature`)
116
+ 5. Create a new Pull Request
@@ -0,0 +1,11 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+
4
+ Rake::TestTask.new(:spec) do |t|
5
+ t.pattern = 'spec/**/*_spec.rb'
6
+ t.libs << "spec"
7
+ end
8
+
9
+ task :test => :spec
10
+ task :default => :spec
11
+
@@ -0,0 +1,427 @@
1
+ require "traject/solrj_writer/version"
2
+ require 'yell'
3
+
4
+ require 'traject'
5
+ require 'traject/util'
6
+ require 'traject/qualified_const_get'
7
+ require 'traject/thread_pool'
8
+
9
+ require 'uri'
10
+ require 'thread' # for Mutex
11
+
12
+ #
13
+ # Writes to a Solr using SolrJ, and the SolrJ HttpSolrServer.
14
+ #
15
+ # After you call #close, you can check #skipped_record_count if you want
16
+ # for an integer count of skipped records.
17
+ #
18
+ # For fatal errors that raise... async processing with thread_pool means that
19
+ # you may not get a raise immediately after calling #put, you may get it on
20
+ # a FUTURE #put or #close. You should get it eventually though.
21
+ #
22
+ # ## Settings
23
+ #
24
+ # * solr.url: Your solr url (required)
25
+ #
26
+ # * solrj_writer.server_class_name: Defaults to "HttpSolrServer". You can specify
27
+ # another Solr Server sub-class, but it has
28
+ # to take a one-arg url constructor. Maybe
29
+ # subclass this writer class and overwrite
30
+ # instantiate_solr_server! otherwise
31
+ #
32
+ # * solrj.jar_dir: Custom directory containing all of the SolrJ jars. All
33
+ # jars in this dir will be loaded. Otherwise,
34
+ # we load our own packaged solrj jars. This setting
35
+ # can't really be used differently in the same app instance,
36
+ # since jars are loaded globally.
37
+ #
38
+ # * solrj_writer.parser_class_name: A String name of a class in package
39
+ # org.apache.solr.client.solrj.impl,
40
+ # we'll instantiate one with a zero-arg
41
+ # constructor, and pass it as an arg to setParser on
42
+ # the SolrServer instance, if present.
43
+ # NOTE: For contacting a Solr 1.x server, with the
44
+ # recent version of SolrJ used by default, set to
45
+ # "XMLResponseParser"
46
+ #
47
+ # * solr_writer.commit_on_close: If true (or string 'true'), send a commit to solr
48
+ # at end of #process.
49
+ #
50
+ # * solr_writer.batch_size: If non-nil and more than 1, send documents to
51
+ # solr in batches of solrj_writer.batch_size. If nil/1,
52
+ # however, an http transaction with solr will be done
53
+ # per doc. DEFAULT to 100, which seems to be a sweet spot.
54
+ #
55
+ # * solr_writer.thread_pool: Defaults to 1. A thread pool is used for submitting docs
56
+ # to solr. Set to 0 or nil to disable threading. Set to 1,
57
+ # there will still be a single bg thread doing the adds. For
58
+ # very fast Solr servers and very fast indexing processes, may
59
+ # make sense to increase this value to throw at Solr as fast as it
60
+ # can catch.
61
+ #
62
+ # ## Example
63
+ #
64
+ # settings do
65
+ # provide "writer_class_name", "Traject::SolrJWriter"
66
+ #
67
+ # # This is just regular ruby, so don't be afraid to have conditionals!
68
+ # # Switch on hostname, for test and production server differences
69
+ # if Socket.gethostname =~ /devhost/
70
+ # provide "solr.url", "http://my.dev.machine:9033/catalog"
71
+ # else
72
+ # provide "solr.url", "http://my.production.machine:9033/catalog"
73
+ # end
74
+ #
75
+ # provide "solrj_writer.parser_class_name", "BinaryResponseParser" # for Solr 4.x
76
+ # # provide "solrj_writer.parser_class_name", "XMLResponseParser" # For solr 1.x or 3.x
77
+ #
78
+ # provide "solrj_writer.commit_on_close", "true"
79
+ # end
80
+ class Traject::SolrJWriter
81
+
82
+
83
+ # just a tuple of a SolrInputDocument
84
+ # and a Traject::Indexer::Context it came from
85
+ class UpdatePackage
86
+ attr_accessor :solr_document, :context
87
+ def initialize(doc, ctx)
88
+ self.solr_document = doc
89
+ self.context = ctx
90
+ end
91
+ end
92
+
93
+ # Class method to load up the jars from vendor if we need to
94
+ # Requires solrj jar(s) from settings['solrj.jar_dir'] if given, otherwise
95
+ # uses jars bundled with traject gem in ./vendor
96
+ #
97
+ # Have to pass in a settings arg, so we can check it for specified jar dir.
98
+ #
99
+ # Tries not to do the dirglob and require if solrj has already been loaded.
100
+ # Will define global constants with classes HttpSolrServer and SolrInputDocument
101
+ # if not already defined.
102
+ #
103
+ # This is all a bit janky, maybe there's a better way to do this? We do want
104
+ # a 'require' method defined somewhere utility, so multiple classes can
105
+ # use it, including extra gems. This method may be used by extra gems, so should
106
+ # be considered part of the API -- after it's called, those top-level
107
+ # globals should be available, and solrj should be loaded.
108
+ def self.require_solrj_jars(settings)
109
+ jruby_ensure_init!
110
+
111
+ tries = 0
112
+ begin
113
+ tries += 1
114
+
115
+ org.apache.solr
116
+ org.apache.solr.client.solrj
117
+
118
+ # java_import which we'd normally use weirdly doesn't work
119
+ # from a class method. https://github.com/jruby/jruby/issues/975
120
+ Object.const_set("HttpSolrServer", org.apache.solr.client.solrj.impl.HttpSolrServer) unless defined? ::HttpSolrServer
121
+ Object.const_set("SolrInputDocument", org.apache.solr.common.SolrInputDocument) unless defined? ::SolrInputDocument
122
+ rescue NameError => e
123
+ included_jar_dir = File.expand_path("../../vendor/solrj/lib", File.dirname(__FILE__))
124
+
125
+ jardir = settings["solrj.jar_dir"] || included_jar_dir
126
+ Dir.glob("#{jardir}/*.jar") do |x|
127
+ require x
128
+ end
129
+ if tries > 1
130
+ raise LoadError.new("Can not find SolrJ java classes")
131
+ else
132
+ retry
133
+ end
134
+ end
135
+ end
136
+
137
+ # just does a `require 'java'` but rescues the exception if we
138
+ # aren't jruby, and raises a better error message.
139
+ # Pass in a developer-presentable name of a feature to include in the error
140
+ # message if you want.
141
+ def self.jruby_ensure_init!(feature = nil)
142
+ begin
143
+ require 'java'
144
+ rescue LoadError => e
145
+ feature ||= "A traject feature is in use that"
146
+ msg = if feature
147
+ "#{feature} requires jruby, but you do not appear to be running under jruby. We recommend `chruby` for managing multiple ruby installs."
148
+ end
149
+ raise LoadError.new(msg)
150
+ end
151
+ end
152
+
153
+
154
+
155
+ include Traject::QualifiedConstGet
156
+
157
+ attr_reader :settings
158
+
159
+ attr_reader :batched_queue
160
+
161
+ def initialize(argSettings)
162
+ @settings = Traject::Indexer::Settings.new(argSettings)
163
+
164
+ # Let's go ahead an alias out the old solrj_writer settings to the
165
+ # newer solr_writer settings, so the old config files still work
166
+
167
+ %w[commit_on_close batch_size thread_pool].each do |s|
168
+ swkey = "solr_writer.#{s}"
169
+ sjwkey = "solrj_writer.#{s}"
170
+ @settings[swkey] = @settings[sjwkey] unless @settings[sjwkey].nil?
171
+
172
+ end
173
+
174
+ settings_check!(settings)
175
+
176
+ ensure_solrj_loaded!
177
+
178
+ solr_server # init
179
+
180
+ @batched_queue = java.util.concurrent.LinkedBlockingQueue.new
181
+
182
+ # when multi-threaded exceptions raised in threads are held here
183
+ # we need a HIGH performance queue here to try and avoid slowing things down,
184
+ # since we need to check it frequently.
185
+ @async_exception_queue = java.util.concurrent.ConcurrentLinkedQueue.new
186
+
187
+ # Store error count in an AtomicInteger, so multi threads can increment
188
+ # it safely, if we're threaded.
189
+ @skipped_record_incrementer = java.util.concurrent.atomic.AtomicInteger.new(0)
190
+
191
+ # if our thread pool settings are 0, it'll just create a null threadpool that
192
+ # executes in calling context.
193
+ @thread_pool = Traject::ThreadPool.new( @settings['solr_writer.thread_pool'].to_i )
194
+
195
+ @debug_ascii_progress = (@settings["debug_ascii_progress"].to_s == "true")
196
+
197
+ logger.info(" #{self.class.name} writing to '#{settings['solr.url']}'")
198
+ end
199
+
200
+ # Loads solrj if not already loaded. By loading all jars found
201
+ # in settings["solrj.jar_dir"]
202
+ def ensure_solrj_loaded!
203
+ unless defined?(HttpSolrServer) && defined?(SolrInputDocument)
204
+ self.class.require_solrj_jars(settings)
205
+ end
206
+
207
+ # And for now, SILENCE SolrJ logging
208
+ org.apache.log4j.Logger.getRootLogger().addAppender(org.apache.log4j.varia.NullAppender.new)
209
+ end
210
+
211
+ # Method IS thread-safe, can be called concurrently by multi-threads.
212
+ #
213
+ # Why? If not using batched add, we just use the SolrServer, which is already
214
+ # thread safe itself.
215
+ #
216
+ # If we are using batch add, we surround all access to our shared state batch queue
217
+ # in a mutex -- just a naive implementation. May be able to improve performance
218
+ # with more sophisticated java.util.concurrent data structure (blocking queue etc)
219
+ # I did try a java ArrayBlockingQueue or LinkedBlockingQueue instead of our own
220
+ # mutex -- I did not see consistently different performance. May want to
221
+ # change so doesn't use a mutex at all if multiple mapping threads aren't being
222
+ # used.
223
+ #
224
+ # this class does not at present use any threads itself, all work will be done
225
+ # in the calling thread, including actual http transactions to solr via solrj SolrServer
226
+ # if using batches, then not every #put is a http transaction, but when it is,
227
+ # it's in the calling thread, synchronously.
228
+ def put(context)
229
+ @thread_pool.raise_collected_exception!
230
+
231
+ # package the SolrInputDocument along with the context, so we have
232
+ # the context for error reporting when we actually add.
233
+
234
+ package = UpdatePackage.new(hash_to_solr_document(context.output_hash), context)
235
+
236
+ if settings["solr_writer.batch_size"].to_i > 1
237
+ ready_batch = []
238
+
239
+ batched_queue.add(package)
240
+ if batched_queue.size >= settings["solr_writer.batch_size"].to_i
241
+ batched_queue.drain_to(ready_batch)
242
+ end
243
+
244
+ if ready_batch.length > 0
245
+ if @debug_ascii_progress
246
+ $stderr.write("^")
247
+ if @thread_pool.queue && (@thread_pool.queue.size >= @thread_pool.queue_capacity)
248
+ $stderr.write "!"
249
+ end
250
+ end
251
+
252
+ @thread_pool.maybe_in_thread_pool { batch_add_document_packages(ready_batch) }
253
+ end
254
+ else # non-batched add, add one at a time.
255
+ @thread_pool.maybe_in_thread_pool { add_one_document_package(package) }
256
+ end
257
+ end
258
+
259
+ def hash_to_solr_document(hash)
260
+ doc = SolrInputDocument.new
261
+ hash.each_pair do |key, value_array|
262
+ value_array.each do |value|
263
+ doc.addField( key, value )
264
+ end
265
+ end
266
+ return doc
267
+ end
268
+
269
+ # Takes array and batch adds it to solr -- array of UpdatePackage tuples of
270
+ # SolrInputDocument and context.
271
+ #
272
+ # Catches error in batch add, logs, and re-tries docs individually
273
+ #
274
+ # Is thread-safe, because SolrServer is thread-safe, and we aren't
275
+ # referencing any other shared state. Important that CALLER passes
276
+ # in a doc array that is not shared state, extracting it from
277
+ # shared state batched_queue in a mutex.
278
+ def batch_add_document_packages(current_batch)
279
+ begin
280
+ a = current_batch.collect {|package| package.solr_document }
281
+ solr_server.add( a )
282
+
283
+ $stderr.write "%" if @debug_ascii_progress
284
+ rescue Exception => e
285
+ # Error in batch, none of the docs got added, let's try to re-add
286
+ # em all individually, so those that CAN get added get added, and those
287
+ # that can't get individually logged.
288
+ logger.warn "Error encountered in batch solr add, will re-try documents individually, at a performance penalty...\n" + Traject::Util.exception_to_log_message(e)
289
+ current_batch.each do |package|
290
+ add_one_document_package(package)
291
+ end
292
+ end
293
+ end
294
+
295
+
296
+ # Adds a single SolrInputDocument passed in as an UpdatePackage combo of SolrInputDocument
297
+ # and context.
298
+ #
299
+ # Rescues exceptions thrown by SolrServer.add, logs them, and then raises them
300
+ # again if deemed fatal and should stop indexing. Only intended to be used on a SINGLE
301
+ # document add. If we get an exception on a multi-doc batch add, we need to recover
302
+ # differently.
303
+ def add_one_document_package(package)
304
+ begin
305
+ solr_server.add(package.solr_document)
306
+ # Honestly not sure what the difference is between those types, but SolrJ raises both
307
+ rescue org.apache.solr.common.SolrException, org.apache.solr.client.solrj.SolrServerException => e
308
+ id = package.context.source_record && package.context.source_record['001'] && package.context.source_record['001'].value
309
+ id_str = id ? "001:#{id}" : ""
310
+
311
+ position = package.context.position
312
+ position_str = position ? "at file position #{position} (starting at 1)" : ""
313
+
314
+ logger.error("Could not index record #{id_str} #{position_str}\n" + Traject::Util.exception_to_log_message(e) )
315
+ logger.debug(package.context.source_record.to_s)
316
+
317
+ @skipped_record_incrementer.getAndIncrement() # AtomicInteger, thread-safe increment.
318
+
319
+ if fatal_exception? e
320
+ logger.fatal ("SolrJ exception judged fatal, raising...")
321
+ raise e
322
+ end
323
+ end
324
+ end
325
+
326
+ def logger
327
+ settings["logger"] ||= Yell.new(STDERR, :level => "gt.fatal") # null logger
328
+ end
329
+
330
+ # If an exception is encountered talking to Solr, is it one we should
331
+ # entirely give up on? SolrJ doesn't use a useful exception class hieararchy,
332
+ # we have to look into it's details and guess.
333
+ def fatal_exception?(e)
334
+
335
+
336
+ root_cause = e.respond_to?(:getRootCause) && e.getRootCause
337
+
338
+ # Various kinds of inability to actually talk to the
339
+ # server look like this:
340
+ if root_cause.kind_of? java.io.IOException
341
+ return true
342
+ end
343
+
344
+ # Consider Solr server returning HTTP 500 Internal Server Error to be fatal.
345
+ # This can mean, for instance, that disk space is exhausted on solr server.
346
+ if e.kind_of?(Java::OrgApacheSolrCommon::SolrException) && e.code == 500
347
+ return true
348
+ end
349
+
350
+ return false
351
+ end
352
+
353
+ def close
354
+ @thread_pool.raise_collected_exception!
355
+
356
+ # Any leftovers in batch buffer? Send em to the threadpool too.
357
+ if batched_queue.length > 0
358
+ packages = []
359
+ batched_queue.drain_to(packages)
360
+
361
+ # we do it in the thread pool for consistency, and so
362
+ # it goes to the end of the queue behind any outstanding
363
+ # work in the pool.
364
+ @thread_pool.maybe_in_thread_pool { batch_add_document_packages( packages ) }
365
+ end
366
+
367
+ # Wait for shutdown, and time it.
368
+ logger.debug "SolrJWriter: Shutting down thread pool, waiting if needed..."
369
+ elapsed = @thread_pool.shutdown_and_wait
370
+ if elapsed > 60
371
+ logger.warn "Waited #{elapsed} seconds for all SolrJWriter threads, you may want to increase solr_writer.thread_pool (currently #{@settings["solr_writer.thread_pool"]})"
372
+ end
373
+ logger.debug "SolrJWriter: Thread pool shutdown complete"
374
+ logger.warn "SolrJWriter: #{skipped_record_count} skipped records" if skipped_record_count > 0
375
+
376
+ # check again now that we've waited, there could still be some
377
+ # that didn't show up before.
378
+ @thread_pool.raise_collected_exception!
379
+
380
+ if settings["solrj_writer.commit_on_close"].to_s == "true"
381
+ logger.info "SolrJWriter: Sending commit to solr..."
382
+ solr_server.commit
383
+ end
384
+
385
+ solr_server.shutdown
386
+ @solr_server = nil
387
+ end
388
+
389
+ # Return count of encountered skipped records. Most accurate to call
390
+ # it after #close, in which case it should include full count, even
391
+ # under async thread_pool.
392
+ def skipped_record_count
393
+ @skipped_record_incrementer.get
394
+ end
395
+
396
+
397
+ def solr_server
398
+ @solr_server ||= instantiate_solr_server!
399
+ end
400
+ attr_writer :solr_server # mainly for testing
401
+
402
+ # Instantiates a solr server of class settings["solrj_writer.server_class_name"] or "HttpSolrServer"
403
+ # and initializes it with settings["solr.url"]
404
+ def instantiate_solr_server!
405
+ server_class = qualified_const_get( settings["solrj_writer.server_class_name"] || "HttpSolrServer" )
406
+ server = server_class.new( settings["solr.url"].to_s );
407
+
408
+ if parser_name = settings["solrj_writer.parser_class_name"]
409
+ #parser = org.apache.solr.client.solrj.impl.const_get(parser_name).new
410
+ parser = Java::JavaClass.for_name("org.apache.solr.client.solrj.impl.#{parser_name}").ruby_class.new
411
+ server.setParser( parser )
412
+ end
413
+
414
+ server
415
+ end
416
+
417
+ def settings_check!(settings)
418
+ unless settings.has_key?("solr.url") && ! settings["solr.url"].nil?
419
+ raise ArgumentError.new("SolrJWriter requires a 'solr.url' solr url in settings")
420
+ end
421
+
422
+ unless settings["solr.url"] =~ /^#{URI::regexp}$/
423
+ raise ArgumentError.new("SolrJWriter requires a 'solr.url' setting that looks like a URL, not: `#{settings['solr.url']}`")
424
+ end
425
+ end
426
+
427
+ end
@@ -0,0 +1,5 @@
1
+ module Traject
2
+ class SolrJWriter
3
+ VERSION = "1.0.0"
4
+ end
5
+ end
@@ -0,0 +1,85 @@
1
+ $LOAD_PATH.unshift File.expand_path('../../lib', __FILE__)
2
+ require 'traject/solrj_writer'
3
+
4
+ require 'minitest/spec'
5
+ require 'minitest/autorun'
6
+
7
+
8
+ # Get a traject context with the given data
9
+ def context_with(hash)
10
+ Traject::Indexer::Context.new(:output_hash => hash)
11
+ end
12
+
13
+
14
+ # pretends to be a SolrJ HTTPServer-like thing, just kind of mocks it up
15
+ # and records what happens and simulates errors in some cases.
16
+ class MockSolrServer
17
+ attr_accessor :things_added, :url, :committed, :parser, :shutted_down
18
+
19
+ def initialize(url)
20
+ @url = url
21
+ @things_added = []
22
+ @add_mutex = Mutex.new
23
+ end
24
+
25
+ def add(thing)
26
+ @add_mutex.synchronize do # easy peasy threadsafety for our mock
27
+ if @url == "http://no.such.place"
28
+ raise org.apache.solr.client.solrj.SolrServerException.new("mock bad uri", java.io.IOException.new)
29
+ end
30
+
31
+ # simulate a multiple id error please
32
+ if [thing].flatten.find {|doc| doc.getField("id").getValueCount() != 1}
33
+ raise org.apache.solr.client.solrj.SolrServerException.new("mock non-1 size of 'id'")
34
+ else
35
+ things_added << thing
36
+ end
37
+ end
38
+ end
39
+
40
+ def commit
41
+ @committed = true
42
+ end
43
+
44
+ def setParser(parser)
45
+ @parser = parser
46
+ end
47
+
48
+ def shutdown
49
+ @shutted_down = true
50
+ end
51
+
52
+ end
53
+
54
+ # keeps things from complaining about "yell-1.4.0/lib/yell/adapters/io.rb:66 warning: syswrite for buffered IO"
55
+ # for reasons I don't entirely understand, involving yell using syswrite and tests sometimes
56
+ # using $stderr.puts. https://github.com/TwP/logging/issues/31
57
+ STDERR.sync = true
58
+
59
+ # Hacky way to turn off Indexer logging by default, say only
60
+ # log things higher than fatal, which is nothing.
61
+ require 'traject/indexer/settings'
62
+ Traject::Indexer::Settings.defaults["log.level"] = "gt.fatal"
63
+
64
+ def support_file_path(relative_path)
65
+ return File.expand_path(File.join("test_support", relative_path), File.dirname(__FILE__))
66
+ end
67
+
68
+ # The 'assert' method I don't know why it's not there
69
+ def assert_length(length, obj, msg = nil)
70
+ unless obj.respond_to? :length
71
+ raise ArgumentError, "object with assert_length must respond_to? :length", obj
72
+ end
73
+
74
+
75
+ msg ||= "Expected length of #{obj} to be #{length}, but was #{obj.length}"
76
+
77
+ assert_equal(length, obj.length, msg.to_s )
78
+ end
79
+
80
+ def assert_start_with(start_with, obj, msg = nil)
81
+ msg ||= "expected #{obj} to start with #{start_with}"
82
+
83
+ assert obj.start_with?(start_with), msg
84
+ end
85
+
@@ -0,0 +1,223 @@
1
+ require 'minitest_helper'
2
+
3
+ require 'traject/solrj_writer'
4
+
5
+ # It's crazy hard to test this effectively, especially under threading.
6
+ # we do our best to test decently, and keep the tests readable,
7
+ # but some things aren't quite reliable under threading, sorry.
8
+
9
+ # create's a solrj_writer, maybe with MockSolrServer, maybe
10
+ # with a real one. With settings in @settings, set or change
11
+ # in before blocks
12
+ #
13
+ # writer left in @writer, with maybe mock solr server in @mock
14
+ def create_solrj_writer
15
+ @writer = Traject::SolrJWriter.new(@settings)
16
+
17
+ if @settings["solrj_writer.server_class_name"] == "MockSolrServer"
18
+ # so we can test it later
19
+ @mock = @writer.solr_server
20
+ end
21
+ end
22
+
23
+ def context_with(hash)
24
+ Traject::Indexer::Context.new(:output_hash => hash)
25
+ end
26
+
27
+
28
+ # Some tests we need to run multiple ties in multiple batch/thread scenarios,
29
+ # we DRY them up by creating a method to add the tests in different describe blocks
30
+ def test_handles_errors
31
+ it "errors but does not raise on multiple ID's" do
32
+ @writer.put context_with("id" => ["one", "two"])
33
+ @writer.close
34
+ assert_equal 1, @writer.skipped_record_count, "counts skipped record"
35
+ end
36
+
37
+ it "errors and raises on connection error" do
38
+ @settings.merge!("solr.url" => "http://no.such.place")
39
+ create_solrj_writer
40
+ assert_raises org.apache.solr.client.solrj.SolrServerException do
41
+ @writer.put context_with("id" => ["one"])
42
+ # in batch and/or thread scenarios, sometimes no exception raised until close
43
+ @writer.close
44
+ end
45
+ end
46
+ end
47
+
48
+ $stderr.puts "\n======\nWARNING: Testing SolrJWriter with mock instance, set ENV 'solr_url' to test against real solr\n======\n\n" unless ENV["solr_url"]
49
+ # WARNING. The SolrJWriter talks to a running Solr server.
50
+ #
51
+ # set ENV['solr_url'] to run tests against a real solr server
52
+ # OR
53
+ # the tests will run against a mock SolrJ server instead.
54
+ #
55
+ #
56
+ # This is pretty limited test right now.
57
+ describe "Traject::SolrJWriter" do
58
+ before do
59
+ @settings = {
60
+ # Use XMLResponseParser just to test, and so it will work
61
+ # with a solr 1.4 test server
62
+ "solrj_writer.parser_class_name" => "XMLResponseParser",
63
+ "solrj_writer.commit_on_close" => "false", # real solr is way too slow if we always have it commit on close
64
+ "solrj_writer.batch_size" => nil
65
+ }
66
+
67
+ if ENV["solr_url"]
68
+ @settings["solr.url"] = ENV["solr_url"]
69
+ else
70
+ @settings["solr.url"] = "http://example.org/solr"
71
+ @settings["solrj_writer.server_class_name"] = "MockSolrServer"
72
+ end
73
+ end
74
+
75
+ it "raises on missing url" do
76
+ assert_raises(ArgumentError) { Traject::SolrJWriter.new }
77
+ assert_raises(ArgumentError) { Traject::SolrJWriter.new("solr.url" => nil) }
78
+ end
79
+
80
+ it "raises on malformed URL" do
81
+ assert_raises(ArgumentError) { Traject::SolrJWriter.new("solr.url" => "") }
82
+ assert_raises(ArgumentError) { Traject::SolrJWriter.new("solr.url" => "adfadf") }
83
+ end
84
+
85
+ it "defaults to solrj_writer.batch_size more than 1" do
86
+ assert 1 < Traject::SolrJWriter.new("solr.url" => "http://example.org/solr").settings["solr_writer.batch_size"].to_i
87
+ end
88
+
89
+ describe "with no threading or batching" do
90
+ before do
91
+ @settings.merge!("solrj_writer.batch_size" => nil, "solrj_writer.thread_pool" => nil)
92
+ create_solrj_writer
93
+ end
94
+
95
+ it "writes a simple document" do
96
+ @writer.put context_with("title_t" => ["MY TESTING TITLE"], "id" => ["TEST_TEST_TEST_0001"])
97
+ @writer.close
98
+
99
+
100
+ if @mock
101
+ assert_kind_of org.apache.solr.client.solrj.impl.XMLResponseParser, @mock.parser
102
+ assert_equal @settings["solr.url"], @mock.url
103
+
104
+ assert_equal 1, @mock.things_added.length
105
+ assert_kind_of SolrInputDocument, @mock.things_added.first.first
106
+
107
+ assert @mock.shutted_down
108
+ end
109
+ end
110
+
111
+ it "commits on close when so set" do
112
+ @settings.merge!("solrj_writer.commit_on_close" => "true")
113
+ create_solrj_writer
114
+
115
+ @writer.put context_with("title_t" => ["MY TESTING TITLE"], "id" => ["TEST_TEST_TEST_0001"])
116
+ @writer.close
117
+
118
+ # if it's not a mock, we don't really test anything, except that
119
+ # no exception was raised. oh well. If it's a mock, we can
120
+ # ask it.
121
+ if @mock
122
+ assert @mock.committed, "mock gets commit called on it"
123
+ end
124
+ end
125
+
126
+ test_handles_errors
127
+
128
+
129
+ # I got to see what serialized marc binary does against a real solr server,
130
+ # sorry this is a bit out of place, but this is the class that talks to real
131
+ # solr server right now. This test won't do much unless you have
132
+ # real solr server set up.
133
+ #
134
+ # Not really a good test right now, just manually checking my solr server,
135
+ # using this to make the add reproducible at least.
136
+ describe "Serialized MARC" do
137
+ it "goes to real solr somehow" do
138
+ record = MARC::Reader.new(support_file_path "manufacturing_consent.marc").to_a.first
139
+
140
+ serialized = record.to_marc # straight binary
141
+ @writer.put context_with("marc_record_t" => [serialized], "id" => ["TEST_TEST_TEST_MARC_BINARY"])
142
+ @writer.close
143
+ end
144
+ end
145
+ end
146
+
147
+ describe "with batching but no threading" do
148
+ before do
149
+ @settings.merge!("solr_writer.batch_size" => 5, "solr_writer.thread_pool" => nil)
150
+ create_solrj_writer
151
+ end
152
+
153
+ it "sends all documents" do
154
+ docs = Array(1..17).collect do |i|
155
+ {"id" => ["item_#{i}"], "title" => ["To be #{i} again!"]}
156
+ end
157
+
158
+ docs.each do |doc|
159
+ @writer.put context_with(doc)
160
+ end
161
+ @writer.close
162
+
163
+ if @mock
164
+ # 3 batches of 5, and the leftover 2 (16, 17)
165
+ assert_length 4, @mock.things_added
166
+
167
+ assert_length 5, @mock.things_added[0]
168
+ assert_length 5, @mock.things_added[1]
169
+ assert_length 5, @mock.things_added[2]
170
+ assert_length 2, @mock.things_added[3]
171
+ end
172
+ end
173
+
174
+ test_handles_errors
175
+ end
176
+
177
+ describe "with batching and threading" do
178
+ before do
179
+ @settings.merge!("solr_writer.batch_size" => 5, "solr_writer.thread_pool" => 2)
180
+ create_solrj_writer
181
+ end
182
+
183
+ it "sends all documents" do
184
+ docs = Array(1..17).collect do |i|
185
+ {"id" => ["item_#{i}"], "title" => ["To be #{i} again!"]}
186
+ end
187
+
188
+ docs.each do |doc|
189
+ @writer.put context_with(doc)
190
+ end
191
+ @writer.close
192
+
193
+ if @mock
194
+ # 3 batches of 5, and the leftover 2 (16, 17)
195
+ assert_length 4, @mock.things_added
196
+
197
+ # we can't be sure of the order under async,
198
+ # just three of 5 and one of 2
199
+ assert_length 3, @mock.things_added.find_all {|array| array.length == 5}
200
+ assert_length 1, @mock.things_added.find_all {|array| array.length == 2}
201
+ end
202
+ end
203
+
204
+ test_handles_errors
205
+ end
206
+
207
+ describe "alises solrj_writer* to solr_writer* settings" do
208
+ # commit_on_close batch_size thread_pool
209
+
210
+ end
211
+ it "aliases as needed" do
212
+ @settings.merge!("solrj_writer.commit_on_close" => true, "solrj_writer.batch_size" => 5, "solrj_writer.thread_pool" => 2)
213
+ create_solrj_writer
214
+ assert_equal(true, @writer.settings['solr_writer.commit_on_close'], "commit_on_close")
215
+ assert_equal(5, @writer.settings['solr_writer.batch_size'], "batch_size")
216
+ assert_equal(2, @writer.settings['solr_writer.thread_pool'], "thread_pool")
217
+
218
+
219
+ end
220
+
221
+ end
222
+
223
+ require 'thread' # Mutex
@@ -0,0 +1 @@
1
+ 02067cam a2200469 a 4500001000800000005001700008008004100025010001700066020001500083020001800098029002100116029001900137029001700156035001200173035001200185035001600197035002000213040006400233049000900297050002200306082002100328084001500349100002200364245014800386246002600534260003900560300002700599500005800626504006600684505032800750650002701078650003101105700001901136856009901155856009101254910002601345938007101371938004001442938003901482991006401521994001201585271018320080307152200.0010831s2002 nyu b 001 0 eng  a 2001050014 a0375714499 a97803757144981 aNLGGCb2461901591 aYDXCPb18130101 aNZ1b6504593 a2710183 a2710183 aocm47971712 a(OCoLC)47971712 aDLCcDLCdUSXdBAKERdNLGGCdNPLdYDXCPdOCLCQdBTCTAdMdBJ aJHEE00aP96.E25bH47 200200a381/.4530223221 a05.302bcl1 aHerman, Edward S.10aManufacturing consent :bthe political economy of the mass media /cEdward S. Herman and Noam Chomsky ; with a new introduction by the authors.14aManugacturing content aNew York :bPantheon Books,c2002. alxiv, 412 p. ;c24 cm. aUpdated ed. of: Manufacturing consent. 1st ed. c1988. aIncludes bibliographical references (p. [331]-393) and index.0 aA propaganda model -- Worthy and unworthy victims -- Legitimizing versus meaningless third world elections: El Salvador, Guatemala, and Nicaragua -- The KGB-Bulgarian plot to kill the Pope: free-market disinformation as "news" -- The Indochina wars (I): Vietnam -- The Indochina wars (II): Laos and Cambodia -- Conclusions. 0aMass mediaxOwnership. 0aMass media and propaganda.1 aChomsky, Noam.423Contributor biographical informationuhttp://www.loc.gov/catdir/bios/random051/2001050014.html423Publisher descriptionuhttp://www.loc.gov/catdir/description/random044/2001050014.html a2710183bHorizon bib# aBaker & TaylorbBKTYc18.95d14.21i0375714499n0003788716sactive aYBP Library ServicesbYANKn1813010 aBaker and TaylorbBTCPn2001050014 aP96.E25 H47 2002flcbelc1cc. 1q0i4659750lembluememsel aC0bJHE
@@ -0,0 +1,27 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'traject/solrj_writer/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.platform = 'java'
8
+ spec.name = "traject-solrj_writer"
9
+ spec.version = Traject::SolrJWriter::VERSION
10
+ spec.authors = ["Bill Dueber"]
11
+ spec.email = ["bill@dueber.com"]
12
+ spec.summary = %q{Use Traject into index data into Solr using solrj under JRuby}
13
+ spec.homepage = "https://github.com/traject-project/traject-solrj_writer"
14
+ spec.license = "MIT"
15
+
16
+ spec.files = `git ls-files -z`.split("\x0")
17
+ spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
18
+ spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
+ spec.require_paths = ["lib"]
20
+
21
+
22
+ spec.add_development_dependency "bundler", "~> 1.7"
23
+ spec.add_development_dependency "rake", "~> 10.0"
24
+ spec.add_development_dependency "minitest"
25
+ spec.add_development_dependency 'simple_solr_client', '>=0.1.2'
26
+
27
+ end
@@ -0,0 +1,8 @@
1
+ Inside ./lib are all the jar files neccesary for solrj. They are used by the SolrJWriter.
2
+
3
+ The build.xml and ivy.xml file included here were used to download the jars, and
4
+ can be used to re-download them. Just run `ant` in this directory, and the contents of `./lib` will be replaced by the current latest release of solrj. Or edit ivy.xml to download a specific solrj version (perhaps change ivy.xml to use a java prop for release, defaulting to latest! ha.) And then commit changes to repo, etc, to update solrj distro'd with traject.
5
+
6
+ This is not neccesarily a great way to provide access to solrj .jars. It's just what we're doing now, and it works. See main project README.md for discussion and other potential ideas.
7
+
8
+ Note, the ivy.xml in here currently downloads a bit MORE than we really need, like .jars of docs and source. Haven't yet figured out how to tell it to download all maven-specified solrj jars that we really need, but not the ones we don't need. (we DO need logging-related ones to properly get logging working!) If you can figure it out, it'd be an improvement, as ALL jars in this dir are by default loaded by traject at runtime.
@@ -0,0 +1,39 @@
1
+ <?xml version="1.0" encoding="utf-8"?>
2
+ <project xmlns:ivy="antlib:org.apache.ivy.ant" name="traject-fetch-jars" default="prepare" basedir=".">
3
+
4
+
5
+
6
+
7
+
8
+ <target name="prepare" depends="setup-ivy">
9
+ <mkdir dir="lib"/>
10
+ <ivy:retrieve sync="true"/>
11
+ </target>
12
+
13
+ <target name="clean">
14
+ <delete dir="lib"/>
15
+ </target>
16
+
17
+
18
+
19
+ <property name="ivy.install.version" value="2.3.0"/>
20
+ <property name="ivy.jar.dir" value="ivy"/>
21
+ <property name="ivy.jar.file" value="${ivy.jar.dir}/ivy.jar"/>
22
+
23
+ <available file="${ivy.jar.file}" property="skip.download"/>
24
+
25
+ <target name="download-ivy" unless="skip.download">
26
+ <mkdir dir="${ivy.jar.dir}"/>
27
+
28
+ <echo message="installing ivy..."/>
29
+ <get src="http://repo1.maven.org/maven2/org/apache/ivy/ivy/${ivy.install.version}/ivy-${ivy.install.version}.jar" dest="${ivy.jar.file}" usetimestamp="true"/>
30
+ </target>
31
+
32
+ <target name="setup-ivy" depends="download-ivy" description="--> setup ivy">
33
+ <path id="ivy.lib.path">
34
+ <fileset dir="${ivy.jar.dir}" includes="*.jar"/>
35
+ </path>
36
+ <taskdef resource="org/apache/ivy/ant/antlib.xml" uri="antlib:org.apache.ivy.ant" classpathref="ivy.lib.path"/>
37
+ </target>
38
+
39
+ </project>
@@ -0,0 +1,16 @@
1
+ <ivy-module version="2.0">
2
+ <info organisation="org.code4lib" module="traject"/>
3
+
4
+ <dependencies>
5
+ <!-- downloads EVERYTHING including docs and source we don't need. Oh well, it
6
+ works for prototyping at least... -->
7
+ <dependency org="org.apache.solr" name="solr-solrj" rev="latest.release"/>
8
+
9
+
10
+ <!-- Attempts to give us just what we need, including working logging, still
11
+ not quite right, but leaving here for thinking... -->
12
+ <!-- <dependency org="org.apache.solr" name="solr-solrj" rev="latest.release" conf="default" />
13
+ <dependency org="org.slf4j" name="slf4j-simple" rev="latest.release"/> -->
14
+ </dependencies>
15
+ </ivy-module>
16
+
metadata ADDED
@@ -0,0 +1,133 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: traject-solrj_writer
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: java
6
+ authors:
7
+ - Bill Dueber
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2015-02-10 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - ~>
17
+ - !ruby/object:Gem::Version
18
+ version: '1.7'
19
+ name: bundler
20
+ prerelease: false
21
+ type: :development
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ~>
25
+ - !ruby/object:Gem::Version
26
+ version: '1.7'
27
+ - !ruby/object:Gem::Dependency
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - ~>
31
+ - !ruby/object:Gem::Version
32
+ version: '10.0'
33
+ name: rake
34
+ prerelease: false
35
+ type: :development
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ~>
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ requirement: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - '>='
45
+ - !ruby/object:Gem::Version
46
+ version: '0'
47
+ name: minitest
48
+ prerelease: false
49
+ type: :development
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - '>='
59
+ - !ruby/object:Gem::Version
60
+ version: 0.1.2
61
+ name: simple_solr_client
62
+ prerelease: false
63
+ type: :development
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - '>='
67
+ - !ruby/object:Gem::Version
68
+ version: 0.1.2
69
+ description:
70
+ email:
71
+ - bill@dueber.com
72
+ executables: []
73
+ extensions: []
74
+ extra_rdoc_files: []
75
+ files:
76
+ - .gitignore
77
+ - .travis.yml
78
+ - Gemfile
79
+ - LICENSE.txt
80
+ - README.md
81
+ - Rakefile
82
+ - lib/traject/solrj_writer.rb
83
+ - lib/traject/solrj_writer/version.rb
84
+ - spec/minitest_helper.rb
85
+ - spec/solrj_writer_spec.rb
86
+ - spec/test_support/manufacturing_consent.marc
87
+ - traject-solrj_writer.gemspec
88
+ - vendor/solrj/README
89
+ - vendor/solrj/build.xml
90
+ - vendor/solrj/ivy.xml
91
+ - vendor/solrj/lib/commons-io-2.3.jar
92
+ - vendor/solrj/lib/httpclient-4.3.1.jar
93
+ - vendor/solrj/lib/httpcore-4.3.jar
94
+ - vendor/solrj/lib/httpmime-4.3.1.jar
95
+ - vendor/solrj/lib/jcl-over-slf4j-1.6.6.jar
96
+ - vendor/solrj/lib/log4j-1.2.16.jar
97
+ - vendor/solrj/lib/noggit-0.5.jar
98
+ - vendor/solrj/lib/slf4j-api-1.7.6.jar
99
+ - vendor/solrj/lib/slf4j-log4j12-1.6.6.jar
100
+ - vendor/solrj/lib/solr-solrj-4.3.1-javadoc.jar
101
+ - vendor/solrj/lib/solr-solrj-4.3.1-sources.jar
102
+ - vendor/solrj/lib/solr-solrj-4.3.1.jar
103
+ - vendor/solrj/lib/wstx-asl-3.2.7.jar
104
+ - vendor/solrj/lib/zookeeper-3.4.6.jar
105
+ homepage: https://github.com/traject-project/traject-solrj_writer
106
+ licenses:
107
+ - MIT
108
+ metadata: {}
109
+ post_install_message:
110
+ rdoc_options: []
111
+ require_paths:
112
+ - lib
113
+ required_ruby_version: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - '>='
116
+ - !ruby/object:Gem::Version
117
+ version: '0'
118
+ required_rubygems_version: !ruby/object:Gem::Requirement
119
+ requirements:
120
+ - - '>='
121
+ - !ruby/object:Gem::Version
122
+ version: '0'
123
+ requirements: []
124
+ rubyforge_project:
125
+ rubygems_version: 2.1.9
126
+ signing_key:
127
+ specification_version: 4
128
+ summary: Use Traject into index data into Solr using solrj under JRuby
129
+ test_files:
130
+ - spec/minitest_helper.rb
131
+ - spec/solrj_writer_spec.rb
132
+ - spec/test_support/manufacturing_consent.marc
133
+ has_rdoc: