lyber-core 0.9.6.2.3 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.rdoc CHANGED
@@ -38,10 +38,28 @@ The BagitBag class requires the bagit gem
38
38
  http://github.com/flazz/bagit
39
39
 
40
40
  == Build and release procedure
41
+ Modify the version number in lyber-core.gemspec, then push your commits to AFS. DO NOT TAG!
41
42
  Run: 'rake dlss_release' to tag, build, and publish the lyber-core gem
42
43
  See the Rakefile and the LyberCore::DlssRelease task in lib/lyber_core/rake/dlss_release.rb for more details
43
44
 
44
45
  == Releases
46
+ - <b>1.3</b> Started to use Dor::Config for workspace configuration
47
+ - <b>1.2.1</b> Clean up logging of exceptions in LyberCore::Log
48
+ - <b>1.2</b> Robots can now run as daemons via the LyberCore::Robots::ServiceController
49
+ - <b>1.1.2</b> Can pass an array of "command line" arguments to the Robot constructor
50
+ - <b>1.1.1</b> Robot#start now returns LyberCore::Robots::CONTINUE if it did work without error, LyberCore::Robots::SLEEP if it did no work,
51
+ and LyberCore::Robots::HALT if it reached its error limit while working on its queue
52
+ - <b>1.1.0</b> Allow Robots::WorkQueue to resolve an arbitrary number of prerequisites
53
+ - <b>1.0.0</b> Factored all Dor::* classes and object models out of lyber-core and into a separate dor-services gem. WARNING: MAY BREAK COMPATIBILITY WITH PREVIOUS DOR-ENABLED CODE.
54
+ - <b>0.9.8</b> Created branch for legacy work "0.9-legacy". Robots can now be configured with fully qualified workflows for prerequisites
55
+ eg <i>dor:googleScannedBookWF:register-object</i>
56
+ - <b>0.9.7.4</b> Untangled a couple development dependencies; fixed issue where "include REXML" was polluting the Object namespace
57
+ - <b>0.9.7.3</b> Logging enhancements
58
+ - <b>0.9.7.2</b> IdentityMetadata bugfixes
59
+ - <b>0.9.7.1</b> Enhanced exception handling
60
+ - <b>0.9.7</b> ActiveMQ message-based robot parallelization as described here: https://consul.stanford.edu/x/tQjdBw . Removal of ROXML models.
61
+ - <b>0.9.6.3</b> Better error reporting for LyberCore::Utils::FileUtilities.execute, which means when a system command fails we have a better idea of why.
62
+ - <b>0.9.6.2</b> Handles new response from workflow service when there are no objects in the queue: <objects count="0"\>
45
63
  - <b>0.9.6</b> DorService.get_objects_for_workstep can handle one or two completed steps. Trimmed-down gem dependencies now defined in lyber-core.gemspec. 'rake dlss_release' will tag, build and publish gem
46
64
  - <b>0.9.5.5</b> Robots now log to ROBOT_ROOT/log/robot_name.log unless specified in constructor
47
65
  - <b>0.9.5.4</b> Custom exception classes, more checking of error conditions
data/lib/dlss_service.rb CHANGED
@@ -1,4 +1,3 @@
1
- require 'rubygems'
2
1
  require 'net/http'
3
2
  require 'net/https'
4
3
  require 'uri'
data/lib/dor_service.rb CHANGED
@@ -4,10 +4,10 @@ require 'uri'
4
4
  require 'cgi'
5
5
  require 'rexml/document'
6
6
 
7
- include REXML
8
-
9
7
  class DorService
10
8
 
9
+ include REXML
10
+
11
11
  def DorService.get_https_connection(url)
12
12
  https = Net::HTTP.new(url.host, url.port)
13
13
  if(url.scheme == 'https')
@@ -31,7 +31,7 @@ class DorService
31
31
  druid = $1
32
32
  return druid
33
33
  rescue Exception => e
34
- LyberCore::Log.error("Unable to create object #{e.backtrace}")
34
+ LyberCore::Log.debug("Unable to create object #{e.backtrace}")
35
35
  raise e
36
36
  end
37
37
  end
@@ -55,7 +55,7 @@ class DorService
55
55
  LyberCore::Log.debug("new druid = #{druid}")
56
56
  return druid
57
57
  rescue Exception => e
58
- LyberCore::Log.error("Unable to create object")
58
+ LyberCore::Log.debug("Unable to create object")
59
59
  raise e, "Unable to create object "
60
60
  end
61
61
  end
@@ -111,7 +111,8 @@ class DorService
111
111
  LyberCore::Log.debug("Fetching druid for dor_id #{dor_id} at url #{url_string}")
112
112
  url = URI.parse(url_string)
113
113
  req = Net::HTTP::Get.new(url.request_uri)
114
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
114
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
115
+ res = LyberCore::Connection.send_request(url,req)
115
116
 
116
117
  case res
117
118
  when Net::HTTPSuccess
@@ -121,10 +122,10 @@ class DorService
121
122
  LyberCore::Log.debug("Barcode does not yet exist in DOR: #{dor_id}")
122
123
  return nil
123
124
  when Net::HTTPServerError
124
- LyberCore::Log.error("Encountered HTTPServerError error when requesting #{url}: #{res.inspect}")
125
+ LyberCore::Log.debug("Encountered HTTPServerError error when requesting #{url}: #{res.inspect}")
125
126
  raise "Encountered 500 error when requesting #{url}: #{res.inspect}"
126
127
  else
127
- LyberCore::Log.error("Encountered unknown error when requesting #{url}: #{res.inspect}")
128
+ LyberCore::Log.debug("Encountered unknown error when requesting #{url}: #{res.inspect}")
128
129
  raise "Encountered unknown error when requesting #{url}: #{res.inspect}"
129
130
  end
130
131
  end
@@ -132,7 +133,7 @@ class DorService
132
133
  ############################################# Start of Datastream methods
133
134
  # Until ActiveFedora supports client-side certificate configuration, we are stuck with our own methods to access datastreams
134
135
 
135
- #/objects/{pid}/datastreams/{dsID} ? [controlGroup] [dsLocation] [altIDs] [dsLabel] [versionable] [dsState] [formatURI] [checksumType] [checksum] [logMessage]
136
+ #/objects/pid/datastreams/dsID ? [controlGroup] [dsLocation] [altIDs] [dsLabel] [versionable] [dsState] [formatURI] [checksumType] [checksum] [logMessage]
136
137
  def DorService.add_datastream(druid, ds_id, ds_label, xml, content_type='application/xml', versionable = false )
137
138
  DorService.add_datastream_managed(druid, ds_id, ds_label, xml, content_type, versionable)
138
139
  end
@@ -168,8 +169,9 @@ class DorService
168
169
  LyberCore::Log.debug("Connecting to #{url_string}...")
169
170
  req = Net::HTTP::Get.new(url.request_uri)
170
171
  LyberCore::Log.debug("request object: #{req.inspect}")
171
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
172
-
172
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
173
+ res = LyberCore::Connection.send_request(url,req)
174
+
173
175
  case res
174
176
  when Net::HTTPSuccess
175
177
  return res.body
@@ -177,10 +179,10 @@ class DorService
177
179
  LyberCore::Log.debug("Datastream not found at url #{url_string}")
178
180
  return nil
179
181
  when Net::HTTPServerError
180
- LyberCore::Log.error("Attempted to reach #{url_string} but failed")
182
+ LyberCore::Log.debug("Attempted to reach #{url_string} but failed")
181
183
  raise "Encountered 500 error when requesting #{url_string}: #{res.inspect}"
182
184
  else
183
- LyberCore::Log.error("Encountered unknown error when requesting #{url}: #{res.inspect}")
185
+ LyberCore::Log.debug("Encountered unknown error when requesting #{url}: #{res.inspect}")
184
186
  raise "Encountered unknown error when requesting #{url}: #{res.inspect}"
185
187
  end
186
188
  rescue Exception => e
@@ -203,7 +205,8 @@ class DorService
203
205
  req = Net::HTTP::Get.new(url.request_uri)
204
206
  req.basic_auth FEDORA_USER, FEDORA_PASS
205
207
  LyberCore::Log.debug("request object: #{req.inspect}")
206
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
208
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
209
+ res = LyberCore::Connection.send_request(url,req)
207
210
  case res
208
211
  when Net::HTTPSuccess
209
212
  return res.body
@@ -276,56 +279,69 @@ class DorService
276
279
  # </objects>
277
280
  def DorService.get_objects_for_workstep(repository, workflow, completed, waiting)
278
281
  LyberCore::Log.debug("DorService.get_objects_for_workstep(#{repository}, #{workflow}, #{completed}, #{waiting})")
279
- begin
280
- if repository.nil? or workflow.nil? or completed.nil? or waiting.nil?
281
- LyberCore::Log.fatal("Can't execute DorService.get_objects_for_workstep: missing info")
282
- end
283
-
284
- unless defined?(WORKFLOW_URI) and WORKFLOW_URI != nil
285
- LyberCore::Log.fatal("WORKFLOW_URI is not set. ROBOT_ROOT = #{ROBOT_ROOT}")
286
- raise "WORKFLOW_URI is not set"
287
- end
288
-
289
- uri_string = "#{WORKFLOW_URI}/workflow_queue?repository=#{repository}&workflow=#{workflow}&waiting=#{waiting}"
290
- if(completed.class == Array)
291
- raise "The workflow service can only handle queries with no more than 2 completed steps" if completed.size > 2
292
- completed.each {|step| uri_string << "&completed=#{step}"}
293
- else
294
- uri_string << "&completed=#{completed}"
295
- end
296
- LyberCore::Log.info("Attempting to connect to #{uri_string}")
297
- url = URI.parse(uri_string)
298
- req = Net::HTTP::Get.new(url.request_uri)
299
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
300
- case res
301
- when Net::HTTPSuccess
302
- begin
303
- doc = Nokogiri::XML(res.body)
304
- count = doc.root.at_xpath("//objects/@count").content.to_i
305
- rescue Exception => e
306
- msg = "Could not parse response from Workflow Service"
307
- LyberCore::Log.error(msg + "\n#{res.body}")
308
- raise e, msg
309
- end
310
-
311
- if(count == 0)
312
- raise LyberCore::Exceptions::EmptyQueue.new, "empty queue"
313
- else
314
- return res.body
315
- end
316
- else
317
- LyberCore::Log.fatal("Workflow queue not found for #{workflow} : #{waiting}")
318
- LyberCore::Log.debug("I am attempting to connect to WORKFLOW_URI #{WORKFLOW_URI}")
319
- LyberCore::Log.debug("repository: #{repository}")
320
- LyberCore::Log.debug("workflow: #{workflow}")
321
- LyberCore::Log.debug("completed: #{completed}")
322
- LyberCore::Log.debug("waiting: #{waiting}")
323
- LyberCore::Log.debug(res.inspect)
324
- raise "Could not connect to url #{uri_string}"
325
- end
326
- end
282
+
283
+ if repository.nil? or workflow.nil? or completed.nil? or waiting.nil?
284
+ LyberCore::Log.fatal("Can't execute DorService.get_objects_for_workstep: missing info")
285
+ end
286
+
287
+ unless defined?(WORKFLOW_URI) and WORKFLOW_URI != nil
288
+ LyberCore::Log.fatal("WORKFLOW_URI is not set. ROBOT_ROOT = #{ROBOT_ROOT}")
289
+ raise "WORKFLOW_URI is not set"
290
+ end
291
+
292
+ uri_string = "#{WORKFLOW_URI}/workflow_queue?repository=#{repository}&workflow=#{workflow}&waiting=#{waiting}"
293
+ if(completed.class == Array)
294
+ raise "The workflow service can only handle queries with no more than 2 completed steps" if completed.size > 2
295
+ completed.each {|step| uri_string << "&completed=#{step}"}
296
+ else
297
+ uri_string << "&completed=#{completed}"
298
+ end
299
+
300
+ return DorService.execute_workflow_xml_query(uri_string)
327
301
  end
328
302
 
303
+ # Returns string containing object list XML from a workflow DOR query using fully qualified workflow step names
304
+ # eg <tt>dor:googleScannedBookWF:register-object</tt>
305
+ #
306
+ # @param [String, Array] completed if only querying for one completed step, pass in a String of a fully qualified workflow step.
307
+ # If querying for two completed steps, pass in an Array of the two completed steps
308
+ # @param [String] waiting the fully qualified name of the waiting step
309
+ # @raise [LyberCore::Exceptions::EmptyQueue] When the query is successful, but no objects are found in that queue
310
+ # @raise [Exception] For other problems like connection failures or passing in non-qualified workflow names
311
+ # @return [String] XML containing all the objects that match the specific query. It looks like:
312
+ # <objects>
313
+ # <object druid="dr:123" url="http://localhost:9999/jersey-spring/objects/dr:123%5c" />
314
+ # <object druid="dr:abc" url="http://localhost:9999/jersey-spring/objects/dr:abc%5c" />
315
+ # </objects>
316
+ def DorService.get_objects_for_qualified_workstep(completed, waiting)
317
+ LyberCore::Log.debug("DorService.get_objects_for_qualified_workstep(#{completed}, #{waiting})")
318
+
319
+ if completed.nil? or waiting.nil?
320
+ LyberCore::Log.fatal("Can't execute DorService.get_objects_for_qualified_workstep: missing info")
321
+ end
322
+
323
+ unless defined?(WORKFLOW_URI) and WORKFLOW_URI != nil
324
+ LyberCore::Log.fatal("WORKFLOW_URI is not set. ROBOT_ROOT = #{ROBOT_ROOT}")
325
+ raise "WORKFLOW_URI is not set"
326
+ end
327
+
328
+ unless(waiting =~ /.+:.+:.+/)
329
+ raise "The waiting step was not fully qualified or of the form: <repository>:<workflow>:<stepname>. Received #{waiting}"
330
+ end
331
+ uri_string = "#{WORKFLOW_URI}/workflow_queue?waiting=#{waiting}"
332
+
333
+ completed_steps = Array(completed)
334
+ raise "The workflow service can only handle queries with no more than 2 completed steps" if completed_steps.size > 2
335
+ completed_steps.each do |step|
336
+ raise "A completed step was not fully qualified or of the form: <repository>:<workflow>:<stepname>. Received #{step}" unless(step =~ /.+:.+:.+/)
337
+ uri_string << "&completed=#{step}"
338
+ end
339
+
340
+ return DorService.execute_workflow_xml_query(uri_string)
341
+ end
342
+
343
+
344
+
329
345
  def DorService.log_and_raise_workflow_connection_problem(repository, workflow, completed, waiting, response)
330
346
 
331
347
  end
@@ -400,18 +416,19 @@ class DorService
400
416
  req.body = DorService.construct_error_update_request(process, error_msg, error_txt)
401
417
  req.content_type = 'application/xml'
402
418
  LyberCore::Log::debug("Putting request: #{req.inspect}")
403
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
419
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
420
+ res = LyberCore::Connection.send_request(url,req)
404
421
  LyberCore::Log::debug("Got response: #{res.inspect}")
405
422
  case res
406
423
  when Net::HTTPSuccess
407
- LyberCore::Log.error("#{workflow} - #{process} set to error for " + druid)
424
+ LyberCore::Log.info("#{workflow} - #{process} set to error for " + druid)
408
425
  else
409
- LyberCore::Log.error(res.body)
426
+ LyberCore::Log.debug(res.body)
410
427
  raise res.error!, "Received error from the workflow service"
411
428
  end
412
429
  rescue Exception => e
413
430
  msg = "Unable to update workflow service at url #{url_string}"
414
- LyberCore::Log.error(msg)
431
+ LyberCore::Log.debug(msg)
415
432
  raise e, msg
416
433
  end
417
434
  end
@@ -434,7 +451,7 @@ class DorService
434
451
  LyberCore::Log.debug("Successfully queried symphony for #{flexkey}")
435
452
  return res.body
436
453
  else
437
- LyberCore::Log.error("Encountered an error from symphony: #{res.body}")
454
+ LyberCore::Log.debug("Encountered an error from symphony: #{res.body}")
438
455
  raise res.error!
439
456
  end
440
457
  rescue Exception => e
@@ -445,6 +462,40 @@ class DorService
445
462
 
446
463
 
447
464
  private
465
+
466
+ def DorService.execute_workflow_xml_query(uri_string)
467
+ LyberCore::Log.info("Attempting to connect to #{uri_string}")
468
+ url = URI.parse(uri_string)
469
+ req = Net::HTTP::Get.new(url.request_uri)
470
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
471
+ res = LyberCore::Connection.send_request(url,req)
472
+ case res
473
+ when Net::HTTPSuccess
474
+ begin
475
+ doc = Nokogiri::XML(res.body)
476
+ count = doc.root.at_xpath("//objects/@count").content.to_i
477
+ rescue Exception => e
478
+ msg = "Could not parse response from Workflow Service"
479
+ LyberCore::Log.debug(msg + "\n#{res.body}")
480
+ raise e, msg
481
+ end
482
+
483
+ if(count == 0)
484
+ raise LyberCore::Exceptions::EmptyQueue.new, "empty queue"
485
+ else
486
+ return res.body
487
+ end
488
+ else
489
+ LyberCore::Log.fatal("Workflow queue not found for #{workflow} : #{waiting}")
490
+ LyberCore::Log.debug("I am attempting to connect to WORKFLOW_URI #{WORKFLOW_URI}")
491
+ LyberCore::Log.debug("repository: #{repository}")
492
+ LyberCore::Log.debug("workflow: #{workflow}")
493
+ LyberCore::Log.debug("completed: #{completed}")
494
+ LyberCore::Log.debug("waiting: #{waiting}")
495
+ LyberCore::Log.debug(res.inspect)
496
+ raise "Could not connect to url #{uri_string}"
497
+ end
498
+ end
448
499
  # druid, ds, url, content_type, method, parms
449
500
  def DorService.set_datastream(druid, ds_id, parms, method, content = {})
450
501
  begin
@@ -458,15 +509,16 @@ class DorService
458
509
  req.basic_auth FEDORA_USER, FEDORA_PASS
459
510
  req.body = content[:xml] if(content[:xml])
460
511
  req.content_type = content[:type]
461
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
512
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
513
+ res = LyberCore::Connection.send_request(url,req)
462
514
  case res
463
515
  when Net::HTTPSuccess
464
516
  return true
465
517
  when Net::HTTPServerError
466
- LyberCore::Log.error("Attempted to set datastream #{url} but failed")
518
+ LyberCore::Log.debug("Attempted to set datastream #{url} but failed")
467
519
  raise "Encountered 500 error setting datastream #{url}: #{res.inspect}"
468
520
  else
469
- LyberCore::Log.error("Encountered unknown error when setting datastream #{url}: #{res.inspect}")
521
+ LyberCore::Log.debug("Encountered unknown error when setting datastream #{url}: #{res.inspect}")
470
522
  raise "Encountered unknown error when setting datastream #{url}: #{res.inspect}"
471
523
  end
472
524
  rescue Exception => e
@@ -506,15 +558,16 @@ end
506
558
  req = Net::HTTP::Put.new(url.path)
507
559
  req.body = DorService.construct_xml_for_tag_array(tags)
508
560
  req.content_type = 'application/xml'
509
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
561
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
562
+ res = LyberCore::Connection.send_request(url,req)
510
563
  case res
511
564
  when Net::HTTPSuccess
512
565
  return true
513
566
  when Net::HTTPServerError
514
- LyberCore::Log.error("Attempted to add identity tags #{url} but failed")
567
+ LyberCore::Log.debug("Attempted to add identity tags #{url} but failed")
515
568
  raise "Encountered 500 error when adding identity tags #{url}: #{res.inspect}"
516
569
  else
517
- LyberCore::Log.error("Encountered unknown error when adding identity tags #{url}: #{res.inspect}")
570
+ LyberCore::Log.debug("Encountered unknown error when adding identity tags #{url}: #{res.inspect}")
518
571
  raise "Encountered unknown error when adding identity tags #{url}: #{res.inspect}"
519
572
  end
520
573
  rescue Exception => e
data/lib/lyber_core.rb CHANGED
@@ -1,14 +1,17 @@
1
+ require 'lyber_core/config'
1
2
  require 'dlss_service'
2
- require 'dor/suri_service'
3
- require 'dor/workflow_service'
4
- require 'dor/base'
3
+ require 'dor-services'
5
4
  require 'lyber_core/connection'
6
5
  require 'lyber_core/destroyer'
7
6
  require 'lyber_core/log'
8
7
  require 'lyber_core/robots/robot'
8
+ require 'lyber_core/robots/service_controller'
9
9
  require 'lyber_core/robots/workflow'
10
10
  require 'lyber_core/robots/workspace'
11
11
  require 'lyber_core/robots/work_queue'
12
12
  require 'lyber_core/robots/work_item'
13
13
  require 'lyber_core/exceptions/empty_queue'
14
+ require 'lyber_core/exceptions/fatal_error'
15
+ require 'lyber_core/exceptions/service_error'
16
+ require 'lyber_core/exceptions/item_error'
14
17
 
@@ -0,0 +1,13 @@
1
+ require 'dor-services'
2
+
3
+ module Dor
4
+
5
+ Config.declare do
6
+
7
+ robots do
8
+ workspace nil
9
+ end
10
+
11
+ end
12
+
13
+ end
@@ -2,6 +2,27 @@ require 'net/https'
2
2
  require 'uri'
3
3
  require 'cgi'
4
4
 
5
+ # Extend the Integer class to facilitate retries of code blocks if specified exception(s) occur
6
+ # see: http://blog.josh-nesbitt.net/2010/02/08/writing-contingent-ruby-code-with-retryable/
7
+ RETRYABLE_SLEEP_VALUE = 300
8
+ class Integer
9
+ def tries(options={}, &block)
10
+ attempts = self
11
+ exception_classes = [*options[:on] || StandardError]
12
+ begin
13
+ # First attempt
14
+ return yield
15
+ rescue *exception_classes
16
+ sleep RETRYABLE_SLEEP_VALUE
17
+ # 2nd to n-1 attempts
18
+ retry if (attempts -= 1) > 1
19
+ end
20
+ # final (nth) attempt
21
+ yield
22
+ end
23
+ end
24
+
25
+
5
26
  module LyberCore
6
27
  class Connection
7
28
  def Connection.get_https_connection(url)
@@ -62,7 +83,7 @@ module LyberCore
62
83
  req.basic_auth options[:auth_user], options[:auth_password]
63
84
  end
64
85
 
65
- res = Connection.get_https_connection(url).start {|http| http.request(req) }
86
+ res = Connection.send_request(url, req)
66
87
  case res
67
88
  when Net::HTTPSuccess
68
89
  if(block_given?)
@@ -72,9 +93,21 @@ module LyberCore
72
93
  end
73
94
  else
74
95
  raise res.error!
96
+ # ??? raise LyberCore::Exceptions::ServiceError.new('HTTP Request failed',res.error!)
75
97
  end
76
98
 
77
99
  end
100
+
101
+
102
+ # Send the request to the server, with multiple retries if specified exceptions occur
103
+ def Connection.send_request(url, req)
104
+ 3.tries :on => [Timeout::Error, EOFError, Errno::ECONNRESET] do
105
+ Connection.get_https_connection(url).start {|http| http.request(req) }
106
+ end
107
+ rescue Exception => e
108
+ raise LyberCore::Exceptions::ServiceError.new('HTTP Request failed',e)
109
+ end
110
+
78
111
  end
79
112
 
80
113