lyber-core 0.9.6.2.3 → 1.3.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.rdoc CHANGED
@@ -38,10 +38,28 @@ The BagitBag class requires the bagit gem
38
38
  http://github.com/flazz/bagit
39
39
 
40
40
  == Build and release procedure
41
+ Modify the version number in lyber-core.gemspec, then push your commits to AFS. DO NOT TAG!
41
42
  Run: 'rake dlss_release' to tag, build, and publish the lyber-core gem
42
43
  See the Rakefile and the LyberCore::DlssRelease task in lib/lyber_core/rake/dlss_release.rb for more details
43
44
 
44
45
  == Releases
46
+ - <b>1.3</b> Started to use Dor::Config for workspace configuration
47
+ - <b>1.2.1</b> Clean up logging of exceptions in LyberCore::Log
48
+ - <b>1.2</b> Robots can now run as daemons via the LyberCore::Robots::ServiceController
49
+ - <b>1.1.2</b> Can pass an array of "command line" arguments to the Robot constructor
50
+ - <b>1.1.1</b> Robot#start now returns LyberCore::Robots::CONTINUE if it did work without error, LyberCore::Robots::SLEEP if it did no work,
51
+ and LyberCore::Robots::HALT if it reached its error limit while working on its queue
52
+ - <b>1.1.0</b> Allow Robots::WorkQueue to resolve an arbitrary number of prerequisites
53
+ - <b>1.0.0</b> Factored all Dor::* classes and object models out of lyber-core and into a separate dor-services gem. WARNING: MAY BREAK COMPATIBILITY WITH PREVIOUS DOR-ENABLED CODE.
54
+ - <b>0.9.8</b> Created branch for legacy work "0.9-legacy". Robots can now be configured with fully qualified workflows for prerequisites
55
+ eg <i>dor:googleScannedBookWF:register-object</i>
56
+ - <b>0.9.7.4</b> Untangled a couple development dependencies; fixed issue where "include REXML" was polluting the Object namespace
57
+ - <b>0.9.7.3</b> Logging enhancements
58
+ - <b>0.9.7.2</b> IdentityMetadata bugfixes
59
+ - <b>0.9.7.1</b> Enhanced exception handling
60
+ - <b>0.9.7</b> ActiveMQ message-based robot parallelization as described here: https://consul.stanford.edu/x/tQjdBw . Removal of ROXML models.
61
+ - <b>0.9.6.3</b> Better error reporting for LyberCore::Utils::FileUtilities.execute, which means when a system command fails we have a better idea of why.
62
+ - <b>0.9.6.2</b> Handles new response from workflow service when there are no objects in the queue: <objects count="0"\>
45
63
  - <b>0.9.6</b> DorService.get_objects_for_workstep can handle one or two completed steps. Trimmed-down gem dependencies now defined in lyber-core.gemspec. 'rake dlss_release' will tag, build and publish gem
46
64
  - <b>0.9.5.5</b> Robots now log to ROBOT_ROOT/log/robot_name.log unless specified in constructor
47
65
  - <b>0.9.5.4</b> Custom exception classes, more checking of error conditions
data/lib/dlss_service.rb CHANGED
@@ -1,4 +1,3 @@
1
- require 'rubygems'
2
1
  require 'net/http'
3
2
  require 'net/https'
4
3
  require 'uri'
data/lib/dor_service.rb CHANGED
@@ -4,10 +4,10 @@ require 'uri'
4
4
  require 'cgi'
5
5
  require 'rexml/document'
6
6
 
7
- include REXML
8
-
9
7
  class DorService
10
8
 
9
+ include REXML
10
+
11
11
  def DorService.get_https_connection(url)
12
12
  https = Net::HTTP.new(url.host, url.port)
13
13
  if(url.scheme == 'https')
@@ -31,7 +31,7 @@ class DorService
31
31
  druid = $1
32
32
  return druid
33
33
  rescue Exception => e
34
- LyberCore::Log.error("Unable to create object #{e.backtrace}")
34
+ LyberCore::Log.debug("Unable to create object #{e.backtrace}")
35
35
  raise e
36
36
  end
37
37
  end
@@ -55,7 +55,7 @@ class DorService
55
55
  LyberCore::Log.debug("new druid = #{druid}")
56
56
  return druid
57
57
  rescue Exception => e
58
- LyberCore::Log.error("Unable to create object")
58
+ LyberCore::Log.debug("Unable to create object")
59
59
  raise e, "Unable to create object "
60
60
  end
61
61
  end
@@ -111,7 +111,8 @@ class DorService
111
111
  LyberCore::Log.debug("Fetching druid for dor_id #{dor_id} at url #{url_string}")
112
112
  url = URI.parse(url_string)
113
113
  req = Net::HTTP::Get.new(url.request_uri)
114
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
114
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
115
+ res = LyberCore::Connection.send_request(url,req)
115
116
 
116
117
  case res
117
118
  when Net::HTTPSuccess
@@ -121,10 +122,10 @@ class DorService
121
122
  LyberCore::Log.debug("Barcode does not yet exist in DOR: #{dor_id}")
122
123
  return nil
123
124
  when Net::HTTPServerError
124
- LyberCore::Log.error("Encountered HTTPServerError error when requesting #{url}: #{res.inspect}")
125
+ LyberCore::Log.debug("Encountered HTTPServerError error when requesting #{url}: #{res.inspect}")
125
126
  raise "Encountered 500 error when requesting #{url}: #{res.inspect}"
126
127
  else
127
- LyberCore::Log.error("Encountered unknown error when requesting #{url}: #{res.inspect}")
128
+ LyberCore::Log.debug("Encountered unknown error when requesting #{url}: #{res.inspect}")
128
129
  raise "Encountered unknown error when requesting #{url}: #{res.inspect}"
129
130
  end
130
131
  end
@@ -132,7 +133,7 @@ class DorService
132
133
  ############################################# Start of Datastream methods
133
134
  # Until ActiveFedora supports client-side certificate configuration, we are stuck with our own methods to access datastreams
134
135
 
135
- #/objects/{pid}/datastreams/{dsID} ? [controlGroup] [dsLocation] [altIDs] [dsLabel] [versionable] [dsState] [formatURI] [checksumType] [checksum] [logMessage]
136
+ #/objects/pid/datastreams/dsID ? [controlGroup] [dsLocation] [altIDs] [dsLabel] [versionable] [dsState] [formatURI] [checksumType] [checksum] [logMessage]
136
137
  def DorService.add_datastream(druid, ds_id, ds_label, xml, content_type='application/xml', versionable = false )
137
138
  DorService.add_datastream_managed(druid, ds_id, ds_label, xml, content_type, versionable)
138
139
  end
@@ -168,8 +169,9 @@ class DorService
168
169
  LyberCore::Log.debug("Connecting to #{url_string}...")
169
170
  req = Net::HTTP::Get.new(url.request_uri)
170
171
  LyberCore::Log.debug("request object: #{req.inspect}")
171
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
172
-
172
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
173
+ res = LyberCore::Connection.send_request(url,req)
174
+
173
175
  case res
174
176
  when Net::HTTPSuccess
175
177
  return res.body
@@ -177,10 +179,10 @@ class DorService
177
179
  LyberCore::Log.debug("Datastream not found at url #{url_string}")
178
180
  return nil
179
181
  when Net::HTTPServerError
180
- LyberCore::Log.error("Attempted to reach #{url_string} but failed")
182
+ LyberCore::Log.debug("Attempted to reach #{url_string} but failed")
181
183
  raise "Encountered 500 error when requesting #{url_string}: #{res.inspect}"
182
184
  else
183
- LyberCore::Log.error("Encountered unknown error when requesting #{url}: #{res.inspect}")
185
+ LyberCore::Log.debug("Encountered unknown error when requesting #{url}: #{res.inspect}")
184
186
  raise "Encountered unknown error when requesting #{url}: #{res.inspect}"
185
187
  end
186
188
  rescue Exception => e
@@ -203,7 +205,8 @@ class DorService
203
205
  req = Net::HTTP::Get.new(url.request_uri)
204
206
  req.basic_auth FEDORA_USER, FEDORA_PASS
205
207
  LyberCore::Log.debug("request object: #{req.inspect}")
206
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
208
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
209
+ res = LyberCore::Connection.send_request(url,req)
207
210
  case res
208
211
  when Net::HTTPSuccess
209
212
  return res.body
@@ -276,56 +279,69 @@ class DorService
276
279
  # </objects>
277
280
  def DorService.get_objects_for_workstep(repository, workflow, completed, waiting)
278
281
  LyberCore::Log.debug("DorService.get_objects_for_workstep(#{repository}, #{workflow}, #{completed}, #{waiting})")
279
- begin
280
- if repository.nil? or workflow.nil? or completed.nil? or waiting.nil?
281
- LyberCore::Log.fatal("Can't execute DorService.get_objects_for_workstep: missing info")
282
- end
283
-
284
- unless defined?(WORKFLOW_URI) and WORKFLOW_URI != nil
285
- LyberCore::Log.fatal("WORKFLOW_URI is not set. ROBOT_ROOT = #{ROBOT_ROOT}")
286
- raise "WORKFLOW_URI is not set"
287
- end
288
-
289
- uri_string = "#{WORKFLOW_URI}/workflow_queue?repository=#{repository}&workflow=#{workflow}&waiting=#{waiting}"
290
- if(completed.class == Array)
291
- raise "The workflow service can only handle queries with no more than 2 completed steps" if completed.size > 2
292
- completed.each {|step| uri_string << "&completed=#{step}"}
293
- else
294
- uri_string << "&completed=#{completed}"
295
- end
296
- LyberCore::Log.info("Attempting to connect to #{uri_string}")
297
- url = URI.parse(uri_string)
298
- req = Net::HTTP::Get.new(url.request_uri)
299
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
300
- case res
301
- when Net::HTTPSuccess
302
- begin
303
- doc = Nokogiri::XML(res.body)
304
- count = doc.root.at_xpath("//objects/@count").content.to_i
305
- rescue Exception => e
306
- msg = "Could not parse response from Workflow Service"
307
- LyberCore::Log.error(msg + "\n#{res.body}")
308
- raise e, msg
309
- end
310
-
311
- if(count == 0)
312
- raise LyberCore::Exceptions::EmptyQueue.new, "empty queue"
313
- else
314
- return res.body
315
- end
316
- else
317
- LyberCore::Log.fatal("Workflow queue not found for #{workflow} : #{waiting}")
318
- LyberCore::Log.debug("I am attempting to connect to WORKFLOW_URI #{WORKFLOW_URI}")
319
- LyberCore::Log.debug("repository: #{repository}")
320
- LyberCore::Log.debug("workflow: #{workflow}")
321
- LyberCore::Log.debug("completed: #{completed}")
322
- LyberCore::Log.debug("waiting: #{waiting}")
323
- LyberCore::Log.debug(res.inspect)
324
- raise "Could not connect to url #{uri_string}"
325
- end
326
- end
282
+
283
+ if repository.nil? or workflow.nil? or completed.nil? or waiting.nil?
284
+ LyberCore::Log.fatal("Can't execute DorService.get_objects_for_workstep: missing info")
285
+ end
286
+
287
+ unless defined?(WORKFLOW_URI) and WORKFLOW_URI != nil
288
+ LyberCore::Log.fatal("WORKFLOW_URI is not set. ROBOT_ROOT = #{ROBOT_ROOT}")
289
+ raise "WORKFLOW_URI is not set"
290
+ end
291
+
292
+ uri_string = "#{WORKFLOW_URI}/workflow_queue?repository=#{repository}&workflow=#{workflow}&waiting=#{waiting}"
293
+ if(completed.class == Array)
294
+ raise "The workflow service can only handle queries with no more than 2 completed steps" if completed.size > 2
295
+ completed.each {|step| uri_string << "&completed=#{step}"}
296
+ else
297
+ uri_string << "&completed=#{completed}"
298
+ end
299
+
300
+ return DorService.execute_workflow_xml_query(uri_string)
327
301
  end
328
302
 
303
+ # Returns string containing object list XML from a workflow DOR query using fully qualified workflow step names
304
+ # eg <tt>dor:googleScannedBookWF:register-object</tt>
305
+ #
306
+ # @param [String, Array] completed if only querying for one completed step, pass in a String of a fully qualified workflow step.
307
+ # If querying for two completed steps, pass in an Array of the two completed steps
308
+ # @param [String] waiting the fully qualified name of the waiting step
309
+ # @raise [LyberCore::Exceptions::EmptyQueue] When the query is successful, but no objects are found in that queue
310
+ # @raise [Exception] For other problems like connection failures or passing in non-qualified workflow names
311
+ # @return [String] XML containing all the objects that match the specific query. It looks like:
312
+ # <objects>
313
+ # <object druid="dr:123" url="http://localhost:9999/jersey-spring/objects/dr:123%5c" />
314
+ # <object druid="dr:abc" url="http://localhost:9999/jersey-spring/objects/dr:abc%5c" />
315
+ # </objects>
316
+ def DorService.get_objects_for_qualified_workstep(completed, waiting)
317
+ LyberCore::Log.debug("DorService.get_objects_for_qualified_workstep(#{completed}, #{waiting})")
318
+
319
+ if completed.nil? or waiting.nil?
320
+ LyberCore::Log.fatal("Can't execute DorService.get_objects_for_qualified_workstep: missing info")
321
+ end
322
+
323
+ unless defined?(WORKFLOW_URI) and WORKFLOW_URI != nil
324
+ LyberCore::Log.fatal("WORKFLOW_URI is not set. ROBOT_ROOT = #{ROBOT_ROOT}")
325
+ raise "WORKFLOW_URI is not set"
326
+ end
327
+
328
+ unless(waiting =~ /.+:.+:.+/)
329
+ raise "The waiting step was not fully qualified or of the form: <repository>:<workflow>:<stepname>. Received #{waiting}"
330
+ end
331
+ uri_string = "#{WORKFLOW_URI}/workflow_queue?waiting=#{waiting}"
332
+
333
+ completed_steps = Array(completed)
334
+ raise "The workflow service can only handle queries with no more than 2 completed steps" if completed_steps.size > 2
335
+ completed_steps.each do |step|
336
+ raise "A completed step was not fully qualified or of the form: <repository>:<workflow>:<stepname>. Received #{step}" unless(step =~ /.+:.+:.+/)
337
+ uri_string << "&completed=#{step}"
338
+ end
339
+
340
+ return DorService.execute_workflow_xml_query(uri_string)
341
+ end
342
+
343
+
344
+
329
345
  def DorService.log_and_raise_workflow_connection_problem(repository, workflow, completed, waiting, response)
330
346
 
331
347
  end
@@ -400,18 +416,19 @@ class DorService
400
416
  req.body = DorService.construct_error_update_request(process, error_msg, error_txt)
401
417
  req.content_type = 'application/xml'
402
418
  LyberCore::Log::debug("Putting request: #{req.inspect}")
403
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
419
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
420
+ res = LyberCore::Connection.send_request(url,req)
404
421
  LyberCore::Log::debug("Got response: #{res.inspect}")
405
422
  case res
406
423
  when Net::HTTPSuccess
407
- LyberCore::Log.error("#{workflow} - #{process} set to error for " + druid)
424
+ LyberCore::Log.info("#{workflow} - #{process} set to error for " + druid)
408
425
  else
409
- LyberCore::Log.error(res.body)
426
+ LyberCore::Log.debug(res.body)
410
427
  raise res.error!, "Received error from the workflow service"
411
428
  end
412
429
  rescue Exception => e
413
430
  msg = "Unable to update workflow service at url #{url_string}"
414
- LyberCore::Log.error(msg)
431
+ LyberCore::Log.debug(msg)
415
432
  raise e, msg
416
433
  end
417
434
  end
@@ -434,7 +451,7 @@ class DorService
434
451
  LyberCore::Log.debug("Successfully queried symphony for #{flexkey}")
435
452
  return res.body
436
453
  else
437
- LyberCore::Log.error("Encountered an error from symphony: #{res.body}")
454
+ LyberCore::Log.debug("Encountered an error from symphony: #{res.body}")
438
455
  raise res.error!
439
456
  end
440
457
  rescue Exception => e
@@ -445,6 +462,40 @@ class DorService
445
462
 
446
463
 
447
464
  private
465
+
466
+ def DorService.execute_workflow_xml_query(uri_string)
467
+ LyberCore::Log.info("Attempting to connect to #{uri_string}")
468
+ url = URI.parse(uri_string)
469
+ req = Net::HTTP::Get.new(url.request_uri)
470
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
471
+ res = LyberCore::Connection.send_request(url,req)
472
+ case res
473
+ when Net::HTTPSuccess
474
+ begin
475
+ doc = Nokogiri::XML(res.body)
476
+ count = doc.root.at_xpath("//objects/@count").content.to_i
477
+ rescue Exception => e
478
+ msg = "Could not parse response from Workflow Service"
479
+ LyberCore::Log.debug(msg + "\n#{res.body}")
480
+ raise e, msg
481
+ end
482
+
483
+ if(count == 0)
484
+ raise LyberCore::Exceptions::EmptyQueue.new, "empty queue"
485
+ else
486
+ return res.body
487
+ end
488
+ else
489
+ LyberCore::Log.fatal("Workflow queue not found for #{workflow} : #{waiting}")
490
+ LyberCore::Log.debug("I am attempting to connect to WORKFLOW_URI #{WORKFLOW_URI}")
491
+ LyberCore::Log.debug("repository: #{repository}")
492
+ LyberCore::Log.debug("workflow: #{workflow}")
493
+ LyberCore::Log.debug("completed: #{completed}")
494
+ LyberCore::Log.debug("waiting: #{waiting}")
495
+ LyberCore::Log.debug(res.inspect)
496
+ raise "Could not connect to url #{uri_string}"
497
+ end
498
+ end
448
499
  # druid, ds, url, content_type, method, parms
449
500
  def DorService.set_datastream(druid, ds_id, parms, method, content = {})
450
501
  begin
@@ -458,15 +509,16 @@ class DorService
458
509
  req.basic_auth FEDORA_USER, FEDORA_PASS
459
510
  req.body = content[:xml] if(content[:xml])
460
511
  req.content_type = content[:type]
461
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
512
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
513
+ res = LyberCore::Connection.send_request(url,req)
462
514
  case res
463
515
  when Net::HTTPSuccess
464
516
  return true
465
517
  when Net::HTTPServerError
466
- LyberCore::Log.error("Attempted to set datastream #{url} but failed")
518
+ LyberCore::Log.debug("Attempted to set datastream #{url} but failed")
467
519
  raise "Encountered 500 error setting datastream #{url}: #{res.inspect}"
468
520
  else
469
- LyberCore::Log.error("Encountered unknown error when setting datastream #{url}: #{res.inspect}")
521
+ LyberCore::Log.debug("Encountered unknown error when setting datastream #{url}: #{res.inspect}")
470
522
  raise "Encountered unknown error when setting datastream #{url}: #{res.inspect}"
471
523
  end
472
524
  rescue Exception => e
@@ -506,15 +558,16 @@ end
506
558
  req = Net::HTTP::Put.new(url.path)
507
559
  req.body = DorService.construct_xml_for_tag_array(tags)
508
560
  req.content_type = 'application/xml'
509
- res = DorService.get_https_connection(url).start {|http| http.request(req) }
561
+ # res = DorService.get_https_connection(url).start {|http| http.request(req) }
562
+ res = LyberCore::Connection.send_request(url,req)
510
563
  case res
511
564
  when Net::HTTPSuccess
512
565
  return true
513
566
  when Net::HTTPServerError
514
- LyberCore::Log.error("Attempted to add identity tags #{url} but failed")
567
+ LyberCore::Log.debug("Attempted to add identity tags #{url} but failed")
515
568
  raise "Encountered 500 error when adding identity tags #{url}: #{res.inspect}"
516
569
  else
517
- LyberCore::Log.error("Encountered unknown error when adding identity tags #{url}: #{res.inspect}")
570
+ LyberCore::Log.debug("Encountered unknown error when adding identity tags #{url}: #{res.inspect}")
518
571
  raise "Encountered unknown error when adding identity tags #{url}: #{res.inspect}"
519
572
  end
520
573
  rescue Exception => e
data/lib/lyber_core.rb CHANGED
@@ -1,14 +1,17 @@
1
+ require 'lyber_core/config'
1
2
  require 'dlss_service'
2
- require 'dor/suri_service'
3
- require 'dor/workflow_service'
4
- require 'dor/base'
3
+ require 'dor-services'
5
4
  require 'lyber_core/connection'
6
5
  require 'lyber_core/destroyer'
7
6
  require 'lyber_core/log'
8
7
  require 'lyber_core/robots/robot'
8
+ require 'lyber_core/robots/service_controller'
9
9
  require 'lyber_core/robots/workflow'
10
10
  require 'lyber_core/robots/workspace'
11
11
  require 'lyber_core/robots/work_queue'
12
12
  require 'lyber_core/robots/work_item'
13
13
  require 'lyber_core/exceptions/empty_queue'
14
+ require 'lyber_core/exceptions/fatal_error'
15
+ require 'lyber_core/exceptions/service_error'
16
+ require 'lyber_core/exceptions/item_error'
14
17
 
@@ -0,0 +1,13 @@
1
+ require 'dor-services'
2
+
3
+ module Dor
4
+
5
+ Config.declare do
6
+
7
+ robots do
8
+ workspace nil
9
+ end
10
+
11
+ end
12
+
13
+ end
@@ -2,6 +2,27 @@ require 'net/https'
2
2
  require 'uri'
3
3
  require 'cgi'
4
4
 
5
+ # Extend the Integer class to facilitate retries of code blocks if specified exception(s) occur
6
+ # see: http://blog.josh-nesbitt.net/2010/02/08/writing-contingent-ruby-code-with-retryable/
7
+ RETRYABLE_SLEEP_VALUE = 300
8
+ class Integer
9
+ def tries(options={}, &block)
10
+ attempts = self
11
+ exception_classes = [*options[:on] || StandardError]
12
+ begin
13
+ # First attempt
14
+ return yield
15
+ rescue *exception_classes
16
+ sleep RETRYABLE_SLEEP_VALUE
17
+ # 2nd to n-1 attempts
18
+ retry if (attempts -= 1) > 1
19
+ end
20
+ # final (nth) attempt
21
+ yield
22
+ end
23
+ end
24
+
25
+
5
26
  module LyberCore
6
27
  class Connection
7
28
  def Connection.get_https_connection(url)
@@ -62,7 +83,7 @@ module LyberCore
62
83
  req.basic_auth options[:auth_user], options[:auth_password]
63
84
  end
64
85
 
65
- res = Connection.get_https_connection(url).start {|http| http.request(req) }
86
+ res = Connection.send_request(url, req)
66
87
  case res
67
88
  when Net::HTTPSuccess
68
89
  if(block_given?)
@@ -72,9 +93,21 @@ module LyberCore
72
93
  end
73
94
  else
74
95
  raise res.error!
96
+ # ??? raise LyberCore::Exceptions::ServiceError.new('HTTP Request failed',res.error!)
75
97
  end
76
98
 
77
99
  end
100
+
101
+
102
+ # Send the request to the server, with multiple retries if specified exceptions occur
103
+ def Connection.send_request(url, req)
104
+ 3.tries :on => [Timeout::Error, EOFError, Errno::ECONNRESET] do
105
+ Connection.get_https_connection(url).start {|http| http.request(req) }
106
+ end
107
+ rescue Exception => e
108
+ raise LyberCore::Exceptions::ServiceError.new('HTTP Request failed',e)
109
+ end
110
+
78
111
  end
79
112
 
80
113