harvestdor-indexer 0.0.13 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a14d39d89afa8e167b50facf06cf7cf0a6c5fa76
4
- data.tar.gz: 63478e959742b0640189aaf90b2f185315b4b9c3
3
+ metadata.gz: 45d033dffd56d8c3abd90ccafd0fa4666b4379da
4
+ data.tar.gz: f579c955fb390f2d7ed497aa237dbe763361d60b
5
5
  SHA512:
6
- metadata.gz: eb6f6830232b5d7609a29779c8ca58139cbad499d5c51daa9bfeb0252fe52fc633296a360ccd5e2e9c7b2b68313402c7fb9b6e4532a414a2d130c3f0fe867a69
7
- data.tar.gz: f97897beffaebca548732810ed0d3281b169feead5203aaa9ec1a63167cdbf4813a5619052b9ab1cdf8f2fc54bb5068d827487ec9e668b27a1f8b85acd375552
6
+ metadata.gz: a7e9213e27145df273eaee8384b690e9ae720aeffd7f781f64dec9bc13948c6716b942499305037c7210c59ae9b5f07747f5ff67e4fe315577a78bce7d530a81
7
+ data.tar.gz: 896d1c26f157087e2bd6f935e0106024e9a7840cdb5c5db05e6a348ff1fe0579410bc70895b26b3db60a6461883bfd5bfe4a8ee43483d0f8821bf73cad4c2b1c
data/README.rdoc CHANGED
@@ -1,9 +1,8 @@
1
1
  = Harvestdor::Indexer
2
2
  {<img src="https://travis-ci.org/sul-dlss/harvestdor-indexer.svg" alt="Build Status" />}[https://travis-ci.org/sul-dlss/harvestdor-indexer]
3
- {<img src="https://coveralls.io/repos/sul-dlss/harvestdor-indexer/badge.png" alt="Coverage Status" />}[https://coveralls.io/r/sul-dlss/harvestdor-indexer]{<img src="https://gemnasium.com/sul-dlss/harvestdor-indexer.svg" alt="Dependency Status" />}[https://gemnasium.com/sul-dlss/harvestdor-indexer]{<img src="https://badge.fury.io/rb/harvestdor-indexer.svg" alt="Gem Version" />}[http://badge.fury.io/rb/harvestdor-indexer]
4
-
5
-
6
-
3
+ {<img src="https://coveralls.io/repos/sul-dlss/harvestdor-indexer/badge.png" alt="Coverage Status" />}[https://coveralls.io/r/sul-dlss/harvestdor-indexer]
4
+ {<img src="https://gemnasium.com/sul-dlss/harvestdor-indexer.svg" alt="Dependency Status" />}[https://gemnasium.com/sul-dlss/harvestdor-indexer]
5
+ {<img src="https://badge.fury.io/rb/harvestdor-indexer.svg" alt="Gem Version" />}[http://badge.fury.io/rb/harvestdor-indexer]
7
6
 
8
7
  A Gem to harvest meta/data from DOR and the skeleton code to index it and write to Solr.
9
8
 
@@ -42,13 +41,10 @@ Note: Because of an update to underlying HTTP libraries, versions of this gem >
42
41
  See spec/config/ap.yml for an example.
43
42
  You will want to copy that file and change the following settings:
44
43
  1. log_name
45
- 2. default_set (in OAI harvesting params section)
46
- 3. other OAI harvesting params
47
- 4. blacklist or whitelist if you are using them
44
+ 2. default_set
45
+ 3. blacklist or whitelist if you are using them
48
46
 
49
- You can also pass in non-default configurations as a hash
50
-
51
- indexer = Harvestdor::Indexer.new({:oai_repository_url => 'http://my_oai.org, :default_from_date => '2012-12-01'})
47
+ Update the dor-fetcher-client.yml file in the config directory with the location of the URL of the dor-fetcher-service provider. The defaulted value is the 3000 port for a localhost - dor_fetcher_service_url: http://127.0.0.1:3000
52
48
 
53
49
  === Override the Harvestdor::Indexer.index method
54
50
 
@@ -94,17 +90,21 @@ I suggest you write a script to run the code. Your script might look like this:
94
90
  end
95
91
  config_yml_path = ARGV.pop
96
92
  if config_yml_path.nil?
97
- puts "** You must provide the full path to a config yml file **"
93
+ puts "** You must provide the full path to a collection config yml file **"
94
+ exit
95
+ end
96
+ if client_config_path.nil?
97
+ puts "** You must provide the full path to dor-fetcher-client config yml file **"
98
98
  exit
99
99
  end
100
- indexer = Harvestdor::Indexer.new(config_yml_path, opts)
100
+ indexer = Harvestdor::Indexer.new(config_yml_path, client_config_path, opts)
101
101
  indexer.harvest_and_index
102
102
 
103
103
  Then you run the script like so:
104
104
 
105
105
  ./bin/indexer config/(your coll).yml
106
106
 
107
- I suggest you run your code on harvestdor-dev, as it is already set up to be able to harvest from the DOR OAI provider
107
+ I suggest you run your code on harvestdor-dev, as it is already set up to be able to harvest from the DorFetcher
108
108
 
109
109
 
110
110
  == Contributing
@@ -118,6 +118,7 @@ I suggest you run your code on harvestdor-dev, as it is already set up to be abl
118
118
 
119
119
  == Releases
120
120
 
121
+ * <b>1.0.0</b> Replaced OAI harvesting mechanism with dor-fetcher
121
122
  * <b>0.0.13</b> Upgrade to latest faraday HTTP client syntax; Use retries gem (https://github.com/ooyala/retries) to make retrying of index process more robust
122
123
  * <b>0.0.12</b> fix total_object nil error
123
124
  * <b>0.0.11</b> fix error_count and success_count, allow setting of max-tries (retry solr add if error)
@@ -0,0 +1,4 @@
1
+ # ---------- DorFetcher Client parameters -----------
2
+
3
+ # dor_fetcher_service_url: URL of the dor-fetcher-service provider
4
+ dor_fetcher_service_url: http://127.0.0.1:3000
@@ -6,8 +6,8 @@ require 'harvestdor-indexer/version'
6
6
  Gem::Specification.new do |gem|
7
7
  gem.name = "harvestdor-indexer"
8
8
  gem.version = Harvestdor::Indexer::VERSION
9
- gem.authors = ["Naomi Dushay"]
10
- gem.email = ["ndushay@stanford.edu"]
9
+ gem.authors = ["Naomi Dushay", "Bess Sadler", "Laney McGlohon"]
10
+ gem.email = ["ndushay@stanford.edu", "bess@stanford.edu", "laneymcg@stanford.edu"]
11
11
  gem.description = %q{Harvest DOR object metadata via a relationship (e.g. hydra:isGovernedBy rdf:resource="info:fedora/druid:hy787xj5878") and dates, plus code framework to write Solr docs to index}
12
12
  gem.summary = %q{Harvest DOR object metadata and index it to Solr}
13
13
  gem.homepage = "https://consul.stanford.edu/display/chimera/Chimera+project"
@@ -21,6 +21,7 @@ Gem::Specification.new do |gem|
21
21
  gem.add_dependency 'retries'
22
22
  gem.add_dependency 'harvestdor', '>=0.0.14'
23
23
  gem.add_dependency 'stanford-mods'
24
+ gem.add_dependency 'dor-fetcher', '>=1.0.0'
24
25
 
25
26
  # Runtime dependencies
26
27
  gem.add_runtime_dependency 'confstruct'
@@ -36,5 +37,7 @@ Gem::Specification.new do |gem|
36
37
  gem.add_development_dependency 'rspec'
37
38
  gem.add_development_dependency 'coveralls'
38
39
  # gem.add_development_dependency 'ruby-debug19'
40
+ gem.add_development_dependency 'vcr'
41
+ gem.add_development_dependency 'webmock'
39
42
 
40
43
  end
@@ -1,6 +1,6 @@
1
1
  module Harvestdor
2
2
  class Indexer
3
3
  # this is the Ruby Gem version
4
- VERSION = "0.0.13"
4
+ VERSION = "1.0.0"
5
5
  end
6
6
  end
@@ -2,10 +2,12 @@
2
2
  require 'confstruct'
3
3
  require 'rsolr'
4
4
  require 'retries'
5
+ require 'json'
5
6
 
6
7
  # sul-dlss gems
7
8
  require 'harvestdor'
8
9
  require 'stanford-mods'
10
+ require 'dor-fetcher'
9
11
 
10
12
  # stdlib
11
13
  require 'logger'
@@ -18,8 +20,9 @@ module Harvestdor
18
20
 
19
21
  attr_accessor :error_count, :success_count, :max_retries
20
22
  attr_accessor :total_time_to_parse,:total_time_to_solr
23
+ attr_accessor :dor_fetcher_client, :client_config
21
24
 
22
- def initialize yml_path, options = {}
25
+ def initialize yml_path, client_config_path, options = {}
23
26
  @success_count=0 # the number of objects successfully indexed
24
27
  @error_count=0 # the number of objects that failed
25
28
  @max_retries=10 # the number of times to retry an object
@@ -27,8 +30,10 @@ module Harvestdor
27
30
  @total_time_to_parse=0
28
31
  @yml_path = yml_path
29
32
  config.configure(YAML.load_file(yml_path)) if yml_path
30
- config.configure options
33
+ config.configure options
31
34
  yield(config) if block_given?
35
+ @client_config = YAML.load_file(client_config_path) if client_config_path && File.exists?(client_config_path)
36
+ @dor_fetcher_client=DorFetcher::Client.new({:service_url => client_config["dor_fetcher_service_url"]})
32
37
  end
33
38
 
34
39
  def config
@@ -40,7 +45,7 @@ module Harvestdor
40
45
  end
41
46
 
42
47
  # per this Indexer's config options
43
- # harvest the druids via OAI
48
+ # harvest the druids via DorFetcher
44
49
  # create a Solr profiling document for each druid
45
50
  # write the result to the Solr index
46
51
  def harvest_and_index
@@ -67,14 +72,14 @@ module Harvestdor
67
72
  logger.info("Total records processed: #{total_objects}")
68
73
  end
69
74
 
70
- # return Array of druids contained in the OAI harvest indicated by OAI params in yml configuration file
75
+ # return Array of druids contained in the DorFetcher pulling indicated by DorFetcher params
71
76
  # @return [Array<String>] or enumeration over it, if block is given. (strings are druids, e.g. ab123cd1234)
72
77
  def druids
73
78
  if @druids.nil?
74
79
  start_time=Time.now
75
- logger.info("Starting OAI harvest of druids at #{start_time}.")
76
- @druids = harvestdor_client.druids_via_oai
77
- logger.info("Completed OAI harves of druids at #{Time.now}. Found #{@druids.size} druids. Total elapsed time for OAI harvest = #{elapsed_time(start_time,:minutes)} minutes")
80
+ logger.info("Starting DorFetcher pulling of druids at #{start_time}.")
81
+ @druids = @dor_fetcher_client.druid_array(@dor_fetcher_client.get_collection(strip_default_set_string(), {}))
82
+ logger.info("Completed DorFetcher pulling of druids at #{Time.now}. Found #{@druids.size} druids. Total elapsed time for DorFetcher pulling = #{elapsed_time(start_time,:minutes)} minutes")
78
83
  end
79
84
  return @druids
80
85
  end
@@ -224,6 +229,12 @@ module Harvestdor
224
229
  @whitelist ||= []
225
230
  end
226
231
 
232
+ # Get only the druid from the end of the default_set string
233
+ # from the yml file
234
+ def strip_default_set_string()
235
+ @config.default_set.split('_').last
236
+ end
237
+
227
238
  protected #---------------------------------------------------------------------
228
239
 
229
240
  def harvestdor_client
data/spec/config/ap.yml CHANGED
@@ -1,7 +1,6 @@
1
1
  # You will want to copy this file and change the following settings:
2
2
  # 1. log_dir, log_name
3
- # 2. default_set (in OAI harvesting params section)
4
- # 2a. other OAI harvesting params
3
+ # 2. default_set
5
4
  # 3. blacklist or whitelist if you are using them
6
5
  # 4. Solr baseurl
7
6
 
@@ -16,8 +15,9 @@ purl: http://purl.stanford.edu
16
15
 
17
16
  # ---------- White and Black list parameters -----
18
17
 
19
- # name of file containing druids that will NOT be processed even if they are harvested via OAI
20
- # either give absolute path or path relative to where the command will be executed
18
+ # name of file containing druids that will NOT be processed even if they are harvested
19
+ # via DorFetcher either give absolute path or path relative to where the command will
20
+ # be executed
21
21
  #blacklist: config/ap_blacklist.txt
22
22
 
23
23
  # name of file containing druids that WILL be processed (all others will be ignored)
@@ -32,14 +32,9 @@ solr:
32
32
  read_timeout: 60
33
33
  open_timeout: 60
34
34
 
35
- # ---------- OAI harvesting parameters -----------
36
-
37
- # oai_repository_url: URL of the OAI data provider
38
- oai_repository_url: https://dor-oaiprovider-prod.stanford.edu/oai
39
-
40
35
  # default_set: default set for harvest (default: nil)
41
36
  # can be overridden on calls to harvest_ids and harvest_records
42
- default_set: is_governed_by_hy787xj5878
37
+ default_set: is_governed_by_yg867hg1375
43
38
 
44
39
  # default_metadata_prefix: default metadata prefix to be used for harvesting (default: mods)
45
40
  # can be overridden on calls to harvest_ids and harvest_records
@@ -50,8 +45,6 @@ default_set: is_governed_by_hy787xj5878
50
45
  # default_until_date: default until date for harvest (default: nil)
51
46
  # can be overridden on calls to harvest_ids and harvest_records
52
47
 
53
- # oai_client_debug: true for OAI::Client debug mode (default: false)
54
-
55
48
  # Additional options to pass to Faraday http client (https://github.com/technoweenie/faraday)
56
49
  http_options:
57
50
  ssl:
@@ -1,5 +1,5 @@
1
1
  # blacklist containing druids that should NOT be processed.
2
2
  # druids should be in the form aa111bb2222
3
3
 
4
- oo111oo1111
5
- oo222oo2222
4
+ druid:jf275fd6276
5
+ druid:tc552kq0798
@@ -1,5 +1,6 @@
1
1
  # whitelist containing the specific druids to be processed (all others will be ignored)
2
2
  # druids should be in the form aa111bb2222
3
3
 
4
- oo000oo0000
5
- oo222oo2222
4
+ druid:yg867hg1375
5
+ druid:jf275fd6276
6
+ druid:nz353cp1092
@@ -0,0 +1,114 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://purl.stanford.edu/oo000oo0000.mods
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 404
17
+ message: ''
18
+ headers:
19
+ Date:
20
+ - Wed, 22 Oct 2014 20:26:30 GMT
21
+ Server:
22
+ - Apache/2.2.15 (Red Hat)
23
+ X-Powered-By:
24
+ - Phusion Passenger (mod_rails/mod_rack) 3.0.19
25
+ X-Ua-Compatible:
26
+ - IE=Edge,chrome=1
27
+ Cache-Control:
28
+ - no-cache
29
+ X-Request-Id:
30
+ - eb7854ee5cc96cbf20bfdafb0e8ea1c2
31
+ X-Runtime:
32
+ - '0.011781'
33
+ X-Rack-Cache:
34
+ - miss
35
+ Status:
36
+ - '404'
37
+ Content-Length:
38
+ - '3015'
39
+ Content-Type:
40
+ - text/html; charset=utf-8
41
+ body:
42
+ encoding: US-ASCII
43
+ string: |
44
+ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
45
+ <html xmlns="http://www.w3.org/1999/xhtml">
46
+ <head>
47
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
48
+ <title>Stanford Digital Repository</title>
49
+ <link rel="shortcut icon" href="favicon.ico" />
50
+ <link href="/assets/site.css" media="screen" rel="stylesheet" type="text/css" />
51
+ </head>
52
+ <body>
53
+ <div id="container">
54
+ <div id="banner"><h1>Stanford Digital Repository</h1></div>
55
+ <div id="contents">
56
+ <div class="dialog">
57
+ <h2>The item you requested is not available.</h2>
58
+ <p>The item you requested is not yet available. It will be available at this URL when Library processing is completed.</p>
59
+ </div>
60
+
61
+ </div>
62
+ </div>
63
+
64
+ <div id="footer">
65
+ <div class="footer-contents">
66
+ <div class="footer-sul">
67
+ <div class="footer-logo">
68
+ <a href="http://library.stanford.edu" target="_blank"><img src="/images/footer-sul-logo.png"></a>
69
+ </div>
70
+ <div class="footer-links">
71
+ <a href="http://library.stanford.edu" target="_blank">Stanford University Libraries</a>
72
+ <a href="http://searchworks.stanford.edu" target="_blank">SearchWorks</a>
73
+ <a href="http://library.stanford.edu/ejournals" target="_blank">eJournals</a>
74
+ <a href="hhttp://library.stanford.edu/myawrap.html" target="_blank">My Account</a>
75
+ <a href="http://library.stanford.edu/ask" target="_blank">Ask Us</a>
76
+ </div>
77
+ </div>
78
+ <div class="footer-su">
79
+ <div class="footer-logo">
80
+ <a href="http://www.stanford.edu" target="_blank"><img src="/images/footer-stanford-logo.png"></a>
81
+ </div>
82
+ <div class="footer-links">
83
+ <a href="http://www.stanford.edu" target="_blank">SU Home</a>
84
+ <a href="http://visit.stanford.edu/plan/maps.html" target="_blank">Maps &amp; Directions</a>
85
+ <a href="http://www.stanford.edu/search/" target="_blank">Search Stanford</a>
86
+ <a href="http://www.stanford.edu/site/terms.html" target="_blank">Terms of Use</a>
87
+ <a href="http://www.stanford.edu/site/copyright.html" target="_blank">Copyright Complaints</a>
88
+ </br>&copy; Stanford University, Stanford, California 94305
89
+ </div>
90
+ </div>
91
+ </div>
92
+ </div>
93
+ <script src="/assets/jquery.js" type="text/javascript"></script>
94
+ <script src="/assets/jquery.truncator.js" type="text/javascript"></script>
95
+ <script src="/assets/application.js" type="text/javascript"></script>
96
+
97
+ <script type="text/javascript">
98
+
99
+ var _gaq = _gaq || [];
100
+ _gaq.push(['_setAccount', 'UA-7219229-11']);
101
+ _gaq.push(['_trackPageview']);
102
+
103
+ (function() {
104
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
105
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
106
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
107
+ })();
108
+
109
+ </script>
110
+ </body>
111
+ </html>
112
+ http_version:
113
+ recorded_at: Wed, 22 Oct 2014 20:26:30 GMT
114
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - 0954c447-9cb9-4eeb-8020-d87f13098f07
33
+ X-Runtime:
34
+ - '0.006736'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 18:42:32 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 18:42:32 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - 1e0232c6-fc39-49bf-b874-89567e225d00
33
+ X-Runtime:
34
+ - '0.006851'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 18:53:15 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 18:53:15 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - d35b0793-e841-496b-bce1-720bfbf2ad04
33
+ X-Runtime:
34
+ - '0.006751'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 20:32:36 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 20:32:36 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - a631e18e-8396-4699-b7a9-fd05fd115e02
33
+ X-Runtime:
34
+ - '0.006491'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 20:34:01 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 20:34:01 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - 37413f0c-a104-4df1-8a80-1873389200f4
33
+ X-Runtime:
34
+ - '0.006754'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 18:42:32 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 18:42:32 GMT
58
+ recorded_with: VCR 2.9.3
data/spec/spec_helper.rb CHANGED
@@ -6,5 +6,9 @@ $LOAD_PATH.unshift(File.dirname(__FILE__))
6
6
 
7
7
  require 'harvestdor-indexer'
8
8
 
9
- #RSpec.configure do |config|
10
- #end
9
+ require 'vcr'
10
+
11
+ VCR.configure do |c|
12
+ c.cassette_library_dir = 'spec/fixtures/vcr_cassettes'
13
+ c.hook_into :webmock
14
+ end
@@ -4,7 +4,8 @@ describe Harvestdor::Indexer do
4
4
 
5
5
  before(:all) do
6
6
  @config_yml_path = File.join(File.dirname(__FILE__), "..", "config", "ap.yml")
7
- @indexer = Harvestdor::Indexer.new(@config_yml_path)
7
+ @client_config_path = File.join(File.dirname(__FILE__), "../..", "config", "dor-fetcher-client.yml")
8
+ @indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
8
9
  require 'yaml'
9
10
  @yaml = YAML.load_file(@config_yml_path)
10
11
  @hdor_client = @indexer.send(:harvestdor_client)
@@ -65,55 +66,85 @@ describe Harvestdor::Indexer do
65
66
  :id => @fake_druid
66
67
  }
67
68
  end
68
- it "should call druids_via_oai and then call :add on rsolr connection" do
69
+ it "should call dor_fetcher_client.druid_array and then call :add on rsolr connection" do
69
70
  @indexer.should_receive(:druids).and_return([@fake_druid])
70
71
  @indexer.solr_client.should_receive(:add).with(@doc_hash)
71
72
  @indexer.solr_client.should_receive(:commit)
72
73
  @indexer.harvest_and_index
73
74
  end
74
- it "should not process druids in blacklist" do
75
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => @blacklist_path})
76
- hdor_client = indexer.send(:harvestdor_client)
77
- hdor_client.should_receive(:druids_via_oai).and_return(['oo000oo0000', 'oo111oo1111', 'oo222oo2222', 'oo333oo3333'])
78
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo000oo0000'}))
79
- indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'oo111oo1111'}))
80
- indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'oo222oo2222'}))
81
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo333oo3333'}))
82
- indexer.solr_client.should_receive(:commit)
83
- indexer.harvest_and_index
75
+
76
+ it "should only call :commit on rsolr connection once" do
77
+ VCR.use_cassette('single_rsolr_connection_call') do
78
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
79
+ hdor_client = indexer.send(:harvestdor_client)
80
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return(["druid:yg867hg1375", "druid:jf275fd6276", "druid:nz353cp1092", "druid:tc552kq0798", "druid:th998nk0722", "druid:ww689vs6534"])
81
+ indexer.solr_client.should_receive(:add).exactly(6).times
82
+ indexer.solr_client.should_receive(:commit).once
83
+ indexer.harvest_and_index
84
+ end
84
85
  end
85
- it "should only process druids in whitelist if it exists" do
86
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:whitelist => @whitelist_path})
87
- hdor_client = indexer.send(:harvestdor_client)
88
- hdor_client.should_not_receive(:druids_via_oai)
89
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo000oo0000'}))
90
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo222oo2222'}))
91
- indexer.solr_client.should_receive(:commit)
92
- indexer.harvest_and_index
86
+
87
+ it "should not process druids in blacklist" do
88
+ VCR.use_cassette('ignore_druids_in_blacklist_call') do
89
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => @blacklist_path})
90
+ hdor_client = indexer.send(:harvestdor_client)
91
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return(["druid:yg867hg1375", "druid:jf275fd6276", "druid:nz353cp1092", "druid:tc552kq0798", "druid:th998nk0722", "druid:ww689vs6534"])
92
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:nz353cp1092'}))
93
+ indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'druid:jf275fd6276'}))
94
+ indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'druid:tc552kq0798'}))
95
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:th998nk0722'}))
96
+ indexer.solr_client.should_receive(:commit)
97
+ indexer.harvest_and_index
98
+ end
93
99
  end
94
100
  it "should not process druid if it is in both blacklist and whitelist" do
95
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => @blacklist_path, :whitelist => @whitelist_path})
96
- hdor_client = indexer.send(:harvestdor_client)
97
- hdor_client.should_not_receive(:druids_via_oai)
98
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo000oo0000'}))
99
- indexer.solr_client.should_receive(:commit)
100
- indexer.harvest_and_index
101
+ VCR.use_cassette('ignore_druids_in_blacklist_and_whitelist_call') do
102
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => @blacklist_path, :whitelist => @whitelist_path})
103
+ hdor_client = indexer.send(:harvestdor_client)
104
+ indexer.dor_fetcher_client.should_not_receive(:druid_array)
105
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:yg867hg1375'}))
106
+ indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'druid:jf275fd6276'}))
107
+ indexer.solr_client.should_receive(:commit)
108
+ indexer.harvest_and_index
109
+ end
101
110
  end
102
- it "should only call :commit on rsolr connection once" do
103
- indexer = Harvestdor::Indexer.new(@config_yml_path)
104
- hdor_client = indexer.send(:harvestdor_client)
105
- hdor_client.should_receive(:druids_via_oai).and_return(['1', '2', '3'])
106
- indexer.solr_client.should_receive(:add).exactly(3).times
107
- indexer.solr_client.should_receive(:commit).once
108
- indexer.harvest_and_index
111
+ it "should only process druids in whitelist if it exists" do
112
+ VCR.use_cassette('process_druids_whitelist_call') do
113
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:whitelist => @whitelist_path})
114
+ hdor_client = indexer.send(:harvestdor_client)
115
+ indexer.dor_fetcher_client.should_not_receive(:druid_array)
116
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:yg867hg1375'}))
117
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:jf275fd6276'}))
118
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:nz353cp1092'}))
119
+ indexer.solr_client.should_receive(:commit)
120
+ indexer.harvest_and_index
121
+ end
109
122
  end
123
+
110
124
  end
111
125
 
112
- it "druids method should call druids_via_oai method on harvestdor_client" do
113
- @hdor_client.should_receive(:druids_via_oai).and_return([@fake_druid])
114
- @indexer.druids
115
- end
116
-
126
+ # Check for replacement of oai harvesting with dor-fetcher
127
+ context "replacing OAI harvesting with dor-fetcher" do
128
+ it "has a dor-fetcher client" do
129
+ expect(@indexer.dor_fetcher_client).to be_an_instance_of(DorFetcher::Client)
130
+ end
131
+
132
+ it "should strip off is_member_of_collection_ and is_governed_by_ and return only the druid" do
133
+ expect(@indexer.strip_default_set_string()).to eq("yg867hg1375")
134
+ end
135
+
136
+ it "druids method should call druid_array and get_collection methods on fetcher_client" do
137
+ VCR.use_cassette('get_collection_druids_call') do
138
+ expect(@indexer.druids).to eq(["druid:yg867hg1375", "druid:jf275fd6276", "druid:nz353cp1092", "druid:tc552kq0798", "druid:th998nk0722", "druid:ww689vs6534"])
139
+ end
140
+ end
141
+
142
+ it "should get the configuration of the dor-fetcher client from included yml file" do
143
+ expect(@indexer.dor_fetcher_client.service_url).to eq(@indexer.client_config["dor_fetcher_service_url"])
144
+ end
145
+
146
+ end # ending replacing OAI context
147
+
117
148
  context "smods_rec method" do
118
149
  before(:all) do
119
150
  @fake_druid = 'oo000oo0000'
@@ -134,7 +165,9 @@ describe Harvestdor::Indexer do
134
165
  expect { @indexer.smods_rec(@fake_druid) }.to raise_error(RuntimeError, Regexp.new("^Empty MODS metadata for #{@fake_druid}: <"))
135
166
  end
136
167
  it "should raise exception if there is no MODS xml for the druid" do
137
- expect { @indexer.smods_rec(@fake_druid) }.to raise_error(Harvestdor::Errors::MissingMods)
168
+ VCR.use_cassette('exception_no_MODS_call') do
169
+ expect { @indexer.smods_rec(@fake_druid) }.to raise_error(Harvestdor::Errors::MissingMods)
170
+ end
138
171
  end
139
172
  end
140
173
 
@@ -253,30 +286,32 @@ describe Harvestdor::Indexer do
253
286
  @indexer.send(:blacklist).size.should == 2
254
287
  end
255
288
  it "should be empty Array if there was no blacklist config setting" do
256
- indexer = Harvestdor::Indexer.new(@config_yml_path)
289
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
257
290
  indexer.send(:blacklist).should == []
258
291
  end
259
292
  context "load_blacklist" do
260
293
  it "should not be called if there was no blacklist config setting" do
261
- indexer = Harvestdor::Indexer.new(@config_yml_path)
294
+ VCR.use_cassette('no_blacklist_config_call') do
295
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
262
296
 
263
- indexer.should_not_receive(:load_blacklist)
297
+ indexer.should_not_receive(:load_blacklist)
264
298
 
265
- hdor_client = indexer.send(:harvestdor_client)
266
- hdor_client.should_receive(:druids_via_oai).and_return([@fake_druid])
267
- indexer.solr_client.should_receive(:add)
268
- indexer.solr_client.should_receive(:commit)
269
- indexer.harvest_and_index
299
+ hdor_client = indexer.send(:harvestdor_client)
300
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return([@fake_druid])
301
+ indexer.solr_client.should_receive(:add)
302
+ indexer.solr_client.should_receive(:commit)
303
+ indexer.harvest_and_index
304
+ end
270
305
  end
271
306
  it "should only try to load a blacklist once" do
272
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => @blacklist_path})
307
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => @blacklist_path})
273
308
  indexer.send(:blacklist)
274
309
  File.any_instance.should_not_receive(:open)
275
310
  indexer.send(:blacklist)
276
311
  end
277
312
  it "should log an error message and throw RuntimeError if it can't find the indicated blacklist file" do
278
313
  exp_msg = 'Unable to find list of druids at bad_path'
279
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => 'bad_path'})
314
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => 'bad_path'})
280
315
  indexer.logger.should_receive(:fatal).with(exp_msg)
281
316
  expect { indexer.send(:load_blacklist, 'bad_path') }.to raise_error(exp_msg)
282
317
  end
@@ -287,33 +322,35 @@ describe Harvestdor::Indexer do
287
322
  it "should be an Array with an entry for each non-empty line in the file" do
288
323
  @indexer.send(:load_whitelist, @whitelist_path)
289
324
  @indexer.send(:whitelist).should be_an_instance_of(Array)
290
- @indexer.send(:whitelist).size.should == 2
325
+ @indexer.send(:whitelist).size.should == 3
291
326
  end
292
327
  it "should be empty Array if there was no whitelist config setting" do
293
- indexer = Harvestdor::Indexer.new(@config_yml_path)
328
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
294
329
  indexer.send(:whitelist).should == []
295
330
  end
296
331
  context "load_whitelist" do
297
332
  it "should not be called if there was no whitelist config setting" do
298
- indexer = Harvestdor::Indexer.new(@config_yml_path)
333
+ VCR.use_cassette('no_whitelist_config_call') do
334
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
299
335
 
300
- indexer.should_not_receive(:load_whitelist)
336
+ indexer.should_not_receive(:load_whitelist)
301
337
 
302
- hdor_client = indexer.send(:harvestdor_client)
303
- hdor_client.should_receive(:druids_via_oai).and_return([@fake_druid])
304
- indexer.solr_client.should_receive(:add)
305
- indexer.solr_client.should_receive(:commit)
306
- indexer.harvest_and_index
338
+ hdor_client = indexer.send(:harvestdor_client)
339
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return([@fake_druid])
340
+ indexer.solr_client.should_receive(:add)
341
+ indexer.solr_client.should_receive(:commit)
342
+ indexer.harvest_and_index
343
+ end
307
344
  end
308
345
  it "should only try to load a whitelist once" do
309
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:whitelist => @whitelist_path})
346
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:whitelist => @whitelist_path})
310
347
  indexer.send(:whitelist)
311
348
  File.any_instance.should_not_receive(:open)
312
349
  indexer.send(:whitelist)
313
350
  end
314
351
  it "should log an error message and throw RuntimeError if it can't find the indicated whitelist file" do
315
352
  exp_msg = 'Unable to find list of druids at bad_path'
316
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:whitelist => 'bad_path'})
353
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:whitelist => 'bad_path'})
317
354
  indexer.logger.should_receive(:fatal).with(exp_msg)
318
355
  expect { indexer.send(:load_whitelist, 'bad_path') }.to raise_error(exp_msg)
319
356
  end
@@ -321,7 +358,7 @@ describe Harvestdor::Indexer do
321
358
  end # whitelist
322
359
 
323
360
  it "solr_client should initialize the rsolr client using the options from the config" do
324
- indexer = Harvestdor::Indexer.new(nil, Confstruct::Configuration.new(:solr => { :url => 'http://localhost:2345', :a => 1 }) )
361
+ indexer = Harvestdor::Indexer.new(nil, @client_config_path, Confstruct::Configuration.new(:solr => { :url => 'http://localhost:2345', :a => 1 }) )
325
362
  RSolr.should_receive(:connect).with(hash_including(:a => 1, :url => 'http://localhost:2345')).and_return('foo')
326
363
  indexer.solr_client
327
364
  end
metadata CHANGED
@@ -1,206 +1,263 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: harvestdor-indexer
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.13
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Naomi Dushay
8
- autorequire:
8
+ - Bess Sadler
9
+ - Laney McGlohon
10
+ autorequire:
9
11
  bindir: bin
10
12
  cert_chain: []
11
- date: 2014-08-05 00:00:00.000000000 Z
13
+ date: 2014-10-24 00:00:00.000000000 Z
12
14
  dependencies:
13
15
  - !ruby/object:Gem::Dependency
14
16
  name: rsolr
17
+ version_requirements: !ruby/object:Gem::Requirement
18
+ requirements:
19
+ - - '>='
20
+ - !ruby/object:Gem::Version
21
+ version: '0'
15
22
  requirement: !ruby/object:Gem::Requirement
16
23
  requirements:
17
- - - ">="
24
+ - - '>='
18
25
  - !ruby/object:Gem::Version
19
26
  version: '0'
20
- type: :runtime
21
27
  prerelease: false
28
+ type: :runtime
29
+ - !ruby/object:Gem::Dependency
30
+ name: retries
22
31
  version_requirements: !ruby/object:Gem::Requirement
23
32
  requirements:
24
- - - ">="
33
+ - - '>='
25
34
  - !ruby/object:Gem::Version
26
35
  version: '0'
27
- - !ruby/object:Gem::Dependency
28
- name: retries
29
36
  requirement: !ruby/object:Gem::Requirement
30
37
  requirements:
31
- - - ">="
38
+ - - '>='
32
39
  - !ruby/object:Gem::Version
33
40
  version: '0'
34
- type: :runtime
35
41
  prerelease: false
42
+ type: :runtime
43
+ - !ruby/object:Gem::Dependency
44
+ name: harvestdor
36
45
  version_requirements: !ruby/object:Gem::Requirement
37
46
  requirements:
38
- - - ">="
47
+ - - '>='
39
48
  - !ruby/object:Gem::Version
40
- version: '0'
41
- - !ruby/object:Gem::Dependency
42
- name: harvestdor
49
+ version: 0.0.14
43
50
  requirement: !ruby/object:Gem::Requirement
44
51
  requirements:
45
- - - ">="
52
+ - - '>='
46
53
  - !ruby/object:Gem::Version
47
54
  version: 0.0.14
48
- type: :runtime
49
55
  prerelease: false
56
+ type: :runtime
57
+ - !ruby/object:Gem::Dependency
58
+ name: stanford-mods
50
59
  version_requirements: !ruby/object:Gem::Requirement
51
60
  requirements:
52
- - - ">="
61
+ - - '>='
53
62
  - !ruby/object:Gem::Version
54
- version: 0.0.14
55
- - !ruby/object:Gem::Dependency
56
- name: stanford-mods
63
+ version: '0'
57
64
  requirement: !ruby/object:Gem::Requirement
58
65
  requirements:
59
- - - ">="
66
+ - - '>='
60
67
  - !ruby/object:Gem::Version
61
68
  version: '0'
62
- type: :runtime
63
69
  prerelease: false
70
+ type: :runtime
71
+ - !ruby/object:Gem::Dependency
72
+ name: dor-fetcher
64
73
  version_requirements: !ruby/object:Gem::Requirement
65
74
  requirements:
66
- - - ">="
75
+ - - '>='
67
76
  - !ruby/object:Gem::Version
68
- version: '0'
77
+ version: 1.0.0
78
+ requirement: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - '>='
81
+ - !ruby/object:Gem::Version
82
+ version: 1.0.0
83
+ prerelease: false
84
+ type: :runtime
69
85
  - !ruby/object:Gem::Dependency
70
86
  name: confstruct
87
+ version_requirements: !ruby/object:Gem::Requirement
88
+ requirements:
89
+ - - '>='
90
+ - !ruby/object:Gem::Version
91
+ version: '0'
71
92
  requirement: !ruby/object:Gem::Requirement
72
93
  requirements:
73
- - - ">="
94
+ - - '>='
74
95
  - !ruby/object:Gem::Version
75
96
  version: '0'
76
- type: :runtime
77
97
  prerelease: false
98
+ type: :runtime
99
+ - !ruby/object:Gem::Dependency
100
+ name: rake
78
101
  version_requirements: !ruby/object:Gem::Requirement
79
102
  requirements:
80
- - - ">="
103
+ - - '>='
81
104
  - !ruby/object:Gem::Version
82
105
  version: '0'
83
- - !ruby/object:Gem::Dependency
84
- name: rake
85
106
  requirement: !ruby/object:Gem::Requirement
86
107
  requirements:
87
- - - ">="
108
+ - - '>='
88
109
  - !ruby/object:Gem::Version
89
110
  version: '0'
90
- type: :development
91
111
  prerelease: false
112
+ type: :development
113
+ - !ruby/object:Gem::Dependency
114
+ name: rdoc
92
115
  version_requirements: !ruby/object:Gem::Requirement
93
116
  requirements:
94
- - - ">="
117
+ - - '>='
95
118
  - !ruby/object:Gem::Version
96
119
  version: '0'
97
- - !ruby/object:Gem::Dependency
98
- name: rdoc
99
120
  requirement: !ruby/object:Gem::Requirement
100
121
  requirements:
101
- - - ">="
122
+ - - '>='
102
123
  - !ruby/object:Gem::Version
103
124
  version: '0'
104
- type: :development
105
125
  prerelease: false
126
+ type: :development
127
+ - !ruby/object:Gem::Dependency
128
+ name: yard
106
129
  version_requirements: !ruby/object:Gem::Requirement
107
130
  requirements:
108
- - - ">="
131
+ - - '>='
109
132
  - !ruby/object:Gem::Version
110
133
  version: '0'
111
- - !ruby/object:Gem::Dependency
112
- name: yard
113
134
  requirement: !ruby/object:Gem::Requirement
114
135
  requirements:
115
- - - ">="
136
+ - - '>='
116
137
  - !ruby/object:Gem::Version
117
138
  version: '0'
118
- type: :development
119
139
  prerelease: false
140
+ type: :development
141
+ - !ruby/object:Gem::Dependency
142
+ name: rspec
120
143
  version_requirements: !ruby/object:Gem::Requirement
121
144
  requirements:
122
- - - ">="
145
+ - - '>='
123
146
  - !ruby/object:Gem::Version
124
147
  version: '0'
125
- - !ruby/object:Gem::Dependency
126
- name: rspec
127
148
  requirement: !ruby/object:Gem::Requirement
128
149
  requirements:
129
- - - ">="
150
+ - - '>='
130
151
  - !ruby/object:Gem::Version
131
152
  version: '0'
132
- type: :development
133
153
  prerelease: false
154
+ type: :development
155
+ - !ruby/object:Gem::Dependency
156
+ name: coveralls
134
157
  version_requirements: !ruby/object:Gem::Requirement
135
158
  requirements:
136
- - - ">="
159
+ - - '>='
137
160
  - !ruby/object:Gem::Version
138
161
  version: '0'
139
- - !ruby/object:Gem::Dependency
140
- name: coveralls
141
162
  requirement: !ruby/object:Gem::Requirement
142
163
  requirements:
143
- - - ">="
164
+ - - '>='
144
165
  - !ruby/object:Gem::Version
145
166
  version: '0'
167
+ prerelease: false
146
168
  type: :development
169
+ - !ruby/object:Gem::Dependency
170
+ name: vcr
171
+ version_requirements: !ruby/object:Gem::Requirement
172
+ requirements:
173
+ - - '>='
174
+ - !ruby/object:Gem::Version
175
+ version: '0'
176
+ requirement: !ruby/object:Gem::Requirement
177
+ requirements:
178
+ - - '>='
179
+ - !ruby/object:Gem::Version
180
+ version: '0'
147
181
  prerelease: false
182
+ type: :development
183
+ - !ruby/object:Gem::Dependency
184
+ name: webmock
148
185
  version_requirements: !ruby/object:Gem::Requirement
149
186
  requirements:
150
- - - ">="
187
+ - - '>='
188
+ - !ruby/object:Gem::Version
189
+ version: '0'
190
+ requirement: !ruby/object:Gem::Requirement
191
+ requirements:
192
+ - - '>='
151
193
  - !ruby/object:Gem::Version
152
194
  version: '0'
153
- description: Harvest DOR object metadata via a relationship (e.g. hydra:isGovernedBy
154
- rdf:resource="info:fedora/druid:hy787xj5878") and dates, plus code framework to
155
- write Solr docs to index
195
+ prerelease: false
196
+ type: :development
197
+ description: Harvest DOR object metadata via a relationship (e.g. hydra:isGovernedBy rdf:resource="info:fedora/druid:hy787xj5878") and dates, plus code framework to write Solr docs to index
156
198
  email:
157
199
  - ndushay@stanford.edu
200
+ - bess@stanford.edu
201
+ - laneymcg@stanford.edu
158
202
  executables: []
159
203
  extensions: []
160
204
  extra_rdoc_files: []
161
205
  files:
162
- - ".gitignore"
163
- - ".travis.yml"
164
- - ".yardopts"
206
+ - .gitignore
207
+ - .travis.yml
208
+ - .yardopts
165
209
  - Gemfile
166
210
  - LICENSE.txt
167
211
  - README.rdoc
168
212
  - Rakefile
213
+ - config/dor-fetcher-client.yml
169
214
  - harvestdor-indexer.gemspec
170
215
  - lib/harvestdor-indexer.rb
171
216
  - lib/harvestdor-indexer/version.rb
172
217
  - spec/config/ap.yml
173
218
  - spec/config/ap_blacklist.txt
174
219
  - spec/config/ap_whitelist.txt
220
+ - spec/fixtures/vcr_cassettes/exception_no_MODS_call.yml
221
+ - spec/fixtures/vcr_cassettes/get_collection_druids_call.yml
222
+ - spec/fixtures/vcr_cassettes/ignore_druids_in_blacklist_call.yml
223
+ - spec/fixtures/vcr_cassettes/no_blacklist_config_call.yml
224
+ - spec/fixtures/vcr_cassettes/no_whitelist_config_call.yml
225
+ - spec/fixtures/vcr_cassettes/single_rsolr_connection_call.yml
175
226
  - spec/spec_helper.rb
176
227
  - spec/unit/harvestdor-indexer_spec.rb
177
228
  homepage: https://consul.stanford.edu/display/chimera/Chimera+project
178
229
  licenses: []
179
230
  metadata: {}
180
- post_install_message:
231
+ post_install_message:
181
232
  rdoc_options: []
182
233
  require_paths:
183
234
  - lib
184
235
  required_ruby_version: !ruby/object:Gem::Requirement
185
236
  requirements:
186
- - - ">="
237
+ - - '>='
187
238
  - !ruby/object:Gem::Version
188
239
  version: '0'
189
240
  required_rubygems_version: !ruby/object:Gem::Requirement
190
241
  requirements:
191
- - - ">="
242
+ - - '>='
192
243
  - !ruby/object:Gem::Version
193
244
  version: '0'
194
245
  requirements: []
195
- rubyforge_project:
246
+ rubyforge_project:
196
247
  rubygems_version: 2.2.2
197
- signing_key:
248
+ signing_key:
198
249
  specification_version: 4
199
250
  summary: Harvest DOR object metadata and index it to Solr
200
251
  test_files:
201
252
  - spec/config/ap.yml
202
253
  - spec/config/ap_blacklist.txt
203
254
  - spec/config/ap_whitelist.txt
255
+ - spec/fixtures/vcr_cassettes/exception_no_MODS_call.yml
256
+ - spec/fixtures/vcr_cassettes/get_collection_druids_call.yml
257
+ - spec/fixtures/vcr_cassettes/ignore_druids_in_blacklist_call.yml
258
+ - spec/fixtures/vcr_cassettes/no_blacklist_config_call.yml
259
+ - spec/fixtures/vcr_cassettes/no_whitelist_config_call.yml
260
+ - spec/fixtures/vcr_cassettes/single_rsolr_connection_call.yml
204
261
  - spec/spec_helper.rb
205
262
  - spec/unit/harvestdor-indexer_spec.rb
206
- has_rdoc:
263
+ has_rdoc: