harvestdor-indexer 0.0.13 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a14d39d89afa8e167b50facf06cf7cf0a6c5fa76
4
- data.tar.gz: 63478e959742b0640189aaf90b2f185315b4b9c3
3
+ metadata.gz: 45d033dffd56d8c3abd90ccafd0fa4666b4379da
4
+ data.tar.gz: f579c955fb390f2d7ed497aa237dbe763361d60b
5
5
  SHA512:
6
- metadata.gz: eb6f6830232b5d7609a29779c8ca58139cbad499d5c51daa9bfeb0252fe52fc633296a360ccd5e2e9c7b2b68313402c7fb9b6e4532a414a2d130c3f0fe867a69
7
- data.tar.gz: f97897beffaebca548732810ed0d3281b169feead5203aaa9ec1a63167cdbf4813a5619052b9ab1cdf8f2fc54bb5068d827487ec9e668b27a1f8b85acd375552
6
+ metadata.gz: a7e9213e27145df273eaee8384b690e9ae720aeffd7f781f64dec9bc13948c6716b942499305037c7210c59ae9b5f07747f5ff67e4fe315577a78bce7d530a81
7
+ data.tar.gz: 896d1c26f157087e2bd6f935e0106024e9a7840cdb5c5db05e6a348ff1fe0579410bc70895b26b3db60a6461883bfd5bfe4a8ee43483d0f8821bf73cad4c2b1c
data/README.rdoc CHANGED
@@ -1,9 +1,8 @@
1
1
  = Harvestdor::Indexer
2
2
  {<img src="https://travis-ci.org/sul-dlss/harvestdor-indexer.svg" alt="Build Status" />}[https://travis-ci.org/sul-dlss/harvestdor-indexer]
3
- {<img src="https://coveralls.io/repos/sul-dlss/harvestdor-indexer/badge.png" alt="Coverage Status" />}[https://coveralls.io/r/sul-dlss/harvestdor-indexer]{<img src="https://gemnasium.com/sul-dlss/harvestdor-indexer.svg" alt="Dependency Status" />}[https://gemnasium.com/sul-dlss/harvestdor-indexer]{<img src="https://badge.fury.io/rb/harvestdor-indexer.svg" alt="Gem Version" />}[http://badge.fury.io/rb/harvestdor-indexer]
4
-
5
-
6
-
3
+ {<img src="https://coveralls.io/repos/sul-dlss/harvestdor-indexer/badge.png" alt="Coverage Status" />}[https://coveralls.io/r/sul-dlss/harvestdor-indexer]
4
+ {<img src="https://gemnasium.com/sul-dlss/harvestdor-indexer.svg" alt="Dependency Status" />}[https://gemnasium.com/sul-dlss/harvestdor-indexer]
5
+ {<img src="https://badge.fury.io/rb/harvestdor-indexer.svg" alt="Gem Version" />}[http://badge.fury.io/rb/harvestdor-indexer]
7
6
 
8
7
  A Gem to harvest meta/data from DOR and the skeleton code to index it and write to Solr.
9
8
 
@@ -42,13 +41,10 @@ Note: Because of an update to underlying HTTP libraries, versions of this gem >
42
41
  See spec/config/ap.yml for an example.
43
42
  You will want to copy that file and change the following settings:
44
43
  1. log_name
45
- 2. default_set (in OAI harvesting params section)
46
- 3. other OAI harvesting params
47
- 4. blacklist or whitelist if you are using them
44
+ 2. default_set
45
+ 3. blacklist or whitelist if you are using them
48
46
 
49
- You can also pass in non-default configurations as a hash
50
-
51
- indexer = Harvestdor::Indexer.new({:oai_repository_url => 'http://my_oai.org, :default_from_date => '2012-12-01'})
47
+ Update the dor-fetcher-client.yml file in the config directory with the location of the URL of the dor-fetcher-service provider. The defaulted value is the 3000 port for a localhost - dor_fetcher_service_url: http://127.0.0.1:3000
52
48
 
53
49
  === Override the Harvestdor::Indexer.index method
54
50
 
@@ -94,17 +90,21 @@ I suggest you write a script to run the code. Your script might look like this:
94
90
  end
95
91
  config_yml_path = ARGV.pop
96
92
  if config_yml_path.nil?
97
- puts "** You must provide the full path to a config yml file **"
93
+ puts "** You must provide the full path to a collection config yml file **"
94
+ exit
95
+ end
96
+ if client_config_path.nil?
97
+ puts "** You must provide the full path to dor-fetcher-client config yml file **"
98
98
  exit
99
99
  end
100
- indexer = Harvestdor::Indexer.new(config_yml_path, opts)
100
+ indexer = Harvestdor::Indexer.new(config_yml_path, client_config_path, opts)
101
101
  indexer.harvest_and_index
102
102
 
103
103
  Then you run the script like so:
104
104
 
105
105
  ./bin/indexer config/(your coll).yml
106
106
 
107
- I suggest you run your code on harvestdor-dev, as it is already set up to be able to harvest from the DOR OAI provider
107
+ I suggest you run your code on harvestdor-dev, as it is already set up to be able to harvest from the DorFetcher
108
108
 
109
109
 
110
110
  == Contributing
@@ -118,6 +118,7 @@ I suggest you run your code on harvestdor-dev, as it is already set up to be abl
118
118
 
119
119
  == Releases
120
120
 
121
+ * <b>1.0.0</b> Replaced OAI harvesting mechanism with dor-fetcher
121
122
  * <b>0.0.13</b> Upgrade to latest faraday HTTP client syntax; Use retries gem (https://github.com/ooyala/retries) to make retrying of index process more robust
122
123
  * <b>0.0.12</b> fix total_object nil error
123
124
  * <b>0.0.11</b> fix error_count and success_count, allow setting of max-tries (retry solr add if error)
@@ -0,0 +1,4 @@
1
+ # ---------- DorFetcher Client parameters -----------
2
+
3
+ # dor_fetcher_service_url: URL of the dor-fetcher-service provider
4
+ dor_fetcher_service_url: http://127.0.0.1:3000
@@ -6,8 +6,8 @@ require 'harvestdor-indexer/version'
6
6
  Gem::Specification.new do |gem|
7
7
  gem.name = "harvestdor-indexer"
8
8
  gem.version = Harvestdor::Indexer::VERSION
9
- gem.authors = ["Naomi Dushay"]
10
- gem.email = ["ndushay@stanford.edu"]
9
+ gem.authors = ["Naomi Dushay", "Bess Sadler", "Laney McGlohon"]
10
+ gem.email = ["ndushay@stanford.edu", "bess@stanford.edu", "laneymcg@stanford.edu"]
11
11
  gem.description = %q{Harvest DOR object metadata via a relationship (e.g. hydra:isGovernedBy rdf:resource="info:fedora/druid:hy787xj5878") and dates, plus code framework to write Solr docs to index}
12
12
  gem.summary = %q{Harvest DOR object metadata and index it to Solr}
13
13
  gem.homepage = "https://consul.stanford.edu/display/chimera/Chimera+project"
@@ -21,6 +21,7 @@ Gem::Specification.new do |gem|
21
21
  gem.add_dependency 'retries'
22
22
  gem.add_dependency 'harvestdor', '>=0.0.14'
23
23
  gem.add_dependency 'stanford-mods'
24
+ gem.add_dependency 'dor-fetcher', '>=1.0.0'
24
25
 
25
26
  # Runtime dependencies
26
27
  gem.add_runtime_dependency 'confstruct'
@@ -36,5 +37,7 @@ Gem::Specification.new do |gem|
36
37
  gem.add_development_dependency 'rspec'
37
38
  gem.add_development_dependency 'coveralls'
38
39
  # gem.add_development_dependency 'ruby-debug19'
40
+ gem.add_development_dependency 'vcr'
41
+ gem.add_development_dependency 'webmock'
39
42
 
40
43
  end
@@ -1,6 +1,6 @@
1
1
  module Harvestdor
2
2
  class Indexer
3
3
  # this is the Ruby Gem version
4
- VERSION = "0.0.13"
4
+ VERSION = "1.0.0"
5
5
  end
6
6
  end
@@ -2,10 +2,12 @@
2
2
  require 'confstruct'
3
3
  require 'rsolr'
4
4
  require 'retries'
5
+ require 'json'
5
6
 
6
7
  # sul-dlss gems
7
8
  require 'harvestdor'
8
9
  require 'stanford-mods'
10
+ require 'dor-fetcher'
9
11
 
10
12
  # stdlib
11
13
  require 'logger'
@@ -18,8 +20,9 @@ module Harvestdor
18
20
 
19
21
  attr_accessor :error_count, :success_count, :max_retries
20
22
  attr_accessor :total_time_to_parse,:total_time_to_solr
23
+ attr_accessor :dor_fetcher_client, :client_config
21
24
 
22
- def initialize yml_path, options = {}
25
+ def initialize yml_path, client_config_path, options = {}
23
26
  @success_count=0 # the number of objects successfully indexed
24
27
  @error_count=0 # the number of objects that failed
25
28
  @max_retries=10 # the number of times to retry an object
@@ -27,8 +30,10 @@ module Harvestdor
27
30
  @total_time_to_parse=0
28
31
  @yml_path = yml_path
29
32
  config.configure(YAML.load_file(yml_path)) if yml_path
30
- config.configure options
33
+ config.configure options
31
34
  yield(config) if block_given?
35
+ @client_config = YAML.load_file(client_config_path) if client_config_path && File.exists?(client_config_path)
36
+ @dor_fetcher_client=DorFetcher::Client.new({:service_url => client_config["dor_fetcher_service_url"]})
32
37
  end
33
38
 
34
39
  def config
@@ -40,7 +45,7 @@ module Harvestdor
40
45
  end
41
46
 
42
47
  # per this Indexer's config options
43
- # harvest the druids via OAI
48
+ # harvest the druids via DorFetcher
44
49
  # create a Solr profiling document for each druid
45
50
  # write the result to the Solr index
46
51
  def harvest_and_index
@@ -67,14 +72,14 @@ module Harvestdor
67
72
  logger.info("Total records processed: #{total_objects}")
68
73
  end
69
74
 
70
- # return Array of druids contained in the OAI harvest indicated by OAI params in yml configuration file
75
+ # return Array of druids contained in the DorFetcher pulling indicated by DorFetcher params
71
76
  # @return [Array<String>] or enumeration over it, if block is given. (strings are druids, e.g. ab123cd1234)
72
77
  def druids
73
78
  if @druids.nil?
74
79
  start_time=Time.now
75
- logger.info("Starting OAI harvest of druids at #{start_time}.")
76
- @druids = harvestdor_client.druids_via_oai
77
- logger.info("Completed OAI harves of druids at #{Time.now}. Found #{@druids.size} druids. Total elapsed time for OAI harvest = #{elapsed_time(start_time,:minutes)} minutes")
80
+ logger.info("Starting DorFetcher pulling of druids at #{start_time}.")
81
+ @druids = @dor_fetcher_client.druid_array(@dor_fetcher_client.get_collection(strip_default_set_string(), {}))
82
+ logger.info("Completed DorFetcher pulling of druids at #{Time.now}. Found #{@druids.size} druids. Total elapsed time for DorFetcher pulling = #{elapsed_time(start_time,:minutes)} minutes")
78
83
  end
79
84
  return @druids
80
85
  end
@@ -224,6 +229,12 @@ module Harvestdor
224
229
  @whitelist ||= []
225
230
  end
226
231
 
232
+ # Get only the druid from the end of the default_set string
233
+ # from the yml file
234
+ def strip_default_set_string()
235
+ @config.default_set.split('_').last
236
+ end
237
+
227
238
  protected #---------------------------------------------------------------------
228
239
 
229
240
  def harvestdor_client
data/spec/config/ap.yml CHANGED
@@ -1,7 +1,6 @@
1
1
  # You will want to copy this file and change the following settings:
2
2
  # 1. log_dir, log_name
3
- # 2. default_set (in OAI harvesting params section)
4
- # 2a. other OAI harvesting params
3
+ # 2. default_set
5
4
  # 3. blacklist or whitelist if you are using them
6
5
  # 4. Solr baseurl
7
6
 
@@ -16,8 +15,9 @@ purl: http://purl.stanford.edu
16
15
 
17
16
  # ---------- White and Black list parameters -----
18
17
 
19
- # name of file containing druids that will NOT be processed even if they are harvested via OAI
20
- # either give absolute path or path relative to where the command will be executed
18
+ # name of file containing druids that will NOT be processed even if they are harvested
19
+ # via DorFetcher either give absolute path or path relative to where the command will
20
+ # be executed
21
21
  #blacklist: config/ap_blacklist.txt
22
22
 
23
23
  # name of file containing druids that WILL be processed (all others will be ignored)
@@ -32,14 +32,9 @@ solr:
32
32
  read_timeout: 60
33
33
  open_timeout: 60
34
34
 
35
- # ---------- OAI harvesting parameters -----------
36
-
37
- # oai_repository_url: URL of the OAI data provider
38
- oai_repository_url: https://dor-oaiprovider-prod.stanford.edu/oai
39
-
40
35
  # default_set: default set for harvest (default: nil)
41
36
  # can be overridden on calls to harvest_ids and harvest_records
42
- default_set: is_governed_by_hy787xj5878
37
+ default_set: is_governed_by_yg867hg1375
43
38
 
44
39
  # default_metadata_prefix: default metadata prefix to be used for harvesting (default: mods)
45
40
  # can be overridden on calls to harvest_ids and harvest_records
@@ -50,8 +45,6 @@ default_set: is_governed_by_hy787xj5878
50
45
  # default_until_date: default until date for harvest (default: nil)
51
46
  # can be overridden on calls to harvest_ids and harvest_records
52
47
 
53
- # oai_client_debug: true for OAI::Client debug mode (default: false)
54
-
55
48
  # Additional options to pass to Faraday http client (https://github.com/technoweenie/faraday)
56
49
  http_options:
57
50
  ssl:
@@ -1,5 +1,5 @@
1
1
  # blacklist containing druids that should NOT be processed.
2
2
  # druids should be in the form aa111bb2222
3
3
 
4
- oo111oo1111
5
- oo222oo2222
4
+ druid:jf275fd6276
5
+ druid:tc552kq0798
@@ -1,5 +1,6 @@
1
1
  # whitelist containing the specific druids to be processed (all others will be ignored)
2
2
  # druids should be in the form aa111bb2222
3
3
 
4
- oo000oo0000
5
- oo222oo2222
4
+ druid:yg867hg1375
5
+ druid:jf275fd6276
6
+ druid:nz353cp1092
@@ -0,0 +1,114 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://purl.stanford.edu/oo000oo0000.mods
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 404
17
+ message: ''
18
+ headers:
19
+ Date:
20
+ - Wed, 22 Oct 2014 20:26:30 GMT
21
+ Server:
22
+ - Apache/2.2.15 (Red Hat)
23
+ X-Powered-By:
24
+ - Phusion Passenger (mod_rails/mod_rack) 3.0.19
25
+ X-Ua-Compatible:
26
+ - IE=Edge,chrome=1
27
+ Cache-Control:
28
+ - no-cache
29
+ X-Request-Id:
30
+ - eb7854ee5cc96cbf20bfdafb0e8ea1c2
31
+ X-Runtime:
32
+ - '0.011781'
33
+ X-Rack-Cache:
34
+ - miss
35
+ Status:
36
+ - '404'
37
+ Content-Length:
38
+ - '3015'
39
+ Content-Type:
40
+ - text/html; charset=utf-8
41
+ body:
42
+ encoding: US-ASCII
43
+ string: |
44
+ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
45
+ <html xmlns="http://www.w3.org/1999/xhtml">
46
+ <head>
47
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
48
+ <title>Stanford Digital Repository</title>
49
+ <link rel="shortcut icon" href="favicon.ico" />
50
+ <link href="/assets/site.css" media="screen" rel="stylesheet" type="text/css" />
51
+ </head>
52
+ <body>
53
+ <div id="container">
54
+ <div id="banner"><h1>Stanford Digital Repository</h1></div>
55
+ <div id="contents">
56
+ <div class="dialog">
57
+ <h2>The item you requested is not available.</h2>
58
+ <p>The item you requested is not yet available. It will be available at this URL when Library processing is completed.</p>
59
+ </div>
60
+
61
+ </div>
62
+ </div>
63
+
64
+ <div id="footer">
65
+ <div class="footer-contents">
66
+ <div class="footer-sul">
67
+ <div class="footer-logo">
68
+ <a href="http://library.stanford.edu" target="_blank"><img src="/images/footer-sul-logo.png"></a>
69
+ </div>
70
+ <div class="footer-links">
71
+ <a href="http://library.stanford.edu" target="_blank">Stanford University Libraries</a>
72
+ <a href="http://searchworks.stanford.edu" target="_blank">SearchWorks</a>
73
+ <a href="http://library.stanford.edu/ejournals" target="_blank">eJournals</a>
74
+ <a href="hhttp://library.stanford.edu/myawrap.html" target="_blank">My Account</a>
75
+ <a href="http://library.stanford.edu/ask" target="_blank">Ask Us</a>
76
+ </div>
77
+ </div>
78
+ <div class="footer-su">
79
+ <div class="footer-logo">
80
+ <a href="http://www.stanford.edu" target="_blank"><img src="/images/footer-stanford-logo.png"></a>
81
+ </div>
82
+ <div class="footer-links">
83
+ <a href="http://www.stanford.edu" target="_blank">SU Home</a>
84
+ <a href="http://visit.stanford.edu/plan/maps.html" target="_blank">Maps &amp; Directions</a>
85
+ <a href="http://www.stanford.edu/search/" target="_blank">Search Stanford</a>
86
+ <a href="http://www.stanford.edu/site/terms.html" target="_blank">Terms of Use</a>
87
+ <a href="http://www.stanford.edu/site/copyright.html" target="_blank">Copyright Complaints</a>
88
+ </br>&copy; Stanford University, Stanford, California 94305
89
+ </div>
90
+ </div>
91
+ </div>
92
+ </div>
93
+ <script src="/assets/jquery.js" type="text/javascript"></script>
94
+ <script src="/assets/jquery.truncator.js" type="text/javascript"></script>
95
+ <script src="/assets/application.js" type="text/javascript"></script>
96
+
97
+ <script type="text/javascript">
98
+
99
+ var _gaq = _gaq || [];
100
+ _gaq.push(['_setAccount', 'UA-7219229-11']);
101
+ _gaq.push(['_trackPageview']);
102
+
103
+ (function() {
104
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
105
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
106
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
107
+ })();
108
+
109
+ </script>
110
+ </body>
111
+ </html>
112
+ http_version:
113
+ recorded_at: Wed, 22 Oct 2014 20:26:30 GMT
114
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - 0954c447-9cb9-4eeb-8020-d87f13098f07
33
+ X-Runtime:
34
+ - '0.006736'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 18:42:32 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 18:42:32 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - 1e0232c6-fc39-49bf-b874-89567e225d00
33
+ X-Runtime:
34
+ - '0.006851'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 18:53:15 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 18:53:15 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - d35b0793-e841-496b-bce1-720bfbf2ad04
33
+ X-Runtime:
34
+ - '0.006751'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 20:32:36 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 20:32:36 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - a631e18e-8396-4699-b7a9-fd05fd115e02
33
+ X-Runtime:
34
+ - '0.006491'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 20:34:01 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 20:34:01 GMT
58
+ recorded_with: VCR 2.9.3
@@ -0,0 +1,58 @@
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: http://127.0.0.1:3000/collection/yg867hg1375
6
+ body:
7
+ encoding: US-ASCII
8
+ string: ''
9
+ headers:
10
+ Accept:
11
+ - '*/*'
12
+ User-Agent:
13
+ - Ruby
14
+ response:
15
+ status:
16
+ code: 200
17
+ message: 'OK '
18
+ headers:
19
+ X-Frame-Options:
20
+ - SAMEORIGIN
21
+ X-Xss-Protection:
22
+ - 1; mode=block
23
+ X-Content-Type-Options:
24
+ - nosniff
25
+ Content-Type:
26
+ - application/json; charset=utf-8
27
+ Etag:
28
+ - '"682afec57f678e4d153a5841b21395dd"'
29
+ Cache-Control:
30
+ - max-age=0, private, must-revalidate
31
+ X-Request-Id:
32
+ - 37413f0c-a104-4df1-8a80-1873389200f4
33
+ X-Runtime:
34
+ - '0.006754'
35
+ Server:
36
+ - WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08)
37
+ Date:
38
+ - Wed, 22 Oct 2014 18:42:32 GMT
39
+ Content-Length:
40
+ - '1121'
41
+ Connection:
42
+ - Keep-Alive
43
+ body:
44
+ encoding: US-ASCII
45
+ string: '{"collection":[{"druid":"druid:yg867hg1375","latest_change":"2013-11-11T23:34:29Z","title":["Francis
46
+ E. Stafford photographs, 1909-1933"]}],"item":[{"druid":"druid:jf275fd6276","latest_change":"2013-11-11T23:34:29Z","title":["Album
47
+ A: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
48
+ social customs, and people."]},{"druid":"druid:nz353cp1092","latest_change":"2013-11-11T23:34:29Z","title":["Album
49
+ E: Photographs of the Seventh Day Adventist Church missionaries in China"]},{"druid":"druid:tc552kq0798","latest_change":"2013-11-11T23:34:29Z","title":["Album
50
+ D: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
51
+ social customs, and people."]},{"druid":"druid:th998nk0722","latest_change":"2013-11-11T23:34:29Z","title":["Album
52
+ C: Photographs of the Chinese Revolution of 1911 and the Shanghai Commercial
53
+ Press"]},{"druid":"druid:ww689vs6534","latest_change":"2013-11-11T23:34:29Z","title":["Album
54
+ B: Photographs of China''s natural landscapes, urban scenes, cultural landmarks,
55
+ social customs, and people."]}],"counts":[{"collection":1},{"item":5},{"total_count":6}]}'
56
+ http_version:
57
+ recorded_at: Wed, 22 Oct 2014 18:42:32 GMT
58
+ recorded_with: VCR 2.9.3
data/spec/spec_helper.rb CHANGED
@@ -6,5 +6,9 @@ $LOAD_PATH.unshift(File.dirname(__FILE__))
6
6
 
7
7
  require 'harvestdor-indexer'
8
8
 
9
- #RSpec.configure do |config|
10
- #end
9
+ require 'vcr'
10
+
11
+ VCR.configure do |c|
12
+ c.cassette_library_dir = 'spec/fixtures/vcr_cassettes'
13
+ c.hook_into :webmock
14
+ end
@@ -4,7 +4,8 @@ describe Harvestdor::Indexer do
4
4
 
5
5
  before(:all) do
6
6
  @config_yml_path = File.join(File.dirname(__FILE__), "..", "config", "ap.yml")
7
- @indexer = Harvestdor::Indexer.new(@config_yml_path)
7
+ @client_config_path = File.join(File.dirname(__FILE__), "../..", "config", "dor-fetcher-client.yml")
8
+ @indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
8
9
  require 'yaml'
9
10
  @yaml = YAML.load_file(@config_yml_path)
10
11
  @hdor_client = @indexer.send(:harvestdor_client)
@@ -65,55 +66,85 @@ describe Harvestdor::Indexer do
65
66
  :id => @fake_druid
66
67
  }
67
68
  end
68
- it "should call druids_via_oai and then call :add on rsolr connection" do
69
+ it "should call dor_fetcher_client.druid_array and then call :add on rsolr connection" do
69
70
  @indexer.should_receive(:druids).and_return([@fake_druid])
70
71
  @indexer.solr_client.should_receive(:add).with(@doc_hash)
71
72
  @indexer.solr_client.should_receive(:commit)
72
73
  @indexer.harvest_and_index
73
74
  end
74
- it "should not process druids in blacklist" do
75
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => @blacklist_path})
76
- hdor_client = indexer.send(:harvestdor_client)
77
- hdor_client.should_receive(:druids_via_oai).and_return(['oo000oo0000', 'oo111oo1111', 'oo222oo2222', 'oo333oo3333'])
78
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo000oo0000'}))
79
- indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'oo111oo1111'}))
80
- indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'oo222oo2222'}))
81
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo333oo3333'}))
82
- indexer.solr_client.should_receive(:commit)
83
- indexer.harvest_and_index
75
+
76
+ it "should only call :commit on rsolr connection once" do
77
+ VCR.use_cassette('single_rsolr_connection_call') do
78
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
79
+ hdor_client = indexer.send(:harvestdor_client)
80
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return(["druid:yg867hg1375", "druid:jf275fd6276", "druid:nz353cp1092", "druid:tc552kq0798", "druid:th998nk0722", "druid:ww689vs6534"])
81
+ indexer.solr_client.should_receive(:add).exactly(6).times
82
+ indexer.solr_client.should_receive(:commit).once
83
+ indexer.harvest_and_index
84
+ end
84
85
  end
85
- it "should only process druids in whitelist if it exists" do
86
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:whitelist => @whitelist_path})
87
- hdor_client = indexer.send(:harvestdor_client)
88
- hdor_client.should_not_receive(:druids_via_oai)
89
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo000oo0000'}))
90
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo222oo2222'}))
91
- indexer.solr_client.should_receive(:commit)
92
- indexer.harvest_and_index
86
+
87
+ it "should not process druids in blacklist" do
88
+ VCR.use_cassette('ignore_druids_in_blacklist_call') do
89
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => @blacklist_path})
90
+ hdor_client = indexer.send(:harvestdor_client)
91
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return(["druid:yg867hg1375", "druid:jf275fd6276", "druid:nz353cp1092", "druid:tc552kq0798", "druid:th998nk0722", "druid:ww689vs6534"])
92
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:nz353cp1092'}))
93
+ indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'druid:jf275fd6276'}))
94
+ indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'druid:tc552kq0798'}))
95
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:th998nk0722'}))
96
+ indexer.solr_client.should_receive(:commit)
97
+ indexer.harvest_and_index
98
+ end
93
99
  end
94
100
  it "should not process druid if it is in both blacklist and whitelist" do
95
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => @blacklist_path, :whitelist => @whitelist_path})
96
- hdor_client = indexer.send(:harvestdor_client)
97
- hdor_client.should_not_receive(:druids_via_oai)
98
- indexer.solr_client.should_receive(:add).with(hash_including({:id => 'oo000oo0000'}))
99
- indexer.solr_client.should_receive(:commit)
100
- indexer.harvest_and_index
101
+ VCR.use_cassette('ignore_druids_in_blacklist_and_whitelist_call') do
102
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => @blacklist_path, :whitelist => @whitelist_path})
103
+ hdor_client = indexer.send(:harvestdor_client)
104
+ indexer.dor_fetcher_client.should_not_receive(:druid_array)
105
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:yg867hg1375'}))
106
+ indexer.solr_client.should_not_receive(:add).with(hash_including({:id => 'druid:jf275fd6276'}))
107
+ indexer.solr_client.should_receive(:commit)
108
+ indexer.harvest_and_index
109
+ end
101
110
  end
102
- it "should only call :commit on rsolr connection once" do
103
- indexer = Harvestdor::Indexer.new(@config_yml_path)
104
- hdor_client = indexer.send(:harvestdor_client)
105
- hdor_client.should_receive(:druids_via_oai).and_return(['1', '2', '3'])
106
- indexer.solr_client.should_receive(:add).exactly(3).times
107
- indexer.solr_client.should_receive(:commit).once
108
- indexer.harvest_and_index
111
+ it "should only process druids in whitelist if it exists" do
112
+ VCR.use_cassette('process_druids_whitelist_call') do
113
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:whitelist => @whitelist_path})
114
+ hdor_client = indexer.send(:harvestdor_client)
115
+ indexer.dor_fetcher_client.should_not_receive(:druid_array)
116
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:yg867hg1375'}))
117
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:jf275fd6276'}))
118
+ indexer.solr_client.should_receive(:add).with(hash_including({:id => 'druid:nz353cp1092'}))
119
+ indexer.solr_client.should_receive(:commit)
120
+ indexer.harvest_and_index
121
+ end
109
122
  end
123
+
110
124
  end
111
125
 
112
- it "druids method should call druids_via_oai method on harvestdor_client" do
113
- @hdor_client.should_receive(:druids_via_oai).and_return([@fake_druid])
114
- @indexer.druids
115
- end
116
-
126
+ # Check for replacement of oai harvesting with dor-fetcher
127
+ context "replacing OAI harvesting with dor-fetcher" do
128
+ it "has a dor-fetcher client" do
129
+ expect(@indexer.dor_fetcher_client).to be_an_instance_of(DorFetcher::Client)
130
+ end
131
+
132
+ it "should strip off is_member_of_collection_ and is_governed_by_ and return only the druid" do
133
+ expect(@indexer.strip_default_set_string()).to eq("yg867hg1375")
134
+ end
135
+
136
+ it "druids method should call druid_array and get_collection methods on fetcher_client" do
137
+ VCR.use_cassette('get_collection_druids_call') do
138
+ expect(@indexer.druids).to eq(["druid:yg867hg1375", "druid:jf275fd6276", "druid:nz353cp1092", "druid:tc552kq0798", "druid:th998nk0722", "druid:ww689vs6534"])
139
+ end
140
+ end
141
+
142
+ it "should get the configuration of the dor-fetcher client from included yml file" do
143
+ expect(@indexer.dor_fetcher_client.service_url).to eq(@indexer.client_config["dor_fetcher_service_url"])
144
+ end
145
+
146
+ end # ending replacing OAI context
147
+
117
148
  context "smods_rec method" do
118
149
  before(:all) do
119
150
  @fake_druid = 'oo000oo0000'
@@ -134,7 +165,9 @@ describe Harvestdor::Indexer do
134
165
  expect { @indexer.smods_rec(@fake_druid) }.to raise_error(RuntimeError, Regexp.new("^Empty MODS metadata for #{@fake_druid}: <"))
135
166
  end
136
167
  it "should raise exception if there is no MODS xml for the druid" do
137
- expect { @indexer.smods_rec(@fake_druid) }.to raise_error(Harvestdor::Errors::MissingMods)
168
+ VCR.use_cassette('exception_no_MODS_call') do
169
+ expect { @indexer.smods_rec(@fake_druid) }.to raise_error(Harvestdor::Errors::MissingMods)
170
+ end
138
171
  end
139
172
  end
140
173
 
@@ -253,30 +286,32 @@ describe Harvestdor::Indexer do
253
286
  @indexer.send(:blacklist).size.should == 2
254
287
  end
255
288
  it "should be empty Array if there was no blacklist config setting" do
256
- indexer = Harvestdor::Indexer.new(@config_yml_path)
289
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
257
290
  indexer.send(:blacklist).should == []
258
291
  end
259
292
  context "load_blacklist" do
260
293
  it "should not be called if there was no blacklist config setting" do
261
- indexer = Harvestdor::Indexer.new(@config_yml_path)
294
+ VCR.use_cassette('no_blacklist_config_call') do
295
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
262
296
 
263
- indexer.should_not_receive(:load_blacklist)
297
+ indexer.should_not_receive(:load_blacklist)
264
298
 
265
- hdor_client = indexer.send(:harvestdor_client)
266
- hdor_client.should_receive(:druids_via_oai).and_return([@fake_druid])
267
- indexer.solr_client.should_receive(:add)
268
- indexer.solr_client.should_receive(:commit)
269
- indexer.harvest_and_index
299
+ hdor_client = indexer.send(:harvestdor_client)
300
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return([@fake_druid])
301
+ indexer.solr_client.should_receive(:add)
302
+ indexer.solr_client.should_receive(:commit)
303
+ indexer.harvest_and_index
304
+ end
270
305
  end
271
306
  it "should only try to load a blacklist once" do
272
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => @blacklist_path})
307
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => @blacklist_path})
273
308
  indexer.send(:blacklist)
274
309
  File.any_instance.should_not_receive(:open)
275
310
  indexer.send(:blacklist)
276
311
  end
277
312
  it "should log an error message and throw RuntimeError if it can't find the indicated blacklist file" do
278
313
  exp_msg = 'Unable to find list of druids at bad_path'
279
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:blacklist => 'bad_path'})
314
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:blacklist => 'bad_path'})
280
315
  indexer.logger.should_receive(:fatal).with(exp_msg)
281
316
  expect { indexer.send(:load_blacklist, 'bad_path') }.to raise_error(exp_msg)
282
317
  end
@@ -287,33 +322,35 @@ describe Harvestdor::Indexer do
287
322
  it "should be an Array with an entry for each non-empty line in the file" do
288
323
  @indexer.send(:load_whitelist, @whitelist_path)
289
324
  @indexer.send(:whitelist).should be_an_instance_of(Array)
290
- @indexer.send(:whitelist).size.should == 2
325
+ @indexer.send(:whitelist).size.should == 3
291
326
  end
292
327
  it "should be empty Array if there was no whitelist config setting" do
293
- indexer = Harvestdor::Indexer.new(@config_yml_path)
328
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
294
329
  indexer.send(:whitelist).should == []
295
330
  end
296
331
  context "load_whitelist" do
297
332
  it "should not be called if there was no whitelist config setting" do
298
- indexer = Harvestdor::Indexer.new(@config_yml_path)
333
+ VCR.use_cassette('no_whitelist_config_call') do
334
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path)
299
335
 
300
- indexer.should_not_receive(:load_whitelist)
336
+ indexer.should_not_receive(:load_whitelist)
301
337
 
302
- hdor_client = indexer.send(:harvestdor_client)
303
- hdor_client.should_receive(:druids_via_oai).and_return([@fake_druid])
304
- indexer.solr_client.should_receive(:add)
305
- indexer.solr_client.should_receive(:commit)
306
- indexer.harvest_and_index
338
+ hdor_client = indexer.send(:harvestdor_client)
339
+ indexer.dor_fetcher_client.should_receive(:druid_array).and_return([@fake_druid])
340
+ indexer.solr_client.should_receive(:add)
341
+ indexer.solr_client.should_receive(:commit)
342
+ indexer.harvest_and_index
343
+ end
307
344
  end
308
345
  it "should only try to load a whitelist once" do
309
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:whitelist => @whitelist_path})
346
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:whitelist => @whitelist_path})
310
347
  indexer.send(:whitelist)
311
348
  File.any_instance.should_not_receive(:open)
312
349
  indexer.send(:whitelist)
313
350
  end
314
351
  it "should log an error message and throw RuntimeError if it can't find the indicated whitelist file" do
315
352
  exp_msg = 'Unable to find list of druids at bad_path'
316
- indexer = Harvestdor::Indexer.new(@config_yml_path, {:whitelist => 'bad_path'})
353
+ indexer = Harvestdor::Indexer.new(@config_yml_path, @client_config_path, {:whitelist => 'bad_path'})
317
354
  indexer.logger.should_receive(:fatal).with(exp_msg)
318
355
  expect { indexer.send(:load_whitelist, 'bad_path') }.to raise_error(exp_msg)
319
356
  end
@@ -321,7 +358,7 @@ describe Harvestdor::Indexer do
321
358
  end # whitelist
322
359
 
323
360
  it "solr_client should initialize the rsolr client using the options from the config" do
324
- indexer = Harvestdor::Indexer.new(nil, Confstruct::Configuration.new(:solr => { :url => 'http://localhost:2345', :a => 1 }) )
361
+ indexer = Harvestdor::Indexer.new(nil, @client_config_path, Confstruct::Configuration.new(:solr => { :url => 'http://localhost:2345', :a => 1 }) )
325
362
  RSolr.should_receive(:connect).with(hash_including(:a => 1, :url => 'http://localhost:2345')).and_return('foo')
326
363
  indexer.solr_client
327
364
  end
metadata CHANGED
@@ -1,206 +1,263 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: harvestdor-indexer
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.13
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Naomi Dushay
8
- autorequire:
8
+ - Bess Sadler
9
+ - Laney McGlohon
10
+ autorequire:
9
11
  bindir: bin
10
12
  cert_chain: []
11
- date: 2014-08-05 00:00:00.000000000 Z
13
+ date: 2014-10-24 00:00:00.000000000 Z
12
14
  dependencies:
13
15
  - !ruby/object:Gem::Dependency
14
16
  name: rsolr
17
+ version_requirements: !ruby/object:Gem::Requirement
18
+ requirements:
19
+ - - '>='
20
+ - !ruby/object:Gem::Version
21
+ version: '0'
15
22
  requirement: !ruby/object:Gem::Requirement
16
23
  requirements:
17
- - - ">="
24
+ - - '>='
18
25
  - !ruby/object:Gem::Version
19
26
  version: '0'
20
- type: :runtime
21
27
  prerelease: false
28
+ type: :runtime
29
+ - !ruby/object:Gem::Dependency
30
+ name: retries
22
31
  version_requirements: !ruby/object:Gem::Requirement
23
32
  requirements:
24
- - - ">="
33
+ - - '>='
25
34
  - !ruby/object:Gem::Version
26
35
  version: '0'
27
- - !ruby/object:Gem::Dependency
28
- name: retries
29
36
  requirement: !ruby/object:Gem::Requirement
30
37
  requirements:
31
- - - ">="
38
+ - - '>='
32
39
  - !ruby/object:Gem::Version
33
40
  version: '0'
34
- type: :runtime
35
41
  prerelease: false
42
+ type: :runtime
43
+ - !ruby/object:Gem::Dependency
44
+ name: harvestdor
36
45
  version_requirements: !ruby/object:Gem::Requirement
37
46
  requirements:
38
- - - ">="
47
+ - - '>='
39
48
  - !ruby/object:Gem::Version
40
- version: '0'
41
- - !ruby/object:Gem::Dependency
42
- name: harvestdor
49
+ version: 0.0.14
43
50
  requirement: !ruby/object:Gem::Requirement
44
51
  requirements:
45
- - - ">="
52
+ - - '>='
46
53
  - !ruby/object:Gem::Version
47
54
  version: 0.0.14
48
- type: :runtime
49
55
  prerelease: false
56
+ type: :runtime
57
+ - !ruby/object:Gem::Dependency
58
+ name: stanford-mods
50
59
  version_requirements: !ruby/object:Gem::Requirement
51
60
  requirements:
52
- - - ">="
61
+ - - '>='
53
62
  - !ruby/object:Gem::Version
54
- version: 0.0.14
55
- - !ruby/object:Gem::Dependency
56
- name: stanford-mods
63
+ version: '0'
57
64
  requirement: !ruby/object:Gem::Requirement
58
65
  requirements:
59
- - - ">="
66
+ - - '>='
60
67
  - !ruby/object:Gem::Version
61
68
  version: '0'
62
- type: :runtime
63
69
  prerelease: false
70
+ type: :runtime
71
+ - !ruby/object:Gem::Dependency
72
+ name: dor-fetcher
64
73
  version_requirements: !ruby/object:Gem::Requirement
65
74
  requirements:
66
- - - ">="
75
+ - - '>='
67
76
  - !ruby/object:Gem::Version
68
- version: '0'
77
+ version: 1.0.0
78
+ requirement: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - '>='
81
+ - !ruby/object:Gem::Version
82
+ version: 1.0.0
83
+ prerelease: false
84
+ type: :runtime
69
85
  - !ruby/object:Gem::Dependency
70
86
  name: confstruct
87
+ version_requirements: !ruby/object:Gem::Requirement
88
+ requirements:
89
+ - - '>='
90
+ - !ruby/object:Gem::Version
91
+ version: '0'
71
92
  requirement: !ruby/object:Gem::Requirement
72
93
  requirements:
73
- - - ">="
94
+ - - '>='
74
95
  - !ruby/object:Gem::Version
75
96
  version: '0'
76
- type: :runtime
77
97
  prerelease: false
98
+ type: :runtime
99
+ - !ruby/object:Gem::Dependency
100
+ name: rake
78
101
  version_requirements: !ruby/object:Gem::Requirement
79
102
  requirements:
80
- - - ">="
103
+ - - '>='
81
104
  - !ruby/object:Gem::Version
82
105
  version: '0'
83
- - !ruby/object:Gem::Dependency
84
- name: rake
85
106
  requirement: !ruby/object:Gem::Requirement
86
107
  requirements:
87
- - - ">="
108
+ - - '>='
88
109
  - !ruby/object:Gem::Version
89
110
  version: '0'
90
- type: :development
91
111
  prerelease: false
112
+ type: :development
113
+ - !ruby/object:Gem::Dependency
114
+ name: rdoc
92
115
  version_requirements: !ruby/object:Gem::Requirement
93
116
  requirements:
94
- - - ">="
117
+ - - '>='
95
118
  - !ruby/object:Gem::Version
96
119
  version: '0'
97
- - !ruby/object:Gem::Dependency
98
- name: rdoc
99
120
  requirement: !ruby/object:Gem::Requirement
100
121
  requirements:
101
- - - ">="
122
+ - - '>='
102
123
  - !ruby/object:Gem::Version
103
124
  version: '0'
104
- type: :development
105
125
  prerelease: false
126
+ type: :development
127
+ - !ruby/object:Gem::Dependency
128
+ name: yard
106
129
  version_requirements: !ruby/object:Gem::Requirement
107
130
  requirements:
108
- - - ">="
131
+ - - '>='
109
132
  - !ruby/object:Gem::Version
110
133
  version: '0'
111
- - !ruby/object:Gem::Dependency
112
- name: yard
113
134
  requirement: !ruby/object:Gem::Requirement
114
135
  requirements:
115
- - - ">="
136
+ - - '>='
116
137
  - !ruby/object:Gem::Version
117
138
  version: '0'
118
- type: :development
119
139
  prerelease: false
140
+ type: :development
141
+ - !ruby/object:Gem::Dependency
142
+ name: rspec
120
143
  version_requirements: !ruby/object:Gem::Requirement
121
144
  requirements:
122
- - - ">="
145
+ - - '>='
123
146
  - !ruby/object:Gem::Version
124
147
  version: '0'
125
- - !ruby/object:Gem::Dependency
126
- name: rspec
127
148
  requirement: !ruby/object:Gem::Requirement
128
149
  requirements:
129
- - - ">="
150
+ - - '>='
130
151
  - !ruby/object:Gem::Version
131
152
  version: '0'
132
- type: :development
133
153
  prerelease: false
154
+ type: :development
155
+ - !ruby/object:Gem::Dependency
156
+ name: coveralls
134
157
  version_requirements: !ruby/object:Gem::Requirement
135
158
  requirements:
136
- - - ">="
159
+ - - '>='
137
160
  - !ruby/object:Gem::Version
138
161
  version: '0'
139
- - !ruby/object:Gem::Dependency
140
- name: coveralls
141
162
  requirement: !ruby/object:Gem::Requirement
142
163
  requirements:
143
- - - ">="
164
+ - - '>='
144
165
  - !ruby/object:Gem::Version
145
166
  version: '0'
167
+ prerelease: false
146
168
  type: :development
169
+ - !ruby/object:Gem::Dependency
170
+ name: vcr
171
+ version_requirements: !ruby/object:Gem::Requirement
172
+ requirements:
173
+ - - '>='
174
+ - !ruby/object:Gem::Version
175
+ version: '0'
176
+ requirement: !ruby/object:Gem::Requirement
177
+ requirements:
178
+ - - '>='
179
+ - !ruby/object:Gem::Version
180
+ version: '0'
147
181
  prerelease: false
182
+ type: :development
183
+ - !ruby/object:Gem::Dependency
184
+ name: webmock
148
185
  version_requirements: !ruby/object:Gem::Requirement
149
186
  requirements:
150
- - - ">="
187
+ - - '>='
188
+ - !ruby/object:Gem::Version
189
+ version: '0'
190
+ requirement: !ruby/object:Gem::Requirement
191
+ requirements:
192
+ - - '>='
151
193
  - !ruby/object:Gem::Version
152
194
  version: '0'
153
- description: Harvest DOR object metadata via a relationship (e.g. hydra:isGovernedBy
154
- rdf:resource="info:fedora/druid:hy787xj5878") and dates, plus code framework to
155
- write Solr docs to index
195
+ prerelease: false
196
+ type: :development
197
+ description: Harvest DOR object metadata via a relationship (e.g. hydra:isGovernedBy rdf:resource="info:fedora/druid:hy787xj5878") and dates, plus code framework to write Solr docs to index
156
198
  email:
157
199
  - ndushay@stanford.edu
200
+ - bess@stanford.edu
201
+ - laneymcg@stanford.edu
158
202
  executables: []
159
203
  extensions: []
160
204
  extra_rdoc_files: []
161
205
  files:
162
- - ".gitignore"
163
- - ".travis.yml"
164
- - ".yardopts"
206
+ - .gitignore
207
+ - .travis.yml
208
+ - .yardopts
165
209
  - Gemfile
166
210
  - LICENSE.txt
167
211
  - README.rdoc
168
212
  - Rakefile
213
+ - config/dor-fetcher-client.yml
169
214
  - harvestdor-indexer.gemspec
170
215
  - lib/harvestdor-indexer.rb
171
216
  - lib/harvestdor-indexer/version.rb
172
217
  - spec/config/ap.yml
173
218
  - spec/config/ap_blacklist.txt
174
219
  - spec/config/ap_whitelist.txt
220
+ - spec/fixtures/vcr_cassettes/exception_no_MODS_call.yml
221
+ - spec/fixtures/vcr_cassettes/get_collection_druids_call.yml
222
+ - spec/fixtures/vcr_cassettes/ignore_druids_in_blacklist_call.yml
223
+ - spec/fixtures/vcr_cassettes/no_blacklist_config_call.yml
224
+ - spec/fixtures/vcr_cassettes/no_whitelist_config_call.yml
225
+ - spec/fixtures/vcr_cassettes/single_rsolr_connection_call.yml
175
226
  - spec/spec_helper.rb
176
227
  - spec/unit/harvestdor-indexer_spec.rb
177
228
  homepage: https://consul.stanford.edu/display/chimera/Chimera+project
178
229
  licenses: []
179
230
  metadata: {}
180
- post_install_message:
231
+ post_install_message:
181
232
  rdoc_options: []
182
233
  require_paths:
183
234
  - lib
184
235
  required_ruby_version: !ruby/object:Gem::Requirement
185
236
  requirements:
186
- - - ">="
237
+ - - '>='
187
238
  - !ruby/object:Gem::Version
188
239
  version: '0'
189
240
  required_rubygems_version: !ruby/object:Gem::Requirement
190
241
  requirements:
191
- - - ">="
242
+ - - '>='
192
243
  - !ruby/object:Gem::Version
193
244
  version: '0'
194
245
  requirements: []
195
- rubyforge_project:
246
+ rubyforge_project:
196
247
  rubygems_version: 2.2.2
197
- signing_key:
248
+ signing_key:
198
249
  specification_version: 4
199
250
  summary: Harvest DOR object metadata and index it to Solr
200
251
  test_files:
201
252
  - spec/config/ap.yml
202
253
  - spec/config/ap_blacklist.txt
203
254
  - spec/config/ap_whitelist.txt
255
+ - spec/fixtures/vcr_cassettes/exception_no_MODS_call.yml
256
+ - spec/fixtures/vcr_cassettes/get_collection_druids_call.yml
257
+ - spec/fixtures/vcr_cassettes/ignore_druids_in_blacklist_call.yml
258
+ - spec/fixtures/vcr_cassettes/no_blacklist_config_call.yml
259
+ - spec/fixtures/vcr_cassettes/no_whitelist_config_call.yml
260
+ - spec/fixtures/vcr_cassettes/single_rsolr_connection_call.yml
204
261
  - spec/spec_helper.rb
205
262
  - spec/unit/harvestdor-indexer_spec.rb
206
- has_rdoc:
263
+ has_rdoc: