search_solr_tools 7.1.0 → 7.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 06c1b12c76f3c9cfdd99f05f85cce8a24c6118cd1dfcbfc8ace538a32865d36f
4
- data.tar.gz: bd22205a6833f05d4d96fa43cdc91b5b3d45452f658ccf597ac3bbd26ac40abc
3
+ metadata.gz: ea0e97fdb6cd3986683da13d8fd8ff26a150b00a3fc3b64de54fee93d3ceaf1d
4
+ data.tar.gz: 9e0c10993ab2fce13b400f9fb419788f0706b4367d9629145e425ed8c5931d05
5
5
  SHA512:
6
- metadata.gz: 6df6573622d4643676f1334b15aa4184f15403bc8f90f8a8cd116c25aa86202c6edc04a8acb9ef4ae428e49bb5429b2fad5b1ecd1afbf5bf5557af38eca28285
7
- data.tar.gz: e75b60f41f1c4e5f8b2e7ad14e34a58fa20ba3af5d45b56fccf6914db51f108d4c015a26a139d21ed30c2cd9c06634f73c5a637043f336a210ca37f00625877c
6
+ metadata.gz: 9fa45abe863da34e10f81183d242db3ccf8e1b37b01b2efb0836f469f50ae1f36dc3cd03ca6ab27a0ea2d0a83db232b93bb98271b8948b3283f717f2fd03ee66
7
+ data.tar.gz: 190b02a7ecd6ed44c762317d4e142e88877f053ad09457187e05c5a9e6de7e41f307118ad64661cc8efe563d0bb23054d795c3aa0a307b3b71df4d764c30fd53
data/CHANGELOG.md CHANGED
@@ -1,3 +1,8 @@
1
+ ## v7.2.0 (2023-10-19)
2
+
3
+ - Force parameter facets based on GCMD keywords to be upper-case.
4
+ - Only use short name for sensor facets in which the short name and long name are identical.
5
+
1
6
  ## v7.1.0 (2023-10-11)
2
7
 
3
8
  - Updating harvesting to harvest storage system and spatial coverage
@@ -50,15 +55,15 @@
50
55
 
51
56
  ## v5.2.0 (2022-08-31)
52
57
 
53
- - Updated the call for identifiers for the json harvester to use the
58
+ - Updated the call for identifiers for the json harvester to use the
54
59
  proper "metadataPrefix" parameter, and request the dif identifiers
55
60
  instead of iso.
56
61
 
57
62
  ## v5.1.0 (2020-07-23)
58
63
 
59
- - Added a CLI method to "ping" the Solr and Source servers for a given
64
+ - Added a CLI method to "ping" the Solr and Source servers for a given
60
65
  data center.
61
- - Added a CLI method "errcode" to get information about the various
66
+ - Added a CLI method "errcode" to get information about the various
62
67
  error codes that may be returned during harvest
63
68
  - Updated the CLI harvest to return more useful error codes on failure.
64
69
 
data/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  This is a gem that contains:
6
6
 
7
- * Ruby translators to transform various metadata feeds into solr documents
7
+ * Ruby translators to transform NSIDC metadata feeds into solr documents
8
8
  * A command-line utility to access/utilize the gem's translators to harvest
9
9
  metadata into a working solr instance.
10
10
 
@@ -25,30 +25,23 @@ Clone the repository, and install all requirements as noted below.
25
25
  #### Configuration
26
26
 
27
27
  Once you have the code and requirements, edit the configuration file in
28
- `lib/search_solr_tools/config/environments.yaml` to match your environment. The
29
- configuration values are set by environment for each harvester (or specified in
30
- the `common` settings list), with the environment overriding `common` if a
31
- different setting is specified for a given environment.
32
-
33
- Each harvester has its own configuration settings. Most are the target endpoint;
34
- EOL, however, has a list of THREDDS project endpoints and NSIDC has its own
35
- oai/metadata endpoint settings.
36
-
37
- Most users should not need to change the harvester configuration unless they
38
- establish a local test node, or if a provider changes available endpoints;
39
- however, the `host` option for each environment must specify the configured SOLR
28
+ `lib/search_solr_tools/config/environments.yaml` to match your environment.
29
+ Environment settings take precedence over `common` settings.
30
+ The `host` option for each environment must specify the configured SOLR
40
31
  instance you intend to use these tools with.
41
32
 
42
33
  #### Build and Install Gem
43
34
 
44
- Then run:
35
+ Run:
45
36
 
46
37
  `bundle exec gem build ./search_solr_tools.gemspec`
47
38
 
48
- Once you have the gem built in the project directory, install the utility:
39
+ Once you have the gem built in the project directory, install it:
49
40
 
50
41
  `gem install --local ./search_solr_tools-version.gem`
51
42
 
43
+ See _Harvesting Data_ (below) for usage examples.
44
+
52
45
  ## Working on the Project
53
46
 
54
47
  1. Create your feature branch (`git checkout -b my-new-feature`)
@@ -177,8 +170,7 @@ workflow instead.
177
170
 
178
171
  ### SOLR
179
172
 
180
- To harvest data utilizing the gem, you will need an installed instance of [Solr
181
- 8.5.3](https://lucene.apache.org/solr/guide/)
173
+ To harvest data utilizing the gem, you will need an installed instance of [Solr](https://solr.apache.org/guide/solr/latest/index.html)
182
174
 
183
175
  #### NSIDC
184
176
 
@@ -193,7 +185,7 @@ Outside of NSIDC, setup solr using the instructions found in the
193
185
 
194
186
  ### Harvesting Data
195
187
 
196
- The harvester requires additional metadata from services that may not yet be
188
+ The harvester requires additional metadata from services that may not be
197
189
  publicly available, which are referenced in
198
190
  `lib/search_solr_tools/config/environments.yaml`.
199
191
 
@@ -204,7 +196,7 @@ overview of what's available, simply run `search_solr_tools`.
204
196
 
205
197
  Harvesting of data can be done using the `harvest` task, giving it a list of
206
198
  harvesters and an environment. Deletion is possible via the `delete_all` and/or
207
- `delete_by_data_center'`tasks. `list harvesters` will list the valid harvest
199
+ `delete_by_data_center'`tasks. `list_harvesters` will list the valid harvest
208
200
  targets.
209
201
 
210
202
  In addition to feed URLs, `environments.yaml` also defines various environments
@@ -212,6 +204,13 @@ which can be modified, or additional environments can be added by just adding a
212
204
  new YAML stanza with the right keys; this new environment can then be used with
213
205
  the `--environment` flag when running `search_solr_tools harvest`.
214
206
 
207
+ An example harvest of NSIDC metadata into a developer instance of Solr:
208
+
209
+ bundle exec search_solr_tools harvest --data-center=nsidc --environment=dev
210
+
211
+ In this example, the `host` value in the `environments.yaml` `dev` entry
212
+ must reference a valid Solr instance.
213
+
215
214
  #### Logging
216
215
 
217
216
  By default, when running the harvest, harvest logs are written to the file
@@ -40,7 +40,7 @@ module SearchSolrTools
40
40
 
41
41
  status.record_status(Helpers::HarvestStatus::HARVEST_NO_DOCS) if (result[:num_docs]).zero?
42
42
 
43
- # Record the number of harvest failures; note that if this is 0, thats OK, the status will stay at 0
43
+ # Record the number of harvest failures; note that if this is 0, that's OK, the status will stay at 0
44
44
  status.record_status(Helpers::HarvestStatus::HARVEST_FAILURE, result[:failure_ids].length)
45
45
 
46
46
  raise Errors::HarvestError, status unless status.ok?
@@ -89,7 +89,7 @@ module SearchSolrTools
89
89
  binned_facet = bin(FacetConfiguration.get_facet_bin(type), format_string)
90
90
  if binned_facet.nil?
91
91
  format_string
92
- elsif binned_facet.eql?('exclude')
92
+ elsif binned_facet.match?(/\Aexclude\z/i)
93
93
  nil
94
94
  else
95
95
  binned_facet
@@ -98,11 +98,14 @@ module SearchSolrTools
98
98
 
99
99
  def self.parameter_binning(parameter_string)
100
100
  binned_parameter = bin(FacetConfiguration.get_facet_bin('parameter'), parameter_string)
101
- # use variable_level_1 if no mapping exists
102
101
  return binned_parameter unless binned_parameter.nil?
103
102
 
103
+ # if no mapping exists, use variable_level_1.
104
+ # Force it to all upper case for consistency. This is a hacky workaround to
105
+ # deal with deprecated GCMD keywords still in use by some datasets that result
106
+ # in duplicate, case-sensitive entries in the search interface facet list.
104
107
  parts = parameter_string.split '>'
105
- return parts[3].strip if parts.length >= 4
108
+ return parts[3].strip.upcase if parts.length >= 4
106
109
 
107
110
  nil
108
111
  end
@@ -158,7 +161,7 @@ module SearchSolrTools
158
161
  def self.bin(mappings, term)
159
162
  mappings.each do |mapping|
160
163
  term.match(mapping['pattern']) do
161
- return mapping['mapping']
164
+ return mapping['mapping'].upcase
162
165
  end
163
166
  end
164
167
  nil
@@ -75,9 +75,10 @@ module SearchSolrTools
75
75
  return facet_values if json.nil?
76
76
 
77
77
  json.each do |json_entry|
78
+ long_name = json_entry['shortName'].eql?(json_entry['longName']) ? '' : json_entry['longName']
78
79
  sensor_bin = Helpers::SolrFormat.facet_binning('sensor', json_entry['shortName'].to_s)
79
80
  facet_values << if sensor_bin.eql? json_entry['shortName']
80
- "#{json_entry['longName']} | #{json_entry['shortName']}"
81
+ "#{long_name} | #{json_entry['shortName']}"
81
82
  else
82
83
  " | #{sensor_bin}"
83
84
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SearchSolrTools
4
- VERSION = '7.1.0'
4
+ VERSION = '7.2.0'
5
5
  end
@@ -8,7 +8,7 @@ Gem::Specification.new do |spec|
8
8
  spec.name = 'search_solr_tools'
9
9
  spec.version = SearchSolrTools::VERSION
10
10
  spec.authors = ['Chris Chalstrom', 'Michael Brandt', 'Jonathan Kovarik', 'Luis Lopez', 'Stuart Reed', 'Julia Collins', 'Scott Lewis']
11
- spec.email = ['cchalstr@nsidc.org', 'mbrandt@colorado.edu', 'kovarik@nsidc.org', 'luis.lopezespinosa@colorado.edu', 'stuart.reed@colorado.edu', 'jcollins@nsidc.org', 'scott.lewis@nsidc.org']
11
+ spec.email = ['Jonathan.Kovarik@colorado.edu', 'luis.lopezespinosa@colorado.edu', 'collinsj@colorado.edu', 'scott.lewis@colorado.edu']
12
12
  spec.summary = 'Tools to harvest and manage various scientific dataset feeds in a Solr instance.'
13
13
  spec.description = <<-EOF
14
14
  Ruby translators to transform various metadata feeds into solr documents and
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: search_solr_tools
3
3
  version: !ruby/object:Gem::Version
4
- version: 7.1.0
4
+ version: 7.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chris Chalstrom
@@ -14,7 +14,7 @@ authors:
14
14
  autorequire:
15
15
  bindir: bin
16
16
  cert_chain: []
17
- date: 2023-10-11 00:00:00.000000000 Z
17
+ date: 2023-10-19 00:00:00.000000000 Z
18
18
  dependencies:
19
19
  - !ruby/object:Gem::Dependency
20
20
  name: ffi-geos
@@ -329,13 +329,10 @@ description: |2
329
329
  a command-line utility to access/utilize the gem's translators to harvest
330
330
  metadata into a working solr instance.
331
331
  email:
332
- - cchalstr@nsidc.org
333
- - mbrandt@colorado.edu
334
- - kovarik@nsidc.org
332
+ - Jonathan.Kovarik@colorado.edu
335
333
  - luis.lopezespinosa@colorado.edu
336
- - stuart.reed@colorado.edu
337
- - jcollins@nsidc.org
338
- - scott.lewis@nsidc.org
334
+ - collinsj@colorado.edu
335
+ - scott.lewis@colorado.edu
339
336
  executables:
340
337
  - search_solr_tools
341
338
  extensions: []