search_solr_tools 6.3.0 → 6.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f9ced4643b8adbda2b5ef09192f036af86878e07243fe959448213762e0e5cc1
4
- data.tar.gz: 0a5f27a7bc1d8c9c0c07a20b6fbf122d5a3b6163a5654db635d5c478ac4a21bc
3
+ metadata.gz: daf440444527ee68761274399910b02c51d7994a1b605c57f64b2666d7a243c7
4
+ data.tar.gz: 13a1a0ccd9d623aad334f68c6d314f7205a48a45546393d3d2ed4d71a95899cc
5
5
  SHA512:
6
- metadata.gz: cc66f8b40c62e2640fd72ce05aa2ac01aa76c58c730b6c445976fc7cf6e43b88cff29ec088f73e5dff913c879f7a1a31016cb634d851cfb3adb7b8bb735614c8
7
- data.tar.gz: f896f7b473f977f0e349d422f6568774342e6bdb66e1a7dad1cf4477d0ffa7e9b05b81184898ab2d4d69c44f8ca98a3aa6791234926190484820cca4b439dc7d
6
+ metadata.gz: 2b195f73dc88f1fcf83b354f7ea572082419172970ced5e15e7f7afdb89f47da85424b473e9aa87462483f3ba6c5a2e1e2eb33c1493d72a978ff6570d86d9645
7
+ data.tar.gz: 96b72cadd2046588046b56a9911ee14caefead8813c38d579b93d9df56b00afacfeae13033c3ed5b866076c81aa694ccdf77fe024c354191c52d65db04c7f186
data/CHANGELOG.md CHANGED
@@ -1,3 +1,21 @@
1
+ ## v6.5.0 (2023-09-21)
2
+
3
+ - Adding logging functionality to the code, including the ability
4
+ to specify log file destination and log level for both the file and
5
+ console output
6
+
7
+ ## v6.4.1 (2023-09-15)
8
+
9
+ - Added GitHub Action workflows for continuous integration features
10
+ - Updated bump rake task to use Bump gem
11
+ - Removed release rake task, moved it to the CI workflow
12
+
13
+ ## v6.4.0 (2023-08-14)
14
+
15
+ - Fixed a bug with the sanitization, which was trying to modify the
16
+ string directly (causing problems with frozen strings). Changed to
17
+ return a new, sanitized string.
18
+
1
19
  ## v6.3.0 (2023-07-24)
2
20
 
3
21
  - Update Rubocop configuration to actually run against files, and make
data/README.md CHANGED
@@ -101,51 +101,79 @@ the tests whenever the appropriate files are changed.
101
101
 
102
102
  Please be sure to run them in the `bundle exec` context if you're utilizing bundler.
103
103
 
104
+ By default, tests are run with minimal logging - no log file and only fatal errors
105
+ written to the console. This can be changed by setting the environment variables
106
+ as described in [Logging](#logging) below.
107
+
104
108
  ### Creating Releases (NSIDC devs only)
105
109
 
106
110
  Requirements:
107
111
 
108
112
  * Ruby > 3.2.2
109
113
  * [Bundler](http://bundler.io/)
110
- * [Gem Release](https://github.com/svenfuchs/gem-release)
111
114
  * [Rake](https://github.com/ruby/rake)
112
- * A [RubyGems](https://rubygems.org) account that has
113
- [ownership](http://guides.rubygems.org/publishing/) of the gem
114
115
  * RuboCop and the unit tests should all pass (`rake`)
115
116
 
116
- The [CHANGELOG.md](CHANGELOG.md) is not automatically updated by the
117
- `rake release:*` tasks. Update it manually to insert the correct version and
118
- date, and commit the file, before creating the release package.
117
+ To make a release, follow these steps:
119
118
 
120
- **gem release** is used by rake tasks in this project to handle version changes,
121
- tagging, and publishing to RubyGems.
119
+ 1. Confirm no errors are returned by `bundle exec rubocop` *
120
+ 2. Confirm all tests pass (`bundle exec rake spec:unit`) *
121
+ 3. Ensure that the `CHANGELOG.md` file is up to date with an `Unreleased`
122
+ header.
123
+ 4. Submit a Pull Request
124
+ 5. Once the PR has been reviewed and approved, merge the branch into `main`
125
+ 6. On your local machine, ensure you are on the `main` branch (and have
126
+ it up-to-date), and run `bundle exec rake bump:<part>` (see below)
127
+ * This will trigger the GitHub Actions CI workflow to push a release to
128
+ RubyGems.
122
129
 
123
- | Command | Description |
124
- |---------------------------|-------------|
125
- | `rake release:pre[false]` | Increase the current prerelease version number, push changes |
126
- | `rake release:pre[true]` | Increase the current prerelease version number, publish release\* |
127
- | `rake release:none` | Drop the prerelease version, publish release\*, then `pre[false]` (does a patch release) |
128
- | `rake release:minor` | Increase the minor version number, publish release\*, then `pre[false]` |
129
- | `rake release:major` | Increase the major version number, publish release\*, then `pre[false]` |
130
+ The steps marked `*` above don't need to be done manually; every time a commit
131
+ is pushed to the GitHub repository, these tests will be run automatically.
130
132
 
131
- \*"publish release" means each of the following occurs:
133
+ The first 4 steps above are self-explanatory. More information on the last
134
+ steps can be found below.
132
135
 
133
- * a new tag is created
134
- * the changes are pushed
135
- * the tagged version is built and published to RubyGems
136
+ #### Version Bumping
136
137
 
137
- You will need to have a current Rubygems API key for the _NSIDC developer user_ account in
138
- order to publish a new version of the gem to Rubygems. To get the lastest API key:
138
+ Running the `bundle exec rake bump:<part>` tasks will do the following actions:
139
139
 
140
- `curl -u <username> https://rubygems.org/api/v1/api_key.yaml > ~/.gem/credentials; chmod 0600 ~/.gem/credentials`
140
+ 1. The gem version will be updated locally
141
+ 2. The `CHANGELOG.md` file will updated with the updated gem version and date
142
+ 3. A tag `vx.x.x` will be created (with the new gem version)
143
+ 4. The files updated by the bump will be pushed to the GitHub repository, along
144
+ with the newly created tag.
145
+
146
+ The sub-tasks associated with bump will allow the type of bump to be determined:
147
+
148
+ | Command | Description |
149
+ |---------------------------|----------------------------------------------------------------------------------------------------|
150
+ | `rake bump:pre` | Increase the current prerelease version number (v1.2.3 -> v1.2.3.pre1; v1.2.3.pre1 -> v1.2.3.pre2) |
151
+ | `rake bump:patch` | Increase the current patch number (v1.2.0 -> v1.2.1; v1.2.4 -> v1.2.4) |
152
+ | `rake bump:minor` | Increase the minor version number (v1.2.0 -> v1.3.0; v1.2.4 -> v1.3.0) |
153
+ | `rake bump:major` | Increase the major version number (v1.2.0 _> v2.0.0; v1.2.4 -> v2.0.0) |
154
+
155
+ Using any bump other than `pre` will remove the `pre` suffix from the version as well.
156
+
157
+ #### Release to RubyGems
158
+
159
+ When a tag in the format of `vx.y.z` (including a `pre` suffix) is pushed to GitHub,
160
+ it will trigger the GitHub Actions release workflow. This workflow will:
161
+
162
+ 1. Build the gem
163
+ 2. Push the gem to RubyGems
164
+
165
+ The CI workflow has the credentials set up to push to RubyGems, so no user intervention
166
+ is needed, and the workflow itself does not have to be manually triggered.
167
+
168
+ If needed, the release can also be done locally by running the command
169
+ `bundle exec gem release`. In order for this to work, you will need to have a
170
+ local copy of current Rubygems API key for the _NSIDC developer user_ account in
171
+ To get the lastest API key:
141
172
 
142
- ## Release steps (summary)
173
+ `curl -u <username> https://rubygems.org/api/v1/api_key.yaml > ~/.gem/credentials; chmod 0600 ~/.gem/credentials`
143
174
 
144
- - Confirm no errors are returned by `bundle exec rubocop`
145
- - Confirm all tests pass (`bundle exec rake spec:unit`)
146
- - Update the version number and date manually in `CHANGELOG.md` and commit the
147
- changes.
148
- - Run the appropriate `bundle exec rake release:*` task
175
+ It is recommended that this not be run locally, however; use the GitHub Actions CI
176
+ workflow instead.
149
177
 
150
178
  ### SOLR
151
179
 
@@ -184,6 +212,47 @@ which can be modified, or additional environments can be added by just adding a
184
212
  new YAML stanza with the right keys; this new environment can then be used with
185
213
  the `--environment` flag when running `search_solr_tools harvest`.
186
214
 
215
+ #### Logging
216
+
217
+ By default, when running the harvest, harvest logs are written to the file
218
+ `/var/log/search-solr-tools.log` (set to `warn` level), as well as to the console
219
+ at `info` level. These settings are configured in the `environments.yaml` config
220
+ file, in the `common` section.
221
+
222
+ The keys in the `environments.yaml` file to consider are as follows:
223
+
224
+ * `log_file` - The full name and path of the file to which log output will be written
225
+ to. If set to the special value `none`, no log file will be written to at all.
226
+ Log output will be **appended** to the file, if it exists; otherwise, the file will
227
+ be created.
228
+ * `log_file_level` - Indicates the level of logging which should be written to the log file.
229
+ * `log_stdout_level` - Indicates the level of logging which should be written to the console.
230
+ This can be different than the level written to the log file.
231
+
232
+ You can also override the configuration file settings at the command line with the
233
+ following environment variables (useful when for doing development work):
234
+
235
+ * `SEARCH_SOLR_LOG_FILE` - Overrides the `log_file` setting
236
+ * `SEARCH_SOLR_LOG_LEVEL` - Overrides the `log_file_level` setting
237
+ * `SEARCH_SOLR_STDOUT_LEVEL` - Overrides the `log_stdout_level` setting
238
+
239
+ When running the spec tests, `SEARCH_SOLR_LOG_FILE` is set to `none` and
240
+ `SEARCH_SOLR_STDOUT_LEVEL` is set to `fatal`, unless you manually set those
241
+ environment variables prior to running the tests. This is to keep the test output
242
+ clean unless you need more detail for debugging.
243
+
244
+ The following are the levels of logging that can be specified. These levels are
245
+ cumulative; for example, `error` will also output `fatal` log entries, and `debug`
246
+ will output **all** log entries.
247
+
248
+ * `none` - No logging outputs will be written.
249
+ * `fatal` - Only outputs errors which result in a crash.
250
+ * `error` - Outputs any error that occurs while harvesting.
251
+ * `warn` - Outputs warnings that occur that do not cause issues with the harvesting,
252
+ but might indicate things that may need to be addressed (such as deprecations, etc)
253
+ * `info` - Outputs general information, such as harvesting status
254
+ * `debug` - Outputs detailed information that can be used for debugging and code tracing.
255
+
187
256
  ## Organization Info
188
257
 
189
258
  ### How to contact NSIDC
@@ -210,6 +279,5 @@ license is included in the file COPYING.
210
279
 
211
280
  Andy Grauch, Brendan Billingsley, Chris Chalstrom, Danielle Harper, Ian
212
281
  Truslove, Jonathan Kovarik, Luis Lopez, Miao Liu, Michael Brandt, Stuart Reed,
213
- Julia Collins, Scott Lewis
214
- (2023): Arctic Data Explorer SOLR Search software tools. The National Snow and
215
- Ice Data Center. Software.
282
+ Julia Collins, Scott Lewis (2023): Arctic Data Explorer SOLR Search software tools.
283
+ The National Snow and Ice Data Center. Software. https://doi.org/10.7265/n5jq0xzm
@@ -6,8 +6,14 @@ require 'thor'
6
6
 
7
7
  # rubocop:disable Metrics/AbcSize
8
8
  class SolrHarvestCLI < Thor
9
+ include SSTLogger
10
+
9
11
  map %w[--version -v] => :__print_version
10
12
 
13
+ def self.exit_on_failure?
14
+ false
15
+ end
16
+
11
17
  desc '--version, -v', 'print the version'
12
18
  def __print_version
13
19
  puts SearchSolrTools::VERSION
@@ -39,10 +45,11 @@ class SolrHarvestCLI < Thor
39
45
  rescue StandardError => e
40
46
  solr_status = false
41
47
  source_status = false
42
- puts "Error trying to ping for #{target}: #{e}"
48
+ logger.error "Ping failed for #{target}: #{e}}"
43
49
  end
44
50
  solr_success &&= solr_status
45
51
  source_success &&= source_status
52
+
46
53
  puts "Target: #{target}, Solr ping OK? #{solr_status}, data center ping OK? #{source_status}"
47
54
  end
48
55
 
@@ -61,7 +68,7 @@ class SolrHarvestCLI < Thor
61
68
  option :die_on_failure, type: :boolean
62
69
  def harvest(die_on_failure = options[:die_on_failure] || false)
63
70
  options[:data_center].each do |target|
64
- puts "Target: #{target}"
71
+ logger.info "Target: #{target}"
65
72
  begin
66
73
  harvest_class = get_harvester_class(target)
67
74
  harvester = harvest_class.new(options[:environment], die_on_failure:)
@@ -73,12 +80,12 @@ class SolrHarvestCLI < Thor
73
80
 
74
81
  harvester.harvest_and_delete
75
82
  rescue SearchSolrTools::Errors::HarvestError => e
76
- puts "THERE WERE HARVEST STATUS ERRORS:\n#{e.message}"
83
+ logger.error "THERE WERE HARVEST STATUS ERRORS:\n#{e.message}"
77
84
  exit e.exit_code
78
85
  rescue StandardError => e
79
86
  # If it gets here, there is an error that we aren't expecting.
80
- puts "harvest failed for #{target}: #{e.message}"
81
- puts e.backtrace
87
+ logger.error "harvest failed for #{target}: #{e.message}"
88
+ logger.error e.backtrace
82
89
  exit SearchSolrTools::Errors::HarvestError::ERRCODE_OTHER
83
90
  end
84
91
  end
@@ -93,16 +100,20 @@ class SolrHarvestCLI < Thor
93
100
  option :environment, required: true
94
101
  def delete_all
95
102
  env = SearchSolrTools::SolrEnvironments[options[:environment]]
103
+ logger.info('DELETE ALL started')
96
104
  `curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<delete><query>*:*</query></delete>'`
97
105
  `curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<commit/>'`
106
+ logger.info('DELETE ALL complete')
98
107
  end
99
108
 
100
109
  desc 'delete_all_auto_suggest', 'Delete all documents from the auto_suggest index'
101
110
  option :environment, required: true
102
111
  def delete_all_auto_suggest
103
112
  env = SearchSolrTools::SolrEnvironments[options[:environment]]
113
+ logger.info('DELETE ALL AUTO_SUGGEST started')
104
114
  `curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<delete><query>*:*</query></delete>'`
105
115
  `curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<commit/>'`
116
+ logger.info('DELETE ALL AUTO_SUGGEST complete')
106
117
  end
107
118
 
108
119
  desc 'delete_by_data_center', 'Force deletion of documents for a specific data center with timestamps before the passed timestamp in format iso8601 (2014-07-14T21:49:21Z)'
@@ -110,11 +121,13 @@ class SolrHarvestCLI < Thor
110
121
  option :environment, required: true
111
122
  option :data_center, required: true
112
123
  def delete_by_data_center
124
+ logger.info("DELETE ALL for data center '#{options[:data_center]}' started")
113
125
  harvester = get_harvester_class(options[:data_center]).new options[:environment]
114
126
  harvester.delete_old_documents(options[:timestamp],
115
127
  "data_centers:\"#{SearchSolrTools::Helpers::SolrFormat::DATA_CENTER_NAMES[options[:data_center].upcase.to_sym][:long_name]}\"",
116
128
  SearchSolrTools::SolrEnvironments[harvester.environment][:collection_name],
117
129
  true)
130
+ logger.info("DELETE ALL for data center '#{options[:data_center]}' complete")
118
131
  end
119
132
 
120
133
  no_tasks do
@@ -5,10 +5,10 @@ require 'yaml'
5
5
  module SearchSolrTools
6
6
  # configuration to work with solr locally, or on integration/qa/staging/prod
7
7
  module SolrEnvironments
8
- YAML_ENVS = YAML.load_file(File.expand_path('environments.yaml', __dir__))
8
+ YAML_ENVS = YAML.load_file(File.expand_path('environments.yaml', __dir__), aliases: true)
9
9
 
10
10
  def self.[](env = :development)
11
- YAML_ENVS[:common].merge(YAML_ENVS[env.to_sym])
11
+ YAML_ENVS[env.to_sym]
12
12
  end
13
13
  end
14
14
  end
@@ -1,4 +1,4 @@
1
- :common:
1
+ :common: &common
2
2
  :auto_suggest_collection_name: auto_suggest
3
3
  :collection_name: nsidc_oai
4
4
  :collection_path: solr
@@ -9,33 +9,48 @@
9
9
  # should be. GLA01.018 will show up if we use DCS API v2.
10
10
  :nsidc_oai_identifiers_url: oai?verb=ListIdentifiers&metadataPrefix=dif&retired=false
11
11
 
12
+ # Log details. Can be overridden by environment-specific values
13
+ :log_file: /var/log/search-solr-tools.log
14
+ :log_file_level: warn
15
+ :log_stdout_level: info
16
+
12
17
  :local:
18
+ <<: *common
13
19
  :host: localhost
14
20
  :nsidc_dataset_metadata_url: http://integration.nsidc.org/api/dataset/metadata/
15
21
 
16
- :dev:
22
+ :dev: &dev
23
+ <<: *common
17
24
  ## For the below, you'll need to instantiate your own search-solr instance, and point host to that.
18
25
  :host: dev.search-solr.USERNAME.dev.int.nsidc.org
19
26
  ## For the metadata content, either set up your own instance of dataset-catalog-services
20
27
  ## or change the URL below to point to integration
21
28
  :nsidc_dataset_metadata_url: http://dev.dcs.USERNAME.dev.int.nsidc.org:1580/api/dataset/metadata/
22
29
 
30
+ :development:
31
+ <<: *dev
32
+
23
33
  :integration:
34
+ <<: *common
24
35
  :host: integration.search-solr.apps.int.nsidc.org
25
36
  :nsidc_dataset_metadata_url: http://integration.nsidc.org/api/dataset/metadata/
26
37
 
27
38
  :qa:
39
+ <<: *common
28
40
  :host: qa.search-solr.apps.int.nsidc.org
29
41
  :nsidc_dataset_metadata_url: http://qa.nsidc.org/api/dataset/metadata/
30
42
 
31
43
  :staging:
44
+ <<: *common
32
45
  :host: staging.search-solr.apps.int.nsidc.org
33
46
  :nsidc_dataset_metadata_url: http://staging.nsidc.org/api/dataset/metadata/
34
47
 
35
48
  :blue:
49
+ <<: *common
36
50
  :host: blue.search-solr.apps.int.nsidc.org
37
51
  :nsidc_dataset_metadata_url: http://nsidc.org/api/dataset/metadata/
38
52
 
39
53
  :production:
54
+ <<: *common
40
55
  :host: search-solr.apps.int.nsidc.org
41
56
  :nsidc_dataset_metadata_url: http://nsidc.org/api/dataset/metadata/
@@ -1,8 +1,12 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require_relative '../logging/sst_logger'
4
+
3
5
  module SearchSolrTools
4
6
  module Errors
5
7
  class HarvestError < StandardError
8
+ include SSTLogger
9
+
6
10
  ERRCODE_SOLR_PING = 1
7
11
  ERRCODE_SOURCE_PING = 2
8
12
  ERRCODE_SOURCE_NO_RESULTS = 4
@@ -73,11 +77,11 @@ module SearchSolrTools
73
77
  # rubocop:disable Metrics/AbcSize
74
78
  def exit_code
75
79
  if @status_data.nil?
76
- puts "OTHER ERROR REPORTED: #{@other_message}"
80
+ logger.error "OTHER ERROR REPORTED: #{@other_message}"
77
81
  return ERRCODE_OTHER
78
82
  end
79
83
 
80
- puts "EXIT CODE STATUS:\n#{@status_data.status}"
84
+ logger.error "EXIT CODE STATUS:\n#{@status_data.status}"
81
85
 
82
86
  code = 0
83
87
  code += ERRCODE_SOLR_PING unless @status_data.ping_solr
@@ -51,10 +51,10 @@ module SearchSolrTools
51
51
  status = insert_solr_doc add_docs, Base::JSON_CONTENT_TYPE, @env_settings[:auto_suggest_collection_name]
52
52
 
53
53
  if status == Helpers::HarvestStatus::INGEST_OK
54
- puts "Added #{add_docs.size} auto suggest documents in one commit"
54
+ logger.info "Added #{add_docs.size} auto suggest documents in one commit"
55
55
  Helpers::HarvestStatus.new(Helpers::HarvestStatus::INGEST_OK => add_docs)
56
56
  else
57
- puts "Failed adding #{add_docs.size} documents in single commit, retrying one by one"
57
+ logger.error "Failed adding #{add_docs.size} documents in single commit, retrying one by one"
58
58
  new_add_docs = []
59
59
  add_docs.each do |doc|
60
60
  new_add_docs << { 'add' => { 'doc' => doc } }
@@ -15,6 +15,8 @@ module SearchSolrTools
15
15
  module Harvesters
16
16
  # base class for solr harvesters
17
17
  class Base
18
+ include SSTLogger
19
+
18
20
  attr_accessor :environment
19
21
 
20
22
  DELETE_DOCUMENTS_RATIO = 0.1
@@ -50,10 +52,10 @@ module SearchSolrTools
50
52
  begin
51
53
  RestClient.get(url) do |response, _request, _result|
52
54
  success = response.code == 200
53
- puts "Error in ping request: #{response.body}" unless success
55
+ logger.error "Error in ping request: #{response.body}" unless success
54
56
  end
55
57
  rescue StandardError => e
56
- puts "Rest exception while pinging Solr: #{e}"
58
+ logger.error "Rest exception while pinging Solr: #{e}"
57
59
  end
58
60
  success
59
61
  end
@@ -62,7 +64,7 @@ module SearchSolrTools
62
64
  # to "ping" the data center. Returns true if the ping is successful (or, as
63
65
  # in this default, no ping method was defined)
64
66
  def ping_source
65
- puts 'Harvester does not have ping method defined, assuming true'
67
+ logger.info 'Harvester does not have ping method defined, assuming true'
66
68
  true
67
69
  end
68
70
 
@@ -81,31 +83,31 @@ module SearchSolrTools
81
83
  solr = RSolr.connect url: solr_url + "/#{solr_core}"
82
84
  unchanged_count = (solr.get 'select', params: { wt: :ruby, q: delete_query, rows: 0 })['response']['numFound'].to_i
83
85
  if unchanged_count.zero?
84
- puts "All documents were updated after #{timestamp}, nothing to delete"
86
+ logger.info "All documents were updated after #{timestamp}, nothing to delete"
85
87
  else
86
- puts "Begin removing documents older than #{timestamp}"
88
+ logger.info "Begin removing documents older than #{timestamp}"
87
89
  remove_documents(solr, delete_query, constraints, force, unchanged_count)
88
90
  end
89
91
  end
90
92
 
91
93
  def sanitize_data_centers_constraints(query_string)
92
94
  # Remove lucene special characters, preserve the query parameter and compress whitespace
93
- query_string.gsub!(/[:&|!~\-\(\)\{\}\[\]\^\*\?\+]+/, ' ')
94
- query_string.gsub!('data_centers ', 'data_centers:')
95
- query_string.gsub!('source ', 'source:')
95
+ query_string = query_string.gsub(/[:&|!~\-\(\)\{\}\[\]\^\*\?\+]+/, ' ')
96
+ query_string = query_string.gsub('data_centers ', 'data_centers:')
97
+ query_string = query_string.gsub('source ', 'source:')
96
98
  query_string.squeeze(' ').strip
97
99
  end
98
100
 
99
101
  def remove_documents(solr, delete_query, constraints, force, numfound)
100
102
  all_response_count = (solr.get 'select', params: { wt: :ruby, q: constraints, rows: 0 })['response']['numFound']
101
103
  if force || (numfound / all_response_count.to_f < DELETE_DOCUMENTS_RATIO)
102
- puts "Deleting #{numfound} documents for #{constraints}"
104
+ logger.info "Deleting #{numfound} documents for #{constraints}"
103
105
  solr.delete_by_query delete_query
104
106
  solr.commit
105
107
  else
106
- puts "Failed to delete records older than current harvest start because they exceeded #{DELETE_DOCUMENTS_RATIO} of the total records for this data center."
107
- puts "\tTotal records: #{all_response_count}"
108
- puts "\tNon-updated records: #{numfound}"
108
+ logger.info "Failed to delete records older than current harvest start because they exceeded #{DELETE_DOCUMENTS_RATIO} of the total records for this data center."
109
+ logger.info "\tTotal records: #{all_response_count}"
110
+ logger.info "\tNon-updated records: #{numfound}"
109
111
  end
110
112
  end
111
113
 
@@ -121,8 +123,8 @@ module SearchSolrTools
121
123
  status.record_status doc_status
122
124
  doc_status == Helpers::HarvestStatus::INGEST_OK ? success += 1 : failure += 1
123
125
  end
124
- puts "#{success} document#{success == 1 ? '' : 's'} successfully added to Solr."
125
- puts "#{failure} document#{failure == 1 ? '' : 's'} not added to Solr."
126
+ logger.info "#{success} document#{success == 1 ? '' : 's'} successfully added to Solr."
127
+ logger.info "#{failure} document#{failure == 1 ? '' : 's'} not added to Solr."
126
128
 
127
129
  status
128
130
  end
@@ -146,14 +148,14 @@ module SearchSolrTools
146
148
  RestClient.post(url, doc_serialized, content_type:) do |response, _request, _result|
147
149
  success = response.code == 200
148
150
  unless success
149
- puts "Error for #{doc_serialized}\n\n response: #{response.body}"
151
+ logger.error "Error for #{doc_serialized}\n\n response: #{response.body}"
150
152
  status = Helpers::HarvestStatus::INGEST_ERR_SOLR_ERROR
151
153
  end
152
154
  end
153
155
  rescue StandardError => e
154
156
  # TODO: Need to provide more detail re: this failure so we know whether to
155
157
  # exit the job with a status != 0
156
- puts "Rest exception while POSTing to Solr: #{e}, for doc: #{doc_serialized}"
158
+ logger.error "Rest exception while POSTing to Solr: #{e}, for doc: #{doc_serialized}"
157
159
  status = Helpers::HarvestStatus::INGEST_ERR_SOLR_ERROR
158
160
  end
159
161
  status
@@ -177,11 +179,11 @@ module SearchSolrTools
177
179
  request_url = encode_data_provider_url(request_url)
178
180
 
179
181
  begin
180
- puts "Request: #{request_url}"
182
+ logger.debug "Request: #{request_url}"
181
183
  response = URI.parse(request_url).open(read_timeout: timeout, 'Content-Type' => content_type)
182
184
  rescue OpenURI::HTTPError, Timeout::Error, Errno::ETIMEDOUT => e
183
185
  retries_left -= 1
184
- puts "## REQUEST FAILED ## #{e.class} ## Retrying #{retries_left} more times..."
186
+ logger.error "## REQUEST FAILED ## #{e.class} ## Retrying #{retries_left} more times..."
185
187
 
186
188
  retry if retries_left.positive?
187
189
 
@@ -6,7 +6,7 @@ module SearchSolrTools
6
6
  module Harvesters
7
7
  class NsidcAutoSuggest < AutoSuggest
8
8
  def harvest_and_delete
9
- puts 'Building auto-suggest indexes for NSIDC'
9
+ logger.info 'Building auto-suggest indexes for NSIDC'
10
10
  super(method(:harvest), 'source:"NSIDC"', @env_settings[:auto_suggest_collection_name])
11
11
  end
12
12
 
@@ -21,13 +21,13 @@ module SearchSolrTools
21
21
  return response.code == 200
22
22
  end
23
23
  rescue StandardError
24
- puts "Error trying to get options for #{nsidc_json_url} (ping)"
24
+ logger.error "Error trying to get options for #{nsidc_json_url} (ping)"
25
25
  end
26
26
  false
27
27
  end
28
28
 
29
29
  def harvest_and_delete
30
- puts "Running harvest of NSIDC catalog from #{nsidc_json_url}"
30
+ logger.info "Running harvest of NSIDC catalog from #{nsidc_json_url}"
31
31
  super(method(:harvest_nsidc_json_into_solr), "data_centers:\"#{Helpers::SolrFormat::DATA_CENTER_NAMES[:NSIDC][:long_name]}\"")
32
32
  end
33
33
 
@@ -47,8 +47,8 @@ module SearchSolrTools
47
47
  rescue Errors::HarvestError => e
48
48
  raise e
49
49
  rescue StandardError => e
50
- puts "An unexpected exception occurred while trying to harvest or insert: #{e}"
51
- puts e.backtrace
50
+ logger.error "An unexpected exception occurred while trying to harvest or insert: #{e}"
51
+ logger.error e.backtrace
52
52
  status = Helpers::HarvestStatus.new(Helpers::HarvestStatus::OTHER_ERROR => e)
53
53
  raise Errors::HarvestError, status
54
54
  end
@@ -83,7 +83,7 @@ module SearchSolrTools
83
83
  begin
84
84
  docs << { 'add' => { 'doc' => @translator.translate(fetch_json_from_nsidc(id)) } }
85
85
  rescue StandardError => e
86
- puts "Failed to fetch #{id} with error #{e}: #{e.backtrace}"
86
+ logger.error "Failed to fetch #{id} with error #{e}: #{e.backtrace}"
87
87
  failure_ids << id
88
88
  end
89
89
  end
@@ -0,0 +1,71 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'fileutils'
4
+ require 'logging'
5
+
6
+ require_relative '../config/environments'
7
+
8
+ module SSTLogger
9
+ LOG_LEVELS = %w[debug info warn error fatal none].freeze
10
+
11
+ def logger
12
+ SSTLogger.logger
13
+ end
14
+
15
+ class << self
16
+ def logger
17
+ @logger ||= new_logger
18
+ end
19
+
20
+ private
21
+
22
+ def new_logger
23
+ logger = Logging.logger['search_solr_tools']
24
+
25
+ append_stdout_logger(logger)
26
+ append_file_logger(logger)
27
+
28
+ logger
29
+ end
30
+
31
+ def append_stdout_logger(logger)
32
+ return if log_stdout_level.nil?
33
+
34
+ new_stdout = Logging.appenders.stdout
35
+ new_stdout.level = log_stdout_level
36
+ new_stdout.layout = Logging.layouts.pattern(pattern: "%-5l : %m\n")
37
+ logger.add_appenders new_stdout
38
+ end
39
+
40
+ def append_file_logger(logger)
41
+ return if log_file == 'none'
42
+
43
+ FileUtils.mkdir_p(File.dirname(log_file))
44
+ new_file = Logging.appenders.file(
45
+ log_file,
46
+ layout: Logging.layouts.pattern(pattern: "[%d] %-5l : %m\n")
47
+ )
48
+ new_file.level = log_file_level
49
+ logger.add_appenders new_file
50
+ end
51
+
52
+ def log_file
53
+ env = SearchSolrTools::SolrEnvironments[]
54
+ ENV.fetch('SEARCH_SOLR_LOG_FILE', nil) || env[:log_file]
55
+ end
56
+
57
+ def log_file_level
58
+ env = SearchSolrTools::SolrEnvironments[]
59
+ log_level(ENV.fetch('SEARCH_SOLR_LOG_LEVEL', nil) || env[:log_file_level])
60
+ end
61
+
62
+ def log_stdout_level
63
+ env = SearchSolrTools::SolrEnvironments[]
64
+ log_level(ENV.fetch('SEARCH_SOLR_STDOUT_LEVEL', nil) || env[:log_stdout_level])
65
+ end
66
+
67
+ def log_level(level)
68
+ LOG_LEVELS.include?(level) ? level : nil
69
+ end
70
+ end
71
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SearchSolrTools
4
- VERSION = '6.3.0'
4
+ VERSION = '6.5.0'
5
5
  end
@@ -6,6 +6,8 @@ require_relative 'search_solr_tools/version'
6
6
  require_relative 'search_solr_tools/helpers/harvest_status'
7
7
  require_relative 'search_solr_tools/errors/harvest_error'
8
8
 
9
+ require_relative 'search_solr_tools/logging/sst_logger'
10
+
9
11
  %w[harvesters translators].each do |subdir|
10
12
  Dir[File.join(__dir__, 'search_solr_tools', subdir, '*.rb')].each { |file| require file }
11
13
  end
@@ -27,14 +27,16 @@ Gem::Specification.new do |spec|
27
27
 
28
28
  spec.add_runtime_dependency 'ffi-geos', '~> 2.4.0'
29
29
  spec.add_runtime_dependency 'iso8601', '~> 0.13.0'
30
+ spec.add_runtime_dependency 'logging', '~> 2.3.1'
30
31
  spec.add_runtime_dependency 'multi_json', '~> 1.15.0'
31
- spec.add_runtime_dependency 'nokogiri', '~> 1.15.3'
32
+ spec.add_runtime_dependency 'nokogiri', '~> 1.15.4'
32
33
  spec.add_runtime_dependency 'rest-client', '~> 2.1.0'
33
34
  spec.add_runtime_dependency 'rgeo', '~> 3.0.0'
34
35
  spec.add_runtime_dependency 'rgeo-geojson', '~> 2.1.1'
35
36
  spec.add_runtime_dependency 'rsolr', '~> 2.5.0'
36
37
  spec.add_runtime_dependency 'thor', '~> 1.2.2'
37
38
 
39
+ spec.add_development_dependency 'bump', '~> 0.10.0'
38
40
  spec.add_development_dependency 'gem-release', '~> 2.2.2'
39
41
  spec.add_development_dependency 'guard', '~> 2.18.0'
40
42
  spec.add_development_dependency 'guard-rspec', '~> 4.7.3'
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: search_solr_tools
3
3
  version: !ruby/object:Gem::Version
4
- version: 6.3.0
4
+ version: 6.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chris Chalstrom
@@ -14,7 +14,7 @@ authors:
14
14
  autorequire:
15
15
  bindir: bin
16
16
  cert_chain: []
17
- date: 2023-07-24 00:00:00.000000000 Z
17
+ date: 2023-09-21 00:00:00.000000000 Z
18
18
  dependencies:
19
19
  - !ruby/object:Gem::Dependency
20
20
  name: ffi-geos
@@ -44,6 +44,20 @@ dependencies:
44
44
  - - "~>"
45
45
  - !ruby/object:Gem::Version
46
46
  version: 0.13.0
47
+ - !ruby/object:Gem::Dependency
48
+ name: logging
49
+ requirement: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - "~>"
52
+ - !ruby/object:Gem::Version
53
+ version: 2.3.1
54
+ type: :runtime
55
+ prerelease: false
56
+ version_requirements: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - "~>"
59
+ - !ruby/object:Gem::Version
60
+ version: 2.3.1
47
61
  - !ruby/object:Gem::Dependency
48
62
  name: multi_json
49
63
  requirement: !ruby/object:Gem::Requirement
@@ -64,14 +78,14 @@ dependencies:
64
78
  requirements:
65
79
  - - "~>"
66
80
  - !ruby/object:Gem::Version
67
- version: 1.15.3
81
+ version: 1.15.4
68
82
  type: :runtime
69
83
  prerelease: false
70
84
  version_requirements: !ruby/object:Gem::Requirement
71
85
  requirements:
72
86
  - - "~>"
73
87
  - !ruby/object:Gem::Version
74
- version: 1.15.3
88
+ version: 1.15.4
75
89
  - !ruby/object:Gem::Dependency
76
90
  name: rest-client
77
91
  requirement: !ruby/object:Gem::Requirement
@@ -142,6 +156,20 @@ dependencies:
142
156
  - - "~>"
143
157
  - !ruby/object:Gem::Version
144
158
  version: 1.2.2
159
+ - !ruby/object:Gem::Dependency
160
+ name: bump
161
+ requirement: !ruby/object:Gem::Requirement
162
+ requirements:
163
+ - - "~>"
164
+ - !ruby/object:Gem::Version
165
+ version: 0.10.0
166
+ type: :development
167
+ prerelease: false
168
+ version_requirements: !ruby/object:Gem::Requirement
169
+ requirements:
170
+ - - "~>"
171
+ - !ruby/object:Gem::Version
172
+ version: 0.10.0
145
173
  - !ruby/object:Gem::Dependency
146
174
  name: gem-release
147
175
  requirement: !ruby/object:Gem::Requirement
@@ -332,6 +360,7 @@ files:
332
360
  - lib/search_solr_tools/helpers/solr_format.rb
333
361
  - lib/search_solr_tools/helpers/translate_spatial_coverage.rb
334
362
  - lib/search_solr_tools/helpers/translate_temporal_coverage.rb
363
+ - lib/search_solr_tools/logging/sst_logger.rb
335
364
  - lib/search_solr_tools/translators/nsidc_json.rb
336
365
  - lib/search_solr_tools/version.rb
337
366
  - search_solr_tools.gemspec
@@ -354,7 +383,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
354
383
  - !ruby/object:Gem::Version
355
384
  version: '0'
356
385
  requirements: []
357
- rubygems_version: 3.4.17
386
+ rubygems_version: 3.4.10
358
387
  signing_key:
359
388
  specification_version: 4
360
389
  summary: Tools to harvest and manage various scientific dataset feeds in a Solr instance.