oddb2xml 3.0.17 → 3.0.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: df4185a65a50fc628ba92bc98fb8ac378563d5653882e3e34b54d1c1265e3808
4
- data.tar.gz: a62c0d6b0b643458e9885c6ba7d8c1d54c1435696a3c73da2788fd22539a9915
3
+ metadata.gz: a4aadf4a537b9f625c491c4609116f6aef44c4f558e0a273a0d2da705ad8bfd0
4
+ data.tar.gz: 946a8f14539626a383dcab8e73cdc5e89410e8df2721a44d0fca50cc868cf11c
5
5
  SHA512:
6
- metadata.gz: e3718c1c76fbced6bb1ce43350ae13b42fbf747e2dce91980be24df4b9dcea1a78448b20a344ab50685effe3a33341a20ebd520c1a70ed995c996a6fd0dfb0ff
7
- data.tar.gz: e0c0dd04ccacc29933a582fd05d19ebee40ce7ae420ca632e4d6f5068ae92de5928ab1fa6daf3b9bd4894f5797dde0090fad74d8bdc25a3610f648b57624aa1c
6
+ metadata.gz: d91d784a98fd256a0ce056c0cfa36959249c7d4cf6b194a2b97c360569ad6f3e503e57c4395722d8a83b5f163ec3b9f5187b8f28ca193de8cc83eff80cf59c2f
7
+ data.tar.gz: 75f26b3a4559da9776b77c718a916e7ee76715a49475066b8b1e575ba5aa9790b9c6797245784a799af211bb18c86711600ce7ce8cb1794feae4769bb442eaed
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- oddb2xml (3.0.17)
4
+ oddb2xml (3.0.18)
5
5
  csv
6
6
  htmlentities
7
7
  httpi
data/History.txt CHANGED
@@ -1,3 +1,7 @@
1
+ === 3.0.18 / 09.06.2026
2
+ * Bugfix: stop the FHIR downloader from crashing with "Errno::ENOENT @ rb_file_s_size ... foph-sl-export-latest-de.ndjson" when run with --skip-download (issue #121). FhirDownloader#skip_download? returned true on the bare --skip-download flag and then called File.size on a file that was never downloaded. Each oddb2xml run uses its own ./downloads dir, so deploy scripts that download once and then re-run with --skip-download in a fresh dir hit this every time. skip_download? now requires the target NDJSON to actually exist on disk before honouring the flag; a missing file falls through to a normal download instead of crashing.
3
+ * New: proxy / connectivity preflight check (issue #121). At the very start of a run, oddb2xml now probes the outbound hosts it needs (honouring the http(s)_proxy environment) and prints a loud warning if any host is blocked by the proxy (HTTP 407 on an allow-list proxy such as Aspectra's Skyhigh gateway) or otherwise unreachable — surfacing the cause up front instead of a later empty-output/Errno symptom. The probed host set is option-aware (e.g. id.gs1.ch only with --firstbase, epl.bag.admin.ch only with --fhir). It only warns and never aborts the run; downloads still proceed and fail individually as before. Checks run concurrently (~6s worst case) and are skipped during tests; set ODDB2XML_SKIP_PROXY_CHECK=1 to silence it.
4
+
1
5
  === 3.0.17 / 08.06.2026
2
6
  * Compatibility: relax the nokogiri requirement from ">= 1.19.3" to ">= 1.18.10" (lower bound only, no upper cap) so oddb2xml installs on Ruby 3.1 again. nokogiri 1.19.x requires Ruby >= 3.2, which made 3.0.13–3.0.16 uninstallable on Ruby 3.1.0 (the bare "gem install" crashed with the misleading RubyGems resolver error "undefined method `request' for nil:NilClass"). With a lower-bound-only floor, RubyGems still resolves the security-fixed 1.19.3 on Ruby >= 3.2, while Ruby 3.1 installs 1.18.10 (the newest 3.1-compatible release). oddb2xml uses no nokogiri 1.19-only API and runs no CSS/XSLT/C14N over untrusted input, so the 1.19.3 advisories do not apply to its usage. Note: Ruby 3.1 is EOL (since March 2025) — upgrading to Ruby >= 3.2 remains recommended.
3
7
 
data/lib/oddb2xml/cli.rb CHANGED
@@ -3,6 +3,7 @@ require "oddb2xml/downloader"
3
3
  require "oddb2xml/extractor"
4
4
  require "oddb2xml/compressor"
5
5
  require "oddb2xml/options"
6
+ require "oddb2xml/proxy_check"
6
7
  require "oddb2xml/util"
7
8
  require "rubyXL"
8
9
  require "date" # for today
@@ -38,6 +39,7 @@ module Oddb2xml
38
39
  def run
39
40
  threads = []
40
41
  start_time = Time.now
42
+ ProxyCheck.run(@options)
41
43
  files2rm = Dir.glob(File.join(DOWNLOADS, "*"))
42
44
  FileUtils.rm_f(files2rm, verbose: true) if (files2rm.size > 0) && !Oddb2xml.skip_download?
43
45
  if @options[:calc] && !(@options[:extended] || @options[:firstbase])
@@ -79,7 +79,12 @@ module Oddb2xml
79
79
  end
80
80
 
81
81
  def skip_download?
82
- @options[:skip_download] || (File.exist?(@file2save) && file_age_hours(@file2save) < 24)
82
+ # Only skip when the target file actually exists on disk. The bare
83
+ # @options[:skip_download] flag is not enough: each oddb2xml run uses its
84
+ # own ./downloads dir, so a flag-only short-circuit made download_one call
85
+ # File.size on a missing NDJSON and crash with Errno::ENOENT (issue #121).
86
+ return false unless File.exist?(@file2save)
87
+ @options[:skip_download] || file_age_hours(@file2save) < 24
83
88
  end
84
89
 
85
90
  def file_age_hours(file)
@@ -0,0 +1,108 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "net/http"
4
+ require "uri"
5
+ require "openssl"
6
+
7
+ module Oddb2xml
8
+ # Preflight connectivity check. Run once at the very start of a CLI run, it
9
+ # probes every outbound host oddb2xml needs (honouring the http(s)_proxy
10
+ # environment) and prints a loud warning if any host is blocked by the proxy
11
+ # (HTTP 407 on an allow-list proxy such as Aspectra's Skyhigh gateway) or is
12
+ # otherwise unreachable. It never aborts the run -- downloads still proceed and
13
+ # fail individually as before; this just surfaces the cause up front instead of
14
+ # leaving the user to decode a later Errno/empty-output symptom. See issue #121.
15
+ module ProxyCheck
16
+ module_function
17
+
18
+ # host => human-readable description of what breaks when it is unreachable.
19
+ # Hosts only needed for certain options are added conditionally (see #hosts_for).
20
+ BASE_HOSTS = {
21
+ "files.refdata.ch" => "Refdata articles",
22
+ "www.swissmedic.ch" => "Swissmedic registrations",
23
+ "raw.githubusercontent.com" => "ATC codes (cpp2sqlite)"
24
+ }.freeze
25
+
26
+ TIMEOUT = 6 # seconds, per host (open + read); checks run concurrently
27
+
28
+ def proxy_uri
29
+ env = ENV["https_proxy"] || ENV["HTTPS_PROXY"] || ENV["http_proxy"] || ENV["HTTP_PROXY"]
30
+ return nil if env.nil? || env.empty?
31
+ env = "http://#{env}" unless env.start_with?("http")
32
+ URI.parse(env)
33
+ rescue URI::InvalidURIError
34
+ nil
35
+ end
36
+
37
+ def hosts_for(options = {})
38
+ hosts = BASE_HOSTS.dup
39
+ hosts["epl.bag.admin.ch"] = "BAG FHIR data (--fhir)" if options[:fhir]
40
+ hosts["id.gs1.ch"] = "GS1 NONPHARMA (--firstbase / -b)" if options[:firstbase]
41
+ hosts["www.spezialitaetenliste.ch"] = "BAG Spezialitätenliste" unless options[:fhir]
42
+ hosts["www.medregbm.admin.ch"] = "Medizinalberuferegister (-x address)" if options[:address]
43
+ hosts
44
+ end
45
+
46
+ # Returns :ok, :blocked (proxy 407) or :unreachable for a single host.
47
+ def check_host(host, proxy)
48
+ http =
49
+ if proxy
50
+ Net::HTTP.new(host, 443, proxy.host, proxy.port, proxy.user, proxy.password)
51
+ else
52
+ Net::HTTP.new(host, 443)
53
+ end
54
+ http.use_ssl = true
55
+ http.verify_mode = OpenSSL::SSL::VERIFY_NONE
56
+ http.open_timeout = TIMEOUT
57
+ http.read_timeout = TIMEOUT
58
+ http.start do |h|
59
+ res = h.head("/")
60
+ return :blocked if res.code.to_s == "407"
61
+ return :ok # any HTTP answer (200/301/403/404/...) means the host is reachable
62
+ end
63
+ rescue => error
64
+ msg = error.message.to_s.downcase
65
+ return :blocked if msg.include?("407") || msg.include?("authenticationrequired") || msg.include?("proxy")
66
+ :unreachable
67
+ end
68
+
69
+ # Probe all relevant hosts concurrently and warn about any that fail.
70
+ def run(options = {})
71
+ return if defined?(RSpec) || defined?(VCR) # never touch the network in tests
72
+ return if ENV["ODDB2XML_SKIP_PROXY_CHECK"]
73
+
74
+ proxy = proxy_uri
75
+ hosts = hosts_for(options)
76
+ results = hosts.map do |host, desc|
77
+ Thread.new { [host, desc, check_host(host, proxy)] }
78
+ end.map(&:value)
79
+
80
+ problems = results.reject { |(_host, _desc, status)| status == :ok }
81
+ return if problems.empty?
82
+
83
+ warn_about(problems, proxy)
84
+ end
85
+
86
+ def warn_about(problems, proxy)
87
+ line = "=" * 72
88
+ $stderr.puts line
89
+ $stderr.puts " oddb2xml CONNECTIVITY WARNING"
90
+ $stderr.puts " The following hosts could not be reached -- the corresponding"
91
+ $stderr.puts " downloads will FAIL or produce incomplete data:"
92
+ problems.each do |(host, desc, status)|
93
+ tag = (status == :blocked) ? "BLOCKED by proxy (407)" : "UNREACHABLE "
94
+ $stderr.puts format(" [%s] %-26s %s", tag, host, desc)
95
+ end
96
+ if proxy
97
+ $stderr.puts ""
98
+ $stderr.puts " Proxy in use: #{proxy.host}:#{proxy.port}"
99
+ if problems.any? { |(_h, _d, s)| s == :blocked }
100
+ $stderr.puts " This looks like an allow-list proxy. Ask your admin to allow the"
101
+ $stderr.puts " hosts above (HTTPS/443), or set credentials in http(s)_proxy."
102
+ end
103
+ end
104
+ $stderr.puts " (Set ODDB2XML_SKIP_PROXY_CHECK=1 to silence this check.)"
105
+ $stderr.puts line
106
+ end
107
+ end
108
+ end
@@ -1,3 +1,3 @@
1
1
  module Oddb2xml
2
- VERSION = "3.0.17"
2
+ VERSION = "3.0.18"
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: oddb2xml
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.0.17
4
+ version: 3.0.18
5
5
  platform: ruby
6
6
  authors:
7
7
  - Yasuhiro Asaka, Zeno R.R. Davatz, Niklaus Giger
@@ -483,6 +483,7 @@ files:
483
483
  - lib/oddb2xml/fhir_support.rb
484
484
  - lib/oddb2xml/options.rb
485
485
  - lib/oddb2xml/parslet_compositions.rb
486
+ - lib/oddb2xml/proxy_check.rb
486
487
  - lib/oddb2xml/refdata_cleanup.rb
487
488
  - lib/oddb2xml/semantic_check.rb
488
489
  - lib/oddb2xml/util.rb