oddb2xml 3.0.17 → 3.0.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: df4185a65a50fc628ba92bc98fb8ac378563d5653882e3e34b54d1c1265e3808
4
- data.tar.gz: a62c0d6b0b643458e9885c6ba7d8c1d54c1435696a3c73da2788fd22539a9915
3
+ metadata.gz: '081c86c751d4c29fa2ce616abdb6043707e50fab79e3b74a67aab568b2084519'
4
+ data.tar.gz: e3c91770aa8ccd4de0714c45f5a20eb7f87fc989ea84932a8e4cf1a5a3813a7c
5
5
  SHA512:
6
- metadata.gz: e3718c1c76fbced6bb1ce43350ae13b42fbf747e2dce91980be24df4b9dcea1a78448b20a344ab50685effe3a33341a20ebd520c1a70ed995c996a6fd0dfb0ff
7
- data.tar.gz: e0c0dd04ccacc29933a582fd05d19ebee40ce7ae420ca632e4d6f5068ae92de5928ab1fa6daf3b9bd4894f5797dde0090fad74d8bdc25a3610f648b57624aa1c
6
+ metadata.gz: b8a928d127496fb06cba79dc630c19bd5edc3b09dd3df0c0dda6ca058cd1897643818d4a8aa9e2ab125023d0ebb319dddee2ab1be0fcdabc8e4708299a136dee
7
+ data.tar.gz: 27c6ef5bcb9fa0df5b7861496468a88b416209a8b83e7c08e3a995407a75944f0d81849d75399254b7ee404787128ed9e4742ef5f6a78107d90f895845afe707
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- oddb2xml (3.0.17)
4
+ oddb2xml (3.0.19)
5
5
  csv
6
6
  htmlentities
7
7
  httpi
data/History.txt CHANGED
@@ -1,3 +1,10 @@
1
+ === 3.0.19 / 09.06.2026
2
+ * New option --proxy-check: probe connectivity/proxy reachability for every host oddb2xml could need, print a full OK/BLOCKED/UNREACHABLE report (honouring http(s)_proxy) and exit without downloading or building. Exits 0 if all hosts are reachable, 1 otherwise — handy for cron/deploy preflight on allow-list proxies (e.g. "oddb2xml --proxy-check"). Reuses the 3.0.18 proxy checker; checks run concurrently.
3
+
4
+ === 3.0.18 / 09.06.2026
5
+ * Bugfix: stop the FHIR downloader from crashing with "Errno::ENOENT @ rb_file_s_size ... foph-sl-export-latest-de.ndjson" when run with --skip-download (issue #121). FhirDownloader#skip_download? returned true on the bare --skip-download flag and then called File.size on a file that was never downloaded. Each oddb2xml run uses its own ./downloads dir, so deploy scripts that download once and then re-run with --skip-download in a fresh dir hit this every time. skip_download? now requires the target NDJSON to actually exist on disk before honouring the flag; a missing file falls through to a normal download instead of crashing.
6
+ * New: proxy / connectivity preflight check (issue #121). At the very start of a run, oddb2xml now probes the outbound hosts it needs (honouring the http(s)_proxy environment) and prints a loud warning if any host is blocked by the proxy (HTTP 407 on an allow-list proxy such as Aspectra's Skyhigh gateway) or otherwise unreachable — surfacing the cause up front instead of a later empty-output/Errno symptom. The probed host set is option-aware (e.g. id.gs1.ch only with --firstbase, epl.bag.admin.ch only with --fhir). It only warns and never aborts the run; downloads still proceed and fail individually as before. Checks run concurrently (~6s worst case) and are skipped during tests; set ODDB2XML_SKIP_PROXY_CHECK=1 to silence it.
7
+
1
8
  === 3.0.17 / 08.06.2026
2
9
  * Compatibility: relax the nokogiri requirement from ">= 1.19.3" to ">= 1.18.10" (lower bound only, no upper cap) so oddb2xml installs on Ruby 3.1 again. nokogiri 1.19.x requires Ruby >= 3.2, which made 3.0.13–3.0.16 uninstallable on Ruby 3.1.0 (the bare "gem install" crashed with the misleading RubyGems resolver error "undefined method `request' for nil:NilClass"). With a lower-bound-only floor, RubyGems still resolves the security-fixed 1.19.3 on Ruby >= 3.2, while Ruby 3.1 installs 1.18.10 (the newest 3.1-compatible release). oddb2xml uses no nokogiri 1.19-only API and runs no CSS/XSLT/C14N over untrusted input, so the 1.19.3 advisories do not apply to its usage. Note: Ruby 3.1 is EOL (since March 2025) — upgrading to Ruby >= 3.2 remains recommended.
3
10
 
data/lib/oddb2xml/cli.rb CHANGED
@@ -3,6 +3,7 @@ require "oddb2xml/downloader"
3
3
  require "oddb2xml/extractor"
4
4
  require "oddb2xml/compressor"
5
5
  require "oddb2xml/options"
6
+ require "oddb2xml/proxy_check"
6
7
  require "oddb2xml/util"
7
8
  require "rubyXL"
8
9
  require "date" # for today
@@ -38,6 +39,12 @@ module Oddb2xml
38
39
  def run
39
40
  threads = []
40
41
  start_time = Time.now
42
+ if @options[:proxy_check]
43
+ ok = ProxyCheck.report(@options)
44
+ exit(ok ? 0 : 1) unless defined?(RSpec)
45
+ return ok
46
+ end
47
+ ProxyCheck.run(@options)
41
48
  files2rm = Dir.glob(File.join(DOWNLOADS, "*"))
42
49
  FileUtils.rm_f(files2rm, verbose: true) if (files2rm.size > 0) && !Oddb2xml.skip_download?
43
50
  if @options[:calc] && !(@options[:extended] || @options[:firstbase])
@@ -79,7 +79,12 @@ module Oddb2xml
79
79
  end
80
80
 
81
81
  def skip_download?
82
- @options[:skip_download] || (File.exist?(@file2save) && file_age_hours(@file2save) < 24)
82
+ # Only skip when the target file actually exists on disk. The bare
83
+ # @options[:skip_download] flag is not enough: each oddb2xml run uses its
84
+ # own ./downloads dir, so a flag-only short-circuit made download_one call
85
+ # File.size on a missing NDJSON and crash with Errno::ENOENT (issue #121).
86
+ return false unless File.exist?(@file2save)
87
+ @options[:skip_download] || file_age_hours(@file2save) < 24
83
88
  end
84
89
 
85
90
  def file_age_hours(file)
@@ -46,6 +46,7 @@ module Oddb2xml
46
46
  opt :use_ra11zip, "Use the ra11.zip (a zipped transfer.dat from Galexis)",
47
47
  default: File.exist?("ra11.zip") ? "ra11.zip" : nil, type: :string
48
48
  opt :firstbase, "Build all NONPHARMA articles on firstbase (GS1 Switzerland CSV from id.gs1.ch)", short: "b", default: false
49
+ opt :proxy_check, "Only probe connectivity/proxy reachability for every required host, print a report and exit (no download/build). Honours http(s)_proxy. Exits 0 if all reachable, 1 otherwise.", short: :none, default: false
49
50
  end
50
51
 
51
52
  @opts[:percent] = @opts[:increment]
@@ -0,0 +1,148 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "net/http"
4
+ require "uri"
5
+ require "openssl"
6
+
7
+ module Oddb2xml
8
+ # Preflight connectivity check. Run once at the very start of a CLI run, it
9
+ # probes every outbound host oddb2xml needs (honouring the http(s)_proxy
10
+ # environment) and prints a loud warning if any host is blocked by the proxy
11
+ # (HTTP 407 on an allow-list proxy such as Aspectra's Skyhigh gateway) or is
12
+ # otherwise unreachable. It never aborts the run -- downloads still proceed and
13
+ # fail individually as before; this just surfaces the cause up front instead of
14
+ # leaving the user to decode a later Errno/empty-output symptom. See issue #121.
15
+ module ProxyCheck
16
+ module_function
17
+
18
+ # host => human-readable description of what breaks when it is unreachable.
19
+ # Hosts only needed for certain options are added conditionally (see #hosts_for).
20
+ BASE_HOSTS = {
21
+ "files.refdata.ch" => "Refdata articles",
22
+ "www.swissmedic.ch" => "Swissmedic registrations",
23
+ "raw.githubusercontent.com" => "ATC codes (cpp2sqlite)"
24
+ }.freeze
25
+
26
+ TIMEOUT = 6 # seconds, per host (open + read); checks run concurrently
27
+
28
+ def proxy_uri
29
+ env = ENV["https_proxy"] || ENV["HTTPS_PROXY"] || ENV["http_proxy"] || ENV["HTTP_PROXY"]
30
+ return nil if env.nil? || env.empty?
31
+ env = "http://#{env}" unless env.start_with?("http")
32
+ URI.parse(env)
33
+ rescue URI::InvalidURIError
34
+ nil
35
+ end
36
+
37
+ def hosts_for(options = {})
38
+ hosts = BASE_HOSTS.dup
39
+ hosts["epl.bag.admin.ch"] = "BAG FHIR data (--fhir)" if options[:fhir]
40
+ hosts["id.gs1.ch"] = "GS1 NONPHARMA (--firstbase / -b)" if options[:firstbase]
41
+ hosts["www.spezialitaetenliste.ch"] = "BAG Spezialitätenliste" unless options[:fhir]
42
+ hosts["www.medregbm.admin.ch"] = "Medizinalberuferegister (-x address)" if options[:address]
43
+ hosts
44
+ end
45
+
46
+ # Full union of every host any run could need, regardless of options.
47
+ # Used by --proxy-check so the report covers everything in one go.
48
+ def all_hosts
49
+ BASE_HOSTS.merge(
50
+ "epl.bag.admin.ch" => "BAG FHIR data (--fhir)",
51
+ "id.gs1.ch" => "GS1 NONPHARMA (--firstbase / -b)",
52
+ "www.spezialitaetenliste.ch" => "BAG Spezialitätenliste",
53
+ "www.medregbm.admin.ch" => "Medizinalberuferegister (-x address)"
54
+ )
55
+ end
56
+
57
+ # Probe every host and print a full OK/BLOCKED/UNREACHABLE table.
58
+ # Returns true when all hosts are reachable. Used by `oddb2xml --proxy-check`.
59
+ def report(_options = {})
60
+ proxy = proxy_uri
61
+ results = all_hosts.map do |host, desc|
62
+ Thread.new { [host, desc, check_host(host, proxy)] }
63
+ end.map(&:value).sort_by { |(host, _desc, _status)| host }
64
+
65
+ header = "oddb2xml connectivity check"
66
+ header += proxy ? " (via proxy #{proxy.host}:#{proxy.port})" : " (no proxy configured)"
67
+ puts header
68
+ results.each do |(host, desc, status)|
69
+ tag = case status
70
+ when :ok then "OK "
71
+ when :blocked then "BLOCKED" # proxy returned 407
72
+ else "UNREACH"
73
+ end
74
+ puts format(" [%s] %-28s %s", tag, host, desc)
75
+ end
76
+ unreachable = results.reject { |(_host, _desc, status)| status == :ok }
77
+ if unreachable.empty?
78
+ puts "All #{results.size} hosts reachable."
79
+ true
80
+ else
81
+ puts "#{unreachable.size} of #{results.size} host(s) NOT reachable -- downloads using them will fail."
82
+ false
83
+ end
84
+ end
85
+
86
+ # Returns :ok, :blocked (proxy 407) or :unreachable for a single host.
87
+ def check_host(host, proxy)
88
+ http =
89
+ if proxy
90
+ Net::HTTP.new(host, 443, proxy.host, proxy.port, proxy.user, proxy.password)
91
+ else
92
+ Net::HTTP.new(host, 443)
93
+ end
94
+ http.use_ssl = true
95
+ http.verify_mode = OpenSSL::SSL::VERIFY_NONE
96
+ http.open_timeout = TIMEOUT
97
+ http.read_timeout = TIMEOUT
98
+ http.start do |h|
99
+ res = h.head("/")
100
+ return :blocked if res.code.to_s == "407"
101
+ return :ok # any HTTP answer (200/301/403/404/...) means the host is reachable
102
+ end
103
+ rescue => error
104
+ msg = error.message.to_s.downcase
105
+ return :blocked if msg.include?("407") || msg.include?("authenticationrequired") || msg.include?("proxy")
106
+ :unreachable
107
+ end
108
+
109
+ # Probe all relevant hosts concurrently and warn about any that fail.
110
+ def run(options = {})
111
+ return if defined?(RSpec) || defined?(VCR) # never touch the network in tests
112
+ return if ENV["ODDB2XML_SKIP_PROXY_CHECK"]
113
+
114
+ proxy = proxy_uri
115
+ hosts = hosts_for(options)
116
+ results = hosts.map do |host, desc|
117
+ Thread.new { [host, desc, check_host(host, proxy)] }
118
+ end.map(&:value)
119
+
120
+ problems = results.reject { |(_host, _desc, status)| status == :ok }
121
+ return if problems.empty?
122
+
123
+ warn_about(problems, proxy)
124
+ end
125
+
126
+ def warn_about(problems, proxy)
127
+ line = "=" * 72
128
+ warn line
129
+ warn " oddb2xml CONNECTIVITY WARNING"
130
+ warn " The following hosts could not be reached -- the corresponding"
131
+ warn " downloads will FAIL or produce incomplete data:"
132
+ problems.each do |(host, desc, status)|
133
+ tag = (status == :blocked) ? "BLOCKED by proxy (407)" : "UNREACHABLE "
134
+ warn format(" [%s] %-26s %s", tag, host, desc)
135
+ end
136
+ if proxy
137
+ warn ""
138
+ warn " Proxy in use: #{proxy.host}:#{proxy.port}"
139
+ if problems.any? { |(_h, _d, s)| s == :blocked }
140
+ warn " This looks like an allow-list proxy. Ask your admin to allow the"
141
+ warn " hosts above (HTTPS/443), or set credentials in http(s)_proxy."
142
+ end
143
+ end
144
+ warn " (Set ODDB2XML_SKIP_PROXY_CHECK=1 to silence this check.)"
145
+ warn line
146
+ end
147
+ end
148
+ end
@@ -1,3 +1,3 @@
1
1
  module Oddb2xml
2
- VERSION = "3.0.17"
2
+ VERSION = "3.0.19"
3
3
  end
data/spec/options_spec.rb CHANGED
@@ -19,6 +19,7 @@ Oddb2xml::DEFAULT_OPTS = {
19
19
  firstbase: false,
20
20
  fhir: false,
21
21
  fhir_url: nil,
22
+ proxy_check: false,
22
23
  }
23
24
 
24
25
  describe Oddb2xml::Options do
@@ -75,6 +76,7 @@ describe Oddb2xml::Options do
75
76
  expected[:nonpharma] = true
76
77
  expected[:calc] = true
77
78
  expected[:price] = :zurrose
79
+ expected[:fhir] = true
78
80
  specify { expect(test_opts).to eq expected }
79
81
  end
80
82
 
@@ -86,6 +88,7 @@ describe Oddb2xml::Options do
86
88
  expected[:calc] = true
87
89
  expected[:price] = :zurrose
88
90
  expected[:percent] = 80
91
+ expected[:fhir] = true
89
92
  specify { expect(test_opts).to eq expected }
90
93
  end
91
94
 
@@ -165,6 +168,7 @@ describe Oddb2xml::Options do
165
168
  expected[:price] = :zurrose
166
169
  expected[:extended] = true
167
170
  expected[:artikelstamm] = true
171
+ expected[:fhir] = true
168
172
  specify { expect(test_opts).to eq expected }
169
173
  end
170
174
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: oddb2xml
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.0.17
4
+ version: 3.0.19
5
5
  platform: ruby
6
6
  authors:
7
7
  - Yasuhiro Asaka, Zeno R.R. Davatz, Niklaus Giger
@@ -483,6 +483,7 @@ files:
483
483
  - lib/oddb2xml/fhir_support.rb
484
484
  - lib/oddb2xml/options.rb
485
485
  - lib/oddb2xml/parslet_compositions.rb
486
+ - lib/oddb2xml/proxy_check.rb
486
487
  - lib/oddb2xml/refdata_cleanup.rb
487
488
  - lib/oddb2xml/semantic_check.rb
488
489
  - lib/oddb2xml/util.rb