oddb2xml 3.0.10 → 3.0.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CLAUDE.md +2 -0
- data/Gemfile.lock +1 -1
- data/History.txt +3 -0
- data/README.md +1 -1
- data/lib/oddb2xml/builder.rb +12 -4
- data/lib/oddb2xml/chapter_70_hack.rb +23 -3
- data/lib/oddb2xml/version.rb +1 -1
- data/spec/artikelstamm_spec.rb +19 -1
- data/spec/data/varia_De_spa.htm +16 -0
- metadata +3 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 36ede269e855765b80eda67f9926c44dbc78c3eb4130b2e75ee780f635d0dbe3
|
|
4
|
+
data.tar.gz: 81ecf15485597e9a9eea503e63a7cc7aaf40dc3417f34ef2c3ca768092db9262
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: a305b713ab53201fa6c52d18db0446aa853df4f3986dbd53bcdb26cada8ad46770765800b7db5135bec1d0d8e9c58b4d793ca6f864dd203e5187912014a04103
|
|
7
|
+
data.tar.gz: 77abfc938a47bafc4579a429d4ff65dcace489566673754e4dedb74834fd0332f902c3aa80155b8a5aaf8140abc191fd6d4f7892092ab36ea541bf5ddff1bdcc
|
data/CLAUDE.md
CHANGED
|
@@ -51,6 +51,8 @@ The system follows a **download → extract → build → compress** pipeline:
|
|
|
51
51
|
|
|
52
52
|
8. **Refdata cleanup** (`lib/oddb2xml/refdata_cleanup.rb`) — Compensates for known data-quality issues in upstream Refdata.Articles.xml before they reach the output. Each fix is guarded by a Swissmedic-side heuristic (e.g. comma in `substance_swissmedic` to distinguish mono products from real combinations). Currently fixes the doubled-dose template bug (`X mg / X mg / Stk`). Called from `Builder#apply_refdata_description_cleanups!` at the start of `prepare_articles`. See GitHub issue #112 for the catalogue.
|
|
53
53
|
|
|
54
|
+
9. **Chapter-70 hack** (`lib/oddb2xml/chapter_70_hack.rb`) — Legacy scraper for the SL "Komplementärarzneimittel" products (homeopathic/anthroposophic/phytotherapeutic), called only from `Builder#build_artikelstamm`. **Deprecated / non-FHIR only (3.0.11 onwards):** the source page `varia_De.htm` was rebuilt as a JavaScript SPA with no static data table, so the scraper now returns nothing there. These products + limitations now come through the FHIR feed (SL classification `20. KOMPLEMENTÄRARZNEIMITTEL`, 221 products on the live DE feed with real GTINs and limitation texts), so `build_artikelstamm` **skips the scraper entirely when `@options[:fhir]`** (the default for `--artikelstamm` since 3.0.9). In `--no-fhir` mode the scraper degrades gracefully (skips non-row/`<script>` nodes and empty tables, warns, returns `[]`) instead of raising `NoMethodError`. See GitHub issue #118.
|
|
55
|
+
|
|
54
56
|
### Key data identifiers
|
|
55
57
|
- **GTIN/EAN13**: Primary article identifier (13-digit barcode)
|
|
56
58
|
- **Pharmacode**: Swiss pharmacy code
|
data/Gemfile.lock
CHANGED
data/History.txt
CHANGED
|
@@ -1,3 +1,6 @@
|
|
|
1
|
+
=== 3.0.11 / 02.06.2026
|
|
2
|
+
* Artikelstamm: fix crash in the chapter-70 hack. The upstream source page http://www.spezialitaetenliste.ch/varia_De.htm has been rebuilt as a JavaScript single-page app and no longer ships the chapter-70 data table as static HTML, so Chapter70xtractor.parse raised NoMethodError. The chapter-70 (Komplementärarzneimittel) products and their limitations now arrive through the FHIR feed (SL classification "20. KOMPLEMENTÄRARZNEIMITTEL"), so the legacy scraper is now skipped entirely in FHIR mode (the default for --artikelstamm since 3.0.9). For --no-fhir runs the scraper degrades gracefully: it skips non-row nodes and an empty/missing table, logs a warning and returns no items instead of crashing (issue #118).
|
|
3
|
+
|
|
1
4
|
=== 3.0.10 / 01.06.2026
|
|
2
5
|
* FHIR: read the BAG Indikationscode (XXXXX.NN) from the explicit `indicationCode` extension now carried on each limitation (BAG SL FHIR export >= v2.0.5) instead of reconstructing it from FOPHDossierNumber + the ClinicalUseDefinition id suffix. The changelog states the limitation code (CUD id) and the indication code are independent fields, so the old derivation was no longer guaranteed correct. Falls back to the previous derivation for older feeds that lack the extension. Output is identical on the current live feed. Other changelog items (limitation text moved to ClinicalUseDefinition, sanitized CUD ids, un-truncated `<` in texts, deduplicated/ordered Ingredients) were already handled or are transparent upstream fixes.
|
|
3
6
|
|
data/README.md
CHANGED
|
@@ -51,7 +51,7 @@ HIN (http://hin.ch) creates daily the actual file. They can be downloaded from `
|
|
|
51
51
|
see `--help`.
|
|
52
52
|
|
|
53
53
|
```
|
|
54
|
-
/opt/src/oddb2xml/bin/oddb2xml version 3.0.
|
|
54
|
+
/opt/src/oddb2xml/bin/oddb2xml version 3.0.11
|
|
55
55
|
Usage:
|
|
56
56
|
oddb2xml [option]
|
|
57
57
|
produced files are found under data
|
data/lib/oddb2xml/builder.rb
CHANGED
|
@@ -1742,7 +1742,7 @@ module Oddb2xml
|
|
|
1742
1742
|
# Set the pharmatype to 'Y' for outdated products, which are no longer found
|
|
1743
1743
|
# in refdata/packungen
|
|
1744
1744
|
chap70 = nil
|
|
1745
|
-
if @chapter70items
|
|
1745
|
+
if @chapter70items&.values&.find { |x| x[:pharmacode]&.eql?(obj[:pharmacode]) }
|
|
1746
1746
|
Oddb2xml.log "found chapter #{obj[:pharmacode]}" if $VERBOSE
|
|
1747
1747
|
chap70 = true
|
|
1748
1748
|
end
|
|
@@ -1804,9 +1804,17 @@ module Oddb2xml
|
|
|
1804
1804
|
end
|
|
1805
1805
|
end
|
|
1806
1806
|
unless @prepared
|
|
1807
|
-
|
|
1808
|
-
|
|
1809
|
-
|
|
1807
|
+
# The chapter-70 ("Komplementärarzneimittel") products and their
|
|
1808
|
+
# limitations now arrive through the FHIR feed (SL classification
|
|
1809
|
+
# "20. KOMPLEMENTÄRARZNEIMITTEL"), so the legacy varia_De.htm scraper
|
|
1810
|
+
# is redundant in FHIR mode. The varia page itself has been rebuilt as
|
|
1811
|
+
# a JavaScript SPA that no longer ships the data table, so the scraper
|
|
1812
|
+
# can only fail there now. See GitHub issue #118.
|
|
1813
|
+
unless @options[:fhir]
|
|
1814
|
+
require "oddb2xml/chapter_70_hack"
|
|
1815
|
+
Oddb2xml::Chapter70xtractor.parse
|
|
1816
|
+
@chapter70items = Oddb2xml::Chapter70xtractor.items
|
|
1817
|
+
end
|
|
1810
1818
|
prepare_limitations
|
|
1811
1819
|
prepare_articles
|
|
1812
1820
|
prepare_products
|
|
@@ -39,12 +39,31 @@ module Oddb2xml
|
|
|
39
39
|
effort: :tolerant,
|
|
40
40
|
smart: true
|
|
41
41
|
}
|
|
42
|
-
|
|
42
|
+
parsed = Ox.load(Oddb2xml.uri_open(html_file).read, mode: :hash_no_attrs)
|
|
43
|
+
res = parsed.values.first["body"] if parsed.respond_to?(:values) && parsed.values.first.is_a?(Hash)
|
|
43
44
|
result = []
|
|
44
45
|
idx = 0
|
|
45
46
|
@@items = {}
|
|
46
|
-
res.values
|
|
47
|
-
|
|
47
|
+
unless res.respond_to?(:values)
|
|
48
|
+
warn "Chapter70: varia page has no <body> to parse (got #{res.class}); skipping"
|
|
49
|
+
return []
|
|
50
|
+
end
|
|
51
|
+
# The varia page used to expose the chapter-70 table as static HTML. It
|
|
52
|
+
# is now a JavaScript single-page app whose <body> only contains an empty
|
|
53
|
+
# <sl-root> shell, so there is no data table to walk. Each entry yielded
|
|
54
|
+
# by iterating a Hash is a [tag, content] pair; a real row carries a Hash
|
|
55
|
+
# of cells, while stray nodes (e.g. <script>) carry an Array. Skip the
|
|
56
|
+
# latter so a redesigned/empty page degrades to "no items" instead of
|
|
57
|
+
# raising NoMethodError. See GitHub issue #118.
|
|
58
|
+
rows = res.values.last
|
|
59
|
+
unless rows.respond_to?(:each)
|
|
60
|
+
warn "Chapter70: varia page has no parseable rows (got #{rows.class}); skipping"
|
|
61
|
+
return []
|
|
62
|
+
end
|
|
63
|
+
rows.each do |item|
|
|
64
|
+
cells = item.is_a?(Hash) ? item.values.first : nil
|
|
65
|
+
next unless cells.respond_to?(:each)
|
|
66
|
+
cells.each do |sub_elem|
|
|
48
67
|
what = Chapter70xtractor.parse_td(sub_elem)
|
|
49
68
|
idx += 1
|
|
50
69
|
puts "#{idx}: xx #{what}" if $VERBOSE
|
|
@@ -52,6 +71,7 @@ module Oddb2xml
|
|
|
52
71
|
end
|
|
53
72
|
end
|
|
54
73
|
result2 = result.find_all { |x| (x.is_a?(Array) && x.first.is_a?(String)) && x.first.to_i > 100 }
|
|
74
|
+
warn "Chapter70: varia page yielded no chapter-70 products; skipping" if result2.empty?
|
|
55
75
|
result2.each do |entry|
|
|
56
76
|
data = {}
|
|
57
77
|
pharma_code = entry.first
|
data/lib/oddb2xml/version.rb
CHANGED
data/spec/artikelstamm_spec.rb
CHANGED
|
@@ -38,7 +38,11 @@ describe Oddb2xml::Builder do
|
|
|
38
38
|
context "when artikelstamm option is given" do
|
|
39
39
|
before(:all) do
|
|
40
40
|
common_run_init
|
|
41
|
-
|
|
41
|
+
# --no-fhir: these fixtures (chapter-70 fake GTINs, BAG-XML preparations)
|
|
42
|
+
# exercise the legacy pipeline. Since 3.0.9 --artikelstamm defaults to the
|
|
43
|
+
# FHIR feed, which has no VCR cassette here and skips the chapter-70 hack
|
|
44
|
+
# (issue #118), so the suite must opt out to keep testing legacy output.
|
|
45
|
+
options = Oddb2xml::Options.parse(["--artikelstamm", "--no-fhir"]) # , '--log'])
|
|
42
46
|
# @res = buildr_capture(:stdout){ Oddb2xml::Cli.new(options).run }
|
|
43
47
|
Oddb2xml::Cli.new(options).run # to debug
|
|
44
48
|
@artikelstamm_name = File.join(Oddb2xml::WORK_DIR, "artikelstamm_#{Date.today.strftime("%d%m%Y")}_v5.xml")
|
|
@@ -503,10 +507,24 @@ Der behandelnde Arzt ist verpflichtet, die erforderlichen Daten laufend im vorge
|
|
|
503
507
|
end
|
|
504
508
|
it "parsing" do
|
|
505
509
|
require "oddb2xml/chapter_70_hack"
|
|
510
|
+
# Stub explicitly so this test is independent of example ordering
|
|
511
|
+
# (the SPA test below re-stubs the same URL).
|
|
512
|
+
url = "http://www.spezialitaetenliste.ch/varia_De.htm"
|
|
513
|
+
stub_request(:any, url).to_return(body: File.read(File.join(Oddb2xml::SpecData, "varia_De.htm")))
|
|
506
514
|
result = Oddb2xml::Chapter70xtractor.parse
|
|
507
515
|
expect(result.class).to eq Array
|
|
508
516
|
expect(result.first).to eq ["2069562", "70.01.10", "Urtinktur", "1--10 g/ml", "13.40", ""]
|
|
509
517
|
expect(result.last).to eq ["6516727", "70.02", "Allergenorum extractum varium / Inj. Susp. \n\tFortsetzungsbehandlung", "1 Durchstfl 1.5 ml", "311.85", "L"]
|
|
510
518
|
end
|
|
519
|
+
it "degrades gracefully when varia page is a JavaScript SPA (issue #118)" do
|
|
520
|
+
require "oddb2xml/chapter_70_hack"
|
|
521
|
+
spa = File.read(File.join(Oddb2xml::SpecData, "varia_De_spa.htm"))
|
|
522
|
+
url = "http://www.spezialitaetenliste.ch/varia_De.htm"
|
|
523
|
+
stub_request(:any, url).to_return(body: spa)
|
|
524
|
+
result = nil
|
|
525
|
+
expect { result = Oddb2xml::Chapter70xtractor.parse }.not_to raise_error
|
|
526
|
+
expect(result).to eq []
|
|
527
|
+
expect(Oddb2xml::Chapter70xtractor.items).to eq({})
|
|
528
|
+
end
|
|
511
529
|
end
|
|
512
530
|
end
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
<!doctype html>
|
|
2
|
+
<html lang="en">
|
|
3
|
+
<head>
|
|
4
|
+
<meta charset="utf-8">
|
|
5
|
+
<title>BAG SL</title>
|
|
6
|
+
<base href="/">
|
|
7
|
+
</head>
|
|
8
|
+
<body>
|
|
9
|
+
<noscript><div><section><strong>JavaScript ist erforderlich!</strong></section></div></noscript>
|
|
10
|
+
<div><div><section><strong>Nicht unterstützter Browser!</strong></section></div></div>
|
|
11
|
+
<sl-root></sl-root>
|
|
12
|
+
<script src="polyfills-HE33VVSG.js" type="module"></script>
|
|
13
|
+
<script src="scripts-ALGZMJC4.js" defer></script>
|
|
14
|
+
<script src="main-DRAKRIKO.js" type="module"></script>
|
|
15
|
+
</body>
|
|
16
|
+
</html>
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: oddb2xml
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 3.0.
|
|
4
|
+
version: 3.0.11
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Yasuhiro Asaka, Zeno R.R. Davatz, Niklaus Giger
|
|
@@ -547,6 +547,7 @@ files:
|
|
|
547
547
|
- spec/data/v5_first.xml
|
|
548
548
|
- spec/data/v5_second.xml
|
|
549
549
|
- spec/data/varia_De.htm
|
|
550
|
+
- spec/data/varia_De_spa.htm
|
|
550
551
|
- spec/data/vcr/transfer.dat
|
|
551
552
|
- spec/data/vcr/transfer.zip
|
|
552
553
|
- spec/data/wsdl_nonpharma.xml
|
|
@@ -644,6 +645,7 @@ test_files:
|
|
|
644
645
|
- spec/data/v5_first.xml
|
|
645
646
|
- spec/data/v5_second.xml
|
|
646
647
|
- spec/data/varia_De.htm
|
|
648
|
+
- spec/data/varia_De_spa.htm
|
|
647
649
|
- spec/data/vcr/transfer.dat
|
|
648
650
|
- spec/data/vcr/transfer.zip
|
|
649
651
|
- spec/data/wsdl_nonpharma.xml
|