oddb2xml 2.9.9 → 3.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CLAUDE.md +69 -0
- data/Gemfile +3 -2
- data/Gemfile.lock +66 -64
- data/gemset.nix +103 -72
- data/lib/oddb2xml/cli.rb +23 -3
- data/lib/oddb2xml/compressor.rb +1 -3
- data/lib/oddb2xml/fhir_support.rb +752 -0
- data/lib/oddb2xml/options.rb +6 -0
- data/lib/oddb2xml/version.rb +1 -1
- data/lib/oddb2xml.rb +10 -0
- data/oddb2xml.gemspec +8 -8
- data/spec/downloader_spec.rb +1 -1
- metadata +28 -26
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: c0521593bc12b423a74628f04973dcaf43241ce674c415c4e3a82da3ad34a134
|
|
4
|
+
data.tar.gz: a7b8fe9076dd737105696b525e3c4c0e17e3174ea4d86d0bab9f948324229c05
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 1f0c6275adc2b60930eb961b1745013f81c167a49c035347c49b40af8f59ead028f884981aeb9aa0ce809b7007a1b4c78993d2adfa5c3e31e3ed56e8866ac035
|
|
7
|
+
data.tar.gz: c02f68d34884adbabe1465cf7180e6ef51cbb84049a71a931880bcb383abb4c3dc8196f92e21932b3529a68e5211f546fb053112f27037434313517de4d22961
|
data/CLAUDE.md
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Project Overview
|
|
6
|
+
|
|
7
|
+
oddb2xml is a Ruby gem that downloads Swiss pharmaceutical data from 10+ sources (Swissmedic, BAG, Refdata, ZurRose, EPha, etc.), parses multiple formats (XML, XLSX, CSV, SOAP, fixed-width DAT), merges/deduplicates them, and generates standardized XML/DAT output files for healthcare systems. It also supports the Elexis EHR Artikelstamm format.
|
|
8
|
+
|
|
9
|
+
## Common Commands
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
# Install dependencies
|
|
13
|
+
bundle install
|
|
14
|
+
|
|
15
|
+
# Run full test suite
|
|
16
|
+
bundle exec rake spec
|
|
17
|
+
|
|
18
|
+
# Run a single test file
|
|
19
|
+
bundle exec rspec spec/builder_spec.rb
|
|
20
|
+
|
|
21
|
+
# Run a single test by line number
|
|
22
|
+
bundle exec rspec spec/builder_spec.rb:42
|
|
23
|
+
|
|
24
|
+
# Lint with StandardRB
|
|
25
|
+
bundle exec standardrb
|
|
26
|
+
|
|
27
|
+
# Auto-fix lint issues
|
|
28
|
+
bundle exec standardrb --fix
|
|
29
|
+
|
|
30
|
+
# Build the gem
|
|
31
|
+
bundle exec rake build
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## Architecture
|
|
35
|
+
|
|
36
|
+
The system follows a **download → extract → build → compress** pipeline:
|
|
37
|
+
|
|
38
|
+
1. **CLI** (`lib/oddb2xml/cli.rb`) — Entry point. Parses options via Optimist (`options.rb`), orchestrates the pipeline, manages multi-threaded downloads.
|
|
39
|
+
|
|
40
|
+
2. **Downloaders** (`lib/oddb2xml/downloader.rb`) — 11 subclasses of `Downloader`, each fetching from a specific Swiss data source. Files cached in `./downloads/`.
|
|
41
|
+
|
|
42
|
+
3. **Extractors** (`lib/oddb2xml/extractor.rb`) — Matching extractor classes that parse downloaded files into Ruby hashes. Formats include XML (nokogiri/sax-machine), XLSX (rubyXL), SOAP (savon), CSV, and fixed-width text.
|
|
43
|
+
|
|
44
|
+
4. **Builder** (`lib/oddb2xml/builder.rb`) — The largest file (~1900 lines). Merges extracted data and generates output XML/DAT files. Methods follow `prepare_*` (data assembly) and `build_*` (output generation) naming.
|
|
45
|
+
|
|
46
|
+
5. **Calc** (`lib/oddb2xml/calc.rb`) — Composition calculation logic, works with `parslet_compositions.rb` and `compositions_syntax.rb` (Parslet-based PEG parser for drug composition strings).
|
|
47
|
+
|
|
48
|
+
6. **Compressor** (`lib/oddb2xml/compressor.rb`) — Optional ZIP/TAR.GZ output compression.
|
|
49
|
+
|
|
50
|
+
### Key data identifiers
|
|
51
|
+
- **GTIN/EAN13**: Primary article identifier (13-digit barcode)
|
|
52
|
+
- **Pharmacode**: Swiss pharmacy code
|
|
53
|
+
- **IKSNR**: Swissmedic registration number (5-digit)
|
|
54
|
+
- **Swissmedic sequence/pack numbers**: Combined with IKSNR to form full identifiers
|
|
55
|
+
|
|
56
|
+
### Static data overrides
|
|
57
|
+
YAML files in `data/` provide manual overrides and mappings: `article_overrides.yaml`, `product_overrides.yaml`, `gtin2ignore.yaml`, `gal_forms.yaml`, `gal_groups.yaml`.
|
|
58
|
+
|
|
59
|
+
## Testing
|
|
60
|
+
|
|
61
|
+
- Framework: RSpec with flexmock (mocking), webmock + VCR (HTTP recording/playback)
|
|
62
|
+
- Test fixtures: `spec/data/` (sample files), `spec/fixtures/vcr_cassettes/` (recorded HTTP responses)
|
|
63
|
+
- `spec/spec_helper.rb` defines test constants (GTINs) and configures VCR to avoid real HTTP calls during tests
|
|
64
|
+
- CI runs on Ruby 3.0, 3.1, 3.2
|
|
65
|
+
|
|
66
|
+
## Ruby Version
|
|
67
|
+
|
|
68
|
+
- Minimum: Ruby >= 2.5.0 (gemspec)
|
|
69
|
+
- Current development: Ruby 3.2.0 (`.ruby-version`)
|
data/Gemfile
CHANGED
data/Gemfile.lock
CHANGED
|
@@ -1,54 +1,53 @@
|
|
|
1
1
|
PATH
|
|
2
2
|
remote: .
|
|
3
3
|
specs:
|
|
4
|
-
oddb2xml (
|
|
4
|
+
oddb2xml (3.0.0)
|
|
5
5
|
htmlentities
|
|
6
6
|
httpi
|
|
7
|
-
mechanize
|
|
7
|
+
mechanize (>= 2.8.5)
|
|
8
8
|
minitar
|
|
9
9
|
multi_json
|
|
10
|
-
nokogiri (>= 1.
|
|
10
|
+
nokogiri (>= 1.19.1)
|
|
11
11
|
optimist
|
|
12
12
|
ox
|
|
13
13
|
parslet
|
|
14
|
-
rack (
|
|
15
|
-
rexml
|
|
14
|
+
rack (>= 3.1.20)
|
|
15
|
+
rexml (>= 3.3.9)
|
|
16
16
|
rubyXL (~> 3.4.0)
|
|
17
|
-
rubyntlm (
|
|
18
|
-
rubyzip
|
|
17
|
+
rubyntlm (>= 0.6.3)
|
|
18
|
+
rubyzip (~> 3.0.1)
|
|
19
19
|
savon (~> 2.12.0)
|
|
20
20
|
sax-machine
|
|
21
21
|
spreadsheet
|
|
22
22
|
standardrb
|
|
23
|
-
webrick
|
|
23
|
+
webrick (>= 1.8.2)
|
|
24
24
|
xml-simple
|
|
25
25
|
|
|
26
26
|
GEM
|
|
27
27
|
remote: https://rubygems.org/
|
|
28
28
|
specs:
|
|
29
|
-
addressable (2.8.
|
|
30
|
-
public_suffix (>= 2.0.2, <
|
|
29
|
+
addressable (2.8.8)
|
|
30
|
+
public_suffix (>= 2.0.2, < 8.0)
|
|
31
31
|
akami (1.3.1)
|
|
32
32
|
gyoku (>= 0.4.0)
|
|
33
33
|
nokogiri
|
|
34
34
|
ast (2.4.2)
|
|
35
|
-
base64 (0.
|
|
35
|
+
base64 (0.3.0)
|
|
36
36
|
builder (3.2.4)
|
|
37
37
|
byebug (11.1.3)
|
|
38
38
|
coderay (1.1.3)
|
|
39
|
-
connection_pool (
|
|
39
|
+
connection_pool (3.0.2)
|
|
40
40
|
crack (0.4.5)
|
|
41
41
|
rexml
|
|
42
42
|
diff-lcs (1.5.0)
|
|
43
|
-
domain_name (0.
|
|
44
|
-
unf (>= 0.0.5, < 1.0.0)
|
|
43
|
+
domain_name (0.6.20240107)
|
|
45
44
|
flexmock (2.3.8)
|
|
46
45
|
gyoku (1.4.0)
|
|
47
46
|
builder (>= 2.1.2)
|
|
48
47
|
rexml (~> 3.0)
|
|
49
48
|
hashdiff (1.0.1)
|
|
50
49
|
htmlentities (4.3.4)
|
|
51
|
-
http-cookie (1.0
|
|
50
|
+
http-cookie (1.1.0)
|
|
52
51
|
domain_name (~> 0.5)
|
|
53
52
|
httpi (2.5.0)
|
|
54
53
|
rack
|
|
@@ -56,33 +55,39 @@ GEM
|
|
|
56
55
|
json (2.6.3)
|
|
57
56
|
language_server-protocol (3.17.0.3)
|
|
58
57
|
lint_roller (1.1.0)
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
58
|
+
logger (1.7.0)
|
|
59
|
+
mechanize (2.14.0)
|
|
60
|
+
addressable (~> 2.8)
|
|
61
|
+
base64
|
|
62
|
+
domain_name (~> 0.5, >= 0.5.20190701)
|
|
63
|
+
http-cookie (~> 1.0, >= 1.0.3)
|
|
64
|
+
mime-types (~> 3.3)
|
|
65
|
+
net-http-digest_auth (~> 1.4, >= 1.4.1)
|
|
66
|
+
net-http-persistent (>= 2.5.2, < 5.0.dev)
|
|
67
|
+
nkf
|
|
68
|
+
nokogiri (~> 1.11, >= 1.11.2)
|
|
69
|
+
rubyntlm (~> 0.6, >= 0.6.3)
|
|
67
70
|
webrick (~> 1.7)
|
|
68
|
-
webrobots (
|
|
71
|
+
webrobots (~> 0.1.2)
|
|
69
72
|
method_source (1.0.0)
|
|
70
|
-
mime-types (3.
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
73
|
+
mime-types (3.7.0)
|
|
74
|
+
logger
|
|
75
|
+
mime-types-data (~> 3.2025, >= 3.2025.0507)
|
|
76
|
+
mime-types-data (3.2026.0203)
|
|
77
|
+
mini_portile2 (2.8.9)
|
|
74
78
|
minitar (0.9)
|
|
75
79
|
multi_json (1.15.0)
|
|
80
|
+
mutex_m (0.3.0)
|
|
76
81
|
net-http-digest_auth (1.4.1)
|
|
77
|
-
net-http-persistent (4.0.
|
|
78
|
-
connection_pool (
|
|
79
|
-
|
|
80
|
-
|
|
82
|
+
net-http-persistent (4.0.8)
|
|
83
|
+
connection_pool (>= 2.2.4, < 4)
|
|
84
|
+
nkf (0.2.0)
|
|
85
|
+
nokogiri (1.19.1)
|
|
86
|
+
mini_portile2 (~> 2.8.2)
|
|
81
87
|
racc (~> 1.4)
|
|
82
88
|
nori (2.6.0)
|
|
83
|
-
ntlm-http (0.1.1)
|
|
84
89
|
optimist (3.1.0)
|
|
85
|
-
ox (2.14.
|
|
90
|
+
ox (2.14.14)
|
|
86
91
|
parallel (1.23.0)
|
|
87
92
|
parser (3.2.2.3)
|
|
88
93
|
ast (~> 2.4.1)
|
|
@@ -91,21 +96,21 @@ GEM
|
|
|
91
96
|
pry (0.14.2)
|
|
92
97
|
coderay (~> 1.1)
|
|
93
98
|
method_source (~> 1.0)
|
|
94
|
-
pry-byebug (3.
|
|
99
|
+
pry-byebug (3.8.0)
|
|
95
100
|
byebug (~> 11.0)
|
|
96
|
-
pry (
|
|
101
|
+
pry (~> 0.10)
|
|
97
102
|
pry-doc (1.4.0)
|
|
98
103
|
pry (~> 0.11)
|
|
99
104
|
yard (~> 0.9.11)
|
|
100
105
|
psych (3.3.4)
|
|
101
|
-
public_suffix (
|
|
102
|
-
racc (1.
|
|
103
|
-
rack (3.
|
|
106
|
+
public_suffix (7.0.2)
|
|
107
|
+
racc (1.8.1)
|
|
108
|
+
rack (3.2.5)
|
|
104
109
|
rainbow (3.1.1)
|
|
105
110
|
rake (13.0.6)
|
|
106
|
-
rdoc (6.3.
|
|
111
|
+
rdoc (6.3.4.1)
|
|
107
112
|
regexp_parser (2.8.1)
|
|
108
|
-
rexml (3.
|
|
113
|
+
rexml (3.4.4)
|
|
109
114
|
rspec (3.12.0)
|
|
110
115
|
rspec-core (~> 3.12.0)
|
|
111
116
|
rspec-expectations (~> 3.12.0)
|
|
@@ -119,21 +124,19 @@ GEM
|
|
|
119
124
|
diff-lcs (>= 1.2.0, < 2.0)
|
|
120
125
|
rspec-support (~> 3.12.0)
|
|
121
126
|
rspec-support (3.12.1)
|
|
122
|
-
rubocop (1.
|
|
123
|
-
base64 (~> 0.1.1)
|
|
127
|
+
rubocop (1.50.2)
|
|
124
128
|
json (~> 2.3)
|
|
125
|
-
language_server-protocol (>= 3.17.0)
|
|
126
129
|
parallel (~> 1.10)
|
|
127
|
-
parser (>= 3.2.
|
|
130
|
+
parser (>= 3.2.0.0)
|
|
128
131
|
rainbow (>= 2.2.2, < 4.0)
|
|
129
132
|
regexp_parser (>= 1.8, < 3.0)
|
|
130
133
|
rexml (>= 3.2.5, < 4.0)
|
|
131
|
-
rubocop-ast (>= 1.28.
|
|
134
|
+
rubocop-ast (>= 1.28.0, < 2.0)
|
|
132
135
|
ruby-progressbar (~> 1.7)
|
|
133
136
|
unicode-display_width (>= 2.4.0, < 3.0)
|
|
134
137
|
rubocop-ast (1.29.0)
|
|
135
138
|
parser (>= 3.2.1.0)
|
|
136
|
-
rubocop-performance (1.
|
|
139
|
+
rubocop-performance (1.16.0)
|
|
137
140
|
rubocop (>= 1.7.0, < 2.0)
|
|
138
141
|
rubocop-ast (>= 0.4.0)
|
|
139
142
|
ruby-ole (1.2.12.2)
|
|
@@ -141,8 +144,9 @@ GEM
|
|
|
141
144
|
rubyXL (3.4.25)
|
|
142
145
|
nokogiri (>= 1.10.8)
|
|
143
146
|
rubyzip (>= 1.3.0)
|
|
144
|
-
rubyntlm (0.5
|
|
145
|
-
|
|
147
|
+
rubyntlm (0.6.5)
|
|
148
|
+
base64
|
|
149
|
+
rubyzip (3.0.1)
|
|
146
150
|
savon (2.12.1)
|
|
147
151
|
akami (~> 1.2)
|
|
148
152
|
builder (>= 2.1.2)
|
|
@@ -155,26 +159,23 @@ GEM
|
|
|
155
159
|
socksify (1.7.1)
|
|
156
160
|
spreadsheet (1.3.0)
|
|
157
161
|
ruby-ole
|
|
158
|
-
standard (1.
|
|
162
|
+
standard (1.28.5)
|
|
159
163
|
language_server-protocol (~> 3.17.0.2)
|
|
160
164
|
lint_roller (~> 1.0)
|
|
161
|
-
rubocop (~> 1.
|
|
165
|
+
rubocop (~> 1.50.2)
|
|
162
166
|
standard-custom (~> 1.0.0)
|
|
163
|
-
standard-performance (~> 1.
|
|
167
|
+
standard-performance (~> 1.0.1)
|
|
164
168
|
standard-custom (1.0.2)
|
|
165
169
|
lint_roller (~> 1.0)
|
|
166
170
|
rubocop (~> 1.50)
|
|
167
|
-
standard-performance (1.
|
|
168
|
-
lint_roller (~> 1.
|
|
169
|
-
rubocop-performance (~> 1.
|
|
171
|
+
standard-performance (1.0.1)
|
|
172
|
+
lint_roller (~> 1.0)
|
|
173
|
+
rubocop-performance (~> 1.16.0)
|
|
170
174
|
standardrb (1.0.1)
|
|
171
175
|
standard
|
|
172
176
|
timecop (0.9.8)
|
|
173
|
-
unf (0.1.4)
|
|
174
|
-
unf_ext
|
|
175
|
-
unf_ext (0.0.8.2)
|
|
176
177
|
unicode-display_width (2.5.0)
|
|
177
|
-
vcr (6.
|
|
178
|
+
vcr (6.1.0)
|
|
178
179
|
wasabi (3.7.0)
|
|
179
180
|
addressable
|
|
180
181
|
httpi (~> 2.0)
|
|
@@ -183,7 +184,7 @@ GEM
|
|
|
183
184
|
addressable (>= 2.8.0)
|
|
184
185
|
crack (>= 0.3.2)
|
|
185
186
|
hashdiff (>= 0.4.0, < 2.0.0)
|
|
186
|
-
webrick (1.
|
|
187
|
+
webrick (1.9.2)
|
|
187
188
|
webrobots (0.1.2)
|
|
188
189
|
xml-simple (1.1.9)
|
|
189
190
|
rexml
|
|
@@ -195,18 +196,19 @@ PLATFORMS
|
|
|
195
196
|
DEPENDENCIES
|
|
196
197
|
bundler
|
|
197
198
|
flexmock
|
|
198
|
-
|
|
199
|
+
mutex_m
|
|
200
|
+
nokogiri (>= 1.19.1)
|
|
199
201
|
oddb2xml!
|
|
200
202
|
pry-byebug
|
|
201
203
|
pry-doc
|
|
202
204
|
psych (< 4.0.0)
|
|
203
|
-
rack (
|
|
205
|
+
rack (>= 3.1.20)
|
|
204
206
|
rake
|
|
205
|
-
rdoc (
|
|
207
|
+
rdoc (>= 6.3.4.1)
|
|
206
208
|
rspec
|
|
207
209
|
timecop
|
|
208
210
|
vcr
|
|
209
211
|
webmock
|
|
210
212
|
|
|
211
213
|
BUNDLED WITH
|
|
212
|
-
2.
|
|
214
|
+
2.4.19
|