oddb2xml 2.9.9 → 3.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 538eaca672eb756d84b1aa382f5a9a27e4109a3f28575386d4f0ad0aed1f7e58
4
- data.tar.gz: 3d7cd4d55f9e738acfabe9ab04b2b391c4144b30db8f21534d7f8c58dbf8e9d3
3
+ metadata.gz: c0521593bc12b423a74628f04973dcaf43241ce674c415c4e3a82da3ad34a134
4
+ data.tar.gz: a7b8fe9076dd737105696b525e3c4c0e17e3174ea4d86d0bab9f948324229c05
5
5
  SHA512:
6
- metadata.gz: 953e11dece40e657143b7e85ca9d0a32d614b0cdd717f4ffb4dd6a6a4ddbc55bb851afd2db2a19d4b8c5e651488514a392951dffee677e763573beb0c2c0b6e2
7
- data.tar.gz: 55289f276f21c170ae2bb50ab8c5c53ab197529f3b101079b941d15688f2650571f0a158c7067807106920c87363641baae0196d432e2e7406ce90394dfee7b4
6
+ metadata.gz: 1f0c6275adc2b60930eb961b1745013f81c167a49c035347c49b40af8f59ead028f884981aeb9aa0ce809b7007a1b4c78993d2adfa5c3e31e3ed56e8866ac035
7
+ data.tar.gz: c02f68d34884adbabe1465cf7180e6ef51cbb84049a71a931880bcb383abb4c3dc8196f92e21932b3529a68e5211f546fb053112f27037434313517de4d22961
data/CLAUDE.md ADDED
@@ -0,0 +1,69 @@
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Project Overview
6
+
7
+ oddb2xml is a Ruby gem that downloads Swiss pharmaceutical data from 10+ sources (Swissmedic, BAG, Refdata, ZurRose, EPha, etc.), parses multiple formats (XML, XLSX, CSV, SOAP, fixed-width DAT), merges/deduplicates them, and generates standardized XML/DAT output files for healthcare systems. It also supports the Elexis EHR Artikelstamm format.
8
+
9
+ ## Common Commands
10
+
11
+ ```bash
12
+ # Install dependencies
13
+ bundle install
14
+
15
+ # Run full test suite
16
+ bundle exec rake spec
17
+
18
+ # Run a single test file
19
+ bundle exec rspec spec/builder_spec.rb
20
+
21
+ # Run a single test by line number
22
+ bundle exec rspec spec/builder_spec.rb:42
23
+
24
+ # Lint with StandardRB
25
+ bundle exec standardrb
26
+
27
+ # Auto-fix lint issues
28
+ bundle exec standardrb --fix
29
+
30
+ # Build the gem
31
+ bundle exec rake build
32
+ ```
33
+
34
+ ## Architecture
35
+
36
+ The system follows a **download → extract → build → compress** pipeline:
37
+
38
+ 1. **CLI** (`lib/oddb2xml/cli.rb`) — Entry point. Parses options via Optimist (`options.rb`), orchestrates the pipeline, manages multi-threaded downloads.
39
+
40
+ 2. **Downloaders** (`lib/oddb2xml/downloader.rb`) — 11 subclasses of `Downloader`, each fetching from a specific Swiss data source. Files cached in `./downloads/`.
41
+
42
+ 3. **Extractors** (`lib/oddb2xml/extractor.rb`) — Matching extractor classes that parse downloaded files into Ruby hashes. Formats include XML (nokogiri/sax-machine), XLSX (rubyXL), SOAP (savon), CSV, and fixed-width text.
43
+
44
+ 4. **Builder** (`lib/oddb2xml/builder.rb`) — The largest file (~1900 lines). Merges extracted data and generates output XML/DAT files. Methods follow `prepare_*` (data assembly) and `build_*` (output generation) naming.
45
+
46
+ 5. **Calc** (`lib/oddb2xml/calc.rb`) — Composition calculation logic, works with `parslet_compositions.rb` and `compositions_syntax.rb` (Parslet-based PEG parser for drug composition strings).
47
+
48
+ 6. **Compressor** (`lib/oddb2xml/compressor.rb`) — Optional ZIP/TAR.GZ output compression.
49
+
50
+ ### Key data identifiers
51
+ - **GTIN/EAN13**: Primary article identifier (13-digit barcode)
52
+ - **Pharmacode**: Swiss pharmacy code
53
+ - **IKSNR**: Swissmedic registration number (5-digit)
54
+ - **Swissmedic sequence/pack numbers**: Combined with IKSNR to form full identifiers
55
+
56
+ ### Static data overrides
57
+ YAML files in `data/` provide manual overrides and mappings: `article_overrides.yaml`, `product_overrides.yaml`, `gtin2ignore.yaml`, `gal_forms.yaml`, `gal_groups.yaml`.
58
+
59
+ ## Testing
60
+
61
+ - Framework: RSpec with flexmock (mocking), webmock + VCR (HTTP recording/playback)
62
+ - Test fixtures: `spec/data/` (sample files), `spec/fixtures/vcr_cassettes/` (recorded HTTP responses)
63
+ - `spec/spec_helper.rb` defines test constants (GTINs) and configures VCR to avoid real HTTP calls during tests
64
+ - CI runs on Ruby 3.0, 3.1, 3.2
65
+
66
+ ## Ruby Version
67
+
68
+ - Minimum: Ruby >= 2.5.0 (gemspec)
69
+ - Current development: Ruby 3.2.0 (`.ruby-version`)
data/Gemfile CHANGED
@@ -7,5 +7,6 @@ group :debugger do
7
7
  gem "pry-doc"
8
8
  end
9
9
 
10
- gem "nokogiri", "1.13.9"
11
- gem "rack", "3.0.11"
10
+ gem "nokogiri", ">= 1.19.1"
11
+ gem "rack", ">= 3.1.20"
12
+ gem "mutex_m"
data/Gemfile.lock CHANGED
@@ -1,54 +1,53 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- oddb2xml (2.9.2)
4
+ oddb2xml (3.0.0)
5
5
  htmlentities
6
6
  httpi
7
- mechanize
7
+ mechanize (>= 2.8.5)
8
8
  minitar
9
9
  multi_json
10
- nokogiri (>= 1.8.2)
10
+ nokogiri (>= 1.19.1)
11
11
  optimist
12
12
  ox
13
13
  parslet
14
- rack (= 3.0.11)
15
- rexml
14
+ rack (>= 3.1.20)
15
+ rexml (>= 3.3.9)
16
16
  rubyXL (~> 3.4.0)
17
- rubyntlm (= 0.5.1)
18
- rubyzip
17
+ rubyntlm (>= 0.6.3)
18
+ rubyzip (~> 3.0.1)
19
19
  savon (~> 2.12.0)
20
20
  sax-machine
21
21
  spreadsheet
22
22
  standardrb
23
- webrick
23
+ webrick (>= 1.8.2)
24
24
  xml-simple
25
25
 
26
26
  GEM
27
27
  remote: https://rubygems.org/
28
28
  specs:
29
- addressable (2.8.5)
30
- public_suffix (>= 2.0.2, < 6.0)
29
+ addressable (2.8.8)
30
+ public_suffix (>= 2.0.2, < 8.0)
31
31
  akami (1.3.1)
32
32
  gyoku (>= 0.4.0)
33
33
  nokogiri
34
34
  ast (2.4.2)
35
- base64 (0.1.1)
35
+ base64 (0.3.0)
36
36
  builder (3.2.4)
37
37
  byebug (11.1.3)
38
38
  coderay (1.1.3)
39
- connection_pool (2.4.1)
39
+ connection_pool (3.0.2)
40
40
  crack (0.4.5)
41
41
  rexml
42
42
  diff-lcs (1.5.0)
43
- domain_name (0.5.20190701)
44
- unf (>= 0.0.5, < 1.0.0)
43
+ domain_name (0.6.20240107)
45
44
  flexmock (2.3.8)
46
45
  gyoku (1.4.0)
47
46
  builder (>= 2.1.2)
48
47
  rexml (~> 3.0)
49
48
  hashdiff (1.0.1)
50
49
  htmlentities (4.3.4)
51
- http-cookie (1.0.5)
50
+ http-cookie (1.1.0)
52
51
  domain_name (~> 0.5)
53
52
  httpi (2.5.0)
54
53
  rack
@@ -56,33 +55,39 @@ GEM
56
55
  json (2.6.3)
57
56
  language_server-protocol (3.17.0.3)
58
57
  lint_roller (1.1.0)
59
- mechanize (2.7.7)
60
- domain_name (~> 0.5, >= 0.5.1)
61
- http-cookie (~> 1.0)
62
- mime-types (>= 1.17.2)
63
- net-http-digest_auth (~> 1.1, >= 1.1.1)
64
- net-http-persistent (>= 2.5.2)
65
- nokogiri (~> 1.6)
66
- ntlm-http (~> 0.1, >= 0.1.1)
58
+ logger (1.7.0)
59
+ mechanize (2.14.0)
60
+ addressable (~> 2.8)
61
+ base64
62
+ domain_name (~> 0.5, >= 0.5.20190701)
63
+ http-cookie (~> 1.0, >= 1.0.3)
64
+ mime-types (~> 3.3)
65
+ net-http-digest_auth (~> 1.4, >= 1.4.1)
66
+ net-http-persistent (>= 2.5.2, < 5.0.dev)
67
+ nkf
68
+ nokogiri (~> 1.11, >= 1.11.2)
69
+ rubyntlm (~> 0.6, >= 0.6.3)
67
70
  webrick (~> 1.7)
68
- webrobots (>= 0.0.9, < 0.2)
71
+ webrobots (~> 0.1.2)
69
72
  method_source (1.0.0)
70
- mime-types (3.5.1)
71
- mime-types-data (~> 3.2015)
72
- mime-types-data (3.2023.0808)
73
- mini_portile2 (2.8.4)
73
+ mime-types (3.7.0)
74
+ logger
75
+ mime-types-data (~> 3.2025, >= 3.2025.0507)
76
+ mime-types-data (3.2026.0203)
77
+ mini_portile2 (2.8.9)
74
78
  minitar (0.9)
75
79
  multi_json (1.15.0)
80
+ mutex_m (0.3.0)
76
81
  net-http-digest_auth (1.4.1)
77
- net-http-persistent (4.0.2)
78
- connection_pool (~> 2.2)
79
- nokogiri (1.13.9)
80
- mini_portile2 (~> 2.8.0)
82
+ net-http-persistent (4.0.8)
83
+ connection_pool (>= 2.2.4, < 4)
84
+ nkf (0.2.0)
85
+ nokogiri (1.19.1)
86
+ mini_portile2 (~> 2.8.2)
81
87
  racc (~> 1.4)
82
88
  nori (2.6.0)
83
- ntlm-http (0.1.1)
84
89
  optimist (3.1.0)
85
- ox (2.14.17)
90
+ ox (2.14.14)
86
91
  parallel (1.23.0)
87
92
  parser (3.2.2.3)
88
93
  ast (~> 2.4.1)
@@ -91,21 +96,21 @@ GEM
91
96
  pry (0.14.2)
92
97
  coderay (~> 1.1)
93
98
  method_source (~> 1.0)
94
- pry-byebug (3.10.1)
99
+ pry-byebug (3.8.0)
95
100
  byebug (~> 11.0)
96
- pry (>= 0.13, < 0.15)
101
+ pry (~> 0.10)
97
102
  pry-doc (1.4.0)
98
103
  pry (~> 0.11)
99
104
  yard (~> 0.9.11)
100
105
  psych (3.3.4)
101
- public_suffix (5.0.3)
102
- racc (1.7.1)
103
- rack (3.0.11)
106
+ public_suffix (7.0.2)
107
+ racc (1.8.1)
108
+ rack (3.2.5)
104
109
  rainbow (3.1.1)
105
110
  rake (13.0.6)
106
- rdoc (6.3.3)
111
+ rdoc (6.3.4.1)
107
112
  regexp_parser (2.8.1)
108
- rexml (3.2.6)
113
+ rexml (3.4.4)
109
114
  rspec (3.12.0)
110
115
  rspec-core (~> 3.12.0)
111
116
  rspec-expectations (~> 3.12.0)
@@ -119,21 +124,19 @@ GEM
119
124
  diff-lcs (>= 1.2.0, < 2.0)
120
125
  rspec-support (~> 3.12.0)
121
126
  rspec-support (3.12.1)
122
- rubocop (1.56.4)
123
- base64 (~> 0.1.1)
127
+ rubocop (1.50.2)
124
128
  json (~> 2.3)
125
- language_server-protocol (>= 3.17.0)
126
129
  parallel (~> 1.10)
127
- parser (>= 3.2.2.3)
130
+ parser (>= 3.2.0.0)
128
131
  rainbow (>= 2.2.2, < 4.0)
129
132
  regexp_parser (>= 1.8, < 3.0)
130
133
  rexml (>= 3.2.5, < 4.0)
131
- rubocop-ast (>= 1.28.1, < 2.0)
134
+ rubocop-ast (>= 1.28.0, < 2.0)
132
135
  ruby-progressbar (~> 1.7)
133
136
  unicode-display_width (>= 2.4.0, < 3.0)
134
137
  rubocop-ast (1.29.0)
135
138
  parser (>= 3.2.1.0)
136
- rubocop-performance (1.19.1)
139
+ rubocop-performance (1.16.0)
137
140
  rubocop (>= 1.7.0, < 2.0)
138
141
  rubocop-ast (>= 0.4.0)
139
142
  ruby-ole (1.2.12.2)
@@ -141,8 +144,9 @@ GEM
141
144
  rubyXL (3.4.25)
142
145
  nokogiri (>= 1.10.8)
143
146
  rubyzip (>= 1.3.0)
144
- rubyntlm (0.5.1)
145
- rubyzip (2.3.2)
147
+ rubyntlm (0.6.5)
148
+ base64
149
+ rubyzip (3.0.1)
146
150
  savon (2.12.1)
147
151
  akami (~> 1.2)
148
152
  builder (>= 2.1.2)
@@ -155,26 +159,23 @@ GEM
155
159
  socksify (1.7.1)
156
160
  spreadsheet (1.3.0)
157
161
  ruby-ole
158
- standard (1.31.1)
162
+ standard (1.28.5)
159
163
  language_server-protocol (~> 3.17.0.2)
160
164
  lint_roller (~> 1.0)
161
- rubocop (~> 1.56.2)
165
+ rubocop (~> 1.50.2)
162
166
  standard-custom (~> 1.0.0)
163
- standard-performance (~> 1.2)
167
+ standard-performance (~> 1.0.1)
164
168
  standard-custom (1.0.2)
165
169
  lint_roller (~> 1.0)
166
170
  rubocop (~> 1.50)
167
- standard-performance (1.2.0)
168
- lint_roller (~> 1.1)
169
- rubocop-performance (~> 1.19.0)
171
+ standard-performance (1.0.1)
172
+ lint_roller (~> 1.0)
173
+ rubocop-performance (~> 1.16.0)
170
174
  standardrb (1.0.1)
171
175
  standard
172
176
  timecop (0.9.8)
173
- unf (0.1.4)
174
- unf_ext
175
- unf_ext (0.0.8.2)
176
177
  unicode-display_width (2.5.0)
177
- vcr (6.2.0)
178
+ vcr (6.1.0)
178
179
  wasabi (3.7.0)
179
180
  addressable
180
181
  httpi (~> 2.0)
@@ -183,7 +184,7 @@ GEM
183
184
  addressable (>= 2.8.0)
184
185
  crack (>= 0.3.2)
185
186
  hashdiff (>= 0.4.0, < 2.0.0)
186
- webrick (1.8.1)
187
+ webrick (1.9.2)
187
188
  webrobots (0.1.2)
188
189
  xml-simple (1.1.9)
189
190
  rexml
@@ -195,18 +196,19 @@ PLATFORMS
195
196
  DEPENDENCIES
196
197
  bundler
197
198
  flexmock
198
- nokogiri (= 1.13.9)
199
+ mutex_m
200
+ nokogiri (>= 1.19.1)
199
201
  oddb2xml!
200
202
  pry-byebug
201
203
  pry-doc
202
204
  psych (< 4.0.0)
203
- rack (= 3.0.11)
205
+ rack (>= 3.1.20)
204
206
  rake
205
- rdoc (~> 6.3.3)
207
+ rdoc (>= 6.3.4.1)
206
208
  rspec
207
209
  timecop
208
210
  vcr
209
211
  webmock
210
212
 
211
213
  BUNDLED WITH
212
- 2.3.24
214
+ 2.4.19