dsv7-parser 7.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 46a266cb139c92c4211efbec9a9675f72615b37a0a93bda512193e307f2965c4
4
+ data.tar.gz: c53949825909b9261ba0a2f49a462bc55a2223feb8829e220d33c5c2251de243
5
+ SHA512:
6
+ metadata.gz: d2e86049d2d517aa43733f222bd81c76938f8175022d8377c79cb1df74dc71a6eaad192a5b1bc8a979f9b3eb4f9bdfa4d80d7a92632681f704860f6b0f755fd3
7
+ data.tar.gz: a3d348801f8ba8c4a95be3e707ff5199399f3b626b2b8084fa8c33efa419f3597ff83a172bc9f2f3ee8dcfff4591f91b2244e929e7e4df7306fe7b92796f0d32
data/.gitignore ADDED
@@ -0,0 +1,8 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /tmp/
8
+ *.gem
data/.rubocop.yml ADDED
@@ -0,0 +1,22 @@
1
+ # Starter RuboCop configuration
2
+
3
+ AllCops:
4
+ NewCops: enable
5
+ SuggestExtensions: false
6
+ Exclude:
7
+ - 'bin/*'
8
+ - 'pkg/**/*'
9
+ - 'tmp/**/*'
10
+ - 'vendor/**/*'
11
+
12
+ Layout/LineLength:
13
+ Max: 100
14
+
15
+ Metrics/BlockLength:
16
+ Exclude:
17
+ - 'Rakefile'
18
+ - 'test/**/*'
19
+
20
+ # Keep docs optional for small gems and tests
21
+ Style/Documentation:
22
+ Enabled: false
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 3.4.6
data/AGENTS.md ADDED
@@ -0,0 +1,255 @@
1
+ # Agent Guide for dsv7-parser
2
+
3
+ This repo contains a Ruby gem for parsing and validating DSV7 files (German Swimming Federation, “Format 7”). It already includes a high-level validator and a growing specification captured in Markdown.
4
+
5
+ ## Repo Layout
6
+
7
+ - `lib/dsv7/validator.rb` — public entrypoint `Dsv7::Validator.validate(...)`
8
+ - `lib/dsv7/validator/core.rb` — core stream pipeline (BOM/encoding, line parsing)
9
+ - `lib/dsv7/validator/result.rb` — result container (`errors`, `warnings`, `valid?`)
10
+ - `lib/dsv7/parser.rb` — parser namespace and version loader
11
+ - `specification/dsv7/dsv7_specification.md` — cleaned, structured spec text
12
+ - `test/dsv7/*_test.rb` — Minitest suite covering validator behavior
13
+ - `Rakefile` — `rake` (default: tests + rubocop), `rake ci`, `rake rubocop`
14
+ - `lib/dsv7/stream.rb` — IO helpers (BOM, UTF‑8, CR/LF handling)
15
+ - `lib/dsv7/lex.rb` — lexical helpers (FORMAT, element split)
16
+ - `lib/dsv7/validator/line_analyzer.rb` — streaming line analyzer + orchestration
17
+ - `lib/dsv7/validator/line_analyzer_common.rb` — list‑specific analyzer mixins
18
+ - `lib/dsv7/validator/schemas/*.rb` — per‑list attribute schemas
19
+ - `lib/dsv7/validator/types/*.rb` — datatype and enum checks
20
+ - `lib/dsv7/parser/engine.rb` — streaming parser implementation
21
+ - `lib/dsv7/parser/io_util.rb` — parser IO utilities
22
+
23
+ ## Dev Quick Start
24
+
25
+ - Install deps: `bundle install`
26
+ - Run tests: `bundle exec rake` (or `bundle exec rake test`)
27
+ - Run RuboCop: `bundle exec rake rubocop`
28
+
29
+ ## Spec: “Allgemeines” (Key Rules)
30
+
31
+ From `specification/dsv7/dsv7_specification.md` → section “Allgemeines” and the general rules that impact top‑level validation:
32
+
33
+ - Scope and version
34
+ - Standard for exchanging meet entries and results.
35
+ - Current format is version `7`, effective from 2023‑01‑01; format 6 valid only until 2023‑07‑31.
36
+ - Encoding and separators
37
+ - UTF‑8 without BOM.
38
+ - Attribute delimiter is `;`.
39
+ - No line breaks inside a single element.
40
+ - Comments
41
+ - Syntax: `(* ... *)` (intended to be single‑line; inline allowed).
42
+ - File envelope
43
+ - First effective line must be `FORMAT:<Listentyp>;7;`.
44
+ - Last element is `DATEIENDE`.
45
+ - Allowed list types
46
+ - `Wettkampfdefinitionsliste`, `Vereinsmeldeliste`, `Wettkampfergebnisliste`, `Vereinsergebnisliste`.
47
+ - Filenames
48
+ - Pattern: `JJJJ-MM-TT-Ort-Zusatz.DSV7`.
49
+ - `Ort`: remove spaces/hyphens; map umlauts `ä→ae`, `ö→oe`, `ü→ue`, `ß→ss`; max 8 chars.
50
+ - `Zusatz`: `…-Me` (Vereinsmeldeliste), `…-Pr` (Vereinsergebnisliste), `Pr` (Wettkampfergebnisliste), `Wk` (Wettkampfdefinitionsliste).
51
+ - Datatypes (future schema validation)
52
+ - ZK (string), Zeichen (char), Zahl (int), Zeit (`HH:MM:SS,hh`), Datum (`TT.MM.JJJJ`), Uhrzeit (`HH:MM`), Betrag (`x,yy`), JGAK variants.
53
+
54
+ ## Current Validator Coverage (implemented)
55
+
56
+ - FORMAT line parsing and validation
57
+ - Must be first effective line; exact syntax `FORMAT:<Listentyp>;7;`.
58
+ - List type must be one of the four allowed types.
59
+ - Version must be `7`.
60
+ - Terminator
61
+ - `DATEIENDE` must be present; no effective content after it.
62
+ - Encoding and line endings
63
+ - Detects UTF‑8 BOM (errors); validates UTF‑8 and scrubs invalid input (reports error).
64
+ - Detects CRLF present anywhere (adds a warning) — still considered valid.
65
+ - Comments and delimiters
66
+ - Inline comments `(* ... *)` are stripped for checks.
67
+ - Unbalanced comment delimiters on any single line are errors.
68
+ - Every non‑empty, non‑comment data line after FORMAT must contain at least one `;`.
69
+ - Filenames
70
+ - When validating by path, warns if the filename does not match `^\d{4}-\d{2}-\d{2}-[^.]+\.DSV7$`.
71
+
72
+ - Element schemas and datatypes
73
+ - Per‑list element schemas enforce exact attribute counts and required/optional positions.
74
+ - Datatype checks implemented: `ZK`, `Zahl`, `Betrag`, `Zeit`, `Datum`, `Uhrzeit`, plus many enums
75
+ (e.g., `Technik`, `Ausübung`, `Geschlecht`, `Wettkampfart`, `Wertungstyp`, `JG/AK`).
76
+ - Cross‑field rule example: `MELDEGELD` with `WKMELDEGELD` requires Wettkampfnr (attr 3).
77
+ - Accepts synonymous element names where found in the wild (e.g., `STAFFELERGEBNIS`/`STERGEBNIS`).
78
+ - Cardinality checks
79
+ - Enforces required presence and max occurrences per list type (WKDL, VML, ERG, VRL).
80
+
81
+ What is not implemented yet (or partial): normalization of `Ort`/`Zusatz` beyond filename pattern; full
82
+ cross‑element ordering/refs (beyond cardinalities); some rarely used elements (e.g., PFLICHTZEIT) are
83
+ intentionally deferred pending clearer examples.
84
+
85
+ ## Minimal Valid Example
86
+
87
+ ```
88
+ FORMAT:Wettkampfdefinitionsliste;7;
89
+ DATA;ok
90
+ DATEIENDE
91
+ ```
92
+
93
+ Notes:
94
+ - Leading/trailing spaces around `FORMAT`/`DATEIENDE` are tolerated.
95
+ - Inline comments are allowed, e.g. `FORMAT:... (* note *)`.
96
+
97
+ ## Tests Overview and Conventions
98
+
99
+ - Location: `test/dsv7/` using Minitest.
100
+ - Common helpers in tests:
101
+ - `format_line(type = 'Wettkampfdefinitionsliste', version = '7')`
102
+ - `validate_string(content)` → `Dsv7::Validator.validate(content)`
103
+ - Existing coverage slices (examples):
104
+ - Validator envelope and syntax: `validator_test.rb`, `validator_format_syntax_test.rb`,
105
+ `validator_comments_test.rb`, `validator_encoding_test.rb`, `validator_filename_test.rb`,
106
+ `validator_whitespace_test.rb`.
107
+ - Element schemas and datatypes: `validator_schema_base_test.rb`, `validator_wkdl_attributes_test.rb`.
108
+ - List‑specific suites: `validator_vereinsmeldeliste_*_test.rb`, `validator_wettkampfergebnisliste_*_test.rb`,
109
+ `validator_vereinsergebnisliste_*_test.rb`.
110
+ - Parser: `parser_*_test.rb` (including end‑to‑end `e2e_*_test.rb`).
111
+
112
+ ## Suggested Next Steps
113
+
114
+ - Extend schemas to remaining/rare elements (e.g., `PFLICHTZEIT`) once examples are clarified.
115
+ - Add cross‑element reference/order checks (e.g., section and event numbering consistency).
116
+ - Expand filename normalization checks (umlaut mappings, max length, numeric suffixes like `Ort1`).
117
+ - Harden parser/validator against additional real‑world quirks (document with tests).
118
+ - Update README with Parser streaming examples and advanced validation usage.
119
+
120
+ # Coding Guidelines for dsv7-parser
121
+
122
+ Concise conventions for contributing code and tests to this repo. Focus is on a small, dependency‑light Ruby library with a streaming validator and Minitest suite.
123
+
124
+ ## Language & Versions
125
+
126
+ - Ruby version: `2.7.0` in `.ruby-version`; gem supports `>= 2.7` (see `dsv7-parser.gemspec`).
127
+ - Add `# frozen_string_literal: true` to the top of Ruby files.
128
+ - Avoid non‑portable syntax not available in Ruby 2.7.
129
+
130
+ ## Style & Linting
131
+
132
+ - Use RuboCop; run `bundle exec rake rubocop`.
133
+ - Write both library code and tests in a RuboCop‑compliant way; fix all offenses before submitting.
134
+ - Do not change `.rubocop.yml` to satisfy offenses; fix the code instead.
135
+ - Avoid inline `# rubocop:disable` comments; refactor to comply where possible.
136
+ - New cops enabled; line length max `100` (see `.rubocop.yml`).
137
+ - `Metrics/BlockLength` is relaxed for `Rakefile` and `test/**/*`.
138
+ - `Style/Documentation` disabled (no mandatory class/module docs).
139
+ - String literals: prefer single quotes; use double quotes when interpolation/escapes are needed.
140
+ - Keep methods small and cohesive; favor clear names over brevity.
141
+ - Names: prefer full words for variables and methods; avoid abbreviations (e.g., use `line_number` not `line_no`, `add_error` not `add_err`).
142
+
143
+ ## Structure & Namespacing
144
+
145
+ - Place code under `lib/dsv7/` and namespace under `Dsv7`.
146
+ - Keep validator pipeline pieces encapsulated:
147
+ - `Dsv7::Validator` (public API, orchestration)
148
+ - `Dsv7::Validator::Core` (streaming + encoding + lines)
149
+ - `Dsv7::Validator::Result` (errors/warnings container)
150
+ - Keep responsibilities narrow; prefer small classes/modules over large monoliths.
151
+
152
+ ## Public API Expectations
153
+
154
+ - Validator: `Dsv7::Validator.validate(input)` supports IO, file path, or content String.
155
+ - Parser: `Dsv7::Parser.parse(input)`, plus list‑specific `parse_*` helpers, yield events
156
+ `[:format|:element|:end, payload, line_number]` and can be used as enumerators or with blocks.
157
+ - Raise `ArgumentError` for unsupported input types; never print to stdout/stderr in library code.
158
+
159
+ ## Errors, Warnings, Messages
160
+
161
+ - Use precise, user‑actionable wording; match existing phrasing where possible.
162
+ - Errors go to `result.errors`; warnings to `result.warnings`.
163
+ - Include context when helpful (e.g., line numbers); keep format stable for tests.
164
+
165
+ ## IO & Encoding
166
+
167
+ - Stream inputs (avoid loading entire files); set `io.binmode`.
168
+ - Detect and report UTF‑8 BOM; enforce UTF‑8 with `force_encoding` and `valid_encoding?`.
169
+ - Preserve processing order; support both LF and CRLF, adding a single warning for CRLF presence.
170
+
171
+ ## Performance & Memory
172
+
173
+ - Use streaming and incremental processing wherever possible (enumerate by line).
174
+ - Optimize for low memory usage (avoid reading whole files or accumulating large arrays).
175
+ - Prefer single‑pass algorithms; keep per‑line state minimal and discard intermediate buffers.
176
+
177
+ ## Comments and Parsing Rules
178
+
179
+ - Treat `(* ... *)` as inline comments and strip before structural checks.
180
+ - Ensure balanced comment delimiters on each line.
181
+ - For non‑empty, non‑comment data lines (after `FORMAT`), require at least one `;` delimiter.
182
+
183
+ ## Testing Conventions
184
+
185
+ - Framework: Minitest; tests live under `test/dsv7/` with `_test.rb` suffix.
186
+ - Helpers used commonly:
187
+ - `format_line(type = 'Wettkampfdefinitionsliste', version = '7')`
188
+ - `validate_string(content)` → `Dsv7::Validator.validate(content)`
189
+ - Use tmp files for filename tests; clean them in `ensure` blocks.
190
+ - Assertions: prefer `assert_includes`, `assert_empty`, and `assert result.valid?` for clarity.
191
+
192
+ ## Tooling
193
+
194
+ - Install deps: `bundle install`
195
+ - Run tests: `bundle exec rake` (default task)
196
+ - Lint: `bundle exec rake rubocop`
197
+
198
+ ## Finish Checklist
199
+
200
+ - Run tests: `rake test` (or `bundle exec rake test`).
201
+ - Auto-correct style: `rubocop -A .` (or `bundle exec rubocop -A .`).
202
+ - Fix any remaining RuboCop items; re-run `rake test` and `rubocop` until green.
203
+ - If you add a new feature or public API, update `README.md` with usage and examples.
204
+
205
+ ## Repo Quick Reference (Appendix)
206
+
207
+ - `lib/dsv7/validator.rb` — public entrypoint `Dsv7::Validator.validate(...)`
208
+ - `lib/dsv7/validator/core.rb` — core stream pipeline (BOM/encoding, line parsing)
209
+ - `lib/dsv7/validator/line_analyzer.rb` — per‑line orchestration and finish checks
210
+ - `lib/dsv7/validator/result.rb` — result container (`errors`, `warnings`, `valid?`)
211
+ - `lib/dsv7/parser.rb` — parser namespace and version loader
212
+ - `lib/dsv7/parser/engine.rb` — streaming parser core
213
+ - `specification/dsv7/dsv7_specification.md` — structured spec text
214
+ - `test/dsv7/*_test.rb` — Minitest suite
215
+ - `Rakefile` — `rake` (default: tests + rubocop), `rake ci`, `rake rubocop`
216
+
217
+ Spec notes driving current behavior: require `FORMAT:<Listentyp>;7;`, final `DATEIENDE`, UTF‑8 without BOM,
218
+ inline comment handling, semicolon on data lines, allowed list types, filename pattern warnings, and
219
+ element/datatype enforcement per list type.
220
+
221
+ ## Patch Discipline
222
+
223
+ - Keep diffs minimal and targeted; avoid unrelated refactors.
224
+ - Do not reformat whole files; only touch necessary lines.
225
+ - Don’t add license headers or banners.
226
+ - Avoid inline code comments unless explicitly requested.
227
+ - Use descriptive names; avoid one‑letter variables.
228
+ - Prefer full-length, intent-revealing variable names; avoid abbreviations (e.g., `line_number` not `line_no`).
229
+
230
+ ## Error/Warning Policy
231
+
232
+ - Errors invalidate the file; warnings do not affect `valid?`.
233
+ - Include line numbers when relevant; keep message text stable.
234
+ - Prefer new checks as additional messages over changing existing texts.
235
+ - For non‑normative guidance, favor warnings instead of errors.
236
+
237
+ ## Dependency Policy
238
+
239
+ - Keep runtime dependencies at zero; prefer stdlib.
240
+ - Discuss before adding any new gem (including dev‑only).
241
+ - Maintain streaming design and dependency‑light footprint.
242
+
243
+ ## API Stability
244
+
245
+ - Do not change `Dsv7::Validator.validate` signature or return type.
246
+ - `Result#valid?` is based solely on presence of errors.
247
+ - Expose new information via additional fields/methods with tests.
248
+
249
+ ## Test Matrix Expectations
250
+
251
+ - Add both accept and reject tests for each new rule.
252
+ - Cover LF and CRLF, whitespace quirks, comments, UTF‑8 edge cases.
253
+ - Test both string and file‑path inputs (use `tmp` and cleanup in `ensure`).
254
+ - Prefer precise assertions (`assert_includes`, `assert_empty`, `assert result.valid?`).
255
+ - Keep tests independent, small, and fast.
data/Gemfile ADDED
@@ -0,0 +1,15 @@
1
+ # frozen_string_literal: true
2
+
3
+ source 'https://rubygems.org'
4
+
5
+ gemspec
6
+
7
+ group :development, :test do
8
+ gem 'minitest', '>= 5.18'
9
+ gem 'rake', '>= 13.0'
10
+ gem 'simplecov', require: false
11
+ end
12
+
13
+ group :development do
14
+ gem 'rubocop', require: false
15
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,61 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ dsv7-parser (7.0.0)
5
+
6
+ GEM
7
+ remote: https://rubygems.org/
8
+ specs:
9
+ ast (2.4.3)
10
+ docile (1.4.1)
11
+ json (2.15.0)
12
+ language_server-protocol (3.17.0.5)
13
+ lint_roller (1.1.0)
14
+ minitest (5.25.5)
15
+ parallel (1.27.0)
16
+ parser (3.3.9.0)
17
+ ast (~> 2.4.1)
18
+ racc
19
+ prism (1.5.1)
20
+ racc (1.8.1)
21
+ rainbow (3.1.1)
22
+ rake (13.3.0)
23
+ regexp_parser (2.11.3)
24
+ rubocop (1.81.0)
25
+ json (~> 2.3)
26
+ language_server-protocol (~> 3.17.0.2)
27
+ lint_roller (~> 1.1.0)
28
+ parallel (~> 1.10)
29
+ parser (>= 3.3.0.2)
30
+ rainbow (>= 2.2.2, < 4.0)
31
+ regexp_parser (>= 2.9.3, < 3.0)
32
+ rubocop-ast (>= 1.47.1, < 2.0)
33
+ ruby-progressbar (~> 1.7)
34
+ unicode-display_width (>= 2.4.0, < 4.0)
35
+ rubocop-ast (1.47.1)
36
+ parser (>= 3.3.7.2)
37
+ prism (~> 1.4)
38
+ ruby-progressbar (1.13.0)
39
+ simplecov (0.22.0)
40
+ docile (~> 1.1)
41
+ simplecov-html (~> 0.11)
42
+ simplecov_json_formatter (~> 0.1)
43
+ simplecov-html (0.13.2)
44
+ simplecov_json_formatter (0.1.4)
45
+ unicode-display_width (3.2.0)
46
+ unicode-emoji (~> 4.1)
47
+ unicode-emoji (4.1.0)
48
+
49
+ PLATFORMS
50
+ ruby
51
+ x86_64-linux
52
+
53
+ DEPENDENCIES
54
+ dsv7-parser!
55
+ minitest (>= 5.18)
56
+ rake (>= 13.0)
57
+ rubocop
58
+ simplecov
59
+
60
+ BUNDLED WITH
61
+ 2.7.2
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 bigcurl
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.