verbatim 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: b75beb6061fd54c18171fd501e92c6a90daee024dab8b18fc232cf0ec6b4a21e
4
+ data.tar.gz: fa29a0a2aa2348a6f1cb45a77beef2f310525a76780f0aa8a1d69474ef09224c
5
+ SHA512:
6
+ metadata.gz: a1ffa058b4c70bc06dac53b5c98e254ec332944d30ebe4d63ace45d1d7da1ad2f5867f025f97aa97b17ad405011190dd4c03818a38e2a30b461e5eda87a9080b
7
+ data.tar.gz: 335e1735ecbb0e2fc49d8f19e23d888a1d81a6d65336b01467ee3011ade95fc236be68476746c877ff62da1026b02066b53881a8872a09d64506fadd8777ba63
data/CHANGELOG.md ADDED
@@ -0,0 +1,28 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
4
+
5
+ ## [Unreleased]
6
+
7
+ ### Added
8
+
9
+ - (nothing yet)
10
+
11
+ ### Changed
12
+
13
+ - `Schema.parse` enforces a maximum input length of 128 characters (`Verbatim::Schema::MAX_INPUT_LEN`).
14
+
15
+ ## [0.1.0] - 2026-04-08
16
+
17
+ ### Added
18
+
19
+ - `Verbatim::Schema` DSL: `delimiter`, `segment` with `optional`, `lead`, `delimiter_after`, and type-specific options.
20
+ - Built-in segment types: `:uint`, `:int`, `:token`, `:string`, `:semver_ids`.
21
+ - `Verbatim::Schemas::SemVer` and `Verbatim::Schemas::CalVer`.
22
+ - `.parse` / `.format`, value API: readers, `#[]`, `#to_h`, `#to_s`, equality, `Comparable` (SemVer uses 2.0.0 precedence).
23
+ - `Verbatim::ParseError` with message, string, index, and segment.
24
+ - Instance `#with(**attrs)`; `#succ` / `#pred` on `Schema` (default `NotImplementedError`), with CalVer (calendar day) and SemVer (patch bump / core predecessor) implementations.
25
+ - MIT `LICENSE` included in the gem.
26
+
27
+ [Unreleased]: https://github.com/dsdugal/verbatim/compare/v0.1.0...HEAD
28
+ [0.1.0]: https://github.com/dsdugal/verbatim/releases/tag/v0.1.0
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Dustin Dugal
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,133 @@
1
+ # Verbatim
2
+
3
+ Verbatim is a Ruby gem for **declarative version string schemas**. You subclass `Verbatim::Schema`, declare ordered segments in a class-level Domain-Specific Language (DSL), then **parse** strings into structured values and **format** them back with `#to_s`.
4
+
5
+ Requires **Ruby 3.2+**.
6
+
7
+ ## Installation
8
+
9
+ Add to your project's Gemfile:
10
+
11
+ ```ruby
12
+ gem "verbatim"
13
+ ```
14
+
15
+ Install the gem locally:
16
+
17
+ ```bash
18
+ bundle install
19
+ ```
20
+
21
+ ## Use
22
+
23
+ Define a schema, call `.parse`, use readers or `#[]` / `#to_s`:
24
+
25
+ ```ruby
26
+ require "verbatim"
27
+
28
+ class ApiVersion < Verbatim::Schema
29
+ delimiter "."
30
+ segment :major, :uint
31
+ segment :minor, :uint
32
+ end
33
+
34
+ v = ApiVersion.parse("2.15")
35
+ v.major # => 2
36
+ v.minor # => 15
37
+ v[:minor] # => 15
38
+ v.to_s # => "2.15"
39
+ ```
40
+
41
+ Build an instance by hand (useful for tests or construction from other data):
42
+
43
+ ```ruby
44
+ ApiVersion.new(major: 1, minor: 0).to_s # => "1.0"
45
+ ```
46
+
47
+ ## Semantic Version Schema (Built-in)
48
+
49
+ `Verbatim::Schemas::SemVer` encodes [Semantic Versioning 2.0.0](https://semver.org/) core version, optional prerelease and build metadata.
50
+
51
+ ```ruby
52
+ v = Verbatim::Schemas::SemVer.parse("1.0.0-rc.1+exp.sha.5114f85")
53
+ v.major # => 1
54
+ v.minor # => 0
55
+ v.patch # => 0
56
+ v.prerelease # => "rc.1"
57
+ v.build # => "exp.sha.5114f85"
58
+ v.to_s # => "1.0.0-rc.1+exp.sha.5114f85"
59
+ ```
60
+
61
+ ## Calendar Version Schema (Built-in)
62
+
63
+ `Verbatim::Schemas::CalVer` follows a common [Calendar Versioning](https://calver.org/)-style **YYYY.0M.0D** layout with **zero-padding** on `#to_s`.
64
+
65
+ ```ruby
66
+ v = Verbatim::Schemas::CalVer.parse("2026.4.8")
67
+ v.to_s # => "2026.04.08"
68
+ ```
69
+
70
+ ## DSL Reference
71
+
72
+ #### `delimiter(string)`:
73
+ Sets the default string placed **between** consecutive segments when neither segment uses a `lead`.
74
+
75
+ #### `segment(name, type, **options)`:
76
+ Declares one segment, in order. Duplicate names are rejected.
77
+
78
+
79
+ | Option | Meaning |
80
+ | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
81
+ | `optional: true` | With `lead`: if the input does not start with `lead`, the segment is `nil` and parsing continues. Optional segments without `lead` are not treated specially (use `lead` for optional tails such as SemVer prerelease/build). |
82
+ | `lead: "..."` | Literal prefix **before** this segment’s value (for example `"-"` or `"+"`). Consumed on parse; emitted on `#to_s`. No default delimiter is consumed before a segment that has a `lead`. |
83
+ | `delimiter_after:` | `:inherit` (default): after this segment, the next segment (if it has no `lead`) is separated with the schema’s default delimiter. `:none`: do not insert the default delimiter before the next segment (typical before optional `-` / `+` tails). Or pass a string for a fixed delimiter after this segment. |
84
+ | Other keyword arguments | Passed through to the segment **type** (for example `leading_zeros: false` on `:uint`, `pattern:` on `:string`, `terminator:` on `:semver_ids`). |
85
+
86
+
87
+ ### Object API
88
+
89
+ - **Access:** reader per segment (e.g. `#major`), plus `#[](name)` and `#to_h`.
90
+ - **`#to_s`:** formats in segment order; omits `nil` optional segments.
91
+ - **`#with(**attrs)`:** returns a **new** instance of the same schema with merged segment values.
92
+ - **`#succ` / `#pred`:** previous and next along a schema-specific sequence. The base `Verbatim::Schema` raises `NotImplementedError`; override in subclasses or use `#with` to bump fields yourself.
93
+ - **Equality:** `#==`, `#eql?`, and `#hash` use the schema class and segment values.
94
+ - **`Comparable`:** same-class instances sort together (`nil` optionals before non-`nil` in each slot); different schema classes are incomparable. `Verbatim::Schemas::SemVer` uses SemVer 2.0.0 precedence.
95
+ - **`.parse`** returns a frozen instance.
96
+
97
+ ### Class methods
98
+
99
+ - `YourSchema.parse(string)` → instance
100
+ - `YourSchema.format(instance)` → canonical string (same rules as `#to_s`)
101
+
102
+ ## Segment Types
103
+
104
+
105
+ | Type | Description | Options |
106
+ | ------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
107
+ | `:uint` | Non-empty ASCII digits → `Integer`; format with `Integer#to_s`, optionally zero-padded via `pad:` | `leading_zeros: false` disallows multi-digit values with a leading `0`. `pad: N` formats with at least `N` digits (e.g. `pad: 2` → `"08"`). `minimum:` / `maximum:` validate the integer after parsing. |
108
+ | `:int` | Optional leading `-`, then digits → `Integer`; format with `Integer#to_s` (negative values allowed) | — |
109
+ | `:token` | Maximal run of `[0-9A-Za-z-]` → string; format the same string | — |
110
+ | `:string` | Rest of input must match anchored `options[:pattern]`; format `value.to_s` | `pattern:` (required Regexp). |
111
+ | `:semver_ids` | Dot-separated SemVer identifiers until end of string or `terminator`; format joins segments with `.` | `terminator:` (e.g. `"+"` so prerelease stops before build metadata). |
112
+
113
+
114
+ ## Custom Segment Types
115
+
116
+ Register a handler object that responds to `parse(cursor, segment, parse_ctx)` and `format(value, segment)`. Parsing should advance `cursor` and return the Ruby value; formatting returns the string fragment for that segment. Use `Verbatim::Cursor` (`#peek`, `#advance`, `#remainder`, `#starts_with?`, `#eos?`) and `segment.options` for type-specific configuration.
117
+
118
+ ```ruby
119
+ Verbatim::Types.register(:my_token, my_handler)
120
+
121
+ class MySchema < Verbatim::Schema
122
+ delimiter "."
123
+ segment :x, :my_token
124
+ end
125
+ ```
126
+
127
+ ## Errors
128
+
129
+ Input is interpreted as UTF-8; version strings are expected to be compatible with that. `Schema.parse` rejects strings longer than `Verbatim::Schema::MAX_INPUT_LEN` (**128** characters, by UTF-8 codepoint count) with `Verbatim::ParseError`. Failed parses raise `Verbatim::ParseError` with `#message`, `#string` (input), `#index` (0-based **character** index into the string), and `#segment`.
130
+
131
+ ## License
132
+
133
+ MIT (see [LICENSE](LICENSE) and [verbatim.gemspec](verbatim.gemspec)).
@@ -0,0 +1,65 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Verbatim
4
+ # Character-position cursor over a UTF-8 string; used by the parser and custom type handlers.
5
+ #
6
+ # @api public
7
+ #
8
+ class Cursor
9
+ # @return [String] the full string being scanned (UTF-8)
10
+ attr_reader :string
11
+ # @return [Integer] current zero-based character index into {#string}
12
+ attr_reader :pos
13
+
14
+ # @param string [String] input; re-encoded to UTF-8
15
+ # @return [void]
16
+ #
17
+ def initialize(string)
18
+ @string = string.encode(Encoding::UTF_8)
19
+ @pos = 0
20
+ end
21
+
22
+ # Checks if the cursor is at the end of the string.
23
+ #
24
+ # @return [Boolean] +true+ if no characters remain after {#pos}
25
+ #
26
+ def eos?
27
+ pos >= string.length
28
+ end
29
+
30
+ # Returns the remainder of the string from the current position to the end.
31
+ #
32
+ # @return [String] substring from {#pos} through end of {#string}
33
+ #
34
+ def remainder
35
+ string[pos..] || ""
36
+ end
37
+
38
+ # Returns up to +count+ characters at {#pos}, or empty string at end.
39
+ #
40
+ # @param count [Integer] number of characters to peek (default 1)
41
+ # @return [String] up to +count+ characters at {#pos}, or empty string at end
42
+ #
43
+ def peek(count = 1)
44
+ string[pos, count] || ""
45
+ end
46
+
47
+ # Advances the cursor by +delta+ characters.
48
+ #
49
+ # @param delta [Integer] number of character positions to advance (default 1)
50
+ # @return [void]
51
+ #
52
+ def advance(delta = 1)
53
+ @pos += delta
54
+ end
55
+
56
+ # Checks if the cursor starts with a given prefix.
57
+ #
58
+ # @param prefix [String] literal prefix to test
59
+ # @return [Boolean] +true+ if {#string} at {#pos} starts with +prefix+
60
+ #
61
+ def starts_with?(prefix)
62
+ string[pos, prefix.length] == prefix
63
+ end
64
+ end
65
+ end
@@ -0,0 +1,29 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Verbatim
4
+ # Raised when a string does not match the expected schema layout.
5
+ #
6
+ # @api public
7
+ #
8
+ class ParseError < StandardError
9
+ # @return [String] the full input string that was being parsed
10
+ attr_reader :string
11
+ # @return [Integer] zero-based character index of the error position in {#string}
12
+ attr_reader :index
13
+ # @return [Symbol, nil] segment name when known, or +nil+
14
+ attr_reader :segment
15
+
16
+ # @param message [String] human-readable error description
17
+ # @param string [String] full input string
18
+ # @param index [Integer] character index where parsing failed
19
+ # @param segment [Symbol, nil] segment name when applicable
20
+ # @return [void]
21
+ #
22
+ def initialize(message, string:, index:, segment: nil)
23
+ super(message)
24
+ @string = string
25
+ @index = index
26
+ @segment = segment
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,112 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Verbatim
4
+ # Fills a {Schema} instance by consuming a string according to the schema’s segment list.
5
+ #
6
+ # @api public
7
+ #
8
+ class Parser
9
+ # @param schema_class [Class] subclass of {Schema}
10
+ # @param string [String] input to parse
11
+ # @return [void]
12
+ #
13
+ def initialize(schema_class, string)
14
+ @schema_class = schema_class
15
+ @cursor = Cursor.new(string)
16
+ end
17
+
18
+ # Consumes {#initialize}’s string and writes segment values into +instance+.
19
+ #
20
+ # @param instance [Schema] unfrozen or frozen target; receives {#Schema#assign_segment}
21
+ # @return [Schema] +instance+
22
+ # @raise [ParseError] on malformed input or trailing data
23
+ #
24
+ def parse_into(instance)
25
+ segments = @schema_class.segments
26
+
27
+ segments.each_with_index do |segment, i|
28
+ if segment.optional? && segment.lead? && !@cursor.starts_with?(segment.lead)
29
+ instance.assign_segment(segment.name, nil)
30
+ next
31
+ end
32
+
33
+ if i.positive? && !segment.lead?
34
+ delim = effective_delimiter_after(segments[i - 1])
35
+ expect_delimiter!(delim, segment) if delim && !delim.empty?
36
+ end
37
+
38
+ if segment.lead?
39
+ expect_lead!(segment)
40
+ @cursor.advance(segment.lead.length)
41
+ end
42
+
43
+ value = Types.parse(segment.type, @cursor, segment, self)
44
+ instance.assign_segment(segment.name, value)
45
+ end
46
+
47
+ unless @cursor.eos?
48
+ raise ParseError.new(
49
+ "unexpected trailing data #{@cursor.remainder.inspect}",
50
+ string: @cursor.string,
51
+ index: @cursor.pos,
52
+ segment: nil
53
+ )
54
+ end
55
+
56
+ instance
57
+ end
58
+
59
+ private
60
+
61
+ # Expects a delimiter at the cursor.
62
+ #
63
+ # @param delim [String, nil]
64
+ # @param segment [Segment]
65
+ # @return [void]
66
+ # @raise [ParseError] if +delim+ is present but not at the cursor
67
+ #
68
+ def expect_delimiter!(delim, segment)
69
+ return if delim.nil? || delim.empty?
70
+
71
+ unless @cursor.starts_with?(delim)
72
+ raise ParseError.new(
73
+ "expected delimiter #{delim.inspect}",
74
+ string: @cursor.string,
75
+ index: @cursor.pos,
76
+ segment: segment.name
77
+ )
78
+ end
79
+ @cursor.advance(delim.length)
80
+ end
81
+
82
+ # Expects a lead at the cursor.
83
+ #
84
+ # @param segment [Segment]
85
+ # @return [void]
86
+ # @raise [ParseError] if the cursor does not start with +segment.lead+
87
+ #
88
+ def expect_lead!(segment)
89
+ return if @cursor.starts_with?(segment.lead)
90
+
91
+ raise ParseError.new(
92
+ "expected #{segment.lead.inspect}",
93
+ string: @cursor.string,
94
+ index: @cursor.pos,
95
+ segment: segment.name
96
+ )
97
+ end
98
+
99
+ # Returns the effective delimiter after a segment.
100
+ #
101
+ # @param segment [Segment]
102
+ # @return [String, nil] delimiter string to expect after +segment+, or +nil+
103
+ #
104
+ def effective_delimiter_after(segment)
105
+ case segment.delimiter_after
106
+ when :inherit then @schema_class.default_delimiter
107
+ when :none then nil
108
+ else segment.delimiter_after.to_s
109
+ end
110
+ end
111
+ end
112
+ end