hummel 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 51c766a08cd0ff86bb1c6e7b24015ac5a0dcb7bccfce1afd50b97943d5e2b2e4
4
+ data.tar.gz: ca7137d3dfe54500e91f34cc83f295b35096a688690168a13edcab8071522d33
5
+ SHA512:
6
+ metadata.gz: 16a70da398fb107964d014cd393a18417fc6112d7bbb854f72297e9626ef718b196bd8b6a128f95c23444c4db95b81add9708d2aa19e55b8fdf1a7fc2773f275
7
+ data.tar.gz: 28018bb7b58fcc429a53c8001ea643d62331aeec9432a8aa267a8c7ab03c3b1a7e1f9a4263d83621cd8b893e8067255b31918abf3d4d3529fbbc05a84ab9d5fa
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --format documentation
2
+ --color
3
+ --require spec_helper
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 3.4.7
data/.standard.yml ADDED
@@ -0,0 +1,3 @@
1
+ # For available configuration options, see:
2
+ # https://github.com/standardrb/standard
3
+ ruby_version: 3.4
data/CHANGELOG.md ADDED
@@ -0,0 +1,5 @@
1
+ ## [Unreleased]
2
+
3
+ ## [0.1.0] - 2025-10-12
4
+
5
+ - Initial release
@@ -0,0 +1,132 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ We as members, contributors, and leaders pledge to make participation in our
6
+ community a harassment-free experience for everyone, regardless of age, body
7
+ size, visible or invisible disability, ethnicity, sex characteristics, gender
8
+ identity and expression, level of experience, education, socio-economic status,
9
+ nationality, personal appearance, race, caste, color, religion, or sexual
10
+ identity and orientation.
11
+
12
+ We pledge to act and interact in ways that contribute to an open, welcoming,
13
+ diverse, inclusive, and healthy community.
14
+
15
+ ## Our Standards
16
+
17
+ Examples of behavior that contributes to a positive environment for our
18
+ community include:
19
+
20
+ * Demonstrating empathy and kindness toward other people
21
+ * Being respectful of differing opinions, viewpoints, and experiences
22
+ * Giving and gracefully accepting constructive feedback
23
+ * Accepting responsibility and apologizing to those affected by our mistakes,
24
+ and learning from the experience
25
+ * Focusing on what is best not just for us as individuals, but for the overall
26
+ community
27
+
28
+ Examples of unacceptable behavior include:
29
+
30
+ * The use of sexualized language or imagery, and sexual attention or advances of
31
+ any kind
32
+ * Trolling, insulting or derogatory comments, and personal or political attacks
33
+ * Public or private harassment
34
+ * Publishing others' private information, such as a physical or email address,
35
+ without their explicit permission
36
+ * Other conduct which could reasonably be considered inappropriate in a
37
+ professional setting
38
+
39
+ ## Enforcement Responsibilities
40
+
41
+ Community leaders are responsible for clarifying and enforcing our standards of
42
+ acceptable behavior and will take appropriate and fair corrective action in
43
+ response to any behavior that they deem inappropriate, threatening, offensive,
44
+ or harmful.
45
+
46
+ Community leaders have the right and responsibility to remove, edit, or reject
47
+ comments, commits, code, wiki edits, issues, and other contributions that are
48
+ not aligned to this Code of Conduct, and will communicate reasons for moderation
49
+ decisions when appropriate.
50
+
51
+ ## Scope
52
+
53
+ This Code of Conduct applies within all community spaces, and also applies when
54
+ an individual is officially representing the community in public spaces.
55
+ Examples of representing our community include using an official email address,
56
+ posting via an official social media account, or acting as an appointed
57
+ representative at an online or offline event.
58
+
59
+ ## Enforcement
60
+
61
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
62
+ reported to the community leaders responsible for enforcement at
63
+ [INSERT CONTACT METHOD].
64
+ All complaints will be reviewed and investigated promptly and fairly.
65
+
66
+ All community leaders are obligated to respect the privacy and security of the
67
+ reporter of any incident.
68
+
69
+ ## Enforcement Guidelines
70
+
71
+ Community leaders will follow these Community Impact Guidelines in determining
72
+ the consequences for any action they deem in violation of this Code of Conduct:
73
+
74
+ ### 1. Correction
75
+
76
+ **Community Impact**: Use of inappropriate language or other behavior deemed
77
+ unprofessional or unwelcome in the community.
78
+
79
+ **Consequence**: A private, written warning from community leaders, providing
80
+ clarity around the nature of the violation and an explanation of why the
81
+ behavior was inappropriate. A public apology may be requested.
82
+
83
+ ### 2. Warning
84
+
85
+ **Community Impact**: A violation through a single incident or series of
86
+ actions.
87
+
88
+ **Consequence**: A warning with consequences for continued behavior. No
89
+ interaction with the people involved, including unsolicited interaction with
90
+ those enforcing the Code of Conduct, for a specified period of time. This
91
+ includes avoiding interactions in community spaces as well as external channels
92
+ like social media. Violating these terms may lead to a temporary or permanent
93
+ ban.
94
+
95
+ ### 3. Temporary Ban
96
+
97
+ **Community Impact**: A serious violation of community standards, including
98
+ sustained inappropriate behavior.
99
+
100
+ **Consequence**: A temporary ban from any sort of interaction or public
101
+ communication with the community for a specified period of time. No public or
102
+ private interaction with the people involved, including unsolicited interaction
103
+ with those enforcing the Code of Conduct, is allowed during this period.
104
+ Violating these terms may lead to a permanent ban.
105
+
106
+ ### 4. Permanent Ban
107
+
108
+ **Community Impact**: Demonstrating a pattern of violation of community
109
+ standards, including sustained inappropriate behavior, harassment of an
110
+ individual, or aggression toward or disparagement of classes of individuals.
111
+
112
+ **Consequence**: A permanent ban from any sort of public interaction within the
113
+ community.
114
+
115
+ ## Attribution
116
+
117
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage],
118
+ version 2.1, available at
119
+ [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
120
+
121
+ Community Impact Guidelines were inspired by
122
+ [Mozilla's code of conduct enforcement ladder][Mozilla CoC].
123
+
124
+ For answers to common questions about this code of conduct, see the FAQ at
125
+ [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
126
+ [https://www.contributor-covenant.org/translations][translations].
127
+
128
+ [homepage]: https://www.contributor-covenant.org
129
+ [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
130
+ [Mozilla CoC]: https://github.com/mozilla/diversity
131
+ [FAQ]: https://www.contributor-covenant.org/faq
132
+ [translations]: https://www.contributor-covenant.org/translations
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2025 Derek Bender
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,95 @@
1
+ # hummel
2
+
3
+ An HUML parser implementation in ruby.
4
+
5
+ > **Note:** The gem is named `hummel` because `huml` was already taken on rubygems.org.
6
+
7
+ ## Installation
8
+
9
+ Add this line to your application's Gemfile:
10
+
11
+ ```ruby
12
+ gem 'hummel', source: 'https://gem.coop'
13
+ ```
14
+
15
+ And then execute:
16
+
17
+ ```bash
18
+ bundle install
19
+ ```
20
+
21
+ Or install it yourself as:
22
+
23
+ ```bash
24
+ gem install hummel --source https://gem.coop
25
+ ```
26
+
27
+ ## Usage
28
+
29
+ ### Parsing HUML
30
+
31
+ ```ruby
32
+ require 'hummel'
33
+
34
+ huml_string = <<~HUML
35
+ name: John Doe
36
+ age: 30
37
+ email: john@example.com
38
+ HUML
39
+
40
+ data = Hummel::Decode.parse(huml_string)
41
+ # => {"name"=>"John Doe", "age"=>30, "email"=>"john@example.com"}
42
+ ```
43
+
44
+ ### Encoding to HUML
45
+
46
+ ```ruby
47
+ require 'hummel'
48
+
49
+ data = {
50
+ name: "John Doe",
51
+ age: 30,
52
+ hobbies: ["reading", "coding", "hiking"]
53
+ }
54
+
55
+ huml_string = Hummel::Encode.stringify(data)
56
+ puts huml_string
57
+ # Output:
58
+ # age: 30
59
+ # hobbies::
60
+ # - reading
61
+ # - coding
62
+ # - hiking
63
+ # name: John Doe
64
+ ```
65
+
66
+ ### Options
67
+
68
+ You can include the HUML version header when encoding:
69
+
70
+ ```ruby
71
+ Hummel::Encode.stringify(data, include_version: true)
72
+ # Output:
73
+ # %HUML v0.1.0
74
+ #
75
+ # age: 30
76
+ # ...
77
+ ```
78
+
79
+ ## Development
80
+
81
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
82
+
83
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [gem.coop](https://gem.coop).
84
+
85
+ ## Contributing
86
+
87
+ Bug reports and pull requests are welcome on GitHub at https://github.com/djbender/hummel. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/djbender/hummel/blob/main/CODE_OF_CONDUCT.md).
88
+
89
+ ## License
90
+
91
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
92
+
93
+ ## Code of Conduct
94
+
95
+ Everyone interacting in the Hummel project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/djbender/hummel/blob/main/CODE_OF_CONDUCT.md).
data/Rakefile ADDED
@@ -0,0 +1,10 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ require "standard/rake"
9
+
10
+ task default: %i[spec standard]
@@ -0,0 +1,825 @@
1
+ module Hummel
2
+ module Decode
3
+ def self.parse(input)
4
+ parser = Parser.new(input)
5
+ parser.parse
6
+ end
7
+
8
+ class Error < StandardError; end
9
+ end
10
+
11
+ class Parser
12
+ class ParseError < Hummel::Decode::Error
13
+ attr_reader :line
14
+
15
+ def initialize(message, line)
16
+ @line = line
17
+ super("line #{line}: #{message}")
18
+ end
19
+ end
20
+
21
+ TYPES = {
22
+ INLINE_DICT: 1,
23
+ MULTILINE_DICT: 2,
24
+ EMPTY_LIST: 3,
25
+ EMPTY_DICT: 4,
26
+ MULTILINE_LIST: 5,
27
+ INLINE_LIST: 6,
28
+ SCALAR: 7
29
+ }
30
+ SPECIAL_VALUES = [
31
+ ["true", true],
32
+ ["false", false],
33
+ ["null", nil],
34
+ ["nan", Float::NAN],
35
+ ["inf", Float::INFINITY]
36
+ ]
37
+
38
+ ESCAPE_MAP = {
39
+ '"' => '"',
40
+ "\\" => "\\",
41
+ "/" => "/",
42
+ "n" => "\n",
43
+ "t" => "\t",
44
+ "r" => "\r",
45
+ "f" => "\f",
46
+ "v" => "\v"
47
+ }
48
+
49
+ NUMBER_BASE_PREFIXES = [
50
+ ["0x", 16],
51
+ ["0o", 8],
52
+ ["0b", 2]
53
+ ]
54
+
55
+ attr_reader :data
56
+ attr_accessor :pos, :line
57
+
58
+ def initialize(data)
59
+ @data = data
60
+ @pos = 0
61
+ @line = 1
62
+ end
63
+
64
+ def parse
65
+ raise error("empty document is undefined") if data.empty?
66
+
67
+ if peek_string("%HUML")
68
+ advance(5)
69
+
70
+ if !done? && data[pos] == " "
71
+ advance(1)
72
+
73
+ # parse version string
74
+ starting_pos = pos
75
+ while !done? && ![" ", "\n", "#"].include?(data[pos])
76
+ self.pos += 1
77
+ end
78
+
79
+ if pos > starting_pos
80
+ version = data[starting_pos...pos]
81
+ if version != "v0.1.0"
82
+ raise error("unsupported version '#{version}'. expected 'v0.1.0'")
83
+ end
84
+ end
85
+ end
86
+
87
+ consume_line
88
+ end
89
+
90
+ skip_blank_lines
91
+
92
+ raise error("empty doc is undefined") if done?
93
+ raise error("root element must not be indented") if current_indent != 0
94
+ raise error("'::' indicator not allowed at document root") if peek_string("::")
95
+ raise error("':' indicator not allowed at document root") if peek_string(":") && !key_value_pair?
96
+
97
+ type_handlers = {
98
+ TYPES.fetch(:INLINE_DICT) => -> {
99
+ assert_root_end(parse_inline_vector_contents(TYPES[:INLINE_DICT]), "root inline dict")
100
+ },
101
+ TYPES[:MULTILINE_DICT] => -> {
102
+ parse_multiline_dict(0)
103
+ },
104
+ TYPES[:EMPTY_LIST] => -> {
105
+ advance(2)
106
+ consume_line
107
+ assert_root_end([], "root list")
108
+ },
109
+ TYPES[:EMPTY_DICT] => -> {
110
+ advance(2)
111
+ consume_line
112
+ assert_root_end({}, "root dict")
113
+ },
114
+ TYPES[:MULTILINE_LIST] => -> {
115
+ parse_multiline_list(0)
116
+ },
117
+ TYPES[:INLINE_LIST] => -> {
118
+ assert_root_end(
119
+ parse_inline_vector_contents(TYPES.fetch(:INLINE_LIST)),
120
+ "root inline list"
121
+ )
122
+ },
123
+ TYPES[:SCALAR] => -> {
124
+ val = parse_value(0)
125
+ consume_line
126
+ assert_root_end(val, "root scalar value")
127
+ }
128
+ }
129
+ type_handlers[root_type].call
130
+ end
131
+
132
+ # 90% sure this is a valid ruby version
133
+ def root_type
134
+ if key_value_pair?
135
+ return TYPES.fetch(:INLINE_DICT) if inline_dict_at_root?
136
+ return TYPES.fetch(:MULTILINE_DICT)
137
+ end
138
+ return TYPES.fetch(:EMPTY_LIST) if peek_string("[]")
139
+ return TYPES.fetch(:EMPTY_DICT) if peek_string("{}")
140
+ return TYPES.fetch(:MULTILINE_LIST) if peek_char(pos) == "-"
141
+ return TYPES.fetch(:INLINE_LIST) if inline_list_at_root?
142
+
143
+ TYPES.fetch(:SCALAR)
144
+ end
145
+
146
+ def assert_root_end(val, description)
147
+ skip_blank_lines
148
+ unless done?
149
+ raise error("unexpected content after #{description}")
150
+ end
151
+ val
152
+ end
153
+
154
+ def parse_multiline_dict(indent)
155
+ result = {}
156
+
157
+ loop do
158
+ skip_blank_lines
159
+ break if done?
160
+ break if current_indent < indent
161
+
162
+ if current_indent != indent
163
+ raise error("bad indent #{current_indent}, expected #{indent}")
164
+ end
165
+
166
+ unless key_start?
167
+ raise error("invalid character '#{data[pos]}', expected key")
168
+ end
169
+
170
+ key = parse_key
171
+
172
+ if result.include?(key)
173
+ raise error("duplicat key '#{key}' in dict")
174
+ end
175
+
176
+ indicator = parse_indicator
177
+
178
+ result[key] = if indicator == ":"
179
+ assert_space("after ':'")
180
+
181
+ multiline = peek_string("```") || peek_string('""""')
182
+
183
+ value = parse_value(current_indent)
184
+
185
+ consume_line unless multiline
186
+ value
187
+ else
188
+ parse_vector(current_indent + 2)
189
+ end
190
+ end
191
+
192
+ result
193
+ end
194
+
195
+ def parse_multiline_list(indent)
196
+ result = []
197
+
198
+ loop do
199
+ skip_blank_lines
200
+ break if done?
201
+ break if current_indent < indent
202
+
203
+ if current_indent != indent
204
+ raise error("bad indent #{current_indent}, expected #{indent}")
205
+ end
206
+
207
+ break if data[pos] != "-"
208
+
209
+ advance(1)
210
+ assert_space("after '-'")
211
+
212
+ result << if peek_string("::")
213
+ # nested vector
214
+ advance(2)
215
+ parse_vector(current_indent + 2)
216
+ else
217
+ # scalar value
218
+ parse_value(current_indent).tap do
219
+ consume_line
220
+ end
221
+ end
222
+ end
223
+
224
+ result
225
+ end
226
+
227
+ def multiline_vector_type(indent)
228
+ skip_blank_lines
229
+
230
+ if done? || current_indent < indent
231
+ raise error("ambiguous empty vector after '::'. Use [] or {}.")
232
+ end
233
+
234
+ if data[pos] == "-"
235
+ "list"
236
+ else
237
+ "dict"
238
+ end
239
+ end
240
+
241
+ def parse_vector(indent)
242
+ starting_pos = pos
243
+ skip_spaces
244
+
245
+ if done? || data[pos] == "\n" || data[pos] == "#"
246
+ self.pos = starting_pos
247
+ consume_line
248
+
249
+ vector_type = multiline_vector_type(indent)
250
+ next_indent = current_indent
251
+
252
+ parse_multiline_method = method(:"parse_multiline_#{vector_type}")
253
+ return parse_multiline_method.call(next_indent)
254
+ end
255
+
256
+ self.pos = starting_pos
257
+ assert_space("after '::'")
258
+
259
+ parse_inline_vector
260
+ end
261
+
262
+ def parse_inline_vector
263
+ if peek_string("[]")
264
+ advance(2)
265
+ consume_line
266
+ []
267
+ elsif peek_string("{}")
268
+ advance(2)
269
+ consume_line
270
+ {}
271
+ elsif inline_dict?
272
+ parse_inline_vector_contents(TYPES.fetch(:INLINE_DICT))
273
+ else
274
+ parse_inline_vector_contents(TYPES.fetch(:INLINE_LIST))
275
+ end
276
+ end
277
+
278
+ def parse_inline_vector_contents(type)
279
+ result = if type == TYPES.fetch(:INLINE_DICT)
280
+ {}
281
+ else
282
+ []
283
+ end
284
+
285
+ @first = nil # Reset for each new inline vector
286
+
287
+ while !done? && data[pos] != "\n" && data[pos] != "#"
288
+ skip_first do
289
+ expect_comma
290
+ end
291
+
292
+ if type == TYPES.fetch(:INLINE_DICT)
293
+ key = parse_key
294
+ if done? || data[pos] != ":"
295
+ raise error("expected ':' in inline dict")
296
+ end
297
+
298
+ advance(1)
299
+ assert_space("in inline dict")
300
+
301
+ if result.include?(key)
302
+ raise error("duplicate key '#{key}' in dict")
303
+ end
304
+
305
+ result[key] = parse_value(0)
306
+ else
307
+ result.push(parse_value(0))
308
+ end
309
+
310
+ if !done? && data[pos] == " "
311
+ next_pos = pos + 1
312
+
313
+ next_pos += 1 while next_pos < data.length && data[next_pos] == " "
314
+ if next_pos < data.length && data[next_pos] == ","
315
+ skip_spaces
316
+ else
317
+ break
318
+ end
319
+ end
320
+ end
321
+
322
+ consume_line
323
+ result
324
+ end
325
+
326
+ def parse_key
327
+ skip_spaces
328
+ if peek_char(pos) == '"'
329
+ return parse_string
330
+ end
331
+
332
+ start = pos
333
+ while !done? && alpha_numeric?(data[pos]) ||
334
+ data[pos] == "-" || data[pos] == "_"
335
+ self.pos += 1
336
+ end
337
+
338
+ if self.pos == start
339
+ raise error("expected a key")
340
+ end
341
+
342
+ data[start...self.pos]
343
+ end
344
+
345
+ def parse_indicator
346
+ if done? || data[pos] != ":"
347
+ raise error("expected ':' or '::' after key")
348
+ end
349
+
350
+ advance(1)
351
+
352
+ if !done? && data[pos] == ":"
353
+ advance(1)
354
+ return "::"
355
+ end
356
+
357
+ ":"
358
+ end
359
+
360
+ def parse_value(key_indent)
361
+ if done?
362
+ raise error("unexpected end of input, expected a value")
363
+ end
364
+ character = data[self.pos]
365
+
366
+ if character == '"'
367
+ return peek_string('"""') ? parse_multiline_string(key_indent, false) : parse_string
368
+ end
369
+
370
+ if character == "`" && peek_string("```")
371
+ return parse_multiline_string(key_indent, true)
372
+ end
373
+
374
+ SPECIAL_VALUES.each do |string, value|
375
+ if peek_string(string)
376
+ advance(string.length)
377
+ return value
378
+ end
379
+ end
380
+
381
+ if character == "+"
382
+ advance(1)
383
+ if peek_string("inf")
384
+ advance(3)
385
+ return Float::INFINITY
386
+ end
387
+
388
+ if digit?(peek_char(self.pos))
389
+ self.pos -= 1
390
+ return parse_number
391
+ end
392
+ raise error("invalid character after '+'")
393
+ end
394
+
395
+ if character == "-"
396
+ advance(1)
397
+ if peek_string("inf")
398
+ advance(3)
399
+ return -Float::INFINITY
400
+ end
401
+
402
+ if digit?(peek_char(self.pos))
403
+ self.pos -= 1
404
+ return parse_number
405
+ end
406
+
407
+ raise error("invalid character after '-'")
408
+ end
409
+
410
+ if digit?(character)
411
+ return parse_number
412
+ end
413
+
414
+ raise error("unexpected character '#{character}' when parsing value")
415
+ end
416
+
417
+ def parse_string
418
+ advance(1)
419
+
420
+ result = ""
421
+ until done?
422
+ character = data[pos]
423
+
424
+ case character
425
+ when '"'
426
+ advance(1)
427
+ return result
428
+ when "\n"
429
+ raise error("newlines not allowed in single-line strings")
430
+ when "\\"
431
+ advance(1)
432
+ if done?
433
+ raise error("incomplete escape sequence")
434
+ end
435
+
436
+ escape = data[pos]
437
+
438
+ if ESCAPE_MAP.include?(escape)
439
+ result << ESCAPE_MAP.fetch(escape)
440
+ else
441
+ raise error("invalid escape character '\\#{escape}'")
442
+ end
443
+ else
444
+ result << character
445
+ end
446
+
447
+ advance(1)
448
+ end
449
+
450
+ raise error("unclosed string")
451
+ end
452
+
453
+ def parse_multiline_string(key_indent, preserve_spaces)
454
+ delimiter = data[pos, 3]
455
+ advance(3)
456
+ consume_line
457
+
458
+ # define line processing base on string type
459
+ process_line = if preserve_spaces
460
+ ->(content, line_indent) do
461
+ # strip required 2-space indent relative to key
462
+ required_indent = key_indent + 2
463
+ if content.length >= required_indent && space_string?(content[0, required_indent])
464
+ return content[required_indent..]
465
+ end
466
+
467
+ content
468
+ end
469
+ else
470
+ ->(content, _line_indent) { content.strip }
471
+ end
472
+
473
+ lines = []
474
+
475
+ until done?
476
+ line_starting_pos = pos
477
+ line_indent = 0
478
+
479
+ # count indentation
480
+ while !done? && data[pos] == " "
481
+ line_indent += 1
482
+ self.pos += 1
483
+ end
484
+
485
+ # check for closing delimiter
486
+ if peek_string(delimiter)
487
+ if line_indent != key_indent
488
+ raise error("multiline closing delimiter must be at same indentation as the key (#{key_indent} spaces)")
489
+ end
490
+
491
+ advance(3)
492
+ consume_line
493
+
494
+ return lines.join("\n")
495
+ end
496
+
497
+ # get line content
498
+ self.pos = line_starting_pos
499
+ line_content = consume_line_content
500
+ lines.push(process_line.call(line_content, line_indent))
501
+ end
502
+
503
+ raise error("unclosed multiline string")
504
+ end
505
+
506
+ # parses numbers in various formats: decimal, hex, octal, binary, float
507
+ def parse_number
508
+ starting_pos = pos
509
+
510
+ # handle sign
511
+ next_character = peek_char(pos)
512
+ if ["+", "-"].include?(next_character)
513
+ advance(1)
514
+ end
515
+
516
+ NUMBER_BASE_PREFIXES.each do |prefix, base|
517
+ if peek_string(prefix)
518
+ parse_base(start: starting_pos, base: base, prefix:)
519
+ end
520
+ end
521
+
522
+ float = false
523
+
524
+ until done?
525
+ character = data[pos]
526
+
527
+ if digit?(character) || character == "_"
528
+ advance(1) # TODO: make advance always do 1 by default
529
+ elsif character == "."
530
+ float = true
531
+ advance(1)
532
+ elsif character.downcase == "e"
533
+ float = true
534
+ advance(1)
535
+
536
+ if ["+", "-"].include?(peek_char(pos))
537
+ advance(1)
538
+ end
539
+ else
540
+ break
541
+ end
542
+ end
543
+
544
+ # remove underscores and parse
545
+ number_string = data[starting_pos...pos].delete("_")
546
+ begin
547
+ float ? Float(number_string) : Integer(number_string)
548
+ rescue ArgumentError => e
549
+ raise error("invalid number: #{e.message}")
550
+ end
551
+ end
552
+
553
+ def parse_base(start:, base:, prefix:)
554
+ advance(prefix.length)
555
+ number_start = pos
556
+
557
+ validators = {
558
+ 16 => ->(c) { hex?(c) },
559
+ 8 => ->(c) { ("0".."7").cover?(c) },
560
+ 2 => ->(c) { ["0", "1"].include?(c) }
561
+ }
562
+
563
+ while !done? && validators.fetch(base).call(data[pos])
564
+ advance(1)
565
+ end
566
+
567
+ if pos == number_start
568
+ raise error("invalid number literal, requires digits after prefix")
569
+ end
570
+
571
+ sign = (data[start] == "-") ? -1 : 1
572
+ number_string = data[number_start...pos].delete("_")
573
+
574
+ sign * Integer(number_string, base)
575
+ end
576
+
577
+ def skip_blank_lines
578
+ until done?
579
+ line_start = self.pos
580
+ skip_spaces
581
+
582
+ if done?
583
+ raise error("trailing spaces are not allowed") if self.pos > line_start
584
+ return
585
+ end
586
+
587
+ if !["\n", "#"].include?(data[self.pos])
588
+ return # Found content
589
+ end
590
+
591
+ if data[self.pos] == "\n" && self.pos > line_start
592
+ raise error("trailing spaces are not allowed")
593
+ end
594
+
595
+ self.pos = line_start
596
+ consume_line
597
+ end
598
+ end
599
+
600
+ def consume_line
601
+ content_start = self.pos
602
+ skip_spaces
603
+
604
+ if done? || data[self.pos] == "\n"
605
+ if self.pos > content_start
606
+ raise error("trailing spaces are not allowed")
607
+ end
608
+ elsif data[self.pos] == "#"
609
+ if self.pos == content_start && current_indent != self.pos - line_start
610
+ raise error("a value must be separated from an inline comment by a space")
611
+ end
612
+
613
+ self.pos += 1
614
+ if !done? && ![" ", "\n"].include?(data[self.pos])
615
+ raise error("comment hash '#' must be followed by a space")
616
+ end
617
+
618
+ else
619
+ raise error("unexpected content at end of line")
620
+ end
621
+
622
+ # NOTE: this section has been refactored
623
+ next_new_line = data.index("\n", self.pos)
624
+ if next_new_line
625
+ remaining_line = data[self.pos...next_new_line]
626
+ if remaining_line.end_with?(" ") && remaining_line.length > 0
627
+ raise error("trailing spaces are not allowed")
628
+ end
629
+
630
+ self.pos = next_new_line + 1
631
+ @line += 1
632
+ else
633
+ self.pos = data.length
634
+ end
635
+ end
636
+
637
+ def consume_line_content
638
+ starting_pos = pos
639
+ next_newline = data.index("\n", pos)
640
+
641
+ if next_newline.nil?
642
+ # no more newlines, consume to end
643
+ content = data[starting_pos..]
644
+ self.pos = data.length
645
+ return content
646
+ end
647
+
648
+ content = data[starting_pos...next_newline]
649
+ self.pos = next_newline + 1
650
+ self.line += 1
651
+
652
+ content
653
+ end
654
+
655
+ def assert_space(context)
656
+ if done? || data[pos] != " "
657
+ raise error("expected single space #{context}")
658
+ end
659
+
660
+ advance(1)
661
+
662
+ if !done? && data[pos] == " "
663
+ raise error("expected single space #{context}, found multiple")
664
+ end
665
+ end
666
+
667
+ def expect_comma
668
+ skip_spaces
669
+
670
+ if done? || data[pos] != ","
671
+ raise error("expected a comma in inline collection")
672
+ end
673
+
674
+ if pos > 0 && data[pos - 1] == " "
675
+ raise error("no spaces allowed before comma")
676
+ end
677
+
678
+ advance(1)
679
+ assert_space("after comma")
680
+ end
681
+
682
+ def current_indent
683
+ start = line_start
684
+ indent = 0
685
+ while start + indent < data.length && data[start + indent] == " "
686
+ indent += 1
687
+ end
688
+
689
+ indent
690
+ end
691
+
692
+ def line_start
693
+ if self.pos > 0 && self.pos <= data.length && data[self.pos - 1] == "\n"
694
+ self.pos
695
+ else
696
+ return 0 if self.pos <= 0 # prevent -1 array access wrap arounds
697
+
698
+ last_newline = data.rindex("\n", self.pos - 1)
699
+ last_newline ? last_newline + 1 : 0
700
+ end
701
+ end
702
+
703
+ def key_value_pair?
704
+ current_pos = self.pos
705
+ parse_key
706
+ !done? && data[self.pos] == ":"
707
+ rescue ParseError
708
+ false
709
+ ensure
710
+ self.pos = current_pos
711
+ end
712
+
713
+ def inline_dict?
714
+ current_pos = pos
715
+ while current_pos < data.length && !["\n", "#"].include?(data[current_pos])
716
+ if data[current_pos] == ":"
717
+ if current_pos + 1 >= data.length || data[current_pos + 1] != ":"
718
+ return true
719
+ end
720
+ end
721
+ current_pos += 1
722
+ end
723
+ false
724
+ end
725
+
726
+ def inline_list_at_root?
727
+ line_end = data.index("\n", self.pos) || data.length
728
+ line = data[self.pos...line_end]
729
+ comment_index = line.index("#")
730
+ content = comment_index ? line[0...comment_index] : line
731
+
732
+ content.include?(",") && !content.include?(":")
733
+ end
734
+
735
+ def inline_dict_at_root?
736
+ line_end = data.index("\n", pos) || data.length
737
+ comment_index = data.index("#", pos) || data.length
738
+
739
+ marker = [
740
+ line_end,
741
+ comment_index
742
+ ].min
743
+
744
+ line = data[pos...marker]
745
+ has_colon = line.include?(":") && !line.include?("::")
746
+ has_comma = line.include?(",")
747
+
748
+ return false unless has_colon && has_comma
749
+
750
+ remaining_content = if line_end != -1
751
+ data[0...line_end]
752
+ else
753
+ data
754
+ end.split("\n")[1..].any? do |line|
755
+ trimmed = line.strip
756
+ trimmed && !trimmed.start_with?("#")
757
+ end
758
+
759
+ !remaining_content
760
+ end
761
+
762
+ def key_start?
763
+ !done? && data[pos] == '"' || alpha?(data[pos])
764
+ end
765
+
766
+ def done?
767
+ self.pos >= data.length
768
+ end
769
+
770
+ def advance(amount)
771
+ self.pos += amount
772
+ end
773
+
774
+ def skip_spaces
775
+ while !done? && data[self.pos] == " "
776
+ advance(1)
777
+ end
778
+ end
779
+
780
+ def peek_string(string)
781
+ # really unsure if the start_with? is correct
782
+ self.pos + string.length <= data.length && data[self.pos..].start_with?(string)
783
+ end
784
+
785
+ def peek_char(position)
786
+ return data[position] if position >= 0 && position < data.length
787
+ '\0'
788
+ end
789
+
790
+ def digit?(character)
791
+ character.match?(/\d/)
792
+ end
793
+
794
+ def alpha?(character)
795
+ character.match?(/[a-zA-Z]/)
796
+ end
797
+
798
+ def alpha_numeric?(character)
799
+ alpha?(character) || digit?(character)
800
+ end
801
+
802
+ def hex?(character)
803
+ digit?(character) ||
804
+ ("a".."f").cover?(character.downcase)
805
+ end
806
+
807
+ def space_string?(string)
808
+ string.strip == ""
809
+ end
810
+
811
+ def error(message)
812
+ ParseError.new(message, self.line)
813
+ end
814
+
815
+ def skip_first
816
+ return unless block_given?
817
+
818
+ @first = true if @first.nil?
819
+ yield unless @first
820
+ @first = false
821
+ end
822
+
823
+ class IncompleteMethod < StandardError; end
824
+ end
825
+ end
@@ -0,0 +1,120 @@
1
+ require "json"
2
+
3
+ module Hummel
4
+ module Encode
5
+ # Regular expression to validate bare keys (no quotes needed)
6
+ BARE_KEY_REGEX = /^[a-zA-Z][a-zA-Z0-9_-]*$/
7
+
8
+ class << self
9
+ # Convert a Ruby object to HUML format
10
+ def stringify(obj, cfg = {})
11
+ lines = []
12
+ lines.concat(["%HUML v0.1.0", ""]) if cfg[:include_version]
13
+
14
+ lines.concat(encode_value(obj, 0, true))
15
+ lines << "" # Ensure document ends with newline
16
+
17
+ lines.join("\n")
18
+ end
19
+
20
+ private
21
+
22
+ # Core encoding methods
23
+
24
+ # Encode a value to HUML format - returns array of lines
25
+ def encode_value(value, indent, is_root_level = false)
26
+ return ["null"] if value.nil?
27
+
28
+ case value
29
+ when TrueClass, FalseClass
30
+ [value.to_s]
31
+ when Numeric
32
+ [format_number(value)]
33
+ when String
34
+ encode_string(value, indent)
35
+ when Array
36
+ encode_array(value, indent, is_root_level)
37
+ when Hash
38
+ encode_object(value, indent, is_root_level)
39
+ else
40
+ raise ArgumentError, "Unsupported type: #{value.class}"
41
+ end
42
+ end
43
+
44
+ # Type-specific encoding methods
45
+
46
+ # Format a number for HUML output
47
+ def format_number(num)
48
+ return "nan" if num.respond_to?(:nan?) && num.nan?
49
+ return "inf" if num == Float::INFINITY
50
+ return "-inf" if num == -Float::INFINITY
51
+
52
+ num.to_s
53
+ end
54
+
55
+ # Encode a string value - returns array of lines
56
+ def encode_string(str, indent)
57
+ return [str.to_json] unless str.include?("\n")
58
+
59
+ # Multi-line string
60
+ str_lines = str.split("\n")
61
+ str_lines.pop if str_lines.last && str_lines.last.empty?
62
+
63
+ ["```"] + str_lines.map { |line| "#{" " * indent}#{line}" } + ["#{" " * (indent - 2)}```"]
64
+ end
65
+
66
+ # Encode an array value - returns array of lines
67
+ def encode_array(arr, indent, is_root_level = false)
68
+ return ["[]"] if arr.empty?
69
+
70
+ item_indent = is_root_level ? 0 : indent
71
+
72
+ arr.flat_map do |item|
73
+ item_lines = encode_value(item, item_indent + 2)
74
+
75
+ if vector?(item) && !item.empty?
76
+ # Non-empty vector: "- ::" on one line, value on next lines
77
+ ["#{" " * item_indent}- ::"] + item_lines
78
+ else
79
+ # Scalar or empty vector: "- value" on same line
80
+ ["#{" " * item_indent}- #{item_lines.first}"] + item_lines[1..]
81
+ end
82
+ end
83
+ end
84
+
85
+ # Encode an object value - returns array of lines
86
+ def encode_object(obj, indent, is_root_level = false)
87
+ return ["{}"] if obj.empty?
88
+
89
+ key_indent = is_root_level ? 0 : indent
90
+
91
+ obj.sort_by { |key, _| key.to_s }.flat_map do |key, value|
92
+ is_vec = vector?(value)
93
+ value_lines = encode_value(value, key_indent + 2)
94
+
95
+ if is_vec && !value.empty?
96
+ # Non-empty vector: key:: on one line, value on next lines
97
+ ["#{" " * key_indent}#{quote_key(key)}::"] + value_lines
98
+ else
99
+ # Scalar or empty vector: combine key and value on first line
100
+ separator = is_vec ? ":: " : ": "
101
+ ["#{" " * key_indent}#{quote_key(key)}#{separator}#{value_lines.first}"] + value_lines[1..]
102
+ end
103
+ end
104
+ end
105
+
106
+ # Helper methods
107
+
108
+ # Determines if a value is a vector (array or object)
109
+ def vector?(value)
110
+ value.is_a?(Array) || value.is_a?(Hash)
111
+ end
112
+
113
+ # Quotes a key if necessary
114
+ def quote_key(key)
115
+ key_str = key.to_s
116
+ BARE_KEY_REGEX.match?(key_str) ? key_str : key_str.to_json
117
+ end
118
+ end
119
+ end
120
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Hummel
4
+ VERSION = "0.1.0"
5
+ end
data/lib/hummel.rb ADDED
@@ -0,0 +1,10 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "hummel/version"
4
+ require_relative "hummel/decode"
5
+ require_relative "hummel/encode"
6
+
7
+ module Hummel
8
+ class Error < StandardError; end
9
+ # Your code goes here...
10
+ end
data/sig/hummel.rbs ADDED
@@ -0,0 +1,4 @@
1
+ module Hummel
2
+ VERSION: String
3
+ # See the writing guide of rbs: https://github.com/ruby/rbs#guides
4
+ end
metadata ADDED
@@ -0,0 +1,60 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: hummel
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Derek Bender
8
+ bindir: exe
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies: []
12
+ description: HUML (Human Markup Language) is a data serialization format designed
13
+ for human readability. This gem provides a complete Ruby implementation including
14
+ a parser for decoding HUML documents and an encoder for converting Ruby objects
15
+ to HUML format.
16
+ email:
17
+ - 170351+djbender@users.noreply.github.com
18
+ executables: []
19
+ extensions: []
20
+ extra_rdoc_files: []
21
+ files:
22
+ - ".rspec"
23
+ - ".ruby-version"
24
+ - ".standard.yml"
25
+ - CHANGELOG.md
26
+ - CODE_OF_CONDUCT.md
27
+ - LICENSE.txt
28
+ - README.md
29
+ - Rakefile
30
+ - lib/hummel.rb
31
+ - lib/hummel/decode.rb
32
+ - lib/hummel/encode.rb
33
+ - lib/hummel/version.rb
34
+ - sig/hummel.rbs
35
+ homepage: https://github.com/djbender/hummel
36
+ licenses:
37
+ - MIT
38
+ metadata:
39
+ allowed_push_host: https://rubygems.org
40
+ homepage_uri: https://github.com/djbender/hummel
41
+ source_code_uri: https://github.com/djbender/hummel
42
+ changelog_uri: https://github.com/djbender/hummel/blob/main/CHANGELOG.md
43
+ rdoc_options: []
44
+ require_paths:
45
+ - lib
46
+ required_ruby_version: !ruby/object:Gem::Requirement
47
+ requirements:
48
+ - - ">="
49
+ - !ruby/object:Gem::Version
50
+ version: 3.1.0
51
+ required_rubygems_version: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - ">="
54
+ - !ruby/object:Gem::Version
55
+ version: '0'
56
+ requirements: []
57
+ rubygems_version: 3.6.9
58
+ specification_version: 4
59
+ summary: A Ruby parser and encoder for HUML (Human Markup Language)
60
+ test_files: []