rtoon 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 89392ee400d56b09518657851bd1f69b63c8ca50f9376da702a541cc69eaefae
4
+ data.tar.gz: 96c0f189e8dfa1c559dd8b69e58ad351fba7f5c7b92c1c70f42f8432d70c5344
5
+ SHA512:
6
+ metadata.gz: c6a0a8e40b860a2b19b3b4060b4f51e1fc430ffc41e3b24dca5907ed16fc9cfcf056763e4184a640104ee61ec7da4b14c0f00e81dcdbabbcfc8e12d04d642a07
7
+ data.tar.gz: 38033867f87e4c304e8047d82ed7fb2b94ac4fafed741ed571a29e500c4d1e7eb90a55e69cb66d4639914374a234a1d07237fca2ceb9d120e071b118fd21c1e3
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,298 @@
1
+ # TOON Parser - Token Object Oriented Notation
2
+
3
+ A Ruby gem for parsing Token Object Oriented Notation (TOON), a tabular, schema-based data format with indentation-based structure.
4
+
5
+ ## What is TOON?
6
+
7
+ TOON is a data serialization format that combines:
8
+ - **Schema declarations** with field definitions
9
+ - **Tabular data** for compact representation
10
+ - **Indentation-based nesting** for hierarchy
11
+ - **Array size hints** for structure clarity
12
+
13
+ ### Example
14
+
15
+ ```toon
16
+ items[1]{users,status}:
17
+ users[2]{id,name}:
18
+ 1,Ada
19
+ 2,Bob
20
+ status: active
21
+ ```
22
+
23
+ This parses to:
24
+
25
+ ```ruby
26
+ {
27
+ "items" => [
28
+ {
29
+ "users" => [
30
+ { "id" => "1", "name" => "Ada" },
31
+ { "id" => "2", "name" => "Bob" }
32
+ ],
33
+ "status" => "active"
34
+ }
35
+ ]
36
+ }
37
+ ```
38
+
39
+ ## Installation
40
+
41
+ Add this line to your application's Gemfile:
42
+
43
+ ```ruby
44
+ gem 'rtoon'
45
+ ```
46
+
47
+ Or install it yourself:
48
+
49
+ ```bash
50
+ gem install rtoon
51
+ ```
52
+
53
+ ## Usage
54
+
55
+ ### Basic Parsing
56
+
57
+ ```ruby
58
+ require 'rtoon'
59
+
60
+ toon_string = <<~TOON
61
+ users[2]{id,name}:
62
+ 1,Ada
63
+ 2,Bob
64
+ TOON
65
+
66
+ result = Rtoon.parse(toon_string)
67
+ # => {"users" => [{"id" => "1", "name" => "Ada"}, {"id" => "2", "name" => "Bob"}]}
68
+ ```
69
+
70
+ ### Schema Declarations
71
+
72
+ Define data structures with schemas:
73
+
74
+ ```toon
75
+ products[3]{id,name,price}:
76
+ 1,Widget,100
77
+ 2,Gadget,200
78
+ 3,Gizmo,300
79
+ ```
80
+
81
+ ### Nested Structures
82
+
83
+ Use indentation for nesting:
84
+
85
+ ```toon
86
+ company[1]{depts,location}:
87
+ depts[2]{name,size}:
88
+ engineering,50
89
+ sales,20
90
+ location: NYC
91
+ ```
92
+
93
+ ### Field Assignments
94
+
95
+ Simple key-value pairs:
96
+
97
+ ```toon
98
+ name: John
99
+ age: 30
100
+ active: true
101
+ ```
102
+
103
+ ## TOON Syntax
104
+
105
+ ### Schema Declaration
106
+
107
+ ```
108
+ identifier[size]{field1,field2,...}:
109
+ ```
110
+
111
+ - **identifier**: Name of the data structure
112
+ - **[size]**: Optional array size hint (can be any number)
113
+ - **{fields}**: Comma-separated list of field names
114
+ - **:** Indicates content follows on indented lines
115
+
116
+ ### Data Rows
117
+
118
+ ```
119
+ value1,value2,value3
120
+ ```
121
+
122
+ Comma-separated values matching the schema fields in order.
123
+
124
+ ### Indentation
125
+
126
+ - Use **2 spaces** for each nesting level
127
+ - Indentation defines the structure hierarchy
128
+ - Empty lines are ignored
129
+
130
+ ### Complete Example
131
+
132
+ ```toon
133
+ database[1]{tables,version}:
134
+ tables[3]{name,records,indexed}:
135
+ users,1000,yes
136
+ posts,5000,yes
137
+ comments,20000,no
138
+ version: 3
139
+ ```
140
+
141
+ Parses to:
142
+
143
+ ```ruby
144
+ {
145
+ "database" => [
146
+ {
147
+ "tables" => [
148
+ { "name" => "users", "records" => "1000", "indexed" => "yes" },
149
+ { "name" => "posts", "records" => "5000", "indexed" => "yes" },
150
+ { "name" => "comments", "records" => "20000", "indexed" => "no" }
151
+ ],
152
+ "version" => "3"
153
+ }
154
+ ]
155
+ }
156
+ ```
157
+
158
+ ## API Reference
159
+
160
+ ### `Rtoon.parse(string)`
161
+
162
+ Parses a TOON string and returns a Ruby hash/array structure.
163
+
164
+ **Parameters:**
165
+ - `string` (String): TOON formatted string
166
+
167
+ **Returns:**
168
+ - Hash or Array with parsed data
169
+
170
+ **Raises:**
171
+ - `RtoonParser::ParseError`: If the string contains invalid TOON syntax
172
+
173
+ ### `Rtoon.decode(string)`
174
+
175
+ Alias for `Rtoon.parse(string)`.
176
+
177
+ ## Features
178
+
179
+ ✅ Schema-based declarations
180
+ ✅ Tabular data rows
181
+ ✅ Indentation-based nesting
182
+ ✅ Array size hints
183
+ ✅ Field assignments
184
+ ✅ Multi-level nesting
185
+ ✅ Empty line handling
186
+
187
+ ## Grammar
188
+
189
+ TOON uses a context-free grammar parsed with Racc (Ruby parser generator):
190
+
191
+ - **Statements**: Schema blocks or field assignments
192
+ - **Schema Block**: Header + indented content
193
+ - **Schema Header**: `name[size]{fields}:`
194
+ - **Data Rows**: Comma-separated values
195
+ - **Field Assignment**: `name: value`
196
+ - **Indentation**: INDENT/DEDENT tokens (2 spaces)
197
+
198
+ ## Development
199
+
200
+ ### Prerequisites
201
+
202
+ ```bash
203
+ # Ruby 2.7 or higher
204
+ ruby --version
205
+
206
+ # Racc (included with Ruby)
207
+ ruby -e "require 'racc/parser'; puts 'Racc available'"
208
+ ```
209
+
210
+ ### Building from Source
211
+
212
+ ```bash
213
+ # Clone or extract the gem
214
+ cd rtoon
215
+
216
+ # Compile grammar
217
+ racc -o lib/rtoon.tab.rb lib/rtoon.y
218
+
219
+ # Run tests
220
+ ruby test/toon_test.rb
221
+
222
+ # Build gem
223
+ gem build rtoon.gemspec
224
+ ```
225
+
226
+ ### Running Tests
227
+
228
+ ```bash
229
+ ruby test/toon_test.rb
230
+ ```
231
+
232
+ Expected output:
233
+ ```
234
+ 11 runs, 39 assertions, 0 failures, 0 errors, 0 skips
235
+ ```
236
+
237
+ ## Architecture
238
+
239
+ ```
240
+ TOON String
241
+
242
+ RtoonLexer (tokenization + indentation tracking)
243
+
244
+ RtoonParser (Racc-generated parser)
245
+
246
+ Ruby Hash/Array
247
+ ```
248
+
249
+ ### Components
250
+
251
+ - **lib/rtoon/lexer.rb**: Indentation-aware lexer
252
+ - **lib/rtoon/parser.y**: Racc grammar definition
253
+ - **lib/rtoon/parser.tab.rb**: Generated parser (don't edit!)
254
+ - **lib/rtoon/encoder.rb**: TOON encoder (Ruby → TOON)
255
+ - **lib/rtoon.rb**: Main interface
256
+
257
+ ## Use Cases
258
+
259
+ TOON is ideal for:
260
+
261
+ - **Configuration files** with tabular data
262
+ - **Database seeds** with schemas and rows
263
+ - **API responses** with structured tables
264
+ - **Data exports** in readable format
265
+ - **Test fixtures** with clear structure
266
+
267
+ ## Limitations
268
+
269
+ - Values are currently returned as strings (no automatic type conversion)
270
+ - No support for nested arrays in data rows
271
+ - No string escaping (commas in values not supported)
272
+ - Numbers are treated as identifiers/strings
273
+
274
+ ## Future Enhancements
275
+
276
+ - Type inference/conversion (numbers, booleans)
277
+ - String literals with quotes
278
+ - Escape sequences for commas in values
279
+ - Comments support
280
+ - Encoder (Ruby → TOON)
281
+
282
+ ## Contributing
283
+
284
+ 1. Fork the repository
285
+ 2. Create your feature branch
286
+ 3. Add tests for your changes
287
+ 4. Ensure all tests pass
288
+ 5. Submit a pull request
289
+
290
+ ## License
291
+
292
+ MIT License - See LICENSE.txt
293
+
294
+ ## Resources
295
+
296
+ - Racc: https://github.com/ruby/racc
297
+ - Parser theory: https://en.wikipedia.org/wiki/Parsing
298
+ - Indentation-based parsing: https://en.wikipedia.org/wiki/Off-side_rule
data/TOON_SPEC.md ADDED
@@ -0,0 +1,61 @@
1
+ # TOON Format Specification
2
+
3
+ ## Overview
4
+ Token Object Oriented Notation (TOON) is a tabular, schema-based data format with indentation-based structure.
5
+
6
+ ## Syntax Elements
7
+
8
+ ### Schema Declaration
9
+ ```
10
+ identifier[size]{field1,field2,...}:
11
+ ```
12
+
13
+ - **identifier**: Name of the data structure
14
+ - **[size]**: Optional array size hint
15
+ - **{fields}**: Comma-separated list of field names
16
+ - **:** Indicates content follows
17
+
18
+ ### Data Rows
19
+ ```
20
+ value1,value2,value3
21
+ ```
22
+
23
+ Comma-separated values corresponding to the schema fields.
24
+
25
+ ### Nesting
26
+ Uses indentation (2 spaces) to indicate nested structures.
27
+
28
+ ### Example
29
+ ```
30
+ items[1]{users,status}:
31
+ users[2]{id,name}:
32
+ 1,Ada
33
+ 2,Bob
34
+ status: active
35
+ ```
36
+
37
+ ## Parsed Structure
38
+ The above should parse to:
39
+ ```ruby
40
+ {
41
+ "items" => [
42
+ {
43
+ "users" => [
44
+ { "id" => "1", "name" => "Ada" },
45
+ { "id" => "2", "name" => "Bob" }
46
+ ],
47
+ "status" => "active"
48
+ }
49
+ ]
50
+ }
51
+ ```
52
+
53
+ ## Grammar Rules
54
+
55
+ 1. **Top-level**: Schema declaration or field assignment
56
+ 2. **Schema**: `name[size]{fields}:` followed by indented content
57
+ 3. **Field**: `name: value` (simple assignment)
58
+ 4. **Data rows**: Comma-separated values matching schema fields
59
+ 5. **Indentation**: 2 spaces per level
60
+ 6. **Arrays**: Created from schema declarations with size hint
61
+ 7. **Objects**: Created from schema fields
@@ -0,0 +1,68 @@
1
+ module Rtoon
2
+ class Encoder
3
+ def self.encode(value, indent: 0, key: nil, top_level: true)
4
+ case value
5
+ when Hash
6
+ encode_hash(value, indent: indent, key: key, top_level: top_level)
7
+ when Array
8
+ encode_array(value, indent: indent, key: key, top_level: top_level)
9
+ else
10
+ "#{value}"
11
+ end
12
+ end
13
+
14
+ def self.encode_hash(hash, indent:, key:, top_level:)
15
+ lines = []
16
+ if key
17
+ subkeys = hash.keys
18
+ lines << (" " * indent + "#{key}{#{subkeys.join(',')}}:")
19
+ indent += 1
20
+ end
21
+
22
+ hash.each do |k, v|
23
+ if v.is_a?(Hash)
24
+ lines << encode_hash(v, indent: indent, key: k, top_level: false)
25
+ elsif v.is_a?(Array)
26
+ lines << encode_array(v, indent: indent, key: k, top_level: false)
27
+ else
28
+ lines << (" " * indent + "#{k}: #{v}")
29
+ end
30
+ end
31
+
32
+ lines.join("\n") + (top_level ? "\n" : "")
33
+ end
34
+
35
+ def self.encode_array(array, indent:, key:, top_level:)
36
+ return "" if array.empty?
37
+
38
+ # Check if it's an array of hashes that themselves contain nested structures
39
+ if array.first.is_a?(Hash)
40
+ # If all elements are flat hashes (no arrays inside), compress to CSV form
41
+ if array.all? { |h| h.values.all? { |v| !v.is_a?(Array) && !v.is_a?(Hash) } }
42
+ element_keys = array.first.keys
43
+ lines = []
44
+ lines << (" " * indent + "#{key}[#{array.size}]{#{element_keys.join(',')}}:")
45
+ array.each do |element|
46
+ row = element.values.join(',')
47
+ lines << (" " * (indent + 1) + row)
48
+ end
49
+ return lines.join("\n")
50
+ else
51
+ # Otherwise, treat each element recursively
52
+ lines = []
53
+ lines << (" " * indent + "#{key}[#{array.size}]{#{array.first.keys.join(',')}}:")
54
+ array.each do |element|
55
+ lines << encode_hash(element, indent: indent + 1, key: nil, top_level: false)
56
+ end
57
+ return lines.join("\n")
58
+ end
59
+ else
60
+ # Array of primitives
61
+ lines = []
62
+ lines << (" " * indent + "#{key}[#{array.size}]:")
63
+ array.each { |v| lines << (" " * (indent + 1) + v.to_s) }
64
+ lines.join("\n")
65
+ end
66
+ end
67
+ end
68
+ end
@@ -0,0 +1,136 @@
1
+ module Rtoon
2
+ class Lexer
3
+ attr_reader :line_number, :indent_stack
4
+
5
+ def initialize(input)
6
+ @input = input
7
+ @lines = input.split("\n")
8
+ @line_index = 0
9
+ @pos = 0
10
+ @current_line = @lines[@line_index] || ""
11
+ @indent_stack = [0]
12
+ @pending_dedents = 0
13
+ @line_number = 1
14
+ end
15
+
16
+ def next_token
17
+ # Handle pending dedents first
18
+ if @pending_dedents > 0
19
+ @pending_dedents -= 1
20
+ @indent_stack.pop
21
+ return [:DEDENT, nil]
22
+ end
23
+
24
+ # Skip empty lines
25
+ while @current_line && @current_line.strip.empty?
26
+ advance_line
27
+ return [false, false] if @current_line.nil?
28
+ end
29
+
30
+ return [false, false] if @current_line.nil?
31
+
32
+ # Check indentation at start of line
33
+ if @pos == 0
34
+ indent = @current_line[/^\s*/].length
35
+ current_indent = @indent_stack.last
36
+
37
+ if indent > current_indent
38
+ @indent_stack.push(indent)
39
+ @pos = indent
40
+ return [:INDENT, nil]
41
+ elsif indent < current_indent
42
+ # Calculate how many dedents needed
43
+ while @indent_stack.length > 1 && @indent_stack[-1] > indent
44
+ @pending_dedents += 1
45
+ if @indent_stack[-2] == indent
46
+ break
47
+ end
48
+ end
49
+
50
+ if @pending_dedents > 0
51
+ @pending_dedents -= 1
52
+ @indent_stack.pop
53
+ @pos = indent
54
+ return [:DEDENT, nil]
55
+ end
56
+ end
57
+
58
+ @pos = indent
59
+ end
60
+
61
+ # Skip whitespace (but not newlines)
62
+ while @pos < @current_line.length && @current_line[@pos] == ' '
63
+ @pos += 1
64
+ end
65
+
66
+ # End of line
67
+ if @pos >= @current_line.length
68
+ advance_line
69
+ return [:NEWLINE, "\n"]
70
+ end
71
+
72
+ char = @current_line[@pos]
73
+
74
+ case char
75
+ when '['
76
+ @pos += 1
77
+ return [:LBRACKET, '[']
78
+ when ']'
79
+ @pos += 1
80
+ return [:RBRACKET, ']']
81
+ when '{'
82
+ @pos += 1
83
+ return [:LBRACE, '{']
84
+ when '}'
85
+ @pos += 1
86
+ return [:RBRACE, '}']
87
+ when ':'
88
+ @pos += 1
89
+ return [:COLON, ':']
90
+ when ','
91
+ @pos += 1
92
+ return [:COMMA, ',']
93
+ when '0'..'9'
94
+ return [:NUMBER, scan_number]
95
+ when 'a'..'z', 'A'..'Z', '_'
96
+ return [:IDENTIFIER, scan_identifier]
97
+ else
98
+ raise "Unexpected character: #{char.inspect} at line #{@line_number}, position #{@pos}"
99
+ end
100
+ end
101
+
102
+ private
103
+
104
+ def advance_line
105
+ @line_index += 1
106
+ @line_number += 1
107
+ if @line_index < @lines.length
108
+ @current_line = @lines[@line_index]
109
+ @pos = 0
110
+ else
111
+ @current_line = nil
112
+ end
113
+ end
114
+
115
+ def scan_number
116
+ start = @pos
117
+ # Scan digits
118
+ @pos += 1 while @pos < @current_line.length && @current_line[@pos].match?(/[0-9]/)
119
+
120
+ # Check for decimal point
121
+ if @pos < @current_line.length && @current_line[@pos] == '.'
122
+ @pos += 1
123
+ # Scan fractional part
124
+ @pos += 1 while @pos < @current_line.length && @current_line[@pos].match?(/[0-9]/)
125
+ end
126
+
127
+ @current_line[start...@pos]
128
+ end
129
+
130
+ def scan_identifier
131
+ start = @pos
132
+ @pos += 1 while @pos < @current_line.length && @current_line[@pos].match?(/[a-zA-Z0-9_.]/)
133
+ @current_line[start...@pos]
134
+ end
135
+ end
136
+ end
@@ -0,0 +1,391 @@
1
+ #
2
+ # DO NOT MODIFY!!!!
3
+ # This file is automatically generated by Racc 1.8.1
4
+ # from Racc grammar file "parser.y".
5
+ #
6
+
7
+ require 'racc/parser.rb'
8
+
9
+ require_relative 'lexer'
10
+
11
+ module Rtoon
12
+ class Parser < Racc::Parser
13
+
14
+ module_eval(<<'...end parser.y/module_eval...', 'parser.y', 74)
15
+
16
+ def parse(str)
17
+ @lexer = Lexer.new(str)
18
+ do_parse
19
+ end
20
+
21
+ def next_token
22
+ @lexer.next_token
23
+ end
24
+
25
+ def on_error(token_id, value, value_stack)
26
+ line = @lexer.line_number
27
+ raise ParseError, "Parse error at line #{line}: unexpected token #{value.inspect}"
28
+ end
29
+
30
+ def process_statements(statements)
31
+ result = {}
32
+ current_schema = nil
33
+ data_rows = []
34
+
35
+ statements.compact.each do |stmt|
36
+ case stmt[:type]
37
+ when :schema
38
+ # First, finalize any pending schema with data rows
39
+ if current_schema && !data_rows.empty?
40
+ result[current_schema[:name]] = build_array_from_rows(current_schema, data_rows)
41
+ data_rows = []
42
+ end
43
+
44
+ current_schema = stmt[:header]
45
+
46
+ # Process nested content
47
+ nested_result = process_statements(stmt[:content])
48
+
49
+ # Check if there are data rows in the nested content
50
+ nested_data_rows = stmt[:content].select { |s| s && s[:type] == :data_row }
51
+
52
+ if !nested_data_rows.empty?
53
+ # This schema has data rows
54
+ result[current_schema[:name]] = build_array_from_rows(current_schema, nested_data_rows)
55
+ current_schema = nil
56
+ elsif !nested_result.empty?
57
+ # This schema has nested schemas
58
+ if current_schema[:size]
59
+ result[current_schema[:name]] = [nested_result]
60
+ else
61
+ result[current_schema[:name]] = nested_result
62
+ end
63
+ current_schema = nil
64
+ end
65
+
66
+ when :field
67
+ result[stmt[:name]] = stmt[:value]
68
+
69
+ when :data_row
70
+ data_rows << stmt[:values]
71
+ end
72
+ end
73
+
74
+ # Finalize any remaining schema
75
+ if current_schema && !data_rows.empty?
76
+ result[current_schema[:name]] = build_array_from_rows(current_schema, data_rows)
77
+ end
78
+
79
+ result
80
+ end
81
+
82
+ def build_array_from_rows(schema, data_row_statements)
83
+ fields = schema[:fields]
84
+
85
+ data_row_statements.map do |row_stmt|
86
+ values = row_stmt[:values]
87
+ obj = {}
88
+ fields.each_with_index do |field, idx|
89
+ obj[field] = values[idx] if idx < values.length
90
+ end
91
+ obj
92
+ end
93
+ end
94
+
95
+ class ParseError < StandardError; end
96
+ ...end parser.y/module_eval...
97
+ ##### State transition tables begin ###
98
+
99
+ racc_action_table = [
100
+ 8, 12, 8, 12, 8, 12, 8, 12, 13, 16,
101
+ 6, 17, 6, 18, 6, 31, 6, 32, 38, 15,
102
+ 32, 26, 12, 26, 12, 19, 20, 21, 22, 24,
103
+ 30, 33, 34, 35, 36, 24, 39 ]
104
+
105
+ racc_action_check = [
106
+ 0, 0, 2, 2, 21, 21, 28, 28, 1, 8,
107
+ 0, 8, 2, 8, 21, 23, 28, 23, 37, 7,
108
+ 37, 18, 18, 19, 19, 10, 13, 15, 16, 17,
109
+ 22, 29, 30, 31, 32, 34, 38 ]
110
+
111
+ racc_action_pointer = [
112
+ -2, 8, 0, nil, nil, nil, nil, 7, 5, nil,
113
+ 16, nil, nil, 26, nil, 17, 25, 27, 19, 21,
114
+ nil, 2, 25, 8, nil, nil, nil, nil, 4, 20,
115
+ 26, 25, 32, nil, 33, nil, nil, 11, 28, nil ]
116
+
117
+ racc_action_default = [
118
+ -2, -22, -1, -3, -5, -6, -7, -22, -20, -16,
119
+ -17, -18, -21, -22, -4, -22, -22, -22, -22, -22,
120
+ 40, -22, -22, -22, -12, -15, -20, -19, -14, -9,
121
+ -22, -22, -22, -8, -22, -11, -13, -22, -22, -10 ]
122
+
123
+ racc_goto_table = [
124
+ 14, 2, 23, 25, 27, 1, 29, nil, nil, nil,
125
+ nil, nil, nil, nil, nil, nil, nil, nil, nil, 37,
126
+ nil, nil, 28, nil, nil, nil, 14 ]
127
+
128
+ racc_goto_check = [
129
+ 3, 2, 8, 9, 9, 1, 7, nil, nil, nil,
130
+ nil, nil, nil, nil, nil, nil, nil, nil, nil, 8,
131
+ nil, nil, 2, nil, nil, nil, 3 ]
132
+
133
+ racc_goto_pointer = [
134
+ nil, 5, 1, -2, nil, nil, nil, -15, -15, -15,
135
+ nil, nil ]
136
+
137
+ racc_goto_default = [
138
+ nil, nil, nil, 3, 4, 5, 7, nil, nil, 11,
139
+ 9, 10 ]
140
+
141
+ racc_reduce_table = [
142
+ 0, 0, :racc_error,
143
+ 1, 14, :_reduce_1,
144
+ 0, 14, :_reduce_2,
145
+ 1, 15, :_reduce_3,
146
+ 2, 15, :_reduce_4,
147
+ 1, 16, :_reduce_5,
148
+ 1, 16, :_reduce_6,
149
+ 1, 16, :_reduce_7,
150
+ 5, 17, :_reduce_8,
151
+ 4, 17, :_reduce_9,
152
+ 8, 19, :_reduce_10,
153
+ 5, 19, :_reduce_11,
154
+ 1, 21, :_reduce_12,
155
+ 3, 21, :_reduce_13,
156
+ 1, 20, :_reduce_14,
157
+ 3, 18, :_reduce_15,
158
+ 1, 18, :_reduce_16,
159
+ 1, 23, :_reduce_17,
160
+ 1, 24, :_reduce_18,
161
+ 3, 24, :_reduce_19,
162
+ 1, 22, :_reduce_20,
163
+ 1, 22, :_reduce_21 ]
164
+
165
+ racc_reduce_n = 22
166
+
167
+ racc_shift_n = 40
168
+
169
+ racc_token_table = {
170
+ false => 0,
171
+ :error => 1,
172
+ :IDENTIFIER => 2,
173
+ :NUMBER => 3,
174
+ :LBRACKET => 4,
175
+ :RBRACKET => 5,
176
+ :LBRACE => 6,
177
+ :RBRACE => 7,
178
+ :COLON => 8,
179
+ :COMMA => 9,
180
+ :INDENT => 10,
181
+ :DEDENT => 11,
182
+ :NEWLINE => 12 }
183
+
184
+ racc_nt_base = 13
185
+
186
+ racc_use_result_var = true
187
+
188
+ Racc_arg = [
189
+ racc_action_table,
190
+ racc_action_check,
191
+ racc_action_default,
192
+ racc_action_pointer,
193
+ racc_goto_table,
194
+ racc_goto_check,
195
+ racc_goto_default,
196
+ racc_goto_pointer,
197
+ racc_nt_base,
198
+ racc_reduce_table,
199
+ racc_token_table,
200
+ racc_shift_n,
201
+ racc_reduce_n,
202
+ racc_use_result_var ]
203
+ Ractor.make_shareable(Racc_arg) if defined?(Ractor)
204
+
205
+ Racc_token_to_s_table = [
206
+ "$end",
207
+ "error",
208
+ "IDENTIFIER",
209
+ "NUMBER",
210
+ "LBRACKET",
211
+ "RBRACKET",
212
+ "LBRACE",
213
+ "RBRACE",
214
+ "COLON",
215
+ "COMMA",
216
+ "INDENT",
217
+ "DEDENT",
218
+ "NEWLINE",
219
+ "$start",
220
+ "document",
221
+ "statements",
222
+ "statement",
223
+ "schema_block",
224
+ "field_assignment",
225
+ "schema_header",
226
+ "block_content",
227
+ "field_list",
228
+ "value",
229
+ "data_row",
230
+ "value_list" ]
231
+ Ractor.make_shareable(Racc_token_to_s_table) if defined?(Ractor)
232
+
233
+ Racc_debug_parser = false
234
+
235
+ ##### State transition tables end #####
236
+
237
+ # reduce 0 omitted
238
+
239
+ module_eval(<<'.,.,', 'parser.y', 10)
240
+ def _reduce_1(val, _values, result)
241
+ result = process_statements(val[0])
242
+ result
243
+ end
244
+ .,.,
245
+
246
+ module_eval(<<'.,.,', 'parser.y', 11)
247
+ def _reduce_2(val, _values, result)
248
+ result = {}
249
+ result
250
+ end
251
+ .,.,
252
+
253
+ module_eval(<<'.,.,', 'parser.y', 15)
254
+ def _reduce_3(val, _values, result)
255
+ result = [val[0]]
256
+ result
257
+ end
258
+ .,.,
259
+
260
+ module_eval(<<'.,.,', 'parser.y', 16)
261
+ def _reduce_4(val, _values, result)
262
+ result = val[0] + [val[1]]
263
+ result
264
+ end
265
+ .,.,
266
+
267
+ module_eval(<<'.,.,', 'parser.y', 20)
268
+ def _reduce_5(val, _values, result)
269
+ result = val[0]
270
+ result
271
+ end
272
+ .,.,
273
+
274
+ module_eval(<<'.,.,', 'parser.y', 21)
275
+ def _reduce_6(val, _values, result)
276
+ result = val[0]
277
+ result
278
+ end
279
+ .,.,
280
+
281
+ module_eval(<<'.,.,', 'parser.y', 22)
282
+ def _reduce_7(val, _values, result)
283
+ result = nil
284
+ result
285
+ end
286
+ .,.,
287
+
288
+ module_eval(<<'.,.,', 'parser.y', 27)
289
+ def _reduce_8(val, _values, result)
290
+ result = { type: :schema, header: val[0], content: val[3] }
291
+ result
292
+ end
293
+ .,.,
294
+
295
+ module_eval(<<'.,.,', 'parser.y', 29)
296
+ def _reduce_9(val, _values, result)
297
+ result = { type: :schema, header: val[0], content: val[3] }
298
+ result
299
+ end
300
+ .,.,
301
+
302
+ module_eval(<<'.,.,', 'parser.y', 34)
303
+ def _reduce_10(val, _values, result)
304
+ result = { name: val[0], size: val[2].to_i, fields: val[5] }
305
+ result
306
+ end
307
+ .,.,
308
+
309
+ module_eval(<<'.,.,', 'parser.y', 36)
310
+ def _reduce_11(val, _values, result)
311
+ result = { name: val[0], size: nil, fields: val[2] }
312
+ result
313
+ end
314
+ .,.,
315
+
316
+ module_eval(<<'.,.,', 'parser.y', 40)
317
+ def _reduce_12(val, _values, result)
318
+ result = [val[0]]
319
+ result
320
+ end
321
+ .,.,
322
+
323
+ module_eval(<<'.,.,', 'parser.y', 41)
324
+ def _reduce_13(val, _values, result)
325
+ result = val[0] + [val[2]]
326
+ result
327
+ end
328
+ .,.,
329
+
330
+ module_eval(<<'.,.,', 'parser.y', 45)
331
+ def _reduce_14(val, _values, result)
332
+ result = val[0]
333
+ result
334
+ end
335
+ .,.,
336
+
337
+ module_eval(<<'.,.,', 'parser.y', 49)
338
+ def _reduce_15(val, _values, result)
339
+ result = { type: :field, name: val[0], value: val[2] }
340
+ result
341
+ end
342
+ .,.,
343
+
344
+ module_eval(<<'.,.,', 'parser.y', 50)
345
+ def _reduce_16(val, _values, result)
346
+ result = { type: :data_row, values: val[0] }
347
+ result
348
+ end
349
+ .,.,
350
+
351
+ module_eval(<<'.,.,', 'parser.y', 54)
352
+ def _reduce_17(val, _values, result)
353
+ result = val[0]
354
+ result
355
+ end
356
+ .,.,
357
+
358
+ module_eval(<<'.,.,', 'parser.y', 58)
359
+ def _reduce_18(val, _values, result)
360
+ result = [val[0]]
361
+ result
362
+ end
363
+ .,.,
364
+
365
+ module_eval(<<'.,.,', 'parser.y', 59)
366
+ def _reduce_19(val, _values, result)
367
+ result = val[0] + [val[2]]
368
+ result
369
+ end
370
+ .,.,
371
+
372
+ module_eval(<<'.,.,', 'parser.y', 63)
373
+ def _reduce_20(val, _values, result)
374
+ result = val[0]
375
+ result
376
+ end
377
+ .,.,
378
+
379
+ module_eval(<<'.,.,', 'parser.y', 64)
380
+ def _reduce_21(val, _values, result)
381
+ result = val[0]
382
+ result
383
+ end
384
+ .,.,
385
+
386
+ def _reduce_none(val, _values, result)
387
+ val[0]
388
+ end
389
+
390
+ end # class Parser
391
+ end # module Rtoon
@@ -0,0 +1,154 @@
1
+ # TOON Grammar - Tabular, schema-based format
2
+
3
+ class Rtoon::Parser
4
+
5
+ token IDENTIFIER NUMBER
6
+ token LBRACKET RBRACKET LBRACE RBRACE COLON COMMA
7
+ token INDENT DEDENT NEWLINE
8
+
9
+ rule
10
+ document
11
+ : statements { result = process_statements(val[0]) }
12
+ | /* empty */ { result = {} }
13
+ ;
14
+
15
+ statements
16
+ : statement { result = [val[0]] }
17
+ | statements statement { result = val[0] + [val[1]] }
18
+ ;
19
+
20
+ statement
21
+ : schema_block { result = val[0] }
22
+ | field_assignment { result = val[0] }
23
+ | NEWLINE { result = nil }
24
+ ;
25
+
26
+ schema_block
27
+ : schema_header NEWLINE INDENT block_content DEDENT
28
+ { result = { type: :schema, header: val[0], content: val[3] } }
29
+ | schema_header NEWLINE INDENT block_content
30
+ { result = { type: :schema, header: val[0], content: val[3] } }
31
+ ;
32
+
33
+ schema_header
34
+ : IDENTIFIER LBRACKET NUMBER RBRACKET LBRACE field_list RBRACE COLON
35
+ { result = { name: val[0], size: val[2].to_i, fields: val[5] } }
36
+ | IDENTIFIER LBRACE field_list RBRACE COLON
37
+ { result = { name: val[0], size: nil, fields: val[2] } }
38
+ ;
39
+
40
+ field_list
41
+ : IDENTIFIER { result = [val[0]] }
42
+ | field_list COMMA IDENTIFIER { result = val[0] + [val[2]] }
43
+ ;
44
+
45
+ block_content
46
+ : statements { result = val[0] }
47
+ ;
48
+
49
+ field_assignment
50
+ : IDENTIFIER COLON value { result = { type: :field, name: val[0], value: val[2] } }
51
+ | data_row { result = { type: :data_row, values: val[0] } }
52
+ ;
53
+
54
+ data_row
55
+ : value_list { result = val[0] }
56
+ ;
57
+
58
+ value_list
59
+ : value { result = [val[0]] }
60
+ | value_list COMMA value { result = val[0] + [val[2]] }
61
+ ;
62
+
63
+ value
64
+ : IDENTIFIER { result = val[0] }
65
+ | NUMBER { result = val[0] }
66
+ ;
67
+
68
+ end
69
+
70
+ ---- header
71
+ require_relative 'lexer'
72
+
73
+ ---- inner
74
+
75
+ def parse(str)
76
+ @lexer = Lexer.new(str)
77
+ do_parse
78
+ end
79
+
80
+ def next_token
81
+ @lexer.next_token
82
+ end
83
+
84
+ def on_error(token_id, value, value_stack)
85
+ line = @lexer.line_number
86
+ raise ParseError, "Parse error at line #{line}: unexpected token #{value.inspect}"
87
+ end
88
+
89
+ def process_statements(statements)
90
+ result = {}
91
+ current_schema = nil
92
+ data_rows = []
93
+
94
+ statements.compact.each do |stmt|
95
+ case stmt[:type]
96
+ when :schema
97
+ # First, finalize any pending schema with data rows
98
+ if current_schema && !data_rows.empty?
99
+ result[current_schema[:name]] = build_array_from_rows(current_schema, data_rows)
100
+ data_rows = []
101
+ end
102
+
103
+ current_schema = stmt[:header]
104
+
105
+ # Process nested content
106
+ nested_result = process_statements(stmt[:content])
107
+
108
+ # Check if there are data rows in the nested content
109
+ nested_data_rows = stmt[:content].select { |s| s && s[:type] == :data_row }
110
+
111
+ if !nested_data_rows.empty?
112
+ # This schema has data rows
113
+ result[current_schema[:name]] = build_array_from_rows(current_schema, nested_data_rows)
114
+ current_schema = nil
115
+ elsif !nested_result.empty?
116
+ # This schema has nested schemas
117
+ if current_schema[:size]
118
+ result[current_schema[:name]] = [nested_result]
119
+ else
120
+ result[current_schema[:name]] = nested_result
121
+ end
122
+ current_schema = nil
123
+ end
124
+
125
+ when :field
126
+ result[stmt[:name]] = stmt[:value]
127
+
128
+ when :data_row
129
+ data_rows << stmt[:values]
130
+ end
131
+ end
132
+
133
+ # Finalize any remaining schema
134
+ if current_schema && !data_rows.empty?
135
+ result[current_schema[:name]] = build_array_from_rows(current_schema, data_rows)
136
+ end
137
+
138
+ result
139
+ end
140
+
141
+ def build_array_from_rows(schema, data_row_statements)
142
+ fields = schema[:fields]
143
+
144
+ data_row_statements.map do |row_stmt|
145
+ values = row_stmt[:values]
146
+ obj = {}
147
+ fields.each_with_index do |field, idx|
148
+ obj[field] = values[idx] if idx < values.length
149
+ end
150
+ obj
151
+ end
152
+ end
153
+
154
+ class ParseError < StandardError; end
data/lib/rtoon.rb ADDED
@@ -0,0 +1,19 @@
1
+ require_relative 'rtoon/parser.tab'
2
+ require_relative 'rtoon/encoder'
3
+
4
+ module Rtoon
5
+ VERSION = "0.1.0"
6
+
7
+ def self.parse(string)
8
+ parser = Parser.new
9
+ parser.parse(string)
10
+ end
11
+
12
+ def self.decode(string)
13
+ parse(string)
14
+ end
15
+
16
+ def self.encode(hash, indent_level = 0)
17
+ Encoder.encode(hash, indent: indent_level)
18
+ end
19
+ end
metadata ADDED
@@ -0,0 +1,91 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: rtoon
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Antonio Chavez
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: racc
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - "~>"
17
+ - !ruby/object:Gem::Version
18
+ version: '1.7'
19
+ type: :development
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - "~>"
24
+ - !ruby/object:Gem::Version
25
+ version: '1.7'
26
+ - !ruby/object:Gem::Dependency
27
+ name: rake
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '13.0'
33
+ type: :development
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '13.0'
40
+ - !ruby/object:Gem::Dependency
41
+ name: minitest
42
+ requirement: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: '5.0'
47
+ type: :development
48
+ prerelease: false
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - "~>"
52
+ - !ruby/object:Gem::Version
53
+ version: '5.0'
54
+ description: A Ruby library for parsing TOON, a tabular, schema-based data format
55
+ with indentation-based structure
56
+ email:
57
+ - antonio@zoobean.com
58
+ executables: []
59
+ extensions: []
60
+ extra_rdoc_files: []
61
+ files:
62
+ - LICENSE.txt
63
+ - README.md
64
+ - TOON_SPEC.md
65
+ - lib/rtoon.rb
66
+ - lib/rtoon/encoder.rb
67
+ - lib/rtoon/lexer.rb
68
+ - lib/rtoon/parser.tab.rb
69
+ - lib/rtoon/parser.y
70
+ homepage: https://github.com/zoobean/rtoon
71
+ licenses:
72
+ - MIT
73
+ metadata: {}
74
+ rdoc_options: []
75
+ require_paths:
76
+ - lib
77
+ required_ruby_version: !ruby/object:Gem::Requirement
78
+ requirements:
79
+ - - ">="
80
+ - !ruby/object:Gem::Version
81
+ version: 2.7.0
82
+ required_rubygems_version: !ruby/object:Gem::Requirement
83
+ requirements:
84
+ - - ">="
85
+ - !ruby/object:Gem::Version
86
+ version: '0'
87
+ requirements: []
88
+ rubygems_version: 3.6.7
89
+ specification_version: 4
90
+ summary: Parser for Token Object Oriented Notation (TOON)
91
+ test_files: []