ruby-json-toon 0.2.0 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +10 -4
- data/README.md +42 -17
- data/lib/json_to_toon/encoder.rb +25 -7
- data/lib/json_to_toon.rb +0 -1
- data/lib/ruby_json_toon/version.rb +5 -0
- data/lib/ruby_json_toon.rb +15 -0
- data/lib/toon_to_json/decoder.rb +594 -0
- data/lib/toon_to_json.rb +14 -0
- metadata +7 -18
- data/lib/json_to_toon/version.rb +0 -5
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: e84a5a72ebd25a0ce5a32c395a1bdc881768e0f7ca153fc7aa8d082b05964731
|
|
4
|
+
data.tar.gz: 95ced809c09555d02369300a0fcfdbf7998c22ca39dbaec4f8bb0e787c2969c6
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 06e08a7eccb8b565901c3c0c9e07718d742d4709a37cffc2513f7d0f7b0094065c10b0788275d7e82aa6ef5ea477ae756a93cf1b2b07483202612c31d4a066f3
|
|
7
|
+
data.tar.gz: 57170a59a1b8dd083b88df2fd5e29921c6ab179ba49264c64685d100232d82a33851ba0665d9a09c3f14e11f96a65cf749952537adea5c03b413e54fba8dcc03
|
data/CHANGELOG.md
CHANGED
|
@@ -21,11 +21,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
21
21
|
- Initial Encoder class implementation
|
|
22
22
|
- Tests for JSON to TOON conversion
|
|
23
23
|
|
|
24
|
-
##[0.2.0] - First
|
|
24
|
+
## [0.2.0] - First Encoder release
|
|
25
25
|
### Added
|
|
26
|
-
|
|
26
|
+
- Updated to run release workflow on tag pushes
|
|
27
27
|
|
|
28
|
-
## [
|
|
28
|
+
## [0.3.0] - Decoder implementation
|
|
29
|
+
### Added
|
|
30
|
+
- Added decoder implementation
|
|
31
|
+
- Added ability to seperately require decoder using: require "toon_to_json"
|
|
29
32
|
|
|
33
|
+
## [1.0.0] - Complete JSON to TOON roundtrip conversion
|
|
30
34
|
### Added
|
|
31
|
-
-
|
|
35
|
+
- RubyJsonToon wrapper to encapsulate all logic
|
|
36
|
+
- Simple 2 method implementation
|
|
37
|
+
- Fixed known issues
|
data/README.md
CHANGED
|
@@ -1,10 +1,25 @@
|
|
|
1
1
|
# JSON to TOON
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Lightweight Ruby library for converting JSON data to TOON (Token-Oriented Object Notation), achieving 30–60% token reduction for LLM applications.
|
|
4
|
+
|
|
5
|
+
## Summary
|
|
6
|
+
|
|
7
|
+
Convert JSON to TOON (Token-Oriented Object Notation)
|
|
8
|
+
|
|
9
|
+
Authors: Jitendra Neema
|
|
10
|
+
Contact: jitendra.neema.8@gmail.com
|
|
11
|
+
|
|
12
|
+
Homepage: https://github.com/jitendra-neema/ruby-json-toon
|
|
13
|
+
Documentation: https://rubydoc.info/gems/ruby-json-toon
|
|
14
|
+
Changelog: https://github.com/jitendra-neema/ruby-json-toon/blob/main/CHANGELOG.md
|
|
15
|
+
Bug tracker: https://github.com/jitendra-neema/ruby-json-toon/issues
|
|
16
|
+
Rubygems: https://rubygems.org/gems/ruby-json-toon
|
|
17
|
+
|
|
18
|
+
Requires Ruby >= 2.7.0
|
|
4
19
|
|
|
5
20
|
## What is TOON?
|
|
6
21
|
|
|
7
|
-
TOON (Token-Oriented Object Notation) is a compact, indentation-based data format optimized for LLM token efficiency. It uses 30
|
|
22
|
+
TOON (Token-Oriented Object Notation) is a compact, indentation-based data format optimized for LLM token efficiency. It uses roughly 30–60% fewer tokens than JSON while remaining human-readable.
|
|
8
23
|
|
|
9
24
|
### Comparison
|
|
10
25
|
|
|
@@ -27,16 +42,22 @@ users[2]{id,name,role}:
|
|
|
27
42
|
|
|
28
43
|
## Installation
|
|
29
44
|
|
|
30
|
-
|
|
45
|
+
Install the gem:
|
|
46
|
+
|
|
47
|
+
```bash
|
|
48
|
+
gem install ruby-json-toon
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Or add to your Gemfile:
|
|
31
52
|
|
|
32
53
|
```ruby
|
|
33
|
-
gem '
|
|
54
|
+
gem 'ruby-json-toon'
|
|
34
55
|
```
|
|
35
56
|
|
|
36
|
-
|
|
57
|
+
Require the library in your code (require path follows the library files):
|
|
37
58
|
|
|
38
|
-
```
|
|
39
|
-
|
|
59
|
+
```ruby
|
|
60
|
+
require 'json_to_toon'
|
|
40
61
|
```
|
|
41
62
|
|
|
42
63
|
## Quick Start
|
|
@@ -57,10 +78,6 @@ json_data = JSON.parse('{"users":[{"id":1,"name":"Alice"}]}')
|
|
|
57
78
|
toon = JsonToToon.encode(json_data)
|
|
58
79
|
```
|
|
59
80
|
|
|
60
|
-
## Documentation
|
|
61
|
-
|
|
62
|
-
See full documentation at [rubydoc.info](https://rubydoc.info/gems/json_to_toon)
|
|
63
|
-
|
|
64
81
|
## Options
|
|
65
82
|
|
|
66
83
|
```ruby
|
|
@@ -73,8 +90,13 @@ JsonToToon.encode(data,
|
|
|
73
90
|
|
|
74
91
|
## Development
|
|
75
92
|
|
|
93
|
+
Clone the repo, install dependencies, run tests, and build the gem:
|
|
94
|
+
|
|
76
95
|
```bash
|
|
77
|
-
|
|
96
|
+
git clone https://github.com/jitendra-neema/ruby-json-toon
|
|
97
|
+
cd ruby-json-toon
|
|
98
|
+
|
|
99
|
+
# Install development dependencies
|
|
78
100
|
bundle install
|
|
79
101
|
|
|
80
102
|
# Run tests
|
|
@@ -84,15 +106,18 @@ bundle exec rspec
|
|
|
84
106
|
bundle exec rubocop
|
|
85
107
|
|
|
86
108
|
# Build gem
|
|
87
|
-
gem build
|
|
109
|
+
gem build ruby-json-toon.gemspec
|
|
88
110
|
```
|
|
89
111
|
|
|
112
|
+
Development dependencies (from the gemspec): benchmark-ips, memory_profiler, rake, rspec, rubocop, rubocop-rake, rubocop-rspec, simplecov.
|
|
113
|
+
|
|
90
114
|
## License
|
|
91
115
|
|
|
92
|
-
MIT License
|
|
116
|
+
MIT License — see LICENSE file for details.
|
|
93
117
|
|
|
94
118
|
## Links
|
|
95
119
|
|
|
96
|
-
-
|
|
97
|
-
-
|
|
98
|
-
-
|
|
120
|
+
- TOON Specification: https://toonformat.dev
|
|
121
|
+
- Homepage / source: https://github.com/jitendra-neema/ruby-json-toon
|
|
122
|
+
- Documentation: https://rubydoc.info/gems/ruby-json-toon
|
|
123
|
+
- Bug tracker: https://github.com/jitendra-neema/ruby-json-toon/issues
|
data/lib/json_to_toon/encoder.rb
CHANGED
|
@@ -253,24 +253,42 @@ module JsonToToon
|
|
|
253
253
|
end
|
|
254
254
|
|
|
255
255
|
# COMPLEX: Handle list item where first field is an array
|
|
256
|
+
# Implementation: Handle list item where first field is an array
|
|
256
257
|
def encode_list_item_with_array_first(item, depth)
|
|
257
|
-
# This is a complex edge case from the spec
|
|
258
|
-
# TODO: Implement proper handling per spec
|
|
259
258
|
keys = item.keys
|
|
260
259
|
first_key = keys.first
|
|
260
|
+
first_val = item[first_key]
|
|
261
261
|
|
|
262
262
|
base_indent = indent(depth)
|
|
263
263
|
hyphen_line = "#{base_indent}- "
|
|
264
264
|
field_indent = "#{base_indent} "
|
|
265
265
|
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
266
|
+
# Create the compact header: tags[2]:
|
|
267
|
+
length_str = "#{length_prefix}#{first_val.length}"
|
|
268
|
+
marker = delimiter_marker
|
|
269
|
+
array_header = "#{format_key(first_key)}[#{length_str}#{marker}]:"
|
|
270
|
+
|
|
271
|
+
# JOIN the values on the same line as the hyphen
|
|
272
|
+
if all_primitives?(first_val)
|
|
273
|
+
values_str = first_val.map { |v| format_value(v) }.join(@delimiter)
|
|
274
|
+
emit_line("#{hyphen_line}#{array_header} #{values_str}")
|
|
275
|
+
else
|
|
276
|
+
# Fallback for complex arrays (keep as is)
|
|
277
|
+
emit_line("#{hyphen_line}#{array_header}")
|
|
278
|
+
first_val.each { |sub| encode_list_item(sub, depth + 1) }
|
|
279
|
+
end
|
|
269
280
|
|
|
281
|
+
# Encode remaining fields (id, etc.)
|
|
270
282
|
keys[1..].each do |k|
|
|
283
|
+
v = item[k]
|
|
271
284
|
fk = format_key(k)
|
|
272
|
-
|
|
273
|
-
|
|
285
|
+
# NOTE: Use field_indent to align with the start of the keys
|
|
286
|
+
if primitive?(v)
|
|
287
|
+
emit_line("#{field_indent}#{fk}: #{format_value(v)}")
|
|
288
|
+
else
|
|
289
|
+
emit_line("#{field_indent}#{fk}:")
|
|
290
|
+
encode_value(v, depth + 1)
|
|
291
|
+
end
|
|
274
292
|
end
|
|
275
293
|
end
|
|
276
294
|
|
data/lib/json_to_toon.rb
CHANGED
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require_relative 'ruby_json_toon/version'
|
|
4
|
+
require_relative 'json_to_toon'
|
|
5
|
+
require_relative 'toon_to_json'
|
|
6
|
+
|
|
7
|
+
module RubyJsonToon
|
|
8
|
+
def self.encode(value, options = {})
|
|
9
|
+
JsonToToon.encode(value, options)
|
|
10
|
+
end
|
|
11
|
+
|
|
12
|
+
def self.decode(toon_string)
|
|
13
|
+
ToonToJson.decode(toon_string)
|
|
14
|
+
end
|
|
15
|
+
end
|
|
@@ -0,0 +1,594 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module ToonToJson
|
|
4
|
+
# Efficiently converts TOON format directly to JSON string
|
|
5
|
+
class Decoder
|
|
6
|
+
def decode(str)
|
|
7
|
+
return 'null' if str.nil? || str.empty?
|
|
8
|
+
|
|
9
|
+
lines = str.to_s.split("\n")
|
|
10
|
+
|
|
11
|
+
# Single-line primitive detection
|
|
12
|
+
if lines.length == 1
|
|
13
|
+
line = lines.first.strip
|
|
14
|
+
|
|
15
|
+
is_structure = line.include?(':') ||
|
|
16
|
+
line.match?(/\A-\s/) ||
|
|
17
|
+
line.match?(/\A\[/)
|
|
18
|
+
|
|
19
|
+
return primitive_to_json(line) unless is_structure
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
@lines = lines.map { |l| { raw: l, indent: leading_spaces(l), text: l.lstrip } }
|
|
23
|
+
@i = 0
|
|
24
|
+
@indent_unit = detect_indent_unit
|
|
25
|
+
@output = []
|
|
26
|
+
|
|
27
|
+
parse_block(0)
|
|
28
|
+
@output.join
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
private
|
|
32
|
+
|
|
33
|
+
def leading_spaces(line)
|
|
34
|
+
line[/^\s*/].size
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
def detect_indent_unit
|
|
38
|
+
prev = 0
|
|
39
|
+
@lines.each do |ln|
|
|
40
|
+
next if ln[:raw].strip.empty?
|
|
41
|
+
return ln[:indent] - prev if ln[:indent] > prev
|
|
42
|
+
|
|
43
|
+
prev = ln[:indent]
|
|
44
|
+
end
|
|
45
|
+
2
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
def parse_block(min_indent)
|
|
49
|
+
return parse_object_block(min_indent) if @i >= @lines.length
|
|
50
|
+
|
|
51
|
+
first_line = @lines[@i][:text]
|
|
52
|
+
if first_line.match?(/\A\[/)
|
|
53
|
+
array_header = parse_array_header(first_line)
|
|
54
|
+
if array_header && array_header[:key].nil?
|
|
55
|
+
@i += 1
|
|
56
|
+
parse_array_body(array_header, @lines[@i - 1][:indent] + @indent_unit)
|
|
57
|
+
return
|
|
58
|
+
end
|
|
59
|
+
end
|
|
60
|
+
|
|
61
|
+
start_i = @i
|
|
62
|
+
while start_i < @lines.length
|
|
63
|
+
ln = @lines[start_i]
|
|
64
|
+
break if ln[:raw].strip.empty?
|
|
65
|
+
break if ln[:indent] < min_indent
|
|
66
|
+
|
|
67
|
+
return parse_list_block(min_indent) if ln[:indent] >= min_indent && ln[:text].match?(/\A-\s/)
|
|
68
|
+
|
|
69
|
+
break if ln[:text].include?(':')
|
|
70
|
+
|
|
71
|
+
start_i += 1
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
parse_object_block(min_indent)
|
|
75
|
+
end
|
|
76
|
+
|
|
77
|
+
def parse_object_block(min_indent)
|
|
78
|
+
@output << '{'
|
|
79
|
+
first = true
|
|
80
|
+
|
|
81
|
+
while @i < @lines.length
|
|
82
|
+
ln = @lines[@i]
|
|
83
|
+
break if ln[:raw].strip.empty?
|
|
84
|
+
break if ln[:indent] < min_indent
|
|
85
|
+
|
|
86
|
+
if (array_header = parse_array_header(ln[:text]))
|
|
87
|
+
@output << ',' unless first
|
|
88
|
+
first = false
|
|
89
|
+
|
|
90
|
+
key = array_header[:key]
|
|
91
|
+
@i += 1
|
|
92
|
+
|
|
93
|
+
if key
|
|
94
|
+
@output << json_string(key)
|
|
95
|
+
@output << ':'
|
|
96
|
+
end
|
|
97
|
+
|
|
98
|
+
parse_array_body(array_header, ln[:indent] + @indent_unit)
|
|
99
|
+
next
|
|
100
|
+
end
|
|
101
|
+
|
|
102
|
+
if (kv = parse_key_value_line(ln[:text]))
|
|
103
|
+
@output << ',' unless first
|
|
104
|
+
first = false
|
|
105
|
+
|
|
106
|
+
@output << json_string(kv[:key])
|
|
107
|
+
@output << ':'
|
|
108
|
+
@output << kv[:value]
|
|
109
|
+
@i += 1
|
|
110
|
+
next
|
|
111
|
+
end
|
|
112
|
+
|
|
113
|
+
if (key = parse_key_only(ln[:text]))
|
|
114
|
+
@output << ',' unless first
|
|
115
|
+
first = false
|
|
116
|
+
|
|
117
|
+
@output << json_string(key)
|
|
118
|
+
@output << ':'
|
|
119
|
+
@i += 1
|
|
120
|
+
parse_block(ln[:indent] + @indent_unit)
|
|
121
|
+
next
|
|
122
|
+
end
|
|
123
|
+
|
|
124
|
+
@i += 1
|
|
125
|
+
end
|
|
126
|
+
|
|
127
|
+
@output << '}'
|
|
128
|
+
end
|
|
129
|
+
|
|
130
|
+
def parse_array_header(text)
|
|
131
|
+
m = text.match(/\A(?:(?<key>.+?)?)?\[(?<len>#?\d+)(?<marker>[\t|]?)\](?:\{(?<fields>[^}]*)\})?:(?:\s*(?<rest>.*))?\z/)
|
|
132
|
+
return nil unless m
|
|
133
|
+
|
|
134
|
+
key = m[:key]&.strip
|
|
135
|
+
key = parse_quoted_key(key) if key && !key.empty?
|
|
136
|
+
|
|
137
|
+
{
|
|
138
|
+
key: key,
|
|
139
|
+
length: m[:len].sub(/^#/, '').to_i,
|
|
140
|
+
fields: m[:fields],
|
|
141
|
+
inline: m[:rest],
|
|
142
|
+
marker: m[:marker]
|
|
143
|
+
}
|
|
144
|
+
end
|
|
145
|
+
|
|
146
|
+
def parse_array_body(header, child_indent)
|
|
147
|
+
@output << '['
|
|
148
|
+
|
|
149
|
+
if header[:inline] && !header[:inline].strip.empty?
|
|
150
|
+
delim = detect_delimiter(header[:marker], header[:fields])
|
|
151
|
+
values = split_with_quotes(header[:inline], delim)
|
|
152
|
+
|
|
153
|
+
values.each_with_index do |v, idx|
|
|
154
|
+
@output << ',' if idx > 0
|
|
155
|
+
@output << value_to_json(v.strip)
|
|
156
|
+
end
|
|
157
|
+
|
|
158
|
+
@output << ']'
|
|
159
|
+
return
|
|
160
|
+
end
|
|
161
|
+
|
|
162
|
+
if header[:fields]
|
|
163
|
+
delim = detect_delimiter(header[:marker], header[:fields])
|
|
164
|
+
fields = split_with_quotes(header[:fields], delim)
|
|
165
|
+
first = true
|
|
166
|
+
|
|
167
|
+
while @i < @lines.length && @lines[@i][:indent] >= child_indent
|
|
168
|
+
row_text = @lines[@i][:text]
|
|
169
|
+
break if row_text.strip.empty?
|
|
170
|
+
|
|
171
|
+
@output << ',' unless first
|
|
172
|
+
first = false
|
|
173
|
+
|
|
174
|
+
values = split_with_quotes(row_text, delim)
|
|
175
|
+
|
|
176
|
+
@output << '{'
|
|
177
|
+
fields.each_with_index do |f, idx|
|
|
178
|
+
@output << ',' if idx > 0
|
|
179
|
+
@output << json_string(unquote_if_quoted(f.strip))
|
|
180
|
+
@output << ':'
|
|
181
|
+
@output << value_to_json(values[idx]&.strip || 'null')
|
|
182
|
+
end
|
|
183
|
+
@output << '}'
|
|
184
|
+
|
|
185
|
+
@i += 1
|
|
186
|
+
end
|
|
187
|
+
|
|
188
|
+
@output << ']'
|
|
189
|
+
return
|
|
190
|
+
end
|
|
191
|
+
|
|
192
|
+
if @i < @lines.length && @lines[@i][:indent] >= child_indent
|
|
193
|
+
peek_text = @lines[@i][:text]
|
|
194
|
+
peek_header = parse_array_header(peek_text)
|
|
195
|
+
|
|
196
|
+
if peek_header && peek_header[:key].nil?
|
|
197
|
+
parse_array_of_arrays(child_indent)
|
|
198
|
+
return
|
|
199
|
+
end
|
|
200
|
+
end
|
|
201
|
+
|
|
202
|
+
if @i < @lines.length && @lines[@i][:indent] >= child_indent &&
|
|
203
|
+
@lines[@i][:text].match?(/\A-\s/)
|
|
204
|
+
first = true
|
|
205
|
+
while @i < @lines.length
|
|
206
|
+
ln = @lines[@i]
|
|
207
|
+
break if ln[:raw].strip.empty?
|
|
208
|
+
break if ln[:indent] < child_indent
|
|
209
|
+
break unless ln[:text].match?(/\A-\s/)
|
|
210
|
+
|
|
211
|
+
@output << ',' unless first
|
|
212
|
+
first = false
|
|
213
|
+
|
|
214
|
+
after = ln[:text][2..]&.strip || ''
|
|
215
|
+
|
|
216
|
+
if after.empty?
|
|
217
|
+
@i += 1
|
|
218
|
+
if @i < @lines.length && @lines[@i][:indent] > ln[:indent]
|
|
219
|
+
parse_block(@lines[@i][:indent])
|
|
220
|
+
else
|
|
221
|
+
@output << '{}'
|
|
222
|
+
end
|
|
223
|
+
next
|
|
224
|
+
end
|
|
225
|
+
|
|
226
|
+
# FIXED: Check if it's an array header first (like "access[2]: read,write")
|
|
227
|
+
if (inline_array = parse_array_header(after))
|
|
228
|
+
@output << '{'
|
|
229
|
+
@output << json_string(inline_array[:key])
|
|
230
|
+
@output << ':'
|
|
231
|
+
|
|
232
|
+
# Parse the inline array directly
|
|
233
|
+
@output << '['
|
|
234
|
+
if inline_array[:inline] && !inline_array[:inline].strip.empty?
|
|
235
|
+
delim = detect_delimiter(inline_array[:marker], inline_array[:fields])
|
|
236
|
+
values = split_with_quotes(inline_array[:inline], delim)
|
|
237
|
+
values.each_with_index do |v, idx|
|
|
238
|
+
@output << ',' if idx > 0
|
|
239
|
+
@output << value_to_json(v.strip)
|
|
240
|
+
end
|
|
241
|
+
end
|
|
242
|
+
@output << ']'
|
|
243
|
+
|
|
244
|
+
@i += 1
|
|
245
|
+
|
|
246
|
+
# Handle remaining fields
|
|
247
|
+
if @i < @lines.length && @lines[@i][:indent] > ln[:indent]
|
|
248
|
+
child_ind = @lines[@i][:indent]
|
|
249
|
+
|
|
250
|
+
while @i < @lines.length && @lines[@i][:indent] >= child_ind
|
|
251
|
+
field_ln = @lines[@i]
|
|
252
|
+
break if field_ln[:text].match?(/\A-\s/)
|
|
253
|
+
|
|
254
|
+
# Check if field is an array header
|
|
255
|
+
if (field_array = parse_array_header(field_ln[:text]))
|
|
256
|
+
@output << ','
|
|
257
|
+
@output << json_string(field_array[:key])
|
|
258
|
+
@output << ':'
|
|
259
|
+
|
|
260
|
+
@output << '['
|
|
261
|
+
if field_array[:inline] && !field_array[:inline].strip.empty?
|
|
262
|
+
delim = detect_delimiter(field_array[:marker], field_array[:fields])
|
|
263
|
+
vals = split_with_quotes(field_array[:inline], delim)
|
|
264
|
+
vals.each_with_index do |v, idx|
|
|
265
|
+
@output << ',' if idx > 0
|
|
266
|
+
@output << value_to_json(v.strip)
|
|
267
|
+
end
|
|
268
|
+
end
|
|
269
|
+
@output << ']'
|
|
270
|
+
@i += 1
|
|
271
|
+
elsif field_kv = parse_key_value_line(field_ln[:text])
|
|
272
|
+
@output << ','
|
|
273
|
+
@output << json_string(field_kv[:key])
|
|
274
|
+
@output << ':'
|
|
275
|
+
@output << field_kv[:value]
|
|
276
|
+
@i += 1
|
|
277
|
+
elsif field_key = parse_key_only(field_ln[:text])
|
|
278
|
+
@output << ','
|
|
279
|
+
@output << json_string(field_key)
|
|
280
|
+
@output << ':'
|
|
281
|
+
@i += 1
|
|
282
|
+
parse_block(field_ln[:indent] + @indent_unit)
|
|
283
|
+
else
|
|
284
|
+
break
|
|
285
|
+
end
|
|
286
|
+
end
|
|
287
|
+
end
|
|
288
|
+
|
|
289
|
+
@output << '}'
|
|
290
|
+
next
|
|
291
|
+
end
|
|
292
|
+
|
|
293
|
+
if (kv = parse_key_value_line(after))
|
|
294
|
+
@output << '{'
|
|
295
|
+
@output << json_string(kv[:key])
|
|
296
|
+
@output << ':'
|
|
297
|
+
@output << kv[:value]
|
|
298
|
+
|
|
299
|
+
@i += 1
|
|
300
|
+
|
|
301
|
+
if @i < @lines.length && @lines[@i][:indent] > ln[:indent]
|
|
302
|
+
child_ind = @lines[@i][:indent]
|
|
303
|
+
|
|
304
|
+
while @i < @lines.length && @lines[@i][:indent] >= child_ind
|
|
305
|
+
field_ln = @lines[@i]
|
|
306
|
+
break if field_ln[:text].match?(/\A-\s/)
|
|
307
|
+
|
|
308
|
+
# FIXED: Check if field is an array header
|
|
309
|
+
if (field_array = parse_array_header(field_ln[:text]))
|
|
310
|
+
@output << ','
|
|
311
|
+
@output << json_string(field_array[:key])
|
|
312
|
+
@output << ':'
|
|
313
|
+
|
|
314
|
+
@output << '['
|
|
315
|
+
if field_array[:inline] && !field_array[:inline].strip.empty?
|
|
316
|
+
delim = detect_delimiter(field_array[:marker], field_array[:fields])
|
|
317
|
+
vals = split_with_quotes(field_array[:inline], delim)
|
|
318
|
+
vals.each_with_index do |v, idx|
|
|
319
|
+
@output << ',' if idx > 0
|
|
320
|
+
@output << value_to_json(v.strip)
|
|
321
|
+
end
|
|
322
|
+
end
|
|
323
|
+
@output << ']'
|
|
324
|
+
@i += 1
|
|
325
|
+
elsif field_kv = parse_key_value_line(field_ln[:text])
|
|
326
|
+
@output << ','
|
|
327
|
+
@output << json_string(field_kv[:key])
|
|
328
|
+
@output << ':'
|
|
329
|
+
@output << field_kv[:value]
|
|
330
|
+
@i += 1
|
|
331
|
+
elsif field_key = parse_key_only(field_ln[:text])
|
|
332
|
+
@output << ','
|
|
333
|
+
@output << json_string(field_key)
|
|
334
|
+
@output << ':'
|
|
335
|
+
@i += 1
|
|
336
|
+
parse_block(field_ln[:indent] + @indent_unit)
|
|
337
|
+
else
|
|
338
|
+
break
|
|
339
|
+
end
|
|
340
|
+
end
|
|
341
|
+
end
|
|
342
|
+
|
|
343
|
+
@output << '}'
|
|
344
|
+
next
|
|
345
|
+
end
|
|
346
|
+
|
|
347
|
+
if after.end_with?(':')
|
|
348
|
+
key = unquote_if_quoted(after[0...-1].strip)
|
|
349
|
+
@i += 1
|
|
350
|
+
|
|
351
|
+
@output << '{'
|
|
352
|
+
@output << json_string(key)
|
|
353
|
+
@output << ':'
|
|
354
|
+
|
|
355
|
+
if @i < @lines.length && @lines[@i][:indent] > ln[:indent]
|
|
356
|
+
parse_block(@lines[@i][:indent])
|
|
357
|
+
else
|
|
358
|
+
@output << '{}'
|
|
359
|
+
end
|
|
360
|
+
|
|
361
|
+
@output << '}'
|
|
362
|
+
next
|
|
363
|
+
end
|
|
364
|
+
|
|
365
|
+
@output << value_to_json(after)
|
|
366
|
+
@i += 1
|
|
367
|
+
end
|
|
368
|
+
|
|
369
|
+
@output << ']'
|
|
370
|
+
return
|
|
371
|
+
end
|
|
372
|
+
|
|
373
|
+
@output << ']'
|
|
374
|
+
end
|
|
375
|
+
|
|
376
|
+
def parse_array_of_arrays(child_indent)
|
|
377
|
+
first_item = true
|
|
378
|
+
|
|
379
|
+
while @i < @lines.length && @lines[@i][:indent] >= child_indent
|
|
380
|
+
child_text = @lines[@i][:text]
|
|
381
|
+
break if child_text.strip.empty?
|
|
382
|
+
|
|
383
|
+
child_header = parse_array_header(child_text)
|
|
384
|
+
break unless child_header && child_header[:key].nil?
|
|
385
|
+
|
|
386
|
+
@output << ',' unless first_item
|
|
387
|
+
first_item = false
|
|
388
|
+
@i += 1
|
|
389
|
+
|
|
390
|
+
parse_array_body(child_header, @lines[@i - 1][:indent] + @indent_unit)
|
|
391
|
+
end
|
|
392
|
+
|
|
393
|
+
@output << ']'
|
|
394
|
+
end
|
|
395
|
+
|
|
396
|
+
def split_with_quotes(text, delimiter)
|
|
397
|
+
return [text] if text.nil? || text.empty?
|
|
398
|
+
|
|
399
|
+
values = []
|
|
400
|
+
current = +''
|
|
401
|
+
in_quotes = false
|
|
402
|
+
i = 0
|
|
403
|
+
|
|
404
|
+
while i < text.length
|
|
405
|
+
c = text[i]
|
|
406
|
+
|
|
407
|
+
if c == '\\' && i + 1 < text.length
|
|
408
|
+
current << c << text[i + 1]
|
|
409
|
+
i += 2
|
|
410
|
+
next
|
|
411
|
+
end
|
|
412
|
+
|
|
413
|
+
if c == '"'
|
|
414
|
+
in_quotes = !in_quotes
|
|
415
|
+
current << c
|
|
416
|
+
i += 1
|
|
417
|
+
next
|
|
418
|
+
end
|
|
419
|
+
|
|
420
|
+
if c == delimiter && !in_quotes
|
|
421
|
+
values << current
|
|
422
|
+
current = +''
|
|
423
|
+
i += 1
|
|
424
|
+
next
|
|
425
|
+
end
|
|
426
|
+
|
|
427
|
+
current << c
|
|
428
|
+
i += 1
|
|
429
|
+
end
|
|
430
|
+
|
|
431
|
+
values << current unless current.empty?
|
|
432
|
+
values.map(&:strip)
|
|
433
|
+
end
|
|
434
|
+
|
|
435
|
+
def detect_delimiter(marker, fields)
|
|
436
|
+
return "\t" if marker == "\t"
|
|
437
|
+
return '|' if marker == '|'
|
|
438
|
+
return "\t" if fields&.include?("\t")
|
|
439
|
+
return '|' if fields&.include?('|')
|
|
440
|
+
|
|
441
|
+
','
|
|
442
|
+
end
|
|
443
|
+
|
|
444
|
+
def parse_key_value_line(text)
|
|
445
|
+
key = nil
|
|
446
|
+
rest = nil
|
|
447
|
+
|
|
448
|
+
if text.start_with?('"')
|
|
449
|
+
idx = 1
|
|
450
|
+
while idx < text.length
|
|
451
|
+
if text[idx] == '\\' && idx + 1 < text.length
|
|
452
|
+
idx += 2
|
|
453
|
+
next
|
|
454
|
+
end
|
|
455
|
+
break if text[idx] == '"'
|
|
456
|
+
|
|
457
|
+
idx += 1
|
|
458
|
+
end
|
|
459
|
+
return nil if idx >= text.length
|
|
460
|
+
|
|
461
|
+
key = unescape_string(text[1...idx])
|
|
462
|
+
after = text[(idx + 1)..]&.lstrip
|
|
463
|
+
return nil unless after&.start_with?(':')
|
|
464
|
+
|
|
465
|
+
rest = after[1..]&.lstrip
|
|
466
|
+
elsif (cpos = text.index(':'))
|
|
467
|
+
key = text[0...cpos].strip
|
|
468
|
+
rest = text[(cpos + 1)..]&.lstrip
|
|
469
|
+
else
|
|
470
|
+
return nil
|
|
471
|
+
end
|
|
472
|
+
|
|
473
|
+
return nil if rest.nil? || rest.empty?
|
|
474
|
+
|
|
475
|
+
{ key: unquote_if_quoted(key), value: value_to_json(rest) }
|
|
476
|
+
end
|
|
477
|
+
|
|
478
|
+
def parse_key_only(text)
|
|
479
|
+
return nil unless text.end_with?(':')
|
|
480
|
+
|
|
481
|
+
key_part = text[0...-1].strip
|
|
482
|
+
return nil if key_part.empty?
|
|
483
|
+
|
|
484
|
+
unquote_if_quoted(key_part)
|
|
485
|
+
end
|
|
486
|
+
|
|
487
|
+
def unquote_if_quoted(str)
|
|
488
|
+
return str unless str&.start_with?('"') && str.end_with?('"')
|
|
489
|
+
|
|
490
|
+
unescape_string(str[1...-1])
|
|
491
|
+
end
|
|
492
|
+
|
|
493
|
+
def parse_quoted_key(key)
|
|
494
|
+
key = key.strip
|
|
495
|
+
if key.start_with?('"') && key.end_with?('"')
|
|
496
|
+
unescape_string(key[1...-1])
|
|
497
|
+
else
|
|
498
|
+
key
|
|
499
|
+
end
|
|
500
|
+
end
|
|
501
|
+
|
|
502
|
+
def value_to_json(text)
|
|
503
|
+
return 'null' if text.nil? || text == 'null'
|
|
504
|
+
|
|
505
|
+
t = text.strip
|
|
506
|
+
|
|
507
|
+
return json_string(unescape_string(t[1...-1])) if t.start_with?('"') && t.end_with?('"')
|
|
508
|
+
|
|
509
|
+
return 'true' if t.casecmp('true').zero?
|
|
510
|
+
return 'false' if t.casecmp('false').zero?
|
|
511
|
+
return 'null' if t.casecmp('null').zero?
|
|
512
|
+
|
|
513
|
+
return t if t.match?(/\A-?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?\z/)
|
|
514
|
+
|
|
515
|
+
json_string(t)
|
|
516
|
+
end
|
|
517
|
+
|
|
518
|
+
def primitive_to_json(token)
|
|
519
|
+
return 'null' if token.nil? || token.casecmp?('null')
|
|
520
|
+
return 'true' if token.casecmp?('true')
|
|
521
|
+
return 'false' if token.casecmp?('false')
|
|
522
|
+
return token if token.match?(/\A-?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?\z/)
|
|
523
|
+
|
|
524
|
+
json_string(token)
|
|
525
|
+
end
|
|
526
|
+
|
|
527
|
+
def json_string(str)
|
|
528
|
+
return '""' if str.nil? || str.empty?
|
|
529
|
+
|
|
530
|
+
result = +'"'
|
|
531
|
+
|
|
532
|
+
str.each_char do |c|
|
|
533
|
+
result << case c
|
|
534
|
+
when '"' then '\\"'
|
|
535
|
+
when '\\' then '\\\\'
|
|
536
|
+
when "\n" then '\\n'
|
|
537
|
+
when "\r" then '\\r'
|
|
538
|
+
when "\t" then '\\t'
|
|
539
|
+
when "\b" then '\\b'
|
|
540
|
+
when "\f" then '\\f'
|
|
541
|
+
else
|
|
542
|
+
if c.ord < 32
|
|
543
|
+
format('\\u%04x', c.ord)
|
|
544
|
+
else
|
|
545
|
+
c
|
|
546
|
+
end
|
|
547
|
+
end
|
|
548
|
+
end
|
|
549
|
+
|
|
550
|
+
result << '"'
|
|
551
|
+
result
|
|
552
|
+
end
|
|
553
|
+
|
|
554
|
+
def unescape_string(s)
|
|
555
|
+
result = +''
|
|
556
|
+
i = 0
|
|
557
|
+
|
|
558
|
+
while i < s.length
|
|
559
|
+
if s[i] == '\\' && i + 1 < s.length
|
|
560
|
+
case s[i + 1]
|
|
561
|
+
when 'n' then result << "\n"
|
|
562
|
+
when 'r' then result << "\r"
|
|
563
|
+
when 't' then result << "\t"
|
|
564
|
+
when '"' then result << '"'
|
|
565
|
+
when '\\' then result << '\\'
|
|
566
|
+
when 'b' then result << "\b"
|
|
567
|
+
when 'f' then result << "\f"
|
|
568
|
+
when 'u'
|
|
569
|
+
if i + 5 < s.length
|
|
570
|
+
hex = s[i + 2..i + 5]
|
|
571
|
+
begin
|
|
572
|
+
result << [hex.to_i(16)].pack('U')
|
|
573
|
+
rescue StandardError
|
|
574
|
+
result << s[i..i + 5]
|
|
575
|
+
end
|
|
576
|
+
i += 6
|
|
577
|
+
next
|
|
578
|
+
else
|
|
579
|
+
result << s[i] << s[i + 1]
|
|
580
|
+
end
|
|
581
|
+
else
|
|
582
|
+
result << s[i] << s[i + 1]
|
|
583
|
+
end
|
|
584
|
+
i += 2
|
|
585
|
+
else
|
|
586
|
+
result << s[i]
|
|
587
|
+
i += 1
|
|
588
|
+
end
|
|
589
|
+
end
|
|
590
|
+
|
|
591
|
+
result
|
|
592
|
+
end
|
|
593
|
+
end
|
|
594
|
+
end
|
data/lib/toon_to_json.rb
ADDED
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
require_relative 'toon_to_json/decoder'
|
|
2
|
+
|
|
3
|
+
module ToonToJson
|
|
4
|
+
class Error < StandardError; end
|
|
5
|
+
|
|
6
|
+
# Decode a TOON-formatted string into a Ruby object
|
|
7
|
+
#
|
|
8
|
+
# @param toon_str [String] The TOON-formatted string to decode
|
|
9
|
+
# @return [Object] The decoded Ruby object (Hash, Array, or primitive)
|
|
10
|
+
|
|
11
|
+
def self.decode(toon_str)
|
|
12
|
+
Decoder.new.decode(toon_str)
|
|
13
|
+
end
|
|
14
|
+
end
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: ruby-json-toon
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 1.0.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Jitendra Neema
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2026-01-
|
|
11
|
+
date: 2026-01-12 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: benchmark-ips
|
|
@@ -108,24 +108,10 @@ dependencies:
|
|
|
108
108
|
- - "~>"
|
|
109
109
|
- !ruby/object:Gem::Version
|
|
110
110
|
version: '3.0'
|
|
111
|
-
- !ruby/object:Gem::Dependency
|
|
112
|
-
name: simplecov
|
|
113
|
-
requirement: !ruby/object:Gem::Requirement
|
|
114
|
-
requirements:
|
|
115
|
-
- - "~>"
|
|
116
|
-
- !ruby/object:Gem::Version
|
|
117
|
-
version: '0.22'
|
|
118
|
-
type: :development
|
|
119
|
-
prerelease: false
|
|
120
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
121
|
-
requirements:
|
|
122
|
-
- - "~>"
|
|
123
|
-
- !ruby/object:Gem::Version
|
|
124
|
-
version: '0.22'
|
|
125
111
|
description: Lightweight Ruby library for converting JSON data to TOON format, achieving
|
|
126
112
|
30-60% token reduction for LLM applications
|
|
127
113
|
email:
|
|
128
|
-
-
|
|
114
|
+
- jitendra.neema.8@gmail.com
|
|
129
115
|
executables: []
|
|
130
116
|
extensions: []
|
|
131
117
|
extra_rdoc_files: []
|
|
@@ -135,7 +121,10 @@ files:
|
|
|
135
121
|
- README.md
|
|
136
122
|
- lib/json_to_toon.rb
|
|
137
123
|
- lib/json_to_toon/encoder.rb
|
|
138
|
-
- lib/
|
|
124
|
+
- lib/ruby_json_toon.rb
|
|
125
|
+
- lib/ruby_json_toon/version.rb
|
|
126
|
+
- lib/toon_to_json.rb
|
|
127
|
+
- lib/toon_to_json/decoder.rb
|
|
139
128
|
homepage: https://github.com/jitendra-neema/ruby-json-toon
|
|
140
129
|
licenses:
|
|
141
130
|
- MIT
|