ruby-json-toon 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -3
- data/README.md +42 -17
- data/lib/json_to_toon/version.rb +1 -1
- data/lib/toon_to_json/decoder.rb +527 -0
- data/lib/toon_to_json/version.rb +5 -0
- data/lib/toon_to_json.rb +14 -0
- metadata +6 -17
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 239c3f74f2d761583b30c69d05882a1072d02c423b84446bbae9e7d241b09a2b
|
|
4
|
+
data.tar.gz: 4a9b92b5663e92fa1f99d3f61f9b80be0ad2b0fef1a5ebf16564573b3338e7b7
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c54ca35838a6697763813f709ee382162d5efdefae022ed00cba6849a63f3135e06af366b217552ea2c6a941941db82cf0069fe03a960f178ff7130fff8f470e
|
|
7
|
+
data.tar.gz: ed409049fb431d60f786edc49ace9dd827429f59b67f01e0b08c238e8cc2619cb1b704dbea26888e392a48e8894d2bb8e1e41ed7ef2c938460908cfd34e79591
|
data/CHANGELOG.md
CHANGED
|
@@ -21,11 +21,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
21
21
|
- Initial Encoder class implementation
|
|
22
22
|
- Tests for JSON to TOON conversion
|
|
23
23
|
|
|
24
|
-
##[0.2.0] - First
|
|
24
|
+
## [0.2.0] - First Encoder release
|
|
25
25
|
### Added
|
|
26
|
-
|
|
26
|
+
- Updated to run release workflow on tag pushes
|
|
27
27
|
|
|
28
|
-
## [
|
|
28
|
+
## [0.3.0] - Decoder implementation
|
|
29
|
+
### Added
|
|
30
|
+
- Added decoder implementation
|
|
31
|
+
- Added ability to seperately require decoder using: require "toon_to_json"
|
|
29
32
|
|
|
30
33
|
### Added
|
|
31
34
|
- Initial release
|
data/README.md
CHANGED
|
@@ -1,10 +1,25 @@
|
|
|
1
1
|
# JSON to TOON
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Lightweight Ruby library for converting JSON data to TOON (Token-Oriented Object Notation), achieving 30–60% token reduction for LLM applications.
|
|
4
|
+
|
|
5
|
+
## Summary
|
|
6
|
+
|
|
7
|
+
Convert JSON to TOON (Token-Oriented Object Notation)
|
|
8
|
+
|
|
9
|
+
Authors: Jitendra Neema
|
|
10
|
+
Contact: jitendra.neema.8@gmail.com
|
|
11
|
+
|
|
12
|
+
Homepage: https://github.com/jitendra-neema/ruby-json-toon
|
|
13
|
+
Documentation: https://rubydoc.info/gems/ruby-json-toon
|
|
14
|
+
Changelog: https://github.com/jitendra-neema/ruby-json-toon/blob/main/CHANGELOG.md
|
|
15
|
+
Bug tracker: https://github.com/jitendra-neema/ruby-json-toon/issues
|
|
16
|
+
Rubygems: https://rubygems.org/gems/ruby-json-toon
|
|
17
|
+
|
|
18
|
+
Requires Ruby >= 2.7.0
|
|
4
19
|
|
|
5
20
|
## What is TOON?
|
|
6
21
|
|
|
7
|
-
TOON (Token-Oriented Object Notation) is a compact, indentation-based data format optimized for LLM token efficiency. It uses 30
|
|
22
|
+
TOON (Token-Oriented Object Notation) is a compact, indentation-based data format optimized for LLM token efficiency. It uses roughly 30–60% fewer tokens than JSON while remaining human-readable.
|
|
8
23
|
|
|
9
24
|
### Comparison
|
|
10
25
|
|
|
@@ -27,16 +42,22 @@ users[2]{id,name,role}:
|
|
|
27
42
|
|
|
28
43
|
## Installation
|
|
29
44
|
|
|
30
|
-
|
|
45
|
+
Install the gem:
|
|
46
|
+
|
|
47
|
+
```bash
|
|
48
|
+
gem install ruby-json-toon
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Or add to your Gemfile:
|
|
31
52
|
|
|
32
53
|
```ruby
|
|
33
|
-
gem '
|
|
54
|
+
gem 'ruby-json-toon'
|
|
34
55
|
```
|
|
35
56
|
|
|
36
|
-
|
|
57
|
+
Require the library in your code (require path follows the library files):
|
|
37
58
|
|
|
38
|
-
```
|
|
39
|
-
|
|
59
|
+
```ruby
|
|
60
|
+
require 'json_to_toon'
|
|
40
61
|
```
|
|
41
62
|
|
|
42
63
|
## Quick Start
|
|
@@ -57,10 +78,6 @@ json_data = JSON.parse('{"users":[{"id":1,"name":"Alice"}]}')
|
|
|
57
78
|
toon = JsonToToon.encode(json_data)
|
|
58
79
|
```
|
|
59
80
|
|
|
60
|
-
## Documentation
|
|
61
|
-
|
|
62
|
-
See full documentation at [rubydoc.info](https://rubydoc.info/gems/json_to_toon)
|
|
63
|
-
|
|
64
81
|
## Options
|
|
65
82
|
|
|
66
83
|
```ruby
|
|
@@ -73,8 +90,13 @@ JsonToToon.encode(data,
|
|
|
73
90
|
|
|
74
91
|
## Development
|
|
75
92
|
|
|
93
|
+
Clone the repo, install dependencies, run tests, and build the gem:
|
|
94
|
+
|
|
76
95
|
```bash
|
|
77
|
-
|
|
96
|
+
git clone https://github.com/jitendra-neema/ruby-json-toon
|
|
97
|
+
cd ruby-json-toon
|
|
98
|
+
|
|
99
|
+
# Install development dependencies
|
|
78
100
|
bundle install
|
|
79
101
|
|
|
80
102
|
# Run tests
|
|
@@ -84,15 +106,18 @@ bundle exec rspec
|
|
|
84
106
|
bundle exec rubocop
|
|
85
107
|
|
|
86
108
|
# Build gem
|
|
87
|
-
gem build
|
|
109
|
+
gem build ruby-json-toon.gemspec
|
|
88
110
|
```
|
|
89
111
|
|
|
112
|
+
Development dependencies (from the gemspec): benchmark-ips, memory_profiler, rake, rspec, rubocop, rubocop-rake, rubocop-rspec, simplecov.
|
|
113
|
+
|
|
90
114
|
## License
|
|
91
115
|
|
|
92
|
-
MIT License
|
|
116
|
+
MIT License — see LICENSE file for details.
|
|
93
117
|
|
|
94
118
|
## Links
|
|
95
119
|
|
|
96
|
-
-
|
|
97
|
-
-
|
|
98
|
-
-
|
|
120
|
+
- TOON Specification: https://toonformat.dev
|
|
121
|
+
- Homepage / source: https://github.com/jitendra-neema/ruby-json-toon
|
|
122
|
+
- Documentation: https://rubydoc.info/gems/ruby-json-toon
|
|
123
|
+
- Bug tracker: https://github.com/jitendra-neema/ruby-json-toon/issues
|
data/lib/json_to_toon/version.rb
CHANGED
|
@@ -0,0 +1,527 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module ToonToJson
|
|
4
|
+
# Efficiently converts TOON format directly to JSON string
|
|
5
|
+
class Decoder
|
|
6
|
+
def decode(str)
|
|
7
|
+
return 'null' if str.nil? || str.empty?
|
|
8
|
+
|
|
9
|
+
lines = str.to_s.split("\n")
|
|
10
|
+
|
|
11
|
+
# Single-line primitive detection
|
|
12
|
+
if lines.length == 1
|
|
13
|
+
line = lines.first.strip
|
|
14
|
+
|
|
15
|
+
# Check if it's a primitive (not a TOON structure)
|
|
16
|
+
# TOON structures have:
|
|
17
|
+
# - Colons (key:value or key:)
|
|
18
|
+
# - List items (starts with "- " - note the space!)
|
|
19
|
+
# - Array headers (starts with [)
|
|
20
|
+
|
|
21
|
+
is_structure = line.include?(':') ||
|
|
22
|
+
line.match?(/\A-\s/) || # "- " with space (list item)
|
|
23
|
+
line.match?(/\A\[/) # Array header
|
|
24
|
+
|
|
25
|
+
return primitive_to_json(line) unless is_structure
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
@lines = lines.map { |l| { raw: l, indent: leading_spaces(l), text: l.lstrip } }
|
|
29
|
+
@i = 0
|
|
30
|
+
@indent_unit = detect_indent_unit
|
|
31
|
+
@output = []
|
|
32
|
+
|
|
33
|
+
parse_block(0)
|
|
34
|
+
@output.join
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
private
|
|
38
|
+
|
|
39
|
+
def leading_spaces(line)
|
|
40
|
+
line[/^\s*/].size
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
def detect_indent_unit
|
|
44
|
+
prev = 0
|
|
45
|
+
@lines.each do |ln|
|
|
46
|
+
next if ln[:raw].strip.empty?
|
|
47
|
+
return ln[:indent] - prev if ln[:indent] > prev
|
|
48
|
+
|
|
49
|
+
prev = ln[:indent]
|
|
50
|
+
end
|
|
51
|
+
2
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
def parse_block(min_indent)
|
|
55
|
+
return parse_object_block(min_indent) if @i >= @lines.length
|
|
56
|
+
|
|
57
|
+
# Check if first line is array header
|
|
58
|
+
first_line = @lines[@i][:text]
|
|
59
|
+
if first_line.match?(/\A\[/)
|
|
60
|
+
array_header = parse_array_header(first_line)
|
|
61
|
+
if array_header && array_header[:key].nil?
|
|
62
|
+
# Root-level array
|
|
63
|
+
@i += 1
|
|
64
|
+
parse_array_body(array_header, @lines[@i - 1][:indent] + @indent_unit)
|
|
65
|
+
return
|
|
66
|
+
end
|
|
67
|
+
end
|
|
68
|
+
|
|
69
|
+
# Check if list format (starts with "- " with space)
|
|
70
|
+
start_i = @i
|
|
71
|
+
while start_i < @lines.length
|
|
72
|
+
ln = @lines[start_i]
|
|
73
|
+
break if ln[:raw].strip.empty?
|
|
74
|
+
break if ln[:indent] < min_indent
|
|
75
|
+
|
|
76
|
+
return parse_list_block(min_indent) if ln[:indent] >= min_indent && ln[:text].match?(/\A-\s/)
|
|
77
|
+
|
|
78
|
+
break if ln[:text].include?(':')
|
|
79
|
+
|
|
80
|
+
start_i += 1
|
|
81
|
+
end
|
|
82
|
+
|
|
83
|
+
parse_object_block(min_indent)
|
|
84
|
+
end
|
|
85
|
+
|
|
86
|
+
def parse_object_block(min_indent)
|
|
87
|
+
@output << '{'
|
|
88
|
+
first = true
|
|
89
|
+
|
|
90
|
+
while @i < @lines.length
|
|
91
|
+
ln = @lines[@i]
|
|
92
|
+
break if ln[:raw].strip.empty?
|
|
93
|
+
break if ln[:indent] < min_indent
|
|
94
|
+
|
|
95
|
+
# Array header
|
|
96
|
+
if (array_header = parse_array_header(ln[:text]))
|
|
97
|
+
@output << ',' unless first
|
|
98
|
+
first = false
|
|
99
|
+
|
|
100
|
+
key = array_header[:key]
|
|
101
|
+
@i += 1
|
|
102
|
+
|
|
103
|
+
if key
|
|
104
|
+
@output << json_string(key)
|
|
105
|
+
@output << ':'
|
|
106
|
+
end
|
|
107
|
+
|
|
108
|
+
parse_array_body(array_header, ln[:indent] + @indent_unit)
|
|
109
|
+
next
|
|
110
|
+
end
|
|
111
|
+
|
|
112
|
+
# Key-value pair
|
|
113
|
+
if (kv = parse_key_value_line(ln[:text]))
|
|
114
|
+
@output << ',' unless first
|
|
115
|
+
first = false
|
|
116
|
+
|
|
117
|
+
@output << json_string(kv[:key])
|
|
118
|
+
@output << ':'
|
|
119
|
+
@output << kv[:value]
|
|
120
|
+
@i += 1
|
|
121
|
+
next
|
|
122
|
+
end
|
|
123
|
+
|
|
124
|
+
# Key-only (nested object)
|
|
125
|
+
if (key = parse_key_only(ln[:text]))
|
|
126
|
+
@output << ',' unless first
|
|
127
|
+
first = false
|
|
128
|
+
|
|
129
|
+
@output << json_string(key)
|
|
130
|
+
@output << ':'
|
|
131
|
+
@i += 1
|
|
132
|
+
parse_block(ln[:indent] + @indent_unit)
|
|
133
|
+
next
|
|
134
|
+
end
|
|
135
|
+
|
|
136
|
+
@i += 1
|
|
137
|
+
end
|
|
138
|
+
|
|
139
|
+
@output << '}'
|
|
140
|
+
end
|
|
141
|
+
|
|
142
|
+
def parse_array_header(text)
|
|
143
|
+
m = text.match(/\A(?:(?<key>.+?)?)?\[(?<len>#?\d+)(?<marker>[\t|]?)\](?:\{(?<fields>[^}]*)\})?:(?:\s*(?<rest>.*))?\z/)
|
|
144
|
+
return nil unless m
|
|
145
|
+
|
|
146
|
+
key = m[:key]&.strip
|
|
147
|
+
key = parse_quoted_key(key) if key && !key.empty?
|
|
148
|
+
|
|
149
|
+
{
|
|
150
|
+
key: key,
|
|
151
|
+
length: m[:len].sub(/^#/, '').to_i,
|
|
152
|
+
fields: m[:fields],
|
|
153
|
+
inline: m[:rest],
|
|
154
|
+
marker: m[:marker]
|
|
155
|
+
}
|
|
156
|
+
end
|
|
157
|
+
|
|
158
|
+
def parse_array_body(header, child_indent)
|
|
159
|
+
@output << '['
|
|
160
|
+
|
|
161
|
+
# Inline values
|
|
162
|
+
if header[:inline] && !header[:inline].strip.empty?
|
|
163
|
+
delim = detect_delimiter(header[:marker], header[:fields])
|
|
164
|
+
values = split_with_quotes(header[:inline], delim)
|
|
165
|
+
|
|
166
|
+
values.each_with_index do |v, idx|
|
|
167
|
+
@output << ',' if idx > 0
|
|
168
|
+
@output << value_to_json(v.strip)
|
|
169
|
+
end
|
|
170
|
+
|
|
171
|
+
@output << ']'
|
|
172
|
+
return
|
|
173
|
+
end
|
|
174
|
+
|
|
175
|
+
# Tabular format
|
|
176
|
+
if header[:fields]
|
|
177
|
+
delim = detect_delimiter(header[:marker], header[:fields])
|
|
178
|
+
fields = split_with_quotes(header[:fields], delim)
|
|
179
|
+
first = true
|
|
180
|
+
|
|
181
|
+
while @i < @lines.length && @lines[@i][:indent] >= child_indent
|
|
182
|
+
row_text = @lines[@i][:text]
|
|
183
|
+
break if row_text.strip.empty?
|
|
184
|
+
|
|
185
|
+
@output << ',' unless first
|
|
186
|
+
first = false
|
|
187
|
+
|
|
188
|
+
values = split_with_quotes(row_text, delim)
|
|
189
|
+
|
|
190
|
+
@output << '{'
|
|
191
|
+
fields.each_with_index do |f, idx|
|
|
192
|
+
@output << ',' if idx > 0
|
|
193
|
+
@output << json_string(unquote_if_quoted(f.strip))
|
|
194
|
+
@output << ':'
|
|
195
|
+
@output << value_to_json(values[idx]&.strip || 'null')
|
|
196
|
+
end
|
|
197
|
+
@output << '}'
|
|
198
|
+
|
|
199
|
+
@i += 1
|
|
200
|
+
end
|
|
201
|
+
|
|
202
|
+
@output << ']'
|
|
203
|
+
return
|
|
204
|
+
end
|
|
205
|
+
|
|
206
|
+
if @i < @lines.length && @lines[@i][:indent] >= child_indent
|
|
207
|
+
peek_text = @lines[@i][:text]
|
|
208
|
+
peek_header = parse_array_header(peek_text)
|
|
209
|
+
|
|
210
|
+
if peek_header && peek_header[:key].nil?
|
|
211
|
+
parse_array_of_arrays(child_indent)
|
|
212
|
+
return
|
|
213
|
+
end
|
|
214
|
+
end
|
|
215
|
+
|
|
216
|
+
# List format - parse items directly
|
|
217
|
+
if @i < @lines.length && @lines[@i][:indent] >= child_indent &&
|
|
218
|
+
@lines[@i][:text].match?(/\A-\s/)
|
|
219
|
+
first = true
|
|
220
|
+
while @i < @lines.length
|
|
221
|
+
ln = @lines[@i]
|
|
222
|
+
break if ln[:raw].strip.empty?
|
|
223
|
+
break if ln[:indent] < child_indent
|
|
224
|
+
break unless ln[:text].match?(/\A-\s/)
|
|
225
|
+
|
|
226
|
+
@output << ',' unless first
|
|
227
|
+
first = false
|
|
228
|
+
|
|
229
|
+
after = ln[:text][2..]&.strip || ''
|
|
230
|
+
|
|
231
|
+
if after.empty?
|
|
232
|
+
@i += 1
|
|
233
|
+
if @i < @lines.length && @lines[@i][:indent] > ln[:indent]
|
|
234
|
+
parse_block(@lines[@i][:indent])
|
|
235
|
+
else
|
|
236
|
+
@output << '{}'
|
|
237
|
+
end
|
|
238
|
+
next
|
|
239
|
+
end
|
|
240
|
+
|
|
241
|
+
if (kv = parse_key_value_line(after))
|
|
242
|
+
@output << '{'
|
|
243
|
+
@output << json_string(kv[:key])
|
|
244
|
+
@output << ':'
|
|
245
|
+
@output << kv[:value]
|
|
246
|
+
|
|
247
|
+
@i += 1
|
|
248
|
+
|
|
249
|
+
if @i < @lines.length && @lines[@i][:indent] > ln[:indent]
|
|
250
|
+
child_ind = @lines[@i][:indent]
|
|
251
|
+
|
|
252
|
+
while @i < @lines.length && @lines[@i][:indent] >= child_ind
|
|
253
|
+
field_ln = @lines[@i]
|
|
254
|
+
break if field_ln[:text].match?(/\A-\s/)
|
|
255
|
+
|
|
256
|
+
if field_kv = parse_key_value_line(field_ln[:text])
|
|
257
|
+
@output << ','
|
|
258
|
+
@output << json_string(field_kv[:key])
|
|
259
|
+
@output << ':'
|
|
260
|
+
@output << field_kv[:value]
|
|
261
|
+
@i += 1
|
|
262
|
+
elsif field_key = parse_key_only(field_ln[:text])
|
|
263
|
+
@output << ','
|
|
264
|
+
@output << json_string(field_key)
|
|
265
|
+
@output << ':'
|
|
266
|
+
@i += 1
|
|
267
|
+
parse_block(field_ln[:indent] + @indent_unit)
|
|
268
|
+
else
|
|
269
|
+
break
|
|
270
|
+
end
|
|
271
|
+
end
|
|
272
|
+
end
|
|
273
|
+
|
|
274
|
+
@output << '}'
|
|
275
|
+
next
|
|
276
|
+
end
|
|
277
|
+
|
|
278
|
+
if after.end_with?(':')
|
|
279
|
+
key = unquote_if_quoted(after[0...-1].strip)
|
|
280
|
+
@i += 1
|
|
281
|
+
|
|
282
|
+
@output << '{'
|
|
283
|
+
@output << json_string(key)
|
|
284
|
+
@output << ':'
|
|
285
|
+
|
|
286
|
+
if @i < @lines.length && @lines[@i][:indent] > ln[:indent]
|
|
287
|
+
parse_block(@lines[@i][:indent])
|
|
288
|
+
else
|
|
289
|
+
@output << '{}'
|
|
290
|
+
end
|
|
291
|
+
|
|
292
|
+
@output << '}'
|
|
293
|
+
next
|
|
294
|
+
end
|
|
295
|
+
|
|
296
|
+
@output << value_to_json(after)
|
|
297
|
+
@i += 1
|
|
298
|
+
end
|
|
299
|
+
|
|
300
|
+
@output << ']'
|
|
301
|
+
return
|
|
302
|
+
end
|
|
303
|
+
|
|
304
|
+
# Empty array
|
|
305
|
+
@output << ']'
|
|
306
|
+
end
|
|
307
|
+
|
|
308
|
+
def parse_array_of_arrays(child_indent)
|
|
309
|
+
first_item = true
|
|
310
|
+
|
|
311
|
+
while @i < @lines.length && @lines[@i][:indent] >= child_indent
|
|
312
|
+
child_text = @lines[@i][:text]
|
|
313
|
+
break if child_text.strip.empty?
|
|
314
|
+
|
|
315
|
+
child_header = parse_array_header(child_text)
|
|
316
|
+
break unless child_header && child_header[:key].nil? # Must be headerless array
|
|
317
|
+
|
|
318
|
+
@output << ',' unless first_item
|
|
319
|
+
first_item = false
|
|
320
|
+
@i += 1
|
|
321
|
+
|
|
322
|
+
parse_array_body(child_header, @lines[@i - 1][:indent] + @indent_unit)
|
|
323
|
+
end
|
|
324
|
+
|
|
325
|
+
@output << ']'
|
|
326
|
+
end
|
|
327
|
+
|
|
328
|
+
def split_with_quotes(text, delimiter)
|
|
329
|
+
return [text] if text.nil? || text.empty?
|
|
330
|
+
|
|
331
|
+
values = []
|
|
332
|
+
current = +''
|
|
333
|
+
in_quotes = false
|
|
334
|
+
i = 0
|
|
335
|
+
|
|
336
|
+
while i < text.length
|
|
337
|
+
c = text[i]
|
|
338
|
+
|
|
339
|
+
if c == '\\' && i + 1 < text.length
|
|
340
|
+
current << c << text[i + 1]
|
|
341
|
+
i += 2
|
|
342
|
+
next
|
|
343
|
+
end
|
|
344
|
+
|
|
345
|
+
if c == '"'
|
|
346
|
+
in_quotes = !in_quotes
|
|
347
|
+
current << c
|
|
348
|
+
i += 1
|
|
349
|
+
next
|
|
350
|
+
end
|
|
351
|
+
|
|
352
|
+
if c == delimiter && !in_quotes
|
|
353
|
+
values << current
|
|
354
|
+
current = +''
|
|
355
|
+
i += 1
|
|
356
|
+
next
|
|
357
|
+
end
|
|
358
|
+
|
|
359
|
+
current << c
|
|
360
|
+
i += 1
|
|
361
|
+
end
|
|
362
|
+
|
|
363
|
+
values << current unless current.empty?
|
|
364
|
+
values.map(&:strip)
|
|
365
|
+
end
|
|
366
|
+
|
|
367
|
+
def detect_delimiter(marker, fields)
|
|
368
|
+
return "\t" if marker == "\t"
|
|
369
|
+
return '|' if marker == '|'
|
|
370
|
+
return "\t" if fields&.include?("\t")
|
|
371
|
+
return '|' if fields&.include?('|')
|
|
372
|
+
|
|
373
|
+
','
|
|
374
|
+
end
|
|
375
|
+
|
|
376
|
+
def parse_key_value_line(text)
|
|
377
|
+
key = nil
|
|
378
|
+
rest = nil
|
|
379
|
+
|
|
380
|
+
if text.start_with?('"')
|
|
381
|
+
idx = 1
|
|
382
|
+
while idx < text.length
|
|
383
|
+
if text[idx] == '\\' && idx + 1 < text.length
|
|
384
|
+
idx += 2
|
|
385
|
+
next
|
|
386
|
+
end
|
|
387
|
+
break if text[idx] == '"'
|
|
388
|
+
|
|
389
|
+
idx += 1
|
|
390
|
+
end
|
|
391
|
+
return nil if idx >= text.length
|
|
392
|
+
|
|
393
|
+
key = unescape_string(text[1...idx])
|
|
394
|
+
after = text[(idx + 1)..]&.lstrip
|
|
395
|
+
return nil unless after&.start_with?(':')
|
|
396
|
+
|
|
397
|
+
rest = after[1..]&.lstrip
|
|
398
|
+
elsif (cpos = text.index(':'))
|
|
399
|
+
key = text[0...cpos].strip
|
|
400
|
+
rest = text[(cpos + 1)..]&.lstrip
|
|
401
|
+
else
|
|
402
|
+
return nil
|
|
403
|
+
end
|
|
404
|
+
|
|
405
|
+
return nil if rest.nil? || rest.empty?
|
|
406
|
+
|
|
407
|
+
{ key: unquote_if_quoted(key), value: value_to_json(rest) }
|
|
408
|
+
end
|
|
409
|
+
|
|
410
|
+
def parse_key_only(text)
|
|
411
|
+
return nil unless text.end_with?(':')
|
|
412
|
+
|
|
413
|
+
key_part = text[0...-1].strip
|
|
414
|
+
return nil if key_part.empty?
|
|
415
|
+
|
|
416
|
+
unquote_if_quoted(key_part)
|
|
417
|
+
end
|
|
418
|
+
|
|
419
|
+
def unquote_if_quoted(str)
|
|
420
|
+
return str unless str&.start_with?('"') && str.end_with?('"')
|
|
421
|
+
|
|
422
|
+
unescape_string(str[1...-1])
|
|
423
|
+
end
|
|
424
|
+
|
|
425
|
+
def parse_quoted_key(key)
|
|
426
|
+
key = key.strip
|
|
427
|
+
if key.start_with?('"') && key.end_with?('"')
|
|
428
|
+
unescape_string(key[1...-1])
|
|
429
|
+
else
|
|
430
|
+
key
|
|
431
|
+
end
|
|
432
|
+
end
|
|
433
|
+
|
|
434
|
+
def value_to_json(text)
|
|
435
|
+
return 'null' if text.nil? || text == 'null'
|
|
436
|
+
|
|
437
|
+
t = text.strip
|
|
438
|
+
|
|
439
|
+
return json_string(unescape_string(t[1...-1])) if t.start_with?('"') && t.end_with?('"')
|
|
440
|
+
|
|
441
|
+
return 'true' if t.casecmp('true').zero?
|
|
442
|
+
return 'false' if t.casecmp('false').zero?
|
|
443
|
+
return 'null' if t.casecmp('null').zero?
|
|
444
|
+
|
|
445
|
+
# Numbers
|
|
446
|
+
return t if t.match?(/\A-?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?\z/)
|
|
447
|
+
|
|
448
|
+
json_string(t)
|
|
449
|
+
end
|
|
450
|
+
|
|
451
|
+
def primitive_to_json(token)
|
|
452
|
+
return 'null' if token.nil? || token.casecmp?('null')
|
|
453
|
+
return 'true' if token.casecmp?('true')
|
|
454
|
+
return 'false' if token.casecmp?('false')
|
|
455
|
+
return token if token.match?(/\A-?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?\z/)
|
|
456
|
+
|
|
457
|
+
json_string(token)
|
|
458
|
+
end
|
|
459
|
+
|
|
460
|
+
def json_string(str)
|
|
461
|
+
return '""' if str.nil? || str.empty?
|
|
462
|
+
|
|
463
|
+
result = +'"'
|
|
464
|
+
|
|
465
|
+
str.each_char do |c|
|
|
466
|
+
result << case c
|
|
467
|
+
when '"' then '\\"'
|
|
468
|
+
when '\\' then '\\\\'
|
|
469
|
+
when "\n" then '\\n'
|
|
470
|
+
when "\r" then '\\r'
|
|
471
|
+
when "\t" then '\\t'
|
|
472
|
+
when "\b" then '\\b'
|
|
473
|
+
when "\f" then '\\f'
|
|
474
|
+
else
|
|
475
|
+
if c.ord < 32
|
|
476
|
+
format('\\u%04x', c.ord)
|
|
477
|
+
else
|
|
478
|
+
c
|
|
479
|
+
end
|
|
480
|
+
end
|
|
481
|
+
end
|
|
482
|
+
|
|
483
|
+
result << '"'
|
|
484
|
+
result
|
|
485
|
+
end
|
|
486
|
+
|
|
487
|
+
def unescape_string(s)
|
|
488
|
+
result = +''
|
|
489
|
+
i = 0
|
|
490
|
+
|
|
491
|
+
while i < s.length
|
|
492
|
+
if s[i] == '\\' && i + 1 < s.length
|
|
493
|
+
case s[i + 1]
|
|
494
|
+
when 'n' then result << "\n"
|
|
495
|
+
when 'r' then result << "\r"
|
|
496
|
+
when 't' then result << "\t"
|
|
497
|
+
when '"' then result << '"'
|
|
498
|
+
when '\\' then result << '\\'
|
|
499
|
+
when 'b' then result << "\b"
|
|
500
|
+
when 'f' then result << "\f"
|
|
501
|
+
when 'u'
|
|
502
|
+
if i + 5 < s.length
|
|
503
|
+
hex = s[i + 2..i + 5]
|
|
504
|
+
begin
|
|
505
|
+
result << [hex.to_i(16)].pack('U')
|
|
506
|
+
rescue StandardError
|
|
507
|
+
result << s[i..i + 5]
|
|
508
|
+
end
|
|
509
|
+
i += 6
|
|
510
|
+
next
|
|
511
|
+
else
|
|
512
|
+
result << s[i] << s[i + 1]
|
|
513
|
+
end
|
|
514
|
+
else
|
|
515
|
+
result << s[i] << s[i + 1]
|
|
516
|
+
end
|
|
517
|
+
i += 2
|
|
518
|
+
else
|
|
519
|
+
result << s[i]
|
|
520
|
+
i += 1
|
|
521
|
+
end
|
|
522
|
+
end
|
|
523
|
+
|
|
524
|
+
result
|
|
525
|
+
end
|
|
526
|
+
end
|
|
527
|
+
end
|
data/lib/toon_to_json.rb
ADDED
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
require_relative 'toon_to_json/decoder'
|
|
2
|
+
|
|
3
|
+
module ToonToJson
|
|
4
|
+
class Error < StandardError; end
|
|
5
|
+
|
|
6
|
+
# Decode a TOON-formatted string into a Ruby object
|
|
7
|
+
#
|
|
8
|
+
# @param toon_str [String] The TOON-formatted string to decode
|
|
9
|
+
# @return [Object] The decoded Ruby object (Hash, Array, or primitive)
|
|
10
|
+
|
|
11
|
+
def self.decode(toon_str)
|
|
12
|
+
Decoder.new.decode(toon_str)
|
|
13
|
+
end
|
|
14
|
+
end
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: ruby-json-toon
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.3.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Jitendra Neema
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2026-01-
|
|
11
|
+
date: 2026-01-09 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: benchmark-ips
|
|
@@ -108,24 +108,10 @@ dependencies:
|
|
|
108
108
|
- - "~>"
|
|
109
109
|
- !ruby/object:Gem::Version
|
|
110
110
|
version: '3.0'
|
|
111
|
-
- !ruby/object:Gem::Dependency
|
|
112
|
-
name: simplecov
|
|
113
|
-
requirement: !ruby/object:Gem::Requirement
|
|
114
|
-
requirements:
|
|
115
|
-
- - "~>"
|
|
116
|
-
- !ruby/object:Gem::Version
|
|
117
|
-
version: '0.22'
|
|
118
|
-
type: :development
|
|
119
|
-
prerelease: false
|
|
120
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
121
|
-
requirements:
|
|
122
|
-
- - "~>"
|
|
123
|
-
- !ruby/object:Gem::Version
|
|
124
|
-
version: '0.22'
|
|
125
111
|
description: Lightweight Ruby library for converting JSON data to TOON format, achieving
|
|
126
112
|
30-60% token reduction for LLM applications
|
|
127
113
|
email:
|
|
128
|
-
-
|
|
114
|
+
- jitendra.neema.8@gmail.com
|
|
129
115
|
executables: []
|
|
130
116
|
extensions: []
|
|
131
117
|
extra_rdoc_files: []
|
|
@@ -136,6 +122,9 @@ files:
|
|
|
136
122
|
- lib/json_to_toon.rb
|
|
137
123
|
- lib/json_to_toon/encoder.rb
|
|
138
124
|
- lib/json_to_toon/version.rb
|
|
125
|
+
- lib/toon_to_json.rb
|
|
126
|
+
- lib/toon_to_json/decoder.rb
|
|
127
|
+
- lib/toon_to_json/version.rb
|
|
139
128
|
homepage: https://github.com/jitendra-neema/ruby-json-toon
|
|
140
129
|
licenses:
|
|
141
130
|
- MIT
|