cton 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b010a8f0e0da39e4e4d0a4217eddaa8f9496f1889bf32e12430fdb7737f17fab
4
- data.tar.gz: 6fe6f58ff0a40233a279ae5c8881ccca4ce382fa85cae15c2c5e26782bb02875
3
+ metadata.gz: 1c9161ae830ba6b01d3ec94d1170fc4295aaacfa839c869aaa6adefe2711cc2d
4
+ data.tar.gz: 80c4ba30abbf8a562bde581f26e7dc5529aa46275b61930fe28370f34156db61
5
5
  SHA512:
6
- metadata.gz: 3a85563dd205c2c00b204359d85376514de8fc45ce2b2c98e4d52a0325bff2937e2d88ba5e367fe718a0b82127603deadfe16dd6f60062e77a1b75babc666ec4
7
- data.tar.gz: b4b27bfb483e0145c49def7b9ab735c27e03420dc59fd6bcaabc57d1b2bf6868d7bc5c55fea9866da3270a6c81126df032590129a1e28385827b8b4f3058e92a
6
+ metadata.gz: 914196284081bacd5b7f5f6ac9a1b246ea8924eddbb26cd28796b12a2ee2156718a9ff3b795b86188d483da4fc294b002a93e935815f692b432dac78b5304dcf
7
+ data.tar.gz: d9b1bfb1f7de402fe9de0d7da90f750d490dbdcb4e572e9283bc3ed15b1b43e3f3f9b44a4031f80ecf5d610a73d64719f3c78f170060de78035e04dab3a9d663
data/CHANGELOG.md CHANGED
@@ -5,6 +5,25 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.3.0] - 2025-11-20
9
+
10
+ ### Added
11
+
12
+ - **Performance tunables**: `Cton.dump` now accepts `decimal_mode: :fast | :precise`, allowing callers to trade float-format determinism for lower allocation pressure. Specs cover both modes.
13
+ - **Benchmark harness**: New `bench/encode_decode_bench.rb` script (wired into the README/Development docs) exercises encode/decode hot paths and prints comparative JSON vs CTON timings. On Ruby 3.1.4/macOS the fast encoder completes 1,000 iterations in ~0.63s and the new inline decoder stress test wraps 400 concatenated documents in ~4.14s.
14
+ - **Regression tests**: Added specs for streaming documents without separators plus validation around the new decimal mode toggle.
15
+
16
+ ### Changed
17
+
18
+ - **Encoder**: Memoizes table schemas per array instance, adds a fast-path for homogeneous scalar lists, and reduces float/BigDecimal copying by favoring Ruby's native float formatting before falling back to `BigDecimal`. Unsupported `decimal_mode` values now raise immediately.
19
+ - **Decoder**: Replaces high-allocation `StringScanner` tokenization with raw string slicing, improves key-boundary detection for inline payloads, and keeps symbolization logic untouched. Boundary heuristics now prefer alphabetic key starts to avoid splitting numeric payloads.
20
+ - **Documentation**: README now calls out the tuning flags, inline caveats, and benchmark instructions; Development workflow highlights how to rerun the perf suite.
21
+
22
+ ### Fixed
23
+
24
+ - **Inline parsing**: Eliminated the runaway allocations and incorrect key splits when processing long documents with `separator: ""`.
25
+ - **Float normalization**: Restored canonical `9.2`-style output in fast mode while keeping the new perf optimizations.
26
+
8
27
  ## [0.2.0] - 2025-11-19
9
28
 
10
29
  ### Added
data/README.md CHANGED
@@ -15,6 +15,8 @@
15
15
  - [Token Savings](#token-savings-vs-json--toon)
16
16
  - [Installation](#installation)
17
17
  - [Usage](#usage)
18
+ - [Performance & Benchmarks](#performance--benchmarks)
19
+ - [Teaching CTON to LLMs](#teaching-cton-to-llms)
18
20
  - [Development](#development)
19
21
  - [Contributing](#contributing)
20
22
  - [License](#license)
@@ -165,6 +167,10 @@ pretty = Cton.dump(payload, pretty: true)
165
167
  File.open("data.cton", "w") do |f|
166
168
  Cton.dump(payload, f)
167
169
  end
170
+
171
+ # Toggle float normalization strategies
172
+ fast = Cton.dump(payload) # default :fast mode
173
+ strict = Cton.dump(payload, decimal_mode: :precise)
168
174
  ```
169
175
 
170
176
  ### CLI Tool
@@ -196,7 +202,7 @@ CTON natively supports serialization for:
196
202
  Whenever an array is made of hashes that all expose the same scalar keys, the encoder flattens it into a table to save tokens. Mixed or nested arrays fall back to `[N]=(value1,value2,...)`.
197
203
 
198
204
  #### Separators & ambiguity
199
- Removing every newline makes certain inputs ambiguous because `sam` and the next key `hikes` can merge into `samhikes`. The default `separator: "\n"` avoids that by inserting a single newline between root segments. You may pass `separator: ""` to `Cton.dump` for maximum compactness, but decoding such strings is only safe if you can guarantee extra quoting or whitespace between segments.
205
+ Removing every newline makes certain inputs ambiguous because `sam` and the next key `hikes` can merge into `samhikes`. The default `separator: "\n"` avoids that by inserting a single newline between root segments. You may pass `separator: ""` to `Cton.dump` for maximum compactness, but decoding such strings is only safe if you can guarantee extra quoting or whitespace between segments. When you intentionally omit separators, keep next-level keys alphabetic (e.g., `payload`, `k42`) so the decoder's boundary heuristic can split `...1payload...` without misclassifying numeric prefixes.
200
206
 
201
207
  #### Literal safety & number normalization
202
208
  Following the TOON specification's guardrails, the encoder now:
@@ -204,6 +210,106 @@ Following the TOON specification's guardrails, the encoder now:
204
210
  - Canonicalizes float/BigDecimal output: no exponent notation, no trailing zeros, and `-0` collapses to `0`.
205
211
  - Converts `NaN` and `±Infinity` inputs to `null`, matching TOON's normalization guidance so downstream decoders don't explode on non-finite numbers.
206
212
 
213
+ #### Decimal normalization modes
214
+ - `decimal_mode: :fast` (default) prefers Ruby's native float representation and only falls back to `BigDecimal` when scientific notation is detected, minimizing allocations on tight loops.
215
+ - `decimal_mode: :precise` forces the legacy `BigDecimal` path for every float, which is slower but useful for audit-grade dumps where you want deterministic decimal expansion.
216
+ - Both modes share the same trailing-zero stripping and `-0 → 0` normalization, so switching modes never affects integer formatting.
217
+
218
+ ---
219
+
220
+ ## Performance & Benchmarks
221
+
222
+ CTON focuses on throughput: encoder table schemas are memoized, scalar list encoding keeps a reusable buffer, floats avoid `BigDecimal` when they can, and the decoder slices straight from the raw string to sidestep `StringScanner` allocations. You can reproduce the numbers below with the bundled script:
223
+
224
+ ```bash
225
+ bundle exec ruby bench/encode_decode_bench.rb
226
+ # customize input size / iterations
227
+ ITERATIONS=2000 STREAM_SIZE=400 bundle exec ruby bench/encode_decode_bench.rb
228
+ ```
229
+
230
+ Latest results on Ruby 3.1.4/macOS (M-series), 1,000 iterations, `STREAM_SIZE=200`:
231
+
232
+ | Benchmark | Time (s) |
233
+ | --- | --- |
234
+ | `cton dump` (:fast) | 0.626 |
235
+ | `cton dump` (:precise) | 0.658 |
236
+ | `json generate` | 0.027 |
237
+ | `cton load` | 2.067 |
238
+ | `json parse` | 0.045 |
239
+ | `cton inline load` (separator=`""`, double payload) | 4.140 |
240
+
241
+ `cton inline load` deliberately concatenates documents without separators to stress the new boundary detector; it now finishes without the runaway allocations seen in earlier releases.
242
+
243
+ ---
244
+
245
+ ## Teaching CTON to LLMs
246
+
247
+ Use this system prompt to teach an LLM how to understand and generate CTON:
248
+
249
+ ````markdown
250
+ You are an expert in data serialization and specifically in CTON (Compact Token-Oriented Notation). CTON is a token-efficient data format optimized for LLMs that serves as a compact alternative to JSON.
251
+
252
+ Your task is to interpret CTON input and convert it to JSON, or convert JSON input into valid CTON format, following the specification below.
253
+
254
+ ### CTON Specification
255
+
256
+ CTON minimizes syntax characters (braces, quotes) while preserving structure and type safety.
257
+
258
+ **1. Basic Structure (Key-Value)**
259
+ - **Rule:** Do not use outer curly braces `{}` for the root object.
260
+ - **Rule:** Use `=` to separate keys and values.
261
+ - **Rule:** Use `,` to separate fields.
262
+ - **Rule:** Do not use quotes around "safe" strings (alphanumeric, simple text).
263
+ - **Example:** - JSON: `{"task": "planning", "urgent": true}`
264
+ - CTON: `task=planning,urgent=true`
265
+
266
+ **2. Nested Objects**
267
+ - **Rule:** Use parentheses `()` to denote a nested object instead of `{}`.
268
+ - **Example:**
269
+ - JSON: `{"context": {"user": "Davide", "theme": "dark"}}`
270
+ - CTON: `context(user=Davide,theme=dark)`
271
+
272
+ **3. Arrays of Objects (Table Compression)**
273
+ - **Rule:** Use the syntax `key[count]{columns}=values` for arrays of objects to avoid repeating keys.
274
+ - **Structure:** `key[Length]{col1,col2}=val1,val2;val1,val2`
275
+ - **Details:** - `[N]` denotes the number of items in the array.
276
+ - `{col1,col2}` defines the schema headers.
277
+ - `;` separates distinct objects (rows).
278
+ - `,` separates values within an object.
279
+ - **Example:**
280
+
281
+ JSON:
282
+ ```json
283
+ {
284
+ "files": [
285
+ { "name": "README.md", "size": 1024 },
286
+ { "name": "lib.rb", "size": 2048 }
287
+ ]
288
+ }
289
+ ```
290
+
291
+ CTON: `files[2]{name,size}=README.md,1024;lib.rb,2048`
292
+
293
+ **4. Type Safety & Literals**
294
+ - **Booleans/Null:** `true`, `false`, and `null` are preserved as literals (unquoted).
295
+ - **Numbers:** Integers and floats are written as is (e.g., `1024`, `3.14`).
296
+ - **Escaping:** If a string value looks like a boolean, number, or contains reserved characters (like `,`, `;`, `=`, `(`, `)`), it must be wrapped in double quotes (e.g., `"true"`).
297
+
298
+ ### Examples for Training
299
+
300
+ **Input (JSON):**
301
+ ```json
302
+ {
303
+ "id": 123,
304
+ "active": true,
305
+ "metadata": {
306
+ "created_at": "2023-01-01",
307
+ "tags": "admin"
308
+ }
309
+ }
310
+ ```
311
+ ````
312
+
207
313
  ---
208
314
 
209
315
  ## Type Safety
@@ -216,6 +322,7 @@ CTON ships with RBS signatures (`sig/cton.rbs`) to support type checking and IDE
216
322
  bin/setup # install dependencies
217
323
  bundle exec rake # run tests and rubocop
218
324
  bin/console # interactive playground
325
+ bundle exec ruby bench/encode_decode_bench.rb # performance smoke test
219
326
  ```
220
327
 
221
328
  To release a new version, bump `Cton::VERSION` and run `bundle exec rake release`.
@@ -0,0 +1,65 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require "benchmark"
5
+ require "json"
6
+ require_relative "../lib/cton"
7
+
8
+ ITERATIONS = Integer(ENV.fetch("ITERATIONS", 1_000))
9
+ STREAM_SIZE = Integer(ENV.fetch("STREAM_SIZE", 200))
10
+
11
+ sample_payload = {
12
+ "context" => {
13
+ "task" => "Our favorite hikes together",
14
+ "location" => "Boulder",
15
+ "season" => "spring_2025"
16
+ },
17
+ "friends" => %w[ana luis sam],
18
+ "hikes" => Array.new(STREAM_SIZE) do |idx|
19
+ {
20
+ "id" => idx + 1,
21
+ "name" => "Trail ##{idx + 1}",
22
+ "distanceKm" => (6.0 + ((idx % 5) * 0.5)),
23
+ "elevationGain" => 250 + ((idx % 3) * 50),
24
+ "companion" => %w[ana luis sam][idx % 3],
25
+ "wasSunny" => idx.even?
26
+ }
27
+ end
28
+ }
29
+
30
+ warm_cton = Cton.dump(sample_payload)
31
+ warm_json = JSON.generate(sample_payload)
32
+
33
+ puts "\nEncoding benchmarks (iterations=#{ITERATIONS}, stream_size=#{STREAM_SIZE})"
34
+ Benchmark.bm(25) do |bm|
35
+ bm.report("cton dump fast") do
36
+ ITERATIONS.times { Cton.dump(sample_payload) }
37
+ end
38
+
39
+ bm.report("cton dump precise") do
40
+ ITERATIONS.times { Cton.dump(sample_payload, decimal_mode: :precise) }
41
+ end
42
+
43
+ bm.report("json generate") do
44
+ ITERATIONS.times { JSON.generate(sample_payload) }
45
+ end
46
+ end
47
+
48
+ puts "\nDecoding benchmarks"
49
+ Benchmark.bm(25) do |bm|
50
+ bm.report("cton load") do
51
+ ITERATIONS.times { Cton.load(warm_cton) }
52
+ end
53
+
54
+ bm.report("json parse") do
55
+ ITERATIONS.times { JSON.parse(warm_json) }
56
+ end
57
+ end
58
+
59
+ puts "\nStreaming decode stress (#{STREAM_SIZE * 2} documents, separator=\"\")"
60
+ inline_blob = warm_cton.delete("\n") * 2
61
+ Benchmark.bm(25) do |bm|
62
+ bm.report("cton inline load") do
63
+ ITERATIONS.times { Cton.load(inline_blob) }
64
+ end
65
+ end
data/lib/cton/decoder.rb CHANGED
@@ -5,13 +5,15 @@ require "strscan"
5
5
  module Cton
6
6
  class Decoder
7
7
  TERMINATORS = [",", ";", ")", "]", "}"].freeze
8
+ KEY_VALUE_BOUNDARY_TOKENS = ["(", "[", "="].freeze
8
9
 
9
10
  def initialize(symbolize_names: false)
10
11
  @symbolize_names = symbolize_names
11
12
  end
12
13
 
13
14
  def decode(cton)
14
- @scanner = StringScanner.new(cton.to_s)
15
+ @raw_string = cton.to_s
16
+ @scanner = StringScanner.new(@raw_string)
15
17
  skip_ws
16
18
 
17
19
  value = if key_ahead?
@@ -28,7 +30,7 @@ module Cton
28
30
 
29
31
  private
30
32
 
31
- attr_reader :symbolize_names, :scanner
33
+ attr_reader :symbolize_names, :scanner, :raw_string
32
34
 
33
35
  def raise_error(message)
34
36
  line, col = calculate_location(@scanner.pos)
@@ -36,7 +38,7 @@ module Cton
36
38
  end
37
39
 
38
40
  def calculate_location(pos)
39
- string = @scanner.string
41
+ string = raw_string
40
42
  consumed = string[0...pos]
41
43
  line = consumed.count("\n") + 1
42
44
  last_newline = consumed.rindex("\n")
@@ -168,56 +170,74 @@ module Cton
168
170
  end
169
171
 
170
172
  def scan_until_terminator
171
- @scanner.scan(/[^,;\]\}\)\(\[\{\s]+/)
173
+ start_pos = @scanner.pos
174
+ end_pos = find_terminator_position(start_pos)
175
+ consume_slice(start_pos, end_pos)
172
176
  end
173
177
 
174
178
  def scan_until_boundary_or_terminator
175
179
  start_pos = @scanner.pos
180
+ boundary_pos = find_key_boundary(start_pos)
181
+ end_pos = boundary_pos || find_terminator_position(start_pos)
182
+ consume_slice(start_pos, end_pos)
183
+ end
176
184
 
177
- chunk = @scanner.scan(/[0-9A-Za-z_.:-]+/)
178
- return nil unless chunk
185
+ def consume_slice(start_pos, end_pos)
186
+ return nil if end_pos <= start_pos
179
187
 
180
- boundary_idx = find_key_boundary(start_pos)
188
+ token = raw_string.byteslice(start_pos, end_pos - start_pos)
189
+ @scanner.pos = end_pos
190
+ token
191
+ end
181
192
 
182
- if boundary_idx
183
- length = boundary_idx - start_pos
184
- @scanner.pos = start_pos
185
- token = @scanner.peek(length)
186
- @scanner.pos += length
187
- token
188
- else
189
- @scanner.pos = start_pos + chunk.length
190
- chunk
193
+ def find_terminator_position(start_pos)
194
+ str = raw_string
195
+ len = str.length
196
+ idx = start_pos
197
+
198
+ while idx < len
199
+ char = str[idx]
200
+ break if terminator?(char)
201
+
202
+ idx += 1
191
203
  end
204
+
205
+ idx
192
206
  end
193
207
 
194
208
  def find_key_boundary(from_index)
195
- str = @scanner.string
209
+ str = raw_string
196
210
  len = str.length
197
211
  idx = from_index
198
212
 
199
213
  while idx < len
200
214
  char = str[idx]
201
215
 
202
- return nil if TERMINATORS.include?(char) || whitespace?(char) || "([{".include?(char)
216
+ return nil if terminator?(char)
203
217
 
204
218
  if safe_key_char?(char)
205
219
  key_end = idx
206
220
  key_end += 1 while key_end < len && safe_key_char?(str[key_end])
207
221
 
208
- next_char_idx = key_end
209
-
210
- if next_char_idx < len
211
- next_char = str[next_char_idx]
212
- return idx if ["(", "[", "="].include?(next_char) && (idx > from_index)
222
+ if key_end < len && KEY_VALUE_BOUNDARY_TOKENS.include?(str[key_end]) && idx > from_index && boundary_start_allowed?(str[idx])
223
+ return idx
213
224
  end
214
225
  end
215
226
 
216
227
  idx += 1
217
228
  end
229
+
218
230
  nil
219
231
  end
220
232
 
233
+ def terminator?(char)
234
+ TERMINATORS.include?(char) || whitespace?(char) || ["(", "[", "{"].include?(char)
235
+ end
236
+
237
+ def boundary_start_allowed?(char)
238
+ !char.nil? && char.match?(/[A-Za-z_.:-]/)
239
+ end
240
+
221
241
  def convert_scalar(token)
222
242
  case token
223
243
  when "true" then true
data/lib/cton/encoder.rb CHANGED
@@ -11,10 +11,14 @@ module Cton
11
11
  RESERVED_LITERALS = %w[true false null].freeze
12
12
  FLOAT_DECIMAL_PRECISION = Float::DIG
13
13
 
14
- def initialize(separator: "\n", pretty: false)
14
+ def initialize(separator: "\n", pretty: false, decimal_mode: :fast)
15
15
  @separator = separator || ""
16
16
  @pretty = pretty
17
+ @decimal_mode = decimal_mode
18
+ raise ArgumentError, "decimal_mode must be :fast or :precise" unless %i[fast precise].include?(@decimal_mode)
19
+
17
20
  @indent_level = 0
21
+ @table_schema_cache = {}
18
22
  end
19
23
 
20
24
  def encode(payload, io: nil)
@@ -25,7 +29,7 @@ module Cton
25
29
 
26
30
  private
27
31
 
28
- attr_reader :separator, :io, :pretty, :indent_level
32
+ attr_reader :separator, :io, :pretty, :indent_level, :decimal_mode
29
33
 
30
34
  def encode_root(value)
31
35
  case value
@@ -96,8 +100,8 @@ module Cton
96
100
 
97
101
  io << "[" << length.to_s << "]"
98
102
 
99
- if table_candidate?(list)
100
- encode_table(list)
103
+ if (header = table_schema_for(list))
104
+ encode_table(list, header)
101
105
  else
102
106
  io << "="
103
107
  if list.all? { |value| scalar?(value) }
@@ -108,8 +112,7 @@ module Cton
108
112
  end
109
113
  end
110
114
 
111
- def encode_table(rows)
112
- header = rows.first.keys
115
+ def encode_table(rows, header)
113
116
  io << "{"
114
117
  io << header.map { |key| format_key(key) }.join(",")
115
118
  io << "}="
@@ -150,10 +153,14 @@ module Cton
150
153
  outdent
151
154
  else
152
155
  first = true
153
- list.each do |value|
154
- io << "," unless first
155
- encode_scalar(value)
156
- first = false
156
+ if fast_scalar_stream?(list)
157
+ io << fast_scalar_stream(list)
158
+ else
159
+ list.each do |value|
160
+ io << "," unless first
161
+ encode_scalar(value)
162
+ first = false
163
+ end
157
164
  end
158
165
  end
159
166
  end
@@ -174,30 +181,34 @@ module Cton
174
181
  end
175
182
 
176
183
  def encode_scalar(value)
184
+ io << scalar_to_string(value)
185
+ end
186
+
187
+ def scalar_to_string(value)
177
188
  case value
178
189
  when String
179
- encode_string(value)
190
+ format_string(value)
180
191
  when TrueClass, FalseClass
181
- io << (value ? "true" : "false")
192
+ value ? "true" : "false"
182
193
  when NilClass
183
- io << "null"
194
+ "null"
184
195
  when Numeric
185
- io << format_number(value)
196
+ format_number(value)
186
197
  when Time, Date
187
- encode_string(value.iso8601)
198
+ format_string(value.iso8601)
188
199
  else
189
200
  raise EncodeError, "Unsupported value: #{value.class}"
190
201
  end
191
202
  end
192
203
 
193
- def encode_string(value)
194
- io << if value.empty?
195
- '""'
196
- elsif string_needs_quotes?(value)
197
- quote_string(value)
198
- else
199
- value
200
- end
204
+ def format_string(value)
205
+ if value.empty?
206
+ '""'
207
+ elsif string_needs_quotes?(value)
208
+ quote_string(value)
209
+ else
210
+ value
211
+ end
201
212
  end
202
213
 
203
214
  def format_number(value)
@@ -234,6 +245,17 @@ module Cton
234
245
  end
235
246
 
236
247
  def float_decimal_string(value)
248
+ return precise_float_decimal_string(value) if decimal_mode == :precise
249
+
250
+ decimal = value.to_s
251
+ if decimal.include?("e") || decimal.include?("E")
252
+ precise_float_decimal_string(value)
253
+ else
254
+ decimal
255
+ end
256
+ end
257
+
258
+ def precise_float_decimal_string(value)
237
259
  if defined?(BigDecimal)
238
260
  BigDecimal(value.to_s).to_s("F")
239
261
  else
@@ -278,16 +300,64 @@ module Cton
278
300
  value.is_a?(String) || value.is_a?(Numeric) || value == true || value == false || value.nil? || value.is_a?(Time) || value.is_a?(Date)
279
301
  end
280
302
 
281
- def table_candidate?(rows)
282
- return false if rows.empty?
303
+ def table_schema_for(rows)
304
+ cache_lookup = @table_schema_cache.fetch(rows.object_id, :__missing__)
305
+ return cache_lookup unless cache_lookup == :__missing__
306
+
307
+ schema = compute_table_schema(rows)
308
+ @table_schema_cache[rows.object_id] = schema
309
+ end
310
+
311
+ def compute_table_schema(rows)
312
+ return nil if rows.empty?
283
313
 
284
314
  first = rows.first
285
- return false unless first.is_a?(Hash) && !first.empty?
315
+ return nil unless first.is_a?(Hash) && !first.empty?
316
+
317
+ header = first.keys.freeze
318
+
319
+ rows.each do |row|
320
+ return nil unless row.is_a?(Hash)
321
+ return nil unless row.keys == header
322
+ return nil unless row.values.all? { |val| scalar?(val) }
323
+ end
324
+
325
+ header
326
+ end
327
+
328
+ def fast_scalar_stream?(list)
329
+ !pretty && list.length > 4 && homogeneous_scalar_tokens?(list)
330
+ end
331
+
332
+ def homogeneous_scalar_tokens?(list)
333
+ first_class = nil
334
+ list.all? do |value|
335
+ return false unless scalar?(value)
336
+
337
+ token_class = value.class
338
+ first_class ||= token_class
339
+ token_class == first_class && token_does_not_require_quotes?(value)
340
+ end
341
+ end
342
+
343
+ def token_does_not_require_quotes?(value)
344
+ case value
345
+ when String
346
+ !value.empty? && !string_needs_quotes?(value)
347
+ when Integer, TrueClass, FalseClass, NilClass
348
+ true
349
+ else
350
+ false
351
+ end
352
+ end
286
353
 
287
- keys = first.keys
288
- rows.all? do |row|
289
- row.is_a?(Hash) && row.keys == keys && row.values.all? { |val| scalar?(val) }
354
+ def fast_scalar_stream(list)
355
+ buffer = String.new
356
+ list.each_with_index do |value, index|
357
+ buffer << "," unless index.zero?
358
+ buffer << scalar_to_string(value)
290
359
  end
360
+ buffer
291
361
  end
292
362
 
293
363
  def indent
data/lib/cton/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Cton
4
- VERSION = "0.2.0"
4
+ VERSION = "0.3.0"
5
5
  end
data/lib/cton.rb CHANGED
@@ -28,7 +28,8 @@ module Cton
28
28
 
29
29
  separator = options.fetch(:separator, "\n")
30
30
  pretty = options.fetch(:pretty, false)
31
- Encoder.new(separator: separator, pretty: pretty).encode(payload, io: io)
31
+ decimal_mode = options.fetch(:decimal_mode, :fast)
32
+ Encoder.new(separator: separator, pretty: pretty, decimal_mode: decimal_mode).encode(payload, io: io)
32
33
  end
33
34
  alias generate dump
34
35
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: cton
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Davide Santangelo
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2025-11-19 00:00:00.000000000 Z
11
+ date: 2025-11-20 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: CTON provides a JSON-compatible, token-efficient text representation
14
14
  optimized for LLM prompts.
@@ -25,6 +25,7 @@ files:
25
25
  - LICENSE.txt
26
26
  - README.md
27
27
  - Rakefile
28
+ - bench/encode_decode_bench.rb
28
29
  - lib/cton.rb
29
30
  - lib/cton/decoder.rb
30
31
  - lib/cton/encoder.rb