smarter_json 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of smarter_json might be problematic. Click here for more details.

checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 44970fff9a74d5d18ef8ce1f909afb35ed8db60e92e7fd6ddee0d399ed98f826
4
+ data.tar.gz: 6b02c6491a049ce306bc07de1b5bb06cf71ec14a579f2f6a659ffc634b1b2555
5
+ SHA512:
6
+ metadata.gz: 3040050ab8538786344d7cdafb7959b0b28e2c869b2c5c5609228226de4cc52f2a70a8460c97790f537c19eb44a6defb7774ca8945aacd26ae2040684f3587c6
7
+ data.tar.gz: 389be8c3a940d29cfef4bf14df69c4f917fcb6fc1ff37d4c3fc4c3ec716c3985363a408d17ebc303370a110f057e7fdd200f726a46d05966b83b9e5eb6ac446d
data/.gitignore ADDED
@@ -0,0 +1,46 @@
1
+ # General ignores for a Ruby gem
2
+ *.gem
3
+
4
+ # Ignore Bundler lockfile for gems (leave out for gems, include for apps)
5
+ Gemfile.lock
6
+
7
+ # Ignore RSpec status persistence
8
+ .rspec_status
9
+
10
+ # Ignore build output
11
+ /pkg/
12
+ /tmp/
13
+ /log/
14
+ /coverage/
15
+
16
+ # Ignore RuboCop, Yard, docs, bundler
17
+ /.bundle/
18
+ .yardoc
19
+ /.yardoc/
20
+ /.byebug_history
21
+ /.pry_history
22
+ /.irb-history
23
+
24
+ # Ignore Mac, editor, IDE artifacts
25
+ .DS_Store
26
+ *~
27
+ *.swp
28
+ *.swo
29
+ *.tmp
30
+ .vscode/
31
+ .idea/
32
+
33
+ # Ignore node_modules just in case
34
+ node_modules/
35
+
36
+ # Ignore test coverage output
37
+ overage/
38
+
39
+ # Ignore binary object files, extensions
40
+ *.so
41
+ *.o
42
+ *.bundle
43
+ *.rbc
44
+
45
+ .claude/
46
+ CLAUDE.md
data/CHANGELOG.md ADDED
@@ -0,0 +1,75 @@
1
+
2
+ # SmarterJSON Change Log
3
+
4
+ ## 0.5.2 (2026-06-01)
5
+ - `generate` now supports pretty-printing via the `indent:` option (spaces per nesting level; default `0` = compact). Empty objects/arrays stay inline; `indent:` combined with `format: :ndjson` raises `ArgumentError`.
6
+ - `generate` adds `sort_keys:` (emit object keys in sorted order), `ascii_only:` (escape non-ASCII as `\uXXXX`, astral chars as surrogate pairs), and `script_safe:` (escape `</` and U+2028/U+2029 for safe embedding in an HTML `<script>` tag).
7
+ - `generate` adds opt-in `coerce:` — when `true`, a value that isn't natively supported (e.g. `Time`, `Date`, app objects) is converted via its own `as_json` (result re-emitted) or `to_json` (spliced); strict-by-default still raises `GenerateError`.
8
+
9
+ ## 0.5.1 (2026-06-01)
10
+ - Unified the error classes under a single `SmarterJSON::Error` base: `ParseError` and `EncodingError` now inherit from it, and `generate` raises a new `GenerateError`. `rescue SmarterJSON::Error` now catches everything the gem raises.
11
+ - Added a CI test matrix (Ruby 2.6–4.0 + head, on Ubuntu and macOS).
12
+ - Fixed the C extension build on Ruby 2.6 (declare `rb_hash_bulk_insert`, which 2.6 exports but does not declare in its headers); set the minimum Ruby to 2.6.
13
+
14
+ ## 0.5.0 (2026-05-31 unreleased)
15
+ - add JSON generation, incl. NDJSON generation
16
+ - add test coverage
17
+
18
+ ## 0.4.0 (2026-05-31 unreleased)
19
+ - rename `flex_json` -> `smarter_json`
20
+
21
+ ## 0.3.10 (2026-05-31 unreleased)
22
+ - change interface to use `.process` and `.process_file`
23
+
24
+
25
+ ## 0.3.9 (2026-05-31 unreleased)
26
+ - `parse` (no block) now handles any input automatically: 0 documents (empty / whitespace / comment-only) → `nil`, 1 document → the value itself, 2+ documents (NDJSON / JSONL / concatenated / whitespace-separated) → an Array of the values. It no longer raises on trailing content.
27
+ - Detection is free (the same trailing-content check that used to raise) and the single-document path allocates no Array, so single-value parsing is unchanged in speed.
28
+ - The block form (`parse(input) { |doc| … }`) is kept as the bounded-memory streaming path. `parse_file(path) { |doc| … }` now forwards the block too, so files stream the same way (previously the block was silently ignored). Bracketless comma lists (`1, 2, 3`) still raise — commas don't separate top-level documents (implicit-root array remains unsupported).
29
+ - The block form allows individual processing of each line in NDJSON files.
30
+ - Supersedes the earlier "raise on trailing content, match Oj" behavior.
31
+
32
+ ## 0.3.8 (2026-05-30 unreleased)
33
+ - Reordered single-character checks so the more common byte is tested first (`-` before `+`).
34
+ - Quoteless-token boundary scan now uses a 256-byte class table: ordinary bytes are classified in one table lookup, and the lookahead byte is read only at a `#`/`/` instead of on every byte. Speeds up quoteless / config-style input (the lenient case the JSON benchmarks don't exercise).
35
+
36
+ ## 0.3.7 (2026-05-30 unreleased)
37
+ - Escaped-string literal runs are bulk-copied with the NEON scanner instead of one byte at a time.
38
+ - Added branch hints (`__builtin_expect`) and prefetch to the hot string-scan loop. Sped up string-heavy files (string_array, github_events, twitter all 12–16% faster).
39
+
40
+ ## 0.3.6 (2026-05-30 unreleased)
41
+ - Fast path for plain numbers inside objects/arrays (`fj_try_member_number`): one scan straight from the cursor, committing when the number meets a delimiter and falling back to the quoteless scanner otherwise. Skips the quoteless boundary scan + classify dispatch for the common case. Broad gains on number-in-container files (weather, canada, usgs, big_decimals).
42
+
43
+ ## 0.3.5 (2026-05-30 unreleased)
44
+ - Rewrote `fj_parse_number` (top-level numbers) as a single pass: finds the token end and accumulates the mantissa/exponent at once, using the string's NUL terminator as a scan sentinel (no per-byte bounds check) and a digit loop that skips the underscore check until an underscore actually appears.
45
+ - Added `fj_try_decimal` for the quoteless path: validates and extracts the number in one scan, replacing the old three scans (validate + significant-digit count + mantissa extraction); skips the significant-digit scan when the number has ≤16 digits.
46
+ - Both number paths now build values through the shared `fj_int_from_parts` / `fj_float_from_parts` helpers so they can't drift; removed the now-dead `fj_validate_decimal` / `fj_int_value` / `fj_decimal_value`.
47
+
48
+ ## 0.3.4 (2026-05-30 unreleased)
49
+ - Dropped a per-member Ruby method call (`key?`) that fired for every object member under the default duplicate-key mode — pure waste on object-heavy files (twitter, github_events, citm).
50
+ - Build objects and arrays from a C value stack with a pre-sized hash + bulk insert (and size-based duplicate detection), instead of inserting one member/element at a time.
51
+ - Added a per-parse key cache so repeated object keys are interned once instead of every occurrence.
52
+
53
+ ## 0.3.3 (2026-05-30 unreleased)
54
+ - Vendored Ryū (Ulf Adams, Apache-2.0) for correctly-rounded string→double conversion: the mantissa is accumulated in one pass and converted with no `strtod`. Large win on float-heavy files (canada, big_decimals).
55
+
56
+ ## 0.3.3 (2026-05-29 unreleased)
57
+ - performance fixes
58
+
59
+ ## 0.3.2 (2026-05-29 unreleased)
60
+ - performance fixes
61
+
62
+ ## 0.3.1 (2026-05-29 unreleased)
63
+ - performance fixes
64
+
65
+ ## 0.3.0 (2026-05-29 unreleased)
66
+ - iterative parser
67
+
68
+ ## 0.2.0 (2026-05-29 unreleased)
69
+ - recursive parser
70
+
71
+ ## 0.1.1 (2026-05-29 unreleased)
72
+ - MVP complete
73
+
74
+ ## 0.1.0 (2026-05-28 unreleased)
75
+ - Initial Ruby version
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026 Tilo Sloboda
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,110 @@
1
+ # SmarterJSON
2
+
3
+ ![Gem Version](https://img.shields.io/gem/v/smarter_json) [![codecov](https://codecov.io/gh/tilo/smarter_json/branch/main/graph/badge.svg)](https://codecov.io/gh/tilo/smarter_json) <!-- [![Downloads](https://img.shields.io/gem/dt/smarter_json)](https://rubygems.org/gems/smarter_json) --> [![RubyGems](https://img.shields.io/badge/RubyGems-smarter__json-brightgreen?logo=rubygems&logoColor=white)](https://rubygems.org/gems/smarter_json) [![Ruby Toolbox](https://img.shields.io/badge/Ruby%20Toolbox-smarter__json-brightgreen)](https://www.ruby-toolbox.com/projects/smarter_json)
4
+
5
+ A lenient, fast JSON parser for Ruby. It parses strict JSON, JSON5, HJSON-style config, and the messy JSON-ish input humans actually write — and in benchmarks it matches or beats Oj on nearly every file. SmarterJSON is opinionated: we want your JSON processing to be successful. Other parsers are strict - they stop at the first deviation - SmarterJSON keeps going - it optimizes for getting your data out, not for policing the JSON spec.
6
+
7
+ ## Why SmarterJSON?
8
+
9
+ Most JSON parsers reject anything that isn't perfectly strict JSON. SmarterJSON is built on the opposite principle: **you shouldn't have to care what flavor of JSON you were handed.** Give it strict JSON, JSON5, an HJSON-style config file, newline-delimited JSON, or a copy-pasted blob with comments and trailing commas — it just parses it.
10
+
11
+ Three things set it apart:
12
+
13
+ 1. **One parser, no modes, no flags.** There is no `dialect:` option and no "strict mode" — `SmarterJSON.process(input)` accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the parser to match your input; it adapts to whatever you give it.
14
+
15
+ 2. **It parses multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: one document returns its value, several documents return an `Array`, empty input returns `nil`. **Only SmarterJSON parses multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.** For input larger than memory, pass a block to stream one document at a time.
16
+
17
+ 3. **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere) puts it ahead of Oj on nearly every file we benchmark, and competitive with the stdlib `json` C parser — the fastest general-purpose Ruby JSON parser.
18
+
19
+ ## What it accepts, beyond strict JSON
20
+
21
+ - `//`, `/* … */`, and `#` comments (a `#`/`//` only starts a comment when preceded by whitespace, so `url: http://x.com` parses as a string, not a truncated value)
22
+ - Trailing commas; unquoted keys (`{host: localhost}`); single-quoted, triple-quoted (`'''…'''`), and quoteless string values
23
+ - Implicit root object — a config file that starts with `key: value`, no outer `{}`
24
+ - `NaN`, `Infinity`, hex (`0xFF`), leading `+` / `.`, underscores in numbers (`1_000_000`)
25
+ - UTF-8 BOM, smart/curly quotes, Python literals (`True` / `False` / `None`), JavaScript `undefined`
26
+ - Mixed CR / LF / CRLF line endings, and any Ruby-supported input encoding (via `encoding:`)
27
+ - Duplicate keys (last value wins by default; configurable)
28
+
29
+ It raises only on genuinely unparseable input (unterminated string, mismatched bracket), with line and column in the message — never on valid-but-lenient input.
30
+
31
+ ## Installation
32
+
33
+ ```ruby
34
+ # Gemfile
35
+ gem "smarter_json"
36
+ ```
37
+
38
+ ```bash
39
+ gem install smarter_json
40
+ ```
41
+
42
+ The C extension is built on install and used automatically. On platforms where it can't build, the pure-Ruby parser runs instead and produces identical results.
43
+
44
+ ## Documentation
45
+
46
+ * [Introduction](docs/_introduction.md)
47
+ * [The Basic Read API](docs/basic_read_api.md)
48
+ * [The Basic Write API](docs/basic_write_api.md)
49
+ * [Configuration Options](docs/options.md)
50
+ * [Examples](docs/examples.md)
51
+
52
+ ## Usage
53
+
54
+ ```ruby
55
+ require "smarter_json"
56
+
57
+ SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]}
58
+ SmarterJSON.process("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432} (no braces needed)
59
+ SmarterJSON.process_file("config.json5") # read a file, then parse
60
+
61
+ # Multiple documents (NDJSON / JSONL / concatenated) — no block, no special method:
62
+ SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
63
+ SmarterJSON.process('{"id":1}') # => {"id"=>1} (one document → the value itself)
64
+ SmarterJSON.process("") # => nil (zero documents)
65
+
66
+ # For input larger than memory, stream one document at a time with a block
67
+ # (process and process_file both forward the block):
68
+ SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
69
+ ```
70
+
71
+ ### Options
72
+
73
+ | option | default | meaning |
74
+ |-------------------|--------------|-------------------------------------------------------------------------|
75
+ | `symbolize_keys` | `false` | return object keys as Symbols instead of Strings |
76
+ | `duplicate_key` | `:last_wins` | `:last_wins` / `:first_wins` / `:raise` for repeated keys in one object |
77
+ | `bigdecimal_load` | `:auto` | `:auto` keeps high-precision decimals as `BigDecimal`; `:float` forces `Float`; `:bigdecimal` forces `BigDecimal` |
78
+ | `acceleration` | `true` | `true` uses the C extension when compiled and loadable; `false` forces pure Ruby (identical results) |
79
+ | `encoding` | `"UTF-8"` | labels the input's encoding (no transcoding pass; see below) |
80
+
81
+ ## Performance
82
+
83
+ Benchmarks: p10 of 40 runs, Apple M1 Max, Ruby 3.4.7, on the standard JSON corpus (canada, citm_catalog, twitter, github_events, …). The apples-to-apples comparisons are **SmarterJSON/C** vs **Oj/strict** vs **stdlib `json`**, all producing `Float` (run `rake report` in `json_benchmarks/` for the full table — numbers vary run to run).
84
+
85
+ - **vs Oj:** SmarterJSON/C matches or beats Oj on nearly every file — typically **1.1–1.7× faster** (e.g. deeply-nested ~1.7×, citm ~1.3×, twitter ~1.3×, usgs/weather ~1.2–1.3×).
86
+ - **vs stdlib `json` (C):** competitive with the fastest Ruby JSON parser — it matches `json` on number- and string-heavy files (e.g. big_decimals, string_array) and trails by ~1.2–1.6× on others.
87
+ - **Numbers:** floats are parsed with Ryū (correctly rounded, single-pass), so number-heavy data is fast and bit-exact.
88
+
89
+ **Two notes on fair comparison:**
90
+
91
+ - **NDJSON:** on multi-document files, **only SmarterJSON parses the input via plain `process`** — Oj and `json` raise without a block, so their cells are `N/A`. That `N/A` reflects real default behavior, not a measurement gap. Plain `process` collects every document into an Array at ~270 MB/s; the streaming block form runs faster (~440 MB/s) because it doesn't hold all documents in memory at once — use it for input larger than RAM.
92
+ - **High-precision decimals (e.g. `canada.json`):** SmarterJSON's default `:auto` mode preserves high-precision numbers as `BigDecimal` (matching Oj's default), which is intrinsically slower than `Float`. Against `Float`-producing parsers it looks slower on such files; pass `bigdecimal_load: :float` to compare like-for-like (it then runs much faster). Against the equivalent `BigDecimal`-producing Oj mode, SmarterJSON is faster.
93
+
94
+ ## Encoding
95
+
96
+ `encoding:` (default `"UTF-8"`) labels what the input is — it does **not** trigger a transcoding pass. The parser works on the bytes in their native encoding and emits string values with the same encoding tag, the same way `smarter_csv` handles encodings. Bytes that are invalid for the claimed encoding raise `SmarterJSON::EncodingError` (a kind of `SmarterJSON::ParseError`).
97
+
98
+ ## Nesting & untrusted input
99
+
100
+ Both the C extension and the pure-Ruby parser are **iterative, not recursive** — they track nesting on an explicit, heap-allocated stack rather than the call stack. So deeply nested input **cannot overflow the call stack or segfault**: nesting is bounded only by available memory, the same posture as Oj (which also ships no nesting limit; the stdlib `json` caps at 100). The `deeply_nested.json` benchmark (212 MB of nesting) parses without issue.
101
+
102
+ The trade-off: there is currently **no fixed nesting or input-size limit**, so extremely large or adversarially-nested untrusted input is bounded by memory (it can exhaust RAM), not by a crash. If you parse untrusted input and want a hard cap, that's a planned opt-in guard — for now, size-limit upstream of the parser.
103
+
104
+ ## Development
105
+
106
+ After checking out the repo, run `bin/setup` to install dependencies, then `rake compile` to build the C extension and `rake spec` to run the tests. The test suite runs every example against **both** the C and pure-Ruby paths, so the two stay behavior-identical.
107
+
108
+ ## License
109
+
110
+ Available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,22 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ require "rubocop/rake_task"
9
+
10
+ RuboCop::RakeTask.new
11
+
12
+ require "rake/extensiontask"
13
+ Rake::ExtensionTask.new("smarter_json") do |ext|
14
+ ext.ext_dir = "ext/smarter_json"
15
+ ext.lib_dir = "lib/smarter_json"
16
+ end
17
+
18
+ task spec: :compile
19
+ # rubocop is NOT in the default task: `bundle exec rake` = build + test only, so it
20
+ # runs on every Ruby in the CI matrix (incl. 2.5–2.7, where the latest rubocop won't
21
+ # install). Lint runs as its own CI step on one modern Ruby. Locally: `rake rubocop`.
22
+ task default: %i[clobber compile spec]
@@ -0,0 +1,48 @@
1
+
2
+ ### Contents
3
+
4
+ * [**Introduction**](./_introduction.md)
5
+ * [The Basic Read API](./basic_read_api.md)
6
+ * [The Basic Write API](./basic_write_api.md)
7
+ * [Configuration Options](./options.md)
8
+ * [Examples](./examples.md)
9
+
10
+ --------------
11
+
12
+ # SmarterJSON Introduction
13
+
14
+ `smarter_json` is a fast, lenient JSON parser and writer for Ruby. It reads strict JSON, JSON5, HJSON-style config, newline-delimited JSON (NDJSON / JSONL), and the messy JSON-ish input humans actually paste — and in benchmarks it matches or beats Oj on nearly every file. It is opinionated: it optimizes for getting your data out, not for policing the JSON spec. Where other parsers stop at the first deviation, SmarterJSON keeps going.
15
+
16
+ ## Why another JSON library?
17
+
18
+ Most JSON parsers reject anything that isn't perfectly strict JSON, and they make you tell them up front what shape the input is. SmarterJSON is built on the opposite principle: **you shouldn't have to care what flavor of JSON you were handed.** Give it strict JSON, JSON5, an HJSON-style config file, several concatenated documents, or a copy-pasted blob with comments and trailing commas — it just reads it.
19
+
20
+ ## What sets it apart
21
+
22
+ * **One reader, no modes, no flags.** There is no `dialect:` option and no "strict mode" — `SmarterJSON.process(input)` accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the reader to match your input; it adapts to whatever you give it.
23
+
24
+ * **It reads multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: zero documents returns `nil`, one document returns its value, two or more return an `Array`. **Only SmarterJSON reads multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.** For input larger than memory, pass a block to stream one document at a time. See [The Basic Read API](./basic_read_api.md).
25
+
26
+ * **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere) puts it ahead of Oj on nearly every file we benchmark, and competitive with the stdlib `json` C parser. Floats are parsed with Ryū (correctly rounded, single-pass), so number-heavy data is fast and bit-exact.
27
+
28
+ * **It writes JSON too.** `SmarterJSON.generate` turns Ruby values into strict, interoperable JSON — or into NDJSON, one element per line, the exact inverse of reading NDJSON back into an Array. See [The Basic Write API](./basic_write_api.md).
29
+
30
+ ## What it accepts, beyond strict JSON
31
+
32
+ * `//`, `/* … */`, and `#` comments (a `#`/`//` only starts a comment when preceded by whitespace, so `url: http://x.com` reads as a string, not a truncated value)
33
+ * Trailing commas; unquoted keys (`{host: localhost}`); single-quoted, triple-quoted (`'''…'''`), and quoteless string values
34
+ * Implicit root object — a config file that starts with `key: value`, no outer `{}`
35
+ * `NaN`, `Infinity`, hex (`0xFF`), leading `+` / `.`, underscores in numbers (`1_000_000`)
36
+ * UTF-8 BOM, smart/curly quotes, Python literals (`True` / `False` / `None`), JavaScript `undefined`
37
+ * Mixed CR / LF / CRLF line endings, and any Ruby-supported input encoding (via `encoding:`)
38
+ * Duplicate keys (last value wins by default; configurable — see [Configuration Options](./options.md))
39
+
40
+ It raises only on genuinely unparseable input (unterminated string, mismatched bracket), with line and column in the message — never on valid-but-lenient input.
41
+
42
+ ## Nesting & untrusted input
43
+
44
+ Both the C extension and the pure-Ruby parser are **iterative, not recursive** — they track nesting on an explicit, heap-allocated stack rather than the call stack. So deeply nested input **cannot overflow the call stack or segfault**: nesting is bounded only by available memory, the same posture as Oj (the stdlib `json` caps at 100). The trade-off: there is currently **no fixed nesting or input-size limit**, so size-limit untrusted input upstream of the parser.
45
+
46
+ ---------------
47
+
48
+ NEXT: [The Basic Read API](./basic_read_api.md) | UP: [README](../README.md)
@@ -0,0 +1,72 @@
1
+
2
+ ### Contents
3
+
4
+ * [Introduction](./_introduction.md)
5
+ * [**The Basic Read API**](./basic_read_api.md)
6
+ * [The Basic Write API](./basic_write_api.md)
7
+ * [Configuration Options](./options.md)
8
+ * [Examples](./examples.md)
9
+
10
+ --------------
11
+
12
+ # SmarterJSON Basic Read API
13
+
14
+ Reading JSON has one entry point for content and one for files. Both accept the same [options](./options.md), and both take an optional block for streaming.
15
+
16
+ ## `SmarterJSON.process` — read a String or an IO
17
+
18
+ ```ruby
19
+ require "smarter_json"
20
+
21
+ SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]}
22
+ SmarterJSON.process("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432} (no braces needed)
23
+ ```
24
+
25
+ `process` is polymorphic: its first argument is **either a String of JSON content or an IO to read from**. A String is always treated as content, never as a filename — use `process_file` for paths.
26
+
27
+ ```ruby
28
+ SmarterJSON.process(io) # an open IO (File, StringIO, socket, …) — reads it and parses
29
+ SmarterJSON.process(some_string) # JSON content
30
+ ```
31
+
32
+ ### Return value depends on how many documents the input holds
33
+
34
+ This is the distinguishing feature: `process` reads multi-document input (NDJSON / JSONL / concatenated / whitespace-separated) automatically, with no block and no special method.
35
+
36
+ ```ruby
37
+ SmarterJSON.process("") # => nil (zero documents)
38
+ SmarterJSON.process('{"id":1}') # => {"id"=>1} (one document → the value itself)
39
+ SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}] (two or more → an Array)
40
+ ```
41
+
42
+ Documents are separated by whitespace, newlines, or simple concatenation — **not** by commas (a comma between top-level documents would be read as an implicit root array, which is not supported). Only SmarterJSON reads this via plain `process`: Oj and the stdlib `json` library raise without a block.
43
+
44
+ ## `SmarterJSON.process_file` — read a file by path
45
+
46
+ ```ruby
47
+ SmarterJSON.process_file("config.json5") # read the file, then parse — same return-value rules as process
48
+ ```
49
+
50
+ `process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`, no transcoding pass), and parses it.
51
+
52
+ ## Streaming with a block (bounded memory)
53
+
54
+ For input larger than memory, pass a block. Each top-level document is yielded as it is read, and the method returns `nil` (it never collects the documents into an Array). Both `process` and `process_file` forward the block.
55
+
56
+ ```ruby
57
+ # Stream straight from disk, one document at a time — the whole file is never loaded:
58
+ SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
59
+
60
+ # Same for an IO:
61
+ SmarterJSON.process(io) { |doc| handle(doc) }
62
+ ```
63
+
64
+ The streaming path reads the input as newline-delimited documents (NDJSON / JSONL), one document per line. A single document that spans multiple lines is not supported by the streaming path — read it without a block instead.
65
+
66
+ ## The C extension and the pure-Ruby fallback
67
+
68
+ By default (`acceleration: :auto`) the C extension is used when it is compiled and loadable (`SmarterJSON::HAS_ACCELERATION` is then `true`); otherwise the pure-Ruby parser runs and produces identical results. Pass `acceleration: false` to force the pure-Ruby path. See [Configuration Options](./options.md).
69
+
70
+ ---------------
71
+
72
+ PREVIOUS: [Introduction](./_introduction.md) | NEXT: [The Basic Write API](./basic_write_api.md) | UP: [README](../README.md)
@@ -0,0 +1,144 @@
1
+
2
+ ### Contents
3
+
4
+ * [Introduction](./_introduction.md)
5
+ * [The Basic Read API](./basic_read_api.md)
6
+ * [**The Basic Write API**](./basic_write_api.md)
7
+ * [Configuration Options](./options.md)
8
+ * [Examples](./examples.md)
9
+
10
+ --------------
11
+
12
+ # SmarterJSON Basic Write API
13
+
14
+ Writing JSON has one entry point: `SmarterJSON.generate`. It turns a Ruby value into a JSON String — strict, interoperable output by default, or NDJSON when you ask for it.
15
+
16
+ ## `SmarterJSON.generate` — write a Ruby value as JSON
17
+
18
+ ```ruby
19
+ require "smarter_json"
20
+
21
+ SmarterJSON.generate({ "a" => 1, "b" => [2, 3] }) # => '{"a":1,"b":[2,3]}'
22
+ SmarterJSON.generate([1, 2, 3]) # => '[1,2,3]'
23
+ SmarterJSON.generate("hi") # => '"hi"'
24
+ SmarterJSON.generate(42) # => '42'
25
+ SmarterJSON.generate(nil) # => 'null'
26
+ ```
27
+
28
+ The output is always **valid, strict JSON** — there is no lenient write mode. (We are lenient about what we *read*, strict about what we *write*, so the output interoperates with every other JSON parser.)
29
+
30
+ ## How Ruby values map to JSON
31
+
32
+ | Ruby | JSON output |
33
+ |----------------------------------------|---------------------------------------------------------|
34
+ | `Hash` | object `{…}` — keys are stringified (Symbol keys too) |
35
+ | `Array` | array `[…]` |
36
+ | `String` | quoted string, escaped (see below) |
37
+ | `Symbol` | quoted string (`:sym` → `"sym"`) |
38
+ | `Integer` | number |
39
+ | `Float` | number (non-finite raises — see below) |
40
+ | `BigDecimal` | number, full precision (not a string) |
41
+ | `true` / `false` / `nil` | `true` / `false` / `null` |
42
+
43
+ ```ruby
44
+ SmarterJSON.generate({ a: 1, b: :sym }) # => '{"a":1,"b":"sym"}' (Symbol key and value → strings)
45
+ SmarterJSON.generate(BigDecimal("65.613616999999977")) # => '65.613616999999977' (a number, full precision)
46
+ SmarterJSON.generate("café\tx") # => '"café\tx"' (control chars escaped, UTF-8 raw)
47
+ ```
48
+
49
+ Strings escape `"`, `\`, and the control characters `0x00–0x1F`; everything else — including multi-byte UTF-8 — is emitted raw, which is valid JSON.
50
+
51
+ ## What raises
52
+
53
+ `generate` raises `SmarterJSON::Error` on input it cannot represent as strict JSON:
54
+
55
+ ```ruby
56
+ SmarterJSON.generate(Time.now) # raises SmarterJSON::GenerateError — unsupported type
57
+ SmarterJSON.generate(Float::INFINITY) # raises SmarterJSON::GenerateError — non-finite Float
58
+ SmarterJSON.generate(Float::NAN) # raises SmarterJSON::GenerateError — non-finite Float
59
+ ```
60
+
61
+ (`GenerateError` is a kind of `SmarterJSON::Error`, so `rescue SmarterJSON::Error` catches it. `Infinity` and `NaN` are accepted on the *read* side as a leniency, but they are not valid JSON to *write*.)
62
+
63
+ By default `generate` is strict: it only writes the types above and raises on anything else. To serialize `Time`, `Date`, or your own objects, pass `coerce: true` — an unsupported value is then converted by its own `as_json` (whose result is re-emitted, so escaping/`indent`/`sort_keys` still apply) or, failing that, `to_json` (spliced verbatim):
64
+
65
+ ```ruby
66
+ class Money
67
+ def as_json(*)
68
+ { "cents" => @cents, "currency" => @currency }
69
+ end
70
+ end
71
+
72
+ SmarterJSON.generate({ "price" => Money.new(500, "USD") }, coerce: true)
73
+ # => '{"price":{"cents":500,"currency":"USD"}}'
74
+ ```
75
+
76
+ Strict-by-default stays the default precisely so you opt in to delegating serialization rather than silently emitting an object's `to_s`. See [Configuration Options](./options.md).
77
+
78
+ ## Pretty-printing
79
+
80
+ By default `generate` produces compact output (no spaces). Pass `indent:` (a number of spaces per nesting level) to pretty-print:
81
+
82
+ ```ruby
83
+ SmarterJSON.generate({ "a" => 1, "b" => [2, 3] }, indent: 2)
84
+ # => "{\n \"a\": 1,\n \"b\": [\n 2,\n 3\n ]\n}"
85
+ ```
86
+
87
+ which prints as:
88
+
89
+ ```json
90
+ {
91
+ "a": 1,
92
+ "b": [
93
+ 2,
94
+ 3
95
+ ]
96
+ }
97
+ ```
98
+
99
+ Empty objects and arrays stay inline (`{}` / `[]`) even when indenting. `indent: 0` (the default) is compact output. Pretty-printing is multi-line, so it can't be combined with `format: :ndjson` (where each record must be a single line) — doing so raises `ArgumentError`. See [Configuration Options](./options.md).
100
+
101
+ ## Safe and canonical output
102
+
103
+ Three more options shape the output, and they compose with each other and with `indent:`:
104
+
105
+ - **`sort_keys: true`** — emit object keys in sorted order (Symbol keys sorted by their string form). Handy for canonical, diff-friendly JSON.
106
+ - **`ascii_only: true`** — escape every non-ASCII character as `\uXXXX` (characters above U+FFFF become a UTF-16 surrogate pair). The default emits raw UTF-8.
107
+ - **`script_safe: true`** — escape the `/` in `</` and the JavaScript line separators U+2028 / U+2029, so the output is safe to embed directly in an HTML `<script>` tag without breaking out of it.
108
+
109
+ ```ruby
110
+ SmarterJSON.generate({ "b" => 2, "a" => 1 }, sort_keys: true) # => '{"a":1,"b":2}'
111
+ SmarterJSON.generate("</script>", script_safe: true) # => '"<\/script>"'
112
+ ```
113
+
114
+ See [Configuration Options](./options.md) for the full table.
115
+
116
+ ## Writing NDJSON
117
+
118
+ Pass `format: :ndjson` to write newline-delimited JSON. An `Array` writes **one element per line**; any other value writes as a single line. This is the exact inverse of [reading NDJSON](./basic_read_api.md) back into an Array.
119
+
120
+ ```ruby
121
+ SmarterJSON.generate([{ "id" => 1 }, { "id" => 2 }], format: :ndjson) # => "{\"id\":1}\n{\"id\":2}\n"
122
+ SmarterJSON.generate({ "id" => 1 }, format: :ndjson) # => "{\"id\":1}\n" (single value → one line)
123
+ SmarterJSON.generate([], format: :ndjson) # => "" (empty array → no lines)
124
+ ```
125
+
126
+ Note the difference from the default `format: :json`, where a top-level Array is written as a single JSON array (`[…]`), not as NDJSON. See [Configuration Options](./options.md) for the full list of writer options.
127
+
128
+ ## Round-tripping
129
+
130
+ `process` and `generate` are inverses:
131
+
132
+ ```ruby
133
+ obj = { "a" => 1, "b" => [2, "three", nil, true] }
134
+ SmarterJSON.process(SmarterJSON.generate(obj)) == obj # => true
135
+
136
+ arr = [{ "id" => 1 }, { "id" => 2 }, { "id" => 3 }]
137
+ SmarterJSON.process(SmarterJSON.generate(arr, format: :ndjson)) == arr # => true
138
+ ```
139
+
140
+ Check out the [RSpec tests](../spec/generator_spec.rb) for more examples.
141
+
142
+ ---------------
143
+
144
+ PREVIOUS: [The Basic Read API](./basic_read_api.md) | NEXT: [Configuration Options](./options.md) | UP: [README](../README.md)