smarter_json 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: cbc1f54c56cf5a1fd4c569660faf9f4115e5e7ed8442ac9e8e7105bc880a3912
4
+ data.tar.gz: 5e68d7b1dafa55347cf5de1ee10e8ac39a97b645996fd0c58538799bbbb1191d
5
+ SHA512:
6
+ metadata.gz: 0e84fb4caf1fc9b192aa0e88f5111c3178557078abd8a31e7ce73373590e2487344c8df9ad61e718756e78a15239b460fd05b5f7a6fdbede35d8859f1873f5be
7
+ data.tar.gz: 9d808b8b3e8465ce7b12ae861053e67a77ef1550df425c6383ffe4b799ba401a6f927f692b09709942fdb9e690a1c9d25a5f4453d57b9b532e18f979d171199d
data/.gitignore ADDED
@@ -0,0 +1,46 @@
1
+ # General ignores for a Ruby gem
2
+ *.gem
3
+
4
+ # Ignore Bundler lockfile for gems (leave out for gems, include for apps)
5
+ Gemfile.lock
6
+
7
+ # Ignore RSpec status persistence
8
+ .rspec_status
9
+
10
+ # Ignore build output
11
+ /pkg/
12
+ /tmp/
13
+ /log/
14
+ /coverage/
15
+
16
+ # Ignore RuboCop, Yard, docs, bundler
17
+ /.bundle/
18
+ .yardoc
19
+ /.yardoc/
20
+ /.byebug_history
21
+ /.pry_history
22
+ /.irb-history
23
+
24
+ # Ignore Mac, editor, IDE artifacts
25
+ .DS_Store
26
+ *~
27
+ *.swp
28
+ *.swo
29
+ *.tmp
30
+ .vscode/
31
+ .idea/
32
+
33
+ # Ignore node_modules just in case
34
+ node_modules/
35
+
36
+ # Ignore test coverage output
37
+ overage/
38
+
39
+ # Ignore binary object files, extensions
40
+ *.so
41
+ *.o
42
+ *.bundle
43
+ *.rbc
44
+
45
+ .claude/
46
+ CLAUDE.md
data/CHANGELOG.md ADDED
@@ -0,0 +1,70 @@
1
+
2
+ # SmarterJSON Change Log
3
+
4
+ ## 0.5.1 (2026-06-01)
5
+ - Unified the error classes under a single `SmarterJSON::Error` base: `ParseError` and `EncodingError` now inherit from it, and `generate` raises a new `GenerateError`. `rescue SmarterJSON::Error` now catches everything the gem raises.
6
+ - Added a CI test matrix (Ruby 2.6–4.0 + head, on Ubuntu and macOS).
7
+ - Fixed the C extension build on Ruby 2.6 (declare `rb_hash_bulk_insert`, which 2.6 exports but does not declare in its headers); set the minimum Ruby to 2.6.
8
+
9
+ ## 0.5.0 (2026-05-31 unreleased)
10
+ - add JSON generation, incl. NDJSON generation
11
+ - add test coverage
12
+
13
+ ## 0.4.0 (2026-05-31 unreleased)
14
+ - rename `flex_json` -> `smarter_json`
15
+
16
+ ## 0.3.10 (2026-05-31 unreleased)
17
+ - change interface to use `.process` and `.process_file`
18
+
19
+
20
+ ## 0.3.9 (2026-05-31 unreleased)
21
+ - `parse` (no block) now handles any input automatically: 0 documents (empty / whitespace / comment-only) → `nil`, 1 document → the value itself, 2+ documents (NDJSON / JSONL / concatenated / whitespace-separated) → an Array of the values. It no longer raises on trailing content.
22
+ - Detection is free (the same trailing-content check that used to raise) and the single-document path allocates no Array, so single-value parsing is unchanged in speed.
23
+ - The block form (`parse(input) { |doc| … }`) is kept as the bounded-memory streaming path. `parse_file(path) { |doc| … }` now forwards the block too, so files stream the same way (previously the block was silently ignored). Bracketless comma lists (`1, 2, 3`) still raise — commas don't separate top-level documents (implicit-root array remains unsupported).
24
+ - The block form allows individual processing of each line in NDJSON files.
25
+ - Supersedes the earlier "raise on trailing content, match Oj" behavior.
26
+
27
+ ## 0.3.8 (2026-05-30 unreleased)
28
+ - Reordered single-character checks so the more common byte is tested first (`-` before `+`).
29
+ - Quoteless-token boundary scan now uses a 256-byte class table: ordinary bytes are classified in one table lookup, and the lookahead byte is read only at a `#`/`/` instead of on every byte. Speeds up quoteless / config-style input (the lenient case the JSON benchmarks don't exercise).
30
+
31
+ ## 0.3.7 (2026-05-30 unreleased)
32
+ - Escaped-string literal runs are bulk-copied with the NEON scanner instead of one byte at a time.
33
+ - Added branch hints (`__builtin_expect`) and prefetch to the hot string-scan loop. Sped up string-heavy files (string_array, github_events, twitter all 12–16% faster).
34
+
35
+ ## 0.3.6 (2026-05-30 unreleased)
36
+ - Fast path for plain numbers inside objects/arrays (`fj_try_member_number`): one scan straight from the cursor, committing when the number meets a delimiter and falling back to the quoteless scanner otherwise. Skips the quoteless boundary scan + classify dispatch for the common case. Broad gains on number-in-container files (weather, canada, usgs, big_decimals).
37
+
38
+ ## 0.3.5 (2026-05-30 unreleased)
39
+ - Rewrote `fj_parse_number` (top-level numbers) as a single pass: finds the token end and accumulates the mantissa/exponent at once, using the string's NUL terminator as a scan sentinel (no per-byte bounds check) and a digit loop that skips the underscore check until an underscore actually appears.
40
+ - Added `fj_try_decimal` for the quoteless path: validates and extracts the number in one scan, replacing the old three scans (validate + significant-digit count + mantissa extraction); skips the significant-digit scan when the number has ≤16 digits.
41
+ - Both number paths now build values through the shared `fj_int_from_parts` / `fj_float_from_parts` helpers so they can't drift; removed the now-dead `fj_validate_decimal` / `fj_int_value` / `fj_decimal_value`.
42
+
43
+ ## 0.3.4 (2026-05-30 unreleased)
44
+ - Dropped a per-member Ruby method call (`key?`) that fired for every object member under the default duplicate-key mode — pure waste on object-heavy files (twitter, github_events, citm).
45
+ - Build objects and arrays from a C value stack with a pre-sized hash + bulk insert (and size-based duplicate detection), instead of inserting one member/element at a time.
46
+ - Added a per-parse key cache so repeated object keys are interned once instead of every occurrence.
47
+
48
+ ## 0.3.3 (2026-05-30 unreleased)
49
+ - Vendored Ryū (Ulf Adams, Apache-2.0) for correctly-rounded string→double conversion: the mantissa is accumulated in one pass and converted with no `strtod`. Large win on float-heavy files (canada, big_decimals).
50
+
51
+ ## 0.3.3 (2026-05-29 unreleased)
52
+ - performance fixes
53
+
54
+ ## 0.3.2 (2026-05-29 unreleased)
55
+ - performance fixes
56
+
57
+ ## 0.3.1 (2026-05-29 unreleased)
58
+ - performance fixes
59
+
60
+ ## 0.3.0 (2026-05-29 unreleased)
61
+ - iterative parser
62
+
63
+ ## 0.2.0 (2026-05-29 unreleased)
64
+ - recursive parser
65
+
66
+ ## 0.1.1 (2026-05-29 unreleased)
67
+ - MVP complete
68
+
69
+ ## 0.1.0 (2026-05-28 unreleased)
70
+ - Initial Ruby version
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026 Tilo Sloboda
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,110 @@
1
+ # SmarterJSON
2
+
3
+ ![Gem Version](https://img.shields.io/gem/v/smarter_json) [![codecov](https://codecov.io/gh/tilo/smarter_json/branch/main/graph/badge.svg)](https://codecov.io/gh/tilo/smarter_json) [![Downloads](https://img.shields.io/gem/dt/smarter_json)](https://rubygems.org/gems/smarter_json) [![RubyGems](https://img.shields.io/badge/RubyGems-smarter__json-brightgreen?logo=rubygems&logoColor=white)](https://rubygems.org/gems/smarter_json) [![Ruby Toolbox](https://img.shields.io/badge/Ruby%20Toolbox-smarter__json-brightgreen)](https://www.ruby-toolbox.com/projects/smarter_json)
4
+
5
+ A lenient, fast JSON parser for Ruby. It parses strict JSON, JSON5, HJSON-style config, and the messy JSON-ish input humans actually write — and in benchmarks it matches or beats Oj on nearly every file. SmarterJSON is opinionated: we want your JSON processing to be successful. Other parsers are strict - they stop at the first deviation - SmarterJSON keeps going - it optimizes for getting your data out, not for policing the JSON spec.
6
+
7
+ ## Why SmarterJSON?
8
+
9
+ Most JSON parsers reject anything that isn't perfectly strict JSON. SmarterJSON is built on the opposite principle: **you shouldn't have to care what flavor of JSON you were handed.** Give it strict JSON, JSON5, an HJSON-style config file, newline-delimited JSON, or a copy-pasted blob with comments and trailing commas — it just parses it.
10
+
11
+ Three things set it apart:
12
+
13
+ 1. **One parser, no modes, no flags.** There is no `dialect:` option and no "strict mode" — `SmarterJSON.process(input)` accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the parser to match your input; it adapts to whatever you give it.
14
+
15
+ 2. **It parses multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: one document returns its value, several documents return an `Array`, empty input returns `nil`. **Only SmarterJSON parses multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.** For input larger than memory, pass a block to stream one document at a time.
16
+
17
+ 3. **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere) puts it ahead of Oj on nearly every file we benchmark, and competitive with the stdlib `json` C parser — the fastest general-purpose Ruby JSON parser.
18
+
19
+ ## What it accepts, beyond strict JSON
20
+
21
+ - `//`, `/* … */`, and `#` comments (a `#`/`//` only starts a comment when preceded by whitespace, so `url: http://x.com` parses as a string, not a truncated value)
22
+ - Trailing commas; unquoted keys (`{host: localhost}`); single-quoted, triple-quoted (`'''…'''`), and quoteless string values
23
+ - Implicit root object — a config file that starts with `key: value`, no outer `{}`
24
+ - `NaN`, `Infinity`, hex (`0xFF`), leading `+` / `.`, underscores in numbers (`1_000_000`)
25
+ - UTF-8 BOM, smart/curly quotes, Python literals (`True` / `False` / `None`), JavaScript `undefined`
26
+ - Mixed CR / LF / CRLF line endings, and any Ruby-supported input encoding (via `encoding:`)
27
+ - Duplicate keys (last value wins by default; configurable)
28
+
29
+ It raises only on genuinely unparseable input (unterminated string, mismatched bracket), with line and column in the message — never on valid-but-lenient input.
30
+
31
+ ## Installation
32
+
33
+ ```ruby
34
+ # Gemfile
35
+ gem "smarter_json"
36
+ ```
37
+
38
+ ```bash
39
+ gem install smarter_json
40
+ ```
41
+
42
+ The C extension is built on install and used automatically. On platforms where it can't build, the pure-Ruby parser runs instead and produces identical results.
43
+
44
+ ## Documentation
45
+
46
+ * [Introduction](docs/_introduction.md)
47
+ * [The Basic Read API](docs/basic_read_api.md)
48
+ * [The Basic Write API](docs/basic_write_api.md)
49
+ * [Configuration Options](docs/options.md)
50
+ * [Examples](docs/examples.md)
51
+
52
+ ## Usage
53
+
54
+ ```ruby
55
+ require "smarter_json"
56
+
57
+ SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]}
58
+ SmarterJSON.process("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432} (no braces needed)
59
+ SmarterJSON.process_file("config.json5") # read a file, then parse
60
+
61
+ # Multiple documents (NDJSON / JSONL / concatenated) — no block, no special method:
62
+ SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
63
+ SmarterJSON.process('{"id":1}') # => {"id"=>1} (one document → the value itself)
64
+ SmarterJSON.process("") # => nil (zero documents)
65
+
66
+ # For input larger than memory, stream one document at a time with a block
67
+ # (process and process_file both forward the block):
68
+ SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
69
+ ```
70
+
71
+ ### Options
72
+
73
+ | option | default | meaning |
74
+ |-------------------|--------------|-------------------------------------------------------------------------|
75
+ | `symbolize_keys` | `false` | return object keys as Symbols instead of Strings |
76
+ | `duplicate_key` | `:last_wins` | `:last_wins` / `:first_wins` / `:raise` for repeated keys in one object |
77
+ | `bigdecimal_load` | `:auto` | `:auto` keeps high-precision decimals as `BigDecimal`; `:float` forces `Float`; `:bigdecimal` forces `BigDecimal` |
78
+ | `acceleration` | `true` | `true` uses the C extension when compiled and loadable; `false` forces pure Ruby (identical results) |
79
+ | `encoding` | `"UTF-8"` | labels the input's encoding (no transcoding pass; see below) |
80
+
81
+ ## Performance
82
+
83
+ Benchmarks: p10 of 40 runs, Apple M1 Max, Ruby 3.4.7, on the standard JSON corpus (canada, citm_catalog, twitter, github_events, …). The apples-to-apples comparisons are **SmarterJSON/C** vs **Oj/strict** vs **stdlib `json`**, all producing `Float` (run `rake report` in `json_benchmarks/` for the full table — numbers vary run to run).
84
+
85
+ - **vs Oj:** SmarterJSON/C matches or beats Oj on nearly every file — typically **1.1–1.7× faster** (e.g. deeply-nested ~1.7×, citm ~1.3×, twitter ~1.3×, usgs/weather ~1.2–1.3×).
86
+ - **vs stdlib `json` (C):** competitive with the fastest Ruby JSON parser — it matches `json` on number- and string-heavy files (e.g. big_decimals, string_array) and trails by ~1.2–1.6× on others.
87
+ - **Numbers:** floats are parsed with Ryū (correctly rounded, single-pass), so number-heavy data is fast and bit-exact.
88
+
89
+ **Two notes on fair comparison:**
90
+
91
+ - **NDJSON:** on multi-document files, **only SmarterJSON parses the input via plain `process`** — Oj and `json` raise without a block, so their cells are `N/A`. That `N/A` reflects real default behavior, not a measurement gap. Plain `process` collects every document into an Array at ~270 MB/s; the streaming block form runs faster (~440 MB/s) because it doesn't hold all documents in memory at once — use it for input larger than RAM.
92
+ - **High-precision decimals (e.g. `canada.json`):** SmarterJSON's default `:auto` mode preserves high-precision numbers as `BigDecimal` (matching Oj's default), which is intrinsically slower than `Float`. Against `Float`-producing parsers it looks slower on such files; pass `bigdecimal_load: :float` to compare like-for-like (it then runs much faster). Against the equivalent `BigDecimal`-producing Oj mode, SmarterJSON is faster.
93
+
94
+ ## Encoding
95
+
96
+ `encoding:` (default `"UTF-8"`) labels what the input is — it does **not** trigger a transcoding pass. The parser works on the bytes in their native encoding and emits string values with the same encoding tag, the same way `smarter_csv` handles encodings. Bytes that are invalid for the claimed encoding raise `SmarterJSON::EncodingError` (a kind of `SmarterJSON::ParseError`).
97
+
98
+ ## Nesting & untrusted input
99
+
100
+ Both the C extension and the pure-Ruby parser are **iterative, not recursive** — they track nesting on an explicit, heap-allocated stack rather than the call stack. So deeply nested input **cannot overflow the call stack or segfault**: nesting is bounded only by available memory, the same posture as Oj (which also ships no nesting limit; the stdlib `json` caps at 100). The `deeply_nested.json` benchmark (212 MB of nesting) parses without issue.
101
+
102
+ The trade-off: there is currently **no fixed nesting or input-size limit**, so extremely large or adversarially-nested untrusted input is bounded by memory (it can exhaust RAM), not by a crash. If you parse untrusted input and want a hard cap, that's a planned opt-in guard — for now, size-limit upstream of the parser.
103
+
104
+ ## Development
105
+
106
+ After checking out the repo, run `bin/setup` to install dependencies, then `rake compile` to build the C extension and `rake spec` to run the tests. The test suite runs every example against **both** the C and pure-Ruby paths, so the two stay behavior-identical.
107
+
108
+ ## License
109
+
110
+ Available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,22 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ require "rubocop/rake_task"
9
+
10
+ RuboCop::RakeTask.new
11
+
12
+ require "rake/extensiontask"
13
+ Rake::ExtensionTask.new("smarter_json") do |ext|
14
+ ext.ext_dir = "ext/smarter_json"
15
+ ext.lib_dir = "lib/smarter_json"
16
+ end
17
+
18
+ task spec: :compile
19
+ # rubocop is NOT in the default task: `bundle exec rake` = build + test only, so it
20
+ # runs on every Ruby in the CI matrix (incl. 2.5–2.7, where the latest rubocop won't
21
+ # install). Lint runs as its own CI step on one modern Ruby. Locally: `rake rubocop`.
22
+ task default: %i[clobber compile spec]
@@ -0,0 +1,48 @@
1
+
2
+ ### Contents
3
+
4
+ * [**Introduction**](./_introduction.md)
5
+ * [The Basic Read API](./basic_read_api.md)
6
+ * [The Basic Write API](./basic_write_api.md)
7
+ * [Configuration Options](./options.md)
8
+ * [Examples](./examples.md)
9
+
10
+ --------------
11
+
12
+ # SmarterJSON Introduction
13
+
14
+ `smarter_json` is a fast, lenient JSON parser and writer for Ruby. It reads strict JSON, JSON5, HJSON-style config, newline-delimited JSON (NDJSON / JSONL), and the messy JSON-ish input humans actually paste — and in benchmarks it matches or beats Oj on nearly every file. It is opinionated: it optimizes for getting your data out, not for policing the JSON spec. Where other parsers stop at the first deviation, SmarterJSON keeps going.
15
+
16
+ ## Why another JSON library?
17
+
18
+ Most JSON parsers reject anything that isn't perfectly strict JSON, and they make you tell them up front what shape the input is. SmarterJSON is built on the opposite principle: **you shouldn't have to care what flavor of JSON you were handed.** Give it strict JSON, JSON5, an HJSON-style config file, several concatenated documents, or a copy-pasted blob with comments and trailing commas — it just reads it.
19
+
20
+ ## What sets it apart
21
+
22
+ * **One reader, no modes, no flags.** There is no `dialect:` option and no "strict mode" — `SmarterJSON.process(input)` accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the reader to match your input; it adapts to whatever you give it.
23
+
24
+ * **It reads multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: zero documents returns `nil`, one document returns its value, two or more return an `Array`. **Only SmarterJSON reads multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.** For input larger than memory, pass a block to stream one document at a time. See [The Basic Read API](./basic_read_api.md).
25
+
26
+ * **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere) puts it ahead of Oj on nearly every file we benchmark, and competitive with the stdlib `json` C parser. Floats are parsed with Ryū (correctly rounded, single-pass), so number-heavy data is fast and bit-exact.
27
+
28
+ * **It writes JSON too.** `SmarterJSON.generate` turns Ruby values into strict, interoperable JSON — or into NDJSON, one element per line, the exact inverse of reading NDJSON back into an Array. See [The Basic Write API](./basic_write_api.md).
29
+
30
+ ## What it accepts, beyond strict JSON
31
+
32
+ * `//`, `/* … */`, and `#` comments (a `#`/`//` only starts a comment when preceded by whitespace, so `url: http://x.com` reads as a string, not a truncated value)
33
+ * Trailing commas; unquoted keys (`{host: localhost}`); single-quoted, triple-quoted (`'''…'''`), and quoteless string values
34
+ * Implicit root object — a config file that starts with `key: value`, no outer `{}`
35
+ * `NaN`, `Infinity`, hex (`0xFF`), leading `+` / `.`, underscores in numbers (`1_000_000`)
36
+ * UTF-8 BOM, smart/curly quotes, Python literals (`True` / `False` / `None`), JavaScript `undefined`
37
+ * Mixed CR / LF / CRLF line endings, and any Ruby-supported input encoding (via `encoding:`)
38
+ * Duplicate keys (last value wins by default; configurable — see [Configuration Options](./options.md))
39
+
40
+ It raises only on genuinely unparseable input (unterminated string, mismatched bracket), with line and column in the message — never on valid-but-lenient input.
41
+
42
+ ## Nesting & untrusted input
43
+
44
+ Both the C extension and the pure-Ruby parser are **iterative, not recursive** — they track nesting on an explicit, heap-allocated stack rather than the call stack. So deeply nested input **cannot overflow the call stack or segfault**: nesting is bounded only by available memory, the same posture as Oj (the stdlib `json` caps at 100). The trade-off: there is currently **no fixed nesting or input-size limit**, so size-limit untrusted input upstream of the parser.
45
+
46
+ ---------------
47
+
48
+ NEXT: [The Basic Read API](./basic_read_api.md) | UP: [README](../README.md)
@@ -0,0 +1,72 @@
1
+
2
+ ### Contents
3
+
4
+ * [Introduction](./_introduction.md)
5
+ * [**The Basic Read API**](./basic_read_api.md)
6
+ * [The Basic Write API](./basic_write_api.md)
7
+ * [Configuration Options](./options.md)
8
+ * [Examples](./examples.md)
9
+
10
+ --------------
11
+
12
+ # SmarterJSON Basic Read API
13
+
14
+ Reading JSON has one entry point for content and one for files. Both accept the same [options](./options.md), and both take an optional block for streaming.
15
+
16
+ ## `SmarterJSON.process` — read a String or an IO
17
+
18
+ ```ruby
19
+ require "smarter_json"
20
+
21
+ SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]}
22
+ SmarterJSON.process("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432} (no braces needed)
23
+ ```
24
+
25
+ `process` is polymorphic: its first argument is **either a String of JSON content or an IO to read from**. A String is always treated as content, never as a filename — use `process_file` for paths.
26
+
27
+ ```ruby
28
+ SmarterJSON.process(io) # an open IO (File, StringIO, socket, …) — reads it and parses
29
+ SmarterJSON.process(some_string) # JSON content
30
+ ```
31
+
32
+ ### Return value depends on how many documents the input holds
33
+
34
+ This is the distinguishing feature: `process` reads multi-document input (NDJSON / JSONL / concatenated / whitespace-separated) automatically, with no block and no special method.
35
+
36
+ ```ruby
37
+ SmarterJSON.process("") # => nil (zero documents)
38
+ SmarterJSON.process('{"id":1}') # => {"id"=>1} (one document → the value itself)
39
+ SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}] (two or more → an Array)
40
+ ```
41
+
42
+ Documents are separated by whitespace, newlines, or simple concatenation — **not** by commas (a comma between top-level documents would be read as an implicit root array, which is not supported). Only SmarterJSON reads this via plain `process`: Oj and the stdlib `json` library raise without a block.
43
+
44
+ ## `SmarterJSON.process_file` — read a file by path
45
+
46
+ ```ruby
47
+ SmarterJSON.process_file("config.json5") # read the file, then parse — same return-value rules as process
48
+ ```
49
+
50
+ `process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`, no transcoding pass), and parses it.
51
+
52
+ ## Streaming with a block (bounded memory)
53
+
54
+ For input larger than memory, pass a block. Each top-level document is yielded as it is read, and the method returns `nil` (it never collects the documents into an Array). Both `process` and `process_file` forward the block.
55
+
56
+ ```ruby
57
+ # Stream straight from disk, one document at a time — the whole file is never loaded:
58
+ SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
59
+
60
+ # Same for an IO:
61
+ SmarterJSON.process(io) { |doc| handle(doc) }
62
+ ```
63
+
64
+ The streaming path reads the input as newline-delimited documents (NDJSON / JSONL), one document per line. A single document that spans multiple lines is not supported by the streaming path — read it without a block instead.
65
+
66
+ ## The C extension and the pure-Ruby fallback
67
+
68
+ By default (`acceleration: :auto`) the C extension is used when it is compiled and loadable (`SmarterJSON::HAS_ACCELERATION` is then `true`); otherwise the pure-Ruby parser runs and produces identical results. Pass `acceleration: false` to force the pure-Ruby path. See [Configuration Options](./options.md).
69
+
70
+ ---------------
71
+
72
+ PREVIOUS: [Introduction](./_introduction.md) | NEXT: [The Basic Write API](./basic_write_api.md) | UP: [README](../README.md)
@@ -0,0 +1,91 @@
1
+
2
+ ### Contents
3
+
4
+ * [Introduction](./_introduction.md)
5
+ * [The Basic Read API](./basic_read_api.md)
6
+ * [**The Basic Write API**](./basic_write_api.md)
7
+ * [Configuration Options](./options.md)
8
+ * [Examples](./examples.md)
9
+
10
+ --------------
11
+
12
+ # SmarterJSON Basic Write API
13
+
14
+ Writing JSON has one entry point: `SmarterJSON.generate`. It turns a Ruby value into a JSON String — strict, interoperable output by default, or NDJSON when you ask for it.
15
+
16
+ ## `SmarterJSON.generate` — write a Ruby value as JSON
17
+
18
+ ```ruby
19
+ require "smarter_json"
20
+
21
+ SmarterJSON.generate({ "a" => 1, "b" => [2, 3] }) # => '{"a":1,"b":[2,3]}'
22
+ SmarterJSON.generate([1, 2, 3]) # => '[1,2,3]'
23
+ SmarterJSON.generate("hi") # => '"hi"'
24
+ SmarterJSON.generate(42) # => '42'
25
+ SmarterJSON.generate(nil) # => 'null'
26
+ ```
27
+
28
+ The output is always **valid, strict JSON** — there is no lenient write mode. (We are lenient about what we *read*, strict about what we *write*, so the output interoperates with every other JSON parser.)
29
+
30
+ ## How Ruby values map to JSON
31
+
32
+ | Ruby | JSON output |
33
+ |----------------------------------------|---------------------------------------------------------|
34
+ | `Hash` | object `{…}` — keys are stringified (Symbol keys too) |
35
+ | `Array` | array `[…]` |
36
+ | `String` | quoted string, escaped (see below) |
37
+ | `Symbol` | quoted string (`:sym` → `"sym"`) |
38
+ | `Integer` | number |
39
+ | `Float` | number (non-finite raises — see below) |
40
+ | `BigDecimal` | number, full precision (not a string) |
41
+ | `true` / `false` / `nil` | `true` / `false` / `null` |
42
+
43
+ ```ruby
44
+ SmarterJSON.generate({ a: 1, b: :sym }) # => '{"a":1,"b":"sym"}' (Symbol key and value → strings)
45
+ SmarterJSON.generate(BigDecimal("65.613616999999977")) # => '65.613616999999977' (a number, full precision)
46
+ SmarterJSON.generate("café\tx") # => '"café\tx"' (control chars escaped, UTF-8 raw)
47
+ ```
48
+
49
+ Strings escape `"`, `\`, and the control characters `0x00–0x1F`; everything else — including multi-byte UTF-8 — is emitted raw, which is valid JSON.
50
+
51
+ ## What raises
52
+
53
+ `generate` raises `SmarterJSON::Error` on input it cannot represent as strict JSON:
54
+
55
+ ```ruby
56
+ SmarterJSON.generate(Time.now) # raises SmarterJSON::Error — unsupported type
57
+ SmarterJSON.generate(Float::INFINITY) # raises SmarterJSON::Error — non-finite Float
58
+ SmarterJSON.generate(Float::NAN) # raises SmarterJSON::Error — non-finite Float
59
+ ```
60
+
61
+ (`Infinity` and `NaN` are accepted on the *read* side as a leniency, but they are not valid JSON to *write*.)
62
+
63
+ ## Writing NDJSON
64
+
65
+ Pass `format: :ndjson` to write newline-delimited JSON. An `Array` writes **one element per line**; any other value writes as a single line. This is the exact inverse of [reading NDJSON](./basic_read_api.md) back into an Array.
66
+
67
+ ```ruby
68
+ SmarterJSON.generate([{ "id" => 1 }, { "id" => 2 }], format: :ndjson) # => "{\"id\":1}\n{\"id\":2}\n"
69
+ SmarterJSON.generate({ "id" => 1 }, format: :ndjson) # => "{\"id\":1}\n" (single value → one line)
70
+ SmarterJSON.generate([], format: :ndjson) # => "" (empty array → no lines)
71
+ ```
72
+
73
+ Note the difference from the default `format: :json`, where a top-level Array is written as a single JSON array (`[…]`), not as NDJSON. See [Configuration Options](./options.md) for the full list of writer options.
74
+
75
+ ## Round-tripping
76
+
77
+ `process` and `generate` are inverses:
78
+
79
+ ```ruby
80
+ obj = { "a" => 1, "b" => [2, "three", nil, true] }
81
+ SmarterJSON.process(SmarterJSON.generate(obj)) == obj # => true
82
+
83
+ arr = [{ "id" => 1 }, { "id" => 2 }, { "id" => 3 }]
84
+ SmarterJSON.process(SmarterJSON.generate(arr, format: :ndjson)) == arr # => true
85
+ ```
86
+
87
+ Check out the [RSpec tests](../spec/generator_spec.rb) for more examples.
88
+
89
+ ---------------
90
+
91
+ PREVIOUS: [The Basic Read API](./basic_read_api.md) | NEXT: [Configuration Options](./options.md) | UP: [README](../README.md)
data/docs/examples.md ADDED
@@ -0,0 +1,140 @@
1
+
2
+ ### Contents
3
+
4
+ * [Introduction](./_introduction.md)
5
+ * [The Basic Read API](./basic_read_api.md)
6
+ * [The Basic Write API](./basic_write_api.md)
7
+ * [Configuration Options](./options.md)
8
+ * [**Examples**](./examples.md)
9
+
10
+ --------------
11
+
12
+ # Examples
13
+
14
+ **Rescue from `SmarterJSON::Error` (recommended):** SmarterJSON raises only on genuinely unparseable input (an unterminated string, a mismatched bracket), with line and column in the message. Rescuing from `SmarterJSON::Error` lets your application handle bad input gracefully.
15
+
16
+ ---
17
+
18
+ 1. [Read a JSON String](#example-1-read-a-json-string)
19
+ 2. [Read a JSON File](#example-2-read-a-json-file)
20
+ 3. [Implicit Root Object (config-style, no braces)](#example-3-implicit-root-object-config-style-no-braces)
21
+ 4. [Multiple Documents (NDJSON) → Array](#example-4-multiple-documents-ndjson--array)
22
+ 5. [Streaming a Large File with a Block](#example-5-streaming-a-large-file-with-a-block)
23
+ 6. [Symbolize Keys](#example-6-symbolize-keys)
24
+ 7. [Duplicate Keys](#example-7-duplicate-keys)
25
+ 8. [High-Precision Numbers: BigDecimal vs Float](#example-8-high-precision-numbers-bigdecimal-vs-float)
26
+ 9. [Lenient Input: Comments, Trailing Commas, Unquoted Keys](#example-9-lenient-input-comments-trailing-commas-unquoted-keys)
27
+ 10. [Write JSON](#example-10-write-json)
28
+ 11. [Write NDJSON](#example-11-write-ndjson)
29
+ 12. [Round-Trip Read and Write](#example-12-round-trip-read-and-write)
30
+
31
+ ---
32
+
33
+ ### Example 1: Read a JSON String
34
+
35
+ ```ruby
36
+ require "smarter_json"
37
+
38
+ SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]}
39
+ ```
40
+
41
+ ### Example 2: Read a JSON File
42
+
43
+ ```ruby
44
+ SmarterJSON.process_file("config.json") # => the parsed value
45
+ ```
46
+
47
+ `process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`), and parses it.
48
+
49
+ ### Example 3: Implicit Root Object (config-style, no braces)
50
+
51
+ A config file that starts with `key: value` and has no outer `{}` is read as an object:
52
+
53
+ ```ruby
54
+ SmarterJSON.process("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432}
55
+ ```
56
+
57
+ ### Example 4: Multiple Documents (NDJSON) → Array
58
+
59
+ Plain `process` reads NDJSON / JSONL / concatenated documents with no block and no special method. Zero documents → `nil`, one → its value, two or more → an `Array`:
60
+
61
+ ```ruby
62
+ SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
63
+ SmarterJSON.process('{"id":1}') # => {"id"=>1}
64
+ SmarterJSON.process("") # => nil
65
+ ```
66
+
67
+ ### Example 5: Streaming a Large File with a Block
68
+
69
+ For input larger than memory, pass a block. Each document is yielded as it is read; the whole file is never loaded:
70
+
71
+ ```ruby
72
+ SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
73
+ ```
74
+
75
+ ### Example 6: Symbolize Keys
76
+
77
+ ```ruby
78
+ SmarterJSON.process('{"a": 1, "b": 2}', symbolize_keys: true) # => {:a=>1, :b=>2}
79
+ ```
80
+
81
+ ### Example 7: Duplicate Keys
82
+
83
+ By default the last value wins. Choose `:first_wins` or `:raise` instead:
84
+
85
+ ```ruby
86
+ SmarterJSON.process('{"a":1,"a":2}') # => {"a"=>2} (:last_wins, the default)
87
+ SmarterJSON.process('{"a":1,"a":2}', duplicate_key: :first_wins) # => {"a"=>1}
88
+ SmarterJSON.process('{"a":1,"a":2}', duplicate_key: :raise) # raises SmarterJSON::ParseError
89
+ ```
90
+
91
+ ### Example 8: High-Precision Numbers: BigDecimal vs Float
92
+
93
+ The default `:auto` keeps high-precision decimals as `BigDecimal` (matching Oj). Force `Float` for raw speed when you don't need the precision:
94
+
95
+ ```ruby
96
+ SmarterJSON.process("65.613616999999977") # => BigDecimal (:auto, the default)
97
+ SmarterJSON.process("65.613616999999977", bigdecimal_load: :float) # => 65.613616999999977 (a Float)
98
+ ```
99
+
100
+ ### Example 9: Lenient Input: Comments, Trailing Commas, Unquoted Keys
101
+
102
+ ```ruby
103
+ SmarterJSON.process(<<~JSON)
104
+ {
105
+ host: localhost, # unquoted key, quoteless value, and a trailing comma
106
+ port: 5432,
107
+ /* block comment */
108
+ url: http://example.com
109
+ }
110
+ JSON
111
+ # => {"host"=>"localhost", "port"=>5432, "url"=>"http://example.com"}
112
+ ```
113
+
114
+ A `#`/`//` only starts a comment when preceded by whitespace, so `http://example.com` stays a string rather than being truncated.
115
+
116
+ ### Example 10: Write JSON
117
+
118
+ ```ruby
119
+ SmarterJSON.generate({ "a" => 1, "b" => [2, 3] }) # => '{"a":1,"b":[2,3]}'
120
+ SmarterJSON.generate([1, 2, 3]) # => '[1,2,3]'
121
+ ```
122
+
123
+ ### Example 11: Write NDJSON
124
+
125
+ An Array writes one element per line:
126
+
127
+ ```ruby
128
+ SmarterJSON.generate([{ "id" => 1 }, { "id" => 2 }], format: :ndjson) # => "{\"id\":1}\n{\"id\":2}\n"
129
+ ```
130
+
131
+ ### Example 12: Round-Trip Read and Write
132
+
133
+ ```ruby
134
+ obj = { "a" => 1, "b" => [2, "three", nil, true] }
135
+ SmarterJSON.process(SmarterJSON.generate(obj)) == obj # => true
136
+ ```
137
+
138
+ ---------------
139
+
140
+ PREVIOUS: [Configuration Options](./options.md) | UP: [README](../README.md)