smarter_json 0.9.2 → 0.9.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +1 -0
- data/CHANGELOG.md +77 -54
- data/README.md +215 -72
- data/docs/_introduction.md +6 -12
- data/docs/basic_read_api.md +29 -19
- data/docs/basic_write_api.md +2 -2
- data/docs/examples.md +32 -23
- data/docs/options.md +14 -14
- data/ext/smarter_json/smarter_json.c +223 -89
- data/ext/smarter_json/vendor/LICENSE-fast_float-MIT +27 -0
- data/ext/smarter_json/vendor/eisel_lemire.h +117 -0
- data/ext/smarter_json/vendor/eisel_lemire.md +29 -0
- data/ext/smarter_json/vendor/eisel_lemire_powers.h +663 -0
- data/lib/smarter_json/backports.rb +28 -0
- data/lib/smarter_json/options.rb +52 -0
- data/lib/smarter_json/parser.rb +400 -139
- data/lib/smarter_json/version.rb +1 -1
- data/lib/smarter_json.rb +3 -1
- metadata +9 -5
- data/ext/smarter_json/vendor/ryu.h +0 -819
- data/ext/smarter_json/vendor/ryu.md +0 -22
data/docs/_introduction.md
CHANGED
|
@@ -11,7 +11,7 @@
|
|
|
11
11
|
|
|
12
12
|
# SmarterJSON Introduction
|
|
13
13
|
|
|
14
|
-
`smarter_json` is a fast, lenient JSON
|
|
14
|
+
`smarter_json` is a fast, lenient JSON processor for Ruby. It reads strict JSON, JSON5, HJSON-style config, newline-delimited JSON (NDJSON / JSONL), markdown-wrapped / chatty blobs around a JSON payload, and the messy JSON-ish input humans actually paste — and in benchmarks it matches or beats Oj on every file. It is opinionated: it optimizes for getting your data out, not for policing the JSON spec. Where other parsers stop at the first deviation, SmarterJSON keeps going.
|
|
15
15
|
|
|
16
16
|
## Why another JSON library?
|
|
17
17
|
|
|
@@ -21,27 +21,21 @@ Most JSON parsers reject anything that isn't perfectly strict JSON, and they mak
|
|
|
21
21
|
|
|
22
22
|
* **One reader, no modes, no flags.** There is no `dialect:` option and no "strict mode" — `SmarterJSON.process(input)` accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the reader to match your input; it adapts to whatever you give it.
|
|
23
23
|
|
|
24
|
-
* **It reads multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**:
|
|
24
|
+
* **It reads multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: it always returns an `Array` of the documents found (`[]` / `[doc]` / `[d1, d2, …]`). For the common single-document case, `SmarterJSON.process_one` returns the one value directly (and warns, never raises, if there was more than one). The same rule applies when wrapper noise is stripped and several payloads are recovered from one blob. **Only SmarterJSON reads multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.** For input larger than memory, pass a block to stream one document at a time. See [The Basic Read API](./basic_read_api.md).
|
|
25
25
|
|
|
26
|
-
* **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere)
|
|
26
|
+
* **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere) matches or beats Oj on every file we benchmark, and is competitive with the stdlib `json` C parser. Floats are decoded with the **Eisel-Lemire** algorithm (fast_float), correctly rounded and bit-for-bit identical to `JSON.parse`, so number-heavy data is fast and exact.
|
|
27
27
|
|
|
28
28
|
* **It writes JSON too.** `SmarterJSON.generate` turns Ruby values into strict, interoperable JSON — or into NDJSON, one element per line, the exact inverse of reading NDJSON back into an Array. See [The Basic Write API](./basic_write_api.md).
|
|
29
29
|
|
|
30
30
|
## What it accepts, beyond strict JSON
|
|
31
31
|
|
|
32
|
-
|
|
33
|
-
* Trailing commas; unquoted keys (`{host: localhost}`); single-quoted, triple-quoted (`'''…'''`), and quoteless string values
|
|
34
|
-
* Implicit root object — a config file that starts with `key: value`, no outer `{}`
|
|
35
|
-
* `NaN`, `Infinity`, hex (`0xFF`), leading `+` / `.`, underscores in numbers (`1_000_000`)
|
|
36
|
-
* UTF-8 BOM, smart/curly quotes, Python literals (`True` / `False` / `None`), JavaScript `undefined`
|
|
37
|
-
* Mixed CR / LF / CRLF line endings, and any Ruby-supported input encoding (via `encoding:`)
|
|
38
|
-
* Duplicate keys (last value wins by default; configurable — see [Configuration Options](./options.md))
|
|
32
|
+
Comments (`//`, `/* … */`, `#` — a `#`/`//` only starts a comment when preceded by whitespace, so `url: http://x.com` reads as a string, not a truncated value), markdown-wrapped / chatty blobs around the payload, trailing commas, unquoted / single- / triple-quoted / quoteless strings, an implicit root object (`key: value`, no braces), `NaN` / `Infinity` / hex / underscored numbers, Python (`True` / `False` / `None`) and JavaScript (`undefined`) literals, smart quotes, a UTF-8 BOM, mixed CR / LF / CRLF line endings, any Ruby-supported input encoding (via `encoding:`), and duplicate keys. The full list — with the human-JSON spec references it's drawn from — is kept in one place: [**What it accepts, beyond strict JSON**](../README.md#what-it-accepts-beyond-strict-json) in the README.
|
|
39
33
|
|
|
40
|
-
It raises only on genuinely
|
|
34
|
+
It raises only on genuinely unreadable input (unterminated string, mismatched bracket), with line and column in the message — never on valid-but-lenient input.
|
|
41
35
|
|
|
42
36
|
## Nesting & untrusted input
|
|
43
37
|
|
|
44
|
-
Both the C extension and the pure-Ruby
|
|
38
|
+
Both the C extension and the pure-Ruby engine are **iterative, not recursive** — they track nesting on an explicit, heap-allocated stack rather than the call stack. So deeply nested input **cannot overflow the call stack or segfault**: nesting is bounded only by available memory, the same posture as Oj (the stdlib `json` caps at 100). The trade-off: there is currently **no fixed nesting or input-size limit**, so size-limit untrusted input upstream.
|
|
45
39
|
|
|
46
40
|
---------------
|
|
47
41
|
|
data/docs/basic_read_api.md
CHANGED
|
@@ -18,14 +18,15 @@ Reading JSON has one entry point for content and one for files. Both accept the
|
|
|
18
18
|
```ruby
|
|
19
19
|
require "smarter_json"
|
|
20
20
|
|
|
21
|
-
SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]}
|
|
22
|
-
SmarterJSON.
|
|
21
|
+
SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => [{"a"=>1, "b"=>[2, 3]}] (always an Array of documents)
|
|
22
|
+
SmarterJSON.process_one('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]} (the single document's value)
|
|
23
|
+
SmarterJSON.process_one("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432} (no braces needed)
|
|
23
24
|
```
|
|
24
25
|
|
|
25
|
-
`process`
|
|
26
|
+
`process` accepts **either a String of JSON content or an IO to read from** as its first argument. A String is always treated as content, never as a filename — use `process_file` for paths. When the input wraps the payload in obvious markdown / prose / tags, `process` strips that wrapper first and then reads the recovered payload(s).
|
|
26
27
|
|
|
27
|
-
|
|
28
|
-
SmarterJSON.
|
|
28
|
+
````ruby
|
|
29
|
+
SmarterJSON.process_one(<<~TEXT)
|
|
29
30
|
Here is the JSON:
|
|
30
31
|
|
|
31
32
|
```json
|
|
@@ -36,7 +37,7 @@ SmarterJSON.process(<<~TEXT)
|
|
|
36
37
|
TEXT
|
|
37
38
|
# => {"a"=>1}
|
|
38
39
|
|
|
39
|
-
SmarterJSON.
|
|
40
|
+
SmarterJSON.process_one(<<~TEXT)
|
|
40
41
|
Here is the result:
|
|
41
42
|
|
|
42
43
|
{
|
|
@@ -55,36 +56,45 @@ SmarterJSON.process(<<~TEXT)
|
|
|
55
56
|
{"b":2}
|
|
56
57
|
TEXT
|
|
57
58
|
# => [{"a"=>1}, {"b"=>2}]
|
|
58
|
-
|
|
59
|
+
````
|
|
59
60
|
|
|
60
61
|
```ruby
|
|
61
|
-
SmarterJSON.process(io) # an open IO (File, StringIO, socket, …) — reads it and
|
|
62
|
+
SmarterJSON.process(io) # an open IO (File, StringIO, socket, …) — reads it and extracts the data
|
|
62
63
|
SmarterJSON.process(some_string) # JSON content
|
|
63
64
|
```
|
|
64
65
|
|
|
65
|
-
###
|
|
66
|
+
### `process` always returns an Array of documents
|
|
66
67
|
|
|
67
|
-
This is the distinguishing feature: `process` reads multi-document input (NDJSON / JSONL / concatenated
|
|
68
|
+
This is the distinguishing feature: `process` reads multi-document input (NDJSON / JSONL / concatenated) automatically, with no block and no special method, and **always returns an `Array` of the documents** it found:
|
|
68
69
|
|
|
69
70
|
```ruby
|
|
70
|
-
SmarterJSON.process("") # =>
|
|
71
|
-
SmarterJSON.process('{"id":1}') # => {"id"=>1}
|
|
72
|
-
SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
|
|
71
|
+
SmarterJSON.process("") # => [] (zero documents)
|
|
72
|
+
SmarterJSON.process('{"id":1}') # => [{"id"=>1}] (one document, still an Array)
|
|
73
|
+
SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
|
|
73
74
|
```
|
|
74
75
|
|
|
75
|
-
|
|
76
|
+
For the common single-document case, **`process_one`** returns the one value directly — and *warns* (never raises) if there was more than one, so a stray extra document is never dropped silently:
|
|
77
|
+
|
|
78
|
+
```ruby
|
|
79
|
+
SmarterJSON.process_one('{"id":1}') # => {"id"=>1}
|
|
80
|
+
SmarterJSON.process_one("") # => nil
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
> Type-checking the result? Use `result.is_a?(Array)`, not `result.class == Array` — idiomatic, and future-proof if the return ever becomes a specialized `Array` subclass.
|
|
84
|
+
|
|
85
|
+
Documents are separated by **newlines, commas, RS (0x1E), or simple concatenation (self-delimiting values)** — **never by a space**. A top-level value must be a recognized JSON value (number / `true` / `false` / `null` / quoted string / object / array) or an implicit-root object (`host: localhost`); a bare top-level run such as `localhost` or `1 2 3` raises `ParseError`. (Quoteless string values *inside* objects and arrays are unchanged.) If wrapper noise is stripped and several payloads are recovered, they come back by the same rule — an `Array` of payloads (`process_one` returns the first). Only SmarterJSON reads multi-document input via plain `process`: Oj and the stdlib `json` library raise without a block.
|
|
76
86
|
|
|
77
87
|
## `SmarterJSON.process_file` — read a file by path
|
|
78
88
|
|
|
79
89
|
```ruby
|
|
80
|
-
SmarterJSON.process_file("config.json5") # read the file, then
|
|
90
|
+
SmarterJSON.process_file("config.json5") # read the file, then process — same return-value rules as process
|
|
81
91
|
```
|
|
82
92
|
|
|
83
|
-
`process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`, no transcoding pass), and
|
|
93
|
+
`process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`, no transcoding pass), and processes it.
|
|
84
94
|
|
|
85
95
|
## Streaming with a block (bounded memory)
|
|
86
96
|
|
|
87
|
-
For input larger than memory, pass a block. Each recovered top-level document is yielded as it is framed, and the method returns
|
|
97
|
+
For input larger than memory, pass a block. Each recovered top-level document is yielded as it is framed, and the method returns the **document count** instead of collecting the documents into an Array. Both `process` and `process_file` forward the block.
|
|
88
98
|
|
|
89
99
|
```ruby
|
|
90
100
|
SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
|
|
@@ -95,7 +105,7 @@ The streaming path now frames whole top-level documents, not just one line at a
|
|
|
95
105
|
|
|
96
106
|
## The C extension and the pure-Ruby fallback
|
|
97
107
|
|
|
98
|
-
By default (`acceleration: true`) the C extension is used when it is compiled and loadable (`SmarterJSON::HAS_ACCELERATION` is then `true`); otherwise the pure-Ruby
|
|
108
|
+
By default (`acceleration: true`) the C extension is used when it is compiled and loadable (`SmarterJSON::HAS_ACCELERATION` is then `true`); otherwise the pure-Ruby implementation runs and produces identical results. Pass `acceleration: false` to force the pure-Ruby path. See [Configuration Options](./options.md).
|
|
99
109
|
|
|
100
110
|
## Seeing what was fixed: `on_warning:`
|
|
101
111
|
|
|
@@ -104,7 +114,7 @@ By default (`acceleration: true`) the C extension is used when it is compiled an
|
|
|
104
114
|
```ruby
|
|
105
115
|
warns = []
|
|
106
116
|
result = SmarterJSON.process("[1,,2]", on_warning: ->(w) { warns << w })
|
|
107
|
-
result # => [1, 2]
|
|
117
|
+
result # => [[1, 2]] (one document: the array [1, 2] with the empty slot collapsed; process always returns an Array of documents)
|
|
108
118
|
warns.map(&:type) # => [:empty_slot]
|
|
109
119
|
warns.first.to_s # => "extra comma, collapsed an empty slot at line 1, col 4"
|
|
110
120
|
```
|
data/docs/basic_write_api.md
CHANGED
|
@@ -127,11 +127,11 @@ Note the difference from the default `format: :json`, where a top-level Array is
|
|
|
127
127
|
|
|
128
128
|
## Round-tripping
|
|
129
129
|
|
|
130
|
-
`process`
|
|
130
|
+
`process_one` reads one document back, `process` reads many (NDJSON) back into an `Array` — each is the inverse of the matching `generate`:
|
|
131
131
|
|
|
132
132
|
```ruby
|
|
133
133
|
obj = { "a" => 1, "b" => [2, "three", nil, true] }
|
|
134
|
-
SmarterJSON.
|
|
134
|
+
SmarterJSON.process_one(SmarterJSON.generate(obj)) == obj # => true
|
|
135
135
|
|
|
136
136
|
arr = [{ "id" => 1 }, { "id" => 2 }, { "id" => 3 }]
|
|
137
137
|
SmarterJSON.process(SmarterJSON.generate(arr, format: :ndjson)) == arr # => true
|
data/docs/examples.md
CHANGED
|
@@ -11,7 +11,9 @@
|
|
|
11
11
|
|
|
12
12
|
# Examples
|
|
13
13
|
|
|
14
|
-
**Rescue from `SmarterJSON::Error` (recommended):** SmarterJSON raises only on genuinely
|
|
14
|
+
**Rescue from `SmarterJSON::Error` (recommended):** SmarterJSON raises only on genuinely unreadable input (an unterminated string, a mismatched bracket), with line and column in the message. Rescuing from `SmarterJSON::Error` lets your application handle bad input gracefully.
|
|
15
|
+
|
|
16
|
+
**`process` vs `process_one`:** `SmarterJSON.process` is the preferred call — it always returns an `Array` of documents, so the count is explicit and you never silently drop one. `SmarterJSON.process_one` is the convenience for the single-document case: it returns that one document's value directly, and *warns* (never raises) if the input turned out to hold more than one. Both appear below; reach for `process` unless you specifically want the single value.
|
|
15
17
|
|
|
16
18
|
---
|
|
17
19
|
|
|
@@ -36,38 +38,46 @@
|
|
|
36
38
|
```ruby
|
|
37
39
|
require "smarter_json"
|
|
38
40
|
|
|
39
|
-
SmarterJSON.process('{"a": 1, "b": [2, 3]}')
|
|
41
|
+
SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => [{"a"=>1, "b"=>[2, 3]}] (always an Array of documents)
|
|
42
|
+
SmarterJSON.process_one('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]} (the one document's value)
|
|
40
43
|
```
|
|
41
44
|
|
|
42
45
|
### Example 2: Read a JSON File
|
|
43
46
|
|
|
44
47
|
```ruby
|
|
45
|
-
SmarterJSON.process_file("config.json") # =>
|
|
48
|
+
SmarterJSON.process_file("config.json") # => an Array of documents (same return rules as process)
|
|
46
49
|
```
|
|
47
50
|
|
|
48
|
-
`process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`), and
|
|
51
|
+
`process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`), and processes it.
|
|
49
52
|
|
|
50
53
|
### Example 3: Implicit Root Object (config-style, no braces)
|
|
51
54
|
|
|
52
55
|
A config file that starts with `key: value` and has no outer `{}` is read as an object:
|
|
53
56
|
|
|
54
57
|
```ruby
|
|
55
|
-
SmarterJSON.
|
|
58
|
+
SmarterJSON.process_one("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432}
|
|
56
59
|
```
|
|
57
60
|
|
|
58
61
|
### Example 4: Multiple Documents (NDJSON) → Array
|
|
59
62
|
|
|
60
|
-
Plain `process` reads NDJSON / JSONL / concatenated documents with no block and no special method
|
|
63
|
+
Plain `process` reads NDJSON / JSONL / concatenated documents with no block and no special method, and always returns an `Array` — `[]` for none, `[doc]` for one, `[d1, d2, …]` for several:
|
|
61
64
|
|
|
62
65
|
```ruby
|
|
63
66
|
SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
|
|
64
|
-
SmarterJSON.process('{"id":1}') # => {"id"=>1}
|
|
65
|
-
SmarterJSON.process("") # =>
|
|
67
|
+
SmarterJSON.process('{"id":1}') # => [{"id"=>1}] (one document, still an Array)
|
|
68
|
+
SmarterJSON.process("") # => [] (zero documents)
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
For the single-document case, `process_one` returns the one value directly — and *warns* (never raises) if there was more than one:
|
|
72
|
+
|
|
73
|
+
```ruby
|
|
74
|
+
SmarterJSON.process_one('{"id":1}') # => {"id"=>1}
|
|
75
|
+
SmarterJSON.process_one("") # => nil
|
|
66
76
|
```
|
|
67
77
|
|
|
68
78
|
### Example 5: Streaming a Large File with a Block
|
|
69
79
|
|
|
70
|
-
For input larger than memory, pass a block. Each recovered document is yielded one at a time
|
|
80
|
+
For input larger than memory, pass a block. Each recovered document is yielded one at a time, and the method returns the **document count** instead of building an `Array`:
|
|
71
81
|
|
|
72
82
|
```ruby
|
|
73
83
|
SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
|
|
@@ -76,17 +86,16 @@ SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event
|
|
|
76
86
|
### Example 6: Symbolize Keys
|
|
77
87
|
|
|
78
88
|
```ruby
|
|
79
|
-
SmarterJSON.
|
|
89
|
+
SmarterJSON.process_one('{"a": 1, "b": 2}', symbolize_keys: true) # => {:a=>1, :b=>2}
|
|
80
90
|
```
|
|
81
91
|
|
|
82
92
|
### Example 7: Duplicate Keys
|
|
83
93
|
|
|
84
|
-
By default the last value wins.
|
|
94
|
+
By default the last value wins. Pass `:first_wins` to keep the first instead (either way, the repeat is reported through [`on_warning`](./options.md)):
|
|
85
95
|
|
|
86
96
|
```ruby
|
|
87
|
-
SmarterJSON.
|
|
88
|
-
SmarterJSON.
|
|
89
|
-
SmarterJSON.process('{"a":1,"a":2}', duplicate_key: :raise) # raises SmarterJSON::ParseError
|
|
97
|
+
SmarterJSON.process_one('{"a":1,"a":2}') # => {"a"=>2} (:last_wins, the default)
|
|
98
|
+
SmarterJSON.process_one('{"a":1,"a":2}', duplicate_key: :first_wins) # => {"a"=>1}
|
|
90
99
|
```
|
|
91
100
|
|
|
92
101
|
### Example 8: High-Precision Numbers: BigDecimal vs Float
|
|
@@ -94,14 +103,14 @@ SmarterJSON.process('{"a":1,"a":2}', duplicate_key: :raise) # raises SmarterJS
|
|
|
94
103
|
The default `:auto` keeps high-precision decimals as `BigDecimal` (matching Oj). Force `Float` for raw speed when you don't need the precision:
|
|
95
104
|
|
|
96
105
|
```ruby
|
|
97
|
-
SmarterJSON.
|
|
98
|
-
SmarterJSON.
|
|
106
|
+
SmarterJSON.process_one("65.613616999999977") # => BigDecimal (:auto, the default)
|
|
107
|
+
SmarterJSON.process_one("65.613616999999977", decimal_precision: :float) # => 65.613616999999977 (a Float)
|
|
99
108
|
```
|
|
100
109
|
|
|
101
110
|
### Example 9: Lenient Input: Comments, Trailing Commas, Unquoted Keys
|
|
102
111
|
|
|
103
112
|
```ruby
|
|
104
|
-
SmarterJSON.
|
|
113
|
+
SmarterJSON.process_one(<<~JSON)
|
|
105
114
|
{
|
|
106
115
|
host: localhost, # unquoted key, quoteless value, and a trailing comma
|
|
107
116
|
port: 5432,
|
|
@@ -118,8 +127,8 @@ A `#`/`//` only starts a comment when preceded by whitespace, so `http://example
|
|
|
118
127
|
|
|
119
128
|
#### Fenced payload
|
|
120
129
|
|
|
121
|
-
|
|
122
|
-
SmarterJSON.
|
|
130
|
+
````ruby
|
|
131
|
+
SmarterJSON.process_one(<<~TEXT)
|
|
123
132
|
Here is the JSON:
|
|
124
133
|
|
|
125
134
|
```json
|
|
@@ -129,12 +138,12 @@ SmarterJSON.process(<<~TEXT)
|
|
|
129
138
|
```
|
|
130
139
|
TEXT
|
|
131
140
|
# => {"a"=>1}
|
|
132
|
-
|
|
141
|
+
````
|
|
133
142
|
|
|
134
143
|
#### Prose before / after the payload
|
|
135
144
|
|
|
136
145
|
```ruby
|
|
137
|
-
SmarterJSON.
|
|
146
|
+
SmarterJSON.process_one(<<~TEXT)
|
|
138
147
|
Here is the result:
|
|
139
148
|
|
|
140
149
|
{
|
|
@@ -149,7 +158,7 @@ TEXT
|
|
|
149
158
|
#### Wrapper tags
|
|
150
159
|
|
|
151
160
|
```ruby
|
|
152
|
-
SmarterJSON.
|
|
161
|
+
SmarterJSON.process_one("<json>{\"a\":1}</json>")
|
|
153
162
|
# => {"a"=>1}
|
|
154
163
|
```
|
|
155
164
|
|
|
@@ -185,7 +194,7 @@ SmarterJSON.generate([{ "id" => 1 }, { "id" => 2 }], format: :ndjson) # => "{\
|
|
|
185
194
|
|
|
186
195
|
```ruby
|
|
187
196
|
obj = { "a" => 1, "b" => [2, "three", nil, true] }
|
|
188
|
-
SmarterJSON.
|
|
197
|
+
SmarterJSON.process_one(SmarterJSON.generate(obj)) == obj # => true
|
|
189
198
|
```
|
|
190
199
|
|
|
191
200
|
---------------
|
data/docs/options.md
CHANGED
|
@@ -13,32 +13,32 @@
|
|
|
13
13
|
|
|
14
14
|
## Reading
|
|
15
15
|
|
|
16
|
-
These options are passed to [`SmarterJSON.process`](./basic_read_api.md) and `SmarterJSON.process_file` as the second argument; anything you set overrides the defaults below.
|
|
16
|
+
These options are passed to [`SmarterJSON.process`](./basic_read_api.md), `SmarterJSON.process_one`, and `SmarterJSON.process_file` as the second argument; anything you set overrides the defaults below.
|
|
17
17
|
|
|
18
18
|
| Option | Default | Explanation |
|
|
19
19
|
|-------------------|--------------|------------------------------------------------------------------------------------------------------------------------|
|
|
20
|
-
| `:
|
|
21
|
-
| `:
|
|
22
|
-
| `:
|
|
23
|
-
| `:acceleration` | `true` | Use the C extension when it is compiled and loadable; `false` forces the pure-Ruby parser. Both produce identical results. |
|
|
20
|
+
| `:acceleration` | `true` | Use the C extension when it is compiled and loadable; `false` forces the pure-Ruby implementation. Both produce identical results. |
|
|
21
|
+
| `:decimal_precision`| `:auto` | `:auto` keeps high-precision decimals as `BigDecimal` (matches Oj); `:float` forces every number to `Float`; `:bigdecimal` forces every decimal to `BigDecimal`. |
|
|
22
|
+
| `:duplicate_key` | `:last_wins` | How to handle a key that repeats within one object: `:last_wins` or `:first_wins`. (Every repeat is also reported through `:on_warning` — see below.) |
|
|
24
23
|
| `:encoding` | `nil` | Labels the input's encoding (e.g. `"UTF-8"`). It does **not** trigger a transcoding pass — see below. |
|
|
25
24
|
| `:on_warning` | `nil` | A callable invoked once per lenient fix applied, passed a `SmarterJSON::Warning`. Never changes the return value. See below. |
|
|
25
|
+
| `:symbolize_keys` | `false` | Return object keys as Symbols instead of Strings. |
|
|
26
26
|
|
|
27
27
|
```ruby
|
|
28
|
-
SmarterJSON.
|
|
29
|
-
SmarterJSON.
|
|
30
|
-
SmarterJSON.
|
|
31
|
-
SmarterJSON.
|
|
28
|
+
SmarterJSON.process_one('{"a": 1}', symbolize_keys: true) # => {:a=>1}
|
|
29
|
+
SmarterJSON.process_one('{"a":1,"a":2}', duplicate_key: :first_wins) # => {"a"=>1} (default keeps the 2)
|
|
30
|
+
SmarterJSON.process_one(big_decimal_json, decimal_precision: :float) # every number as Float (fastest)
|
|
31
|
+
SmarterJSON.process_one("[1,,2]", on_warning: ->(w) { puts w }) # => [1, 2], and prints the warning
|
|
32
32
|
```
|
|
33
33
|
|
|
34
34
|
### A note on `:on_warning`
|
|
35
35
|
|
|
36
|
-
`smarter_json` is lenient by design — it salvages your data instead of rejecting the whole document over a stray comma. `on_warning:` keeps that, but also hands you a record of what it had to fix, so leniency is transparent rather than silent. It takes a callable that
|
|
36
|
+
`smarter_json` is lenient by design — it salvages your data instead of rejecting the whole document over a stray comma. `on_warning:` keeps that, but also hands you a record of what it had to fix, so leniency is transparent rather than silent. It takes a callable that SmarterJSON invokes once per fix, passing a `SmarterJSON::Warning` (with `type` (a Symbol), `message`, `line`, and `col`). It never changes the return value — `process` still returns its `Array` of documents (and `process_one` its single value) — and it fires on every path, including the streaming block form. With no handler (the default), nothing is recorded and there is zero overhead.
|
|
37
37
|
|
|
38
38
|
```ruby
|
|
39
39
|
warns = []
|
|
40
40
|
result = SmarterJSON.process("[1,,2]", on_warning: ->(w) { warns << w })
|
|
41
|
-
result # => [1, 2]
|
|
41
|
+
result # => [[1, 2]] (one document: the array [1, 2], with the empty slot collapsed)
|
|
42
42
|
warns.map(&:type) # => [:empty_slot]
|
|
43
43
|
warns.first.to_s # => "extra comma, collapsed an empty slot at line 1, col 4"
|
|
44
44
|
```
|
|
@@ -47,11 +47,11 @@ The warning types are `:empty_slot` (a collapsed empty comma slot, e.g. `[1,,2]`
|
|
|
47
47
|
|
|
48
48
|
### A note on `:encoding`
|
|
49
49
|
|
|
50
|
-
`:encoding` labels what the input *is* — it does not transcode.
|
|
50
|
+
`:encoding` labels what the input *is* — it does not transcode. SmarterJSON works on the bytes in their native encoding and emits string values with the same encoding tag, the same way `smarter_csv` handles encodings. Bytes that are invalid for the claimed encoding raise `SmarterJSON::EncodingError` (a kind of `SmarterJSON::ParseError`). A UTF-8 BOM is handled automatically; UTF-16 / UTF-32 input is out of scope.
|
|
51
51
|
|
|
52
|
-
### A note on `:
|
|
52
|
+
### A note on `:decimal_precision`
|
|
53
53
|
|
|
54
|
-
The default `:auto` preserves high-precision numbers as `BigDecimal`, matching Oj's default. That is intrinsically slower than producing `Float` on number-heavy files (e.g. `canada.json`). For raw speed when you don't need the extra precision, pass `
|
|
54
|
+
The default `:auto` preserves high-precision numbers as `BigDecimal`, matching Oj's default. That is intrinsically slower than producing `Float` on number-heavy files (e.g. `canada.json`). For raw speed when you don't need the extra precision, pass `decimal_precision: :float`.
|
|
55
55
|
|
|
56
56
|
## Writing
|
|
57
57
|
|