smarter_json 0.8.0 → 0.9.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +13 -0
- data/README.md +42 -6
- data/docs/_introduction.md +1 -1
- data/docs/basic_read_api.md +35 -2
- data/docs/examples.md +29 -4
- data/ext/smarter_json/smarter_json.c +46 -16
- data/lib/smarter_json/parser.rb +325 -62
- data/lib/smarter_json/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 2256f81fe3b29e83a42dcf948db896a03cdac568bbc799cb3b63b9516a76592d
|
|
4
|
+
data.tar.gz: c13d572f3cb417fdffc16423a38e180018f121adbc31cbbc2490bc39576bf7b5
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 8ccaf09a845726e751740a870f62e008fac272bf314f6de88bba069663fa1fb9ba890d469bb1010ee55def74eb291f2953251372a2adcebae8d641a0609ff541
|
|
7
|
+
data.tar.gz: ce622275f2c90fc5044a0c0a9c2c8efcd601326c54f1869cb62241e3f78a784d9d237c9ff9f4c5b44e734562b22bcc744c0a14577641336ec5adeb24e1926c29
|
data/CHANGELOG.md
CHANGED
|
@@ -3,6 +3,19 @@
|
|
|
3
3
|
|
|
4
4
|
> 🚧 Getting ready for the 1.0.0 release - sorry for the interface changes - thank you for your patience! 🚧
|
|
5
5
|
|
|
6
|
+
## 0.9.2 (2026-06-03)
|
|
7
|
+
- **Fix a residual performance regression affecting every large document.** The "leading label" check (for `JSON: {…}`, which parses successfully but wrongly as an implicit-root object) now uses `String#start_with?(/…/)` instead of `match?(/\A…/)`. A `\A`-anchored `match?` is **not** anchor-optimized — it retries at every byte position and so scanned the entire input (~0.3 s on a 200 MB document) on every parse, which had quietly taxed every large file since the wrapper was introduced (deeply_nested.json and big_decimals.json sat well below their 0.6.0 throughput even after 0.9.1). `start_with?` inspects only the beginning, restoring — and slightly exceeding — 0.6.0 throughput across the board.
|
|
8
|
+
|
|
9
|
+
## 0.9.1 (2026-06-03 unreleased)
|
|
10
|
+
- **Fix a major performance regression on real-world data** (introduced with the 0.8.0 wrapper recovery). Wrapper recovery is now **reactive**: input is parsed first, and the markdown-fence / `<json>` / prose extraction runs only when that parse actually fails. Before, any input that merely *contained* ` ``` ` or `<json>` anywhere — including inside ordinary JSON string values, as GitHub-event payloads and other markdown-bearing data routinely do — was dragged through a full pure-Ruby recovery scan plus a double parse on every call (~30–45× slower on those files). A bare leading label like `JSON: {…}`, which parses successfully but wrongly, is still caught up front before parsing.
|
|
11
|
+
- **Streaming framer**: a multi-byte marker (`//`, `/*`, `'''`, `*/`) whose bytes straddle a read-chunk boundary is no longer mis-scanned — the framer waits for the rest of the marker before deciding, so a brace inside such a comment/string can no longer end a document early.
|
|
12
|
+
- Wrapper warnings (`code_fence_stripped` / `wrapper_tag_stripped`) now fire only when the marker is actually in the stripped text, not when it sits inside a recovered payload's own string value.
|
|
13
|
+
- Shared `SmarterJSON::Bytes` constants for the parser and the framer / recovery scanners (no raw hex byte literals).
|
|
14
|
+
|
|
15
|
+
## 0.9.0 (2026-06-03 unreleased)
|
|
16
|
+
- performance improvements
|
|
17
|
+
- code cleanup
|
|
18
|
+
|
|
6
19
|
## 0.8.0 (2026-06-03)
|
|
7
20
|
- **Robustness** against LLM-generated / wrapped JSON:
|
|
8
21
|
- strips markdown code fences (```json / ```)
|
data/README.md
CHANGED
|
@@ -16,7 +16,7 @@ Three things set it apart:
|
|
|
16
16
|
|
|
17
17
|
1. **One parser, no modes, no flags.** There is no `dialect:` option and no "strict mode" — `SmarterJSON.process(input)` accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the parser to match your input; it adapts to whatever you give it.
|
|
18
18
|
|
|
19
|
-
2. **It parses multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: one document returns its value, several documents return an `Array`, empty input returns `nil`. The same rule applies when wrapper noise is stripped and several payloads are recovered from one blob. **Only SmarterJSON parses multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.**
|
|
19
|
+
2. **It parses multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: one document returns its value, several documents return an `Array`, empty input returns `nil`. The same rule applies when wrapper noise is stripped and several payloads are recovered from one blob. **Only SmarterJSON parses multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.** For input larger than memory, pass a block to stream one document at a time.
|
|
20
20
|
|
|
21
21
|
3. **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere) puts it ahead of Oj on nearly every file we benchmark, and competitive with the stdlib `json` C parser — the fastest general-purpose Ruby JSON parser.
|
|
22
22
|
|
|
@@ -46,6 +46,12 @@ gem install smarter_json
|
|
|
46
46
|
|
|
47
47
|
The C extension is built on install and used automatically. On platforms where it can't build, the pure-Ruby parser runs instead and produces identical results.
|
|
48
48
|
|
|
49
|
+
## API stability and thread safety
|
|
50
|
+
|
|
51
|
+
The public API is now considered stable: `SmarterJSON.process`, `SmarterJSON.process_file`, `SmarterJSON.generate`, and the documented options in this README/docs are the supported surface.
|
|
52
|
+
|
|
53
|
+
Concurrent calls are safe. The parser/generator keep per-call state local, and the C extension only caches Ruby IDs / constants at load time; it does not share mutable parse state across calls.
|
|
54
|
+
|
|
49
55
|
## Documentation
|
|
50
56
|
|
|
51
57
|
* [Introduction](docs/_introduction.md)
|
|
@@ -68,14 +74,44 @@ SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2
|
|
|
68
74
|
SmarterJSON.process('{"id":1}') # => {"id"=>1} (one document → the value itself)
|
|
69
75
|
SmarterJSON.process("") # => nil (zero documents)
|
|
70
76
|
|
|
71
|
-
#
|
|
77
|
+
# For input larger than memory, stream one document at a time with a block
|
|
72
78
|
# (process and process_file both forward the block):
|
|
73
79
|
SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
|
|
74
80
|
|
|
75
81
|
# Wrapper noise is stripped automatically:
|
|
76
|
-
SmarterJSON.process(
|
|
77
|
-
|
|
78
|
-
|
|
82
|
+
SmarterJSON.process(<<~TEXT)
|
|
83
|
+
Here is the JSON:
|
|
84
|
+
|
|
85
|
+
```json
|
|
86
|
+
{
|
|
87
|
+
"a": 1
|
|
88
|
+
}
|
|
89
|
+
```
|
|
90
|
+
TEXT
|
|
91
|
+
# => {"a"=>1}
|
|
92
|
+
|
|
93
|
+
SmarterJSON.process(<<~TEXT)
|
|
94
|
+
Here is the result:
|
|
95
|
+
|
|
96
|
+
{
|
|
97
|
+
"a": 1
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
Hope this helps.
|
|
101
|
+
TEXT
|
|
102
|
+
# => {"a"=>1}
|
|
103
|
+
|
|
104
|
+
SmarterJSON.process("<json>{\"a\":1}</json>")
|
|
105
|
+
# => {"a"=>1}
|
|
106
|
+
|
|
107
|
+
SmarterJSON.process(<<~TEXT)
|
|
108
|
+
first attempt:
|
|
109
|
+
{"a":1}
|
|
110
|
+
|
|
111
|
+
corrected payload:
|
|
112
|
+
{"b":2}
|
|
113
|
+
TEXT
|
|
114
|
+
# => [{"a"=>1}, {"b"=>2}]
|
|
79
115
|
```
|
|
80
116
|
|
|
81
117
|
### Options
|
|
@@ -112,7 +148,7 @@ Benchmarks: p10 of 40 runs, Apple M1 Max, Ruby 3.4.7, on the standard JSON corpu
|
|
|
112
148
|
|
|
113
149
|
**Two notes on fair comparison:**
|
|
114
150
|
|
|
115
|
-
- **NDJSON:** on multi-document files, **only SmarterJSON parses the input via plain `process`** — Oj and `json` raise without a block, so their cells are `N/A`. That `N/A` reflects real default behavior, not a measurement gap. Plain `process` collects every document into an Array at ~270 MB/s; the block form
|
|
151
|
+
- **NDJSON:** on multi-document files, **only SmarterJSON parses the input via plain `process`** — Oj and `json` raise without a block, so their cells are `N/A`. That `N/A` reflects real default behavior, not a measurement gap. Plain `process` collects every document into an Array at ~270 MB/s; the streaming block form runs faster (~440 MB/s) because it doesn't hold all documents in memory at once.
|
|
116
152
|
- **High-precision decimals (e.g. `canada.json`):** SmarterJSON's default `:auto` mode preserves high-precision numbers as `BigDecimal` (matching Oj's default), which is intrinsically slower than `Float`. Against `Float`-producing parsers it looks slower on such files; pass `bigdecimal_load: :float` to compare like-for-like (it then runs much faster). Against the equivalent `BigDecimal`-producing Oj mode, SmarterJSON is faster.
|
|
117
153
|
|
|
118
154
|
## Encoding
|
data/docs/_introduction.md
CHANGED
|
@@ -21,7 +21,7 @@ Most JSON parsers reject anything that isn't perfectly strict JSON, and they mak
|
|
|
21
21
|
|
|
22
22
|
* **One reader, no modes, no flags.** There is no `dialect:` option and no "strict mode" — `SmarterJSON.process(input)` accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the reader to match your input; it adapts to whatever you give it.
|
|
23
23
|
|
|
24
|
-
* **It reads multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: zero documents returns `nil`, one document returns its value, two or more return an `Array`. The same rule applies when wrapper noise is stripped and several payloads are recovered from one blob. **Only SmarterJSON reads multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.**
|
|
24
|
+
* **It reads multi-document input automatically — a distinguishing feature.** `SmarterJSON.process` handles NDJSON / JSONL / concatenated JSON with **no block and no special method**: zero documents returns `nil`, one document returns its value, two or more return an `Array`. The same rule applies when wrapper noise is stripped and several payloads are recovered from one blob. **Only SmarterJSON reads multi-document input via plain `process` — Oj and the stdlib `json` library raise without a block.** For input larger than memory, pass a block to stream one document at a time. See [The Basic Read API](./basic_read_api.md).
|
|
25
25
|
|
|
26
26
|
* **It's fast.** A C extension (with a pure-Ruby fallback that runs everywhere) puts it ahead of Oj on nearly every file we benchmark, and competitive with the stdlib `json` C parser. Floats are parsed with Ryū (correctly rounded, single-pass), so number-heavy data is fast and bit-exact.
|
|
27
27
|
|
data/docs/basic_read_api.md
CHANGED
|
@@ -24,6 +24,39 @@ SmarterJSON.process("host: localhost\nport: 5432") # => {"host"=>"localhost"
|
|
|
24
24
|
|
|
25
25
|
`process` is polymorphic: its first argument is **either a String of JSON content or an IO to read from**. A String is always treated as content, never as a filename — use `process_file` for paths. When the input wraps the payload in obvious markdown / prose / tags, `process` strips that wrapper first and then parses the recovered payload(s).
|
|
26
26
|
|
|
27
|
+
```ruby
|
|
28
|
+
SmarterJSON.process(<<~TEXT)
|
|
29
|
+
Here is the JSON:
|
|
30
|
+
|
|
31
|
+
```json
|
|
32
|
+
{
|
|
33
|
+
"a": 1
|
|
34
|
+
}
|
|
35
|
+
```
|
|
36
|
+
TEXT
|
|
37
|
+
# => {"a"=>1}
|
|
38
|
+
|
|
39
|
+
SmarterJSON.process(<<~TEXT)
|
|
40
|
+
Here is the result:
|
|
41
|
+
|
|
42
|
+
{
|
|
43
|
+
"a": 1
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
Hope this helps.
|
|
47
|
+
TEXT
|
|
48
|
+
# => {"a"=>1}
|
|
49
|
+
|
|
50
|
+
SmarterJSON.process(<<~TEXT)
|
|
51
|
+
first attempt:
|
|
52
|
+
{"a":1}
|
|
53
|
+
|
|
54
|
+
corrected payload:
|
|
55
|
+
{"b":2}
|
|
56
|
+
TEXT
|
|
57
|
+
# => [{"a"=>1}, {"b"=>2}]
|
|
58
|
+
```
|
|
59
|
+
|
|
27
60
|
```ruby
|
|
28
61
|
SmarterJSON.process(io) # an open IO (File, StringIO, socket, …) — reads it and parses
|
|
29
62
|
SmarterJSON.process(some_string) # JSON content
|
|
@@ -49,9 +82,9 @@ SmarterJSON.process_file("config.json5") # read the file, then parse — sam
|
|
|
49
82
|
|
|
50
83
|
`process_file` opens the file, reads it with the labeled [`encoding:`](./options.md) (default `"UTF-8"`, no transcoding pass), and parses it.
|
|
51
84
|
|
|
52
|
-
## Streaming with a block
|
|
85
|
+
## Streaming with a block (bounded memory)
|
|
53
86
|
|
|
54
|
-
|
|
87
|
+
For input larger than memory, pass a block. Each recovered top-level document is yielded as it is framed, and the method returns `nil` instead of collecting the documents into an Array. Both `process` and `process_file` forward the block.
|
|
55
88
|
|
|
56
89
|
```ruby
|
|
57
90
|
SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
|
data/docs/examples.md
CHANGED
|
@@ -65,9 +65,9 @@ SmarterJSON.process('{"id":1}') # => {"id"=>1}
|
|
|
65
65
|
SmarterJSON.process("") # => nil
|
|
66
66
|
```
|
|
67
67
|
|
|
68
|
-
### Example 5:
|
|
68
|
+
### Example 5: Streaming a Large File with a Block
|
|
69
69
|
|
|
70
|
-
|
|
70
|
+
For input larger than memory, pass a block. Each recovered document is yielded one at a time:
|
|
71
71
|
|
|
72
72
|
```ruby
|
|
73
73
|
SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
|
|
@@ -116,6 +116,8 @@ A `#`/`//` only starts a comment when preceded by whitespace, so `http://example
|
|
|
116
116
|
|
|
117
117
|
### Example 10: Wrapper Noise Around a Payload
|
|
118
118
|
|
|
119
|
+
#### Fenced payload
|
|
120
|
+
|
|
119
121
|
```ruby
|
|
120
122
|
SmarterJSON.process(<<~TEXT)
|
|
121
123
|
Here is the JSON:
|
|
@@ -127,15 +129,38 @@ SmarterJSON.process(<<~TEXT)
|
|
|
127
129
|
```
|
|
128
130
|
TEXT
|
|
129
131
|
# => {"a"=>1}
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
#### Prose before / after the payload
|
|
135
|
+
|
|
136
|
+
```ruby
|
|
137
|
+
SmarterJSON.process(<<~TEXT)
|
|
138
|
+
Here is the result:
|
|
139
|
+
|
|
140
|
+
{
|
|
141
|
+
"a": 1
|
|
142
|
+
}
|
|
130
143
|
|
|
144
|
+
Hope this helps.
|
|
145
|
+
TEXT
|
|
146
|
+
# => {"a"=>1}
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
#### Wrapper tags
|
|
150
|
+
|
|
151
|
+
```ruby
|
|
131
152
|
SmarterJSON.process("<json>{\"a\":1}</json>")
|
|
132
153
|
# => {"a"=>1}
|
|
154
|
+
```
|
|
133
155
|
|
|
156
|
+
#### Multiple recovered payloads from one noisy blob
|
|
157
|
+
|
|
158
|
+
```ruby
|
|
134
159
|
SmarterJSON.process(<<~TEXT)
|
|
135
|
-
first:
|
|
160
|
+
first attempt:
|
|
136
161
|
{"a":1}
|
|
137
162
|
|
|
138
|
-
|
|
163
|
+
corrected payload:
|
|
139
164
|
{"b":2}
|
|
140
165
|
TEXT
|
|
141
166
|
# => [{"a"=>1}, {"b"=>2}]
|
|
@@ -41,6 +41,18 @@ static VALUE fj_sym_duplicate_key;
|
|
|
41
41
|
static ID fj_bigdecimal_id; /* cached BigDecimal() method id (set in Init) */
|
|
42
42
|
static ID fj_to_sym_id; /* cached :to_sym (symbolize_keys) */
|
|
43
43
|
static ID fj_key_p_id; /* cached :key? (non-default duplicate_key modes) */
|
|
44
|
+
static ID fj_force_encoding_id;
|
|
45
|
+
static ID fj_valid_encoding_p_id;
|
|
46
|
+
static ID fj_encoding_id;
|
|
47
|
+
static ID fj_name_id;
|
|
48
|
+
static VALUE fj_sym_encoding;
|
|
49
|
+
static VALUE fj_sym_symbolize_keys;
|
|
50
|
+
static VALUE fj_sym_first_wins;
|
|
51
|
+
static VALUE fj_sym_raise;
|
|
52
|
+
static VALUE fj_sym_bigdecimal_load;
|
|
53
|
+
static VALUE fj_sym_float;
|
|
54
|
+
static VALUE fj_sym_bigdecimal;
|
|
55
|
+
static VALUE fj_sym_on_warning;
|
|
44
56
|
|
|
45
57
|
/* Per-parse direct-mapped key cache: key bytes -> the interned (frozen,
|
|
46
58
|
* globally-rooted) String, so repeated keys skip the global fstring lookup.
|
|
@@ -373,11 +385,17 @@ static void fj_consume_keyword(fj_state *st, const char *word) {
|
|
|
373
385
|
fj_advance(st, n);
|
|
374
386
|
}
|
|
375
387
|
|
|
376
|
-
/* Copy a byte range into a fresh String, dropping underscores.
|
|
388
|
+
/* Copy a byte range into a fresh String, dropping underscores. Copies whole
|
|
389
|
+
* underscore-free runs in bulk, rather than one byte at a time. */
|
|
377
390
|
static VALUE fj_strip_underscores(const char *p, long n) {
|
|
378
391
|
VALUE s = rb_str_buf_new(n);
|
|
379
|
-
long i;
|
|
380
|
-
|
|
392
|
+
long i = 0;
|
|
393
|
+
while (i < n) {
|
|
394
|
+
long start = i;
|
|
395
|
+
while (i < n && p[i] != '_') i++;
|
|
396
|
+
if (i > start) rb_str_buf_cat(s, p + start, i - start);
|
|
397
|
+
if (i < n) i++; /* skip '_' */
|
|
398
|
+
}
|
|
381
399
|
return s;
|
|
382
400
|
}
|
|
383
401
|
|
|
@@ -1379,14 +1397,14 @@ static VALUE fj_parse_c(VALUE self, VALUE input, VALUE opts) {
|
|
|
1379
1397
|
|
|
1380
1398
|
Check_Type(input, T_STRING);
|
|
1381
1399
|
|
|
1382
|
-
enc_opt = rb_hash_aref(opts,
|
|
1400
|
+
enc_opt = rb_hash_aref(opts, fj_sym_encoding);
|
|
1383
1401
|
if (!NIL_P(enc_opt)) {
|
|
1384
|
-
input = rb_funcall(rb_str_dup(input),
|
|
1402
|
+
input = rb_funcall(rb_str_dup(input), fj_force_encoding_id, 1, enc_opt);
|
|
1385
1403
|
}
|
|
1386
|
-
if (!RTEST(rb_funcall(input,
|
|
1387
|
-
VALUE name = rb_funcall(rb_funcall(input,
|
|
1404
|
+
if (!RTEST(rb_funcall(input, fj_valid_encoding_p_id, 0))) {
|
|
1405
|
+
VALUE name = rb_funcall(rb_funcall(input, fj_encoding_id, 0), fj_name_id, 0);
|
|
1388
1406
|
VALUE msg = rb_sprintf("invalid byte sequence for %" PRIsVALUE, name);
|
|
1389
|
-
rb_exc_raise(rb_funcall(cEncodingError,
|
|
1407
|
+
rb_exc_raise(rb_funcall(cEncodingError, fj_new_id, 3, msg, Qnil, Qnil));
|
|
1390
1408
|
}
|
|
1391
1409
|
|
|
1392
1410
|
st.buf = RSTRING_PTR(input);
|
|
@@ -1402,19 +1420,19 @@ static VALUE fj_parse_c(VALUE self, VALUE input, VALUE opts) {
|
|
|
1402
1420
|
st.kcache = NULL;
|
|
1403
1421
|
#endif
|
|
1404
1422
|
|
|
1405
|
-
st.symbolize_keys = RTEST(rb_hash_aref(opts,
|
|
1406
|
-
dk = rb_hash_aref(opts,
|
|
1407
|
-
st.dup_first_wins = (dk ==
|
|
1408
|
-
st.dup_raise = (dk ==
|
|
1423
|
+
st.symbolize_keys = RTEST(rb_hash_aref(opts, fj_sym_symbolize_keys));
|
|
1424
|
+
dk = rb_hash_aref(opts, fj_sym_duplicate_key);
|
|
1425
|
+
st.dup_first_wins = (dk == fj_sym_first_wins);
|
|
1426
|
+
st.dup_raise = (dk == fj_sym_raise);
|
|
1409
1427
|
|
|
1410
1428
|
{
|
|
1411
|
-
VALUE bd = rb_hash_aref(opts,
|
|
1412
|
-
if (bd ==
|
|
1413
|
-
else if (bd ==
|
|
1429
|
+
VALUE bd = rb_hash_aref(opts, fj_sym_bigdecimal_load);
|
|
1430
|
+
if (bd == fj_sym_float) st.bigdecimal_load = 0;
|
|
1431
|
+
else if (bd == fj_sym_bigdecimal) st.bigdecimal_load = 2;
|
|
1414
1432
|
else st.bigdecimal_load = 1; /* :auto (default), including nil */
|
|
1415
1433
|
}
|
|
1416
1434
|
|
|
1417
|
-
st.on_warning = rb_hash_aref(opts,
|
|
1435
|
+
st.on_warning = rb_hash_aref(opts, fj_sym_on_warning); /* Qnil when absent */
|
|
1418
1436
|
|
|
1419
1437
|
if (st.len >= 3 && (unsigned char)st.buf[0] == 0xEF &&
|
|
1420
1438
|
(unsigned char)st.buf[1] == 0xBB && (unsigned char)st.buf[2] == 0xBF) {
|
|
@@ -1465,8 +1483,20 @@ void Init_smarter_json(void) {
|
|
|
1465
1483
|
fj_key_p_id = rb_intern("key?");
|
|
1466
1484
|
fj_new_id = rb_intern("new");
|
|
1467
1485
|
fj_call_id = rb_intern("call");
|
|
1486
|
+
fj_force_encoding_id = rb_intern("force_encoding");
|
|
1487
|
+
fj_valid_encoding_p_id = rb_intern("valid_encoding?");
|
|
1488
|
+
fj_encoding_id = rb_intern("encoding");
|
|
1489
|
+
fj_name_id = rb_intern("name");
|
|
1468
1490
|
fj_sym_empty_slot = ID2SYM(rb_intern("empty_slot"));
|
|
1469
1491
|
fj_sym_empty_value = ID2SYM(rb_intern("empty_value"));
|
|
1470
1492
|
fj_sym_duplicate_key = ID2SYM(rb_intern("duplicate_key"));
|
|
1493
|
+
fj_sym_encoding = ID2SYM(rb_intern("encoding"));
|
|
1494
|
+
fj_sym_symbolize_keys = ID2SYM(rb_intern("symbolize_keys"));
|
|
1495
|
+
fj_sym_first_wins = ID2SYM(rb_intern("first_wins"));
|
|
1496
|
+
fj_sym_raise = ID2SYM(rb_intern("raise"));
|
|
1497
|
+
fj_sym_bigdecimal_load = ID2SYM(rb_intern("bigdecimal_load"));
|
|
1498
|
+
fj_sym_float = ID2SYM(rb_intern("float"));
|
|
1499
|
+
fj_sym_bigdecimal = ID2SYM(rb_intern("bigdecimal"));
|
|
1500
|
+
fj_sym_on_warning = ID2SYM(rb_intern("on_warning"));
|
|
1471
1501
|
rb_define_module_function(mSmarterJSON, "parse_c", fj_parse_c, 2);
|
|
1472
1502
|
}
|
data/lib/smarter_json/parser.rb
CHANGED
|
@@ -63,22 +63,300 @@ module SmarterJSON
|
|
|
63
63
|
end
|
|
64
64
|
end
|
|
65
65
|
|
|
66
|
-
# Stream documents from an IO,
|
|
67
|
-
#
|
|
68
|
-
# spanning multiple lines is not supported by the streaming path.
|
|
66
|
+
# Stream documents from an IO incrementally, yielding each recovered top-level
|
|
67
|
+
# document without slurping the whole input into memory first.
|
|
69
68
|
def stream_io(io, options, &block)
|
|
70
|
-
Recovery.process_string(
|
|
69
|
+
Framer.each_document(io) { |doc| Recovery.process_string(doc, options, &block) }
|
|
70
|
+
nil
|
|
71
71
|
end
|
|
72
72
|
|
|
73
73
|
private_class_method :process_content, :stream_io
|
|
74
74
|
|
|
75
|
+
# Named byte values, shared by the Parser FSM and the Framer / Recovery byte
|
|
76
|
+
# scanners so none of them spell out raw hex. Included where needed.
|
|
77
|
+
module Bytes
|
|
78
|
+
LBRACE = 0x7B
|
|
79
|
+
RBRACE = 0x7D
|
|
80
|
+
LBRACKET = 0x5B
|
|
81
|
+
RBRACKET = 0x5D
|
|
82
|
+
COLON = 0x3A
|
|
83
|
+
COMMA = 0x2C
|
|
84
|
+
DQUOTE = 0x22
|
|
85
|
+
SQUOTE = 0x27
|
|
86
|
+
BACKSLASH = 0x5C
|
|
87
|
+
SLASH = 0x2F
|
|
88
|
+
STAR = 0x2A
|
|
89
|
+
HASH = 0x23
|
|
90
|
+
MINUS = 0x2D
|
|
91
|
+
PLUS = 0x2B
|
|
92
|
+
DOT = 0x2E
|
|
93
|
+
ZERO = 0x30
|
|
94
|
+
NINE = 0x39
|
|
95
|
+
LOWER_E = 0x65
|
|
96
|
+
UPPER_E = 0x45
|
|
97
|
+
LOWER_T = 0x74
|
|
98
|
+
LOWER_F = 0x66
|
|
99
|
+
LOWER_N = 0x6E
|
|
100
|
+
LOWER_U = 0x75
|
|
101
|
+
LOWER_X = 0x78
|
|
102
|
+
UPPER_X = 0x58
|
|
103
|
+
UPPER_I = 0x49
|
|
104
|
+
UPPER_N = 0x4E
|
|
105
|
+
UPPER_T = 0x54
|
|
106
|
+
UPPER_F = 0x46
|
|
107
|
+
UNDERSCORE = 0x5F
|
|
108
|
+
DOLLAR = 0x24
|
|
109
|
+
SPACE = 0x20
|
|
110
|
+
TAB = 0x09
|
|
111
|
+
LF = 0x0A
|
|
112
|
+
CR = 0x0D
|
|
113
|
+
end
|
|
114
|
+
|
|
115
|
+
module Framer
|
|
116
|
+
include Bytes
|
|
117
|
+
|
|
118
|
+
CHUNK_SIZE = 16 * 1024
|
|
119
|
+
|
|
120
|
+
module_function
|
|
121
|
+
|
|
122
|
+
def each_document(io, &block)
|
|
123
|
+
buffer = +""
|
|
124
|
+
scan = 0
|
|
125
|
+
doc_start = nil
|
|
126
|
+
stack = []
|
|
127
|
+
mode = nil
|
|
128
|
+
|
|
129
|
+
while (chunk = read_chunk(io))
|
|
130
|
+
buffer << chunk
|
|
131
|
+
loop do
|
|
132
|
+
emitted, buffer, scan, doc_start, stack, mode = scan_buffer(buffer, scan, doc_start, stack, mode)
|
|
133
|
+
break unless emitted
|
|
134
|
+
|
|
135
|
+
yield emitted
|
|
136
|
+
end
|
|
137
|
+
end
|
|
138
|
+
|
|
139
|
+
yield buffer unless separators_only?(buffer)
|
|
140
|
+
end
|
|
141
|
+
|
|
142
|
+
def read_chunk(io)
|
|
143
|
+
if io.respond_to?(:readpartial)
|
|
144
|
+
io.readpartial(CHUNK_SIZE)
|
|
145
|
+
else
|
|
146
|
+
io.read(CHUNK_SIZE)
|
|
147
|
+
end
|
|
148
|
+
rescue EOFError
|
|
149
|
+
nil
|
|
150
|
+
end
|
|
151
|
+
|
|
152
|
+
def scan_buffer(buffer, scan, doc_start, stack, mode)
|
|
153
|
+
while scan < buffer.bytesize
|
|
154
|
+
b = buffer.getbyte(scan)
|
|
155
|
+
# A multi-byte marker (// /* ''' */) whose lead byte is here but whose
|
|
156
|
+
# remaining bytes have not arrived yet must not be guessed at — advancing
|
|
157
|
+
# past the lead byte would misread the brace/quote that follows it once the
|
|
158
|
+
# next chunk lands. Stop and let each_document append more input, then resume
|
|
159
|
+
# from this same position. At true EOF the leftover is parsed whole instead.
|
|
160
|
+
break if defer_for_split_marker?(buffer, scan, b, mode, doc_start)
|
|
161
|
+
|
|
162
|
+
if mode == :double
|
|
163
|
+
if b == BACKSLASH
|
|
164
|
+
scan += 2
|
|
165
|
+
elsif b == DQUOTE
|
|
166
|
+
mode = nil
|
|
167
|
+
scan += 1
|
|
168
|
+
else
|
|
169
|
+
scan += 1
|
|
170
|
+
end
|
|
171
|
+
elsif mode == :single
|
|
172
|
+
if b == BACKSLASH
|
|
173
|
+
scan += 2
|
|
174
|
+
elsif b == SQUOTE
|
|
175
|
+
mode = nil
|
|
176
|
+
scan += 1
|
|
177
|
+
else
|
|
178
|
+
scan += 1
|
|
179
|
+
end
|
|
180
|
+
elsif mode == :triple
|
|
181
|
+
if buffer.byteslice(scan, 3) == "'''"
|
|
182
|
+
mode = nil
|
|
183
|
+
scan += 3
|
|
184
|
+
else
|
|
185
|
+
scan += 1
|
|
186
|
+
end
|
|
187
|
+
elsif mode == :line_comment
|
|
188
|
+
if [LF, CR].include?(b)
|
|
189
|
+
mode = nil
|
|
190
|
+
else
|
|
191
|
+
scan += 1
|
|
192
|
+
next
|
|
193
|
+
end
|
|
194
|
+
elsif mode == :block_comment
|
|
195
|
+
if buffer.byteslice(scan, 2) == '*/'
|
|
196
|
+
mode = nil
|
|
197
|
+
scan += 2
|
|
198
|
+
else
|
|
199
|
+
scan += 1
|
|
200
|
+
end
|
|
201
|
+
elsif doc_start.nil?
|
|
202
|
+
if whitespace_byte?(b)
|
|
203
|
+
scan += 1
|
|
204
|
+
elsif line_comment_start?(buffer, scan)
|
|
205
|
+
mode = :line_comment
|
|
206
|
+
scan += buffer.getbyte(scan) == HASH ? 1 : 2
|
|
207
|
+
elsif block_comment_start?(buffer, scan)
|
|
208
|
+
mode = :block_comment
|
|
209
|
+
scan += 2
|
|
210
|
+
elsif [LBRACE, LBRACKET].include?(b)
|
|
211
|
+
doc_start = scan
|
|
212
|
+
stack << b
|
|
213
|
+
scan += 1
|
|
214
|
+
else
|
|
215
|
+
scan = buffer.bytesize
|
|
216
|
+
end
|
|
217
|
+
else
|
|
218
|
+
if mode.nil? && line_comment_start?(buffer, scan)
|
|
219
|
+
mode = :line_comment
|
|
220
|
+
scan += buffer.getbyte(scan) == HASH ? 1 : 2
|
|
221
|
+
elsif mode.nil? && block_comment_start?(buffer, scan)
|
|
222
|
+
mode = :block_comment
|
|
223
|
+
scan += 2
|
|
224
|
+
elsif b == DQUOTE
|
|
225
|
+
mode = :double
|
|
226
|
+
scan += 1
|
|
227
|
+
elsif buffer.byteslice(scan, 3) == "'''"
|
|
228
|
+
mode = :triple
|
|
229
|
+
scan += 3
|
|
230
|
+
elsif b == SQUOTE
|
|
231
|
+
mode = :single
|
|
232
|
+
scan += 1
|
|
233
|
+
elsif [LBRACE, LBRACKET].include?(b)
|
|
234
|
+
stack << b
|
|
235
|
+
scan += 1
|
|
236
|
+
elsif b == RBRACE
|
|
237
|
+
stack.pop if stack.last == LBRACE
|
|
238
|
+
scan += 1
|
|
239
|
+
if stack.empty?
|
|
240
|
+
doc = buffer.byteslice(doc_start, scan - doc_start)
|
|
241
|
+
buffer = buffer.byteslice(scan..-1) || +""
|
|
242
|
+
return [doc, buffer, 0, nil, [], nil]
|
|
243
|
+
end
|
|
244
|
+
elsif b == RBRACKET
|
|
245
|
+
stack.pop if stack.last == LBRACKET
|
|
246
|
+
scan += 1
|
|
247
|
+
if stack.empty?
|
|
248
|
+
doc = buffer.byteslice(doc_start, scan - doc_start)
|
|
249
|
+
buffer = buffer.byteslice(scan..-1) || +""
|
|
250
|
+
return [doc, buffer, 0, nil, [], nil]
|
|
251
|
+
end
|
|
252
|
+
else
|
|
253
|
+
scan += 1
|
|
254
|
+
end
|
|
255
|
+
end
|
|
256
|
+
end
|
|
257
|
+
|
|
258
|
+
[nil, buffer, scan, doc_start, stack, mode]
|
|
259
|
+
end
|
|
260
|
+
|
|
261
|
+
# True when `b` is the lead byte of a multi-byte marker but the rest of that
|
|
262
|
+
# marker has not been read into the buffer yet, so we cannot decide what it is.
|
|
263
|
+
# `//` and `/*` need 2 bytes; `'''` (and a closing `'''`) needs 3; a closing
|
|
264
|
+
# `*/` needs 2. Backslash escapes and single-byte delimiters never need this.
|
|
265
|
+
def defer_for_split_marker?(buffer, scan, b, mode, doc_start)
|
|
266
|
+
avail = buffer.bytesize - scan
|
|
267
|
+
case mode
|
|
268
|
+
when :block_comment
|
|
269
|
+
b == STAR && avail < 2
|
|
270
|
+
when :triple
|
|
271
|
+
b == SQUOTE && avail < 3
|
|
272
|
+
when nil
|
|
273
|
+
if doc_start.nil?
|
|
274
|
+
b == SLASH && avail < 2
|
|
275
|
+
else
|
|
276
|
+
(b == SLASH && avail < 2) || (b == SQUOTE && avail < 3)
|
|
277
|
+
end
|
|
278
|
+
else
|
|
279
|
+
false
|
|
280
|
+
end
|
|
281
|
+
end
|
|
282
|
+
|
|
283
|
+
def separators_only?(buffer)
|
|
284
|
+
scan = 0
|
|
285
|
+
mode = nil
|
|
286
|
+
while scan < buffer.bytesize
|
|
287
|
+
b = buffer.getbyte(scan)
|
|
288
|
+
if mode == :line_comment
|
|
289
|
+
if [LF, CR].include?(b)
|
|
290
|
+
mode = nil
|
|
291
|
+
else
|
|
292
|
+
scan += 1
|
|
293
|
+
next
|
|
294
|
+
end
|
|
295
|
+
elsif mode == :block_comment
|
|
296
|
+
if buffer.byteslice(scan, 2) == '*/'
|
|
297
|
+
mode = nil
|
|
298
|
+
scan += 2
|
|
299
|
+
else
|
|
300
|
+
scan += 1
|
|
301
|
+
end
|
|
302
|
+
elsif whitespace_byte?(b)
|
|
303
|
+
scan += 1
|
|
304
|
+
elsif line_comment_start?(buffer, scan)
|
|
305
|
+
mode = :line_comment
|
|
306
|
+
scan += buffer.getbyte(scan) == HASH ? 1 : 2
|
|
307
|
+
elsif block_comment_start?(buffer, scan)
|
|
308
|
+
mode = :block_comment
|
|
309
|
+
scan += 2
|
|
310
|
+
else
|
|
311
|
+
return false
|
|
312
|
+
end
|
|
313
|
+
end
|
|
314
|
+
true
|
|
315
|
+
end
|
|
316
|
+
|
|
317
|
+
def whitespace_byte?(b)
|
|
318
|
+
b == SPACE || (b && b >= TAB && b <= CR)
|
|
319
|
+
end
|
|
320
|
+
|
|
321
|
+
def line_comment_start?(buffer, scan)
|
|
322
|
+
b = buffer.getbyte(scan)
|
|
323
|
+
return preceded_by_ws_or_start?(buffer, scan) if b == HASH
|
|
324
|
+
|
|
325
|
+
b == SLASH && buffer.getbyte(scan + 1) == SLASH && preceded_by_ws_or_start?(buffer, scan)
|
|
326
|
+
end
|
|
327
|
+
|
|
328
|
+
def block_comment_start?(buffer, scan)
|
|
329
|
+
buffer.getbyte(scan) == SLASH && buffer.getbyte(scan + 1) == STAR && preceded_by_ws_or_start?(buffer, scan)
|
|
330
|
+
end
|
|
331
|
+
|
|
332
|
+
def preceded_by_ws_or_start?(buffer, scan)
|
|
333
|
+
return true if scan.zero?
|
|
334
|
+
|
|
335
|
+
prev = buffer.getbyte(scan - 1)
|
|
336
|
+
whitespace_byte?(prev)
|
|
337
|
+
end
|
|
338
|
+
end
|
|
339
|
+
|
|
75
340
|
module Recovery
|
|
341
|
+
include Bytes
|
|
342
|
+
|
|
76
343
|
module_function
|
|
77
344
|
|
|
78
345
|
def process_string(input, options, &block)
|
|
79
346
|
return SmarterJSON.send(:process_content, input, options, &block) unless input.valid_encoding?
|
|
80
347
|
|
|
81
|
-
|
|
348
|
+
# Recovery is REACTIVE: parse first, and only fall back to wrapper extraction when
|
|
349
|
+
# the parse actually fails (the rescue below). Every wrapper shape — code fences,
|
|
350
|
+
# <json>/BEGIN_JSON tags, prose around the payload — makes the parse raise, so the
|
|
351
|
+
# rescue catches it. Crucially this keeps clean input on the single-parse fast path
|
|
352
|
+
# even when its string values legitimately contain ``` or <json> (real-world data
|
|
353
|
+
# like GitHub event payloads is full of markdown), instead of dragging hundreds of
|
|
354
|
+
# MB through the pure-Ruby candidate scan.
|
|
355
|
+
#
|
|
356
|
+
# The one exception is a bare leading label like "JSON: {...}", which parses
|
|
357
|
+
# successfully but WRONGLY (as an implicit-root object keyed by the label), so it
|
|
358
|
+
# must be intercepted before parsing.
|
|
359
|
+
if leading_label?(input)
|
|
82
360
|
payloads = extract_payloads(input, options)
|
|
83
361
|
return replay_payloads(payloads, options, &block) unless payloads.empty?
|
|
84
362
|
end
|
|
@@ -93,10 +371,14 @@ module SmarterJSON
|
|
|
93
371
|
raise
|
|
94
372
|
end
|
|
95
373
|
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
374
|
+
# Whether the input opens with a bare "JSON:" / "Final answer:" label (which would
|
|
375
|
+
# otherwise parse, wrongly, as an implicit-root object keyed by the label). We use
|
|
376
|
+
# String#start_with? with a Regexp rather than match?(/\A.../): start_with? checks
|
|
377
|
+
# only the beginning, whereas a \A-anchored match? still retries at every byte
|
|
378
|
+
# position and so scans the WHOLE input (≈0.3s on a 200 MB document) on every parse.
|
|
379
|
+
# (Caller has already established the input is valid_encoding?.)
|
|
380
|
+
def leading_label?(input)
|
|
381
|
+
input.start_with?(/[[:space:]]*(?:JSON|Final answer)[[:space:]]*:/i)
|
|
100
382
|
end
|
|
101
383
|
|
|
102
384
|
def replay_payloads(payloads, options, &block)
|
|
@@ -146,16 +428,31 @@ module SmarterJSON
|
|
|
146
428
|
last = ranges.last
|
|
147
429
|
prefix = input.byteslice(0, first.begin)
|
|
148
430
|
suffix = input.byteslice(last.end, input.bytesize - last.end)
|
|
431
|
+
# Look for fence / wrapper markers only in the text we actually strip (outside
|
|
432
|
+
# every recovered payload), so a ``` or <json> sitting inside a payload's own
|
|
433
|
+
# string value does not trigger a "stripped a wrapper" warning.
|
|
434
|
+
outside = non_payload_text(input, ranges)
|
|
149
435
|
{
|
|
150
436
|
prefix: substantive_text?(prefix),
|
|
151
437
|
suffix: substantive_text?(suffix),
|
|
152
|
-
fence:
|
|
153
|
-
wrapper:
|
|
438
|
+
fence: outside.include?("```"),
|
|
439
|
+
wrapper: outside.match?(/<json\b|BEGIN_JSON\b/i),
|
|
154
440
|
first_pos: line_col_for(input, first.begin),
|
|
155
441
|
last_pos: line_col_for(input, last.begin)
|
|
156
442
|
}
|
|
157
443
|
end
|
|
158
444
|
|
|
445
|
+
def non_payload_text(input, ranges)
|
|
446
|
+
out = +""
|
|
447
|
+
pos = 0
|
|
448
|
+
ranges.each do |range|
|
|
449
|
+
out << input.byteslice(pos, range.begin - pos) if range.begin > pos
|
|
450
|
+
pos = range.end
|
|
451
|
+
end
|
|
452
|
+
out << input.byteslice(pos, input.bytesize - pos) if pos < input.bytesize
|
|
453
|
+
out
|
|
454
|
+
end
|
|
455
|
+
|
|
159
456
|
def line_col_for(input, offset)
|
|
160
457
|
line = 1
|
|
161
458
|
col = 1
|
|
@@ -164,15 +461,15 @@ module SmarterJSON
|
|
|
164
461
|
b = input.getbyte(i)
|
|
165
462
|
break if b.nil?
|
|
166
463
|
|
|
167
|
-
if b ==
|
|
464
|
+
if b == LF
|
|
168
465
|
line += 1
|
|
169
466
|
col = 1
|
|
170
467
|
i += 1
|
|
171
|
-
elsif b ==
|
|
468
|
+
elsif b == CR
|
|
172
469
|
line += 1
|
|
173
470
|
col = 1
|
|
174
471
|
i += 1
|
|
175
|
-
i += 1 if i < offset && input.getbyte(i) ==
|
|
472
|
+
i += 1 if i < offset && input.getbyte(i) == LF
|
|
176
473
|
else
|
|
177
474
|
col += 1
|
|
178
475
|
i += 1
|
|
@@ -203,19 +500,19 @@ module SmarterJSON
|
|
|
203
500
|
while i < input.bytesize
|
|
204
501
|
b = input.getbyte(i)
|
|
205
502
|
if mode == :double
|
|
206
|
-
if b ==
|
|
503
|
+
if b == BACKSLASH
|
|
207
504
|
i += 2
|
|
208
505
|
next
|
|
209
|
-
elsif b ==
|
|
506
|
+
elsif b == DQUOTE
|
|
210
507
|
mode = nil
|
|
211
508
|
end
|
|
212
509
|
i += 1
|
|
213
510
|
next
|
|
214
511
|
elsif mode == :single
|
|
215
|
-
if b ==
|
|
512
|
+
if b == BACKSLASH
|
|
216
513
|
i += 2
|
|
217
514
|
next
|
|
218
|
-
elsif b ==
|
|
515
|
+
elsif b == SQUOTE
|
|
219
516
|
mode = nil
|
|
220
517
|
end
|
|
221
518
|
i += 1
|
|
@@ -229,7 +526,7 @@ module SmarterJSON
|
|
|
229
526
|
end
|
|
230
527
|
next
|
|
231
528
|
elsif mode == :line_comment
|
|
232
|
-
if [
|
|
529
|
+
if [LF, CR].include?(b)
|
|
233
530
|
mode = nil
|
|
234
531
|
else
|
|
235
532
|
i += 1
|
|
@@ -252,11 +549,11 @@ module SmarterJSON
|
|
|
252
549
|
mode = :block_comment
|
|
253
550
|
i += 2
|
|
254
551
|
next
|
|
255
|
-
elsif b ==
|
|
552
|
+
elsif b == HASH
|
|
256
553
|
mode = :line_comment
|
|
257
554
|
i += 1
|
|
258
555
|
next
|
|
259
|
-
elsif b ==
|
|
556
|
+
elsif b == DQUOTE
|
|
260
557
|
mode = :double
|
|
261
558
|
i += 1
|
|
262
559
|
next
|
|
@@ -264,21 +561,21 @@ module SmarterJSON
|
|
|
264
561
|
mode = :triple
|
|
265
562
|
i += 3
|
|
266
563
|
next
|
|
267
|
-
elsif b ==
|
|
564
|
+
elsif b == SQUOTE
|
|
268
565
|
mode = :single
|
|
269
566
|
i += 1
|
|
270
567
|
next
|
|
271
|
-
elsif [
|
|
568
|
+
elsif [LBRACE, LBRACKET].include?(b)
|
|
272
569
|
start_pos = i if stack.empty?
|
|
273
570
|
stack << b
|
|
274
|
-
elsif b ==
|
|
275
|
-
stack.pop if stack.last ==
|
|
571
|
+
elsif b == RBRACE
|
|
572
|
+
stack.pop if stack.last == LBRACE
|
|
276
573
|
if stack.empty? && start_pos
|
|
277
574
|
ranges << (start_pos...(i + 1))
|
|
278
575
|
start_pos = nil
|
|
279
576
|
end
|
|
280
|
-
elsif b ==
|
|
281
|
-
stack.pop if stack.last ==
|
|
577
|
+
elsif b == RBRACKET
|
|
578
|
+
stack.pop if stack.last == LBRACKET
|
|
282
579
|
if stack.empty? && start_pos
|
|
283
580
|
ranges << (start_pos...(i + 1))
|
|
284
581
|
start_pos = nil
|
|
@@ -304,41 +601,7 @@ module SmarterJSON
|
|
|
304
601
|
# Python literals (True/False/None) and undefined, underscores in
|
|
305
602
|
# numeric literals, and encoding validation (SmarterJSON::EncodingError).
|
|
306
603
|
class Parser
|
|
307
|
-
|
|
308
|
-
RBRACE = 0x7D
|
|
309
|
-
LBRACKET = 0x5B
|
|
310
|
-
RBRACKET = 0x5D
|
|
311
|
-
COLON = 0x3A
|
|
312
|
-
COMMA = 0x2C
|
|
313
|
-
DQUOTE = 0x22
|
|
314
|
-
SQUOTE = 0x27
|
|
315
|
-
BACKSLASH = 0x5C
|
|
316
|
-
SLASH = 0x2F
|
|
317
|
-
STAR = 0x2A
|
|
318
|
-
HASH = 0x23
|
|
319
|
-
MINUS = 0x2D
|
|
320
|
-
PLUS = 0x2B
|
|
321
|
-
DOT = 0x2E
|
|
322
|
-
ZERO = 0x30
|
|
323
|
-
NINE = 0x39
|
|
324
|
-
LOWER_E = 0x65
|
|
325
|
-
UPPER_E = 0x45
|
|
326
|
-
LOWER_T = 0x74
|
|
327
|
-
LOWER_F = 0x66
|
|
328
|
-
LOWER_N = 0x6E
|
|
329
|
-
LOWER_U = 0x75
|
|
330
|
-
LOWER_X = 0x78
|
|
331
|
-
UPPER_X = 0x58
|
|
332
|
-
UPPER_I = 0x49
|
|
333
|
-
UPPER_N = 0x4E
|
|
334
|
-
UPPER_T = 0x54
|
|
335
|
-
UPPER_F = 0x46
|
|
336
|
-
UNDERSCORE = 0x5F
|
|
337
|
-
DOLLAR = 0x24
|
|
338
|
-
SPACE = 0x20
|
|
339
|
-
TAB = 0x09
|
|
340
|
-
LF = 0x0A
|
|
341
|
-
CR = 0x0D
|
|
604
|
+
include Bytes
|
|
342
605
|
|
|
343
606
|
NOT_NUMERIC = Object.new
|
|
344
607
|
HEX_RE = /\A[-+]?0[xX][0-9a-fA-F_]+\z/.freeze
|
data/lib/smarter_json/version.rb
CHANGED