slimjson 1.0.3 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README_EN.md CHANGED
@@ -1,309 +1,499 @@
1
- # slimjson
2
-
3
- [中文](./README.md) | English
4
-
5
- A lightweight object array compression tool — converts JSON object arrays with repeated keys into a compact `{ keys, rows }` format, with support for omitting `null` values during serialization to further reduce size.
6
-
7
- ## Use Cases
8
-
9
- - **API List Endpoints**: Backend list endpoints where every object carries the same key names, resulting in massive redundancy
10
- - **Heterogeneous Fields**: Objects with different fields (backend omits null fields on demand)
11
- - **Network Transfer Compression**: Minimizing JSON text size for network transmission
12
- - **LLM Context Compression**: Compress large structured data (e.g. database query results, API responses, knowledge base entries) before sending to prompts, reducing token consumption and API costs
13
- - **LLM Tool Calling**: function calling / tool_use results are often structured object arrays — compressing them before feeding back to the model significantly reduces context window usage, enabling the model to handle more complex data within limited tokens
14
- - **LLM-Friendly Format**: The compressed `{ keys, rows }` format separates schema (field definitions) from data, with each key appearing only once. Models can more accurately understand data structures and extract information by field name, with less confusion compared to raw JSON with repeated keys
15
-
16
- ## Installation
17
-
18
- ```bash
19
- npm install slimjson
20
- ```
21
-
22
- ## API
23
-
24
- ### `compress(source)`
25
-
26
- Compresses an object array into a `{ keys, rows }` structure:
27
-
28
- ```js
29
- import { compress } from 'slimjson';
30
-
31
- const users = [
32
- { name: 'Alice', age: 25, city: 'NYC' },
33
- { name: 'Bob', age: 30, city: 'LA' },
34
- ];
35
-
36
- const compressed = compress(users);
37
- // {
38
- // keys: ['name', 'age', 'city'],
39
- // rows: [
40
- // ['Alice', 25, 'NYC'],
41
- // ['Bob', 30, 'LA' ]
42
- // ]
43
- // }
44
- ```
45
-
46
- **Features:**
47
- - `keys` takes the union of all object keys, ordered by first appearance
48
- - Missing fields in an object → fill `null` at the corresponding row position
49
- - Nested objects are recursively processed: represented as `{ "fieldName": [childKeys] }` in `keys`
50
- - Object arrays (e.g. order items) are recursively compressed the same way
51
- - When a plain object is passed (not an array), it is treated as a single-element array
52
-
53
- ```js
54
- compress({ name: 'Alice', age: 25 });
55
- // Equivalent to compress([{ name: 'Alice', age: 25 }])
56
- // { keys: ['name', 'age'], rows: [['Alice', 25]] }
57
- ```
58
-
59
- #### Nested Object Example
60
-
61
- ```js
62
- const data = [
63
- { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
64
- { name: 'Bob', age: 35, profile: { avatar: 'b.jpg' } }, // missing bio
65
- ];
66
-
67
- compress(data);
68
- // {
69
- // keys: ['name', 'age', { profile: ['avatar', 'bio'] }],
70
- // rows: [
71
- // ['Alice', 28, ['a.jpg', 'Hello']],
72
- // ['Bob', 35, ['b.jpg', null ]]
73
- // ]
74
- // }
75
-
76
- stringify(compress(data));
77
- // {keys:[name,age,{profile:[avatar,bio]}],rows:[[Alice,28,[a.jpg,Hello]],[Bob,35,[b.jpg,]]]}
78
- // ^^ null omitted, comma retained
79
- ```
80
-
81
- #### Object Array Example (Order Scenario)
82
-
83
- ```js
84
- const orders = [
85
- { orderId: 'A001', items: [{ name: 'Keyboard', price: 299 }, { name: 'Mouse', price: 99 }] },
86
- { orderId: 'A002', items: [{ name: 'Monitor', price: 1999 }] },
87
- ];
88
-
89
- compress(orders);
90
- // {
91
- // keys: ['orderId', { items: ['name', 'price'] }],
92
- // rows: [
93
- // ['A001', [['Keyboard', 299], ['Mouse', 99]]],
94
- // ['A002', [['Monitor', 1999]]]
95
- // ]
96
- // }
97
-
98
- stringify(compress(orders));
99
- // {keys:[orderId,{items:[name,price]}],rows:[[A001,[[Keyboard,299],[Mouse,99]]],[A002,[[Monitor,1999]]]]}
100
- // ^^^^^ nested object key, no quotes ^^^^ safe string value, no quotes
101
- ```
102
-
103
- #### Three-Level Nesting Example (Order → Item → Specs)
104
-
105
- ```js
106
- const orders = [
107
- {
108
- orderId: 'A001',
109
- customer: 'Alice',
110
- items: [
111
- { name: 'Keyboard', price: 299, specs: { color: 'Black', layout: '104-key' } },
112
- { name: 'Mouse', price: 99, specs: { color: 'White', dpi: '4000' } },
113
- ]
114
- },
115
- {
116
- orderId: 'A002',
117
- customer: 'Bob',
118
- items: [
119
- { name: 'Monitor', price: 1999, specs: { color: 'Silver', size: '27in' } },
120
- ]
121
- },
122
- ];
123
-
124
- compress(orders);
125
- // {
126
- // keys: [
127
- // 'orderId',
128
- // 'customer',
129
- // { items: ['name', 'price', { specs: ['color', 'layout', 'dpi', 'size'] }] }
130
- // ],
131
- // rows: [
132
- // ['A001', 'Alice', [
133
- // ['Keyboard', 299, ['Black', '104-key', null, null]],
134
- // ['Mouse', 99, ['White', null, '4000', null]]
135
- // ]],
136
- // ['A002', 'Bob', [
137
- // ['Monitor', 1999, ['Silver', null, null, '27in']]
138
- // ]]
139
- // ]
140
- // }
141
- // specs keys take the union: order 1 has layout, order 2 has size → both kept, missing fields filled with null
142
-
143
- stringify(compress(orders));
144
- // {keys:[orderId,customer,{items:[name,price,{specs:[color,layout,dpi,size]}]}],rows:[[
145
- // A001,Alice,[[Keyboard,299,[Black,104-key,,]],[Mouse,99,[White,,4000,]]]],[A002,Bob,[[Monitor,1999,[Silver,,,27in]]]]]}
146
- // ^^^^^^^^^^^^^^^^^^^^^^^^^^^ missing fields in specs omitted as empty slots ^^^^^^^^^^^^^^^^^^^^^^^^
147
- ```
148
-
149
- ### `decompress(compressed)`
150
-
151
- Restores `{ keys, rows }` back to the original object array:
152
-
153
- ```js
154
- const restored = decompress(compressed);
155
- // deep-equal to the original array
156
- ```
157
-
158
- ### `stringify(compressed)`
159
-
160
- Serializes the compress result into compact text. Compared to `JSON.stringify`, the following optimization rules are applied:
161
-
162
- ```js
163
- const data = [
164
- { name: 'Alice', age: 25 },
165
- { name: 'Bob', age: 30 },
166
- ];
167
-
168
- const text = stringify(compress(data));
169
- // {keys:[name,age],rows:[[Alice,25],[Bob,30]]}
170
-
171
- JSON.stringify(compress(data));
172
- // {"keys":["name","age"],"rows":[["Alice",25],["Bob",30]]}
173
- ```
174
-
175
- #### Serialization Rules
176
-
177
- | Value Type | Serialized Result | Notes |
178
- |------------|------------------|-------|
179
- | `null` / `undefined` | `null` | |
180
- | Finite number | `25` | Direct output, no quotes |
181
- | `NaN` / `Infinity` | `null` | Non-finite numbers unified to null |
182
- | `true` / `false` | `true` / `false` | — |
183
- | Safe string | `Alice` | Quotes omitted (see rules below) |
184
- | Unsafe string | `"hello world"` | JSON quotes and escaping retained |
185
- | Nested object `{k: v}` | `{k:v}` | Keys follow same safe/unsafe rules |
186
- | Array | See null omission rules below | — |
187
-
188
- #### Safe Strings (Conditions for Omitting Quotes)
189
-
190
- A string can omit quotes only when it satisfies **all** of the following conditions; otherwise `JSON.stringify` escaping is applied:
191
-
192
- 1. Non-empty string
193
- 2. Not a keyword literal: `null`, `true`, `false`
194
- 3. Does not match number pattern: `/^-?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?$/` (e.g. `"123"`, `"-3.14"`, `"1e10"` all retain quotes)
195
- 4. Does not start with a digit or minus sign `-`
196
- 5. Does not contain whitespace, `[`, `]`, `{`, `}`, `,`, `:`, `"` or similar characters
197
-
198
- | String | Result | Reason |
199
- |--------|--------|--------|
200
- | `"Alice"` | `Alice` | Safe, quotes omitted |
201
- | `"hello world"` | `"hello world"` | Contains space |
202
- | `"123"` | `"123"` | Looks like a number |
203
- | `"-3.14"` | `"-3.14"` | Looks like a number |
204
- | `"null"` | `"null"` | Keyword |
205
- | `""` | `""` | Empty string |
206
- | `"-abc"` | `"-abc"` | Starts with minus sign |
207
- | `"a:b"` | `"a:b"` | Contains colon |
208
-
209
- #### Object Key Quoting Rules
210
-
211
- Nested object keys in `keys` follow the same safe string check:
212
-
213
- ```js
214
- stringify({ keys: [{ profile: ['name', 'age'] }], rows: [...] });
215
- // {keys:[{profile:[name,age]}],rows:[...]} ← profile is safe, quotes omitted
216
-
217
- stringify({ keys: [{ "my-key": ['name'] }], rows: [...] });
218
- // {keys:[{"my-key":[name]}],rows:[...]} ← my-key contains hyphen, quotes retained
219
- ```
220
-
221
- #### Array Null Omission Rules
222
-
223
- `null` / `undefined` values in arrays are omitted as comma slots, taking no text space:
224
-
225
- | Original Array | Serialized Result | Notes |
226
- |---------------|------------------|-------|
227
- | `["a", null, null]` | `[a,,]` | Two trailing empty slots |
228
- | `[null, 1, null]` | `[,1,]` | Leading and trailing empty slots |
229
- | `[]` | `[]` | Empty array |
230
- | `[null]` | `[null]` | **Special**: `[,]` means 2 nulls, so single null retains literal |
231
-
232
- ### `parse(text)`
233
-
234
- Parses text produced by `stringify`, restoring omitted `null` values:
235
-
236
- ```js
237
- const parsed = parse(text);
238
- // deep-equal to compressed
239
- ```
240
-
241
- Supports full JSON type parsing (strings, numbers, booleans, null, nested objects/arrays), compatible with escape characters and Unicode.
242
-
243
- ## Complete Example
244
-
245
- ```js
246
- import { compress, decompress, stringify, parse } from 'slimjson';
247
-
248
- const data = [
249
- { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
250
- { name: 'Bob', age: 35, profile: { avatar: 'b.jpg' } }, // missing bio
251
- ];
252
-
253
- // Compress Stringify Parse Decompress
254
- const compressed = compress(data);
255
- const text = stringify(compressed);
256
- const parsed = parse(text);
257
- const restored = decompress(parsed);
258
-
259
- // restored is deep-equal to data
260
- ```
261
-
262
- ### Compression Ratio Calculation
263
-
264
- ```js
265
- const originalSize = Buffer.byteLength(JSON.stringify(data));
266
- const compressedSize = Buffer.byteLength(stringify(compress(data)));
267
- const ratio = ((originalSize - compressedSize) / originalSize * 100).toFixed(1);
268
- console.log(`Compression ratio: ${ratio}%`);
269
- ```
270
-
271
- ## Compression Results
272
-
273
- Based on actual data from `compress-test.js` benchmarks (18 test cases, average compression ratio **52.20%**, all roundtrip decompressions verified):
274
-
275
- | Data Type | Object Count | Original Size | Compressed | Ratio |
276
- |-----------|-------------|---------------|------------|-------|
277
- | Simple users | 1,000 | 147.85 KB | 87.36 KB | **40.91%** |
278
- | Simple users | 10,000 | 1.45 MB | 882.34 KB | **40.69%** |
279
- | Nested users (with profile.social) | 1,000 | 235.28 KB | 153.03 KB | **34.96%** |
280
- | Orders (1-5 items per order) | 500 | 166.34 KB | 72.10 KB | **56.66%** |
281
- | School data (6 grades x 4 classes x 30 students) | 24 | 215.47 KB | 88.39 KB | **58.98%** |
282
- | Sparse fields (500 records x 30 fields) | 500 | 144.68 KB | 45.39 KB | **68.62%** |
283
- | Sparse fields (2000 records x 50 fields) | 2,000 | 947.88 KB | 292.74 KB | **69.12%** |
284
- | Deep nesting (5-level org structure) | 5 | 634.65 KB | 288.75 KB | **54.50%** |
285
-
286
- **Conclusions:**
287
- 1. Longer field names and more fields yield better compression
288
- 2. Object arrays (order items, student lists) show significant compression (55–59%)
289
- 3. Sparse fields achieve the highest compression — missing field nulls omitted as empty slots (67–69%)
290
- 4. Deeper nested structures achieve better compression
291
- 5. `stringify` quote omission further reduces text size
292
-
293
- ## Development
294
-
295
- ```bash
296
- # Run tests
297
- npm test
298
-
299
- # Run compression ratio benchmarks
300
- node compress-test.js
301
- ```
302
-
303
- ## GitHub
304
-
305
- [https://github.com/LastHeaven/slimjson](https://github.com/LastHeaven/slimjson)
306
-
307
- ## License
308
-
309
- MIT
1
+ # slimjson
2
+
3
+ [中文](./README.md) | English
4
+
5
+ A lightweight object array compression tool — converts JSON object arrays with repeated keys into a compact `{ schema, data }` format, with support for omitting `null` values during serialization to further reduce size.
6
+
7
+ ## Use Cases
8
+
9
+ - **API List Endpoints**: Backend list endpoints where every object carries the same key names, resulting in massive redundancy
10
+ - **Heterogeneous Fields**: Objects with different fields (backend omits null fields on demand)
11
+ - **Network Transfer Compression**: Minimizing JSON text size for network transmission
12
+ - **LLM Context Compression**: Compress large structured data (e.g. database query results, API responses, knowledge base entries) before sending to prompts, reducing token consumption and API costs
13
+ - **LLM Tool Calling**: function calling / tool_use results are often structured object arrays — compressing them before feeding back to the model significantly reduces context window usage, enabling the model to handle more complex data within limited tokens
14
+ - **LLM-Friendly Format**: The compressed `{ schema, data }` format separates schema (field definitions) from data, with each key appearing only once. Models can more accurately understand data structures and extract information by field name, with less confusion compared to raw JSON with repeated keys
15
+
16
+ ## Installation
17
+
18
+ ```bash
19
+ npm install slimjson
20
+ ```
21
+
22
+ ## API
23
+
24
+ ### `compress(source, opts?)`
25
+
26
+ Compresses an object array into a `{ schema, data }` structure:
27
+
28
+ ```js
29
+ import { compress } from 'slimjson';
30
+
31
+ const users = [
32
+ { name: 'Alice', age: 25, city: 'NYC' },
33
+ { name: 'Bob', age: 30, city: 'LA' },
34
+ ];
35
+
36
+ const compressed = compress(users);
37
+ // {
38
+ // schema: [['name', 'age', 'city']],
39
+ // data: [['Alice', 25, 'NYC'], ['Bob', 30, 'LA']]
40
+ // }
41
+ ```
42
+
43
+ **Parameters:**
44
+
45
+ | Parameter | Type | Default | Description |
46
+ |-----------|------|---------|-------------|
47
+ | `source` | `Object[]` or `Object` | | Object array to compress (single object is auto-wrapped) |
48
+ | `opts` | `Object` | | Optional configuration |
49
+ | `opts.trimTrailingNulls` | `boolean` | `false` | Remove trailing `null` values from each row |
50
+
51
+ **Features:**
52
+ - `schema` takes the union of all object keys, ordered by first appearance
53
+ - Missing fields in an object → fill `null` at the corresponding data position
54
+ - Nested objects are recursively processed: represented as `{ "fieldName": [childKeys] }` in `schema`
55
+ - Object arrays (e.g. order items) are recursively compressed the same way
56
+ - When a plain object is passed (not an array), it is treated as a single-element array
57
+
58
+ #### Nested Object Example
59
+
60
+ ```js
61
+ const data = [
62
+ { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
63
+ { name: 'Bob', age: 35, profile: { avatar: 'b.jpg', file: null } },
64
+ { name: 'Carol' },
65
+ ];
66
+
67
+ compress(data);
68
+ // {
69
+ // schema: [['name', 'age', { profile: ['avatar', 'bio', 'file'] }]],
70
+ // data: [
71
+ // ['Alice', 28, ['a.jpg', 'Hello', null]],
72
+ // ['Bob', 35, ['b.jpg', null, null]],
73
+ // ['Carol', null, null]
74
+ // ]
75
+ // }
76
+ ```
77
+
78
+ #### `trimTrailingNulls`: Remove Trailing nulls
79
+
80
+ When enabled, trailing `null` values in each row (and nested sub-rows) are removed for further compression:
81
+
82
+ ```js
83
+ compress(data, { trimTrailingNulls: true });
84
+ // {
85
+ // schema: [['name', 'age', { profile: ['avatar', 'bio', 'file'] }]],
86
+ // data: [
87
+ // ['Alice', 28, ['a.jpg', 'Hello']],
88
+ // ['Bob', 35, ['b.jpg']],
89
+ // ['Carol']
90
+ // ]
91
+ // }
92
+ ```
93
+
94
+ `decompress` automatically fills missing trailing values with `null`, so the roundtrip result is identical:
95
+
96
+ ```js
97
+ decompress(compress(data, { trimTrailingNulls: true }));
98
+ // [
99
+ // { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello', file: null } },
100
+ // { name: 'Bob', age: 35, profile: { avatar: 'b.jpg', bio: null, file: null } },
101
+ // { name: 'Carol', age: null, profile: null }
102
+ // ]
103
+ ```
104
+
105
+ #### Object Array Example (Order Scenario)
106
+
107
+ ```js
108
+ const orders = [
109
+ { orderId: 'A001', items: [{ name: 'Keyboard', price: 299 }, { name: 'Mouse', price: 99 }] },
110
+ { orderId: 'A002', items: [{ name: 'Monitor', price: 1999 }] },
111
+ ];
112
+
113
+ compress(orders);
114
+ // {
115
+ // schema: [['orderId', { items: [['name', 'price']] }]],
116
+ // data: [['A001', [['Keyboard', 299], ['Mouse', 99]]], ['A002', [['Monitor', 1999]]]]
117
+ // }
118
+
119
+ stringify(compress(orders));
120
+ // {schema:[[orderId,{items:[[name,price]]}]],data:[[A001,[[Keyboard,299],[Mouse,99]]],[A002,[[Monitor,1999]]]]}
121
+ // ^^^^^ nested object key, no quotes ^^^^ safe string value, no quotes
122
+ ```
123
+
124
+ #### Three-Level Nesting Example (Order → Item → Specs)
125
+
126
+ ```js
127
+ const orders = [
128
+ {
129
+ orderId: 'A001',
130
+ customer: 'Alice',
131
+ items: [
132
+ { name: 'Keyboard', price: 299, specs: { color: 'Black', layout: '104-key' } },
133
+ { name: 'Mouse', price: 99, specs: { color: 'White', dpi: '4000' } },
134
+ ]
135
+ },
136
+ {
137
+ orderId: 'A002',
138
+ customer: 'Bob',
139
+ items: [
140
+ { name: 'Monitor', price: 1999, specs: { color: 'Silver', size: '27in' } },
141
+ ]
142
+ },
143
+ ];
144
+
145
+ compress(orders);
146
+ // {
147
+ // schema: [[
148
+ // 'orderId',
149
+ // 'customer',
150
+ // { items: [['name', 'price', { specs: ['color', 'layout', 'dpi', 'size'] }]] }
151
+ // ]],
152
+ // data: [
153
+ // ['A001', 'Alice', [
154
+ // ['Keyboard', 299, ['Black', '104-key', null, null]],
155
+ // ['Mouse', 99, ['White', null, '4000', null]]
156
+ // ]],
157
+ // ['A002', 'Bob', [
158
+ // ['Monitor', 1999, ['Silver', null, null, '27in']]
159
+ // ]]
160
+ // ]
161
+ // }
162
+ // specs schema takes the union: order 1 has layout, order 2 has size → both kept, missing fields filled with null
163
+
164
+ compress(orders, { trimTrailingNulls: true });
165
+ // data becomes:
166
+ // [
167
+ // ['A001', 'Alice', [
168
+ // ['Keyboard', 299, ['Black', '104-key']],
169
+ // ['Mouse', 99, ['White', null, '4000']]
170
+ // ]],
171
+ // ['A002', 'Bob', [
172
+ // ['Monitor', 1999, ['Silver', null, null, '27in']]
173
+ // ]]
174
+ // ]
175
+ ```
176
+
177
+ ### `decompress(compressed)`
178
+
179
+ Restores `{ schema, data }` back to the original object array. Missing trailing values are automatically filled with `null`:
180
+
181
+ ```js
182
+ const restored = decompress(compressed);
183
+ // deep-equal to the original array
184
+ ```
185
+
186
+ ### `stringify(compressed)`
187
+
188
+ Serializes the compress result into compact text. Compared to `JSON.stringify`, the following optimization rules are applied:
189
+
190
+ ```js
191
+ const data = [
192
+ { name: 'Alice', age: 25 },
193
+ { name: 'Bob', age: 30 },
194
+ ];
195
+
196
+ const text = stringify(compress(data));
197
+ // {schema:[[name,age]],data:[[Alice,25],[Bob,30]]}
198
+
199
+ JSON.stringify(compress(data));
200
+ // {"schema":[["name","age"]],"data":[["Alice",25],["Bob",30]]}
201
+ ```
202
+
203
+ #### Serialization Rules
204
+
205
+ | Value Type | Serialized Result | Notes |
206
+ |------------|------------------|-------|
207
+ | `null` / `undefined` | `null` | |
208
+ | Finite number | `25` | Direct output, no quotes |
209
+ | `NaN` / `Infinity` | `null` | Non-finite numbers unified to null |
210
+ | `true` / `false` | `true` / `false` | — |
211
+ | Safe string | `Alice` | Quotes omitted (see rules below) |
212
+ | Unsafe string | `"hello world"` | JSON quotes and escaping retained |
213
+ | Nested object `{k: v}` | `{k:v}` | Keys follow same safe/unsafe rules |
214
+ | Array | See null omission rules below | — |
215
+
216
+ #### Safe Strings (Conditions for Omitting Quotes)
217
+
218
+ A string can omit quotes only when it satisfies **all** of the following conditions; otherwise `JSON.stringify` escaping is applied:
219
+
220
+ 1. Non-empty string
221
+ 2. Not a keyword literal: `null`, `true`, `false`
222
+ 3. Does not match number pattern: `/^-?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?$/` (e.g. `"123"`, `"-3.14"`, `"1e10"` all retain quotes)
223
+ 4. Does not start with a digit or minus sign `-`
224
+ 5. Does not contain whitespace, `[`, `]`, `{`, `}`, `,`, `:`, `"` or similar characters
225
+
226
+ | String | Result | Reason |
227
+ |--------|--------|--------|
228
+ | `"Alice"` | `Alice` | Safe, quotes omitted |
229
+ | `"hello world"` | `"hello world"` | Contains space |
230
+ | `"123"` | `"123"` | Looks like a number |
231
+ | `"-3.14"` | `"-3.14"` | Looks like a number |
232
+ | `"null"` | `"null"` | Keyword |
233
+ | `""` | `""` | Empty string |
234
+ | `"-abc"` | `"-abc"` | Starts with minus sign |
235
+ | `"a:b"` | `"a:b"` | Contains colon |
236
+
237
+ #### Object Key Quoting Rules
238
+
239
+ Nested object keys in `schema` follow the same safe string check:
240
+
241
+ ```js
242
+ stringify({ schema: [{ profile: ['name', 'age'] }], data: [...] });
243
+ // {schema:[{profile:[name,age]}],data:[...]} ← profile is safe, quotes omitted
244
+
245
+ stringify({ schema: [{ "my key": ['name'] }], data: [...] });
246
+ // {schema:[{"my key":[name]}],data:[...]} ← my key contains space, quotes retained
247
+ ```
248
+
249
+ #### Array Null Omission Rules
250
+
251
+ `null` / `undefined` values in arrays are omitted as comma slots, taking no text space:
252
+
253
+ | Original Array | Serialized Result | Notes |
254
+ |---------------|------------------|-------|
255
+ | `["a", null, null]` | `[a,,]` | Two trailing empty slots |
256
+ | `[null, 1, null]` | `[,1,]` | Leading and trailing empty slots |
257
+ | `[]` | `[]` | Empty array |
258
+ | `[null]` | `[null]` | **Special**: `[,]` means 2 nulls, so single null retains literal |
259
+
260
+ ### `parse(text)`
261
+
262
+ Parses text produced by `stringify`, restoring omitted `null` values:
263
+
264
+ ```js
265
+ const parsed = parse(text);
266
+ // deep-equal to compressed
267
+ ```
268
+
269
+ Supports full JSON type parsing (strings, numbers, booleans, null, nested objects/arrays), compatible with escape characters and Unicode.
270
+
271
+ ## Complete Example
272
+
273
+ ```js
274
+ import { compress, decompress, stringify, parse } from 'slimjson';
275
+
276
+ const data = [
277
+ { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
278
+ { name: 'Bob', age: 35, profile: { avatar: 'b.jpg' } }, // missing bio
279
+ ];
280
+
281
+ // Compress Stringify Parse Decompress
282
+ const compressed = compress(data);
283
+ const text = stringify(compressed);
284
+ const parsed = parse(text);
285
+ const restored = decompress(parsed);
286
+
287
+ // restored is deep-equal to data
288
+
289
+ // Enable trimTrailingNulls for further compression
290
+ const compressedTrim = compress(data, { trimTrailingNulls: true });
291
+ const textTrim = stringify(compressedTrim);
292
+ // textTrim is shorter than text
293
+ ```
294
+
295
+ ### Compression Ratio Calculation
296
+
297
+ ```js
298
+ const originalSize = Buffer.byteLength(JSON.stringify(data));
299
+ const compressedSize = Buffer.byteLength(stringify(compress(data)));
300
+ const ratio = ((originalSize - compressedSize) / originalSize * 100).toFixed(1);
301
+ console.log(`Compression ratio: ${ratio}%`);
302
+ ```
303
+
304
+ ## Compression Results
305
+
306
+ Based on actual data from `compress-test.js` benchmarks (18 test cases, all roundtrip decompressions verified):
307
+
308
+ | Data Type | Count | Original | No trim | Ratio | Trim | Ratio | Diff |
309
+ |-----------|-------|----------|---------|-------|------|-------|------|
310
+ | Simple users | 100 | 14.69 KB | 8.69 KB | 40.82% | 8.69 KB | 40.82% | — |
311
+ | Simple users | 1,000 | 147.74 KB | 87.25 KB | 40.94% | 87.25 KB | 40.94% | — |
312
+ | Simple users | 10,000 | 1.45 MB | 881.58 KB | 40.71% | 881.58 KB | 40.71% | — |
313
+ | Nested users (with profile.social) | 100 | 23.41 KB | 15.28 KB | 34.74% | 15.24 KB | 34.87% | -33 B |
314
+ | Nested users (with profile.social) | 1,000 | 236.03 KB | 153.93 KB | 34.78% | 153.64 KB | 34.91% | -301 B |
315
+ | Nested users (with profile.social) | 5,000 | 1.16 MB | 777.89 KB | 34.58% | 776.42 KB | 34.70% | -1.47 KB |
316
+ | Orders (1-5 items per order) | 100 | 31.28 KB | 13.65 KB | 56.38% | 13.65 KB | 56.38% | — |
317
+ | Orders (1-5 items per order) | 500 | 163.18 KB | 70.83 KB | 56.59% | 70.83 KB | 56.59% | — |
318
+ | Orders (1-5 items per order) | 2,000 | 655.99 KB | 284.29 KB | 56.66% | 284.29 KB | 56.66% | — |
319
+ | School data (2 grades x 2 classes x 10 students) | 4 | 12.26 KB | 5.25 KB | 57.20% | 5.23 KB | 57.36% | -21 B |
320
+ | School data (6 grades x 4 classes x 30 students) | 24 | 217.73 KB | 89.71 KB | 58.80% | 89.31 KB | 58.98% | -406 B |
321
+ | School data (6 grades x 6 classes x 50 students) | 36 | 539.64 KB | 222.56 KB | 58.76% | 221.66 KB | 58.92% | -923 B |
322
+ | Sparse fields (100 records x 20 fields) | 100 | 19.50 KB | 6.34 KB | 67.46% | 6.28 KB | 67.78% | -64 B |
323
+ | Sparse fields (500 records x 30 fields) | 500 | 143.26 KB | 45.09 KB | 68.52% | 44.78 KB | 68.74% | -326 B |
324
+ | Sparse fields (2000 records x 50 fields) | 2,000 | 957.96 KB | 294.69 KB | 69.24% | 293.54 KB | 69.36% | -1.15 KB |
325
+ | Deep nesting (small) | 2 | 17.47 KB | 8.08 KB | 53.73% | 8.08 KB | 53.73% | — |
326
+ | Deep nesting (medium) | 3 | 141.89 KB | 64.55 KB | 54.50% | 64.55 KB | 54.50% | — |
327
+ | Deep nesting (large) | 5 | 629.42 KB | 286.40 KB | 54.50% | 286.40 KB | 54.50% | — |
328
+
329
+ **Conclusions:**
330
+ 1. Longer field names and more fields yield better compression
331
+ 2. Object arrays (order items, student lists) show significant compression (55–59%)
332
+ 3. Sparse fields achieve the highest compression — missing field nulls omitted as empty slots (67–69%)
333
+ 4. `trimTrailingNulls` saves additional space when data has missing trailing fields (up to 1.48 KB / 5000 records)
334
+ 5. When data has no missing fields, trim provides no extra benefit
335
+ 6. Deeper nested structures achieve better compression
336
+ 7. `stringify` quote omission further reduces text size
337
+
338
+ ## Token Efficiency Comparison
339
+
340
+ Token consumption comparison across formats (based on 6 real-world datasets).
341
+
342
+ #### Mixed-Structure Track
343
+
344
+ Datasets with nested or semi-uniform structures. CSV excluded as it cannot represent these structures.
345
+
346
+ ```
347
+ 🛒 E-commerce orders (nested) ┊ Tabular: 33%
348
+
349
+ slimjson ████████░░░░░░░░░░░░ 46,233 tokens
350
+ ├─ vs JSON (−57.8%) 109,574 tokens
351
+ ├─ vs JSON compact (−33.5%) 69,528 tokens
352
+ ├─ vs TOON (−36.9%) 73,246 tokens
353
+ ├─ vs YAML (−45.9%) 85,451 tokens
354
+ └─ vs XML (−62.5%) 123,272 tokens
355
+
356
+ 📃 Semi-uniform event logs ┊ Tabular: 50%
357
+
358
+ slimjson ██████████░░░░░░░░░░ 91,630 tokens
359
+ ├─ vs JSON (−49.4%) 181,141 tokens
360
+ ├─ vs JSON compact (−28.7%) 128,480 tokens
361
+ ├─ vs TOON (−40.5%) 154,032 tokens
362
+ ├─ vs YAML (−41.0%) 155,346 tokens
363
+ └─ vs XML (−55.5%) 205,796 tokens
364
+
365
+ 🧩 Deeply nested configuration ┊ Tabular: 0%
366
+
367
+ slimjson ████████████░░░░░░░░ 547 tokens
368
+ ├─ vs JSON (−39.6%) 905 tokens
369
+ ├─ vs JSON compact (−0.9%) 552 tokens
370
+ ├─ vs TOON (−11.5%) 618 tokens
371
+ ├─ vs YAML (−17.4%) 662 tokens
372
+ └─ vs XML (−45.1%) 997 tokens
373
+
374
+ ──────────────────────────────────── Total ────────────────────────────────────
375
+ slimjson █████████░░░░░░░░░░░ 138,410 tokens
376
+ ├─ vs JSON (−52.5%) 291,620 tokens
377
+ ├─ vs JSON compact (−30.3%) 198,560 tokens
378
+ ├─ vs TOON (−39.3%) 227,896 tokens
379
+ ├─ vs YAML (−42.7%) 241,459 tokens
380
+ └─ vs XML (−58.1%) 330,065 tokens
381
+ ```
382
+
383
+ #### Flat-Only Track
384
+
385
+ Flat tabular datasets where CSV is applicable.
386
+
387
+ ```
388
+ 👥 Uniform employee records ┊ Tabular: 100%
389
+
390
+ CSV ████████████████████ 47,137 tokens
391
+ slimjson ████████████████████ 47,067 tokens (-0.1% vs CSV)
392
+ ├─ vs JSON (−63.0%) 127,050 tokens
393
+ ├─ vs JSON compact (−40.5%) 79,046 tokens
394
+ ├─ vs TOON (−5.8%) 49,966 tokens
395
+ ├─ vs YAML (−52.9%) 100,033 tokens
396
+ └─ vs XML (−67.9%) 146,596 tokens
397
+
398
+ 📈 Time-series analytics data ┊ Tabular: 100%
399
+
400
+ CSV ███████████████████░ 8,392 tokens
401
+ slimjson ████████████████████ 8,767 tokens (+4.5% vs CSV)
402
+ ├─ vs JSON (−60.6%) 22,254 tokens
403
+ ├─ vs JSON compact (−38.3%) 14,220 tokens
404
+ ├─ vs TOON (−3.9%) 9,124 tokens
405
+ ├─ vs YAML (−50.9%) 17,867 tokens
406
+ └─ vs XML (−67.1%) 26,625 tokens
407
+
408
+ ⭐ Top 100 GitHub repositories ┊ Tabular: 100%
409
+
410
+ CSV ████████████████████ 8,512 tokens
411
+ slimjson ████████████████████ 8,550 tokens (+0.4% vs CSV)
412
+ ├─ vs JSON (−43.5%) 15,144 tokens
413
+ ├─ vs JSON compact (−25.4%) 11,454 tokens
414
+ ├─ vs TOON (−2.2%) 8,744 tokens
415
+ ├─ vs YAML (−34.9%) 13,128 tokens
416
+ └─ vs XML (−50.0%) 17,095 tokens
417
+
418
+ ──────────────────────────────────── Total ────────────────────────────────────
419
+ CSV ████████████████████ 64,041 tokens
420
+ slimjson ████████████████████ 64,384 tokens (+0.5% vs CSV)
421
+ ├─ vs JSON (−60.8%) 164,448 tokens
422
+ ├─ vs JSON compact (−38.5%) 104,720 tokens
423
+ ├─ vs TOON (−5.1%) 67,834 tokens
424
+ ├─ vs YAML (−50.9%) 131,028 tokens
425
+ └─ vs XML (−66.2%) 190,316 tokens
426
+ ```
427
+
428
+ > On mixed-structure data, slimjson saves **52.5%** tokens vs JSON. On flat tabular data, it's on par with CSV (only 0.5% more).
429
+
430
+ ## LLM Data Retrieval Accuracy
431
+
432
+ Accuracy tested with 209 data retrieval questions across different input formats.
433
+
434
+ #### Efficiency Ranking (Accuracy per 1K Tokens)
435
+
436
+ ```
437
+ slimjson ████████████████████ 44.4 acc%/1K tok │ 94.7% acc │ 2,134 tokens
438
+ TOON ███████████████░░░░░ 34.0 acc%/1K tok │ 92.8% acc │ 2,734 tokens
439
+ JSON compact ██████████████░░░░░░ 31.0 acc%/1K tok │ 95.2% acc │ 3,072 tokens
440
+ YAML ███████████░░░░░░░░░ 25.4 acc%/1K tok │ 94.3% acc │ 3,716 tokens
441
+ JSON ██████████░░░░░░░░░░ 21.1 acc%/1K tok │ 95.7% acc │ 4,538 tokens
442
+ XML ████████░░░░░░░░░░░░ 18.5 acc%/1K tok │ 95.7% acc │ 5,162 tokens
443
+ ```
444
+
445
+ *Efficiency score = (Accuracy % ÷ Tokens) × 1,000. Higher is better.*
446
+
447
+ > slimjson achieves **94.7%** accuracy (vs JSON's 95.7%) while using **53.0% fewer tokens**.
448
+
449
+ #### Per-Model Accuracy
450
+
451
+ ```
452
+ deepseek-v4-flash
453
+ JSON ███████████████████░ 95.7% (200/209)
454
+ XML ███████████████████░ 95.7% (200/209)
455
+ JSON compact ███████████████████░ 95.2% (199/209)
456
+ → slimjson ███████████████████░ 94.7% (198/209)
457
+ YAML ███████████████████░ 94.3% (197/209)
458
+ TOON ███████████████████░ 92.8% (194/209)
459
+ CSV ██████████████████░░ 91.7% (100/109)
460
+ ```
461
+
462
+ #### Accuracy by Question Type
463
+
464
+ | Question Type | JSON | XML | JSON compact | slimjson | YAML | TOON | CSV |
465
+ |---------------|------|-----|-------------|----------|------|------|-----|
466
+ | Field Retrieval | 98.5% | 97.1% | 98.5% | 95.6% | 97.1% | 91.2% | 96.9% |
467
+ | Aggregation | 98.4% | 96.8% | 95.2% | 95.2% | 93.7% | 95.2% | 86.2% |
468
+ | Filtering | 97.9% | 97.9% | 100.0% | 100.0% | 100.0% | 100.0% | 96.3% |
469
+ | Structure Awareness | 88.0% | 92.0% | 84.0% | 92.0% | 88.0% | 88.0% | 87.5% |
470
+ | Structural Validation | 40.0% | 60.0% | 60.0% | 40.0% | 40.0% | 40.0% | 80.0% |
471
+
472
+ #### Datasets Tested
473
+
474
+ | Dataset | Rows | Structure | CSV Support |
475
+ |---------|------|-----------|-------------|
476
+ | Uniform employee records | 100 | uniform | ✓ |
477
+ | E-commerce orders (nested) | 50 | nested | ✗ |
478
+ | Time-series analytics data | 60 | uniform | ✓ |
479
+ | Top 100 GitHub repositories | 100 | uniform | ✓ |
480
+ | Semi-uniform event logs | 75 | semi-uniform | ✗ |
481
+ | Deeply nested configuration | 11 | deep | ✗ |
482
+
483
+ ## Development
484
+
485
+ ```bash
486
+ # Run tests (192 cases, 100% coverage)
487
+ npm test
488
+
489
+ # Run compression ratio benchmarks (with trim comparison)
490
+ node compress-test.js
491
+ ```
492
+
493
+ ## GitHub
494
+
495
+ [https://github.com/LastHeaven/slimjson](https://github.com/LastHeaven/slimjson)
496
+
497
+ ## License
498
+
499
+ MIT