slimjson 1.0.4 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/settings.local.json +5 -2
- package/.idea/git_toolbox_prj.xml +15 -0
- package/.idea/modules.xml +8 -0
- package/.idea/slimjson.iml +12 -0
- package/.idea/vcs.xml +6 -0
- package/README.md +510 -361
- package/README_EN.md +499 -350
- package/compress-file.js +41 -41
- package/compress-ratio.js +70 -70
- package/compress-test.js +436 -436
- package/compress.js +267 -146
- package/decompress-file.js +42 -42
- package/esm.mjs +4 -4
- package/package.json +24 -24
- package/test.js +975 -975
- package/data/searchGroup.json +0 -96365
package/README_EN.md
CHANGED
|
@@ -1,350 +1,499 @@
|
|
|
1
|
-
# slimjson
|
|
2
|
-
|
|
3
|
-
[中文](./README.md) | English
|
|
4
|
-
|
|
5
|
-
A lightweight object array compression tool — converts JSON object arrays with repeated keys into a compact `{
|
|
6
|
-
|
|
7
|
-
## Use Cases
|
|
8
|
-
|
|
9
|
-
- **API List Endpoints**: Backend list endpoints where every object carries the same key names, resulting in massive redundancy
|
|
10
|
-
- **Heterogeneous Fields**: Objects with different fields (backend omits null fields on demand)
|
|
11
|
-
- **Network Transfer Compression**: Minimizing JSON text size for network transmission
|
|
12
|
-
- **LLM Context Compression**: Compress large structured data (e.g. database query results, API responses, knowledge base entries) before sending to prompts, reducing token consumption and API costs
|
|
13
|
-
- **LLM Tool Calling**: function calling / tool_use results are often structured object arrays — compressing them before feeding back to the model significantly reduces context window usage, enabling the model to handle more complex data within limited tokens
|
|
14
|
-
- **LLM-Friendly Format**: The compressed `{
|
|
15
|
-
|
|
16
|
-
## Installation
|
|
17
|
-
|
|
18
|
-
```bash
|
|
19
|
-
npm install slimjson
|
|
20
|
-
```
|
|
21
|
-
|
|
22
|
-
## API
|
|
23
|
-
|
|
24
|
-
### `compress(source, opts?)`
|
|
25
|
-
|
|
26
|
-
Compresses an object array into a `{
|
|
27
|
-
|
|
28
|
-
```js
|
|
29
|
-
import { compress } from 'slimjson';
|
|
30
|
-
|
|
31
|
-
const users = [
|
|
32
|
-
{ name: 'Alice', age: 25, city: 'NYC' },
|
|
33
|
-
{ name: 'Bob', age: 30, city: 'LA' },
|
|
34
|
-
];
|
|
35
|
-
|
|
36
|
-
const compressed = compress(users);
|
|
37
|
-
// {
|
|
38
|
-
//
|
|
39
|
-
//
|
|
40
|
-
//
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
-
|
|
56
|
-
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
//
|
|
72
|
-
//
|
|
73
|
-
//
|
|
74
|
-
//
|
|
75
|
-
//
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
//
|
|
88
|
-
//
|
|
89
|
-
//
|
|
90
|
-
//
|
|
91
|
-
//
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
//
|
|
102
|
-
//
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
//
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
//
|
|
121
|
-
//
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
//
|
|
153
|
-
//
|
|
154
|
-
//
|
|
155
|
-
//
|
|
156
|
-
//
|
|
157
|
-
//
|
|
158
|
-
//
|
|
159
|
-
//
|
|
160
|
-
//
|
|
161
|
-
//
|
|
162
|
-
//
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
//
|
|
166
|
-
//
|
|
167
|
-
//
|
|
168
|
-
//
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
//
|
|
172
|
-
// [
|
|
173
|
-
//
|
|
174
|
-
//
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
```
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
```
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
]
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
|
212
|
-
|
|
213
|
-
|
|
|
214
|
-
|
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
|
233
|
-
|
|
234
|
-
| `"
|
|
235
|
-
| `"
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
```
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
`
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
```
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
const
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
//
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
const
|
|
291
|
-
const
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
|
315
|
-
|
|
316
|
-
|
|
|
317
|
-
|
|
|
318
|
-
|
|
|
319
|
-
|
|
|
320
|
-
| School data (6 grades x 4 classes x 30 students) | 24 |
|
|
321
|
-
|
|
|
322
|
-
| Sparse fields (
|
|
323
|
-
|
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
|
|
1
|
+
# slimjson
|
|
2
|
+
|
|
3
|
+
[中文](./README.md) | English
|
|
4
|
+
|
|
5
|
+
A lightweight object array compression tool — converts JSON object arrays with repeated keys into a compact `{ schema, data }` format, with support for omitting `null` values during serialization to further reduce size.
|
|
6
|
+
|
|
7
|
+
## Use Cases
|
|
8
|
+
|
|
9
|
+
- **API List Endpoints**: Backend list endpoints where every object carries the same key names, resulting in massive redundancy
|
|
10
|
+
- **Heterogeneous Fields**: Objects with different fields (backend omits null fields on demand)
|
|
11
|
+
- **Network Transfer Compression**: Minimizing JSON text size for network transmission
|
|
12
|
+
- **LLM Context Compression**: Compress large structured data (e.g. database query results, API responses, knowledge base entries) before sending to prompts, reducing token consumption and API costs
|
|
13
|
+
- **LLM Tool Calling**: function calling / tool_use results are often structured object arrays — compressing them before feeding back to the model significantly reduces context window usage, enabling the model to handle more complex data within limited tokens
|
|
14
|
+
- **LLM-Friendly Format**: The compressed `{ schema, data }` format separates schema (field definitions) from data, with each key appearing only once. Models can more accurately understand data structures and extract information by field name, with less confusion compared to raw JSON with repeated keys
|
|
15
|
+
|
|
16
|
+
## Installation
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
npm install slimjson
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## API
|
|
23
|
+
|
|
24
|
+
### `compress(source, opts?)`
|
|
25
|
+
|
|
26
|
+
Compresses an object array into a `{ schema, data }` structure:
|
|
27
|
+
|
|
28
|
+
```js
|
|
29
|
+
import { compress } from 'slimjson';
|
|
30
|
+
|
|
31
|
+
const users = [
|
|
32
|
+
{ name: 'Alice', age: 25, city: 'NYC' },
|
|
33
|
+
{ name: 'Bob', age: 30, city: 'LA' },
|
|
34
|
+
];
|
|
35
|
+
|
|
36
|
+
const compressed = compress(users);
|
|
37
|
+
// {
|
|
38
|
+
// schema: [['name', 'age', 'city']],
|
|
39
|
+
// data: [['Alice', 25, 'NYC'], ['Bob', 30, 'LA']]
|
|
40
|
+
// }
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
**Parameters:**
|
|
44
|
+
|
|
45
|
+
| Parameter | Type | Default | Description |
|
|
46
|
+
|-----------|------|---------|-------------|
|
|
47
|
+
| `source` | `Object[]` or `Object` | — | Object array to compress (single object is auto-wrapped) |
|
|
48
|
+
| `opts` | `Object` | — | Optional configuration |
|
|
49
|
+
| `opts.trimTrailingNulls` | `boolean` | `false` | Remove trailing `null` values from each row |
|
|
50
|
+
|
|
51
|
+
**Features:**
|
|
52
|
+
- `schema` takes the union of all object keys, ordered by first appearance
|
|
53
|
+
- Missing fields in an object → fill `null` at the corresponding data position
|
|
54
|
+
- Nested objects are recursively processed: represented as `{ "fieldName": [childKeys] }` in `schema`
|
|
55
|
+
- Object arrays (e.g. order items) are recursively compressed the same way
|
|
56
|
+
- When a plain object is passed (not an array), it is treated as a single-element array
|
|
57
|
+
|
|
58
|
+
#### Nested Object Example
|
|
59
|
+
|
|
60
|
+
```js
|
|
61
|
+
const data = [
|
|
62
|
+
{ name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
|
|
63
|
+
{ name: 'Bob', age: 35, profile: { avatar: 'b.jpg', file: null } },
|
|
64
|
+
{ name: 'Carol' },
|
|
65
|
+
];
|
|
66
|
+
|
|
67
|
+
compress(data);
|
|
68
|
+
// {
|
|
69
|
+
// schema: [['name', 'age', { profile: ['avatar', 'bio', 'file'] }]],
|
|
70
|
+
// data: [
|
|
71
|
+
// ['Alice', 28, ['a.jpg', 'Hello', null]],
|
|
72
|
+
// ['Bob', 35, ['b.jpg', null, null]],
|
|
73
|
+
// ['Carol', null, null]
|
|
74
|
+
// ]
|
|
75
|
+
// }
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
#### `trimTrailingNulls`: Remove Trailing nulls
|
|
79
|
+
|
|
80
|
+
When enabled, trailing `null` values in each row (and nested sub-rows) are removed for further compression:
|
|
81
|
+
|
|
82
|
+
```js
|
|
83
|
+
compress(data, { trimTrailingNulls: true });
|
|
84
|
+
// {
|
|
85
|
+
// schema: [['name', 'age', { profile: ['avatar', 'bio', 'file'] }]],
|
|
86
|
+
// data: [
|
|
87
|
+
// ['Alice', 28, ['a.jpg', 'Hello']],
|
|
88
|
+
// ['Bob', 35, ['b.jpg']],
|
|
89
|
+
// ['Carol']
|
|
90
|
+
// ]
|
|
91
|
+
// }
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
`decompress` automatically fills missing trailing values with `null`, so the roundtrip result is identical:
|
|
95
|
+
|
|
96
|
+
```js
|
|
97
|
+
decompress(compress(data, { trimTrailingNulls: true }));
|
|
98
|
+
// [
|
|
99
|
+
// { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello', file: null } },
|
|
100
|
+
// { name: 'Bob', age: 35, profile: { avatar: 'b.jpg', bio: null, file: null } },
|
|
101
|
+
// { name: 'Carol', age: null, profile: null }
|
|
102
|
+
// ]
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
#### Object Array Example (Order Scenario)
|
|
106
|
+
|
|
107
|
+
```js
|
|
108
|
+
const orders = [
|
|
109
|
+
{ orderId: 'A001', items: [{ name: 'Keyboard', price: 299 }, { name: 'Mouse', price: 99 }] },
|
|
110
|
+
{ orderId: 'A002', items: [{ name: 'Monitor', price: 1999 }] },
|
|
111
|
+
];
|
|
112
|
+
|
|
113
|
+
compress(orders);
|
|
114
|
+
// {
|
|
115
|
+
// schema: [['orderId', { items: [['name', 'price']] }]],
|
|
116
|
+
// data: [['A001', [['Keyboard', 299], ['Mouse', 99]]], ['A002', [['Monitor', 1999]]]]
|
|
117
|
+
// }
|
|
118
|
+
|
|
119
|
+
stringify(compress(orders));
|
|
120
|
+
// {schema:[[orderId,{items:[[name,price]]}]],data:[[A001,[[Keyboard,299],[Mouse,99]]],[A002,[[Monitor,1999]]]]}
|
|
121
|
+
// ^^^^^ nested object key, no quotes ^^^^ safe string value, no quotes
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
#### Three-Level Nesting Example (Order → Item → Specs)
|
|
125
|
+
|
|
126
|
+
```js
|
|
127
|
+
const orders = [
|
|
128
|
+
{
|
|
129
|
+
orderId: 'A001',
|
|
130
|
+
customer: 'Alice',
|
|
131
|
+
items: [
|
|
132
|
+
{ name: 'Keyboard', price: 299, specs: { color: 'Black', layout: '104-key' } },
|
|
133
|
+
{ name: 'Mouse', price: 99, specs: { color: 'White', dpi: '4000' } },
|
|
134
|
+
]
|
|
135
|
+
},
|
|
136
|
+
{
|
|
137
|
+
orderId: 'A002',
|
|
138
|
+
customer: 'Bob',
|
|
139
|
+
items: [
|
|
140
|
+
{ name: 'Monitor', price: 1999, specs: { color: 'Silver', size: '27in' } },
|
|
141
|
+
]
|
|
142
|
+
},
|
|
143
|
+
];
|
|
144
|
+
|
|
145
|
+
compress(orders);
|
|
146
|
+
// {
|
|
147
|
+
// schema: [[
|
|
148
|
+
// 'orderId',
|
|
149
|
+
// 'customer',
|
|
150
|
+
// { items: [['name', 'price', { specs: ['color', 'layout', 'dpi', 'size'] }]] }
|
|
151
|
+
// ]],
|
|
152
|
+
// data: [
|
|
153
|
+
// ['A001', 'Alice', [
|
|
154
|
+
// ['Keyboard', 299, ['Black', '104-key', null, null]],
|
|
155
|
+
// ['Mouse', 99, ['White', null, '4000', null]]
|
|
156
|
+
// ]],
|
|
157
|
+
// ['A002', 'Bob', [
|
|
158
|
+
// ['Monitor', 1999, ['Silver', null, null, '27in']]
|
|
159
|
+
// ]]
|
|
160
|
+
// ]
|
|
161
|
+
// }
|
|
162
|
+
// specs schema takes the union: order 1 has layout, order 2 has size → both kept, missing fields filled with null
|
|
163
|
+
|
|
164
|
+
compress(orders, { trimTrailingNulls: true });
|
|
165
|
+
// data becomes:
|
|
166
|
+
// [
|
|
167
|
+
// ['A001', 'Alice', [
|
|
168
|
+
// ['Keyboard', 299, ['Black', '104-key']],
|
|
169
|
+
// ['Mouse', 99, ['White', null, '4000']]
|
|
170
|
+
// ]],
|
|
171
|
+
// ['A002', 'Bob', [
|
|
172
|
+
// ['Monitor', 1999, ['Silver', null, null, '27in']]
|
|
173
|
+
// ]]
|
|
174
|
+
// ]
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
### `decompress(compressed)`
|
|
178
|
+
|
|
179
|
+
Restores `{ schema, data }` back to the original object array. Missing trailing values are automatically filled with `null`:
|
|
180
|
+
|
|
181
|
+
```js
|
|
182
|
+
const restored = decompress(compressed);
|
|
183
|
+
// deep-equal to the original array
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### `stringify(compressed)`
|
|
187
|
+
|
|
188
|
+
Serializes the compress result into compact text. Compared to `JSON.stringify`, the following optimization rules are applied:
|
|
189
|
+
|
|
190
|
+
```js
|
|
191
|
+
const data = [
|
|
192
|
+
{ name: 'Alice', age: 25 },
|
|
193
|
+
{ name: 'Bob', age: 30 },
|
|
194
|
+
];
|
|
195
|
+
|
|
196
|
+
const text = stringify(compress(data));
|
|
197
|
+
// {schema:[[name,age]],data:[[Alice,25],[Bob,30]]}
|
|
198
|
+
|
|
199
|
+
JSON.stringify(compress(data));
|
|
200
|
+
// {"schema":[["name","age"]],"data":[["Alice",25],["Bob",30]]}
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
#### Serialization Rules
|
|
204
|
+
|
|
205
|
+
| Value Type | Serialized Result | Notes |
|
|
206
|
+
|------------|------------------|-------|
|
|
207
|
+
| `null` / `undefined` | `null` | — |
|
|
208
|
+
| Finite number | `25` | Direct output, no quotes |
|
|
209
|
+
| `NaN` / `Infinity` | `null` | Non-finite numbers unified to null |
|
|
210
|
+
| `true` / `false` | `true` / `false` | — |
|
|
211
|
+
| Safe string | `Alice` | Quotes omitted (see rules below) |
|
|
212
|
+
| Unsafe string | `"hello world"` | JSON quotes and escaping retained |
|
|
213
|
+
| Nested object `{k: v}` | `{k:v}` | Keys follow same safe/unsafe rules |
|
|
214
|
+
| Array | See null omission rules below | — |
|
|
215
|
+
|
|
216
|
+
#### Safe Strings (Conditions for Omitting Quotes)
|
|
217
|
+
|
|
218
|
+
A string can omit quotes only when it satisfies **all** of the following conditions; otherwise `JSON.stringify` escaping is applied:
|
|
219
|
+
|
|
220
|
+
1. Non-empty string
|
|
221
|
+
2. Not a keyword literal: `null`, `true`, `false`
|
|
222
|
+
3. Does not match number pattern: `/^-?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?$/` (e.g. `"123"`, `"-3.14"`, `"1e10"` all retain quotes)
|
|
223
|
+
4. Does not start with a digit or minus sign `-`
|
|
224
|
+
5. Does not contain whitespace, `[`, `]`, `{`, `}`, `,`, `:`, `"` or similar characters
|
|
225
|
+
|
|
226
|
+
| String | Result | Reason |
|
|
227
|
+
|--------|--------|--------|
|
|
228
|
+
| `"Alice"` | `Alice` | Safe, quotes omitted |
|
|
229
|
+
| `"hello world"` | `"hello world"` | Contains space |
|
|
230
|
+
| `"123"` | `"123"` | Looks like a number |
|
|
231
|
+
| `"-3.14"` | `"-3.14"` | Looks like a number |
|
|
232
|
+
| `"null"` | `"null"` | Keyword |
|
|
233
|
+
| `""` | `""` | Empty string |
|
|
234
|
+
| `"-abc"` | `"-abc"` | Starts with minus sign |
|
|
235
|
+
| `"a:b"` | `"a:b"` | Contains colon |
|
|
236
|
+
|
|
237
|
+
#### Object Key Quoting Rules
|
|
238
|
+
|
|
239
|
+
Nested object keys in `schema` follow the same safe string check:
|
|
240
|
+
|
|
241
|
+
```js
|
|
242
|
+
stringify({ schema: [{ profile: ['name', 'age'] }], data: [...] });
|
|
243
|
+
// {schema:[{profile:[name,age]}],data:[...]} ← profile is safe, quotes omitted
|
|
244
|
+
|
|
245
|
+
stringify({ schema: [{ "my key": ['name'] }], data: [...] });
|
|
246
|
+
// {schema:[{"my key":[name]}],data:[...]} ← my key contains space, quotes retained
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
#### Array Null Omission Rules
|
|
250
|
+
|
|
251
|
+
`null` / `undefined` values in arrays are omitted as comma slots, taking no text space:
|
|
252
|
+
|
|
253
|
+
| Original Array | Serialized Result | Notes |
|
|
254
|
+
|---------------|------------------|-------|
|
|
255
|
+
| `["a", null, null]` | `[a,,]` | Two trailing empty slots |
|
|
256
|
+
| `[null, 1, null]` | `[,1,]` | Leading and trailing empty slots |
|
|
257
|
+
| `[]` | `[]` | Empty array |
|
|
258
|
+
| `[null]` | `[null]` | **Special**: `[,]` means 2 nulls, so single null retains literal |
|
|
259
|
+
|
|
260
|
+
### `parse(text)`
|
|
261
|
+
|
|
262
|
+
Parses text produced by `stringify`, restoring omitted `null` values:
|
|
263
|
+
|
|
264
|
+
```js
|
|
265
|
+
const parsed = parse(text);
|
|
266
|
+
// deep-equal to compressed
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
Supports full JSON type parsing (strings, numbers, booleans, null, nested objects/arrays), compatible with escape characters and Unicode.
|
|
270
|
+
|
|
271
|
+
## Complete Example
|
|
272
|
+
|
|
273
|
+
```js
|
|
274
|
+
import { compress, decompress, stringify, parse } from 'slimjson';
|
|
275
|
+
|
|
276
|
+
const data = [
|
|
277
|
+
{ name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
|
|
278
|
+
{ name: 'Bob', age: 35, profile: { avatar: 'b.jpg' } }, // missing bio
|
|
279
|
+
];
|
|
280
|
+
|
|
281
|
+
// Compress → Stringify → Parse → Decompress
|
|
282
|
+
const compressed = compress(data);
|
|
283
|
+
const text = stringify(compressed);
|
|
284
|
+
const parsed = parse(text);
|
|
285
|
+
const restored = decompress(parsed);
|
|
286
|
+
|
|
287
|
+
// restored is deep-equal to data
|
|
288
|
+
|
|
289
|
+
// Enable trimTrailingNulls for further compression
|
|
290
|
+
const compressedTrim = compress(data, { trimTrailingNulls: true });
|
|
291
|
+
const textTrim = stringify(compressedTrim);
|
|
292
|
+
// textTrim is shorter than text
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
### Compression Ratio Calculation
|
|
296
|
+
|
|
297
|
+
```js
|
|
298
|
+
const originalSize = Buffer.byteLength(JSON.stringify(data));
|
|
299
|
+
const compressedSize = Buffer.byteLength(stringify(compress(data)));
|
|
300
|
+
const ratio = ((originalSize - compressedSize) / originalSize * 100).toFixed(1);
|
|
301
|
+
console.log(`Compression ratio: ${ratio}%`);
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
## Compression Results
|
|
305
|
+
|
|
306
|
+
Based on actual data from `compress-test.js` benchmarks (18 test cases, all roundtrip decompressions verified):
|
|
307
|
+
|
|
308
|
+
| Data Type | Count | Original | No trim | Ratio | Trim | Ratio | Diff |
|
|
309
|
+
|-----------|-------|----------|---------|-------|------|-------|------|
|
|
310
|
+
| Simple users | 100 | 14.69 KB | 8.69 KB | 40.82% | 8.69 KB | 40.82% | — |
|
|
311
|
+
| Simple users | 1,000 | 147.74 KB | 87.25 KB | 40.94% | 87.25 KB | 40.94% | — |
|
|
312
|
+
| Simple users | 10,000 | 1.45 MB | 881.58 KB | 40.71% | 881.58 KB | 40.71% | — |
|
|
313
|
+
| Nested users (with profile.social) | 100 | 23.41 KB | 15.28 KB | 34.74% | 15.24 KB | 34.87% | -33 B |
|
|
314
|
+
| Nested users (with profile.social) | 1,000 | 236.03 KB | 153.93 KB | 34.78% | 153.64 KB | 34.91% | -301 B |
|
|
315
|
+
| Nested users (with profile.social) | 5,000 | 1.16 MB | 777.89 KB | 34.58% | 776.42 KB | 34.70% | -1.47 KB |
|
|
316
|
+
| Orders (1-5 items per order) | 100 | 31.28 KB | 13.65 KB | 56.38% | 13.65 KB | 56.38% | — |
|
|
317
|
+
| Orders (1-5 items per order) | 500 | 163.18 KB | 70.83 KB | 56.59% | 70.83 KB | 56.59% | — |
|
|
318
|
+
| Orders (1-5 items per order) | 2,000 | 655.99 KB | 284.29 KB | 56.66% | 284.29 KB | 56.66% | — |
|
|
319
|
+
| School data (2 grades x 2 classes x 10 students) | 4 | 12.26 KB | 5.25 KB | 57.20% | 5.23 KB | 57.36% | -21 B |
|
|
320
|
+
| School data (6 grades x 4 classes x 30 students) | 24 | 217.73 KB | 89.71 KB | 58.80% | 89.31 KB | 58.98% | -406 B |
|
|
321
|
+
| School data (6 grades x 6 classes x 50 students) | 36 | 539.64 KB | 222.56 KB | 58.76% | 221.66 KB | 58.92% | -923 B |
|
|
322
|
+
| Sparse fields (100 records x 20 fields) | 100 | 19.50 KB | 6.34 KB | 67.46% | 6.28 KB | 67.78% | -64 B |
|
|
323
|
+
| Sparse fields (500 records x 30 fields) | 500 | 143.26 KB | 45.09 KB | 68.52% | 44.78 KB | 68.74% | -326 B |
|
|
324
|
+
| Sparse fields (2000 records x 50 fields) | 2,000 | 957.96 KB | 294.69 KB | 69.24% | 293.54 KB | 69.36% | -1.15 KB |
|
|
325
|
+
| Deep nesting (small) | 2 | 17.47 KB | 8.08 KB | 53.73% | 8.08 KB | 53.73% | — |
|
|
326
|
+
| Deep nesting (medium) | 3 | 141.89 KB | 64.55 KB | 54.50% | 64.55 KB | 54.50% | — |
|
|
327
|
+
| Deep nesting (large) | 5 | 629.42 KB | 286.40 KB | 54.50% | 286.40 KB | 54.50% | — |
|
|
328
|
+
|
|
329
|
+
**Conclusions:**
|
|
330
|
+
1. Longer field names and more fields yield better compression
|
|
331
|
+
2. Object arrays (order items, student lists) show significant compression (55–59%)
|
|
332
|
+
3. Sparse fields achieve the highest compression — missing field nulls omitted as empty slots (67–69%)
|
|
333
|
+
4. `trimTrailingNulls` saves additional space when data has missing trailing fields (up to 1.48 KB / 5000 records)
|
|
334
|
+
5. When data has no missing fields, trim provides no extra benefit
|
|
335
|
+
6. Deeper nested structures achieve better compression
|
|
336
|
+
7. `stringify` quote omission further reduces text size
|
|
337
|
+
|
|
338
|
+
## Token Efficiency Comparison
|
|
339
|
+
|
|
340
|
+
Token consumption comparison across formats (based on 6 real-world datasets).
|
|
341
|
+
|
|
342
|
+
#### Mixed-Structure Track
|
|
343
|
+
|
|
344
|
+
Datasets with nested or semi-uniform structures. CSV excluded as it cannot represent these structures.
|
|
345
|
+
|
|
346
|
+
```
|
|
347
|
+
🛒 E-commerce orders (nested) ┊ Tabular: 33%
|
|
348
|
+
│
|
|
349
|
+
slimjson ████████░░░░░░░░░░░░ 46,233 tokens
|
|
350
|
+
├─ vs JSON (−57.8%) 109,574 tokens
|
|
351
|
+
├─ vs JSON compact (−33.5%) 69,528 tokens
|
|
352
|
+
├─ vs TOON (−36.9%) 73,246 tokens
|
|
353
|
+
├─ vs YAML (−45.9%) 85,451 tokens
|
|
354
|
+
└─ vs XML (−62.5%) 123,272 tokens
|
|
355
|
+
|
|
356
|
+
📃 Semi-uniform event logs ┊ Tabular: 50%
|
|
357
|
+
│
|
|
358
|
+
slimjson ██████████░░░░░░░░░░ 91,630 tokens
|
|
359
|
+
├─ vs JSON (−49.4%) 181,141 tokens
|
|
360
|
+
├─ vs JSON compact (−28.7%) 128,480 tokens
|
|
361
|
+
├─ vs TOON (−40.5%) 154,032 tokens
|
|
362
|
+
├─ vs YAML (−41.0%) 155,346 tokens
|
|
363
|
+
└─ vs XML (−55.5%) 205,796 tokens
|
|
364
|
+
|
|
365
|
+
🧩 Deeply nested configuration ┊ Tabular: 0%
|
|
366
|
+
│
|
|
367
|
+
slimjson ████████████░░░░░░░░ 547 tokens
|
|
368
|
+
├─ vs JSON (−39.6%) 905 tokens
|
|
369
|
+
├─ vs JSON compact (−0.9%) 552 tokens
|
|
370
|
+
├─ vs TOON (−11.5%) 618 tokens
|
|
371
|
+
├─ vs YAML (−17.4%) 662 tokens
|
|
372
|
+
└─ vs XML (−45.1%) 997 tokens
|
|
373
|
+
|
|
374
|
+
──────────────────────────────────── Total ────────────────────────────────────
|
|
375
|
+
slimjson █████████░░░░░░░░░░░ 138,410 tokens
|
|
376
|
+
├─ vs JSON (−52.5%) 291,620 tokens
|
|
377
|
+
├─ vs JSON compact (−30.3%) 198,560 tokens
|
|
378
|
+
├─ vs TOON (−39.3%) 227,896 tokens
|
|
379
|
+
├─ vs YAML (−42.7%) 241,459 tokens
|
|
380
|
+
└─ vs XML (−58.1%) 330,065 tokens
|
|
381
|
+
```
|
|
382
|
+
|
|
383
|
+
#### Flat-Only Track
|
|
384
|
+
|
|
385
|
+
Flat tabular datasets where CSV is applicable.
|
|
386
|
+
|
|
387
|
+
```
|
|
388
|
+
👥 Uniform employee records ┊ Tabular: 100%
|
|
389
|
+
│
|
|
390
|
+
CSV ████████████████████ 47,137 tokens
|
|
391
|
+
slimjson ████████████████████ 47,067 tokens (-0.1% vs CSV)
|
|
392
|
+
├─ vs JSON (−63.0%) 127,050 tokens
|
|
393
|
+
├─ vs JSON compact (−40.5%) 79,046 tokens
|
|
394
|
+
├─ vs TOON (−5.8%) 49,966 tokens
|
|
395
|
+
├─ vs YAML (−52.9%) 100,033 tokens
|
|
396
|
+
└─ vs XML (−67.9%) 146,596 tokens
|
|
397
|
+
|
|
398
|
+
📈 Time-series analytics data ┊ Tabular: 100%
|
|
399
|
+
│
|
|
400
|
+
CSV ███████████████████░ 8,392 tokens
|
|
401
|
+
slimjson ████████████████████ 8,767 tokens (+4.5% vs CSV)
|
|
402
|
+
├─ vs JSON (−60.6%) 22,254 tokens
|
|
403
|
+
├─ vs JSON compact (−38.3%) 14,220 tokens
|
|
404
|
+
├─ vs TOON (−3.9%) 9,124 tokens
|
|
405
|
+
├─ vs YAML (−50.9%) 17,867 tokens
|
|
406
|
+
└─ vs XML (−67.1%) 26,625 tokens
|
|
407
|
+
|
|
408
|
+
⭐ Top 100 GitHub repositories ┊ Tabular: 100%
|
|
409
|
+
│
|
|
410
|
+
CSV ████████████████████ 8,512 tokens
|
|
411
|
+
slimjson ████████████████████ 8,550 tokens (+0.4% vs CSV)
|
|
412
|
+
├─ vs JSON (−43.5%) 15,144 tokens
|
|
413
|
+
├─ vs JSON compact (−25.4%) 11,454 tokens
|
|
414
|
+
├─ vs TOON (−2.2%) 8,744 tokens
|
|
415
|
+
├─ vs YAML (−34.9%) 13,128 tokens
|
|
416
|
+
└─ vs XML (−50.0%) 17,095 tokens
|
|
417
|
+
|
|
418
|
+
──────────────────────────────────── Total ────────────────────────────────────
|
|
419
|
+
CSV ████████████████████ 64,041 tokens
|
|
420
|
+
slimjson ████████████████████ 64,384 tokens (+0.5% vs CSV)
|
|
421
|
+
├─ vs JSON (−60.8%) 164,448 tokens
|
|
422
|
+
├─ vs JSON compact (−38.5%) 104,720 tokens
|
|
423
|
+
├─ vs TOON (−5.1%) 67,834 tokens
|
|
424
|
+
├─ vs YAML (−50.9%) 131,028 tokens
|
|
425
|
+
└─ vs XML (−66.2%) 190,316 tokens
|
|
426
|
+
```
|
|
427
|
+
|
|
428
|
+
> On mixed-structure data, slimjson saves **52.5%** tokens vs JSON. On flat tabular data, it's on par with CSV (only 0.5% more).
|
|
429
|
+
|
|
430
|
+
## LLM Data Retrieval Accuracy
|
|
431
|
+
|
|
432
|
+
Accuracy tested with 209 data retrieval questions across different input formats.
|
|
433
|
+
|
|
434
|
+
#### Efficiency Ranking (Accuracy per 1K Tokens)
|
|
435
|
+
|
|
436
|
+
```
|
|
437
|
+
slimjson ████████████████████ 44.4 acc%/1K tok │ 94.7% acc │ 2,134 tokens
|
|
438
|
+
TOON ███████████████░░░░░ 34.0 acc%/1K tok │ 92.8% acc │ 2,734 tokens
|
|
439
|
+
JSON compact ██████████████░░░░░░ 31.0 acc%/1K tok │ 95.2% acc │ 3,072 tokens
|
|
440
|
+
YAML ███████████░░░░░░░░░ 25.4 acc%/1K tok │ 94.3% acc │ 3,716 tokens
|
|
441
|
+
JSON ██████████░░░░░░░░░░ 21.1 acc%/1K tok │ 95.7% acc │ 4,538 tokens
|
|
442
|
+
XML ████████░░░░░░░░░░░░ 18.5 acc%/1K tok │ 95.7% acc │ 5,162 tokens
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
*Efficiency score = (Accuracy % ÷ Tokens) × 1,000. Higher is better.*
|
|
446
|
+
|
|
447
|
+
> slimjson achieves **94.7%** accuracy (vs JSON's 95.7%) while using **53.0% fewer tokens**.
|
|
448
|
+
|
|
449
|
+
#### Per-Model Accuracy
|
|
450
|
+
|
|
451
|
+
```
|
|
452
|
+
deepseek-v4-flash
|
|
453
|
+
JSON ███████████████████░ 95.7% (200/209)
|
|
454
|
+
XML ███████████████████░ 95.7% (200/209)
|
|
455
|
+
JSON compact ███████████████████░ 95.2% (199/209)
|
|
456
|
+
→ slimjson ███████████████████░ 94.7% (198/209)
|
|
457
|
+
YAML ███████████████████░ 94.3% (197/209)
|
|
458
|
+
TOON ███████████████████░ 92.8% (194/209)
|
|
459
|
+
CSV ██████████████████░░ 91.7% (100/109)
|
|
460
|
+
```
|
|
461
|
+
|
|
462
|
+
#### Accuracy by Question Type
|
|
463
|
+
|
|
464
|
+
| Question Type | JSON | XML | JSON compact | slimjson | YAML | TOON | CSV |
|
|
465
|
+
|---------------|------|-----|-------------|----------|------|------|-----|
|
|
466
|
+
| Field Retrieval | 98.5% | 97.1% | 98.5% | 95.6% | 97.1% | 91.2% | 96.9% |
|
|
467
|
+
| Aggregation | 98.4% | 96.8% | 95.2% | 95.2% | 93.7% | 95.2% | 86.2% |
|
|
468
|
+
| Filtering | 97.9% | 97.9% | 100.0% | 100.0% | 100.0% | 100.0% | 96.3% |
|
|
469
|
+
| Structure Awareness | 88.0% | 92.0% | 84.0% | 92.0% | 88.0% | 88.0% | 87.5% |
|
|
470
|
+
| Structural Validation | 40.0% | 60.0% | 60.0% | 40.0% | 40.0% | 40.0% | 80.0% |
|
|
471
|
+
|
|
472
|
+
#### Datasets Tested
|
|
473
|
+
|
|
474
|
+
| Dataset | Rows | Structure | CSV Support |
|
|
475
|
+
|---------|------|-----------|-------------|
|
|
476
|
+
| Uniform employee records | 100 | uniform | ✓ |
|
|
477
|
+
| E-commerce orders (nested) | 50 | nested | ✗ |
|
|
478
|
+
| Time-series analytics data | 60 | uniform | ✓ |
|
|
479
|
+
| Top 100 GitHub repositories | 100 | uniform | ✓ |
|
|
480
|
+
| Semi-uniform event logs | 75 | semi-uniform | ✗ |
|
|
481
|
+
| Deeply nested configuration | 11 | deep | ✗ |
|
|
482
|
+
|
|
483
|
+
## Development
|
|
484
|
+
|
|
485
|
+
```bash
|
|
486
|
+
# Run tests (192 cases, 100% coverage)
|
|
487
|
+
npm test
|
|
488
|
+
|
|
489
|
+
# Run compression ratio benchmarks (with trim comparison)
|
|
490
|
+
node compress-test.js
|
|
491
|
+
```
|
|
492
|
+
|
|
493
|
+
## GitHub
|
|
494
|
+
|
|
495
|
+
[https://github.com/LastHeaven/slimjson](https://github.com/LastHeaven/slimjson)
|
|
496
|
+
|
|
497
|
+
## License
|
|
498
|
+
|
|
499
|
+
MIT
|