slimjson 1.0.2 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +65 -13
  2. package/README_EN.md +309 -0
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -1,14 +1,17 @@
1
1
  # slimjson
2
2
 
3
+ 中文 | [English](./README_EN.md)
4
+
3
5
  轻量级对象数组压缩工具 — 将重复 key 的 JSON 对象数组转换为 `{ keys, rows }` 紧凑格式,并支持序列化时省略 `null` 以进一步减小体积。
4
6
 
5
7
  ## 适用场景
6
8
 
7
- - 后端返回列表接口时,每个对象都携带相同的 key 名,大量冗余
8
- - 不同对象可能拥有不同的字段(后端按需 omit null 字段)
9
- - 需要在网络传输中极致压缩 JSON 文本体积
9
+ - **API 列表接口**:后端返回列表接口时,每个对象都携带相同的 key 名,大量冗余
10
+ - **异构字段**:不同对象可能拥有不同的字段(后端按需 omit null 字段)
11
+ - **网络传输压缩**:需要在网络传输中极致压缩 JSON 文本体积
10
12
  - **大模型上下文压缩**:将大量结构化数据(如数据库查询结果、API 响应、知识库条目)压缩后送入 prompt,减少 token 消耗,降低调用成本
11
13
  - **大模型工具调用**:function calling / tool_use 返回的结果往往是结构化的对象数组,压缩后再回传给模型,可显著减少上下文窗口占用,让模型在有限 token 内处理更复杂的数据
14
+ - **大模型识别友好**:压缩后的 `{ keys, rows }` 格式将 schema(字段定义)与数据分离,key 只出现一次,模型能更准确地理解数据结构、按字段名提取信息,比重复 key 的原始 JSON 更不容易混淆
12
15
 
13
16
  ## 安装
14
17
 
@@ -23,7 +26,7 @@ npm install slimjson
23
26
  将对象数组压缩为 `{ keys, rows }` 结构:
24
27
 
25
28
  ```js
26
- const { compress } = require('slimjson');
29
+ import { compress } from 'slimjson';
27
30
 
28
31
  const users = [
29
32
  { name: 'Alice', age: 25, city: 'NYC' },
@@ -137,6 +140,13 @@ stringify(compress(orders));
137
140
  // ^^^^^^^^^^^^^^^^^^^^^^^^^^^ specs 中缺失字段用空槽省略 null ^^^^^^^^^^^^^^^^^^^^^^^^
138
141
  ```
139
142
 
143
+ #### 单对象示例
144
+ ```js
145
+ compress({ name: 'Alice', age: 25 });
146
+ // 等价于 compress([{ name: 'Alice', age: 25 }])
147
+ // { keys: ['name', 'age'], rows: [['Alice', 25]] }
148
+ ```
149
+
140
150
  ### `decompress(compressed)`
141
151
 
142
152
  将 `{ keys, rows }` 还原为原始对象数组:
@@ -148,7 +158,7 @@ const restored = decompress(compressed);
148
158
 
149
159
  ### `stringify(compressed)`
150
160
 
151
- 将 compress 结果序列化为文本,数组中 `null` 值被省略(保留逗号占位),安全的字符串省略引号:
161
+ 将 compress 结果序列化为紧凑文本。相比 `JSON.stringify`,应用了以下优化规则:
152
162
 
153
163
  ```js
154
164
  const data = [
@@ -158,23 +168,65 @@ const data = [
158
168
 
159
169
  const text = stringify(compress(data));
160
170
  // {keys:[name,age],rows:[[Alice,25],[Bob,30]]}
161
- ```
162
171
 
163
- 对比 JSON.stringify:
164
- ```js
165
172
  JSON.stringify(compress(data));
166
173
  // {"keys":["name","age"],"rows":[["Alice",25],["Bob",30]]}
167
- // ↑ 引号 ↑ ↑ 引号 ↑
168
174
  ```
169
175
 
170
- 数组 null 省略规则:
176
+ #### 序列化规则一览
177
+
178
+ | 值类型 | 序列化结果 | 说明 |
179
+ |--------|-----------|------|
180
+ | `null` / `undefined` | `null` | — |
181
+ | 有限数字 | `25` | 直接输出,无引号 |
182
+ | `NaN` / `Infinity` | `null` | 非有限数统一输出 null |
183
+ | `true` / `false` | `true` / `false` | — |
184
+ | 安全字符串 | `Alice` | 省略引号(见下方规则) |
185
+ | 非安全字符串 | `"hello world"` | 保留 JSON 引号和转义 |
186
+ | 嵌套对象 `{k: v}` | `{k:v}` | key 同样区分安全/非安全 |
187
+ | 数组 | 见下方 null 省略规则 | — |
188
+
189
+ #### 安全字符串(可省略引号的条件)
190
+
191
+ 满足以下**全部**条件的字符串可省略引号,否则保留 `JSON.stringify` 转义:
192
+
193
+ 1. 非空字符串
194
+ 2. 不是关键字字面量:`null`、`true`、`false`
195
+ 3. 不匹配数字模式:`/^-?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?$/`(如 `"123"`、`"-3.14"`、`"1e10"` 均保留引号)
196
+ 4. 不以数字或减号 `-` 开头
197
+ 5. 不含空白、`[`、`]`、`{`、`}`、`,`、`:`、`"` 等字符
198
+
199
+ | 字符串 | 结果 | 原因 |
200
+ |--------|------|------|
201
+ | `"Alice"` | `Alice` | 安全,省略引号 |
202
+ | `"hello world"` | `"hello world"` | 含空格 |
203
+ | `"123"` | `"123"` | 看起来像数字 |
204
+ | `"-3.14"` | `"-3.14"` | 看起来像数字 |
205
+ | `"null"` | `"null"` | 关键字 |
206
+ | `""` | `""` | 空字符串 |
207
+ | `"-abc"` | `"-abc"` | 以减号开头 |
208
+ | `"a:b"` | `"a:b"` | 含冒号 |
209
+
210
+ #### 对象 key 引号规则
211
+
212
+ `keys` 中的嵌套对象 key 同样适用安全字符串判断:
213
+
214
+ ```js
215
+ stringify({ keys: [{ profile: ['name', 'age'] }], rows: [...] });
216
+ // {keys:[{profile:[name,age]}],rows:[...]} ← profile 是安全 key,省略引号
217
+
218
+ stringify({ keys: [{ "my-key": ['name'] }], rows: [...] });
219
+ // {keys:[{"my-key":[name]}],rows:[...]} ← my-key 含减号,保留引号
220
+ ```
221
+
222
+ #### 数组 null 省略规则
223
+
224
+ 数组中的 `null` / `undefined` 被省略为逗号空槽,不占文字体积:
171
225
 
172
226
  | 原始数组 | 序列化结果 | 说明 |
173
227
  |----------------------|------------|------|
174
228
  | `["a", null, null]` | `[a,,]` | 尾部两个空槽 |
175
229
  | `[null, 1, null]` | `[,1,]` | 前后空槽 |
176
- | `[null, "1", null]` | `[,"1",]` | 前后空槽 |
177
- | `[null, "1a", null]` | `[,"1a",]` | 前后空槽 |
178
230
  | `[]` | `[]` | 空数组 |
179
231
  | `[null]` | `[null]` | **特殊**:`[,]` 代表 2 个 null,因此单 null 保留文字 |
180
232
 
@@ -192,7 +244,7 @@ const parsed = parse(text);
192
244
  ## 完整使用示例
193
245
 
194
246
  ```js
195
- const { compress, decompress, stringify, parse } = require('slimjson');
247
+ import { compress, decompress, stringify, parse } from 'slimjson';
196
248
 
197
249
  const data = [
198
250
  { name: '张三', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
package/README_EN.md ADDED
@@ -0,0 +1,309 @@
1
+ # slimjson
2
+
3
+ [中文](./README.md) | English
4
+
5
+ A lightweight object array compression tool — converts JSON object arrays with repeated keys into a compact `{ keys, rows }` format, with support for omitting `null` values during serialization to further reduce size.
6
+
7
+ ## Use Cases
8
+
9
+ - **API List Endpoints**: Backend list endpoints where every object carries the same key names, resulting in massive redundancy
10
+ - **Heterogeneous Fields**: Objects with different fields (backend omits null fields on demand)
11
+ - **Network Transfer Compression**: Minimizing JSON text size for network transmission
12
+ - **LLM Context Compression**: Compress large structured data (e.g. database query results, API responses, knowledge base entries) before sending to prompts, reducing token consumption and API costs
13
+ - **LLM Tool Calling**: function calling / tool_use results are often structured object arrays — compressing them before feeding back to the model significantly reduces context window usage, enabling the model to handle more complex data within limited tokens
14
+ - **LLM-Friendly Format**: The compressed `{ keys, rows }` format separates schema (field definitions) from data, with each key appearing only once. Models can more accurately understand data structures and extract information by field name, with less confusion compared to raw JSON with repeated keys
15
+
16
+ ## Installation
17
+
18
+ ```bash
19
+ npm install slimjson
20
+ ```
21
+
22
+ ## API
23
+
24
+ ### `compress(source)`
25
+
26
+ Compresses an object array into a `{ keys, rows }` structure:
27
+
28
+ ```js
29
+ import { compress } from 'slimjson';
30
+
31
+ const users = [
32
+ { name: 'Alice', age: 25, city: 'NYC' },
33
+ { name: 'Bob', age: 30, city: 'LA' },
34
+ ];
35
+
36
+ const compressed = compress(users);
37
+ // {
38
+ // keys: ['name', 'age', 'city'],
39
+ // rows: [
40
+ // ['Alice', 25, 'NYC'],
41
+ // ['Bob', 30, 'LA' ]
42
+ // ]
43
+ // }
44
+ ```
45
+
46
+ **Features:**
47
+ - `keys` takes the union of all object keys, ordered by first appearance
48
+ - Missing fields in an object → fill `null` at the corresponding row position
49
+ - Nested objects are recursively processed: represented as `{ "fieldName": [childKeys] }` in `keys`
50
+ - Object arrays (e.g. order items) are recursively compressed the same way
51
+ - When a plain object is passed (not an array), it is treated as a single-element array
52
+
53
+ ```js
54
+ compress({ name: 'Alice', age: 25 });
55
+ // Equivalent to compress([{ name: 'Alice', age: 25 }])
56
+ // { keys: ['name', 'age'], rows: [['Alice', 25]] }
57
+ ```
58
+
59
+ #### Nested Object Example
60
+
61
+ ```js
62
+ const data = [
63
+ { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
64
+ { name: 'Bob', age: 35, profile: { avatar: 'b.jpg' } }, // missing bio
65
+ ];
66
+
67
+ compress(data);
68
+ // {
69
+ // keys: ['name', 'age', { profile: ['avatar', 'bio'] }],
70
+ // rows: [
71
+ // ['Alice', 28, ['a.jpg', 'Hello']],
72
+ // ['Bob', 35, ['b.jpg', null ]]
73
+ // ]
74
+ // }
75
+
76
+ stringify(compress(data));
77
+ // {keys:[name,age,{profile:[avatar,bio]}],rows:[[Alice,28,[a.jpg,Hello]],[Bob,35,[b.jpg,]]]}
78
+ // ^^ null omitted, comma retained
79
+ ```
80
+
81
+ #### Object Array Example (Order Scenario)
82
+
83
+ ```js
84
+ const orders = [
85
+ { orderId: 'A001', items: [{ name: 'Keyboard', price: 299 }, { name: 'Mouse', price: 99 }] },
86
+ { orderId: 'A002', items: [{ name: 'Monitor', price: 1999 }] },
87
+ ];
88
+
89
+ compress(orders);
90
+ // {
91
+ // keys: ['orderId', { items: ['name', 'price'] }],
92
+ // rows: [
93
+ // ['A001', [['Keyboard', 299], ['Mouse', 99]]],
94
+ // ['A002', [['Monitor', 1999]]]
95
+ // ]
96
+ // }
97
+
98
+ stringify(compress(orders));
99
+ // {keys:[orderId,{items:[name,price]}],rows:[[A001,[[Keyboard,299],[Mouse,99]]],[A002,[[Monitor,1999]]]]}
100
+ // ^^^^^ nested object key, no quotes ^^^^ safe string value, no quotes
101
+ ```
102
+
103
+ #### Three-Level Nesting Example (Order → Item → Specs)
104
+
105
+ ```js
106
+ const orders = [
107
+ {
108
+ orderId: 'A001',
109
+ customer: 'Alice',
110
+ items: [
111
+ { name: 'Keyboard', price: 299, specs: { color: 'Black', layout: '104-key' } },
112
+ { name: 'Mouse', price: 99, specs: { color: 'White', dpi: '4000' } },
113
+ ]
114
+ },
115
+ {
116
+ orderId: 'A002',
117
+ customer: 'Bob',
118
+ items: [
119
+ { name: 'Monitor', price: 1999, specs: { color: 'Silver', size: '27in' } },
120
+ ]
121
+ },
122
+ ];
123
+
124
+ compress(orders);
125
+ // {
126
+ // keys: [
127
+ // 'orderId',
128
+ // 'customer',
129
+ // { items: ['name', 'price', { specs: ['color', 'layout', 'dpi', 'size'] }] }
130
+ // ],
131
+ // rows: [
132
+ // ['A001', 'Alice', [
133
+ // ['Keyboard', 299, ['Black', '104-key', null, null]],
134
+ // ['Mouse', 99, ['White', null, '4000', null]]
135
+ // ]],
136
+ // ['A002', 'Bob', [
137
+ // ['Monitor', 1999, ['Silver', null, null, '27in']]
138
+ // ]]
139
+ // ]
140
+ // }
141
+ // specs keys take the union: order 1 has layout, order 2 has size → both kept, missing fields filled with null
142
+
143
+ stringify(compress(orders));
144
+ // {keys:[orderId,customer,{items:[name,price,{specs:[color,layout,dpi,size]}]}],rows:[[
145
+ // A001,Alice,[[Keyboard,299,[Black,104-key,,]],[Mouse,99,[White,,4000,]]]],[A002,Bob,[[Monitor,1999,[Silver,,,27in]]]]]}
146
+ // ^^^^^^^^^^^^^^^^^^^^^^^^^^^ missing fields in specs omitted as empty slots ^^^^^^^^^^^^^^^^^^^^^^^^
147
+ ```
148
+
149
+ ### `decompress(compressed)`
150
+
151
+ Restores `{ keys, rows }` back to the original object array:
152
+
153
+ ```js
154
+ const restored = decompress(compressed);
155
+ // deep-equal to the original array
156
+ ```
157
+
158
+ ### `stringify(compressed)`
159
+
160
+ Serializes the compress result into compact text. Compared to `JSON.stringify`, the following optimization rules are applied:
161
+
162
+ ```js
163
+ const data = [
164
+ { name: 'Alice', age: 25 },
165
+ { name: 'Bob', age: 30 },
166
+ ];
167
+
168
+ const text = stringify(compress(data));
169
+ // {keys:[name,age],rows:[[Alice,25],[Bob,30]]}
170
+
171
+ JSON.stringify(compress(data));
172
+ // {"keys":["name","age"],"rows":[["Alice",25],["Bob",30]]}
173
+ ```
174
+
175
+ #### Serialization Rules
176
+
177
+ | Value Type | Serialized Result | Notes |
178
+ |------------|------------------|-------|
179
+ | `null` / `undefined` | `null` | — |
180
+ | Finite number | `25` | Direct output, no quotes |
181
+ | `NaN` / `Infinity` | `null` | Non-finite numbers unified to null |
182
+ | `true` / `false` | `true` / `false` | — |
183
+ | Safe string | `Alice` | Quotes omitted (see rules below) |
184
+ | Unsafe string | `"hello world"` | JSON quotes and escaping retained |
185
+ | Nested object `{k: v}` | `{k:v}` | Keys follow same safe/unsafe rules |
186
+ | Array | See null omission rules below | — |
187
+
188
+ #### Safe Strings (Conditions for Omitting Quotes)
189
+
190
+ A string can omit quotes only when it satisfies **all** of the following conditions; otherwise `JSON.stringify` escaping is applied:
191
+
192
+ 1. Non-empty string
193
+ 2. Not a keyword literal: `null`, `true`, `false`
194
+ 3. Does not match number pattern: `/^-?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?$/` (e.g. `"123"`, `"-3.14"`, `"1e10"` all retain quotes)
195
+ 4. Does not start with a digit or minus sign `-`
196
+ 5. Does not contain whitespace, `[`, `]`, `{`, `}`, `,`, `:`, `"` or similar characters
197
+
198
+ | String | Result | Reason |
199
+ |--------|--------|--------|
200
+ | `"Alice"` | `Alice` | Safe, quotes omitted |
201
+ | `"hello world"` | `"hello world"` | Contains space |
202
+ | `"123"` | `"123"` | Looks like a number |
203
+ | `"-3.14"` | `"-3.14"` | Looks like a number |
204
+ | `"null"` | `"null"` | Keyword |
205
+ | `""` | `""` | Empty string |
206
+ | `"-abc"` | `"-abc"` | Starts with minus sign |
207
+ | `"a:b"` | `"a:b"` | Contains colon |
208
+
209
+ #### Object Key Quoting Rules
210
+
211
+ Nested object keys in `keys` follow the same safe string check:
212
+
213
+ ```js
214
+ stringify({ keys: [{ profile: ['name', 'age'] }], rows: [...] });
215
+ // {keys:[{profile:[name,age]}],rows:[...]} ← profile is safe, quotes omitted
216
+
217
+ stringify({ keys: [{ "my-key": ['name'] }], rows: [...] });
218
+ // {keys:[{"my-key":[name]}],rows:[...]} ← my-key contains hyphen, quotes retained
219
+ ```
220
+
221
+ #### Array Null Omission Rules
222
+
223
+ `null` / `undefined` values in arrays are omitted as comma slots, taking no text space:
224
+
225
+ | Original Array | Serialized Result | Notes |
226
+ |---------------|------------------|-------|
227
+ | `["a", null, null]` | `[a,,]` | Two trailing empty slots |
228
+ | `[null, 1, null]` | `[,1,]` | Leading and trailing empty slots |
229
+ | `[]` | `[]` | Empty array |
230
+ | `[null]` | `[null]` | **Special**: `[,]` means 2 nulls, so single null retains literal |
231
+
232
+ ### `parse(text)`
233
+
234
+ Parses text produced by `stringify`, restoring omitted `null` values:
235
+
236
+ ```js
237
+ const parsed = parse(text);
238
+ // deep-equal to compressed
239
+ ```
240
+
241
+ Supports full JSON type parsing (strings, numbers, booleans, null, nested objects/arrays), compatible with escape characters and Unicode.
242
+
243
+ ## Complete Example
244
+
245
+ ```js
246
+ import { compress, decompress, stringify, parse } from 'slimjson';
247
+
248
+ const data = [
249
+ { name: 'Alice', age: 28, profile: { avatar: 'a.jpg', bio: 'Hello' } },
250
+ { name: 'Bob', age: 35, profile: { avatar: 'b.jpg' } }, // missing bio
251
+ ];
252
+
253
+ // Compress → Stringify → Parse → Decompress
254
+ const compressed = compress(data);
255
+ const text = stringify(compressed);
256
+ const parsed = parse(text);
257
+ const restored = decompress(parsed);
258
+
259
+ // restored is deep-equal to data
260
+ ```
261
+
262
+ ### Compression Ratio Calculation
263
+
264
+ ```js
265
+ const originalSize = Buffer.byteLength(JSON.stringify(data));
266
+ const compressedSize = Buffer.byteLength(stringify(compress(data)));
267
+ const ratio = ((originalSize - compressedSize) / originalSize * 100).toFixed(1);
268
+ console.log(`Compression ratio: ${ratio}%`);
269
+ ```
270
+
271
+ ## Compression Results
272
+
273
+ Based on actual data from `compress-test.js` benchmarks (18 test cases, average compression ratio **52.20%**, all roundtrip decompressions verified):
274
+
275
+ | Data Type | Object Count | Original Size | Compressed | Ratio |
276
+ |-----------|-------------|---------------|------------|-------|
277
+ | Simple users | 1,000 | 147.85 KB | 87.36 KB | **40.91%** |
278
+ | Simple users | 10,000 | 1.45 MB | 882.34 KB | **40.69%** |
279
+ | Nested users (with profile.social) | 1,000 | 235.28 KB | 153.03 KB | **34.96%** |
280
+ | Orders (1-5 items per order) | 500 | 166.34 KB | 72.10 KB | **56.66%** |
281
+ | School data (6 grades x 4 classes x 30 students) | 24 | 215.47 KB | 88.39 KB | **58.98%** |
282
+ | Sparse fields (500 records x 30 fields) | 500 | 144.68 KB | 45.39 KB | **68.62%** |
283
+ | Sparse fields (2000 records x 50 fields) | 2,000 | 947.88 KB | 292.74 KB | **69.12%** |
284
+ | Deep nesting (5-level org structure) | 5 | 634.65 KB | 288.75 KB | **54.50%** |
285
+
286
+ **Conclusions:**
287
+ 1. Longer field names and more fields yield better compression
288
+ 2. Object arrays (order items, student lists) show significant compression (55–59%)
289
+ 3. Sparse fields achieve the highest compression — missing field nulls omitted as empty slots (67–69%)
290
+ 4. Deeper nested structures achieve better compression
291
+ 5. `stringify` quote omission further reduces text size
292
+
293
+ ## Development
294
+
295
+ ```bash
296
+ # Run tests
297
+ npm test
298
+
299
+ # Run compression ratio benchmarks
300
+ node compress-test.js
301
+ ```
302
+
303
+ ## GitHub
304
+
305
+ [https://github.com/LastHeaven/slimjson](https://github.com/LastHeaven/slimjson)
306
+
307
+ ## License
308
+
309
+ MIT
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slimjson",
3
- "version": "1.0.2",
3
+ "version": "1.0.3",
4
4
  "main": "compress.js",
5
5
  "exports": {
6
6
  ".": {