slimjson 1.1.0 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/settings.local.json +3 -1
- package/README.md +33 -24
- package/README_EN.md +33 -24
- package/compress.js +1 -0
- package/esm.mjs +1 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -440,56 +440,65 @@ console.log(`压缩率: ${ratio}%`);
|
|
|
440
440
|
|
|
441
441
|
## LLM 数据检索准确率
|
|
442
442
|
|
|
443
|
-
使用 209
|
|
443
|
+
使用 209 道数据检索题在 2 个模型上测试不同格式下 LLM 的理解准确率。
|
|
444
444
|
|
|
445
445
|
#### 效率排名(每 1K tokens 的准确率)
|
|
446
446
|
|
|
447
447
|
```
|
|
448
|
-
slimjson ████████████████████ 44.
|
|
449
|
-
TOON ███████████████░░░░░
|
|
448
|
+
slimjson ████████████████████ 44.3 acc%/1K tok │ 94.5% acc │ 2,133 tokens
|
|
449
|
+
TOON ███████████████░░░░░ 33.8 acc%/1K tok │ 92.3% acc │ 2,734 tokens
|
|
450
450
|
JSON compact ██████████████░░░░░░ 31.0 acc%/1K tok │ 95.2% acc │ 3,072 tokens
|
|
451
|
-
YAML ███████████░░░░░░░░░
|
|
452
|
-
JSON
|
|
453
|
-
XML ████████░░░░░░░░░░░░ 18.
|
|
451
|
+
YAML ███████████░░░░░░░░░ 24.9 acc%/1K tok │ 92.3% acc │ 3,716 tokens
|
|
452
|
+
JSON █████████░░░░░░░░░░░ 20.3 acc%/1K tok │ 92.3% acc │ 4,538 tokens
|
|
453
|
+
XML ████████░░░░░░░░░░░░ 18.1 acc%/1K tok │ 93.3% acc │ 5,162 tokens
|
|
454
454
|
```
|
|
455
455
|
|
|
456
456
|
*效率分数 = (准确率% ÷ tokens) × 1,000,越高越好。*
|
|
457
457
|
|
|
458
|
-
> slimjson 准确率 **94.
|
|
458
|
+
> slimjson 准确率 **94.5%**(vs JSON 的 92.3%),同时节省 **53.0%** tokens。
|
|
459
459
|
|
|
460
460
|
#### 各模型准确率
|
|
461
461
|
|
|
462
462
|
```
|
|
463
463
|
deepseek-v4-flash
|
|
464
|
-
JSON ███████████████████░ 95.7% (200/209)
|
|
465
464
|
XML ███████████████████░ 95.7% (200/209)
|
|
465
|
+
JSON ███████████████████░ 95.7% (200/209)
|
|
466
466
|
JSON compact ███████████████████░ 95.2% (199/209)
|
|
467
|
-
→ slimjson ███████████████████░ 94.7% (198/209)
|
|
468
467
|
YAML ███████████████████░ 94.3% (197/209)
|
|
468
|
+
→ slimjson ███████████████████░ 93.3% (195/209)
|
|
469
469
|
TOON ███████████████████░ 92.8% (194/209)
|
|
470
470
|
CSV ██████████████████░░ 91.7% (100/109)
|
|
471
|
+
|
|
472
|
+
mimo-v2.5-pro
|
|
473
|
+
→ slimjson ███████████████████░ 95.7% (200/209)
|
|
474
|
+
JSON compact ███████████████████░ 95.2% (199/209)
|
|
475
|
+
TOON ██████████████████░░ 91.9% (192/209)
|
|
476
|
+
XML ██████████████████░░ 90.9% (190/209)
|
|
477
|
+
YAML ██████████████████░░ 90.4% (189/209)
|
|
478
|
+
JSON ██████████████████░░ 89.0% (186/209)
|
|
479
|
+
CSV ██████████████████░░ 88.1% (96/109)
|
|
471
480
|
```
|
|
472
481
|
|
|
473
482
|
#### 按题型准确率
|
|
474
483
|
|
|
475
|
-
| 题型 | JSON | XML | JSON
|
|
476
|
-
|
|
477
|
-
| 字段检索 |
|
|
478
|
-
| 聚合计算 |
|
|
479
|
-
| 条件筛选 | 97.9% |
|
|
480
|
-
| 结构感知 | 88.0% |
|
|
481
|
-
| 结构验证 |
|
|
484
|
+
| 题型 | JSON compact | slimjson | XML | JSON | TOON | YAML | CSV |
|
|
485
|
+
|------|-------------|----------|-----|------|------|------|-----|
|
|
486
|
+
| 字段检索 | 99.3% | 98.5% | 98.5% | 99.3% | 95.6% | 98.5% | 98.4% |
|
|
487
|
+
| 聚合计算 | 94.4% | 96.0% | 88.9% | 89.7% | 92.9% | 90.5% | 84.5% |
|
|
488
|
+
| 条件筛选 | 97.9% | 96.9% | 94.8% | 91.7% | 93.8% | 92.7% | 88.9% |
|
|
489
|
+
| 结构感知 | 88.0% | 88.0% | 90.0% | 90.0% | 90.0% | 88.0% | 87.5% |
|
|
490
|
+
| 结构验证 | 60.0% | 30.0% | 80.0% | 50.0% | 40.0% | 50.0% | 80.0% |
|
|
482
491
|
|
|
483
492
|
#### 测试数据集
|
|
484
493
|
|
|
485
|
-
| 数据集 | 行数 | 结构类型 | CSV 支持 |
|
|
486
|
-
|
|
487
|
-
| 均匀员工记录 | 100 | 均匀 | ✓ |
|
|
488
|
-
| 电商订单(嵌套结构) | 50 | 嵌套 | ✗ |
|
|
489
|
-
| 时间序列分析数据 | 60 | 均匀 | ✓ |
|
|
490
|
-
| Top 100 GitHub 仓库 | 100 | 均匀 | ✓ |
|
|
491
|
-
| 半均匀事件日志 | 75 | 半均匀 | ✗ |
|
|
492
|
-
| 深层嵌套配置 | 11 | 深层 | ✗ |
|
|
494
|
+
| 数据集 | 行数 | 结构类型 | CSV 支持 | 表格化程度 |
|
|
495
|
+
|--------|------|----------|----------|-----------|
|
|
496
|
+
| 均匀员工记录 | 100 | 均匀 | ✓ | 100% |
|
|
497
|
+
| 电商订单(嵌套结构) | 50 | 嵌套 | ✗ | 33% |
|
|
498
|
+
| 时间序列分析数据 | 60 | 均匀 | ✓ | 100% |
|
|
499
|
+
| Top 100 GitHub 仓库 | 100 | 均匀 | ✓ | 100% |
|
|
500
|
+
| 半均匀事件日志 | 75 | 半均匀 | ✗ | 50% |
|
|
501
|
+
| 深层嵌套配置 | 11 | 深层 | ✗ | 0% |
|
|
493
502
|
|
|
494
503
|
## 开发
|
|
495
504
|
|
package/README_EN.md
CHANGED
|
@@ -429,56 +429,65 @@ Flat tabular datasets where CSV is applicable.
|
|
|
429
429
|
|
|
430
430
|
## LLM Data Retrieval Accuracy
|
|
431
431
|
|
|
432
|
-
Accuracy tested with 209 data retrieval questions across different input formats.
|
|
432
|
+
Accuracy tested with 209 data retrieval questions across 2 LLMs on different input formats.
|
|
433
433
|
|
|
434
434
|
#### Efficiency Ranking (Accuracy per 1K Tokens)
|
|
435
435
|
|
|
436
436
|
```
|
|
437
|
-
slimjson ████████████████████ 44.
|
|
438
|
-
TOON ███████████████░░░░░
|
|
437
|
+
slimjson ████████████████████ 44.3 acc%/1K tok │ 94.5% acc │ 2,133 tokens
|
|
438
|
+
TOON ███████████████░░░░░ 33.8 acc%/1K tok │ 92.3% acc │ 2,734 tokens
|
|
439
439
|
JSON compact ██████████████░░░░░░ 31.0 acc%/1K tok │ 95.2% acc │ 3,072 tokens
|
|
440
|
-
YAML ███████████░░░░░░░░░
|
|
441
|
-
JSON
|
|
442
|
-
XML ████████░░░░░░░░░░░░ 18.
|
|
440
|
+
YAML ███████████░░░░░░░░░ 24.9 acc%/1K tok │ 92.3% acc │ 3,716 tokens
|
|
441
|
+
JSON █████████░░░░░░░░░░░ 20.3 acc%/1K tok │ 92.3% acc │ 4,538 tokens
|
|
442
|
+
XML ████████░░░░░░░░░░░░ 18.1 acc%/1K tok │ 93.3% acc │ 5,162 tokens
|
|
443
443
|
```
|
|
444
444
|
|
|
445
445
|
*Efficiency score = (Accuracy % ÷ Tokens) × 1,000. Higher is better.*
|
|
446
446
|
|
|
447
|
-
> slimjson achieves **94.
|
|
447
|
+
> slimjson achieves **94.5%** accuracy (vs JSON's 92.3%) while using **53.0% fewer tokens**.
|
|
448
448
|
|
|
449
449
|
#### Per-Model Accuracy
|
|
450
450
|
|
|
451
451
|
```
|
|
452
452
|
deepseek-v4-flash
|
|
453
|
-
JSON ███████████████████░ 95.7% (200/209)
|
|
454
453
|
XML ███████████████████░ 95.7% (200/209)
|
|
454
|
+
JSON ███████████████████░ 95.7% (200/209)
|
|
455
455
|
JSON compact ███████████████████░ 95.2% (199/209)
|
|
456
|
-
→ slimjson ███████████████████░ 94.7% (198/209)
|
|
457
456
|
YAML ███████████████████░ 94.3% (197/209)
|
|
457
|
+
→ slimjson ███████████████████░ 93.3% (195/209)
|
|
458
458
|
TOON ███████████████████░ 92.8% (194/209)
|
|
459
459
|
CSV ██████████████████░░ 91.7% (100/109)
|
|
460
|
+
|
|
461
|
+
mimo-v2.5-pro
|
|
462
|
+
→ slimjson ███████████████████░ 95.7% (200/209)
|
|
463
|
+
JSON compact ███████████████████░ 95.2% (199/209)
|
|
464
|
+
TOON ██████████████████░░ 91.9% (192/209)
|
|
465
|
+
XML ██████████████████░░ 90.9% (190/209)
|
|
466
|
+
YAML ██████████████████░░ 90.4% (189/209)
|
|
467
|
+
JSON ██████████████████░░ 89.0% (186/209)
|
|
468
|
+
CSV ██████████████████░░ 88.1% (96/109)
|
|
460
469
|
```
|
|
461
470
|
|
|
462
471
|
#### Accuracy by Question Type
|
|
463
472
|
|
|
464
|
-
| Question Type | JSON | XML | JSON
|
|
465
|
-
|
|
466
|
-
| Field Retrieval |
|
|
467
|
-
| Aggregation |
|
|
468
|
-
| Filtering | 97.9% |
|
|
469
|
-
| Structure Awareness | 88.0% |
|
|
470
|
-
| Structural Validation |
|
|
473
|
+
| Question Type | JSON compact | slimjson | XML | JSON | TOON | YAML | CSV |
|
|
474
|
+
|---------------|-------------|----------|-----|------|------|------|-----|
|
|
475
|
+
| Field Retrieval | 99.3% | 98.5% | 98.5% | 99.3% | 95.6% | 98.5% | 98.4% |
|
|
476
|
+
| Aggregation | 94.4% | 96.0% | 88.9% | 89.7% | 92.9% | 90.5% | 84.5% |
|
|
477
|
+
| Filtering | 97.9% | 96.9% | 94.8% | 91.7% | 93.8% | 92.7% | 88.9% |
|
|
478
|
+
| Structure Awareness | 88.0% | 88.0% | 90.0% | 90.0% | 90.0% | 88.0% | 87.5% |
|
|
479
|
+
| Structural Validation | 60.0% | 30.0% | 80.0% | 50.0% | 40.0% | 50.0% | 80.0% |
|
|
471
480
|
|
|
472
481
|
#### Datasets Tested
|
|
473
482
|
|
|
474
|
-
| Dataset | Rows | Structure | CSV Support |
|
|
475
|
-
|
|
476
|
-
| Uniform employee records | 100 | uniform | ✓ |
|
|
477
|
-
| E-commerce orders (nested) | 50 | nested | ✗ |
|
|
478
|
-
| Time-series analytics data | 60 | uniform | ✓ |
|
|
479
|
-
| Top 100 GitHub repositories | 100 | uniform | ✓ |
|
|
480
|
-
| Semi-uniform event logs | 75 | semi-uniform | ✗ |
|
|
481
|
-
| Deeply nested configuration | 11 | deep | ✗ |
|
|
483
|
+
| Dataset | Rows | Structure | CSV Support | Tabular % |
|
|
484
|
+
|---------|------|-----------|-------------|-----------|
|
|
485
|
+
| Uniform employee records | 100 | uniform | ✓ | 100% |
|
|
486
|
+
| E-commerce orders (nested) | 50 | nested | ✗ | 33% |
|
|
487
|
+
| Time-series analytics data | 60 | uniform | ✓ | 100% |
|
|
488
|
+
| Top 100 GitHub repositories | 100 | uniform | ✓ | 100% |
|
|
489
|
+
| Semi-uniform event logs | 75 | semi-uniform | ✗ | 50% |
|
|
490
|
+
| Deeply nested configuration | 11 | deep | ✗ | 0% |
|
|
482
491
|
|
|
483
492
|
## Development
|
|
484
493
|
|
package/compress.js
CHANGED
package/esm.mjs
CHANGED
|
@@ -2,3 +2,4 @@ import { createRequire } from 'node:module';
|
|
|
2
2
|
const require = createRequire(import.meta.url);
|
|
3
3
|
const { compress, decompress, stringify, parse } = require('./compress.js');
|
|
4
4
|
export { compress, decompress, stringify, parse };
|
|
5
|
+
export default { compress, decompress, stringify, parse };
|