vectorjson 0.1.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +272 -70
- package/dist/engine-wasm.generated.d.ts +2 -0
- package/dist/engine-wasm.generated.d.ts.map +1 -0
- package/dist/engine.wasm +0 -0
- package/dist/index.d.ts +75 -28
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +5 -5
- package/dist/index.js.map +5 -4
- package/package.json +5 -4
package/README.md
CHANGED
|
@@ -1,6 +1,9 @@
|
|
|
1
1
|
# VectorJSON
|
|
2
2
|
|
|
3
3
|
[](https://github.com/teamchong/vectorjson/actions/workflows/ci.yml)
|
|
4
|
+
[](https://www.npmjs.com/package/vectorjson)
|
|
5
|
+
[](https://www.npmjs.com/package/vectorjson)
|
|
6
|
+
[](https://github.com/teamchong/vectorjson/blob/main/LICENSE)
|
|
4
7
|
|
|
5
8
|
O(n) streaming JSON parser for LLM tool calls, built on WASM SIMD. Agents act faster with field-level streaming, detect wrong outputs early to abort and save tokens, and offload parsing to Workers with transferable ArrayBuffers.
|
|
6
9
|
|
|
@@ -27,24 +30,26 @@ for await (const chunk of stream) {
|
|
|
27
30
|
}
|
|
28
31
|
```
|
|
29
32
|
|
|
30
|
-
A 50KB tool call streamed in ~12-char chunks means ~4,000 full re-parses — O(n²). At 100KB, Vercel AI SDK spends
|
|
33
|
+
A 50KB tool call streamed in ~12-char chunks means ~4,000 full re-parses — O(n²). At 100KB, Vercel AI SDK spends 6.1 seconds just parsing. Anthropic SDK spends 13.4 seconds.
|
|
31
34
|
|
|
32
35
|
## Quick Start
|
|
33
36
|
|
|
34
|
-
|
|
37
|
+
Zero-config — just import and use. No `init()`, no WASM setup:
|
|
35
38
|
|
|
36
39
|
```js
|
|
37
|
-
import {
|
|
38
|
-
const vj = await init();
|
|
40
|
+
import { parse, createParser, createEventParser } from "vectorjson";
|
|
39
41
|
|
|
40
|
-
//
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
42
|
+
// One-shot parse
|
|
43
|
+
const result = parse('{"tool":"file_edit","path":"app.ts"}');
|
|
44
|
+
result.value.tool; // "file_edit" — lazy Proxy over WASM tape
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
**Streaming** — O(n) incremental parsing, feed chunks, get a live object:
|
|
45
48
|
|
|
46
|
-
|
|
47
|
-
|
|
49
|
+
```js
|
|
50
|
+
import { createParser } from "vectorjson";
|
|
51
|
+
|
|
52
|
+
const parser = createParser();
|
|
48
53
|
for await (const chunk of stream) {
|
|
49
54
|
parser.feed(chunk);
|
|
50
55
|
result = parser.getValue(); // O(1) — returns live object
|
|
@@ -57,7 +62,7 @@ parser.destroy();
|
|
|
57
62
|
**Or skip intermediate access entirely** — if you only need the final value:
|
|
58
63
|
|
|
59
64
|
```js
|
|
60
|
-
const parser =
|
|
65
|
+
const parser = createParser();
|
|
61
66
|
for await (const chunk of stream) {
|
|
62
67
|
const s = parser.feed(chunk); // O(1) — appends bytes to WASM buffer
|
|
63
68
|
if (s === "complete") break;
|
|
@@ -69,7 +74,9 @@ parser.destroy();
|
|
|
69
74
|
**Event-driven** — react to fields as they arrive, O(n) total, no re-parsing:
|
|
70
75
|
|
|
71
76
|
```js
|
|
72
|
-
|
|
77
|
+
import { createEventParser } from "vectorjson";
|
|
78
|
+
|
|
79
|
+
const parser = createEventParser();
|
|
73
80
|
|
|
74
81
|
parser.on('tool', (e) => showToolUI(e.value)); // fires immediately
|
|
75
82
|
parser.onDelta('code', (e) => editor.append(e.value)); // streams char-by-char
|
|
@@ -85,7 +92,7 @@ parser.destroy();
|
|
|
85
92
|
|
|
86
93
|
```js
|
|
87
94
|
const abort = new AbortController();
|
|
88
|
-
const parser =
|
|
95
|
+
const parser = createEventParser();
|
|
89
96
|
|
|
90
97
|
parser.on('name', (e) => {
|
|
91
98
|
if (e.value !== 'str_replace_editor') {
|
|
@@ -111,7 +118,7 @@ const buf = parser.getRawBuffer();
|
|
|
111
118
|
postMessage(buf, [buf]); // O(1) transfer — moves pointer, no copy
|
|
112
119
|
|
|
113
120
|
// On Main thread:
|
|
114
|
-
const result =
|
|
121
|
+
const result = parse(new Uint8Array(buf)); // lazy Proxy
|
|
115
122
|
result.value.name; // only materializes what you touch
|
|
116
123
|
```
|
|
117
124
|
|
|
@@ -125,18 +132,22 @@ Apple-to-apple: both sides produce a materialized partial object on every chunk.
|
|
|
125
132
|
|
|
126
133
|
| Payload | Product | Original | + VectorJSON | Speedup |
|
|
127
134
|
|---------|---------|----------|-------------|---------|
|
|
128
|
-
| 1 KB | Vercel AI SDK |
|
|
129
|
-
| | Anthropic SDK |
|
|
130
|
-
| | TanStack AI |
|
|
131
|
-
| | OpenClaw |
|
|
132
|
-
|
|
|
133
|
-
| | Anthropic SDK |
|
|
134
|
-
| | TanStack AI |
|
|
135
|
-
| | OpenClaw |
|
|
136
|
-
|
|
|
137
|
-
| | Anthropic SDK |
|
|
138
|
-
| | TanStack AI |
|
|
139
|
-
| | OpenClaw |
|
|
135
|
+
| 1 KB | Vercel AI SDK | 3.9 ms | 283 µs | **14×** |
|
|
136
|
+
| | Anthropic SDK | 3.3 ms | 283 µs | **12×** |
|
|
137
|
+
| | TanStack AI | 3.2 ms | 283 µs | **11×** |
|
|
138
|
+
| | OpenClaw | 3.8 ms | 283 µs | **14×** |
|
|
139
|
+
| 5 KB | Vercel AI SDK | 23.1 ms | 739 µs | **31×** |
|
|
140
|
+
| | Anthropic SDK | 34.7 ms | 739 µs | **47×** |
|
|
141
|
+
| | TanStack AI | — | 739 µs | — |
|
|
142
|
+
| | OpenClaw | — | 739 µs | — |
|
|
143
|
+
| 50 KB | Vercel AI SDK | 1.80 s | 2.7 ms | **664×** |
|
|
144
|
+
| | Anthropic SDK | 3.39 s | 2.7 ms | **1255×** |
|
|
145
|
+
| | TanStack AI | 2.34 s | 2.7 ms | **864×** |
|
|
146
|
+
| | OpenClaw | 2.73 s | 2.7 ms | **1011×** |
|
|
147
|
+
| 100 KB | Vercel AI SDK | 6.1 s | 6.6 ms | **920×** |
|
|
148
|
+
| | Anthropic SDK | 13.4 s | 6.6 ms | **2028×** |
|
|
149
|
+
| | TanStack AI | 7.0 s | 6.6 ms | **1065×** |
|
|
150
|
+
| | OpenClaw | 8.0 s | 6.6 ms | **1222×** |
|
|
140
151
|
|
|
141
152
|
Stock parsers re-parse the full buffer on every chunk — O(n²). VectorJSON maintains a **live JS object** that grows incrementally on each `feed()`, so `getValue()` is O(1). Total work: O(n).
|
|
142
153
|
|
|
@@ -148,10 +159,10 @@ The real cost isn't just CPU time — it's blocking the agent's main thread. Sim
|
|
|
148
159
|
|
|
149
160
|
| Payload | Stock total | VectorJSON total | Main thread freed |
|
|
150
161
|
|---------|-----------|-----------------|-------------------|
|
|
151
|
-
|
|
|
152
|
-
|
|
|
153
|
-
|
|
|
154
|
-
|
|
|
162
|
+
| 1 KB | 4.0 ms | 1.7 ms | 2.3 ms sooner |
|
|
163
|
+
| 10 KB | 36.7 ms | 1.9 ms | 35 ms sooner |
|
|
164
|
+
| 50 KB | 665 ms | 3.8 ms | **661 ms sooner** |
|
|
165
|
+
| 100 KB | 2.42 s | 10.2 ms | **2.4 seconds sooner** |
|
|
155
166
|
|
|
156
167
|
Both approaches detect the tool name (`.name`) at the same chunk — the LLM hasn't streamed more yet. But while VectorJSON finishes processing all chunks in milliseconds, the stock parser blocks the main thread for the entire duration. The agent can't render UI, stream code to the editor, or start running tools until parsing is done.
|
|
157
168
|
|
|
@@ -188,19 +199,24 @@ The event parser (`createEventParser`) adds path-matching on top: it diffs the t
|
|
|
188
199
|
|
|
189
200
|
```bash
|
|
190
201
|
npm install vectorjson
|
|
202
|
+
# or
|
|
203
|
+
pnpm add vectorjson
|
|
204
|
+
# or
|
|
205
|
+
bun add vectorjson
|
|
206
|
+
# or
|
|
207
|
+
yarn add vectorjson
|
|
191
208
|
```
|
|
192
209
|
|
|
193
210
|
## Usage
|
|
194
211
|
|
|
195
|
-
###
|
|
212
|
+
### Streaming parse
|
|
196
213
|
|
|
197
|
-
|
|
214
|
+
Feed chunks as they arrive from any source — raw fetch, WebSocket, SSE, or your own transport:
|
|
198
215
|
|
|
199
216
|
```js
|
|
200
|
-
import {
|
|
201
|
-
const vj = await init();
|
|
217
|
+
import { createParser } from "vectorjson";
|
|
202
218
|
|
|
203
|
-
const parser =
|
|
219
|
+
const parser = createParser();
|
|
204
220
|
for await (const chunk of stream) {
|
|
205
221
|
const s = parser.feed(chunk);
|
|
206
222
|
if (s === "complete" || s === "end_early") break;
|
|
@@ -209,7 +225,9 @@ const result = parser.getValue(); // lazy Proxy — materializes on access
|
|
|
209
225
|
parser.destroy();
|
|
210
226
|
```
|
|
211
227
|
|
|
212
|
-
|
|
228
|
+
### Vercel AI SDK-compatible signature
|
|
229
|
+
|
|
230
|
+
If you have code that calls `parsePartialJson`, VectorJSON provides a compatible function:
|
|
213
231
|
|
|
214
232
|
```js
|
|
215
233
|
// Before
|
|
@@ -217,17 +235,20 @@ import { parsePartialJson } from "ai";
|
|
|
217
235
|
const { value, state } = parsePartialJson(buffer);
|
|
218
236
|
|
|
219
237
|
// After
|
|
220
|
-
import {
|
|
221
|
-
const
|
|
222
|
-
const { value, state } = vj.parsePartialJson(buffer);
|
|
238
|
+
import { parsePartialJson } from "vectorjson";
|
|
239
|
+
const { value, state } = parsePartialJson(buffer);
|
|
223
240
|
```
|
|
224
241
|
|
|
242
|
+
> **Note:** AI SDKs (Vercel, Anthropic, TanStack) parse JSON internally inside `streamObject()`, `MessageStream`, etc. — you don't get access to the raw chunks. To use VectorJSON today, work with the raw LLM stream directly (raw fetch, WebSocket, SSE).
|
|
243
|
+
|
|
225
244
|
### Event-driven: React to fields as they stream in
|
|
226
245
|
|
|
227
246
|
When an LLM streams a tool call, you usually care about specific fields at specific times. `createEventParser` lets you subscribe to paths and get notified the moment a value completes or a string grows:
|
|
228
247
|
|
|
229
248
|
```js
|
|
230
|
-
|
|
249
|
+
import { createEventParser } from "vectorjson";
|
|
250
|
+
|
|
251
|
+
const parser = createEventParser();
|
|
231
252
|
|
|
232
253
|
// Get the tool name the moment it's complete
|
|
233
254
|
parser.on('tool_calls[*].name', (e) => {
|
|
@@ -254,7 +275,9 @@ parser.destroy();
|
|
|
254
275
|
Some LLM APIs stream multiple JSON values separated by newlines. VectorJSON auto-resets between values:
|
|
255
276
|
|
|
256
277
|
```js
|
|
257
|
-
|
|
278
|
+
import { createEventParser } from "vectorjson";
|
|
279
|
+
|
|
280
|
+
const parser = createEventParser({
|
|
258
281
|
multiRoot: true,
|
|
259
282
|
onRoot(event) {
|
|
260
283
|
console.log(`Root #${event.index}:`, event.value);
|
|
@@ -272,7 +295,9 @@ parser.destroy();
|
|
|
272
295
|
Some models emit thinking text before JSON, or wrap JSON in code fences. VectorJSON finds the JSON automatically:
|
|
273
296
|
|
|
274
297
|
```js
|
|
275
|
-
|
|
298
|
+
import { createEventParser } from "vectorjson";
|
|
299
|
+
|
|
300
|
+
const parser = createEventParser();
|
|
276
301
|
parser.on('answer', (e) => console.log(e.value));
|
|
277
302
|
parser.onText((text) => thinkingPanel.append(text)); // opt-in
|
|
278
303
|
|
|
@@ -283,6 +308,73 @@ parser.onText((text) => thinkingPanel.append(text)); // opt-in
|
|
|
283
308
|
parser.feed(llmOutput);
|
|
284
309
|
```
|
|
285
310
|
|
|
311
|
+
### Field picking — only parse what you need
|
|
312
|
+
|
|
313
|
+
When streaming a large tool call, you often only need 2-3 fields. `pick` tells the parser to skip everything else during byte scanning — skipped fields never allocate JS objects:
|
|
314
|
+
|
|
315
|
+
```js
|
|
316
|
+
import { createParser } from "vectorjson";
|
|
317
|
+
|
|
318
|
+
const parser = createParser({ pick: ["name", "age"] });
|
|
319
|
+
parser.feed('{"name":"Alice","age":30,"bio":"...10KB of text...","metadata":{}}');
|
|
320
|
+
parser.getValue(); // { name: "Alice", age: 30 } — bio and metadata never materialized
|
|
321
|
+
parser.destroy();
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
Nested paths work with dot notation:
|
|
325
|
+
|
|
326
|
+
```js
|
|
327
|
+
const parser = createParser({ pick: ["user.name", "user.age"] });
|
|
328
|
+
parser.feed('{"user":{"name":"Bob","age":25,"role":"admin"},"extra":"data"}');
|
|
329
|
+
parser.getValue(); // { user: { name: "Bob", age: 25 } }
|
|
330
|
+
parser.destroy();
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
### `for await` — pull-based streaming from any source
|
|
334
|
+
|
|
335
|
+
Pass a `source` (ReadableStream or AsyncIterable) and iterate with `for await`. Each iteration yields the growing partial value:
|
|
336
|
+
|
|
337
|
+
```js
|
|
338
|
+
import { createParser } from "vectorjson";
|
|
339
|
+
|
|
340
|
+
const parser = createParser({ source: response.body });
|
|
341
|
+
|
|
342
|
+
for await (const partial of parser) {
|
|
343
|
+
console.log(partial);
|
|
344
|
+
// { name: "Ali" }
|
|
345
|
+
// { name: "Alice" }
|
|
346
|
+
// { name: "Alice", age: 30 }
|
|
347
|
+
}
|
|
348
|
+
// Parser auto-destroys when the source ends or you break out of the loop
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
Combine `pick` + `source` for minimal allocation streaming:
|
|
352
|
+
|
|
353
|
+
```js
|
|
354
|
+
const parser = createParser({
|
|
355
|
+
pick: ["name", "age"],
|
|
356
|
+
source: response.body,
|
|
357
|
+
});
|
|
358
|
+
|
|
359
|
+
for await (const partial of parser) {
|
|
360
|
+
updateUI(partial); // only picked fields, growing incrementally
|
|
361
|
+
}
|
|
362
|
+
```
|
|
363
|
+
|
|
364
|
+
Works with any async source — fetch body, WebSocket wrapper, SSE adapter, or a plain async generator:
|
|
365
|
+
|
|
366
|
+
```js
|
|
367
|
+
async function* chunks() {
|
|
368
|
+
yield '{"status":"';
|
|
369
|
+
yield 'ok","data":';
|
|
370
|
+
yield '[1,2,3]}';
|
|
371
|
+
}
|
|
372
|
+
|
|
373
|
+
for await (const partial of createParser({ source: chunks() })) {
|
|
374
|
+
console.log(partial);
|
|
375
|
+
}
|
|
376
|
+
```
|
|
377
|
+
|
|
286
378
|
### Schema validation
|
|
287
379
|
|
|
288
380
|
Validate and auto-infer types with Zod, Valibot, ArkType, or any lib with `.safeParse()`. Works on all three APIs:
|
|
@@ -291,9 +383,11 @@ Validate and auto-infer types with Zod, Valibot, ArkType, or any lib with `.safe
|
|
|
291
383
|
|
|
292
384
|
```ts
|
|
293
385
|
import { z } from 'zod';
|
|
386
|
+
import { createParser } from "vectorjson";
|
|
387
|
+
|
|
294
388
|
const User = z.object({ name: z.string(), age: z.number() });
|
|
295
389
|
|
|
296
|
-
const parser =
|
|
390
|
+
const parser = createParser(User); // T inferred from schema
|
|
297
391
|
for await (const chunk of stream) {
|
|
298
392
|
parser.feed(chunk);
|
|
299
393
|
const partial = parser.getValue(); // { name: "Ali" } mid-stream — always available
|
|
@@ -307,7 +401,9 @@ parser.destroy();
|
|
|
307
401
|
**Partial JSON** — returns `DeepPartial<T>` because incomplete JSON has missing fields:
|
|
308
402
|
|
|
309
403
|
```ts
|
|
310
|
-
|
|
404
|
+
import { parsePartialJson } from "vectorjson";
|
|
405
|
+
|
|
406
|
+
const { value, state } = parsePartialJson('{"name":"Al', User);
|
|
311
407
|
// value: { name: "Al" } — partial object, typed as DeepPartial<{ name: string; age: number }>
|
|
312
408
|
// state: "repaired-parse"
|
|
313
409
|
// TypeScript type: { name?: string; age?: number } | undefined
|
|
@@ -326,12 +422,41 @@ parser.on('tool_calls[*]', ToolCall, (event) => {
|
|
|
326
422
|
|
|
327
423
|
Schema-agnostic: any object with `{ safeParse(v) → { success: boolean; data?: T } }` works.
|
|
328
424
|
|
|
425
|
+
### Deep compare — compare JSON without materializing
|
|
426
|
+
|
|
427
|
+
Compare two parsed values directly in WASM memory. Returns a boolean — no JS objects allocated, no Proxy traps fired. Useful for diffing LLM outputs, caching, or deduplication:
|
|
428
|
+
|
|
429
|
+
```js
|
|
430
|
+
import { parse, deepCompare } from "vectorjson";
|
|
431
|
+
|
|
432
|
+
const a = parse('{"name":"Alice","age":30}').value;
|
|
433
|
+
const b = parse('{"age":30,"name":"Alice"}').value;
|
|
434
|
+
|
|
435
|
+
deepCompare(a, b); // true — key order ignored by default
|
|
436
|
+
deepCompare(a, b, { ignoreKeyOrder: false }); // false — keys must be in same order
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
By default, `deepCompare` ignores key order — `{"a":1,"b":2}` equals `{"b":2,"a":1}`, just like `fast-deep-equal`. Set `{ ignoreKeyOrder: false }` for strict key order comparison, which is ~2× faster when you know both values come from the same source.
|
|
440
|
+
|
|
441
|
+
```
|
|
442
|
+
bun --expose-gc bench/deep-compare.mjs
|
|
443
|
+
|
|
444
|
+
Equal objects (560 KB):
|
|
445
|
+
JS deepEqual (recursive) 848 ops/s heap Δ 2.4 MB
|
|
446
|
+
VJ ignore key order (default) 1.63K ops/s heap Δ 0.1 MB 2× faster
|
|
447
|
+
VJ strict key order 3.41K ops/s heap Δ 0.1 MB 4× faster
|
|
448
|
+
```
|
|
449
|
+
|
|
450
|
+
Works with any combination: two VJ proxies (fast WASM path), plain JS objects, or mixed (falls back to `JSON.stringify` comparison).
|
|
451
|
+
|
|
329
452
|
### Lazy access — only materialize what you touch
|
|
330
453
|
|
|
331
|
-
`
|
|
454
|
+
`parse()` returns a lazy Proxy backed by the WASM tape. Fields are only materialized into JS objects when you access them. On a 2 MB payload, reading one field is 2× faster than `JSON.parse` because the other 99% is never allocated:
|
|
332
455
|
|
|
333
456
|
```js
|
|
334
|
-
|
|
457
|
+
import { parse } from "vectorjson";
|
|
458
|
+
|
|
459
|
+
const result = parse(huge2MBToolCall);
|
|
335
460
|
result.value.tool; // "file_edit" — reads from WASM tape, 2.3ms
|
|
336
461
|
result.value.path; // "app.ts"
|
|
337
462
|
// result.value.code (the 50KB field) is never materialized in JS memory
|
|
@@ -351,18 +476,28 @@ bun --expose-gc bench/partial-access.mjs
|
|
|
351
476
|
For non-streaming use cases:
|
|
352
477
|
|
|
353
478
|
```js
|
|
354
|
-
|
|
479
|
+
import { parse } from "vectorjson";
|
|
480
|
+
|
|
481
|
+
const result = parse('{"users": [{"name": "Alice"}]}');
|
|
355
482
|
result.status; // "complete" | "complete_early" | "incomplete" | "invalid"
|
|
356
483
|
result.value.users; // lazy Proxy — materializes on access
|
|
357
484
|
```
|
|
358
485
|
|
|
359
486
|
## API Reference
|
|
360
487
|
|
|
488
|
+
### Direct exports (recommended)
|
|
489
|
+
|
|
490
|
+
All functions are available as direct imports — no `init()` needed:
|
|
491
|
+
|
|
492
|
+
```js
|
|
493
|
+
import { parse, parsePartialJson, deepCompare, createParser, createEventParser, materialize } from "vectorjson";
|
|
494
|
+
```
|
|
495
|
+
|
|
361
496
|
### `init(options?): Promise<VectorJSON>`
|
|
362
497
|
|
|
363
|
-
|
|
498
|
+
Returns cached singleton. Useful for passing custom WASM via `{ engineWasm?: string | URL | BufferSource }`. Called automatically on import.
|
|
364
499
|
|
|
365
|
-
### `
|
|
500
|
+
### `parse(input: string | Uint8Array): ParseResult`
|
|
366
501
|
|
|
367
502
|
```ts
|
|
368
503
|
interface ParseResult {
|
|
@@ -380,9 +515,34 @@ interface ParseResult {
|
|
|
380
515
|
- **`incomplete`** — truncated JSON; value is autocompleted, `isComplete()` tells you what's real
|
|
381
516
|
- **`invalid`** — broken JSON
|
|
382
517
|
|
|
383
|
-
### `
|
|
518
|
+
### `createParser(schema?): StreamingParser<T>`
|
|
519
|
+
### `createParser(options?): StreamingParser<T>`
|
|
520
|
+
|
|
521
|
+
Each `feed()` processes only new bytes — O(n) total. Three overloads:
|
|
522
|
+
|
|
523
|
+
```ts
|
|
524
|
+
createParser(); // no validation
|
|
525
|
+
createParser(schema); // schema validation (Zod, Valibot, etc.)
|
|
526
|
+
createParser({ pick, schema, source }); // options object
|
|
527
|
+
```
|
|
384
528
|
|
|
385
|
-
|
|
529
|
+
**Options object:**
|
|
530
|
+
|
|
531
|
+
```ts
|
|
532
|
+
interface CreateParserOptions<T = unknown> {
|
|
533
|
+
pick?: string[]; // only include these fields (dot-separated paths)
|
|
534
|
+
schema?: ZodLike<T>; // validate on complete
|
|
535
|
+
source?: ReadableStream<Uint8Array> | AsyncIterable<Uint8Array | string>;
|
|
536
|
+
}
|
|
537
|
+
```
|
|
538
|
+
|
|
539
|
+
When `source` is provided, the parser becomes async-iterable — use `for await` to consume partial values:
|
|
540
|
+
|
|
541
|
+
```ts
|
|
542
|
+
for await (const partial of createParser({ source: stream, pick: ["name"] })) {
|
|
543
|
+
console.log(partial); // growing object with only picked fields
|
|
544
|
+
}
|
|
545
|
+
```
|
|
386
546
|
|
|
387
547
|
```ts
|
|
388
548
|
interface StreamingParser<T = unknown> {
|
|
@@ -392,6 +552,7 @@ interface StreamingParser<T = unknown> {
|
|
|
392
552
|
getRawBuffer(): ArrayBuffer | null; // transferable buffer for Worker postMessage
|
|
393
553
|
getStatus(): FeedStatus;
|
|
394
554
|
destroy(): void;
|
|
555
|
+
[Symbol.asyncIterator](): AsyncIterableIterator<T | undefined>; // requires source
|
|
395
556
|
}
|
|
396
557
|
type FeedStatus = "incomplete" | "complete" | "error" | "end_early";
|
|
397
558
|
```
|
|
@@ -400,16 +561,18 @@ While incomplete, `getValue()` returns the **live document** — a mutable JS ob
|
|
|
400
561
|
|
|
401
562
|
```ts
|
|
402
563
|
import { z } from 'zod';
|
|
564
|
+
import { createParser } from "vectorjson";
|
|
565
|
+
|
|
403
566
|
const User = z.object({ name: z.string(), age: z.number() });
|
|
404
567
|
|
|
405
|
-
const parser =
|
|
568
|
+
const parser = createParser(User);
|
|
406
569
|
parser.feed('{"name":"Alice","age":30}');
|
|
407
570
|
const val = parser.getValue(); // { name: string; age: number } | undefined ✅
|
|
408
571
|
```
|
|
409
572
|
|
|
410
573
|
Works with Zod, Valibot, ArkType — any library with `{ safeParse(v) → { success, data? } }`.
|
|
411
574
|
|
|
412
|
-
### `
|
|
575
|
+
### `parsePartialJson(input, schema?): PartialJsonResult<DeepPartial<T>>`
|
|
413
576
|
|
|
414
577
|
Compatible with Vercel AI SDK's `parsePartialJson` signature. Returns a plain JS object (not a Proxy). Pass an optional schema for type-safe validation.
|
|
415
578
|
|
|
@@ -427,7 +590,7 @@ type DeepPartial<T> = T extends object
|
|
|
427
590
|
: T;
|
|
428
591
|
```
|
|
429
592
|
|
|
430
|
-
### `
|
|
593
|
+
### `createEventParser(options?): EventParser`
|
|
431
594
|
|
|
432
595
|
Event-driven streaming parser. Events fire synchronously during `feed()`.
|
|
433
596
|
|
|
@@ -485,7 +648,23 @@ interface RootEvent {
|
|
|
485
648
|
}
|
|
486
649
|
```
|
|
487
650
|
|
|
488
|
-
### `
|
|
651
|
+
### `deepCompare(a, b, options?): boolean`
|
|
652
|
+
|
|
653
|
+
Compare two values for deep equality without materializing JS objects. When both values are VJ proxies, comparison runs entirely in WASM memory — zero allocations, zero Proxy traps.
|
|
654
|
+
|
|
655
|
+
```ts
|
|
656
|
+
deepCompare(
|
|
657
|
+
a: unknown,
|
|
658
|
+
b: unknown,
|
|
659
|
+
options?: { ignoreKeyOrder?: boolean } // default: true
|
|
660
|
+
): boolean
|
|
661
|
+
```
|
|
662
|
+
|
|
663
|
+
- **`ignoreKeyOrder: true`** (default) — `{"a":1,"b":2}` equals `{"b":2,"a":1}`. Same semantics as `fast-deep-equal`.
|
|
664
|
+
- **`ignoreKeyOrder: false`** — keys must appear in the same order. ~2× faster for same-source comparisons.
|
|
665
|
+
- Falls back to `JSON.stringify` comparison when either value is a plain JS object.
|
|
666
|
+
|
|
667
|
+
### `materialize(value): unknown`
|
|
489
668
|
|
|
490
669
|
Convert a lazy Proxy into a plain JS object tree. No-op on plain values.
|
|
491
670
|
|
|
@@ -493,26 +672,41 @@ Convert a lazy Proxy into a plain JS object tree. No-op on plain values.
|
|
|
493
672
|
|
|
494
673
|
| Runtime | Status | Notes |
|
|
495
674
|
|---------|--------|-------|
|
|
496
|
-
| Node.js 20+ | ✅ | WASM
|
|
497
|
-
| Bun | ✅ | WASM
|
|
498
|
-
| Browsers | ✅ |
|
|
499
|
-
| Deno | ✅ |
|
|
500
|
-
| Cloudflare Workers | ✅ |
|
|
675
|
+
| Node.js 20+ | ✅ | WASM embedded in bundle — zero config |
|
|
676
|
+
| Bun | ✅ | WASM embedded in bundle — zero config |
|
|
677
|
+
| Browsers | ✅ | WASM embedded in bundle — zero config |
|
|
678
|
+
| Deno | ✅ | WASM embedded in bundle — zero config |
|
|
679
|
+
| Cloudflare Workers | ✅ | WASM embedded in bundle — zero config |
|
|
680
|
+
|
|
681
|
+
WASM is embedded as base64 in the JS bundle and auto-initialized via top-level `await`. No setup required — just `import { parse } from "vectorjson"`.
|
|
501
682
|
|
|
502
|
-
For
|
|
683
|
+
For advanced use cases, you can still provide a custom WASM binary via `init()`:
|
|
503
684
|
|
|
504
685
|
```js
|
|
505
686
|
import { init } from "vectorjson";
|
|
687
|
+
const vj = await init({ engineWasm: customWasmBytes });
|
|
688
|
+
```
|
|
689
|
+
|
|
690
|
+
Bundle size: ~148 KB JS with embedded WASM (~47 KB gzipped). No runtime dependencies.
|
|
691
|
+
|
|
692
|
+
## Runnable Examples
|
|
506
693
|
|
|
507
|
-
|
|
508
|
-
const vj = await init({ engineWasm: new URL('./engine.wasm', import.meta.url) });
|
|
694
|
+
The `examples/` directory has working demos you can run immediately:
|
|
509
695
|
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
|
|
696
|
+
```bash
|
|
697
|
+
# Anthropic tool call — streams fields as they arrive, early abort demo
|
|
698
|
+
bun examples/anthropic-tool-call.ts --mock
|
|
699
|
+
bun examples/anthropic-tool-call.ts --mock --wrong-tool # early abort
|
|
700
|
+
|
|
701
|
+
# OpenAI function call — streams function arguments via EventParser
|
|
702
|
+
bun examples/openai-function-call.ts --mock
|
|
703
|
+
|
|
704
|
+
# With a real API key:
|
|
705
|
+
ANTHROPIC_API_KEY=sk-ant-... bun examples/anthropic-tool-call.ts
|
|
706
|
+
OPENAI_API_KEY=sk-... bun examples/openai-function-call.ts
|
|
513
707
|
```
|
|
514
708
|
|
|
515
|
-
|
|
709
|
+
See also `examples/ai-usage.ts` for additional patterns (MCP stdio, Vercel AI SDK `streamObject`, NDJSON embeddings).
|
|
516
710
|
|
|
517
711
|
## Building from Source
|
|
518
712
|
|
|
@@ -528,7 +722,7 @@ sudo apt-get install -y binaryen
|
|
|
528
722
|
|
|
529
723
|
```bash
|
|
530
724
|
bun run build # Zig → WASM → wasm-opt → TypeScript
|
|
531
|
-
bun run test #
|
|
725
|
+
bun run test # 724+ tests including 100MB stress payloads
|
|
532
726
|
bun run test:worker # Worker transferable tests (Playwright + Chromium)
|
|
533
727
|
```
|
|
534
728
|
|
|
@@ -538,9 +732,17 @@ To reproduce benchmarks:
|
|
|
538
732
|
bun --expose-gc bench/parse-stream.mjs # one-shot + streaming parse
|
|
539
733
|
cd bench/ai-parsers && bun install && bun --expose-gc bench.mjs # AI SDK comparison
|
|
540
734
|
bun run bench:worker # Worker transfer vs structured clone benchmark
|
|
735
|
+
node --expose-gc bench/deep-compare.mjs # deep compare: VJ vs JS deepEqual
|
|
541
736
|
```
|
|
542
737
|
|
|
543
|
-
Benchmark numbers in this README were measured on
|
|
738
|
+
Benchmark numbers in this README were measured on GitHub Actions (Ubuntu, x86_64). Results vary by machine but relative speedups are consistent.
|
|
739
|
+
|
|
740
|
+
## Acknowledgments
|
|
741
|
+
|
|
742
|
+
VectorJSON is built on the work of:
|
|
743
|
+
|
|
744
|
+
- **[zimdjson](https://github.com/EzequielRamis/zimdjson)** by Ezequiel Ramis — a Zig port of simdjson that powers the WASM engine
|
|
745
|
+
- **[simdjson](https://simdjson.org/)** by Daniel Lemire & Geoff Langdale — the SIMD-accelerated JSON parsing research that started it all
|
|
544
746
|
|
|
545
747
|
## License
|
|
546
748
|
|