npm - vectorjson - Versions diffs - 0.4.7 → 0.5.1 - Mend

vectorjson 0.4.7 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +25 -9
package/dist/engine-wasm.generated.d.ts +1 -1
package/dist/engine-wasm.generated.d.ts.map +1 -1
package/dist/engine.wasm +0 -0
package/dist/index.d.ts +22 -6
package/dist/index.d.ts.map +1 -1
package/dist/index.js +7 -5
package/dist/index.js.map +4 -4
package/package.json +2 -2

package/README.md CHANGED Viewed

@@ -5,7 +5,7 @@
 [![gzip size](https://img.shields.io/badge/gzip-~47kB-blue)](https://www.npmjs.com/package/vectorjson)
 [![license](https://img.shields.io/npm/l/vectorjson)](https://github.com/teamchong/vectorjson/blob/main/LICENSE)
-O(n) streaming JSON parser for LLM tool calls, built on WASM SIMD. Agents act faster with field-level streaming, detect wrong outputs early to abort and save tokens, and offload parsing to Workers with transferable ArrayBuffers.
+O(n) streaming JSON parser for LLM tool calls, built on WASM SIMD. Agents act faster with field-level streaming, detect wrong outputs early to abort and save tokens.
 ## The Problem
@@ -124,9 +124,7 @@ for await (const chunk of llmStream({ signal: abort.signal })) {
 }
 ```
-**Worker offload** — parse in a Worker, transfer in O(1), skip re-parsing on main thread:
-`getTapeBuffer()` exports the SIMD-parsed tape + input as a single packed ArrayBuffer. `postMessage(buf, [buf])` transfers it in O(1). The main thread imports it with `importTape()` — zero parsing, just memcpy into a document slot:
+**Worker offload** — `getTapeBuffer()` + `importTape()` for lazy access without re-parsing:
 ```js
 // In Worker:
@@ -136,11 +134,11 @@ postMessage(tape, [tape]); // O(1) transfer — moves pointer, no copy
 // On Main thread:
 import { importTape } from "vectorjson";
-const obj = importTape(tape); // zero parse — tape is already built
-obj.name; // lazy Proxy, same as parse()
+const obj = importTape(tape); // imports pre-built tape, no parse
+obj.name; // lazy Proxy — only materializes fields you access
 ```
-Worker-side parsing is 2-3× faster than `JSON.parse` at 50 KB+. The transferable ArrayBuffer avoids structured clone overhead. Tape transfer eliminates re-parsing on the main thread entirely — **~20× less main-thread blocking** than transferring raw bytes and re-parsing.
+**Note:** For full materialization, `JSON.parse` in Worker + structured clone is faster end-to-end — Chrome's C++ structured clone is highly optimized. Tape transfer wins when you only need a few fields on the main thread via lazy Proxy access, avoiding full object materialization.
 ## Benchmarks
@@ -169,6 +167,24 @@ Apple-to-apple: both sides produce a materialized partial object on every chunk.
 Stock parsers re-parse the full buffer on every chunk — O(n²). VectorJSON maintains a **live JS object** that grows incrementally on each `feed()`, so `getValue()` is O(1). Total work: O(n).
+### What about single-shot JSON.parse?
+For one-shot full materialization, `JSON.parse` is faster — it's highly optimized C++ running inside V8. On small payloads it's orders of magnitude faster. On large payloads (500KB+) VectorJSON approaches parity but never beats it for full access:
+```
+bun --expose-gc bench/parse-stream.mjs   (one-shot section, CI results)
+  1 KB     JSON.parse 2.16M ops/s    VectorJSON 1.37K ops/s    JSON.parse wins
+  10 KB    JSON.parse 9.64K ops/s    VectorJSON 1.37K ops/s    JSON.parse wins
+  500 KB   JSON.parse  490  ops/s    VectorJSON  466  ops/s    ~equal (0.95×)
+  2 MB     JSON.parse  183  ops/s    VectorJSON  172  ops/s    ~equal (0.94×)
+```
+VectorJSON is not a replacement for `JSON.parse`. It wins in different scenarios:
+- **Streaming**: O(n) incremental vs O(n²) re-parse on every chunk
+- **Partial access**: read 3 fields from a 100KB payload without materializing the other 97%
+- **Deep compare**: WASM tape-level comparison with zero JS allocations
 ### Why this matters: main thread availability
 The real cost isn't just CPU time — it's blocking the agent's main thread. Simulating an Anthropic `tool_use` content block (`str_replace_editor`) streamed in ~12-char chunks:
@@ -186,11 +202,11 @@ Both approaches detect the tool name (`.name`) at the same chunk — the LLM has
 For even more control, use `createEventParser()` for field-level subscriptions or only call `getValue()` once when `feed()` returns `"complete"`.
-### Worker Transfer: parse faster, transfer in O(1)
+### Worker Transfer
 `bun run bench:worker` (requires Playwright + Chromium)
-Measures the full Worker→Main thread pipeline in a real browser. VectorJSON parses 2-3× faster in the Worker at 50 KB+, and `getRawBuffer()` produces a transferable ArrayBuffer — `postMessage(buf, [buf])` moves the backing store pointer in O(1) instead of structured-cloning the parsed object.
+Measures full round-trip time (postMessage → worker parse → transfer → main receive + field access) in a real browser. For full materialization, `JSON.parse` + structured clone wins — Chrome's C++ clone is faster than WASM parse + tape export/import. Tape transfer is useful when you only access a few fields via lazy Proxy on the main thread.
 <details>
 <summary>Which products use which parser</summary>