npm - vectorjson - Versions diffs - 0.3.1 → 0.4.0 - Mend

vectorjson 0.3.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +64 -56
package/dist/engine-wasm.generated.d.ts +1 -1
package/dist/engine-wasm.generated.d.ts.map +1 -1
package/dist/engine.wasm +0 -0
package/dist/index.d.ts +23 -13
package/dist/index.d.ts.map +1 -1
package/dist/index.js +5 -5
package/dist/index.js.map +4 -4
package/package.json +2 -2

package/README.md CHANGED Viewed

@@ -98,7 +98,6 @@ const parser = createEventParser();
 parser.on('tool', (e) => showToolUI(e.value));             // fires immediately
 parser.onDelta('code', (e) => editor.append(e.value));     // streams char-by-char
-parser.skip('explanation');                                // never materialized
 for await (const chunk of llmStream) {
   parser.feed(chunk);  // O(n) — only new bytes scanned
@@ -279,35 +278,12 @@ parser.onDelta('tool_calls[0].args.code', (e) => {
   editor.append(e.value); // just the new characters, decoded
 });
-// Don't waste CPU on fields you don't need
-parser.skip('tool_calls[*].args.explanation');
 for await (const chunk of llmStream) {
   parser.feed(chunk);
 }
 parser.destroy();
 ```
-### Multi-root / NDJSON
-Some LLM APIs stream multiple JSON values separated by newlines. VectorJSON auto-resets between values:
-```js
-import { createEventParser } from "vectorjson";
-const parser = createEventParser({
-  multiRoot: true,
-  onRoot(event) {
-    console.log(`Root #${event.index}:`, event.value);
-  }
-});
-for await (const chunk of ndjsonStream) {
-  parser.feed(chunk);
-}
-parser.destroy();
-```
 ### Mixed LLM output (chain-of-thought, code fences)
 Some models emit thinking text before JSON, or wrap JSON in code fences. VectorJSON finds the JSON automatically:
@@ -343,15 +319,6 @@ for await (const partial of createParser({ schema: User, source: response.body }
 }
 ```
-Works on dirty LLM output — think tags, code fences, and leading prose are stripped automatically when a schema is provided:
-```js
-// All of these work with createParser(schema):
-// <think>reasoning</think>{"name":"Alice","age":30}
-// ```json\n{"name":"Alice","age":30}\n```
-// Here's the result: {"name":"Alice","age":30}
-```
 Both `createParser` and `createEventParser` support `source` + `for await`:
 ```js
@@ -487,6 +454,24 @@ result.status;       // "complete" | "complete_early" | "incomplete" | "invalid"
 result.value.users;  // lazy Proxy — materializes on access
 ```
+### JSONL & JSON5
+Both `createParser` and `createEventParser` accept `format: "jsonl" | "json5"`:
+```js
+// JSONL — yields each value separately
+for await (const value of createParser({ format: "jsonl", source: stream })) {
+  console.log(value);  // { user: "Alice" }, { user: "Bob" }, ...
+}
+// JSON5 — comments, trailing commas, unquoted keys, single-quoted strings, hex, Infinity/NaN
+const p = createParser({ format: "json5" });
+p.feed(`{ name: 'Alice', tags: ['admin',], color: 0xFF0000, timeout: Infinity, }`);
+p.getValue(); // { name: "Alice", tags: ["admin"], color: 16711680, timeout: Infinity }
+```
+JSONL push-based: call `resetForNext()` after each value. JSON5 comments are stripped at the byte level during streaming.
 ## API Reference
 ### Direct exports (recommended)
@@ -526,7 +511,7 @@ Each `feed()` processes only new bytes — O(n) total. Three overloads:
 ```ts
 createParser();                    // no validation
-createParser(schema);              // schema validation + auto-pick + dirty input handling
+createParser(schema);              // only parse schema fields, validate on complete
 createParser({ schema, source });  // options object
 ```
@@ -534,16 +519,15 @@ createParser({ schema, source });  // options object
 ```ts
 interface CreateParserOptions<T = unknown> {
-  schema?: ZodLike<T>;   // validate on complete, auto-pick from shape, skip dirty input
+  schema?: ZodLike<T>;   // only parse schema fields, validate on complete
   source?: ReadableStream<Uint8Array> | AsyncIterable<Uint8Array | string>;
-  pick?: string[];       // advanced: explicit field paths (overrides schema auto-pick)
+  format?: "json" | "jsonl" | "json5";  // default: "json"
 }
 ```
 When a `schema` is provided:
-- Fields are auto-picked from the schema's `.shape` — only matching fields are parsed
-- Arrays are transparent — `{ users: z.array(z.object({ name })) }` picks `users.name` through arrays
-- Dirty input (think tags, code fences, leading prose) is stripped before parsing
+- Only fields defined in the schema are parsed — everything else is skipped at the byte level
+- Arrays are transparent — `z.array(z.object({ name }))` parses `name` inside each array element
 - On complete, `safeParse()` validates the final value
 When `source` is provided, the parser becomes async-iterable — use `for await` to consume partial values:
@@ -561,6 +545,7 @@ interface StreamingParser<T = unknown> {
   getRemaining(): Uint8Array | null;
   getRawBuffer(): ArrayBuffer | null;  // transferable buffer for Worker postMessage
   getStatus(): FeedStatus;
+  resetForNext(): number;  // JSONL: reset for next value, returns remaining byte count
   destroy(): void;
   [Symbol.asyncIterator](): AsyncIterableIterator<T | undefined>;  // requires source
 }
@@ -607,16 +592,16 @@ Event-driven streaming parser. Events fire synchronously during `feed()`.
 ```ts
 createEventParser();                              // basic
 createEventParser({ source: stream });            // for-await iteration
-createEventParser({ multiRoot: true, onRoot });   // NDJSON
+createEventParser({ schema, source });              // schema + for-await
 ```
 **Options:**
 ```ts
 {
-  multiRoot?: boolean;   // auto-reset between JSON values (NDJSON)
-  onRoot?: (event: RootEvent) => void;
   source?: ReadableStream<Uint8Array> | AsyncIterable<Uint8Array | string>;
+  schema?: ZodLike<T>;   // only parse schema fields (same as createParser)
+  format?: "json" | "jsonl" | "json5";  // default: "json"
 }
 ```
@@ -637,7 +622,6 @@ interface EventParser {
   on<T>(path: string, schema: { safeParse: Function }, callback: (event: PathEvent & { value: T }) => void): EventParser;
   onDelta(path: string, callback: (event: DeltaEvent) => void): EventParser;
   onText(callback: (text: string) => void): EventParser;
-  skip(...paths: string[]): EventParser;
   off(path: string, callback?: Function): EventParser;
   feed(chunk: string | Uint8Array): FeedStatus;
   getValue(): unknown | undefined;  // undefined while incomplete, throws on parse errors
@@ -649,7 +633,7 @@ interface EventParser {
 }
 ```
-All methods return `self` for chaining: `parser.on(...).onDelta(...).skip(...)`.
+All methods return `self` for chaining: `parser.on(...).onDelta(...)`.
 **Path syntax:**
 - `foo.bar` — exact key
@@ -679,24 +663,48 @@ interface DeltaEvent {
   length: number;         // byte length of delta (raw bytes, not char count)
 }
-interface RootEvent {
-  type: 'root';
-  index: number;          // which root value (0, 1, 2...)
-  value: unknown;         // parsed via doc_parse
-}
 ```
 ### Parser comparison
 | | `createParser` | `createEventParser` |
 |---|---|---|
-| **Use case** | Get a growing partial object | React to individual fields as they arrive |
-| **Schema auto-pick** | Yes — schema `.shape` drives field selection | No — use `skip()` and `on()` for filtering |
-| **Dirty input handling** | Yes (when schema provided) | Yes (always) |
-| **`for await` with source** | Yes | Yes |
-| **Field subscriptions** | No | `on()`, `onDelta()`, `skip()` |
-| **Multi-root / NDJSON** | No | Yes (`multiRoot: true`) |
-| **Text callbacks** | No | `onText()` for non-JSON text |
+| **Completion** | `feed()` returns `"complete"` after one JSON value | Handles multiple JSON values — user calls `destroy()` when done |
+| **Malformed JSON** | `feed()` returns `"error"` | Skips it, finds the next JSON |
+| **Schema** | Pass Zod/Valibot, only schema fields are parsed | Same |
+| **Skip non-JSON** (think tags, code fences, prose) | — | Always |
+| **Field subscriptions** | — | `on()`, `onDelta()` |
+| **JSONL** | `format: "jsonl"` | `format: "jsonl"` |
+| **Text callbacks** | — | `onText()` |
+**`createParser` parses one JSON value** and reports status — you check it and react:
+```js
+const parser = createParser();
+for await (const chunk of stream) {
+  const status = parser.feed(chunk);
+  if (status === "complete") break;  // done — one JSON value parsed
+  if (status === "error") break;     // malformed JSON detected
+}
+const result = parser.getValue();
+parser.destroy();
+```
+**`createEventParser` handles an entire LLM response** — text, thinking, code fences, all in one stream:
+```js
+const parser = createEventParser();
+parser.on('tool', (e) => showToolUI(e.value));
+parser.onText((text) => thinkingPanel.append(text));
+// LLM output with mixed content:
+// <think>let me reason about this...</think>
+// {"tool":"search","query":"weather"}
+for await (const chunk of llmStream) {
+  parser.feed(chunk);  // strips think tags, finds JSON, fires callbacks
+}
+parser.destroy();
+```
 ### `deepCompare(a, b, options?): boolean`