blazen 0.1.151 → 0.1.153

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (5) hide show
  1. package/README.md +318 -1
  2. package/error-classes.js +855 -0
  3. package/index.d.ts +4774 -181
  4. package/index.js +336 -0
  5. package/package.json +11 -10
package/README.md CHANGED
@@ -258,6 +258,293 @@ console.log(result.data.answer);
258
258
 
259
259
  ---
260
260
 
261
+ ## Typed Errors
262
+
263
+ Every error thrown across the FFI boundary is an instance of `BlazenError extends Error`, so you can use `instanceof` to narrow on specific failure modes instead of pattern-matching message strings. The hierarchy spans roughly 87 typed classes -- 18 direct subclasses of `BlazenError` plus per-backend `ProviderError` subclasses (one tree per local-inference backend) and their narrower variants.
264
+
265
+ ```typescript
266
+ import {
267
+ CompletionModel, ChatMessage,
268
+ BlazenError, RateLimitError, AuthError, TimeoutError, ValidationError,
269
+ ContentPolicyError, ProviderError,
270
+ } from "blazen";
271
+
272
+ const model = CompletionModel.openai();
273
+
274
+ try {
275
+ const response = await model.complete([ChatMessage.user("Hello")]);
276
+ console.log(response.content);
277
+ } catch (e) {
278
+ if (e instanceof RateLimitError) {
279
+ // back off and retry
280
+ } else if (e instanceof AuthError) {
281
+ // re-prompt for credentials
282
+ } else if (e instanceof ContentPolicyError) {
283
+ // surface a friendlier message
284
+ } else if (e instanceof BlazenError) {
285
+ // any other Blazen-originated failure
286
+ } else {
287
+ throw e;
288
+ }
289
+ }
290
+ ```
291
+
292
+ ### `ProviderError` and its structured fields
293
+
294
+ `ProviderError` (and every per-backend subclass) carries structured metadata so you can build retry, alerting, and observability logic without parsing strings:
295
+
296
+ ```typescript
297
+ import { ProviderError } from "blazen";
298
+
299
+ try {
300
+ await model.complete([ChatMessage.user("...")]);
301
+ } catch (e) {
302
+ if (e instanceof ProviderError) {
303
+ console.error({
304
+ provider: e.provider, // e.g. "openai", "anthropic"
305
+ status: e.status, // HTTP status, if any
306
+ endpoint: e.endpoint, // request URL, if known
307
+ requestId: e.requestId, // upstream request ID, if returned
308
+ detail: e.detail, // upstream error detail
309
+ retryAfterMs: e.retryAfterMs, // suggested backoff
310
+ });
311
+ }
312
+ }
313
+ ```
314
+
315
+ ### Per-backend error trees
316
+
317
+ Each local-inference backend has its own `ProviderError` subtree:
318
+
319
+ | Tree root | Variants |
320
+ |---|---|
321
+ | `LlamaCppError` | `LlamaCppInvalidOptionsError`, `LlamaCppModelLoadError`, `LlamaCppInferenceError`, `LlamaCppEngineNotAvailableError` |
322
+ | `MistralRsError` | `MistralRsInvalidOptionsError`, `MistralRsInitError`, `MistralRsInferenceError`, `MistralRsEngineNotAvailableError` |
323
+ | `CandleLlmError` | `CandleLlmInvalidOptionsError`, `CandleLlmModelLoadError`, `CandleLlmInferenceError`, `CandleLlmEngineNotAvailableError` |
324
+ | `CandleEmbedError` | `CandleEmbedInvalidOptionsError`, `CandleEmbedModelLoadError`, `CandleEmbedEmbeddingError`, `CandleEmbedEngineNotAvailableError`, `CandleEmbedTaskPanickedError` |
325
+ | `WhisperError` | `WhisperInvalidOptionsError`, `WhisperModelLoadError`, `WhisperTranscriptionError`, `WhisperEngineNotAvailableError`, `WhisperIoError` |
326
+ | `PiperError` | `PiperInvalidOptionsError`, `PiperModelLoadError`, `PiperSynthesisError`, `PiperEngineNotAvailableError` |
327
+ | `DiffusionError` | `DiffusionInvalidOptionsError`, `DiffusionModelLoadError`, `DiffusionGenerationError` |
328
+ | `FastEmbedError` | `EmbedUnknownModelError`, `EmbedInitError`, `EmbedEmbedError`, `EmbedMutexPoisonedError`, `EmbedTaskPanickedError` |
329
+ | `TractError` | (additional ONNX runtime failures) |
330
+
331
+ `PromptError`, `MemoryError`, `CacheError`, `PersistError`, and several `Peer*` errors all extend `BlazenError` with their own narrower subclasses (e.g. `PromptMissingVariableError`, `MemoryNotFoundError`, `DownloadError`).
332
+
333
+ ### `enrichError` -- re-classify across the FFI boundary
334
+
335
+ If an error has been re-thrown through plain `Error` (for example after being serialized through a structured-clone boundary, or wrapped by user code), call `enrichError(err)` to re-attach the correct `BlazenError` subclass:
336
+
337
+ ```typescript
338
+ import { enrichError, RateLimitError } from "blazen";
339
+
340
+ try {
341
+ await someWrapperThatRethrows();
342
+ } catch (raw) {
343
+ const e = enrichError(raw);
344
+ if (e instanceof RateLimitError) {
345
+ // narrow as usual
346
+ } else {
347
+ throw e;
348
+ }
349
+ }
350
+ ```
351
+
352
+ ---
353
+
354
+ ## Typed Result Classes
355
+
356
+ `AgentResult` and `BatchResult` were previously plain dictionaries. They are now first-class JS classes with typed getters and a useful `toString()` for logging.
357
+
358
+ ### `AgentResult`
359
+
360
+ Returned by agent runs that may invoke tools across multiple iterations.
361
+
362
+ ```typescript
363
+ import type { AgentResult } from "blazen";
364
+
365
+ const result: AgentResult = await agent.run("Summarize this document");
366
+ console.log(result.response); // CompletionResponse from the final model call
367
+ console.log(result.messages); // full message history (incl. tool calls + results)
368
+ console.log(result.iterations); // number of tool-calling iterations
369
+ console.log(result.totalCost); // aggregated USD cost across iterations, or null
370
+ console.log(result.toString()); // matches the Python AgentResult.__repr__
371
+ ```
372
+
373
+ | Getter | Type | Description |
374
+ |---|---|---|
375
+ | `.response` | `CompletionResponse` | Final completion response from the model |
376
+ | `.messages` | `Array<any>` | Full message history including tool calls and results |
377
+ | `.iterations` | `number` | Number of tool-calling iterations performed |
378
+ | `.totalCost` | `number \| null` | Aggregated USD cost across iterations, if available |
379
+
380
+ ### `BatchResult`
381
+
382
+ Returned by batch completion runs. Indices line up with the original input requests.
383
+
384
+ ```typescript
385
+ import type { BatchResult } from "blazen";
386
+
387
+ const batch: BatchResult = await runBatch(requests);
388
+ console.log(`${batch.successCount} / ${batch.length} succeeded`);
389
+ for (let i = 0; i < batch.length; i++) {
390
+ if (batch.responses[i]) {
391
+ console.log(i, batch.responses[i]?.content);
392
+ } else {
393
+ console.error(i, batch.errors[i]);
394
+ }
395
+ }
396
+ console.log("total tokens:", batch.totalUsage?.totalTokens);
397
+ console.log("total cost:", batch.totalCost);
398
+ ```
399
+
400
+ | Getter | Type | Description |
401
+ |---|---|---|
402
+ | `.responses` | `Array<CompletionResponse \| null>` | One response per request; `null` for failures |
403
+ | `.errors` | `Array<string \| null>` | One error message per request; `null` for successes |
404
+ | `.totalUsage` | `TokenUsage \| null` | Aggregated token usage across successful responses |
405
+ | `.totalCost` | `number \| null` | Aggregated USD cost across successful responses |
406
+ | `.successCount` | `number` | Number of successful requests |
407
+ | `.failureCount` | `number` | Number of failed requests |
408
+ | `.length` | `number` | Total number of requests in the batch |
409
+
410
+ ---
411
+
412
+ ## Local Inference Types
413
+
414
+ Local inference (mistral.rs, llama.cpp, candle) exposes its own typed result and streaming classes alongside the higher-level `CompletionModel` API. Streams are pulled by repeatedly awaiting `stream.next()` until it returns `null` -- they are **not** `for await`-iterable.
415
+
416
+ ### mistral.rs
417
+
418
+ Nine un-prefixed classes under the `Inference*` and `ChatMessageInput` names:
419
+
420
+ | Class | Purpose |
421
+ |---|---|
422
+ | `ChatMessageInput` | Message for local inference; constructor `(role, text, images?)`, plus `ChatMessageInput.fromText(role, text)` |
423
+ | `ChatRole` | String enum: `System`, `User`, `Assistant`, `Tool` |
424
+ | `InferenceResult` | Non-streaming result with `.content`, `.reasoningContent`, `.toolCalls`, `.finishReason`, `.model`, `.usage` |
425
+ | `InferenceChunk` | Single streaming chunk with `.delta`, `.reasoningDelta`, `.toolCalls`, `.finishReason` |
426
+ | `InferenceChunkStream` | Pull-based stream -- `await stream.next()` returns `InferenceChunk \| null` |
427
+ | `InferenceImage` | Image attachment; static `.fromBytes(buf)`, `.fromPath(path)`, `.fromSource(src)` |
428
+ | `InferenceImageSource` | Source variant: `.bytes(buf)` or `.path(path)`, inspected with `.kind` / `.data` / `.filePath` |
429
+ | `InferenceToolCall` | Tool call with `.id`, `.name`, `.arguments` (JSON string) |
430
+ | `InferenceUsage` | `.promptTokens`, `.completionTokens`, `.totalTokens`, `.totalTimeSec` |
431
+
432
+ ```typescript
433
+ import { ChatMessageInput, ChatRole } from "blazen";
434
+
435
+ const messages = [
436
+ ChatMessageInput.fromText(ChatRole.System, "You are helpful."),
437
+ ChatMessageInput.fromText(ChatRole.User, "Hello"),
438
+ ];
439
+
440
+ const stream = await provider.inferStream(messages);
441
+ for (let chunk = await stream.next(); chunk !== null; chunk = await stream.next()) {
442
+ process.stdout.write(chunk.delta ?? "");
443
+ if (chunk.finishReason) console.log("\n[done]", chunk.finishReason);
444
+ }
445
+ ```
446
+
447
+ ### llama.cpp
448
+
449
+ Six classes prefixed with `LlamaCpp`:
450
+
451
+ | Class | Purpose |
452
+ |---|---|
453
+ | `LlamaCppChatMessageInput` | Constructor `(role, text)` |
454
+ | `LlamaCppChatRole` | String enum: `System`, `User`, `Assistant`, `Tool` |
455
+ | `LlamaCppInferenceResult` | `.content`, `.finishReason`, `.model`, `.usage` |
456
+ | `LlamaCppInferenceChunk` | `.delta`, `.finishReason` |
457
+ | `LlamaCppInferenceChunkStream` | `await stream.next()` returns `LlamaCppInferenceChunk \| null` |
458
+ | `LlamaCppInferenceUsage` | `.promptTokens` and other token counts |
459
+
460
+ ### candle
461
+
462
+ | Class | Purpose |
463
+ |---|---|
464
+ | `CandleInferenceResult` | Constructor `(content, promptTokens, completionTokens, totalTimeSecs)`, with matching getters |
465
+
466
+ ### `MediaSource` type alias
467
+
468
+ `MediaSource` is exported as a type alias for `ImageSource` (which itself aliases the underlying `JsImageSource`). Use whichever name reads better at the call site:
469
+
470
+ ```typescript
471
+ import type { MediaSource, ImageSource } from "blazen";
472
+ // MediaSource and ImageSource refer to the same underlying type.
473
+ ```
474
+
475
+ ---
476
+
477
+ ## Model Download Progress
478
+
479
+ `ProgressCallback` is a subclassable JS class that reports byte-level progress for model downloads. Subclass it, call `super()` in the constructor, and override `onProgress(downloaded, total?)`. The `downloaded` and `total` arguments are `bigint` values (use `Number(...)` for percentage math, or stay in `bigint` to avoid precision loss on multi-GB downloads).
480
+
481
+ ```typescript
482
+ import { ProgressCallback, ModelCache } from "blazen";
483
+
484
+ class LoggingProgress extends ProgressCallback {
485
+ onProgress(downloaded: bigint, total?: bigint): void {
486
+ if (total !== undefined && total !== null) {
487
+ const pct = Number((downloaded * 100n) / total);
488
+ console.log(`${pct}%`);
489
+ } else {
490
+ console.log(`${downloaded} bytes`);
491
+ }
492
+ }
493
+ }
494
+
495
+ const cache = ModelCache.create();
496
+ await cache.download("bert-base-uncased", "config.json", new LoggingProgress());
497
+ ```
498
+
499
+ The base `onProgress` always throws -- forgetting to override is caught loudly rather than silently swallowed.
500
+
501
+ ---
502
+
503
+ ## Pipeline Persistence Callbacks
504
+
505
+ `PipelineBuilder.onPersist(callback)` and `.onPersistJson(callback)` register persist hooks that fire after every stage completes. The callback must return `Promise<void>` (or be `async`); a rejection is wrapped as a `PipelineError` and aborts the running pipeline.
506
+
507
+ - `onPersist` receives a typed `PipelineSnapshot` instance.
508
+ - `onPersistJson` receives the same snapshot serialized to a JSON string -- handy when you just want to ship bytes to a key/value store.
509
+
510
+ ```typescript
511
+ import { PipelineBuilder } from "blazen";
512
+
513
+ const pipeline = new PipelineBuilder("my-pipeline")
514
+ .stage(stageA)
515
+ .stage(stageB)
516
+ .onPersistJson(async (json: string) => {
517
+ // IndexedDB-style "put one row per stage" persist
518
+ await db.put("pipeline-snapshots", { id: pipelineId, json });
519
+ })
520
+ .build();
521
+ ```
522
+
523
+ ---
524
+
525
+ ## Telemetry: Langfuse
526
+
527
+ Langfuse export is gated behind the `langfuse` Cargo feature (enabled in the published `blazen` npm package). `LangfuseConfig` uses positional constructor arguments; `host`, `batchSize`, and `flushIntervalMs` are optional.
528
+
529
+ ```typescript
530
+ import { LangfuseConfig, initLangfuse } from "blazen";
531
+
532
+ const cfg = new LangfuseConfig(
533
+ process.env.LANGFUSE_PUBLIC_KEY!,
534
+ process.env.LANGFUSE_SECRET_KEY!,
535
+ "https://cloud.langfuse.com", // host (optional)
536
+ 100, // batchSize (optional)
537
+ 5000, // flushIntervalMs (optional)
538
+ );
539
+
540
+ initLangfuse(cfg);
541
+ // Calling initLangfuse more than once is a no-op.
542
+ ```
543
+
544
+ > **Note:** The Node binding currently ships `LangfuseConfig` and `initLangfuse` only. `OtlpConfig`, `initOtlp`, and `initPrometheus` are **not** exported from the Node SDK -- use the Rust crate or Python binding if you need those exporters.
545
+
546
+ ---
547
+
261
548
  ## Branching / Fan-Out
262
549
 
263
550
  Return an array of events from a step handler to dispatch multiple events simultaneously. Each event routes to the step that handles its type.
@@ -484,10 +771,24 @@ Full TypeScript type definitions ship with the package -- no `@types` needed. Al
484
771
  import {
485
772
  Workflow, WorkflowHandler, Context, CompletionModel,
486
773
  ChatMessage, Role, version,
774
+ // Typed errors
775
+ BlazenError, RateLimitError, AuthError, ProviderError,
776
+ LlamaCppError, MistralRsError, CandleLlmError, WhisperError,
777
+ PiperError, DiffusionError, FastEmbedError, TractError,
778
+ enrichError,
779
+ // Typed result classes
780
+ AgentResult, BatchResult,
781
+ // Local inference
782
+ ChatMessageInput, ChatRole, InferenceChunkStream,
783
+ LlamaCppChatMessageInput, LlamaCppChatRole, LlamaCppInferenceChunkStream,
784
+ CandleInferenceResult,
785
+ // Misc
786
+ ProgressCallback, PipelineBuilder,
787
+ LangfuseConfig, initLangfuse,
487
788
  } from "blazen";
488
789
  import type {
489
790
  JsWorkflowResult, CompletionResponse, CompletionOptions,
490
- ToolCall, TokenUsage, ContentPart, ImageContent, ImageSource,
791
+ ToolCall, TokenUsage, ContentPart, ImageContent, ImageSource, MediaSource,
491
792
  } from "blazen";
492
793
  ```
493
794
 
@@ -535,6 +836,22 @@ import type {
535
836
  | `TokenUsage` | Interface: `{ promptTokens, completionTokens, totalTokens }` |
536
837
  | `CompletionOptions` | Interface: `{ temperature?, maxTokens?, topP?, model?, tools? }` |
537
838
  | `ContentPart` / `ImageContent` / `ImageSource` | Types for multimodal message content |
839
+ | `MediaSource` | Type alias for `ImageSource` |
840
+ | `AgentResult` | Class: `.response`, `.messages`, `.iterations`, `.totalCost`, `.toString()` |
841
+ | `BatchResult` | Class: `.responses`, `.errors`, `.totalUsage`, `.totalCost`, `.successCount`, `.failureCount`, `.length`, `.toString()` |
842
+ | `BlazenError` | Base class for every typed error thrown by Blazen (extends `Error`) |
843
+ | `RateLimitError` / `AuthError` / `TimeoutError` / `ValidationError` / `ContentPolicyError` / `UnsupportedError` / `ComputeError` / `MediaError` | Direct `BlazenError` subclasses |
844
+ | `ProviderError` | `BlazenError` subclass with structured fields: `provider`, `status`, `endpoint`, `requestId`, `detail`, `retryAfterMs` |
845
+ | `LlamaCppError` / `MistralRsError` / `CandleLlmError` / `CandleEmbedError` / `WhisperError` / `PiperError` / `DiffusionError` / `FastEmbedError` / `TractError` | Per-backend `ProviderError` subtrees with narrower variants |
846
+ | `PromptError` / `MemoryError` / `CacheError` / `PersistError` | Other `BlazenError` subtrees |
847
+ | `enrichError(err)` | Re-classify a re-thrown error back to the correct `BlazenError` subclass |
848
+ | `ProgressCallback` | Subclassable JS class; override `onProgress(downloaded: bigint, total?: bigint)` |
849
+ | `PipelineBuilder.onPersist(callback)` / `.onPersistJson(callback)` | Per-stage persist hooks; callback returns `Promise<void>` |
850
+ | `LangfuseConfig(publicKey, secretKey, host?, batchSize?, flushIntervalMs?)` | Positional ctor for the Langfuse exporter |
851
+ | `initLangfuse(config)` | Install the global Langfuse subscriber (idempotent) |
852
+ | `ChatMessageInput` / `ChatRole` / `InferenceResult` / `InferenceChunk` / `InferenceChunkStream` / `InferenceImage` / `InferenceImageSource` / `InferenceToolCall` / `InferenceUsage` | Local mistral.rs inference types (pull streams with `await stream.next()`) |
853
+ | `LlamaCppChatMessageInput` / `LlamaCppChatRole` / `LlamaCppInferenceResult` / `LlamaCppInferenceChunk` / `LlamaCppInferenceChunkStream` / `LlamaCppInferenceUsage` | Local llama.cpp inference types |
854
+ | `CandleInferenceResult` | Local candle inference result |
538
855
  | `JsWorkflowResult` | Interface: `{ type: string, data: any }` |
539
856
  | `version()` | Returns the blazen library version string |
540
857