npm - blazen - Versions diffs - 0.1.151 → 0.1.153 - Mend

blazen 0.1.151 → 0.1.153

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -258,6 +258,293 @@ console.log(result.data.answer);
 ---
+## Typed Errors
+Every error thrown across the FFI boundary is an instance of `BlazenError extends Error`, so you can use `instanceof` to narrow on specific failure modes instead of pattern-matching message strings. The hierarchy spans roughly 87 typed classes -- 18 direct subclasses of `BlazenError` plus per-backend `ProviderError` subclasses (one tree per local-inference backend) and their narrower variants.
+```typescript
+import {
+  CompletionModel, ChatMessage,
+  BlazenError, RateLimitError, AuthError, TimeoutError, ValidationError,
+  ContentPolicyError, ProviderError,
+} from "blazen";
+const model = CompletionModel.openai();
+try {
+  const response = await model.complete([ChatMessage.user("Hello")]);
+  console.log(response.content);
+} catch (e) {
+  if (e instanceof RateLimitError) {
+    // back off and retry
+  } else if (e instanceof AuthError) {
+    // re-prompt for credentials
+  } else if (e instanceof ContentPolicyError) {
+    // surface a friendlier message
+  } else if (e instanceof BlazenError) {
+    // any other Blazen-originated failure
+  } else {
+    throw e;
+  }
+}
+```
+### `ProviderError` and its structured fields
+`ProviderError` (and every per-backend subclass) carries structured metadata so you can build retry, alerting, and observability logic without parsing strings:
+```typescript
+import { ProviderError } from "blazen";
+try {
+  await model.complete([ChatMessage.user("...")]);
+} catch (e) {
+  if (e instanceof ProviderError) {
+    console.error({
+      provider: e.provider,        // e.g. "openai", "anthropic"
+      status: e.status,            // HTTP status, if any
+      endpoint: e.endpoint,        // request URL, if known
+      requestId: e.requestId,      // upstream request ID, if returned
+      detail: e.detail,            // upstream error detail
+      retryAfterMs: e.retryAfterMs, // suggested backoff
+    });
+  }
+}
+```
+### Per-backend error trees
+Each local-inference backend has its own `ProviderError` subtree:
+| Tree root | Variants |
+|---|---|
+| `LlamaCppError` | `LlamaCppInvalidOptionsError`, `LlamaCppModelLoadError`, `LlamaCppInferenceError`, `LlamaCppEngineNotAvailableError` |
+| `MistralRsError` | `MistralRsInvalidOptionsError`, `MistralRsInitError`, `MistralRsInferenceError`, `MistralRsEngineNotAvailableError` |
+| `CandleLlmError` | `CandleLlmInvalidOptionsError`, `CandleLlmModelLoadError`, `CandleLlmInferenceError`, `CandleLlmEngineNotAvailableError` |
+| `CandleEmbedError` | `CandleEmbedInvalidOptionsError`, `CandleEmbedModelLoadError`, `CandleEmbedEmbeddingError`, `CandleEmbedEngineNotAvailableError`, `CandleEmbedTaskPanickedError` |
+| `WhisperError` | `WhisperInvalidOptionsError`, `WhisperModelLoadError`, `WhisperTranscriptionError`, `WhisperEngineNotAvailableError`, `WhisperIoError` |
+| `PiperError` | `PiperInvalidOptionsError`, `PiperModelLoadError`, `PiperSynthesisError`, `PiperEngineNotAvailableError` |
+| `DiffusionError` | `DiffusionInvalidOptionsError`, `DiffusionModelLoadError`, `DiffusionGenerationError` |
+| `FastEmbedError` | `EmbedUnknownModelError`, `EmbedInitError`, `EmbedEmbedError`, `EmbedMutexPoisonedError`, `EmbedTaskPanickedError` |
+| `TractError` | (additional ONNX runtime failures) |
+`PromptError`, `MemoryError`, `CacheError`, `PersistError`, and several `Peer*` errors all extend `BlazenError` with their own narrower subclasses (e.g. `PromptMissingVariableError`, `MemoryNotFoundError`, `DownloadError`).
+### `enrichError` -- re-classify across the FFI boundary
+If an error has been re-thrown through plain `Error` (for example after being serialized through a structured-clone boundary, or wrapped by user code), call `enrichError(err)` to re-attach the correct `BlazenError` subclass:
+```typescript
+import { enrichError, RateLimitError } from "blazen";
+try {
+  await someWrapperThatRethrows();
+} catch (raw) {
+  const e = enrichError(raw);
+  if (e instanceof RateLimitError) {
+    // narrow as usual
+  } else {
+    throw e;
+  }
+}
+```
+---
+## Typed Result Classes
+`AgentResult` and `BatchResult` were previously plain dictionaries. They are now first-class JS classes with typed getters and a useful `toString()` for logging.
+### `AgentResult`
+Returned by agent runs that may invoke tools across multiple iterations.
+```typescript
+import type { AgentResult } from "blazen";
+const result: AgentResult = await agent.run("Summarize this document");
+console.log(result.response);    // CompletionResponse from the final model call
+console.log(result.messages);    // full message history (incl. tool calls + results)
+console.log(result.iterations);  // number of tool-calling iterations
+console.log(result.totalCost);   // aggregated USD cost across iterations, or null
+console.log(result.toString());  // matches the Python AgentResult.__repr__
+```
+| Getter | Type | Description |
+|---|---|---|
+| `.response` | `CompletionResponse` | Final completion response from the model |
+| `.messages` | `Array<any>` | Full message history including tool calls and results |
+| `.iterations` | `number` | Number of tool-calling iterations performed |
+| `.totalCost` | `number \| null` | Aggregated USD cost across iterations, if available |
+### `BatchResult`
+Returned by batch completion runs. Indices line up with the original input requests.
+```typescript
+import type { BatchResult } from "blazen";
+const batch: BatchResult = await runBatch(requests);
+console.log(`${batch.successCount} / ${batch.length} succeeded`);
+for (let i = 0; i < batch.length; i++) {
+  if (batch.responses[i]) {
+    console.log(i, batch.responses[i]?.content);
+  } else {
+    console.error(i, batch.errors[i]);
+  }
+}
+console.log("total tokens:", batch.totalUsage?.totalTokens);
+console.log("total cost:", batch.totalCost);
+```
+| Getter | Type | Description |
+|---|---|---|
+| `.responses` | `Array<CompletionResponse \| null>` | One response per request; `null` for failures |
+| `.errors` | `Array<string \| null>` | One error message per request; `null` for successes |
+| `.totalUsage` | `TokenUsage \| null` | Aggregated token usage across successful responses |
+| `.totalCost` | `number \| null` | Aggregated USD cost across successful responses |
+| `.successCount` | `number` | Number of successful requests |
+| `.failureCount` | `number` | Number of failed requests |
+| `.length` | `number` | Total number of requests in the batch |
+---
+## Local Inference Types
+Local inference (mistral.rs, llama.cpp, candle) exposes its own typed result and streaming classes alongside the higher-level `CompletionModel` API. Streams are pulled by repeatedly awaiting `stream.next()` until it returns `null` -- they are **not** `for await`-iterable.
+### mistral.rs
+Nine un-prefixed classes under the `Inference*` and `ChatMessageInput` names:
+| Class | Purpose |
+|---|---|
+| `ChatMessageInput` | Message for local inference; constructor `(role, text, images?)`, plus `ChatMessageInput.fromText(role, text)` |
+| `ChatRole` | String enum: `System`, `User`, `Assistant`, `Tool` |
+| `InferenceResult` | Non-streaming result with `.content`, `.reasoningContent`, `.toolCalls`, `.finishReason`, `.model`, `.usage` |
+| `InferenceChunk` | Single streaming chunk with `.delta`, `.reasoningDelta`, `.toolCalls`, `.finishReason` |
+| `InferenceChunkStream` | Pull-based stream -- `await stream.next()` returns `InferenceChunk \| null` |
+| `InferenceImage` | Image attachment; static `.fromBytes(buf)`, `.fromPath(path)`, `.fromSource(src)` |
+| `InferenceImageSource` | Source variant: `.bytes(buf)` or `.path(path)`, inspected with `.kind` / `.data` / `.filePath` |
+| `InferenceToolCall` | Tool call with `.id`, `.name`, `.arguments` (JSON string) |
+| `InferenceUsage` | `.promptTokens`, `.completionTokens`, `.totalTokens`, `.totalTimeSec` |
+```typescript
+import { ChatMessageInput, ChatRole } from "blazen";
+const messages = [
+  ChatMessageInput.fromText(ChatRole.System, "You are helpful."),
+  ChatMessageInput.fromText(ChatRole.User, "Hello"),
+];
+const stream = await provider.inferStream(messages);
+for (let chunk = await stream.next(); chunk !== null; chunk = await stream.next()) {
+  process.stdout.write(chunk.delta ?? "");
+  if (chunk.finishReason) console.log("\n[done]", chunk.finishReason);
+}
+```
+### llama.cpp
+Six classes prefixed with `LlamaCpp`:
+| Class | Purpose |
+|---|---|
+| `LlamaCppChatMessageInput` | Constructor `(role, text)` |
+| `LlamaCppChatRole` | String enum: `System`, `User`, `Assistant`, `Tool` |
+| `LlamaCppInferenceResult` | `.content`, `.finishReason`, `.model`, `.usage` |
+| `LlamaCppInferenceChunk` | `.delta`, `.finishReason` |
+| `LlamaCppInferenceChunkStream` | `await stream.next()` returns `LlamaCppInferenceChunk \| null` |
+| `LlamaCppInferenceUsage` | `.promptTokens` and other token counts |
+### candle
+| Class | Purpose |
+|---|---|
+| `CandleInferenceResult` | Constructor `(content, promptTokens, completionTokens, totalTimeSecs)`, with matching getters |
+### `MediaSource` type alias
+`MediaSource` is exported as a type alias for `ImageSource` (which itself aliases the underlying `JsImageSource`). Use whichever name reads better at the call site:
+```typescript
+import type { MediaSource, ImageSource } from "blazen";
+// MediaSource and ImageSource refer to the same underlying type.
+```
+---
+## Model Download Progress
+`ProgressCallback` is a subclassable JS class that reports byte-level progress for model downloads. Subclass it, call `super()` in the constructor, and override `onProgress(downloaded, total?)`. The `downloaded` and `total` arguments are `bigint` values (use `Number(...)` for percentage math, or stay in `bigint` to avoid precision loss on multi-GB downloads).
+```typescript
+import { ProgressCallback, ModelCache } from "blazen";
+class LoggingProgress extends ProgressCallback {
+  onProgress(downloaded: bigint, total?: bigint): void {
+    if (total !== undefined && total !== null) {
+      const pct = Number((downloaded * 100n) / total);
+      console.log(`${pct}%`);
+    } else {
+      console.log(`${downloaded} bytes`);
+    }
+  }
+}
+const cache = ModelCache.create();
+await cache.download("bert-base-uncased", "config.json", new LoggingProgress());
+```
+The base `onProgress` always throws -- forgetting to override is caught loudly rather than silently swallowed.
+---
+## Pipeline Persistence Callbacks
+`PipelineBuilder.onPersist(callback)` and `.onPersistJson(callback)` register persist hooks that fire after every stage completes. The callback must return `Promise<void>` (or be `async`); a rejection is wrapped as a `PipelineError` and aborts the running pipeline.
+- `onPersist` receives a typed `PipelineSnapshot` instance.
+- `onPersistJson` receives the same snapshot serialized to a JSON string -- handy when you just want to ship bytes to a key/value store.
+```typescript
+import { PipelineBuilder } from "blazen";
+const pipeline = new PipelineBuilder("my-pipeline")
+  .stage(stageA)
+  .stage(stageB)
+  .onPersistJson(async (json: string) => {
+    // IndexedDB-style "put one row per stage" persist
+    await db.put("pipeline-snapshots", { id: pipelineId, json });
+  })
+  .build();
+```
+---
+## Telemetry: Langfuse
+Langfuse export is gated behind the `langfuse` Cargo feature (enabled in the published `blazen` npm package). `LangfuseConfig` uses positional constructor arguments; `host`, `batchSize`, and `flushIntervalMs` are optional.
+```typescript
+import { LangfuseConfig, initLangfuse } from "blazen";
+const cfg = new LangfuseConfig(
+  process.env.LANGFUSE_PUBLIC_KEY!,
+  process.env.LANGFUSE_SECRET_KEY!,
+  "https://cloud.langfuse.com", // host (optional)
+  100,                           // batchSize (optional)
+  5000,                          // flushIntervalMs (optional)
+);
+initLangfuse(cfg);
+// Calling initLangfuse more than once is a no-op.
+```
+> **Note:** The Node binding currently ships `LangfuseConfig` and `initLangfuse` only. `OtlpConfig`, `initOtlp`, and `initPrometheus` are **not** exported from the Node SDK -- use the Rust crate or Python binding if you need those exporters.
+---
 ## Branching / Fan-Out
 Return an array of events from a step handler to dispatch multiple events simultaneously. Each event routes to the step that handles its type.
@@ -484,10 +771,24 @@ Full TypeScript type definitions ship with the package -- no `@types` needed. Al
 import {
   Workflow, WorkflowHandler, Context, CompletionModel,
   ChatMessage, Role, version,
+  // Typed errors
+  BlazenError, RateLimitError, AuthError, ProviderError,
+  LlamaCppError, MistralRsError, CandleLlmError, WhisperError,
+  PiperError, DiffusionError, FastEmbedError, TractError,
+  enrichError,
+  // Typed result classes
+  AgentResult, BatchResult,
+  // Local inference
+  ChatMessageInput, ChatRole, InferenceChunkStream,
+  LlamaCppChatMessageInput, LlamaCppChatRole, LlamaCppInferenceChunkStream,
+  CandleInferenceResult,
+  // Misc
+  ProgressCallback, PipelineBuilder,
+  LangfuseConfig, initLangfuse,
 } from "blazen";
 import type {
   JsWorkflowResult, CompletionResponse, CompletionOptions,
-  ToolCall, TokenUsage, ContentPart, ImageContent, ImageSource,
+  ToolCall, TokenUsage, ContentPart, ImageContent, ImageSource, MediaSource,
 } from "blazen";
 ```
@@ -535,6 +836,22 @@ import type {
 | `TokenUsage` | Interface: `{ promptTokens, completionTokens, totalTokens }` |
 | `CompletionOptions` | Interface: `{ temperature?, maxTokens?, topP?, model?, tools? }` |
 | `ContentPart` / `ImageContent` / `ImageSource` | Types for multimodal message content |
+| `MediaSource` | Type alias for `ImageSource` |
+| `AgentResult` | Class: `.response`, `.messages`, `.iterations`, `.totalCost`, `.toString()` |
+| `BatchResult` | Class: `.responses`, `.errors`, `.totalUsage`, `.totalCost`, `.successCount`, `.failureCount`, `.length`, `.toString()` |
+| `BlazenError` | Base class for every typed error thrown by Blazen (extends `Error`) |
+| `RateLimitError` / `AuthError` / `TimeoutError` / `ValidationError` / `ContentPolicyError` / `UnsupportedError` / `ComputeError` / `MediaError` | Direct `BlazenError` subclasses |
+| `ProviderError` | `BlazenError` subclass with structured fields: `provider`, `status`, `endpoint`, `requestId`, `detail`, `retryAfterMs` |
+| `LlamaCppError` / `MistralRsError` / `CandleLlmError` / `CandleEmbedError` / `WhisperError` / `PiperError` / `DiffusionError` / `FastEmbedError` / `TractError` | Per-backend `ProviderError` subtrees with narrower variants |
+| `PromptError` / `MemoryError` / `CacheError` / `PersistError` | Other `BlazenError` subtrees |
+| `enrichError(err)` | Re-classify a re-thrown error back to the correct `BlazenError` subclass |
+| `ProgressCallback` | Subclassable JS class; override `onProgress(downloaded: bigint, total?: bigint)` |
+| `PipelineBuilder.onPersist(callback)` / `.onPersistJson(callback)` | Per-stage persist hooks; callback returns `Promise<void>` |
+| `LangfuseConfig(publicKey, secretKey, host?, batchSize?, flushIntervalMs?)` | Positional ctor for the Langfuse exporter |
+| `initLangfuse(config)` | Install the global Langfuse subscriber (idempotent) |
+| `ChatMessageInput` / `ChatRole` / `InferenceResult` / `InferenceChunk` / `InferenceChunkStream` / `InferenceImage` / `InferenceImageSource` / `InferenceToolCall` / `InferenceUsage` | Local mistral.rs inference types (pull streams with `await stream.next()`) |
+| `LlamaCppChatMessageInput` / `LlamaCppChatRole` / `LlamaCppInferenceResult` / `LlamaCppInferenceChunk` / `LlamaCppInferenceChunkStream` / `LlamaCppInferenceUsage` | Local llama.cpp inference types |
+| `CandleInferenceResult` | Local candle inference result |
 | `JsWorkflowResult` | Interface: `{ type: string, data: any }` |
 | `version()` | Returns the blazen library version string |