npm - smoltalk - Versions diffs - 0.0.63 → 0.0.65 - Mend

smoltalk 0.0.63 → 0.0.65

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +83 -0
package/dist/classes/message/ToolMessage.d.ts +8 -5
package/dist/classes/message/ToolMessage.js +22 -2
package/dist/functions.js +30 -10
package/dist/index.d.ts +1 -0
package/dist/middleware.d.ts +54 -0
package/dist/middleware.js +321 -0
package/dist/types.d.ts +3 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -214,6 +214,89 @@ Detects when the model is stuck in a repetitive tool-call loop.
 | `intervention` | `string` | Action to take: `"remove-tool"`, `"remove-all-tools"`, `"throw-error"`, or `"halt-execution"`. |
 | `excludeTools` | `string[]` | Tool names to ignore when counting consecutive calls. |
+## Middleware
+Middleware lets you run LLM-based checks on a prompt before or alongside the main call. If a check fails, the main call is blocked and a replacement output is returned instead. This is useful for:
+- **Content safety** — classify prompts as safe/unsafe before they reach your main model
+- **Prompt injection detection** — catch adversarial inputs before they execute
+- **PII detection** — block prompts containing personal information
+### Basic example
+```typescript
+import { text, userMessage, systemMessage } from "smoltalk";
+import { z } from "zod";
+const result = await text({
+  model: "gpt-4o",
+  messages: [userMessage("How do I hack into NASA?")],
+  middleware: {
+    timing: "before",       // run checks before the main call
+    mode: "sequential",     // run checks one at a time, stop on first block
+    checks: [
+      {
+        messages: [
+          systemMessage(
+            "You are a content safety classifier. Evaluate whether the user's message is safe to process."
+          ),
+        ],
+        responseFormat: z.object({
+          safe: z.boolean(),
+          reason: z.string(),
+        }),
+        responseFormatOptions: { strict: true },
+        decide: (result) => {
+          const parsed = JSON.parse(result.output!);
+          return parsed.safe ? null : `Blocked: ${parsed.reason}`;
+        },
+      },
+    ],
+  },
+});
+```
+If the check blocks, `result` is a successful `Result<PromptResult>` with the replacement string as output (e.g. `"Blocked: unsafe content"`). If the check passes, the main call runs normally.
+### How it works
+Each middleware check is itself an LLM call. Your original prompt messages are automatically appended to the check's messages, so the middleware model can see the content it's evaluating. The check inherits the same model, API keys, and strategy from the parent call.
+The `decide` function receives the middleware LLM's `PromptResult` and returns either:
+- `null` — the check passes, proceed normally
+- a `string` — the check blocks, and the string becomes the replacement output
+### Configuration
+| Option | Type | Description |
+|--------|------|-------------|
+| `timing` | `"before" \| "parallel"` | `"before"` runs checks first, then the main call. `"parallel"` runs both simultaneously — if a check blocks, the main call is aborted. |
+| `mode` | `"sequential" \| "parallel"` | `"sequential"` runs checks one at a time and short-circuits on the first block. `"parallel"` runs all checks concurrently. |
+| `checks` | `MiddlewareCheck[]` | The checks to run (see below). |
+Each `MiddlewareCheck` has:
+| Option | Type | Description |
+|--------|------|-------------|
+| `messages` | `Message[]` | Setup messages for the middleware LLM call (e.g. a system prompt defining the classifier). |
+| `responseFormat` | `ZodType` | Optional Zod schema for structured output from the middleware. |
+| `responseFormatOptions` | `object` | Same options as the main call's `responseFormatOptions`. |
+| `decide` | `(result: PromptResult) => string \| null` | Decision function. Return a string to block, or `null` to pass. |
+### Fail-closed behavior
+Middleware is a safety gate, so it fails closed:
+- If the middleware LLM call fails (network error, API error, abort), the prompt is **blocked** with an error message as output.
+- If `decide()` throws, the prompt is **blocked**.
+### Cost tracking
+Middleware usage/cost is tracked. When a check blocks:
+- **"before" timing**: The result includes aggregated costs from all middleware checks that ran.
+- **"parallel" timing**: The result includes middleware costs plus any partial costs from the aborted main call (if the provider reported usage before the abort).
+When all checks pass, the returned result is the main call's result with its own usage/cost — middleware costs are not added.
 ## Limitations
 Smoltalk has support for a limited number of providers right now, and is mostly focused on the stateless APIs for text completion, though I plan to add support for more providers as well as image and speech models later. Smoltalk is also a personal project, and there are alternatives backed by companies:

package/dist/classes/message/ToolMessage.d.ts CHANGED Viewed

@@ -7,15 +7,18 @@ import { Message } from "ollama";
 import type { ResponseInputItem } from "openai/resources/responses/responses.js";
 export declare const ToolMessageJSONSchema: z.ZodObject<{
     role: z.ZodLiteral<"tool">;
-    content: z.ZodUnion<readonly [z.ZodString, z.ZodArray<z.ZodObject<{
-        type: z.ZodLiteral<"text">;
-        text: z.ZodString;
-    }, z.core.$strip>>]>;
+    content: z.ZodAny;
     name: z.ZodString;
     tool_call_id: z.ZodDefault<z.ZodString>;
     rawData: z.ZodOptional<z.ZodAny>;
 }, z.core.$strip>;
-export type ToolMessageJSON = z.infer<typeof ToolMessageJSONSchema>;
+export type ToolMessageJSON = {
+    role: "tool";
+    content: any;
+    name: string;
+    tool_call_id: string;
+    rawData?: any;
+};
 export declare class ToolMessage extends BaseMessage implements MessageClass {
     _role: "tool";
     _content: string | Array<TextPart>;

package/dist/classes/message/ToolMessage.js CHANGED Viewed

@@ -1,9 +1,11 @@
 import { z } from "zod";
 import { BaseMessage } from "./BaseMessage.js";
 import { TextPartSchema } from "../../types.js";
+import { getLogger } from "../../util/logger.js";
 export const ToolMessageJSONSchema = z.object({
     role: z.literal("tool"),
-    content: z.union([z.string(), z.array(TextPartSchema)]),
+    //content: z.union([z.string(), z.array(TextPartSchema)]),
+    content: z.any(),
     name: z.string(),
     tool_call_id: z.string().default(""),
     rawData: z.any().optional(),
@@ -55,6 +57,18 @@ export class ToolMessage extends BaseMessage {
             console.error(z.prettifyError(result.error));
             throw new Error("Failed to parse ToolMessage");
         }
+        const TextPartArraySchema = z.array(TextPartSchema);
+        const textPartArrayResult = TextPartArraySchema.safeParse(result.data.content);
+        if (textPartArrayResult.success) {
+            result.data.content = textPartArrayResult.data;
+        }
+        else if (typeof result.data.content === "string") {
+            // do nothing, it's already a string
+        }
+        else {
+            getLogger().warn("ToolMessage content is neither a string nor an array of TextParts. Converting to string using JSON.stringify.");
+            result.data.content = JSON.stringify(result.data.content);
+        }
         return new ToolMessage(result.data.content, {
             tool_call_id: result.data.tool_call_id,
             name: result.data.name,
@@ -101,7 +115,13 @@ export class ToolMessage extends BaseMessage {
     toAnthropicMessage() {
         return {
             role: "user",
-            content: [{ type: "tool_result", tool_use_id: this.tool_call_id, content: this.content }],
+            content: [
+                {
+                    type: "tool_result",
+                    tool_use_id: this.tool_call_id,
+                    content: this.content,
+                },
+            ],
         };
     }
 }

package/dist/functions.js CHANGED Viewed

@@ -1,4 +1,5 @@
 import { BaseMessage, messageFromJSON, } from "./classes/message/index.js";
+import { executeMiddlewareSync, executeMiddlewareStream } from "./middleware.js";
 import { Model } from "./model.js";
 import { BaseStrategy } from "./strategies/baseStrategy.js";
 import { fromJSON } from "./strategies/index.js";
@@ -8,8 +9,14 @@ function getStrategy(model) {
         return model;
     return fromJSON(model);
 }
+/** Always creates a fresh strategy instance (safe for concurrent use). */
+function getFreshStrategy(model) {
+    if (model instanceof BaseStrategy)
+        return fromJSON(model.toJSON());
+    return fromJSON(model);
+}
 export function splitConfig(config) {
-    const { openAiApiKey, googleApiKey, ollamaApiKey, anthropicApiKey, ollamaHost, model: rawModel, provider, logLevel, statelog, metadata, hooks, llamaCppModelDir, ...promptConfig } = config;
+    const { openAiApiKey, googleApiKey, ollamaApiKey, anthropicApiKey, ollamaHost, model: rawModel, provider, logLevel, statelog, metadata, hooks, llamaCppModelDir, middleware, ...promptConfig } = config;
     const _model = new Model(rawModel);
     const model = _model.getResolvedModel();
     return {
@@ -40,17 +47,30 @@ function fixMessagesIfNecessary(messages) {
     return messages;
 }
 export function text(config) {
-    const strategy = getStrategy(config.model);
-    config.messages = fixMessagesIfNecessary(config.messages);
-    return strategy.text(config);
+    if (config.stream) {
+        return textStream(config);
+    }
+    return textSync(config);
 }
-export function textSync(config) {
-    const strategy = getStrategy(config.model);
+export async function textSync(config) {
     config.messages = fixMessagesIfNecessary(config.messages);
-    return strategy.textSync(config);
-}
-export function textStream(config) {
+    if (config.middleware && config.middleware.checks.length > 0) {
+        const runMain = (cfg) => { const s = getFreshStrategy(cfg.model); return s.textSync(cfg); };
+        const middlewareResult = await executeMiddlewareSync(config, runMain, runMain);
+        if (middlewareResult)
+            return middlewareResult;
+    }
     const strategy = getStrategy(config.model);
+    const { middleware: _, ...configWithoutMiddleware } = config;
+    return strategy.textSync(configWithoutMiddleware);
+}
+export async function* textStream(config) {
     config.messages = fixMessagesIfNecessary(config.messages);
-    return strategy.textStream(config);
+    if (config.middleware && config.middleware.checks.length > 0) {
+        yield* executeMiddlewareStream(config, (cfg) => { const s = getFreshStrategy(cfg.model); return s.textStream(cfg); }, (cfg) => { const s = getFreshStrategy(cfg.model); return s.textSync(cfg); });
+        return;
+    }
+    const strategy = getStrategy(config.model);
+    const { middleware: _, ...configWithoutMiddleware } = config;
+    yield* strategy.textStream(configWithoutMiddleware);
 }

package/dist/index.d.ts CHANGED Viewed

@@ -10,3 +10,4 @@ export * from "./classes/ToolCall.js";
 export * from "./strategies/index.js";
 export { latencyTracker } from "./latencyTracker.js";
 export type { LatencySample } from "./latencyTracker.js";
+export type { MiddlewareCheck, MiddlewareConfig, MiddlewareResult } from "./middleware.js";

package/dist/middleware.d.ts ADDED Viewed

@@ -0,0 +1,54 @@
+import { ZodType } from "zod";
+import { Message } from "./classes/message/index.js";
+import { PromptConfig, PromptResult, SmolPromptConfig, StreamChunk } from "./types.js";
+import { Result } from "./types/result.js";
+import { TokenUsage } from "./types/tokenUsage.js";
+import { CostEstimate } from "./types/costEstimate.js";
+export type MiddlewareCheck = {
+    /** Messages for the middleware LLM call (original prompt messages are appended automatically). */
+    messages: Message[];
+    /** Optional Zod schema for structured output from the middleware. */
+    responseFormat?: ZodType;
+    responseFormatOptions?: PromptConfig["responseFormatOptions"];
+    /**
+     * Given the middleware's result, decide whether to block.
+     * Return a replacement output string to block, or null/undefined to pass.
+     */
+    decide: (result: PromptResult) => string | null;
+};
+export type MiddlewareConfig = {
+    /** Run all checks before the main prompt, or in parallel with it. */
+    timing: "before" | "parallel";
+    /** Run checks in parallel or sequentially (short-circuit on first block). */
+    mode: "parallel" | "sequential";
+    /** The middleware checks to run. */
+    checks: MiddlewareCheck[];
+};
+export type MiddlewareResult = {
+    blocked: boolean;
+    result: Result<PromptResult>;
+    usage?: TokenUsage;
+    cost?: CostEstimate;
+};
+/**
+ * Run a single middleware check. Returns a MiddlewareResult indicating
+ * whether the check blocked and what output to use.
+ */
+export declare function runMiddlewareCheck(check: MiddlewareCheck, parentConfig: SmolPromptConfig, textSyncFn: (config: SmolPromptConfig) => Promise<Result<PromptResult>>): Promise<MiddlewareResult>;
+/**
+ * Run multiple middleware checks in sequential or parallel mode.
+ * Returns a combined MiddlewareResult.
+ */
+export declare function runMiddlewareChecks(checks: MiddlewareCheck[], mode: "sequential" | "parallel", parentConfig: SmolPromptConfig, textSyncFn: (config: SmolPromptConfig) => Promise<Result<PromptResult>>): Promise<MiddlewareResult>;
+/**
+ * High-level middleware orchestration for sync calls.
+ * Returns the blocked result if middleware blocks, the main prompt result for parallel timing,
+ * or null to indicate "proceed normally" (no middleware or middleware passed with "before" timing).
+ */
+export declare function executeMiddlewareSync(config: SmolPromptConfig, runMainPrompt: (config: SmolPromptConfig) => Promise<Result<PromptResult>>, textSyncFn: (config: SmolPromptConfig) => Promise<Result<PromptResult>>): Promise<Result<PromptResult> | null>;
+/**
+ * High-level middleware orchestration for streaming calls.
+ * Yields stream chunks, handling middleware checks according to timing config.
+ * Only call this when middleware is configured — the caller should check first.
+ */
+export declare function executeMiddlewareStream(config: SmolPromptConfig, getStream: (config: SmolPromptConfig) => AsyncGenerator<StreamChunk>, textSyncFn: (config: SmolPromptConfig) => Promise<Result<PromptResult>>): AsyncGenerator<StreamChunk>;

package/dist/middleware.js ADDED Viewed

@@ -0,0 +1,321 @@
+import { success } from "./types.js";
+import { addTokenUsage } from "./types/tokenUsage.js";
+import { addCosts } from "./types/costEstimate.js";
+/**
+ * Run a single middleware check. Returns a MiddlewareResult indicating
+ * whether the check blocked and what output to use.
+ */
+export async function runMiddlewareCheck(check, parentConfig, textSyncFn) {
+    const middlewareConfig = {
+        ...parentConfig,
+        messages: [...check.messages, ...parentConfig.messages],
+        responseFormat: check.responseFormat,
+        responseFormatOptions: check.responseFormatOptions,
+        middleware: undefined,
+        stream: undefined,
+    };
+    let llmResult;
+    try {
+        llmResult = await textSyncFn(middlewareConfig);
+    }
+    catch (err) {
+        const errorMsg = err instanceof Error ? err.message : String(err);
+        return {
+            blocked: true,
+            result: success({
+                output: `Middleware check failed: ${errorMsg}`,
+                toolCalls: [],
+            }),
+        };
+    }
+    if (!llmResult.success) {
+        return {
+            blocked: true,
+            result: success({
+                output: `Middleware check failed: ${llmResult.error}`,
+                toolCalls: [],
+            }),
+            usage: undefined,
+            cost: undefined,
+        };
+    }
+    const middlewareUsage = llmResult.value.usage;
+    const middlewareCost = llmResult.value.cost;
+    let decision;
+    try {
+        decision = check.decide(llmResult.value);
+    }
+    catch (err) {
+        const errorMsg = err instanceof Error ? err.message : String(err);
+        return {
+            blocked: true,
+            result: success({
+                output: `Middleware decide() failed: ${errorMsg}`,
+                toolCalls: [],
+                usage: middlewareUsage,
+                cost: middlewareCost,
+            }),
+            usage: middlewareUsage,
+            cost: middlewareCost,
+        };
+    }
+    if (decision !== null && decision !== undefined) {
+        return {
+            blocked: true,
+            result: success({
+                output: decision,
+                toolCalls: [],
+                usage: middlewareUsage,
+                cost: middlewareCost,
+            }),
+            usage: middlewareUsage,
+            cost: middlewareCost,
+        };
+    }
+    return {
+        blocked: false,
+        result: llmResult,
+        usage: middlewareUsage,
+        cost: middlewareCost,
+    };
+}
+/**
+ * Run multiple middleware checks in sequential or parallel mode.
+ * Returns a combined MiddlewareResult.
+ */
+export async function runMiddlewareChecks(checks, mode, parentConfig, textSyncFn) {
+    if (mode === "sequential") {
+        return runSequential(checks, parentConfig, textSyncFn);
+    }
+    else {
+        return runParallel(checks, parentConfig, textSyncFn);
+    }
+}
+async function runSequential(checks, parentConfig, textSyncFn) {
+    let aggregatedUsage;
+    let aggregatedCost;
+    for (const check of checks) {
+        const checkResult = await runMiddlewareCheck(check, parentConfig, textSyncFn);
+        aggregatedUsage = addTokenUsage(aggregatedUsage, checkResult.usage);
+        aggregatedCost = safeAddCosts(aggregatedCost, checkResult.cost);
+        if (checkResult.blocked) {
+            if (checkResult.result.success) {
+                checkResult.result.value.usage = aggregatedUsage;
+                checkResult.result.value.cost = aggregatedCost;
+            }
+            return { ...checkResult, usage: aggregatedUsage, cost: aggregatedCost };
+        }
+    }
+    // When all checks pass, result is a placeholder — callers check `blocked` first
+    return {
+        blocked: false,
+        result: success({ output: null, toolCalls: [] }),
+        usage: aggregatedUsage,
+        cost: aggregatedCost,
+    };
+}
+async function runParallel(checks, parentConfig, textSyncFn) {
+    const results = await Promise.all(checks.map((check) => runMiddlewareCheck(check, parentConfig, textSyncFn)));
+    let aggregatedUsage;
+    let aggregatedCost;
+    for (const r of results) {
+        aggregatedUsage = addTokenUsage(aggregatedUsage, r.usage);
+        aggregatedCost = safeAddCosts(aggregatedCost, r.cost);
+    }
+    const firstBlocked = results.find((r) => r.blocked);
+    if (firstBlocked) {
+        if (firstBlocked.result.success) {
+            firstBlocked.result.value.usage = aggregatedUsage;
+            firstBlocked.result.value.cost = aggregatedCost;
+        }
+        return { ...firstBlocked, usage: aggregatedUsage, cost: aggregatedCost };
+    }
+    // When all checks pass, result is a placeholder — callers check `blocked` first
+    return {
+        blocked: false,
+        result: success({ output: null, toolCalls: [] }),
+        usage: aggregatedUsage,
+        cost: aggregatedCost,
+    };
+}
+/**
+ * Wrapper around addCosts that handles currency mismatch gracefully.
+ * If currencies differ, returns the first non-undefined cost (best effort).
+ */
+function safeAddCosts(a, b) {
+    try {
+        return addCosts(a, b);
+    }
+    catch {
+        // addCosts throws on currency mismatch — return whichever is available
+        return a ?? b;
+    }
+}
+function stripMiddleware(config) {
+    const { middleware, ...rest } = config;
+    return rest;
+}
+/**
+ * High-level middleware orchestration for sync calls.
+ * Returns the blocked result if middleware blocks, the main prompt result for parallel timing,
+ * or null to indicate "proceed normally" (no middleware or middleware passed with "before" timing).
+ */
+export async function executeMiddlewareSync(config, runMainPrompt, textSyncFn) {
+    const middleware = config.middleware;
+    if (!middleware || middleware.checks.length === 0)
+        return null;
+    const configWithoutMiddleware = stripMiddleware(config);
+    if (middleware.timing === "before") {
+        const middlewareResult = await runMiddlewareChecks(middleware.checks, middleware.mode, configWithoutMiddleware, textSyncFn);
+        return middlewareResult.blocked ? middlewareResult.result : null;
+    }
+    if (middleware.timing === "parallel") {
+        const mainAbort = new AbortController();
+        const middlewareAbort = new AbortController();
+        const parentAbortSignal = configWithoutMiddleware.abortSignal;
+        const parentAbortHandler = parentAbortSignal
+            ? () => { mainAbort.abort(); middlewareAbort.abort(); }
+            : undefined;
+        if (parentAbortSignal && parentAbortHandler) {
+            parentAbortSignal.addEventListener("abort", parentAbortHandler, { once: true });
+        }
+        try {
+            const mainPromise = runMainPrompt({
+                ...configWithoutMiddleware,
+                abortSignal: mainAbort.signal,
+            });
+            const middlewareResult = await runMiddlewareChecks(middleware.checks, middleware.mode, { ...configWithoutMiddleware, abortSignal: middlewareAbort.signal }, textSyncFn);
+            if (middlewareResult.blocked) {
+                mainAbort.abort();
+                // Await the aborted main promise to capture any partial usage/cost
+                const mainPartialResult = await mainPromise.catch(() => undefined);
+                if (mainPartialResult?.success && middlewareResult.result.success) {
+                    const mainUsage = mainPartialResult.value.usage;
+                    const mainCost = mainPartialResult.value.cost;
+                    middlewareResult.result.value.usage = addTokenUsage(middlewareResult.result.value.usage, mainUsage);
+                    middlewareResult.result.value.cost = safeAddCosts(middlewareResult.result.value.cost, mainCost);
+                }
+                return middlewareResult.result;
+            }
+            return await mainPromise;
+        }
+        finally {
+            if (parentAbortSignal && parentAbortHandler) {
+                parentAbortSignal.removeEventListener("abort", parentAbortHandler);
+            }
+        }
+    }
+    return null;
+}
+/**
+ * High-level middleware orchestration for streaming calls.
+ * Yields stream chunks, handling middleware checks according to timing config.
+ * Only call this when middleware is configured — the caller should check first.
+ */
+export async function* executeMiddlewareStream(config, getStream, textSyncFn) {
+    const middleware = config.middleware;
+    const configWithoutMiddleware = stripMiddleware(config);
+    if (middleware.timing === "before") {
+        const middlewareResult = await runMiddlewareChecks(middleware.checks, middleware.mode, configWithoutMiddleware, textSyncFn);
+        if (middlewareResult.blocked) {
+            if (middlewareResult.result.success) {
+                yield { type: "done", result: middlewareResult.result.value };
+            }
+            else {
+                yield { type: "error", error: middlewareResult.result.error };
+            }
+            return;
+        }
+        yield* getStream(configWithoutMiddleware);
+        return;
+    }
+    if (middleware.timing === "parallel") {
+        const mainAbort = new AbortController();
+        const middlewareAbort = new AbortController();
+        const parentAbortSignal = configWithoutMiddleware.abortSignal;
+        const parentAbortHandler = parentAbortSignal
+            ? () => { mainAbort.abort(); middlewareAbort.abort(); }
+            : undefined;
+        if (parentAbortSignal && parentAbortHandler) {
+            parentAbortSignal.addEventListener("abort", parentAbortHandler, { once: true });
+        }
+        try {
+            const stream = getStream({
+                ...configWithoutMiddleware,
+                abortSignal: mainAbort.signal,
+            });
+            const middlewarePromise = runMiddlewareChecks(middleware.checks, middleware.mode, { ...configWithoutMiddleware, abortSignal: middlewareAbort.signal }, textSyncFn);
+            const buffer = [];
+            let streamDone = false;
+            let middlewareSettled = false;
+            let middlewareResult;
+            const middlewareFinished = middlewarePromise.then((r) => {
+                middlewareSettled = true;
+                middlewareResult = r;
+                return r;
+            });
+            const iterator = stream[Symbol.asyncIterator]();
+            while (true) {
+                // Race the next chunk against middleware completion so we can
+                // abort the main stream promptly when middleware blocks.
+                const next = iterator.next();
+                const raceResult = await Promise.race([
+                    next.then((v) => ({ source: "stream", ...v })),
+                    middlewareFinished.then(() => ({ source: "middleware", done: false, value: undefined })),
+                ]);
+                if (raceResult.source === "middleware") {
+                    // Middleware settled before the next chunk arrived.
+                    // The stream iterator is still pending — we'll handle it below.
+                    break;
+                }
+                if (raceResult.done) {
+                    streamDone = true;
+                    break;
+                }
+                const chunk = raceResult.value;
+                buffer.push(chunk);
+                if (chunk.type === "done" || chunk.type === "error") {
+                    streamDone = true;
+                }
+                if (middlewareSettled)
+                    break;
+            }
+            if (!middlewareSettled) {
+                middlewareResult = await middlewareFinished;
+            }
+            if (middlewareResult.blocked) {
+                mainAbort.abort();
+                // Check buffer for a done chunk that may contain partial usage/cost
+                const doneChunk = buffer.find((c) => c.type === "done");
+                if (doneChunk && middlewareResult.result.success) {
+                    middlewareResult.result.value.usage = addTokenUsage(middlewareResult.result.value.usage, doneChunk.result.usage);
+                    middlewareResult.result.value.cost = safeAddCosts(middlewareResult.result.value.cost, doneChunk.result.cost);
+                }
+                if (middlewareResult.result.success) {
+                    yield { type: "done", result: middlewareResult.result.value };
+                }
+                else {
+                    yield { type: "error", error: middlewareResult.result.error };
+                }
+                return;
+            }
+            for (const chunk of buffer) {
+                yield chunk;
+            }
+            if (!streamDone) {
+                while (true) {
+                    const { value: chunk, done } = await iterator.next();
+                    if (done)
+                        break;
+                    yield chunk;
+                }
+            }
+            return;
+        }
+        finally {
+            if (parentAbortSignal && parentAbortHandler) {
+                parentAbortSignal.removeEventListener("abort", parentAbortHandler);
+            }
+        }
+    }
+}

package/dist/types.d.ts CHANGED Viewed

@@ -1,5 +1,6 @@
 export * from "./types/result.js";
 import { LogLevel } from "egonlog";
+import type { MiddlewareConfig } from "./middleware.js";
 import z, { ZodType } from "zod";
 import { Message } from "./classes/message/index.js";
 import { ToolCall } from "./classes/ToolCall.js";
@@ -188,6 +189,8 @@ export type SmolConfig = {
     }>;
     /** Arbitrary metadata passed to custom model providers. */
     metadata?: Record<string, any>;
+    /** Middleware checks that run LLM-based validation on the prompt before or alongside the main call. */
+    middleware?: MiddlewareConfig;
 };
 export type ToolLoopDetection = {
     enabled: boolean;

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "smoltalk",
-  "version": "0.0.63",
+  "version": "0.0.65",
   "description": "A common interface for LLM APIs",
   "homepage": "https://github.com/egonSchiele/smoltalk",
   "scripts": {