npm - @context-chef/ai-sdk-middleware - Versions diffs - 1.0.4 → 1.0.6 - Mend

@context-chef/ai-sdk-middleware 1.0.4 → 1.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -8,6 +8,8 @@
 [Vercel AI SDK](https://ai-sdk.dev) middleware powered by [context-chef](https://github.com/MyPrototypeWhat/context-chef). Transparent history compression, tool result truncation, and token budget management — zero code changes required.
+![Quick Start](../../@context-chef_ai-sdk-middleware.png)
 ## Installation
 ```bash
@@ -27,7 +29,7 @@ const model = withContextChef(openai('gpt-4o'), {
   truncate: { threshold: 5000, headChars: 500, tailChars: 1000 },
 });
-// Everything below stays exactly the same — works with generateText and streamText
+// Everything below stays exactly the same — works with generateText, streamText, and ToolLoopAgent
 const result = await generateText({
   model,
   messages: conversationHistory,
@@ -98,6 +100,36 @@ const model = withContextChef(openai('gpt-4o'), {
 The middleware automatically extracts token usage from `generateText` and `streamText` responses and feeds it back to the compression engine. No manual `reportTokenUsage()` calls needed.
+### Compact (Mechanical Clearing)
+Zero-LLM-cost content clearing for thinking blocks and tool results:
+```typescript
+const model = withContextChef(openai('gpt-4o'), {
+  contextWindow: 128_000,
+  compact: {
+    clear: ['thinking', { target: 'tool-result', keepRecent: 5 }],
+  },
+});
+```
+> **Important: compact + compress interaction**
+>
+> When using `compact` together with `compress`, only clear `thinking` in compact:
+>
+> ```typescript
+> const model = withContextChef(openai('gpt-4o'), {
+>   contextWindow: 128_000,
+>   compact: { clear: ['thinking'] },                // thinking only
+>   compress: { model: openai('gpt-4o-mini') },
+> });
+> ```
+>
+> Clearing `tool-result` before compression causes the compression model to receive
+> empty placeholders instead of actual tool outputs, producing low-quality summaries.
+> Compression's turn-based splitting already manages history length — use `compact`
+> for `tool-result` clearing only when `compress` is **not** configured.
 ## API
 ### `withContextChef(model, options)`
@@ -123,6 +155,7 @@ const wrappedModel = withContextChef(model, options);
 | `truncate.headChars` | `number` | No | Characters to preserve from start (default: `0`) |
 | `truncate.tailChars` | `number` | No | Characters to preserve from end (default: `1000`) |
 | `truncate.storage` | `VFSStorageAdapter` | No | Storage adapter to persist original content before truncation |
+| `compact` | `CompactConfig` | No | Mechanical content clearing (thinking, tool-result). When combined with `compress`, use `clear: ['thinking']` only |
 | `tokenizer` | `(msgs) => number` | No | Custom tokenizer for precise counting |
 | `onCompress` | `(summary, count) => void` | No | Hook called after compression |
@@ -155,7 +188,7 @@ const aiSdkPrompt = toAISDK(irMessages);
 ## How It Works
 ```
-generateText / streamText ({ model: wrappedModel, messages })
+generateText / streamText / ToolLoopAgent ({ model: wrappedModel, messages })
   |
   v
 transformParams (before LLM call)
@@ -186,3 +219,7 @@ The middleware covers the most common use case: transparent compression and trun
 ## License
 ISC
+---
+[中文文档](./README.zh-CN.md)

package/README.zh-CN.md ADDED Viewed

@@ -0,0 +1,222 @@
+# @context-chef/ai-sdk-middleware
+[![npm version](https://img.shields.io/npm/v/@context-chef/ai-sdk-middleware.svg)](https://www.npmjs.com/package/@context-chef/ai-sdk-middleware)
+[![npm downloads](https://img.shields.io/npm/dm/@context-chef/ai-sdk-middleware.svg)](https://www.npmjs.com/package/@context-chef/ai-sdk-middleware)
+[![License](https://img.shields.io/npm/l/@context-chef/ai-sdk-middleware.svg)](https://github.com/MyPrototypeWhat/context-chef/blob/main/LICENSE)
+[![TypeScript](https://img.shields.io/badge/TypeScript-5.9-blue.svg)](https://www.typescriptlang.org/)
+[![AI SDK](https://img.shields.io/badge/AI%20SDK-v6-black.svg)](https://ai-sdk.dev)
+基于 [context-chef](https://github.com/MyPrototypeWhat/context-chef) 的 [Vercel AI SDK](https://ai-sdk.dev) 中间件。透明的历史压缩、工具结果截断和 token 预算管理 — 无需修改任何代码。
+[English](./README.md)
+![Quick Start](../../@context-chef_ai-sdk-middleware.png)
+## 安装
+```bash
+npm install @context-chef/ai-sdk-middleware ai
+```
+## 快速开始
+```typescript
+import { withContextChef } from '@context-chef/ai-sdk-middleware';
+import { openai } from '@ai-sdk/openai';
+import { generateText } from 'ai';
+const model = withContextChef(openai('gpt-4o'), {
+  contextWindow: 128_000,
+  compress: { model: openai('gpt-4o-mini') },
+  truncate: { threshold: 5000, headChars: 500, tailChars: 1000 },
+});
+// 下面的代码完全不变 — 兼容 generateText、streamText 和 ToolLoopAgent
+const result = await generateText({
+  model,
+  messages: conversationHistory,
+  tools: myTools,
+});
+```
+就这样。历史压缩、工具结果截断和 token 预算追踪在后台自动完成。
+## 功能
+### 历史压缩
+当对话超出 token 预算时，中间件会压缩旧消息以腾出空间。两种模式：
+**不配置压缩模型**（默认）— 旧消息被丢弃，仅保留近期消息：
+```typescript
+const model = withContextChef(openai('gpt-4o'), {
+  contextWindow: 128_000,
+});
+```
+**配置压缩模型** — 旧消息由便宜模型生成摘要后替换：
+```typescript
+const model = withContextChef(openai('gpt-4o'), {
+  contextWindow: 128_000,
+  compress: {
+    model: openai('gpt-4o-mini'),  // 用于摘要的便宜模型
+    preserveRatio: 0.8,             // 保留 80% 的上下文给近期消息
+  },
+});
+```
+### 工具结果截断
+大体积工具输出（终端日志、API 响应）会被自动截断，同时保留头部和尾部：
+```typescript
+const model = withContextChef(openai('gpt-4o'), {
+  contextWindow: 128_000,
+  truncate: {
+    threshold: 5000,   // 超过 5000 字符时截断
+    headChars: 500,    // 保留开头 500 字符
+    tailChars: 1000,   // 保留结尾 1000 字符
+  },
+});
+```
+可选地通过存储适配器持久化原始内容，LLM 后续可通过 `context://vfs/` URI 按需检索：
+```typescript
+import { FileSystemAdapter } from '@context-chef/core';
+const model = withContextChef(openai('gpt-4o'), {
+  contextWindow: 128_000,
+  truncate: {
+    threshold: 5000,
+    headChars: 500,
+    tailChars: 1000,
+    storage: new FileSystemAdapter('.context_vfs'), // 或自定义数据库适配器
+  },
+});
+```
+### Token 预算追踪
+中间件自动从 `generateText` 和 `streamText` 响应中提取 token 用量，并回传给压缩引擎。无需手动调用 `reportTokenUsage()`。
+### Compact（机械清理）
+零 LLM 成本的 thinking 块和工具结果清理：
+```typescript
+const model = withContextChef(openai('gpt-4o'), {
+  contextWindow: 128_000,
+  compact: {
+    clear: ['thinking', { target: 'tool-result', keepRecent: 5 }],
+  },
+});
+```
+> **注意：compact + compress 交互**
+>
+> 同时使用 `compact` 和 `compress` 时，compact 中仅清理 `thinking`：
+>
+> ```typescript
+> const model = withContextChef(openai('gpt-4o'), {
+>   contextWindow: 128_000,
+>   compact: { clear: ['thinking'] },                // 仅 thinking
+>   compress: { model: openai('gpt-4o-mini') },
+> });
+> ```
+>
+> 在压缩之前清理 `tool-result` 会导致压缩模型收到空占位符而非实际工具输出，
+> 产生低质量摘要。压缩的 turn 分组机制已经管理了历史长度 — 仅在**未配置**
+> `compress` 时使用 `compact` 清理 `tool-result`。
+## API
+### `withContextChef(model, options)`
+用 context-chef 中间件包装 AI SDK 语言模型。
+```typescript
+import { withContextChef } from '@context-chef/ai-sdk-middleware';
+const wrappedModel = withContextChef(model, options);
+```
+**参数：**
+| 选项 | 类型 | 必填 | 说明 |
+|---|---|---|---|
+| `contextWindow` | `number` | 是 | 模型的上下文窗口大小（token 数） |
+| `compress` | `CompressOptions` | 否 | 启用基于 LLM 的压缩 |
+| `compress.model` | `LanguageModelV3` | 是（如启用 compress） | 用于摘要的便宜模型 |
+| `compress.preserveRatio` | `number` | 否 | 保留上下文的比例（默认：`0.8`） |
+| `truncate` | `TruncateOptions` | 否 | 启用工具结果截断 |
+| `truncate.threshold` | `number` | 是（如启用 truncate） | 触发截断的字符数 |
+| `truncate.headChars` | `number` | 否 | 保留开头的字符数（默认：`0`） |
+| `truncate.tailChars` | `number` | 否 | 保留结尾的字符数（默认：`1000`） |
+| `truncate.storage` | `VFSStorageAdapter` | 否 | 截断前持久化原始内容的存储适配器 |
+| `compact` | `CompactConfig` | 否 | 机械内容清理（thinking、tool-result）。与 `compress` 同时使用时，仅使用 `clear: ['thinking']` |
+| `tokenizer` | `(msgs) => number` | 否 | 自定义分词器用于精确计数 |
+| `onCompress` | `(summary, count) => void` | 否 | 压缩完成后的回调 |
+**返回值：** `LanguageModelV3` — 包装后的模型，可在任何使用原模型的地方直接替换。
+### `createMiddleware(options)`
+创建原始 `LanguageModelMiddleware`，可通过 `wrapLanguageModel` 自行应用：
+```typescript
+import { createMiddleware } from '@context-chef/ai-sdk-middleware';
+import { wrapLanguageModel } from 'ai';
+const middleware = createMiddleware({ contextWindow: 128_000 });
+const model = wrapLanguageModel({ model: openai('gpt-4o'), middleware });
+```
+### `fromAISDK(prompt)` / `toAISDK(messages)`
+AI SDK `LanguageModelV3Prompt` 与 context-chef `Message[]` IR 之间的底层转换器。适用于直接使用 context-chef 模块处理 AI SDK 消息格式的场景。
+```typescript
+import { fromAISDK, toAISDK } from '@context-chef/ai-sdk-middleware';
+const irMessages = fromAISDK(aiSdkPrompt);
+// ... 用 context-chef 模块处理 ...
+const aiSdkPrompt = toAISDK(irMessages);
+```
+## 工作原理
+```
+generateText / streamText / ToolLoopAgent ({ model: wrappedModel, messages })
+  |
+  v
+transformParams（LLM 调用前）
+  1. 截断大体积工具结果（如已配置）
+     - 可选持久化原始内容到存储适配器
+  2. AI SDK 消息 -> context-chef IR
+  3. 运行 Janitor 压缩（如超出 token 预算）
+  4. 转换回 AI SDK 消息
+  |
+  v
+LLM 调用正常执行
+  |
+  v
+wrapGenerate / wrapStream（LLM 调用后）
+  5. 从响应中提取 token 用量
+  6. 回传给 Janitor 用于下次调用的预算检查
+  |
+  v
+结果原样返回
+```
+中间件是**有状态的** — 它跨调用追踪 token 用量以判断何时需要压缩。每个对话/会话创建一个包装模型实例。
+## 需要更多控制？
+中间件覆盖了最常见的场景：透明的压缩和截断。如需动态状态注入、工具命名空间、记忆或快照/恢复等高级功能，请直接使用 [`@context-chef/core`](https://www.npmjs.com/package/@context-chef/core)。
+## 许可证
+ISC

package/dist/index.d.cts CHANGED Viewed

@@ -27,6 +27,12 @@ interface CompressOptions {
 /**
  * Mechanical compaction options — zero LLM cost.
  * Runs before LLM-based compression to reduce token usage at no cost.
+ *
+ * **Important:** When using together with `compress`, only clear `thinking`.
+ * Clearing `tool-result` before compression causes the compression model to
+ * receive empty placeholders instead of actual tool outputs, producing
+ * low-quality summaries. Leave tool-result management to compression's
+ * turn-based splitting.
  */
 interface CompactConfig {
   /** Which content types to clear from history. */
@@ -60,6 +66,9 @@ interface ContextChefOptions {
   /**
    * Mechanical compaction before LLM compression.
    * Clears specified content types (tool-result, thinking) at zero LLM cost.
+   *
+   * When combined with `compress`, use `clear: ['thinking']` only.
+   * See CompactConfig for details.
    */
   compact?: CompactConfig;
   /**

package/dist/index.d.mts CHANGED Viewed

@@ -27,6 +27,12 @@ interface CompressOptions {
 /**
  * Mechanical compaction options — zero LLM cost.
  * Runs before LLM-based compression to reduce token usage at no cost.
+ *
+ * **Important:** When using together with `compress`, only clear `thinking`.
+ * Clearing `tool-result` before compression causes the compression model to
+ * receive empty placeholders instead of actual tool outputs, producing
+ * low-quality summaries. Leave tool-result management to compression's
+ * turn-based splitting.
  */
 interface CompactConfig {
   /** Which content types to clear from history. */
@@ -60,6 +66,9 @@ interface ContextChefOptions {
   /**
    * Mechanical compaction before LLM compression.
    * Clears specified content types (tool-result, thinking) at zero LLM cost.
+   *
+   * When combined with `compress`, use `clear: ['thinking']` only.
+   * See CompactConfig for details.
    */
   compact?: CompactConfig;
   /**

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@context-chef/ai-sdk-middleware",
-  "version": "1.0.4",
+  "version": "1.0.6",
   "type": "module",
   "main": "./dist/index.cjs",
   "module": "./dist/index.mjs",
@@ -39,7 +39,7 @@
     "url": "https://github.com/MyPrototypeWhat/context-chef/issues"
   },
   "dependencies": {
-    "@context-chef/core": "3.0.1"
+    "@context-chef/core": "3.0.2"
   },
   "peerDependencies": {
     "@ai-sdk/provider": ">=3",