@context-chef/ai-sdk-middleware 1.0.4 → 1.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,6 +8,8 @@
8
8
 
9
9
  [Vercel AI SDK](https://ai-sdk.dev) middleware powered by [context-chef](https://github.com/MyPrototypeWhat/context-chef). Transparent history compression, tool result truncation, and token budget management — zero code changes required.
10
10
 
11
+ ![Quick Start](../../@context-chef_ai-sdk-middleware.png)
12
+
11
13
  ## Installation
12
14
 
13
15
  ```bash
@@ -27,7 +29,7 @@ const model = withContextChef(openai('gpt-4o'), {
27
29
  truncate: { threshold: 5000, headChars: 500, tailChars: 1000 },
28
30
  });
29
31
 
30
- // Everything below stays exactly the same — works with generateText and streamText
32
+ // Everything below stays exactly the same — works with generateText, streamText, and ToolLoopAgent
31
33
  const result = await generateText({
32
34
  model,
33
35
  messages: conversationHistory,
@@ -98,6 +100,36 @@ const model = withContextChef(openai('gpt-4o'), {
98
100
 
99
101
  The middleware automatically extracts token usage from `generateText` and `streamText` responses and feeds it back to the compression engine. No manual `reportTokenUsage()` calls needed.
100
102
 
103
+ ### Compact (Mechanical Clearing)
104
+
105
+ Zero-LLM-cost content clearing for thinking blocks and tool results:
106
+
107
+ ```typescript
108
+ const model = withContextChef(openai('gpt-4o'), {
109
+ contextWindow: 128_000,
110
+ compact: {
111
+ clear: ['thinking', { target: 'tool-result', keepRecent: 5 }],
112
+ },
113
+ });
114
+ ```
115
+
116
+ > **Important: compact + compress interaction**
117
+ >
118
+ > When using `compact` together with `compress`, only clear `thinking` in compact:
119
+ >
120
+ > ```typescript
121
+ > const model = withContextChef(openai('gpt-4o'), {
122
+ > contextWindow: 128_000,
123
+ > compact: { clear: ['thinking'] }, // thinking only
124
+ > compress: { model: openai('gpt-4o-mini') },
125
+ > });
126
+ > ```
127
+ >
128
+ > Clearing `tool-result` before compression causes the compression model to receive
129
+ > empty placeholders instead of actual tool outputs, producing low-quality summaries.
130
+ > Compression's turn-based splitting already manages history length — use `compact`
131
+ > for `tool-result` clearing only when `compress` is **not** configured.
132
+
101
133
  ## API
102
134
 
103
135
  ### `withContextChef(model, options)`
@@ -123,6 +155,7 @@ const wrappedModel = withContextChef(model, options);
123
155
  | `truncate.headChars` | `number` | No | Characters to preserve from start (default: `0`) |
124
156
  | `truncate.tailChars` | `number` | No | Characters to preserve from end (default: `1000`) |
125
157
  | `truncate.storage` | `VFSStorageAdapter` | No | Storage adapter to persist original content before truncation |
158
+ | `compact` | `CompactConfig` | No | Mechanical content clearing (thinking, tool-result). When combined with `compress`, use `clear: ['thinking']` only |
126
159
  | `tokenizer` | `(msgs) => number` | No | Custom tokenizer for precise counting |
127
160
  | `onCompress` | `(summary, count) => void` | No | Hook called after compression |
128
161
 
@@ -155,7 +188,7 @@ const aiSdkPrompt = toAISDK(irMessages);
155
188
  ## How It Works
156
189
 
157
190
  ```
158
- generateText / streamText ({ model: wrappedModel, messages })
191
+ generateText / streamText / ToolLoopAgent ({ model: wrappedModel, messages })
159
192
  |
160
193
  v
161
194
  transformParams (before LLM call)
@@ -186,3 +219,7 @@ The middleware covers the most common use case: transparent compression and trun
186
219
  ## License
187
220
 
188
221
  ISC
222
+
223
+ ---
224
+
225
+ [中文文档](./README.zh-CN.md)
@@ -0,0 +1,222 @@
1
+ # @context-chef/ai-sdk-middleware
2
+
3
+ [![npm version](https://img.shields.io/npm/v/@context-chef/ai-sdk-middleware.svg)](https://www.npmjs.com/package/@context-chef/ai-sdk-middleware)
4
+ [![npm downloads](https://img.shields.io/npm/dm/@context-chef/ai-sdk-middleware.svg)](https://www.npmjs.com/package/@context-chef/ai-sdk-middleware)
5
+ [![License](https://img.shields.io/npm/l/@context-chef/ai-sdk-middleware.svg)](https://github.com/MyPrototypeWhat/context-chef/blob/main/LICENSE)
6
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.9-blue.svg)](https://www.typescriptlang.org/)
7
+ [![AI SDK](https://img.shields.io/badge/AI%20SDK-v6-black.svg)](https://ai-sdk.dev)
8
+
9
+ 基于 [context-chef](https://github.com/MyPrototypeWhat/context-chef) 的 [Vercel AI SDK](https://ai-sdk.dev) 中间件。透明的历史压缩、工具结果截断和 token 预算管理 — 无需修改任何代码。
10
+
11
+ [English](./README.md)
12
+
13
+ ![Quick Start](../../@context-chef_ai-sdk-middleware.png)
14
+
15
+ ## 安装
16
+
17
+ ```bash
18
+ npm install @context-chef/ai-sdk-middleware ai
19
+ ```
20
+
21
+ ## 快速开始
22
+
23
+ ```typescript
24
+ import { withContextChef } from '@context-chef/ai-sdk-middleware';
25
+ import { openai } from '@ai-sdk/openai';
26
+ import { generateText } from 'ai';
27
+
28
+ const model = withContextChef(openai('gpt-4o'), {
29
+ contextWindow: 128_000,
30
+ compress: { model: openai('gpt-4o-mini') },
31
+ truncate: { threshold: 5000, headChars: 500, tailChars: 1000 },
32
+ });
33
+
34
+ // 下面的代码完全不变 — 兼容 generateText、streamText 和 ToolLoopAgent
35
+ const result = await generateText({
36
+ model,
37
+ messages: conversationHistory,
38
+ tools: myTools,
39
+ });
40
+ ```
41
+
42
+ 就这样。历史压缩、工具结果截断和 token 预算追踪在后台自动完成。
43
+
44
+ ## 功能
45
+
46
+ ### 历史压缩
47
+
48
+ 当对话超出 token 预算时,中间件会压缩旧消息以腾出空间。两种模式:
49
+
50
+ **不配置压缩模型**(默认)— 旧消息被丢弃,仅保留近期消息:
51
+
52
+ ```typescript
53
+ const model = withContextChef(openai('gpt-4o'), {
54
+ contextWindow: 128_000,
55
+ });
56
+ ```
57
+
58
+ **配置压缩模型** — 旧消息由便宜模型生成摘要后替换:
59
+
60
+ ```typescript
61
+ const model = withContextChef(openai('gpt-4o'), {
62
+ contextWindow: 128_000,
63
+ compress: {
64
+ model: openai('gpt-4o-mini'), // 用于摘要的便宜模型
65
+ preserveRatio: 0.8, // 保留 80% 的上下文给近期消息
66
+ },
67
+ });
68
+ ```
69
+
70
+ ### 工具结果截断
71
+
72
+ 大体积工具输出(终端日志、API 响应)会被自动截断,同时保留头部和尾部:
73
+
74
+ ```typescript
75
+ const model = withContextChef(openai('gpt-4o'), {
76
+ contextWindow: 128_000,
77
+ truncate: {
78
+ threshold: 5000, // 超过 5000 字符时截断
79
+ headChars: 500, // 保留开头 500 字符
80
+ tailChars: 1000, // 保留结尾 1000 字符
81
+ },
82
+ });
83
+ ```
84
+
85
+ 可选地通过存储适配器持久化原始内容,LLM 后续可通过 `context://vfs/` URI 按需检索:
86
+
87
+ ```typescript
88
+ import { FileSystemAdapter } from '@context-chef/core';
89
+
90
+ const model = withContextChef(openai('gpt-4o'), {
91
+ contextWindow: 128_000,
92
+ truncate: {
93
+ threshold: 5000,
94
+ headChars: 500,
95
+ tailChars: 1000,
96
+ storage: new FileSystemAdapter('.context_vfs'), // 或自定义数据库适配器
97
+ },
98
+ });
99
+ ```
100
+
101
+ ### Token 预算追踪
102
+
103
+ 中间件自动从 `generateText` 和 `streamText` 响应中提取 token 用量,并回传给压缩引擎。无需手动调用 `reportTokenUsage()`。
104
+
105
+ ### Compact(机械清理)
106
+
107
+ 零 LLM 成本的 thinking 块和工具结果清理:
108
+
109
+ ```typescript
110
+ const model = withContextChef(openai('gpt-4o'), {
111
+ contextWindow: 128_000,
112
+ compact: {
113
+ clear: ['thinking', { target: 'tool-result', keepRecent: 5 }],
114
+ },
115
+ });
116
+ ```
117
+
118
+ > **注意:compact + compress 交互**
119
+ >
120
+ > 同时使用 `compact` 和 `compress` 时,compact 中仅清理 `thinking`:
121
+ >
122
+ > ```typescript
123
+ > const model = withContextChef(openai('gpt-4o'), {
124
+ > contextWindow: 128_000,
125
+ > compact: { clear: ['thinking'] }, // 仅 thinking
126
+ > compress: { model: openai('gpt-4o-mini') },
127
+ > });
128
+ > ```
129
+ >
130
+ > 在压缩之前清理 `tool-result` 会导致压缩模型收到空占位符而非实际工具输出,
131
+ > 产生低质量摘要。压缩的 turn 分组机制已经管理了历史长度 — 仅在**未配置**
132
+ > `compress` 时使用 `compact` 清理 `tool-result`。
133
+
134
+ ## API
135
+
136
+ ### `withContextChef(model, options)`
137
+
138
+ 用 context-chef 中间件包装 AI SDK 语言模型。
139
+
140
+ ```typescript
141
+ import { withContextChef } from '@context-chef/ai-sdk-middleware';
142
+
143
+ const wrappedModel = withContextChef(model, options);
144
+ ```
145
+
146
+ **参数:**
147
+
148
+ | 选项 | 类型 | 必填 | 说明 |
149
+ |---|---|---|---|
150
+ | `contextWindow` | `number` | 是 | 模型的上下文窗口大小(token 数) |
151
+ | `compress` | `CompressOptions` | 否 | 启用基于 LLM 的压缩 |
152
+ | `compress.model` | `LanguageModelV3` | 是(如启用 compress) | 用于摘要的便宜模型 |
153
+ | `compress.preserveRatio` | `number` | 否 | 保留上下文的比例(默认:`0.8`) |
154
+ | `truncate` | `TruncateOptions` | 否 | 启用工具结果截断 |
155
+ | `truncate.threshold` | `number` | 是(如启用 truncate) | 触发截断的字符数 |
156
+ | `truncate.headChars` | `number` | 否 | 保留开头的字符数(默认:`0`) |
157
+ | `truncate.tailChars` | `number` | 否 | 保留结尾的字符数(默认:`1000`) |
158
+ | `truncate.storage` | `VFSStorageAdapter` | 否 | 截断前持久化原始内容的存储适配器 |
159
+ | `compact` | `CompactConfig` | 否 | 机械内容清理(thinking、tool-result)。与 `compress` 同时使用时,仅使用 `clear: ['thinking']` |
160
+ | `tokenizer` | `(msgs) => number` | 否 | 自定义分词器用于精确计数 |
161
+ | `onCompress` | `(summary, count) => void` | 否 | 压缩完成后的回调 |
162
+
163
+ **返回值:** `LanguageModelV3` — 包装后的模型,可在任何使用原模型的地方直接替换。
164
+
165
+ ### `createMiddleware(options)`
166
+
167
+ 创建原始 `LanguageModelMiddleware`,可通过 `wrapLanguageModel` 自行应用:
168
+
169
+ ```typescript
170
+ import { createMiddleware } from '@context-chef/ai-sdk-middleware';
171
+ import { wrapLanguageModel } from 'ai';
172
+
173
+ const middleware = createMiddleware({ contextWindow: 128_000 });
174
+ const model = wrapLanguageModel({ model: openai('gpt-4o'), middleware });
175
+ ```
176
+
177
+ ### `fromAISDK(prompt)` / `toAISDK(messages)`
178
+
179
+ AI SDK `LanguageModelV3Prompt` 与 context-chef `Message[]` IR 之间的底层转换器。适用于直接使用 context-chef 模块处理 AI SDK 消息格式的场景。
180
+
181
+ ```typescript
182
+ import { fromAISDK, toAISDK } from '@context-chef/ai-sdk-middleware';
183
+
184
+ const irMessages = fromAISDK(aiSdkPrompt);
185
+ // ... 用 context-chef 模块处理 ...
186
+ const aiSdkPrompt = toAISDK(irMessages);
187
+ ```
188
+
189
+ ## 工作原理
190
+
191
+ ```
192
+ generateText / streamText / ToolLoopAgent ({ model: wrappedModel, messages })
193
+ |
194
+ v
195
+ transformParams(LLM 调用前)
196
+ 1. 截断大体积工具结果(如已配置)
197
+ - 可选持久化原始内容到存储适配器
198
+ 2. AI SDK 消息 -> context-chef IR
199
+ 3. 运行 Janitor 压缩(如超出 token 预算)
200
+ 4. 转换回 AI SDK 消息
201
+ |
202
+ v
203
+ LLM 调用正常执行
204
+ |
205
+ v
206
+ wrapGenerate / wrapStream(LLM 调用后)
207
+ 5. 从响应中提取 token 用量
208
+ 6. 回传给 Janitor 用于下次调用的预算检查
209
+ |
210
+ v
211
+ 结果原样返回
212
+ ```
213
+
214
+ 中间件是**有状态的** — 它跨调用追踪 token 用量以判断何时需要压缩。每个对话/会话创建一个包装模型实例。
215
+
216
+ ## 需要更多控制?
217
+
218
+ 中间件覆盖了最常见的场景:透明的压缩和截断。如需动态状态注入、工具命名空间、记忆或快照/恢复等高级功能,请直接使用 [`@context-chef/core`](https://www.npmjs.com/package/@context-chef/core)。
219
+
220
+ ## 许可证
221
+
222
+ ISC
package/dist/index.d.cts CHANGED
@@ -27,6 +27,12 @@ interface CompressOptions {
27
27
  /**
28
28
  * Mechanical compaction options — zero LLM cost.
29
29
  * Runs before LLM-based compression to reduce token usage at no cost.
30
+ *
31
+ * **Important:** When using together with `compress`, only clear `thinking`.
32
+ * Clearing `tool-result` before compression causes the compression model to
33
+ * receive empty placeholders instead of actual tool outputs, producing
34
+ * low-quality summaries. Leave tool-result management to compression's
35
+ * turn-based splitting.
30
36
  */
31
37
  interface CompactConfig {
32
38
  /** Which content types to clear from history. */
@@ -60,6 +66,9 @@ interface ContextChefOptions {
60
66
  /**
61
67
  * Mechanical compaction before LLM compression.
62
68
  * Clears specified content types (tool-result, thinking) at zero LLM cost.
69
+ *
70
+ * When combined with `compress`, use `clear: ['thinking']` only.
71
+ * See CompactConfig for details.
63
72
  */
64
73
  compact?: CompactConfig;
65
74
  /**
package/dist/index.d.mts CHANGED
@@ -27,6 +27,12 @@ interface CompressOptions {
27
27
  /**
28
28
  * Mechanical compaction options — zero LLM cost.
29
29
  * Runs before LLM-based compression to reduce token usage at no cost.
30
+ *
31
+ * **Important:** When using together with `compress`, only clear `thinking`.
32
+ * Clearing `tool-result` before compression causes the compression model to
33
+ * receive empty placeholders instead of actual tool outputs, producing
34
+ * low-quality summaries. Leave tool-result management to compression's
35
+ * turn-based splitting.
30
36
  */
31
37
  interface CompactConfig {
32
38
  /** Which content types to clear from history. */
@@ -60,6 +66,9 @@ interface ContextChefOptions {
60
66
  /**
61
67
  * Mechanical compaction before LLM compression.
62
68
  * Clears specified content types (tool-result, thinking) at zero LLM cost.
69
+ *
70
+ * When combined with `compress`, use `clear: ['thinking']` only.
71
+ * See CompactConfig for details.
63
72
  */
64
73
  compact?: CompactConfig;
65
74
  /**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@context-chef/ai-sdk-middleware",
3
- "version": "1.0.4",
3
+ "version": "1.0.6",
4
4
  "type": "module",
5
5
  "main": "./dist/index.cjs",
6
6
  "module": "./dist/index.mjs",
@@ -39,7 +39,7 @@
39
39
  "url": "https://github.com/MyPrototypeWhat/context-chef/issues"
40
40
  },
41
41
  "dependencies": {
42
- "@context-chef/core": "3.0.1"
42
+ "@context-chef/core": "3.0.2"
43
43
  },
44
44
  "peerDependencies": {
45
45
  "@ai-sdk/provider": ">=3",