pi-cache-optimizer 2.6.6 → 2.6.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +111 -1
- package/README.zh-CN.md +111 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -20,6 +20,7 @@ Pi extension for improving provider-side KV / prompt cache hit rates. It keeps s
|
|
|
20
20
|
- [Anthropic adaptive thinking models](#anthropic-adaptive-thinking-models)
|
|
21
21
|
- [Auto-repair with `/cache-optimizer fix`](#auto-repair-with-cache-optimizer-fix)
|
|
22
22
|
- [Footer stats](#footer-stats)
|
|
23
|
+
- [For router / virtual-channel extension authors](#for-router--virtual-channel-extension-authors)
|
|
23
24
|
- [Uninstall](#uninstall)
|
|
24
25
|
- [Verify effect](#verify-effect)
|
|
25
26
|
- [License](#license)
|
|
@@ -51,6 +52,8 @@ pi remove npm:pi-deepseek-cache-optimizer && pi install npm:pi-cache-optimizer
|
|
|
51
52
|
|
|
52
53
|
Run `/reload` in Pi after install/update/remove so extension hooks refresh.
|
|
53
54
|
|
|
55
|
+
On Pi 0.79.7 and newer, `pi update` updates Pi itself only. To update installed Pi packages such as this extension, run `pi update --extensions` (packages only) or `pi update --all` (Pi + packages).
|
|
56
|
+
|
|
54
57
|
## Commands
|
|
55
58
|
|
|
56
59
|
| Command | Effect |
|
|
@@ -213,7 +216,7 @@ If only one model should change, use `modelOverrides`:
|
|
|
213
216
|
|
|
214
217
|
Stats are read-only local counters stored at `~/.pi/agent/pi-cache-optimizer-stats.json` and scoped by Pi session + provider/model. They contain only dates and numeric counters — no API keys, prompts, payloads, headers, responses, or model output.
|
|
215
218
|
|
|
216
|
-
|
|
219
|
+
Pi 0.79+ also includes a built-in footer `CH` marker for the latest prompt cache hit rate. This extension complements that marker with persisted, provider/model/session-scoped counters plus proxy compat diagnostics.
|
|
217
220
|
|
|
218
221
|
Example footer:
|
|
219
222
|
|
|
@@ -227,6 +230,113 @@ Supported footer labels include: DS, Claude, OpenAI, Gemini, Kimi, Qwen, GLM, Mi
|
|
|
227
230
|
|
|
228
231
|
Adapter selection uses only model id/name (plus assistant message model/name on message end). Generic OpenAI-shaped APIs are not treated as OpenAI-family unless the model id/name matches a supported family.
|
|
229
232
|
|
|
233
|
+
## For router / virtual-channel extension authors
|
|
234
|
+
|
|
235
|
+
If your Pi extension provides a virtual routing provider (for example `router/auto`, `router/smart`, or a profile/channel that forwards to a real upstream), this extension can show cache stats for the real upstream provider/model instead of the virtual shell. Integration is optional, versioned, and does **not** require importing this package.
|
|
236
|
+
|
|
237
|
+
### Minimum integration: final assistant message metadata
|
|
238
|
+
|
|
239
|
+
For seamless final cache-stat attribution, relay the real upstream identity on completed assistant messages:
|
|
240
|
+
|
|
241
|
+
```ts
|
|
242
|
+
{
|
|
243
|
+
role: "assistant",
|
|
244
|
+
provider: "anthropic", // real upstream provider
|
|
245
|
+
responseModel: "claude-opus-4-8", // or model: "..."
|
|
246
|
+
api: "anthropic-messages", // upstream Pi API id when known
|
|
247
|
+
usage: {
|
|
248
|
+
input: 1200, // Pi-normalized uncached input tokens, if available
|
|
249
|
+
cacheRead: 8000, // tokens read from provider prompt cache
|
|
250
|
+
cacheWrite: 500, // tokens newly written to provider prompt cache
|
|
251
|
+
},
|
|
252
|
+
}
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
`message_end` treats these assistant-message fields as authoritative. If `provider` + `model`/`responseModel` + cache usage are present, stats update the upstream bucket even when the active model is still `router/auto`. If upstream usage does not expose cache fields, leave them absent/zero; this extension will not fake cache hits.
|
|
256
|
+
|
|
257
|
+
### Optional: live route registry for pre-response UX
|
|
258
|
+
|
|
259
|
+
Final message metadata is enough for post-response stats. For pre-response flows — footer display before the first response, `/cache-optimizer doctor`, `/cache-optimizer compat`, `/cache-optimizer reset`, and OpenAI-compatible `prompt_cache_key` fallback — register a live route adapter under `Symbol.for("pi.routing.registry.v1")`.
|
|
260
|
+
|
|
261
|
+
Protocol shape:
|
|
262
|
+
|
|
263
|
+
```ts
|
|
264
|
+
type PiRouteSnapshot = {
|
|
265
|
+
virtualProvider: string;
|
|
266
|
+
virtualModelId: string;
|
|
267
|
+
provider: string;
|
|
268
|
+
modelId: string;
|
|
269
|
+
api?: string;
|
|
270
|
+
canonicalModelId?: string;
|
|
271
|
+
routeLabel?: string;
|
|
272
|
+
status?: "planned" | "trying" | "selected" | "success" | "failed";
|
|
273
|
+
sessionIdHash?: string;
|
|
274
|
+
requestId?: string;
|
|
275
|
+
timestamp: number;
|
|
276
|
+
};
|
|
277
|
+
|
|
278
|
+
type PiRouterAdapterV1 = {
|
|
279
|
+
virtualProvider: string;
|
|
280
|
+
resolveActiveRoute(
|
|
281
|
+
virtualModelId: string,
|
|
282
|
+
hint?: { sessionIdHash?: string; requestId?: string },
|
|
283
|
+
): PiRouteSnapshot | undefined;
|
|
284
|
+
resolveCandidateRoutes?(virtualModelId: string): PiRouteSnapshot[];
|
|
285
|
+
subscribe?(listener: (event: PiRouteSnapshot) => void): () => void;
|
|
286
|
+
};
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
Registration pattern:
|
|
290
|
+
|
|
291
|
+
```ts
|
|
292
|
+
const ROUTING = Symbol.for("pi.routing.registry.v1");
|
|
293
|
+
const registry = (globalThis as Record<symbol, unknown>)[ROUTING] as
|
|
294
|
+
| { version: 1; registerRouter(adapter: PiRouterAdapterV1): () => void }
|
|
295
|
+
| undefined;
|
|
296
|
+
|
|
297
|
+
registry?.registerRouter({
|
|
298
|
+
virtualProvider: "router",
|
|
299
|
+
resolveActiveRoute(virtualModelId, hint) {
|
|
300
|
+
return {
|
|
301
|
+
virtualProvider: "router",
|
|
302
|
+
virtualModelId,
|
|
303
|
+
provider: "deepseek",
|
|
304
|
+
modelId: "deepseek-v4",
|
|
305
|
+
api: "openai-completions",
|
|
306
|
+
sessionIdHash: hint?.sessionIdHash,
|
|
307
|
+
timestamp: Date.now(),
|
|
308
|
+
};
|
|
309
|
+
},
|
|
310
|
+
});
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
Do not overwrite an existing registry. If your extension loads before this optimizer, retry registration on `session_start` or create the same V1 registry shape only if no registry exists.
|
|
314
|
+
|
|
315
|
+
### Optional: query-scoped cache hints
|
|
316
|
+
|
|
317
|
+
Routers that forward to an inner Pi request path can read query-scoped hints from `Symbol.for("pi.cache.hints.v1")`:
|
|
318
|
+
|
|
319
|
+
```ts
|
|
320
|
+
const CACHE_HINTS = Symbol.for("pi.cache.hints.v1");
|
|
321
|
+
const hints = (globalThis as Record<symbol, any>)[CACHE_HINTS]?.getHints?.({
|
|
322
|
+
sessionIdHash,
|
|
323
|
+
virtualProvider: "router",
|
|
324
|
+
virtualModelId: "auto",
|
|
325
|
+
upstreamProvider: "deepseek",
|
|
326
|
+
upstreamModelId: "deepseek-v4",
|
|
327
|
+
api: "openai-completions",
|
|
328
|
+
});
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
When the query matches the current session/route, `hints` may contain `systemPrompt`, `promptCacheKey`, and `cacheRetention: "long"`. Treat these as advisory and sensitive: do not log them, do not expose prompt text, and do not overwrite an existing request-level `prompt_cache_key` / `promptCacheKey`.
|
|
332
|
+
|
|
333
|
+
### Security and correctness rules
|
|
334
|
+
|
|
335
|
+
- Do not import `pi-cache-optimizer`; use `Symbol.for(...)` discovery only.
|
|
336
|
+
- Do not expose API keys, prompts, payloads, headers, response bodies, or model output in route snapshots or logs.
|
|
337
|
+
- Use assistant-message metadata for final attribution; live registry data is advisory and may be stale by response time.
|
|
338
|
+
- Preserve truthful usage. Missing cache usage should show as 0/under-reported, not as synthetic hits.
|
|
339
|
+
|
|
230
340
|
## Uninstall
|
|
231
341
|
|
|
232
342
|
```bash
|
package/README.zh-CN.md
CHANGED
|
@@ -20,6 +20,7 @@
|
|
|
20
20
|
- [Anthropic adaptive thinking 模型](#anthropic-adaptive-thinking-模型)
|
|
21
21
|
- [使用 `/cache-optimizer fix` 自动修复](#使用-cache-optimizer-fix-自动修复)
|
|
22
22
|
- [Footer 统计](#footer-统计)
|
|
23
|
+
- [Router / Virtual-channel 扩展作者指南](#router--virtual-channel-扩展作者指南)
|
|
23
24
|
- [卸载](#卸载)
|
|
24
25
|
- [验证效果](#验证效果)
|
|
25
26
|
- [License](#license)
|
|
@@ -51,6 +52,8 @@ pi remove npm:pi-deepseek-cache-optimizer && pi install npm:pi-cache-optimizer
|
|
|
51
52
|
|
|
52
53
|
安装、更新或移除后,在 Pi 中运行 `/reload`,让 extension hooks 刷新。
|
|
53
54
|
|
|
55
|
+
Pi 0.79.7 及之后,`pi update` 默认只更新 Pi 本体。若要更新已安装的 Pi package(包括本扩展),请运行 `pi update --extensions`(只更新 packages)或 `pi update --all`(Pi 与 packages 一起更新)。
|
|
56
|
+
|
|
54
57
|
## 命令
|
|
55
58
|
|
|
56
59
|
| 命令 | 作用 |
|
|
@@ -213,7 +216,7 @@ Provider 级最小 override:
|
|
|
213
216
|
|
|
214
217
|
统计是只读本地计数,保存在 `~/.pi/agent/pi-cache-optimizer-stats.json`,按 Pi session + provider/model 隔离。文件只包含日期和数字计数,不包含 API key、prompt、payload、headers、响应或模型输出。
|
|
215
218
|
|
|
216
|
-
|
|
219
|
+
Pi 0.79+ 已内置 footer `CH` 标记,用于显示最近一次 prompt cache hit rate。本扩展在此基础上补充持久化的 provider/model/session-scoped 计数,以及代理 compat 诊断。
|
|
217
220
|
|
|
218
221
|
示例 footer:
|
|
219
222
|
|
|
@@ -227,6 +230,113 @@ OpenAI cache 3/10 · 0.002M/0.005M tok (40%) ⚠️ compat
|
|
|
227
230
|
|
|
228
231
|
Adapter 选择只看模型 id/name(以及 message_end 时 assistant message 的 model/name)。仅使用 OpenAI-shaped API 不会被当作 OpenAI-family,除非模型 id/name 匹配受支持的家族。
|
|
229
232
|
|
|
233
|
+
## Router / Virtual-channel 扩展作者指南
|
|
234
|
+
|
|
235
|
+
如果你的 Pi 扩展提供虚拟 routing provider(例如 `router/auto`、`router/smart`,或会转发到真实上游的 profile/channel),本扩展可以为真实上游 provider/model 显示缓存统计,而不是把统计记到虚拟外壳上。集成是可选、版本化的,并且**不需要导入本包**。
|
|
236
|
+
|
|
237
|
+
### 最小集成:最终 assistant message metadata
|
|
238
|
+
|
|
239
|
+
要无缝获得最终缓存统计归因,请在完成的 assistant message 上透传真实上游身份:
|
|
240
|
+
|
|
241
|
+
```ts
|
|
242
|
+
{
|
|
243
|
+
role: "assistant",
|
|
244
|
+
provider: "anthropic", // 真实上游 provider
|
|
245
|
+
responseModel: "claude-opus-4-8", // 或 model: "..."
|
|
246
|
+
api: "anthropic-messages", // 已知时填写上游 Pi API id
|
|
247
|
+
usage: {
|
|
248
|
+
input: 1200, // Pi-normalized 未缓存 input tokens,如可用
|
|
249
|
+
cacheRead: 8000, // 从 provider prompt cache 读取的 tokens
|
|
250
|
+
cacheWrite: 500, // 本次新写入 provider prompt cache 的 tokens
|
|
251
|
+
},
|
|
252
|
+
}
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
`message_end` 会把这些 assistant-message 字段视为权威来源。只要存在 `provider` + `model`/`responseModel` + cache usage,即使 active model 仍是 `router/auto`,统计也会更新真实上游桶。如果上游 usage 没有 cache 字段,请保持缺失或为 0;本扩展不会伪造 cache hit。
|
|
256
|
+
|
|
257
|
+
### 可选:用于预响应 UX 的实时路由注册表
|
|
258
|
+
|
|
259
|
+
最终 message metadata 足以支持响应后的统计。若要支持响应前流程——首次响应前的 footer 显示、`/cache-optimizer doctor`、`/cache-optimizer compat`、`/cache-optimizer reset` 和 OpenAI-compatible `prompt_cache_key` fallback——请在 `Symbol.for("pi.routing.registry.v1")` 下注册 live route adapter。
|
|
260
|
+
|
|
261
|
+
协议形状:
|
|
262
|
+
|
|
263
|
+
```ts
|
|
264
|
+
type PiRouteSnapshot = {
|
|
265
|
+
virtualProvider: string;
|
|
266
|
+
virtualModelId: string;
|
|
267
|
+
provider: string;
|
|
268
|
+
modelId: string;
|
|
269
|
+
api?: string;
|
|
270
|
+
canonicalModelId?: string;
|
|
271
|
+
routeLabel?: string;
|
|
272
|
+
status?: "planned" | "trying" | "selected" | "success" | "failed";
|
|
273
|
+
sessionIdHash?: string;
|
|
274
|
+
requestId?: string;
|
|
275
|
+
timestamp: number;
|
|
276
|
+
};
|
|
277
|
+
|
|
278
|
+
type PiRouterAdapterV1 = {
|
|
279
|
+
virtualProvider: string;
|
|
280
|
+
resolveActiveRoute(
|
|
281
|
+
virtualModelId: string,
|
|
282
|
+
hint?: { sessionIdHash?: string; requestId?: string },
|
|
283
|
+
): PiRouteSnapshot | undefined;
|
|
284
|
+
resolveCandidateRoutes?(virtualModelId: string): PiRouteSnapshot[];
|
|
285
|
+
subscribe?(listener: (event: PiRouteSnapshot) => void): () => void;
|
|
286
|
+
};
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
注册模式:
|
|
290
|
+
|
|
291
|
+
```ts
|
|
292
|
+
const ROUTING = Symbol.for("pi.routing.registry.v1");
|
|
293
|
+
const registry = (globalThis as Record<symbol, unknown>)[ROUTING] as
|
|
294
|
+
| { version: 1; registerRouter(adapter: PiRouterAdapterV1): () => void }
|
|
295
|
+
| undefined;
|
|
296
|
+
|
|
297
|
+
registry?.registerRouter({
|
|
298
|
+
virtualProvider: "router",
|
|
299
|
+
resolveActiveRoute(virtualModelId, hint) {
|
|
300
|
+
return {
|
|
301
|
+
virtualProvider: "router",
|
|
302
|
+
virtualModelId,
|
|
303
|
+
provider: "deepseek",
|
|
304
|
+
modelId: "deepseek-v4",
|
|
305
|
+
api: "openai-completions",
|
|
306
|
+
sessionIdHash: hint?.sessionIdHash,
|
|
307
|
+
timestamp: Date.now(),
|
|
308
|
+
};
|
|
309
|
+
},
|
|
310
|
+
});
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
不要覆盖已有 registry。如果你的扩展比本优化器更早加载,请在 `session_start` 时重试注册,或仅在 registry 不存在时创建同样的 V1 registry 形状。
|
|
314
|
+
|
|
315
|
+
### 可选:按查询过滤的缓存提示
|
|
316
|
+
|
|
317
|
+
会转发到内部 Pi 请求路径的 router,可以从 `Symbol.for("pi.cache.hints.v1")` 读取按查询过滤的提示:
|
|
318
|
+
|
|
319
|
+
```ts
|
|
320
|
+
const CACHE_HINTS = Symbol.for("pi.cache.hints.v1");
|
|
321
|
+
const hints = (globalThis as Record<symbol, any>)[CACHE_HINTS]?.getHints?.({
|
|
322
|
+
sessionIdHash,
|
|
323
|
+
virtualProvider: "router",
|
|
324
|
+
virtualModelId: "auto",
|
|
325
|
+
upstreamProvider: "deepseek",
|
|
326
|
+
upstreamModelId: "deepseek-v4",
|
|
327
|
+
api: "openai-completions",
|
|
328
|
+
});
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
当查询匹配当前 session/route 时,`hints` 可能包含 `systemPrompt`、`promptCacheKey` 和 `cacheRetention: "long"`。这些提示是参考信息且可能敏感:不要记录日志,不要暴露 prompt 文本,也不要覆盖已有 request-level `prompt_cache_key` / `promptCacheKey`。
|
|
332
|
+
|
|
333
|
+
### 安全与正确性规则
|
|
334
|
+
|
|
335
|
+
- 不要导入 `pi-cache-optimizer`;只使用 `Symbol.for(...)` 发现协议。
|
|
336
|
+
- 不要在 route snapshot 或日志中暴露 API key、prompt、payload、headers、response body 或模型输出。
|
|
337
|
+
- 最终归因使用 assistant-message metadata;live registry 只是参考信息,到响应完成时可能已经过期。
|
|
338
|
+
- 保持 usage 真实。缺失 cache usage 时应该显示 0 或低报,而不是合成命中。
|
|
339
|
+
|
|
230
340
|
## 卸载
|
|
231
341
|
|
|
232
342
|
```bash
|
package/package.json
CHANGED