pi-cache-optimizer 2.4.6 → 2.4.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. package/README.md +64 -4
  2. package/README.zh-CN.md +48 -1
  3. package/index.ts +1203 -141
  4. package/package.json +1 -1
package/README.md CHANGED
@@ -56,6 +56,36 @@ This release keeps the original DeepSeek behavior and adds read-only stats adapt
56
56
  | Microsoft Phi | Model id/name contains `phi-` prefix, or pattern `phi` with safe boundaries | `Phi cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
57
57
  | AI21 Jamba | Model id/name contains `jamba` or `ai21` | `Jamba cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
58
58
  | Upstage Solar | Model id/name contains `solar` or `upstage` | `Solar cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
59
+ | Perplexity / Sonar | Model id/name contains `sonar`, `perplexity`, or pattern `pplx` with safe boundaries | `Sonar cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
60
+ | Amazon Nova | Model id/name contains `amazon-nova`, or pattern `nova` with safe boundaries | `Nova cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
61
+ | Reka | Model id/name contains `reka` | `Reka cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
62
+ | Falcon / TII | Model id/name contains `falcon` or `tiiuae` (not bare `tii`) | `Falcon cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
63
+ | Databricks DBRX | Model id/name contains `dbrx` or `databricks` | `DBRX cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
64
+ | MosaicML MPT | Model id/name contains `mosaicml`, `mpt-` prefix, or pattern `mpt` with safe boundaries | `MPT cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
65
+ | StableLM / Stability AI | Model id/name contains `stablelm`, `stable-lm`, or `stability-ai` | `StableLM cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
66
+ | BAAI / Aquila | Model id/name contains `aquila` or `baai` | `Aquila cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
67
+ | LG EXAONE | Model id/name contains `exaone` | `EXAONE cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
68
+ | Naver HyperCLOVA X | Model id/name contains `hyperclova` or `clova-x` (conservative, not bare `clova`/`naver`) | `HyperCLOVA cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
69
+ | Aleph Alpha Luminous | Model id/name contains `luminous`, `aleph-alpha`, or pattern `aleph` with safe boundaries | `Luminous cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
70
+ | Nous / Hermes / OpenHermes | Model id/name contains `nous`, `hermes`, or `openhermes` | `Hermes cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
71
+ | IBM Granite | Model id/name contains `granite` or `ibm-granite` | `Granite cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
72
+ | Snowflake Arctic | Model id/name contains `snowflake-arctic`, or safe-boundary pattern `arctic` | `Arctic cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
73
+ | Huawei Pangu / 盘古 | Model id/name contains `pangu`, `pan-gu`, `盘古`, or `huawei-pangu` | `Pangu cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
74
+ | SenseTime SenseNova / 商汤 | Model id/name contains `sensenova`, `sense-nova`, `sensechat`, or `商汤` | `SenseNova cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
75
+ | 360 Zhinao / 智脑 | Model id/name contains `360gpt`, `360-gpt`, `zhinao`, or `智脑` (no bare `360`) | `Zhinao cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
76
+ | OpenBMB MiniCPM | Model id/name contains `minicpm`, `mini-cpm`, or `openbmb` | `MiniCPM cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
77
+ | XVERSE | Model id/name contains `xverse` | `XVERSE cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
78
+ | OrionStar Orion | Model id/name contains `orionstar`, `orion-star`, or safe-boundary pattern `orion` | `Orion cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
79
+ | OpenChat | Model id/name contains `openchat` | `OpenChat cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
80
+ | Vicuna | Model id/name contains `vicuna` | `Vicuna cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
81
+ | WizardLM / WizardCoder | Model id/name contains `wizardlm`, `wizard-lm`, `wizardcoder`, or `wizard-coder` | `Wizard cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
82
+ | Zephyr | Model id/name contains `zephyr` | `Zephyr cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
83
+ | Dolphin | Model id/name contains `dolphin` | `Dolphin cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
84
+ | OpenOrca | Model id/name contains `openorca` or `open-orca` | `OpenOrca cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
85
+ | Starling | Model id/name contains `starling` | `Starling cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
86
+ | BLOOM / BigScience | Model id/name contains `bloom` or `bigscience` | `BLOOM cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
87
+ | RWKV | Model id/name contains `rwkv` | `RWKV cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
88
+ | Cohere Aya | Model id/name contains `aya-expanse`, or safe-boundary pattern `aya` (avoid `maya`/`payara`) | `Aya cache` | Pi-normalized usage, or raw OpenAI-shaped fields when visible |
59
89
  | Anthropic / Claude | Model id/name contains `anthropic` or `claude` | `Claude cache` | Pi-normalized usage, or raw `cache_read_input_tokens`, `cache_creation_input_tokens`, `input_tokens` |
60
90
  | Gemini / Vertex | Model id/name contains `gemini` or `vertex` | `Gemini cache` | Pi-normalized usage, or raw Gemini/Vertex cached-content token metadata when visible |
61
91
 
@@ -309,10 +339,40 @@ Edit ~/.pi/agent/models.json -> providers["otokapi"] -> compat (same level as ba
309
339
 
310
340
  ### `/cache-optimizer compat`
311
341
 
312
- Shows only the compat suggestion for the active model, including file path,
313
- provider path, and copyable JSON snippet. When no flags are missing, it shows
314
- `✅ Compat fully configured.` if the model is an applicable third-party proxy,
315
- or `ℹ️ Compat check not applicable for this model.` otherwise.
342
+ Shows the compat suggestion for the active model, including file path,
343
+ provider path, and copyable JSON snippet. When no compat flags are missing,
344
+ it shows `✅ Compat fully configured.` if the model is an applicable
345
+ third-party proxy, or `ℹ️ Compat check not applicable for this model.`
346
+ otherwise.
347
+
348
+ When the model is routed through a known router/channel proxy (OpenRouter,
349
+ Vercel AI Gateway, LiteLLM, OneAPI/NewAPI/VoAPI, or a generic third-party
350
+ OpenAI-compatible proxy), both `doctor` and `compat` subcommands append
351
+ router/channel diagnostics with targeted recommendations.
352
+
353
+ ### Router/channel diagnostics
354
+
355
+ For models using OpenAI-compatible APIs (`openai-completions` or
356
+ `openai-responses`) through a non-official base URL, the extension detects
357
+ common router/channel proxy patterns from `provider`, `baseUrl`, and `compat`
358
+ metadata:
359
+
360
+ | Profile | Detection | Recommendation |
361
+ |---------|-----------|----------------|
362
+ | **OpenRouter** | baseUrl or provider contains `openrouter`/`openrouter.ai` | Fix the upstream provider with `openRouterRouting.only` or `.order` in compat |
363
+ | **Vercel AI Gateway** | baseUrl contains `ai-gateway.vercel.sh` or provider contains `vercel` | Fix the upstream with `vercelGatewayRouting.only` or `.order` in compat |
364
+ | **LiteLLM / OneAPI / NewAPI / VoAPI** | baseUrl or provider contains `litellm`, `oneapi`/`one-api`, `newapi`/`new-api`, `voapi`/`vo-api` | Ensure sticky session routing, forward `prompt_cache_key` + session-affinity headers, return cache usage fields |
365
+ | **Generic third-party proxy** | Any `openai-completions` model with non-official base URL not matching above | General guidance: verify single-upstream routing, forward `prompt_cache_key` + session-affinity headers, return cache usage |
366
+
367
+ These diagnostics are **advisory only**. They do not participate in adapter
368
+ selection (still id/name-only), prompt_cache_key injection, footer stats, or
369
+ any automated configuration changes. Detection uses only metadata exposed by
370
+ Pi (`provider`, `api`, `baseUrl`, `compat`) — no API keys, prompts, payloads,
371
+ headers, or model outputs are read or exposed.
372
+
373
+ Official OpenAI (`api.openai.com`) and custom transports (`kiro-api`,
374
+ `anthropic-messages`, `bedrock-converse-stream`) are excluded from router/
375
+ channel diagnostics.
316
376
 
317
377
  ### Security
318
378
 
package/README.zh-CN.md CHANGED
@@ -59,6 +59,36 @@
59
59
  | Microsoft Phi | model id/name 包含 `phi-` 前缀,或安全边界内 `phi` 模式 | `Phi cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
60
60
  | AI21 Jamba | model id/name 包含 `jamba` 或 `ai21` | `Jamba cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
61
61
  | Upstage Solar | model id/name 包含 `solar` 或 `upstage` | `Solar cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
62
+ | Perplexity / Sonar | model id/name 包含 `sonar`、`perplexity`,或安全边界内 `pplx` 模式 | `Sonar cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
63
+ | Amazon Nova | model id/name 包含 `amazon-nova`,或安全边界内 `nova` 模式 | `Nova cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
64
+ | Reka | model id/name 包含 `reka` | `Reka cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
65
+ | Falcon / TII | model id/name 包含 `falcon` 或 `tiiuae`(不含裸 `tii`) | `Falcon cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
66
+ | Databricks DBRX | model id/name 包含 `dbrx` 或 `databricks` | `DBRX cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
67
+ | MosaicML MPT | model id/name 包含 `mosaicml`、`mpt-` 前缀,或安全边界内 `mpt` 模式 | `MPT cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
68
+ | StableLM / Stability AI | model id/name 包含 `stablelm`、`stable-lm` 或 `stability-ai` | `StableLM cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
69
+ | BAAI / Aquila | model id/name 包含 `aquila` 或 `baai` | `Aquila cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
70
+ | LG EXAONE | model id/name 包含 `exaone` | `EXAONE cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
71
+ | Naver HyperCLOVA X | model id/name 包含 `hyperclova` 或 `clova-x`(保守检测,不含裸 `clova`/`naver`) | `HyperCLOVA cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
72
+ | Aleph Alpha Luminous | model id/name 包含 `luminous`、`aleph-alpha`,或安全边界内 `aleph` 模式 | `Luminous cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
73
+ | Nous / Hermes / OpenHermes | model id/name 包含 `nous`、`hermes` 或 `openhermes` | `Hermes cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
74
+ | IBM Granite | model id/name 包含 `granite` 或 `ibm-granite` | `Granite cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
75
+ | Snowflake Arctic | model id/name 包含 `snowflake-arctic`,或安全边界内 `arctic` 模式 | `Arctic cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
76
+ | Huawei Pangu / 盘古 | model id/name 包含 `pangu`、`pan-gu`、`盘古` 或 `huawei-pangu` | `Pangu cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
77
+ | SenseTime SenseNova / 商汤 | model id/name 包含 `sensenova`、`sense-nova`、`sensechat` 或 `商汤` | `SenseNova cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
78
+ | 360 Zhinao / 智脑 | model id/name 包含 `360gpt`、`360-gpt`、`zhinao` 或 `智脑`(不含裸 `360`) | `Zhinao cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
79
+ | OpenBMB MiniCPM | model id/name 包含 `minicpm`、`mini-cpm` 或 `openbmb` | `MiniCPM cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
80
+ | XVERSE | model id/name 包含 `xverse` | `XVERSE cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
81
+ | OrionStar Orion | model id/name 包含 `orionstar`、`orion-star`,或安全边界内 `orion` 模式 | `Orion cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
82
+ | OpenChat | model id/name 包含 `openchat` | `OpenChat cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
83
+ | Vicuna | model id/name 包含 `vicuna` | `Vicuna cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
84
+ | WizardLM / WizardCoder | model id/name 包含 `wizardlm`、`wizard-lm`、`wizardcoder` 或 `wizard-coder` | `Wizard cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
85
+ | Zephyr | model id/name 包含 `zephyr` | `Zephyr cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
86
+ | Dolphin | model id/name 包含 `dolphin` | `Dolphin cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
87
+ | OpenOrca | model id/name 包含 `openorca` 或 `open-orca` | `OpenOrca cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
88
+ | Starling | model id/name 包含 `starling` | `Starling cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
89
+ | BLOOM / BigScience | model id/name 包含 `bloom` 或 `bigscience` | `BLOOM cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
90
+ | RWKV | model id/name 包含 `rwkv` | `RWKV cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
91
+ | Cohere Aya | model id/name 包含 `aya-expanse`,或安全边界内 `aya` 模式(避免 `maya`/`payara`) | `Aya cache` | Pi 归一化 usage,或可见 OpenAI 形状字段 |
62
92
  | Anthropic / Claude | model id/name 包含 `anthropic` 或 `claude` | `Claude cache` | Pi 归一化 usage,或可见 raw 字段 `cache_read_input_tokens`、`cache_creation_input_tokens`、`input_tokens` |
63
93
  | Gemini / Vertex | model id/name 包含 `gemini` 或 `vertex` | `Gemini cache` | Pi 归一化 usage,或可见 Gemini/Vertex cached-content token metadata |
64
94
 
@@ -295,7 +325,24 @@ Edit ~/.pi/agent/models.json -> providers["otokapi"] -> compat (same level as ba
295
325
 
296
326
  ### `/cache-optimizer compat`
297
327
 
298
- 仅显示当前模型的 compat 建议,包括文件路径、provider 路径和可复制 JSON 片段。当没有缺失标志时,如果模型是适用的第三方代理则显示 `✅ Compat fully configured.`,否则显示 `ℹ️ Compat check not applicable for this model.`。
328
+ 显示当前模型的 compat 建议,包括文件路径、provider 路径和可复制 JSON 片段。当没有缺失的 compat 标志时,如果模型是适用的第三方代理则显示 `✅ Compat fully configured.`,否则显示 `ℹ️ Compat check not applicable for this model.`。
329
+
330
+ 当模型通过已知的路由器/通道代理(OpenRouter、Vercel AI Gateway、LiteLLM、OneAPI/NewAPI/VoAPI 或通用第三方 OpenAI-compatible 代理)时,`doctor` 和 `compat` 子命令都会附加路由/通道诊断信息和建议。
331
+
332
+ ### 路由/通道诊断
333
+
334
+ 对于通过非官方 base URL 使用 OpenAI-compatible API(`openai-completions` 或 `openai-responses`)的模型,扩展会从 `provider`、`baseUrl` 和 `compat` 元数据中检测常见的路由/通道代理模式:
335
+
336
+ | 类型 | 检测方式 | 建议 |
337
+ |------|----------|------|
338
+ | **OpenRouter** | baseUrl 或 provider 包含 `openrouter`/`openrouter.ai` | 在 compat 中用 `openRouterRouting.only` 或 `.order` 固定上游 provider |
339
+ | **Vercel AI Gateway** | baseUrl 包含 `ai-gateway.vercel.sh` 或 provider 包含 `vercel` | 在 compat 中用 `vercelGatewayRouting.only` 或 `.order` 固定上游 |
340
+ | **LiteLLM / OneAPI / NewAPI / VoAPI** | baseUrl 或 provider 包含 `litellm`、`oneapi`/`one-api`、`newapi`/`new-api`、`voapi`/`vo-api` | 确保每 session 固定路由,转发 `prompt_cache_key` + session-affinity headers,返回缓存用量字段 |
341
+ | **通用第三方代理** | 任何非官方 base URL 的 `openai-completions` 模型,且不匹配以上类型 | 通用建议:验证单上游路由、转发 `prompt_cache_key` + session-affinity headers、返回缓存用量 |
342
+
343
+ 这些诊断**仅用于建议**。它们不参与 adapter selection(仍基于 id/name)、不参与 `prompt_cache_key` 注入、不参与 footer 统计、也不做任何自动化配置修改。检测仅使用 Pi 暴露的元数据(`provider`、`api`、`baseUrl`、`compat`),不会读取或暴露 API key、prompt、payload、headers 或模型输出。
344
+
345
+ 官方 OpenAI(`api.openai.com`)和 custom transport(`kiro-api`、`anthropic-messages`、`bedrock-converse-stream`)不会触发路由/通道诊断。
299
346
 
300
347
  ### 安全说明
301
348