pi-cache-optimizer 2.4.7 → 2.4.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +34 -4
- package/README.zh-CN.md +18 -1
- package/index.ts +213 -13
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -339,10 +339,40 @@ Edit ~/.pi/agent/models.json -> providers["otokapi"] -> compat (same level as ba
|
|
|
339
339
|
|
|
340
340
|
### `/cache-optimizer compat`
|
|
341
341
|
|
|
342
|
-
Shows
|
|
343
|
-
provider path, and copyable JSON snippet. When no flags are missing,
|
|
344
|
-
`✅ Compat fully configured.` if the model is an applicable
|
|
345
|
-
or `ℹ️ Compat check not applicable for this model.`
|
|
342
|
+
Shows the compat suggestion for the active model, including file path,
|
|
343
|
+
provider path, and copyable JSON snippet. When no compat flags are missing,
|
|
344
|
+
it shows `✅ Compat fully configured.` if the model is an applicable
|
|
345
|
+
third-party proxy, or `ℹ️ Compat check not applicable for this model.`
|
|
346
|
+
otherwise.
|
|
347
|
+
|
|
348
|
+
When the model is routed through a known router/channel proxy (OpenRouter,
|
|
349
|
+
Vercel AI Gateway, LiteLLM, OneAPI/NewAPI/VoAPI, or a generic third-party
|
|
350
|
+
OpenAI-compatible proxy), both `doctor` and `compat` subcommands append
|
|
351
|
+
router/channel diagnostics with targeted recommendations.
|
|
352
|
+
|
|
353
|
+
### Router/channel diagnostics
|
|
354
|
+
|
|
355
|
+
For models using OpenAI-compatible APIs (`openai-completions` or
|
|
356
|
+
`openai-responses`) through a non-official base URL, the extension detects
|
|
357
|
+
common router/channel proxy patterns from `provider`, `baseUrl`, and `compat`
|
|
358
|
+
metadata:
|
|
359
|
+
|
|
360
|
+
| Profile | Detection | Recommendation |
|
|
361
|
+
|---------|-----------|----------------|
|
|
362
|
+
| **OpenRouter** | baseUrl or provider contains `openrouter`/`openrouter.ai` | Fix the upstream provider with `openRouterRouting.only` or `.order` in compat |
|
|
363
|
+
| **Vercel AI Gateway** | baseUrl contains `ai-gateway.vercel.sh` or provider contains `vercel` | Fix the upstream with `vercelGatewayRouting.only` or `.order` in compat |
|
|
364
|
+
| **LiteLLM / OneAPI / NewAPI / VoAPI** | baseUrl or provider contains `litellm`, `oneapi`/`one-api`, `newapi`/`new-api`, `voapi`/`vo-api` | Ensure sticky session routing, forward `prompt_cache_key` + session-affinity headers, return cache usage fields |
|
|
365
|
+
| **Generic third-party proxy** | Any `openai-completions` model with non-official base URL not matching above | General guidance: verify single-upstream routing, forward `prompt_cache_key` + session-affinity headers, return cache usage |
|
|
366
|
+
|
|
367
|
+
These diagnostics are **advisory only**. They do not participate in adapter
|
|
368
|
+
selection (still id/name-only), prompt_cache_key injection, footer stats, or
|
|
369
|
+
any automated configuration changes. Detection uses only metadata exposed by
|
|
370
|
+
Pi (`provider`, `api`, `baseUrl`, `compat`) — no API keys, prompts, payloads,
|
|
371
|
+
headers, or model outputs are read or exposed.
|
|
372
|
+
|
|
373
|
+
Official OpenAI (`api.openai.com`) and custom transports (`kiro-api`,
|
|
374
|
+
`anthropic-messages`, `bedrock-converse-stream`) are excluded from router/
|
|
375
|
+
channel diagnostics.
|
|
346
376
|
|
|
347
377
|
### Security
|
|
348
378
|
|
package/README.zh-CN.md
CHANGED
|
@@ -325,7 +325,24 @@ Edit ~/.pi/agent/models.json -> providers["otokapi"] -> compat (same level as ba
|
|
|
325
325
|
|
|
326
326
|
### `/cache-optimizer compat`
|
|
327
327
|
|
|
328
|
-
|
|
328
|
+
显示当前模型的 compat 建议,包括文件路径、provider 路径和可复制 JSON 片段。当没有缺失的 compat 标志时,如果模型是适用的第三方代理则显示 `✅ Compat fully configured.`,否则显示 `ℹ️ Compat check not applicable for this model.`。
|
|
329
|
+
|
|
330
|
+
当模型通过已知的路由器/通道代理(OpenRouter、Vercel AI Gateway、LiteLLM、OneAPI/NewAPI/VoAPI 或通用第三方 OpenAI-compatible 代理)时,`doctor` 和 `compat` 子命令都会附加路由/通道诊断信息和建议。
|
|
331
|
+
|
|
332
|
+
### 路由/通道诊断
|
|
333
|
+
|
|
334
|
+
对于通过非官方 base URL 使用 OpenAI-compatible API(`openai-completions` 或 `openai-responses`)的模型,扩展会从 `provider`、`baseUrl` 和 `compat` 元数据中检测常见的路由/通道代理模式:
|
|
335
|
+
|
|
336
|
+
| 类型 | 检测方式 | 建议 |
|
|
337
|
+
|------|----------|------|
|
|
338
|
+
| **OpenRouter** | baseUrl 或 provider 包含 `openrouter`/`openrouter.ai` | 在 compat 中用 `openRouterRouting.only` 或 `.order` 固定上游 provider |
|
|
339
|
+
| **Vercel AI Gateway** | baseUrl 包含 `ai-gateway.vercel.sh` 或 provider 包含 `vercel` | 在 compat 中用 `vercelGatewayRouting.only` 或 `.order` 固定上游 |
|
|
340
|
+
| **LiteLLM / OneAPI / NewAPI / VoAPI** | baseUrl 或 provider 包含 `litellm`、`oneapi`/`one-api`、`newapi`/`new-api`、`voapi`/`vo-api` | 确保每 session 固定路由,转发 `prompt_cache_key` + session-affinity headers,返回缓存用量字段 |
|
|
341
|
+
| **通用第三方代理** | 任何非官方 base URL 的 `openai-completions` 模型,且不匹配以上类型 | 通用建议:验证单上游路由、转发 `prompt_cache_key` + session-affinity headers、返回缓存用量 |
|
|
342
|
+
|
|
343
|
+
这些诊断**仅用于建议**。它们不参与 adapter selection(仍基于 id/name)、不参与 `prompt_cache_key` 注入、不参与 footer 统计、也不做任何自动化配置修改。检测仅使用 Pi 暴露的元数据(`provider`、`api`、`baseUrl`、`compat`),不会读取或暴露 API key、prompt、payload、headers 或模型输出。
|
|
344
|
+
|
|
345
|
+
官方 OpenAI(`api.openai.com`)和 custom transport(`kiro-api`、`anthropic-messages`、`bedrock-converse-stream`)不会触发路由/通道诊断。
|
|
329
346
|
|
|
330
347
|
### 安全说明
|
|
331
348
|
|
package/index.ts
CHANGED
|
@@ -2621,6 +2621,171 @@ function isCompatCheckApplicable(model: PiModel): boolean {
|
|
|
2621
2621
|
return lower(model.api) === "openai-completions" && !isOfficialOpenAIBaseUrl(model);
|
|
2622
2622
|
}
|
|
2623
2623
|
|
|
2624
|
+
/**
|
|
2625
|
+
* Detect router / channel profiles from a PiModel and return diagnostic notes.
|
|
2626
|
+
*
|
|
2627
|
+
* This function is advisory only — it does NOT participate in adapter selection,
|
|
2628
|
+
* prompt_cache_key injection, or footer stats. It inspects provider, api, baseUrl,
|
|
2629
|
+
* and compat to identify common proxy/router patterns where cache performance may
|
|
2630
|
+
* be degraded due to multi-backend routing.
|
|
2631
|
+
*
|
|
2632
|
+
* Known profiles (checked in order):
|
|
2633
|
+
* 1. OpenRouter — baseUrl or provider id matching openrouter.ai / openrouter
|
|
2634
|
+
* 2. Vercel AI Gateway — baseUrl matching ai-gateway.vercel.sh, or provider
|
|
2635
|
+
* matching vercel / vercel-ai-gateway
|
|
2636
|
+
* 3. LiteLLM / OneAPI / NewAPI / VoAPI — baseUrl or provider matching litellm,
|
|
2637
|
+
* oneapi, one-api, newapi, new-api, voapi, vo-api (self-hosted aggregation)
|
|
2638
|
+
* 4. Generic third-party OpenAI-compatible proxy — any openai-completions model
|
|
2639
|
+
* with a non-official base URL that does not match a higher-profile above.
|
|
2640
|
+
*
|
|
2641
|
+
* Official OpenAI (api.openai.com) and custom transports (kiro-api, anthropic-messages,
|
|
2642
|
+
* bedrock-converse-stream) do NOT produce notes.
|
|
2643
|
+
*/
|
|
2644
|
+
function describeRouterChannelDiagnostics(model: PiModel): string[] {
|
|
2645
|
+
const notes: string[] = [];
|
|
2646
|
+
const api = lower(model.api);
|
|
2647
|
+
const baseUrl = lower(model.baseUrl || "");
|
|
2648
|
+
const provider = lower(model.provider);
|
|
2649
|
+
|
|
2650
|
+
// Only OpenAI-compatible APIs are applicable for router/channel diagnostics.
|
|
2651
|
+
// Custom transports like kiro-api, anthropic-messages, bedrock-converse-stream
|
|
2652
|
+
// or non-OpenAI APIs are excluded.
|
|
2653
|
+
if (api !== "openai-completions" && api !== "openai-responses") {
|
|
2654
|
+
return notes;
|
|
2655
|
+
}
|
|
2656
|
+
|
|
2657
|
+
// Official OpenAI bypass — no notes needed.
|
|
2658
|
+
if (isOfficialOpenAIBaseUrl(model)) {
|
|
2659
|
+
return notes;
|
|
2660
|
+
}
|
|
2661
|
+
|
|
2662
|
+
// ── 1. OpenRouter ────────────────────────────────────────────────
|
|
2663
|
+
if (
|
|
2664
|
+
baseUrl.includes("openrouter.ai") ||
|
|
2665
|
+
baseUrl.includes("openrouter") ||
|
|
2666
|
+
provider.includes("openrouter")
|
|
2667
|
+
) {
|
|
2668
|
+
const compat = getCompat(model);
|
|
2669
|
+
const hasOnly = !!(compat as Record<string, unknown>)["openRouterRouting"]?.only;
|
|
2670
|
+
const hasOrder = !!(compat as Record<string, unknown>)["openRouterRouting"]?.order;
|
|
2671
|
+
|
|
2672
|
+
notes.push(
|
|
2673
|
+
"🔀 Router/channel: OpenRouter detected. OpenRouter is a multi-provider router; " +
|
|
2674
|
+
"low cache hit rates are common when each turn lands on a different upstream provider.",
|
|
2675
|
+
);
|
|
2676
|
+
|
|
2677
|
+
if (!hasOnly && !hasOrder) {
|
|
2678
|
+
notes.push(
|
|
2679
|
+
" Suggestion: Add an openRouterRouting config to fix the upstream provider. " +
|
|
2680
|
+
"Example for models.json -> providers[\"<providerId>\"] -> compat:",
|
|
2681
|
+
);
|
|
2682
|
+
notes.push(
|
|
2683
|
+
` { "sendSessionAffinityHeaders": true, "supportsLongCacheRetention": true, ` +
|
|
2684
|
+
`"openRouterRouting": { "only": ["<provider-slug>"] } }`,
|
|
2685
|
+
);
|
|
2686
|
+
notes.push(
|
|
2687
|
+
' Replace <provider-slug> with the actual OpenRouter provider slug (e.g. "openai", "anthropic").',
|
|
2688
|
+
);
|
|
2689
|
+
notes.push(
|
|
2690
|
+
" Alternatively, use openRouterRouting.order: [\"<provider-slug>\", \"...\"] for fallback order. " +
|
|
2691
|
+
"Only set supportsLongCacheRetention if your upstream supports long cache retention.",
|
|
2692
|
+
);
|
|
2693
|
+
}
|
|
2694
|
+
|
|
2695
|
+
return notes;
|
|
2696
|
+
}
|
|
2697
|
+
|
|
2698
|
+
// ── 2. Vercel AI Gateway ─────────────────────────────────────────
|
|
2699
|
+
if (
|
|
2700
|
+
baseUrl.includes("ai-gateway.vercel.sh") ||
|
|
2701
|
+
provider.includes("vercel") ||
|
|
2702
|
+
provider.includes("vercel-ai-gateway")
|
|
2703
|
+
) {
|
|
2704
|
+
const compat = getCompat(model);
|
|
2705
|
+
const hasOnly = !!(compat as Record<string, unknown>)["vercelGatewayRouting"]?.only;
|
|
2706
|
+
const hasOrder = !!(compat as Record<string, unknown>)["vercelGatewayRouting"]?.order;
|
|
2707
|
+
|
|
2708
|
+
notes.push(
|
|
2709
|
+
"🔀 Router/channel: Vercel AI Gateway detected. The gateway may route to different " +
|
|
2710
|
+
"provider endpoints per request, reducing cache locality.",
|
|
2711
|
+
);
|
|
2712
|
+
|
|
2713
|
+
if (!hasOnly && !hasOrder) {
|
|
2714
|
+
notes.push(
|
|
2715
|
+
" Suggestion: Add a vercelGatewayRouting config to fix the upstream. " +
|
|
2716
|
+
"Example for models.json -> providers[\"<providerId>\"] -> compat:",
|
|
2717
|
+
);
|
|
2718
|
+
notes.push(
|
|
2719
|
+
` { "sendSessionAffinityHeaders": true, "supportsLongCacheRetention": true, ` +
|
|
2720
|
+
`"vercelGatewayRouting": { "only": ["<provider-id>"] } }`,
|
|
2721
|
+
);
|
|
2722
|
+
notes.push(
|
|
2723
|
+
" Replace <provider-id> with the actual Vercel provider ID (e.g. \"openai\").",
|
|
2724
|
+
);
|
|
2725
|
+
notes.push(
|
|
2726
|
+
" Only set supportsLongCacheRetention if your upstream supports it.",
|
|
2727
|
+
);
|
|
2728
|
+
}
|
|
2729
|
+
|
|
2730
|
+
return notes;
|
|
2731
|
+
}
|
|
2732
|
+
|
|
2733
|
+
// ── 3. LiteLLM / OneAPI / NewAPI / VoAPI (self-hosted aggregation) ──
|
|
2734
|
+
const aggregationPatterns = ["litellm", "oneapi", "one-api", "newapi", "new-api", "voapi", "vo-api"];
|
|
2735
|
+
if (
|
|
2736
|
+
aggregationPatterns.some((p) => baseUrl.includes(p)) ||
|
|
2737
|
+
aggregationPatterns.some((p) => provider.includes(p))
|
|
2738
|
+
) {
|
|
2739
|
+
notes.push(
|
|
2740
|
+
"🔀 Router/channel: Self-hosted aggregation proxy detected (LiteLLM / OneAPI / NewAPI / VoAPI). " +
|
|
2741
|
+
"These proxies route to multiple upstream accounts or instances, which can split the cache.",
|
|
2742
|
+
);
|
|
2743
|
+
notes.push(
|
|
2744
|
+
" Suggestions:",
|
|
2745
|
+
);
|
|
2746
|
+
notes.push(
|
|
2747
|
+
" • Ensure the proxy can fix to a single upstream per session (session_id affinity).",
|
|
2748
|
+
);
|
|
2749
|
+
notes.push(
|
|
2750
|
+
" • Forward prompt_cache_key and session-affinity headers to the upstream.",
|
|
2751
|
+
);
|
|
2752
|
+
notes.push(
|
|
2753
|
+
" • Return cache usage fields (prompt_cache_hit_tokens, etc.) in the response.",
|
|
2754
|
+
);
|
|
2755
|
+
notes.push(
|
|
2756
|
+
` Example compat: { "sendSessionAffinityHeaders": true, "supportsLongCacheRetention": true }`,
|
|
2757
|
+
);
|
|
2758
|
+
|
|
2759
|
+
return notes;
|
|
2760
|
+
}
|
|
2761
|
+
|
|
2762
|
+
// ── 4. Generic third-party OpenAI-compatible proxy ─────────────────
|
|
2763
|
+
if (api === "openai-completions" && baseUrl) {
|
|
2764
|
+
const missing = describeMissingOpenAICompatibleProxyCompat(model);
|
|
2765
|
+
notes.push(
|
|
2766
|
+
"🔀 Router/channel: Third-party OpenAI-compatible proxy. If cache hit rates are low:",
|
|
2767
|
+
);
|
|
2768
|
+
notes.push(
|
|
2769
|
+
" • Verify the proxy routes to the same upstream account/instance per session.",
|
|
2770
|
+
);
|
|
2771
|
+
notes.push(
|
|
2772
|
+
" • Ensure the proxy forwards prompt_cache_key and sends session-affinity headers.",
|
|
2773
|
+
);
|
|
2774
|
+
notes.push(
|
|
2775
|
+
" • Check that the proxy returns cache usage fields (prompt_cache_hit_tokens etc.).",
|
|
2776
|
+
);
|
|
2777
|
+
if (missing.length > 0) {
|
|
2778
|
+
notes.push(
|
|
2779
|
+
` • The compat flags above (${missing.join(", ")}) are recommended for cache stability.`,
|
|
2780
|
+
);
|
|
2781
|
+
}
|
|
2782
|
+
|
|
2783
|
+
return notes;
|
|
2784
|
+
}
|
|
2785
|
+
|
|
2786
|
+
return notes;
|
|
2787
|
+
}
|
|
2788
|
+
|
|
2624
2789
|
function buildDoctorDiagnosis(model: PiModel): string {
|
|
2625
2790
|
const lines: string[] = [];
|
|
2626
2791
|
lines.push(`Provider: ${model.provider}`);
|
|
@@ -2648,6 +2813,15 @@ function buildDoctorDiagnosis(model: PiModel): string {
|
|
|
2648
2813
|
lines.push("ℹ️ Compat check not applicable for this model.");
|
|
2649
2814
|
}
|
|
2650
2815
|
|
|
2816
|
+
// ── Router/channel diagnostics ──
|
|
2817
|
+
const routerNotes = describeRouterChannelDiagnostics(model);
|
|
2818
|
+
if (routerNotes.length > 0) {
|
|
2819
|
+
lines.push("");
|
|
2820
|
+
for (const note of routerNotes) {
|
|
2821
|
+
lines.push(note);
|
|
2822
|
+
}
|
|
2823
|
+
}
|
|
2824
|
+
|
|
2651
2825
|
// ── Integrity diagnostics ──
|
|
2652
2826
|
if (lastPromptIntegrityWarningAt > 0) {
|
|
2653
2827
|
const ago = Date.now() - lastPromptIntegrityWarningAt;
|
|
@@ -2670,21 +2844,46 @@ function buildDoctorDiagnosis(model: PiModel): string {
|
|
|
2670
2844
|
|
|
2671
2845
|
function buildCompatDiagnosis(model: PiModel): string | undefined {
|
|
2672
2846
|
const missing = describeMissingOpenAICompatibleProxyCompat(model);
|
|
2673
|
-
|
|
2847
|
+
const routerNotes = describeRouterChannelDiagnostics(model);
|
|
2848
|
+
|
|
2849
|
+
if (missing.length === 0 && routerNotes.length === 0) return undefined;
|
|
2674
2850
|
|
|
2675
2851
|
const key = modelKey(model);
|
|
2676
|
-
const
|
|
2677
|
-
|
|
2678
|
-
|
|
2679
|
-
|
|
2680
|
-
|
|
2681
|
-
|
|
2682
|
-
|
|
2683
|
-
`
|
|
2684
|
-
` (
|
|
2685
|
-
|
|
2686
|
-
`
|
|
2687
|
-
|
|
2852
|
+
const lines: string[] = [];
|
|
2853
|
+
|
|
2854
|
+
if (missing.length > 0) {
|
|
2855
|
+
const slashIdx = key.indexOf("/");
|
|
2856
|
+
const providerLabel = slashIdx > 0 ? key.slice(0, slashIdx) : key;
|
|
2857
|
+
const suggestion = Object.fromEntries(missing.map((f) => [f, true]));
|
|
2858
|
+
const modelsJsonPath = getModelsJsonDisplayPath();
|
|
2859
|
+
lines.push(`Active model: ${key}`);
|
|
2860
|
+
lines.push(`Missing: ${missing.join(", ")}`);
|
|
2861
|
+
lines.push("");
|
|
2862
|
+
lines.push(`Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat`);
|
|
2863
|
+
lines.push(`(at the same level as baseUrl/api/apiKey/models) and add:`);
|
|
2864
|
+
lines.push(JSON.stringify(suggestion, null, 2));
|
|
2865
|
+
lines.push("");
|
|
2866
|
+
lines.push(`Only enable if your endpoint supports them.`);
|
|
2867
|
+
}
|
|
2868
|
+
|
|
2869
|
+
// When compat is fully configured but router notes exist, prefix the status.
|
|
2870
|
+
if (routerNotes.length > 0 && missing.length === 0) {
|
|
2871
|
+
if (isCompatCheckApplicable(model)) {
|
|
2872
|
+
lines.push("✅ Compat fully configured.");
|
|
2873
|
+
} else {
|
|
2874
|
+
lines.push("ℹ️ Compat check not applicable for this model.");
|
|
2875
|
+
}
|
|
2876
|
+
lines.push("");
|
|
2877
|
+
}
|
|
2878
|
+
|
|
2879
|
+
if (routerNotes.length > 0) {
|
|
2880
|
+
if (missing.length > 0) lines.push("");
|
|
2881
|
+
for (const note of routerNotes) {
|
|
2882
|
+
lines.push(note);
|
|
2883
|
+
}
|
|
2884
|
+
}
|
|
2885
|
+
|
|
2886
|
+
return lines.join("\n");
|
|
2688
2887
|
}
|
|
2689
2888
|
|
|
2690
2889
|
// Internal helpers exported only so the task verification script
|
|
@@ -2835,6 +3034,7 @@ export const __internals_for_tests = {
|
|
|
2835
3034
|
isCompatCheckApplicable,
|
|
2836
3035
|
buildDoctorDiagnosis,
|
|
2837
3036
|
buildCompatDiagnosis,
|
|
3037
|
+
describeRouterChannelDiagnostics,
|
|
2838
3038
|
// Cache stats helpers (module-level, usable from verify script)
|
|
2839
3039
|
addUsageToCacheStats,
|
|
2840
3040
|
formatCacheStats,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-cache-optimizer",
|
|
3
|
-
"version": "2.4.
|
|
3
|
+
"version": "2.4.8",
|
|
4
4
|
"description": "Pi extension that improves provider-side KV/prompt cache hit rates (DeepSeek, OpenAI, Claude, Gemini) by reordering the system prompt, requesting long retention, and showing footer cache stats. Renamed from pi-deepseek-cache-optimizer.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"pi-package",
|