npm - pi-cache-optimizer - Versions diffs - 2.5.3 → 2.5.5 - Mend

pi-cache-optimizer 2.5.3 → 2.5.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -8,8 +8,6 @@
 Pi extension for improving provider-side KV / prompt cache hit rates. It keeps stable prompt content near the front, adds a conservative OpenAI-compatible `prompt_cache_key` fallback, warns about common proxy cache-routing gaps, and shows read-only footer cache stats.
-**GitHub About:** Improve Pi prompt/KV cache hit rates with stable prompts, OpenAI-compatible cache keys, proxy compat warnings, and footer cache stats.
 > Renamed from `pi-deepseek-cache-optimizer`. Existing footer counters migrate automatically. This package never creates, edits, backs up, or deletes your `~/.pi/agent/models.json`.
 ## Contents
@@ -100,6 +98,7 @@ Notes:
 - `supportsLongCacheRetention: true` is optional. Add it only when the endpoint explicitly supports OpenAI long prompt cache retention.
 - If you see `400 Unsupported parameter: prompt_cache_retention`, remove/avoid `supportsLongCacheRetention` for that channel. Keep `sendSessionAffinityHeaders` if supported.
 - Use `/cache-optimizer compat` or `/cache-optimizer doctor` to see model-specific advice.
+- For DeepSeek models, the Pi Mono guidance expects `compat.requiresReasoningContentOnAssistantMessages: true` and `compat.thinkingFormat: "deepseek"` alongside cache/session-affinity flags when the endpoint supports them.
 - This extension only advises; it does not edit `models.json`.
 ## Footer stats
@@ -114,7 +113,7 @@ OpenAI cache 3/10 · 0.002M/0.005M tok (40%) ⚠️ compat
 Format: `<label> <hit requests>/<total requests> · <cached input tokens>/<total input tokens> tok (<token hit rate>)`. Some adapters may also append `· write <tokens> tok`, and runtime diagnostics may append `⚠️ compat` or `⚠️ integrity`.
-Supported footer labels include: DS, Claude, OpenAI, Gemini, Kimi, Qwen, GLM, MiniMax, Hunyuan, Mistral, Grok, Llama, Nemotron, Cohere, Yi, Doubao, ERNIE, Baichuan, StepFun, Spark, InternLM, Gemma, Phi, Jamba, Solar, Sonar, Nova, Reka, Falcon, DBRX, MPT, StableLM, Aquila, EXAONE, HyperCLOVA, Luminous, Hermes, Granite, Arctic, Pangu, SenseNova, Zhinao, MiniCPM, XVERSE, Orion, OpenChat, Vicuna, Wizard, Zephyr, Dolphin, OpenOrca, Starling, BLOOM, RWKV, and Aya.
+Supported footer labels include: DS, Claude, OpenAI, Gemini, Kimi, Qwen, GLM, MiniMax, Mimo, Hunyuan, Mistral, Grok, Llama, Nemotron, Cohere, Yi, Doubao, ERNIE, Baichuan, StepFun, Spark, InternLM, Gemma, Phi, Jamba, Solar, Sonar, Nova, Reka, Falcon, DBRX, MPT, StableLM, Aquila, EXAONE, HyperCLOVA, Luminous, Hermes, Granite, Arctic, Pangu, SenseNova, Zhinao, MiniCPM, XVERSE, Orion, OpenChat, Vicuna, Wizard, Zephyr, Dolphin, OpenOrca, Starling, BLOOM, RWKV, and Aya.
 Adapter selection uses only model id/name (plus assistant message model/name on message end). Generic OpenAI-shaped APIs are not treated as OpenAI-family unless the model id/name matches a supported family.

package/README.zh-CN.md CHANGED Viewed

@@ -8,8 +8,6 @@
 用于提升 Pi 中 provider 侧 KV Cache / Prompt Cache 命中率的扩展：把稳定 prompt 内容前置，给 OpenAI-compatible 请求补保守的 `prompt_cache_key`，提示代理渠道常见缓存路由兼容问题，并在底部显示只读缓存统计。
-**GitHub About：** Improve Pi prompt/KV cache hit rates with stable prompts, OpenAI-compatible cache keys, proxy compat warnings, and footer cache stats.
 > 本包已从 `pi-deepseek-cache-optimizer` 改名。已有底部统计会自动迁移。本扩展绝不会创建、修改、备份或删除你的 `~/.pi/agent/models.json`。
 ## 目录
@@ -100,6 +98,7 @@ LiteLLM / OneAPI / NewAPI / 类 OpenRouter 渠道等第三方 `openai-completion
 - `supportsLongCacheRetention: true` 是可选项。只有 endpoint 明确支持 OpenAI long prompt cache retention 时才添加。
 - 如果出现 `400 Unsupported parameter: prompt_cache_retention`，请为该渠道移除 / 避免 `supportsLongCacheRetention`；如支持，可保留 `sendSessionAffinityHeaders`。
 - 使用 `/cache-optimizer compat` 或 `/cache-optimizer doctor` 查看当前模型的具体建议。
+- 对 DeepSeek 模型，Pi Mono 指南期望在支持时同时设置 `compat.requiresReasoningContentOnAssistantMessages: true` 和 `compat.thinkingFormat: "deepseek"`，再配合缓存 / session-affinity 相关 compat。
 - 本扩展只给建议，不会修改 `models.json`。
 ## Footer 统计
@@ -114,7 +113,7 @@ OpenAI cache 3/10 · 0.002M/0.005M tok (40%) ⚠️ compat
 格式：`<label> <命中请求数>/<总请求数> · <cached input tokens>/<total input tokens> tok (<token 命中率>)`。部分 adapter 还可能追加 `· write <tokens> tok`，运行时诊断可能追加 `⚠️ compat` 或 `⚠️ integrity`。
-支持的 footer label 包括：DS、Claude、OpenAI、Gemini、Kimi、Qwen、GLM、MiniMax、Hunyuan、Mistral、Grok、Llama、Nemotron、Cohere、Yi、Doubao、ERNIE、Baichuan、StepFun、Spark、InternLM、Gemma、Phi、Jamba、Solar、Sonar、Nova、Reka、Falcon、DBRX、MPT、StableLM、Aquila、EXAONE、HyperCLOVA、Luminous、Hermes、Granite、Arctic、Pangu、SenseNova、Zhinao、MiniCPM、XVERSE、Orion、OpenChat、Vicuna、Wizard、Zephyr、Dolphin、OpenOrca、Starling、BLOOM、RWKV、Aya。
+支持的 footer label 包括：DS、Claude、OpenAI、Gemini、Kimi、Qwen、GLM、MiniMax、Mimo、Hunyuan、Mistral、Grok、Llama、Nemotron、Cohere、Yi、Doubao、ERNIE、Baichuan、StepFun、Spark、InternLM、Gemma、Phi、Jamba、Solar、Sonar、Nova、Reka、Falcon、DBRX、MPT、StableLM、Aquila、EXAONE、HyperCLOVA、Luminous、Hermes、Granite、Arctic、Pangu、SenseNova、Zhinao、MiniCPM、XVERSE、Orion、OpenChat、Vicuna、Wizard、Zephyr、Dolphin、OpenOrca、Starling、BLOOM、RWKV、Aya。
 Adapter 选择只看模型 id/name（以及 message_end 时 assistant message 的 model/name）。仅使用 OpenAI-shaped API 不会被当作 OpenAI-family，除非模型 id/name 匹配受支持的家族。

package/index.ts CHANGED Viewed

@@ -126,6 +126,7 @@ const MIN_STABLE_CANDIDATE_LENGTH = 8;
 const ASSISTANT_MESSAGE_MODEL_TOKEN_KEYS = ["model", "name"];
 const OPENAI_REASONING_MODEL_PATTERN = /(^|[/\s:_-])o[1345]($|[-_.:/\s])/;
 const XAI_MODEL_PATTERN = /(^|[/\s:_-])xai($|[-_.:/\s])/;
+const MIMO_MODEL_PATTERN = /(^|[/\s:_-])mi-?mo($|[-_.:/\s])/i;
 const PPLX_MODEL_PATTERN = /(^|[/\s:_-])pplx($|[-_.:/\s])/i;
 const NOVA_MODEL_PATTERN = /(^|[/\s:_-])nova($|[-_.:/\s])/i;
 const MPT_MODEL_PATTERN = /(^|[/\s:_-])mpt($|[-_.:/\s])/i;
@@ -141,6 +142,7 @@ type CacheCompat = {
   sendSessionIdHeader?: boolean;
   supportsLongCacheRetention?: boolean;
   thinkingFormat?: string;
+  requiresReasoningContentOnAssistantMessages?: boolean;
   cacheControlFormat?: string;
 };
@@ -831,6 +833,18 @@ function isMiniMaxLikeAssistantMessage(message: unknown, model: PiModel | undefi
   return modelOrAssistantMessageHas(message, model, ["minimax"]);
 }
+function isMimoLikeModel(model: PiModel | undefined): boolean {
+  const tokens = getModelIdNameTokenValues(model);
+  return hasAnyTokenContaining(tokens, ["xiaomimimo"]) || tokens.some((t) => MIMO_MODEL_PATTERN.test(t));
+}
+function isMimoLikeAssistantMessage(message: unknown, model: PiModel | undefined): boolean {
+  const allTokens = [
+    ...getModelIdNameTokenValues(model),
+    ...getAssistantMessageModelTokenValues(message),
+  ];
+  return hasAnyTokenContaining(allTokens, ["xiaomimimo"]) || allTokens.some((t) => MIMO_MODEL_PATTERN.test(t));
+}
 function isHunyuanLikeModel(model: PiModel | undefined): boolean {
   return hasAnyTokenContaining(getModelIdNameTokenValues(model), ["hunyuan"]);
 }
@@ -1492,7 +1506,7 @@ function describeMissingOpenAIFamilyProxyCompat(model: PiModel): string[] {
 /**
  * Like describeMissingOpenAIFamilyProxyCompat but without the isOpenAIFamilyModel
  * gate. Warns for ANY model using openai-completions through a non-official base
- * URL — covers GPT, Kimi, Qwen, GLM, MiniMax, Hunyuan, and any other
+ * URL — covers GPT, Kimi, Qwen, GLM, MiniMax, Mimo, Hunyuan, and any other
  * OpenAI-compatible proxy.
  */
 function describeMissingOpenAICompatibleProxyCompat(model: PiModel): string[] {
@@ -1590,10 +1604,88 @@ function describeMissingDeepSeekCompat(model: PiModel): string[] {
   } else if (compat.sendSessionAffinityHeaders !== true) {
     missing.push("sendSessionAffinityHeaders");
   }
+  if (compat.requiresReasoningContentOnAssistantMessages !== true) {
+    missing.push("requiresReasoningContentOnAssistantMessages");
+  }
+  if (compat.thinkingFormat !== "deepseek") {
+    missing.push("thinkingFormat");
+  }
   return missing;
 }
+function isDeepSeekCompatCheckApplicable(model: PiModel): boolean {
+  return isDeepSeekLikeModel(model) && isOpenAICompatibleApi(model.api);
+}
+function describeMissingCacheCompatForModel(model: PiModel): string[] {
+  if (isDeepSeekCompatCheckApplicable(model)) {
+    return describeMissingDeepSeekCompat(model);
+  }
+  return describeMissingOpenAICompatibleProxyCompat(model);
+}
+function buildDeepSeekCompatSuggestion(missing: string[]): Record<string, unknown> {
+  const suggestion: Record<string, unknown> = {};
+  if (missing.includes("supportsLongCacheRetention")) {
+    suggestion.supportsLongCacheRetention = true;
+  }
+  if (missing.includes("sendSessionIdHeader")) {
+    suggestion.sendSessionIdHeader = true;
+  }
+  if (missing.includes("sendSessionAffinityHeaders")) {
+    suggestion.sendSessionAffinityHeaders = true;
+  }
+  if (missing.includes("requiresReasoningContentOnAssistantMessages")) {
+    suggestion.requiresReasoningContentOnAssistantMessages = true;
+  }
+  if (missing.includes("thinkingFormat")) {
+    suggestion.thinkingFormat = "deepseek";
+  }
+  return suggestion;
+}
+function appendDeepSeekCompatAdviceLines(lines: string[], missing: string[]): void {
+  const suggestion = buildDeepSeekCompatSuggestion(missing);
+  if (Object.keys(suggestion).length > 0) {
+    lines.push("Recommended DeepSeek compat snippet:");
+    lines.push(JSON.stringify(suggestion, null, 2));
+  }
+  if (missing.includes("requiresReasoningContentOnAssistantMessages")) {
+    lines.push('- requiresReasoningContentOnAssistantMessages: true keeps replayed assistant turns compatible with DeepSeek reasoning_content requirements.');
+  }
+  if (missing.includes("thinkingFormat")) {
+    lines.push('- thinkingFormat: "deepseek" tells Pi to use DeepSeek reasoning/thinking parameter format.');
+  }
+  if (missing.includes("sendSessionAffinityHeaders")) {
+    lines.push("- sendSessionAffinityHeaders: recommended for OpenAI-compatible DeepSeek proxies when supported; it helps keep one Pi session on the same upstream/backend.");
+  }
+  if (missing.includes("sendSessionIdHeader")) {
+    lines.push("- sendSessionIdHeader: recommended for OpenAI Responses-compatible DeepSeek proxies when supported.");
+  }
+  if (missing.includes("supportsLongCacheRetention")) {
+    lines.push("- supportsLongCacheRetention: enable for DeepSeek-compatible endpoints that support long cache retention.");
+  }
+}
+function buildDeepSeekCompatWarningText(key: string, missing: string[]): string {
+  const slashIdx = key.indexOf("/");
+  const providerLabel = slashIdx > 0 ? key.slice(0, slashIdx) : key;
+  const modelsJsonPath = getModelsJsonDisplayPath();
+  const lines: string[] = [
+    `💡 pi-cache-optimizer: ${key} is DeepSeek-like but merged compat lacks ${missing.join(" and ")}.`,
+    `Proxies may reduce or hide cache hits. Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat (at the same level as baseUrl/api/apiKey/models).`,
+    "",
+  ];
+  appendDeepSeekCompatAdviceLines(lines, missing);
+  return lines.join("\n");
+}
 const CACHE_PROVIDER_ADAPTERS: CacheProviderAdapter[] = [
   {
     id: "deepseek",
@@ -1613,13 +1705,7 @@ const CACHE_PROVIDER_ADAPTERS: CacheProviderAdapter[] = [
       if (missing.length === 0) return undefined;
       const key = modelKey(model);
-      const slashIdx = key.indexOf("/");
-      const providerLabel = slashIdx > 0 ? key.slice(0, slashIdx) : key;
-      const modelsJsonPath = getModelsJsonDisplayPath();
-      return (
-        `💡 pi-cache-optimizer: ${key} is DeepSeek-like but merged compat lacks ${missing.join(" and ")}. ` +
-        `Proxies may reduce or hide cache hits. Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat (at the same level as baseUrl/api/apiKey/models).`
-      );
+      return buildDeepSeekCompatWarningText(key, missing);
     },
   },
   {
@@ -1742,6 +1828,23 @@ const CACHE_PROVIDER_ADAPTERS: CacheProviderAdapter[] = [
       return buildOpenAIProxyCompatWarningText(modelKey(model), missing);
     },
   },
+  {
+    id: "openai" as CacheProviderId,
+    label: "Mimo cache",
+    matchesModel: isMimoLikeModel,
+    matchesAssistantMessage(message, model) {
+      if (!isAssistantMessage(message)) return false;
+      return isMimoLikeAssistantMessage(message, model);
+    },
+    normalizeUsage(message) {
+      return normalizeWithFallback(message, getOpenAIRawUsage);
+    },
+    warningText(model) {
+      const missing = describeMissingOpenAICompatibleProxyCompat(model);
+      if (missing.length === 0) return undefined;
+      return buildOpenAIProxyCompatWarningText(modelKey(model), missing);
+    },
+  },
   {
     id: "openai" as CacheProviderId,
     label: "Hunyuan cache",
@@ -3028,6 +3131,12 @@ function isCompatCheckApplicable(model: PiModel): boolean {
   return lower(model.api) === "openai-completions" && !isOfficialOpenAIBaseUrl(model);
 }
+function isPromptCacheRetention400Applicable(model: PiModel): boolean {
+  return isOpenAICompatibleApi(model.api) &&
+    !isOfficialOpenAIBaseUrl(model) &&
+    getCompat(model).supportsLongCacheRetention === true;
+}
 /**
  * Detect router / channel profiles from a PiModel and return diagnostic notes.
  *
@@ -3171,7 +3280,7 @@ function describeRouterChannelDiagnostics(model: PiModel): string[] {
   // ── 4. Generic third-party OpenAI-compatible proxy ─────────────────
   if (api === "openai-completions" && baseUrl) {
-    const missing = describeMissingOpenAICompatibleProxyCompat(model);
+    const missing = describeMissingCacheCompatForModel(model);
     notes.push(
       "🔀 Router/channel: Third-party OpenAI-compatible proxy. If cache hit rates are low:",
     );
@@ -3207,7 +3316,8 @@ function buildDoctorDiagnosis(model: PiModel, options: { promptCacheRetention400
   const compat = getCompat(model);
   lines.push(`Compat:   ${JSON.stringify(compat)}`);
-  const missing = describeMissingOpenAICompatibleProxyCompat(model);
+  const deepSeekCompatApplicable = isDeepSeekCompatCheckApplicable(model);
+  const missing = describeMissingCacheCompatForModel(model);
   if (missing.length > 0) {
     lines.push(`⚠️  Missing compat flags: ${missing.join(", ")}`);
     const key = modelKey(model);
@@ -3215,14 +3325,18 @@ function buildDoctorDiagnosis(model: PiModel, options: { promptCacheRetention400
     const providerLabel = slashIdx > 0 ? key.slice(0, slashIdx) : key;
     const modelsJsonPath = getModelsJsonDisplayPath();
     lines.push(`Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat (same level as baseUrl/api/apiKey/models).`);
-    appendOpenAIProxyCompatAdviceLines(lines, missing);
-  } else if (isCompatCheckApplicable(model)) {
+    if (deepSeekCompatApplicable) {
+      appendDeepSeekCompatAdviceLines(lines, missing);
+    } else {
+      appendOpenAIProxyCompatAdviceLines(lines, missing);
+    }
+  } else if (deepSeekCompatApplicable || isCompatCheckApplicable(model)) {
     lines.push("✅ Compat fully configured.");
   } else {
     lines.push("ℹ️ Compat check not applicable for this model.");
   }
-  if (isCompatCheckApplicable(model) && compat.supportsLongCacheRetention === true) {
+  if (isPromptCacheRetention400Applicable(model)) {
     lines.push("");
     if (options.promptCacheRetention400) {
       lines.push("⚠️  A 400 response was observed while supportsLongCacheRetention is enabled.");
@@ -3274,8 +3388,8 @@ function buildLowHitDiagnosis(
 ): string[] {
   const lines: string[] = [];
-  // 1. Missing compat flags (reuse existing check)
-  const missingCompat = describeMissingOpenAICompatibleProxyCompat(model);
+  // 1. Missing compat flags (adapter-aware: DeepSeek has extra reasoning compat)
+  const missingCompat = describeMissingCacheCompatForModel(model);
   // 2. Router/channel risk (reuse existing check)
   const routerNotes = describeRouterChannelDiagnostics(model);
@@ -3297,6 +3411,13 @@ function buildLowHitDiagnosis(
   const hasRouterRisk = routerNotes.length > 0;
   const hasUsageMissing = missingUsageSamples > 0;
+  // Today's cached-token ratio is used both inside and outside the recent-sample
+  // branch. Keep it block-external so doctor/stats never throw for low-hit
+  // models that have persisted counters but no recent in-memory samples.
+  const todayHitRatio = todayStats.totalInputTokens > 0
+    ? Math.round((todayStats.cachedInputTokens / todayStats.totalInputTokens) * 100)
+    : 0;
   // Determine if there are actual issues worth flagging
   const hasActualIssues = hasMissingCompat || hasUsageMissing ||
     // Low hit trend (today total > 3 and hit ratio < 30%)
@@ -3337,10 +3458,6 @@ function buildLowHitDiagnosis(
   // Priority 4: recent trend low
   if (recent10Total > 0) {
     const hitRatio = recent10Input > 0 ? Math.round((recent10Cached / recent10Input) * 100) : 0;
-    const todayHitRatio = todayStats.totalInputTokens > 0
-      ? Math.round((todayStats.cachedInputTokens / todayStats.totalInputTokens) * 100)
-      : 0;
     if (recent10Hits === 0 && todayStats.totalRequests > 3 && todayHitRatio < 30) {
       lines.push(`📉 Cache hit rate is low: ${todayHitRatio}% today (${recent10Total} recent samples).`);
       lines.push("   Likely causes: proxy routing to different backends per request,");
@@ -3371,7 +3488,8 @@ function buildLowHitDiagnosis(
 }
 function buildCompatDiagnosis(model: PiModel): string | undefined {
-  const missing = describeMissingOpenAICompatibleProxyCompat(model);
+  const missing = describeMissingCacheCompatForModel(model);
+  const deepSeekCompatApplicable = isDeepSeekCompatCheckApplicable(model);
   const routerNotes = describeRouterChannelDiagnostics(model);
   if (missing.length === 0 && routerNotes.length === 0) return undefined;
@@ -3388,14 +3506,18 @@ function buildCompatDiagnosis(model: PiModel): string | undefined {
     lines.push("");
     lines.push(`Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat`);
     lines.push(`(at the same level as baseUrl/api/apiKey/models).`);
-    appendOpenAIProxyCompatAdviceLines(lines, missing);
+    if (deepSeekCompatApplicable) {
+      appendDeepSeekCompatAdviceLines(lines, missing);
+    } else {
+      appendOpenAIProxyCompatAdviceLines(lines, missing);
+    }
   }
   // When compat is fully configured but router notes exist, prefix the status.
   if (routerNotes.length > 0 && missing.length === 0) {
-    if (isCompatCheckApplicable(model)) {
+    if (deepSeekCompatApplicable || isCompatCheckApplicable(model)) {
       lines.push("✅ Compat fully configured.");
-      if (getCompat(model).supportsLongCacheRetention === true) {
+      if (isPromptCacheRetention400Applicable(model)) {
         lines.push(getPromptCacheRetentionUnsupportedHint());
       }
     } else {
@@ -3441,9 +3563,16 @@ export const __internals_for_tests = {
   isOpenAIFamilyToken,
   describeMissingOpenAIFamilyProxyCompat,
   describeMissingOpenAICompatibleProxyCompat,
+  describeMissingDeepSeekCompat,
+  isDeepSeekCompatCheckApplicable,
+  describeMissingCacheCompatForModel,
+  buildDeepSeekCompatSuggestion,
+  buildDeepSeekCompatWarningText,
   buildSafeOpenAIProxyCompatSuggestion,
   getPromptCacheRetentionUnsupportedHint,
   isOfficialOpenAIBaseUrl,
+  isCompatCheckApplicable,
+  isPromptCacheRetention400Applicable,
   // Non-GPT OpenAI-compatible model detection
   isKimiLikeModel,
   isKimiLikeAssistantMessage,
@@ -3453,6 +3582,8 @@ export const __internals_for_tests = {
   isGLMLikeAssistantMessage,
   isMiniMaxLikeModel,
   isMiniMaxLikeAssistantMessage,
+  isMimoLikeModel,
+  isMimoLikeAssistantMessage,
   isHunyuanLikeModel,
   isHunyuanLikeAssistantMessage,
   // Additional OpenAI-compatible model detection
@@ -3551,6 +3682,8 @@ export const __internals_for_tests = {
   isRwkvLikeAssistantMessage,
   isAyaLikeModel,
   isAyaLikeAssistantMessage,
+  selectAdapterForModel,
+  selectAdapterForAssistantMessage,
   buildOpenAIProxyCompatWarningText,
   getModelIdNameTokenValues,
   getAssistantMessageModelTokenValues,
@@ -3899,15 +4032,15 @@ export default function (pi: ExtensionAPI) {
       }
     }
-    // ⚠️ compat footer marker: if the active model is a non-official
-    // openai-completions model with missing supportsLongCacheRetention
-    // or sendSessionAffinityHeaders, append the marker to indicate that
-    // compat configuration is incomplete. Re-evaluated on every status
-    // update so the marker persists through stats changes and day
-    // rollovers. Redundant setStatus calls are blocked by the
+    // ⚠️ compat footer marker: if the active model has adapter-specific
+    // missing compat (DeepSeek reasoning/cache compat, or a non-official
+    // openai-completions model missing cache/session-affinity flags), append
+    // the marker to indicate that compat configuration is incomplete.
+    // Re-evaluated on every status update so the marker persists through stats
+    // changes and day rollovers. Redundant setStatus calls are blocked by the
     // `lastStatusText` early return above.
     if (runtimeOptimizerEnabled && statusText !== undefined && model) {
-      const compatMissing = describeMissingOpenAICompatibleProxyCompat(model);
+      const compatMissing = describeMissingCacheCompatForModel(model);
       if (compatMissing.length > 0) {
         statusText = statusText + " ⚠️ compat";
       }
@@ -4027,8 +4160,7 @@ export default function (pi: ExtensionAPI) {
     const model = ctx.model;
     if (!runtimeOptimizerEnabled || !model) return;
     if (event.status !== 400) return;
-    if (!isCompatCheckApplicable(model)) return;
-    if (getCompat(model).supportsLongCacheRetention !== true) return;
+    if (!isPromptCacheRetention400Applicable(model)) return;
     const key = modelKey(model);
     promptCacheRetention400Models.add(key);
@@ -4140,7 +4272,7 @@ export default function (pi: ExtensionAPI) {
           cmdCtx.ui.notify(compatResult, "warning");
         } else {
           cmdCtx.ui.notify(
-            isCompatCheckApplicable(model)
+            isDeepSeekCompatCheckApplicable(model) || isCompatCheckApplicable(model)
               ? "✅ Compat fully configured."
               : "ℹ️ Compat check not applicable for this model.",
             "info",
@@ -4238,7 +4370,7 @@ export default function (pi: ExtensionAPI) {
                 cmdCtx.ui.notify(compatResult, "warning");
               } else {
                 cmdCtx.ui.notify(
-                  isCompatCheckApplicable(model)
+                  isDeepSeekCompatCheckApplicable(model) || isCompatCheckApplicable(model)
                     ? "✅ Compat fully configured."
                     : "ℹ️ Compat check not applicable for this model.",
                   "info",
@@ -4285,11 +4417,11 @@ export default function (pi: ExtensionAPI) {
         diagnosis.push("");
         if (model) {
           const displayKey = modelKey(model);
-          const missing = describeMissingOpenAICompatibleProxyCompat(model);
+          const missing = describeMissingCacheCompatForModel(model);
           if (missing.length > 0) {
             diagnosis.push(`⚠️  Active model "${displayKey}" missing compat: ${missing.join(", ")}`);
             diagnosis.push('Run "/cache-optimizer compat" for edit instructions.');
-          } else if (isCompatCheckApplicable(model)) {
+          } else if (isDeepSeekCompatCheckApplicable(model) || isCompatCheckApplicable(model)) {
             diagnosis.push(`✅ Active model "${displayKey}": compat fully configured.`);
           } else {
             diagnosis.push(`ℹ️ Active model "${displayKey}": compat check not applicable.`);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-cache-optimizer",
-  "version": "2.5.3",
+  "version": "2.5.5",
   "description": "Improve Pi prompt/KV cache hit rates with stable prompts, OpenAI-compatible cache keys, proxy compat warnings, and footer cache stats.",
   "keywords": [
     "pi-package",