pi-cache-optimizer 2.5.3 → 2.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,8 +8,6 @@
8
8
 
9
9
  Pi extension for improving provider-side KV / prompt cache hit rates. It keeps stable prompt content near the front, adds a conservative OpenAI-compatible `prompt_cache_key` fallback, warns about common proxy cache-routing gaps, and shows read-only footer cache stats.
10
10
 
11
- **GitHub About:** Improve Pi prompt/KV cache hit rates with stable prompts, OpenAI-compatible cache keys, proxy compat warnings, and footer cache stats.
12
-
13
11
  > Renamed from `pi-deepseek-cache-optimizer`. Existing footer counters migrate automatically. This package never creates, edits, backs up, or deletes your `~/.pi/agent/models.json`.
14
12
 
15
13
  ## Contents
@@ -100,6 +98,7 @@ Notes:
100
98
  - `supportsLongCacheRetention: true` is optional. Add it only when the endpoint explicitly supports OpenAI long prompt cache retention.
101
99
  - If you see `400 Unsupported parameter: prompt_cache_retention`, remove/avoid `supportsLongCacheRetention` for that channel. Keep `sendSessionAffinityHeaders` if supported.
102
100
  - Use `/cache-optimizer compat` or `/cache-optimizer doctor` to see model-specific advice.
101
+ - For DeepSeek models, the Pi Mono guidance expects `compat.requiresReasoningContentOnAssistantMessages: true` and `compat.thinkingFormat: "deepseek"` alongside cache/session-affinity flags when the endpoint supports them.
103
102
  - This extension only advises; it does not edit `models.json`.
104
103
 
105
104
  ## Footer stats
@@ -114,7 +113,7 @@ OpenAI cache 3/10 · 0.002M/0.005M tok (40%) ⚠️ compat
114
113
 
115
114
  Format: `<label> <hit requests>/<total requests> · <cached input tokens>/<total input tokens> tok (<token hit rate>)`. Some adapters may also append `· write <tokens> tok`, and runtime diagnostics may append `⚠️ compat` or `⚠️ integrity`.
116
115
 
117
- Supported footer labels include: DS, Claude, OpenAI, Gemini, Kimi, Qwen, GLM, MiniMax, Hunyuan, Mistral, Grok, Llama, Nemotron, Cohere, Yi, Doubao, ERNIE, Baichuan, StepFun, Spark, InternLM, Gemma, Phi, Jamba, Solar, Sonar, Nova, Reka, Falcon, DBRX, MPT, StableLM, Aquila, EXAONE, HyperCLOVA, Luminous, Hermes, Granite, Arctic, Pangu, SenseNova, Zhinao, MiniCPM, XVERSE, Orion, OpenChat, Vicuna, Wizard, Zephyr, Dolphin, OpenOrca, Starling, BLOOM, RWKV, and Aya.
116
+ Supported footer labels include: DS, Claude, OpenAI, Gemini, Kimi, Qwen, GLM, MiniMax, Mimo, Hunyuan, Mistral, Grok, Llama, Nemotron, Cohere, Yi, Doubao, ERNIE, Baichuan, StepFun, Spark, InternLM, Gemma, Phi, Jamba, Solar, Sonar, Nova, Reka, Falcon, DBRX, MPT, StableLM, Aquila, EXAONE, HyperCLOVA, Luminous, Hermes, Granite, Arctic, Pangu, SenseNova, Zhinao, MiniCPM, XVERSE, Orion, OpenChat, Vicuna, Wizard, Zephyr, Dolphin, OpenOrca, Starling, BLOOM, RWKV, and Aya.
118
117
 
119
118
  Adapter selection uses only model id/name (plus assistant message model/name on message end). Generic OpenAI-shaped APIs are not treated as OpenAI-family unless the model id/name matches a supported family.
120
119
 
package/README.zh-CN.md CHANGED
@@ -8,8 +8,6 @@
8
8
 
9
9
  用于提升 Pi 中 provider 侧 KV Cache / Prompt Cache 命中率的扩展:把稳定 prompt 内容前置,给 OpenAI-compatible 请求补保守的 `prompt_cache_key`,提示代理渠道常见缓存路由兼容问题,并在底部显示只读缓存统计。
10
10
 
11
- **GitHub About:** Improve Pi prompt/KV cache hit rates with stable prompts, OpenAI-compatible cache keys, proxy compat warnings, and footer cache stats.
12
-
13
11
  > 本包已从 `pi-deepseek-cache-optimizer` 改名。已有底部统计会自动迁移。本扩展绝不会创建、修改、备份或删除你的 `~/.pi/agent/models.json`。
14
12
 
15
13
  ## 目录
@@ -100,6 +98,7 @@ LiteLLM / OneAPI / NewAPI / 类 OpenRouter 渠道等第三方 `openai-completion
100
98
  - `supportsLongCacheRetention: true` 是可选项。只有 endpoint 明确支持 OpenAI long prompt cache retention 时才添加。
101
99
  - 如果出现 `400 Unsupported parameter: prompt_cache_retention`,请为该渠道移除 / 避免 `supportsLongCacheRetention`;如支持,可保留 `sendSessionAffinityHeaders`。
102
100
  - 使用 `/cache-optimizer compat` 或 `/cache-optimizer doctor` 查看当前模型的具体建议。
101
+ - 对 DeepSeek 模型,Pi Mono 指南期望在支持时同时设置 `compat.requiresReasoningContentOnAssistantMessages: true` 和 `compat.thinkingFormat: "deepseek"`,再配合缓存 / session-affinity 相关 compat。
103
102
  - 本扩展只给建议,不会修改 `models.json`。
104
103
 
105
104
  ## Footer 统计
@@ -114,7 +113,7 @@ OpenAI cache 3/10 · 0.002M/0.005M tok (40%) ⚠️ compat
114
113
 
115
114
  格式:`<label> <命中请求数>/<总请求数> · <cached input tokens>/<total input tokens> tok (<token 命中率>)`。部分 adapter 还可能追加 `· write <tokens> tok`,运行时诊断可能追加 `⚠️ compat` 或 `⚠️ integrity`。
116
115
 
117
- 支持的 footer label 包括:DS、Claude、OpenAI、Gemini、Kimi、Qwen、GLM、MiniMax、Hunyuan、Mistral、Grok、Llama、Nemotron、Cohere、Yi、Doubao、ERNIE、Baichuan、StepFun、Spark、InternLM、Gemma、Phi、Jamba、Solar、Sonar、Nova、Reka、Falcon、DBRX、MPT、StableLM、Aquila、EXAONE、HyperCLOVA、Luminous、Hermes、Granite、Arctic、Pangu、SenseNova、Zhinao、MiniCPM、XVERSE、Orion、OpenChat、Vicuna、Wizard、Zephyr、Dolphin、OpenOrca、Starling、BLOOM、RWKV、Aya。
116
+ 支持的 footer label 包括:DS、Claude、OpenAI、Gemini、Kimi、Qwen、GLM、MiniMax、Mimo、Hunyuan、Mistral、Grok、Llama、Nemotron、Cohere、Yi、Doubao、ERNIE、Baichuan、StepFun、Spark、InternLM、Gemma、Phi、Jamba、Solar、Sonar、Nova、Reka、Falcon、DBRX、MPT、StableLM、Aquila、EXAONE、HyperCLOVA、Luminous、Hermes、Granite、Arctic、Pangu、SenseNova、Zhinao、MiniCPM、XVERSE、Orion、OpenChat、Vicuna、Wizard、Zephyr、Dolphin、OpenOrca、Starling、BLOOM、RWKV、Aya。
118
117
 
119
118
  Adapter 选择只看模型 id/name(以及 message_end 时 assistant message 的 model/name)。仅使用 OpenAI-shaped API 不会被当作 OpenAI-family,除非模型 id/name 匹配受支持的家族。
120
119
 
package/index.ts CHANGED
@@ -126,6 +126,7 @@ const MIN_STABLE_CANDIDATE_LENGTH = 8;
126
126
  const ASSISTANT_MESSAGE_MODEL_TOKEN_KEYS = ["model", "name"];
127
127
  const OPENAI_REASONING_MODEL_PATTERN = /(^|[/\s:_-])o[1345]($|[-_.:/\s])/;
128
128
  const XAI_MODEL_PATTERN = /(^|[/\s:_-])xai($|[-_.:/\s])/;
129
+ const MIMO_MODEL_PATTERN = /(^|[/\s:_-])mi-?mo($|[-_.:/\s])/i;
129
130
  const PPLX_MODEL_PATTERN = /(^|[/\s:_-])pplx($|[-_.:/\s])/i;
130
131
  const NOVA_MODEL_PATTERN = /(^|[/\s:_-])nova($|[-_.:/\s])/i;
131
132
  const MPT_MODEL_PATTERN = /(^|[/\s:_-])mpt($|[-_.:/\s])/i;
@@ -141,6 +142,7 @@ type CacheCompat = {
141
142
  sendSessionIdHeader?: boolean;
142
143
  supportsLongCacheRetention?: boolean;
143
144
  thinkingFormat?: string;
145
+ requiresReasoningContentOnAssistantMessages?: boolean;
144
146
  cacheControlFormat?: string;
145
147
  };
146
148
 
@@ -831,6 +833,18 @@ function isMiniMaxLikeAssistantMessage(message: unknown, model: PiModel | undefi
831
833
  return modelOrAssistantMessageHas(message, model, ["minimax"]);
832
834
  }
833
835
 
836
+ function isMimoLikeModel(model: PiModel | undefined): boolean {
837
+ const tokens = getModelIdNameTokenValues(model);
838
+ return hasAnyTokenContaining(tokens, ["xiaomimimo"]) || tokens.some((t) => MIMO_MODEL_PATTERN.test(t));
839
+ }
840
+ function isMimoLikeAssistantMessage(message: unknown, model: PiModel | undefined): boolean {
841
+ const allTokens = [
842
+ ...getModelIdNameTokenValues(model),
843
+ ...getAssistantMessageModelTokenValues(message),
844
+ ];
845
+ return hasAnyTokenContaining(allTokens, ["xiaomimimo"]) || allTokens.some((t) => MIMO_MODEL_PATTERN.test(t));
846
+ }
847
+
834
848
  function isHunyuanLikeModel(model: PiModel | undefined): boolean {
835
849
  return hasAnyTokenContaining(getModelIdNameTokenValues(model), ["hunyuan"]);
836
850
  }
@@ -1492,7 +1506,7 @@ function describeMissingOpenAIFamilyProxyCompat(model: PiModel): string[] {
1492
1506
  /**
1493
1507
  * Like describeMissingOpenAIFamilyProxyCompat but without the isOpenAIFamilyModel
1494
1508
  * gate. Warns for ANY model using openai-completions through a non-official base
1495
- * URL — covers GPT, Kimi, Qwen, GLM, MiniMax, Hunyuan, and any other
1509
+ * URL — covers GPT, Kimi, Qwen, GLM, MiniMax, Mimo, Hunyuan, and any other
1496
1510
  * OpenAI-compatible proxy.
1497
1511
  */
1498
1512
  function describeMissingOpenAICompatibleProxyCompat(model: PiModel): string[] {
@@ -1590,10 +1604,88 @@ function describeMissingDeepSeekCompat(model: PiModel): string[] {
1590
1604
  } else if (compat.sendSessionAffinityHeaders !== true) {
1591
1605
  missing.push("sendSessionAffinityHeaders");
1592
1606
  }
1607
+ if (compat.requiresReasoningContentOnAssistantMessages !== true) {
1608
+ missing.push("requiresReasoningContentOnAssistantMessages");
1609
+ }
1610
+ if (compat.thinkingFormat !== "deepseek") {
1611
+ missing.push("thinkingFormat");
1612
+ }
1593
1613
 
1594
1614
  return missing;
1595
1615
  }
1596
1616
 
1617
+ function isDeepSeekCompatCheckApplicable(model: PiModel): boolean {
1618
+ return isDeepSeekLikeModel(model) && isOpenAICompatibleApi(model.api);
1619
+ }
1620
+
1621
+ function describeMissingCacheCompatForModel(model: PiModel): string[] {
1622
+ if (isDeepSeekCompatCheckApplicable(model)) {
1623
+ return describeMissingDeepSeekCompat(model);
1624
+ }
1625
+ return describeMissingOpenAICompatibleProxyCompat(model);
1626
+ }
1627
+
1628
+ function buildDeepSeekCompatSuggestion(missing: string[]): Record<string, unknown> {
1629
+ const suggestion: Record<string, unknown> = {};
1630
+
1631
+ if (missing.includes("supportsLongCacheRetention")) {
1632
+ suggestion.supportsLongCacheRetention = true;
1633
+ }
1634
+ if (missing.includes("sendSessionIdHeader")) {
1635
+ suggestion.sendSessionIdHeader = true;
1636
+ }
1637
+ if (missing.includes("sendSessionAffinityHeaders")) {
1638
+ suggestion.sendSessionAffinityHeaders = true;
1639
+ }
1640
+ if (missing.includes("requiresReasoningContentOnAssistantMessages")) {
1641
+ suggestion.requiresReasoningContentOnAssistantMessages = true;
1642
+ }
1643
+ if (missing.includes("thinkingFormat")) {
1644
+ suggestion.thinkingFormat = "deepseek";
1645
+ }
1646
+
1647
+ return suggestion;
1648
+ }
1649
+
1650
+ function appendDeepSeekCompatAdviceLines(lines: string[], missing: string[]): void {
1651
+ const suggestion = buildDeepSeekCompatSuggestion(missing);
1652
+ if (Object.keys(suggestion).length > 0) {
1653
+ lines.push("Recommended DeepSeek compat snippet:");
1654
+ lines.push(JSON.stringify(suggestion, null, 2));
1655
+ }
1656
+
1657
+ if (missing.includes("requiresReasoningContentOnAssistantMessages")) {
1658
+ lines.push('- requiresReasoningContentOnAssistantMessages: true keeps replayed assistant turns compatible with DeepSeek reasoning_content requirements.');
1659
+ }
1660
+ if (missing.includes("thinkingFormat")) {
1661
+ lines.push('- thinkingFormat: "deepseek" tells Pi to use DeepSeek reasoning/thinking parameter format.');
1662
+ }
1663
+ if (missing.includes("sendSessionAffinityHeaders")) {
1664
+ lines.push("- sendSessionAffinityHeaders: recommended for OpenAI-compatible DeepSeek proxies when supported; it helps keep one Pi session on the same upstream/backend.");
1665
+ }
1666
+ if (missing.includes("sendSessionIdHeader")) {
1667
+ lines.push("- sendSessionIdHeader: recommended for OpenAI Responses-compatible DeepSeek proxies when supported.");
1668
+ }
1669
+ if (missing.includes("supportsLongCacheRetention")) {
1670
+ lines.push("- supportsLongCacheRetention: enable for DeepSeek-compatible endpoints that support long cache retention.");
1671
+ }
1672
+ }
1673
+
1674
+ function buildDeepSeekCompatWarningText(key: string, missing: string[]): string {
1675
+ const slashIdx = key.indexOf("/");
1676
+ const providerLabel = slashIdx > 0 ? key.slice(0, slashIdx) : key;
1677
+ const modelsJsonPath = getModelsJsonDisplayPath();
1678
+ const lines: string[] = [
1679
+ `💡 pi-cache-optimizer: ${key} is DeepSeek-like but merged compat lacks ${missing.join(" and ")}.`,
1680
+ `Proxies may reduce or hide cache hits. Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat (at the same level as baseUrl/api/apiKey/models).`,
1681
+ "",
1682
+ ];
1683
+
1684
+ appendDeepSeekCompatAdviceLines(lines, missing);
1685
+
1686
+ return lines.join("\n");
1687
+ }
1688
+
1597
1689
  const CACHE_PROVIDER_ADAPTERS: CacheProviderAdapter[] = [
1598
1690
  {
1599
1691
  id: "deepseek",
@@ -1613,13 +1705,7 @@ const CACHE_PROVIDER_ADAPTERS: CacheProviderAdapter[] = [
1613
1705
  if (missing.length === 0) return undefined;
1614
1706
 
1615
1707
  const key = modelKey(model);
1616
- const slashIdx = key.indexOf("/");
1617
- const providerLabel = slashIdx > 0 ? key.slice(0, slashIdx) : key;
1618
- const modelsJsonPath = getModelsJsonDisplayPath();
1619
- return (
1620
- `💡 pi-cache-optimizer: ${key} is DeepSeek-like but merged compat lacks ${missing.join(" and ")}. ` +
1621
- `Proxies may reduce or hide cache hits. Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat (at the same level as baseUrl/api/apiKey/models).`
1622
- );
1708
+ return buildDeepSeekCompatWarningText(key, missing);
1623
1709
  },
1624
1710
  },
1625
1711
  {
@@ -1742,6 +1828,23 @@ const CACHE_PROVIDER_ADAPTERS: CacheProviderAdapter[] = [
1742
1828
  return buildOpenAIProxyCompatWarningText(modelKey(model), missing);
1743
1829
  },
1744
1830
  },
1831
+ {
1832
+ id: "openai" as CacheProviderId,
1833
+ label: "Mimo cache",
1834
+ matchesModel: isMimoLikeModel,
1835
+ matchesAssistantMessage(message, model) {
1836
+ if (!isAssistantMessage(message)) return false;
1837
+ return isMimoLikeAssistantMessage(message, model);
1838
+ },
1839
+ normalizeUsage(message) {
1840
+ return normalizeWithFallback(message, getOpenAIRawUsage);
1841
+ },
1842
+ warningText(model) {
1843
+ const missing = describeMissingOpenAICompatibleProxyCompat(model);
1844
+ if (missing.length === 0) return undefined;
1845
+ return buildOpenAIProxyCompatWarningText(modelKey(model), missing);
1846
+ },
1847
+ },
1745
1848
  {
1746
1849
  id: "openai" as CacheProviderId,
1747
1850
  label: "Hunyuan cache",
@@ -3028,6 +3131,12 @@ function isCompatCheckApplicable(model: PiModel): boolean {
3028
3131
  return lower(model.api) === "openai-completions" && !isOfficialOpenAIBaseUrl(model);
3029
3132
  }
3030
3133
 
3134
+ function isPromptCacheRetention400Applicable(model: PiModel): boolean {
3135
+ return isOpenAICompatibleApi(model.api) &&
3136
+ !isOfficialOpenAIBaseUrl(model) &&
3137
+ getCompat(model).supportsLongCacheRetention === true;
3138
+ }
3139
+
3031
3140
  /**
3032
3141
  * Detect router / channel profiles from a PiModel and return diagnostic notes.
3033
3142
  *
@@ -3171,7 +3280,7 @@ function describeRouterChannelDiagnostics(model: PiModel): string[] {
3171
3280
 
3172
3281
  // ── 4. Generic third-party OpenAI-compatible proxy ─────────────────
3173
3282
  if (api === "openai-completions" && baseUrl) {
3174
- const missing = describeMissingOpenAICompatibleProxyCompat(model);
3283
+ const missing = describeMissingCacheCompatForModel(model);
3175
3284
  notes.push(
3176
3285
  "🔀 Router/channel: Third-party OpenAI-compatible proxy. If cache hit rates are low:",
3177
3286
  );
@@ -3207,7 +3316,8 @@ function buildDoctorDiagnosis(model: PiModel, options: { promptCacheRetention400
3207
3316
  const compat = getCompat(model);
3208
3317
  lines.push(`Compat: ${JSON.stringify(compat)}`);
3209
3318
 
3210
- const missing = describeMissingOpenAICompatibleProxyCompat(model);
3319
+ const deepSeekCompatApplicable = isDeepSeekCompatCheckApplicable(model);
3320
+ const missing = describeMissingCacheCompatForModel(model);
3211
3321
  if (missing.length > 0) {
3212
3322
  lines.push(`⚠️ Missing compat flags: ${missing.join(", ")}`);
3213
3323
  const key = modelKey(model);
@@ -3215,14 +3325,18 @@ function buildDoctorDiagnosis(model: PiModel, options: { promptCacheRetention400
3215
3325
  const providerLabel = slashIdx > 0 ? key.slice(0, slashIdx) : key;
3216
3326
  const modelsJsonPath = getModelsJsonDisplayPath();
3217
3327
  lines.push(`Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat (same level as baseUrl/api/apiKey/models).`);
3218
- appendOpenAIProxyCompatAdviceLines(lines, missing);
3219
- } else if (isCompatCheckApplicable(model)) {
3328
+ if (deepSeekCompatApplicable) {
3329
+ appendDeepSeekCompatAdviceLines(lines, missing);
3330
+ } else {
3331
+ appendOpenAIProxyCompatAdviceLines(lines, missing);
3332
+ }
3333
+ } else if (deepSeekCompatApplicable || isCompatCheckApplicable(model)) {
3220
3334
  lines.push("✅ Compat fully configured.");
3221
3335
  } else {
3222
3336
  lines.push("ℹ️ Compat check not applicable for this model.");
3223
3337
  }
3224
3338
 
3225
- if (isCompatCheckApplicable(model) && compat.supportsLongCacheRetention === true) {
3339
+ if (isPromptCacheRetention400Applicable(model)) {
3226
3340
  lines.push("");
3227
3341
  if (options.promptCacheRetention400) {
3228
3342
  lines.push("⚠️ A 400 response was observed while supportsLongCacheRetention is enabled.");
@@ -3274,8 +3388,8 @@ function buildLowHitDiagnosis(
3274
3388
  ): string[] {
3275
3389
  const lines: string[] = [];
3276
3390
 
3277
- // 1. Missing compat flags (reuse existing check)
3278
- const missingCompat = describeMissingOpenAICompatibleProxyCompat(model);
3391
+ // 1. Missing compat flags (adapter-aware: DeepSeek has extra reasoning compat)
3392
+ const missingCompat = describeMissingCacheCompatForModel(model);
3279
3393
 
3280
3394
  // 2. Router/channel risk (reuse existing check)
3281
3395
  const routerNotes = describeRouterChannelDiagnostics(model);
@@ -3371,7 +3485,8 @@ function buildLowHitDiagnosis(
3371
3485
  }
3372
3486
 
3373
3487
  function buildCompatDiagnosis(model: PiModel): string | undefined {
3374
- const missing = describeMissingOpenAICompatibleProxyCompat(model);
3488
+ const missing = describeMissingCacheCompatForModel(model);
3489
+ const deepSeekCompatApplicable = isDeepSeekCompatCheckApplicable(model);
3375
3490
  const routerNotes = describeRouterChannelDiagnostics(model);
3376
3491
 
3377
3492
  if (missing.length === 0 && routerNotes.length === 0) return undefined;
@@ -3388,14 +3503,18 @@ function buildCompatDiagnosis(model: PiModel): string | undefined {
3388
3503
  lines.push("");
3389
3504
  lines.push(`Edit ${modelsJsonPath} -> providers["${providerLabel}"] -> compat`);
3390
3505
  lines.push(`(at the same level as baseUrl/api/apiKey/models).`);
3391
- appendOpenAIProxyCompatAdviceLines(lines, missing);
3506
+ if (deepSeekCompatApplicable) {
3507
+ appendDeepSeekCompatAdviceLines(lines, missing);
3508
+ } else {
3509
+ appendOpenAIProxyCompatAdviceLines(lines, missing);
3510
+ }
3392
3511
  }
3393
3512
 
3394
3513
  // When compat is fully configured but router notes exist, prefix the status.
3395
3514
  if (routerNotes.length > 0 && missing.length === 0) {
3396
- if (isCompatCheckApplicable(model)) {
3515
+ if (deepSeekCompatApplicable || isCompatCheckApplicable(model)) {
3397
3516
  lines.push("✅ Compat fully configured.");
3398
- if (getCompat(model).supportsLongCacheRetention === true) {
3517
+ if (isPromptCacheRetention400Applicable(model)) {
3399
3518
  lines.push(getPromptCacheRetentionUnsupportedHint());
3400
3519
  }
3401
3520
  } else {
@@ -3441,9 +3560,16 @@ export const __internals_for_tests = {
3441
3560
  isOpenAIFamilyToken,
3442
3561
  describeMissingOpenAIFamilyProxyCompat,
3443
3562
  describeMissingOpenAICompatibleProxyCompat,
3563
+ describeMissingDeepSeekCompat,
3564
+ isDeepSeekCompatCheckApplicable,
3565
+ describeMissingCacheCompatForModel,
3566
+ buildDeepSeekCompatSuggestion,
3567
+ buildDeepSeekCompatWarningText,
3444
3568
  buildSafeOpenAIProxyCompatSuggestion,
3445
3569
  getPromptCacheRetentionUnsupportedHint,
3446
3570
  isOfficialOpenAIBaseUrl,
3571
+ isCompatCheckApplicable,
3572
+ isPromptCacheRetention400Applicable,
3447
3573
  // Non-GPT OpenAI-compatible model detection
3448
3574
  isKimiLikeModel,
3449
3575
  isKimiLikeAssistantMessage,
@@ -3453,6 +3579,8 @@ export const __internals_for_tests = {
3453
3579
  isGLMLikeAssistantMessage,
3454
3580
  isMiniMaxLikeModel,
3455
3581
  isMiniMaxLikeAssistantMessage,
3582
+ isMimoLikeModel,
3583
+ isMimoLikeAssistantMessage,
3456
3584
  isHunyuanLikeModel,
3457
3585
  isHunyuanLikeAssistantMessage,
3458
3586
  // Additional OpenAI-compatible model detection
@@ -3551,6 +3679,8 @@ export const __internals_for_tests = {
3551
3679
  isRwkvLikeAssistantMessage,
3552
3680
  isAyaLikeModel,
3553
3681
  isAyaLikeAssistantMessage,
3682
+ selectAdapterForModel,
3683
+ selectAdapterForAssistantMessage,
3554
3684
  buildOpenAIProxyCompatWarningText,
3555
3685
  getModelIdNameTokenValues,
3556
3686
  getAssistantMessageModelTokenValues,
@@ -3899,15 +4029,15 @@ export default function (pi: ExtensionAPI) {
3899
4029
  }
3900
4030
  }
3901
4031
 
3902
- // ⚠️ compat footer marker: if the active model is a non-official
3903
- // openai-completions model with missing supportsLongCacheRetention
3904
- // or sendSessionAffinityHeaders, append the marker to indicate that
3905
- // compat configuration is incomplete. Re-evaluated on every status
3906
- // update so the marker persists through stats changes and day
3907
- // rollovers. Redundant setStatus calls are blocked by the
4032
+ // ⚠️ compat footer marker: if the active model has adapter-specific
4033
+ // missing compat (DeepSeek reasoning/cache compat, or a non-official
4034
+ // openai-completions model missing cache/session-affinity flags), append
4035
+ // the marker to indicate that compat configuration is incomplete.
4036
+ // Re-evaluated on every status update so the marker persists through stats
4037
+ // changes and day rollovers. Redundant setStatus calls are blocked by the
3908
4038
  // `lastStatusText` early return above.
3909
4039
  if (runtimeOptimizerEnabled && statusText !== undefined && model) {
3910
- const compatMissing = describeMissingOpenAICompatibleProxyCompat(model);
4040
+ const compatMissing = describeMissingCacheCompatForModel(model);
3911
4041
  if (compatMissing.length > 0) {
3912
4042
  statusText = statusText + " ⚠️ compat";
3913
4043
  }
@@ -4027,8 +4157,7 @@ export default function (pi: ExtensionAPI) {
4027
4157
  const model = ctx.model;
4028
4158
  if (!runtimeOptimizerEnabled || !model) return;
4029
4159
  if (event.status !== 400) return;
4030
- if (!isCompatCheckApplicable(model)) return;
4031
- if (getCompat(model).supportsLongCacheRetention !== true) return;
4160
+ if (!isPromptCacheRetention400Applicable(model)) return;
4032
4161
 
4033
4162
  const key = modelKey(model);
4034
4163
  promptCacheRetention400Models.add(key);
@@ -4140,7 +4269,7 @@ export default function (pi: ExtensionAPI) {
4140
4269
  cmdCtx.ui.notify(compatResult, "warning");
4141
4270
  } else {
4142
4271
  cmdCtx.ui.notify(
4143
- isCompatCheckApplicable(model)
4272
+ isDeepSeekCompatCheckApplicable(model) || isCompatCheckApplicable(model)
4144
4273
  ? "✅ Compat fully configured."
4145
4274
  : "ℹ️ Compat check not applicable for this model.",
4146
4275
  "info",
@@ -4238,7 +4367,7 @@ export default function (pi: ExtensionAPI) {
4238
4367
  cmdCtx.ui.notify(compatResult, "warning");
4239
4368
  } else {
4240
4369
  cmdCtx.ui.notify(
4241
- isCompatCheckApplicable(model)
4370
+ isDeepSeekCompatCheckApplicable(model) || isCompatCheckApplicable(model)
4242
4371
  ? "✅ Compat fully configured."
4243
4372
  : "ℹ️ Compat check not applicable for this model.",
4244
4373
  "info",
@@ -4285,11 +4414,11 @@ export default function (pi: ExtensionAPI) {
4285
4414
  diagnosis.push("");
4286
4415
  if (model) {
4287
4416
  const displayKey = modelKey(model);
4288
- const missing = describeMissingOpenAICompatibleProxyCompat(model);
4417
+ const missing = describeMissingCacheCompatForModel(model);
4289
4418
  if (missing.length > 0) {
4290
4419
  diagnosis.push(`⚠️ Active model "${displayKey}" missing compat: ${missing.join(", ")}`);
4291
4420
  diagnosis.push('Run "/cache-optimizer compat" for edit instructions.');
4292
- } else if (isCompatCheckApplicable(model)) {
4421
+ } else if (isDeepSeekCompatCheckApplicable(model) || isCompatCheckApplicable(model)) {
4293
4422
  diagnosis.push(`✅ Active model "${displayKey}": compat fully configured.`);
4294
4423
  } else {
4295
4424
  diagnosis.push(`ℹ️ Active model "${displayKey}": compat check not applicable.`);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-cache-optimizer",
3
- "version": "2.5.3",
3
+ "version": "2.5.4",
4
4
  "description": "Improve Pi prompt/KV cache hit rates with stable prompts, OpenAI-compatible cache keys, proxy compat warnings, and footer cache stats.",
5
5
  "keywords": [
6
6
  "pi-package",