npm - claude-code-cache-fix - Versions diffs - 3.3.0 → 3.5.0 - Mend

claude-code-cache-fix 3.3.0 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/README.ko.md +3 -3
package/README.md +58 -3
package/README.zh.md +3 -3
package/package.json +2 -2
package/proxy/extensions/cache-telemetry.mjs +139 -17
package/proxy/extensions/identity-normalization.mjs +1 -1
package/proxy/extensions/image-strip.mjs +7 -2
package/proxy/extensions/messages-cache-breakpoint.mjs +314 -0
package/proxy/extensions/microcompact-stability.mjs +429 -0
package/proxy/extensions/ttl-management.mjs +2 -1
package/proxy/extensions/ttl-tier-detect.mjs +33 -0
package/proxy/extensions.json +3 -0
package/tools/cache-test.sh +19 -11
package/tools/cross-version-cache-test.sh +4 -4
package/tools/quota-statusline.sh +75 -19

package/README.ko.md CHANGED Viewed

@@ -39,7 +39,7 @@ ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
 | `identity-normalization` | 접두사 안정성을 위해 메시지 ID 필드를 정규화합니다 |
 | `fresh-session-sort` | 첫 번째 턴의 비결정적 순서를 수정합니다 |
 | `cache-control-normalize` | 메시지 간 cache_control 마커를 정규화합니다 |
-| `cache-telemetry` | 응답 헤더에서 캐시 통계를 추출하여 `~/.claude/quota-status.json`에 기록합니다 |
+| `cache-telemetry` | 응답 헤더에서 캐시 통계를 추출하여 `~/.claude/quota-status/{account.json,sessions/<id>.json}`에 기록합니다 |
 확장은 핫리로드됩니다 — `proxy/extensions/`에서 `.mjs` 파일을 추가, 제거 또는 수정하면 프록시 재시작 없이 다음 요청부터 적용됩니다. 설정은 `proxy/extensions.json`에 있습니다.
@@ -202,7 +202,7 @@ Fixes are disabled — consider re-enabling to recover cache performance.
 ## 상태 표시줄 — 실시간 쿼터 경고
-프록시와 프리로드 모드 모두 매 API 호출마다 `~/.claude/quota-status.json`에 쿼터 상태를 기록합니다. 포함된 `tools/quota-statusline.sh` 스크립트로 실시간 상태를 표시할 수 있습니다:
+두 모드 모두 매 API 호출마다 쿼터 상태를 기록합니다. 프록시 모드(v3.5.0+)는 `~/.claude/quota-status/account.json`(계정 전역: Q5h/Q7d, 상태, 초과)과 `~/.claude/quota-status/sessions/<id>.json`(세션별: TTL 계층, 적중률)로 분리됩니다. 프리로드 모드는 기존 `~/.claude/quota-status.json`(구조상 단일 세션)을 유지합니다. 포함된 `tools/quota-statusline.sh` 스크립트로 실시간 상태를 표시할 수 있습니다:
 - **Q5h %** (소진율, %/분)
 - **Q7d %** (소진율, %/시간)
@@ -292,7 +292,7 @@ npm install sharp
 ## 모니터링 & 진단
-프리로드 인터셉터에는 마이크로컴팩트 열화, 가상 속도 제한기, GrowthBook 플래그 상태, 사용량 텔레메트리, 비용 리포트에 대한 모니터링이 포함됩니다. 쿼터 추적은 프록시와 프리로드 모드 모두에서 `~/.claude/quota-status.json`을 통해 동작합니다.
+프리로드 인터셉터에는 마이크로컴팩트 열화, 가상 속도 제한기, GrowthBook 플래그 상태, 사용량 텔레메트리, 비용 리포트에 대한 모니터링이 포함됩니다. 쿼터 추적은 `~/.claude/quota-status/`(프록시: 세션별 분리) 또는 `~/.claude/quota-status.json`(프리로드: 단일 세션 레거시 경로)을 통해 동작합니다.
 전체 상세, 디버그 모드, 접두사 비교, 환경 변수, 내장 쿼터 분석 도구는 [docs/monitoring.md](docs/monitoring.md)를 참조하십시오.

package/README.md CHANGED Viewed

@@ -39,7 +39,7 @@ On every `/v1/messages` request, 7 extensions run in order:
 | `identity-normalization` | Normalizes message identity fields for prefix stability |
 | `fresh-session-sort` | Fixes non-deterministic ordering on first turn |
 | `cache-control-normalize` | Normalizes cache_control markers across messages |
-| `cache-telemetry` | Extracts cache stats from response headers → `~/.claude/quota-status.json` |
+| `cache-telemetry` | Extracts cache stats from response headers → `~/.claude/quota-status/{account.json,sessions/<id>.json}` |
 Extensions are hot-reloadable — add, remove, or modify `.mjs` files in `proxy/extensions/` and changes apply to the next request without restarting. Configuration in `proxy/extensions.json`.
@@ -280,7 +280,7 @@ The interceptor can only *help* or *do nothing*. It cannot make things worse.
 ## Status line — quota warnings in real time
-Both proxy and preload modes write quota state to `~/.claude/quota-status.json` on every API call. The included `tools/quota-statusline.sh` script displays a live status line showing:
+Both modes write quota state on every API call. Proxy mode (v3.5.0+) splits into `~/.claude/quota-status/account.json` (account-global fields: Q5h/Q7d, status, overage) plus `~/.claude/quota-status/sessions/<id>.json` (per-session cache fields: TTL tier, hit rate). Preload mode keeps the legacy `~/.claude/quota-status.json` (single-session by construction). The included `tools/quota-statusline.sh` script displays a live status line showing:
 - **Q5h %** with burn rate (%/min)
 - **Q7d %** with burn rate (%/hr)
@@ -409,13 +409,66 @@ If `sharp` is missing, Pass 3 skips cleanly (telemetry records `library_missing:
 | `CACHE_FIX_IMAGE_REQUEST_SIZE_MAX` | 31457280 (30 MB) | Pass 2 byte budget. 2 MB headroom from Anthropic's 32 MB ceiling. |
 | `CACHE_FIX_IMAGE_COUNT_MAX` | 100 | Hard image-count cap. Set to 600 for legacy Claude 1/2.x/Instant if needed. |
+## Cache breakpoints (proxy mode, opt-in)
+Anthropic's prompt cache supports up to **four** `cache_control` markers per request. Claude Code currently uses three of the four; the third (between auto-injected `messages[0]` content — hooks, skills, project CLAUDE.md, deferred tools, MCP server descriptions — and the first real user content) is missing entirely. Without that marker, every change inside the auto-injected span busts the cache for everything that follows. wadabum projected ~6,500 token savings per fresh-session first turn from adding it ([anthropics/claude-code#47098](https://github.com/anthropics/claude-code/issues/47098)).
+The proxy can inject the missing marker on opt-in. Default off until validated against community data.
+```sh
+export CACHE_FIX_INJECT_MESSAGES_BREAKPOINT=1
+```
+The injection is conservative: it only fires when the request already carries 1–3 markers (typical CC shape) and refuses if the request is at the 4-marker limit (would 400) or has zero markers (Agent SDK / API-direct shape this extension isn't built for). Boundary detection covers all five observed auto-injected block kinds — hooks, skills, CLAUDE.md, deferred-tools, MCP — and lands the marker on the LAST auto-injected block.
+A diagnostic-only env var dumps the structural shape of `messages[0]` for fixture sourcing without mutating the request:
+```sh
+export CACHE_FIX_DUMP_MESSAGES_HEAD=/tmp/messages-head.jsonl
+```
+| Env var | Default | Purpose |
+|---------|---------|---------|
+| `CACHE_FIX_INJECT_MESSAGES_BREAKPOINT` | unset | Enable breakpoint #3 injection (`=1` opt-in). |
+| `CACHE_FIX_DUMP_MESSAGES_HEAD` | unset | Diagnostic JSONL dump of `messages[0].content` shape — read-only, no mutation. |
+## Microcompact stability (proxy mode, opt-in)
+After ~90 minutes idle, Claude Code's `time_based_microcompact` (and the cold-compact path triggered by `FDY()`) replaces old `tool_result` content with a sentinel string. The original content is gone for cache purposes; that part is unrecoverable from the proxy. But the sentinel itself can carry an embedded timestamp (`[Old tool result content cleared at 2026-04-30T13:42:11Z]`), which means a *second* microcompact pass against the same already-cleared position writes different bytes — busting the cache for everything after that position even though no new content was added.
+This extension addresses the recoverable half: normalize the sentinel to a byte-stable canonical form so repeat microcompacts don't churn the cache. **Phase 1 only** — diagnostic + opt-in normalization. Phase 2 (snapshot-and-restore of original tool_result content) is deferred to v3.5.0+ pending Phase 1 production data.
+```sh
+# Step 1 (diagnostic): characterize what CC's sentinel actually looks like.
+export CACHE_FIX_DUMP_MICROCOMPACT=/tmp/microcompact-dump.jsonl
+# Step 2 (normalize): once the sentinel format is confirmed, opt-in.
+export CACHE_FIX_NORMALIZE_MICROCOMPACT=1
+```
+Detection has two modes:
+- **Mode A** — exact match against confirmed CC sentinel patterns (the bare form and the ISO-8601 timestamp variant). Mode A matches are eligible for normalization.
+- **Mode B** — prefix-only match (text begins with `[Old tool result content cleared` but does not exactly match a Mode A pattern). Mode B is **diagnostic-only**: never normalized, dump records redact to a 64-char prefix only.
+The Mode A/B separation protects against cases where the sentinel might be followed by user-derived content (e.g., a tool that echoed user input back into its result) — the redaction guarantee on Mode B keeps that content out of the diagnostic dump.
+| Env var | Default | Purpose |
+|---------|---------|---------|
+| `CACHE_FIX_DUMP_MICROCOMPACT` | unset | Path for diagnostic JSONL dump of detected sentinels. Read-only — no mutation. |
+| `CACHE_FIX_NORMALIZE_MICROCOMPACT` | unset | Enable normalization (`=1` opts in). Mutates Mode A matches to canonical form. |
+| `CACHE_FIX_MICROCOMPACT_NORMALIZED` | `[Old tool result content cleared]` | Override the canonical replacement string. |
+| `CACHE_FIX_MICROCOMPACT_SENTINEL_PATTERN_<N>` | unset | Add custom Mode A regex pattern(s). Numbered (1-indexed, sparse OK). |
+| `CACHE_FIX_MICROCOMPACT_SENTINEL_PREFIX_<N>` | unset | Custom Mode B literal prefix(es). Pair with a custom Mode A pattern from a non-default sentinel family so prefix-only variants of that family also get redacted Mode B capture. |
+| `CACHE_FIX_MICROCOMPACT_REDACT_LEN` | `64` | Mode B prefix length in dump records. Set to `0` to suppress the prefix entirely. |
+| `CACHE_FIX_DUMP_MICROCOMPACT_INCLUDE_NORMALIZED` | unset | Add post-normalization text alongside (not replacing) raw `sentinel_text` in dump records. |
 ## System prompt rewrite (preload mode, optional)
 The interceptor can rewrite Claude Code's `# Output efficiency` system-prompt section. Disabled by default. Enable with `CACHE_FIX_OUTPUT_EFFICIENCY_REPLACEMENT`. See [docs/output-efficiency-prompts.md](docs/output-efficiency-prompts.md) for the three known prompt variants and usage instructions.
 ## Monitoring & diagnostics
-The preload interceptor includes monitoring for microcompact degradation, false rate limiters, GrowthBook flag state, usage telemetry, and cost reporting. Quota tracking works in both proxy and preload modes via `~/.claude/quota-status.json`.
+The preload interceptor includes monitoring for microcompact degradation, false rate limiters, GrowthBook flag state, usage telemetry, and cost reporting. Quota tracking works in both proxy and preload modes via `~/.claude/quota-status/` (proxy: per-session split) or `~/.claude/quota-status.json` (preload: single-session legacy path).
 See [docs/monitoring.md](docs/monitoring.md) for full details, debug mode, prefix diffing, environment variables, and the bundled quota analysis tool.
@@ -440,6 +493,7 @@ We monitor 30+ upstream Claude Code issues related to cache, quota, and context
 ## Used in production
 - **[Crunchloop DAP](https://dap.crunchloop.ai)** — Agent SDK / DAP development environment. First production team to merge the interceptor to trunk for team-wide deployment (2026-04-10). Identified two distinct cache regression patterns through real-world testing — tool ordering jitter and the fresh-session sort gap — and contributed debug traces that drove the v1.5.1 and v1.6.2 fixes.
+- **[VM Farms](https://vmfarms.com)** ([@vmfarms](https://github.com/vmfarms)) — Agent development environment running concurrent multi-runner workloads with `--resume --fork-session`. Surfaced three cache-fix proxy-mode bugs: the resume-marker regex no-op (#96), TTL tier detection gap vs preload mode (#97), and image-strip stderr leak past `CACHE_FIX_DEBUG` (#98) — all addressed in the v3.4.0 release.
 ## Contributors
@@ -456,6 +510,7 @@ We monitor 30+ upstream Claude Code issues related to cache, quota, and context
 - **[@JEONG-JIWOO](https://github.com/JEONG-JIWOO)** — VS Code extension investigation: discovered `claudeCode.claudeProcessWrapper` as the working integration path, wrote the C wrapper for Windows (#16)
 - **[@X-15](https://github.com/X-15)** — VS Code extension validation, per-fix health status analysis confirming safety check behavior on v2.1.105 (#16)
 - **[@deafsquad](https://github.com/deafsquad)** — Universal smoosh_split un-smoosh fix (PR #26), source-level function attribution of resume scatter bug (anthropics/claude-code#43657), OTEL telemetry discovery, proposed and built proxy architecture for v3.0.0
+- **[@vmfarms](https://github.com/vmfarms)** — Concurrent multi-runner production validation, surfaced proxy-mode resume-marker regex no-op (#96), TTL tier detection gap (#97), and image-strip stderr leak (#98)
 If you contributed to the community effort on these issues and aren't listed here, please open an issue or PR — we want to credit everyone properly.

package/README.zh.md CHANGED Viewed

@@ -39,7 +39,7 @@ ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
 | `identity-normalization` | 规范化消息身份字段以保持前缀稳定性 |
 | `fresh-session-sort` | 修复首次轮次的非确定性排序 |
 | `cache-control-normalize` | 规范化消息间的 cache_control 标记 |
-| `cache-telemetry` | 从响应头提取缓存统计 → `~/.claude/quota-status.json` |
+| `cache-telemetry` | 从响应头提取缓存统计 → `~/.claude/quota-status/{account.json,sessions/<id>.json}` |
 扩展支持热重载 — 在 `proxy/extensions/` 中添加、删除或修改 `.mjs` 文件，更改将在下一次请求时生效，无需重启。配置在 `proxy/extensions.json` 中。
@@ -167,7 +167,7 @@ NODE_OPTIONS="--import claude-code-cache-fix" claude
 ## 状态栏 — 实时配额警告
-代理和预加载模式都会在每次 API 调用时将配额状态写入 `~/.claude/quota-status.json`。内置的 `tools/quota-statusline.sh` 脚本显示实时状态栏：
+两种模式在每次 API 调用时都会写入配额状态。代理模式（v3.5.0+）拆分为 `~/.claude/quota-status/account.json`（账户级：Q5h/Q7d、状态、超额）和 `~/.claude/quota-status/sessions/<id>.json`（每会话：TTL 层级、命中率）。预加载模式保留旧版 `~/.claude/quota-status.json`（按构造为单会话）。内置的 `tools/quota-statusline.sh` 脚本显示实时状态栏：
 - **Q5h %** 及消耗速率（%/分钟）
 - **Q7d %** 及消耗速率（%/小时）
@@ -200,7 +200,7 @@ export CACHE_FIX_IMAGE_KEEP_LAST=3
 ## 监控与诊断
-预加载拦截器包含对微压缩降级、虚假速率限制器、GrowthBook 标志状态、使用量遥测和成本报告的监控。配额追踪在代理和预加载模式下均通过 `~/.claude/quota-status.json` 工作。
+预加载拦截器包含对微压缩降级、虚假速率限制器、GrowthBook 标志状态、使用量遥测和成本报告的监控。配额追踪通过 `~/.claude/quota-status/`（代理：按会话拆分）或 `~/.claude/quota-status.json`（预加载：单会话旧路径）工作。
 完整详情、调试模式、前缀差异对比、环境变量和内置配额分析工具请参见 [docs/monitoring.md](docs/monitoring.md)。

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-code-cache-fix",
-  "version": "3.3.0",
+  "version": "3.5.0",
   "description": "Cache optimization proxy and interceptor for Claude Code. Fixes prompt cache bugs, stabilizes prefix, reduces quota burn.",
   "type": "module",
   "exports": "./preload.mjs",
@@ -53,5 +53,5 @@
     "url": "https://buymeacoffee.com/vsits"
   },
   "license": "MIT",
-  "author": "Chris Nighswonger <chris@veritassuperaitsolutions.com> (https://veritassuperaitsolutions.com)"
+  "author": "Chris Nighswonger <dev@vsits.co> (https://vsits.co)"
 }

package/proxy/extensions/cache-telemetry.mjs CHANGED Viewed

@@ -1,8 +1,63 @@
-import { writeFileSync, mkdirSync } from "node:fs";
+import {
+  writeFileSync,
+  renameSync,
+  unlinkSync,
+  mkdirSync,
+  readdirSync,
+  statSync,
+} from "node:fs";
 import { join } from "node:path";
 import { homedir } from "node:os";
+import { createHash, randomBytes } from "node:crypto";
-const QUOTA_PATH = join(homedir(), ".claude", "quota-status.json");
+// Paths are resolved per call (not cached at module load) so tests can swap
+// $HOME between cases. The homedir() call is essentially free.
+function paths() {
+  const home = homedir();
+  const quotaDir = join(home, ".claude", "quota-status");
+  return {
+    quotaDir,
+    accountPath: join(quotaDir, "account.json"),
+    sessionsDir: join(quotaDir, "sessions"),
+    legacyPath: join(home, ".claude", "quota-status.json"),
+  };
+}
+const SAFE_NAME_RE = /^[A-Za-z0-9_-]{1,128}$/;
+const SWEEP_THROTTLE_MS = 60_000;
+const DEFAULT_TTL_DAYS = 7;
+// --- Module-scope state ---
+let legacyCleanupDone = false;
+let lastSweepMs = 0;
+// Per directive `proxy-quota-status-per-session.md` — derive a filesystem-safe
+// filename from a raw session id. Both writer (this extension) and readers
+// (tools/quota-statusline.sh, etc.) must apply the same rule.
+//
+// Rules:
+//   - null/undefined/empty/whitespace → "unknown"
+//   - matches /^[A-Za-z0-9_-]{1,128}$/ → raw passthrough
+//   - else → "inv-" + sha256(raw)[:16]
+//
+// Exported for unit testing and for the directive's writer/reader contract.
+export function sessionFilename(rawId) {
+  if (rawId === null || rawId === undefined) return "unknown";
+  const s = String(rawId).trim();
+  if (s.length === 0) return "unknown";
+  if (SAFE_NAME_RE.test(s)) return s;
+  return "inv-" + createHash("sha256").update(s).digest("hex").slice(0, 16);
+}
+function resolveSessionId(headers) {
+  if (!headers) return null;
+  const sid =
+    headers["x-claude-code-session-id"] ||
+    headers["x-session-id"] ||
+    headers["x-anthropic-session-id"] ||
+    null;
+  return sid || null;
+}
 function parseHeaders(headers) {
   const get = (key) => headers[key] || "";
@@ -44,9 +99,56 @@ function parseHeaders(headers) {
   };
 }
+function atomicWrite(finalPath, content) {
+  const tmp = `${finalPath}.tmp.${process.pid}.${randomBytes(4).toString("hex")}`;
+  writeFileSync(tmp, content);
+  renameSync(tmp, finalPath);
+}
+function cleanupLegacyOnce() {
+  if (legacyCleanupDone) return;
+  legacyCleanupDone = true;
+  try {
+    unlinkSync(paths().legacyPath);
+  } catch {}
+}
+function sweepStaleSessions(ttlDays) {
+  const now = Date.now();
+  if (now - lastSweepMs < SWEEP_THROTTLE_MS) return;
+  lastSweepMs = now;
+  const cutoffMs = now - ttlDays * 86_400_000;
+  const { sessionsDir } = paths();
+  let entries;
+  try {
+    entries = readdirSync(sessionsDir);
+  } catch {
+    return;
+  }
+  for (const name of entries) {
+    const p = join(sessionsDir, name);
+    try {
+      const st = statSync(p);
+      if (st.mtimeMs < cutoffMs) {
+        try {
+          unlinkSync(p);
+        } catch {}
+      }
+    } catch {}
+  }
+}
+function getTtlDays() {
+  const raw = process.env.CACHE_FIX_QUOTA_STATUS_TTL_DAYS;
+  if (raw === undefined || raw === "") return DEFAULT_TTL_DAYS;
+  const n = Number(raw);
+  return Number.isFinite(n) && n >= 0 ? n : DEFAULT_TTL_DAYS;
+}
 export default {
   name: "cache-telemetry",
-  description: "Extract cache stats from response stream, persist quota state to ~/.claude/quota-status.json",
+  description: "Extract cache stats from response stream, persist quota state to ~/.claude/quota-status/{account.json,sessions/<filename>.json}",
   order: 600,
   async onResponseStart(ctx) {
@@ -56,6 +158,7 @@ export default {
     if (!quota) return;
     ctx.meta._quotaData = quota;
+    ctx.meta._sessionId = resolveSessionId(ctx.headers);
   },
   async onStreamEvent(ctx) {
@@ -89,24 +192,43 @@ export default {
       const ttl = cr > 0 ? "1h" : (cc > 0 ? "5m" : "unknown");
-      const output = {
-        cache: {
-          ttl_tier: ttl,
-          cache_creation: cc,
-          cache_read: cr,
-          ephemeral_1h: ephemeral1h,
-          ephemeral_5m: ephemeral5m,
-          hit_rate: hitRate,
-          timestamp: new Date().toISOString(),
+      const timestamp = new Date().toISOString();
+      const rawSid = ctx.meta._sessionId;
+      const filename = sessionFilename(rawSid);
+      const accountPayload = JSON.stringify({ ...quota, timestamp }, null, 2);
+      const sessionPayload = JSON.stringify(
+        {
+          cache: {
+            ttl_tier: ttl,
+            cache_creation: cc,
+            cache_read: cr,
+            ephemeral_1h: ephemeral1h,
+            ephemeral_5m: ephemeral5m,
+            hit_rate: hitRate,
+            timestamp,
+          },
+          timestamp,
+          session_id: rawSid,
         },
-        timestamp: new Date().toISOString(),
-        ...quota,
-      };
+        null,
+        2,
+      );
       try {
-        mkdirSync(join(homedir(), ".claude"), { recursive: true });
-        writeFileSync(QUOTA_PATH, JSON.stringify(output, null, 2));
+        cleanupLegacyOnce();
+        const { sessionsDir, accountPath } = paths();
+        mkdirSync(sessionsDir, { recursive: true });
+        atomicWrite(accountPath, accountPayload);
+        atomicWrite(join(sessionsDir, `${filename}.json`), sessionPayload);
+        sweepStaleSessions(getTtlDays());
       } catch {}
     }
   },
+  // Test-only: reset module state between tests.
+  __resetForTests() {
+    legacyCleanupDone = false;
+    lastSweepMs = 0;
+  },
 };

package/proxy/extensions/identity-normalization.mjs CHANGED Viewed

@@ -2,7 +2,7 @@ import { createHash } from "node:crypto";
 const _pinnedBlocks = new Map();
-const SESSION_START_RESUME_MARKER = /SessionStart:startup hook success:/g;
+const SESSION_START_RESUME_MARKER = /SessionStart:resume hook success:/g;
 const SESSION_START_ID_TAG = /\n?<session-id>[^<]*<\/session-id>/g;
 const SESSION_START_LAST_ACTIVE_LINE = /\nLast active:[^\n]*/g;
 const CONTINUE_TRAILER_TEXT = "Continue from where you left off.";

package/proxy/extensions/image-strip.mjs CHANGED Viewed

@@ -48,6 +48,9 @@ function getRequestSizeMax() {
   const v = parseInt(process.env.CACHE_FIX_IMAGE_REQUEST_SIZE_MAX || "31457280", 10);
   return v > 0 ? v : 31457280;
 }
+function isDebug() {
+  return process.env.CACHE_FIX_DEBUG === "1";
+}
 function getImageCountMax() {
   // Default 100 — single cap covering the only model family in active CC use.
   // Users on legacy Claude 1/2.x/Instant who genuinely need 600 can override.
@@ -649,7 +652,9 @@ export default {
       if (logParts.length > 0) {
         ctx.body.messages = messages;
-        process.stderr.write(`[image-strip] ${logParts.join("; ")}\n`);
+        if (isDebug()) {
+          process.stderr.write(`[image-strip] ${logParts.join("; ")}\n`);
+        }
       }
       return;
     }
@@ -676,7 +681,7 @@ export default {
       stats.resize_succeeded > 0 ||
       stats.unsupported_format_count > 0 ||
       stats.dimension_probe_fail_count > 0;
-    if (didSomething) {
+    if (didSomething && isDebug()) {
       const parts = [];
       if (stats.resize_succeeded > 0) parts.push(`resized=${stats.resize_succeeded}`);
       if (stats.resize_failed > 0) parts.push(`resize_failed=${stats.resize_failed}`);