claude-code-cache-fix 1.7.1 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -105,6 +105,103 @@ The module intercepts `globalThis.fetch` before Claude Code makes API calls to `
105
105
 
106
106
  All fixes are idempotent — if nothing needs fixing, the request passes through unmodified. The interceptor is read-only with respect to your conversation; it only normalizes the request structure before it hits the API.
107
107
 
108
+ ## Graduating from Fixes
109
+
110
+ The interceptor serves three purposes with different lifecycles:
111
+
112
+ | Purpose | Examples | When to disable |
113
+ |---------|----------|-----------------|
114
+ | **Bug fixes** | Block relocation, fingerprint, tool sort, TTL | When CC fixes the underlying bug — check the health line |
115
+ | **Monitoring** | Quota tracking, microcompact detection, GrowthBook flags | Keep permanently — these detect future regressions |
116
+ | **Optimizations** | Image stripping, output efficiency rewrite | Keep as long as they help your workflow |
117
+
118
+ ### Health status
119
+
120
+ On first API call, the interceptor logs a health status line (requires `CACHE_FIX_DEBUG=1`):
121
+
122
+ ```
123
+ cache-fix health: relocate=active(2h ago) fingerprint=dormant(5 clean sessions) tool_sort=active ttl=active identity=waiting
124
+ ```
125
+
126
+ Status meanings:
127
+ - **active(Xh ago)** — fix was applied recently
128
+ - **dormant(N clean sessions)** — bug not detected in N resume sessions; CC may have fixed it
129
+ - **safety-blocked(Nx)** — round-trip verification failed; CC changed its algorithm, fix auto-disabled
130
+ - **waiting** — fix hasn't been triggered yet
131
+
132
+ When a fix shows `dormant`, you can safely disable it:
133
+ ```bash
134
+ export CACHE_FIX_SKIP_RELOCATE=1 # example
135
+ ```
136
+
137
+ To disable all fixes but keep monitoring:
138
+ ```bash
139
+ export CACHE_FIX_DISABLED=1
140
+ ```
141
+
142
+ ### Regression detection
143
+
144
+ If cache_read ratio drops below 50% across 5+ calls after disabling fixes, you'll see:
145
+ ```
146
+ REGRESSION WARNING: cache_read ratio averaged 12% across last 5 calls.
147
+ Fixes are disabled — consider re-enabling to recover cache performance.
148
+ ```
149
+
150
+ ## Safety
151
+
152
+ ### Fingerprint round-trip verification
153
+
154
+ Before rewriting the `cc_version` fingerprint, the interceptor verifies that its
155
+ hardcoded salt and character indices reproduce the fingerprint Claude Code sent.
156
+ If verification fails (CC changed its algorithm), the rewrite is skipped automatically.
157
+ This ensures the interceptor can never make cache performance *worse* than stock CC.
158
+
159
+ ### Fail-safe design
160
+
161
+ Every fix is designed to fail to a no-op:
162
+ - If block detection regexes don't match → blocks aren't relocated (CC behavior)
163
+ - If fingerprint format changes → fingerprint isn't rewritten (CC behavior)
164
+ - If tool sort produces no changes → payload passes through untouched
165
+ - If TTL injection target structure changes → TTL isn't injected (CC behavior)
166
+
167
+ The interceptor can only *help* or *do nothing*. It cannot make things worse.
168
+
169
+ ## Status line — quota warnings in real time
170
+
171
+ The interceptor writes quota state to `~/.claude/quota-status.json` on every API call. The included `tools/quota-statusline.sh` script reads this file and displays a live status line in Claude Code showing:
172
+
173
+ - **Q5h %** with burn rate (%/min)
174
+ - **Q7d %** with burn rate (%/hr)
175
+ - **TTL tier** — shows `TTL:1h` when healthy, **`TTL:5m` in red when the server has downgraded you** (typically at Q5h ≥ 100%)
176
+ - **PEAK** in yellow during weekday peak hours (13:00–19:00 UTC)
177
+ - **Cache hit rate %**
178
+ - **OVERAGE** flag when active
179
+
180
+ ### Setup
181
+
182
+ Copy the script and configure Claude Code to use it:
183
+
184
+ ```bash
185
+ # Copy from the npm package to Claude Code's hooks directory
186
+ mkdir -p ~/.claude/hooks
187
+ cp "$(npm root -g)/claude-code-cache-fix/tools/quota-statusline.sh" ~/.claude/hooks/
188
+ chmod +x ~/.claude/hooks/quota-statusline.sh
189
+ ```
190
+
191
+ Add to `~/.claude/settings.json`:
192
+
193
+ ```json
194
+ {
195
+ "statusLine": {
196
+ "command": "~/.claude/hooks/quota-statusline.sh"
197
+ }
198
+ }
199
+ ```
200
+
201
+ ### Why this matters
202
+
203
+ When the server downgrades your TTL to 5m (Layer 2 — quota-aware downgrade at Q5h ≥ 100%), **every idle longer than 5 minutes causes a full context rebuild**. Without the status line, this is invisible — you just notice things getting slower and more expensive. With the status line, the red `TTL:5m` warning tells you immediately: **stop working, wait for the Q5h window to reset, then resume**. Powering through overage compounds the drain; pausing breaks the cycle.
204
+
108
205
  ## Image stripping
109
206
 
110
207
  Images read via the Read tool are encoded as base64 and stored in `tool_result` blocks in conversation history. They ride along on **every subsequent API call** until compaction. A single 500KB image costs ~62,500 tokens per turn in carry-forward.
@@ -305,6 +402,12 @@ Snapshots are saved to `~/.claude/cache-fix-snapshots/` and diff reports are gen
305
402
  | `CACHE_FIX_IMAGE_KEEP_LAST` | `0` | Keep images in last N user messages (0 = disabled) |
306
403
  | `CACHE_FIX_OUTPUT_EFFICIENCY_REPLACEMENT` | unset | Replace Claude Code's `# Output efficiency` system-prompt section before the request is sent |
307
404
  | `CACHE_FIX_USAGE_LOG` | `~/.claude/usage.jsonl` | Path for per-call usage telemetry log |
405
+ | `CACHE_FIX_DISABLED` | `0` | Disable all bug fixes; keep monitoring + optimizations active |
406
+ | `CACHE_FIX_SKIP_RELOCATE` | `0` | Skip block relocation fix (Bug 1) |
407
+ | `CACHE_FIX_SKIP_FINGERPRINT` | `0` | Skip fingerprint stabilization (Bug 2b) |
408
+ | `CACHE_FIX_SKIP_TOOL_SORT` | `0` | Skip tool ordering stabilization (Bug 2a) |
409
+ | `CACHE_FIX_SKIP_TTL` | `0` | Skip 1h TTL injection (Bug 5) |
410
+ | `CACHE_FIX_SKIP_IDENTITY` | `0` | Skip identity normalization (Bug 6) |
308
411
 
309
412
  ## Limitations
310
413
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-code-cache-fix",
3
- "version": "1.7.1",
3
+ "version": "1.8.0",
4
4
  "description": "Fixes prompt cache regression in Claude Code that causes up to 20x cost increase on resumed sessions",
5
5
  "type": "module",
6
6
  "exports": "./preload.mjs",
package/preload.mjs CHANGED
@@ -83,6 +83,25 @@ function extractRealUserMessageText(messages) {
83
83
  return "";
84
84
  }
85
85
 
86
+ /**
87
+ * Extract text from messages[0] the way CC's original fingerprint code does —
88
+ * including meta/attachment blocks. Used only for round-trip verification.
89
+ */
90
+ function extractFirstMessageText(messages) {
91
+ if (!Array.isArray(messages) || messages.length === 0) return "";
92
+ const first = messages[0];
93
+ if (!first || first.role !== "user") return "";
94
+ const content = first.content;
95
+ if (typeof content === "string") return content;
96
+ if (!Array.isArray(content)) return "";
97
+ for (const block of content) {
98
+ if (block.type === "text" && typeof block.text === "string") {
99
+ return block.text;
100
+ }
101
+ }
102
+ return "";
103
+ }
104
+
86
105
  /**
87
106
  * Extract current cc_version from system prompt blocks and recompute with
88
107
  * stable fingerprint. Returns { oldVersion, newVersion, stableFingerprint }.
@@ -107,6 +126,23 @@ function stabilizeFingerprint(system, messages) {
107
126
  const baseVersion = dotParts.slice(0, 3).join("."); // "2.1.87"
108
127
  const oldFingerprint = dotParts[3]; // "a3f"
109
128
 
129
+ // --- SAFETY: Round-trip verification ---
130
+ // Verify our salt/indices reproduce CC's fingerprint for the ORIGINAL
131
+ // message text (messages[0] content, which is what CC used).
132
+ // If our computation doesn't match, our constants are stale — skip rewrite.
133
+ const originalText = extractFirstMessageText(messages);
134
+ const verification = computeFingerprint(originalText, baseVersion);
135
+ if (verification !== oldFingerprint) {
136
+ debugLog(
137
+ "FINGERPRINT SAFETY: round-trip verification failed.",
138
+ `CC sent '${oldFingerprint}', we computed '${verification}'.`,
139
+ "Salt/indices may have changed in this CC version. Skipping rewrite."
140
+ );
141
+ recordFixResult("fingerprint", "safety_blocked");
142
+ return null;
143
+ }
144
+ // --- END SAFETY ---
145
+
110
146
  // Compute stable fingerprint from real user text
111
147
  const realText = extractRealUserMessageText(messages);
112
148
  const stableFingerprint = computeFingerprint(realText, baseVersion);
@@ -588,7 +624,7 @@ function replaceOutputEfficiencySection(text) {
588
624
  // Set CACHE_FIX_DEBUG=1 to enable
589
625
  // --------------------------------------------------------------------------
590
626
 
591
- import { appendFileSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
627
+ import { appendFileSync, readFileSync, writeFileSync, mkdirSync, renameSync } from "node:fs";
592
628
  import { homedir } from "node:os";
593
629
  import { join } from "node:path";
594
630
 
@@ -605,6 +641,103 @@ function debugLog(...args) {
605
641
  try { appendFileSync(LOG_PATH, line); } catch {}
606
642
  }
607
643
 
644
+ // --------------------------------------------------------------------------
645
+ // Kill switches — disable fixes while keeping monitoring active
646
+ // --------------------------------------------------------------------------
647
+
648
+ const FIXES_DISABLED = process.env.CACHE_FIX_DISABLED === "1";
649
+
650
+ /**
651
+ * Check if a specific fix should be applied.
652
+ * Returns false if master kill switch is on OR individual fix is skipped.
653
+ * Monitoring and optimizations (image strip, output efficiency) are NOT
654
+ * affected by CACHE_FIX_DISABLED — only bug fixes are.
655
+ */
656
+ function shouldApplyFix(fixName) {
657
+ if (FIXES_DISABLED) return false;
658
+ const skipKey = `CACHE_FIX_SKIP_${fixName.toUpperCase()}`;
659
+ if (process.env[skipKey] === "1") return false;
660
+ return true;
661
+ }
662
+
663
+ // --------------------------------------------------------------------------
664
+ // Persistent effectiveness stats
665
+ // --------------------------------------------------------------------------
666
+
667
+ const STATS_PATH = join(homedir(), ".claude", "cache-fix-stats.json");
668
+
669
+ const _STATS_SCHEMA = {
670
+ relocate: { applied: 0, skipped: 0, bugPresent: 0, resumeScanned: 0, lastApplied: null, lastScanned: null },
671
+ fingerprint: { applied: 0, skipped: 0, safetyBlocked: 0, lastApplied: null },
672
+ tool_sort: { applied: 0, skipped: 0, lastApplied: null },
673
+ ttl: { applied: 0, skipped: 0, lastApplied: null },
674
+ identity: { applied: 0, skipped: 0, lastApplied: null },
675
+ };
676
+
677
+ function _createEmptyStats() {
678
+ return {
679
+ version: 1,
680
+ created: new Date().toISOString(),
681
+ lastUpdated: null,
682
+ fixes: JSON.parse(JSON.stringify(_STATS_SCHEMA)),
683
+ };
684
+ }
685
+
686
+ /** Read stats from disk. Returns empty stats on any error. */
687
+ function readStats() {
688
+ try {
689
+ const data = JSON.parse(readFileSync(STATS_PATH, "utf8"));
690
+ if (data.created) {
691
+ const ageDays = (Date.now() - new Date(data.created).getTime()) / (1000 * 60 * 60 * 24);
692
+ if (ageDays > 30) return _createEmptyStats();
693
+ }
694
+ for (const [key, schema] of Object.entries(_STATS_SCHEMA)) {
695
+ if (!data.fixes[key]) data.fixes[key] = { ...schema };
696
+ }
697
+ return data;
698
+ } catch {
699
+ return _createEmptyStats();
700
+ }
701
+ }
702
+
703
+ /** Atomic write: temp file + rename to avoid corruption. */
704
+ function writeStats(stats) {
705
+ try {
706
+ stats.lastUpdated = new Date().toISOString();
707
+ const tmp = STATS_PATH + ".tmp";
708
+ writeFileSync(tmp, JSON.stringify(stats, null, 2));
709
+ renameSync(tmp, STATS_PATH);
710
+ } catch (e) {
711
+ debugLog("STATS WRITE ERROR:", e?.message);
712
+ }
713
+ }
714
+
715
+ function recordFixResult(fixName, result) {
716
+ const stats = readStats();
717
+ if (!stats.fixes[fixName]) return;
718
+ const now = new Date().toISOString();
719
+ stats.lastUpdated = now;
720
+ if (result === "applied") {
721
+ stats.fixes[fixName].applied++;
722
+ stats.fixes[fixName].lastApplied = now;
723
+ } else if (result === "skipped") {
724
+ stats.fixes[fixName].skipped++;
725
+ } else if (result === "safety_blocked") {
726
+ stats.fixes[fixName].safetyBlocked = (stats.fixes[fixName].safetyBlocked || 0) + 1;
727
+ }
728
+ writeStats(stats);
729
+ }
730
+
731
+ function recordRelocateScan(bugFound) {
732
+ const stats = readStats();
733
+ const now = new Date().toISOString();
734
+ stats.lastUpdated = now;
735
+ stats.fixes.relocate.resumeScanned++;
736
+ stats.fixes.relocate.lastScanned = now;
737
+ if (bugFound) stats.fixes.relocate.bugPresent++;
738
+ writeStats(stats);
739
+ }
740
+
608
741
  // --------------------------------------------------------------------------
609
742
  // Prefix snapshot — captures message prefix for cross-process diff.
610
743
  // Set CACHE_FIX_PREFIXDIFF=1 to enable.
@@ -656,6 +789,59 @@ function dumpGrowthBookFlags() {
656
789
  }
657
790
  }
658
791
 
792
+ // --------------------------------------------------------------------------
793
+ // Startup health status line
794
+ // --------------------------------------------------------------------------
795
+
796
+ let _healthLinePrinted = false;
797
+
798
+ function _formatTimeSince(isoString) {
799
+ if (!isoString) return "never";
800
+ const ms = Date.now() - new Date(isoString).getTime();
801
+ const hours = Math.floor(ms / (1000 * 60 * 60));
802
+ const days = Math.floor(hours / 24);
803
+ if (days > 0) return `${days}d ago`;
804
+ if (hours > 0) return `${hours}h ago`;
805
+ const mins = Math.floor(ms / (1000 * 60));
806
+ return `${mins}m ago`;
807
+ }
808
+
809
+ function _formatFixStatus(fixName, fixStats, dormantThreshold = 5) {
810
+ if (fixName === "relocate") {
811
+ if (fixStats.resumeScanned >= dormantThreshold && fixStats.bugPresent === 0) {
812
+ return `dormant(${fixStats.resumeScanned} clean sessions)`;
813
+ }
814
+ } else {
815
+ if (fixStats.skipped >= dormantThreshold && fixStats.applied === 0) {
816
+ return `dormant(${fixStats.skipped} skips)`;
817
+ }
818
+ }
819
+ if (fixStats.safetyBlocked > 0) return `safety-blocked(${fixStats.safetyBlocked}x)`;
820
+ if (fixStats.lastApplied) return `active(${_formatTimeSince(fixStats.lastApplied)})`;
821
+ return "waiting";
822
+ }
823
+
824
+ function printHealthLine() {
825
+ if (_healthLinePrinted) return;
826
+ _healthLinePrinted = true;
827
+ const stats = readStats();
828
+ const parts = [];
829
+ for (const [name, fixStats] of Object.entries(stats.fixes)) {
830
+ const status = _formatFixStatus(name, fixStats);
831
+ parts.push(`${name}=${status}`);
832
+ if (status.startsWith("dormant")) {
833
+ debugLog(`DORMANT: ${name} — CC may have fixed this. Consider CACHE_FIX_SKIP_${name.toUpperCase()}=1`);
834
+ }
835
+ if (status.startsWith("safety-blocked")) {
836
+ debugLog(`SAFETY: ${name} — salt/indices may have changed. Fix is auto-disabled.`);
837
+ }
838
+ }
839
+ debugLog(`HEALTH: ${parts.join(" ")}`);
840
+ if (FIXES_DISABLED) {
841
+ debugLog("HEALTH: all fixes disabled via CACHE_FIX_DISABLED=1 (monitoring active)");
842
+ }
843
+ }
844
+
659
845
  // --------------------------------------------------------------------------
660
846
  // Microcompact / budget monitoring
661
847
  // --------------------------------------------------------------------------
@@ -801,6 +987,50 @@ function snapshotPrefix(payload) {
801
987
  }
802
988
  }
803
989
 
990
+ // --------------------------------------------------------------------------
991
+ // Cache regression detector
992
+ // --------------------------------------------------------------------------
993
+
994
+ const _cacheHistory = []; // in-memory ring buffer of { ratio, turn }
995
+ const REGRESSION_MIN_CALLS = 5;
996
+ const REGRESSION_MIN_RATIO = 0.5;
997
+ let _apiCallCount = 0;
998
+
999
+ function _computeCacheRatio(usage) {
1000
+ if (!usage) return null;
1001
+ const read = usage.cache_read_input_tokens || 0;
1002
+ const creation = usage.cache_creation_input_tokens || 0;
1003
+ const input = usage.input_tokens || 0;
1004
+ const total = read + creation + input;
1005
+ if (total === 0) return null;
1006
+ return read / total;
1007
+ }
1008
+
1009
+ function _checkCacheRegression() {
1010
+ if (_cacheHistory.length < REGRESSION_MIN_CALLS) return;
1011
+ const recent = _cacheHistory.slice(-REGRESSION_MIN_CALLS);
1012
+ const allLow = recent.every((h) => h.ratio < REGRESSION_MIN_RATIO);
1013
+ if (allLow) {
1014
+ const avgRatio = recent.reduce((sum, h) => sum + h.ratio, 0) / recent.length;
1015
+ debugLog(
1016
+ `REGRESSION WARNING: cache_read ratio averaged ${Math.round(avgRatio * 100)}%`,
1017
+ `across last ${REGRESSION_MIN_CALLS} calls (threshold: ${REGRESSION_MIN_RATIO * 100}%).`,
1018
+ FIXES_DISABLED
1019
+ ? "Fixes are disabled — consider re-enabling to recover cache performance."
1020
+ : "Fixes are active but cache is still degraded — CC may have introduced a new bug."
1021
+ );
1022
+ }
1023
+ }
1024
+
1025
+ function _trackCacheRatio(usage) {
1026
+ if (_apiCallCount <= 1) return; // skip first call (cache creation, no reads)
1027
+ const ratio = _computeCacheRatio(usage);
1028
+ if (ratio === null) return;
1029
+ _cacheHistory.push({ ratio, turn: _apiCallCount });
1030
+ if (_cacheHistory.length > 20) _cacheHistory.shift(); // ring buffer
1031
+ _checkCacheRegression();
1032
+ }
1033
+
804
1034
  // --------------------------------------------------------------------------
805
1035
  // Fetch interceptor
806
1036
  // --------------------------------------------------------------------------
@@ -817,11 +1047,17 @@ globalThis.fetch = async function (url, options) {
817
1047
 
818
1048
  if (isMessagesEndpoint && options?.body && typeof options.body === "string") {
819
1049
  try {
1050
+ _apiCallCount++;
820
1051
  const payload = JSON.parse(options.body);
821
1052
  let modified = false;
822
1053
 
823
1054
  // One-time GrowthBook flag dump on first API call
824
1055
  dumpGrowthBookFlags();
1056
+ printHealthLine();
1057
+
1058
+ if (FIXES_DISABLED) {
1059
+ debugLog("CACHE_FIX_DISABLED=1 — all bug fixes bypassed, monitoring active");
1060
+ }
825
1061
 
826
1062
  debugLog("--- API call to", urlStr);
827
1063
  debugLog("message count:", payload.messages?.length);
@@ -832,7 +1068,7 @@ globalThis.fetch = async function (url, options) {
832
1068
  }
833
1069
 
834
1070
  // Bug 1: Relocate resume attachment blocks
835
- if (payload.messages) {
1071
+ if (payload.messages && shouldApplyFix("relocate")) {
836
1072
  // Log message structure for debugging
837
1073
  if (DEBUG) {
838
1074
  let firstUserIdx = -1, lastUserIdx = -1;
@@ -868,13 +1104,21 @@ globalThis.fetch = async function (url, options) {
868
1104
  }
869
1105
 
870
1106
  const normalized = normalizeResumeMessages(payload.messages);
1107
+ // Track bug presence for dormancy detection (resume = messages > 5)
1108
+ const isResume = payload.messages.length > 5;
1109
+ if (isResume) recordRelocateScan(normalized !== payload.messages);
1110
+
871
1111
  if (normalized !== payload.messages) {
872
1112
  payload.messages = normalized;
873
1113
  modified = true;
874
1114
  debugLog("APPLIED: resume message relocation");
1115
+ recordFixResult("relocate", "applied");
875
1116
  } else {
876
1117
  debugLog("SKIPPED: resume relocation (not a resume or already correct)");
1118
+ recordFixResult("relocate", "skipped");
877
1119
  }
1120
+ } else if (payload.messages && !shouldApplyFix("relocate")) {
1121
+ debugLog("SKIPPED: relocate fix disabled via env var");
878
1122
  }
879
1123
 
880
1124
  // Image stripping: remove old tool_result images to reduce token waste
@@ -895,7 +1139,7 @@ globalThis.fetch = async function (url, options) {
895
1139
  }
896
1140
 
897
1141
  // Bug 2a: Stabilize tool ordering
898
- if (payload.tools) {
1142
+ if (payload.tools && shouldApplyFix("tool_sort")) {
899
1143
  const sorted = stabilizeToolOrder(payload.tools);
900
1144
  const changed = sorted.some(
901
1145
  (t, i) => t.name !== payload.tools[i]?.name
@@ -904,11 +1148,16 @@ globalThis.fetch = async function (url, options) {
904
1148
  payload.tools = sorted;
905
1149
  modified = true;
906
1150
  debugLog("APPLIED: tool order stabilization");
1151
+ recordFixResult("tool_sort", "applied");
1152
+ } else {
1153
+ recordFixResult("tool_sort", "skipped");
907
1154
  }
1155
+ } else if (payload.tools && !shouldApplyFix("tool_sort")) {
1156
+ debugLog("SKIPPED: tool sort fix disabled via env var");
908
1157
  }
909
1158
 
910
1159
  // Bug 2b: Stabilize fingerprint in attribution header
911
- if (payload.system && payload.messages) {
1160
+ if (payload.system && payload.messages && shouldApplyFix("fingerprint")) {
912
1161
  const fix = stabilizeFingerprint(payload.system, payload.messages);
913
1162
  if (fix) {
914
1163
  payload.system = [...payload.system];
@@ -918,7 +1167,12 @@ globalThis.fetch = async function (url, options) {
918
1167
  };
919
1168
  modified = true;
920
1169
  debugLog("APPLIED: fingerprint stabilized from", fix.oldFingerprint, "to", fix.stableFingerprint);
1170
+ recordFixResult("fingerprint", "applied");
1171
+ } else {
1172
+ recordFixResult("fingerprint", "skipped");
921
1173
  }
1174
+ } else if (payload.system && payload.messages && !shouldApplyFix("fingerprint")) {
1175
+ debugLog("SKIPPED: fingerprint fix disabled via env var");
922
1176
  }
923
1177
 
924
1178
  // Bug 6: Identity string normalization for Agent()/SendMessage() cache parity
@@ -931,7 +1185,7 @@ globalThis.fetch = async function (url, options) {
931
1185
  // turn even though system[2] (the actual instructions) is byte-identical.
932
1186
  // Confirmed by @labzink via mitmproxy on #44724.
933
1187
  // Opt-in because it's a model-perceivable behavior change (subagent thinks it's CC).
934
- if (NORMALIZE_IDENTITY && payload.system && Array.isArray(payload.system)) {
1188
+ if (NORMALIZE_IDENTITY && shouldApplyFix("identity") && payload.system && Array.isArray(payload.system)) {
935
1189
  const CANONICAL = "You are Claude Code, Anthropic's official CLI for Claude.";
936
1190
  const AGENT_SDK = "You are a Claude agent, built on Anthropic's Claude Agent SDK.";
937
1191
  let normalized = 0;
@@ -949,6 +1203,9 @@ globalThis.fetch = async function (url, options) {
949
1203
  if (normalized > 0) {
950
1204
  modified = true;
951
1205
  debugLog(`APPLIED: identity normalized on ${normalized} system block(s) (Agent SDK → Claude Code)`);
1206
+ recordFixResult("identity", "applied");
1207
+ } else {
1208
+ recordFixResult("identity", "skipped");
952
1209
  }
953
1210
  }
954
1211
 
@@ -971,7 +1228,7 @@ globalThis.fetch = async function (url, options) {
971
1228
  // send cache_control without ttl (defaulting to 5m server-side).
972
1229
  // The server honors whatever TTL the client requests — so we inject it.
973
1230
  // Discovered by @TigerKay1926 on #42052 using our GrowthBook flag dump.
974
- if (payload.system) {
1231
+ if (payload.system && shouldApplyFix("ttl")) {
975
1232
  let ttlInjected = 0;
976
1233
  payload.system = payload.system.map((block) => {
977
1234
  if (block.cache_control?.type === "ephemeral" && !block.cache_control.ttl) {
@@ -996,7 +1253,12 @@ globalThis.fetch = async function (url, options) {
996
1253
  if (ttlInjected > 0) {
997
1254
  modified = true;
998
1255
  debugLog(`APPLIED: 1h TTL injected on ${ttlInjected} cache_control block(s)`);
1256
+ recordFixResult("ttl", "applied");
1257
+ } else {
1258
+ recordFixResult("ttl", "skipped");
999
1259
  }
1260
+ } else if (payload.system && !shouldApplyFix("ttl")) {
1261
+ debugLog("SKIPPED: TTL injection disabled via env var");
1000
1262
  }
1001
1263
 
1002
1264
  if (modified) {
@@ -1199,6 +1461,7 @@ async function drainTTLFromClone(clone, model, quotaHeaders) {
1199
1461
  if (event.type === "message_start" && event.message?.usage) {
1200
1462
  const u = event.message.usage;
1201
1463
  startUsage = u;
1464
+ _trackCacheRatio(u);
1202
1465
  const cc = u.cache_creation || {};
1203
1466
  const e1h = cc.ephemeral_1h_input_tokens ?? 0;
1204
1467
  const e5m = cc.ephemeral_5m_input_tokens ?? 0;
@@ -0,0 +1,86 @@
1
+ #!/bin/bash
2
+ # Status line: show quota % and burn rate from claude-meter JSONL
3
+ # Rate is calculated from window start (reset_time - window_size) to now
4
+ # No prev file needed — each reading is self-contained
5
+
6
+ input=$(cat)
7
+
8
+ JSONL="$HOME/.claude/claude-meter.jsonl"
9
+
10
+ if [ -f "$JSONL" ]; then
11
+ last=$(tail -1 "$JSONL" 2>/dev/null)
12
+
13
+ result=$(echo "$last" | python3 -c "
14
+ import sys, json
15
+ from datetime import datetime, timezone
16
+
17
+ r = json.load(sys.stdin)
18
+ q5h = int(r['q5h'] * 100)
19
+ q7d = int(r.get('q7d', 0) * 100)
20
+ overage = r.get('qoverage', '')
21
+ ts = r.get('ts', '')
22
+ q5h_reset = r.get('q5h_reset', 0)
23
+ q7d_reset = r.get('q7d_reset', 0)
24
+
25
+ now = datetime.fromisoformat(ts.replace('Z', '+00:00'))
26
+
27
+ # Q5h: 5-hour window, rate = pct / minutes elapsed since window start
28
+ rate5 = ''
29
+ if q5h_reset > 0:
30
+ window_start = datetime.fromtimestamp(q5h_reset, tz=timezone.utc) - __import__('datetime').timedelta(hours=5)
31
+ elapsed_min = (now - window_start).total_seconds() / 60
32
+ if elapsed_min > 1 and q5h > 0:
33
+ rate5 = '{:+.1f}'.format(q5h / elapsed_min)
34
+
35
+ # Q7d: 7-day window
36
+ rate7 = ''
37
+ if q7d_reset > 0:
38
+ window_start_7d = datetime.fromtimestamp(q7d_reset, tz=timezone.utc) - __import__('datetime').timedelta(days=7)
39
+ elapsed_min_7d = (now - window_start_7d).total_seconds() / 60
40
+ if elapsed_min_7d > 1 and q7d > 0:
41
+ rate7 = '{:+.1f}'.format(q7d / (elapsed_min_7d / 60))
42
+
43
+ label = 'Q5h: {}%'.format(q5h)
44
+ if rate5:
45
+ label += ' ({}%/m)'.format(rate5)
46
+ label += ' | Q7d: {}%'.format(q7d)
47
+ if rate7:
48
+ label += ' ({}%/hr)'.format(rate7)
49
+ if overage == 'active':
50
+ label += ' | OVERAGE'
51
+
52
+ # Add TTL tier from quota-status.json (written by interceptor)
53
+ import os, pathlib
54
+ qs_path = pathlib.Path.home() / '.claude' / 'quota-status.json'
55
+ try:
56
+ qs = json.load(open(qs_path))
57
+ ttl = qs.get('cache', {}).get('ttl_tier', '')
58
+ hit = qs.get('cache', {}).get('hit_rate', '')
59
+ if ttl:
60
+ if ttl == '5m':
61
+ label += ' | \033[31mTTL:5m\033[0m' # red
62
+ # When on 5m tier, show the cold-rebuild size so users know
63
+ # the cost of idling past 5 minutes
64
+ cache_cr = qs.get('cache', {}).get('cache_creation', 0)
65
+ cache_rd = qs.get('cache', {}).get('cache_read', 0)
66
+ prefix = cache_cr + cache_rd
67
+ if prefix > 0:
68
+ if prefix >= 1_000_000:
69
+ label += ' \033[31m\u26A0 idle >5m = {:.1f}M rebuild\033[0m'.format(prefix / 1_000_000)
70
+ else:
71
+ label += ' \033[31m\u26A0 idle >5m = {:.0f}K rebuild\033[0m'.format(prefix / 1_000)
72
+ else:
73
+ label += ' | TTL:' + ttl
74
+ if hit and hit != 'N/A':
75
+ label += ' ' + hit + '%'
76
+ peak = qs.get('peak_hour', False)
77
+ if peak:
78
+ label += ' | \033[33mPEAK\033[0m' # yellow
79
+ except:
80
+ pass
81
+
82
+ print(label)
83
+ " 2>/dev/null)
84
+
85
+ [ -n "$result" ] && echo "$result"
86
+ fi