claude-code-cache-fix 1.7.1 → 1.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +103 -0
- package/package.json +1 -1
- package/preload.mjs +269 -6
- package/tools/quota-statusline.sh +86 -0
package/README.md
CHANGED
|
@@ -105,6 +105,103 @@ The module intercepts `globalThis.fetch` before Claude Code makes API calls to `
|
|
|
105
105
|
|
|
106
106
|
All fixes are idempotent — if nothing needs fixing, the request passes through unmodified. The interceptor is read-only with respect to your conversation; it only normalizes the request structure before it hits the API.
|
|
107
107
|
|
|
108
|
+
## Graduating from Fixes
|
|
109
|
+
|
|
110
|
+
The interceptor serves three purposes with different lifecycles:
|
|
111
|
+
|
|
112
|
+
| Purpose | Examples | When to disable |
|
|
113
|
+
|---------|----------|-----------------|
|
|
114
|
+
| **Bug fixes** | Block relocation, fingerprint, tool sort, TTL | When CC fixes the underlying bug — check the health line |
|
|
115
|
+
| **Monitoring** | Quota tracking, microcompact detection, GrowthBook flags | Keep permanently — these detect future regressions |
|
|
116
|
+
| **Optimizations** | Image stripping, output efficiency rewrite | Keep as long as they help your workflow |
|
|
117
|
+
|
|
118
|
+
### Health status
|
|
119
|
+
|
|
120
|
+
On first API call, the interceptor logs a health status line (requires `CACHE_FIX_DEBUG=1`):
|
|
121
|
+
|
|
122
|
+
```
|
|
123
|
+
cache-fix health: relocate=active(2h ago) fingerprint=dormant(5 clean sessions) tool_sort=active ttl=active identity=waiting
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Status meanings:
|
|
127
|
+
- **active(Xh ago)** — fix was applied recently
|
|
128
|
+
- **dormant(N clean sessions)** — bug not detected in N resume sessions; CC may have fixed it
|
|
129
|
+
- **safety-blocked(Nx)** — round-trip verification failed; CC changed its algorithm, fix auto-disabled
|
|
130
|
+
- **waiting** — fix hasn't been triggered yet
|
|
131
|
+
|
|
132
|
+
When a fix shows `dormant`, you can safely disable it:
|
|
133
|
+
```bash
|
|
134
|
+
export CACHE_FIX_SKIP_RELOCATE=1 # example
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
To disable all fixes but keep monitoring:
|
|
138
|
+
```bash
|
|
139
|
+
export CACHE_FIX_DISABLED=1
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Regression detection
|
|
143
|
+
|
|
144
|
+
If cache_read ratio drops below 50% across 5+ calls after disabling fixes, you'll see:
|
|
145
|
+
```
|
|
146
|
+
REGRESSION WARNING: cache_read ratio averaged 12% across last 5 calls.
|
|
147
|
+
Fixes are disabled — consider re-enabling to recover cache performance.
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
## Safety
|
|
151
|
+
|
|
152
|
+
### Fingerprint round-trip verification
|
|
153
|
+
|
|
154
|
+
Before rewriting the `cc_version` fingerprint, the interceptor verifies that its
|
|
155
|
+
hardcoded salt and character indices reproduce the fingerprint Claude Code sent.
|
|
156
|
+
If verification fails (CC changed its algorithm), the rewrite is skipped automatically.
|
|
157
|
+
This ensures the interceptor can never make cache performance *worse* than stock CC.
|
|
158
|
+
|
|
159
|
+
### Fail-safe design
|
|
160
|
+
|
|
161
|
+
Every fix is designed to fail to a no-op:
|
|
162
|
+
- If block detection regexes don't match → blocks aren't relocated (CC behavior)
|
|
163
|
+
- If fingerprint format changes → fingerprint isn't rewritten (CC behavior)
|
|
164
|
+
- If tool sort produces no changes → payload passes through untouched
|
|
165
|
+
- If TTL injection target structure changes → TTL isn't injected (CC behavior)
|
|
166
|
+
|
|
167
|
+
The interceptor can only *help* or *do nothing*. It cannot make things worse.
|
|
168
|
+
|
|
169
|
+
## Status line — quota warnings in real time
|
|
170
|
+
|
|
171
|
+
The interceptor writes quota state to `~/.claude/quota-status.json` on every API call. The included `tools/quota-statusline.sh` script reads this file and displays a live status line in Claude Code showing:
|
|
172
|
+
|
|
173
|
+
- **Q5h %** with burn rate (%/min)
|
|
174
|
+
- **Q7d %** with burn rate (%/hr)
|
|
175
|
+
- **TTL tier** — shows `TTL:1h` when healthy, **`TTL:5m` in red when the server has downgraded you** (typically at Q5h ≥ 100%)
|
|
176
|
+
- **PEAK** in yellow during weekday peak hours (13:00–19:00 UTC)
|
|
177
|
+
- **Cache hit rate %**
|
|
178
|
+
- **OVERAGE** flag when active
|
|
179
|
+
|
|
180
|
+
### Setup
|
|
181
|
+
|
|
182
|
+
Copy the script and configure Claude Code to use it:
|
|
183
|
+
|
|
184
|
+
```bash
|
|
185
|
+
# Copy from the npm package to Claude Code's hooks directory
|
|
186
|
+
mkdir -p ~/.claude/hooks
|
|
187
|
+
cp "$(npm root -g)/claude-code-cache-fix/tools/quota-statusline.sh" ~/.claude/hooks/
|
|
188
|
+
chmod +x ~/.claude/hooks/quota-statusline.sh
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
Add to `~/.claude/settings.json`:
|
|
192
|
+
|
|
193
|
+
```json
|
|
194
|
+
{
|
|
195
|
+
"statusLine": {
|
|
196
|
+
"command": "~/.claude/hooks/quota-statusline.sh"
|
|
197
|
+
}
|
|
198
|
+
}
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
### Why this matters
|
|
202
|
+
|
|
203
|
+
When the server downgrades your TTL to 5m (Layer 2 — quota-aware downgrade at Q5h ≥ 100%), **every idle longer than 5 minutes causes a full context rebuild**. Without the status line, this is invisible — you just notice things getting slower and more expensive. With the status line, the red `TTL:5m` warning tells you immediately: **stop working, wait for the Q5h window to reset, then resume**. Powering through overage compounds the drain; pausing breaks the cycle.
|
|
204
|
+
|
|
108
205
|
## Image stripping
|
|
109
206
|
|
|
110
207
|
Images read via the Read tool are encoded as base64 and stored in `tool_result` blocks in conversation history. They ride along on **every subsequent API call** until compaction. A single 500KB image costs ~62,500 tokens per turn in carry-forward.
|
|
@@ -305,6 +402,12 @@ Snapshots are saved to `~/.claude/cache-fix-snapshots/` and diff reports are gen
|
|
|
305
402
|
| `CACHE_FIX_IMAGE_KEEP_LAST` | `0` | Keep images in last N user messages (0 = disabled) |
|
|
306
403
|
| `CACHE_FIX_OUTPUT_EFFICIENCY_REPLACEMENT` | unset | Replace Claude Code's `# Output efficiency` system-prompt section before the request is sent |
|
|
307
404
|
| `CACHE_FIX_USAGE_LOG` | `~/.claude/usage.jsonl` | Path for per-call usage telemetry log |
|
|
405
|
+
| `CACHE_FIX_DISABLED` | `0` | Disable all bug fixes; keep monitoring + optimizations active |
|
|
406
|
+
| `CACHE_FIX_SKIP_RELOCATE` | `0` | Skip block relocation fix (Bug 1) |
|
|
407
|
+
| `CACHE_FIX_SKIP_FINGERPRINT` | `0` | Skip fingerprint stabilization (Bug 2b) |
|
|
408
|
+
| `CACHE_FIX_SKIP_TOOL_SORT` | `0` | Skip tool ordering stabilization (Bug 2a) |
|
|
409
|
+
| `CACHE_FIX_SKIP_TTL` | `0` | Skip 1h TTL injection (Bug 5) |
|
|
410
|
+
| `CACHE_FIX_SKIP_IDENTITY` | `0` | Skip identity normalization (Bug 6) |
|
|
308
411
|
|
|
309
412
|
## Limitations
|
|
310
413
|
|
package/package.json
CHANGED
package/preload.mjs
CHANGED
|
@@ -83,6 +83,25 @@ function extractRealUserMessageText(messages) {
|
|
|
83
83
|
return "";
|
|
84
84
|
}
|
|
85
85
|
|
|
86
|
+
/**
|
|
87
|
+
* Extract text from messages[0] the way CC's original fingerprint code does —
|
|
88
|
+
* including meta/attachment blocks. Used only for round-trip verification.
|
|
89
|
+
*/
|
|
90
|
+
function extractFirstMessageText(messages) {
|
|
91
|
+
if (!Array.isArray(messages) || messages.length === 0) return "";
|
|
92
|
+
const first = messages[0];
|
|
93
|
+
if (!first || first.role !== "user") return "";
|
|
94
|
+
const content = first.content;
|
|
95
|
+
if (typeof content === "string") return content;
|
|
96
|
+
if (!Array.isArray(content)) return "";
|
|
97
|
+
for (const block of content) {
|
|
98
|
+
if (block.type === "text" && typeof block.text === "string") {
|
|
99
|
+
return block.text;
|
|
100
|
+
}
|
|
101
|
+
}
|
|
102
|
+
return "";
|
|
103
|
+
}
|
|
104
|
+
|
|
86
105
|
/**
|
|
87
106
|
* Extract current cc_version from system prompt blocks and recompute with
|
|
88
107
|
* stable fingerprint. Returns { oldVersion, newVersion, stableFingerprint }.
|
|
@@ -107,6 +126,23 @@ function stabilizeFingerprint(system, messages) {
|
|
|
107
126
|
const baseVersion = dotParts.slice(0, 3).join("."); // "2.1.87"
|
|
108
127
|
const oldFingerprint = dotParts[3]; // "a3f"
|
|
109
128
|
|
|
129
|
+
// --- SAFETY: Round-trip verification ---
|
|
130
|
+
// Verify our salt/indices reproduce CC's fingerprint for the ORIGINAL
|
|
131
|
+
// message text (messages[0] content, which is what CC used).
|
|
132
|
+
// If our computation doesn't match, our constants are stale — skip rewrite.
|
|
133
|
+
const originalText = extractFirstMessageText(messages);
|
|
134
|
+
const verification = computeFingerprint(originalText, baseVersion);
|
|
135
|
+
if (verification !== oldFingerprint) {
|
|
136
|
+
debugLog(
|
|
137
|
+
"FINGERPRINT SAFETY: round-trip verification failed.",
|
|
138
|
+
`CC sent '${oldFingerprint}', we computed '${verification}'.`,
|
|
139
|
+
"Salt/indices may have changed in this CC version. Skipping rewrite."
|
|
140
|
+
);
|
|
141
|
+
recordFixResult("fingerprint", "safety_blocked");
|
|
142
|
+
return null;
|
|
143
|
+
}
|
|
144
|
+
// --- END SAFETY ---
|
|
145
|
+
|
|
110
146
|
// Compute stable fingerprint from real user text
|
|
111
147
|
const realText = extractRealUserMessageText(messages);
|
|
112
148
|
const stableFingerprint = computeFingerprint(realText, baseVersion);
|
|
@@ -588,7 +624,7 @@ function replaceOutputEfficiencySection(text) {
|
|
|
588
624
|
// Set CACHE_FIX_DEBUG=1 to enable
|
|
589
625
|
// --------------------------------------------------------------------------
|
|
590
626
|
|
|
591
|
-
import { appendFileSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
|
|
627
|
+
import { appendFileSync, readFileSync, writeFileSync, mkdirSync, renameSync } from "node:fs";
|
|
592
628
|
import { homedir } from "node:os";
|
|
593
629
|
import { join } from "node:path";
|
|
594
630
|
|
|
@@ -605,6 +641,103 @@ function debugLog(...args) {
|
|
|
605
641
|
try { appendFileSync(LOG_PATH, line); } catch {}
|
|
606
642
|
}
|
|
607
643
|
|
|
644
|
+
// --------------------------------------------------------------------------
|
|
645
|
+
// Kill switches — disable fixes while keeping monitoring active
|
|
646
|
+
// --------------------------------------------------------------------------
|
|
647
|
+
|
|
648
|
+
const FIXES_DISABLED = process.env.CACHE_FIX_DISABLED === "1";
|
|
649
|
+
|
|
650
|
+
/**
|
|
651
|
+
* Check if a specific fix should be applied.
|
|
652
|
+
* Returns false if master kill switch is on OR individual fix is skipped.
|
|
653
|
+
* Monitoring and optimizations (image strip, output efficiency) are NOT
|
|
654
|
+
* affected by CACHE_FIX_DISABLED — only bug fixes are.
|
|
655
|
+
*/
|
|
656
|
+
function shouldApplyFix(fixName) {
|
|
657
|
+
if (FIXES_DISABLED) return false;
|
|
658
|
+
const skipKey = `CACHE_FIX_SKIP_${fixName.toUpperCase()}`;
|
|
659
|
+
if (process.env[skipKey] === "1") return false;
|
|
660
|
+
return true;
|
|
661
|
+
}
|
|
662
|
+
|
|
663
|
+
// --------------------------------------------------------------------------
|
|
664
|
+
// Persistent effectiveness stats
|
|
665
|
+
// --------------------------------------------------------------------------
|
|
666
|
+
|
|
667
|
+
const STATS_PATH = join(homedir(), ".claude", "cache-fix-stats.json");
|
|
668
|
+
|
|
669
|
+
const _STATS_SCHEMA = {
|
|
670
|
+
relocate: { applied: 0, skipped: 0, bugPresent: 0, resumeScanned: 0, lastApplied: null, lastScanned: null },
|
|
671
|
+
fingerprint: { applied: 0, skipped: 0, safetyBlocked: 0, lastApplied: null },
|
|
672
|
+
tool_sort: { applied: 0, skipped: 0, lastApplied: null },
|
|
673
|
+
ttl: { applied: 0, skipped: 0, lastApplied: null },
|
|
674
|
+
identity: { applied: 0, skipped: 0, lastApplied: null },
|
|
675
|
+
};
|
|
676
|
+
|
|
677
|
+
function _createEmptyStats() {
|
|
678
|
+
return {
|
|
679
|
+
version: 1,
|
|
680
|
+
created: new Date().toISOString(),
|
|
681
|
+
lastUpdated: null,
|
|
682
|
+
fixes: JSON.parse(JSON.stringify(_STATS_SCHEMA)),
|
|
683
|
+
};
|
|
684
|
+
}
|
|
685
|
+
|
|
686
|
+
/** Read stats from disk. Returns empty stats on any error. */
|
|
687
|
+
function readStats() {
|
|
688
|
+
try {
|
|
689
|
+
const data = JSON.parse(readFileSync(STATS_PATH, "utf8"));
|
|
690
|
+
if (data.created) {
|
|
691
|
+
const ageDays = (Date.now() - new Date(data.created).getTime()) / (1000 * 60 * 60 * 24);
|
|
692
|
+
if (ageDays > 30) return _createEmptyStats();
|
|
693
|
+
}
|
|
694
|
+
for (const [key, schema] of Object.entries(_STATS_SCHEMA)) {
|
|
695
|
+
if (!data.fixes[key]) data.fixes[key] = { ...schema };
|
|
696
|
+
}
|
|
697
|
+
return data;
|
|
698
|
+
} catch {
|
|
699
|
+
return _createEmptyStats();
|
|
700
|
+
}
|
|
701
|
+
}
|
|
702
|
+
|
|
703
|
+
/** Atomic write: temp file + rename to avoid corruption. */
|
|
704
|
+
function writeStats(stats) {
|
|
705
|
+
try {
|
|
706
|
+
stats.lastUpdated = new Date().toISOString();
|
|
707
|
+
const tmp = STATS_PATH + ".tmp";
|
|
708
|
+
writeFileSync(tmp, JSON.stringify(stats, null, 2));
|
|
709
|
+
renameSync(tmp, STATS_PATH);
|
|
710
|
+
} catch (e) {
|
|
711
|
+
debugLog("STATS WRITE ERROR:", e?.message);
|
|
712
|
+
}
|
|
713
|
+
}
|
|
714
|
+
|
|
715
|
+
function recordFixResult(fixName, result) {
|
|
716
|
+
const stats = readStats();
|
|
717
|
+
if (!stats.fixes[fixName]) return;
|
|
718
|
+
const now = new Date().toISOString();
|
|
719
|
+
stats.lastUpdated = now;
|
|
720
|
+
if (result === "applied") {
|
|
721
|
+
stats.fixes[fixName].applied++;
|
|
722
|
+
stats.fixes[fixName].lastApplied = now;
|
|
723
|
+
} else if (result === "skipped") {
|
|
724
|
+
stats.fixes[fixName].skipped++;
|
|
725
|
+
} else if (result === "safety_blocked") {
|
|
726
|
+
stats.fixes[fixName].safetyBlocked = (stats.fixes[fixName].safetyBlocked || 0) + 1;
|
|
727
|
+
}
|
|
728
|
+
writeStats(stats);
|
|
729
|
+
}
|
|
730
|
+
|
|
731
|
+
function recordRelocateScan(bugFound) {
|
|
732
|
+
const stats = readStats();
|
|
733
|
+
const now = new Date().toISOString();
|
|
734
|
+
stats.lastUpdated = now;
|
|
735
|
+
stats.fixes.relocate.resumeScanned++;
|
|
736
|
+
stats.fixes.relocate.lastScanned = now;
|
|
737
|
+
if (bugFound) stats.fixes.relocate.bugPresent++;
|
|
738
|
+
writeStats(stats);
|
|
739
|
+
}
|
|
740
|
+
|
|
608
741
|
// --------------------------------------------------------------------------
|
|
609
742
|
// Prefix snapshot — captures message prefix for cross-process diff.
|
|
610
743
|
// Set CACHE_FIX_PREFIXDIFF=1 to enable.
|
|
@@ -656,6 +789,59 @@ function dumpGrowthBookFlags() {
|
|
|
656
789
|
}
|
|
657
790
|
}
|
|
658
791
|
|
|
792
|
+
// --------------------------------------------------------------------------
|
|
793
|
+
// Startup health status line
|
|
794
|
+
// --------------------------------------------------------------------------
|
|
795
|
+
|
|
796
|
+
let _healthLinePrinted = false;
|
|
797
|
+
|
|
798
|
+
function _formatTimeSince(isoString) {
|
|
799
|
+
if (!isoString) return "never";
|
|
800
|
+
const ms = Date.now() - new Date(isoString).getTime();
|
|
801
|
+
const hours = Math.floor(ms / (1000 * 60 * 60));
|
|
802
|
+
const days = Math.floor(hours / 24);
|
|
803
|
+
if (days > 0) return `${days}d ago`;
|
|
804
|
+
if (hours > 0) return `${hours}h ago`;
|
|
805
|
+
const mins = Math.floor(ms / (1000 * 60));
|
|
806
|
+
return `${mins}m ago`;
|
|
807
|
+
}
|
|
808
|
+
|
|
809
|
+
function _formatFixStatus(fixName, fixStats, dormantThreshold = 5) {
|
|
810
|
+
if (fixName === "relocate") {
|
|
811
|
+
if (fixStats.resumeScanned >= dormantThreshold && fixStats.bugPresent === 0) {
|
|
812
|
+
return `dormant(${fixStats.resumeScanned} clean sessions)`;
|
|
813
|
+
}
|
|
814
|
+
} else {
|
|
815
|
+
if (fixStats.skipped >= dormantThreshold && fixStats.applied === 0) {
|
|
816
|
+
return `dormant(${fixStats.skipped} skips)`;
|
|
817
|
+
}
|
|
818
|
+
}
|
|
819
|
+
if (fixStats.safetyBlocked > 0) return `safety-blocked(${fixStats.safetyBlocked}x)`;
|
|
820
|
+
if (fixStats.lastApplied) return `active(${_formatTimeSince(fixStats.lastApplied)})`;
|
|
821
|
+
return "waiting";
|
|
822
|
+
}
|
|
823
|
+
|
|
824
|
+
function printHealthLine() {
|
|
825
|
+
if (_healthLinePrinted) return;
|
|
826
|
+
_healthLinePrinted = true;
|
|
827
|
+
const stats = readStats();
|
|
828
|
+
const parts = [];
|
|
829
|
+
for (const [name, fixStats] of Object.entries(stats.fixes)) {
|
|
830
|
+
const status = _formatFixStatus(name, fixStats);
|
|
831
|
+
parts.push(`${name}=${status}`);
|
|
832
|
+
if (status.startsWith("dormant")) {
|
|
833
|
+
debugLog(`DORMANT: ${name} — CC may have fixed this. Consider CACHE_FIX_SKIP_${name.toUpperCase()}=1`);
|
|
834
|
+
}
|
|
835
|
+
if (status.startsWith("safety-blocked")) {
|
|
836
|
+
debugLog(`SAFETY: ${name} — salt/indices may have changed. Fix is auto-disabled.`);
|
|
837
|
+
}
|
|
838
|
+
}
|
|
839
|
+
debugLog(`HEALTH: ${parts.join(" ")}`);
|
|
840
|
+
if (FIXES_DISABLED) {
|
|
841
|
+
debugLog("HEALTH: all fixes disabled via CACHE_FIX_DISABLED=1 (monitoring active)");
|
|
842
|
+
}
|
|
843
|
+
}
|
|
844
|
+
|
|
659
845
|
// --------------------------------------------------------------------------
|
|
660
846
|
// Microcompact / budget monitoring
|
|
661
847
|
// --------------------------------------------------------------------------
|
|
@@ -801,6 +987,50 @@ function snapshotPrefix(payload) {
|
|
|
801
987
|
}
|
|
802
988
|
}
|
|
803
989
|
|
|
990
|
+
// --------------------------------------------------------------------------
|
|
991
|
+
// Cache regression detector
|
|
992
|
+
// --------------------------------------------------------------------------
|
|
993
|
+
|
|
994
|
+
const _cacheHistory = []; // in-memory ring buffer of { ratio, turn }
|
|
995
|
+
const REGRESSION_MIN_CALLS = 5;
|
|
996
|
+
const REGRESSION_MIN_RATIO = 0.5;
|
|
997
|
+
let _apiCallCount = 0;
|
|
998
|
+
|
|
999
|
+
function _computeCacheRatio(usage) {
|
|
1000
|
+
if (!usage) return null;
|
|
1001
|
+
const read = usage.cache_read_input_tokens || 0;
|
|
1002
|
+
const creation = usage.cache_creation_input_tokens || 0;
|
|
1003
|
+
const input = usage.input_tokens || 0;
|
|
1004
|
+
const total = read + creation + input;
|
|
1005
|
+
if (total === 0) return null;
|
|
1006
|
+
return read / total;
|
|
1007
|
+
}
|
|
1008
|
+
|
|
1009
|
+
function _checkCacheRegression() {
|
|
1010
|
+
if (_cacheHistory.length < REGRESSION_MIN_CALLS) return;
|
|
1011
|
+
const recent = _cacheHistory.slice(-REGRESSION_MIN_CALLS);
|
|
1012
|
+
const allLow = recent.every((h) => h.ratio < REGRESSION_MIN_RATIO);
|
|
1013
|
+
if (allLow) {
|
|
1014
|
+
const avgRatio = recent.reduce((sum, h) => sum + h.ratio, 0) / recent.length;
|
|
1015
|
+
debugLog(
|
|
1016
|
+
`REGRESSION WARNING: cache_read ratio averaged ${Math.round(avgRatio * 100)}%`,
|
|
1017
|
+
`across last ${REGRESSION_MIN_CALLS} calls (threshold: ${REGRESSION_MIN_RATIO * 100}%).`,
|
|
1018
|
+
FIXES_DISABLED
|
|
1019
|
+
? "Fixes are disabled — consider re-enabling to recover cache performance."
|
|
1020
|
+
: "Fixes are active but cache is still degraded — CC may have introduced a new bug."
|
|
1021
|
+
);
|
|
1022
|
+
}
|
|
1023
|
+
}
|
|
1024
|
+
|
|
1025
|
+
function _trackCacheRatio(usage) {
|
|
1026
|
+
if (_apiCallCount <= 1) return; // skip first call (cache creation, no reads)
|
|
1027
|
+
const ratio = _computeCacheRatio(usage);
|
|
1028
|
+
if (ratio === null) return;
|
|
1029
|
+
_cacheHistory.push({ ratio, turn: _apiCallCount });
|
|
1030
|
+
if (_cacheHistory.length > 20) _cacheHistory.shift(); // ring buffer
|
|
1031
|
+
_checkCacheRegression();
|
|
1032
|
+
}
|
|
1033
|
+
|
|
804
1034
|
// --------------------------------------------------------------------------
|
|
805
1035
|
// Fetch interceptor
|
|
806
1036
|
// --------------------------------------------------------------------------
|
|
@@ -817,11 +1047,17 @@ globalThis.fetch = async function (url, options) {
|
|
|
817
1047
|
|
|
818
1048
|
if (isMessagesEndpoint && options?.body && typeof options.body === "string") {
|
|
819
1049
|
try {
|
|
1050
|
+
_apiCallCount++;
|
|
820
1051
|
const payload = JSON.parse(options.body);
|
|
821
1052
|
let modified = false;
|
|
822
1053
|
|
|
823
1054
|
// One-time GrowthBook flag dump on first API call
|
|
824
1055
|
dumpGrowthBookFlags();
|
|
1056
|
+
printHealthLine();
|
|
1057
|
+
|
|
1058
|
+
if (FIXES_DISABLED) {
|
|
1059
|
+
debugLog("CACHE_FIX_DISABLED=1 — all bug fixes bypassed, monitoring active");
|
|
1060
|
+
}
|
|
825
1061
|
|
|
826
1062
|
debugLog("--- API call to", urlStr);
|
|
827
1063
|
debugLog("message count:", payload.messages?.length);
|
|
@@ -832,7 +1068,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
832
1068
|
}
|
|
833
1069
|
|
|
834
1070
|
// Bug 1: Relocate resume attachment blocks
|
|
835
|
-
if (payload.messages) {
|
|
1071
|
+
if (payload.messages && shouldApplyFix("relocate")) {
|
|
836
1072
|
// Log message structure for debugging
|
|
837
1073
|
if (DEBUG) {
|
|
838
1074
|
let firstUserIdx = -1, lastUserIdx = -1;
|
|
@@ -868,13 +1104,21 @@ globalThis.fetch = async function (url, options) {
|
|
|
868
1104
|
}
|
|
869
1105
|
|
|
870
1106
|
const normalized = normalizeResumeMessages(payload.messages);
|
|
1107
|
+
// Track bug presence for dormancy detection (resume = messages > 5)
|
|
1108
|
+
const isResume = payload.messages.length > 5;
|
|
1109
|
+
if (isResume) recordRelocateScan(normalized !== payload.messages);
|
|
1110
|
+
|
|
871
1111
|
if (normalized !== payload.messages) {
|
|
872
1112
|
payload.messages = normalized;
|
|
873
1113
|
modified = true;
|
|
874
1114
|
debugLog("APPLIED: resume message relocation");
|
|
1115
|
+
recordFixResult("relocate", "applied");
|
|
875
1116
|
} else {
|
|
876
1117
|
debugLog("SKIPPED: resume relocation (not a resume or already correct)");
|
|
1118
|
+
recordFixResult("relocate", "skipped");
|
|
877
1119
|
}
|
|
1120
|
+
} else if (payload.messages && !shouldApplyFix("relocate")) {
|
|
1121
|
+
debugLog("SKIPPED: relocate fix disabled via env var");
|
|
878
1122
|
}
|
|
879
1123
|
|
|
880
1124
|
// Image stripping: remove old tool_result images to reduce token waste
|
|
@@ -895,7 +1139,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
895
1139
|
}
|
|
896
1140
|
|
|
897
1141
|
// Bug 2a: Stabilize tool ordering
|
|
898
|
-
if (payload.tools) {
|
|
1142
|
+
if (payload.tools && shouldApplyFix("tool_sort")) {
|
|
899
1143
|
const sorted = stabilizeToolOrder(payload.tools);
|
|
900
1144
|
const changed = sorted.some(
|
|
901
1145
|
(t, i) => t.name !== payload.tools[i]?.name
|
|
@@ -904,11 +1148,16 @@ globalThis.fetch = async function (url, options) {
|
|
|
904
1148
|
payload.tools = sorted;
|
|
905
1149
|
modified = true;
|
|
906
1150
|
debugLog("APPLIED: tool order stabilization");
|
|
1151
|
+
recordFixResult("tool_sort", "applied");
|
|
1152
|
+
} else {
|
|
1153
|
+
recordFixResult("tool_sort", "skipped");
|
|
907
1154
|
}
|
|
1155
|
+
} else if (payload.tools && !shouldApplyFix("tool_sort")) {
|
|
1156
|
+
debugLog("SKIPPED: tool sort fix disabled via env var");
|
|
908
1157
|
}
|
|
909
1158
|
|
|
910
1159
|
// Bug 2b: Stabilize fingerprint in attribution header
|
|
911
|
-
if (payload.system && payload.messages) {
|
|
1160
|
+
if (payload.system && payload.messages && shouldApplyFix("fingerprint")) {
|
|
912
1161
|
const fix = stabilizeFingerprint(payload.system, payload.messages);
|
|
913
1162
|
if (fix) {
|
|
914
1163
|
payload.system = [...payload.system];
|
|
@@ -918,7 +1167,12 @@ globalThis.fetch = async function (url, options) {
|
|
|
918
1167
|
};
|
|
919
1168
|
modified = true;
|
|
920
1169
|
debugLog("APPLIED: fingerprint stabilized from", fix.oldFingerprint, "to", fix.stableFingerprint);
|
|
1170
|
+
recordFixResult("fingerprint", "applied");
|
|
1171
|
+
} else {
|
|
1172
|
+
recordFixResult("fingerprint", "skipped");
|
|
921
1173
|
}
|
|
1174
|
+
} else if (payload.system && payload.messages && !shouldApplyFix("fingerprint")) {
|
|
1175
|
+
debugLog("SKIPPED: fingerprint fix disabled via env var");
|
|
922
1176
|
}
|
|
923
1177
|
|
|
924
1178
|
// Bug 6: Identity string normalization for Agent()/SendMessage() cache parity
|
|
@@ -931,7 +1185,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
931
1185
|
// turn even though system[2] (the actual instructions) is byte-identical.
|
|
932
1186
|
// Confirmed by @labzink via mitmproxy on #44724.
|
|
933
1187
|
// Opt-in because it's a model-perceivable behavior change (subagent thinks it's CC).
|
|
934
|
-
if (NORMALIZE_IDENTITY && payload.system && Array.isArray(payload.system)) {
|
|
1188
|
+
if (NORMALIZE_IDENTITY && shouldApplyFix("identity") && payload.system && Array.isArray(payload.system)) {
|
|
935
1189
|
const CANONICAL = "You are Claude Code, Anthropic's official CLI for Claude.";
|
|
936
1190
|
const AGENT_SDK = "You are a Claude agent, built on Anthropic's Claude Agent SDK.";
|
|
937
1191
|
let normalized = 0;
|
|
@@ -949,6 +1203,9 @@ globalThis.fetch = async function (url, options) {
|
|
|
949
1203
|
if (normalized > 0) {
|
|
950
1204
|
modified = true;
|
|
951
1205
|
debugLog(`APPLIED: identity normalized on ${normalized} system block(s) (Agent SDK → Claude Code)`);
|
|
1206
|
+
recordFixResult("identity", "applied");
|
|
1207
|
+
} else {
|
|
1208
|
+
recordFixResult("identity", "skipped");
|
|
952
1209
|
}
|
|
953
1210
|
}
|
|
954
1211
|
|
|
@@ -971,7 +1228,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
971
1228
|
// send cache_control without ttl (defaulting to 5m server-side).
|
|
972
1229
|
// The server honors whatever TTL the client requests — so we inject it.
|
|
973
1230
|
// Discovered by @TigerKay1926 on #42052 using our GrowthBook flag dump.
|
|
974
|
-
if (payload.system) {
|
|
1231
|
+
if (payload.system && shouldApplyFix("ttl")) {
|
|
975
1232
|
let ttlInjected = 0;
|
|
976
1233
|
payload.system = payload.system.map((block) => {
|
|
977
1234
|
if (block.cache_control?.type === "ephemeral" && !block.cache_control.ttl) {
|
|
@@ -996,7 +1253,12 @@ globalThis.fetch = async function (url, options) {
|
|
|
996
1253
|
if (ttlInjected > 0) {
|
|
997
1254
|
modified = true;
|
|
998
1255
|
debugLog(`APPLIED: 1h TTL injected on ${ttlInjected} cache_control block(s)`);
|
|
1256
|
+
recordFixResult("ttl", "applied");
|
|
1257
|
+
} else {
|
|
1258
|
+
recordFixResult("ttl", "skipped");
|
|
999
1259
|
}
|
|
1260
|
+
} else if (payload.system && !shouldApplyFix("ttl")) {
|
|
1261
|
+
debugLog("SKIPPED: TTL injection disabled via env var");
|
|
1000
1262
|
}
|
|
1001
1263
|
|
|
1002
1264
|
if (modified) {
|
|
@@ -1199,6 +1461,7 @@ async function drainTTLFromClone(clone, model, quotaHeaders) {
|
|
|
1199
1461
|
if (event.type === "message_start" && event.message?.usage) {
|
|
1200
1462
|
const u = event.message.usage;
|
|
1201
1463
|
startUsage = u;
|
|
1464
|
+
_trackCacheRatio(u);
|
|
1202
1465
|
const cc = u.cache_creation || {};
|
|
1203
1466
|
const e1h = cc.ephemeral_1h_input_tokens ?? 0;
|
|
1204
1467
|
const e5m = cc.ephemeral_5m_input_tokens ?? 0;
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
# Status line: show quota % and burn rate from claude-meter JSONL
|
|
3
|
+
# Rate is calculated from window start (reset_time - window_size) to now
|
|
4
|
+
# No prev file needed — each reading is self-contained
|
|
5
|
+
|
|
6
|
+
input=$(cat)
|
|
7
|
+
|
|
8
|
+
JSONL="$HOME/.claude/claude-meter.jsonl"
|
|
9
|
+
|
|
10
|
+
if [ -f "$JSONL" ]; then
|
|
11
|
+
last=$(tail -1 "$JSONL" 2>/dev/null)
|
|
12
|
+
|
|
13
|
+
result=$(echo "$last" | python3 -c "
|
|
14
|
+
import sys, json
|
|
15
|
+
from datetime import datetime, timezone
|
|
16
|
+
|
|
17
|
+
r = json.load(sys.stdin)
|
|
18
|
+
q5h = int(r['q5h'] * 100)
|
|
19
|
+
q7d = int(r.get('q7d', 0) * 100)
|
|
20
|
+
overage = r.get('qoverage', '')
|
|
21
|
+
ts = r.get('ts', '')
|
|
22
|
+
q5h_reset = r.get('q5h_reset', 0)
|
|
23
|
+
q7d_reset = r.get('q7d_reset', 0)
|
|
24
|
+
|
|
25
|
+
now = datetime.fromisoformat(ts.replace('Z', '+00:00'))
|
|
26
|
+
|
|
27
|
+
# Q5h: 5-hour window, rate = pct / minutes elapsed since window start
|
|
28
|
+
rate5 = ''
|
|
29
|
+
if q5h_reset > 0:
|
|
30
|
+
window_start = datetime.fromtimestamp(q5h_reset, tz=timezone.utc) - __import__('datetime').timedelta(hours=5)
|
|
31
|
+
elapsed_min = (now - window_start).total_seconds() / 60
|
|
32
|
+
if elapsed_min > 1 and q5h > 0:
|
|
33
|
+
rate5 = '{:+.1f}'.format(q5h / elapsed_min)
|
|
34
|
+
|
|
35
|
+
# Q7d: 7-day window
|
|
36
|
+
rate7 = ''
|
|
37
|
+
if q7d_reset > 0:
|
|
38
|
+
window_start_7d = datetime.fromtimestamp(q7d_reset, tz=timezone.utc) - __import__('datetime').timedelta(days=7)
|
|
39
|
+
elapsed_min_7d = (now - window_start_7d).total_seconds() / 60
|
|
40
|
+
if elapsed_min_7d > 1 and q7d > 0:
|
|
41
|
+
rate7 = '{:+.1f}'.format(q7d / (elapsed_min_7d / 60))
|
|
42
|
+
|
|
43
|
+
label = 'Q5h: {}%'.format(q5h)
|
|
44
|
+
if rate5:
|
|
45
|
+
label += ' ({}%/m)'.format(rate5)
|
|
46
|
+
label += ' | Q7d: {}%'.format(q7d)
|
|
47
|
+
if rate7:
|
|
48
|
+
label += ' ({}%/hr)'.format(rate7)
|
|
49
|
+
if overage == 'active':
|
|
50
|
+
label += ' | OVERAGE'
|
|
51
|
+
|
|
52
|
+
# Add TTL tier from quota-status.json (written by interceptor)
|
|
53
|
+
import os, pathlib
|
|
54
|
+
qs_path = pathlib.Path.home() / '.claude' / 'quota-status.json'
|
|
55
|
+
try:
|
|
56
|
+
qs = json.load(open(qs_path))
|
|
57
|
+
ttl = qs.get('cache', {}).get('ttl_tier', '')
|
|
58
|
+
hit = qs.get('cache', {}).get('hit_rate', '')
|
|
59
|
+
if ttl:
|
|
60
|
+
if ttl == '5m':
|
|
61
|
+
label += ' | \033[31mTTL:5m\033[0m' # red
|
|
62
|
+
# When on 5m tier, show the cold-rebuild size so users know
|
|
63
|
+
# the cost of idling past 5 minutes
|
|
64
|
+
cache_cr = qs.get('cache', {}).get('cache_creation', 0)
|
|
65
|
+
cache_rd = qs.get('cache', {}).get('cache_read', 0)
|
|
66
|
+
prefix = cache_cr + cache_rd
|
|
67
|
+
if prefix > 0:
|
|
68
|
+
if prefix >= 1_000_000:
|
|
69
|
+
label += ' \033[31m\u26A0 idle >5m = {:.1f}M rebuild\033[0m'.format(prefix / 1_000_000)
|
|
70
|
+
else:
|
|
71
|
+
label += ' \033[31m\u26A0 idle >5m = {:.0f}K rebuild\033[0m'.format(prefix / 1_000)
|
|
72
|
+
else:
|
|
73
|
+
label += ' | TTL:' + ttl
|
|
74
|
+
if hit and hit != 'N/A':
|
|
75
|
+
label += ' ' + hit + '%'
|
|
76
|
+
peak = qs.get('peak_hour', False)
|
|
77
|
+
if peak:
|
|
78
|
+
label += ' | \033[33mPEAK\033[0m' # yellow
|
|
79
|
+
except:
|
|
80
|
+
pass
|
|
81
|
+
|
|
82
|
+
print(label)
|
|
83
|
+
" 2>/dev/null)
|
|
84
|
+
|
|
85
|
+
[ -n "$result" ] && echo "$result"
|
|
86
|
+
fi
|