claude-code-cache-fix 1.0.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +104 -5
- package/package.json +1 -1
- package/preload.mjs +607 -135
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# claude-code-cache-fix
|
|
2
2
|
|
|
3
|
-
Fixes
|
|
3
|
+
Fixes prompt cache regressions in [Claude Code](https://github.com/anthropics/claude-code) that cause **up to 20x cost increase** on resumed sessions, plus monitoring for silent context degradation. Confirmed through v2.1.92.
|
|
4
4
|
|
|
5
5
|
## The problem
|
|
6
6
|
|
|
@@ -14,6 +14,8 @@ Three bugs cause this:
|
|
|
14
14
|
|
|
15
15
|
3. **Non-deterministic tool ordering** — Tool definitions can arrive in different orders between turns, changing request bytes and invalidating the cache key.
|
|
16
16
|
|
|
17
|
+
Additionally, images read via the Read tool persist as base64 in conversation history and are sent on every subsequent API call, compounding token costs silently.
|
|
18
|
+
|
|
17
19
|
## Installation
|
|
18
20
|
|
|
19
21
|
Requires Node.js >= 18 and Claude Code installed via npm (not the standalone binary).
|
|
@@ -76,6 +78,63 @@ The module intercepts `globalThis.fetch` before Claude Code makes API calls to `
|
|
|
76
78
|
|
|
77
79
|
All fixes are idempotent — if nothing needs fixing, the request passes through unmodified. The interceptor is read-only with respect to your conversation; it only normalizes the request structure before it hits the API.
|
|
78
80
|
|
|
81
|
+
## Image stripping
|
|
82
|
+
|
|
83
|
+
Images read via the Read tool are encoded as base64 and stored in `tool_result` blocks in conversation history. They ride along on **every subsequent API call** until compaction. A single 500KB image costs ~62,500 tokens per turn in carry-forward.
|
|
84
|
+
|
|
85
|
+
Enable image stripping to remove old images from tool results:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
export CACHE_FIX_IMAGE_KEEP_LAST=3
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
This keeps images in the last 3 user messages and replaces older ones with a text placeholder. Only targets images inside `tool_result` blocks (Read tool output) — user-pasted images are never touched. Files remain on disk for re-reading if needed.
|
|
92
|
+
|
|
93
|
+
Set to `0` (default) to disable.
|
|
94
|
+
|
|
95
|
+
## Prefix lock (resume cache hit)
|
|
96
|
+
|
|
97
|
+
Even with the block relocation fix, the first API call after `--resume` triggers a full cache rebuild because CC reassembles messages with different system-reminder blocks, changing the prefix bytes. On a 300k token context at Opus rates, that's ~$2.80 per resume.
|
|
98
|
+
|
|
99
|
+
The prefix lock eliminates this by saving the exact `messages[0]` content after all fixes are applied, then replaying it on the next resume to produce a byte-identical prefix.
|
|
100
|
+
|
|
101
|
+
```bash
|
|
102
|
+
export CACHE_FIX_PREFIX_LOCK=1
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
Safety guards — the lock only fires when ALL of these match:
|
|
106
|
+
- System prompt hash (same project, no CLAUDE.md changes)
|
|
107
|
+
- Tools hash (no MCP/plugin changes)
|
|
108
|
+
- User message text (same conversation)
|
|
109
|
+
- User content hash (no substantive context changes)
|
|
110
|
+
- Not a post-compaction conversation
|
|
111
|
+
|
|
112
|
+
If any guard fails, the lock skips and falls back to normal behavior. The worst case is a skip — the lock cannot increase costs or cause context loss.
|
|
113
|
+
|
|
114
|
+
Set to `0` (default) to disable.
|
|
115
|
+
|
|
116
|
+
## Monitoring
|
|
117
|
+
|
|
118
|
+
The interceptor includes monitoring for several additional issues identified by the community:
|
|
119
|
+
|
|
120
|
+
### Microcompact / budget enforcement
|
|
121
|
+
|
|
122
|
+
Claude Code silently replaces old tool results with `[Old tool result content cleared]` via server-controlled mechanisms (GrowthBook flags). A 200,000-character aggregate cap and per-tool caps (Bash: 30K, Grep: 20K) truncate older results without notification. There is no `DISABLE_MICROCOMPACT` environment variable.
|
|
123
|
+
|
|
124
|
+
The interceptor detects cleared tool results and logs counts. When total tool result characters approach the 200K threshold, a warning is logged.
|
|
125
|
+
|
|
126
|
+
### False rate limiter
|
|
127
|
+
|
|
128
|
+
The client can generate synthetic "Rate limit reached" errors without making an API call, identifiable by `"model": "<synthetic>"`. The interceptor logs these events.
|
|
129
|
+
|
|
130
|
+
### GrowthBook flag dump
|
|
131
|
+
|
|
132
|
+
On the first API call, the interceptor reads `~/.claude.json` and logs the current state of cost/cache-relevant server-controlled flags (hawthorn_window, pewter_kestrel, slate_heron, session_memory, etc.).
|
|
133
|
+
|
|
134
|
+
### Quota tracking
|
|
135
|
+
|
|
136
|
+
Response headers are parsed for `anthropic-ratelimit-unified-5h-utilization` and `7d-utilization`, saved to `~/.claude/quota-status.json` for consumption by status line hooks or other tools.
|
|
137
|
+
|
|
79
138
|
## Debug mode
|
|
80
139
|
|
|
81
140
|
Enable debug logging to verify the fix is working:
|
|
@@ -88,31 +147,71 @@ Logs are written to `~/.claude/cache-fix-debug.log`. Look for:
|
|
|
88
147
|
- `APPLIED: resume message relocation` — block scatter was detected and fixed
|
|
89
148
|
- `APPLIED: tool order stabilization` — tools were reordered
|
|
90
149
|
- `APPLIED: fingerprint stabilized from XXX to YYY` — fingerprint was corrected
|
|
91
|
-
- `
|
|
150
|
+
- `APPLIED: stripped N images from old tool results` — images were stripped
|
|
151
|
+
- `MICROCOMPACT: N/M tool results cleared` — microcompact degradation detected
|
|
152
|
+
- `BUDGET WARNING: tool result chars at N / 200,000 threshold` — approaching budget cap
|
|
153
|
+
- `FALSE RATE LIMIT: synthetic model detected` — client-side false rate limit
|
|
154
|
+
- `GROWTHBOOK FLAGS: {...}` — server-controlled feature flags on first call
|
|
155
|
+
- `PREFIX LOCK: APPLIED — replayed saved messages[0]` — resume cache hit achieved
|
|
156
|
+
- `PREFIX LOCK: skipped — <reason>` — guard prevented lock (expected, safe)
|
|
157
|
+
- `SKIPPED: resume relocation (not a resume or already correct)` — no fix needed
|
|
158
|
+
|
|
159
|
+
### Prefix diff mode
|
|
160
|
+
|
|
161
|
+
Enable cross-process prefix snapshot diffing to diagnose cache busts on restart:
|
|
162
|
+
|
|
163
|
+
```bash
|
|
164
|
+
CACHE_FIX_PREFIXDIFF=1 claude-fixed
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
Snapshots are saved to `~/.claude/cache-fix-snapshots/` and diff reports are generated on the first API call after a restart.
|
|
168
|
+
|
|
169
|
+
## Environment variables
|
|
170
|
+
|
|
171
|
+
| Variable | Default | Description |
|
|
172
|
+
|----------|---------|-------------|
|
|
173
|
+
| `CACHE_FIX_DEBUG` | `0` | Enable debug logging to `~/.claude/cache-fix-debug.log` |
|
|
174
|
+
| `CACHE_FIX_PREFIXDIFF` | `0` | Enable prefix snapshot diffing |
|
|
175
|
+
| `CACHE_FIX_IMAGE_KEEP_LAST` | `0` | Keep images in last N user messages (0 = disabled) |
|
|
176
|
+
| `CACHE_FIX_PREFIX_LOCK` | `0` | Replay saved messages[0] on resume for cache hit (0 = disabled) |
|
|
92
177
|
|
|
93
178
|
## Limitations
|
|
94
179
|
|
|
95
180
|
- **npm installation only** — The standalone Claude Code binary has Zig-level attestation that bypasses Node.js. This fix only works with the npm package (`npm install -g @anthropic-ai/claude-code`).
|
|
96
181
|
- **Overage TTL downgrade** — Exceeding 100% of the 5-hour quota triggers a server-enforced TTL downgrade from 1h to 5m. This is a server-side decision and cannot be fixed client-side. The interceptor prevents the cache instability that can push you into overage in the first place.
|
|
182
|
+
- **Microcompact is not preventable** — The monitoring features detect context degradation but cannot prevent it. The microcompact and budget enforcement mechanisms are server-controlled via GrowthBook flags with no client-side disable option.
|
|
97
183
|
- **Version coupling** — The fingerprint salt and block detection heuristics are derived from Claude Code internals. A major refactor could require an update to this package.
|
|
98
184
|
|
|
99
185
|
## Tracked issues
|
|
100
186
|
|
|
101
187
|
- [#34629](https://github.com/anthropics/claude-code/issues/34629) — Original resume cache regression report
|
|
102
|
-
- [#40524](https://github.com/anthropics/claude-code/issues/40524) — Within-session fingerprint invalidation
|
|
103
|
-
- [#42052](https://github.com/anthropics/claude-code/issues/42052) — Community interceptor development
|
|
188
|
+
- [#40524](https://github.com/anthropics/claude-code/issues/40524) — Within-session fingerprint invalidation, image persistence
|
|
189
|
+
- [#42052](https://github.com/anthropics/claude-code/issues/42052) — Community interceptor development, TTL downgrade discovery
|
|
104
190
|
- [#43044](https://github.com/anthropics/claude-code/issues/43044) — Resume loads 0% context on v2.1.91
|
|
105
191
|
- [#43657](https://github.com/anthropics/claude-code/issues/43657) — Resume cache invalidation confirmed on v2.1.92
|
|
106
192
|
- [#44045](https://github.com/anthropics/claude-code/issues/44045) — SDK-level reproduction with token measurements
|
|
107
193
|
|
|
194
|
+
## Related research
|
|
195
|
+
|
|
196
|
+
- **[@ArkNill/claude-code-hidden-problem-analysis](https://github.com/ArkNill/claude-code-hidden-problem-analysis)** — Systematic proxy-based analysis of 7 bugs including microcompact, budget enforcement, false rate limiter, and extended thinking quota impact. The monitoring features in v1.1.0 are informed by this research.
|
|
197
|
+
- **[@Renvect/X-Ray-Claude-Code-Interceptor](https://github.com/Renvect/X-Ray-Claude-Code-Interceptor)** — Diagnostic HTTPS proxy with real-time dashboard, system prompt section diffing, per-tool stripping thresholds, and multi-stream JSONL logging. Works with any Claude client that supports `ANTHROPIC_BASE_URL` (CLI, VS Code extension, desktop app), complementing this package's CLI-only `NODE_OPTIONS` approach.
|
|
198
|
+
|
|
108
199
|
## Contributors
|
|
109
200
|
|
|
110
201
|
- **[@VictorSun92](https://github.com/VictorSun92)** — Original monkey-patch fix for v2.1.88, identified partial scatter on v2.1.90, contributed forward-scan detection, correct block ordering, and tighter block matchers
|
|
111
202
|
- **[@jmarianski](https://github.com/jmarianski)** — Root cause analysis via MITM proxy capture and Ghidra reverse engineering, multi-mode cache test script
|
|
112
|
-
- **[@cnighswonger](https://github.com/cnighswonger)** — Fingerprint stabilization, tool ordering fix,
|
|
203
|
+
- **[@cnighswonger](https://github.com/cnighswonger)** — Fingerprint stabilization, tool ordering fix, image stripping, monitoring features, overage TTL downgrade discovery, package maintainer
|
|
204
|
+
- **[@ArkNill](https://github.com/ArkNill)** — Microcompact mechanism analysis, GrowthBook flag documentation, false rate limiter identification
|
|
205
|
+
- **[@Renvect](https://github.com/Renvect)** — Image duplication discovery, cross-project directory contamination analysis
|
|
113
206
|
|
|
114
207
|
If you contributed to the community effort on these issues and aren't listed here, please open an issue or PR — we want to credit everyone properly.
|
|
115
208
|
|
|
209
|
+
## Support
|
|
210
|
+
|
|
211
|
+
If this tool saved you money, consider buying me a coffee:
|
|
212
|
+
|
|
213
|
+
<a href="https://buymeacoffee.com/vsits" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
|
|
214
|
+
|
|
116
215
|
## License
|
|
117
216
|
|
|
118
217
|
[MIT](LICENSE)
|
package/package.json
CHANGED
package/preload.mjs
CHANGED
|
@@ -8,51 +8,42 @@
|
|
|
8
8
|
// later user messages instead of messages[0]. This breaks the prompt cache
|
|
9
9
|
// prefix match. Fix: relocate them to messages[0] on every API call.
|
|
10
10
|
// (github.com/anthropics/claude-code/issues/34629)
|
|
11
|
-
// (github.com/anthropics/claude-code/issues/43657)
|
|
12
|
-
// (github.com/anthropics/claude-code/issues/44045)
|
|
13
11
|
//
|
|
14
12
|
// Bug 2: Fingerprint instability
|
|
15
13
|
// The cc_version fingerprint in the attribution header is computed from
|
|
16
14
|
// messages[0] content INCLUDING meta/attachment blocks. When those blocks
|
|
17
|
-
// change between turns, the fingerprint changes
|
|
18
|
-
//
|
|
15
|
+
// change between turns, the fingerprint changes, busting cache within the
|
|
16
|
+
// same session. Fix: stabilize the fingerprint from the real user message.
|
|
19
17
|
// (github.com/anthropics/claude-code/issues/40524)
|
|
20
18
|
//
|
|
21
|
-
// Bug 3:
|
|
22
|
-
//
|
|
23
|
-
//
|
|
19
|
+
// Bug 3: Image carry-forward in conversation history
|
|
20
|
+
// Images read via the Read tool persist as base64 in conversation history
|
|
21
|
+
// and are sent on every subsequent API call. A single 500KB image costs
|
|
22
|
+
// ~62,500 tokens per turn in carry-forward. Fix: strip base64 image blocks
|
|
23
|
+
// from tool_result content older than N user turns.
|
|
24
|
+
// Set CACHE_FIX_IMAGE_KEEP_LAST=N to enable (default: 0 = disabled).
|
|
25
|
+
// (github.com/anthropics/claude-code/issues/40524)
|
|
26
|
+
//
|
|
27
|
+
// Monitoring:
|
|
28
|
+
// - GrowthBook flag dump on first API call (CACHE_FIX_DEBUG=1)
|
|
29
|
+
// - Microcompact / budget enforcement detection (logs cleared tool results)
|
|
30
|
+
// - False rate limiter detection (model: "<synthetic>")
|
|
31
|
+
// - Quota utilization tracking (writes ~/.claude/quota-status.json)
|
|
32
|
+
// - Prefix snapshot diffing across process restarts (CACHE_FIX_PREFIXDIFF=1)
|
|
24
33
|
//
|
|
25
|
-
// Based on community
|
|
26
|
-
//
|
|
34
|
+
// Based on community fix by @VictorSun92 / @jmarianski (issue #34629),
|
|
35
|
+
// enhanced with fingerprint stabilization, image stripping, and monitoring.
|
|
36
|
+
// Bug research informed by @ArkNill's claude-code-hidden-problem-analysis.
|
|
27
37
|
//
|
|
28
|
-
//
|
|
38
|
+
// Load via: NODE_OPTIONS="--import $HOME/.claude/cache-fix-preload.mjs"
|
|
29
39
|
|
|
30
40
|
import { createHash } from "node:crypto";
|
|
31
|
-
import { appendFileSync } from "node:fs";
|
|
32
|
-
import { homedir } from "node:os";
|
|
33
|
-
import { join } from "node:path";
|
|
34
|
-
|
|
35
|
-
// ---------------------------------------------------------------------------
|
|
36
|
-
// Debug logging (writes to ~/.claude/cache-fix-debug.log)
|
|
37
|
-
// Set CACHE_FIX_DEBUG=1 to enable
|
|
38
|
-
// ---------------------------------------------------------------------------
|
|
39
|
-
|
|
40
|
-
const DEBUG = process.env.CACHE_FIX_DEBUG === "1";
|
|
41
|
-
const LOG_PATH = join(homedir(), ".claude", "cache-fix-debug.log");
|
|
42
|
-
|
|
43
|
-
function debugLog(...args) {
|
|
44
|
-
if (!DEBUG) return;
|
|
45
|
-
const line = `[${new Date().toISOString()}] ${args.join(" ")}\n`;
|
|
46
|
-
try {
|
|
47
|
-
appendFileSync(LOG_PATH, line);
|
|
48
|
-
} catch {}
|
|
49
|
-
}
|
|
50
41
|
|
|
51
|
-
//
|
|
42
|
+
// --------------------------------------------------------------------------
|
|
52
43
|
// Fingerprint stabilization (Bug 2)
|
|
53
|
-
//
|
|
44
|
+
// --------------------------------------------------------------------------
|
|
54
45
|
|
|
55
|
-
// Must match
|
|
46
|
+
// Must match src/utils/fingerprint.ts exactly.
|
|
56
47
|
const FINGERPRINT_SALT = "59cf53e54c78";
|
|
57
48
|
const FINGERPRINT_INDICES = [4, 7, 20];
|
|
58
49
|
|
|
@@ -77,20 +68,14 @@ function extractRealUserMessageText(messages) {
|
|
|
77
68
|
if (msg.role !== "user") continue;
|
|
78
69
|
const content = msg.content;
|
|
79
70
|
if (!Array.isArray(content)) {
|
|
80
|
-
if (
|
|
81
|
-
typeof content === "string" &&
|
|
82
|
-
!content.startsWith("<system-reminder>")
|
|
83
|
-
) {
|
|
71
|
+
if (typeof content === "string" && !content.startsWith("<system-reminder>")) {
|
|
84
72
|
return content;
|
|
85
73
|
}
|
|
86
74
|
continue;
|
|
87
75
|
}
|
|
76
|
+
// Find first text block that isn't a system-reminder
|
|
88
77
|
for (const block of content) {
|
|
89
|
-
if (
|
|
90
|
-
block.type === "text" &&
|
|
91
|
-
typeof block.text === "string" &&
|
|
92
|
-
!block.text.startsWith("<system-reminder>")
|
|
93
|
-
) {
|
|
78
|
+
if (block.type === "text" && typeof block.text === "string" && !block.text.startsWith("<system-reminder>")) {
|
|
94
79
|
return block.text;
|
|
95
80
|
}
|
|
96
81
|
}
|
|
@@ -100,17 +85,14 @@ function extractRealUserMessageText(messages) {
|
|
|
100
85
|
|
|
101
86
|
/**
|
|
102
87
|
* Extract current cc_version from system prompt blocks and recompute with
|
|
103
|
-
* stable fingerprint. Returns {
|
|
104
|
-
* or null if no fix needed.
|
|
88
|
+
* stable fingerprint. Returns { oldVersion, newVersion, stableFingerprint }.
|
|
105
89
|
*/
|
|
106
90
|
function stabilizeFingerprint(system, messages) {
|
|
107
91
|
if (!Array.isArray(system)) return null;
|
|
108
92
|
|
|
93
|
+
// Find the attribution header block
|
|
109
94
|
const attrIdx = system.findIndex(
|
|
110
|
-
(b) =>
|
|
111
|
-
b.type === "text" &&
|
|
112
|
-
typeof b.text === "string" &&
|
|
113
|
-
b.text.includes("x-anthropic-billing-header:")
|
|
95
|
+
(b) => b.type === "text" && typeof b.text === "string" && b.text.includes("x-anthropic-billing-header:")
|
|
114
96
|
);
|
|
115
97
|
if (attrIdx === -1) return null;
|
|
116
98
|
|
|
@@ -118,13 +100,14 @@ function stabilizeFingerprint(system, messages) {
|
|
|
118
100
|
const versionMatch = attrBlock.text.match(/cc_version=([^;]+)/);
|
|
119
101
|
if (!versionMatch) return null;
|
|
120
102
|
|
|
121
|
-
const fullVersion = versionMatch[1]; // e.g. "2.1.
|
|
103
|
+
const fullVersion = versionMatch[1]; // e.g. "2.1.87.a3f"
|
|
122
104
|
const dotParts = fullVersion.split(".");
|
|
123
105
|
if (dotParts.length < 4) return null;
|
|
124
106
|
|
|
125
|
-
const baseVersion = dotParts.slice(0, 3).join("."); // "2.1.
|
|
107
|
+
const baseVersion = dotParts.slice(0, 3).join("."); // "2.1.87"
|
|
126
108
|
const oldFingerprint = dotParts[3]; // "a3f"
|
|
127
109
|
|
|
110
|
+
// Compute stable fingerprint from real user text
|
|
128
111
|
const realText = extractRealUserMessageText(messages);
|
|
129
112
|
const stableFingerprint = computeFingerprint(realText, baseVersion);
|
|
130
113
|
|
|
@@ -139,38 +122,28 @@ function stabilizeFingerprint(system, messages) {
|
|
|
139
122
|
return { attrIdx, newText, oldFingerprint, stableFingerprint };
|
|
140
123
|
}
|
|
141
124
|
|
|
142
|
-
//
|
|
125
|
+
// --------------------------------------------------------------------------
|
|
143
126
|
// Resume message relocation (Bug 1)
|
|
144
|
-
//
|
|
127
|
+
// --------------------------------------------------------------------------
|
|
145
128
|
|
|
146
129
|
function isSystemReminder(text) {
|
|
147
130
|
return typeof text === "string" && text.startsWith("<system-reminder>");
|
|
148
131
|
}
|
|
149
|
-
|
|
132
|
+
// FIX: Match block headers with startsWith to avoid false positives from
|
|
133
|
+
// quoted content (e.g. "Note:" file-change reminders embedding debug logs).
|
|
150
134
|
const SR = "<system-reminder>\n";
|
|
151
|
-
|
|
152
135
|
function isHooksBlock(text) {
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
);
|
|
136
|
+
// Hooks block header varies; fall back to head-region check
|
|
137
|
+
return isSystemReminder(text) && text.substring(0, 200).includes("hook success");
|
|
156
138
|
}
|
|
157
139
|
function isSkillsBlock(text) {
|
|
158
|
-
return (
|
|
159
|
-
typeof text === "string" &&
|
|
160
|
-
text.startsWith(SR + "The following skills are available")
|
|
161
|
-
);
|
|
140
|
+
return typeof text === "string" && text.startsWith(SR + "The following skills are available");
|
|
162
141
|
}
|
|
163
142
|
function isDeferredToolsBlock(text) {
|
|
164
|
-
return (
|
|
165
|
-
typeof text === "string" &&
|
|
166
|
-
text.startsWith(SR + "The following deferred tools are now available")
|
|
167
|
-
);
|
|
143
|
+
return typeof text === "string" && text.startsWith(SR + "The following deferred tools are now available");
|
|
168
144
|
}
|
|
169
145
|
function isMcpBlock(text) {
|
|
170
|
-
return (
|
|
171
|
-
typeof text === "string" &&
|
|
172
|
-
text.startsWith(SR + "# MCP Server Instructions")
|
|
173
|
-
);
|
|
146
|
+
return typeof text === "string" && text.startsWith(SR + "# MCP Server Instructions");
|
|
174
147
|
}
|
|
175
148
|
function isRelocatableBlock(text) {
|
|
176
149
|
return (
|
|
@@ -208,18 +181,21 @@ function stripSessionKnowledge(text) {
|
|
|
208
181
|
}
|
|
209
182
|
|
|
210
183
|
/**
|
|
211
|
-
* Core fix: on EVERY
|
|
184
|
+
* Core fix: on EVERY call, scan the entire message array for the LATEST
|
|
212
185
|
* relocatable blocks (skills, MCP, deferred tools, hooks) and ensure they
|
|
213
186
|
* are in messages[0]. This matches fresh session behavior where attachments
|
|
214
|
-
* are always prepended to messages[0].
|
|
187
|
+
* are always prepended to messages[0] on every API call.
|
|
215
188
|
*
|
|
216
|
-
* The
|
|
217
|
-
*
|
|
218
|
-
*
|
|
189
|
+
* The original community fix only checked the last user message, which
|
|
190
|
+
* broke on subsequent turns because:
|
|
191
|
+
* - Call 1: skills in last msg → relocated to messages[0] (3 blocks)
|
|
192
|
+
* - Call 2: in-memory state unchanged, skills now in a middle msg,
|
|
193
|
+
* last msg has no relocatable blocks → messages[0] back to 2 blocks
|
|
194
|
+
* - Prefix changed → cache bust
|
|
219
195
|
*
|
|
220
196
|
* This version scans backwards to find the latest instance of each
|
|
221
197
|
* relocatable block type, removes them from wherever they are, and
|
|
222
|
-
* prepends them to messages[0]
|
|
198
|
+
* prepends them to messages[0]. Idempotent across calls.
|
|
223
199
|
*/
|
|
224
200
|
function normalizeResumeMessages(messages) {
|
|
225
201
|
if (!Array.isArray(messages) || messages.length < 2) return messages;
|
|
@@ -236,13 +212,11 @@ function normalizeResumeMessages(messages) {
|
|
|
236
212
|
const firstMsg = messages[firstUserIdx];
|
|
237
213
|
if (!Array.isArray(firstMsg?.content)) return messages;
|
|
238
214
|
|
|
239
|
-
// Check if ANY relocatable blocks are scattered outside first user msg.
|
|
215
|
+
// FIX: Check if ANY relocatable blocks are scattered outside first user msg.
|
|
216
|
+
// The old check (firstAlreadyHas → skip) missed partial scatter where some
|
|
217
|
+
// blocks stay in messages[0] but others drift to later messages (v2.1.89+).
|
|
240
218
|
let hasScatteredBlocks = false;
|
|
241
|
-
for (
|
|
242
|
-
let i = firstUserIdx + 1;
|
|
243
|
-
i < messages.length && !hasScatteredBlocks;
|
|
244
|
-
i++
|
|
245
|
-
) {
|
|
219
|
+
for (let i = firstUserIdx + 1; i < messages.length && !hasScatteredBlocks; i++) {
|
|
246
220
|
const msg = messages[i];
|
|
247
221
|
if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
248
222
|
for (const block of msg.content) {
|
|
@@ -254,8 +228,8 @@ function normalizeResumeMessages(messages) {
|
|
|
254
228
|
}
|
|
255
229
|
if (!hasScatteredBlocks) return messages;
|
|
256
230
|
|
|
257
|
-
// Scan ALL user messages in reverse to collect the LATEST
|
|
258
|
-
// block type. This handles both full and partial scatter.
|
|
231
|
+
// Scan ALL user messages (including first) in reverse to collect the LATEST
|
|
232
|
+
// version of each block type. This handles both full and partial scatter.
|
|
259
233
|
const found = new Map();
|
|
260
234
|
|
|
261
235
|
for (let i = messages.length - 1; i >= firstUserIdx; i--) {
|
|
@@ -267,6 +241,7 @@ function normalizeResumeMessages(messages) {
|
|
|
267
241
|
const text = block.text || "";
|
|
268
242
|
if (!isRelocatableBlock(text)) continue;
|
|
269
243
|
|
|
244
|
+
// Determine block type for dedup
|
|
270
245
|
let blockType;
|
|
271
246
|
if (isSkillsBlock(text)) blockType = "skills";
|
|
272
247
|
else if (isMcpBlock(text)) blockType = "mcp";
|
|
@@ -274,6 +249,7 @@ function normalizeResumeMessages(messages) {
|
|
|
274
249
|
else if (isHooksBlock(text)) blockType = "hooks";
|
|
275
250
|
else continue;
|
|
276
251
|
|
|
252
|
+
// Keep only the LATEST (first found scanning backwards)
|
|
277
253
|
if (!found.has(blockType)) {
|
|
278
254
|
let fixedText = text;
|
|
279
255
|
if (blockType === "hooks") fixedText = stripSessionKnowledge(text);
|
|
@@ -287,17 +263,15 @@ function normalizeResumeMessages(messages) {
|
|
|
287
263
|
|
|
288
264
|
if (found.size === 0) return messages;
|
|
289
265
|
|
|
290
|
-
// Remove ALL relocatable blocks from ALL user messages
|
|
266
|
+
// Remove ALL relocatable blocks from ALL user messages (both first and later)
|
|
291
267
|
const result = messages.map((msg) => {
|
|
292
268
|
if (msg.role !== "user" || !Array.isArray(msg.content)) return msg;
|
|
293
|
-
const filtered = msg.content.filter(
|
|
294
|
-
(b) => !isRelocatableBlock(b.text || "")
|
|
295
|
-
);
|
|
269
|
+
const filtered = msg.content.filter((b) => !isRelocatableBlock(b.text || ""));
|
|
296
270
|
if (filtered.length === msg.content.length) return msg;
|
|
297
271
|
return { ...msg, content: filtered };
|
|
298
272
|
});
|
|
299
273
|
|
|
300
|
-
// Order must match fresh session layout: deferred
|
|
274
|
+
// FIX: Order must match fresh session layout: deferred → mcp → skills → hooks
|
|
301
275
|
const ORDER = ["deferred", "mcp", "skills", "hooks"];
|
|
302
276
|
const toRelocate = ORDER.filter((t) => found.has(t)).map((t) => found.get(t));
|
|
303
277
|
|
|
@@ -309,12 +283,245 @@ function normalizeResumeMessages(messages) {
|
|
|
309
283
|
return result;
|
|
310
284
|
}
|
|
311
285
|
|
|
312
|
-
//
|
|
313
|
-
//
|
|
314
|
-
//
|
|
286
|
+
// --------------------------------------------------------------------------
|
|
287
|
+
// Image stripping from old tool results (cost optimization)
|
|
288
|
+
// --------------------------------------------------------------------------
|
|
289
|
+
|
|
290
|
+
// CACHE_FIX_IMAGE_KEEP_LAST=N — keep images only in the last N user messages.
|
|
291
|
+
// Unset or 0 = disabled (all images preserved, backward compatible).
|
|
292
|
+
// Images in tool_result blocks older than N user messages from the end are
|
|
293
|
+
// replaced with a text placeholder. User-pasted images (direct image blocks
|
|
294
|
+
// in user messages, not inside tool_result) are left alone.
|
|
295
|
+
const IMAGE_KEEP_LAST = parseInt(process.env.CACHE_FIX_IMAGE_KEEP_LAST || "0", 10);
|
|
296
|
+
|
|
297
|
+
/**
|
|
298
|
+
* Strip base64 image blocks from tool_result content in older messages.
|
|
299
|
+
* Returns { messages, stats } where stats has stripping metrics.
|
|
300
|
+
*/
|
|
301
|
+
function stripOldToolResultImages(messages, keepLast) {
|
|
302
|
+
if (!keepLast || keepLast <= 0 || !Array.isArray(messages)) {
|
|
303
|
+
return { messages, stats: null };
|
|
304
|
+
}
|
|
305
|
+
|
|
306
|
+
// Find user message indices (turns) so we can count from the end
|
|
307
|
+
const userMsgIndices = [];
|
|
308
|
+
for (let i = 0; i < messages.length; i++) {
|
|
309
|
+
if (messages[i].role === "user") userMsgIndices.push(i);
|
|
310
|
+
}
|
|
311
|
+
|
|
312
|
+
if (userMsgIndices.length <= keepLast) {
|
|
313
|
+
return { messages, stats: null }; // not enough turns to strip anything
|
|
314
|
+
}
|
|
315
|
+
|
|
316
|
+
// Messages at or after this index are "recent" — keep their images
|
|
317
|
+
const cutoffIdx = userMsgIndices[userMsgIndices.length - keepLast];
|
|
318
|
+
|
|
319
|
+
let strippedCount = 0;
|
|
320
|
+
let strippedBytes = 0;
|
|
321
|
+
|
|
322
|
+
const result = messages.map((msg, msgIdx) => {
|
|
323
|
+
// Only process user messages before the cutoff (tool_result is in user msgs)
|
|
324
|
+
if (msg.role !== "user" || msgIdx >= cutoffIdx || !Array.isArray(msg.content)) {
|
|
325
|
+
return msg;
|
|
326
|
+
}
|
|
327
|
+
|
|
328
|
+
let msgModified = false;
|
|
329
|
+
const newContent = msg.content.map((block) => {
|
|
330
|
+
// Only strip images inside tool_result blocks, not user-pasted images
|
|
331
|
+
if (block.type === "tool_result" && Array.isArray(block.content)) {
|
|
332
|
+
let toolModified = false;
|
|
333
|
+
const newToolContent = block.content.map((item) => {
|
|
334
|
+
if (item.type === "image") {
|
|
335
|
+
strippedCount++;
|
|
336
|
+
if (item.source?.data) {
|
|
337
|
+
strippedBytes += item.source.data.length;
|
|
338
|
+
}
|
|
339
|
+
toolModified = true;
|
|
340
|
+
return {
|
|
341
|
+
type: "text",
|
|
342
|
+
text: "[image stripped from history — file may still be on disk]",
|
|
343
|
+
};
|
|
344
|
+
}
|
|
345
|
+
return item;
|
|
346
|
+
});
|
|
347
|
+
if (toolModified) {
|
|
348
|
+
msgModified = true;
|
|
349
|
+
return { ...block, content: newToolContent };
|
|
350
|
+
}
|
|
351
|
+
}
|
|
352
|
+
return block;
|
|
353
|
+
});
|
|
354
|
+
|
|
355
|
+
if (msgModified) {
|
|
356
|
+
return { ...msg, content: newContent };
|
|
357
|
+
}
|
|
358
|
+
return msg;
|
|
359
|
+
});
|
|
360
|
+
|
|
361
|
+
const stats = strippedCount > 0
|
|
362
|
+
? { strippedCount, strippedBytes, estimatedTokens: Math.ceil(strippedBytes * 0.125) }
|
|
363
|
+
: null;
|
|
364
|
+
|
|
365
|
+
return { messages: strippedCount > 0 ? result : messages, stats };
|
|
366
|
+
}
|
|
367
|
+
|
|
368
|
+
// --------------------------------------------------------------------------
|
|
369
|
+
// Prefix lock — replay saved messages[0] on resume for cache hit
|
|
370
|
+
// --------------------------------------------------------------------------
|
|
371
|
+
|
|
372
|
+
// CACHE_FIX_PREFIX_LOCK=1 — save messages[0] on every call and replay it on
|
|
373
|
+
// resume to avoid a cache rebuild. Disabled by default.
|
|
374
|
+
//
|
|
375
|
+
// On resume, CC reassembles messages with blocks in different positions and
|
|
376
|
+
// injects fresh system-reminders, changing the prefix bytes. Even after our
|
|
377
|
+
// relocation fix corrects the blocks, the prefix differs from what the server
|
|
378
|
+
// cached on the last pre-exit call, causing a full cache rebuild.
|
|
379
|
+
//
|
|
380
|
+
// This feature saves the exact messages[0] content after all fixes are applied.
|
|
381
|
+
// On the first call of a new process (resume), if system prompt hash and tools
|
|
382
|
+
// hash match the saved snapshot, and the real user message text matches, we
|
|
383
|
+
// replay the saved messages[0] to produce a byte-identical prefix → cache hit.
|
|
384
|
+
|
|
385
|
+
const PREFIX_LOCK = process.env.CACHE_FIX_PREFIX_LOCK === "1";
|
|
386
|
+
const PREFIX_LOCK_FILE = join(homedir(), ".claude", "cache-fix-prefix-lock.json");
|
|
387
|
+
|
|
388
|
+
let _prefixLockFirstCall = true;
|
|
389
|
+
|
|
390
|
+
/**
|
|
391
|
+
* Compute hashes for prefix lock comparison.
|
|
392
|
+
*/
|
|
393
|
+
function computePrefixHashes(system, tools) {
|
|
394
|
+
const sysHash = system
|
|
395
|
+
? createHash("sha256").update(JSON.stringify(system)).digest("hex").slice(0, 16)
|
|
396
|
+
: "none";
|
|
397
|
+
const toolHash = tools
|
|
398
|
+
? createHash("sha256").update(JSON.stringify(tools.map(t => t.name).sort())).digest("hex").slice(0, 16)
|
|
399
|
+
: "none";
|
|
400
|
+
return { sysHash, toolHash };
|
|
401
|
+
}
|
|
402
|
+
|
|
403
|
+
/**
|
|
404
|
+
* Extract the real user message text from messages[0] (skipping system-reminders).
|
|
405
|
+
*/
|
|
406
|
+
function extractUserTextFromFirstMsg(msg) {
|
|
407
|
+
if (!msg || !Array.isArray(msg.content)) return "";
|
|
408
|
+
for (const block of msg.content) {
|
|
409
|
+
if (block.type === "text" && typeof block.text === "string" &&
|
|
410
|
+
!block.text.startsWith("<system-reminder>") &&
|
|
411
|
+
!block.text.startsWith("<local-command")) {
|
|
412
|
+
return block.text.slice(0, 200); // enough to identify, not too much to compare
|
|
413
|
+
}
|
|
414
|
+
}
|
|
415
|
+
return "";
|
|
416
|
+
}
|
|
417
|
+
|
|
418
|
+
/**
|
|
419
|
+
* Hash all non-system-reminder user content in messages[0] to detect
|
|
420
|
+
* substantive changes that the userText check (first 200 chars) might miss.
|
|
421
|
+
*/
|
|
422
|
+
function hashUserContent(msg) {
|
|
423
|
+
if (!msg || !Array.isArray(msg.content)) return "empty";
|
|
424
|
+
const userBlocks = msg.content.filter(b =>
|
|
425
|
+
b.type === "text" && typeof b.text === "string" &&
|
|
426
|
+
!b.text.startsWith("<system-reminder>") &&
|
|
427
|
+
!b.text.startsWith("<local-command")
|
|
428
|
+
);
|
|
429
|
+
if (userBlocks.length === 0) return "empty";
|
|
430
|
+
return createHash("sha256")
|
|
431
|
+
.update(userBlocks.map(b => b.text).join("\n"))
|
|
432
|
+
.digest("hex").slice(0, 16);
|
|
433
|
+
}
|
|
434
|
+
|
|
435
|
+
/**
|
|
436
|
+
* On resume: try to replay saved messages[0] for cache hit.
|
|
437
|
+
* Returns the locked messages array or the original if lock doesn't apply.
|
|
438
|
+
*/
|
|
439
|
+
function applyPrefixLock(messages, system, tools) {
|
|
440
|
+
if (!PREFIX_LOCK || !Array.isArray(messages) || messages.length < 2) return messages;
|
|
441
|
+
|
|
442
|
+
const firstUserIdx = messages.findIndex(m => m.role === "user");
|
|
443
|
+
if (firstUserIdx === -1) return messages;
|
|
444
|
+
|
|
445
|
+
const { sysHash, toolHash } = computePrefixHashes(system, tools);
|
|
446
|
+
const currentUserText = extractUserTextFromFirstMsg(messages[firstUserIdx]);
|
|
447
|
+
const currentContentHash = hashUserContent(messages[firstUserIdx]);
|
|
448
|
+
|
|
449
|
+
// Skip if this looks like a compacted conversation (system-reminder as first block
|
|
450
|
+
// with compaction summary markers)
|
|
451
|
+
const firstBlock = messages[firstUserIdx]?.content?.[0];
|
|
452
|
+
if (firstBlock?.text?.includes("CompactBoundary") || firstBlock?.text?.includes("compacted")) {
|
|
453
|
+
debugLog("PREFIX LOCK: skipped — compacted conversation detected");
|
|
454
|
+
return messages;
|
|
455
|
+
}
|
|
456
|
+
|
|
457
|
+
if (_prefixLockFirstCall) {
|
|
458
|
+
_prefixLockFirstCall = false;
|
|
459
|
+
|
|
460
|
+
// Try to load and apply saved prefix
|
|
461
|
+
try {
|
|
462
|
+
const saved = JSON.parse(readFileSync(PREFIX_LOCK_FILE, "utf8"));
|
|
463
|
+
|
|
464
|
+
if (saved.sysHash !== sysHash) {
|
|
465
|
+
debugLog("PREFIX LOCK: skipped — system prompt changed");
|
|
466
|
+
} else if (saved.toolHash !== toolHash) {
|
|
467
|
+
debugLog("PREFIX LOCK: skipped — tools changed");
|
|
468
|
+
} else if (saved.userText !== currentUserText) {
|
|
469
|
+
debugLog("PREFIX LOCK: skipped — user message text changed");
|
|
470
|
+
} else if (saved.contentHash && saved.contentHash !== currentContentHash) {
|
|
471
|
+
debugLog("PREFIX LOCK: skipped — user content hash changed (substantive context change)");
|
|
472
|
+
} else if (!saved.content || !Array.isArray(saved.content)) {
|
|
473
|
+
debugLog("PREFIX LOCK: skipped — saved content invalid");
|
|
474
|
+
} else {
|
|
475
|
+
// Apply the saved messages[0] content
|
|
476
|
+
const result = [...messages];
|
|
477
|
+
result[firstUserIdx] = { ...result[firstUserIdx], content: saved.content };
|
|
478
|
+
debugLog(`PREFIX LOCK: APPLIED — replayed saved messages[0] (${saved.content.length} blocks)`);
|
|
479
|
+
return result;
|
|
480
|
+
}
|
|
481
|
+
} catch {
|
|
482
|
+
debugLog("PREFIX LOCK: no saved prefix found (first run or file missing)");
|
|
483
|
+
}
|
|
484
|
+
}
|
|
485
|
+
|
|
486
|
+
return messages;
|
|
487
|
+
}
|
|
488
|
+
|
|
489
|
+
/**
|
|
490
|
+
* Save current messages[0] content for future resume replay.
|
|
491
|
+
* Called after all fixes are applied, before the request is sent.
|
|
492
|
+
*/
|
|
493
|
+
function savePrefixLock(messages, system, tools) {
|
|
494
|
+
if (!PREFIX_LOCK || !Array.isArray(messages)) return;
|
|
495
|
+
|
|
496
|
+
const firstUserIdx = messages.findIndex(m => m.role === "user");
|
|
497
|
+
if (firstUserIdx === -1) return;
|
|
498
|
+
|
|
499
|
+
const { sysHash, toolHash } = computePrefixHashes(system, tools);
|
|
500
|
+
const userText = extractUserTextFromFirstMsg(messages[firstUserIdx]);
|
|
501
|
+
const contentHash = hashUserContent(messages[firstUserIdx]);
|
|
502
|
+
const content = messages[firstUserIdx].content;
|
|
503
|
+
|
|
504
|
+
try {
|
|
505
|
+
writeFileSync(PREFIX_LOCK_FILE, JSON.stringify({
|
|
506
|
+
timestamp: new Date().toISOString(),
|
|
507
|
+
sysHash,
|
|
508
|
+
toolHash,
|
|
509
|
+
userText,
|
|
510
|
+
contentHash,
|
|
511
|
+
content,
|
|
512
|
+
}));
|
|
513
|
+
} catch (e) {
|
|
514
|
+
debugLog("PREFIX LOCK: failed to save:", e?.message);
|
|
515
|
+
}
|
|
516
|
+
}
|
|
517
|
+
|
|
518
|
+
// --------------------------------------------------------------------------
|
|
519
|
+
// Tool schema stabilization (Bug 2 secondary cause)
|
|
520
|
+
// --------------------------------------------------------------------------
|
|
315
521
|
|
|
316
522
|
/**
|
|
317
|
-
* Sort tool definitions by name for deterministic ordering.
|
|
523
|
+
* Sort tool definitions by name for deterministic ordering. Tool schema bytes
|
|
524
|
+
* changing mid-session was acknowledged as a bug in the v2.1.88 changelog.
|
|
318
525
|
*/
|
|
319
526
|
function stabilizeToolOrder(tools) {
|
|
320
527
|
if (!Array.isArray(tools) || tools.length === 0) return tools;
|
|
@@ -325,9 +532,228 @@ function stabilizeToolOrder(tools) {
|
|
|
325
532
|
});
|
|
326
533
|
}
|
|
327
534
|
|
|
328
|
-
//
|
|
535
|
+
// --------------------------------------------------------------------------
|
|
536
|
+
// Fetch interceptor
|
|
537
|
+
// --------------------------------------------------------------------------
|
|
538
|
+
|
|
539
|
+
// --------------------------------------------------------------------------
|
|
540
|
+
// Debug logging (writes to ~/.claude/cache-fix-debug.log)
|
|
541
|
+
// Set CACHE_FIX_DEBUG=1 to enable
|
|
542
|
+
// --------------------------------------------------------------------------
|
|
543
|
+
|
|
544
|
+
import { appendFileSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
|
|
545
|
+
import { homedir } from "node:os";
|
|
546
|
+
import { join } from "node:path";
|
|
547
|
+
|
|
548
|
+
const DEBUG = process.env.CACHE_FIX_DEBUG === "1";
|
|
549
|
+
const PREFIXDIFF = process.env.CACHE_FIX_PREFIXDIFF === "1";
|
|
550
|
+
const LOG_PATH = join(homedir(), ".claude", "cache-fix-debug.log");
|
|
551
|
+
const SNAPSHOT_DIR = join(homedir(), ".claude", "cache-fix-snapshots");
|
|
552
|
+
|
|
553
|
+
function debugLog(...args) {
|
|
554
|
+
if (!DEBUG) return;
|
|
555
|
+
const line = `[${new Date().toISOString()}] ${args.join(" ")}\n`;
|
|
556
|
+
try { appendFileSync(LOG_PATH, line); } catch {}
|
|
557
|
+
}
|
|
558
|
+
|
|
559
|
+
// --------------------------------------------------------------------------
|
|
560
|
+
// Prefix snapshot — captures message prefix for cross-process diff.
|
|
561
|
+
// Set CACHE_FIX_PREFIXDIFF=1 to enable.
|
|
562
|
+
//
|
|
563
|
+
// On each API call: saves JSON of first 5 messages + system + tools hash
|
|
564
|
+
// to ~/.claude/cache-fix-snapshots/<session-hash>-last.json
|
|
565
|
+
//
|
|
566
|
+
// On first call after startup: compares against saved snapshot and writes
|
|
567
|
+
// a diff report to ~/.claude/cache-fix-snapshots/<session-hash>-diff.json
|
|
568
|
+
// --------------------------------------------------------------------------
|
|
569
|
+
|
|
570
|
+
let _prefixDiffFirstCall = true;
|
|
571
|
+
|
|
572
|
+
// --------------------------------------------------------------------------
|
|
573
|
+
// GrowthBook flag dump (runs once on first API call)
|
|
574
|
+
// --------------------------------------------------------------------------
|
|
575
|
+
|
|
576
|
+
let _growthBookDumped = false;
|
|
577
|
+
|
|
578
|
+
function dumpGrowthBookFlags() {
|
|
579
|
+
if (_growthBookDumped || !DEBUG) return;
|
|
580
|
+
_growthBookDumped = true;
|
|
581
|
+
try {
|
|
582
|
+
const claudeJson = JSON.parse(readFileSync(join(homedir(), ".claude.json"), "utf8"));
|
|
583
|
+
const features = claudeJson.cachedGrowthBookFeatures;
|
|
584
|
+
if (!features) { debugLog("GROWTHBOOK: no cachedGrowthBookFeatures found"); return; }
|
|
585
|
+
|
|
586
|
+
// Log the flags that matter for cost/cache/context behavior
|
|
587
|
+
const interesting = {
|
|
588
|
+
hawthorn_window: features.tengu_hawthorn_window,
|
|
589
|
+
pewter_kestrel: features.tengu_pewter_kestrel,
|
|
590
|
+
summarize_tool_results: features.tengu_summarize_tool_results,
|
|
591
|
+
slate_heron: features.tengu_slate_heron,
|
|
592
|
+
session_memory: features.tengu_session_memory,
|
|
593
|
+
sm_compact: features.tengu_sm_compact,
|
|
594
|
+
sm_compact_config: features.tengu_sm_compact_config,
|
|
595
|
+
sm_config: features.tengu_sm_config,
|
|
596
|
+
cache_plum_violet: features.tengu_cache_plum_violet,
|
|
597
|
+
prompt_cache_1h_config: features.tengu_prompt_cache_1h_config,
|
|
598
|
+
crystal_beam: features.tengu_crystal_beam,
|
|
599
|
+
cold_compact: features.tengu_cold_compact,
|
|
600
|
+
system_prompt_global_cache: features.tengu_system_prompt_global_cache,
|
|
601
|
+
compact_cache_prefix: features.tengu_compact_cache_prefix,
|
|
602
|
+
};
|
|
603
|
+
debugLog("GROWTHBOOK FLAGS:", JSON.stringify(interesting, null, 2));
|
|
604
|
+
} catch (e) {
|
|
605
|
+
debugLog("GROWTHBOOK: failed to read ~/.claude.json:", e?.message);
|
|
606
|
+
}
|
|
607
|
+
}
|
|
608
|
+
|
|
609
|
+
// --------------------------------------------------------------------------
|
|
610
|
+
// Microcompact / budget monitoring
|
|
611
|
+
// --------------------------------------------------------------------------
|
|
612
|
+
|
|
613
|
+
/**
|
|
614
|
+
* Scan outgoing messages for signs of microcompact clearing and budget
|
|
615
|
+
* enforcement. Counts tool results that have been gutted and reports stats.
|
|
616
|
+
*/
|
|
617
|
+
function monitorContextDegradation(messages) {
|
|
618
|
+
if (!Array.isArray(messages)) return null;
|
|
619
|
+
|
|
620
|
+
let clearedToolResults = 0;
|
|
621
|
+
let totalToolResultChars = 0;
|
|
622
|
+
let totalToolResults = 0;
|
|
623
|
+
|
|
624
|
+
for (const msg of messages) {
|
|
625
|
+
if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
626
|
+
for (const block of msg.content) {
|
|
627
|
+
if (block.type === "tool_result") {
|
|
628
|
+
totalToolResults++;
|
|
629
|
+
const content = block.content;
|
|
630
|
+
if (typeof content === "string") {
|
|
631
|
+
if (content === "[Old tool result content cleared]") {
|
|
632
|
+
clearedToolResults++;
|
|
633
|
+
} else {
|
|
634
|
+
totalToolResultChars += content.length;
|
|
635
|
+
}
|
|
636
|
+
} else if (Array.isArray(content)) {
|
|
637
|
+
for (const item of content) {
|
|
638
|
+
if (item.type === "text") {
|
|
639
|
+
if (item.text === "[Old tool result content cleared]") {
|
|
640
|
+
clearedToolResults++;
|
|
641
|
+
} else {
|
|
642
|
+
totalToolResultChars += item.text.length;
|
|
643
|
+
}
|
|
644
|
+
}
|
|
645
|
+
}
|
|
646
|
+
}
|
|
647
|
+
}
|
|
648
|
+
}
|
|
649
|
+
}
|
|
650
|
+
|
|
651
|
+
if (totalToolResults === 0) return null;
|
|
652
|
+
|
|
653
|
+
const stats = { totalToolResults, clearedToolResults, totalToolResultChars };
|
|
654
|
+
|
|
655
|
+
if (clearedToolResults > 0) {
|
|
656
|
+
debugLog(`MICROCOMPACT: ${clearedToolResults}/${totalToolResults} tool results cleared`);
|
|
657
|
+
}
|
|
658
|
+
|
|
659
|
+
// Warn when approaching the 200K budget threshold
|
|
660
|
+
if (totalToolResultChars > 150000) {
|
|
661
|
+
debugLog(`BUDGET WARNING: tool result chars at ${totalToolResultChars.toLocaleString()} / 200,000 threshold`);
|
|
662
|
+
}
|
|
663
|
+
|
|
664
|
+
return stats;
|
|
665
|
+
}
|
|
666
|
+
|
|
667
|
+
function snapshotPrefix(payload) {
|
|
668
|
+
if (!PREFIXDIFF) return;
|
|
669
|
+
try {
|
|
670
|
+
mkdirSync(SNAPSHOT_DIR, { recursive: true });
|
|
671
|
+
|
|
672
|
+
// Session key: use system prompt hash — stable across restarts for the same project.
|
|
673
|
+
// Different projects get different snapshots, same project matches across resume.
|
|
674
|
+
const sessionKey = payload.system
|
|
675
|
+
? createHash("sha256").update(JSON.stringify(payload.system).slice(0, 2000)).digest("hex").slice(0, 12)
|
|
676
|
+
: "default";
|
|
677
|
+
|
|
678
|
+
const snapshotFile = join(SNAPSHOT_DIR, `${sessionKey}-last.json`);
|
|
679
|
+
const diffFile = join(SNAPSHOT_DIR, `${sessionKey}-diff.json`);
|
|
680
|
+
|
|
681
|
+
// Build prefix snapshot: first 5 messages, stripped of cache_control
|
|
682
|
+
const prefixMsgs = (payload.messages || []).slice(0, 5).map(msg => {
|
|
683
|
+
const content = Array.isArray(msg.content)
|
|
684
|
+
? msg.content.map(b => {
|
|
685
|
+
const { cache_control, ...rest } = b;
|
|
686
|
+
// Truncate long text blocks for diffing
|
|
687
|
+
if (rest.text && rest.text.length > 500) {
|
|
688
|
+
rest.text = rest.text.slice(0, 500) + `...[${rest.text.length} chars]`;
|
|
689
|
+
}
|
|
690
|
+
return rest;
|
|
691
|
+
})
|
|
692
|
+
: msg.content;
|
|
693
|
+
return { role: msg.role, content };
|
|
694
|
+
});
|
|
695
|
+
|
|
696
|
+
const toolsHash = payload.tools
|
|
697
|
+
? createHash("sha256").update(JSON.stringify(payload.tools.map(t => t.name))).digest("hex").slice(0, 16)
|
|
698
|
+
: "none";
|
|
699
|
+
|
|
700
|
+
const systemHash = payload.system
|
|
701
|
+
? createHash("sha256").update(JSON.stringify(payload.system)).digest("hex").slice(0, 16)
|
|
702
|
+
: "none";
|
|
703
|
+
|
|
704
|
+
const snapshot = {
|
|
705
|
+
timestamp: new Date().toISOString(),
|
|
706
|
+
messageCount: payload.messages?.length || 0,
|
|
707
|
+
toolsHash,
|
|
708
|
+
systemHash,
|
|
709
|
+
prefixMessages: prefixMsgs,
|
|
710
|
+
};
|
|
711
|
+
|
|
712
|
+
// On first call: compare against saved
|
|
713
|
+
if (_prefixDiffFirstCall) {
|
|
714
|
+
_prefixDiffFirstCall = false;
|
|
715
|
+
try {
|
|
716
|
+
const prev = JSON.parse(readFileSync(snapshotFile, "utf8"));
|
|
717
|
+
const diff = {
|
|
718
|
+
timestamp: snapshot.timestamp,
|
|
719
|
+
prevTimestamp: prev.timestamp,
|
|
720
|
+
toolsMatch: prev.toolsHash === snapshot.toolsHash,
|
|
721
|
+
systemMatch: prev.systemHash === snapshot.systemHash,
|
|
722
|
+
messageCountPrev: prev.messageCount,
|
|
723
|
+
messageCountNow: snapshot.messageCount,
|
|
724
|
+
prefixDiffs: [],
|
|
725
|
+
};
|
|
726
|
+
|
|
727
|
+
const maxIdx = Math.max(prev.prefixMessages.length, snapshot.prefixMessages.length);
|
|
728
|
+
for (let i = 0; i < maxIdx; i++) {
|
|
729
|
+
const prevMsg = JSON.stringify(prev.prefixMessages[i] || null);
|
|
730
|
+
const nowMsg = JSON.stringify(snapshot.prefixMessages[i] || null);
|
|
731
|
+
if (prevMsg !== nowMsg) {
|
|
732
|
+
diff.prefixDiffs.push({
|
|
733
|
+
index: i,
|
|
734
|
+
prev: prev.prefixMessages[i] || null,
|
|
735
|
+
now: snapshot.prefixMessages[i] || null,
|
|
736
|
+
});
|
|
737
|
+
}
|
|
738
|
+
}
|
|
739
|
+
|
|
740
|
+
writeFileSync(diffFile, JSON.stringify(diff, null, 2));
|
|
741
|
+
debugLog(`PREFIX DIFF: ${diff.prefixDiffs.length} differences in first 5 messages. tools=${diff.toolsMatch ? "match" : "DIFFER"} system=${diff.systemMatch ? "match" : "DIFFER"}`);
|
|
742
|
+
} catch {
|
|
743
|
+
// No previous snapshot — first run
|
|
744
|
+
}
|
|
745
|
+
}
|
|
746
|
+
|
|
747
|
+
// Save current snapshot
|
|
748
|
+
writeFileSync(snapshotFile, JSON.stringify(snapshot, null, 2));
|
|
749
|
+
} catch (e) {
|
|
750
|
+
debugLog("PREFIX SNAPSHOT ERROR:", e?.message);
|
|
751
|
+
}
|
|
752
|
+
}
|
|
753
|
+
|
|
754
|
+
// --------------------------------------------------------------------------
|
|
329
755
|
// Fetch interceptor
|
|
330
|
-
//
|
|
756
|
+
// --------------------------------------------------------------------------
|
|
331
757
|
|
|
332
758
|
const _origFetch = globalThis.fetch;
|
|
333
759
|
|
|
@@ -339,23 +765,27 @@ globalThis.fetch = async function (url, options) {
|
|
|
339
765
|
!urlStr.includes("batches") &&
|
|
340
766
|
!urlStr.includes("count_tokens");
|
|
341
767
|
|
|
342
|
-
if (
|
|
343
|
-
isMessagesEndpoint &&
|
|
344
|
-
options?.body &&
|
|
345
|
-
typeof options.body === "string"
|
|
346
|
-
) {
|
|
768
|
+
if (isMessagesEndpoint && options?.body && typeof options.body === "string") {
|
|
347
769
|
try {
|
|
348
770
|
const payload = JSON.parse(options.body);
|
|
349
771
|
let modified = false;
|
|
350
772
|
|
|
773
|
+
// One-time GrowthBook flag dump on first API call
|
|
774
|
+
dumpGrowthBookFlags();
|
|
775
|
+
|
|
351
776
|
debugLog("--- API call to", urlStr);
|
|
352
777
|
debugLog("message count:", payload.messages?.length);
|
|
353
778
|
|
|
354
|
-
//
|
|
779
|
+
// Detect synthetic model (false rate limiter, B3)
|
|
780
|
+
if (payload.model === "<synthetic>") {
|
|
781
|
+
debugLog("FALSE RATE LIMIT: synthetic model detected — client-side rate limit, no real API call");
|
|
782
|
+
}
|
|
783
|
+
|
|
784
|
+
// Bug 1: Relocate resume attachment blocks
|
|
355
785
|
if (payload.messages) {
|
|
786
|
+
// Log message structure for debugging
|
|
356
787
|
if (DEBUG) {
|
|
357
|
-
let firstUserIdx = -1;
|
|
358
|
-
let lastUserIdx = -1;
|
|
788
|
+
let firstUserIdx = -1, lastUserIdx = -1;
|
|
359
789
|
for (let i = 0; i < payload.messages.length; i++) {
|
|
360
790
|
if (payload.messages[i].role === "user") {
|
|
361
791
|
if (firstUserIdx === -1) firstUserIdx = i;
|
|
@@ -365,39 +795,20 @@ globalThis.fetch = async function (url, options) {
|
|
|
365
795
|
if (firstUserIdx !== -1) {
|
|
366
796
|
const firstContent = payload.messages[firstUserIdx].content;
|
|
367
797
|
const lastContent = payload.messages[lastUserIdx].content;
|
|
368
|
-
debugLog(
|
|
369
|
-
|
|
370
|
-
firstUserIdx,
|
|
371
|
-
"lastUserIdx:",
|
|
372
|
-
lastUserIdx
|
|
373
|
-
);
|
|
374
|
-
debugLog(
|
|
375
|
-
"first user msg blocks:",
|
|
376
|
-
Array.isArray(firstContent) ? firstContent.length : "string"
|
|
377
|
-
);
|
|
798
|
+
debugLog("firstUserIdx:", firstUserIdx, "lastUserIdx:", lastUserIdx);
|
|
799
|
+
debugLog("first user msg blocks:", Array.isArray(firstContent) ? firstContent.length : "string");
|
|
378
800
|
if (Array.isArray(firstContent)) {
|
|
379
801
|
for (const b of firstContent) {
|
|
380
802
|
const t = (b.text || "").substring(0, 80);
|
|
381
|
-
debugLog(
|
|
382
|
-
" first[block]:",
|
|
383
|
-
isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep",
|
|
384
|
-
JSON.stringify(t)
|
|
385
|
-
);
|
|
803
|
+
debugLog(" first[block]:", isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep", JSON.stringify(t));
|
|
386
804
|
}
|
|
387
805
|
}
|
|
388
806
|
if (firstUserIdx !== lastUserIdx) {
|
|
389
|
-
debugLog(
|
|
390
|
-
"last user msg blocks:",
|
|
391
|
-
Array.isArray(lastContent) ? lastContent.length : "string"
|
|
392
|
-
);
|
|
807
|
+
debugLog("last user msg blocks:", Array.isArray(lastContent) ? lastContent.length : "string");
|
|
393
808
|
if (Array.isArray(lastContent)) {
|
|
394
809
|
for (const b of lastContent) {
|
|
395
810
|
const t = (b.text || "").substring(0, 80);
|
|
396
|
-
debugLog(
|
|
397
|
-
" last[block]:",
|
|
398
|
-
isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep",
|
|
399
|
-
JSON.stringify(t)
|
|
400
|
-
);
|
|
811
|
+
debugLog(" last[block]:", isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep", JSON.stringify(t));
|
|
401
812
|
}
|
|
402
813
|
}
|
|
403
814
|
} else {
|
|
@@ -412,13 +823,37 @@ globalThis.fetch = async function (url, options) {
|
|
|
412
823
|
modified = true;
|
|
413
824
|
debugLog("APPLIED: resume message relocation");
|
|
414
825
|
} else {
|
|
826
|
+
debugLog("SKIPPED: resume relocation (not a resume or already correct)");
|
|
827
|
+
}
|
|
828
|
+
}
|
|
829
|
+
|
|
830
|
+
// Image stripping: remove old tool_result images to reduce token waste
|
|
831
|
+
if (payload.messages && IMAGE_KEEP_LAST > 0) {
|
|
832
|
+
const { messages: imgStripped, stats: imgStats } = stripOldToolResultImages(
|
|
833
|
+
payload.messages, IMAGE_KEEP_LAST
|
|
834
|
+
);
|
|
835
|
+
if (imgStats) {
|
|
836
|
+
payload.messages = imgStripped;
|
|
837
|
+
modified = true;
|
|
415
838
|
debugLog(
|
|
416
|
-
|
|
839
|
+
`APPLIED: stripped ${imgStats.strippedCount} images from old tool results`,
|
|
840
|
+
`(~${imgStats.strippedBytes} base64 bytes, ~${imgStats.estimatedTokens} tokens saved)`
|
|
417
841
|
);
|
|
842
|
+
} else if (IMAGE_KEEP_LAST > 0) {
|
|
843
|
+
debugLog("SKIPPED: image stripping (no old images found or not enough turns)");
|
|
844
|
+
}
|
|
845
|
+
}
|
|
846
|
+
|
|
847
|
+
// Prefix lock: replay saved messages[0] on resume for cache hit
|
|
848
|
+
if (payload.messages && payload.system) {
|
|
849
|
+
const locked = applyPrefixLock(payload.messages, payload.system, payload.tools);
|
|
850
|
+
if (locked !== payload.messages) {
|
|
851
|
+
payload.messages = locked;
|
|
852
|
+
modified = true;
|
|
418
853
|
}
|
|
419
854
|
}
|
|
420
855
|
|
|
421
|
-
// Bug
|
|
856
|
+
// Bug 2a: Stabilize tool ordering
|
|
422
857
|
if (payload.tools) {
|
|
423
858
|
const sorted = stabilizeToolOrder(payload.tools);
|
|
424
859
|
const changed = sorted.some(
|
|
@@ -431,7 +866,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
431
866
|
}
|
|
432
867
|
}
|
|
433
868
|
|
|
434
|
-
// Bug
|
|
869
|
+
// Bug 2b: Stabilize fingerprint in attribution header
|
|
435
870
|
if (payload.system && payload.messages) {
|
|
436
871
|
const fix = stabilizeFingerprint(payload.system, payload.messages);
|
|
437
872
|
if (fix) {
|
|
@@ -441,12 +876,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
441
876
|
text: fix.newText,
|
|
442
877
|
};
|
|
443
878
|
modified = true;
|
|
444
|
-
debugLog(
|
|
445
|
-
"APPLIED: fingerprint stabilized from",
|
|
446
|
-
fix.oldFingerprint,
|
|
447
|
-
"to",
|
|
448
|
-
fix.stableFingerprint
|
|
449
|
-
);
|
|
879
|
+
debugLog("APPLIED: fingerprint stabilized from", fix.oldFingerprint, "to", fix.stableFingerprint);
|
|
450
880
|
}
|
|
451
881
|
}
|
|
452
882
|
|
|
@@ -454,11 +884,53 @@ globalThis.fetch = async function (url, options) {
|
|
|
454
884
|
options = { ...options, body: JSON.stringify(payload) };
|
|
455
885
|
debugLog("Request body rewritten");
|
|
456
886
|
}
|
|
887
|
+
|
|
888
|
+
// Save prefix lock after all fixes applied
|
|
889
|
+
if (payload.messages && payload.system) {
|
|
890
|
+
savePrefixLock(payload.messages, payload.system, payload.tools);
|
|
891
|
+
}
|
|
892
|
+
|
|
893
|
+
// Monitor for microcompact / budget enforcement degradation
|
|
894
|
+
if (payload.messages) {
|
|
895
|
+
monitorContextDegradation(payload.messages);
|
|
896
|
+
}
|
|
897
|
+
|
|
898
|
+
// Capture prefix snapshot for cross-process diff analysis
|
|
899
|
+
snapshotPrefix(payload);
|
|
900
|
+
|
|
457
901
|
} catch (e) {
|
|
458
902
|
debugLog("ERROR in interceptor:", e?.message);
|
|
459
903
|
// Parse failure — pass through unmodified
|
|
460
904
|
}
|
|
461
905
|
}
|
|
462
906
|
|
|
463
|
-
|
|
907
|
+
const response = await _origFetch.apply(this, [url, options]);
|
|
908
|
+
|
|
909
|
+
// Extract quota utilization from response headers and save for hooks/MCP
|
|
910
|
+
if (isMessagesEndpoint) {
|
|
911
|
+
try {
|
|
912
|
+
const h5 = response.headers.get("anthropic-ratelimit-unified-5h-utilization");
|
|
913
|
+
const h7d = response.headers.get("anthropic-ratelimit-unified-7d-utilization");
|
|
914
|
+
const reset5h = response.headers.get("anthropic-ratelimit-unified-5h-reset");
|
|
915
|
+
const reset7d = response.headers.get("anthropic-ratelimit-unified-7d-reset");
|
|
916
|
+
const status = response.headers.get("anthropic-ratelimit-unified-status");
|
|
917
|
+
const overage = response.headers.get("anthropic-ratelimit-unified-overage-status");
|
|
918
|
+
|
|
919
|
+
if (h5 || h7d) {
|
|
920
|
+
const quota = {
|
|
921
|
+
timestamp: new Date().toISOString(),
|
|
922
|
+
five_hour: h5 ? { utilization: parseFloat(h5), pct: Math.round(parseFloat(h5) * 100), resets_at: reset5h ? parseInt(reset5h) : null } : null,
|
|
923
|
+
seven_day: h7d ? { utilization: parseFloat(h7d), pct: Math.round(parseFloat(h7d) * 100), resets_at: reset7d ? parseInt(reset7d) : null } : null,
|
|
924
|
+
status: status || null,
|
|
925
|
+
overage_status: overage || null,
|
|
926
|
+
};
|
|
927
|
+
const quotaFile = join(homedir(), ".claude", "quota-status.json");
|
|
928
|
+
writeFileSync(quotaFile, JSON.stringify(quota, null, 2));
|
|
929
|
+
}
|
|
930
|
+
} catch {
|
|
931
|
+
// Non-critical — don't break the response
|
|
932
|
+
}
|
|
933
|
+
}
|
|
934
|
+
|
|
935
|
+
return response;
|
|
464
936
|
};
|