claude-code-cache-fix 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +79 -5
- package/package.json +1 -1
- package/preload.mjs +443 -135
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# claude-code-cache-fix
|
|
2
2
|
|
|
3
|
-
Fixes
|
|
3
|
+
Fixes prompt cache regressions in [Claude Code](https://github.com/anthropics/claude-code) that cause **up to 20x cost increase** on resumed sessions, plus monitoring for silent context degradation. Confirmed through v2.1.92.
|
|
4
4
|
|
|
5
5
|
## The problem
|
|
6
6
|
|
|
@@ -14,6 +14,8 @@ Three bugs cause this:
|
|
|
14
14
|
|
|
15
15
|
3. **Non-deterministic tool ordering** — Tool definitions can arrive in different orders between turns, changing request bytes and invalidating the cache key.
|
|
16
16
|
|
|
17
|
+
Additionally, images read via the Read tool persist as base64 in conversation history and are sent on every subsequent API call, compounding token costs silently.
|
|
18
|
+
|
|
17
19
|
## Installation
|
|
18
20
|
|
|
19
21
|
Requires Node.js >= 18 and Claude Code installed via npm (not the standalone binary).
|
|
@@ -76,6 +78,42 @@ The module intercepts `globalThis.fetch` before Claude Code makes API calls to `
|
|
|
76
78
|
|
|
77
79
|
All fixes are idempotent — if nothing needs fixing, the request passes through unmodified. The interceptor is read-only with respect to your conversation; it only normalizes the request structure before it hits the API.
|
|
78
80
|
|
|
81
|
+
## Image stripping
|
|
82
|
+
|
|
83
|
+
Images read via the Read tool are encoded as base64 and stored in `tool_result` blocks in conversation history. They ride along on **every subsequent API call** until compaction. A single 500KB image costs ~62,500 tokens per turn in carry-forward.
|
|
84
|
+
|
|
85
|
+
Enable image stripping to remove old images from tool results:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
export CACHE_FIX_IMAGE_KEEP_LAST=3
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
This keeps images in the last 3 user messages and replaces older ones with a text placeholder. Only targets images inside `tool_result` blocks (Read tool output) — user-pasted images are never touched. Files remain on disk for re-reading if needed.
|
|
92
|
+
|
|
93
|
+
Set to `0` (default) to disable.
|
|
94
|
+
|
|
95
|
+
## Monitoring
|
|
96
|
+
|
|
97
|
+
The interceptor includes monitoring for several additional issues identified by the community:
|
|
98
|
+
|
|
99
|
+
### Microcompact / budget enforcement
|
|
100
|
+
|
|
101
|
+
Claude Code silently replaces old tool results with `[Old tool result content cleared]` via server-controlled mechanisms (GrowthBook flags). A 200,000-character aggregate cap and per-tool caps (Bash: 30K, Grep: 20K) truncate older results without notification. There is no `DISABLE_MICROCOMPACT` environment variable.
|
|
102
|
+
|
|
103
|
+
The interceptor detects cleared tool results and logs counts. When total tool result characters approach the 200K threshold, a warning is logged.
|
|
104
|
+
|
|
105
|
+
### False rate limiter
|
|
106
|
+
|
|
107
|
+
The client can generate synthetic "Rate limit reached" errors without making an API call, identifiable by `"model": "<synthetic>"`. The interceptor logs these events.
|
|
108
|
+
|
|
109
|
+
### GrowthBook flag dump
|
|
110
|
+
|
|
111
|
+
On the first API call, the interceptor reads `~/.claude.json` and logs the current state of cost/cache-relevant server-controlled flags (hawthorn_window, pewter_kestrel, slate_heron, session_memory, etc.).
|
|
112
|
+
|
|
113
|
+
### Quota tracking
|
|
114
|
+
|
|
115
|
+
Response headers are parsed for `anthropic-ratelimit-unified-5h-utilization` and `7d-utilization`, saved to `~/.claude/quota-status.json` for consumption by status line hooks or other tools.
|
|
116
|
+
|
|
79
117
|
## Debug mode
|
|
80
118
|
|
|
81
119
|
Enable debug logging to verify the fix is working:
|
|
@@ -88,31 +126,67 @@ Logs are written to `~/.claude/cache-fix-debug.log`. Look for:
|
|
|
88
126
|
- `APPLIED: resume message relocation` — block scatter was detected and fixed
|
|
89
127
|
- `APPLIED: tool order stabilization` — tools were reordered
|
|
90
128
|
- `APPLIED: fingerprint stabilized from XXX to YYY` — fingerprint was corrected
|
|
91
|
-
- `
|
|
129
|
+
- `APPLIED: stripped N images from old tool results` — images were stripped
|
|
130
|
+
- `MICROCOMPACT: N/M tool results cleared` — microcompact degradation detected
|
|
131
|
+
- `BUDGET WARNING: tool result chars at N / 200,000 threshold` — approaching budget cap
|
|
132
|
+
- `FALSE RATE LIMIT: synthetic model detected` — client-side false rate limit
|
|
133
|
+
- `GROWTHBOOK FLAGS: {...}` — server-controlled feature flags on first call
|
|
134
|
+
- `SKIPPED: resume relocation (not a resume or already correct)` — no fix needed
|
|
135
|
+
|
|
136
|
+
### Prefix diff mode
|
|
137
|
+
|
|
138
|
+
Enable cross-process prefix snapshot diffing to diagnose cache busts on restart:
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
CACHE_FIX_PREFIXDIFF=1 claude-fixed
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
Snapshots are saved to `~/.claude/cache-fix-snapshots/` and diff reports are generated on the first API call after a restart.
|
|
145
|
+
|
|
146
|
+
## Environment variables
|
|
147
|
+
|
|
148
|
+
| Variable | Default | Description |
|
|
149
|
+
|----------|---------|-------------|
|
|
150
|
+
| `CACHE_FIX_DEBUG` | `0` | Enable debug logging to `~/.claude/cache-fix-debug.log` |
|
|
151
|
+
| `CACHE_FIX_PREFIXDIFF` | `0` | Enable prefix snapshot diffing |
|
|
152
|
+
| `CACHE_FIX_IMAGE_KEEP_LAST` | `0` | Keep images in last N user messages (0 = disabled) |
|
|
92
153
|
|
|
93
154
|
## Limitations
|
|
94
155
|
|
|
95
156
|
- **npm installation only** — The standalone Claude Code binary has Zig-level attestation that bypasses Node.js. This fix only works with the npm package (`npm install -g @anthropic-ai/claude-code`).
|
|
96
157
|
- **Overage TTL downgrade** — Exceeding 100% of the 5-hour quota triggers a server-enforced TTL downgrade from 1h to 5m. This is a server-side decision and cannot be fixed client-side. The interceptor prevents the cache instability that can push you into overage in the first place.
|
|
158
|
+
- **Microcompact is not preventable** — The monitoring features detect context degradation but cannot prevent it. The microcompact and budget enforcement mechanisms are server-controlled via GrowthBook flags with no client-side disable option.
|
|
97
159
|
- **Version coupling** — The fingerprint salt and block detection heuristics are derived from Claude Code internals. A major refactor could require an update to this package.
|
|
98
160
|
|
|
99
161
|
## Tracked issues
|
|
100
162
|
|
|
101
163
|
- [#34629](https://github.com/anthropics/claude-code/issues/34629) — Original resume cache regression report
|
|
102
|
-
- [#40524](https://github.com/anthropics/claude-code/issues/40524) — Within-session fingerprint invalidation
|
|
103
|
-
- [#42052](https://github.com/anthropics/claude-code/issues/42052) — Community interceptor development
|
|
164
|
+
- [#40524](https://github.com/anthropics/claude-code/issues/40524) — Within-session fingerprint invalidation, image persistence
|
|
165
|
+
- [#42052](https://github.com/anthropics/claude-code/issues/42052) — Community interceptor development, TTL downgrade discovery
|
|
104
166
|
- [#43044](https://github.com/anthropics/claude-code/issues/43044) — Resume loads 0% context on v2.1.91
|
|
105
167
|
- [#43657](https://github.com/anthropics/claude-code/issues/43657) — Resume cache invalidation confirmed on v2.1.92
|
|
106
168
|
- [#44045](https://github.com/anthropics/claude-code/issues/44045) — SDK-level reproduction with token measurements
|
|
107
169
|
|
|
170
|
+
## Related research
|
|
171
|
+
|
|
172
|
+
- **[@ArkNill/claude-code-hidden-problem-analysis](https://github.com/ArkNill/claude-code-hidden-problem-analysis)** — Systematic proxy-based analysis of 7 bugs including microcompact, budget enforcement, false rate limiter, and extended thinking quota impact. The monitoring features in v1.1.0 are informed by this research.
|
|
173
|
+
|
|
108
174
|
## Contributors
|
|
109
175
|
|
|
110
176
|
- **[@VictorSun92](https://github.com/VictorSun92)** — Original monkey-patch fix for v2.1.88, identified partial scatter on v2.1.90, contributed forward-scan detection, correct block ordering, and tighter block matchers
|
|
111
177
|
- **[@jmarianski](https://github.com/jmarianski)** — Root cause analysis via MITM proxy capture and Ghidra reverse engineering, multi-mode cache test script
|
|
112
|
-
- **[@cnighswonger](https://github.com/cnighswonger)** — Fingerprint stabilization, tool ordering fix,
|
|
178
|
+
- **[@cnighswonger](https://github.com/cnighswonger)** — Fingerprint stabilization, tool ordering fix, image stripping, monitoring features, overage TTL downgrade discovery, package maintainer
|
|
179
|
+
- **[@ArkNill](https://github.com/ArkNill)** — Microcompact mechanism analysis, GrowthBook flag documentation, false rate limiter identification
|
|
180
|
+
- **[@Renvect](https://github.com/Renvect)** — Image duplication discovery, cross-project directory contamination analysis
|
|
113
181
|
|
|
114
182
|
If you contributed to the community effort on these issues and aren't listed here, please open an issue or PR — we want to credit everyone properly.
|
|
115
183
|
|
|
184
|
+
## Support
|
|
185
|
+
|
|
186
|
+
If this tool saved you money, consider buying me a coffee:
|
|
187
|
+
|
|
188
|
+
<a href="https://buymeacoffee.com/vsits" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
|
|
189
|
+
|
|
116
190
|
## License
|
|
117
191
|
|
|
118
192
|
[MIT](LICENSE)
|
package/package.json
CHANGED
package/preload.mjs
CHANGED
|
@@ -8,51 +8,42 @@
|
|
|
8
8
|
// later user messages instead of messages[0]. This breaks the prompt cache
|
|
9
9
|
// prefix match. Fix: relocate them to messages[0] on every API call.
|
|
10
10
|
// (github.com/anthropics/claude-code/issues/34629)
|
|
11
|
-
// (github.com/anthropics/claude-code/issues/43657)
|
|
12
|
-
// (github.com/anthropics/claude-code/issues/44045)
|
|
13
11
|
//
|
|
14
12
|
// Bug 2: Fingerprint instability
|
|
15
13
|
// The cc_version fingerprint in the attribution header is computed from
|
|
16
14
|
// messages[0] content INCLUDING meta/attachment blocks. When those blocks
|
|
17
|
-
// change between turns, the fingerprint changes
|
|
18
|
-
//
|
|
15
|
+
// change between turns, the fingerprint changes, busting cache within the
|
|
16
|
+
// same session. Fix: stabilize the fingerprint from the real user message.
|
|
19
17
|
// (github.com/anthropics/claude-code/issues/40524)
|
|
20
18
|
//
|
|
21
|
-
// Bug 3:
|
|
22
|
-
//
|
|
23
|
-
//
|
|
19
|
+
// Bug 3: Image carry-forward in conversation history
|
|
20
|
+
// Images read via the Read tool persist as base64 in conversation history
|
|
21
|
+
// and are sent on every subsequent API call. A single 500KB image costs
|
|
22
|
+
// ~62,500 tokens per turn in carry-forward. Fix: strip base64 image blocks
|
|
23
|
+
// from tool_result content older than N user turns.
|
|
24
|
+
// Set CACHE_FIX_IMAGE_KEEP_LAST=N to enable (default: 0 = disabled).
|
|
25
|
+
// (github.com/anthropics/claude-code/issues/40524)
|
|
26
|
+
//
|
|
27
|
+
// Monitoring:
|
|
28
|
+
// - GrowthBook flag dump on first API call (CACHE_FIX_DEBUG=1)
|
|
29
|
+
// - Microcompact / budget enforcement detection (logs cleared tool results)
|
|
30
|
+
// - False rate limiter detection (model: "<synthetic>")
|
|
31
|
+
// - Quota utilization tracking (writes ~/.claude/quota-status.json)
|
|
32
|
+
// - Prefix snapshot diffing across process restarts (CACHE_FIX_PREFIXDIFF=1)
|
|
24
33
|
//
|
|
25
|
-
// Based on community
|
|
26
|
-
//
|
|
34
|
+
// Based on community fix by @VictorSun92 / @jmarianski (issue #34629),
|
|
35
|
+
// enhanced with fingerprint stabilization, image stripping, and monitoring.
|
|
36
|
+
// Bug research informed by @ArkNill's claude-code-hidden-problem-analysis.
|
|
27
37
|
//
|
|
28
|
-
//
|
|
38
|
+
// Load via: NODE_OPTIONS="--import $HOME/.claude/cache-fix-preload.mjs"
|
|
29
39
|
|
|
30
40
|
import { createHash } from "node:crypto";
|
|
31
|
-
import { appendFileSync } from "node:fs";
|
|
32
|
-
import { homedir } from "node:os";
|
|
33
|
-
import { join } from "node:path";
|
|
34
41
|
|
|
35
|
-
//
|
|
36
|
-
// Debug logging (writes to ~/.claude/cache-fix-debug.log)
|
|
37
|
-
// Set CACHE_FIX_DEBUG=1 to enable
|
|
38
|
-
// ---------------------------------------------------------------------------
|
|
39
|
-
|
|
40
|
-
const DEBUG = process.env.CACHE_FIX_DEBUG === "1";
|
|
41
|
-
const LOG_PATH = join(homedir(), ".claude", "cache-fix-debug.log");
|
|
42
|
-
|
|
43
|
-
function debugLog(...args) {
|
|
44
|
-
if (!DEBUG) return;
|
|
45
|
-
const line = `[${new Date().toISOString()}] ${args.join(" ")}\n`;
|
|
46
|
-
try {
|
|
47
|
-
appendFileSync(LOG_PATH, line);
|
|
48
|
-
} catch {}
|
|
49
|
-
}
|
|
50
|
-
|
|
51
|
-
// ---------------------------------------------------------------------------
|
|
42
|
+
// --------------------------------------------------------------------------
|
|
52
43
|
// Fingerprint stabilization (Bug 2)
|
|
53
|
-
//
|
|
44
|
+
// --------------------------------------------------------------------------
|
|
54
45
|
|
|
55
|
-
// Must match
|
|
46
|
+
// Must match src/utils/fingerprint.ts exactly.
|
|
56
47
|
const FINGERPRINT_SALT = "59cf53e54c78";
|
|
57
48
|
const FINGERPRINT_INDICES = [4, 7, 20];
|
|
58
49
|
|
|
@@ -77,20 +68,14 @@ function extractRealUserMessageText(messages) {
|
|
|
77
68
|
if (msg.role !== "user") continue;
|
|
78
69
|
const content = msg.content;
|
|
79
70
|
if (!Array.isArray(content)) {
|
|
80
|
-
if (
|
|
81
|
-
typeof content === "string" &&
|
|
82
|
-
!content.startsWith("<system-reminder>")
|
|
83
|
-
) {
|
|
71
|
+
if (typeof content === "string" && !content.startsWith("<system-reminder>")) {
|
|
84
72
|
return content;
|
|
85
73
|
}
|
|
86
74
|
continue;
|
|
87
75
|
}
|
|
76
|
+
// Find first text block that isn't a system-reminder
|
|
88
77
|
for (const block of content) {
|
|
89
|
-
if (
|
|
90
|
-
block.type === "text" &&
|
|
91
|
-
typeof block.text === "string" &&
|
|
92
|
-
!block.text.startsWith("<system-reminder>")
|
|
93
|
-
) {
|
|
78
|
+
if (block.type === "text" && typeof block.text === "string" && !block.text.startsWith("<system-reminder>")) {
|
|
94
79
|
return block.text;
|
|
95
80
|
}
|
|
96
81
|
}
|
|
@@ -100,17 +85,14 @@ function extractRealUserMessageText(messages) {
|
|
|
100
85
|
|
|
101
86
|
/**
|
|
102
87
|
* Extract current cc_version from system prompt blocks and recompute with
|
|
103
|
-
* stable fingerprint. Returns {
|
|
104
|
-
* or null if no fix needed.
|
|
88
|
+
* stable fingerprint. Returns { oldVersion, newVersion, stableFingerprint }.
|
|
105
89
|
*/
|
|
106
90
|
function stabilizeFingerprint(system, messages) {
|
|
107
91
|
if (!Array.isArray(system)) return null;
|
|
108
92
|
|
|
93
|
+
// Find the attribution header block
|
|
109
94
|
const attrIdx = system.findIndex(
|
|
110
|
-
(b) =>
|
|
111
|
-
b.type === "text" &&
|
|
112
|
-
typeof b.text === "string" &&
|
|
113
|
-
b.text.includes("x-anthropic-billing-header:")
|
|
95
|
+
(b) => b.type === "text" && typeof b.text === "string" && b.text.includes("x-anthropic-billing-header:")
|
|
114
96
|
);
|
|
115
97
|
if (attrIdx === -1) return null;
|
|
116
98
|
|
|
@@ -118,13 +100,14 @@ function stabilizeFingerprint(system, messages) {
|
|
|
118
100
|
const versionMatch = attrBlock.text.match(/cc_version=([^;]+)/);
|
|
119
101
|
if (!versionMatch) return null;
|
|
120
102
|
|
|
121
|
-
const fullVersion = versionMatch[1]; // e.g. "2.1.
|
|
103
|
+
const fullVersion = versionMatch[1]; // e.g. "2.1.87.a3f"
|
|
122
104
|
const dotParts = fullVersion.split(".");
|
|
123
105
|
if (dotParts.length < 4) return null;
|
|
124
106
|
|
|
125
|
-
const baseVersion = dotParts.slice(0, 3).join("."); // "2.1.
|
|
107
|
+
const baseVersion = dotParts.slice(0, 3).join("."); // "2.1.87"
|
|
126
108
|
const oldFingerprint = dotParts[3]; // "a3f"
|
|
127
109
|
|
|
110
|
+
// Compute stable fingerprint from real user text
|
|
128
111
|
const realText = extractRealUserMessageText(messages);
|
|
129
112
|
const stableFingerprint = computeFingerprint(realText, baseVersion);
|
|
130
113
|
|
|
@@ -139,38 +122,28 @@ function stabilizeFingerprint(system, messages) {
|
|
|
139
122
|
return { attrIdx, newText, oldFingerprint, stableFingerprint };
|
|
140
123
|
}
|
|
141
124
|
|
|
142
|
-
//
|
|
125
|
+
// --------------------------------------------------------------------------
|
|
143
126
|
// Resume message relocation (Bug 1)
|
|
144
|
-
//
|
|
127
|
+
// --------------------------------------------------------------------------
|
|
145
128
|
|
|
146
129
|
function isSystemReminder(text) {
|
|
147
130
|
return typeof text === "string" && text.startsWith("<system-reminder>");
|
|
148
131
|
}
|
|
149
|
-
|
|
132
|
+
// FIX: Match block headers with startsWith to avoid false positives from
|
|
133
|
+
// quoted content (e.g. "Note:" file-change reminders embedding debug logs).
|
|
150
134
|
const SR = "<system-reminder>\n";
|
|
151
|
-
|
|
152
135
|
function isHooksBlock(text) {
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
);
|
|
136
|
+
// Hooks block header varies; fall back to head-region check
|
|
137
|
+
return isSystemReminder(text) && text.substring(0, 200).includes("hook success");
|
|
156
138
|
}
|
|
157
139
|
function isSkillsBlock(text) {
|
|
158
|
-
return (
|
|
159
|
-
typeof text === "string" &&
|
|
160
|
-
text.startsWith(SR + "The following skills are available")
|
|
161
|
-
);
|
|
140
|
+
return typeof text === "string" && text.startsWith(SR + "The following skills are available");
|
|
162
141
|
}
|
|
163
142
|
function isDeferredToolsBlock(text) {
|
|
164
|
-
return (
|
|
165
|
-
typeof text === "string" &&
|
|
166
|
-
text.startsWith(SR + "The following deferred tools are now available")
|
|
167
|
-
);
|
|
143
|
+
return typeof text === "string" && text.startsWith(SR + "The following deferred tools are now available");
|
|
168
144
|
}
|
|
169
145
|
function isMcpBlock(text) {
|
|
170
|
-
return (
|
|
171
|
-
typeof text === "string" &&
|
|
172
|
-
text.startsWith(SR + "# MCP Server Instructions")
|
|
173
|
-
);
|
|
146
|
+
return typeof text === "string" && text.startsWith(SR + "# MCP Server Instructions");
|
|
174
147
|
}
|
|
175
148
|
function isRelocatableBlock(text) {
|
|
176
149
|
return (
|
|
@@ -208,18 +181,21 @@ function stripSessionKnowledge(text) {
|
|
|
208
181
|
}
|
|
209
182
|
|
|
210
183
|
/**
|
|
211
|
-
* Core fix: on EVERY
|
|
184
|
+
* Core fix: on EVERY call, scan the entire message array for the LATEST
|
|
212
185
|
* relocatable blocks (skills, MCP, deferred tools, hooks) and ensure they
|
|
213
186
|
* are in messages[0]. This matches fresh session behavior where attachments
|
|
214
|
-
* are always prepended to messages[0].
|
|
187
|
+
* are always prepended to messages[0] on every API call.
|
|
215
188
|
*
|
|
216
|
-
* The
|
|
217
|
-
*
|
|
218
|
-
*
|
|
189
|
+
* The original community fix only checked the last user message, which
|
|
190
|
+
* broke on subsequent turns because:
|
|
191
|
+
* - Call 1: skills in last msg → relocated to messages[0] (3 blocks)
|
|
192
|
+
* - Call 2: in-memory state unchanged, skills now in a middle msg,
|
|
193
|
+
* last msg has no relocatable blocks → messages[0] back to 2 blocks
|
|
194
|
+
* - Prefix changed → cache bust
|
|
219
195
|
*
|
|
220
196
|
* This version scans backwards to find the latest instance of each
|
|
221
197
|
* relocatable block type, removes them from wherever they are, and
|
|
222
|
-
* prepends them to messages[0]
|
|
198
|
+
* prepends them to messages[0]. Idempotent across calls.
|
|
223
199
|
*/
|
|
224
200
|
function normalizeResumeMessages(messages) {
|
|
225
201
|
if (!Array.isArray(messages) || messages.length < 2) return messages;
|
|
@@ -236,13 +212,11 @@ function normalizeResumeMessages(messages) {
|
|
|
236
212
|
const firstMsg = messages[firstUserIdx];
|
|
237
213
|
if (!Array.isArray(firstMsg?.content)) return messages;
|
|
238
214
|
|
|
239
|
-
// Check if ANY relocatable blocks are scattered outside first user msg.
|
|
215
|
+
// FIX: Check if ANY relocatable blocks are scattered outside first user msg.
|
|
216
|
+
// The old check (firstAlreadyHas → skip) missed partial scatter where some
|
|
217
|
+
// blocks stay in messages[0] but others drift to later messages (v2.1.89+).
|
|
240
218
|
let hasScatteredBlocks = false;
|
|
241
|
-
for (
|
|
242
|
-
let i = firstUserIdx + 1;
|
|
243
|
-
i < messages.length && !hasScatteredBlocks;
|
|
244
|
-
i++
|
|
245
|
-
) {
|
|
219
|
+
for (let i = firstUserIdx + 1; i < messages.length && !hasScatteredBlocks; i++) {
|
|
246
220
|
const msg = messages[i];
|
|
247
221
|
if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
248
222
|
for (const block of msg.content) {
|
|
@@ -254,8 +228,8 @@ function normalizeResumeMessages(messages) {
|
|
|
254
228
|
}
|
|
255
229
|
if (!hasScatteredBlocks) return messages;
|
|
256
230
|
|
|
257
|
-
// Scan ALL user messages in reverse to collect the LATEST
|
|
258
|
-
// block type. This handles both full and partial scatter.
|
|
231
|
+
// Scan ALL user messages (including first) in reverse to collect the LATEST
|
|
232
|
+
// version of each block type. This handles both full and partial scatter.
|
|
259
233
|
const found = new Map();
|
|
260
234
|
|
|
261
235
|
for (let i = messages.length - 1; i >= firstUserIdx; i--) {
|
|
@@ -267,6 +241,7 @@ function normalizeResumeMessages(messages) {
|
|
|
267
241
|
const text = block.text || "";
|
|
268
242
|
if (!isRelocatableBlock(text)) continue;
|
|
269
243
|
|
|
244
|
+
// Determine block type for dedup
|
|
270
245
|
let blockType;
|
|
271
246
|
if (isSkillsBlock(text)) blockType = "skills";
|
|
272
247
|
else if (isMcpBlock(text)) blockType = "mcp";
|
|
@@ -274,6 +249,7 @@ function normalizeResumeMessages(messages) {
|
|
|
274
249
|
else if (isHooksBlock(text)) blockType = "hooks";
|
|
275
250
|
else continue;
|
|
276
251
|
|
|
252
|
+
// Keep only the LATEST (first found scanning backwards)
|
|
277
253
|
if (!found.has(blockType)) {
|
|
278
254
|
let fixedText = text;
|
|
279
255
|
if (blockType === "hooks") fixedText = stripSessionKnowledge(text);
|
|
@@ -287,17 +263,15 @@ function normalizeResumeMessages(messages) {
|
|
|
287
263
|
|
|
288
264
|
if (found.size === 0) return messages;
|
|
289
265
|
|
|
290
|
-
// Remove ALL relocatable blocks from ALL user messages
|
|
266
|
+
// Remove ALL relocatable blocks from ALL user messages (both first and later)
|
|
291
267
|
const result = messages.map((msg) => {
|
|
292
268
|
if (msg.role !== "user" || !Array.isArray(msg.content)) return msg;
|
|
293
|
-
const filtered = msg.content.filter(
|
|
294
|
-
(b) => !isRelocatableBlock(b.text || "")
|
|
295
|
-
);
|
|
269
|
+
const filtered = msg.content.filter((b) => !isRelocatableBlock(b.text || ""));
|
|
296
270
|
if (filtered.length === msg.content.length) return msg;
|
|
297
271
|
return { ...msg, content: filtered };
|
|
298
272
|
});
|
|
299
273
|
|
|
300
|
-
// Order must match fresh session layout: deferred
|
|
274
|
+
// FIX: Order must match fresh session layout: deferred → mcp → skills → hooks
|
|
301
275
|
const ORDER = ["deferred", "mcp", "skills", "hooks"];
|
|
302
276
|
const toRelocate = ORDER.filter((t) => found.has(t)).map((t) => found.get(t));
|
|
303
277
|
|
|
@@ -309,12 +283,95 @@ function normalizeResumeMessages(messages) {
|
|
|
309
283
|
return result;
|
|
310
284
|
}
|
|
311
285
|
|
|
312
|
-
//
|
|
313
|
-
//
|
|
314
|
-
//
|
|
286
|
+
// --------------------------------------------------------------------------
|
|
287
|
+
// Image stripping from old tool results (cost optimization)
|
|
288
|
+
// --------------------------------------------------------------------------
|
|
289
|
+
|
|
290
|
+
// CACHE_FIX_IMAGE_KEEP_LAST=N — keep images only in the last N user messages.
|
|
291
|
+
// Unset or 0 = disabled (all images preserved, backward compatible).
|
|
292
|
+
// Images in tool_result blocks older than N user messages from the end are
|
|
293
|
+
// replaced with a text placeholder. User-pasted images (direct image blocks
|
|
294
|
+
// in user messages, not inside tool_result) are left alone.
|
|
295
|
+
const IMAGE_KEEP_LAST = parseInt(process.env.CACHE_FIX_IMAGE_KEEP_LAST || "0", 10);
|
|
296
|
+
|
|
297
|
+
/**
|
|
298
|
+
* Strip base64 image blocks from tool_result content in older messages.
|
|
299
|
+
* Returns { messages, stats } where stats has stripping metrics.
|
|
300
|
+
*/
|
|
301
|
+
function stripOldToolResultImages(messages, keepLast) {
|
|
302
|
+
if (!keepLast || keepLast <= 0 || !Array.isArray(messages)) {
|
|
303
|
+
return { messages, stats: null };
|
|
304
|
+
}
|
|
305
|
+
|
|
306
|
+
// Find user message indices (turns) so we can count from the end
|
|
307
|
+
const userMsgIndices = [];
|
|
308
|
+
for (let i = 0; i < messages.length; i++) {
|
|
309
|
+
if (messages[i].role === "user") userMsgIndices.push(i);
|
|
310
|
+
}
|
|
311
|
+
|
|
312
|
+
if (userMsgIndices.length <= keepLast) {
|
|
313
|
+
return { messages, stats: null }; // not enough turns to strip anything
|
|
314
|
+
}
|
|
315
|
+
|
|
316
|
+
// Messages at or after this index are "recent" — keep their images
|
|
317
|
+
const cutoffIdx = userMsgIndices[userMsgIndices.length - keepLast];
|
|
318
|
+
|
|
319
|
+
let strippedCount = 0;
|
|
320
|
+
let strippedBytes = 0;
|
|
321
|
+
|
|
322
|
+
const result = messages.map((msg, msgIdx) => {
|
|
323
|
+
// Only process user messages before the cutoff (tool_result is in user msgs)
|
|
324
|
+
if (msg.role !== "user" || msgIdx >= cutoffIdx || !Array.isArray(msg.content)) {
|
|
325
|
+
return msg;
|
|
326
|
+
}
|
|
327
|
+
|
|
328
|
+
let msgModified = false;
|
|
329
|
+
const newContent = msg.content.map((block) => {
|
|
330
|
+
// Only strip images inside tool_result blocks, not user-pasted images
|
|
331
|
+
if (block.type === "tool_result" && Array.isArray(block.content)) {
|
|
332
|
+
let toolModified = false;
|
|
333
|
+
const newToolContent = block.content.map((item) => {
|
|
334
|
+
if (item.type === "image") {
|
|
335
|
+
strippedCount++;
|
|
336
|
+
if (item.source?.data) {
|
|
337
|
+
strippedBytes += item.source.data.length;
|
|
338
|
+
}
|
|
339
|
+
toolModified = true;
|
|
340
|
+
return {
|
|
341
|
+
type: "text",
|
|
342
|
+
text: "[image stripped from history — file may still be on disk]",
|
|
343
|
+
};
|
|
344
|
+
}
|
|
345
|
+
return item;
|
|
346
|
+
});
|
|
347
|
+
if (toolModified) {
|
|
348
|
+
msgModified = true;
|
|
349
|
+
return { ...block, content: newToolContent };
|
|
350
|
+
}
|
|
351
|
+
}
|
|
352
|
+
return block;
|
|
353
|
+
});
|
|
354
|
+
|
|
355
|
+
if (msgModified) {
|
|
356
|
+
return { ...msg, content: newContent };
|
|
357
|
+
}
|
|
358
|
+
return msg;
|
|
359
|
+
});
|
|
360
|
+
|
|
361
|
+
const stats = strippedCount > 0
|
|
362
|
+
? { strippedCount, strippedBytes, estimatedTokens: Math.ceil(strippedBytes * 0.125) }
|
|
363
|
+
: null;
|
|
364
|
+
|
|
365
|
+
return { messages: strippedCount > 0 ? result : messages, stats };
|
|
366
|
+
}
|
|
367
|
+
|
|
368
|
+
// --------------------------------------------------------------------------
|
|
369
|
+
// Tool schema stabilization (Bug 2 secondary cause)
|
|
370
|
+
// --------------------------------------------------------------------------
|
|
315
371
|
|
|
316
372
|
/**
|
|
317
|
-
* Sort tool definitions by name for deterministic ordering.
|
|
373
|
+
* Sort tool definitions by name for deterministic ordering. Tool schema bytes
|
|
374
|
+
* changing mid-session was acknowledged as a bug in the v2.1.88 changelog.
|
|
318
375
|
*/
|
|
319
376
|
function stabilizeToolOrder(tools) {
|
|
320
377
|
if (!Array.isArray(tools) || tools.length === 0) return tools;
|
|
@@ -325,9 +382,228 @@ function stabilizeToolOrder(tools) {
|
|
|
325
382
|
});
|
|
326
383
|
}
|
|
327
384
|
|
|
328
|
-
//
|
|
385
|
+
// --------------------------------------------------------------------------
|
|
329
386
|
// Fetch interceptor
|
|
330
|
-
//
|
|
387
|
+
// --------------------------------------------------------------------------
|
|
388
|
+
|
|
389
|
+
// --------------------------------------------------------------------------
|
|
390
|
+
// Debug logging (writes to ~/.claude/cache-fix-debug.log)
|
|
391
|
+
// Set CACHE_FIX_DEBUG=1 to enable
|
|
392
|
+
// --------------------------------------------------------------------------
|
|
393
|
+
|
|
394
|
+
import { appendFileSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
|
|
395
|
+
import { homedir } from "node:os";
|
|
396
|
+
import { join } from "node:path";
|
|
397
|
+
|
|
398
|
+
const DEBUG = process.env.CACHE_FIX_DEBUG === "1";
|
|
399
|
+
const PREFIXDIFF = process.env.CACHE_FIX_PREFIXDIFF === "1";
|
|
400
|
+
const LOG_PATH = join(homedir(), ".claude", "cache-fix-debug.log");
|
|
401
|
+
const SNAPSHOT_DIR = join(homedir(), ".claude", "cache-fix-snapshots");
|
|
402
|
+
|
|
403
|
+
function debugLog(...args) {
|
|
404
|
+
if (!DEBUG) return;
|
|
405
|
+
const line = `[${new Date().toISOString()}] ${args.join(" ")}\n`;
|
|
406
|
+
try { appendFileSync(LOG_PATH, line); } catch {}
|
|
407
|
+
}
|
|
408
|
+
|
|
409
|
+
// --------------------------------------------------------------------------
|
|
410
|
+
// Prefix snapshot — captures message prefix for cross-process diff.
|
|
411
|
+
// Set CACHE_FIX_PREFIXDIFF=1 to enable.
|
|
412
|
+
//
|
|
413
|
+
// On each API call: saves JSON of first 5 messages + system + tools hash
|
|
414
|
+
// to ~/.claude/cache-fix-snapshots/<session-hash>-last.json
|
|
415
|
+
//
|
|
416
|
+
// On first call after startup: compares against saved snapshot and writes
|
|
417
|
+
// a diff report to ~/.claude/cache-fix-snapshots/<session-hash>-diff.json
|
|
418
|
+
// --------------------------------------------------------------------------
|
|
419
|
+
|
|
420
|
+
let _prefixDiffFirstCall = true;
|
|
421
|
+
|
|
422
|
+
// --------------------------------------------------------------------------
|
|
423
|
+
// GrowthBook flag dump (runs once on first API call)
|
|
424
|
+
// --------------------------------------------------------------------------
|
|
425
|
+
|
|
426
|
+
let _growthBookDumped = false;
|
|
427
|
+
|
|
428
|
+
function dumpGrowthBookFlags() {
|
|
429
|
+
if (_growthBookDumped || !DEBUG) return;
|
|
430
|
+
_growthBookDumped = true;
|
|
431
|
+
try {
|
|
432
|
+
const claudeJson = JSON.parse(readFileSync(join(homedir(), ".claude.json"), "utf8"));
|
|
433
|
+
const features = claudeJson.cachedGrowthBookFeatures;
|
|
434
|
+
if (!features) { debugLog("GROWTHBOOK: no cachedGrowthBookFeatures found"); return; }
|
|
435
|
+
|
|
436
|
+
// Log the flags that matter for cost/cache/context behavior
|
|
437
|
+
const interesting = {
|
|
438
|
+
hawthorn_window: features.tengu_hawthorn_window,
|
|
439
|
+
pewter_kestrel: features.tengu_pewter_kestrel,
|
|
440
|
+
summarize_tool_results: features.tengu_summarize_tool_results,
|
|
441
|
+
slate_heron: features.tengu_slate_heron,
|
|
442
|
+
session_memory: features.tengu_session_memory,
|
|
443
|
+
sm_compact: features.tengu_sm_compact,
|
|
444
|
+
sm_compact_config: features.tengu_sm_compact_config,
|
|
445
|
+
sm_config: features.tengu_sm_config,
|
|
446
|
+
cache_plum_violet: features.tengu_cache_plum_violet,
|
|
447
|
+
prompt_cache_1h_config: features.tengu_prompt_cache_1h_config,
|
|
448
|
+
crystal_beam: features.tengu_crystal_beam,
|
|
449
|
+
cold_compact: features.tengu_cold_compact,
|
|
450
|
+
system_prompt_global_cache: features.tengu_system_prompt_global_cache,
|
|
451
|
+
compact_cache_prefix: features.tengu_compact_cache_prefix,
|
|
452
|
+
};
|
|
453
|
+
debugLog("GROWTHBOOK FLAGS:", JSON.stringify(interesting, null, 2));
|
|
454
|
+
} catch (e) {
|
|
455
|
+
debugLog("GROWTHBOOK: failed to read ~/.claude.json:", e?.message);
|
|
456
|
+
}
|
|
457
|
+
}
|
|
458
|
+
|
|
459
|
+
// --------------------------------------------------------------------------
|
|
460
|
+
// Microcompact / budget monitoring
|
|
461
|
+
// --------------------------------------------------------------------------
|
|
462
|
+
|
|
463
|
+
/**
|
|
464
|
+
* Scan outgoing messages for signs of microcompact clearing and budget
|
|
465
|
+
* enforcement. Counts tool results that have been gutted and reports stats.
|
|
466
|
+
*/
|
|
467
|
+
function monitorContextDegradation(messages) {
|
|
468
|
+
if (!Array.isArray(messages)) return null;
|
|
469
|
+
|
|
470
|
+
let clearedToolResults = 0;
|
|
471
|
+
let totalToolResultChars = 0;
|
|
472
|
+
let totalToolResults = 0;
|
|
473
|
+
|
|
474
|
+
for (const msg of messages) {
|
|
475
|
+
if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
476
|
+
for (const block of msg.content) {
|
|
477
|
+
if (block.type === "tool_result") {
|
|
478
|
+
totalToolResults++;
|
|
479
|
+
const content = block.content;
|
|
480
|
+
if (typeof content === "string") {
|
|
481
|
+
if (content === "[Old tool result content cleared]") {
|
|
482
|
+
clearedToolResults++;
|
|
483
|
+
} else {
|
|
484
|
+
totalToolResultChars += content.length;
|
|
485
|
+
}
|
|
486
|
+
} else if (Array.isArray(content)) {
|
|
487
|
+
for (const item of content) {
|
|
488
|
+
if (item.type === "text") {
|
|
489
|
+
if (item.text === "[Old tool result content cleared]") {
|
|
490
|
+
clearedToolResults++;
|
|
491
|
+
} else {
|
|
492
|
+
totalToolResultChars += item.text.length;
|
|
493
|
+
}
|
|
494
|
+
}
|
|
495
|
+
}
|
|
496
|
+
}
|
|
497
|
+
}
|
|
498
|
+
}
|
|
499
|
+
}
|
|
500
|
+
|
|
501
|
+
if (totalToolResults === 0) return null;
|
|
502
|
+
|
|
503
|
+
const stats = { totalToolResults, clearedToolResults, totalToolResultChars };
|
|
504
|
+
|
|
505
|
+
if (clearedToolResults > 0) {
|
|
506
|
+
debugLog(`MICROCOMPACT: ${clearedToolResults}/${totalToolResults} tool results cleared`);
|
|
507
|
+
}
|
|
508
|
+
|
|
509
|
+
// Warn when approaching the 200K budget threshold
|
|
510
|
+
if (totalToolResultChars > 150000) {
|
|
511
|
+
debugLog(`BUDGET WARNING: tool result chars at ${totalToolResultChars.toLocaleString()} / 200,000 threshold`);
|
|
512
|
+
}
|
|
513
|
+
|
|
514
|
+
return stats;
|
|
515
|
+
}
|
|
516
|
+
|
|
517
|
+
function snapshotPrefix(payload) {
|
|
518
|
+
if (!PREFIXDIFF) return;
|
|
519
|
+
try {
|
|
520
|
+
mkdirSync(SNAPSHOT_DIR, { recursive: true });
|
|
521
|
+
|
|
522
|
+
// Session key: use system prompt hash — stable across restarts for the same project.
|
|
523
|
+
// Different projects get different snapshots, same project matches across resume.
|
|
524
|
+
const sessionKey = payload.system
|
|
525
|
+
? createHash("sha256").update(JSON.stringify(payload.system).slice(0, 2000)).digest("hex").slice(0, 12)
|
|
526
|
+
: "default";
|
|
527
|
+
|
|
528
|
+
const snapshotFile = join(SNAPSHOT_DIR, `${sessionKey}-last.json`);
|
|
529
|
+
const diffFile = join(SNAPSHOT_DIR, `${sessionKey}-diff.json`);
|
|
530
|
+
|
|
531
|
+
// Build prefix snapshot: first 5 messages, stripped of cache_control
|
|
532
|
+
const prefixMsgs = (payload.messages || []).slice(0, 5).map(msg => {
|
|
533
|
+
const content = Array.isArray(msg.content)
|
|
534
|
+
? msg.content.map(b => {
|
|
535
|
+
const { cache_control, ...rest } = b;
|
|
536
|
+
// Truncate long text blocks for diffing
|
|
537
|
+
if (rest.text && rest.text.length > 500) {
|
|
538
|
+
rest.text = rest.text.slice(0, 500) + `...[${rest.text.length} chars]`;
|
|
539
|
+
}
|
|
540
|
+
return rest;
|
|
541
|
+
})
|
|
542
|
+
: msg.content;
|
|
543
|
+
return { role: msg.role, content };
|
|
544
|
+
});
|
|
545
|
+
|
|
546
|
+
const toolsHash = payload.tools
|
|
547
|
+
? createHash("sha256").update(JSON.stringify(payload.tools.map(t => t.name))).digest("hex").slice(0, 16)
|
|
548
|
+
: "none";
|
|
549
|
+
|
|
550
|
+
const systemHash = payload.system
|
|
551
|
+
? createHash("sha256").update(JSON.stringify(payload.system)).digest("hex").slice(0, 16)
|
|
552
|
+
: "none";
|
|
553
|
+
|
|
554
|
+
const snapshot = {
|
|
555
|
+
timestamp: new Date().toISOString(),
|
|
556
|
+
messageCount: payload.messages?.length || 0,
|
|
557
|
+
toolsHash,
|
|
558
|
+
systemHash,
|
|
559
|
+
prefixMessages: prefixMsgs,
|
|
560
|
+
};
|
|
561
|
+
|
|
562
|
+
// On first call: compare against saved
|
|
563
|
+
if (_prefixDiffFirstCall) {
|
|
564
|
+
_prefixDiffFirstCall = false;
|
|
565
|
+
try {
|
|
566
|
+
const prev = JSON.parse(readFileSync(snapshotFile, "utf8"));
|
|
567
|
+
const diff = {
|
|
568
|
+
timestamp: snapshot.timestamp,
|
|
569
|
+
prevTimestamp: prev.timestamp,
|
|
570
|
+
toolsMatch: prev.toolsHash === snapshot.toolsHash,
|
|
571
|
+
systemMatch: prev.systemHash === snapshot.systemHash,
|
|
572
|
+
messageCountPrev: prev.messageCount,
|
|
573
|
+
messageCountNow: snapshot.messageCount,
|
|
574
|
+
prefixDiffs: [],
|
|
575
|
+
};
|
|
576
|
+
|
|
577
|
+
const maxIdx = Math.max(prev.prefixMessages.length, snapshot.prefixMessages.length);
|
|
578
|
+
for (let i = 0; i < maxIdx; i++) {
|
|
579
|
+
const prevMsg = JSON.stringify(prev.prefixMessages[i] || null);
|
|
580
|
+
const nowMsg = JSON.stringify(snapshot.prefixMessages[i] || null);
|
|
581
|
+
if (prevMsg !== nowMsg) {
|
|
582
|
+
diff.prefixDiffs.push({
|
|
583
|
+
index: i,
|
|
584
|
+
prev: prev.prefixMessages[i] || null,
|
|
585
|
+
now: snapshot.prefixMessages[i] || null,
|
|
586
|
+
});
|
|
587
|
+
}
|
|
588
|
+
}
|
|
589
|
+
|
|
590
|
+
writeFileSync(diffFile, JSON.stringify(diff, null, 2));
|
|
591
|
+
debugLog(`PREFIX DIFF: ${diff.prefixDiffs.length} differences in first 5 messages. tools=${diff.toolsMatch ? "match" : "DIFFER"} system=${diff.systemMatch ? "match" : "DIFFER"}`);
|
|
592
|
+
} catch {
|
|
593
|
+
// No previous snapshot — first run
|
|
594
|
+
}
|
|
595
|
+
}
|
|
596
|
+
|
|
597
|
+
// Save current snapshot
|
|
598
|
+
writeFileSync(snapshotFile, JSON.stringify(snapshot, null, 2));
|
|
599
|
+
} catch (e) {
|
|
600
|
+
debugLog("PREFIX SNAPSHOT ERROR:", e?.message);
|
|
601
|
+
}
|
|
602
|
+
}
|
|
603
|
+
|
|
604
|
+
// --------------------------------------------------------------------------
|
|
605
|
+
// Fetch interceptor
|
|
606
|
+
// --------------------------------------------------------------------------
|
|
331
607
|
|
|
332
608
|
const _origFetch = globalThis.fetch;
|
|
333
609
|
|
|
@@ -339,23 +615,27 @@ globalThis.fetch = async function (url, options) {
|
|
|
339
615
|
!urlStr.includes("batches") &&
|
|
340
616
|
!urlStr.includes("count_tokens");
|
|
341
617
|
|
|
342
|
-
if (
|
|
343
|
-
isMessagesEndpoint &&
|
|
344
|
-
options?.body &&
|
|
345
|
-
typeof options.body === "string"
|
|
346
|
-
) {
|
|
618
|
+
if (isMessagesEndpoint && options?.body && typeof options.body === "string") {
|
|
347
619
|
try {
|
|
348
620
|
const payload = JSON.parse(options.body);
|
|
349
621
|
let modified = false;
|
|
350
622
|
|
|
623
|
+
// One-time GrowthBook flag dump on first API call
|
|
624
|
+
dumpGrowthBookFlags();
|
|
625
|
+
|
|
351
626
|
debugLog("--- API call to", urlStr);
|
|
352
627
|
debugLog("message count:", payload.messages?.length);
|
|
353
628
|
|
|
354
|
-
//
|
|
629
|
+
// Detect synthetic model (false rate limiter, B3)
|
|
630
|
+
if (payload.model === "<synthetic>") {
|
|
631
|
+
debugLog("FALSE RATE LIMIT: synthetic model detected — client-side rate limit, no real API call");
|
|
632
|
+
}
|
|
633
|
+
|
|
634
|
+
// Bug 1: Relocate resume attachment blocks
|
|
355
635
|
if (payload.messages) {
|
|
636
|
+
// Log message structure for debugging
|
|
356
637
|
if (DEBUG) {
|
|
357
|
-
let firstUserIdx = -1;
|
|
358
|
-
let lastUserIdx = -1;
|
|
638
|
+
let firstUserIdx = -1, lastUserIdx = -1;
|
|
359
639
|
for (let i = 0; i < payload.messages.length; i++) {
|
|
360
640
|
if (payload.messages[i].role === "user") {
|
|
361
641
|
if (firstUserIdx === -1) firstUserIdx = i;
|
|
@@ -365,39 +645,20 @@ globalThis.fetch = async function (url, options) {
|
|
|
365
645
|
if (firstUserIdx !== -1) {
|
|
366
646
|
const firstContent = payload.messages[firstUserIdx].content;
|
|
367
647
|
const lastContent = payload.messages[lastUserIdx].content;
|
|
368
|
-
debugLog(
|
|
369
|
-
|
|
370
|
-
firstUserIdx,
|
|
371
|
-
"lastUserIdx:",
|
|
372
|
-
lastUserIdx
|
|
373
|
-
);
|
|
374
|
-
debugLog(
|
|
375
|
-
"first user msg blocks:",
|
|
376
|
-
Array.isArray(firstContent) ? firstContent.length : "string"
|
|
377
|
-
);
|
|
648
|
+
debugLog("firstUserIdx:", firstUserIdx, "lastUserIdx:", lastUserIdx);
|
|
649
|
+
debugLog("first user msg blocks:", Array.isArray(firstContent) ? firstContent.length : "string");
|
|
378
650
|
if (Array.isArray(firstContent)) {
|
|
379
651
|
for (const b of firstContent) {
|
|
380
652
|
const t = (b.text || "").substring(0, 80);
|
|
381
|
-
debugLog(
|
|
382
|
-
" first[block]:",
|
|
383
|
-
isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep",
|
|
384
|
-
JSON.stringify(t)
|
|
385
|
-
);
|
|
653
|
+
debugLog(" first[block]:", isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep", JSON.stringify(t));
|
|
386
654
|
}
|
|
387
655
|
}
|
|
388
656
|
if (firstUserIdx !== lastUserIdx) {
|
|
389
|
-
debugLog(
|
|
390
|
-
"last user msg blocks:",
|
|
391
|
-
Array.isArray(lastContent) ? lastContent.length : "string"
|
|
392
|
-
);
|
|
657
|
+
debugLog("last user msg blocks:", Array.isArray(lastContent) ? lastContent.length : "string");
|
|
393
658
|
if (Array.isArray(lastContent)) {
|
|
394
659
|
for (const b of lastContent) {
|
|
395
660
|
const t = (b.text || "").substring(0, 80);
|
|
396
|
-
debugLog(
|
|
397
|
-
" last[block]:",
|
|
398
|
-
isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep",
|
|
399
|
-
JSON.stringify(t)
|
|
400
|
-
);
|
|
661
|
+
debugLog(" last[block]:", isRelocatableBlock(b.text) ? "RELOCATABLE" : "keep", JSON.stringify(t));
|
|
401
662
|
}
|
|
402
663
|
}
|
|
403
664
|
} else {
|
|
@@ -412,13 +673,28 @@ globalThis.fetch = async function (url, options) {
|
|
|
412
673
|
modified = true;
|
|
413
674
|
debugLog("APPLIED: resume message relocation");
|
|
414
675
|
} else {
|
|
676
|
+
debugLog("SKIPPED: resume relocation (not a resume or already correct)");
|
|
677
|
+
}
|
|
678
|
+
}
|
|
679
|
+
|
|
680
|
+
// Image stripping: remove old tool_result images to reduce token waste
|
|
681
|
+
if (payload.messages && IMAGE_KEEP_LAST > 0) {
|
|
682
|
+
const { messages: imgStripped, stats: imgStats } = stripOldToolResultImages(
|
|
683
|
+
payload.messages, IMAGE_KEEP_LAST
|
|
684
|
+
);
|
|
685
|
+
if (imgStats) {
|
|
686
|
+
payload.messages = imgStripped;
|
|
687
|
+
modified = true;
|
|
415
688
|
debugLog(
|
|
416
|
-
|
|
689
|
+
`APPLIED: stripped ${imgStats.strippedCount} images from old tool results`,
|
|
690
|
+
`(~${imgStats.strippedBytes} base64 bytes, ~${imgStats.estimatedTokens} tokens saved)`
|
|
417
691
|
);
|
|
692
|
+
} else if (IMAGE_KEEP_LAST > 0) {
|
|
693
|
+
debugLog("SKIPPED: image stripping (no old images found or not enough turns)");
|
|
418
694
|
}
|
|
419
695
|
}
|
|
420
696
|
|
|
421
|
-
// Bug
|
|
697
|
+
// Bug 2a: Stabilize tool ordering
|
|
422
698
|
if (payload.tools) {
|
|
423
699
|
const sorted = stabilizeToolOrder(payload.tools);
|
|
424
700
|
const changed = sorted.some(
|
|
@@ -431,7 +707,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
431
707
|
}
|
|
432
708
|
}
|
|
433
709
|
|
|
434
|
-
// Bug
|
|
710
|
+
// Bug 2b: Stabilize fingerprint in attribution header
|
|
435
711
|
if (payload.system && payload.messages) {
|
|
436
712
|
const fix = stabilizeFingerprint(payload.system, payload.messages);
|
|
437
713
|
if (fix) {
|
|
@@ -441,12 +717,7 @@ globalThis.fetch = async function (url, options) {
|
|
|
441
717
|
text: fix.newText,
|
|
442
718
|
};
|
|
443
719
|
modified = true;
|
|
444
|
-
debugLog(
|
|
445
|
-
"APPLIED: fingerprint stabilized from",
|
|
446
|
-
fix.oldFingerprint,
|
|
447
|
-
"to",
|
|
448
|
-
fix.stableFingerprint
|
|
449
|
-
);
|
|
720
|
+
debugLog("APPLIED: fingerprint stabilized from", fix.oldFingerprint, "to", fix.stableFingerprint);
|
|
450
721
|
}
|
|
451
722
|
}
|
|
452
723
|
|
|
@@ -454,11 +725,48 @@ globalThis.fetch = async function (url, options) {
|
|
|
454
725
|
options = { ...options, body: JSON.stringify(payload) };
|
|
455
726
|
debugLog("Request body rewritten");
|
|
456
727
|
}
|
|
728
|
+
|
|
729
|
+
// Monitor for microcompact / budget enforcement degradation
|
|
730
|
+
if (payload.messages) {
|
|
731
|
+
monitorContextDegradation(payload.messages);
|
|
732
|
+
}
|
|
733
|
+
|
|
734
|
+
// Capture prefix snapshot for cross-process diff analysis
|
|
735
|
+
snapshotPrefix(payload);
|
|
736
|
+
|
|
457
737
|
} catch (e) {
|
|
458
738
|
debugLog("ERROR in interceptor:", e?.message);
|
|
459
739
|
// Parse failure — pass through unmodified
|
|
460
740
|
}
|
|
461
741
|
}
|
|
462
742
|
|
|
463
|
-
|
|
743
|
+
const response = await _origFetch.apply(this, [url, options]);
|
|
744
|
+
|
|
745
|
+
// Extract quota utilization from response headers and save for hooks/MCP
|
|
746
|
+
if (isMessagesEndpoint) {
|
|
747
|
+
try {
|
|
748
|
+
const h5 = response.headers.get("anthropic-ratelimit-unified-5h-utilization");
|
|
749
|
+
const h7d = response.headers.get("anthropic-ratelimit-unified-7d-utilization");
|
|
750
|
+
const reset5h = response.headers.get("anthropic-ratelimit-unified-5h-reset");
|
|
751
|
+
const reset7d = response.headers.get("anthropic-ratelimit-unified-7d-reset");
|
|
752
|
+
const status = response.headers.get("anthropic-ratelimit-unified-status");
|
|
753
|
+
const overage = response.headers.get("anthropic-ratelimit-unified-overage-status");
|
|
754
|
+
|
|
755
|
+
if (h5 || h7d) {
|
|
756
|
+
const quota = {
|
|
757
|
+
timestamp: new Date().toISOString(),
|
|
758
|
+
five_hour: h5 ? { utilization: parseFloat(h5), pct: Math.round(parseFloat(h5) * 100), resets_at: reset5h ? parseInt(reset5h) : null } : null,
|
|
759
|
+
seven_day: h7d ? { utilization: parseFloat(h7d), pct: Math.round(parseFloat(h7d) * 100), resets_at: reset7d ? parseInt(reset7d) : null } : null,
|
|
760
|
+
status: status || null,
|
|
761
|
+
overage_status: overage || null,
|
|
762
|
+
};
|
|
763
|
+
const quotaFile = join(homedir(), ".claude", "quota-status.json");
|
|
764
|
+
writeFileSync(quotaFile, JSON.stringify(quota, null, 2));
|
|
765
|
+
}
|
|
766
|
+
} catch {
|
|
767
|
+
// Non-critical — don't break the response
|
|
768
|
+
}
|
|
769
|
+
}
|
|
770
|
+
|
|
771
|
+
return response;
|
|
464
772
|
};
|