claude-code-cache-fix 1.7.0 → 1.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -70,6 +70,31 @@ NODE_OPTIONS="--import claude-code-cache-fix" claude
70
70
 
71
71
  > **Note**: This only works if `claude` points to the npm/Node installation. The standalone binary uses a different execution path that bypasses Node.js preloads.
72
72
 
73
+ ### Windows users
74
+
75
+ On Windows, `NODE_OPTIONS="--import ..."` doesn't work the same way as on Linux/macOS. Use the included `claude-fixed.bat` wrapper instead:
76
+
77
+ 1. After installing both packages globally:
78
+ ```bat
79
+ npm install -g claude-code-cache-fix
80
+ npm install -g @anthropic-ai/claude-code
81
+ ```
82
+
83
+ 2. Copy `claude-fixed.bat` from this package to a directory in your PATH (e.g., `C:\Users\<you>\bin\`):
84
+ ```bat
85
+ copy "%NPM_ROOT%\claude-code-cache-fix\claude-fixed.bat" C:\Users\%USERNAME%\bin\
86
+ ```
87
+ Or find the file manually at your npm global root (run `npm root -g` to locate it).
88
+
89
+ 3. Run Claude Code with the interceptor active:
90
+ ```bat
91
+ claude-fixed [any claude args...]
92
+ ```
93
+
94
+ The wrapper dynamically resolves your npm global root, constructs a `file:///` URL for the preload module (converting backslashes to forward slashes for Node.js), and launches Claude Code with the interceptor loaded. All environment variables (`CACHE_FIX_DEBUG`, `CACHE_FIX_IMAGE_KEEP_LAST`, etc.) work the same as on Linux/macOS.
95
+
96
+ Credit: [@TomTheMenace](https://github.com/anthropics/claude-code/issues/38335) contributed the Windows wrapper and validated the interceptor across a 7.5-hour, 536-call Opus 4.6 session on Windows — 98.4% cache hit rate, 81% of calls had fingerprint instability that the interceptor corrected.
97
+
73
98
  ## How it works
74
99
 
75
100
  The module intercepts `globalThis.fetch` before Claude Code makes API calls to `/v1/messages`. On each call it:
@@ -362,6 +387,7 @@ measurable signature of cache-efficiency degradation.
362
387
  - **[@ArkNill](https://github.com/ArkNill)** — Microcompact mechanism analysis, GrowthBook flag documentation, false rate limiter identification
363
388
  - **[@Renvect](https://github.com/Renvect)** — Image duplication discovery, cross-project directory contamination analysis
364
389
  - **[@fgrosswig](https://github.com/fgrosswig)** — [claude-usage-dashboard](https://github.com/fgrosswig/claude-usage-dashboard) forensic methodology: cost-factor overhead ratio metric, `anthropic-*` header capture pattern, proxy NDJSON schema that informed our dashboard interop layer
390
+ - **[@TomTheMenace](https://github.com/TomTheMenace)** — Windows `.bat` wrapper for the interceptor, first Windows platform validation (7.5h/536-call Opus 4.6 session, 98.4% cache hit rate, 81% fingerprint instability corrected)
365
391
 
366
392
  If you contributed to the community effort on these issues and aren't listed here, please open an issue or PR — we want to credit everyone properly.
367
393
 
@@ -0,0 +1,22 @@
1
+ @echo off
2
+ REM claude-fixed.bat — Windows wrapper for Claude Code with cache-fix interceptor.
3
+ REM
4
+ REM Resolves the npm global root dynamically, constructs a file:/// URL for the
5
+ REM preload module (converting backslashes to forward slashes for Node.js), and
6
+ REM launches Claude Code with the interceptor active.
7
+ REM
8
+ REM Usage:
9
+ REM claude-fixed [any claude args...]
10
+ REM
11
+ REM Prerequisites:
12
+ REM npm install -g claude-code-cache-fix
13
+ REM npm install -g @anthropic-ai/claude-code
14
+ REM
15
+ REM Save this file somewhere in your PATH (e.g. C:\Users\<you>\bin\claude-fixed.bat).
16
+ REM
17
+ REM Credit: @TomTheMenace (https://github.com/anthropics/claude-code/issues/38335)
18
+ REM Part of claude-code-cache-fix: https://github.com/cnighswonger/claude-code-cache-fix
19
+
20
+ for /f "delims=" %%G in ('npm root -g') do set "NPM_GLOBAL=%%G"
21
+ set NODE_OPTIONS=--import file:///%NPM_GLOBAL:\=/%/claude-code-cache-fix/preload.mjs
22
+ node "%NPM_GLOBAL%\@anthropic-ai\claude-code\cli.js" %*
package/package.json CHANGED
@@ -1,13 +1,14 @@
1
1
  {
2
2
  "name": "claude-code-cache-fix",
3
- "version": "1.7.0",
3
+ "version": "1.7.1",
4
4
  "description": "Fixes prompt cache regression in Claude Code that causes up to 20x cost increase on resumed sessions",
5
5
  "type": "module",
6
6
  "exports": "./preload.mjs",
7
7
  "main": "./preload.mjs",
8
8
  "files": [
9
9
  "preload.mjs",
10
- "tools/"
10
+ "tools/",
11
+ "claude-fixed.bat"
11
12
  ],
12
13
  "engines": {
13
14
  "node": ">=18"
@@ -0,0 +1,304 @@
1
+ #!/usr/bin/env bash
2
+ # cross-version-cache-test — replicable cache-behavior test across installed Claude Code versions.
3
+ #
4
+ # What it tests:
5
+ # Phase A (always): per-version steady-state cache behavior via 5 sequential Haiku -p calls,
6
+ # fired within seconds of each other. Captures:
7
+ # - Turn 1 prefix size (cache_creation cold start)
8
+ # - Turns 2-5 cache hit stability (should be ~100% cache_read if TTL holds)
9
+ # - Per-turn q5h_pct delta
10
+ # - TTL tier granted by server
11
+ # Phase B (optional, --include-idle): per-version idle-gap behavior via two calls 6 minutes apart.
12
+ # Captures whether the 1h TTL grant holds across a >5-minute idle, or whether
13
+ # the server flips to 5m tier and forces a rebuild.
14
+ #
15
+ # Safety:
16
+ # - Uses Haiku exclusively (~$0.006/call at Haiku 4.5 rates; full test at ~30 calls = ~$0.20)
17
+ # - No deliberate quota burn; exits gracefully if Q5h > 80% at start
18
+ # - Runs against fixed seed prompt to keep per-call overhead minimal
19
+ # - Does not trigger overage, does not pin quota state for the session
20
+ #
21
+ # Usage:
22
+ # ./cross-version-cache-test.sh # Phase A only, quick
23
+ # ./cross-version-cache-test.sh --include-idle # Phase A + Phase B (takes ~25 minutes)
24
+ # ./cross-version-cache-test.sh --output /some/path # Custom output dir
25
+ #
26
+ # Output:
27
+ # /tmp/cross-version-test-YYYYMMDD-HHMMSS/ (default) containing:
28
+ # - <version>-phase-a.jsonl # one usage.jsonl record per call
29
+ # - <version>-phase-b.jsonl # optional, only with --include-idle
30
+ # - summary.md # tabulated comparison across versions
31
+ # - raw-quota-status-*.json # quota state snapshots
32
+ #
33
+ # Part of claude-code-cache-fix. Requires:
34
+ # - ~/bin/cc-version launcher (see repo)
35
+ # - Installed versions at ~/cc-versions/<version>/ (this script checks and warns)
36
+ # - Interceptor active (the script verifies usage.jsonl grows per call)
37
+ #
38
+ # First created 2026-04-11 for the March 23 regression investigation follow-up.
39
+
40
+ set -euo pipefail
41
+
42
+ # ─── Configuration ──────────────────────────────────────────────────────────
43
+
44
+ VERSIONS=(2.1.81 2.1.83 2.1.90 2.1.101)
45
+ STEADY_STATE_TURNS=5
46
+ IDLE_GAP_SECONDS=360 # 6 minutes, crosses the 5m TTL boundary
47
+ SEED_PROMPT='Reply with exactly: ok'
48
+ MODEL='haiku'
49
+
50
+ # ─── CLI parsing ────────────────────────────────────────────────────────────
51
+
52
+ INCLUDE_IDLE=0
53
+ OUTPUT_DIR=""
54
+
55
+ while [[ $# -gt 0 ]]; do
56
+ case "$1" in
57
+ --include-idle) INCLUDE_IDLE=1; shift ;;
58
+ --output) OUTPUT_DIR="$2"; shift 2 ;;
59
+ -h|--help)
60
+ sed -n '3,34p' "$0" | sed 's/^# \?//'
61
+ exit 0
62
+ ;;
63
+ *)
64
+ echo "unknown flag: $1" >&2
65
+ exit 1
66
+ ;;
67
+ esac
68
+ done
69
+
70
+ # Default output dir
71
+ if [[ -z "$OUTPUT_DIR" ]]; then
72
+ OUTPUT_DIR="/tmp/cross-version-test-$(date +%Y%m%d-%H%M%S)"
73
+ fi
74
+
75
+ mkdir -p "$OUTPUT_DIR"
76
+ SUMMARY="$OUTPUT_DIR/summary.md"
77
+ echo "# Cross-Version Cache Test — $(date -u +%Y-%m-%dT%H:%M:%SZ)" > "$SUMMARY"
78
+ echo "" >> "$SUMMARY"
79
+ echo "Output directory: \`$OUTPUT_DIR\`" >> "$SUMMARY"
80
+ echo "" >> "$SUMMARY"
81
+
82
+ # ─── Preflight ──────────────────────────────────────────────────────────────
83
+
84
+ echo "=== Cross-version cache test ===" | tee -a "$SUMMARY"
85
+
86
+ # Check launcher
87
+ if [[ ! -x "$HOME/bin/cc-version" ]]; then
88
+ echo "ERROR: $HOME/bin/cc-version not found or not executable" >&2
89
+ exit 1
90
+ fi
91
+
92
+ # Check installed versions
93
+ for v in "${VERSIONS[@]}"; do
94
+ if [[ ! -f "$HOME/cc-versions/$v/node_modules/@anthropic-ai/claude-code/cli.js" ]]; then
95
+ echo "ERROR: v$v not installed at ~/cc-versions/$v — run the install snippet in docs/march-23-regression-investigation.md" >&2
96
+ exit 1
97
+ fi
98
+ done
99
+
100
+ # Quota safety check — abort if Q5h is already high
101
+ Q5H=$(python3 -c "
102
+ import json
103
+ try:
104
+ q = json.load(open('$HOME/.claude/quota-status.json'))
105
+ print(q['five_hour']['pct'])
106
+ except Exception:
107
+ print(0)
108
+ " 2>/dev/null || echo 0)
109
+
110
+ if [[ "$Q5H" -gt 80 ]]; then
111
+ echo "ABORT: Q5h is at ${Q5H}% — too close to cap. Test deferred." | tee -a "$SUMMARY"
112
+ exit 2
113
+ fi
114
+
115
+ echo "Preflight OK: Q5h at ${Q5H}%, 4 versions installed, launcher present." | tee -a "$SUMMARY"
116
+ echo "" | tee -a "$SUMMARY"
117
+
118
+ # Snapshot quota state at start
119
+ cp "$HOME/.claude/quota-status.json" "$OUTPUT_DIR/raw-quota-status-start.json" 2>/dev/null || true
120
+
121
+ # ─── Phase A: steady-state per version ─────────────────────────────────────
122
+
123
+ echo "## Phase A — Steady-state" | tee -a "$SUMMARY"
124
+ echo "" | tee -a "$SUMMARY"
125
+ echo "5 sequential Haiku calls per version, fired in quick succession (<30s gap each)." | tee -a "$SUMMARY"
126
+ echo "" | tee -a "$SUMMARY"
127
+
128
+ for v in "${VERSIONS[@]}"; do
129
+ echo "--- Phase A: v$v ---"
130
+ OUTFILE="$OUTPUT_DIR/$v-phase-a.jsonl"
131
+ : > "$OUTFILE"
132
+
133
+ for i in $(seq 1 "$STEADY_STATE_TURNS"); do
134
+ USAGE_LINES_BEFORE=$(wc -l < "$HOME/.claude/usage.jsonl" 2>/dev/null || echo 0)
135
+ echo "$SEED_PROMPT" | "$HOME/bin/cc-version" "$v" -p --model "$MODEL" > /dev/null 2>&1 || {
136
+ echo "WARNING: v$v turn $i failed" | tee -a "$SUMMARY"
137
+ continue
138
+ }
139
+ USAGE_LINES_AFTER=$(wc -l < "$HOME/.claude/usage.jsonl" 2>/dev/null || echo 0)
140
+ if [[ "$USAGE_LINES_AFTER" -gt "$USAGE_LINES_BEFORE" ]]; then
141
+ # Capture the newly-added usage.jsonl line(s) for this version
142
+ tail -n "$((USAGE_LINES_AFTER - USAGE_LINES_BEFORE))" "$HOME/.claude/usage.jsonl" >> "$OUTFILE"
143
+ fi
144
+ # Tiny sleep to let the interceptor finish writing the telemetry
145
+ sleep 0.5
146
+ done
147
+
148
+ TURNS_CAPTURED=$(wc -l < "$OUTFILE")
149
+ echo " v$v: $TURNS_CAPTURED turns captured → $OUTFILE"
150
+ done
151
+
152
+ echo "" | tee -a "$SUMMARY"
153
+
154
+ # ─── Phase B: idle-gap (optional) ──────────────────────────────────────────
155
+
156
+ if [[ "$INCLUDE_IDLE" -eq 1 ]]; then
157
+ echo "## Phase B — Idle-gap behavior" | tee -a "$SUMMARY"
158
+ echo "" | tee -a "$SUMMARY"
159
+ echo "Per version: turn 1, wait ${IDLE_GAP_SECONDS}s (crosses 5m TTL), turn 2." | tee -a "$SUMMARY"
160
+ echo "" | tee -a "$SUMMARY"
161
+
162
+ for v in "${VERSIONS[@]}"; do
163
+ echo "--- Phase B: v$v ---"
164
+ OUTFILE="$OUTPUT_DIR/$v-phase-b.jsonl"
165
+ : > "$OUTFILE"
166
+
167
+ # Turn 1
168
+ USAGE_LINES_BEFORE=$(wc -l < "$HOME/.claude/usage.jsonl" 2>/dev/null || echo 0)
169
+ echo "$SEED_PROMPT" | "$HOME/bin/cc-version" "$v" -p --model "$MODEL" > /dev/null 2>&1 || true
170
+ USAGE_LINES_AFTER=$(wc -l < "$HOME/.claude/usage.jsonl" 2>/dev/null || echo 0)
171
+ if [[ "$USAGE_LINES_AFTER" -gt "$USAGE_LINES_BEFORE" ]]; then
172
+ tail -n "$((USAGE_LINES_AFTER - USAGE_LINES_BEFORE))" "$HOME/.claude/usage.jsonl" >> "$OUTFILE"
173
+ fi
174
+ echo " v$v: turn 1 done, waiting ${IDLE_GAP_SECONDS}s..."
175
+
176
+ sleep "$IDLE_GAP_SECONDS"
177
+
178
+ # Turn 2
179
+ USAGE_LINES_BEFORE=$(wc -l < "$HOME/.claude/usage.jsonl" 2>/dev/null || echo 0)
180
+ echo "$SEED_PROMPT" | "$HOME/bin/cc-version" "$v" -p --model "$MODEL" > /dev/null 2>&1 || true
181
+ USAGE_LINES_AFTER=$(wc -l < "$HOME/.claude/usage.jsonl" 2>/dev/null || echo 0)
182
+ if [[ "$USAGE_LINES_AFTER" -gt "$USAGE_LINES_BEFORE" ]]; then
183
+ tail -n "$((USAGE_LINES_AFTER - USAGE_LINES_BEFORE))" "$HOME/.claude/usage.jsonl" >> "$OUTFILE"
184
+ fi
185
+ echo " v$v: turn 2 done"
186
+ done
187
+
188
+ echo "" | tee -a "$SUMMARY"
189
+ fi
190
+
191
+ # Snapshot quota state at end
192
+ cp "$HOME/.claude/quota-status.json" "$OUTPUT_DIR/raw-quota-status-end.json" 2>/dev/null || true
193
+
194
+ # ─── Analysis ──────────────────────────────────────────────────────────────
195
+
196
+ echo "## Phase A Results" >> "$SUMMARY"
197
+ echo "" >> "$SUMMARY"
198
+
199
+ python3 <<EOF >> "$SUMMARY"
200
+ import json, os
201
+
202
+ output_dir = "$OUTPUT_DIR"
203
+ versions = ["2.1.81", "2.1.83", "2.1.90", "2.1.101"]
204
+ include_idle = $INCLUDE_IDLE
205
+
206
+ def load_jsonl(path):
207
+ if not os.path.exists(path):
208
+ return []
209
+ rows = []
210
+ with open(path) as f:
211
+ for line in f:
212
+ line = line.strip()
213
+ if line:
214
+ try:
215
+ rows.append(json.loads(line))
216
+ except Exception:
217
+ pass
218
+ return rows
219
+
220
+ # Phase A steady-state table
221
+ print("### Per-version per-turn usage (Phase A)")
222
+ print("")
223
+ print("| Version | Turn | cc (creation) | cr (read) | prefix | out | ttl | q5h% |")
224
+ print("|---|---:|---:|---:|---:|---:|---|---:|")
225
+
226
+ for v in versions:
227
+ rows = load_jsonl(os.path.join(output_dir, f"{v}-phase-a.jsonl"))
228
+ for i, r in enumerate(rows, 1):
229
+ cc = r.get("cache_creation_input_tokens", 0)
230
+ cr = r.get("cache_read_input_tokens", 0)
231
+ prefix = cc + cr
232
+ out = r.get("output_tokens", 0)
233
+ ttl = r.get("ttl_tier", "?")
234
+ q5h = r.get("q5h_pct", "?")
235
+ print(f"| v{v} | {i} | {cc:>6,} | {cr:>6,} | {prefix:>6,} | {out:>3} | {ttl} | {q5h}% |")
236
+
237
+ print("")
238
+
239
+ # Steady-state summary: turn-2-onwards averages
240
+ print("### Steady-state averages (turns 2-5)")
241
+ print("")
242
+ print("| Version | avg prefix | avg cc | avg cr | cache hit rate | Turn 1 cold cc | q5h delta turn 1→5 |")
243
+ print("|---|---:|---:|---:|---:|---:|---:|")
244
+ for v in versions:
245
+ rows = load_jsonl(os.path.join(output_dir, f"{v}-phase-a.jsonl"))
246
+ if len(rows) < 2:
247
+ print(f"| v{v} | (insufficient data) | | | | | |")
248
+ continue
249
+ turn1 = rows[0]
250
+ tail = rows[1:]
251
+ avg_prefix = sum((r.get("cache_creation_input_tokens",0) + r.get("cache_read_input_tokens",0)) for r in tail) / len(tail)
252
+ avg_cc = sum(r.get("cache_creation_input_tokens",0) for r in tail) / len(tail)
253
+ avg_cr = sum(r.get("cache_read_input_tokens",0) for r in tail) / len(tail)
254
+ hit_rate = avg_cr / avg_prefix if avg_prefix > 0 else 0
255
+ q5h_start = rows[0].get("q5h_pct", 0)
256
+ q5h_end = rows[-1].get("q5h_pct", 0)
257
+ q5h_delta = (q5h_end - q5h_start) if isinstance(q5h_start, (int, float)) and isinstance(q5h_end, (int, float)) else "?"
258
+ print(f"| v{v} | {avg_prefix:>7,.0f} | {avg_cc:>6,.0f} | {avg_cr:>6,.0f} | {hit_rate*100:.1f}% | {turn1.get('cache_creation_input_tokens',0):>7,} | {q5h_delta}% |")
259
+
260
+ print("")
261
+
262
+ if include_idle:
263
+ print("## Phase B Results (idle-gap behavior)")
264
+ print("")
265
+ print("| Version | Turn 1 prefix | Turn 1 ttl | idle (s) | Turn 2 cc | Turn 2 cr | Turn 2 ttl | rebuilt? |")
266
+ print("|---|---:|---|---:|---:|---:|---|:---:|")
267
+ for v in versions:
268
+ rows = load_jsonl(os.path.join(output_dir, f"{v}-phase-b.jsonl"))
269
+ if len(rows) < 2:
270
+ print(f"| v{v} | (incomplete) | | | | | | |")
271
+ continue
272
+ t1, t2 = rows[0], rows[1]
273
+ t1_prefix = t1.get("cache_creation_input_tokens",0) + t1.get("cache_read_input_tokens",0)
274
+ t2_cc = t2.get("cache_creation_input_tokens",0)
275
+ t2_cr = t2.get("cache_read_input_tokens",0)
276
+ # Idle gap we configured
277
+ idle_s = $IDLE_GAP_SECONDS
278
+ # Rebuilt = turn 2 had substantial cache_creation relative to turn 1 prefix
279
+ rebuilt = "✗ expired" if t2_cc > (t1_prefix * 0.5) else "✓ warm"
280
+ print(f"| v{v} | {t1_prefix:>7,} | {t1.get('ttl_tier','?')} | {idle_s} | {t2_cc:>7,} | {t2_cr:>7,} | {t2.get('ttl_tier','?')} | {rebuilt} |")
281
+ print("")
282
+
283
+ print("---")
284
+ print("")
285
+ print("*Generated by cross-version-cache-test.sh*")
286
+ EOF
287
+
288
+ echo ""
289
+ echo "=== Test complete ==="
290
+ echo "Summary written to: $SUMMARY"
291
+ echo ""
292
+ echo "Raw per-version JSONLs in: $OUTPUT_DIR"
293
+ echo ""
294
+ if [[ "$Q5H" -lt 50 ]]; then
295
+ NEW_Q5H=$(python3 -c "
296
+ import json
297
+ try:
298
+ print(json.load(open('$HOME/.claude/quota-status.json'))['five_hour']['pct'])
299
+ except Exception:
300
+ print('?')
301
+ " 2>/dev/null)
302
+ echo "Q5h at start: ${Q5H}%"
303
+ echo "Q5h at end: ${NEW_Q5H}%"
304
+ fi