claude-code-cache-fix 3.0.0 → 3.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +79 -5
  2. package/package.json +2 -2
package/README.md CHANGED
@@ -4,9 +4,76 @@
4
4
 
5
5
  English | [中文](./README.zh.md) | [한국어](./README.ko.md) | [Português](./docs/guia-pt-br.md)
6
6
 
7
- Fixes prompt cache regressions in [Claude Code](https://github.com/anthropics/claude-code) that cause **up to 20x cost increase** on resumed sessions, plus monitoring for silent context degradation. Confirmed through v2.1.112. Opus 4.7 compatible.
7
+ Cache optimization proxy and interceptor for [Claude Code](https://github.com/anthropics/claude-code). Fixes prompt cache bugs that cause excessive quota burn, stabilizes the request prefix, and monitors for silent regressions. Works with all CC versions including the v2.1.113+ Bun binary.
8
8
 
9
- > **Opus 4.7 advisory:** Our metered data shows 4.7 burns Q5h quota at **~2.4x the rate of 4.6** for equivalent visible token counts. Two factors: a new tokenizer (up to 35% more tokens, [documented](https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7)) and adaptive thinking overhead (~105%, not documented in usage response). Workaround: `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1` (may reduce quality). Image stripping (`CACHE_FIX_IMAGE_KEEP_LAST`) is even more important on 4.7 due to high-res image support increasing image token counts. See [Discussion #25](https://github.com/cnighswonger/claude-code-cache-fix/discussions/25) for full analysis.
9
+ > **v3.0.0** adds a local HTTP proxy with hot-reloadable extensions. This is the recommended path for CC v2.1.113+ where the preload interceptor no longer works. A/B tested on v2.1.117: **95.5% cache hit rate through proxy vs 82.3% direct** on first warm turn. [Full release notes ](https://github.com/cnighswonger/claude-code-cache-fix/releases/tag/v3.0.0)
10
+
11
+ > **Opus 4.7 advisory:** Metered data shows 4.7 burns Q5h quota at **~2.4x the rate of 4.6** for equivalent visible token counts. Two factors: a new tokenizer (up to 35% more tokens, [documented](https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7)) and adaptive thinking overhead (~105%, not documented in usage response). Workaround: `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1` (may reduce quality). See [Discussion #25](https://github.com/cnighswonger/claude-code-cache-fix/discussions/25) for full analysis.
12
+
13
+ ## Quick Start: Proxy (recommended for CC v2.1.113+)
14
+
15
+ The proxy works with any CC version — Node.js or Bun binary. It sits between Claude Code and the Anthropic API, applying cache fixes as hot-reloadable extensions.
16
+
17
+ ```bash
18
+ # Install
19
+ npm install -g claude-code-cache-fix
20
+
21
+ # Start the proxy (runs on localhost:9801)
22
+ node "$(npm root -g)/claude-code-cache-fix/proxy/server.mjs" &
23
+
24
+ # Launch Claude Code through it
25
+ ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
26
+ ```
27
+
28
+ That's it. The proxy applies all 7 cache-fix extensions automatically. No wrapper scripts, no `NODE_OPTIONS`, no preload.
29
+
30
+ ### What the proxy does
31
+
32
+ On every request passing through, 7 extensions run in order:
33
+
34
+ | Extension | What it fixes |
35
+ |-----------|--------------|
36
+ | `fingerprint-strip` | Removes unstable cc_version fingerprint from system prompt |
37
+ | `sort-stabilization` | Deterministic ordering of tool and MCP definitions |
38
+ | `ttl-management` | Detects server TTL tier, injects correct cache_control markers |
39
+ | `identity-normalization` | Normalizes message identity fields for prefix stability |
40
+ | `fresh-session-sort` | Fixes non-deterministic ordering on first turn |
41
+ | `cache-control-normalize` | Normalizes cache_control markers across messages |
42
+ | `cache-telemetry` | Extracts cache stats from response headers → `~/.claude/quota-status.json` |
43
+
44
+ Extensions are hot-reloadable — add, remove, or modify `.mjs` files in `proxy/extensions/` and changes apply to the next request without restarting. Configuration in `proxy/extensions.json`.
45
+
46
+ ### Running as a service
47
+
48
+ For persistent use, run the proxy in the background:
49
+
50
+ ```bash
51
+ # Start in background with logging
52
+ nohup node "$(npm root -g)/claude-code-cache-fix/proxy/server.mjs" > /tmp/cache-fix-proxy.log 2>&1 &
53
+
54
+ # Add to your shell profile
55
+ echo 'export ANTHROPIC_BASE_URL=http://127.0.0.1:9801' >> ~/.bashrc
56
+ ```
57
+
58
+ ### Health check
59
+
60
+ ```bash
61
+ curl http://127.0.0.1:9801/health
62
+ # {"status":"ok"}
63
+ ```
64
+
65
+ ## Quick Start: Preload (for CC v2.1.112 and earlier)
66
+
67
+ If you're on a Node.js-based CC version (v2.1.112 or earlier), the preload interceptor still works and requires no proxy:
68
+
69
+ ```bash
70
+ npm install -g claude-code-cache-fix
71
+ NODE_OPTIONS="--import claude-code-cache-fix" claude
72
+ ```
73
+
74
+ > **Note:** The preload does NOT work on CC v2.1.113+ (Bun binary). Use the proxy path above.
75
+
76
+ See [Preload Setup Details](#preload-setup-details) below for wrapper scripts, shell aliases, and Windows instructions.
10
77
 
11
78
  ## Security model
12
79
 
@@ -34,7 +101,12 @@ Three bugs cause this:
34
101
 
35
102
  Additionally, images read via the Read tool persist as base64 in conversation history and are sent on every subsequent API call, compounding token costs silently.
36
103
 
37
- ## Installation
104
+ ## Preload Setup Details
105
+
106
+ <details>
107
+ <summary>Expand for preload interceptor setup (CC v2.1.112 and earlier only)</summary>
108
+
109
+ ### Installation
38
110
 
39
111
  Requires Node.js >= 18 and Claude Code installed via npm (not the standalone binary).
40
112
 
@@ -42,9 +114,9 @@ Requires Node.js >= 18 and Claude Code installed via npm (not the standalone bin
42
114
  npm install -g claude-code-cache-fix
43
115
  ```
44
116
 
45
- ## Usage
117
+ ### Usage
46
118
 
47
- The fix works as a Node.js preload module that intercepts API requests before they leave your machine.
119
+ The preload works as a Node.js module that intercepts API requests before they leave your machine.
48
120
 
49
121
  ### Option A: Wrapper script (recommended)
50
122
 
@@ -184,6 +256,8 @@ Then set in VS Code `settings.json`:
184
256
 
185
257
  Credit: [@JEONG-JIWOO](https://github.com/JEONG-JIWOO) and [@X-15](https://github.com/X-15) for the VS Code extension investigation and C wrapper ([#16](https://github.com/cnighswonger/claude-code-cache-fix/issues/16)).
186
258
 
259
+ </details>
260
+
187
261
  ## How it works
188
262
 
189
263
  The module intercepts `globalThis.fetch` before Claude Code makes API calls to `/v1/messages`. On each call it:
package/package.json CHANGED
@@ -1,12 +1,12 @@
1
1
  {
2
2
  "name": "claude-code-cache-fix",
3
- "version": "3.0.0",
3
+ "version": "3.0.1",
4
4
  "description": "Cache optimization proxy and interceptor for Claude Code. Fixes prompt cache bugs, stabilizes prefix, reduces quota burn.",
5
5
  "type": "module",
6
6
  "exports": "./preload.mjs",
7
7
  "main": "./preload.mjs",
8
8
  "bin": {
9
- "claude-via-proxy": "./bin/claude-via-proxy.mjs"
9
+ "cache-fix-proxy": "./bin/claude-via-proxy.mjs"
10
10
  },
11
11
  "files": [
12
12
  "preload.mjs",