claude-code-cache-fix 2.0.6 → 3.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +79 -5
- package/bin/claude-via-proxy.mjs +113 -0
- package/package.json +8 -3
- package/proxy/config.mjs +23 -0
- package/proxy/extensions/cache-control-normalize.mjs +59 -0
- package/proxy/extensions/cache-telemetry.mjs +24 -0
- package/proxy/extensions/fingerprint-strip.mjs +105 -0
- package/proxy/extensions/fresh-session-sort.mjs +188 -0
- package/proxy/extensions/identity-normalization.mjs +129 -0
- package/proxy/extensions/request-log.mjs +35 -0
- package/proxy/extensions/sort-stabilization.mjs +62 -0
- package/proxy/extensions/ttl-management.mjs +49 -0
- package/proxy/extensions.json +10 -0
- package/proxy/pipeline.mjs +96 -0
- package/proxy/server.mjs +168 -0
- package/proxy/stream.mjs +110 -0
- package/proxy/upstream.mjs +93 -0
- package/proxy/watcher.mjs +42 -0
- package/tools/MANUAL-COMPACT.md +41 -2
package/README.md
CHANGED
|
@@ -4,9 +4,76 @@
|
|
|
4
4
|
|
|
5
5
|
English | [中文](./README.zh.md) | [한국어](./README.ko.md) | [Português](./docs/guia-pt-br.md)
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Cache optimization proxy and interceptor for [Claude Code](https://github.com/anthropics/claude-code). Fixes prompt cache bugs that cause excessive quota burn, stabilizes the request prefix, and monitors for silent regressions. Works with all CC versions including the v2.1.113+ Bun binary.
|
|
8
8
|
|
|
9
|
-
> **
|
|
9
|
+
> **v3.0.0** adds a local HTTP proxy with hot-reloadable extensions. This is the recommended path for CC v2.1.113+ where the preload interceptor no longer works. A/B tested on v2.1.117: **95.5% cache hit rate through proxy vs 82.3% direct** on first warm turn. [Full release notes →](https://github.com/cnighswonger/claude-code-cache-fix/releases/tag/v3.0.0)
|
|
10
|
+
|
|
11
|
+
> **Opus 4.7 advisory:** Metered data shows 4.7 burns Q5h quota at **~2.4x the rate of 4.6** for equivalent visible token counts. Two factors: a new tokenizer (up to 35% more tokens, [documented](https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7)) and adaptive thinking overhead (~105%, not documented in usage response). Workaround: `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1` (may reduce quality). See [Discussion #25](https://github.com/cnighswonger/claude-code-cache-fix/discussions/25) for full analysis.
|
|
12
|
+
|
|
13
|
+
## Quick Start: Proxy (recommended for CC v2.1.113+)
|
|
14
|
+
|
|
15
|
+
The proxy works with any CC version — Node.js or Bun binary. It sits between Claude Code and the Anthropic API, applying cache fixes as hot-reloadable extensions.
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
# Install
|
|
19
|
+
npm install -g claude-code-cache-fix
|
|
20
|
+
|
|
21
|
+
# Start the proxy (runs on localhost:9801)
|
|
22
|
+
node "$(npm root -g)/claude-code-cache-fix/proxy/server.mjs" &
|
|
23
|
+
|
|
24
|
+
# Launch Claude Code through it
|
|
25
|
+
ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
That's it. The proxy applies all 7 cache-fix extensions automatically. No wrapper scripts, no `NODE_OPTIONS`, no preload.
|
|
29
|
+
|
|
30
|
+
### What the proxy does
|
|
31
|
+
|
|
32
|
+
On every request passing through, 7 extensions run in order:
|
|
33
|
+
|
|
34
|
+
| Extension | What it fixes |
|
|
35
|
+
|-----------|--------------|
|
|
36
|
+
| `fingerprint-strip` | Removes unstable cc_version fingerprint from system prompt |
|
|
37
|
+
| `sort-stabilization` | Deterministic ordering of tool and MCP definitions |
|
|
38
|
+
| `ttl-management` | Detects server TTL tier, injects correct cache_control markers |
|
|
39
|
+
| `identity-normalization` | Normalizes message identity fields for prefix stability |
|
|
40
|
+
| `fresh-session-sort` | Fixes non-deterministic ordering on first turn |
|
|
41
|
+
| `cache-control-normalize` | Normalizes cache_control markers across messages |
|
|
42
|
+
| `cache-telemetry` | Extracts cache stats from response headers → `~/.claude/quota-status.json` |
|
|
43
|
+
|
|
44
|
+
Extensions are hot-reloadable — add, remove, or modify `.mjs` files in `proxy/extensions/` and changes apply to the next request without restarting. Configuration in `proxy/extensions.json`.
|
|
45
|
+
|
|
46
|
+
### Running as a service
|
|
47
|
+
|
|
48
|
+
For persistent use, run the proxy in the background:
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
# Start in background with logging
|
|
52
|
+
nohup node "$(npm root -g)/claude-code-cache-fix/proxy/server.mjs" > /tmp/cache-fix-proxy.log 2>&1 &
|
|
53
|
+
|
|
54
|
+
# Add to your shell profile
|
|
55
|
+
echo 'export ANTHROPIC_BASE_URL=http://127.0.0.1:9801' >> ~/.bashrc
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
### Health check
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
curl http://127.0.0.1:9801/health
|
|
62
|
+
# {"status":"ok"}
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Quick Start: Preload (for CC v2.1.112 and earlier)
|
|
66
|
+
|
|
67
|
+
If you're on a Node.js-based CC version (v2.1.112 or earlier), the preload interceptor still works and requires no proxy:
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
npm install -g claude-code-cache-fix
|
|
71
|
+
NODE_OPTIONS="--import claude-code-cache-fix" claude
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
> **Note:** The preload does NOT work on CC v2.1.113+ (Bun binary). Use the proxy path above.
|
|
75
|
+
|
|
76
|
+
See [Preload Setup Details](#preload-setup-details) below for wrapper scripts, shell aliases, and Windows instructions.
|
|
10
77
|
|
|
11
78
|
## Security model
|
|
12
79
|
|
|
@@ -34,7 +101,12 @@ Three bugs cause this:
|
|
|
34
101
|
|
|
35
102
|
Additionally, images read via the Read tool persist as base64 in conversation history and are sent on every subsequent API call, compounding token costs silently.
|
|
36
103
|
|
|
37
|
-
##
|
|
104
|
+
## Preload Setup Details
|
|
105
|
+
|
|
106
|
+
<details>
|
|
107
|
+
<summary>Expand for preload interceptor setup (CC v2.1.112 and earlier only)</summary>
|
|
108
|
+
|
|
109
|
+
### Installation
|
|
38
110
|
|
|
39
111
|
Requires Node.js >= 18 and Claude Code installed via npm (not the standalone binary).
|
|
40
112
|
|
|
@@ -42,9 +114,9 @@ Requires Node.js >= 18 and Claude Code installed via npm (not the standalone bin
|
|
|
42
114
|
npm install -g claude-code-cache-fix
|
|
43
115
|
```
|
|
44
116
|
|
|
45
|
-
|
|
117
|
+
### Usage
|
|
46
118
|
|
|
47
|
-
The
|
|
119
|
+
The preload works as a Node.js module that intercepts API requests before they leave your machine.
|
|
48
120
|
|
|
49
121
|
### Option A: Wrapper script (recommended)
|
|
50
122
|
|
|
@@ -184,6 +256,8 @@ Then set in VS Code `settings.json`:
|
|
|
184
256
|
|
|
185
257
|
Credit: [@JEONG-JIWOO](https://github.com/JEONG-JIWOO) and [@X-15](https://github.com/X-15) for the VS Code extension investigation and C wrapper ([#16](https://github.com/cnighswonger/claude-code-cache-fix/issues/16)).
|
|
186
258
|
|
|
259
|
+
</details>
|
|
260
|
+
|
|
187
261
|
## How it works
|
|
188
262
|
|
|
189
263
|
The module intercepts `globalThis.fetch` before Claude Code makes API calls to `/v1/messages`. On each call it:
|
|
@@ -0,0 +1,113 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
import { fork, spawn } from "node:child_process";
|
|
4
|
+
import { fileURLToPath } from "node:url";
|
|
5
|
+
import { dirname, resolve } from "node:path";
|
|
6
|
+
import http from "node:http";
|
|
7
|
+
|
|
8
|
+
const __dirname = dirname(fileURLToPath(import.meta.url));
|
|
9
|
+
const SERVER_PATH = resolve(__dirname, "../proxy/server.mjs");
|
|
10
|
+
|
|
11
|
+
const args = process.argv.slice(2);
|
|
12
|
+
let proxyPort = 9801;
|
|
13
|
+
let proxyUpstream = undefined;
|
|
14
|
+
const claudeArgs = [];
|
|
15
|
+
|
|
16
|
+
for (let i = 0; i < args.length; i++) {
|
|
17
|
+
if (args[i] === "--proxy-port" && args[i + 1]) {
|
|
18
|
+
proxyPort = parseInt(args[++i], 10);
|
|
19
|
+
} else if (args[i] === "--proxy-upstream" && args[i + 1]) {
|
|
20
|
+
proxyUpstream = args[++i];
|
|
21
|
+
} else {
|
|
22
|
+
claudeArgs.push(args[i]);
|
|
23
|
+
}
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
const proxyEnv = { ...process.env, CACHE_FIX_PROXY_PORT: String(proxyPort) };
|
|
27
|
+
if (proxyUpstream) proxyEnv.CACHE_FIX_PROXY_UPSTREAM = proxyUpstream;
|
|
28
|
+
|
|
29
|
+
const proxyProc = fork(SERVER_PATH, [], {
|
|
30
|
+
stdio: ["ignore", "pipe", "pipe", "ipc"],
|
|
31
|
+
env: proxyEnv,
|
|
32
|
+
});
|
|
33
|
+
|
|
34
|
+
let claudeProc = null;
|
|
35
|
+
let exiting = false;
|
|
36
|
+
|
|
37
|
+
function cleanup() {
|
|
38
|
+
if (exiting) return;
|
|
39
|
+
exiting = true;
|
|
40
|
+
if (claudeProc && !claudeProc.killed) claudeProc.kill("SIGTERM");
|
|
41
|
+
if (proxyProc && !proxyProc.killed) proxyProc.kill("SIGTERM");
|
|
42
|
+
}
|
|
43
|
+
|
|
44
|
+
proxyProc.on("exit", (code) => {
|
|
45
|
+
if (!exiting) {
|
|
46
|
+
process.stderr.write(`proxy exited unexpectedly (code ${code})\n`);
|
|
47
|
+
cleanup();
|
|
48
|
+
process.exit(1);
|
|
49
|
+
}
|
|
50
|
+
});
|
|
51
|
+
|
|
52
|
+
proxyProc.stderr.on("data", (chunk) => {
|
|
53
|
+
process.stderr.write(chunk);
|
|
54
|
+
});
|
|
55
|
+
|
|
56
|
+
function waitForReady() {
|
|
57
|
+
return new Promise((resolve, reject) => {
|
|
58
|
+
let output = "";
|
|
59
|
+
proxyProc.stdout.on("data", (chunk) => {
|
|
60
|
+
output += chunk.toString();
|
|
61
|
+
const match = output.match(/listening on ([\d.]+):(\d+)/);
|
|
62
|
+
if (match) resolve(parseInt(match[2], 10));
|
|
63
|
+
});
|
|
64
|
+
proxyProc.on("exit", (code) => {
|
|
65
|
+
reject(new Error(`Proxy exited (code ${code}) before ready`));
|
|
66
|
+
});
|
|
67
|
+
setTimeout(() => reject(new Error("Proxy failed to start within 10s")), 10000);
|
|
68
|
+
});
|
|
69
|
+
}
|
|
70
|
+
|
|
71
|
+
let actualPort;
|
|
72
|
+
try {
|
|
73
|
+
actualPort = await waitForReady();
|
|
74
|
+
} catch (err) {
|
|
75
|
+
process.stderr.write(`${err.message}\n`);
|
|
76
|
+
cleanup();
|
|
77
|
+
process.exit(1);
|
|
78
|
+
}
|
|
79
|
+
|
|
80
|
+
const claudeEnv = {
|
|
81
|
+
...process.env,
|
|
82
|
+
ANTHROPIC_BASE_URL: `http://127.0.0.1:${actualPort}`,
|
|
83
|
+
};
|
|
84
|
+
|
|
85
|
+
const spawnOpts = { stdio: ["inherit", "pipe", "pipe"], env: claudeEnv };
|
|
86
|
+
if (process.env.CACHE_FIX_CLAUDE_CMD) {
|
|
87
|
+
const parts = process.env.CACHE_FIX_CLAUDE_CMD.split(" ");
|
|
88
|
+
claudeProc = spawn(parts[0], [...parts.slice(1), ...claudeArgs], spawnOpts);
|
|
89
|
+
} else {
|
|
90
|
+
claudeProc = spawn("claude", claudeArgs, spawnOpts);
|
|
91
|
+
}
|
|
92
|
+
|
|
93
|
+
claudeProc.stdout.on("data", (chunk) => process.stdout.write(chunk));
|
|
94
|
+
claudeProc.stderr.on("data", (chunk) => process.stderr.write(chunk));
|
|
95
|
+
|
|
96
|
+
claudeProc.on("error", (err) => {
|
|
97
|
+
if (err.code === "ENOENT") {
|
|
98
|
+
process.stderr.write("Error: 'claude' command not found. Is Claude Code installed?\n");
|
|
99
|
+
} else {
|
|
100
|
+
process.stderr.write(`Failed to start claude: ${err.message}\n`);
|
|
101
|
+
}
|
|
102
|
+
cleanup();
|
|
103
|
+
process.exit(1);
|
|
104
|
+
});
|
|
105
|
+
|
|
106
|
+
claudeProc.on("close", (code) => {
|
|
107
|
+
const exitCode = code ?? 0;
|
|
108
|
+
cleanup();
|
|
109
|
+
process.exit(exitCode);
|
|
110
|
+
});
|
|
111
|
+
|
|
112
|
+
process.on("SIGINT", () => { cleanup(); process.exit(130); });
|
|
113
|
+
process.on("SIGTERM", () => { cleanup(); process.exit(143); });
|
package/package.json
CHANGED
|
@@ -1,15 +1,20 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "claude-code-cache-fix",
|
|
3
|
-
"version": "
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "3.0.1",
|
|
4
|
+
"description": "Cache optimization proxy and interceptor for Claude Code. Fixes prompt cache bugs, stabilizes prefix, reduces quota burn.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"exports": "./preload.mjs",
|
|
7
7
|
"main": "./preload.mjs",
|
|
8
|
+
"bin": {
|
|
9
|
+
"cache-fix-proxy": "./bin/claude-via-proxy.mjs"
|
|
10
|
+
},
|
|
8
11
|
"files": [
|
|
9
12
|
"preload.mjs",
|
|
10
13
|
"postinstall.js",
|
|
11
14
|
"tools/",
|
|
12
|
-
"claude-fixed.bat"
|
|
15
|
+
"claude-fixed.bat",
|
|
16
|
+
"proxy/",
|
|
17
|
+
"bin/"
|
|
13
18
|
],
|
|
14
19
|
"engines": {
|
|
15
20
|
"node": ">=18"
|
package/proxy/config.mjs
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
import { fileURLToPath } from "node:url";
|
|
2
|
+
import { dirname, join } from "node:path";
|
|
3
|
+
|
|
4
|
+
function envInt(name, fallback) {
|
|
5
|
+
const raw = process.env[name];
|
|
6
|
+
if (raw === undefined || raw === "") return fallback;
|
|
7
|
+
const parsed = parseInt(raw, 10);
|
|
8
|
+
return Number.isNaN(parsed) ? fallback : parsed;
|
|
9
|
+
}
|
|
10
|
+
|
|
11
|
+
const __dirname = dirname(fileURLToPath(import.meta.url));
|
|
12
|
+
|
|
13
|
+
const config = {
|
|
14
|
+
port: envInt("CACHE_FIX_PROXY_PORT", 9801),
|
|
15
|
+
bind: process.env.CACHE_FIX_PROXY_BIND || "127.0.0.1",
|
|
16
|
+
upstream: process.env.CACHE_FIX_PROXY_UPSTREAM || "https://api.anthropic.com",
|
|
17
|
+
timeout: envInt("CACHE_FIX_PROXY_TIMEOUT", 600_000),
|
|
18
|
+
extensionsDir: process.env.CACHE_FIX_EXTENSIONS_DIR || join(__dirname, "extensions"),
|
|
19
|
+
extensionsConfig: process.env.CACHE_FIX_EXTENSIONS_CONFIG || join(__dirname, "extensions.json"),
|
|
20
|
+
debug: process.env.CACHE_FIX_DEBUG === "1",
|
|
21
|
+
};
|
|
22
|
+
|
|
23
|
+
export default config;
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
function stripCacheControlMarkers(msg) {
|
|
2
|
+
if (!msg || msg.role !== "user" || !Array.isArray(msg.content)) return 0;
|
|
3
|
+
let n = 0;
|
|
4
|
+
for (let i = 0; i < msg.content.length; i++) {
|
|
5
|
+
const block = msg.content[i];
|
|
6
|
+
if (block && typeof block === "object" && block.cache_control) {
|
|
7
|
+
const { cache_control, ...rest } = block;
|
|
8
|
+
msg.content[i] = rest;
|
|
9
|
+
n++;
|
|
10
|
+
}
|
|
11
|
+
}
|
|
12
|
+
return n;
|
|
13
|
+
}
|
|
14
|
+
|
|
15
|
+
function countUserCacheControlMarkers(body) {
|
|
16
|
+
if (!body || !Array.isArray(body.messages)) return 0;
|
|
17
|
+
let n = 0;
|
|
18
|
+
for (const msg of body.messages) {
|
|
19
|
+
if (msg?.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
20
|
+
for (const block of msg.content) {
|
|
21
|
+
if (block && typeof block === "object" && block.cache_control) n++;
|
|
22
|
+
}
|
|
23
|
+
}
|
|
24
|
+
return n;
|
|
25
|
+
}
|
|
26
|
+
|
|
27
|
+
export default {
|
|
28
|
+
name: "cache-control-normalize",
|
|
29
|
+
description: "Strip scattered cache_control markers from user messages and apply canonical placement",
|
|
30
|
+
order: 400,
|
|
31
|
+
|
|
32
|
+
async onRequest(ctx) {
|
|
33
|
+
const { body } = ctx;
|
|
34
|
+
if (!Array.isArray(body.messages)) return;
|
|
35
|
+
|
|
36
|
+
const markerCount = countUserCacheControlMarkers(body);
|
|
37
|
+
if (markerCount === 0) return;
|
|
38
|
+
|
|
39
|
+
for (const msg of body.messages) {
|
|
40
|
+
if (msg.role === "user") {
|
|
41
|
+
stripCacheControlMarkers(msg);
|
|
42
|
+
}
|
|
43
|
+
}
|
|
44
|
+
|
|
45
|
+
// Apply canonical cache_control at the last block of the last user message
|
|
46
|
+
for (let i = body.messages.length - 1; i >= 0; i--) {
|
|
47
|
+
const msg = body.messages[i];
|
|
48
|
+
if (msg.role !== "user" || !Array.isArray(msg.content) || msg.content.length === 0) continue;
|
|
49
|
+
const lastBlock = msg.content[msg.content.length - 1];
|
|
50
|
+
if (lastBlock && typeof lastBlock === "object") {
|
|
51
|
+
msg.content[msg.content.length - 1] = {
|
|
52
|
+
...lastBlock,
|
|
53
|
+
cache_control: { type: "ephemeral" },
|
|
54
|
+
};
|
|
55
|
+
}
|
|
56
|
+
break;
|
|
57
|
+
}
|
|
58
|
+
},
|
|
59
|
+
};
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
export default {
|
|
2
|
+
name: "cache-telemetry",
|
|
3
|
+
description: "Extract cache hit/miss stats from response stream for monitoring",
|
|
4
|
+
order: 600,
|
|
5
|
+
|
|
6
|
+
async onStreamEvent(ctx) {
|
|
7
|
+
const { event, telemetry } = ctx;
|
|
8
|
+
if (!event || !telemetry) return;
|
|
9
|
+
|
|
10
|
+
if (event.type === "message_start" && event.message?.usage) {
|
|
11
|
+
const usage = event.message.usage;
|
|
12
|
+
ctx.meta.cacheStats = {
|
|
13
|
+
cacheRead: usage.cache_read_input_tokens || 0,
|
|
14
|
+
cacheCreation: usage.cache_creation_input_tokens || 0,
|
|
15
|
+
inputTokens: usage.input_tokens || 0,
|
|
16
|
+
};
|
|
17
|
+
}
|
|
18
|
+
|
|
19
|
+
if (event.type === "message_delta" && event.usage) {
|
|
20
|
+
if (!ctx.meta.cacheStats) ctx.meta.cacheStats = {};
|
|
21
|
+
ctx.meta.cacheStats.outputTokens = event.usage.output_tokens || 0;
|
|
22
|
+
}
|
|
23
|
+
},
|
|
24
|
+
};
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
import { createHash } from "node:crypto";
|
|
2
|
+
|
|
3
|
+
const FINGERPRINT_SALT = "59cf53e54c78";
|
|
4
|
+
const FINGERPRINT_INDICES = [4, 7, 20];
|
|
5
|
+
|
|
6
|
+
function computeFingerprint(messageText, version) {
|
|
7
|
+
const chars = FINGERPRINT_INDICES.map((i) => messageText[i] || "0").join("");
|
|
8
|
+
const input = `${FINGERPRINT_SALT}${chars}${version}`;
|
|
9
|
+
return createHash("sha256").update(input).digest("hex").slice(0, 3);
|
|
10
|
+
}
|
|
11
|
+
|
|
12
|
+
function extractRealUserMessageText(messages) {
|
|
13
|
+
for (const msg of messages) {
|
|
14
|
+
if (msg.role !== "user") continue;
|
|
15
|
+
const content = msg.content;
|
|
16
|
+
if (!Array.isArray(content)) {
|
|
17
|
+
if (typeof content === "string" && !content.startsWith("<system-reminder>")) {
|
|
18
|
+
return content;
|
|
19
|
+
}
|
|
20
|
+
continue;
|
|
21
|
+
}
|
|
22
|
+
for (const block of content) {
|
|
23
|
+
if (block.type === "text" && typeof block.text === "string" && !block.text.startsWith("<system-reminder>")) {
|
|
24
|
+
return block.text;
|
|
25
|
+
}
|
|
26
|
+
}
|
|
27
|
+
}
|
|
28
|
+
return "";
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
function extractFirstMessageText(messages) {
|
|
32
|
+
if (!Array.isArray(messages) || messages.length === 0) return "";
|
|
33
|
+
const first = messages[0];
|
|
34
|
+
if (!first || first.role !== "user") return "";
|
|
35
|
+
const content = first.content;
|
|
36
|
+
if (typeof content === "string") return content;
|
|
37
|
+
if (!Array.isArray(content)) return "";
|
|
38
|
+
for (const block of content) {
|
|
39
|
+
if (block.type === "text" && typeof block.text === "string") {
|
|
40
|
+
return block.text;
|
|
41
|
+
}
|
|
42
|
+
}
|
|
43
|
+
return "";
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
function stabilizeFingerprint(system, messages) {
|
|
47
|
+
if (!Array.isArray(system)) return null;
|
|
48
|
+
|
|
49
|
+
const attrIdx = system.findIndex(
|
|
50
|
+
(b) => b.type === "text" && typeof b.text === "string" && b.text.includes("x-anthropic-billing-header:")
|
|
51
|
+
);
|
|
52
|
+
if (attrIdx === -1) return null;
|
|
53
|
+
|
|
54
|
+
const attrBlock = system[attrIdx];
|
|
55
|
+
const versionMatch = attrBlock.text.match(/cc_version=([^;]+)/);
|
|
56
|
+
if (!versionMatch) return null;
|
|
57
|
+
|
|
58
|
+
const fullVersion = versionMatch[1];
|
|
59
|
+
const dotParts = fullVersion.split(".");
|
|
60
|
+
if (dotParts.length < 4) return null;
|
|
61
|
+
|
|
62
|
+
const baseVersion = dotParts.slice(0, 3).join(".");
|
|
63
|
+
const oldFingerprint = dotParts[3];
|
|
64
|
+
|
|
65
|
+
const realText = extractRealUserMessageText(messages);
|
|
66
|
+
const realVerification = computeFingerprint(realText, baseVersion);
|
|
67
|
+
const legacyText = extractFirstMessageText(messages);
|
|
68
|
+
const legacyVerification = computeFingerprint(legacyText, baseVersion);
|
|
69
|
+
|
|
70
|
+
let verificationPassed = false;
|
|
71
|
+
if (realVerification === oldFingerprint) {
|
|
72
|
+
verificationPassed = true;
|
|
73
|
+
} else if (legacyVerification === oldFingerprint) {
|
|
74
|
+
verificationPassed = true;
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
if (!verificationPassed) return null;
|
|
78
|
+
|
|
79
|
+
const stableFingerprint = computeFingerprint(realText, baseVersion);
|
|
80
|
+
if (stableFingerprint === oldFingerprint) return null;
|
|
81
|
+
|
|
82
|
+
const newVersion = `${baseVersion}.${stableFingerprint}`;
|
|
83
|
+
const newText = attrBlock.text.replace(
|
|
84
|
+
`cc_version=${fullVersion}`,
|
|
85
|
+
`cc_version=${newVersion}`
|
|
86
|
+
);
|
|
87
|
+
|
|
88
|
+
return { attrIdx, newText, oldFingerprint, stableFingerprint };
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
export default {
|
|
92
|
+
name: "fingerprint-strip",
|
|
93
|
+
description: "Stabilize cc_version fingerprint in system prompt for cache prefix consistency",
|
|
94
|
+
order: 100,
|
|
95
|
+
|
|
96
|
+
async onRequest(ctx) {
|
|
97
|
+
const { body } = ctx;
|
|
98
|
+
if (!body.system || !body.messages) return;
|
|
99
|
+
|
|
100
|
+
const result = stabilizeFingerprint(body.system, body.messages);
|
|
101
|
+
if (result) {
|
|
102
|
+
body.system[result.attrIdx] = { ...body.system[result.attrIdx], text: result.newText };
|
|
103
|
+
}
|
|
104
|
+
},
|
|
105
|
+
};
|
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
import { createHash } from "node:crypto";
|
|
2
|
+
|
|
3
|
+
const SR = "<system-reminder>\n";
|
|
4
|
+
|
|
5
|
+
function isSystemReminder(text) {
|
|
6
|
+
return typeof text === "string" && text.startsWith("<system-reminder>");
|
|
7
|
+
}
|
|
8
|
+
|
|
9
|
+
function isHooksBlock(text) {
|
|
10
|
+
return isSystemReminder(text) && text.substring(0, 200).includes("hook success");
|
|
11
|
+
}
|
|
12
|
+
|
|
13
|
+
function isSkillsBlock(text) {
|
|
14
|
+
return typeof text === "string" && text.startsWith(SR + "The following skills are available");
|
|
15
|
+
}
|
|
16
|
+
|
|
17
|
+
function isDeferredToolsBlock(text) {
|
|
18
|
+
return typeof text === "string" && text.startsWith(SR + "The following deferred tools are now available");
|
|
19
|
+
}
|
|
20
|
+
|
|
21
|
+
function isMcpBlock(text) {
|
|
22
|
+
return typeof text === "string" && text.startsWith(SR + "# MCP Server Instructions");
|
|
23
|
+
}
|
|
24
|
+
|
|
25
|
+
function isRelocatableBlock(text) {
|
|
26
|
+
return isHooksBlock(text) || isSkillsBlock(text) || isDeferredToolsBlock(text) || isMcpBlock(text);
|
|
27
|
+
}
|
|
28
|
+
|
|
29
|
+
function isClearArtifact(text) {
|
|
30
|
+
if (typeof text !== "string") return false;
|
|
31
|
+
return (
|
|
32
|
+
text.startsWith("<local-command-caveat>") ||
|
|
33
|
+
text.startsWith("<command-name>") ||
|
|
34
|
+
text.startsWith("<local-command-stdout>")
|
|
35
|
+
);
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
function sortSkillsBlock(text) {
|
|
39
|
+
const match = text.match(/^([\s\S]*?\n\n)(- [\s\S]+?)(\n<\/system-reminder>\s*)$/);
|
|
40
|
+
if (!match) return text;
|
|
41
|
+
const [, header, entriesText, footer] = match;
|
|
42
|
+
const entries = entriesText.split(/\n(?=- )/);
|
|
43
|
+
entries.sort();
|
|
44
|
+
return header + entries.join("\n") + footer;
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
function sortDeferredToolsBlock(text) {
|
|
48
|
+
const match = text.match(
|
|
49
|
+
/^(<system-reminder>\nThe following deferred tools are now available[^\n]*\n)([\s\S]+?)(\n<\/system-reminder>\s*)$/
|
|
50
|
+
);
|
|
51
|
+
if (!match) return text;
|
|
52
|
+
const [, header, toolsList, footer] = match;
|
|
53
|
+
const tools = toolsList.split("\n").map((t) => t.trim()).filter(Boolean);
|
|
54
|
+
tools.sort();
|
|
55
|
+
return header + tools.join("\n") + footer;
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
function stripSessionKnowledge(text) {
|
|
59
|
+
return text.replace(/\n<session_knowledge[^>]*>[\s\S]*?<\/session_knowledge>/g, "");
|
|
60
|
+
}
|
|
61
|
+
|
|
62
|
+
const _pinnedBlocks = new Map();
|
|
63
|
+
|
|
64
|
+
function pinBlockContent(blockType, text) {
|
|
65
|
+
const normalized = text.replace(/\s+(<\/system-reminder>)\s*$/, "\n$1");
|
|
66
|
+
const hash = createHash("sha256").update(normalized).digest("hex").slice(0, 16);
|
|
67
|
+
const pinned = _pinnedBlocks.get(blockType);
|
|
68
|
+
if (pinned && pinned.hash === hash) return pinned.text;
|
|
69
|
+
_pinnedBlocks.set(blockType, { hash, text: normalized });
|
|
70
|
+
return normalized;
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
function getBlockType(text) {
|
|
74
|
+
if (isSkillsBlock(text)) return "skills";
|
|
75
|
+
if (isDeferredToolsBlock(text)) return "deferred";
|
|
76
|
+
if (isMcpBlock(text)) return "mcp";
|
|
77
|
+
if (isHooksBlock(text)) return "hooks";
|
|
78
|
+
return null;
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
function fixBlockText(blockType, text) {
|
|
82
|
+
let fixed = text;
|
|
83
|
+
if (blockType === "skills") fixed = sortSkillsBlock(fixed);
|
|
84
|
+
else if (blockType === "deferred") fixed = sortDeferredToolsBlock(fixed);
|
|
85
|
+
else if (blockType === "hooks") fixed = stripSessionKnowledge(fixed);
|
|
86
|
+
return pinBlockContent(blockType, fixed);
|
|
87
|
+
}
|
|
88
|
+
|
|
89
|
+
export default {
|
|
90
|
+
name: "fresh-session-sort",
|
|
91
|
+
description: "Relocate scattered blocks to messages[0] in deterministic fresh-session order",
|
|
92
|
+
order: 250,
|
|
93
|
+
|
|
94
|
+
async onRequest(ctx) {
|
|
95
|
+
const { body } = ctx;
|
|
96
|
+
if (!Array.isArray(body.messages)) return;
|
|
97
|
+
|
|
98
|
+
let firstUserIdx = -1;
|
|
99
|
+
for (let i = 0; i < body.messages.length; i++) {
|
|
100
|
+
if (body.messages[i].role === "user") {
|
|
101
|
+
firstUserIdx = i;
|
|
102
|
+
break;
|
|
103
|
+
}
|
|
104
|
+
}
|
|
105
|
+
if (firstUserIdx === -1) return;
|
|
106
|
+
|
|
107
|
+
const firstMsg = body.messages[firstUserIdx];
|
|
108
|
+
if (!Array.isArray(firstMsg?.content)) return;
|
|
109
|
+
|
|
110
|
+
// Strip /clear artifacts from first user message
|
|
111
|
+
const beforeLen = firstMsg.content.length;
|
|
112
|
+
firstMsg.content = firstMsg.content.filter((b) => !isClearArtifact(b.text || ""));
|
|
113
|
+
|
|
114
|
+
// Check for scattered relocatable blocks outside first user message
|
|
115
|
+
let hasScatteredBlocks = false;
|
|
116
|
+
for (let i = firstUserIdx + 1; i < body.messages.length && !hasScatteredBlocks; i++) {
|
|
117
|
+
const msg = body.messages[i];
|
|
118
|
+
if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
119
|
+
for (const block of msg.content) {
|
|
120
|
+
if (isRelocatableBlock(block.text || "")) {
|
|
121
|
+
hasScatteredBlocks = true;
|
|
122
|
+
break;
|
|
123
|
+
}
|
|
124
|
+
}
|
|
125
|
+
}
|
|
126
|
+
|
|
127
|
+
if (!hasScatteredBlocks) {
|
|
128
|
+
// Still sort and pin blocks in-place for deterministic first-call baseline
|
|
129
|
+
let modified = false;
|
|
130
|
+
const newContent = firstMsg.content.map((block) => {
|
|
131
|
+
const text = block.text || "";
|
|
132
|
+
const blockType = getBlockType(text);
|
|
133
|
+
if (!blockType) return block;
|
|
134
|
+
|
|
135
|
+
const fixedText = fixBlockText(blockType, text);
|
|
136
|
+
if (fixedText !== text) {
|
|
137
|
+
modified = true;
|
|
138
|
+
const { cache_control, ...rest } = block;
|
|
139
|
+
return { ...rest, text: fixedText };
|
|
140
|
+
}
|
|
141
|
+
return block;
|
|
142
|
+
});
|
|
143
|
+
|
|
144
|
+
if (modified || firstMsg.content.length !== beforeLen) {
|
|
145
|
+
body.messages[firstUserIdx] = { ...firstMsg, content: newContent };
|
|
146
|
+
}
|
|
147
|
+
return;
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
// Scan backwards to find latest instance of each relocatable block type
|
|
151
|
+
const found = new Map();
|
|
152
|
+
for (let i = body.messages.length - 1; i >= firstUserIdx; i--) {
|
|
153
|
+
const msg = body.messages[i];
|
|
154
|
+
if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
155
|
+
for (let j = msg.content.length - 1; j >= 0; j--) {
|
|
156
|
+
const block = msg.content[j];
|
|
157
|
+
const text = block.text || "";
|
|
158
|
+
const blockType = getBlockType(text);
|
|
159
|
+
if (!blockType || found.has(blockType)) continue;
|
|
160
|
+
|
|
161
|
+
const fixedText = fixBlockText(blockType, text);
|
|
162
|
+
const { cache_control, ...rest } = block;
|
|
163
|
+
found.set(blockType, { ...rest, text: fixedText });
|
|
164
|
+
}
|
|
165
|
+
}
|
|
166
|
+
|
|
167
|
+
if (found.size === 0) return;
|
|
168
|
+
|
|
169
|
+
// Remove all relocatable blocks from all user messages
|
|
170
|
+
for (let i = 0; i < body.messages.length; i++) {
|
|
171
|
+
const msg = body.messages[i];
|
|
172
|
+
if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
|
|
173
|
+
const filtered = msg.content.filter((b) => !isRelocatableBlock(b.text || ""));
|
|
174
|
+
if (filtered.length !== msg.content.length) {
|
|
175
|
+
body.messages[i] = { ...msg, content: filtered };
|
|
176
|
+
}
|
|
177
|
+
}
|
|
178
|
+
|
|
179
|
+
// Prepend in deterministic order: deferred → mcp → skills → hooks
|
|
180
|
+
const ORDER = ["deferred", "mcp", "skills", "hooks"];
|
|
181
|
+
const toRelocate = ORDER.filter((t) => found.has(t)).map((t) => found.get(t));
|
|
182
|
+
|
|
183
|
+
body.messages[firstUserIdx] = {
|
|
184
|
+
...body.messages[firstUserIdx],
|
|
185
|
+
content: [...toRelocate, ...body.messages[firstUserIdx].content],
|
|
186
|
+
};
|
|
187
|
+
},
|
|
188
|
+
};
|