claude-code-cache-fix 2.0.6 → 3.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,9 +4,76 @@
4
4
 
5
5
  English | [中文](./README.zh.md) | [한국어](./README.ko.md) | [Português](./docs/guia-pt-br.md)
6
6
 
7
- Fixes prompt cache regressions in [Claude Code](https://github.com/anthropics/claude-code) that cause **up to 20x cost increase** on resumed sessions, plus monitoring for silent context degradation. Confirmed through v2.1.112. Opus 4.7 compatible.
7
+ Cache optimization proxy and interceptor for [Claude Code](https://github.com/anthropics/claude-code). Fixes prompt cache bugs that cause excessive quota burn, stabilizes the request prefix, and monitors for silent regressions. Works with all CC versions including the v2.1.113+ Bun binary.
8
8
 
9
- > **Opus 4.7 advisory:** Our metered data shows 4.7 burns Q5h quota at **~2.4x the rate of 4.6** for equivalent visible token counts. Two factors: a new tokenizer (up to 35% more tokens, [documented](https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7)) and adaptive thinking overhead (~105%, not documented in usage response). Workaround: `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1` (may reduce quality). Image stripping (`CACHE_FIX_IMAGE_KEEP_LAST`) is even more important on 4.7 due to high-res image support increasing image token counts. See [Discussion #25](https://github.com/cnighswonger/claude-code-cache-fix/discussions/25) for full analysis.
9
+ > **v3.0.0** adds a local HTTP proxy with hot-reloadable extensions. This is the recommended path for CC v2.1.113+ where the preload interceptor no longer works. A/B tested on v2.1.117: **95.5% cache hit rate through proxy vs 82.3% direct** on first warm turn. [Full release notes ](https://github.com/cnighswonger/claude-code-cache-fix/releases/tag/v3.0.0)
10
+
11
+ > **Opus 4.7 advisory:** Metered data shows 4.7 burns Q5h quota at **~2.4x the rate of 4.6** for equivalent visible token counts. Two factors: a new tokenizer (up to 35% more tokens, [documented](https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7)) and adaptive thinking overhead (~105%, not documented in usage response). Workaround: `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1` (may reduce quality). See [Discussion #25](https://github.com/cnighswonger/claude-code-cache-fix/discussions/25) for full analysis.
12
+
13
+ ## Quick Start: Proxy (recommended for CC v2.1.113+)
14
+
15
+ The proxy works with any CC version — Node.js or Bun binary. It sits between Claude Code and the Anthropic API, applying cache fixes as hot-reloadable extensions.
16
+
17
+ ```bash
18
+ # Install
19
+ npm install -g claude-code-cache-fix
20
+
21
+ # Start the proxy (runs on localhost:9801)
22
+ node "$(npm root -g)/claude-code-cache-fix/proxy/server.mjs" &
23
+
24
+ # Launch Claude Code through it
25
+ ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
26
+ ```
27
+
28
+ That's it. The proxy applies all 7 cache-fix extensions automatically. No wrapper scripts, no `NODE_OPTIONS`, no preload.
29
+
30
+ ### What the proxy does
31
+
32
+ On every request passing through, 7 extensions run in order:
33
+
34
+ | Extension | What it fixes |
35
+ |-----------|--------------|
36
+ | `fingerprint-strip` | Removes unstable cc_version fingerprint from system prompt |
37
+ | `sort-stabilization` | Deterministic ordering of tool and MCP definitions |
38
+ | `ttl-management` | Detects server TTL tier, injects correct cache_control markers |
39
+ | `identity-normalization` | Normalizes message identity fields for prefix stability |
40
+ | `fresh-session-sort` | Fixes non-deterministic ordering on first turn |
41
+ | `cache-control-normalize` | Normalizes cache_control markers across messages |
42
+ | `cache-telemetry` | Extracts cache stats from response headers → `~/.claude/quota-status.json` |
43
+
44
+ Extensions are hot-reloadable — add, remove, or modify `.mjs` files in `proxy/extensions/` and changes apply to the next request without restarting. Configuration in `proxy/extensions.json`.
45
+
46
+ ### Running as a service
47
+
48
+ For persistent use, run the proxy in the background:
49
+
50
+ ```bash
51
+ # Start in background with logging
52
+ nohup node "$(npm root -g)/claude-code-cache-fix/proxy/server.mjs" > /tmp/cache-fix-proxy.log 2>&1 &
53
+
54
+ # Add to your shell profile
55
+ echo 'export ANTHROPIC_BASE_URL=http://127.0.0.1:9801' >> ~/.bashrc
56
+ ```
57
+
58
+ ### Health check
59
+
60
+ ```bash
61
+ curl http://127.0.0.1:9801/health
62
+ # {"status":"ok"}
63
+ ```
64
+
65
+ ## Quick Start: Preload (for CC v2.1.112 and earlier)
66
+
67
+ If you're on a Node.js-based CC version (v2.1.112 or earlier), the preload interceptor still works and requires no proxy:
68
+
69
+ ```bash
70
+ npm install -g claude-code-cache-fix
71
+ NODE_OPTIONS="--import claude-code-cache-fix" claude
72
+ ```
73
+
74
+ > **Note:** The preload does NOT work on CC v2.1.113+ (Bun binary). Use the proxy path above.
75
+
76
+ See [Preload Setup Details](#preload-setup-details) below for wrapper scripts, shell aliases, and Windows instructions.
10
77
 
11
78
  ## Security model
12
79
 
@@ -34,7 +101,12 @@ Three bugs cause this:
34
101
 
35
102
  Additionally, images read via the Read tool persist as base64 in conversation history and are sent on every subsequent API call, compounding token costs silently.
36
103
 
37
- ## Installation
104
+ ## Preload Setup Details
105
+
106
+ <details>
107
+ <summary>Expand for preload interceptor setup (CC v2.1.112 and earlier only)</summary>
108
+
109
+ ### Installation
38
110
 
39
111
  Requires Node.js >= 18 and Claude Code installed via npm (not the standalone binary).
40
112
 
@@ -42,9 +114,9 @@ Requires Node.js >= 18 and Claude Code installed via npm (not the standalone bin
42
114
  npm install -g claude-code-cache-fix
43
115
  ```
44
116
 
45
- ## Usage
117
+ ### Usage
46
118
 
47
- The fix works as a Node.js preload module that intercepts API requests before they leave your machine.
119
+ The preload works as a Node.js module that intercepts API requests before they leave your machine.
48
120
 
49
121
  ### Option A: Wrapper script (recommended)
50
122
 
@@ -184,6 +256,8 @@ Then set in VS Code `settings.json`:
184
256
 
185
257
  Credit: [@JEONG-JIWOO](https://github.com/JEONG-JIWOO) and [@X-15](https://github.com/X-15) for the VS Code extension investigation and C wrapper ([#16](https://github.com/cnighswonger/claude-code-cache-fix/issues/16)).
186
258
 
259
+ </details>
260
+
187
261
  ## How it works
188
262
 
189
263
  The module intercepts `globalThis.fetch` before Claude Code makes API calls to `/v1/messages`. On each call it:
@@ -0,0 +1,113 @@
1
+ #!/usr/bin/env node
2
+
3
+ import { fork, spawn } from "node:child_process";
4
+ import { fileURLToPath } from "node:url";
5
+ import { dirname, resolve } from "node:path";
6
+ import http from "node:http";
7
+
8
+ const __dirname = dirname(fileURLToPath(import.meta.url));
9
+ const SERVER_PATH = resolve(__dirname, "../proxy/server.mjs");
10
+
11
+ const args = process.argv.slice(2);
12
+ let proxyPort = 9801;
13
+ let proxyUpstream = undefined;
14
+ const claudeArgs = [];
15
+
16
+ for (let i = 0; i < args.length; i++) {
17
+ if (args[i] === "--proxy-port" && args[i + 1]) {
18
+ proxyPort = parseInt(args[++i], 10);
19
+ } else if (args[i] === "--proxy-upstream" && args[i + 1]) {
20
+ proxyUpstream = args[++i];
21
+ } else {
22
+ claudeArgs.push(args[i]);
23
+ }
24
+ }
25
+
26
+ const proxyEnv = { ...process.env, CACHE_FIX_PROXY_PORT: String(proxyPort) };
27
+ if (proxyUpstream) proxyEnv.CACHE_FIX_PROXY_UPSTREAM = proxyUpstream;
28
+
29
+ const proxyProc = fork(SERVER_PATH, [], {
30
+ stdio: ["ignore", "pipe", "pipe", "ipc"],
31
+ env: proxyEnv,
32
+ });
33
+
34
+ let claudeProc = null;
35
+ let exiting = false;
36
+
37
+ function cleanup() {
38
+ if (exiting) return;
39
+ exiting = true;
40
+ if (claudeProc && !claudeProc.killed) claudeProc.kill("SIGTERM");
41
+ if (proxyProc && !proxyProc.killed) proxyProc.kill("SIGTERM");
42
+ }
43
+
44
+ proxyProc.on("exit", (code) => {
45
+ if (!exiting) {
46
+ process.stderr.write(`proxy exited unexpectedly (code ${code})\n`);
47
+ cleanup();
48
+ process.exit(1);
49
+ }
50
+ });
51
+
52
+ proxyProc.stderr.on("data", (chunk) => {
53
+ process.stderr.write(chunk);
54
+ });
55
+
56
+ function waitForReady() {
57
+ return new Promise((resolve, reject) => {
58
+ let output = "";
59
+ proxyProc.stdout.on("data", (chunk) => {
60
+ output += chunk.toString();
61
+ const match = output.match(/listening on ([\d.]+):(\d+)/);
62
+ if (match) resolve(parseInt(match[2], 10));
63
+ });
64
+ proxyProc.on("exit", (code) => {
65
+ reject(new Error(`Proxy exited (code ${code}) before ready`));
66
+ });
67
+ setTimeout(() => reject(new Error("Proxy failed to start within 10s")), 10000);
68
+ });
69
+ }
70
+
71
+ let actualPort;
72
+ try {
73
+ actualPort = await waitForReady();
74
+ } catch (err) {
75
+ process.stderr.write(`${err.message}\n`);
76
+ cleanup();
77
+ process.exit(1);
78
+ }
79
+
80
+ const claudeEnv = {
81
+ ...process.env,
82
+ ANTHROPIC_BASE_URL: `http://127.0.0.1:${actualPort}`,
83
+ };
84
+
85
+ const spawnOpts = { stdio: ["inherit", "pipe", "pipe"], env: claudeEnv };
86
+ if (process.env.CACHE_FIX_CLAUDE_CMD) {
87
+ const parts = process.env.CACHE_FIX_CLAUDE_CMD.split(" ");
88
+ claudeProc = spawn(parts[0], [...parts.slice(1), ...claudeArgs], spawnOpts);
89
+ } else {
90
+ claudeProc = spawn("claude", claudeArgs, spawnOpts);
91
+ }
92
+
93
+ claudeProc.stdout.on("data", (chunk) => process.stdout.write(chunk));
94
+ claudeProc.stderr.on("data", (chunk) => process.stderr.write(chunk));
95
+
96
+ claudeProc.on("error", (err) => {
97
+ if (err.code === "ENOENT") {
98
+ process.stderr.write("Error: 'claude' command not found. Is Claude Code installed?\n");
99
+ } else {
100
+ process.stderr.write(`Failed to start claude: ${err.message}\n`);
101
+ }
102
+ cleanup();
103
+ process.exit(1);
104
+ });
105
+
106
+ claudeProc.on("close", (code) => {
107
+ const exitCode = code ?? 0;
108
+ cleanup();
109
+ process.exit(exitCode);
110
+ });
111
+
112
+ process.on("SIGINT", () => { cleanup(); process.exit(130); });
113
+ process.on("SIGTERM", () => { cleanup(); process.exit(143); });
package/package.json CHANGED
@@ -1,15 +1,20 @@
1
1
  {
2
2
  "name": "claude-code-cache-fix",
3
- "version": "2.0.6",
4
- "description": "Fixes prompt cache regression in Claude Code that causes up to 20x cost increase on resumed sessions",
3
+ "version": "3.0.1",
4
+ "description": "Cache optimization proxy and interceptor for Claude Code. Fixes prompt cache bugs, stabilizes prefix, reduces quota burn.",
5
5
  "type": "module",
6
6
  "exports": "./preload.mjs",
7
7
  "main": "./preload.mjs",
8
+ "bin": {
9
+ "cache-fix-proxy": "./bin/claude-via-proxy.mjs"
10
+ },
8
11
  "files": [
9
12
  "preload.mjs",
10
13
  "postinstall.js",
11
14
  "tools/",
12
- "claude-fixed.bat"
15
+ "claude-fixed.bat",
16
+ "proxy/",
17
+ "bin/"
13
18
  ],
14
19
  "engines": {
15
20
  "node": ">=18"
@@ -0,0 +1,23 @@
1
+ import { fileURLToPath } from "node:url";
2
+ import { dirname, join } from "node:path";
3
+
4
+ function envInt(name, fallback) {
5
+ const raw = process.env[name];
6
+ if (raw === undefined || raw === "") return fallback;
7
+ const parsed = parseInt(raw, 10);
8
+ return Number.isNaN(parsed) ? fallback : parsed;
9
+ }
10
+
11
+ const __dirname = dirname(fileURLToPath(import.meta.url));
12
+
13
+ const config = {
14
+ port: envInt("CACHE_FIX_PROXY_PORT", 9801),
15
+ bind: process.env.CACHE_FIX_PROXY_BIND || "127.0.0.1",
16
+ upstream: process.env.CACHE_FIX_PROXY_UPSTREAM || "https://api.anthropic.com",
17
+ timeout: envInt("CACHE_FIX_PROXY_TIMEOUT", 600_000),
18
+ extensionsDir: process.env.CACHE_FIX_EXTENSIONS_DIR || join(__dirname, "extensions"),
19
+ extensionsConfig: process.env.CACHE_FIX_EXTENSIONS_CONFIG || join(__dirname, "extensions.json"),
20
+ debug: process.env.CACHE_FIX_DEBUG === "1",
21
+ };
22
+
23
+ export default config;
@@ -0,0 +1,59 @@
1
+ function stripCacheControlMarkers(msg) {
2
+ if (!msg || msg.role !== "user" || !Array.isArray(msg.content)) return 0;
3
+ let n = 0;
4
+ for (let i = 0; i < msg.content.length; i++) {
5
+ const block = msg.content[i];
6
+ if (block && typeof block === "object" && block.cache_control) {
7
+ const { cache_control, ...rest } = block;
8
+ msg.content[i] = rest;
9
+ n++;
10
+ }
11
+ }
12
+ return n;
13
+ }
14
+
15
+ function countUserCacheControlMarkers(body) {
16
+ if (!body || !Array.isArray(body.messages)) return 0;
17
+ let n = 0;
18
+ for (const msg of body.messages) {
19
+ if (msg?.role !== "user" || !Array.isArray(msg.content)) continue;
20
+ for (const block of msg.content) {
21
+ if (block && typeof block === "object" && block.cache_control) n++;
22
+ }
23
+ }
24
+ return n;
25
+ }
26
+
27
+ export default {
28
+ name: "cache-control-normalize",
29
+ description: "Strip scattered cache_control markers from user messages and apply canonical placement",
30
+ order: 400,
31
+
32
+ async onRequest(ctx) {
33
+ const { body } = ctx;
34
+ if (!Array.isArray(body.messages)) return;
35
+
36
+ const markerCount = countUserCacheControlMarkers(body);
37
+ if (markerCount === 0) return;
38
+
39
+ for (const msg of body.messages) {
40
+ if (msg.role === "user") {
41
+ stripCacheControlMarkers(msg);
42
+ }
43
+ }
44
+
45
+ // Apply canonical cache_control at the last block of the last user message
46
+ for (let i = body.messages.length - 1; i >= 0; i--) {
47
+ const msg = body.messages[i];
48
+ if (msg.role !== "user" || !Array.isArray(msg.content) || msg.content.length === 0) continue;
49
+ const lastBlock = msg.content[msg.content.length - 1];
50
+ if (lastBlock && typeof lastBlock === "object") {
51
+ msg.content[msg.content.length - 1] = {
52
+ ...lastBlock,
53
+ cache_control: { type: "ephemeral" },
54
+ };
55
+ }
56
+ break;
57
+ }
58
+ },
59
+ };
@@ -0,0 +1,24 @@
1
+ export default {
2
+ name: "cache-telemetry",
3
+ description: "Extract cache hit/miss stats from response stream for monitoring",
4
+ order: 600,
5
+
6
+ async onStreamEvent(ctx) {
7
+ const { event, telemetry } = ctx;
8
+ if (!event || !telemetry) return;
9
+
10
+ if (event.type === "message_start" && event.message?.usage) {
11
+ const usage = event.message.usage;
12
+ ctx.meta.cacheStats = {
13
+ cacheRead: usage.cache_read_input_tokens || 0,
14
+ cacheCreation: usage.cache_creation_input_tokens || 0,
15
+ inputTokens: usage.input_tokens || 0,
16
+ };
17
+ }
18
+
19
+ if (event.type === "message_delta" && event.usage) {
20
+ if (!ctx.meta.cacheStats) ctx.meta.cacheStats = {};
21
+ ctx.meta.cacheStats.outputTokens = event.usage.output_tokens || 0;
22
+ }
23
+ },
24
+ };
@@ -0,0 +1,105 @@
1
+ import { createHash } from "node:crypto";
2
+
3
+ const FINGERPRINT_SALT = "59cf53e54c78";
4
+ const FINGERPRINT_INDICES = [4, 7, 20];
5
+
6
+ function computeFingerprint(messageText, version) {
7
+ const chars = FINGERPRINT_INDICES.map((i) => messageText[i] || "0").join("");
8
+ const input = `${FINGERPRINT_SALT}${chars}${version}`;
9
+ return createHash("sha256").update(input).digest("hex").slice(0, 3);
10
+ }
11
+
12
+ function extractRealUserMessageText(messages) {
13
+ for (const msg of messages) {
14
+ if (msg.role !== "user") continue;
15
+ const content = msg.content;
16
+ if (!Array.isArray(content)) {
17
+ if (typeof content === "string" && !content.startsWith("<system-reminder>")) {
18
+ return content;
19
+ }
20
+ continue;
21
+ }
22
+ for (const block of content) {
23
+ if (block.type === "text" && typeof block.text === "string" && !block.text.startsWith("<system-reminder>")) {
24
+ return block.text;
25
+ }
26
+ }
27
+ }
28
+ return "";
29
+ }
30
+
31
+ function extractFirstMessageText(messages) {
32
+ if (!Array.isArray(messages) || messages.length === 0) return "";
33
+ const first = messages[0];
34
+ if (!first || first.role !== "user") return "";
35
+ const content = first.content;
36
+ if (typeof content === "string") return content;
37
+ if (!Array.isArray(content)) return "";
38
+ for (const block of content) {
39
+ if (block.type === "text" && typeof block.text === "string") {
40
+ return block.text;
41
+ }
42
+ }
43
+ return "";
44
+ }
45
+
46
+ function stabilizeFingerprint(system, messages) {
47
+ if (!Array.isArray(system)) return null;
48
+
49
+ const attrIdx = system.findIndex(
50
+ (b) => b.type === "text" && typeof b.text === "string" && b.text.includes("x-anthropic-billing-header:")
51
+ );
52
+ if (attrIdx === -1) return null;
53
+
54
+ const attrBlock = system[attrIdx];
55
+ const versionMatch = attrBlock.text.match(/cc_version=([^;]+)/);
56
+ if (!versionMatch) return null;
57
+
58
+ const fullVersion = versionMatch[1];
59
+ const dotParts = fullVersion.split(".");
60
+ if (dotParts.length < 4) return null;
61
+
62
+ const baseVersion = dotParts.slice(0, 3).join(".");
63
+ const oldFingerprint = dotParts[3];
64
+
65
+ const realText = extractRealUserMessageText(messages);
66
+ const realVerification = computeFingerprint(realText, baseVersion);
67
+ const legacyText = extractFirstMessageText(messages);
68
+ const legacyVerification = computeFingerprint(legacyText, baseVersion);
69
+
70
+ let verificationPassed = false;
71
+ if (realVerification === oldFingerprint) {
72
+ verificationPassed = true;
73
+ } else if (legacyVerification === oldFingerprint) {
74
+ verificationPassed = true;
75
+ }
76
+
77
+ if (!verificationPassed) return null;
78
+
79
+ const stableFingerprint = computeFingerprint(realText, baseVersion);
80
+ if (stableFingerprint === oldFingerprint) return null;
81
+
82
+ const newVersion = `${baseVersion}.${stableFingerprint}`;
83
+ const newText = attrBlock.text.replace(
84
+ `cc_version=${fullVersion}`,
85
+ `cc_version=${newVersion}`
86
+ );
87
+
88
+ return { attrIdx, newText, oldFingerprint, stableFingerprint };
89
+ }
90
+
91
+ export default {
92
+ name: "fingerprint-strip",
93
+ description: "Stabilize cc_version fingerprint in system prompt for cache prefix consistency",
94
+ order: 100,
95
+
96
+ async onRequest(ctx) {
97
+ const { body } = ctx;
98
+ if (!body.system || !body.messages) return;
99
+
100
+ const result = stabilizeFingerprint(body.system, body.messages);
101
+ if (result) {
102
+ body.system[result.attrIdx] = { ...body.system[result.attrIdx], text: result.newText };
103
+ }
104
+ },
105
+ };
@@ -0,0 +1,188 @@
1
+ import { createHash } from "node:crypto";
2
+
3
+ const SR = "<system-reminder>\n";
4
+
5
+ function isSystemReminder(text) {
6
+ return typeof text === "string" && text.startsWith("<system-reminder>");
7
+ }
8
+
9
+ function isHooksBlock(text) {
10
+ return isSystemReminder(text) && text.substring(0, 200).includes("hook success");
11
+ }
12
+
13
+ function isSkillsBlock(text) {
14
+ return typeof text === "string" && text.startsWith(SR + "The following skills are available");
15
+ }
16
+
17
+ function isDeferredToolsBlock(text) {
18
+ return typeof text === "string" && text.startsWith(SR + "The following deferred tools are now available");
19
+ }
20
+
21
+ function isMcpBlock(text) {
22
+ return typeof text === "string" && text.startsWith(SR + "# MCP Server Instructions");
23
+ }
24
+
25
+ function isRelocatableBlock(text) {
26
+ return isHooksBlock(text) || isSkillsBlock(text) || isDeferredToolsBlock(text) || isMcpBlock(text);
27
+ }
28
+
29
+ function isClearArtifact(text) {
30
+ if (typeof text !== "string") return false;
31
+ return (
32
+ text.startsWith("<local-command-caveat>") ||
33
+ text.startsWith("<command-name>") ||
34
+ text.startsWith("<local-command-stdout>")
35
+ );
36
+ }
37
+
38
+ function sortSkillsBlock(text) {
39
+ const match = text.match(/^([\s\S]*?\n\n)(- [\s\S]+?)(\n<\/system-reminder>\s*)$/);
40
+ if (!match) return text;
41
+ const [, header, entriesText, footer] = match;
42
+ const entries = entriesText.split(/\n(?=- )/);
43
+ entries.sort();
44
+ return header + entries.join("\n") + footer;
45
+ }
46
+
47
+ function sortDeferredToolsBlock(text) {
48
+ const match = text.match(
49
+ /^(<system-reminder>\nThe following deferred tools are now available[^\n]*\n)([\s\S]+?)(\n<\/system-reminder>\s*)$/
50
+ );
51
+ if (!match) return text;
52
+ const [, header, toolsList, footer] = match;
53
+ const tools = toolsList.split("\n").map((t) => t.trim()).filter(Boolean);
54
+ tools.sort();
55
+ return header + tools.join("\n") + footer;
56
+ }
57
+
58
+ function stripSessionKnowledge(text) {
59
+ return text.replace(/\n<session_knowledge[^>]*>[\s\S]*?<\/session_knowledge>/g, "");
60
+ }
61
+
62
+ const _pinnedBlocks = new Map();
63
+
64
+ function pinBlockContent(blockType, text) {
65
+ const normalized = text.replace(/\s+(<\/system-reminder>)\s*$/, "\n$1");
66
+ const hash = createHash("sha256").update(normalized).digest("hex").slice(0, 16);
67
+ const pinned = _pinnedBlocks.get(blockType);
68
+ if (pinned && pinned.hash === hash) return pinned.text;
69
+ _pinnedBlocks.set(blockType, { hash, text: normalized });
70
+ return normalized;
71
+ }
72
+
73
+ function getBlockType(text) {
74
+ if (isSkillsBlock(text)) return "skills";
75
+ if (isDeferredToolsBlock(text)) return "deferred";
76
+ if (isMcpBlock(text)) return "mcp";
77
+ if (isHooksBlock(text)) return "hooks";
78
+ return null;
79
+ }
80
+
81
+ function fixBlockText(blockType, text) {
82
+ let fixed = text;
83
+ if (blockType === "skills") fixed = sortSkillsBlock(fixed);
84
+ else if (blockType === "deferred") fixed = sortDeferredToolsBlock(fixed);
85
+ else if (blockType === "hooks") fixed = stripSessionKnowledge(fixed);
86
+ return pinBlockContent(blockType, fixed);
87
+ }
88
+
89
+ export default {
90
+ name: "fresh-session-sort",
91
+ description: "Relocate scattered blocks to messages[0] in deterministic fresh-session order",
92
+ order: 250,
93
+
94
+ async onRequest(ctx) {
95
+ const { body } = ctx;
96
+ if (!Array.isArray(body.messages)) return;
97
+
98
+ let firstUserIdx = -1;
99
+ for (let i = 0; i < body.messages.length; i++) {
100
+ if (body.messages[i].role === "user") {
101
+ firstUserIdx = i;
102
+ break;
103
+ }
104
+ }
105
+ if (firstUserIdx === -1) return;
106
+
107
+ const firstMsg = body.messages[firstUserIdx];
108
+ if (!Array.isArray(firstMsg?.content)) return;
109
+
110
+ // Strip /clear artifacts from first user message
111
+ const beforeLen = firstMsg.content.length;
112
+ firstMsg.content = firstMsg.content.filter((b) => !isClearArtifact(b.text || ""));
113
+
114
+ // Check for scattered relocatable blocks outside first user message
115
+ let hasScatteredBlocks = false;
116
+ for (let i = firstUserIdx + 1; i < body.messages.length && !hasScatteredBlocks; i++) {
117
+ const msg = body.messages[i];
118
+ if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
119
+ for (const block of msg.content) {
120
+ if (isRelocatableBlock(block.text || "")) {
121
+ hasScatteredBlocks = true;
122
+ break;
123
+ }
124
+ }
125
+ }
126
+
127
+ if (!hasScatteredBlocks) {
128
+ // Still sort and pin blocks in-place for deterministic first-call baseline
129
+ let modified = false;
130
+ const newContent = firstMsg.content.map((block) => {
131
+ const text = block.text || "";
132
+ const blockType = getBlockType(text);
133
+ if (!blockType) return block;
134
+
135
+ const fixedText = fixBlockText(blockType, text);
136
+ if (fixedText !== text) {
137
+ modified = true;
138
+ const { cache_control, ...rest } = block;
139
+ return { ...rest, text: fixedText };
140
+ }
141
+ return block;
142
+ });
143
+
144
+ if (modified || firstMsg.content.length !== beforeLen) {
145
+ body.messages[firstUserIdx] = { ...firstMsg, content: newContent };
146
+ }
147
+ return;
148
+ }
149
+
150
+ // Scan backwards to find latest instance of each relocatable block type
151
+ const found = new Map();
152
+ for (let i = body.messages.length - 1; i >= firstUserIdx; i--) {
153
+ const msg = body.messages[i];
154
+ if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
155
+ for (let j = msg.content.length - 1; j >= 0; j--) {
156
+ const block = msg.content[j];
157
+ const text = block.text || "";
158
+ const blockType = getBlockType(text);
159
+ if (!blockType || found.has(blockType)) continue;
160
+
161
+ const fixedText = fixBlockText(blockType, text);
162
+ const { cache_control, ...rest } = block;
163
+ found.set(blockType, { ...rest, text: fixedText });
164
+ }
165
+ }
166
+
167
+ if (found.size === 0) return;
168
+
169
+ // Remove all relocatable blocks from all user messages
170
+ for (let i = 0; i < body.messages.length; i++) {
171
+ const msg = body.messages[i];
172
+ if (msg.role !== "user" || !Array.isArray(msg.content)) continue;
173
+ const filtered = msg.content.filter((b) => !isRelocatableBlock(b.text || ""));
174
+ if (filtered.length !== msg.content.length) {
175
+ body.messages[i] = { ...msg, content: filtered };
176
+ }
177
+ }
178
+
179
+ // Prepend in deterministic order: deferred → mcp → skills → hooks
180
+ const ORDER = ["deferred", "mcp", "skills", "hooks"];
181
+ const toRelocate = ORDER.filter((t) => found.has(t)).map((t) => found.get(t));
182
+
183
+ body.messages[firstUserIdx] = {
184
+ ...body.messages[firstUserIdx],
185
+ content: [...toRelocate, ...body.messages[firstUserIdx].content],
186
+ };
187
+ },
188
+ };