claude-code-cache-fix 3.2.1 → 3.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.ko.md +32 -0
- package/README.md +108 -1
- package/package.json +7 -2
- package/proxy/extensions/identity-normalization.mjs +1 -1
- package/proxy/extensions/image-strip.mjs +566 -39
- package/proxy/extensions/messages-cache-breakpoint.mjs +314 -0
- package/proxy/extensions/microcompact-stability.mjs +428 -0
- package/proxy/extensions/ttl-management.mjs +2 -1
- package/proxy/extensions/ttl-tier-detect.mjs +33 -0
- package/proxy/extensions.json +4 -0
- package/proxy/image-resize.mjs +133 -0
package/README.ko.md
CHANGED
|
@@ -254,6 +254,38 @@ export CACHE_FIX_IMAGE_KEEP_LAST=3
|
|
|
254
254
|
|
|
255
255
|
최근 3개 사용자 메시지의 이미지를 유지하고 이전 것은 텍스트 자리 표시자로 대체합니다. `tool_result` 블록만 대상이며, 사용자가 직접 붙여넣은 이미지는 영향받지 않습니다.
|
|
256
256
|
|
|
257
|
+
### 이미지 가드 파이프라인 (v3.3.0)
|
|
258
|
+
|
|
259
|
+
Anthropic의 실제 이미지 규칙을 그대로 반영하는 조건부 파이프라인입니다. 단일 환경 변수로 명시적 활성화:
|
|
260
|
+
|
|
261
|
+
```bash
|
|
262
|
+
export CACHE_FIX_IMAGE_GUARD=1
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
활성화 시 프록시는 다음을 실행합니다:
|
|
266
|
+
|
|
267
|
+
| 패스 | 트리거 | 동작 |
|
|
268
|
+
|------|--------|------|
|
|
269
|
+
| **Pass 0** (레거시) | `CACHE_FIX_IMAGE_KEEP_LAST=N` 설정 | 가장 최근 N개 이외 사용자 메시지의 tool_result 이미지 제거 |
|
|
270
|
+
| **Pass 3** | `CACHE_FIX_IMAGE_PRESERVE_DETAIL=1` AND 긴 변 > 모델 네이티브 캡 | `sharp`를 통해 네이티브 캡(Opus 4.7은 2576px, 그 외는 1568px)으로 Lanczos 리사이즈, 종횡비와 미디어 타입 보존 |
|
|
271
|
+
| **Pass 1** | 긴 변 > 활성 거부 캡 | 제거 후 forensic 자리 표시자로 대체. 활성 캡 = `MAX_DIM` 설정 시 그 값, 아니면 2000px (개수 > 20일 때) 또는 8000px (개수 ≤ 20) |
|
|
272
|
+
| **Pass 2** | 요청 본문이 `CACHE_FIX_IMAGE_REQUEST_SIZE_MAX` (기본 30 MB) 초과 | 예산 이하가 될 때까지 가장 오래된 이미지부터 제거 |
|
|
273
|
+
| **개수 캡** | 잔여 이미지 개수 > `CACHE_FIX_IMAGE_COUNT_MAX` (기본 100) | 캡까지 가장 오래된 이미지 제거 |
|
|
274
|
+
|
|
275
|
+
실행 순서: **Pass 0 → Pass 3 → Pass 1 → Pass 2 → 개수 캡**. 각 패스는 독립적입니다 — Pass 1은 절대 리사이즈하지 않으며, Pass 3는 절대 제거하지 않습니다.
|
|
276
|
+
|
|
277
|
+
#### 선택적 `sharp` 의존성
|
|
278
|
+
|
|
279
|
+
Pass 3는 Lanczos 리사이즈를 위해 [sharp](https://www.npmjs.com/package/sharp)가 필요합니다. **선택적 peer dependency**로 선언되어 있으며, Pass 3를 사용하려면 별도로 설치하십시오:
|
|
280
|
+
|
|
281
|
+
```bash
|
|
282
|
+
npm install sharp
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
`sharp`가 없는 경우 Pass 3는 깨끗하게 건너뛰며 (telemetry에 `library_missing: true`), Pass 1 + Pass 2 + 개수 캡은 정상 실행됩니다.
|
|
286
|
+
|
|
287
|
+
전체 우선순위 매트릭스(레거시 + 신규 환경 변수의 모든 조합) 및 튜닝 가능한 항목은 [README.md](README.md#image-guard-pipeline-v330)를 참조하십시오.
|
|
288
|
+
|
|
257
289
|
## 시스템 프롬프트 재작성 (프리로드 모드, 선택)
|
|
258
290
|
|
|
259
291
|
인터셉터가 Claude Code의 `# Output efficiency` 시스템 프롬프트 섹션을 재작성할 수 있습니다. 기본 비활성화입니다. `CACHE_FIX_OUTPUT_EFFICIENCY_REPLACEMENT`로 활성화하십시오. 세 가지 알려진 프롬프트 변형과 사용법은 [docs/output-efficiency-prompts.md](docs/output-efficiency-prompts.md)를 참조하십시오.
|
package/README.md
CHANGED
|
@@ -334,7 +334,7 @@ export CACHE_FIX_IMAGE_KEEP_LAST=3
|
|
|
334
334
|
|
|
335
335
|
Keeps images in the last 3 user messages, replaces older ones with a text placeholder. Only targets `tool_result` blocks — user-pasted images are never touched.
|
|
336
336
|
|
|
337
|
-
### Oversized-image guard
|
|
337
|
+
### Oversized-image guard (legacy, v3.2.1)
|
|
338
338
|
|
|
339
339
|
```bash
|
|
340
340
|
export CACHE_FIX_IMAGE_MAX_DIM=2000
|
|
@@ -355,6 +355,113 @@ The two compose: with both set, `KEEP_LAST` runs first (drops the count), then `
|
|
|
355
355
|
|
|
356
356
|
Pure-JS PNG and JPEG header parsing — no native deps. Other formats (GIF, WebP, AVIF, BMP) pass through unchanged regardless of dimension. Fail-open: images whose dimensions can't be parsed (truncated header, unsupported format) are kept rather than stripped — better to send a request that might error than to strip a valid image we just couldn't measure.
|
|
357
357
|
|
|
358
|
+
### Image-guard pipeline (v3.3.0)
|
|
359
|
+
|
|
360
|
+
A conditional pipeline that mirrors Anthropic's actual rules. Strictly opt-in via a single env var:
|
|
361
|
+
|
|
362
|
+
```bash
|
|
363
|
+
export CACHE_FIX_IMAGE_GUARD=1
|
|
364
|
+
```
|
|
365
|
+
|
|
366
|
+
When enabled, the proxy runs:
|
|
367
|
+
|
|
368
|
+
| Pass | Trigger | Action |
|
|
369
|
+
|------|---------|--------|
|
|
370
|
+
| **Pass 0** (legacy) | `CACHE_FIX_IMAGE_KEEP_LAST=N` set | Strip tool_result images from user messages older than N most recent |
|
|
371
|
+
| **Pass 3** | `CACHE_FIX_IMAGE_PRESERVE_DETAIL=1` AND image long edge > model native cap | Lanczos resize via `sharp` to native cap (2576 px for Opus 4.7, 1568 px otherwise), preserve aspect ratio and media type |
|
|
372
|
+
| **Pass 1** | image long edge > active rejection cap | Strip and replace with forensic placeholder. Active cap = `MAX_DIM` if set, else 2000 px (when count > 20) or 8000 px (count ≤ 20) |
|
|
373
|
+
| **Pass 2** | request body exceeds `CACHE_FIX_IMAGE_REQUEST_SIZE_MAX` (default 30 MB) | Drop oldest images until under budget |
|
|
374
|
+
| **Count cap** | surviving image count > `CACHE_FIX_IMAGE_COUNT_MAX` (default 100) | Drop oldest images down to the cap |
|
|
375
|
+
|
|
376
|
+
Execution order: **Pass 0 → Pass 3 → Pass 1 → Pass 2 → count cap**. Each pass is independent — Pass 1 never resizes; Pass 3 never strips.
|
|
377
|
+
|
|
378
|
+
#### Optional `sharp` dependency
|
|
379
|
+
|
|
380
|
+
Pass 3 requires [sharp](https://www.npmjs.com/package/sharp) for Lanczos resize. It's declared as an **optional peer dependency** — install separately if you want Pass 3:
|
|
381
|
+
|
|
382
|
+
```bash
|
|
383
|
+
npm install sharp
|
|
384
|
+
```
|
|
385
|
+
|
|
386
|
+
If `sharp` is missing, Pass 3 skips cleanly (telemetry records `library_missing: true`); Pass 1 + Pass 2 + the count cap still run.
|
|
387
|
+
|
|
388
|
+
#### Precedence matrix
|
|
389
|
+
|
|
390
|
+
| Env var combination | Behavior |
|
|
391
|
+
|---|---|
|
|
392
|
+
| Nothing set | No image processing (back-compat default; the extension short-circuits). |
|
|
393
|
+
| `KEEP_LAST=N` only | Existing v3.2.1: count cap on tool_result images in user messages, runs first. No pipeline. |
|
|
394
|
+
| `MAX_DIM=N` only | Existing v3.2.1: hard size cap, strip-only. No pipeline. |
|
|
395
|
+
| `KEEP_LAST=N` + `MAX_DIM=N` | Existing v3.2.1 composition: `KEEP_LAST` runs first (drops count), then `MAX_DIM` runs on survivors (caps size). No pipeline, no Pass 2, no Pass 3. |
|
|
396
|
+
| `IMAGE_GUARD=1` | New pipeline: Pass 1 (conditional cap) + Pass 2 (request-size guard) + image-count cap. |
|
|
397
|
+
| `IMAGE_GUARD=1` + `MAX_DIM=N` | `MAX_DIM` overrides Pass 1's conditional cap (acts as the cap value); Pass 2 still runs. |
|
|
398
|
+
| `IMAGE_GUARD=1` + `PRESERVE_DETAIL=1` | Adds Pass 3 (Lanczos resize via `sharp`). When `sharp` unavailable, falls back to strip behavior. |
|
|
399
|
+
| `IMAGE_GUARD=1` + `KEEP_LAST=N` | `KEEP_LAST` runs first as count cap (Pass 0); pipeline runs on remainder. |
|
|
400
|
+
| `IMAGE_GUARD=1` + `KEEP_LAST=N` + `MAX_DIM=N` | Three-way: `KEEP_LAST` runs first; pipeline runs on remainder, but `MAX_DIM` overrides Pass 1's conditional cap; Pass 2 still runs. |
|
|
401
|
+
| `PRESERVE_DETAIL=1` without `IMAGE_GUARD=1` | Logs warning, treats as no-op. `PRESERVE_DETAIL` is meaningless without the pipeline running. |
|
|
402
|
+
|
|
403
|
+
#### Tunables
|
|
404
|
+
|
|
405
|
+
| Env var | Default | Purpose |
|
|
406
|
+
|---------|---------|---------|
|
|
407
|
+
| `CACHE_FIX_IMAGE_GUARD` | unset | Top-level pipeline gate (`=1` enables). |
|
|
408
|
+
| `CACHE_FIX_IMAGE_PRESERVE_DETAIL` | unset | Enable Pass 3 Lanczos resize via `sharp`. |
|
|
409
|
+
| `CACHE_FIX_IMAGE_REQUEST_SIZE_MAX` | 31457280 (30 MB) | Pass 2 byte budget. 2 MB headroom from Anthropic's 32 MB ceiling. |
|
|
410
|
+
| `CACHE_FIX_IMAGE_COUNT_MAX` | 100 | Hard image-count cap. Set to 600 for legacy Claude 1/2.x/Instant if needed. |
|
|
411
|
+
|
|
412
|
+
## Cache breakpoints (proxy mode, opt-in)
|
|
413
|
+
|
|
414
|
+
Anthropic's prompt cache supports up to **four** `cache_control` markers per request. Claude Code currently uses three of the four; the third (between auto-injected `messages[0]` content — hooks, skills, project CLAUDE.md, deferred tools, MCP server descriptions — and the first real user content) is missing entirely. Without that marker, every change inside the auto-injected span busts the cache for everything that follows. wadabum projected ~6,500 token savings per fresh-session first turn from adding it ([anthropics/claude-code#47098](https://github.com/anthropics/claude-code/issues/47098)).
|
|
415
|
+
|
|
416
|
+
The proxy can inject the missing marker on opt-in. Default off until validated against community data.
|
|
417
|
+
|
|
418
|
+
```sh
|
|
419
|
+
export CACHE_FIX_INJECT_MESSAGES_BREAKPOINT=1
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
The injection is conservative: it only fires when the request already carries 1–3 markers (typical CC shape) and refuses if the request is at the 4-marker limit (would 400) or has zero markers (Agent SDK / API-direct shape this extension isn't built for). Boundary detection covers all five observed auto-injected block kinds — hooks, skills, CLAUDE.md, deferred-tools, MCP — and lands the marker on the LAST auto-injected block.
|
|
423
|
+
|
|
424
|
+
A diagnostic-only env var dumps the structural shape of `messages[0]` for fixture sourcing without mutating the request:
|
|
425
|
+
|
|
426
|
+
```sh
|
|
427
|
+
export CACHE_FIX_DUMP_MESSAGES_HEAD=/tmp/messages-head.jsonl
|
|
428
|
+
```
|
|
429
|
+
|
|
430
|
+
| Env var | Default | Purpose |
|
|
431
|
+
|---------|---------|---------|
|
|
432
|
+
| `CACHE_FIX_INJECT_MESSAGES_BREAKPOINT` | unset | Enable breakpoint #3 injection (`=1` opt-in). |
|
|
433
|
+
| `CACHE_FIX_DUMP_MESSAGES_HEAD` | unset | Diagnostic JSONL dump of `messages[0].content` shape — read-only, no mutation. |
|
|
434
|
+
|
|
435
|
+
## Microcompact stability (proxy mode, opt-in)
|
|
436
|
+
|
|
437
|
+
After ~90 minutes idle, Claude Code's `time_based_microcompact` (and the cold-compact path triggered by `FDY()`) replaces old `tool_result` content with a sentinel string. The original content is gone for cache purposes; that part is unrecoverable from the proxy. But the sentinel itself can carry an embedded timestamp (`[Old tool result content cleared at 2026-04-30T13:42:11Z]`), which means a *second* microcompact pass against the same already-cleared position writes different bytes — busting the cache for everything after that position even though no new content was added.
|
|
438
|
+
|
|
439
|
+
This extension addresses the recoverable half: normalize the sentinel to a byte-stable canonical form so repeat microcompacts don't churn the cache. **Phase 1 only** — diagnostic + opt-in normalization. Phase 2 (snapshot-and-restore of original tool_result content) is deferred to v3.5.0+ pending Phase 1 production data.
|
|
440
|
+
|
|
441
|
+
```sh
|
|
442
|
+
# Step 1 (diagnostic): characterize what CC's sentinel actually looks like.
|
|
443
|
+
export CACHE_FIX_DUMP_MICROCOMPACT=/tmp/microcompact-dump.jsonl
|
|
444
|
+
|
|
445
|
+
# Step 2 (normalize): once the sentinel format is confirmed, opt-in.
|
|
446
|
+
export CACHE_FIX_NORMALIZE_MICROCOMPACT=1
|
|
447
|
+
```
|
|
448
|
+
|
|
449
|
+
Detection has two modes:
|
|
450
|
+
- **Mode A** — exact match against confirmed CC sentinel patterns (the bare form and the ISO-8601 timestamp variant). Mode A matches are eligible for normalization.
|
|
451
|
+
- **Mode B** — prefix-only match (text begins with `[Old tool result content cleared` but does not exactly match a Mode A pattern). Mode B is **diagnostic-only**: never normalized, dump records redact to a 64-char prefix only.
|
|
452
|
+
|
|
453
|
+
The Mode A/B separation protects against cases where the sentinel might be followed by user-derived content (e.g., a tool that echoed user input back into its result) — the redaction guarantee on Mode B keeps that content out of the diagnostic dump.
|
|
454
|
+
|
|
455
|
+
| Env var | Default | Purpose |
|
|
456
|
+
|---------|---------|---------|
|
|
457
|
+
| `CACHE_FIX_DUMP_MICROCOMPACT` | unset | Path for diagnostic JSONL dump of detected sentinels. Read-only — no mutation. |
|
|
458
|
+
| `CACHE_FIX_NORMALIZE_MICROCOMPACT` | unset | Enable normalization (`=1` opts in). Mutates Mode A matches to canonical form. |
|
|
459
|
+
| `CACHE_FIX_MICROCOMPACT_NORMALIZED` | `[Old tool result content cleared]` | Override the canonical replacement string. |
|
|
460
|
+
| `CACHE_FIX_MICROCOMPACT_SENTINEL_PATTERN_<N>` | unset | Add custom Mode A regex pattern(s). Numbered (1-indexed, sparse OK). |
|
|
461
|
+
| `CACHE_FIX_MICROCOMPACT_SENTINEL_PREFIX_<N>` | unset | Custom Mode B literal prefix(es). Pair with a custom Mode A pattern from a non-default sentinel family so prefix-only variants of that family also get redacted Mode B capture. |
|
|
462
|
+
| `CACHE_FIX_MICROCOMPACT_REDACT_LEN` | `64` | Mode B prefix length in dump records. Set to `0` to suppress the prefix entirely. |
|
|
463
|
+
| `CACHE_FIX_DUMP_MICROCOMPACT_INCLUDE_NORMALIZED` | unset | Add post-normalization text alongside (not replacing) raw `sentinel_text` in dump records. |
|
|
464
|
+
|
|
358
465
|
## System prompt rewrite (preload mode, optional)
|
|
359
466
|
|
|
360
467
|
The interceptor can rewrite Claude Code's `# Output efficiency` system-prompt section. Disabled by default. Enable with `CACHE_FIX_OUTPUT_EFFICIENCY_REPLACEMENT`. See [docs/output-efficiency-prompts.md](docs/output-efficiency-prompts.md) for the three known prompt variants and usage instructions.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "claude-code-cache-fix",
|
|
3
|
-
"version": "3.
|
|
3
|
+
"version": "3.4.0",
|
|
4
4
|
"description": "Cache optimization proxy and interceptor for Claude Code. Fixes prompt cache bugs, stabilizes prefix, reduces quota burn.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"exports": "./preload.mjs",
|
|
@@ -27,6 +27,11 @@
|
|
|
27
27
|
"dependencies": {
|
|
28
28
|
"hpagent": "^1.2.0"
|
|
29
29
|
},
|
|
30
|
+
"peerDependenciesMeta": {
|
|
31
|
+
"sharp": {
|
|
32
|
+
"optional": true
|
|
33
|
+
}
|
|
34
|
+
},
|
|
30
35
|
"keywords": [
|
|
31
36
|
"claude-code",
|
|
32
37
|
"claude",
|
|
@@ -48,5 +53,5 @@
|
|
|
48
53
|
"url": "https://buymeacoffee.com/vsits"
|
|
49
54
|
},
|
|
50
55
|
"license": "MIT",
|
|
51
|
-
"author": "Chris Nighswonger <
|
|
56
|
+
"author": "Chris Nighswonger <dev@vsits.co> (https://vsits.co)"
|
|
52
57
|
}
|
|
@@ -2,7 +2,7 @@ import { createHash } from "node:crypto";
|
|
|
2
2
|
|
|
3
3
|
const _pinnedBlocks = new Map();
|
|
4
4
|
|
|
5
|
-
const SESSION_START_RESUME_MARKER = /SessionStart:
|
|
5
|
+
const SESSION_START_RESUME_MARKER = /SessionStart:resume hook success:/g;
|
|
6
6
|
const SESSION_START_ID_TAG = /\n?<session-id>[^<]*<\/session-id>/g;
|
|
7
7
|
const SESSION_START_LAST_ACTIVE_LINE = /\nLast active:[^\n]*/g;
|
|
8
8
|
const CONTINUE_TRAILER_TEXT = "Continue from where you left off.";
|