claude-code-cache-fix 3.4.0 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.ko.md CHANGED
@@ -39,7 +39,7 @@ ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
39
39
  | `identity-normalization` | 접두사 안정성을 위해 메시지 ID 필드를 정규화합니다 |
40
40
  | `fresh-session-sort` | 첫 번째 턴의 비결정적 순서를 수정합니다 |
41
41
  | `cache-control-normalize` | 메시지 간 cache_control 마커를 정규화합니다 |
42
- | `cache-telemetry` | 응답 헤더에서 캐시 통계를 추출하여 `~/.claude/quota-status.json`에 기록합니다 |
42
+ | `cache-telemetry` | 응답 헤더에서 캐시 통계를 추출하여 `~/.claude/quota-status/{account.json,sessions/<id>.json}`에 기록합니다 |
43
43
 
44
44
  확장은 핫리로드됩니다 — `proxy/extensions/`에서 `.mjs` 파일을 추가, 제거 또는 수정하면 프록시 재시작 없이 다음 요청부터 적용됩니다. 설정은 `proxy/extensions.json`에 있습니다.
45
45
 
@@ -202,7 +202,7 @@ Fixes are disabled — consider re-enabling to recover cache performance.
202
202
 
203
203
  ## 상태 표시줄 — 실시간 쿼터 경고
204
204
 
205
- 프록시와 프리로드 모드 모두 매 API 호출마다 `~/.claude/quota-status.json`에 쿼터 상태를 기록합니다. 포함된 `tools/quota-statusline.sh` 스크립트로 실시간 상태를 표시할 수 있습니다:
205
+ 모드 모두 매 API 호출마다 쿼터 상태를 기록합니다. 프록시 모드(v3.5.0+)는 `~/.claude/quota-status/account.json`(계정 전역: Q5h/Q7d, 상태, 초과)과 `~/.claude/quota-status/sessions/<id>.json`(세션별: TTL 계층, 적중률)로 분리됩니다. 프리로드 모드는 기존 `~/.claude/quota-status.json`(구조상 단일 세션)을 유지합니다. 포함된 `tools/quota-statusline.sh` 스크립트로 실시간 상태를 표시할 수 있습니다:
206
206
 
207
207
  - **Q5h %** (소진율, %/분)
208
208
  - **Q7d %** (소진율, %/시간)
@@ -292,7 +292,7 @@ npm install sharp
292
292
 
293
293
  ## 모니터링 & 진단
294
294
 
295
- 프리로드 인터셉터에는 마이크로컴팩트 열화, 가상 속도 제한기, GrowthBook 플래그 상태, 사용량 텔레메트리, 비용 리포트에 대한 모니터링이 포함됩니다. 쿼터 추적은 프록시와 프리로드 모드 모두에서 `~/.claude/quota-status.json`을 통해 동작합니다.
295
+ 프리로드 인터셉터에는 마이크로컴팩트 열화, 가상 속도 제한기, GrowthBook 플래그 상태, 사용량 텔레메트리, 비용 리포트에 대한 모니터링이 포함됩니다. 쿼터 추적은 `~/.claude/quota-status/`(프록시: 세션별 분리) 또는 `~/.claude/quota-status.json`(프리로드: 단일 세션 레거시 경로)을 통해 동작합니다.
296
296
 
297
297
  전체 상세, 디버그 모드, 접두사 비교, 환경 변수, 내장 쿼터 분석 도구는 [docs/monitoring.md](docs/monitoring.md)를 참조하십시오.
298
298
 
package/README.md CHANGED
@@ -39,7 +39,7 @@ On every `/v1/messages` request, 7 extensions run in order:
39
39
  | `identity-normalization` | Normalizes message identity fields for prefix stability |
40
40
  | `fresh-session-sort` | Fixes non-deterministic ordering on first turn |
41
41
  | `cache-control-normalize` | Normalizes cache_control markers across messages |
42
- | `cache-telemetry` | Extracts cache stats from response headers → `~/.claude/quota-status.json` |
42
+ | `cache-telemetry` | Extracts cache stats from response headers → `~/.claude/quota-status/{account.json,sessions/<id>.json}` |
43
43
 
44
44
  Extensions are hot-reloadable — add, remove, or modify `.mjs` files in `proxy/extensions/` and changes apply to the next request without restarting. Configuration in `proxy/extensions.json`.
45
45
 
@@ -280,7 +280,7 @@ The interceptor can only *help* or *do nothing*. It cannot make things worse.
280
280
 
281
281
  ## Status line — quota warnings in real time
282
282
 
283
- Both proxy and preload modes write quota state to `~/.claude/quota-status.json` on every API call. The included `tools/quota-statusline.sh` script displays a live status line showing:
283
+ Both modes write quota state on every API call. Proxy mode (v3.5.0+) splits into `~/.claude/quota-status/account.json` (account-global fields: Q5h/Q7d, status, overage) plus `~/.claude/quota-status/sessions/<id>.json` (per-session cache fields: TTL tier, hit rate). Preload mode keeps the legacy `~/.claude/quota-status.json` (single-session by construction). The included `tools/quota-statusline.sh` script displays a live status line showing:
284
284
 
285
285
  - **Q5h %** with burn rate (%/min)
286
286
  - **Q7d %** with burn rate (%/hr)
@@ -468,7 +468,7 @@ The interceptor can rewrite Claude Code's `# Output efficiency` system-prompt se
468
468
 
469
469
  ## Monitoring & diagnostics
470
470
 
471
- The preload interceptor includes monitoring for microcompact degradation, false rate limiters, GrowthBook flag state, usage telemetry, and cost reporting. Quota tracking works in both proxy and preload modes via `~/.claude/quota-status.json`.
471
+ The preload interceptor includes monitoring for microcompact degradation, false rate limiters, GrowthBook flag state, usage telemetry, and cost reporting. Quota tracking works in both proxy and preload modes via `~/.claude/quota-status/` (proxy: per-session split) or `~/.claude/quota-status.json` (preload: single-session legacy path).
472
472
 
473
473
  See [docs/monitoring.md](docs/monitoring.md) for full details, debug mode, prefix diffing, environment variables, and the bundled quota analysis tool.
474
474
 
@@ -493,6 +493,7 @@ We monitor 30+ upstream Claude Code issues related to cache, quota, and context
493
493
  ## Used in production
494
494
 
495
495
  - **[Crunchloop DAP](https://dap.crunchloop.ai)** — Agent SDK / DAP development environment. First production team to merge the interceptor to trunk for team-wide deployment (2026-04-10). Identified two distinct cache regression patterns through real-world testing — tool ordering jitter and the fresh-session sort gap — and contributed debug traces that drove the v1.5.1 and v1.6.2 fixes.
496
+ - **[VM Farms](https://vmfarms.com)** ([@vmfarms](https://github.com/vmfarms)) — Agent development environment running concurrent multi-runner workloads with `--resume --fork-session`. Surfaced three cache-fix proxy-mode bugs: the resume-marker regex no-op (#96), TTL tier detection gap vs preload mode (#97), and image-strip stderr leak past `CACHE_FIX_DEBUG` (#98) — all addressed in the v3.4.0 release.
496
497
 
497
498
  ## Contributors
498
499
 
@@ -509,6 +510,7 @@ We monitor 30+ upstream Claude Code issues related to cache, quota, and context
509
510
  - **[@JEONG-JIWOO](https://github.com/JEONG-JIWOO)** — VS Code extension investigation: discovered `claudeCode.claudeProcessWrapper` as the working integration path, wrote the C wrapper for Windows (#16)
510
511
  - **[@X-15](https://github.com/X-15)** — VS Code extension validation, per-fix health status analysis confirming safety check behavior on v2.1.105 (#16)
511
512
  - **[@deafsquad](https://github.com/deafsquad)** — Universal smoosh_split un-smoosh fix (PR #26), source-level function attribution of resume scatter bug (anthropics/claude-code#43657), OTEL telemetry discovery, proposed and built proxy architecture for v3.0.0
513
+ - **[@vmfarms](https://github.com/vmfarms)** — Concurrent multi-runner production validation, surfaced proxy-mode resume-marker regex no-op (#96), TTL tier detection gap (#97), and image-strip stderr leak (#98)
512
514
 
513
515
  If you contributed to the community effort on these issues and aren't listed here, please open an issue or PR — we want to credit everyone properly.
514
516
 
package/README.zh.md CHANGED
@@ -39,7 +39,7 @@ ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
39
39
  | `identity-normalization` | 规范化消息身份字段以保持前缀稳定性 |
40
40
  | `fresh-session-sort` | 修复首次轮次的非确定性排序 |
41
41
  | `cache-control-normalize` | 规范化消息间的 cache_control 标记 |
42
- | `cache-telemetry` | 从响应头提取缓存统计 → `~/.claude/quota-status.json` |
42
+ | `cache-telemetry` | 从响应头提取缓存统计 → `~/.claude/quota-status/{account.json,sessions/<id>.json}` |
43
43
 
44
44
  扩展支持热重载 — 在 `proxy/extensions/` 中添加、删除或修改 `.mjs` 文件,更改将在下一次请求时生效,无需重启。配置在 `proxy/extensions.json` 中。
45
45
 
@@ -167,7 +167,7 @@ NODE_OPTIONS="--import claude-code-cache-fix" claude
167
167
 
168
168
  ## 状态栏 — 实时配额警告
169
169
 
170
- 代理和预加载模式都会在每次 API 调用时将配额状态写入 `~/.claude/quota-status.json`。内置的 `tools/quota-statusline.sh` 脚本显示实时状态栏:
170
+ 两种模式在每次 API 调用时都会写入配额状态。代理模式(v3.5.0+)拆分为 `~/.claude/quota-status/account.json`(账户级:Q5h/Q7d、状态、超额)和 `~/.claude/quota-status/sessions/<id>.json`(每会话:TTL 层级、命中率)。预加载模式保留旧版 `~/.claude/quota-status.json`(按构造为单会话)。内置的 `tools/quota-statusline.sh` 脚本显示实时状态栏:
171
171
 
172
172
  - **Q5h %** 及消耗速率(%/分钟)
173
173
  - **Q7d %** 及消耗速率(%/小时)
@@ -200,7 +200,7 @@ export CACHE_FIX_IMAGE_KEEP_LAST=3
200
200
 
201
201
  ## 监控与诊断
202
202
 
203
- 预加载拦截器包含对微压缩降级、虚假速率限制器、GrowthBook 标志状态、使用量遥测和成本报告的监控。配额追踪在代理和预加载模式下均通过 `~/.claude/quota-status.json` 工作。
203
+ 预加载拦截器包含对微压缩降级、虚假速率限制器、GrowthBook 标志状态、使用量遥测和成本报告的监控。配额追踪通过 `~/.claude/quota-status/`(代理:按会话拆分)或 `~/.claude/quota-status.json`(预加载:单会话旧路径)工作。
204
204
 
205
205
  完整详情、调试模式、前缀差异对比、环境变量和内置配额分析工具请参见 [docs/monitoring.md](docs/monitoring.md)。
206
206
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-code-cache-fix",
3
- "version": "3.4.0",
3
+ "version": "3.5.0",
4
4
  "description": "Cache optimization proxy and interceptor for Claude Code. Fixes prompt cache bugs, stabilizes prefix, reduces quota burn.",
5
5
  "type": "module",
6
6
  "exports": "./preload.mjs",
@@ -1,8 +1,63 @@
1
- import { writeFileSync, mkdirSync } from "node:fs";
1
+ import {
2
+ writeFileSync,
3
+ renameSync,
4
+ unlinkSync,
5
+ mkdirSync,
6
+ readdirSync,
7
+ statSync,
8
+ } from "node:fs";
2
9
  import { join } from "node:path";
3
10
  import { homedir } from "node:os";
11
+ import { createHash, randomBytes } from "node:crypto";
4
12
 
5
- const QUOTA_PATH = join(homedir(), ".claude", "quota-status.json");
13
+ // Paths are resolved per call (not cached at module load) so tests can swap
14
+ // $HOME between cases. The homedir() call is essentially free.
15
+ function paths() {
16
+ const home = homedir();
17
+ const quotaDir = join(home, ".claude", "quota-status");
18
+ return {
19
+ quotaDir,
20
+ accountPath: join(quotaDir, "account.json"),
21
+ sessionsDir: join(quotaDir, "sessions"),
22
+ legacyPath: join(home, ".claude", "quota-status.json"),
23
+ };
24
+ }
25
+
26
+ const SAFE_NAME_RE = /^[A-Za-z0-9_-]{1,128}$/;
27
+ const SWEEP_THROTTLE_MS = 60_000;
28
+ const DEFAULT_TTL_DAYS = 7;
29
+
30
+ // --- Module-scope state ---
31
+ let legacyCleanupDone = false;
32
+ let lastSweepMs = 0;
33
+
34
+ // Per directive `proxy-quota-status-per-session.md` — derive a filesystem-safe
35
+ // filename from a raw session id. Both writer (this extension) and readers
36
+ // (tools/quota-statusline.sh, etc.) must apply the same rule.
37
+ //
38
+ // Rules:
39
+ // - null/undefined/empty/whitespace → "unknown"
40
+ // - matches /^[A-Za-z0-9_-]{1,128}$/ → raw passthrough
41
+ // - else → "inv-" + sha256(raw)[:16]
42
+ //
43
+ // Exported for unit testing and for the directive's writer/reader contract.
44
+ export function sessionFilename(rawId) {
45
+ if (rawId === null || rawId === undefined) return "unknown";
46
+ const s = String(rawId).trim();
47
+ if (s.length === 0) return "unknown";
48
+ if (SAFE_NAME_RE.test(s)) return s;
49
+ return "inv-" + createHash("sha256").update(s).digest("hex").slice(0, 16);
50
+ }
51
+
52
+ function resolveSessionId(headers) {
53
+ if (!headers) return null;
54
+ const sid =
55
+ headers["x-claude-code-session-id"] ||
56
+ headers["x-session-id"] ||
57
+ headers["x-anthropic-session-id"] ||
58
+ null;
59
+ return sid || null;
60
+ }
6
61
 
7
62
  function parseHeaders(headers) {
8
63
  const get = (key) => headers[key] || "";
@@ -44,9 +99,56 @@ function parseHeaders(headers) {
44
99
  };
45
100
  }
46
101
 
102
+ function atomicWrite(finalPath, content) {
103
+ const tmp = `${finalPath}.tmp.${process.pid}.${randomBytes(4).toString("hex")}`;
104
+ writeFileSync(tmp, content);
105
+ renameSync(tmp, finalPath);
106
+ }
107
+
108
+ function cleanupLegacyOnce() {
109
+ if (legacyCleanupDone) return;
110
+ legacyCleanupDone = true;
111
+ try {
112
+ unlinkSync(paths().legacyPath);
113
+ } catch {}
114
+ }
115
+
116
+ function sweepStaleSessions(ttlDays) {
117
+ const now = Date.now();
118
+ if (now - lastSweepMs < SWEEP_THROTTLE_MS) return;
119
+ lastSweepMs = now;
120
+
121
+ const cutoffMs = now - ttlDays * 86_400_000;
122
+ const { sessionsDir } = paths();
123
+ let entries;
124
+ try {
125
+ entries = readdirSync(sessionsDir);
126
+ } catch {
127
+ return;
128
+ }
129
+ for (const name of entries) {
130
+ const p = join(sessionsDir, name);
131
+ try {
132
+ const st = statSync(p);
133
+ if (st.mtimeMs < cutoffMs) {
134
+ try {
135
+ unlinkSync(p);
136
+ } catch {}
137
+ }
138
+ } catch {}
139
+ }
140
+ }
141
+
142
+ function getTtlDays() {
143
+ const raw = process.env.CACHE_FIX_QUOTA_STATUS_TTL_DAYS;
144
+ if (raw === undefined || raw === "") return DEFAULT_TTL_DAYS;
145
+ const n = Number(raw);
146
+ return Number.isFinite(n) && n >= 0 ? n : DEFAULT_TTL_DAYS;
147
+ }
148
+
47
149
  export default {
48
150
  name: "cache-telemetry",
49
- description: "Extract cache stats from response stream, persist quota state to ~/.claude/quota-status.json",
151
+ description: "Extract cache stats from response stream, persist quota state to ~/.claude/quota-status/{account.json,sessions/<filename>.json}",
50
152
  order: 600,
51
153
 
52
154
  async onResponseStart(ctx) {
@@ -56,6 +158,7 @@ export default {
56
158
  if (!quota) return;
57
159
 
58
160
  ctx.meta._quotaData = quota;
161
+ ctx.meta._sessionId = resolveSessionId(ctx.headers);
59
162
  },
60
163
 
61
164
  async onStreamEvent(ctx) {
@@ -89,24 +192,43 @@ export default {
89
192
 
90
193
  const ttl = cr > 0 ? "1h" : (cc > 0 ? "5m" : "unknown");
91
194
 
92
- const output = {
93
- cache: {
94
- ttl_tier: ttl,
95
- cache_creation: cc,
96
- cache_read: cr,
97
- ephemeral_1h: ephemeral1h,
98
- ephemeral_5m: ephemeral5m,
99
- hit_rate: hitRate,
100
- timestamp: new Date().toISOString(),
195
+ const timestamp = new Date().toISOString();
196
+ const rawSid = ctx.meta._sessionId;
197
+ const filename = sessionFilename(rawSid);
198
+
199
+ const accountPayload = JSON.stringify({ ...quota, timestamp }, null, 2);
200
+ const sessionPayload = JSON.stringify(
201
+ {
202
+ cache: {
203
+ ttl_tier: ttl,
204
+ cache_creation: cc,
205
+ cache_read: cr,
206
+ ephemeral_1h: ephemeral1h,
207
+ ephemeral_5m: ephemeral5m,
208
+ hit_rate: hitRate,
209
+ timestamp,
210
+ },
211
+ timestamp,
212
+ session_id: rawSid,
101
213
  },
102
- timestamp: new Date().toISOString(),
103
- ...quota,
104
- };
214
+ null,
215
+ 2,
216
+ );
105
217
 
106
218
  try {
107
- mkdirSync(join(homedir(), ".claude"), { recursive: true });
108
- writeFileSync(QUOTA_PATH, JSON.stringify(output, null, 2));
219
+ cleanupLegacyOnce();
220
+ const { sessionsDir, accountPath } = paths();
221
+ mkdirSync(sessionsDir, { recursive: true });
222
+ atomicWrite(accountPath, accountPayload);
223
+ atomicWrite(join(sessionsDir, `${filename}.json`), sessionPayload);
224
+ sweepStaleSessions(getTtlDays());
109
225
  } catch {}
110
226
  }
111
227
  },
228
+
229
+ // Test-only: reset module state between tests.
230
+ __resetForTests() {
231
+ legacyCleanupDone = false;
232
+ lastSweepMs = 0;
233
+ },
112
234
  };
@@ -234,6 +234,7 @@ export function normalizeToolResultContent(messages, match, canonicalText) {
234
234
  function hashSessionId(reqCtx) {
235
235
  const sid =
236
236
  reqCtx?.meta?.session_id ||
237
+ reqCtx?.headers?.["x-claude-code-session-id"] ||
237
238
  reqCtx?.headers?.["x-session-id"] ||
238
239
  reqCtx?.headers?.["x-anthropic-session-id"] ||
239
240
  null;
@@ -19,7 +19,9 @@ set -euo pipefail
19
19
 
20
20
  CLAUDE_CLI="$HOME/.npm-global/lib/node_modules/@anthropic-ai/claude-code/cli.js"
21
21
  PRELOAD="$HOME/.claude/cache-fix-preload.mjs"
22
- QUOTA_FILE="$HOME/.claude/quota-status.json"
22
+ QUOTA_DIR="$HOME/.claude/quota-status"
23
+ ACCOUNT_FILE="$QUOTA_DIR/account.json"
24
+ SESSIONS_DIR="$QUOTA_DIR/sessions"
23
25
  USAGE_LOG="$HOME/.claude/usage.jsonl"
24
26
  DEBUG_LOG="$HOME/.claude/cache-fix-debug.log"
25
27
  REPORT_DIR="/tmp/cache-test-$(date +%Y%m%d_%H%M%S)"
@@ -54,21 +56,27 @@ echo ""
54
56
 
55
57
  mkdir -p "$REPORT_DIR"
56
58
 
57
- # Helper: snapshot cache state from quota-status.json
59
+ # Helper: snapshot cache state from the most-recent per-session quota-status
60
+ # file. Each one-shot CC invocation generates its own session, so the latest
61
+ # sessions/<filename>.json corresponds to the call we just made.
58
62
  snapshot_cache() {
59
63
  local label="$1"
60
64
  local outfile="$REPORT_DIR/${label}.json"
61
- if [ -f "$QUOTA_FILE" ]; then
62
- cp "$QUOTA_FILE" "$outfile"
63
- local tier=$(python3 -c "import json; d=json.load(open('$QUOTA_FILE')); print(d.get('cache',{}).get('ttl_tier','?'))" 2>/dev/null || echo "?")
64
- local create=$(python3 -c "import json; d=json.load(open('$QUOTA_FILE')); print(d.get('cache',{}).get('cache_creation',0))" 2>/dev/null || echo "?")
65
- local read=$(python3 -c "import json; d=json.load(open('$QUOTA_FILE')); print(d.get('cache',{}).get('cache_read',0))" 2>/dev/null || echo "?")
66
- local e1h=$(python3 -c "import json; d=json.load(open('$QUOTA_FILE')); print(d.get('cache',{}).get('ephemeral_1h',0))" 2>/dev/null || echo "?")
67
- local e5m=$(python3 -c "import json; d=json.load(open('$QUOTA_FILE')); print(d.get('cache',{}).get('ephemeral_5m',0))" 2>/dev/null || echo "?")
68
- local hit=$(python3 -c "import json; d=json.load(open('$QUOTA_FILE')); print(d.get('cache',{}).get('hit_rate','?'))" 2>/dev/null || echo "?")
65
+ local sess_file=""
66
+ if [ -d "$SESSIONS_DIR" ]; then
67
+ sess_file=$(ls -t "$SESSIONS_DIR"/*.json 2>/dev/null | head -1)
68
+ fi
69
+ if [ -n "$sess_file" ] && [ -f "$sess_file" ]; then
70
+ cp "$sess_file" "$outfile"
71
+ local tier=$(python3 -c "import json; d=json.load(open('$sess_file')); print(d.get('cache',{}).get('ttl_tier','?'))" 2>/dev/null || echo "?")
72
+ local create=$(python3 -c "import json; d=json.load(open('$sess_file')); print(d.get('cache',{}).get('cache_creation',0))" 2>/dev/null || echo "?")
73
+ local read=$(python3 -c "import json; d=json.load(open('$sess_file')); print(d.get('cache',{}).get('cache_read',0))" 2>/dev/null || echo "?")
74
+ local e1h=$(python3 -c "import json; d=json.load(open('$sess_file')); print(d.get('cache',{}).get('ephemeral_1h',0))" 2>/dev/null || echo "?")
75
+ local e5m=$(python3 -c "import json; d=json.load(open('$sess_file')); print(d.get('cache',{}).get('ephemeral_5m',0))" 2>/dev/null || echo "?")
76
+ local hit=$(python3 -c "import json; d=json.load(open('$sess_file')); print(d.get('cache',{}).get('hit_rate','?'))" 2>/dev/null || echo "?")
69
77
  echo " [$label] TTL=$tier create=$create read=$read 1h=$e1h 5m=$e5m hit=$hit%"
70
78
  else
71
- echo " [$label] No quota-status.json found"
79
+ echo " [$label] No per-session quota-status file found in $SESSIONS_DIR"
72
80
  fi
73
81
  }
74
82
 
@@ -101,7 +101,7 @@ done
101
101
  Q5H=$(python3 -c "
102
102
  import json
103
103
  try:
104
- q = json.load(open('$HOME/.claude/quota-status.json'))
104
+ q = json.load(open('$HOME/.claude/quota-status/account.json'))
105
105
  print(q['five_hour']['pct'])
106
106
  except Exception:
107
107
  print(0)
@@ -116,7 +116,7 @@ echo "Preflight OK: Q5h at ${Q5H}%, 4 versions installed, launcher present." | t
116
116
  echo "" | tee -a "$SUMMARY"
117
117
 
118
118
  # Snapshot quota state at start
119
- cp "$HOME/.claude/quota-status.json" "$OUTPUT_DIR/raw-quota-status-start.json" 2>/dev/null || true
119
+ cp "$HOME/.claude/quota-status/account.json" "$OUTPUT_DIR/raw-quota-status-start.json" 2>/dev/null || true
120
120
 
121
121
  # ─── Phase A: steady-state per version ─────────────────────────────────────
122
122
 
@@ -189,7 +189,7 @@ if [[ "$INCLUDE_IDLE" -eq 1 ]]; then
189
189
  fi
190
190
 
191
191
  # Snapshot quota state at end
192
- cp "$HOME/.claude/quota-status.json" "$OUTPUT_DIR/raw-quota-status-end.json" 2>/dev/null || true
192
+ cp "$HOME/.claude/quota-status/account.json" "$OUTPUT_DIR/raw-quota-status-end.json" 2>/dev/null || true
193
193
 
194
194
  # ─── Analysis ──────────────────────────────────────────────────────────────
195
195
 
@@ -295,7 +295,7 @@ if [[ "$Q5H" -lt 50 ]]; then
295
295
  NEW_Q5H=$(python3 -c "
296
296
  import json
297
297
  try:
298
- print(json.load(open('$HOME/.claude/quota-status.json'))['five_hour']['pct'])
298
+ print(json.load(open('$HOME/.claude/quota-status/account.json'))['five_hour']['pct'])
299
299
  except Exception:
300
300
  print('?')
301
301
  " 2>/dev/null)
@@ -1,25 +1,82 @@
1
1
  #!/bin/bash
2
- # Status line: show quota % and burn rate from quota-status.json
2
+ # Status line: show quota % and burn rate from per-session quota-status files.
3
3
  # Written by cache-fix proxy's cache-telemetry extension on every API call.
4
+ #
5
+ # Layout (post-v3.5.0):
6
+ # ~/.claude/quota-status/account.json — global quota fields (5h/7d, status, overage)
7
+ # ~/.claude/quota-status/sessions/<filename>.json — per-session cache fields (ttl_tier, hit_rate)
8
+ #
9
+ # CC pipes hook input as JSON on stdin including `session_id`, which we map to
10
+ # the per-session filename via the canonical rule (matches the writer in
11
+ # proxy/extensions/cache-telemetry.mjs:sessionFilename).
4
12
 
5
13
  input=$(cat)
6
14
 
7
- QS="$HOME/.claude/quota-status.json"
15
+ ACCOUNT="$HOME/.claude/quota-status/account.json"
16
+ SESSIONS_DIR="$HOME/.claude/quota-status/sessions"
8
17
 
9
- if [ -f "$QS" ]; then
10
- result=$(python3 -c "
11
- import sys, json, os
18
+ # Show quota even if no per-session file exists yet (fresh session, first
19
+ # request hasn't fired). Per-session block just gets blank.
20
+ if [ ! -f "$ACCOUNT" ]; then
21
+ exit 0
22
+ fi
23
+
24
+ result=$(python3 -c "
25
+ import sys, json, os, re, hashlib
12
26
  from datetime import datetime, timezone, timedelta
13
27
 
14
- qs = json.load(open(os.path.expanduser('~/.claude/quota-status.json')))
28
+ home = os.path.expanduser('~')
29
+ account_path = os.path.join(home, '.claude', 'quota-status', 'account.json')
30
+ sessions_dir = os.path.join(home, '.claude', 'quota-status', 'sessions')
31
+
32
+ # Parse stdin JSON (CC hook input) for session_id. Pass the raw value
33
+ # (including null / "" / whitespace) through session_filename so the
34
+ # canonical rule decides — the writer maps all those to 'unknown',
35
+ # the reader must do the same to keep the contract identical.
36
+ try:
37
+ stdin_data = json.loads('''$input''') if '''$input''' else {}
38
+ except Exception:
39
+ stdin_data = {}
40
+ sess_id_raw = stdin_data.get('session_id')
41
+
42
+ # Canonical filename derivation — must match cache-telemetry.mjs:sessionFilename.
43
+ # Allowlist: [A-Za-z0-9_-]{1,128}; else inv-<sha256(s)[:16]>; null/empty/whitespace -> 'unknown'.
44
+ SAFE = re.compile(r'^[A-Za-z0-9_-]{1,128}\$')
45
+ def session_filename(raw):
46
+ if raw is None:
47
+ return 'unknown'
48
+ s = str(raw).strip()
49
+ if not s:
50
+ return 'unknown'
51
+ if SAFE.match(s):
52
+ return s
53
+ return 'inv-' + hashlib.sha256(s.encode('utf-8')).hexdigest()[:16]
15
54
 
16
- q5h = qs.get('five_hour', {}).get('pct', 0)
17
- q7d = qs.get('seven_day', {}).get('pct', 0)
18
- q5h_reset = qs.get('five_hour', {}).get('resets_at', 0)
19
- q7d_reset = qs.get('seven_day', {}).get('resets_at', 0)
20
- status = qs.get('status', '')
21
- overage = qs.get('overage_status', '')
22
- ts = qs.get('timestamp', '')
55
+ # Read account.json (account-global fields).
56
+ try:
57
+ acc = json.load(open(account_path))
58
+ except Exception:
59
+ sys.exit(0)
60
+
61
+ # Read this session's per-session file (cache fields). Apply the rule
62
+ # unconditionally — null/empty/whitespace land at sessions/unknown.json,
63
+ # matching where the writer would have placed them. If the file doesn't
64
+ # exist (e.g. unknown.json never written, or this is a fresh session
65
+ # whose first request hasn't fired), statusline still shows quota % —
66
+ # just no TTL/hit-rate block.
67
+ sess_filename = session_filename(sess_id_raw)
68
+ try:
69
+ sess = json.load(open(os.path.join(sessions_dir, sess_filename + '.json')))
70
+ except Exception:
71
+ sess = {}
72
+
73
+ q5h = acc.get('five_hour', {}).get('pct', 0)
74
+ q7d = acc.get('seven_day', {}).get('pct', 0)
75
+ q5h_reset = acc.get('five_hour', {}).get('resets_at', 0)
76
+ q7d_reset = acc.get('seven_day', {}).get('resets_at', 0)
77
+ status = acc.get('status', '')
78
+ overage = acc.get('overage_status', '')
79
+ ts = sess.get('timestamp') or acc.get('timestamp', '')
23
80
 
24
81
  now = datetime.fromisoformat(ts.replace('Z', '+00:00')) if ts else datetime.now(timezone.utc)
25
82
 
@@ -48,9 +105,9 @@ if rate7:
48
105
  if overage == 'active':
49
106
  label += ' | OVERAGE'
50
107
 
51
- # TTL and cache stats
52
- ttl = qs.get('cache', {}).get('ttl_tier', '')
53
- hit = qs.get('cache', {}).get('hit_rate', '')
108
+ # Per-session TTL and cache stats
109
+ ttl = sess.get('cache', {}).get('ttl_tier', '')
110
+ hit = sess.get('cache', {}).get('hit_rate', '')
54
111
  if ttl:
55
112
  if ttl == '5m':
56
113
  label += ' | \033[31mTTL:5m\033[0m'
@@ -59,12 +116,11 @@ if ttl:
59
116
  if hit and hit != 'N/A':
60
117
  label += ' ' + hit + '%'
61
118
 
62
- peak = qs.get('peak_hour', False)
119
+ peak = acc.get('peak_hour', False)
63
120
  if peak:
64
121
  label += ' | \033[33mPEAK\033[0m'
65
122
 
66
123
  print(label)
67
124
  " 2>/dev/null)
68
125
 
69
- [ -n "$result" ] && echo "$result"
70
- fi
126
+ [ -n "$result" ] && echo "$result"