claude-code-cache-fix 1.6.3 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -217,6 +217,26 @@ node tools/cost-report.mjs --admin-key <key> # cross-reference with Admin API
217
217
 
218
218
  Also works with any JSONL containing Anthropic usage fields (`--file`, stdin) — useful for SDK users and proxy setups. See `docs/cost-report.md` for full documentation.
219
219
 
220
+ ### Quota analysis (5-hour quota counting)
221
+
222
+ The same `usage.jsonl` log can be analyzed to test how Anthropic's 5-hour quota is actually computed. Run the bundled tool:
223
+
224
+ ```bash
225
+ node tools/quota-analysis.mjs # analyze your default log
226
+ node tools/quota-analysis.mjs --since 24h # last 24 hours only
227
+ node tools/quota-analysis.mjs --json # machine-readable output
228
+ ```
229
+
230
+ The tool answers three questions from your own data:
231
+
232
+ 1. **Does `cache_read` count toward your 5-hour quota?** Tests three hypotheses (cache_read costs 0x / 0.1x / 1x of input rate) and reports which one best explains your `q5h_pct` trajectory across reset windows. Lower coefficient of variation across windows = better fit.
233
+ 2. **Do peak hours cost more quota per token?** Splits windows into peak-dominant (≥80% peak calls) and off-peak-dominant (≤20%) and compares the implied 100% quota under the best-fit model.
234
+ 3. **What is your account's effective 5-hour quota in token-equivalents?** Reports a concrete number you can compare against your subscription tier or against what other users measure.
235
+
236
+ Requires `q5h_pct`, `q7d_pct`, and `peak_hour` fields in usage.jsonl, which were added in v1.6.1 (2026-04-09). Older entries are silently filtered out.
237
+
238
+ **Help us validate across accounts:** if you run this on your own log, please open an issue or PR on this repo with your output (or just the best-fit hypothesis name and your peak/off-peak ratio). Cross-validating across multiple accounts is the only way to distinguish per-account variance from real findings. Reference: [anthropics/claude-code#45756](https://github.com/anthropics/claude-code/issues/45756).
239
+
220
240
  ## Debug mode
221
241
 
222
242
  Enable debug logging to verify the fix is working:
@@ -283,14 +303,65 @@ Snapshots are saved to `~/.claude/cache-fix-snapshots/` and diff reports are gen
283
303
 
284
304
  - **[@ArkNill/claude-code-hidden-problem-analysis](https://github.com/ArkNill/claude-code-hidden-problem-analysis)** — Systematic proxy-based analysis of 7 bugs including microcompact, budget enforcement, false rate limiter, and extended thinking quota impact. The monitoring features in v1.1.0 are informed by this research.
285
305
  - **[@Renvect/X-Ray-Claude-Code-Interceptor](https://github.com/Renvect/X-Ray-Claude-Code-Interceptor)** — Diagnostic HTTPS proxy with real-time dashboard, system prompt section diffing, per-tool stripping thresholds, and multi-stream JSONL logging. Works with any Claude client that supports `ANTHROPIC_BASE_URL` (CLI, VS Code extension, desktop app), complementing this package's CLI-only `NODE_OPTIONS` approach.
306
+ - **[@fgrosswig/claude-usage-dashboard](https://github.com/fgrosswig/claude-usage-dashboard)** — Self-hosted forensic dashboard with SSE live monitoring, multi-host aggregation, cache-health scoring, and forced-restart/compaction detection. Reads from Claude Code's native session JSONL files and optionally from an HTTP proxy NDJSON stream. v1.4.0 documented the forced-session-restart mechanism at quota-cap boundaries (~490K tokens per event) and the 78–91% cache-wipe pattern at compaction events. Complementary to our interceptor's in-process vantage point. See [Works with @fgrosswig's dashboard](#works-with-fgrosswigs-dashboard) below for the interop pattern.
307
+
308
+ ## Works with @fgrosswig's dashboard
309
+
310
+ This interceptor and [@fgrosswig](https://github.com/fgrosswig)'s
311
+ [claude-usage-dashboard](https://github.com/fgrosswig/claude-usage-dashboard)
312
+ solve strongly complementary problems. The interceptor captures per-call API
313
+ data from inside the Node.js process — cache metrics, quota state, TTL tier,
314
+ rewrites applied. The dashboard provides the visualization layer — historical
315
+ trending, per-day charts, multi-host aggregation, cache-health scoring.
316
+
317
+ Running both gives you the best of both tools, and the integration is a
318
+ one-liner thanks to the dashboard's tolerant NDJSON ingest and our new
319
+ `usage-to-dashboard-ndjson` translator.
320
+
321
+ ### Quick setup
322
+
323
+ ```bash
324
+ # Install both tools
325
+ npm install -g claude-code-cache-fix
326
+ # (follow fgrosswig's dashboard install: https://github.com/fgrosswig/claude-usage-dashboard)
327
+
328
+ # One-shot translation (reads ~/.claude/usage.jsonl, writes to
329
+ # ~/.claude/anthropic-proxy-logs/proxy-YYYY-MM-DD.ndjson, which his
330
+ # dashboard already watches)
331
+ node $(npm root -g)/claude-code-cache-fix/tools/usage-to-dashboard-ndjson.mjs
332
+
333
+ # Or keep it live-updating as the interceptor logs new calls
334
+ node $(npm root -g)/claude-code-cache-fix/tools/usage-to-dashboard-ndjson.mjs --follow &
335
+ ```
336
+
337
+ No configuration required on the dashboard side — fgrosswig's
338
+ `collectProxyNdjsonFiles()` auto-discovers files in
339
+ `~/.claude/anthropic-proxy-logs/` (or `$ANTHROPIC_PROXY_LOG_DIR`), and our
340
+ translator writes to exactly that path with the expected `proxy-YYYY-MM-DD.ndjson`
341
+ filename convention. The dashboard's tolerant ingestion layer ignores unknown
342
+ fields, so interceptor-specific extras (`ttl_tier`, `ephemeral_1h_input_tokens`,
343
+ `ephemeral_5m_input_tokens`, `peak_hour`, quota state) pass through cleanly
344
+ and remain available to downstream consumers that know to read them.
345
+
346
+ The `cost_factor` metric in `tools/cost-report.mjs` also comes from
347
+ fgrosswig's methodology — the `(input + output + cache_read + cache_creation) / output`
348
+ ratio that gives a single-number measure of how much context is being paid
349
+ per useful output token. A rising cost factor across a long session is the
350
+ measurable signature of cache-efficiency degradation.
351
+
352
+ ## Used in production
353
+
354
+ - **[Crunchloop DAP](https://dap.crunchloop.ai)** — Agent SDK / DAP development environment. First production team to merge the interceptor to trunk for team-wide deployment (2026-04-10). Identified two distinct cache regression patterns through real-world testing — tool ordering jitter and the fresh-session sort gap — and contributed debug traces that drove the v1.5.1 and v1.6.2 fixes.
286
355
 
287
356
  ## Contributors
288
357
 
289
358
  - **[@VictorSun92](https://github.com/VictorSun92)** — Original monkey-patch fix for v2.1.88, identified partial scatter on v2.1.90, contributed forward-scan detection, correct block ordering, tighter block matchers, and the optional output-efficiency rewrite hook
359
+ - **[@bilby91](https://github.com/bilby91)** ([Crunchloop DAP](https://dap.crunchloop.ai)) — Agent SDK / DAP production environment validation, 1h cache TTL confirmation, tool ordering jitter discovery via debug trace (fixed in v1.5.1), fresh-session sort bug discovery via SKILLS SORT diagnostic (fixed in v1.6.2). First production team to roll the interceptor to trunk.
290
360
  - **[@jmarianski](https://github.com/jmarianski)** — Root cause analysis via MITM proxy capture and Ghidra reverse engineering, multi-mode cache test script
291
361
  - **[@cnighswonger](https://github.com/cnighswonger)** — Fingerprint stabilization, tool ordering fix, image stripping, monitoring features, overage TTL downgrade discovery, package maintainer
292
362
  - **[@ArkNill](https://github.com/ArkNill)** — Microcompact mechanism analysis, GrowthBook flag documentation, false rate limiter identification
293
363
  - **[@Renvect](https://github.com/Renvect)** — Image duplication discovery, cross-project directory contamination analysis
364
+ - **[@fgrosswig](https://github.com/fgrosswig)** — [claude-usage-dashboard](https://github.com/fgrosswig/claude-usage-dashboard) forensic methodology: cost-factor overhead ratio metric, `anthropic-*` header capture pattern, proxy NDJSON schema that informed our dashboard interop layer
294
365
 
295
366
  If you contributed to the community effort on these issues and aren't listed here, please open an issue or PR — we want to credit everyone properly.
296
367
 
package/README.zh.md CHANGED
@@ -277,9 +277,14 @@ CACHE_FIX_PREFIXDIFF=1 claude-fixed
277
277
  - [#44045](https://github.com/anthropics/claude-code/issues/44045) — SDK 层面的复现与 token 测量
278
278
  - [#32508](https://github.com/anthropics/claude-code/issues/32508) — 关于 `Output efficiency` 系统提示词变更及其可能影响模型行为的社区讨论
279
279
 
280
+ ## 生产环境使用
281
+
282
+ - **[Crunchloop DAP](https://dap.crunchloop.ai)** — Agent SDK / DAP 开发环境。首个将本拦截器合入 trunk 并团队级部署的生产团队(2026-04-10)。通过真实环境测试发现两类不同的缓存回归问题——工具排序抖动与 fresh-session 排序漏洞,并贡献了驱动 v1.5.1 与 v1.6.2 修复的调试日志。
283
+
280
284
  ## 贡献者
281
285
 
282
286
  - **[@VictorSun92](https://github.com/VictorSun92)** — 原始 v2.1.88 monkey-patch 修复作者,识别出 v2.1.90 中的部分块散布问题,并贡献了前向扫描检测、正确的块排序、更严格的块匹配器,以及可选的 output-efficiency 重写 hook
287
+ - **[@bilby91](https://github.com/bilby91)** ([Crunchloop DAP](https://dap.crunchloop.ai)) — Agent SDK / DAP 生产环境验证、1h 缓存 TTL 确认、通过调试日志发现工具排序抖动(v1.5.1 修复)、通过 SKILLS SORT 诊断发现 fresh-session 排序 bug(v1.6.2 修复)。首个将本拦截器合入 trunk 的生产团队。
283
288
  - **[@jmarianski](https://github.com/jmarianski)** — 通过 MITM 代理抓包和 Ghidra 逆向分析定位根因,并提供多模式缓存测试脚本
284
289
  - **[@cnighswonger](https://github.com/cnighswonger)** — 指纹稳定化、工具顺序修复、图片剥离、监控功能、超额 TTL 降级发现,本包维护者
285
290
  - **[@ArkNill](https://github.com/ArkNill)** — 微压缩机制分析、GrowthBook 标志文档整理、虚假速率限制识别
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-code-cache-fix",
3
- "version": "1.6.3",
3
+ "version": "1.7.0",
4
4
  "description": "Fixes prompt cache regression in Claude Code that causes up to 20x cost increase on resumed sessions",
5
5
  "type": "module",
6
6
  "exports": "./preload.mjs",
@@ -13,7 +13,7 @@
13
13
  "node": ">=18"
14
14
  },
15
15
  "scripts": {
16
- "test": "node --test 'test/**/*.test.mjs'"
16
+ "test": "node --test"
17
17
  },
18
18
  "keywords": [
19
19
  "claude-code",
package/preload.mjs CHANGED
@@ -1009,6 +1009,30 @@ globalThis.fetch = async function (url, options) {
1009
1009
  monitorContextDegradation(payload.messages);
1010
1010
  }
1011
1011
 
1012
+ // Diagnostic: dump full tools array (names, descriptions, schemas, sizes) to a file
1013
+ // when CACHE_FIX_DUMP_TOOLS=<path> is set. Useful for per-version tool-schema drift
1014
+ // analysis and for understanding which tools contribute prefix bloat. First used
1015
+ // during the 2026-04-11 cross-version regression investigation.
1016
+ if (process.env.CACHE_FIX_DUMP_TOOLS && payload.tools) {
1017
+ try {
1018
+ const dumpPath = process.env.CACHE_FIX_DUMP_TOOLS;
1019
+ const dump = {
1020
+ timestamp: new Date().toISOString(),
1021
+ tool_count: payload.tools.length,
1022
+ tools: payload.tools.map(t => ({
1023
+ name: t.name,
1024
+ description: t.description || "",
1025
+ desc_chars: (t.description || "").length,
1026
+ schema_chars: JSON.stringify(t.input_schema || {}).length,
1027
+ total_chars: JSON.stringify(t).length,
1028
+ })),
1029
+ system_chars: JSON.stringify(payload.system || "").length,
1030
+ total_tools_chars: JSON.stringify(payload.tools).length,
1031
+ };
1032
+ writeFileSync(dumpPath, JSON.stringify(dump, null, 2));
1033
+ } catch (e) { debugLog("DUMP ERROR:", e?.message); }
1034
+ }
1035
+
1012
1036
  // Prompt size measurement — log system prompt, tools, and injected block sizes
1013
1037
  if (DEBUG && payload.system && payload.tools && payload.messages) {
1014
1038
  const sysChars = JSON.stringify(payload.system).length;
@@ -1061,6 +1085,25 @@ globalThis.fetch = async function (url, options) {
1061
1085
  const status = response.headers.get("anthropic-ratelimit-unified-status");
1062
1086
  const overage = response.headers.get("anthropic-ratelimit-unified-overage-status");
1063
1087
 
1088
+ // Capture ALL anthropic-* and request-id/cf-ray response headers.
1089
+ // Pattern borrowed from @fgrosswig's claude-usage-dashboard proxy:
1090
+ // https://github.com/fgrosswig/claude-usage-dashboard
1091
+ // Widening beyond the specific unified-ratelimit headers above future-proofs
1092
+ // us against Anthropic adding new headers (e.g. experimental rollout flags,
1093
+ // region hints, new quota dimensions) without needing code changes.
1094
+ const allAnthropicHeaders = {};
1095
+ for (const [name, value] of response.headers.entries()) {
1096
+ const lower = name.toLowerCase();
1097
+ if (
1098
+ lower.startsWith("anthropic-") ||
1099
+ lower === "request-id" ||
1100
+ lower === "x-request-id" ||
1101
+ lower === "cf-ray"
1102
+ ) {
1103
+ allAnthropicHeaders[lower] = value;
1104
+ }
1105
+ }
1106
+
1064
1107
  if (h5 || h7d) {
1065
1108
  const quotaFile = join(homedir(), ".claude", "quota-status.json");
1066
1109
  let quota = {};
@@ -1070,6 +1113,7 @@ globalThis.fetch = async function (url, options) {
1070
1113
  quota.seven_day = h7d ? { utilization: parseFloat(h7d), pct: Math.round(parseFloat(h7d) * 100), resets_at: reset7d ? parseInt(reset7d) : null } : quota.seven_day;
1071
1114
  quota.status = status || null;
1072
1115
  quota.overage_status = overage || null;
1116
+ quota.all_headers = allAnthropicHeaders;
1073
1117
 
1074
1118
  // Peak hour detection — Anthropic applies higher quota drain rate during
1075
1119
  // weekday peak hours: 13:00–19:00 UTC (Mon–Fri).
@@ -484,6 +484,12 @@ function printJsonReport(results, summary, ratesData, adminSummary) {
484
484
  total_cost: summary.totalCost,
485
485
  avg_cost_per_call: summary.totalCost / summary.calls,
486
486
  tokens: summary.totals,
487
+ cost_factor: (function () {
488
+ // fgrosswig-style overhead ratio: gross tokens / output tokens
489
+ const gross = summary.totals.input + summary.totals.output +
490
+ summary.totals.cache_read + summary.totals.cache_1h + summary.totals.cache_5m;
491
+ return summary.totals.output > 0 ? gross / summary.totals.output : null;
492
+ })(),
487
493
  by_model: summary.byModel,
488
494
  degradation: summary.degradedCalls > 0 ? {
489
495
  degraded_calls: summary.degradedCalls,
@@ -544,6 +550,15 @@ function printMarkdownReport(results, summary, ratesData, adminSummary) {
544
550
  lines.push(`| Total cache write 5m | ${fmt(summary.totals.cache_5m)} |`);
545
551
  lines.push(`| **Total cost** | **${fmtCost(summary.totalCost)}** |`);
546
552
  lines.push(`| Avg cost per call | ${fmtCost(summary.totalCost / summary.calls)} |`);
553
+ {
554
+ // Cost factor: popularized by @fgrosswig's claude-usage-dashboard
555
+ // (https://github.com/fgrosswig/claude-usage-dashboard)
556
+ const grossTokens = summary.totals.input + summary.totals.output +
557
+ summary.totals.cache_read + summary.totals.cache_1h + summary.totals.cache_5m;
558
+ if (summary.totals.output > 0) {
559
+ lines.push(`| Cost factor (tokens/output) | ${(grossTokens / summary.totals.output).toFixed(1)}× |`);
560
+ }
561
+ }
547
562
  lines.push('');
548
563
 
549
564
  // By model
@@ -680,6 +695,22 @@ function printTextReport(results, summary, ratesData, adminSummary) {
680
695
  }
681
696
  }
682
697
  }
698
+
699
+ // ── Cost factor (overhead ratio) ──
700
+ // Credit: this metric was popularized by @fgrosswig's claude-usage-dashboard
701
+ // (https://github.com/fgrosswig/claude-usage-dashboard). It divides total
702
+ // tokens processed (input + output + cache_read + cache_creation) by useful
703
+ // output tokens, giving a single-number "how much context am I carrying
704
+ // per useful word of output" multiplier. Values climb over long sessions
705
+ // due to resume/compaction cycles; a rising curve is a signal that cache
706
+ // efficiency is degrading.
707
+ const totalCacheCreate = summary.totals.cache_1h + summary.totals.cache_5m;
708
+ const grossTokens = summary.totals.input + summary.totals.output +
709
+ summary.totals.cache_read + totalCacheCreate;
710
+ if (summary.totals.output > 0) {
711
+ const costFactor = grossTokens / summary.totals.output;
712
+ console.log(` Cost factor: ${costFactor.toFixed(1)}× (tokens/output)`);
713
+ }
683
714
  console.log('');
684
715
 
685
716
  // ── Degradation ──
@@ -0,0 +1,539 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * quota-analysis — Test how Anthropic's 5-hour quota is actually computed
4
+ * by analyzing your own per-call telemetry.
5
+ *
6
+ * Reads usage.jsonl (the per-call log written by claude-code-cache-fix v1.6.1+)
7
+ * and answers three questions:
8
+ *
9
+ * 1. Does cache_read count toward your 5-hour quota?
10
+ * Tests three hypotheses (cache_read costs 0x / 0.1x / 1x of input rate)
11
+ * and reports which one best explains the q5h_pct trajectory across
12
+ * reset windows in your data.
13
+ *
14
+ * 2. Do peak hours (weekday 13:00–19:00 UTC) cost more quota per token?
15
+ * Splits windows into peak-dominant vs off-peak-dominant and compares
16
+ * the implied 100% quota under the best-fit counting model.
17
+ *
18
+ * 3. What is your account's effective 5-hour quota in token-equivalents?
19
+ * Reports a concrete number you can compare against your subscription
20
+ * tier or against what other users are seeing.
21
+ *
22
+ * Telemetry requirements:
23
+ * - usage.jsonl entries must include q5h_pct, q7d_pct, peak_hour fields
24
+ * - These were added in claude-code-cache-fix v1.6.1 (2026-04-09)
25
+ * - Older entries are silently filtered out
26
+ * - Need at least 2 q5h reset events in the data for meaningful analysis
27
+ * (typically 10+ hours of active use)
28
+ *
29
+ * Methodology and caveats:
30
+ * - q5h is a 5-hour SLIDING window. We approximate it as discrete reset
31
+ * boundaries by looking for drops in q5h_pct >= 5 percentage points.
32
+ * - Token-equivalent weights: uncached_input = 1.0, output = 5.0,
33
+ * cache_creation = 2.0 (treats all writes as 1h-tier; the 5m tier is
34
+ * 1.25 but most writes are 1h with the interceptor's TTL injection).
35
+ * - Coefficient of variation (CV) is used to compare hypotheses: lower
36
+ * CV across windows = better fit. CV < 50% suggests a clear winner;
37
+ * CV > 80% suggests the model is wrong or sample is too small.
38
+ * - Single-account analysis. Sample is yours. Findings should be
39
+ * compared across multiple accounts before generalizing.
40
+ *
41
+ * Part of claude-code-cache-fix. Works with the interceptor's usage log.
42
+ * https://github.com/cnighswonger/claude-code-cache-fix
43
+ *
44
+ * Reference: anthropics/claude-code#45756 (cache_read quota counting hypothesis)
45
+ */
46
+
47
+ import { readFileSync, existsSync } from 'node:fs';
48
+ import { homedir } from 'node:os';
49
+ import { join } from 'node:path';
50
+
51
+ const DEFAULT_USAGE_LOG = join(homedir(), '.claude', 'usage.jsonl');
52
+
53
+ // Token-equivalent weights for the H_zero counting model.
54
+ // (cache_read weight is the variable being tested.)
55
+ const W_UNCACHED_INPUT = 1.0;
56
+ const W_OUTPUT = 5.0;
57
+ const W_CACHE_CREATION = 2.0; // 1h tier conservative; 5m would be 1.25
58
+
59
+ // Q5h window boundary detection threshold (in percentage points)
60
+ const RESET_THRESHOLD = 5;
61
+
62
+ // Window classification thresholds
63
+ const PEAK_WINDOW_MIN_PCT = 80; // >= 80% peak calls = peak-dominant window
64
+ const OFFPEAK_WINDOW_MAX_PCT = 20; // <= 20% peak calls = offpeak-dominant window
65
+
66
+ // Minimum delta_q5h for a window to be useful for extrapolation
67
+ const MIN_DELTA_Q5H = 5;
68
+
69
+ // ─── CLI parsing ────────────────────────────────────────────────────────────
70
+
71
+ function parseArgs() {
72
+ const args = process.argv.slice(2);
73
+ const opts = { file: null, since: null, format: 'text', help: false };
74
+ for (let i = 0; i < args.length; i++) {
75
+ const a = args[i];
76
+ if (a === '--help' || a === '-h') opts.help = true;
77
+ else if (a === '--file' || a === '-f') opts.file = args[++i];
78
+ else if (a === '--since' || a === '-s') opts.since = args[++i];
79
+ else if (a === '--format') opts.format = args[++i];
80
+ else if (a === '--json') opts.format = 'json';
81
+ else { console.error(`Unknown argument: ${a}`); opts.help = true; }
82
+ }
83
+ return opts;
84
+ }
85
+
86
+ function printUsage() {
87
+ console.log(`quota-analysis — analyze 5-hour quota counting from usage telemetry
88
+
89
+ Usage:
90
+ quota-analysis [options]
91
+
92
+ Options:
93
+ -f, --file <path> JSONL file to read (default: ~/.claude/usage.jsonl)
94
+ -s, --since <duration> Filter to last N hours/days (e.g. 24h, 3d, 7d)
95
+ --format <fmt> Output format: text (default), json, markdown
96
+ --json Shorthand for --format json
97
+ -h, --help Show this help
98
+
99
+ Examples:
100
+ quota-analysis # Analyze your default log
101
+ quota-analysis --since 24h # Last 24 hours only
102
+ quota-analysis --file /tmp/team.jsonl # A different log file
103
+ quota-analysis --json > report.json # Machine-readable output
104
+
105
+ Methodology:
106
+ Tests three counting hypotheses for cache_read in the 5-hour quota:
107
+ H_zero = cache_read costs nothing for quota
108
+ H_billed = cache_read costs 0.1x of input rate (matches the billing rate)
109
+ H_full = cache_read costs 1.0x of input rate (the original concern)
110
+ The hypothesis with the lowest coefficient of variation across reset
111
+ windows is the best fit for your data.
112
+
113
+ Then splits windows into peak (weekday 13:00–19:00 UTC) and off-peak
114
+ groups and compares the effective quota multiplier between them.
115
+
116
+ Reference:
117
+ anthropics/claude-code#45756 — original "cache_read counts at full rate"
118
+ hypothesis from @molu0219.
119
+ `);
120
+ }
121
+
122
+ // ─── Data loading ───────────────────────────────────────────────────────────
123
+
124
+ function loadUsage(filePath) {
125
+ if (!existsSync(filePath)) {
126
+ console.error(`Error: usage file not found: ${filePath}`);
127
+ console.error(`Hint: claude-code-cache-fix writes its log to ${DEFAULT_USAGE_LOG} by default.`);
128
+ process.exit(1);
129
+ }
130
+ const text = readFileSync(filePath, 'utf8');
131
+ const rows = [];
132
+ for (const line of text.split('\n')) {
133
+ const t = line.trim();
134
+ if (!t) continue;
135
+ try { rows.push(JSON.parse(t)); }
136
+ catch { /* skip malformed */ }
137
+ }
138
+ return rows;
139
+ }
140
+
141
+ function filterSince(rows, since) {
142
+ if (!since) return rows;
143
+ const m = since.match(/^(\d+)([hd])$/);
144
+ if (!m) {
145
+ console.error(`Invalid --since format: ${since}. Expected like 24h, 3d.`);
146
+ process.exit(1);
147
+ }
148
+ const n = parseInt(m[1], 10);
149
+ const ms = m[2] === 'h' ? n * 3600 * 1000 : n * 86400 * 1000;
150
+ const cutoff = new Date(Date.now() - ms).toISOString();
151
+ return rows.filter(r => r.timestamp >= cutoff);
152
+ }
153
+
154
+ // ─── Window detection ───────────────────────────────────────────────────────
155
+
156
+ function findResetWindows(rows) {
157
+ // Sort by timestamp (defensive — should already be sorted)
158
+ rows = rows.slice().sort((a, b) => a.timestamp.localeCompare(b.timestamp));
159
+
160
+ // Find indices where q5h_pct drops by RESET_THRESHOLD or more
161
+ // (these are window boundaries)
162
+ const windowStarts = [0]; // first call is always a window start
163
+ for (let i = 1; i < rows.length; i++) {
164
+ const prev = rows[i - 1].q5h_pct;
165
+ const cur = rows[i].q5h_pct;
166
+ if (typeof prev === 'number' && typeof cur === 'number' && cur < prev - RESET_THRESHOLD) {
167
+ windowStarts.push(i);
168
+ }
169
+ }
170
+ windowStarts.push(rows.length); // sentinel for last window
171
+
172
+ const windows = [];
173
+ for (let i = 0; i < windowStarts.length - 1; i++) {
174
+ const slice = rows.slice(windowStarts[i], windowStarts[i + 1]);
175
+ if (slice.length === 0) continue;
176
+ windows.push(slice);
177
+ }
178
+ return windows;
179
+ }
180
+
181
+ // ─── Token-equivalent calculation ───────────────────────────────────────────
182
+
183
+ function callEquivalent(r, cacheReadWeight) {
184
+ return (
185
+ (r.input_tokens || 0) * W_UNCACHED_INPUT
186
+ + (r.output_tokens || 0) * W_OUTPUT
187
+ + (r.cache_creation_input_tokens || 0) * W_CACHE_CREATION
188
+ + (r.cache_read_input_tokens || 0) * cacheReadWeight
189
+ );
190
+ }
191
+
192
+ function windowEquivalent(window, cacheReadWeight) {
193
+ let sum = 0;
194
+ for (const r of window) sum += callEquivalent(r, cacheReadWeight);
195
+ return sum;
196
+ }
197
+
198
+ function windowDeltaQ5h(window) {
199
+ const start = window[0].q5h_pct ?? 0;
200
+ let peak = start;
201
+ for (const r of window) {
202
+ if (typeof r.q5h_pct === 'number' && r.q5h_pct > peak) peak = r.q5h_pct;
203
+ }
204
+ return peak - start;
205
+ }
206
+
207
+ function windowPeakFraction(window) {
208
+ let peakCount = 0;
209
+ for (const r of window) if (r.peak_hour) peakCount++;
210
+ return peakCount / window.length;
211
+ }
212
+
213
+ // ─── Statistics helpers ─────────────────────────────────────────────────────
214
+
215
+ function mean(xs) {
216
+ if (xs.length === 0) return 0;
217
+ return xs.reduce((a, b) => a + b, 0) / xs.length;
218
+ }
219
+
220
+ function stdev(xs) {
221
+ if (xs.length < 2) return 0;
222
+ const m = mean(xs);
223
+ const sq = xs.map(x => (x - m) ** 2);
224
+ return Math.sqrt(sq.reduce((a, b) => a + b, 0) / (xs.length - 1));
225
+ }
226
+
227
+ function cv(xs) {
228
+ const m = mean(xs);
229
+ if (m === 0) return Infinity;
230
+ return stdev(xs) / m;
231
+ }
232
+
233
+ // ─── Counting model fit ─────────────────────────────────────────────────────
234
+
235
+ function fitCountingModels(windows) {
236
+ // For each window, compute equivalent tokens under each hypothesis,
237
+ // then extrapolate to 100% quota using the observed delta_q5h.
238
+ // The model whose extrapolations are most consistent (lowest CV) wins.
239
+ const models = {
240
+ zero: { weight: 0.0, label: 'H_zero (cache_read = 0.0x)', extrapolations: [] },
241
+ billed: { weight: 0.1, label: 'H_billed (cache_read = 0.1x)', extrapolations: [] },
242
+ full: { weight: 1.0, label: 'H_full (cache_read = 1.0x)', extrapolations: [] },
243
+ };
244
+
245
+ for (const w of windows) {
246
+ const delta = windowDeltaQ5h(w);
247
+ if (delta < MIN_DELTA_Q5H) continue;
248
+
249
+ for (const key of Object.keys(models)) {
250
+ const eq = windowEquivalent(w, models[key].weight);
251
+ const implied100 = eq / (delta / 100);
252
+ models[key].extrapolations.push(implied100);
253
+ }
254
+ }
255
+
256
+ // Compute CV for each model
257
+ const usableWindows = models.zero.extrapolations.length;
258
+ const fits = {};
259
+ for (const key of Object.keys(models)) {
260
+ const xs = models[key].extrapolations;
261
+ fits[key] = {
262
+ label: models[key].label,
263
+ weight: models[key].weight,
264
+ mean: mean(xs),
265
+ stdev: stdev(xs),
266
+ cv: cv(xs),
267
+ values: xs,
268
+ };
269
+ }
270
+
271
+ // Determine the best fit
272
+ let bestKey = null;
273
+ let bestCv = Infinity;
274
+ for (const key of Object.keys(fits)) {
275
+ if (fits[key].cv < bestCv) {
276
+ bestCv = fits[key].cv;
277
+ bestKey = key;
278
+ }
279
+ }
280
+
281
+ return { fits, bestKey, usableWindows };
282
+ }
283
+
284
+ // ─── Peak vs off-peak analysis ─────────────────────────────────────────────
285
+
286
+ function peakSplit(windows, weight) {
287
+ // Returns { peakWindows: [...], offPeakWindows: [...], skipped: [...] }
288
+ // and computes mean implied 100% quota for each group under the given
289
+ // cache_read weight.
290
+ const peakDom = [];
291
+ const offDom = [];
292
+ const skipped = [];
293
+
294
+ for (const w of windows) {
295
+ const delta = windowDeltaQ5h(w);
296
+ if (delta < MIN_DELTA_Q5H) {
297
+ skipped.push({ reason: 'delta_q5h too small', window: w });
298
+ continue;
299
+ }
300
+ const eq = windowEquivalent(w, weight);
301
+ const implied100 = eq / (delta / 100);
302
+ const pf = windowPeakFraction(w) * 100;
303
+
304
+ const entry = {
305
+ start: w[0].timestamp,
306
+ end: w[w.length - 1].timestamp,
307
+ calls: w.length,
308
+ delta,
309
+ peakFraction: pf,
310
+ eq,
311
+ implied100,
312
+ };
313
+
314
+ if (pf >= PEAK_WINDOW_MIN_PCT) peakDom.push(entry);
315
+ else if (pf <= OFFPEAK_WINDOW_MAX_PCT) offDom.push(entry);
316
+ else skipped.push({ reason: 'mixed peak/off-peak', ...entry });
317
+ }
318
+
319
+ return { peakDom, offDom, skipped };
320
+ }
321
+
322
+ // ─── Output rendering ───────────────────────────────────────────────────────
323
+
324
+ function fmt(n, decimals = 2) {
325
+ if (n === null || n === undefined || !isFinite(n)) return 'n/a';
326
+ if (Math.abs(n) >= 1e6) return (n / 1e6).toFixed(decimals) + 'M';
327
+ if (Math.abs(n) >= 1e3) return (n / 1e3).toFixed(decimals) + 'K';
328
+ return n.toFixed(decimals);
329
+ }
330
+
331
+ function pct(n) { return (n * 100).toFixed(1) + '%'; }
332
+
333
+ function printText(report) {
334
+ const { meta, windows, fit, peak } = report;
335
+
336
+ console.log('═══════════════════════════════════════════════════════════════════════');
337
+ console.log(' CLAUDE 5-HOUR QUOTA ANALYSIS');
338
+ console.log('═══════════════════════════════════════════════════════════════════════');
339
+ console.log();
340
+ console.log(`Data source: ${meta.file}`);
341
+ console.log(`Total entries: ${meta.totalRows}`);
342
+ console.log(`With q5h_pct: ${meta.withQuota} (${pct(meta.withQuota / meta.totalRows)})`);
343
+ console.log(`Time range: ${meta.timeStart}`);
344
+ console.log(` → ${meta.timeEnd}`);
345
+ console.log(`Reset windows: ${windows.total} detected, ${windows.usable} usable for fit`);
346
+ console.log();
347
+
348
+ if (windows.usable < 2) {
349
+ console.log('⚠ Not enough usable reset windows to fit counting models.');
350
+ console.log(' Need at least 2 windows with q5h_pct increase ≥ 5%.');
351
+ console.log(' Run the interceptor through more activity and try again.');
352
+ return;
353
+ }
354
+
355
+ console.log('───────────────────────────────────────────────────────────────────────');
356
+ console.log(' Per-window breakdown');
357
+ console.log('───────────────────────────────────────────────────────────────────────');
358
+ console.log();
359
+ console.log(' ' + 'Window'.padEnd(34) + 'Calls'.padStart(6) + 'Δq5h'.padStart(6) + 'Peak%'.padStart(7) + 'EqToks'.padStart(10) + '100%impl'.padStart(11));
360
+ for (const wr of report.windowRows) {
361
+ console.log(' ' + wr.label.padEnd(34) + String(wr.calls).padStart(6) + (wr.delta + '%').padStart(6) + (wr.peakFraction.toFixed(0) + '%').padStart(7) + fmt(wr.eq).padStart(10) + fmt(wr.implied100).padStart(11));
362
+ }
363
+ console.log();
364
+
365
+ console.log('───────────────────────────────────────────────────────────────────────');
366
+ console.log(' Q1: Does cache_read count toward 5h quota?');
367
+ console.log('───────────────────────────────────────────────────────────────────────');
368
+ console.log();
369
+ console.log(' Tests three hypotheses against your data. Lower CV = better fit.');
370
+ console.log();
371
+ console.log(' ' + 'Hypothesis'.padEnd(34) + 'Mean impl 100%'.padStart(18) + 'CV'.padStart(10));
372
+ for (const key of ['zero', 'billed', 'full']) {
373
+ const f = fit.fits[key];
374
+ const marker = key === fit.bestKey ? ' ★' : '';
375
+ console.log(' ' + f.label.padEnd(34) + (fmt(f.mean) + ' tok').padStart(18) + (f.cv === Infinity ? 'inf' : (f.cv * 100).toFixed(1) + '%').padStart(10) + marker);
376
+ }
377
+ console.log();
378
+ console.log(' ★ = best fit (lowest coefficient of variation)');
379
+ console.log();
380
+ const bestFit = fit.fits[fit.bestKey];
381
+ if (bestFit.cv < 0.5) {
382
+ console.log(` Verdict: ${bestFit.label} is the best fit (CV ${(bestFit.cv * 100).toFixed(1)}%).`);
383
+ if (fit.bestKey === 'zero') {
384
+ console.log(' Interpretation: cache_read does NOT meaningfully count toward your 5h quota.');
385
+ console.log(' The cache really is saving you quota, not just billing.');
386
+ } else if (fit.bestKey === 'billed') {
387
+ console.log(' Interpretation: cache_read counts at the BILLING rate (0.1x of input).');
388
+ console.log(' Quota and billing are aligned for cache reads.');
389
+ } else {
390
+ console.log(' Interpretation: cache_read counts at the FULL input rate for quota purposes.');
391
+ console.log(' This means cache hits save you billing but NOT quota — a stealth multiplier.');
392
+ }
393
+ } else {
394
+ console.log(` Verdict: No clear winner. Best fit (${fit.fits[fit.bestKey].label}) has CV ${(fit.fits[fit.bestKey].cv * 100).toFixed(1)}%.`);
395
+ console.log(' Likely cause: small sample, mixed-model traffic, or sliding-window noise.');
396
+ console.log(' Run for longer and try again.');
397
+ }
398
+ console.log();
399
+
400
+ console.log('───────────────────────────────────────────────────────────────────────');
401
+ console.log(' Q2: Do peak hours cost more quota per token?');
402
+ console.log('───────────────────────────────────────────────────────────────────────');
403
+ console.log();
404
+ console.log(` Peak hours: weekday 13:00–19:00 UTC (interceptor default)`);
405
+ console.log();
406
+ if (peak.peakDom.length === 0 && peak.offDom.length === 0) {
407
+ console.log(' Not enough peak-dominant or off-peak-dominant windows to compare.');
408
+ console.log(' Need at least 1 of each (≥80% same-bucket calls per window).');
409
+ } else {
410
+ console.log(' ' + 'Group'.padEnd(20) + 'Windows'.padStart(10) + 'Mean impl 100%'.padStart(20));
411
+ if (peak.peakDom.length > 0) {
412
+ const m = mean(peak.peakDom.map(p => p.implied100));
413
+ console.log(' ' + 'Peak-dominant'.padEnd(20) + String(peak.peakDom.length).padStart(10) + (fmt(m) + ' tok').padStart(20));
414
+ }
415
+ if (peak.offDom.length > 0) {
416
+ const m = mean(peak.offDom.map(p => p.implied100));
417
+ console.log(' ' + 'Off-peak'.padEnd(20) + String(peak.offDom.length).padStart(10) + (fmt(m) + ' tok').padStart(20));
418
+ }
419
+ if (peak.peakDom.length > 0 && peak.offDom.length > 0) {
420
+ const peakMean = mean(peak.peakDom.map(p => p.implied100));
421
+ const offMean = mean(peak.offDom.map(p => p.implied100));
422
+ const ratio = peakMean / offMean;
423
+ console.log();
424
+ if (ratio < 0.85) {
425
+ console.log(` ⚠ Peak windows imply ${pct(ratio)} of off-peak quota.`);
426
+ console.log(` That's a ${pct(1 - ratio)} effective quota REDUCTION during peak hours.`);
427
+ console.log(' Same usage pattern, fewer tokens until you hit 100%.');
428
+ } else if (ratio > 1.15) {
429
+ console.log(` Peak windows imply ${pct(ratio)} of off-peak quota — peak is MORE generous?`);
430
+ console.log(' Unusual. Check your sample size and time range.');
431
+ } else {
432
+ console.log(` Peak / off-peak ratio is ${pct(ratio)} — no significant peak penalty detected.`);
433
+ }
434
+ } else {
435
+ console.log();
436
+ console.log(' Need both peak-dominant AND off-peak-dominant windows for the comparison.');
437
+ }
438
+ }
439
+ console.log();
440
+
441
+ console.log('───────────────────────────────────────────────────────────────────────');
442
+ console.log(' Q3: Implied 5h quota for your account');
443
+ console.log('───────────────────────────────────────────────────────────────────────');
444
+ console.log();
445
+ console.log(` Under best-fit model (${fit.fits[fit.bestKey].label}):`);
446
+ console.log(` Mean implied 100% quota: ${fmt(fit.fits[fit.bestKey].mean)} token-equivalents`);
447
+ console.log();
448
+ console.log(' Token-equivalent weights used:');
449
+ console.log(` uncached input × ${W_UNCACHED_INPUT}`);
450
+ console.log(` output × ${W_OUTPUT} (Opus output is 5x input rate)`);
451
+ console.log(` cache_creation × ${W_CACHE_CREATION} (1h tier; 5m tier would be 1.25)`);
452
+ console.log(` cache_read × ${fit.fits[fit.bestKey].weight} (this hypothesis)`);
453
+ console.log();
454
+ console.log(' Compare against your subscription tier and plan estimate. If your');
455
+ console.log(' number is wildly different from other reports, your sample may be');
456
+ console.log(' too small or your model mix may differ significantly.');
457
+ console.log();
458
+
459
+ console.log('═══════════════════════════════════════════════════════════════════════');
460
+ console.log();
461
+ console.log('Caveats:');
462
+ console.log(' • q5h is a 5-hour SLIDING window; we approximate as discrete resets');
463
+ console.log(' • Single account; aggregate findings need cross-validation');
464
+ console.log(' • cache_creation TTL weight averaged at 2.0; mixed 5m/1h would lower it');
465
+ console.log(' • Only Anthropic knows the exact quota formula');
466
+ console.log();
467
+ console.log('Reference: anthropics/claude-code#45756');
468
+ console.log('Report your findings: open an issue or PR on cnighswonger/claude-code-cache-fix');
469
+ }
470
+
471
+ function printJson(report) {
472
+ console.log(JSON.stringify(report, null, 2));
473
+ }
474
+
475
+ // ─── Main ───────────────────────────────────────────────────────────────────
476
+
477
+ function main() {
478
+ const opts = parseArgs();
479
+ if (opts.help) { printUsage(); return; }
480
+
481
+ const filePath = opts.file || DEFAULT_USAGE_LOG;
482
+ const rawRows = loadUsage(filePath);
483
+ const filtered = filterSince(rawRows, opts.since);
484
+ const withQuota = filtered.filter(r => typeof r.q5h_pct === 'number');
485
+
486
+ if (withQuota.length === 0) {
487
+ console.error('No entries with q5h_pct field found.');
488
+ console.error('This field was added in claude-code-cache-fix v1.6.1 (2026-04-09).');
489
+ console.error('Older log entries are silently filtered out.');
490
+ process.exit(1);
491
+ }
492
+
493
+ withQuota.sort((a, b) => a.timestamp.localeCompare(b.timestamp));
494
+ const allWindows = findResetWindows(withQuota);
495
+ const fit = fitCountingModels(allWindows);
496
+
497
+ // Use the best-fit weight for the peak/off-peak analysis
498
+ const bestWeight = fit.fits[fit.bestKey].weight;
499
+ const peak = peakSplit(allWindows, bestWeight);
500
+
501
+ // Build per-window rows for the breakdown table
502
+ const windowRows = [];
503
+ for (const w of allWindows) {
504
+ const delta = windowDeltaQ5h(w);
505
+ if (delta < MIN_DELTA_Q5H) continue;
506
+ const eq = windowEquivalent(w, bestWeight);
507
+ const implied100 = eq / (delta / 100);
508
+ const pf = windowPeakFraction(w) * 100;
509
+ windowRows.push({
510
+ label: `${w[0].timestamp.slice(5, 16)} → ${w[w.length - 1].timestamp.slice(5, 16)}`,
511
+ calls: w.length,
512
+ delta,
513
+ peakFraction: pf,
514
+ eq,
515
+ implied100,
516
+ });
517
+ }
518
+
519
+ const report = {
520
+ meta: {
521
+ file: filePath,
522
+ totalRows: rawRows.length,
523
+ filteredRows: filtered.length,
524
+ withQuota: withQuota.length,
525
+ timeStart: withQuota[0].timestamp,
526
+ timeEnd: withQuota[withQuota.length - 1].timestamp,
527
+ since: opts.since,
528
+ },
529
+ windows: { total: allWindows.length, usable: fit.usableWindows },
530
+ windowRows,
531
+ fit,
532
+ peak,
533
+ };
534
+
535
+ if (opts.format === 'json') printJson(report);
536
+ else printText(report);
537
+ }
538
+
539
+ main();
@@ -0,0 +1,60 @@
1
+ #!/usr/bin/env bash
2
+ # sim-cost-reconcile — One-liner wrapper for running cost-report.mjs against
3
+ # a simulation log with admin API cross-reference enabled.
4
+ #
5
+ # Usage:
6
+ # sim-cost-reconcile <sim-dir-or-log> [extra cost-report.mjs args...]
7
+ #
8
+ # Examples:
9
+ # sim-cost-reconcile ~/git_repos/kanfei_test/kanfei-nowcast/.test_cache/simulations/realtime_sim_harnett_county_qlcs_2026_20260411_024836
10
+ # sim-cost-reconcile path/to/simulation.log --format md > report.md
11
+ #
12
+ # Reads admin key from $ANTHROPIC_ADMIN_KEY or ~/.config/anthropic/admin-key.
13
+ # If no admin key is available, runs with telemetry only and warns.
14
+ #
15
+ # NOTE on admin reconciliation: the admin API returns data at 1h-bucket
16
+ # resolution, so if multiple sims (or other API activity) overlap the same
17
+ # hour, the admin total will include all of it. For an accurate multi-sim
18
+ # aggregate, run this on each sim and sum the telemetry totals, then pull
19
+ # the admin total once over the full window.
20
+
21
+ set -euo pipefail
22
+
23
+ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
24
+ COST_REPORT="$SCRIPT_DIR/cost-report.mjs"
25
+
26
+ if [[ $# -lt 1 ]]; then
27
+ echo "Usage: $(basename "$0") <sim-dir-or-log> [extra cost-report args...]" >&2
28
+ exit 1
29
+ fi
30
+
31
+ TARGET="$1"
32
+ shift
33
+
34
+ # Resolve a dir to its simulation.log
35
+ if [[ -d "$TARGET" ]]; then
36
+ LOG="$TARGET/simulation.log"
37
+ if [[ ! -f "$LOG" ]]; then
38
+ echo "ERROR: no simulation.log in $TARGET" >&2
39
+ exit 1
40
+ fi
41
+ elif [[ -f "$TARGET" ]]; then
42
+ LOG="$TARGET"
43
+ else
44
+ echo "ERROR: $TARGET is neither a file nor a directory" >&2
45
+ exit 1
46
+ fi
47
+
48
+ # Load admin key
49
+ ADMIN_KEY_FILE="${HOME}/.config/anthropic/admin-key"
50
+ if [[ -n "${ANTHROPIC_ADMIN_KEY:-}" ]]; then
51
+ KEY="$ANTHROPIC_ADMIN_KEY"
52
+ elif [[ -r "$ADMIN_KEY_FILE" ]]; then
53
+ KEY="$(cat "$ADMIN_KEY_FILE")"
54
+ else
55
+ echo "WARNING: no admin key found ($ADMIN_KEY_FILE missing, ANTHROPIC_ADMIN_KEY unset)" >&2
56
+ echo " running telemetry-only — pass --admin-key or set env var to enable reconciliation" >&2
57
+ exec node "$COST_REPORT" --sim-log "$LOG" "$@"
58
+ fi
59
+
60
+ exec node "$COST_REPORT" --sim-log "$LOG" --admin-key "$KEY" "$@"
@@ -0,0 +1,352 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * usage-to-dashboard-ndjson — Translate claude-code-cache-fix's usage.jsonl
4
+ * into the proxy NDJSON format expected by @fgrosswig's claude-usage-dashboard,
5
+ * and write to the directory his dashboard already watches.
6
+ *
7
+ * https://github.com/fgrosswig/claude-usage-dashboard
8
+ *
9
+ * # Why this exists
10
+ *
11
+ * Our interceptor and fgrosswig's dashboard are strongly complementary:
12
+ * the interceptor captures per-call API data from inside the Node.js process
13
+ * (cache metrics, quota state, request rewrites), while his dashboard
14
+ * provides visualization, historical trending, and multi-host aggregation.
15
+ *
16
+ * Rather than build our own visualization layer, we translate our per-call
17
+ * usage records into the NDJSON schema his dashboard ingests. A user running
18
+ * both tools gets the best of both: the interceptor fixes what it can fix
19
+ * and emits rich per-call data, and the dashboard displays that data
20
+ * alongside whatever Claude Code's own session JSONLs already capture.
21
+ *
22
+ * # What this tool does
23
+ *
24
+ * Reads `~/.claude/usage.jsonl` (our interceptor's per-call log) and
25
+ * translates each entry into a minimal-but-compatible record in the shape
26
+ * his dashboard expects under `~/.claude/anthropic-proxy-logs/*.ndjson`.
27
+ * The output file follows the convention `proxy-YYYY-MM-DD.ndjson`, one
28
+ * file per UTC day, matching the filename pattern his `collectProxyNdjsonFiles()`
29
+ * helper discovers.
30
+ *
31
+ * # Fields emitted
32
+ *
33
+ * Mapped from our usage.jsonl to fgrosswig's proxy-core.js shape:
34
+ *
35
+ * {
36
+ * "ts_start": <our timestamp>,
37
+ * "ts_end": <our timestamp>, // single-point, no duration
38
+ * "duration_ms": null, // we don't measure this
39
+ * "method": "POST",
40
+ * "path": "/v1/messages",
41
+ * "upstream_status": 200, // implicit from usage presence
42
+ * "usage": {
43
+ * "input_tokens": <ours>,
44
+ * "output_tokens": <ours>,
45
+ * "cache_read_input_tokens": <ours>,
46
+ * "cache_creation_input_tokens": <ours>
47
+ * },
48
+ * "cache_read_ratio": <computed>,
49
+ * "cache_health": "healthy" | "affected" | "mixed",
50
+ * "request_hints": { "model": <ours> },
51
+ * "response_anthropic_headers": { // if quota fields available
52
+ * "anthropic-ratelimit-unified-5h-utilization": "<ours>",
53
+ * "anthropic-ratelimit-unified-7d-utilization": "<ours>"
54
+ * },
55
+ * "ttl_tier": <ours, interceptor-specific>,
56
+ * "ephemeral_1h_input_tokens": <ours, interceptor-specific>,
57
+ * "ephemeral_5m_input_tokens": <ours, interceptor-specific>,
58
+ * "source": "claude-code-cache-fix"
59
+ * }
60
+ *
61
+ * Extra fields beyond fgrosswig's native schema (ttl_tier, ephemeral_*,
62
+ * source) are added for forward-compatibility — his dashboard ignores
63
+ * unknown fields per its tolerant-ingest design, and our own tooling
64
+ * downstream may find them useful when consuming the same NDJSON.
65
+ *
66
+ * # Usage
67
+ *
68
+ * # One-shot translation (reads all of usage.jsonl, writes today's file)
69
+ * node tools/usage-to-dashboard-ndjson.mjs
70
+ *
71
+ * # Follow mode (tail usage.jsonl, append new records as they arrive)
72
+ * node tools/usage-to-dashboard-ndjson.mjs --follow
73
+ *
74
+ * # Custom input/output paths
75
+ * node tools/usage-to-dashboard-ndjson.mjs --input /path/to/usage.jsonl --output-dir /path/to/ndjson-dir
76
+ *
77
+ * # Dry-run: print to stdout instead of writing files
78
+ * node tools/usage-to-dashboard-ndjson.mjs --stdout
79
+ *
80
+ * # Environment
81
+ *
82
+ * ANTHROPIC_PROXY_LOG_DIR Override output directory (matches fgrosswig's
83
+ * dashboard env var so both tools stay in sync).
84
+ *
85
+ * Part of claude-code-cache-fix. MIT licensed.
86
+ * https://github.com/cnighswonger/claude-code-cache-fix
87
+ */
88
+
89
+ import { readFileSync, writeFileSync, appendFileSync, existsSync, mkdirSync, statSync, watch } from 'node:fs';
90
+ import { join } from 'node:path';
91
+ import { homedir } from 'node:os';
92
+
93
+ // ─── CLI parsing ────────────────────────────────────────────────────────────
94
+
95
+ function parseArgs() {
96
+ const args = process.argv.slice(2);
97
+ const opts = {
98
+ input: join(homedir(), '.claude', 'usage.jsonl'),
99
+ outputDir: process.env.ANTHROPIC_PROXY_LOG_DIR || join(homedir(), '.claude', 'anthropic-proxy-logs'),
100
+ stdout: false,
101
+ follow: false,
102
+ help: false,
103
+ };
104
+
105
+ for (let i = 0; i < args.length; i++) {
106
+ switch (args[i]) {
107
+ case '--input': opts.input = args[++i]; break;
108
+ case '--output-dir': opts.outputDir = args[++i]; break;
109
+ case '--stdout': opts.stdout = true; break;
110
+ case '--follow': opts.follow = true; break;
111
+ case '-h':
112
+ case '--help': opts.help = true; break;
113
+ default:
114
+ console.error(`unknown flag: ${args[i]}`);
115
+ opts.help = true;
116
+ }
117
+ }
118
+
119
+ return opts;
120
+ }
121
+
122
+ function printUsage() {
123
+ console.log(`usage-to-dashboard-ndjson — Translate cache-fix usage.jsonl to fgrosswig dashboard NDJSON.
124
+
125
+ Usage:
126
+ node usage-to-dashboard-ndjson.mjs One-shot: read all, write today's file
127
+ node usage-to-dashboard-ndjson.mjs --follow Tail usage.jsonl, append new records live
128
+ node usage-to-dashboard-ndjson.mjs --stdout Print NDJSON to stdout instead of files
129
+ node usage-to-dashboard-ndjson.mjs --input <path> Custom input (default: ~/.claude/usage.jsonl)
130
+ node usage-to-dashboard-ndjson.mjs --output-dir <path> Custom output dir (default: ~/.claude/anthropic-proxy-logs)
131
+
132
+ Output files follow the convention: proxy-YYYY-MM-DD.ndjson (one per UTC day).
133
+
134
+ Environment:
135
+ ANTHROPIC_PROXY_LOG_DIR Override output directory (also used by fgrosswig's dashboard).
136
+
137
+ Credit: this tool writes the NDJSON schema expected by @fgrosswig's
138
+ claude-usage-dashboard (https://github.com/fgrosswig/claude-usage-dashboard).
139
+ Running both tools together gives users per-call data from our interceptor
140
+ plus the visualization layer from his dashboard, with no coordination needed.
141
+ `);
142
+ }
143
+
144
+ // ─── Record translation ─────────────────────────────────────────────────────
145
+
146
+ /**
147
+ * Translate one claude-code-cache-fix usage.jsonl record into a
148
+ * fgrosswig-dashboard-compatible NDJSON record. Returns null if the
149
+ * record doesn't have enough fields to be usable.
150
+ */
151
+ function translateRecord(entry) {
152
+ if (!entry || !entry.timestamp || !entry.model) return null;
153
+
154
+ const inTok = entry.input_tokens || 0;
155
+ const outTok = entry.output_tokens || 0;
156
+ const crTok = entry.cache_read_input_tokens || 0;
157
+ const ccTok = entry.cache_creation_input_tokens || 0;
158
+
159
+ // Cache health (fgrosswig's semantic labels)
160
+ const totalCacheInput = crTok + ccTok;
161
+ const cacheReadRatio = totalCacheInput > 0 ? crTok / totalCacheInput : null;
162
+ let cacheHealth = 'na';
163
+ if (cacheReadRatio != null) {
164
+ if (cacheReadRatio >= 0.8) cacheHealth = 'healthy';
165
+ else if (cacheReadRatio < 0.4 && ccTok > 0) cacheHealth = 'affected';
166
+ else cacheHealth = 'mixed';
167
+ }
168
+
169
+ // Reconstruct a minimal response_anthropic_headers blob from the quota
170
+ // pct fields we captured. Not byte-identical to what the proxy would see
171
+ // on the wire, but structurally compatible for the dashboard's consumers.
172
+ const responseHeaders = {};
173
+ if (entry.q5h_pct != null) {
174
+ responseHeaders['anthropic-ratelimit-unified-5h-utilization'] = String(entry.q5h_pct / 100);
175
+ }
176
+ if (entry.q7d_pct != null) {
177
+ responseHeaders['anthropic-ratelimit-unified-7d-utilization'] = String(entry.q7d_pct / 100);
178
+ }
179
+
180
+ const rec = {
181
+ ts_start: entry.timestamp,
182
+ ts_end: entry.timestamp,
183
+ duration_ms: null,
184
+ method: 'POST',
185
+ path: '/v1/messages',
186
+ upstream_status: 200,
187
+ usage: {
188
+ input_tokens: inTok,
189
+ output_tokens: outTok,
190
+ cache_read_input_tokens: crTok,
191
+ cache_creation_input_tokens: ccTok,
192
+ },
193
+ cache_read_ratio: cacheReadRatio,
194
+ cache_health: cacheHealth,
195
+ request_hints: {
196
+ model: entry.model,
197
+ },
198
+ response_anthropic_headers: responseHeaders,
199
+ // Interceptor-specific extras — fgrosswig's dashboard ignores unknown
200
+ // fields, so these pass through without breaking ingestion.
201
+ ttl_tier: entry.ttl_tier || null,
202
+ ephemeral_1h_input_tokens: entry.ephemeral_1h_input_tokens || 0,
203
+ ephemeral_5m_input_tokens: entry.ephemeral_5m_input_tokens || 0,
204
+ peak_hour: entry.peak_hour || false,
205
+ source: 'claude-code-cache-fix',
206
+ };
207
+
208
+ // Synthesize a stable pseudo-request-id from timestamp + model for dedup
209
+ // at the dashboard layer. Not a real request ID — just a deterministic key.
210
+ rec.req_id = 'ccf_' + entry.timestamp.replace(/[^0-9]/g, '') + '_' + entry.model.slice(-6);
211
+
212
+ return rec;
213
+ }
214
+
215
+ // ─── File output ────────────────────────────────────────────────────────────
216
+
217
+ function dayFileFor(outputDir, isoTimestamp) {
218
+ // proxy-YYYY-MM-DD.ndjson from UTC date
219
+ const date = isoTimestamp.slice(0, 10);
220
+ return join(outputDir, `proxy-${date}.ndjson`);
221
+ }
222
+
223
+ function ensureDir(dir) {
224
+ if (!existsSync(dir)) mkdirSync(dir, { recursive: true });
225
+ }
226
+
227
+ function writeRecords(records, outputDir, useStdout) {
228
+ if (useStdout) {
229
+ for (const r of records) {
230
+ process.stdout.write(JSON.stringify(r) + '\n');
231
+ }
232
+ return records.length;
233
+ }
234
+
235
+ ensureDir(outputDir);
236
+
237
+ // Group by day for efficient appending
238
+ const byDay = new Map();
239
+ for (const r of records) {
240
+ const day = dayFileFor(outputDir, r.ts_start);
241
+ if (!byDay.has(day)) byDay.set(day, []);
242
+ byDay.get(day).push(r);
243
+ }
244
+
245
+ for (const [dayFile, dayRecords] of byDay) {
246
+ const payload = dayRecords.map(r => JSON.stringify(r)).join('\n') + '\n';
247
+ // Overwrite on one-shot mode — the tool is idempotent within a single
248
+ // input file, so rewriting today's file from a full replay is safe.
249
+ writeFileSync(dayFile, payload);
250
+ }
251
+
252
+ return records.length;
253
+ }
254
+
255
+ // ─── One-shot batch mode ────────────────────────────────────────────────────
256
+
257
+ function runBatch(opts) {
258
+ if (!existsSync(opts.input)) {
259
+ console.error(`ERROR: input file not found: ${opts.input}`);
260
+ process.exit(1);
261
+ }
262
+
263
+ const raw = readFileSync(opts.input, 'utf8');
264
+ const lines = raw.split('\n').filter(l => l.trim());
265
+ const records = [];
266
+ let skipped = 0;
267
+
268
+ for (const line of lines) {
269
+ try {
270
+ const entry = JSON.parse(line);
271
+ const rec = translateRecord(entry);
272
+ if (rec) records.push(rec);
273
+ else skipped++;
274
+ } catch {
275
+ skipped++;
276
+ }
277
+ }
278
+
279
+ const written = writeRecords(records, opts.outputDir, opts.stdout);
280
+ if (!opts.stdout) {
281
+ console.error(`usage-to-dashboard-ndjson: wrote ${written} records to ${opts.outputDir} (${skipped} skipped)`);
282
+ }
283
+ }
284
+
285
+ // ─── Follow mode ────────────────────────────────────────────────────────────
286
+
287
+ function runFollow(opts) {
288
+ if (!existsSync(opts.input)) {
289
+ console.error(`ERROR: input file not found: ${opts.input}`);
290
+ process.exit(1);
291
+ }
292
+
293
+ // First, catch up on the existing file (idempotent write)
294
+ runBatch(opts);
295
+
296
+ // Then watch for new entries
297
+ console.error(`usage-to-dashboard-ndjson: watching ${opts.input} for new records...`);
298
+ let lastSize = statSync(opts.input).size;
299
+
300
+ watch(opts.input, { persistent: true }, () => {
301
+ let currentSize;
302
+ try { currentSize = statSync(opts.input).size; } catch { return; }
303
+ if (currentSize <= lastSize) {
304
+ // File truncated or unchanged — rewind lastSize
305
+ if (currentSize < lastSize) lastSize = 0;
306
+ return;
307
+ }
308
+ // Read only the new bytes
309
+ try {
310
+ const fd = readFileSync(opts.input, 'utf8');
311
+ const newContent = fd.slice(lastSize);
312
+ lastSize = currentSize;
313
+ const newLines = newContent.split('\n').filter(l => l.trim());
314
+ const newRecs = [];
315
+ for (const line of newLines) {
316
+ try {
317
+ const entry = JSON.parse(line);
318
+ const rec = translateRecord(entry);
319
+ if (rec) newRecs.push(rec);
320
+ } catch {}
321
+ }
322
+ if (newRecs.length > 0) {
323
+ // Append to today's dayfile per record
324
+ ensureDir(opts.outputDir);
325
+ for (const r of newRecs) {
326
+ const dayFile = dayFileFor(opts.outputDir, r.ts_start);
327
+ appendFileSync(dayFile, JSON.stringify(r) + '\n');
328
+ }
329
+ console.error(`[${new Date().toISOString()}] appended ${newRecs.length} records`);
330
+ }
331
+ } catch (err) {
332
+ console.error(`watch error: ${err.message}`);
333
+ }
334
+ });
335
+
336
+ // Keep the process alive
337
+ process.stdin.resume();
338
+ }
339
+
340
+ // ─── Main ───────────────────────────────────────────────────────────────────
341
+
342
+ const opts = parseArgs();
343
+ if (opts.help) {
344
+ printUsage();
345
+ process.exit(0);
346
+ }
347
+
348
+ if (opts.follow) {
349
+ runFollow(opts);
350
+ } else {
351
+ runBatch(opts);
352
+ }