claude-code-cache-fix 1.7.1 → 1.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -105,6 +105,42 @@ The module intercepts `globalThis.fetch` before Claude Code makes API calls to `
105
105
 
106
106
  All fixes are idempotent — if nothing needs fixing, the request passes through unmodified. The interceptor is read-only with respect to your conversation; it only normalizes the request structure before it hits the API.
107
107
 
108
+ ## Status line — quota warnings in real time
109
+
110
+ The interceptor writes quota state to `~/.claude/quota-status.json` on every API call. The included `tools/quota-statusline.sh` script reads this file and displays a live status line in Claude Code showing:
111
+
112
+ - **Q5h %** with burn rate (%/min)
113
+ - **Q7d %** with burn rate (%/hr)
114
+ - **TTL tier** — shows `TTL:1h` when healthy, **`TTL:5m` in red when the server has downgraded you** (typically at Q5h ≥ 100%)
115
+ - **PEAK** in yellow during weekday peak hours (13:00–19:00 UTC)
116
+ - **Cache hit rate %**
117
+ - **OVERAGE** flag when active
118
+
119
+ ### Setup
120
+
121
+ Copy the script and configure Claude Code to use it:
122
+
123
+ ```bash
124
+ # Copy from the npm package to Claude Code's hooks directory
125
+ mkdir -p ~/.claude/hooks
126
+ cp "$(npm root -g)/claude-code-cache-fix/tools/quota-statusline.sh" ~/.claude/hooks/
127
+ chmod +x ~/.claude/hooks/quota-statusline.sh
128
+ ```
129
+
130
+ Add to `~/.claude/settings.json`:
131
+
132
+ ```json
133
+ {
134
+ "statusLine": {
135
+ "command": "~/.claude/hooks/quota-statusline.sh"
136
+ }
137
+ }
138
+ ```
139
+
140
+ ### Why this matters
141
+
142
+ When the server downgrades your TTL to 5m (Layer 2 — quota-aware downgrade at Q5h ≥ 100%), **every idle longer than 5 minutes causes a full context rebuild**. Without the status line, this is invisible — you just notice things getting slower and more expensive. With the status line, the red `TTL:5m` warning tells you immediately: **stop working, wait for the Q5h window to reset, then resume**. Powering through overage compounds the drain; pausing breaks the cycle.
143
+
108
144
  ## Image stripping
109
145
 
110
146
  Images read via the Read tool are encoded as base64 and stored in `tool_result` blocks in conversation history. They ride along on **every subsequent API call** until compaction. A single 500KB image costs ~62,500 tokens per turn in carry-forward.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-code-cache-fix",
3
- "version": "1.7.1",
3
+ "version": "1.7.2",
4
4
  "description": "Fixes prompt cache regression in Claude Code that causes up to 20x cost increase on resumed sessions",
5
5
  "type": "module",
6
6
  "exports": "./preload.mjs",
@@ -0,0 +1,76 @@
1
+ #!/bin/bash
2
+ # Status line: show quota % and burn rate from claude-meter JSONL
3
+ # Rate is calculated from window start (reset_time - window_size) to now
4
+ # No prev file needed — each reading is self-contained
5
+
6
+ input=$(cat)
7
+
8
+ JSONL="$HOME/.claude/claude-meter.jsonl"
9
+
10
+ if [ -f "$JSONL" ]; then
11
+ last=$(tail -1 "$JSONL" 2>/dev/null)
12
+
13
+ result=$(echo "$last" | python3 -c "
14
+ import sys, json
15
+ from datetime import datetime, timezone
16
+
17
+ r = json.load(sys.stdin)
18
+ q5h = int(r['q5h'] * 100)
19
+ q7d = int(r.get('q7d', 0) * 100)
20
+ overage = r.get('qoverage', '')
21
+ ts = r.get('ts', '')
22
+ q5h_reset = r.get('q5h_reset', 0)
23
+ q7d_reset = r.get('q7d_reset', 0)
24
+
25
+ now = datetime.fromisoformat(ts.replace('Z', '+00:00'))
26
+
27
+ # Q5h: 5-hour window, rate = pct / minutes elapsed since window start
28
+ rate5 = ''
29
+ if q5h_reset > 0:
30
+ window_start = datetime.fromtimestamp(q5h_reset, tz=timezone.utc) - __import__('datetime').timedelta(hours=5)
31
+ elapsed_min = (now - window_start).total_seconds() / 60
32
+ if elapsed_min > 1 and q5h > 0:
33
+ rate5 = '{:+.1f}'.format(q5h / elapsed_min)
34
+
35
+ # Q7d: 7-day window
36
+ rate7 = ''
37
+ if q7d_reset > 0:
38
+ window_start_7d = datetime.fromtimestamp(q7d_reset, tz=timezone.utc) - __import__('datetime').timedelta(days=7)
39
+ elapsed_min_7d = (now - window_start_7d).total_seconds() / 60
40
+ if elapsed_min_7d > 1 and q7d > 0:
41
+ rate7 = '{:+.1f}'.format(q7d / (elapsed_min_7d / 60))
42
+
43
+ label = 'Q5h: {}%'.format(q5h)
44
+ if rate5:
45
+ label += ' ({}%/m)'.format(rate5)
46
+ label += ' | Q7d: {}%'.format(q7d)
47
+ if rate7:
48
+ label += ' ({}%/hr)'.format(rate7)
49
+ if overage == 'active':
50
+ label += ' | OVERAGE'
51
+
52
+ # Add TTL tier from quota-status.json (written by interceptor)
53
+ import os, pathlib
54
+ qs_path = pathlib.Path.home() / '.claude' / 'quota-status.json'
55
+ try:
56
+ qs = json.load(open(qs_path))
57
+ ttl = qs.get('cache', {}).get('ttl_tier', '')
58
+ hit = qs.get('cache', {}).get('hit_rate', '')
59
+ if ttl:
60
+ if ttl == '5m':
61
+ label += ' | \033[31mTTL:5m\033[0m' # red
62
+ else:
63
+ label += ' | TTL:' + ttl
64
+ if hit and hit != 'N/A':
65
+ label += ' ' + hit + '%'
66
+ peak = qs.get('peak_hour', False)
67
+ if peak:
68
+ label += ' | \033[33mPEAK\033[0m' # yellow
69
+ except:
70
+ pass
71
+
72
+ print(label)
73
+ " 2>/dev/null)
74
+
75
+ [ -n "$result" ] && echo "$result"
76
+ fi