claude-code-cache-fix 1.7.1 → 1.7.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +36 -0
- package/package.json +1 -1
- package/tools/quota-statusline.sh +76 -0
package/README.md
CHANGED
|
@@ -105,6 +105,42 @@ The module intercepts `globalThis.fetch` before Claude Code makes API calls to `
|
|
|
105
105
|
|
|
106
106
|
All fixes are idempotent — if nothing needs fixing, the request passes through unmodified. The interceptor is read-only with respect to your conversation; it only normalizes the request structure before it hits the API.
|
|
107
107
|
|
|
108
|
+
## Status line — quota warnings in real time
|
|
109
|
+
|
|
110
|
+
The interceptor writes quota state to `~/.claude/quota-status.json` on every API call. The included `tools/quota-statusline.sh` script reads this file and displays a live status line in Claude Code showing:
|
|
111
|
+
|
|
112
|
+
- **Q5h %** with burn rate (%/min)
|
|
113
|
+
- **Q7d %** with burn rate (%/hr)
|
|
114
|
+
- **TTL tier** — shows `TTL:1h` when healthy, **`TTL:5m` in red when the server has downgraded you** (typically at Q5h ≥ 100%)
|
|
115
|
+
- **PEAK** in yellow during weekday peak hours (13:00–19:00 UTC)
|
|
116
|
+
- **Cache hit rate %**
|
|
117
|
+
- **OVERAGE** flag when active
|
|
118
|
+
|
|
119
|
+
### Setup
|
|
120
|
+
|
|
121
|
+
Copy the script and configure Claude Code to use it:
|
|
122
|
+
|
|
123
|
+
```bash
|
|
124
|
+
# Copy from the npm package to Claude Code's hooks directory
|
|
125
|
+
mkdir -p ~/.claude/hooks
|
|
126
|
+
cp "$(npm root -g)/claude-code-cache-fix/tools/quota-statusline.sh" ~/.claude/hooks/
|
|
127
|
+
chmod +x ~/.claude/hooks/quota-statusline.sh
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
Add to `~/.claude/settings.json`:
|
|
131
|
+
|
|
132
|
+
```json
|
|
133
|
+
{
|
|
134
|
+
"statusLine": {
|
|
135
|
+
"command": "~/.claude/hooks/quota-statusline.sh"
|
|
136
|
+
}
|
|
137
|
+
}
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
### Why this matters
|
|
141
|
+
|
|
142
|
+
When the server downgrades your TTL to 5m (Layer 2 — quota-aware downgrade at Q5h ≥ 100%), **every idle longer than 5 minutes causes a full context rebuild**. Without the status line, this is invisible — you just notice things getting slower and more expensive. With the status line, the red `TTL:5m` warning tells you immediately: **stop working, wait for the Q5h window to reset, then resume**. Powering through overage compounds the drain; pausing breaks the cycle.
|
|
143
|
+
|
|
108
144
|
## Image stripping
|
|
109
145
|
|
|
110
146
|
Images read via the Read tool are encoded as base64 and stored in `tool_result` blocks in conversation history. They ride along on **every subsequent API call** until compaction. A single 500KB image costs ~62,500 tokens per turn in carry-forward.
|
package/package.json
CHANGED
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
# Status line: show quota % and burn rate from claude-meter JSONL
|
|
3
|
+
# Rate is calculated from window start (reset_time - window_size) to now
|
|
4
|
+
# No prev file needed — each reading is self-contained
|
|
5
|
+
|
|
6
|
+
input=$(cat)
|
|
7
|
+
|
|
8
|
+
JSONL="$HOME/.claude/claude-meter.jsonl"
|
|
9
|
+
|
|
10
|
+
if [ -f "$JSONL" ]; then
|
|
11
|
+
last=$(tail -1 "$JSONL" 2>/dev/null)
|
|
12
|
+
|
|
13
|
+
result=$(echo "$last" | python3 -c "
|
|
14
|
+
import sys, json
|
|
15
|
+
from datetime import datetime, timezone
|
|
16
|
+
|
|
17
|
+
r = json.load(sys.stdin)
|
|
18
|
+
q5h = int(r['q5h'] * 100)
|
|
19
|
+
q7d = int(r.get('q7d', 0) * 100)
|
|
20
|
+
overage = r.get('qoverage', '')
|
|
21
|
+
ts = r.get('ts', '')
|
|
22
|
+
q5h_reset = r.get('q5h_reset', 0)
|
|
23
|
+
q7d_reset = r.get('q7d_reset', 0)
|
|
24
|
+
|
|
25
|
+
now = datetime.fromisoformat(ts.replace('Z', '+00:00'))
|
|
26
|
+
|
|
27
|
+
# Q5h: 5-hour window, rate = pct / minutes elapsed since window start
|
|
28
|
+
rate5 = ''
|
|
29
|
+
if q5h_reset > 0:
|
|
30
|
+
window_start = datetime.fromtimestamp(q5h_reset, tz=timezone.utc) - __import__('datetime').timedelta(hours=5)
|
|
31
|
+
elapsed_min = (now - window_start).total_seconds() / 60
|
|
32
|
+
if elapsed_min > 1 and q5h > 0:
|
|
33
|
+
rate5 = '{:+.1f}'.format(q5h / elapsed_min)
|
|
34
|
+
|
|
35
|
+
# Q7d: 7-day window
|
|
36
|
+
rate7 = ''
|
|
37
|
+
if q7d_reset > 0:
|
|
38
|
+
window_start_7d = datetime.fromtimestamp(q7d_reset, tz=timezone.utc) - __import__('datetime').timedelta(days=7)
|
|
39
|
+
elapsed_min_7d = (now - window_start_7d).total_seconds() / 60
|
|
40
|
+
if elapsed_min_7d > 1 and q7d > 0:
|
|
41
|
+
rate7 = '{:+.1f}'.format(q7d / (elapsed_min_7d / 60))
|
|
42
|
+
|
|
43
|
+
label = 'Q5h: {}%'.format(q5h)
|
|
44
|
+
if rate5:
|
|
45
|
+
label += ' ({}%/m)'.format(rate5)
|
|
46
|
+
label += ' | Q7d: {}%'.format(q7d)
|
|
47
|
+
if rate7:
|
|
48
|
+
label += ' ({}%/hr)'.format(rate7)
|
|
49
|
+
if overage == 'active':
|
|
50
|
+
label += ' | OVERAGE'
|
|
51
|
+
|
|
52
|
+
# Add TTL tier from quota-status.json (written by interceptor)
|
|
53
|
+
import os, pathlib
|
|
54
|
+
qs_path = pathlib.Path.home() / '.claude' / 'quota-status.json'
|
|
55
|
+
try:
|
|
56
|
+
qs = json.load(open(qs_path))
|
|
57
|
+
ttl = qs.get('cache', {}).get('ttl_tier', '')
|
|
58
|
+
hit = qs.get('cache', {}).get('hit_rate', '')
|
|
59
|
+
if ttl:
|
|
60
|
+
if ttl == '5m':
|
|
61
|
+
label += ' | \033[31mTTL:5m\033[0m' # red
|
|
62
|
+
else:
|
|
63
|
+
label += ' | TTL:' + ttl
|
|
64
|
+
if hit and hit != 'N/A':
|
|
65
|
+
label += ' ' + hit + '%'
|
|
66
|
+
peak = qs.get('peak_hour', False)
|
|
67
|
+
if peak:
|
|
68
|
+
label += ' | \033[33mPEAK\033[0m' # yellow
|
|
69
|
+
except:
|
|
70
|
+
pass
|
|
71
|
+
|
|
72
|
+
print(label)
|
|
73
|
+
" 2>/dev/null)
|
|
74
|
+
|
|
75
|
+
[ -n "$result" ] && echo "$result"
|
|
76
|
+
fi
|