bloat-report 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +199 -0
- package/dist/cli.js +1024 -0
- package/package.json +32 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Brendan O'Neill
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,199 @@
|
|
|
1
|
+
# Bloat Report
|
|
2
|
+
|
|
3
|
+
Your coding agent gets dumber the longer a session runs. Bloat Report is a local, read-only CLI that scans your agent's session transcripts, finds where each one crossed the line, and pairs every finding with the small change that fixes it.
|
|
4
|
+
|
|
5
|
+
```bash
|
|
6
|
+
npx bloat-report conversations report bloat
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
## The dumb zone
|
|
10
|
+
|
|
11
|
+
Past roughly **100k tokens** of context, models get noticeably worse — reasoning slips, instructions get dropped, the agent forgets what you told it ten turns ago. That's the **dumb zone**, and once a session is in it, every new prompt is answered by a worse version of the model.
|
|
12
|
+
|
|
13
|
+
> **On the name and the idea.** The "dumb zone" was coined by **Dex Horthy**, who put the degradation point at ~40% of the context window. [**Matt Pocock** furthered it](https://finance.biggo.com/news/e7209c094224b09c), pinning the line at ~100k tokens — the figure Bloat Report uses directly.
|
|
14
|
+
|
|
15
|
+
Catching this is the tool's main job: it flags the sessions that overstayed — where **2+ genuine prompts** were sent into a context already past 100k tokens. One prompt over the line is noise; staying there means the session should have been cleared, compacted, or split before it got dumb. The fix is a `/clear` or `/compact` at the task boundary, or a fresh conversation.
|
|
16
|
+
|
|
17
|
+
## Context bloat — the same growth, seen from the cost side
|
|
18
|
+
|
|
19
|
+
The dumb zone is a **quality** problem. Context bloat is the **cost** problem hiding underneath it — the same growing context, measured in dollars instead of degraded reasoning. The two are related but distinct: a session can be expensive long before it's dumb, and watching the cost is how you catch the drift early.
|
|
20
|
+
|
|
21
|
+
Here's the mechanism. Every assistant turn re-reads the whole conversation so far, so when a session wanders across unrelated tasks you pay full freight, turn after turn, to carry context you no longer need. Bloat Report prices each token class at its own rate and surfaces the *recoverable* slice — the re-read cost a `/clear` or `/compact` would have saved — so you can see the financial impact of bloat and adopt a few simple habits that keep sessions lean. That's the **"small changes, big savings"** pitch: small, boring practices (clear at task boundaries, quiet flags on noisy commands, ranged reads, drop unused MCP servers) that compound into real savings.
|
|
22
|
+
|
|
23
|
+
> **Estimates, not a bill.** These numbers come from what your machine recorded locally. They are directional estimates, not official provider billing, and can diverge from it.
|
|
24
|
+
|
|
25
|
+
This bloat analysis builds on **[Tokenoptics](https://tokenoptics.dev)** — the same vocabulary and thresholds as the companion app, brought to the terminal where it can read the transcripts on your disk directly. For a richer look at the financial side — trends over time, breakdowns, and visualisations — head to **[tokenoptics.dev](https://tokenoptics.dev)**.
|
|
26
|
+
|
|
27
|
+
## Principles
|
|
28
|
+
|
|
29
|
+
These are non-negotiable (see [CLAUDE.md](CLAUDE.md) for the full set):
|
|
30
|
+
|
|
31
|
+
- **Local-only, read-only. Nothing leaves your machine.** No network, no telemetry, no uploads. Transcripts are opened read-only. Privacy is the product.
|
|
32
|
+
- **No AI in the detection path.** Everything is deterministic parsing, counting, and sizing. (You *may* choose to export a flagged conversation to an LLM to learn more — that's your call, outside the tool.)
|
|
33
|
+
- **Never overstate savings.** Each token class is priced at its own rate (cached reads are far cheaper than fresh input). No token is counted under two findings. A session's measured total is the ceiling.
|
|
34
|
+
- **Provider-agnostic by design.** Adapters normalise each provider's transcripts into one shared model; detectors never know which provider produced the data. **Claude Code is the first adapter; Codex is next.**
|
|
35
|
+
|
|
36
|
+
## Install
|
|
37
|
+
|
|
38
|
+
Requires **Node.js 18+**.
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
npm install -g bloat-report
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Or run without installing:
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
npx bloat-report conversations list
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
## Example output
|
|
51
|
+
|
|
52
|
+
```
|
|
53
|
+
$ bloat-report conversations report bloat
|
|
54
|
+
|
|
55
|
+
Dumb Zone — context past 100k tokens, where the model gets noticeably worse.
|
|
56
|
+
3/14 conversations kept going past 100k tokens for 2+ prompts:
|
|
57
|
+
a1b2c3d4 5 prompts in zone peak 187k Refactor auth middleware to use JWT
|
|
58
|
+
e5f6a7b8 3 prompts in zone peak 142k Add dark mode support to dashboard
|
|
59
|
+
c9d0e1f2 2 prompts in zone peak 108k Debug intermittent test failures in CI
|
|
60
|
+
Fix: /clear or /compact at the task boundary, or start a fresh conversation, before context runs past the line.
|
|
61
|
+
|
|
62
|
+
Bloat (secondary) — recoverable cost from carrying stale context past an early-session baseline.
|
|
63
|
+
Scanned 14 conversations: 4/14 climbing or worse · 1 heavy drift · $0.43 recoverable
|
|
64
|
+
a1b2c3d4 Heavy cost $0.31 bloat $0.21 ramp 8.4× hit 91% dumb zone yes Refactor auth middleware to use JWT
|
|
65
|
+
e5f6a7b8 Climbing cost $0.18 bloat $0.12 ramp 4.1× hit 87% dumb zone yes Add dark mode support to dashboard
|
|
66
|
+
c9d0e1f2 Climbing cost $0.14 bloat $0.07 ramp 3.8× hit 84% dumb zone yes Debug intermittent test failures in CI
|
|
67
|
+
f3a4b5c6 Climbing cost $0.09 bloat $0.03 ramp 3.2× hit 79% dumb zone no Update README and contributing guide
|
|
68
|
+
|
|
69
|
+
To dig deeper, export flagged conversations and upload to an LLM:
|
|
70
|
+
conversations export (saves export-<date>.md)
|
|
71
|
+
conversations export <id> (one specific conversation)
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## Where it reads from
|
|
75
|
+
|
|
76
|
+
The Claude Code adapter reads transcripts from `~/.claude/projects/<project>/<session>.jsonl`. Set `CLAUDE_CONFIG_DIR` to point at a different root (still read-only). If no transcripts are found, the adapter simply reports nothing to scan.
|
|
77
|
+
|
|
78
|
+
## Commands
|
|
79
|
+
|
|
80
|
+
All commands hang off the `conversations` group (the singular `conversation` also works).
|
|
81
|
+
|
|
82
|
+
Global flags (work on any command):
|
|
83
|
+
|
|
84
|
+
- `--json` — machine-readable JSON instead of the plain-English report
|
|
85
|
+
- `--verbose` — include per-finding detail
|
|
86
|
+
|
|
87
|
+
### `conversations list`
|
|
88
|
+
|
|
89
|
+
List discovered conversations, most recent first, with token totals and estimated cost.
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
bloat-report conversations list
|
|
93
|
+
bloat-report conversations list -n 50 # show more
|
|
94
|
+
bloat-report conversations list -p claude # one provider
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
- `-n, --limit <count>` — max conversations to show (default 30)
|
|
98
|
+
- `-p, --provider <name>` — limit to one provider (e.g. `claude`)
|
|
99
|
+
|
|
100
|
+
### `conversations detail <id>`
|
|
101
|
+
|
|
102
|
+
Show one conversation in detail — model, cwd, time span, token breakdown, and the signals that conversation carries. Accepts an id prefix.
|
|
103
|
+
|
|
104
|
+
```bash
|
|
105
|
+
bloat-report conversations detail a1b2c3d4
|
|
106
|
+
bloat-report conversations detail a1b2c3d4 --verbose # per-message timeline
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### `conversations report bloat`
|
|
110
|
+
|
|
111
|
+
The main event. Scans recent conversations and leads with the **Dumb Zone** roundup — who kept working past the ~100k line — then, as secondary detail, a **bloat table** of recoverable cost, biggest opportunity first.
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
bloat-report conversations report bloat
|
|
115
|
+
bloat-report conversations report bloat -a # show every scanned convo, not just flagged
|
|
116
|
+
bloat-report conversations report bloat --verbose # per-finding detail
|
|
117
|
+
bloat-report conversations report bloat --json # structured output
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
- `-n, --limit <count>` — max conversations to scan (default 30)
|
|
121
|
+
- `-p, --provider <name>` — limit to one provider
|
|
122
|
+
- `-a, --all` — list every scanned conversation, not just the Climbing/Heavy ones
|
|
123
|
+
|
|
124
|
+
Conversations with no token data are skipped rather than reported as a hollow zero — the report tells you when and why a detector couldn't run (graceful degradation).
|
|
125
|
+
|
|
126
|
+
### `conversations export [id...]`
|
|
127
|
+
|
|
128
|
+
Export conversations as markdown ready to paste into an LLM chatbot (Claude.ai, ChatGPT, …) so you can ask *why* the bloat happened and how to avoid it next time. The export bundles Bloat Report's findings at the top, then the conversation with tool calls and result sizes summarised (thinking blocks and raw tool output are stripped to keep it lean).
|
|
129
|
+
|
|
130
|
+
```bash
|
|
131
|
+
bloat-report conversations export # exports what the bloat report flags
|
|
132
|
+
bloat-report conversations export a1b2c3d4 e5f6 # specific conversations
|
|
133
|
+
bloat-report conversations export --all # include healthy ones too
|
|
134
|
+
bloat-report conversations export --print # to terminal instead of a file
|
|
135
|
+
bloat-report conversations export -o my-export.md # custom filename
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
With no ids, it exports exactly the set the bloat report flags (Climbing/Heavy). By default it saves to `bloat-report-export-<date>.md`.
|
|
139
|
+
|
|
140
|
+
## What it measures
|
|
141
|
+
|
|
142
|
+
### The Dumb Zone (primary)
|
|
143
|
+
|
|
144
|
+
The main detector. It flags conversations that kept going past the ~100k line — specifically, sessions where **2+ genuine prompts** were sent into a context of 100k+ tokens — and tells you where a `/clear`, `/compact`, or fresh conversation at the task boundary would have kept the session sharp. (See [The dumb zone](#the-dumb-zone) above for the concept and its origins.)
|
|
145
|
+
|
|
146
|
+
### Context bloat (secondary)
|
|
147
|
+
|
|
148
|
+
Alongside the Dumb Zone, the report surfaces *recoverable cost*. Bloat is **not** "total cost is high." It's the *recoverable* slice — the cache-read cost **above an early-session baseline** — and it's only counted when a drift signal fires (evidence the session wandered across topics and a `/clear` or `/compact` would have saved money). The report classifies each conversation:
|
|
149
|
+
|
|
150
|
+
| Health | Meaning |
|
|
151
|
+
| ------------- | ------------------------------------------------------------------- |
|
|
152
|
+
| **No bloat** | Cost per turn is steady; context is being used, not wasted. |
|
|
153
|
+
| **Climbing** | Late turns cost ~3×+ the early baseline — drift is starting. |
|
|
154
|
+
| **Heavy** | Late turns cost ~6×+ the baseline, or a long session with a low cache hit ratio. |
|
|
155
|
+
|
|
156
|
+
Each finding comes with the small change that fixes it — e.g. `/compact` at a task boundary, splitting a session that's drifted into a new topic, quiet flags on noisy commands, ranged reads on big files, or disconnecting MCP servers a session never calls.
|
|
157
|
+
|
|
158
|
+
> **Where this comes from.** The bloat analysis — the early-session baseline, the ramp ratios, and the Climbing/Heavy thresholds — is shared with **[Tokenoptics](https://tokenoptics.dev)**, the companion app this CLI builds on. For trends over time and richer breakdowns of the same numbers, see [tokenoptics.dev](https://tokenoptics.dev).
|
|
159
|
+
|
|
160
|
+
## How pricing works
|
|
161
|
+
|
|
162
|
+
Each message is priced at **its own model's rate** (a session can mix models) and summed — never recomputed from aggregated token totals. Cached reads, fresh input, output, and the two cache-write TTLs (5-minute and 1-hour) are each priced separately. Unknown model ids fall back to a sane mid rate rather than crashing, and an unrecognised newer minor version is priced at its newest known sibling rather than collapsing onto a pricier legacy entry.
|
|
163
|
+
|
|
164
|
+
## Architecture
|
|
165
|
+
|
|
166
|
+
```
|
|
167
|
+
adapters/ provider-specific: read transcripts, normalise to the shared model,
|
|
168
|
+
supply the exact fix wording (Claude today, Codex next)
|
|
169
|
+
core/ the shared model (sessions → messages → blocks, token usage),
|
|
170
|
+
adapter interface, and the provider registry
|
|
171
|
+
analyze/ provider-agnostic detectors: cache/bloat analysis, dumb zone
|
|
172
|
+
commands/ the Commander.js CLI surface
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
The design rule: **adapters normalise, detectors stay provider-agnostic.** Each adapter declares which signals it supplies (token usage, cache split, tool-output sizes, …); a detector runs only where its required signals exist. Provider-specific fixes (slash commands, flags) come from the adapter, never hard-coded into shared code. New providers plug in behind the shared interface without touching detectors or commands.
|
|
176
|
+
|
|
177
|
+
## Development
|
|
178
|
+
|
|
179
|
+
```bash
|
|
180
|
+
git clone <this-repo>
|
|
181
|
+
cd bloat-report
|
|
182
|
+
npm install
|
|
183
|
+
|
|
184
|
+
npm run dev -- conversations list # run from source (tsx)
|
|
185
|
+
npm run build # bundle to dist/cli.js
|
|
186
|
+
npm run typecheck # tsc --noEmit
|
|
187
|
+
npm test # run analysis tests
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
To publish a new version:
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
npm version patch # or minor / major
|
|
194
|
+
npm publish
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
## License
|
|
198
|
+
|
|
199
|
+
MIT
|
package/dist/cli.js
ADDED
|
@@ -0,0 +1,1024 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
// src/cli.ts
|
|
4
|
+
import { Command } from "commander";
|
|
5
|
+
|
|
6
|
+
// src/commands/conversations.ts
|
|
7
|
+
import fs from "fs";
|
|
8
|
+
|
|
9
|
+
// src/core/registry.ts
|
|
10
|
+
var adapters = /* @__PURE__ */ new Map();
|
|
11
|
+
function registerAdapter(adapter) {
|
|
12
|
+
adapters.set(adapter.id, adapter);
|
|
13
|
+
}
|
|
14
|
+
function resolveAdapters(provider) {
|
|
15
|
+
if (provider) {
|
|
16
|
+
const a = adapters.get(provider);
|
|
17
|
+
return a ? [a] : [];
|
|
18
|
+
}
|
|
19
|
+
return [...adapters.values()];
|
|
20
|
+
}
|
|
21
|
+
|
|
22
|
+
// src/adapters/claude/pricing.ts
|
|
23
|
+
var PRICING = {
|
|
24
|
+
// Fable 5 — flagship tier above Opus. cacheWrite (5m) is the standard 1.25x
|
|
25
|
+
// input ($12.50, not in the public table); cacheWrite1h is 2x input ($20),
|
|
26
|
+
// and cacheRead is the 90%-off-input read ($1.00).
|
|
27
|
+
"claude-fable-5": { input: 10, output: 50, cacheRead: 1, cacheWrite: 12.5, cacheWrite1h: 20 },
|
|
28
|
+
// Opus 4.8 — 1M context at standard pricing (no long-context premium); same
|
|
29
|
+
// rates as 4.7/4.6/4.5. Without this row the dash-walk falls through to the
|
|
30
|
+
// legacy "claude-opus-4" entry (15/75), overstating cost ~3x.
|
|
31
|
+
"claude-opus-4-8": { input: 5, output: 25, cacheRead: 0.5, cacheWrite: 6.25, cacheWrite1h: 10 },
|
|
32
|
+
"claude-opus-4-7": { input: 5, output: 25, cacheRead: 0.5, cacheWrite: 6.25, cacheWrite1h: 10 },
|
|
33
|
+
"claude-opus-4-6": { input: 5, output: 25, cacheRead: 0.5, cacheWrite: 6.25, cacheWrite1h: 10 },
|
|
34
|
+
"claude-opus-4-5": { input: 5, output: 25, cacheRead: 0.5, cacheWrite: 6.25, cacheWrite1h: 10 },
|
|
35
|
+
"claude-opus-4-1": { input: 15, output: 75, cacheRead: 1.5, cacheWrite: 18.75, cacheWrite1h: 30 },
|
|
36
|
+
"claude-opus-4": { input: 15, output: 75, cacheRead: 1.5, cacheWrite: 18.75, cacheWrite1h: 30 },
|
|
37
|
+
"claude-sonnet-4-6": { input: 3, output: 15, cacheRead: 0.3, cacheWrite: 3.75, cacheWrite1h: 6 },
|
|
38
|
+
"claude-sonnet-4-5": { input: 3, output: 15, cacheRead: 0.3, cacheWrite: 3.75, cacheWrite1h: 6 },
|
|
39
|
+
"claude-sonnet-4": { input: 3, output: 15, cacheRead: 0.3, cacheWrite: 3.75, cacheWrite1h: 6 },
|
|
40
|
+
"claude-3-7-sonnet": { input: 3, output: 15, cacheRead: 0.3, cacheWrite: 3.75, cacheWrite1h: 6 },
|
|
41
|
+
"claude-3-5-sonnet": { input: 3, output: 15, cacheRead: 0.3, cacheWrite: 3.75, cacheWrite1h: 6 },
|
|
42
|
+
"claude-haiku-4-5": { input: 1, output: 5, cacheRead: 0.1, cacheWrite: 1.25, cacheWrite1h: 2 },
|
|
43
|
+
"claude-3-5-haiku": { input: 0.8, output: 4, cacheRead: 0.08, cacheWrite: 1, cacheWrite1h: 1.6 }
|
|
44
|
+
};
|
|
45
|
+
var FALLBACK_PRICING = PRICING["claude-sonnet-4-6"];
|
|
46
|
+
function newestSiblingPricing(candidate) {
|
|
47
|
+
const m = candidate.match(/^(.*)-\d+$/);
|
|
48
|
+
if (!m) return void 0;
|
|
49
|
+
const prefix = m[1] + "-";
|
|
50
|
+
let bestMinor = -1;
|
|
51
|
+
let best;
|
|
52
|
+
for (const key of Object.keys(PRICING)) {
|
|
53
|
+
if (!key.startsWith(prefix)) continue;
|
|
54
|
+
const rest = key.slice(prefix.length);
|
|
55
|
+
if (!/^\d+$/.test(rest)) continue;
|
|
56
|
+
const minor = Number(rest);
|
|
57
|
+
if (minor > bestMinor) bestMinor = minor, best = PRICING[key];
|
|
58
|
+
}
|
|
59
|
+
return best;
|
|
60
|
+
}
|
|
61
|
+
function pricingForModel(model) {
|
|
62
|
+
if (!model) return FALLBACK_PRICING;
|
|
63
|
+
if (PRICING[model]) return PRICING[model];
|
|
64
|
+
const stripped = model.replace(/\[.*\]$/, "");
|
|
65
|
+
const parts = stripped.split("-");
|
|
66
|
+
while (parts.length > 1) {
|
|
67
|
+
const candidate = parts.join("-");
|
|
68
|
+
if (PRICING[candidate]) return PRICING[candidate];
|
|
69
|
+
const sibling = newestSiblingPricing(candidate);
|
|
70
|
+
if (sibling) return sibling;
|
|
71
|
+
parts.pop();
|
|
72
|
+
}
|
|
73
|
+
return FALLBACK_PRICING;
|
|
74
|
+
}
|
|
75
|
+
function costForUsage(model, usage) {
|
|
76
|
+
const p = pricingForModel(model);
|
|
77
|
+
return (usage.inputTokens * p.input + usage.outputTokens * p.output + usage.cacheReadTokens * p.cacheRead + usage.cacheWrite5mTokens * p.cacheWrite + usage.cacheWrite1hTokens * p.cacheWrite1h) / 1e6;
|
|
78
|
+
}
|
|
79
|
+
function formatUSD(amount) {
|
|
80
|
+
if (amount === 0) return "$0.00";
|
|
81
|
+
if (amount < 0.01) return `$${amount.toFixed(4)}`;
|
|
82
|
+
if (amount < 1) return `$${amount.toFixed(3)}`;
|
|
83
|
+
return `$${amount.toFixed(2)}`;
|
|
84
|
+
}
|
|
85
|
+
|
|
86
|
+
// src/analyze/cache.ts
|
|
87
|
+
var HEALTH_MIN_TURNS = 5;
|
|
88
|
+
var LONG_SESSION_TURNS = 20;
|
|
89
|
+
var HEALTHY_CACHE_HIT_RATIO = 0.7;
|
|
90
|
+
var RAMP_WARN_RATIO = 3;
|
|
91
|
+
var RAMP_CRITICAL_RATIO = 6;
|
|
92
|
+
function dominantBucket(inputCost, outputCost, cacheReadCost, cacheWrite5mCost, cacheWrite1hCost) {
|
|
93
|
+
let best = "output";
|
|
94
|
+
let bestVal = outputCost;
|
|
95
|
+
const candidates = [
|
|
96
|
+
["input", inputCost],
|
|
97
|
+
["cache_read", cacheReadCost],
|
|
98
|
+
["cache_write_5m", cacheWrite5mCost],
|
|
99
|
+
["cache_write_1h", cacheWrite1hCost]
|
|
100
|
+
];
|
|
101
|
+
for (const [bucket, val] of candidates) {
|
|
102
|
+
if (val > bestVal) {
|
|
103
|
+
best = bucket;
|
|
104
|
+
bestVal = val;
|
|
105
|
+
}
|
|
106
|
+
}
|
|
107
|
+
return best;
|
|
108
|
+
}
|
|
109
|
+
function buildTrajectory(messages, pricing) {
|
|
110
|
+
const out = [];
|
|
111
|
+
let cumulative = 0;
|
|
112
|
+
let turnIndex = 0;
|
|
113
|
+
for (const m of messages) {
|
|
114
|
+
if (m.role !== "assistant" || !m.usage) continue;
|
|
115
|
+
turnIndex += 1;
|
|
116
|
+
const p = pricing.pricingForModel(m.model);
|
|
117
|
+
const inputCost = m.usage.inputTokens * p.input / 1e6;
|
|
118
|
+
const outputCost = m.usage.outputTokens * p.output / 1e6;
|
|
119
|
+
const cacheReadCost = m.usage.cacheReadTokens * p.cacheRead / 1e6;
|
|
120
|
+
const cacheWrite5mCost = m.usage.cacheWrite5mTokens * p.cacheWrite / 1e6;
|
|
121
|
+
const cacheWrite1hCost = m.usage.cacheWrite1hTokens * p.cacheWrite1h / 1e6;
|
|
122
|
+
const cost = inputCost + outputCost + cacheReadCost + cacheWrite5mCost + cacheWrite1hCost;
|
|
123
|
+
cumulative += cost;
|
|
124
|
+
out.push({
|
|
125
|
+
turnIndex,
|
|
126
|
+
model: m.model ?? null,
|
|
127
|
+
cost,
|
|
128
|
+
cumulativeCost: cumulative,
|
|
129
|
+
inputTokens: m.usage.inputTokens,
|
|
130
|
+
outputTokens: m.usage.outputTokens,
|
|
131
|
+
cacheReadTokens: m.usage.cacheReadTokens,
|
|
132
|
+
cacheWrite5mTokens: m.usage.cacheWrite5mTokens,
|
|
133
|
+
cacheWrite1hTokens: m.usage.cacheWrite1hTokens,
|
|
134
|
+
dominantBucket: dominantBucket(
|
|
135
|
+
inputCost,
|
|
136
|
+
outputCost,
|
|
137
|
+
cacheReadCost,
|
|
138
|
+
cacheWrite5mCost,
|
|
139
|
+
cacheWrite1hCost
|
|
140
|
+
)
|
|
141
|
+
});
|
|
142
|
+
}
|
|
143
|
+
return out;
|
|
144
|
+
}
|
|
145
|
+
function median(values) {
|
|
146
|
+
if (values.length === 0) return 0;
|
|
147
|
+
const sorted = [...values].sort((a, b) => a - b);
|
|
148
|
+
const mid = Math.floor(sorted.length / 2);
|
|
149
|
+
return sorted.length % 2 === 0 ? ((sorted[mid - 1] ?? 0) + (sorted[mid] ?? 0)) / 2 : sorted[mid] ?? 0;
|
|
150
|
+
}
|
|
151
|
+
function mean(values) {
|
|
152
|
+
if (values.length === 0) return 0;
|
|
153
|
+
return values.reduce((a, b) => a + b, 0) / values.length;
|
|
154
|
+
}
|
|
155
|
+
function buildRecommendations(report, pricing) {
|
|
156
|
+
const recs = [];
|
|
157
|
+
const bloatSuffix = report.aboveBaselineContextCost > 0 ? ` Likely recoverable: ${pricing.formatUSD(report.aboveBaselineContextCost)} (${(report.aboveBaselineContextShare * 100).toFixed(0)}% of session cost) \u2014 drop it with /clear or /compact at the topic boundary.` : "";
|
|
158
|
+
if (report.assistantTurnCount > LONG_SESSION_TURNS && report.cacheHitRatio < HEALTHY_CACHE_HIT_RATIO) {
|
|
159
|
+
recs.push({
|
|
160
|
+
severity: "critical",
|
|
161
|
+
title: "Long session with low cache hit ratio",
|
|
162
|
+
message: `Cache hit ratio is ${(report.cacheHitRatio * 100).toFixed(0)}% across ${report.assistantTurnCount} assistant turns. That means turns are paying full input price for context that should've been cached. Use /clear or /compact between unrelated tasks instead of letting one session drift across topics.${bloatSuffix}`
|
|
163
|
+
});
|
|
164
|
+
}
|
|
165
|
+
if (report.finalRampRatio >= RAMP_CRITICAL_RATIO) {
|
|
166
|
+
recs.push({
|
|
167
|
+
severity: "critical",
|
|
168
|
+
title: "Cost per turn climbed sharply",
|
|
169
|
+
message: `Late turns in this session cost about ${report.finalRampRatio.toFixed(1)}\xD7 as much as early turns. Most of the extra is cumulative cache_read on a growing context window. Split the session at the topic boundary or run /clear to drop history that isn't needed anymore.${bloatSuffix}`
|
|
170
|
+
});
|
|
171
|
+
} else if (report.finalRampRatio >= RAMP_WARN_RATIO) {
|
|
172
|
+
recs.push({
|
|
173
|
+
severity: "warn",
|
|
174
|
+
title: "Cost per turn climbing",
|
|
175
|
+
message: `Late turns cost about ${report.finalRampRatio.toFixed(1)}\xD7 the early-session baseline. Cache_read on a growing context is doing the work. Consider splitting at task boundaries when one session drifts into a new topic.${bloatSuffix}`
|
|
176
|
+
});
|
|
177
|
+
}
|
|
178
|
+
if (report.cacheReadTokens > 0 && report.cacheWrite5mTokens > report.cacheReadTokens) {
|
|
179
|
+
recs.push({
|
|
180
|
+
severity: "info",
|
|
181
|
+
title: "Cache is churning (5-minute TTL expiring)",
|
|
182
|
+
message: `5-minute cache writes (${report.cacheWrite5mTokens.toLocaleString()} tokens) exceed cache reads (${report.cacheReadTokens.toLocaleString()}). The session is rebuilding cache more often than reusing it \u2014 usually caused by long pauses between turns or by prompt prefixes that change shape between calls.`
|
|
183
|
+
});
|
|
184
|
+
}
|
|
185
|
+
return recs;
|
|
186
|
+
}
|
|
187
|
+
function computeCacheReport(messages, pricing) {
|
|
188
|
+
const trajectory = buildTrajectory(messages, pricing);
|
|
189
|
+
let totalCost = 0;
|
|
190
|
+
let inputTokens = 0;
|
|
191
|
+
let outputTokens = 0;
|
|
192
|
+
let cacheReadTokens = 0;
|
|
193
|
+
let cacheWrite5mTokens = 0;
|
|
194
|
+
let cacheWrite1hTokens = 0;
|
|
195
|
+
let cacheReadCost = 0;
|
|
196
|
+
for (const point of trajectory) {
|
|
197
|
+
totalCost += point.cost;
|
|
198
|
+
inputTokens += point.inputTokens;
|
|
199
|
+
outputTokens += point.outputTokens;
|
|
200
|
+
cacheReadTokens += point.cacheReadTokens;
|
|
201
|
+
cacheWrite5mTokens += point.cacheWrite5mTokens;
|
|
202
|
+
cacheWrite1hTokens += point.cacheWrite1hTokens;
|
|
203
|
+
}
|
|
204
|
+
for (const m of messages) {
|
|
205
|
+
if (m.role !== "assistant" || !m.usage) continue;
|
|
206
|
+
const p = pricing.pricingForModel(m.model);
|
|
207
|
+
cacheReadCost += m.usage.cacheReadTokens * p.cacheRead / 1e6;
|
|
208
|
+
}
|
|
209
|
+
const cacheHitDenom = inputTokens + cacheReadTokens;
|
|
210
|
+
const cacheHitRatio = cacheHitDenom > 0 ? cacheReadTokens / cacheHitDenom : 0;
|
|
211
|
+
const cacheReadCostShare = totalCost > 0 ? cacheReadCost / totalCost : 0;
|
|
212
|
+
const baselineSample = trajectory.slice(0, 3).map((p) => p.cost);
|
|
213
|
+
const baselineTurnCost = median(baselineSample);
|
|
214
|
+
const tailSample = trajectory.slice(-3).map((p) => p.cost);
|
|
215
|
+
const tailMeanCost = mean(tailSample);
|
|
216
|
+
const finalRampRatio = baselineTurnCost > 0 ? tailMeanCost / baselineTurnCost : 0;
|
|
217
|
+
const perTurnCacheReadCost = trajectory.map((p) => {
|
|
218
|
+
const pricingForTurn = pricing.pricingForModel(p.model ?? void 0);
|
|
219
|
+
return p.cacheReadTokens * pricingForTurn.cacheRead / 1e6;
|
|
220
|
+
});
|
|
221
|
+
const baselineCacheReadCost = median(perTurnCacheReadCost.slice(0, 3));
|
|
222
|
+
let aboveBaselineContextCost = 0;
|
|
223
|
+
for (const cost of perTurnCacheReadCost) {
|
|
224
|
+
aboveBaselineContextCost += Math.max(0, cost - baselineCacheReadCost);
|
|
225
|
+
}
|
|
226
|
+
const aboveBaselineContextShare = totalCost > 0 ? aboveBaselineContextCost / totalCost : 0;
|
|
227
|
+
let totalCostCheck = 0;
|
|
228
|
+
for (const m of messages) {
|
|
229
|
+
if (m.role !== "assistant" || !m.usage) continue;
|
|
230
|
+
totalCostCheck += pricing.costForUsage(m.model, m.usage);
|
|
231
|
+
}
|
|
232
|
+
const baseReport = {
|
|
233
|
+
assistantTurnCount: trajectory.length,
|
|
234
|
+
totalCost: totalCostCheck,
|
|
235
|
+
inputTokens,
|
|
236
|
+
outputTokens,
|
|
237
|
+
cacheReadTokens,
|
|
238
|
+
cacheWrite5mTokens,
|
|
239
|
+
cacheWrite1hTokens,
|
|
240
|
+
cacheHitRatio,
|
|
241
|
+
cacheReadCost,
|
|
242
|
+
cacheReadCostShare,
|
|
243
|
+
baselineCacheReadCost,
|
|
244
|
+
aboveBaselineContextCost,
|
|
245
|
+
aboveBaselineContextShare,
|
|
246
|
+
recoverableBloatCost: 0,
|
|
247
|
+
trajectory,
|
|
248
|
+
baselineTurnCost,
|
|
249
|
+
finalRampRatio
|
|
250
|
+
};
|
|
251
|
+
const recommendations = buildRecommendations(baseReport, pricing);
|
|
252
|
+
const driftDetected = recommendations.some(
|
|
253
|
+
(r) => r.severity === "critical" || r.severity === "warn"
|
|
254
|
+
);
|
|
255
|
+
return {
|
|
256
|
+
...baseReport,
|
|
257
|
+
recoverableBloatCost: driftDetected ? aboveBaselineContextCost : 0,
|
|
258
|
+
recommendations
|
|
259
|
+
};
|
|
260
|
+
}
|
|
261
|
+
function cacheHealthFromReport(report) {
|
|
262
|
+
if (report.assistantTurnCount < HEALTH_MIN_TURNS) return null;
|
|
263
|
+
for (const rec of report.recommendations) {
|
|
264
|
+
if (rec.severity === "critical") return "poor";
|
|
265
|
+
}
|
|
266
|
+
for (const rec of report.recommendations) {
|
|
267
|
+
if (rec.severity === "warn") return "climbing";
|
|
268
|
+
}
|
|
269
|
+
return "good";
|
|
270
|
+
}
|
|
271
|
+
|
|
272
|
+
// src/analyze/dumbzone.ts
|
|
273
|
+
var DUMB_ZONE_TOKENS = 1e5;
|
|
274
|
+
var DUMB_ZONE_MIN_PROMPTS = 2;
|
|
275
|
+
function inputContextTokens(m) {
|
|
276
|
+
const u = m.usage;
|
|
277
|
+
if (!u) return 0;
|
|
278
|
+
return u.inputTokens + u.cacheReadTokens + u.cacheWrite5mTokens + u.cacheWrite1hTokens;
|
|
279
|
+
}
|
|
280
|
+
function computeDumbZoneReport(messages) {
|
|
281
|
+
let promptsInZone = 0;
|
|
282
|
+
let totalPrompts = 0;
|
|
283
|
+
let peakContextTokens = 0;
|
|
284
|
+
let enteredAtTokens = null;
|
|
285
|
+
for (let i = 0; i < messages.length; i++) {
|
|
286
|
+
const m = messages[i];
|
|
287
|
+
if (m?.role !== "user" || !m.isGenuinePrompt) continue;
|
|
288
|
+
let ctx = null;
|
|
289
|
+
for (let j = i + 1; j < messages.length; j++) {
|
|
290
|
+
const next = messages[j];
|
|
291
|
+
if (next?.role === "user" && next.isGenuinePrompt) break;
|
|
292
|
+
if (next?.role === "assistant" && next.usage) {
|
|
293
|
+
ctx = inputContextTokens(next);
|
|
294
|
+
break;
|
|
295
|
+
}
|
|
296
|
+
}
|
|
297
|
+
if (ctx === null) continue;
|
|
298
|
+
totalPrompts += 1;
|
|
299
|
+
if (ctx > peakContextTokens) peakContextTokens = ctx;
|
|
300
|
+
if (ctx >= DUMB_ZONE_TOKENS) {
|
|
301
|
+
promptsInZone += 1;
|
|
302
|
+
if (enteredAtTokens === null) enteredAtTokens = ctx;
|
|
303
|
+
}
|
|
304
|
+
}
|
|
305
|
+
return {
|
|
306
|
+
promptsInZone,
|
|
307
|
+
totalPrompts,
|
|
308
|
+
inDumbZone: promptsInZone >= DUMB_ZONE_MIN_PROMPTS,
|
|
309
|
+
enteredAtTokens,
|
|
310
|
+
peakContextTokens
|
|
311
|
+
};
|
|
312
|
+
}
|
|
313
|
+
|
|
314
|
+
// src/commands/conversations.ts
|
|
315
|
+
var claudePricing = { pricingForModel, costForUsage, formatUSD };
|
|
316
|
+
var DEFAULT_LIMIT = 100;
|
|
317
|
+
function bloatHealthLabel(health) {
|
|
318
|
+
switch (health) {
|
|
319
|
+
case null:
|
|
320
|
+
return "n/a";
|
|
321
|
+
case "good":
|
|
322
|
+
return "No bloat";
|
|
323
|
+
case "climbing":
|
|
324
|
+
return "Climbing";
|
|
325
|
+
case "poor":
|
|
326
|
+
return "Heavy";
|
|
327
|
+
}
|
|
328
|
+
}
|
|
329
|
+
function globalOpts(cmd) {
|
|
330
|
+
let root = cmd;
|
|
331
|
+
while (root.parent) root = root.parent;
|
|
332
|
+
return root.opts();
|
|
333
|
+
}
|
|
334
|
+
function renderExport(convo) {
|
|
335
|
+
const report = computeCacheReport(convo.messages, claudePricing);
|
|
336
|
+
const health = cacheHealthFromReport(report);
|
|
337
|
+
const dumbZone = computeDumbZoneReport(convo.messages);
|
|
338
|
+
const lines = [];
|
|
339
|
+
lines.push(`# Bloat Report Export \u2014 ${convo.title}`);
|
|
340
|
+
lines.push("");
|
|
341
|
+
lines.push(`**Session:** ${convo.sessionId}`);
|
|
342
|
+
lines.push(`**Date:** ${convo.startedAt.slice(0, 10)}`);
|
|
343
|
+
lines.push(`**Model:** ${convo.primaryModel}`);
|
|
344
|
+
lines.push(`**Total cost:** ${formatUSD(convo.totalCost)}`);
|
|
345
|
+
lines.push(`**Messages:** ${convo.messageCount} (${convo.userPromptCount} user prompts)`);
|
|
346
|
+
lines.push("");
|
|
347
|
+
lines.push("---");
|
|
348
|
+
lines.push("");
|
|
349
|
+
lines.push("## What Bloat Report found");
|
|
350
|
+
lines.push("");
|
|
351
|
+
if (dumbZone.inDumbZone) {
|
|
352
|
+
const tokK = Math.round(DUMB_ZONE_TOKENS / 1e3);
|
|
353
|
+
lines.push(
|
|
354
|
+
`**Dumb zone:** ${dumbZone.promptsInZone} prompt${dumbZone.promptsInZone === 1 ? "" : "s"} past ${tokK}k tokens (peak ${Math.round(dumbZone.peakContextTokens / 1e3)}k)`
|
|
355
|
+
);
|
|
356
|
+
} else {
|
|
357
|
+
lines.push("**Dumb zone:** clear \u2014 context stayed under the line");
|
|
358
|
+
}
|
|
359
|
+
lines.push(
|
|
360
|
+
`**Context bloat (secondary):** ${bloatHealthLabel(health)}` + (report.recoverableBloatCost > 0 ? ` \u2014 ~${formatUSD(report.recoverableBloatCost)} recoverable` : "")
|
|
361
|
+
);
|
|
362
|
+
lines.push(`**Cache hit ratio:** ${(report.cacheHitRatio * 100).toFixed(0)}%`);
|
|
363
|
+
lines.push(`**Context ramp:** ${report.finalRampRatio.toFixed(1)}\xD7 baseline`);
|
|
364
|
+
if (report.recommendations.length > 0) {
|
|
365
|
+
lines.push("");
|
|
366
|
+
lines.push("### Findings");
|
|
367
|
+
for (const rec of report.recommendations) {
|
|
368
|
+
lines.push(`- **[${rec.severity}] ${rec.title}:** ${rec.message}`);
|
|
369
|
+
}
|
|
370
|
+
}
|
|
371
|
+
lines.push("");
|
|
372
|
+
lines.push(
|
|
373
|
+
"> Please review this conversation. Explain why each finding above occurred, which specific exchanges caused it, and what the user could have done differently to avoid the wasted cost."
|
|
374
|
+
);
|
|
375
|
+
lines.push("");
|
|
376
|
+
lines.push("---");
|
|
377
|
+
lines.push("");
|
|
378
|
+
lines.push("## Conversation");
|
|
379
|
+
lines.push("");
|
|
380
|
+
for (const msg of convo.messages) {
|
|
381
|
+
lines.push(`### ${msg.role === "user" ? "User" : "Assistant"}`);
|
|
382
|
+
lines.push(`*${msg.timestamp.slice(0, 19).replace("T", " ")}*`);
|
|
383
|
+
if (msg.usage) {
|
|
384
|
+
lines.push(
|
|
385
|
+
`*tokens: in ${msg.usage.inputTokens.toLocaleString()} \xB7 out ${msg.usage.outputTokens.toLocaleString()} \xB7 cache-read ${msg.usage.cacheReadTokens.toLocaleString()}*`
|
|
386
|
+
);
|
|
387
|
+
}
|
|
388
|
+
lines.push("");
|
|
389
|
+
for (const block of msg.blocks) {
|
|
390
|
+
if (block.kind === "text") {
|
|
391
|
+
lines.push(block.text.trimEnd());
|
|
392
|
+
} else if (block.kind === "thinking") {
|
|
393
|
+
} else if (block.kind === "tool_use") {
|
|
394
|
+
const inputSummary = JSON.stringify(block.input).slice(0, 120);
|
|
395
|
+
lines.push(`*[Tool call: **${block.name}** \u2014 \`${inputSummary}\`]*`);
|
|
396
|
+
} else if (block.kind === "tool_result") {
|
|
397
|
+
const label = block.toolName ? `${block.toolName} result` : "Tool result";
|
|
398
|
+
const size = block.charCount.toLocaleString();
|
|
399
|
+
const errFlag = block.isError ? " (error)" : "";
|
|
400
|
+
lines.push(`*[${label}${errFlag} \u2014 ${size} chars]*`);
|
|
401
|
+
}
|
|
402
|
+
}
|
|
403
|
+
lines.push("");
|
|
404
|
+
}
|
|
405
|
+
return lines.join("\n");
|
|
406
|
+
}
|
|
407
|
+
function totalTokens(c) {
|
|
408
|
+
return c.totalInputTokens + c.totalOutputTokens + c.totalCacheReadTokens + c.totalCacheWriteTokens;
|
|
409
|
+
}
|
|
410
|
+
function registerConversations(program2) {
|
|
411
|
+
const conversations = program2.command("conversations").alias("conversation").description("Inspect locally-recorded agent conversations");
|
|
412
|
+
conversations.command("list").description("List discovered conversations (most recent first)").option("-p, --provider <name>", "limit to one provider (e.g. claude, codex)").option("-n, --limit <count>", "max conversations to show", String(DEFAULT_LIMIT)).action(async (opts, cmd) => {
|
|
413
|
+
const g = globalOpts(cmd);
|
|
414
|
+
const limit = Number.parseInt(opts.limit, 10) || 30;
|
|
415
|
+
const adapters2 = resolveAdapters(opts.provider);
|
|
416
|
+
if (opts.provider && adapters2.length === 0) {
|
|
417
|
+
console.error(`Unknown provider: ${opts.provider}`);
|
|
418
|
+
process.exitCode = 1;
|
|
419
|
+
return;
|
|
420
|
+
}
|
|
421
|
+
const rows = [];
|
|
422
|
+
for (const adapter of adapters2) {
|
|
423
|
+
if (!await adapter.isAvailable()) continue;
|
|
424
|
+
rows.push(...await adapter.listConversations());
|
|
425
|
+
}
|
|
426
|
+
rows.sort((a, b) => b.startedAt.localeCompare(a.startedAt));
|
|
427
|
+
const shown = rows.slice(0, limit);
|
|
428
|
+
if (g.json) {
|
|
429
|
+
console.log(JSON.stringify(shown, (k, v) => v instanceof Set ? [...v] : v, 2));
|
|
430
|
+
return;
|
|
431
|
+
}
|
|
432
|
+
if (shown.length === 0) {
|
|
433
|
+
console.log("No conversations found.");
|
|
434
|
+
return;
|
|
435
|
+
}
|
|
436
|
+
for (const c of shown) {
|
|
437
|
+
const date = c.startedAt ? c.startedAt.slice(0, 10) : "??????????";
|
|
438
|
+
const tokens = totalTokens(c).toLocaleString();
|
|
439
|
+
const cost = formatUSD(c.totalCost).padStart(9);
|
|
440
|
+
console.log(
|
|
441
|
+
`${date} ${c.sessionId.slice(0, 8)} ${tokens.padStart(9)} tok ${cost} ${c.title}`
|
|
442
|
+
);
|
|
443
|
+
}
|
|
444
|
+
const grandTotal = shown.reduce((sum, c) => sum + c.totalCost, 0);
|
|
445
|
+
console.log(
|
|
446
|
+
`${" ".repeat(34)}${formatUSD(grandTotal).padStart(9)} ${shown.length} conversation${shown.length === 1 ? "" : "s"} total`
|
|
447
|
+
);
|
|
448
|
+
if (rows.length > shown.length) {
|
|
449
|
+
console.log(`
|
|
450
|
+
\u2026and ${rows.length - shown.length} more (raise --limit to see them).`);
|
|
451
|
+
}
|
|
452
|
+
});
|
|
453
|
+
conversations.command("detail").description("Show one conversation in detail").argument("[conversationId]", "id (or id prefix) of the conversation to inspect").option("-p, --provider <name>", "provider to load from (e.g. claude, codex)").action(async (conversationId, opts, cmd) => {
|
|
454
|
+
const g = globalOpts(cmd);
|
|
455
|
+
if (!conversationId) {
|
|
456
|
+
console.error("Pass a conversation id \u2014 see `conversations list`.");
|
|
457
|
+
process.exitCode = 1;
|
|
458
|
+
return;
|
|
459
|
+
}
|
|
460
|
+
const adapters2 = resolveAdapters(opts.provider);
|
|
461
|
+
let convo = null;
|
|
462
|
+
for (const adapter of adapters2) {
|
|
463
|
+
if (!await adapter.isAvailable()) continue;
|
|
464
|
+
const summaries = await adapter.listConversations();
|
|
465
|
+
const match = summaries.find(
|
|
466
|
+
(s) => s.sessionId === conversationId || s.sessionId.startsWith(conversationId)
|
|
467
|
+
);
|
|
468
|
+
if (match) {
|
|
469
|
+
convo = await adapter.loadConversation(match.sessionId);
|
|
470
|
+
break;
|
|
471
|
+
}
|
|
472
|
+
}
|
|
473
|
+
if (!convo) {
|
|
474
|
+
console.error(`No conversation matching "${conversationId}".`);
|
|
475
|
+
process.exitCode = 1;
|
|
476
|
+
return;
|
|
477
|
+
}
|
|
478
|
+
if (g.json) {
|
|
479
|
+
console.log(JSON.stringify(convo, (k, v) => v instanceof Set ? [...v] : v, 2));
|
|
480
|
+
return;
|
|
481
|
+
}
|
|
482
|
+
console.log(convo.title);
|
|
483
|
+
console.log(` session ${convo.sessionId}`);
|
|
484
|
+
console.log(` model ${convo.primaryModel}`);
|
|
485
|
+
console.log(` cwd ${convo.cwd}`);
|
|
486
|
+
console.log(` span ${convo.startedAt.slice(0, 19)} \u2192 ${convo.endedAt.slice(0, 19)}`);
|
|
487
|
+
console.log(` messages ${convo.messageCount} (${convo.userPromptCount} user prompts)`);
|
|
488
|
+
console.log(
|
|
489
|
+
` tokens in ${convo.totalInputTokens.toLocaleString()} \xB7 out ${convo.totalOutputTokens.toLocaleString()} \xB7 cache-read ${convo.totalCacheReadTokens.toLocaleString()} \xB7 cache-write ${convo.totalCacheWriteTokens.toLocaleString()}`
|
|
490
|
+
);
|
|
491
|
+
console.log(` signals ${[...convo.capabilities].join(", ")}`);
|
|
492
|
+
if (g.verbose) {
|
|
493
|
+
console.log("");
|
|
494
|
+
for (const m of convo.messages) {
|
|
495
|
+
const kinds = m.blocks.map((b) => b.kind).join(",");
|
|
496
|
+
console.log(` ${m.timestamp.slice(11, 19)} ${m.role.padEnd(9)} ${kinds}`);
|
|
497
|
+
}
|
|
498
|
+
}
|
|
499
|
+
});
|
|
500
|
+
conversations.command("report").description("Find context-bloat patterns and the small change that fixes each").option("-p, --provider <name>", "limit to one provider (e.g. claude, codex)").option("-n, --limit <count>", "max conversations to scan", String(DEFAULT_LIMIT)).option("-a, --all", "show every scanned conversation, not just Climbing/Heavy ones").action(async (opts, cmd) => {
|
|
501
|
+
const g = globalOpts(cmd);
|
|
502
|
+
const limit = Number.parseInt(opts.limit, 10) || 30;
|
|
503
|
+
const adapters2 = resolveAdapters(opts.provider);
|
|
504
|
+
if (opts.provider && adapters2.length === 0) {
|
|
505
|
+
console.error(`Unknown provider: ${opts.provider}`);
|
|
506
|
+
process.exitCode = 1;
|
|
507
|
+
return;
|
|
508
|
+
}
|
|
509
|
+
const summaries = [];
|
|
510
|
+
for (const adapter of adapters2) {
|
|
511
|
+
if (!await adapter.isAvailable()) continue;
|
|
512
|
+
summaries.push(...await adapter.listConversations());
|
|
513
|
+
}
|
|
514
|
+
summaries.sort((a, b) => b.startedAt.localeCompare(a.startedAt));
|
|
515
|
+
const rows = [];
|
|
516
|
+
for (const summary of summaries.slice(0, limit)) {
|
|
517
|
+
if (!summary.capabilities.has("tokenUsage")) continue;
|
|
518
|
+
const owner = adapters2.find((a) => a.id === "claude") ?? adapters2[0];
|
|
519
|
+
if (!owner) continue;
|
|
520
|
+
const convo = await owner.loadConversation(summary.sessionId);
|
|
521
|
+
if (!convo) continue;
|
|
522
|
+
const report = computeCacheReport(convo.messages, claudePricing);
|
|
523
|
+
const dumbZone = computeDumbZoneReport(convo.messages);
|
|
524
|
+
rows.push({ summary, report, health: cacheHealthFromReport(report), dumbZone });
|
|
525
|
+
}
|
|
526
|
+
if (g.json) {
|
|
527
|
+
console.log(
|
|
528
|
+
JSON.stringify(
|
|
529
|
+
rows.map((r) => ({
|
|
530
|
+
sessionId: r.summary.sessionId,
|
|
531
|
+
title: r.summary.title,
|
|
532
|
+
totalCost: r.report.totalCost,
|
|
533
|
+
recoverableBloatCost: r.report.recoverableBloatCost,
|
|
534
|
+
aboveBaselineContextCost: r.report.aboveBaselineContextCost,
|
|
535
|
+
finalRampRatio: r.report.finalRampRatio,
|
|
536
|
+
cacheHitRatio: r.report.cacheHitRatio,
|
|
537
|
+
health: r.health,
|
|
538
|
+
dumbZone: r.dumbZone,
|
|
539
|
+
recommendations: r.report.recommendations
|
|
540
|
+
})),
|
|
541
|
+
null,
|
|
542
|
+
2
|
|
543
|
+
)
|
|
544
|
+
);
|
|
545
|
+
return;
|
|
546
|
+
}
|
|
547
|
+
if (rows.length === 0) {
|
|
548
|
+
console.log("No conversations with token data to analyse.");
|
|
549
|
+
return;
|
|
550
|
+
}
|
|
551
|
+
const climbingOrWorse = rows.filter(
|
|
552
|
+
(r) => r.health === "climbing" || r.health === "poor"
|
|
553
|
+
).length;
|
|
554
|
+
const heavyBloat = rows.filter((r) => r.health === "poor").length;
|
|
555
|
+
const totalRecoverable = rows.reduce((sum, r) => sum + r.report.recoverableBloatCost, 0);
|
|
556
|
+
const scanned = rows.length;
|
|
557
|
+
const climbingFraction = `${climbingOrWorse}/${scanned} climbing or worse`;
|
|
558
|
+
const heavyFraction = heavyBloat > 0 ? ` \xB7 ${heavyBloat} heavy drift` : "";
|
|
559
|
+
const recoverableStr = totalRecoverable > 0 ? ` \xB7 ${formatUSD(totalRecoverable)} recoverable` : "";
|
|
560
|
+
const tokK = `${Math.round(DUMB_ZONE_TOKENS / 1e3)}k`;
|
|
561
|
+
const inZone = rows.filter((r) => r.dumbZone.inDumbZone).sort((a, b) => b.dumbZone.promptsInZone - a.dumbZone.promptsInZone);
|
|
562
|
+
console.log(`
|
|
563
|
+
Dumb Zone \u2014 context past ${tokK} tokens, where the model gets noticeably worse.`);
|
|
564
|
+
if (inZone.length === 0) {
|
|
565
|
+
console.log(`No conversations lingered there. (Flagged at ${DUMB_ZONE_MIN_PROMPTS}+ prompts past ${tokK}.)`);
|
|
566
|
+
} else {
|
|
567
|
+
console.log(
|
|
568
|
+
`${inZone.length}/${rows.length} conversation${inZone.length === 1 ? "" : "s"} kept going past ${tokK} tokens for ${DUMB_ZONE_MIN_PROMPTS}+ prompts:`
|
|
569
|
+
);
|
|
570
|
+
for (const { summary, dumbZone } of inZone) {
|
|
571
|
+
const id = summary.sessionId.slice(0, 8);
|
|
572
|
+
const prompts = `${dumbZone.promptsInZone} prompt${dumbZone.promptsInZone === 1 ? "" : "s"} in zone`.padEnd(20);
|
|
573
|
+
const peak = `peak ${Math.round(dumbZone.peakContextTokens / 1e3)}k`.padEnd(11);
|
|
574
|
+
console.log(`${id} ${prompts}${peak}${summary.title}`);
|
|
575
|
+
}
|
|
576
|
+
console.log("Fix: /clear or /compact at the task boundary, or start a fresh conversation, before context runs past the line.");
|
|
577
|
+
}
|
|
578
|
+
console.log("\nBloat (secondary) \u2014 recoverable cost from carrying stale context past an early-session baseline.");
|
|
579
|
+
console.log(`Scanned ${scanned} conversations: ${climbingFraction}${heavyFraction}${recoverableStr}`);
|
|
580
|
+
rows.sort((a, b) => b.report.recoverableBloatCost - a.report.recoverableBloatCost);
|
|
581
|
+
const listed = opts.all ? rows : rows.filter((r) => r.health === "climbing" || r.health === "poor");
|
|
582
|
+
let totalBloat = 0;
|
|
583
|
+
for (const { summary, report, health, dumbZone } of listed) {
|
|
584
|
+
totalBloat += report.recoverableBloatCost;
|
|
585
|
+
const id = summary.sessionId.slice(0, 8);
|
|
586
|
+
const cost = `cost ${formatUSD(report.totalCost)}`.padEnd(14);
|
|
587
|
+
const bloat = `bloat ${formatUSD(report.recoverableBloatCost)}`.padEnd(15);
|
|
588
|
+
const ramp = `ramp ${report.finalRampRatio.toFixed(1)}\xD7`.padEnd(11);
|
|
589
|
+
const hit = `hit ${(report.cacheHitRatio * 100).toFixed(0)}%`.padEnd(9);
|
|
590
|
+
const dz = `dumb zone ${dumbZone.inDumbZone ? "yes" : "no"}`.padEnd(15);
|
|
591
|
+
const healthLabel = bloatHealthLabel(health);
|
|
592
|
+
console.log(
|
|
593
|
+
`${id} ${healthLabel.padEnd(12)}${cost}${bloat}${ramp}${hit}${dz}${summary.title}`
|
|
594
|
+
);
|
|
595
|
+
if (g.verbose) {
|
|
596
|
+
for (const rec of report.recommendations) {
|
|
597
|
+
console.log(` [${rec.severity}] ${rec.title}`);
|
|
598
|
+
console.log(` ${rec.message}`);
|
|
599
|
+
}
|
|
600
|
+
}
|
|
601
|
+
}
|
|
602
|
+
if (listed.length === 0) {
|
|
603
|
+
console.log("Nothing Climbing or Heavy \u2014 every scanned conversation looks healthy. (--all to list them.)");
|
|
604
|
+
}
|
|
605
|
+
console.log("\nTo dig deeper, export flagged conversations and upload to an LLM:");
|
|
606
|
+
console.log(" conversations export (saves export-<date>.md)");
|
|
607
|
+
console.log(" conversations export <id> (one specific conversation)");
|
|
608
|
+
});
|
|
609
|
+
conversations.command("export").description(
|
|
610
|
+
"Export conversations as markdown ready to paste into an LLM chatbot. Pass id(s) for specific ones, or no ids to export everything the bloat report flags."
|
|
611
|
+
).argument("[id...]", "session id(s) or prefixes (omit to export from the bloat report)").option("-p, --provider <name>", "limit to one provider (e.g. claude, codex)").option("-n, --limit <count>", "max conversations to scan (no-id mode)", String(DEFAULT_LIMIT)).option("-a, --all", "include healthy conversations too (no-id mode)").option("-o, --output <file>", "custom filename (default: bloat-report-export-<date>.md)").option("--print", "print to the terminal instead of saving a file").action(async (ids, opts) => {
|
|
612
|
+
const adapters2 = resolveAdapters(opts.provider);
|
|
613
|
+
if (opts.provider && adapters2.length === 0) {
|
|
614
|
+
console.error(`Unknown provider: ${opts.provider}`);
|
|
615
|
+
process.exitCode = 1;
|
|
616
|
+
return;
|
|
617
|
+
}
|
|
618
|
+
const owner = adapters2.find((a) => a.id === "claude") ?? adapters2[0];
|
|
619
|
+
const exports = [];
|
|
620
|
+
if (ids.length > 0) {
|
|
621
|
+
const summaries = [];
|
|
622
|
+
for (const adapter of adapters2) {
|
|
623
|
+
if (!await adapter.isAvailable()) continue;
|
|
624
|
+
summaries.push(...await adapter.listConversations());
|
|
625
|
+
}
|
|
626
|
+
for (const id of ids) {
|
|
627
|
+
const match = summaries.find(
|
|
628
|
+
(s) => s.sessionId === id || s.sessionId.startsWith(id)
|
|
629
|
+
);
|
|
630
|
+
if (!match) {
|
|
631
|
+
console.error(`No conversation matching "${id}".`);
|
|
632
|
+
process.exitCode = 1;
|
|
633
|
+
continue;
|
|
634
|
+
}
|
|
635
|
+
const convo = await owner?.loadConversation(match.sessionId);
|
|
636
|
+
if (!convo) {
|
|
637
|
+
console.error(`Could not load conversation "${id}".`);
|
|
638
|
+
process.exitCode = 1;
|
|
639
|
+
continue;
|
|
640
|
+
}
|
|
641
|
+
exports.push(renderExport(convo));
|
|
642
|
+
}
|
|
643
|
+
} else {
|
|
644
|
+
const limit = Number.parseInt(opts.limit, 10) || 30;
|
|
645
|
+
const summaries = [];
|
|
646
|
+
for (const adapter of adapters2) {
|
|
647
|
+
if (!await adapter.isAvailable()) continue;
|
|
648
|
+
summaries.push(...await adapter.listConversations());
|
|
649
|
+
}
|
|
650
|
+
summaries.sort((a, b) => b.startedAt.localeCompare(a.startedAt));
|
|
651
|
+
const rows = [];
|
|
652
|
+
for (const summary of summaries.slice(0, limit)) {
|
|
653
|
+
if (!summary.capabilities.has("tokenUsage")) continue;
|
|
654
|
+
const convo = await owner?.loadConversation(summary.sessionId);
|
|
655
|
+
if (!convo) continue;
|
|
656
|
+
const report = computeCacheReport(convo.messages, claudePricing);
|
|
657
|
+
const health = cacheHealthFromReport(report);
|
|
658
|
+
const inScope = opts.all || health === "climbing" || health === "poor";
|
|
659
|
+
if (inScope) exports.push(renderExport(convo));
|
|
660
|
+
rows.push({ summary, health });
|
|
661
|
+
}
|
|
662
|
+
const flagged = rows.filter((r) => r.health === "climbing" || r.health === "poor").length;
|
|
663
|
+
const exported = exports.length;
|
|
664
|
+
process.stderr.write(
|
|
665
|
+
`Scanned ${rows.length} conversations, exporting ${exported}` + (opts.all ? "" : ` flagged (${flagged} climbing/heavy)`) + ".\n"
|
|
666
|
+
);
|
|
667
|
+
if (exported === 0) {
|
|
668
|
+
process.stderr.write("Nothing to export \u2014 no climbing or heavy conversations found. (--all to export everything.)\n");
|
|
669
|
+
return;
|
|
670
|
+
}
|
|
671
|
+
}
|
|
672
|
+
if (exports.length === 0) return;
|
|
673
|
+
const content = exports.join("\n\n---\n\n");
|
|
674
|
+
if (opts.print) {
|
|
675
|
+
process.stdout.write(content + "\n");
|
|
676
|
+
} else {
|
|
677
|
+
const date = (/* @__PURE__ */ new Date()).toISOString().slice(0, 10);
|
|
678
|
+
const filename = opts.output ?? `bloat-report-export-${date}.md`;
|
|
679
|
+
fs.writeFileSync(filename, content, "utf8");
|
|
680
|
+
process.stderr.write(`Saved to ${filename}
|
|
681
|
+
`);
|
|
682
|
+
}
|
|
683
|
+
});
|
|
684
|
+
}
|
|
685
|
+
|
|
686
|
+
// src/adapters/claude/paths.ts
|
|
687
|
+
import { homedir } from "os";
|
|
688
|
+
import { join } from "path";
|
|
689
|
+
import { readdir, access } from "fs/promises";
|
|
690
|
+
function claudeRoot() {
|
|
691
|
+
return process.env.CLAUDE_CONFIG_DIR || join(homedir(), ".claude");
|
|
692
|
+
}
|
|
693
|
+
function projectsDir() {
|
|
694
|
+
return join(claudeRoot(), "projects");
|
|
695
|
+
}
|
|
696
|
+
async function hasProjects() {
|
|
697
|
+
try {
|
|
698
|
+
await access(projectsDir());
|
|
699
|
+
return true;
|
|
700
|
+
} catch {
|
|
701
|
+
return false;
|
|
702
|
+
}
|
|
703
|
+
}
|
|
704
|
+
async function listTranscriptFiles() {
|
|
705
|
+
const root = projectsDir();
|
|
706
|
+
let projects;
|
|
707
|
+
try {
|
|
708
|
+
projects = await readdir(root);
|
|
709
|
+
} catch {
|
|
710
|
+
return [];
|
|
711
|
+
}
|
|
712
|
+
const files = [];
|
|
713
|
+
for (const projectId of projects) {
|
|
714
|
+
let entries;
|
|
715
|
+
try {
|
|
716
|
+
entries = await readdir(join(root, projectId));
|
|
717
|
+
} catch {
|
|
718
|
+
continue;
|
|
719
|
+
}
|
|
720
|
+
for (const entry of entries) {
|
|
721
|
+
if (!entry.endsWith(".jsonl")) continue;
|
|
722
|
+
files.push({
|
|
723
|
+
projectId,
|
|
724
|
+
sessionId: entry.slice(0, -".jsonl".length),
|
|
725
|
+
path: join(root, projectId, entry)
|
|
726
|
+
});
|
|
727
|
+
}
|
|
728
|
+
}
|
|
729
|
+
return files;
|
|
730
|
+
}
|
|
731
|
+
|
|
732
|
+
// src/adapters/claude/parse.ts
|
|
733
|
+
import { readFile } from "fs/promises";
|
|
734
|
+
function mapUsage(raw) {
|
|
735
|
+
if (!raw || typeof raw !== "object") return void 0;
|
|
736
|
+
const cc = raw.cache_creation ?? {};
|
|
737
|
+
const has5m = typeof cc.ephemeral_5m_input_tokens === "number";
|
|
738
|
+
const has1h = typeof cc.ephemeral_1h_input_tokens === "number";
|
|
739
|
+
const lump = !has5m && !has1h ? raw.cache_creation_input_tokens ?? 0 : 0;
|
|
740
|
+
return {
|
|
741
|
+
inputTokens: raw.input_tokens ?? 0,
|
|
742
|
+
outputTokens: raw.output_tokens ?? 0,
|
|
743
|
+
cacheReadTokens: raw.cache_read_input_tokens ?? 0,
|
|
744
|
+
cacheWrite5mTokens: (cc.ephemeral_5m_input_tokens ?? 0) + lump,
|
|
745
|
+
cacheWrite1hTokens: cc.ephemeral_1h_input_tokens ?? 0
|
|
746
|
+
};
|
|
747
|
+
}
|
|
748
|
+
function userPromptText(content) {
|
|
749
|
+
const parts = [];
|
|
750
|
+
if (typeof content === "string") {
|
|
751
|
+
if (content) parts.push(content);
|
|
752
|
+
} else if (Array.isArray(content)) {
|
|
753
|
+
for (const b of content) {
|
|
754
|
+
if (b?.type === "text" && typeof b.text === "string") parts.push(b.text);
|
|
755
|
+
}
|
|
756
|
+
}
|
|
757
|
+
if (parts.length === 0) return null;
|
|
758
|
+
let text = parts.join("\n");
|
|
759
|
+
text = text.replace(/<command-name>([^<]*)<\/command-name>/g, "$1");
|
|
760
|
+
text = text.replace(
|
|
761
|
+
/<command-args>([\s\S]*?)<\/command-args>/g,
|
|
762
|
+
(_, args) => args.trim() ? ` ${args.trim()}` : ""
|
|
763
|
+
);
|
|
764
|
+
text = text.replace(/<command-message>[\s\S]*?<\/command-message>/g, "").replace(/<local-command-stdout>[\s\S]*?<\/local-command-stdout>/g, "").replace(/<local-command-stderr>[\s\S]*?<\/local-command-stderr>/g, "").replace(/<system-reminder>[\s\S]*?<\/system-reminder>/g, "").trim();
|
|
765
|
+
return text || null;
|
|
766
|
+
}
|
|
767
|
+
function charCountOf(content) {
|
|
768
|
+
if (typeof content === "string") return content.length;
|
|
769
|
+
if (Array.isArray(content)) {
|
|
770
|
+
return content.reduce(
|
|
771
|
+
(n, b) => n + (typeof b?.text === "string" ? b.text.length : 0),
|
|
772
|
+
0
|
|
773
|
+
);
|
|
774
|
+
}
|
|
775
|
+
return 0;
|
|
776
|
+
}
|
|
777
|
+
function mapBlocks(content, toolNames) {
|
|
778
|
+
if (typeof content === "string") {
|
|
779
|
+
return content ? [{ kind: "text", text: content }] : [];
|
|
780
|
+
}
|
|
781
|
+
if (!Array.isArray(content)) return [];
|
|
782
|
+
const blocks = [];
|
|
783
|
+
for (const b of content) {
|
|
784
|
+
switch (b?.type) {
|
|
785
|
+
case "text":
|
|
786
|
+
blocks.push({ kind: "text", text: b.text ?? "" });
|
|
787
|
+
break;
|
|
788
|
+
case "thinking":
|
|
789
|
+
blocks.push({ kind: "thinking", text: b.thinking ?? "" });
|
|
790
|
+
break;
|
|
791
|
+
case "tool_use":
|
|
792
|
+
if (b.id && b.name) toolNames.set(b.id, b.name);
|
|
793
|
+
blocks.push({ kind: "tool_use", toolUseId: b.id ?? "", name: b.name ?? "", input: b.input });
|
|
794
|
+
break;
|
|
795
|
+
case "tool_result":
|
|
796
|
+
blocks.push({
|
|
797
|
+
kind: "tool_result",
|
|
798
|
+
toolUseId: b.tool_use_id ?? "",
|
|
799
|
+
isError: !!b.is_error,
|
|
800
|
+
charCount: charCountOf(b.content),
|
|
801
|
+
toolName: b.tool_use_id ? toolNames.get(b.tool_use_id) : void 0
|
|
802
|
+
});
|
|
803
|
+
break;
|
|
804
|
+
}
|
|
805
|
+
}
|
|
806
|
+
return blocks;
|
|
807
|
+
}
|
|
808
|
+
function* readLines(text) {
|
|
809
|
+
for (const line of text.split("\n")) {
|
|
810
|
+
const s = line.trim();
|
|
811
|
+
if (!s) continue;
|
|
812
|
+
try {
|
|
813
|
+
yield JSON.parse(s);
|
|
814
|
+
} catch {
|
|
815
|
+
}
|
|
816
|
+
}
|
|
817
|
+
}
|
|
818
|
+
function newAccumulator() {
|
|
819
|
+
return {
|
|
820
|
+
modelCounts: /* @__PURE__ */ new Map(),
|
|
821
|
+
totals: {
|
|
822
|
+
inputTokens: 0,
|
|
823
|
+
outputTokens: 0,
|
|
824
|
+
cacheReadTokens: 0,
|
|
825
|
+
cacheWrite5mTokens: 0,
|
|
826
|
+
cacheWrite1hTokens: 0
|
|
827
|
+
},
|
|
828
|
+
totalCost: 0,
|
|
829
|
+
messageCount: 0,
|
|
830
|
+
userPromptCount: 0,
|
|
831
|
+
seenUsageIds: /* @__PURE__ */ new Set(),
|
|
832
|
+
caps: /* @__PURE__ */ new Set()
|
|
833
|
+
};
|
|
834
|
+
}
|
|
835
|
+
function accumulate(acc, raw) {
|
|
836
|
+
if (raw.type === "ai-title" && raw.aiTitle) {
|
|
837
|
+
acc.title = raw.aiTitle;
|
|
838
|
+
return;
|
|
839
|
+
}
|
|
840
|
+
if (raw.type !== "user" && raw.type !== "assistant") return;
|
|
841
|
+
if (raw.timestamp) {
|
|
842
|
+
if (!acc.startedAt) acc.startedAt = raw.timestamp;
|
|
843
|
+
acc.endedAt = raw.timestamp;
|
|
844
|
+
}
|
|
845
|
+
if (raw.cwd && !acc.cwd) acc.cwd = raw.cwd;
|
|
846
|
+
if (raw.gitBranch && !acc.gitBranch) acc.gitBranch = raw.gitBranch;
|
|
847
|
+
const msg = raw.message;
|
|
848
|
+
if (raw.type === "user" && userPromptText(msg?.content) !== null) {
|
|
849
|
+
acc.userPromptCount += 1;
|
|
850
|
+
}
|
|
851
|
+
if (raw.type === "assistant" && msg?.model) {
|
|
852
|
+
acc.modelCounts.set(msg.model, (acc.modelCounts.get(msg.model) ?? 0) + 1);
|
|
853
|
+
acc.caps.add("modelPerTurn");
|
|
854
|
+
}
|
|
855
|
+
const usage = mapUsage(msg?.usage);
|
|
856
|
+
if (usage && raw.type === "assistant") {
|
|
857
|
+
const id = msg?.id ?? raw.uuid;
|
|
858
|
+
if (id && !acc.seenUsageIds.has(id)) {
|
|
859
|
+
acc.seenUsageIds.add(id);
|
|
860
|
+
acc.totals.inputTokens += usage.inputTokens;
|
|
861
|
+
acc.totals.outputTokens += usage.outputTokens;
|
|
862
|
+
acc.totals.cacheReadTokens += usage.cacheReadTokens;
|
|
863
|
+
acc.totals.cacheWrite5mTokens += usage.cacheWrite5mTokens;
|
|
864
|
+
acc.totals.cacheWrite1hTokens += usage.cacheWrite1hTokens;
|
|
865
|
+
acc.totalCost += costForUsage(msg?.model, usage);
|
|
866
|
+
acc.caps.add("tokenUsage");
|
|
867
|
+
if (usage.cacheReadTokens || usage.cacheWrite5mTokens || usage.cacheWrite1hTokens) {
|
|
868
|
+
acc.caps.add("cacheSplit");
|
|
869
|
+
}
|
|
870
|
+
}
|
|
871
|
+
}
|
|
872
|
+
}
|
|
873
|
+
function primaryModel(counts) {
|
|
874
|
+
let best = "";
|
|
875
|
+
let n = -1;
|
|
876
|
+
for (const [model, c] of counts) if (c > n) best = model, n = c;
|
|
877
|
+
return best;
|
|
878
|
+
}
|
|
879
|
+
function toSummary(file, acc) {
|
|
880
|
+
acc.caps.add("toolOutputSize");
|
|
881
|
+
return {
|
|
882
|
+
projectId: file.projectId,
|
|
883
|
+
sessionId: file.sessionId,
|
|
884
|
+
title: acc.title || acc.cwd || file.sessionId,
|
|
885
|
+
cwd: acc.cwd ?? "",
|
|
886
|
+
gitBranch: acc.gitBranch,
|
|
887
|
+
startedAt: acc.startedAt ?? "",
|
|
888
|
+
endedAt: acc.endedAt ?? "",
|
|
889
|
+
messageCount: acc.messageCount,
|
|
890
|
+
userPromptCount: acc.userPromptCount,
|
|
891
|
+
primaryModel: primaryModel(acc.modelCounts),
|
|
892
|
+
totalCost: acc.totalCost,
|
|
893
|
+
totalInputTokens: acc.totals.inputTokens,
|
|
894
|
+
totalOutputTokens: acc.totals.outputTokens,
|
|
895
|
+
totalCacheReadTokens: acc.totals.cacheReadTokens,
|
|
896
|
+
totalCacheWriteTokens: acc.totals.cacheWrite5mTokens + acc.totals.cacheWrite1hTokens,
|
|
897
|
+
cacheHealth: null,
|
|
898
|
+
// analysis fills this
|
|
899
|
+
capabilities: acc.caps
|
|
900
|
+
};
|
|
901
|
+
}
|
|
902
|
+
async function summarizeTranscript(file) {
|
|
903
|
+
let text;
|
|
904
|
+
try {
|
|
905
|
+
text = await readFile(file.path, "utf8");
|
|
906
|
+
} catch {
|
|
907
|
+
return null;
|
|
908
|
+
}
|
|
909
|
+
const acc = newAccumulator();
|
|
910
|
+
let messages = 0;
|
|
911
|
+
for (const raw of readLines(text)) {
|
|
912
|
+
accumulate(acc, raw);
|
|
913
|
+
if (raw.type === "user" || raw.type === "assistant") messages++;
|
|
914
|
+
}
|
|
915
|
+
acc.messageCount = messages;
|
|
916
|
+
return toSummary(file, acc);
|
|
917
|
+
}
|
|
918
|
+
async function loadTranscript(file) {
|
|
919
|
+
let text;
|
|
920
|
+
try {
|
|
921
|
+
text = await readFile(file.path, "utf8");
|
|
922
|
+
} catch {
|
|
923
|
+
return null;
|
|
924
|
+
}
|
|
925
|
+
const acc = newAccumulator();
|
|
926
|
+
const toolNames = /* @__PURE__ */ new Map();
|
|
927
|
+
const messages = [];
|
|
928
|
+
const byAssistantId = /* @__PURE__ */ new Map();
|
|
929
|
+
for (const raw of readLines(text)) {
|
|
930
|
+
accumulate(acc, raw);
|
|
931
|
+
if (raw.type !== "user" && raw.type !== "assistant") continue;
|
|
932
|
+
const msg = raw.message;
|
|
933
|
+
const blocks = mapBlocks(msg?.content, toolNames);
|
|
934
|
+
if (raw.type === "assistant") {
|
|
935
|
+
const id = msg?.id ?? raw.uuid ?? "";
|
|
936
|
+
const existing = byAssistantId.get(id);
|
|
937
|
+
if (existing) {
|
|
938
|
+
existing.blocks.push(...blocks);
|
|
939
|
+
continue;
|
|
940
|
+
}
|
|
941
|
+
const message = {
|
|
942
|
+
uuid: raw.uuid ?? id,
|
|
943
|
+
parentUuid: raw.parentUuid ?? null,
|
|
944
|
+
role: "assistant",
|
|
945
|
+
timestamp: raw.timestamp ?? "",
|
|
946
|
+
model: msg?.model,
|
|
947
|
+
blocks,
|
|
948
|
+
usage: mapUsage(msg?.usage)
|
|
949
|
+
};
|
|
950
|
+
byAssistantId.set(id, message);
|
|
951
|
+
messages.push(message);
|
|
952
|
+
} else {
|
|
953
|
+
messages.push({
|
|
954
|
+
uuid: raw.uuid ?? "",
|
|
955
|
+
parentUuid: raw.parentUuid ?? null,
|
|
956
|
+
role: "user",
|
|
957
|
+
timestamp: raw.timestamp ?? "",
|
|
958
|
+
blocks,
|
|
959
|
+
// Same gate that drives userPromptCount — mark genuine prompts so
|
|
960
|
+
// detectors (e.g. the dumb-zone scan) can count them without re-deriving
|
|
961
|
+
// which "user" lines are real human input vs. tool results / wrappers.
|
|
962
|
+
isGenuinePrompt: userPromptText(msg?.content) !== null
|
|
963
|
+
});
|
|
964
|
+
}
|
|
965
|
+
}
|
|
966
|
+
acc.messageCount = messages.length;
|
|
967
|
+
return { ...toSummary(file, acc), messages };
|
|
968
|
+
}
|
|
969
|
+
|
|
970
|
+
// src/adapters/claude/index.ts
|
|
971
|
+
var RECOMMENDATIONS = {
|
|
972
|
+
uncompactedSession: {
|
|
973
|
+
fix: "Run /compact at task boundaries to summarise and shrink the running context.",
|
|
974
|
+
command: "/compact"
|
|
975
|
+
},
|
|
976
|
+
noisyToolOutput: {
|
|
977
|
+
fix: "Pipe noisy commands through a quiet flag or `head` so only the useful slice enters context."
|
|
978
|
+
},
|
|
979
|
+
rereadUnchangedFile: {
|
|
980
|
+
fix: "Read a file once; don't re-read it while it's unchanged in context."
|
|
981
|
+
},
|
|
982
|
+
fullFileRead: {
|
|
983
|
+
fix: "Use a ranged read on large files instead of pulling the whole thing."
|
|
984
|
+
},
|
|
985
|
+
idleMcpServer: {
|
|
986
|
+
fix: "Disconnect MCP servers a session never calls so their tool definitions stop costing tokens.",
|
|
987
|
+
command: "/mcp"
|
|
988
|
+
}
|
|
989
|
+
};
|
|
990
|
+
var ClaudeAdapter = class {
|
|
991
|
+
id = "claude";
|
|
992
|
+
displayName = "Claude Code";
|
|
993
|
+
isAvailable() {
|
|
994
|
+
return hasProjects();
|
|
995
|
+
}
|
|
996
|
+
async listConversations() {
|
|
997
|
+
const files = await listTranscriptFiles();
|
|
998
|
+
const summaries = await Promise.all(files.map(summarizeTranscript));
|
|
999
|
+
return summaries.filter((s) => s !== null).sort((a, b) => b.startedAt.localeCompare(a.startedAt));
|
|
1000
|
+
}
|
|
1001
|
+
async loadConversation(sessionId) {
|
|
1002
|
+
const files = await listTranscriptFiles();
|
|
1003
|
+
const file = files.find((f) => f.sessionId === sessionId);
|
|
1004
|
+
return file ? loadTranscript(file) : null;
|
|
1005
|
+
}
|
|
1006
|
+
recommend(pattern) {
|
|
1007
|
+
return RECOMMENDATIONS[pattern];
|
|
1008
|
+
}
|
|
1009
|
+
};
|
|
1010
|
+
|
|
1011
|
+
// src/adapters/index.ts
|
|
1012
|
+
function registerAdapters() {
|
|
1013
|
+
registerAdapter(new ClaudeAdapter());
|
|
1014
|
+
}
|
|
1015
|
+
|
|
1016
|
+
// src/cli.ts
|
|
1017
|
+
registerAdapters();
|
|
1018
|
+
var program = new Command();
|
|
1019
|
+
program.name("bloat-report").description(
|
|
1020
|
+
"Scan local coding-agent transcripts for wasteful token patterns \u2014 small changes, big savings. Local-only, read-only."
|
|
1021
|
+
).version("1.0.0");
|
|
1022
|
+
program.option("--json", "emit machine-readable JSON instead of the plain-English report").option("--verbose", "include per-finding detail");
|
|
1023
|
+
registerConversations(program);
|
|
1024
|
+
program.parseAsync(process.argv);
|
package/package.json
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "bloat-report",
|
|
3
|
+
"version": "1.0.0",
|
|
4
|
+
"description": "Inspect agent conversations to see how you can make big savings from small changes",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"bin": {
|
|
7
|
+
"bloat-report": "dist/cli.js"
|
|
8
|
+
},
|
|
9
|
+
"main": "dist/cli.js",
|
|
10
|
+
"files": [
|
|
11
|
+
"dist"
|
|
12
|
+
],
|
|
13
|
+
"scripts": {
|
|
14
|
+
"dev": "tsx src/cli.ts",
|
|
15
|
+
"build": "tsup src/cli.ts --format esm --clean",
|
|
16
|
+
"typecheck": "tsc --noEmit",
|
|
17
|
+
"start": "node dist/cli.js",
|
|
18
|
+
"test": "tsx --test src/analyze/cache.test.ts",
|
|
19
|
+
"prepublishOnly": "npm run build"
|
|
20
|
+
},
|
|
21
|
+
"author": "Brendan O'Neill <brendanoneill94@gmail.com>",
|
|
22
|
+
"license": "MIT",
|
|
23
|
+
"dependencies": {
|
|
24
|
+
"commander": "^14.0.3"
|
|
25
|
+
},
|
|
26
|
+
"devDependencies": {
|
|
27
|
+
"@types/node": "^25.9.2",
|
|
28
|
+
"tsup": "^8.5.1",
|
|
29
|
+
"tsx": "^4.22.4",
|
|
30
|
+
"typescript": "^6.0.3"
|
|
31
|
+
}
|
|
32
|
+
}
|