cchubber 0.2.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,24 +1,36 @@
1
1
  # CC Hubber
2
2
 
3
- **What you spent. Why you spent it. Is that normal.**
3
+ Your Claude Code usage, diagnosed. One command.
4
4
 
5
- Offline CLI that reads your local Claude Code data and generates a diagnostic HTML report. No API keys. No telemetry. Everything stays on your machine.
5
+ ```bash
6
+ npx cchubber
7
+ ```
8
+
9
+ Reads your local data, generates an HTML report. No API keys, no telemetry, nothing leaves your machine.
10
+
11
+ Built during the March 2026 cache crisis because nobody could tell if they'd been hit. Thousands of users burning through limits 10-20x faster than normal, and Anthropic's only answer was "we're investigating." We wanted receipts.
12
+
13
+ ## What you get
6
14
 
7
- Built because Claude Code users had zero visibility into the [March 2026 cache bug](https://github.com/anthropics/claude-code/issues/41930) that silently inflated costs by 10-20x. Your `$100 plan` shouldn't feel like a `$20 plan`.
15
+ A single HTML report that tells you three things: what you spent, why you spent it, and whether that's normal.
8
16
 
9
- ![CC Hubber Report](https://raw.githubusercontent.com/azkhh/cchubber/master/screenshot.png)
17
+ **The diagnosis:**
18
+ - Cache health grade (trend-weighted, recent 7 days count more)
19
+ - Inflection point detection: "Your efficiency dropped 3.2x starting March 17"
20
+ - Per-project cost breakdown with decoded project names
21
+ - Session intelligence: duration stats, tool usage, activity heatmap
22
+ - Model routing analysis (93% Opus? Your limits would last 3x longer on Sonnet)
23
+ - 8 actionable recommendations, each with estimated usage savings
10
24
 
11
- ## What it does
25
+ **The data:**
26
+ - Cost calculated from actual token counts (LiteLLM pricing, not the broken `costUSD` field)
27
+ - Message-level deduplication (Claude Code JSONL files contain ~50% duplicates from session resume)
28
+ - Subagent visibility: Haiku and Sonnet background agents show up in model distribution
29
+ - CLAUDE.md section-by-section analysis with per-message cost impact
30
+ - Cache break estimation even when diff files don't exist on your CC version
12
31
 
13
- - **Cost breakdown** — Per-day, per-model, per-project cost calculated from your actual token counts
14
- - **Cache health grade** Trend-weighted (recent 7 days dominate). If you hit the cache bug, you'll see D/F, not a misleading A
15
- - **Inflection point detection** — "Your efficiency dropped 4.7x starting March 29. Before: 360:1. After: 1,676:1."
16
- - **Anomaly detection** — Flags days where your cost/ratio deviates >2 standard deviations
17
- - **Cache break analysis** — Reads `~/.claude/tmp/cache-break-*.diff` files. Shows why your cache broke and how often
18
- - **CLAUDE.md cost analysis** — How much your rules files cost per message (cached vs uncached)
19
- - **Per-project breakdown** — Which project is eating your budget
20
- - **Live rate limits** — 5-hour and 7-day utilization (if OAuth token available)
21
- - **Shareable card** — Export your report as a PNG
32
+ **The shareable card:**
33
+ An animated card with your grade, spend, cache ratio, and diagnosis line. Export as video. Post it. Let people see the numbers Anthropic won't show them.
22
34
 
23
35
  ## Install
24
36
 
@@ -26,74 +38,65 @@ Built because Claude Code users had zero visibility into the [March 2026 cache b
26
38
  npx cchubber
27
39
  ```
28
40
 
29
- Or install globally:
41
+ Or globally:
30
42
 
31
43
  ```bash
32
44
  npm install -g cchubber
33
45
  cchubber
34
46
  ```
35
47
 
36
- Requires Node.js 18+. Runs on macOS, Windows, and Linux.
48
+ Node.js 18+. Works on macOS, Windows, Linux.
37
49
 
38
- ## Usage
50
+ ## The cache bug (March 2026)
39
51
 
40
- ```bash
41
- cchubber # Scan and open HTML report
42
- cchubber --days 7 # Default view: last 7 days
43
- cchubber -o report.html # Custom output path
44
- cchubber --no-open # Don't auto-open in browser
45
- cchubber --json # Machine-readable JSON output
46
- ```
47
-
48
- ## What it reads
52
+ Between v2.1.69 and v2.1.89, five things broke at once:
49
53
 
50
- All data is local. Nothing leaves your machine.
54
+ 1. A sentinel replacement bug in Anthropic's custom Bun fork dropped cache read rates from 95% to 4-17%
55
+ 2. The `--resume` flag caused full prompt-cache misses on every single resume
56
+ 3. One session generated 652,069 output tokens with zero user input ($342 gone)
57
+ 4. Peak-hour throttling kicked in for 7% of users without announcement
58
+ 5. A 2x off-peak promotion expired, making the baseline feel like a cut
51
59
 
52
- | Source | Path | What |
53
- |--------|------|------|
54
- | JSONL conversations | `~/.claude/projects/*/` | Token counts per message, per model, per session |
55
- | Stats cache | `~/.claude/stats-cache.json` | Pre-aggregated daily totals |
56
- | Session meta | `~/.claude/usage-data/session-meta/` | Duration, tool counts, lines changed |
57
- | Cache breaks | `~/.claude/tmp/cache-break-*.diff` | Why your prompt cache invalidated |
58
- | CLAUDE.md stack | `~/.claude/CLAUDE.md`, project-level | File sizes and per-message cost impact |
59
- | OAuth usage | `~/.claude/.credentials.json` | Live rate limit utilization |
60
-
61
- ## The March 2026 cache bug
60
+ v2.1.90 fixes most of these. Run `claude update`.
62
61
 
63
- Between v2.1.69 and v2.1.89, multiple bugs caused Claude Code's prompt cache to silently fail:
62
+ CC Hubber shows you whether you were affected. If your report has a sharp inflection point around mid-March, that's probably when it hit you.
64
63
 
65
- - A sentinel replacement bug in the Bun fork dropped cache read rates from ~95% to 4-17%
66
- - The `--resume` flag caused full prompt-cache misses on every resume
67
- - One session generated 652,069 output tokens with no user input — $342 on a single session
64
+ ## What the community figured out
68
65
 
69
- **v2.1.90 fixes most of these.** Update immediately: `claude update`
66
+ These tips came from GitHub issues, Reddit threads, and Twitter during the crisis. CC Hubber's recommendations are based on this data.
70
67
 
71
- CC Hubber detects whether you were affected by showing your cache efficiency trend over time. If you see a sharp inflection point, that's probably when it hit you.
68
+ - Start a fresh session for each task. Long sessions bleed tokens.
69
+ - Route subagents to Sonnet (`model: "sonnet"` on Task calls). Same quality, 5x cheaper per token.
70
+ - Keep your CLAUDE.md under 200 lines. It gets re-read on every message. 12K tokens at 200 messages/day costs $1.23/day cached.
71
+ - Run `/compact` every 30-40 tool calls. Context bloat compounds.
72
+ - Create a `.claudeignore` file. Exclude `node_modules/`, `dist/`, `*.lock`. Saves tokens on every context load.
73
+ - Avoid `--resume` on older versions. Fixed in v2.1.90.
74
+ - Shift heavy work (refactors, test generation) outside 5am-11am PT. That's when Anthropic throttles session limits.
72
75
 
73
- ## Best practices (from the community)
76
+ ## How the cost works
74
77
 
75
- These tips surfaced during the March crisis. CC Hubber helps you verify whether they're working:
78
+ Claude Code doesn't show costs for Max and Pro plans (`costUSD` is always 0). CC Hubber calculates equivalent API cost from your token counts using LiteLLM's pricing data.
76
79
 
77
- - **Start fresh sessions per task** — don't try to extend long sessions
78
- - **Avoid `--resume` on older versions** — fixed in v2.1.90
79
- - **Switch to Sonnet 4.6 for routine work** — same quality, fraction of the quota
80
- - **Keep CLAUDE.md under 200 lines** — it's re-read on every message
81
- - **Use `/compact` every 30-40 tool calls** — prevents context bloat
82
- - **Create `.claudeignore`** — exclude `node_modules/`, `dist/`, `*.lock`
83
- - **Shift heavy work to off-peak hours** — outside 5am-11am PT weekdays
80
+ The number you see is what you'd pay on the API tier for the same usage. Useful for comparing consumption across days and projects. Not a billing statement.
84
81
 
85
- ## How cost is calculated
82
+ ## Data sources
86
83
 
87
- Claude Code doesn't report costs for Max/Pro plans (`costUSD` is always 0). CC Hubber calculates costs from token counts using dynamic pricing from [LiteLLM](https://github.com/BerriAI/litellm), with hardcoded fallbacks.
84
+ Everything is local. CC Hubber reads files that already exist on your machine.
88
85
 
89
- This gives you an **equivalent API cost** what you would pay on the API tier for the same usage. Useful for understanding relative consumption, not for billing disputes.
86
+ | Source | Path | What it contains |
87
+ |--------|------|-----------------|
88
+ | Conversations | `~/.claude/projects/*/` | Token counts per message, per model |
89
+ | Subagents | `~/.claude/projects/*/subagents/` | Haiku/Sonnet background agent usage |
90
+ | Session meta | `~/.claude/usage-data/session-meta/` | Duration, tool counts, lines changed |
91
+ | Cache breaks | `~/.claude/tmp/cache-break-*.diff` | Why your prompt cache broke |
92
+ | CLAUDE.md | `~/.claude/CLAUDE.md` + project-level | File sizes, section breakdown, cost per message |
93
+ | Rate limits | `~/.claude/.credentials.json` | Live 5-hour and 7-day utilization |
90
94
 
91
- ## Prior art
95
+ ## Compared to ccusage
92
96
 
93
- - [ccusage](https://github.com/jikyo/ccusage) (12K+ stars) token tracking and cost visualization
94
- - [Claude-Code-Usage-Monitor](https://github.com/nicobailon/Claude-Code-Usage-Monitor) — basic session tracking
97
+ [ccusage](https://github.com/ryoppippi/ccusage) (12K+ stars) is great for cost accounting. It tells you what you spent.
95
98
 
96
- CC Hubber focuses on **diagnosis** cache health grading, inflection detection, cache break analysis not just accounting. If ccusage tells you *what* you spent, CC Hubber tells you *why* and whether it's normal.
99
+ CC Hubber tells you why, and whether it's normal. Inflection detection, cache break estimation, model routing savings, session intelligence, trend-weighted grading. Different tools for different questions.
97
100
 
98
101
  ## License
99
102
 
@@ -101,4 +104,4 @@ MIT
101
104
 
102
105
  ## Credits
103
106
 
104
- Built by [@azkhh](https://x.com/asmirkn). Shipped with [Mover OS](https://moveros.dev).
107
+ Built by [@azkhh](https://x.com/asmirkn). Shipped fast with [Mover OS](https://moveros.dev).
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "cchubber",
3
- "version": "0.2.0",
3
+ "version": "0.3.1",
4
4
  "description": "What you spent. Why you spent it. Is that normal. — Claude Code usage diagnosis with beautiful HTML reports.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -57,13 +57,20 @@ export function analyzeCacheHealth(statsCache, cacheBreaks, days, dailyFromJSONL
57
57
  // With cache: reads are $0.50/M
58
58
  const savingsFromCache = totalCacheRead / 1_000_000 * (5.0 - 0.50);
59
59
 
60
- // Cost wasted from cache breaks (rough estimate)
61
- // Each cache break forces a full re-read at write price ($6.25/M) instead of read price ($0.50/M)
62
- // Estimate ~200K tokens re-cached per break
63
- const wastedFromBreaks = totalBreaks * 200_000 / 1_000_000 * (6.25 - 0.50);
60
+ // Cost wasted from cache rewrites
61
+ // Cache writes happen when cache is invalidated costs 12.5x more than reads
62
+ // Use actual cache write tokens as the signal (more reliable than diff file count)
63
+ const wastedFromBreaks = totalBreaks > 0
64
+ ? totalBreaks * 200_000 / 1_000_000 * (6.25 - 0.50)
65
+ : totalCacheWrite / 1_000_000 * (6.25 - 0.50); // estimate from write tokens
66
+
67
+ // If no diff files but cache writes exist, estimate break count
68
+ // Each break re-caches ~200K-500K tokens on average
69
+ const estimatedBreaks = totalBreaks > 0 ? totalBreaks : Math.round(totalCacheWrite / 300_000);
64
70
 
65
71
  return {
66
72
  totalCacheBreaks: totalBreaks,
73
+ estimatedBreaks,
67
74
  reasonsRanked,
68
75
  cacheHitRate: Math.round(cacheHitRate * 10) / 10,
69
76
  efficiencyRatio,
@@ -1,7 +1,7 @@
1
1
  /**
2
2
  * Inflection Point Detection
3
- * Finds the sharpest change in cache efficiency ratio over time.
4
- * Outputs: "Your efficiency dropped 3.6x starting March 29. Before: 482:1. After: 1,726:1."
3
+ * Finds BOTH the worst degradation AND best improvement in cache efficiency.
4
+ * Prioritizes degradation that's what users care about ("why is my usage draining?").
5
5
  */
6
6
  export function detectInflectionPoints(dailyFromJSONL) {
7
7
  if (!dailyFromJSONL || dailyFromJSONL.length < 5) return null;
@@ -12,10 +12,10 @@ export function detectInflectionPoints(dailyFromJSONL) {
12
12
 
13
13
  if (sorted.length < 5) return null;
14
14
 
15
- // Sliding window: compare the average ratio of days before vs after each point
16
- // Window size: at least 3 days on each side
17
15
  const minWindow = 3;
18
- let bestSplit = null;
16
+ let worstDegradation = null;
17
+ let worstScore = 0;
18
+ let bestImprovement = null;
19
19
  let bestScore = 0;
20
20
 
21
21
  for (let i = minWindow; i <= sorted.length - minWindow; i++) {
@@ -27,32 +27,44 @@ export function detectInflectionPoints(dailyFromJSONL) {
27
27
 
28
28
  if (beforeRatio === 0 || afterRatio === 0) continue;
29
29
 
30
- // Score = magnitude of change (either direction)
31
- const changeMultiplier = afterRatio > beforeRatio
32
- ? afterRatio / beforeRatio
33
- : beforeRatio / afterRatio;
34
-
35
- if (changeMultiplier > bestScore && changeMultiplier >= 1.5) {
36
- bestScore = changeMultiplier;
37
- bestSplit = {
38
- date: sorted[i].date,
39
- beforeRatio,
40
- afterRatio,
41
- multiplier: Math.round(changeMultiplier * 10) / 10,
42
- direction: afterRatio > beforeRatio ? 'worsened' : 'improved',
43
- beforeDays: before.length,
44
- afterDays: after.length,
45
- };
30
+ if (afterRatio > beforeRatio) {
31
+ // Degradation (ratio went UP = worse)
32
+ const mult = afterRatio / beforeRatio;
33
+ if (mult > worstScore && mult >= 1.5) {
34
+ worstScore = mult;
35
+ worstDegradation = buildResult(sorted[i].date, beforeRatio, afterRatio, mult, 'worsened', before.length, after.length);
36
+ }
37
+ } else {
38
+ // Improvement (ratio went DOWN = better)
39
+ const mult = beforeRatio / afterRatio;
40
+ if (mult > bestScore && mult >= 1.5) {
41
+ bestScore = mult;
42
+ bestImprovement = buildResult(sorted[i].date, beforeRatio, afterRatio, mult, 'improved', before.length, after.length);
43
+ }
46
44
  }
47
45
  }
48
46
 
49
- if (!bestSplit) return null;
47
+ // Return degradation as primary (that's the problem), improvement as secondary
48
+ const primary = worstDegradation || bestImprovement;
49
+ if (!primary) return null;
50
50
 
51
- // Build human-readable summary
52
- const dirLabel = bestSplit.direction === 'worsened' ? 'dropped' : 'improved';
53
- bestSplit.summary = `Your cache efficiency ${dirLabel} ${bestSplit.multiplier}x starting ${formatDate(bestSplit.date)}. Before: ${bestSplit.beforeRatio.toLocaleString()}:1. After: ${bestSplit.afterRatio.toLocaleString()}:1.`;
51
+ primary.secondary = worstDegradation ? bestImprovement : null;
52
+ return primary;
53
+ }
54
54
 
55
- return bestSplit;
55
+ function buildResult(date, beforeRatio, afterRatio, multiplier, direction, beforeDays, afterDays) {
56
+ const mult = Math.round(multiplier * 10) / 10;
57
+ const dirLabel = direction === 'worsened' ? 'dropped' : 'improved';
58
+ return {
59
+ date,
60
+ beforeRatio,
61
+ afterRatio,
62
+ multiplier: mult,
63
+ direction,
64
+ beforeDays,
65
+ afterDays,
66
+ summary: `Your cache efficiency ${dirLabel} ${mult}x starting ${formatDate(date)}. Before: ${beforeRatio.toLocaleString()}:1. After: ${afterRatio.toLocaleString()}:1.`,
67
+ };
56
68
  }
57
69
 
58
70
  function computeRatio(days) {
@@ -1,117 +1,111 @@
1
1
  /**
2
2
  * Recommendations Engine
3
- * Generates actionable recommendations informed by community data (March 2026 crisis).
4
- * Every recommendation maps to a real pattern reported by users on GitHub/Twitter/Reddit.
3
+ * Each recommendation includes estimated usage % savings.
4
+ * Informed by community data from the March 2026 Claude Code crisis.
5
5
  */
6
6
  export function generateRecommendations(costAnalysis, cacheHealth, claudeMdStack, anomalies, inflection, sessionIntel, modelRouting) {
7
7
  const recs = [];
8
+ const totalCost = costAnalysis.totalCost || 1;
8
9
 
9
- // 0. Inflection point — most important signal
10
+ // 0. Inflection point
10
11
  if (inflection && inflection.direction === 'worsened' && inflection.multiplier >= 2) {
11
12
  recs.push({
12
13
  severity: 'critical',
13
14
  title: `Cache efficiency dropped ${inflection.multiplier}x on ${inflection.date}`,
14
- detail: inflection.summary,
15
- action: 'Run: claude update. Versions 2.1.69-2.1.89 had a cache sentinel bug that dropped read rates from 95% to 4-17%. Fixed in v2.1.90.',
15
+ savings: '~40-60% usage reduction after fix',
16
+ action: 'Run: claude update. v2.1.69-2.1.89 had cache bugs. Fixed in v2.1.90.',
16
17
  });
17
18
  } else if (inflection && inflection.direction === 'improved' && inflection.multiplier >= 2) {
18
19
  recs.push({
19
20
  severity: 'positive',
20
21
  title: `Efficiency improved ${inflection.multiplier}x on ${inflection.date}`,
21
- detail: inflection.summary,
22
- action: 'Your cache efficiency improved here. Likely a version update or workflow change that stuck.',
22
+ savings: 'Already saving',
23
+ action: 'Your cache efficiency improved. Likely a version update or workflow change.',
23
24
  });
24
25
  }
25
26
 
26
- // 1. CLAUDE.md bloatcommunity-reported 10-20x cost multiplier
27
- if (claudeMdStack.totalTokensEstimate > 8000) {
28
- const dailyCost = claudeMdStack.costPerMessage?.dailyCached200;
29
- recs.push({
30
- severity: claudeMdStack.totalTokensEstimate > 15000 ? 'critical' : 'warning',
31
- title: `CLAUDE.md is ${Math.round(claudeMdStack.totalTokensEstimate / 1000)}K tokens`,
32
- detail: `Re-read on every turn. Community best practice: keep under 200 lines (~4K tokens). Yours costs ~$${dailyCost ? dailyCost.toFixed(2) : '?'}/day at 200 messages. Each cache break re-reads at 12.5x the cached price.`,
33
- action: 'Move rarely-used rules to project-level files. Use skills/hooks instead of inline instructions. Every 1K tokens removed saves ~$0.50/day.',
34
- });
35
- }
27
+ // 1. Model routingbiggest actionable saving for most users
28
+ const modelCosts = costAnalysis.modelCosts || {};
29
+ const totalModelCost = Object.values(modelCosts).reduce((s, c) => s + c, 0);
30
+ const opusCost = Object.entries(modelCosts).filter(([n]) => n.toLowerCase().includes('opus')).reduce((s, [, c]) => s + c, 0);
31
+ const opusPct = totalModelCost > 0 ? Math.round((opusCost / totalModelCost) * 100) : 0;
36
32
 
37
- // 2. Version check — the #1 fix reported by community
38
- if (cacheHealth.efficiencyRatio > 1500 || (inflection && inflection.direction === 'worsened')) {
33
+ if (opusPct > 80) {
34
+ const savingsPct = Math.round(opusPct * 0.4 * 0.8); // 40% of Opus routable, 80% cheaper
39
35
  recs.push({
40
- severity: 'critical',
41
- title: 'Update Claude Code to v2.1.90+',
42
- detail: 'Versions 2.1.69-2.1.89 had three cache bugs: sentinel replacement error, --resume cache miss, and nested CLAUDE.md re-injection. Community-verified: usage dropped from 80-100% to 5-7% of Max quota after updating.',
43
- action: 'Run: claude update. If already on latest, start a fresh session the fix only applies to new sessions.',
36
+ severity: 'warning',
37
+ title: `${opusPct}% usage is Opus — route subagents to Sonnet`,
38
+ savings: `~${savingsPct}% usage reduction`,
39
+ action: `Set model: "sonnet" on Task/subagent calls. Sonnet handles search, file reads, docs, and simple edits at same quality. Community-verified: limits lasted 3-5x longer.`,
44
40
  });
45
41
  }
46
42
 
47
- // 3. Cache break analysis
48
- if (cacheHealth.totalCacheBreaks > 10) {
49
- const topReason = cacheHealth.reasonsRanked[0];
43
+ // 2. CLAUDE.md bloat
44
+ if (claudeMdStack.totalTokensEstimate > 8000) {
45
+ const excessK = Math.round((claudeMdStack.totalTokensEstimate - 4000) / 1000);
50
46
  recs.push({
51
- severity: cacheHealth.totalCacheBreaks > 50 ? 'critical' : 'warning',
52
- title: `${cacheHealth.totalCacheBreaks} cache invalidations`,
53
- detail: `Each break forces a full prompt re-read at write prices (12.5x cache read cost). ${topReason ? `Top cause: "${topReason.reason}" (${topReason.count}x, ${topReason.percentage}%).` : ''}`,
54
- action: topReason?.reason === 'Tool schemas changed'
55
- ? 'Reduce MCP server connections. Each tool schema change breaks the cache prefix. Disconnect tools you\'re not actively using.'
56
- : topReason?.reason === 'System prompt changed'
57
- ? 'Stop editing CLAUDE.md mid-session. Batch rule changes between sessions.'
58
- : 'Review ~/.claude/tmp/cache-break-*.diff for exact invalidation causes.',
47
+ severity: claudeMdStack.totalTokensEstimate > 15000 ? 'critical' : 'warning',
48
+ title: `CLAUDE.md is ${Math.round(claudeMdStack.totalTokensEstimate / 1000)}K tokens — trim to <4K`,
49
+ savings: `saves ~${excessK}K tokens/msg`,
50
+ action: 'Re-read on every turn. Move rarely-used rules to project files. Use skills/hooks instead of inline instructions. Community target: under 200 lines.',
59
51
  });
60
52
  }
61
53
 
62
- // 4. High cache:output ratio
63
- if (cacheHealth.efficiencyRatio > 2000) {
64
- recs.push({
65
- severity: 'critical',
66
- title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — abnormally high`,
67
- detail: `Healthy range: 300-800:1. You\'re at ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — every output token costs ${cacheHealth.efficiencyRatio.toLocaleString()} cache read tokens. This pattern matches the March 2026 cache bug reported by thousands of users.`,
68
- action: 'Immediate fix: update to v2.1.90+. If already updated, avoid --resume flag and start fresh sessions per task.',
69
- });
70
- } else if (cacheHealth.efficiencyRatio > 1000) {
54
+ // 3. Compaction frequency — community's #1 session management tip
55
+ if (sessionIntel?.available && sessionIntel.avgToolsPerSession > 25) {
71
56
  recs.push({
72
57
  severity: 'warning',
73
- title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1elevated`,
74
- detail: 'Not critical, but above the 300-800 healthy range. Common causes: large codebase exploration, many file reads without /compact, or stale sessions.',
75
- action: 'Use /compact every 30-40 tool calls. Start fresh sessions for each distinct task.',
58
+ title: `Avg ${sessionIntel.avgToolsPerSession} tool calls/session compact more often`,
59
+ savings: '~15-25% usage reduction',
60
+ action: 'Use /compact every 30-40 tool calls. Context bloat compounds each message re-reads the full history. Community tip: compacting at 40 calls saves 20%+ on long sessions.',
76
61
  });
77
62
  }
78
63
 
79
- // 5. Opus dominance community tip: Sonnet handles 60%+ of tasks at 1/5 cost
80
- const modelCosts = costAnalysis.modelCosts || {};
81
- const totalModelCost = Object.values(modelCosts).reduce((s, c) => s + c, 0);
82
- const opusCost = Object.entries(modelCosts).filter(([n]) => n.toLowerCase().includes('opus')).reduce((s, [, c]) => s + c, 0);
83
- const opusPct = totalModelCost > 0 ? Math.round((opusCost / totalModelCost) * 100) : 0;
84
-
85
- if (opusPct > 85) {
86
- const savings = modelRouting?.estimatedSavings || Math.round(opusCost * 0.16);
64
+ // 4. Fresh sessions per task
65
+ if (sessionIntel?.available && sessionIntel.longSessionPct > 30) {
87
66
  recs.push({
88
67
  severity: 'warning',
89
- title: `${opusPct}% of spend is Opus`,
90
- detail: `Opus costs 5x more than Sonnet per token. Sonnet 4.6 handles file reads, search, simple edits, and subagent work at the same quality. Community tip: switching routine tasks to Sonnet dropped quota usage by 60-80%.`,
91
- action: `Set model: "sonnet" on subagent/Task calls. Estimated savings: ~$${savings.toLocaleString()}. Reserve Opus for complex reasoning only.`,
68
+ title: `${sessionIntel.longSessionPct}% of sessions over 60 min — start fresh more often`,
69
+ savings: '~10-20% usage reduction',
70
+ action: `One task, one session. Your p90 is ${sessionIntel.p90Duration}min, longest ${sessionIntel.maxDuration}min. Starting fresh resets context and maximizes cache hits. Cheaper than a bloated session.`,
92
71
  });
93
72
  }
94
73
 
95
- // 6. Session length — community-reported: sessions >60 min degrade heavily
96
- if (sessionIntel?.available && sessionIntel.longSessionPct > 30) {
74
+ // 5. Cache ratio warning
75
+ if (cacheHealth.efficiencyRatio > 1500) {
97
76
  recs.push({
98
- severity: 'warning',
99
- title: `${sessionIntel.longSessionPct}% of sessions exceed 60 minutes`,
100
- detail: `Long sessions accumulate context that degrades cache efficiency and response quality. Your median: ${sessionIntel.medianDuration}min, p90: ${sessionIntel.p90Duration}min, longest: ${sessionIntel.maxDuration}min.`,
101
- action: 'One task, one session. Use /compact for exploration, fresh session for each bug fix or feature. The cost of starting fresh is less than the cost of a bloated context.',
77
+ severity: 'critical',
78
+ title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1 update Claude Code`,
79
+ savings: '~40-60% usage reduction',
80
+ action: 'Run: claude update. v2.1.89 had cache bugs that inflated ratios 10-20x. Community-verified: v2.1.90 dropped usage from 80-100% to 5-7% of Max quota.',
81
+ });
82
+ } else if (cacheHealth.efficiencyRatio > 800) {
83
+ recs.push({
84
+ severity: 'info',
85
+ title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — slightly elevated`,
86
+ savings: '~5-10% with optimization',
87
+ action: 'Healthy range: 300-800:1. Reduce by compacting more often, starting fresh sessions, and avoiding --resume on older CC versions.',
102
88
  });
103
89
  }
104
90
 
105
- // 7. Peak hour overlap — community-reported: 5am-11am PT has throttled limits
91
+ // 6. Peak hour overlap
106
92
  if (sessionIntel?.available && sessionIntel.peakOverlapPct > 40) {
107
93
  recs.push({
108
94
  severity: 'info',
109
- title: `${sessionIntel.peakOverlapPct}% of your work hits throttled hours`,
110
- detail: 'Anthropic reduces 5-hour session limits during weekday peak hours (5am-11am PT / 12pm-6pm UTC). ~7% of users hit limits they wouldn\'t otherwise.',
111
- action: 'Shift token-heavy work (refactors, test generation, codebase exploration) to off-peak hours. Session limits are unchanged — only the 5-hour window shrinks.',
95
+ title: `${sessionIntel.peakOverlapPct}% of work during throttled hours`,
96
+ savings: '~30% longer session limits',
97
+ action: 'Anthropic throttles 5-hour limits during 5am-11am PT weekdays. Shift heavy work (refactors, test gen) to off-peak for 30%+ longer limits.',
112
98
  });
113
99
  }
114
100
 
101
+ // 7. .claudeignore — prevents reading node_modules etc
102
+ recs.push({
103
+ severity: 'info',
104
+ title: 'Create .claudeignore to exclude build artifacts',
105
+ savings: '~5-10% per context load',
106
+ action: 'Prevents CC from reading node_modules/, dist/, *.lock, __pycache__/. Each context load scans your project tree — excluding junk saves tokens every turn.',
107
+ });
108
+
115
109
  // 8. Cost anomalies
116
110
  if (anomalies.hasAnomalies) {
117
111
  const spikes = anomalies.anomalies.filter(a => a.type === 'spike');
@@ -119,23 +113,50 @@ export function generateRecommendations(costAnalysis, cacheHealth, claudeMdStack
119
113
  const worst = spikes[0];
120
114
  recs.push({
121
115
  severity: worst.severity,
122
- title: `${spikes.length} cost spike${spikes.length > 1 ? 's' : ''} — worst: $${worst.cost.toFixed(0)} on ${worst.date}`,
123
- detail: `+$${worst.deviation.toFixed(0)} above your $${worst.avgCost.toFixed(0)} daily average.${worst.cacheRatioAnomaly ? ' Cache ratio was also anomalous — strongly suggests cache bug.' : ''} GitHub #38029 documents a bug where a single session generated 652K phantom output tokens ($342).`,
124
- action: 'Monitor the first 1-2 messages of each session. If a single message burns 3-5% of your quota, restart immediately.',
116
+ title: `${spikes.length} cost spike${spikes.length > 1 ? 's' : ''} — worst $${worst.cost.toFixed(0)} on ${worst.date}`,
117
+ savings: 'Preventable with monitoring',
118
+ action: 'Watch the first 1-2 messages of each session. If a single message burns 3-5% of quota, restart immediately. GitHub #38029 documents phantom 652K output token bugs.',
125
119
  });
126
120
  }
127
121
  }
128
122
 
129
- // 9. Positive: cache savings
123
+ // 9. Avoid --resume on older versions
124
+ if (cacheHealth.efficiencyRatio > 600) {
125
+ recs.push({
126
+ severity: 'info',
127
+ title: 'Avoid --resume and --continue flags',
128
+ savings: '~$0.15 saved per resume',
129
+ action: 'These flags caused full prompt-cache misses in v2.1.69-2.1.89 (~$0.15 per resume on 500K context). Fixed in v2.1.90. Copy your last message and start fresh instead.',
130
+ });
131
+ }
132
+
133
+ // 10. Specific prompt discipline
134
+ recs.push({
135
+ severity: 'info',
136
+ title: 'Be specific in prompts — reduces tokens up to 10x',
137
+ savings: '~20-40% usage reduction',
138
+ action: 'Instead of "fix the auth bug", say "fix JWT validation in src/auth/validate.ts line 42". Specific prompts avoid codebase-wide scans. Community-verified: 10x reduction per prompt.',
139
+ });
140
+
141
+ // 11. Disconnect unused MCP tools
142
+ if (sessionIntel?.available && sessionIntel.topTools.some(t => t.name.includes('mcp__'))) {
143
+ recs.push({
144
+ severity: 'info',
145
+ title: 'Disconnect unused MCP servers',
146
+ savings: '~5-15% per cache break avoided',
147
+ action: 'Each MCP tool schema change invalidates the prompt cache. Only connect servers you actively need. Disconnect the rest between sessions.',
148
+ });
149
+ }
150
+
151
+ // 12. Cache savings (positive)
130
152
  if (cacheHealth.savings?.fromCaching > 100) {
131
153
  recs.push({
132
154
  severity: 'positive',
133
- title: `Cache saved you ~$${cacheHealth.savings.fromCaching.toLocaleString()}`,
134
- detail: 'Without prompt caching, standard input pricing would have applied to all cache reads. The system is working — optimization is about reducing breaks.',
135
- action: 'Keep sessions alive to maximize hits. Avoid mid-session CLAUDE.md edits and MCP tool changes.',
155
+ title: `Cache saved ~$${cacheHealth.savings.fromCaching.toLocaleString()} in equivalent API costs`,
156
+ savings: 'Working as intended',
157
+ action: 'Prompt caching is saving you significantly. Keep sessions alive, avoid mid-session CLAUDE.md edits and MCP tool changes to maximize hits.',
136
158
  });
137
159
  }
138
160
 
139
- // Cap at 5 most impactful recommendations
140
- return recs.slice(0, 5);
161
+ return recs.slice(0, 8);
141
162
  }
package/src/cli/index.js CHANGED
@@ -39,7 +39,7 @@ const flags = {
39
39
  if (flags.help) {
40
40
  console.log(`
41
41
  ╔═══════════════════════════════════════════════╗
42
- ║ CC Hubber v0.1.0
42
+ ║ CC Hubber v0.3.1 ║
43
43
  ║ What you spent. Why you spent it. Is that ║
44
44
  ║ normal. ║
45
45
  ╚═══════════════════════════════════════════════╝
@@ -74,7 +74,7 @@ async function main() {
74
74
  process.exit(1);
75
75
  }
76
76
 
77
- console.log('\n CC Hubber v0.1.0');
77
+ console.log('\n CC Hubber v0.3.1');
78
78
  console.log(' ─────────────────────────────');
79
79
  console.log(' Reading local Claude Code data...\n');
80
80
 
@@ -28,10 +28,9 @@ export function readClaudeMdStack(claudeDir) {
28
28
  }
29
29
  if (currentSection.lines > 0) sections.push(currentSection);
30
30
 
31
- // Add token estimates and sort by size
31
+ // Add token estimates, keep original file order (don't sort)
32
32
  globalSections = sections
33
- .map(s => ({ ...s, tokens: Math.round(s.bytes / 4) }))
34
- .sort((a, b) => b.bytes - a.bytes);
33
+ .map((s, idx) => ({ ...s, tokens: Math.round(s.bytes / 4), order: idx }));
35
34
 
36
35
  stack.push({
37
36
  level: 'global',
@@ -36,20 +36,37 @@ function readProjectsDir(dir, entries) {
36
36
  for (const hash of projectHashes) {
37
37
  const projectDir = join(dir, hash);
38
38
 
39
- // Read top-level JSONL files only (one per session).
40
- // Subagent files in <session>/subagents/ are NOT read for cost —
41
- // parent session JSONL already includes subagent token billing.
42
- // Reading both would double-count (confirmed: $5.7K → $10.8K).
39
+ // Read top-level JSONL files (one per session)
43
40
  const jsonlFiles = readdirSync(projectDir).filter(f => f.endsWith('.jsonl'));
44
41
  for (const file of jsonlFiles) {
45
42
  readJsonlFile(join(projectDir, file), basename(file, '.jsonl'), hash, entries);
46
43
  }
44
+
45
+ // Read subagent JSONL files (for Haiku/Sonnet model attribution)
46
+ // Dedup by message ID prevents double-counting
47
+ const subdirs = readdirSync(projectDir).filter(f => {
48
+ try { return statSync(join(projectDir, f)).isDirectory(); } catch { return false; }
49
+ });
50
+ for (const subdir of subdirs) {
51
+ const subagentDir = join(projectDir, subdir, 'subagents');
52
+ if (existsSync(subagentDir)) {
53
+ try {
54
+ const subFiles = readdirSync(subagentDir).filter(f => f.endsWith('.jsonl'));
55
+ for (const file of subFiles) {
56
+ readJsonlFile(join(subagentDir, file), basename(file, '.jsonl'), hash, entries);
57
+ }
58
+ } catch { /* skip */ }
59
+ }
60
+ }
47
61
  }
48
62
  } catch {
49
63
  // Directory read failed
50
64
  }
51
65
  }
52
66
 
67
+ // Track seen message IDs to deduplicate (JSONL files contain dupes from session resume)
68
+ const seenMessageIds = new Set();
69
+
53
70
  function readJsonlFile(filePath, sessionId, projectHash, entries) {
54
71
  try {
55
72
  const raw = readFileSync(filePath, 'utf-8');
@@ -65,6 +82,13 @@ function readJsonlFile(filePath, sessionId, projectHash, entries) {
65
82
  const usage = record.message?.usage;
66
83
  if (!usage) continue;
67
84
 
85
+ // Deduplicate by message ID — JSONL files contain duplicates from session resume
86
+ const msgId = record.message?.id;
87
+ if (msgId) {
88
+ if (seenMessageIds.has(msgId)) continue;
89
+ seenMessageIds.add(msgId);
90
+ }
91
+
68
92
  entries.push({
69
93
  sessionId,
70
94
  projectHash,