npm - cchubber - Versions diffs - 0.2.0 → 0.3.1 - Mend

cchubber 0.2.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +63 -60
package/package.json +1 -1
package/src/analyzers/cache-health.js +11 -4
package/src/analyzers/inflection-detector.js +38 -26
package/src/analyzers/recommendations.js +97 -76
package/src/cli/index.js +2 -2
package/src/readers/claude-md.js +2 -3
package/src/readers/jsonl-reader.js +28 -4
package/src/renderers/html-report.js +1045 -767

package/README.md CHANGED Viewed

@@ -1,24 +1,36 @@
 # CC Hubber
-**What you spent. Why you spent it. Is that normal.**
+Your Claude Code usage, diagnosed. One command.
-Offline CLI that reads your local Claude Code data and generates a diagnostic HTML report. No API keys. No telemetry. Everything stays on your machine.
+```bash
+npx cchubber
+```
+Reads your local data, generates an HTML report. No API keys, no telemetry, nothing leaves your machine.
+Built during the March 2026 cache crisis because nobody could tell if they'd been hit. Thousands of users burning through limits 10-20x faster than normal, and Anthropic's only answer was "we're investigating." We wanted receipts.
+## What you get
-Built because Claude Code users had zero visibility into the [March 2026 cache bug](https://github.com/anthropics/claude-code/issues/41930) that silently inflated costs by 10-20x. Your `$100 plan` shouldn't feel like a `$20 plan`.
+A single HTML report that tells you three things: what you spent, why you spent it, and whether that's normal.
-![CC Hubber Report](https://raw.githubusercontent.com/azkhh/cchubber/master/screenshot.png)
+**The diagnosis:**
+- Cache health grade (trend-weighted, recent 7 days count more)
+- Inflection point detection: "Your efficiency dropped 3.2x starting March 17"
+- Per-project cost breakdown with decoded project names
+- Session intelligence: duration stats, tool usage, activity heatmap
+- Model routing analysis (93% Opus? Your limits would last 3x longer on Sonnet)
+- 8 actionable recommendations, each with estimated usage savings
-## What it does
+**The data:**
+- Cost calculated from actual token counts (LiteLLM pricing, not the broken `costUSD` field)
+- Message-level deduplication (Claude Code JSONL files contain ~50% duplicates from session resume)
+- Subagent visibility: Haiku and Sonnet background agents show up in model distribution
+- CLAUDE.md section-by-section analysis with per-message cost impact
+- Cache break estimation even when diff files don't exist on your CC version
-- **Cost breakdown** — Per-day, per-model, per-project cost calculated from your actual token counts
-- **Cache health grade** — Trend-weighted (recent 7 days dominate). If you hit the cache bug, you'll see D/F, not a misleading A
-- **Inflection point detection** — "Your efficiency dropped 4.7x starting March 29. Before: 360:1. After: 1,676:1."
-- **Anomaly detection** — Flags days where your cost/ratio deviates >2 standard deviations
-- **Cache break analysis** — Reads `~/.claude/tmp/cache-break-*.diff` files. Shows why your cache broke and how often
-- **CLAUDE.md cost analysis** — How much your rules files cost per message (cached vs uncached)
-- **Per-project breakdown** — Which project is eating your budget
-- **Live rate limits** — 5-hour and 7-day utilization (if OAuth token available)
-- **Shareable card** — Export your report as a PNG
+**The shareable card:**
+An animated card with your grade, spend, cache ratio, and diagnosis line. Export as video. Post it. Let people see the numbers Anthropic won't show them.
 ## Install
@@ -26,74 +38,65 @@ Built because Claude Code users had zero visibility into the [March 2026 cache b
 npx cchubber
 ```
-Or install globally:
+Or globally:
 ```bash
 npm install -g cchubber
 cchubber
 ```
-Requires Node.js 18+. Runs on macOS, Windows, and Linux.
+Node.js 18+. Works on macOS, Windows, Linux.
-## Usage
+## The cache bug (March 2026)
-```bash
-cchubber                      # Scan and open HTML report
-cchubber --days 7             # Default view: last 7 days
-cchubber -o report.html       # Custom output path
-cchubber --no-open            # Don't auto-open in browser
-cchubber --json               # Machine-readable JSON output
-```
-## What it reads
+Between v2.1.69 and v2.1.89, five things broke at once:
-All data is local. Nothing leaves your machine.
+1. A sentinel replacement bug in Anthropic's custom Bun fork dropped cache read rates from 95% to 4-17%
+2. The `--resume` flag caused full prompt-cache misses on every single resume
+3. One session generated 652,069 output tokens with zero user input ($342 gone)
+4. Peak-hour throttling kicked in for 7% of users without announcement
+5. A 2x off-peak promotion expired, making the baseline feel like a cut
-| Source | Path | What |
-|--------|------|------|
-| JSONL conversations | `~/.claude/projects/*/` | Token counts per message, per model, per session |
-| Stats cache | `~/.claude/stats-cache.json` | Pre-aggregated daily totals |
-| Session meta | `~/.claude/usage-data/session-meta/` | Duration, tool counts, lines changed |
-| Cache breaks | `~/.claude/tmp/cache-break-*.diff` | Why your prompt cache invalidated |
-| CLAUDE.md stack | `~/.claude/CLAUDE.md`, project-level | File sizes and per-message cost impact |
-| OAuth usage | `~/.claude/.credentials.json` | Live rate limit utilization |
-## The March 2026 cache bug
+v2.1.90 fixes most of these. Run `claude update`.
-Between v2.1.69 and v2.1.89, multiple bugs caused Claude Code's prompt cache to silently fail:
+CC Hubber shows you whether you were affected. If your report has a sharp inflection point around mid-March, that's probably when it hit you.
-- A sentinel replacement bug in the Bun fork dropped cache read rates from ~95% to 4-17%
-- The `--resume` flag caused full prompt-cache misses on every resume
-- One session generated 652,069 output tokens with no user input — $342 on a single session
+## What the community figured out
-**v2.1.90 fixes most of these.** Update immediately: `claude update`
+These tips came from GitHub issues, Reddit threads, and Twitter during the crisis. CC Hubber's recommendations are based on this data.
-CC Hubber detects whether you were affected by showing your cache efficiency trend over time. If you see a sharp inflection point, that's probably when it hit you.
+- Start a fresh session for each task. Long sessions bleed tokens.
+- Route subagents to Sonnet (`model: "sonnet"` on Task calls). Same quality, 5x cheaper per token.
+- Keep your CLAUDE.md under 200 lines. It gets re-read on every message. 12K tokens at 200 messages/day costs $1.23/day cached.
+- Run `/compact` every 30-40 tool calls. Context bloat compounds.
+- Create a `.claudeignore` file. Exclude `node_modules/`, `dist/`, `*.lock`. Saves tokens on every context load.
+- Avoid `--resume` on older versions. Fixed in v2.1.90.
+- Shift heavy work (refactors, test generation) outside 5am-11am PT. That's when Anthropic throttles session limits.
-## Best practices (from the community)
+## How the cost works
-These tips surfaced during the March crisis. CC Hubber helps you verify whether they're working:
+Claude Code doesn't show costs for Max and Pro plans (`costUSD` is always 0). CC Hubber calculates equivalent API cost from your token counts using LiteLLM's pricing data.
-- **Start fresh sessions per task** — don't try to extend long sessions
-- **Avoid `--resume` on older versions** — fixed in v2.1.90
-- **Switch to Sonnet 4.6 for routine work** — same quality, fraction of the quota
-- **Keep CLAUDE.md under 200 lines** — it's re-read on every message
-- **Use `/compact` every 30-40 tool calls** — prevents context bloat
-- **Create `.claudeignore`** — exclude `node_modules/`, `dist/`, `*.lock`
-- **Shift heavy work to off-peak hours** — outside 5am-11am PT weekdays
+The number you see is what you'd pay on the API tier for the same usage. Useful for comparing consumption across days and projects. Not a billing statement.
-## How cost is calculated
+## Data sources
-Claude Code doesn't report costs for Max/Pro plans (`costUSD` is always 0). CC Hubber calculates costs from token counts using dynamic pricing from [LiteLLM](https://github.com/BerriAI/litellm), with hardcoded fallbacks.
+Everything is local. CC Hubber reads files that already exist on your machine.
-This gives you an **equivalent API cost** — what you would pay on the API tier for the same usage. Useful for understanding relative consumption, not for billing disputes.
+| Source | Path | What it contains |
+|--------|------|-----------------|
+| Conversations | `~/.claude/projects/*/` | Token counts per message, per model |
+| Subagents | `~/.claude/projects/*/subagents/` | Haiku/Sonnet background agent usage |
+| Session meta | `~/.claude/usage-data/session-meta/` | Duration, tool counts, lines changed |
+| Cache breaks | `~/.claude/tmp/cache-break-*.diff` | Why your prompt cache broke |
+| CLAUDE.md | `~/.claude/CLAUDE.md` + project-level | File sizes, section breakdown, cost per message |
+| Rate limits | `~/.claude/.credentials.json` | Live 5-hour and 7-day utilization |
-## Prior art
+## Compared to ccusage
-- [ccusage](https://github.com/jikyo/ccusage) (12K+ stars) — token tracking and cost visualization
-- [Claude-Code-Usage-Monitor](https://github.com/nicobailon/Claude-Code-Usage-Monitor) — basic session tracking
+[ccusage](https://github.com/ryoppippi/ccusage) (12K+ stars) is great for cost accounting. It tells you what you spent.
-CC Hubber focuses on **diagnosis** — cache health grading, inflection detection, cache break analysis — not just accounting. If ccusage tells you *what* you spent, CC Hubber tells you *why* and whether it's normal.
+CC Hubber tells you why, and whether it's normal. Inflection detection, cache break estimation, model routing savings, session intelligence, trend-weighted grading. Different tools for different questions.
 ## License
@@ -101,4 +104,4 @@ MIT
 ## Credits
-Built by [@azkhh](https://x.com/asmirkn). Shipped with [Mover OS](https://moveros.dev).
+Built by [@azkhh](https://x.com/asmirkn). Shipped fast with [Mover OS](https://moveros.dev).

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "cchubber",
-  "version": "0.2.0",
+  "version": "0.3.1",
   "description": "What you spent. Why you spent it. Is that normal. — Claude Code usage diagnosis with beautiful HTML reports.",
   "type": "module",
   "bin": {

package/src/analyzers/cache-health.js CHANGED Viewed

@@ -57,13 +57,20 @@ export function analyzeCacheHealth(statsCache, cacheBreaks, days, dailyFromJSONL
   // With cache: reads are $0.50/M
   const savingsFromCache = totalCacheRead / 1_000_000 * (5.0 - 0.50);
-  // Cost wasted from cache breaks (rough estimate)
-  // Each cache break forces a full re-read at write price ($6.25/M) instead of read price ($0.50/M)
-  // Estimate ~200K tokens re-cached per break
-  const wastedFromBreaks = totalBreaks * 200_000 / 1_000_000 * (6.25 - 0.50);
+  // Cost wasted from cache rewrites
+  // Cache writes happen when cache is invalidated — costs 12.5x more than reads
+  // Use actual cache write tokens as the signal (more reliable than diff file count)
+  const wastedFromBreaks = totalBreaks > 0
+    ? totalBreaks * 200_000 / 1_000_000 * (6.25 - 0.50)
+    : totalCacheWrite / 1_000_000 * (6.25 - 0.50); // estimate from write tokens
+  // If no diff files but cache writes exist, estimate break count
+  // Each break re-caches ~200K-500K tokens on average
+  const estimatedBreaks = totalBreaks > 0 ? totalBreaks : Math.round(totalCacheWrite / 300_000);
   return {
     totalCacheBreaks: totalBreaks,
+    estimatedBreaks,
     reasonsRanked,
     cacheHitRate: Math.round(cacheHitRate * 10) / 10,
     efficiencyRatio,

package/src/analyzers/inflection-detector.js CHANGED Viewed

@@ -1,7 +1,7 @@
 /**
  * Inflection Point Detection
- * Finds the sharpest change in cache efficiency ratio over time.
- * Outputs: "Your efficiency dropped 3.6x starting March 29. Before: 482:1. After: 1,726:1."
+ * Finds BOTH the worst degradation AND best improvement in cache efficiency.
+ * Prioritizes degradation — that's what users care about ("why is my usage draining?").
  */
 export function detectInflectionPoints(dailyFromJSONL) {
   if (!dailyFromJSONL || dailyFromJSONL.length < 5) return null;
@@ -12,10 +12,10 @@ export function detectInflectionPoints(dailyFromJSONL) {
   if (sorted.length < 5) return null;
-  // Sliding window: compare the average ratio of days before vs after each point
-  // Window size: at least 3 days on each side
   const minWindow = 3;
-  let bestSplit = null;
+  let worstDegradation = null;
+  let worstScore = 0;
+  let bestImprovement = null;
   let bestScore = 0;
   for (let i = minWindow; i <= sorted.length - minWindow; i++) {
@@ -27,32 +27,44 @@ export function detectInflectionPoints(dailyFromJSONL) {
     if (beforeRatio === 0 || afterRatio === 0) continue;
-    // Score = magnitude of change (either direction)
-    const changeMultiplier = afterRatio > beforeRatio
-      ? afterRatio / beforeRatio
-      : beforeRatio / afterRatio;
-    if (changeMultiplier > bestScore && changeMultiplier >= 1.5) {
-      bestScore = changeMultiplier;
-      bestSplit = {
-        date: sorted[i].date,
-        beforeRatio,
-        afterRatio,
-        multiplier: Math.round(changeMultiplier * 10) / 10,
-        direction: afterRatio > beforeRatio ? 'worsened' : 'improved',
-        beforeDays: before.length,
-        afterDays: after.length,
-      };
+    if (afterRatio > beforeRatio) {
+      // Degradation (ratio went UP = worse)
+      const mult = afterRatio / beforeRatio;
+      if (mult > worstScore && mult >= 1.5) {
+        worstScore = mult;
+        worstDegradation = buildResult(sorted[i].date, beforeRatio, afterRatio, mult, 'worsened', before.length, after.length);
+      }
+    } else {
+      // Improvement (ratio went DOWN = better)
+      const mult = beforeRatio / afterRatio;
+      if (mult > bestScore && mult >= 1.5) {
+        bestScore = mult;
+        bestImprovement = buildResult(sorted[i].date, beforeRatio, afterRatio, mult, 'improved', before.length, after.length);
+      }
     }
   }
-  if (!bestSplit) return null;
+  // Return degradation as primary (that's the problem), improvement as secondary
+  const primary = worstDegradation || bestImprovement;
+  if (!primary) return null;
-  // Build human-readable summary
-  const dirLabel = bestSplit.direction === 'worsened' ? 'dropped' : 'improved';
-  bestSplit.summary = `Your cache efficiency ${dirLabel} ${bestSplit.multiplier}x starting ${formatDate(bestSplit.date)}. Before: ${bestSplit.beforeRatio.toLocaleString()}:1. After: ${bestSplit.afterRatio.toLocaleString()}:1.`;
+  primary.secondary = worstDegradation ? bestImprovement : null;
+  return primary;
+}
-  return bestSplit;
+function buildResult(date, beforeRatio, afterRatio, multiplier, direction, beforeDays, afterDays) {
+  const mult = Math.round(multiplier * 10) / 10;
+  const dirLabel = direction === 'worsened' ? 'dropped' : 'improved';
+  return {
+    date,
+    beforeRatio,
+    afterRatio,
+    multiplier: mult,
+    direction,
+    beforeDays,
+    afterDays,
+    summary: `Your cache efficiency ${dirLabel} ${mult}x starting ${formatDate(date)}. Before: ${beforeRatio.toLocaleString()}:1. After: ${afterRatio.toLocaleString()}:1.`,
+  };
 }
 function computeRatio(days) {

package/src/analyzers/recommendations.js CHANGED Viewed

@@ -1,117 +1,111 @@
 /**
  * Recommendations Engine
- * Generates actionable recommendations informed by community data (March 2026 crisis).
- * Every recommendation maps to a real pattern reported by users on GitHub/Twitter/Reddit.
+ * Each recommendation includes estimated usage % savings.
+ * Informed by community data from the March 2026 Claude Code crisis.
  */
 export function generateRecommendations(costAnalysis, cacheHealth, claudeMdStack, anomalies, inflection, sessionIntel, modelRouting) {
   const recs = [];
+  const totalCost = costAnalysis.totalCost || 1;
-  // 0. Inflection point — most important signal
+  // 0. Inflection point
   if (inflection && inflection.direction === 'worsened' && inflection.multiplier >= 2) {
     recs.push({
       severity: 'critical',
       title: `Cache efficiency dropped ${inflection.multiplier}x on ${inflection.date}`,
-      detail: inflection.summary,
-      action: 'Run: claude update. Versions 2.1.69-2.1.89 had a cache sentinel bug that dropped read rates from 95% to 4-17%. Fixed in v2.1.90.',
+      savings: '~40-60% usage reduction after fix',
+      action: 'Run: claude update. v2.1.69-2.1.89 had cache bugs. Fixed in v2.1.90.',
     });
   } else if (inflection && inflection.direction === 'improved' && inflection.multiplier >= 2) {
     recs.push({
       severity: 'positive',
       title: `Efficiency improved ${inflection.multiplier}x on ${inflection.date}`,
-      detail: inflection.summary,
-      action: 'Your cache efficiency improved here. Likely a version update or workflow change that stuck.',
+      savings: 'Already saving',
+      action: 'Your cache efficiency improved. Likely a version update or workflow change.',
     });
   }
-  // 1. CLAUDE.md bloat — community-reported 10-20x cost multiplier
-  if (claudeMdStack.totalTokensEstimate > 8000) {
-    const dailyCost = claudeMdStack.costPerMessage?.dailyCached200;
-    recs.push({
-      severity: claudeMdStack.totalTokensEstimate > 15000 ? 'critical' : 'warning',
-      title: `CLAUDE.md is ${Math.round(claudeMdStack.totalTokensEstimate / 1000)}K tokens`,
-      detail: `Re-read on every turn. Community best practice: keep under 200 lines (~4K tokens). Yours costs ~$${dailyCost ? dailyCost.toFixed(2) : '?'}/day at 200 messages. Each cache break re-reads at 12.5x the cached price.`,
-      action: 'Move rarely-used rules to project-level files. Use skills/hooks instead of inline instructions. Every 1K tokens removed saves ~$0.50/day.',
-    });
-  }
+  // 1. Model routing — biggest actionable saving for most users
+  const modelCosts = costAnalysis.modelCosts || {};
+  const totalModelCost = Object.values(modelCosts).reduce((s, c) => s + c, 0);
+  const opusCost = Object.entries(modelCosts).filter(([n]) => n.toLowerCase().includes('opus')).reduce((s, [, c]) => s + c, 0);
+  const opusPct = totalModelCost > 0 ? Math.round((opusCost / totalModelCost) * 100) : 0;
-  // 2. Version check — the #1 fix reported by community
-  if (cacheHealth.efficiencyRatio > 1500 || (inflection && inflection.direction === 'worsened')) {
+  if (opusPct > 80) {
+    const savingsPct = Math.round(opusPct * 0.4 * 0.8); // 40% of Opus routable, 80% cheaper
     recs.push({
-      severity: 'critical',
-      title: 'Update Claude Code to v2.1.90+',
-      detail: 'Versions 2.1.69-2.1.89 had three cache bugs: sentinel replacement error, --resume cache miss, and nested CLAUDE.md re-injection. Community-verified: usage dropped from 80-100% to 5-7% of Max quota after updating.',
-      action: 'Run: claude update. If already on latest, start a fresh session — the fix only applies to new sessions.',
+      severity: 'warning',
+      title: `${opusPct}% usage is Opus — route subagents to Sonnet`,
+      savings: `~${savingsPct}% usage reduction`,
+      action: `Set model: "sonnet" on Task/subagent calls. Sonnet handles search, file reads, docs, and simple edits at same quality. Community-verified: limits lasted 3-5x longer.`,
     });
   }
-  // 3. Cache break analysis
-  if (cacheHealth.totalCacheBreaks > 10) {
-    const topReason = cacheHealth.reasonsRanked[0];
+  // 2. CLAUDE.md bloat
+  if (claudeMdStack.totalTokensEstimate > 8000) {
+    const excessK = Math.round((claudeMdStack.totalTokensEstimate - 4000) / 1000);
     recs.push({
-      severity: cacheHealth.totalCacheBreaks > 50 ? 'critical' : 'warning',
-      title: `${cacheHealth.totalCacheBreaks} cache invalidations`,
-      detail: `Each break forces a full prompt re-read at write prices (12.5x cache read cost). ${topReason ? `Top cause: "${topReason.reason}" (${topReason.count}x, ${topReason.percentage}%).` : ''}`,
-      action: topReason?.reason === 'Tool schemas changed'
-        ? 'Reduce MCP server connections. Each tool schema change breaks the cache prefix. Disconnect tools you\'re not actively using.'
-        : topReason?.reason === 'System prompt changed'
-        ? 'Stop editing CLAUDE.md mid-session. Batch rule changes between sessions.'
-        : 'Review ~/.claude/tmp/cache-break-*.diff for exact invalidation causes.',
+      severity: claudeMdStack.totalTokensEstimate > 15000 ? 'critical' : 'warning',
+      title: `CLAUDE.md is ${Math.round(claudeMdStack.totalTokensEstimate / 1000)}K tokens — trim to <4K`,
+      savings: `saves ~${excessK}K tokens/msg`,
+      action: 'Re-read on every turn. Move rarely-used rules to project files. Use skills/hooks instead of inline instructions. Community target: under 200 lines.',
     });
   }
-  // 4. High cache:output ratio
-  if (cacheHealth.efficiencyRatio > 2000) {
-    recs.push({
-      severity: 'critical',
-      title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — abnormally high`,
-      detail: `Healthy range: 300-800:1. You\'re at ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — every output token costs ${cacheHealth.efficiencyRatio.toLocaleString()} cache read tokens. This pattern matches the March 2026 cache bug reported by thousands of users.`,
-      action: 'Immediate fix: update to v2.1.90+. If already updated, avoid --resume flag and start fresh sessions per task.',
-    });
-  } else if (cacheHealth.efficiencyRatio > 1000) {
+  // 3. Compaction frequency — community's #1 session management tip
+  if (sessionIntel?.available && sessionIntel.avgToolsPerSession > 25) {
     recs.push({
       severity: 'warning',
-      title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — elevated`,
-      detail: 'Not critical, but above the 300-800 healthy range. Common causes: large codebase exploration, many file reads without /compact, or stale sessions.',
-      action: 'Use /compact every 30-40 tool calls. Start fresh sessions for each distinct task.',
+      title: `Avg ${sessionIntel.avgToolsPerSession} tool calls/session — compact more often`,
+      savings: '~15-25% usage reduction',
+      action: 'Use /compact every 30-40 tool calls. Context bloat compounds — each message re-reads the full history. Community tip: compacting at 40 calls saves 20%+ on long sessions.',
     });
   }
-  // 5. Opus dominance — community tip: Sonnet handles 60%+ of tasks at 1/5 cost
-  const modelCosts = costAnalysis.modelCosts || {};
-  const totalModelCost = Object.values(modelCosts).reduce((s, c) => s + c, 0);
-  const opusCost = Object.entries(modelCosts).filter(([n]) => n.toLowerCase().includes('opus')).reduce((s, [, c]) => s + c, 0);
-  const opusPct = totalModelCost > 0 ? Math.round((opusCost / totalModelCost) * 100) : 0;
-  if (opusPct > 85) {
-    const savings = modelRouting?.estimatedSavings || Math.round(opusCost * 0.16);
+  // 4. Fresh sessions per task
+  if (sessionIntel?.available && sessionIntel.longSessionPct > 30) {
     recs.push({
       severity: 'warning',
-      title: `${opusPct}% of spend is Opus`,
-      detail: `Opus costs 5x more than Sonnet per token. Sonnet 4.6 handles file reads, search, simple edits, and subagent work at the same quality. Community tip: switching routine tasks to Sonnet dropped quota usage by 60-80%.`,
-      action: `Set model: "sonnet" on subagent/Task calls. Estimated savings: ~$${savings.toLocaleString()}. Reserve Opus for complex reasoning only.`,
+      title: `${sessionIntel.longSessionPct}% of sessions over 60 min — start fresh more often`,
+      savings: '~10-20% usage reduction',
+      action: `One task, one session. Your p90 is ${sessionIntel.p90Duration}min, longest ${sessionIntel.maxDuration}min. Starting fresh resets context and maximizes cache hits. Cheaper than a bloated session.`,
     });
   }
-  // 6. Session length — community-reported: sessions >60 min degrade heavily
-  if (sessionIntel?.available && sessionIntel.longSessionPct > 30) {
+  // 5. Cache ratio warning
+  if (cacheHealth.efficiencyRatio > 1500) {
     recs.push({
-      severity: 'warning',
-      title: `${sessionIntel.longSessionPct}% of sessions exceed 60 minutes`,
-      detail: `Long sessions accumulate context that degrades cache efficiency and response quality. Your median: ${sessionIntel.medianDuration}min, p90: ${sessionIntel.p90Duration}min, longest: ${sessionIntel.maxDuration}min.`,
-      action: 'One task, one session. Use /compact for exploration, fresh session for each bug fix or feature. The cost of starting fresh is less than the cost of a bloated context.',
+      severity: 'critical',
+      title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — update Claude Code`,
+      savings: '~40-60% usage reduction',
+      action: 'Run: claude update. v2.1.89 had cache bugs that inflated ratios 10-20x. Community-verified: v2.1.90 dropped usage from 80-100% to 5-7% of Max quota.',
+    });
+  } else if (cacheHealth.efficiencyRatio > 800) {
+    recs.push({
+      severity: 'info',
+      title: `Cache ratio ${cacheHealth.efficiencyRatio.toLocaleString()}:1 — slightly elevated`,
+      savings: '~5-10% with optimization',
+      action: 'Healthy range: 300-800:1. Reduce by compacting more often, starting fresh sessions, and avoiding --resume on older CC versions.',
     });
   }
-  // 7. Peak hour overlap — community-reported: 5am-11am PT has throttled limits
+  // 6. Peak hour overlap
   if (sessionIntel?.available && sessionIntel.peakOverlapPct > 40) {
     recs.push({
       severity: 'info',
-      title: `${sessionIntel.peakOverlapPct}% of your work hits throttled hours`,
-      detail: 'Anthropic reduces 5-hour session limits during weekday peak hours (5am-11am PT / 12pm-6pm UTC). ~7% of users hit limits they wouldn\'t otherwise.',
-      action: 'Shift token-heavy work (refactors, test generation, codebase exploration) to off-peak hours. Session limits are unchanged — only the 5-hour window shrinks.',
+      title: `${sessionIntel.peakOverlapPct}% of work during throttled hours`,
+      savings: '~30% longer session limits',
+      action: 'Anthropic throttles 5-hour limits during 5am-11am PT weekdays. Shift heavy work (refactors, test gen) to off-peak for 30%+ longer limits.',
     });
   }
+  // 7. .claudeignore — prevents reading node_modules etc
+  recs.push({
+    severity: 'info',
+    title: 'Create .claudeignore to exclude build artifacts',
+    savings: '~5-10% per context load',
+    action: 'Prevents CC from reading node_modules/, dist/, *.lock, __pycache__/. Each context load scans your project tree — excluding junk saves tokens every turn.',
+  });
   // 8. Cost anomalies
   if (anomalies.hasAnomalies) {
     const spikes = anomalies.anomalies.filter(a => a.type === 'spike');
@@ -119,23 +113,50 @@ export function generateRecommendations(costAnalysis, cacheHealth, claudeMdStack
       const worst = spikes[0];
       recs.push({
         severity: worst.severity,
-        title: `${spikes.length} cost spike${spikes.length > 1 ? 's' : ''} — worst: $${worst.cost.toFixed(0)} on ${worst.date}`,
-        detail: `+$${worst.deviation.toFixed(0)} above your $${worst.avgCost.toFixed(0)} daily average.${worst.cacheRatioAnomaly ? ' Cache ratio was also anomalous — strongly suggests cache bug.' : ''} GitHub #38029 documents a bug where a single session generated 652K phantom output tokens ($342).`,
-        action: 'Monitor the first 1-2 messages of each session. If a single message burns 3-5% of your quota, restart immediately.',
+        title: `${spikes.length} cost spike${spikes.length > 1 ? 's' : ''} — worst $${worst.cost.toFixed(0)} on ${worst.date}`,
+        savings: 'Preventable with monitoring',
+        action: 'Watch the first 1-2 messages of each session. If a single message burns 3-5% of quota, restart immediately. GitHub #38029 documents phantom 652K output token bugs.',
       });
     }
   }
-  // 9. Positive: cache savings
+  // 9. Avoid --resume on older versions
+  if (cacheHealth.efficiencyRatio > 600) {
+    recs.push({
+      severity: 'info',
+      title: 'Avoid --resume and --continue flags',
+      savings: '~$0.15 saved per resume',
+      action: 'These flags caused full prompt-cache misses in v2.1.69-2.1.89 (~$0.15 per resume on 500K context). Fixed in v2.1.90. Copy your last message and start fresh instead.',
+    });
+  }
+  // 10. Specific prompt discipline
+  recs.push({
+    severity: 'info',
+    title: 'Be specific in prompts — reduces tokens up to 10x',
+    savings: '~20-40% usage reduction',
+    action: 'Instead of "fix the auth bug", say "fix JWT validation in src/auth/validate.ts line 42". Specific prompts avoid codebase-wide scans. Community-verified: 10x reduction per prompt.',
+  });
+  // 11. Disconnect unused MCP tools
+  if (sessionIntel?.available && sessionIntel.topTools.some(t => t.name.includes('mcp__'))) {
+    recs.push({
+      severity: 'info',
+      title: 'Disconnect unused MCP servers',
+      savings: '~5-15% per cache break avoided',
+      action: 'Each MCP tool schema change invalidates the prompt cache. Only connect servers you actively need. Disconnect the rest between sessions.',
+    });
+  }
+  // 12. Cache savings (positive)
   if (cacheHealth.savings?.fromCaching > 100) {
     recs.push({
       severity: 'positive',
-      title: `Cache saved you ~$${cacheHealth.savings.fromCaching.toLocaleString()}`,
-      detail: 'Without prompt caching, standard input pricing would have applied to all cache reads. The system is working — optimization is about reducing breaks.',
-      action: 'Keep sessions alive to maximize hits. Avoid mid-session CLAUDE.md edits and MCP tool changes.',
+      title: `Cache saved ~$${cacheHealth.savings.fromCaching.toLocaleString()} in equivalent API costs`,
+      savings: 'Working as intended',
+      action: 'Prompt caching is saving you significantly. Keep sessions alive, avoid mid-session CLAUDE.md edits and MCP tool changes to maximize hits.',
     });
   }
-  // Cap at 5 most impactful recommendations
-  return recs.slice(0, 5);
+  return recs.slice(0, 8);
 }

package/src/cli/index.js CHANGED Viewed

@@ -39,7 +39,7 @@ const flags = {
 if (flags.help) {
   console.log(`
   ╔═══════════════════════════════════════════════╗
-  ║              CC Hubber v0.1.0                 ║
+  ║              CC Hubber v0.3.1                 ║
   ║  What you spent. Why you spent it. Is that    ║
   ║  normal.                                      ║
   ╚═══════════════════════════════════════════════╝
@@ -74,7 +74,7 @@ async function main() {
     process.exit(1);
   }
-  console.log('\n  CC Hubber v0.1.0');
+  console.log('\n  CC Hubber v0.3.1');
   console.log('  ─────────────────────────────');
   console.log('  Reading local Claude Code data...\n');

package/src/readers/claude-md.js CHANGED Viewed

@@ -28,10 +28,9 @@ export function readClaudeMdStack(claudeDir) {
     }
     if (currentSection.lines > 0) sections.push(currentSection);
-    // Add token estimates and sort by size
+    // Add token estimates, keep original file order (don't sort)
     globalSections = sections
-      .map(s => ({ ...s, tokens: Math.round(s.bytes / 4) }))
-      .sort((a, b) => b.bytes - a.bytes);
+      .map((s, idx) => ({ ...s, tokens: Math.round(s.bytes / 4), order: idx }));
     stack.push({
       level: 'global',

package/src/readers/jsonl-reader.js CHANGED Viewed

@@ -36,20 +36,37 @@ function readProjectsDir(dir, entries) {
     for (const hash of projectHashes) {
       const projectDir = join(dir, hash);
-      // Read top-level JSONL files only (one per session).
-      // Subagent files in <session>/subagents/ are NOT read for cost —
-      // parent session JSONL already includes subagent token billing.
-      // Reading both would double-count (confirmed: $5.7K → $10.8K).
+      // Read top-level JSONL files (one per session)
       const jsonlFiles = readdirSync(projectDir).filter(f => f.endsWith('.jsonl'));
       for (const file of jsonlFiles) {
         readJsonlFile(join(projectDir, file), basename(file, '.jsonl'), hash, entries);
       }
+      // Read subagent JSONL files (for Haiku/Sonnet model attribution)
+      // Dedup by message ID prevents double-counting
+      const subdirs = readdirSync(projectDir).filter(f => {
+        try { return statSync(join(projectDir, f)).isDirectory(); } catch { return false; }
+      });
+      for (const subdir of subdirs) {
+        const subagentDir = join(projectDir, subdir, 'subagents');
+        if (existsSync(subagentDir)) {
+          try {
+            const subFiles = readdirSync(subagentDir).filter(f => f.endsWith('.jsonl'));
+            for (const file of subFiles) {
+              readJsonlFile(join(subagentDir, file), basename(file, '.jsonl'), hash, entries);
+            }
+          } catch { /* skip */ }
+        }
+      }
     }
   } catch {
     // Directory read failed
   }
 }
+// Track seen message IDs to deduplicate (JSONL files contain dupes from session resume)
+const seenMessageIds = new Set();
 function readJsonlFile(filePath, sessionId, projectHash, entries) {
   try {
     const raw = readFileSync(filePath, 'utf-8');
@@ -65,6 +82,13 @@ function readJsonlFile(filePath, sessionId, projectHash, entries) {
         const usage = record.message?.usage;
         if (!usage) continue;
+        // Deduplicate by message ID — JSONL files contain duplicates from session resume
+        const msgId = record.message?.id;
+        if (msgId) {
+          if (seenMessageIds.has(msgId)) continue;
+          seenMessageIds.add(msgId);
+        }
         entries.push({
           sessionId,
           projectHash,