npm - tokengolf - Versions diffs - 0.5.4 → 1.0.0 - Mend

tokengolf 0.5.4 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

package/CHANGELOG.md +1 -0
package/CLAUDE.md +90 -390
package/README.md +147 -126
package/dist/cli.js +618 -1457
package/docs/assets/banner.svg +16 -13
package/docs/design-preview.html +1112 -0
package/docs/index.html +656 -358
package/hooks/post-tool-use.js +35 -4
package/hooks/session-end.js +8 -152
package/hooks/session-start.js +92 -14
package/hooks/statusline.sh +132 -48
package/hooks/user-prompt-submit.js +34 -4
package/install.sh +1 -1
package/package.json +3 -2
package/plugin/.claude-plugin/plugin.json +6 -0
package/plugin/README.md +30 -0
package/plugin/commands/config.md +12 -0
package/plugin/commands/scorecard.md +12 -0
package/plugin/commands/stats.md +12 -0
package/plugin/hooks/hooks.json +94 -0
package/plugin/scripts/post-tool-use-failure.js +27 -0
package/plugin/scripts/post-tool-use.js +78 -0
package/plugin/scripts/pre-compact.js +41 -0
package/plugin/scripts/session-end.js +113 -0
package/plugin/scripts/session-start.js +198 -0
package/plugin/scripts/statusline.sh +181 -0
package/plugin/scripts/stop.js +27 -0
package/plugin/scripts/subagent-start.js +27 -0
package/plugin/scripts/user-prompt-submit.js +61 -0
package/scripts/sync-plugin-version.sh +22 -0
package/src/cli.js +93 -98
package/src/components/ScoreCard.js +28 -37
package/src/components/StatsView.js +26 -27
package/src/lib/__tests__/score.test.js +220 -94
package/src/lib/ansi-scorecard.js +162 -0
package/src/lib/config.js +47 -0
package/src/lib/demo-ansi-scorecard.js +29 -0
package/src/lib/demo-fixtures.js +35 -205
package/src/lib/demo.js +143 -98
package/src/lib/install.js +21 -0
package/src/lib/score.js +95 -69
package/src/lib/store.js +2 -2
package/src/components/ActiveRun.js +0 -141
package/src/components/StartRun.js +0 -278
package/src/lib/demo-active.js +0 -56

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,7 @@ TokenGolf patch notes — what changed, what it measures, and why the mechanic e
 ## [Unreleased]
 ### Changed
+- **Sublinear par scaling (sqrt)** — Par formula changed from `prompts × rate` to `rate × sqrt(prompts)`. Early prompts have headroom for exploration; pressure builds as the session goes on. Long wasteful sessions bust. Rates recalibrated: Haiku $0.55, Sonnet $7.00, Paladin $22.00, Opus $45.00. Floors unchanged. All models bust around 20 prompts at typical per-prompt spend.
 - **Design D HUD** — StatusLine HUD redesigned with `██` accent bar, inline `▓░` progress bars for budget and context, no separator lines. 1 line when context <50%, 2 lines when context visible. Accent bar turns red when budget >75%. Matches Design D across all UI surfaces.
 - **Design D block accent UI** — All bordered boxes replaced with left-only `██` block accent bars. Eliminates persistent right-border misalignment caused by emoji/unicode width differences across terminals. Color-coded: yellow for won, red for died, gray for neutral.
 - ScoreCard, StatsView, ActiveRun, StartRun components all use custom Ink `borderStyle` with `left: '██'`, no right/top/bottom borders

package/CLAUDE.md CHANGED Viewed

@@ -1,454 +1,154 @@
 # TokenGolf — CLAUDE.md
-You are working on **TokenGolf**, a CLI game that gamifies Claude Code sessions by turning token/dollar efficiency into a score. This is the primary project context file. Read this fully before doing anything.
+A CLI game that gamifies Claude Code sessions by turning token/dollar efficiency into a score. Node.js, ESM, Ink v5 TUI, Commander.js, JSON persistence in `~/.tokengolf/`.
 ---
-## What TokenGolf Is
+## One Mode — Every Session Is a Roguelike Run
-A Node.js CLI tool that wraps Claude Code sessions with game mechanics. Users declare a quest ("implement pagination for /users"), set a budget ($0.30), pick a model class, then work in Claude Code normally. At the end, they get a score based on how efficiently they used their budget.
+Every Claude Code session is automatically tracked. No wizard, no upfront budget commitment. The budget (par) scales dynamically with session activity.
-**Core insight**: Claude Code already exposes session cost data. TokenGolf adds the game layer — the meaning, the stakes, the achievement system — on top of data that already exists.
-**Tagline**: *"Flow mode tracks you. Roguelike mode trains you."*
+**Par budget** = `max(rate × sqrt(prompts), model_floor)`. Par grows sublinearly — early prompts have headroom, pressure mounts over time. Efficient prompts beat par; wasteful prompts fall behind. BUST (>100% of par) = `status: 'died'`, red accent, death achievements.
 ---
-## Two Modes
-### Flow Mode
-- Passive. No interruption. Just runs in the background.
-- SessionStart hook auto-creates a flow run if none is active.
-- Post-session: `tokengolf win` shows score + achievements with no pre-configuration.
-- For people in flow state who don't want friction.
-### Roguelike Mode
-- Intentional. Pre-commitment before session starts.
-- Declare quest + budget + model class = a "run" with real stakes.
-- Budget bust = permadeath. Run logged as a death.
-- Floor structure: Write code → Write tests → Fix tests → Code review → PR merged (BOSS)
-- For deliberate practice. Trains prompting skills.
-**Relationship**: Same engine, same data, same achievement system. Roguelike practice makes Flow sessions better over time. That's the meta loop.
----
+## Installation
-## Game Mechanics
-### Model as Character Class
-| Class | Model | Difficulty | Feel |
-|-------|-------|------------|------|
-| 🏹 Rogue | Haiku | Nightmare | Glass cannon. Must prompt precisely. |
-| ⚔️ Fighter | Sonnet | Standard | Balanced. The default run. |
-| 🧙 Warlock | Opus | Casual | Powerful but expensive. |
-| ⚜️ Paladin | Opus (plan mode) | Tactical | Strategic planner. Thinks before acting. |
-### Budget Tiers
-| Tier | Spend | Emoji |
-|------|-------|-------|
-| Diamond | < $0.10 | 💎 |
-| Gold | < $0.30 | 🥇 |
-| Silver | < $1.00 | 🥈 |
-| Bronze | < $3.00 | 🥉 |
-| Reckless | > $3.00 | 💸 |
-### Efficiency Ratings
-| Rating | Budget Used | Color |
-|--------|------------|-------|
-| LEGENDARY | < 25% | magenta |
-| EFFICIENT | < 50% | cyan |
-| SOLID | < 75% | green |
-| CLOSE CALL | < 100% | yellow |
-| BUSTED | > 100% | red |
-### Achievements
-**Class Medals**
-- 🥇 Gold — Completed with Haiku
-- 💎 Diamond — Haiku under $0.10
-- 🥈 Silver — Completed with Sonnet
-- ⚜️ Paladin — Completed as Paladin (Opus plan mode)
-- ♟️ Grand Strategist — LEGENDARY efficiency as Paladin
-- 🥉 Bronze — Completed with Opus
-**Budget Efficiency**
-- 🎯 Sniper — Under 25% of budget used
-- ⚡ Efficient — Under 50% of budget used
-- 🪙 Penny Pincher — Total spend under $0.10
-**Effort-Based**
-- 🏎️ Speedrunner — Low effort, completed under budget
-- 🏋️ Tryhard — High/Max effort, LEGENDARY efficiency
-- 👑 Archmagus — Opus at max effort, completed
-**Fast Mode (Opus-only)**
-- ⛈️ Lightning Run — Opus fast mode, completed under budget
-- 🎰 Daredevil — Opus fast mode, LEGENDARY efficiency
-**Sessions**
-- 🔥 No Rest for the Wicked — Completed in one session
-- 🏕️ Made Camp — Completed across multiple sessions
-- 🧟 Came Back — Fainted and finished anyway
-**Gear (Compaction)**
-- 📦 Overencumbered — Context auto-compacted during run
-- 🥷 Ghost Run — Manual compact at ≤30% context
-- 🪶 Ultralight — Manual compact at 31–40% context
-- 🎒 Traveling Light — Manual compact at 41–50% context
-**Ultrathink**
-- 🔮 Spell Cast — Used extended thinking (won)
-- 🧮 Calculated Risk — Ultrathink + LEGENDARY efficiency
-- 🌀 Deep Thinker — ≥3 ultrathink invocations, completed
-- 🤫 Silent Run — No extended thinking, SOLID or better, completed
-**Paladin Planning Ratio**
-- 🏛️ Architect — Opus handled >60% of cost (heavy planner)
-- 💨 Blitz — Opus handled <25% of cost (light plan, fast execution)
-- ⚖️ Equilibrium — Opus/Sonnet balanced at 40–60%
-**Model Loyalty (non-Paladin)**
-- 🔷 Purist — Single model family throughout
-- 🦎 Chameleon — Multiple model families used, under budget
-- 🔀 Tactical Switch — Exactly 1 model switch, under budget
-- 🔒 Committed — No switches, one model family
-- ⚠️ Class Defection — Declared one class but cost skewed to another
-**Haiku Efficiency**
-- 🏹 Frugal — Haiku handled ≥50% of session cost
-- 🎲 Rogue Run — Haiku handled ≥75% of session cost
-**Prompting Skill**
-- 🥊 One Shot — Completed in a single prompt
-- 💬 Conversationalist — ≥20 prompts
-- 🤐 Terse — ≤3 prompts, ≥10 tool calls
-- 🪑 Backseat Driver — ≥15 prompts, <1 tool call per prompt
-- 🏗️ High Leverage — ≥5 tools per prompt (≥2 prompts)
-**Tool Mastery**
-- 👁️ Read Only — No Edit or Write calls (≥1 Read)
-- ✏️ Editor — ≥10 Edit calls
-- 🐚 Bash Warrior — ≥10 Bash calls, ≥50% of tool usage
-- 🔍 Scout — ≥60% Read calls (≥5 total)
-- 🔪 Surgeon — 1–3 Edit calls, completed under budget
-- 🧰 Toolbox — ≥5 distinct tool types used
-**Cost per Prompt**
-- 💲 Cheap Shots — Under $0.01 per prompt (≥3 prompts)
-- 🍷 Expensive Taste — Over $0.50 per prompt (≥3 prompts; also a death mark)
-**Time**
-- ⏱️ Speedrun — Completed in under 5 minutes
-- 🏃 Marathon — Session 60–180 minutes
-- 🫠 Endurance — Session over 3 hours
-**Tool Reliability**
-- ✅ Clean Run — Zero failed tool calls (≥5 total tool uses)
-- 🐂 Stubborn — ≥10 failed tool calls, still won
-**Subagents**
-- 🐺 Lone Wolf — No subagents spawned
-- 📡 Summoner — ≥5 subagents spawned
-- 🪖 Army of One — ≥10 subagents, under 50% budget used
-**Turn Discipline**
-- 🤖 Agentic — ≥3 Claude turns per user prompt
-- 🐕 Obedient — Exactly 1 turn per prompt (≥3 prompts)
-**Death Marks** *(fire before won-only cutoff; some also fire on won runs)*
-- 🎲 Indecisive — ≥3 model switches *(won or died)*
-- 🤦 Hubris — Used ultrathink, busted anyway
-- 💥 Blowout — Spent ≥2× budget
-- 😭 So Close — Died within 10% of budget
-- 🔨 Tool Happy — Died with ≥30 tool calls
-- 🪦 Silent Death — Died with ≤2 prompts
-- 🤡 Fumble — Died with ≥5 failed tool calls
-- 🍷 Expensive Taste — Over $0.50/prompt *(won or died)*
+**Plugin (recommended)** — one step, auto-updates:
+```
+claude plugin install tokengolf
+```
----
+**npm (alternative)** — requires manual hook setup:
+```
+npm install -g tokengolf && tokengolf install
+```
-## Tech Stack
+npm users get auto-sync: hooks update automatically on version change via session-start.js.
-- **Runtime**: Node.js (ESM, `"type": "module"`)
-- **Build**: esbuild (JSX transform, `npm run build` → `dist/cli.js`)
-- **TUI**: [Ink v5](https://github.com/vadimdemedes/ink) + [@inkjs/ui v2](https://github.com/vadimdemedes/ink-ui)
-- **CLI parsing**: Commander.js
-- **Persistence**: JSON files in `~/.tokengolf/` (no native deps, zero compilation)
-- **Claude Code integration**: Hooks via `~/.claude/settings.json`
-- **Testing**: Vitest (ESM-native, `npm test`)
-- **Language**: JavaScript (no TypeScript — keep it simple)
+## Commands
-### Build pipeline
-Source is JSX (`src/`) → esbuild bundles to `dist/cli.js`. The `bin` in `package.json` points to `dist/cli.js`. Run `npm run build` after any source change. `prepare` runs build automatically on `npm link`/`npm install`.
+`npm run build` after source changes. `npm test` after score.js changes. `npm run lint` / `npm run format` for code quality. Husky pre-commit hook runs automatically.
 **Do not test with `node src/cli.js`** — use `node dist/cli.js` or `tokengolf` after `npm link`.
-### Why JSON not SQLite
-`better-sqlite3` requires native compilation which causes install failures. JSON files in `~/.tokengolf/` are sufficient for the data volume (hundreds of runs max) and have zero friction.
+| Command | Description |
+|---------|-------------|
+| `tokengolf scorecard` | Show last run's score card |
+| `tokengolf stats` | Career stats dashboard |
+| `tokengolf demo [component]` | Show UI demos (all, hud, scorecard, stats) |
+| `tokengolf config` | List all config values |
+| `tokengolf config emotions [mode]` | Get/set emotion mode (`off`, `emoji`, `ascii`) |
+| `tokengolf config par [model] [rate]` | View/set par rates per model, or `reset` |
+| `tokengolf config floor [model] [value]` | View/set par floors per model, or `reset` |
+| `tokengolf install` | Patch `~/.claude/settings.json` with hooks |
 ---
-## Project Structure
+## Par Budget System
+Par = the expected cost for a session, scaled sublinearly by prompts and model.
 ```
-tokengolf/
-├── src/
-│   ├── cli.js                    # Main entrypoint, all commands
-│   ├── components/
-│   │   ├── StartRun.js           # Quest declaration wizard
-│   │   ├── ActiveRun.js          # Live run status display
-│   │   ├── ScoreCard.js          # End-of-run screen (win/death)
-│   │   └── StatsView.js          # Career stats dashboard
-│   └── lib/
-│       ├── state.js              # Read/write ~/.tokengolf/current-run.json
-│       ├── store.js              # Read/write ~/.tokengolf/runs.json
-│       ├── score.js              # Tiers, ratings, model classes, achievements
-│       ├── cost.js               # Auto-detect cost from ~/.claude/ transcripts
-│       ├── install.js            # Patches ~/.claude/settings.json with hooks
-│       └── __tests__/
-│           └── score.test.js     # Vitest: 120 tests covering achievements + pure functions
-├── hooks/
-│   ├── session-start.js          # Injects run context; auto-creates flow run
-│   ├── session-end.js            # Captures cost on /exit; saves run; renders scorecard
-│   ├── post-tool-use.js          # Tracks tool calls, fires budget warnings
-│   ├── post-tool-use-failure.js  # Tracks failedToolCalls
-│   ├── user-prompt-submit.js     # Counts prompts, fires 50% nudge
-│   ├── pre-compact.js            # Tracks compaction events for gear achievements
-│   ├── subagent-start.js         # Tracks subagentSpawns
-│   ├── stop.js                   # Tracks turnCount
-│   └── statusline.sh             # Bash HUD shown in Claude Code statusline
-├── dist/
-│   └── cli.js                    # Built output (gitignored? check .gitignore)
-├── CLAUDE.md                     # This file
-├── package.json
-└── README.md
+par = max(rate × sqrt(prompts), model_floor)
+efficiency = actual_cost / par
 ```
----
+The sqrt scaling creates increasing pressure: early prompts have headroom for exploration, but long sessions must be increasingly efficient to stay under par.
-## State Files (in `~/.tokengolf/`)
-### `current-run.json`
-Active run state. Written by `tokengolf start` or auto-created by SessionStart hook (flow mode). Cleared on `tokengolf win` or `tokengolf bust`.
-```json
-{
-  "id": "run_1741345200000",
-  "quest": "implement pagination for /users",
-  "model": "claude-sonnet-4-6",
-  "budget": 0.30,
-  "spent": 0.11,
-  "status": "active",
-  "mode": "roguelike",
-  "floor": 2,
-  "totalFloors": 5,
-  "promptCount": 8,
-  "totalToolCalls": 14,
-  "toolCalls": { "Read": 6, "Edit": 4, "Bash": 4 },
-  "sessionId": "abc123",
-  "cwd": "/Users/me/projects/my-app",
-  "sessionCount": 1,
-  "fainted": false,
-  "compactionEvents": [],
-  "thinkingInvocations": 0,
-  "thinkingTokens": 0,
-  "failedToolCalls": 0,
-  "subagentSpawns": 2,
-  "turnCount": 12,
-  "startedAt": "2026-03-07T10:00:00Z"
-}
-```
+### Model Par Rates
-Flow mode runs have `"quest": null, "budget": null, "mode": "flow"`.
-### `runs.json`
-Array of all completed runs. Append-only.
-```json
-[
-  {
-    "id": "run_1741345200000",
-    "quest": "...",
-    "status": "won",
-    "spent": 0.18,
-    "budget": 0.30,
-    "model": "claude-sonnet-4-6",
-    "modelBreakdown": { "claude-sonnet-4-6": 0.15, "claude-haiku-4-5-20251001": 0.03 },
-    "achievements": [...],
-    "startedAt": "...",
-    "endedAt": "..."
-  }
-]
-```
+| Model | Par Rate | Floor | Rationale |
+|-------|----------|-------|-----------|
+| Haiku | $0.15 | $0.10 | Calibrated from API pricing ratios |
+| Sonnet | $1.50 | $0.75 | Heavy sessions (~$0.35/prompt) bust around 20 prompts |
+| Paladin | $4.50 | $2.00 | Opus planning + Sonnet execution blend |
+| Opus | $8.00 | $3.00 | Calibrated from actual session data (~$1.72/prompt heavy) |
+Constants: `MODEL_PAR_RATES`, `MODEL_PAR_FLOORS`, `getParBudget()` in `src/lib/score.js`.
+The floor prevents 1-prompt agentic sessions from being instant BUST.
 ---
-## CLI Commands
+## Scoring & Achievements
-| Command | Description |
-|---------|-------------|
-| `tokengolf start` | Declare quest, model, budget — begin a roguelike run |
-| `tokengolf status` | Show live status of current run |
-| `tokengolf win` | Complete current run (auto-detects cost from transcripts) |
-| `tokengolf win --spent 0.18` | Complete with manually specified cost |
-| `tokengolf bust` | Mark run as budget busted (permadeath) |
-| `tokengolf scorecard` | Show last run's score card |
-| `tokengolf stats` | Career stats dashboard |
-| `tokengolf install` | Patch `~/.claude/settings.json` with hooks |
+All tiers, ratings, model classes, par rates, and achievements are defined in `src/lib/score.js`. Read that file for the full catalog — don't duplicate it here.
+Key concepts:
+- **Model classes**: Rogue (Haiku), Fighter (Sonnet), Warlock (Opus), Paladin (Opus plan mode)
+- **Spend tiers**: absolute $ thresholds, model-calibrated (`MODEL_BUDGET_TIERS` / `getModelBudgets()`)
+- **Efficiency ratings**: LEGENDARY (<15%) → EPIC (<30%) → PRO → SOLID → CLOSE CALL → BUST (>100%), computed against dynamic par
+- **Death marks fire before the early return** in `calculateAchievements` — they're checked before `if (!won) return []`. `indecisive` and `expensive_taste` also fire on won runs.
 ---
 ## Claude Code Hooks
-Nine hooks in `hooks/` directory, installed via `tokengolf install`. Most complete in < 5s (synchronous JSON I/O). `session-end.js` uses async dynamic imports with a 30s timeout.
-### `SessionStart` (`session-start.js`)
-- Does NOT read stdin (SessionStart doesn't pipe data)
-- Reads `current-run.json`; if no active run, auto-creates a flow mode run
-- Auto-detects `effort` from env var or `~/.claude/settings.json`; auto-detects `fastMode` from settings.json
-- Increments `sessionCount` on existing runs
-- Outputs `additionalContext` injected into Claude's conversation
-### `PostToolUse` (`post-tool-use.js`)
-- Reads stdin (event JSON with `tool_name`)
-- Updates `toolCalls` count in `current-run.json`
-- At 80%+ budget: outputs `systemMessage` warning to Claude
-### `UserPromptSubmit` (`user-prompt-submit.js`)
-- Increments `promptCount`
-- At 50% budget: injects halfway nudge as `additionalContext`
-### `PreCompact` (`pre-compact.js`)
-- Reads stdin (compact event JSON with `trigger` and `context_window.used_percentage`)
-- Appends to `compactionEvents` array in `current-run.json`
-- Powers gear achievements (Ghost Run, Ultralight, Traveling Light, Overencumbered)
-### `SessionEnd` (`session-end.js`)
-- Reads stdin for `reason` field (detects Fainted if reason is `'other'`)
-- Calls `autoDetectCost(run)` — returns spent, modelBreakdown, thinkingInvocations, thinkingTokens
-- Resting runs: updates state with fainted:true, does NOT clear — run continues next session
-- Won/died runs: calls `saveRun()` (which runs `calculateAchievements()`), clears state, renders ANSI scorecard
-### `PostToolUseFailure` (`post-tool-use-failure.js`)
-- Reads stdin (event JSON with `tool_name` and error info)
-- Increments `failedToolCalls` in `current-run.json`
-- Powers Fumble death mark (≥5 failed tool calls)
-### `SubagentStart` (`subagent-start.js`)
-- Reads stdin (subagent event JSON)
-- Increments `subagentSpawns` in `current-run.json`
-- Powers Lone Wolf / Summoner / Army of One achievements
-### `Stop` (`stop.js`)
-- Reads stdin for turn data
-- Increments `turnCount` in `current-run.json`
-- Powers Agentic / Obedient turn discipline achievements
-### `StatusLine` (`statusline.sh`)
-- Bash script; uses `TG_SESSION_JSON=... python3 - "$STATE_FILE" <<'PYEOF'` pattern to avoid heredoc/stdin conflict
-- Receives live session JSON (cost, context %, model) via stdin
-- **Design D accent bar**: `██` prefix on each line, color-coded (yellow normal, red when budget >75%)
-- Line 1: `██ ⛳ quest  $cost/budget ▓▓▓░░░ pct%  RATING  model  F1/5`
-- Line 2 (always shown when context data available): `██ 🧠 ▓▓▓▓░░░ ctx% 🪶/🎒/📦`
-- Budget progress bar: `▓` filled, `░` empty, 11 chars wide. Red when >75%, yellow otherwise
-- Context progress bar: `▓░` 10 chars wide. Green (50–74%), yellow (75–89%), red (90%+); hidden below 50%
-- Model label: `⚔️ Sonnet`, `⚔️ Sonnet·High`, `🏹 Haiku`, `🧙 Opus·Max`, etc. Effort appended only when explicitly set in settings.json (medium omitted — it's the default)
-- Always 2 lines when context data is available from Claude Code
-- statusLine config must be an object: `{type:"command", command:"...statusline.sh", padding:1}`
-### Hook installation
-`tokengolf install` patches `~/.claude/settings.json`. Uses `fs.realpathSync(process.argv[1])` to resolve npm link symlinks to real hook paths. Hook entries tagged with `_tg: true` for reliable dedup. Non-destructive statusLine install: wraps existing statusline if one is configured.
+Nine hooks in `hooks/`, installed via `tokengolf install`. All are synchronous JSON I/O (< 1s) except `session-end.js` (async imports, 30s timeout).
----
+| Hook | Stdin? | What it does |
+|------|--------|-------------|
+| `session-start.js` | No | Auto-creates run if none active; detects effort/fastMode; injects `additionalContext` with par budget |
+| `session-end.js` | Yes (`reason`) | Authoritative for cost/scorecard. Scans transcripts, saves run, renders ANSI scorecard. Death = spent > par |
+| `post-tool-use.js` | Yes (`tool_name`) | Tracks `toolCalls`; fires par warning at 80%+ |
+| `post-tool-use-failure.js` | Yes (`tool_name`) | Increments `failedToolCalls` |
+| `user-prompt-submit.js` | No | Increments `promptCount`; fires halfway nudge at 50% of par |
+| `pre-compact.js` | Yes (`trigger`, `context_window`) | Tracks compaction events for gear achievements |
+| `subagent-start.js` | Yes | Increments `subagentSpawns` |
+| `stop.js` | Yes | Increments `turnCount` |
+| `statusline.sh` | Yes (session JSON) | 2-line HUD with `██` accent bar, par-based progress bar. Also fixes model detection (writes real model back to current-run.json) |
-## Cost Detection (`src/lib/cost.js`)
+**Plugin distribution**: `plugin/` directory contains the Claude Code plugin scaffold — hooks.json, bundled scripts, slash commands. Build with `npm run build:plugin`. Uses `${CLAUDE_PLUGIN_ROOT}` for paths.
+**Hook installation (npm)**: `tokengolf install` resolves npm link symlinks via `fs.realpathSync(process.argv[1])`. Entries tagged `_tg: true` for dedup. Non-destructive statusLine install wraps existing config. Stamps `~/.tokengolf/installed-version` for auto-sync.
-`autoDetectCost(run)` is called by `session-end.js` and `tokengolf win/bust`. It:
-1. Parses `~/.claude/projects/<cwd>/` transcript files — all `.jsonl` files modified since `run.startedAt`
-2. Scans ALL files (not just the main session) — this captures subagent sidechain files where Haiku usage lives
-3. Also calls `parseThinkingFromTranscripts(paths)` to count thinking blocks and estimate tokens
-4. Returns `{ spent, modelBreakdown, thinkingInvocations, thinkingTokens }`
+**Auto-sync (npm)**: `session-start.js` checks `installed-version` vs `package.json` on every session start. On mismatch, updates all `_tg: true` hook paths and statusLine paths in `~/.claude/settings.json`, then stamps the new version.
-`process.cwd()` is used (not `run.cwd`) because the user always runs `tokengolf win` from their project directory.
+**StatusLine gotcha**: Uses `TG_SESSION_JSON=... python3 - "$STATE_FILE" <<'PYEOF'` pattern to avoid heredoc/stdin conflict. Config must be an object: `{type:"command", command:"...statusline.sh", padding:1}`.
-Thinking tokens are estimated from character count ÷ 4 (approximate — displayed with `~` prefix). Invocations = assistant turns containing at least one `{"type":"thinking"}` content block.
+**Model detection fix**: `session-start.js` defaults model to `claude-sonnet-4-6`. `statusline.sh` gets the real model from session JSON and writes it back to `current-run.json` if different. This ensures par rates use the correct model.
+**Emotion modes** (`tokengolf config emotions <mode>`): `emoji` (default) = mood emoji replaces `⛳` on line 1. `ascii` = adds 3rd line with kaomoji + emotion label. `off` = classic `⛳`/`💤`. Config stored in `~/.tokengolf/config.json`. Emotions are a multi-signal composite: par%, context%, failedToolCalls, promptCount.
 ---
-## Key Design Decisions
+## Cost Detection (`src/lib/cost.js`)
-1. **SessionEnd hook is authoritative for cost/scorecard** — SessionEnd fires on `/exit`, scans transcripts, saves run, and renders ANSI scorecard. `tokengolf win` is a manual override that still works. The Stop hook is also active but only for `turnCount` tracking — it does NOT include `total_cost_usd` so it cannot determine final cost.
+`autoDetectCost(run)` parses `~/.claude/projects/<cwd>/` transcript files modified since `run.startedAt`. Scans ALL `.jsonl` files (not just main session) to capture subagent sidechain files where Haiku usage lives. Same pass detects thinking blocks for ultrathink tracking.
-2. **Scan all transcripts for multi-model + ultrathink** — Claude Code creates separate `.jsonl` files for subagent sidechains (Haiku usage lives there). Same scan also picks up thinking blocks for ultrathink detection. One pass, all data.
+**Gotcha**: Uses `process.cwd()` not `run.cwd` — user always runs `tokengolf` from their project directory.
-3. **Floors are cosmetic** — Floor structure exists in the data model but isn't enforced. It's a UI element. Full roguelike floor mechanics with per-floor budgets are a future feature.
+---
-4. **Flow mode is automatic** — SessionStart hook creates a flow run if none exists. Any Claude Code session is tracked. Just `/exit` and the scorecard appears.
+## Key Design Decisions
-5. **Budget presets are model-calibrated** — `MODEL_BUDGET_TIERS` in score.js defines Diamond/Gold/Silver/Bronze amounts per model class. Wizard calls `getModelBudgets(model)` so Haiku sees $0.15/$0.40/$1.00/$2.50 and Opus sees $2.50/$7.50/$20.00/$50.00. Efficiency ratings (LEGENDARY/EFFICIENT/etc.) still derive as % of whatever budget was committed — no change there.
+1. **SessionEnd is authoritative for cost** — fires on `/exit`, scans transcripts, saves run, renders scorecard. Stop hook only tracks `turnCount`.
-6. **Ultrathink is natural language, not a slash command** — Writing `ultrathink` in a prompt triggers extended thinking mode. It's tracked via thinking blocks in transcripts, not via any hook. `thinkingInvocations === 0` on a won run = Silent Run achievement; on a died run with invocations > 0 = Hubris death mark.
+2. **Every session is a run** — SessionStart creates a run if none exists. Any Claude Code session is tracked automatically. No wizard, no upfront commitment.
-7. **Death marks fire before the early return** — `calculateAchievements` has an `if (!won) return []` early exit, but death marks (blowout, so_close, tool_happy, silent_death, fumble, expensive_taste, hubris) fire before it. `indecisive` (model switches) and `expensive_taste` also fire on won runs — they're behavior patterns, not death verdicts.
+3. **Par budget scales sublinearly with prompts** — `max(rate × sqrt(prompts), floor)`. Par grows slower than spending, creating increasing pressure. Early prompts have headroom; long sessions must be efficient or bust.
-8. **Design D: ██ block accent, no right borders** — All UI cards use a left-only `██` block accent bar instead of full box borders. This eliminates persistent right-border misalignment caused by emoji/unicode width calculation differences across terminals. Color-coded: yellow `██` for won, red `██` for died, gray `██` for neutral (stats, wizard). Ink components use a custom `borderStyle` object with `left: '██'` and `borderRight/Top/Bottom={false}`, `paddingLeft={3}`. session-end.js ANSI scorecard uses `██` prefix + `─` horizontal separators. No screenshots — README uses inline code block demos.
+4. **Death is cosmetic** — BUST (>100% of par) = `status: 'died'`, red accent, death achievements. It's a scoring signal, not a hard stop.
----
+5. **Ultrathink is natural language** — writing "ultrathink" in a prompt triggers extended thinking. Tracked via transcript parsing, not hooks.
-## Current Status: v0.4
-### Done
-- [x] Full project scaffold with esbuild pipeline
-- [x] All CLI commands wired up
-- [x] Ink components: StartRun, ActiveRun, ScoreCard, StatsView
-- [x] JSON persistence (state.js + store.js)
-- [x] Scoring logic (tiers, ratings, achievements, multi-model)
-- [x] 9 Claude Code hooks: SessionStart, PostToolUse, PostToolUseFailure, UserPromptSubmit, PreCompact, SessionEnd, SubagentStart, Stop, StatusLine
-- [x] `tokengolf install` hook installer with symlink resolution + statusLine config
-- [x] Auto cost detection from transcripts (`cost.js`) — multi-file, multi-model
-- [x] SessionEnd hook auto-displays ANSI scorecard on /exit; replaces dead Stop hook
-- [x] Flow mode auto-tracking (SessionStart creates run if none exists)
-- [x] Multi-model breakdown in ScoreCard
-- [x] Haiku efficiency achievements (Frugal, Rogue Run)
-- [x] Effort level wizard step (Low/Medium/High for Sonnet; +Max for Opus; Haiku skips)
-- [x] Fast mode auto-detection from settings.json; tracked in run state
-- [x] Fainted / rest mechanic (usage limit hit = fainted, run continues next session)
-- [x] Context window % in StatusLine HUD: 🪶/🎒/📦 with green/yellow/red
-- [x] PreCompact hook tracks manual vs auto compaction + context % for gear achievements
-- [x] Multi-session tracking (sessionCount increments on each SessionStart)
-- [x] Model-aware budget presets in wizard (MODEL_BUDGET_TIERS, getModelBudgets)
-- [x] Ultrathink detection from transcripts (thinkingInvocations, thinkingTokens)
-- [x] 5 ultrathink achievements including Hubris death mark
-- [x] Paladin (⚜️ opusplan) character class with model-aware budgets and statusline support
-- [x] 28 new achievements: prompting skill, tool mastery, cost/prompt, time, subagents, turn discipline, death marks
-- [x] 3 new hooks: PostToolUseFailure, SubagentStart, Stop
-- [x] Vitest test suite — 120 tests covering all achievements + pure score functions
-- [x] Design D block accent UI — ██ left bar, no right borders, color-coded state
-- [x] Landing page terminal demos updated to ██ style
-- [x] README inline code block demos (replaced PNG screenshots)
-### Next up (v0.4)
-- [ ] `tokengolf floor` command to advance floor manually
-- [ ] Roguelike floor mechanics with per-floor sub-budgets
-- [ ] Leaderboard / shareable run URLs
-- [ ] Team mode (shared `runs.json` via git)
+6. **Design D: `██` block accent, no right borders** — eliminates emoji/unicode width misalignment across terminals. Yellow = won, red = died, gray = neutral. Ink: custom `borderStyle` with `borderRight/Top/Bottom={false}`, `paddingLeft={3}`. ANSI scorecard: `██` prefix + `─` separators.
 ---
 ## Working in This Repo
-When making changes:
 - Keep it ESM (`import/export`, no `require`)
 - Ink components are functional React — hooks only, no classes
-- State mutations always go through `state.js` and `store.js` — never write to `~/.tokengolf/` directly from components
-- Hooks must be fast (< 1s) — no async, no network, JSON file I/O only
-- **Always run `npm run build` after source changes**
-- **Run `npm test` after score.js changes** — 83 tests catch achievement regressions
+- State mutations go through `state.js` and `store.js` — never write to `~/.tokengolf/` directly
+- Hooks must be fast (< 1s), sync, end with `process.exit(0)` (except session-end.js)
+- Hooks run in a separate process with no access to shell env vars
 - Test hooks standalone: `echo '{"tool_name":"Read"}' | node hooks/post-tool-use.js`
-- Remember hooks run in a separate process with no access to shell env vars
-- Always `process.exit(0)` at the end of hooks
+- **Always `npm run build` after source changes**
+- **Always `npm test` after score.js changes**
 When adding a new CLI command:
-1. Add it to `src/cli.js`
-2. Add a component in `src/components/` if it needs a TUI
-3. Document it in this file and README.md
+1. Add to `src/cli.js`
+2. Add component in `src/components/` if it needs a TUI
+3. Update this file and README.md