tokengolf 0.5.3 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/CHANGELOG.md +1 -0
  2. package/CLAUDE.md +90 -390
  3. package/README.md +147 -126
  4. package/dist/cli.js +636 -1469
  5. package/docs/assets/banner.svg +16 -13
  6. package/docs/design-preview.html +1112 -0
  7. package/docs/index.html +656 -358
  8. package/hooks/post-tool-use.js +35 -4
  9. package/hooks/session-end.js +8 -152
  10. package/hooks/session-start.js +92 -14
  11. package/hooks/statusline.sh +132 -48
  12. package/hooks/user-prompt-submit.js +34 -4
  13. package/install.sh +1 -1
  14. package/package.json +3 -2
  15. package/plugin/.claude-plugin/plugin.json +6 -0
  16. package/plugin/README.md +30 -0
  17. package/plugin/commands/config.md +12 -0
  18. package/plugin/commands/scorecard.md +12 -0
  19. package/plugin/commands/stats.md +12 -0
  20. package/plugin/hooks/hooks.json +94 -0
  21. package/plugin/scripts/post-tool-use-failure.js +27 -0
  22. package/plugin/scripts/post-tool-use.js +78 -0
  23. package/plugin/scripts/pre-compact.js +41 -0
  24. package/plugin/scripts/session-end.js +113 -0
  25. package/plugin/scripts/session-start.js +198 -0
  26. package/plugin/scripts/statusline.sh +181 -0
  27. package/plugin/scripts/stop.js +27 -0
  28. package/plugin/scripts/subagent-start.js +27 -0
  29. package/plugin/scripts/user-prompt-submit.js +61 -0
  30. package/scripts/sync-plugin-version.sh +22 -0
  31. package/src/cli.js +93 -98
  32. package/src/components/ScoreCard.js +28 -37
  33. package/src/components/StatsView.js +26 -27
  34. package/src/lib/__tests__/score.test.js +220 -94
  35. package/src/lib/ansi-scorecard.js +162 -0
  36. package/src/lib/config.js +47 -0
  37. package/src/lib/demo-ansi-scorecard.js +29 -0
  38. package/src/lib/demo-fixtures.js +35 -205
  39. package/src/lib/demo.js +143 -98
  40. package/src/lib/install.js +47 -16
  41. package/src/lib/score.js +95 -69
  42. package/src/lib/store.js +2 -2
  43. package/src/components/ActiveRun.js +0 -141
  44. package/src/components/StartRun.js +0 -278
  45. package/src/lib/demo-active.js +0 -56
package/CHANGELOG.md CHANGED
@@ -7,6 +7,7 @@ TokenGolf patch notes — what changed, what it measures, and why the mechanic e
7
7
  ## [Unreleased]
8
8
 
9
9
  ### Changed
10
+ - **Sublinear par scaling (sqrt)** — Par formula changed from `prompts × rate` to `rate × sqrt(prompts)`. Early prompts have headroom for exploration; pressure builds as the session goes on. Long wasteful sessions bust. Rates recalibrated: Haiku $0.55, Sonnet $7.00, Paladin $22.00, Opus $45.00. Floors unchanged. All models bust around 20 prompts at typical per-prompt spend.
10
11
  - **Design D HUD** — StatusLine HUD redesigned with `██` accent bar, inline `▓░` progress bars for budget and context, no separator lines. 1 line when context <50%, 2 lines when context visible. Accent bar turns red when budget >75%. Matches Design D across all UI surfaces.
11
12
  - **Design D block accent UI** — All bordered boxes replaced with left-only `██` block accent bars. Eliminates persistent right-border misalignment caused by emoji/unicode width differences across terminals. Color-coded: yellow for won, red for died, gray for neutral.
12
13
  - ScoreCard, StatsView, ActiveRun, StartRun components all use custom Ink `borderStyle` with `left: '██'`, no right/top/bottom borders
package/CLAUDE.md CHANGED
@@ -1,454 +1,154 @@
1
1
  # TokenGolf — CLAUDE.md
2
2
 
3
- You are working on **TokenGolf**, a CLI game that gamifies Claude Code sessions by turning token/dollar efficiency into a score. This is the primary project context file. Read this fully before doing anything.
3
+ A CLI game that gamifies Claude Code sessions by turning token/dollar efficiency into a score. Node.js, ESM, Ink v5 TUI, Commander.js, JSON persistence in `~/.tokengolf/`.
4
4
 
5
5
  ---
6
6
 
7
- ## What TokenGolf Is
7
+ ## One Mode — Every Session Is a Roguelike Run
8
8
 
9
- A Node.js CLI tool that wraps Claude Code sessions with game mechanics. Users declare a quest ("implement pagination for /users"), set a budget ($0.30), pick a model class, then work in Claude Code normally. At the end, they get a score based on how efficiently they used their budget.
9
+ Every Claude Code session is automatically tracked. No wizard, no upfront budget commitment. The budget (par) scales dynamically with session activity.
10
10
 
11
- **Core insight**: Claude Code already exposes session cost data. TokenGolf adds the game layer the meaning, the stakes, the achievement system on top of data that already exists.
12
-
13
- **Tagline**: *"Flow mode tracks you. Roguelike mode trains you."*
11
+ **Par budget** = `max(rate × sqrt(prompts), model_floor)`. Par grows sublinearly early prompts have headroom, pressure mounts over time. Efficient prompts beat par; wasteful prompts fall behind. BUST (>100% of par) = `status: 'died'`, red accent, death achievements.
14
12
 
15
13
  ---
16
14
 
17
- ## Two Modes
18
-
19
- ### Flow Mode
20
- - Passive. No interruption. Just runs in the background.
21
- - SessionStart hook auto-creates a flow run if none is active.
22
- - Post-session: `tokengolf win` shows score + achievements with no pre-configuration.
23
- - For people in flow state who don't want friction.
24
-
25
- ### Roguelike Mode
26
- - Intentional. Pre-commitment before session starts.
27
- - Declare quest + budget + model class = a "run" with real stakes.
28
- - Budget bust = permadeath. Run logged as a death.
29
- - Floor structure: Write code → Write tests → Fix tests → Code review → PR merged (BOSS)
30
- - For deliberate practice. Trains prompting skills.
31
-
32
- **Relationship**: Same engine, same data, same achievement system. Roguelike practice makes Flow sessions better over time. That's the meta loop.
33
-
34
- ---
15
+ ## Installation
35
16
 
36
- ## Game Mechanics
37
-
38
- ### Model as Character Class
39
- | Class | Model | Difficulty | Feel |
40
- |-------|-------|------------|------|
41
- | 🏹 Rogue | Haiku | Nightmare | Glass cannon. Must prompt precisely. |
42
- | ⚔️ Fighter | Sonnet | Standard | Balanced. The default run. |
43
- | 🧙 Warlock | Opus | Casual | Powerful but expensive. |
44
- | ⚜️ Paladin | Opus (plan mode) | Tactical | Strategic planner. Thinks before acting. |
45
-
46
- ### Budget Tiers
47
- | Tier | Spend | Emoji |
48
- |------|-------|-------|
49
- | Diamond | < $0.10 | 💎 |
50
- | Gold | < $0.30 | 🥇 |
51
- | Silver | < $1.00 | 🥈 |
52
- | Bronze | < $3.00 | 🥉 |
53
- | Reckless | > $3.00 | 💸 |
54
-
55
- ### Efficiency Ratings
56
- | Rating | Budget Used | Color |
57
- |--------|------------|-------|
58
- | LEGENDARY | < 25% | magenta |
59
- | EFFICIENT | < 50% | cyan |
60
- | SOLID | < 75% | green |
61
- | CLOSE CALL | < 100% | yellow |
62
- | BUSTED | > 100% | red |
63
-
64
- ### Achievements
65
-
66
- **Class Medals**
67
- - 🥇 Gold — Completed with Haiku
68
- - 💎 Diamond — Haiku under $0.10
69
- - 🥈 Silver — Completed with Sonnet
70
- - ⚜️ Paladin — Completed as Paladin (Opus plan mode)
71
- - ♟️ Grand Strategist — LEGENDARY efficiency as Paladin
72
- - 🥉 Bronze — Completed with Opus
73
-
74
- **Budget Efficiency**
75
- - 🎯 Sniper — Under 25% of budget used
76
- - ⚡ Efficient — Under 50% of budget used
77
- - 🪙 Penny Pincher — Total spend under $0.10
78
-
79
- **Effort-Based**
80
- - 🏎️ Speedrunner — Low effort, completed under budget
81
- - 🏋️ Tryhard — High/Max effort, LEGENDARY efficiency
82
- - 👑 Archmagus — Opus at max effort, completed
83
-
84
- **Fast Mode (Opus-only)**
85
- - ⛈️ Lightning Run — Opus fast mode, completed under budget
86
- - 🎰 Daredevil — Opus fast mode, LEGENDARY efficiency
87
-
88
- **Sessions**
89
- - 🔥 No Rest for the Wicked — Completed in one session
90
- - 🏕️ Made Camp — Completed across multiple sessions
91
- - 🧟 Came Back — Fainted and finished anyway
92
-
93
- **Gear (Compaction)**
94
- - 📦 Overencumbered — Context auto-compacted during run
95
- - 🥷 Ghost Run — Manual compact at ≤30% context
96
- - 🪶 Ultralight — Manual compact at 31–40% context
97
- - 🎒 Traveling Light — Manual compact at 41–50% context
98
-
99
- **Ultrathink**
100
- - 🔮 Spell Cast — Used extended thinking (won)
101
- - 🧮 Calculated Risk — Ultrathink + LEGENDARY efficiency
102
- - 🌀 Deep Thinker — ≥3 ultrathink invocations, completed
103
- - 🤫 Silent Run — No extended thinking, SOLID or better, completed
104
-
105
- **Paladin Planning Ratio**
106
- - 🏛️ Architect — Opus handled >60% of cost (heavy planner)
107
- - 💨 Blitz — Opus handled <25% of cost (light plan, fast execution)
108
- - ⚖️ Equilibrium — Opus/Sonnet balanced at 40–60%
109
-
110
- **Model Loyalty (non-Paladin)**
111
- - 🔷 Purist — Single model family throughout
112
- - 🦎 Chameleon — Multiple model families used, under budget
113
- - 🔀 Tactical Switch — Exactly 1 model switch, under budget
114
- - 🔒 Committed — No switches, one model family
115
- - ⚠️ Class Defection — Declared one class but cost skewed to another
116
-
117
- **Haiku Efficiency**
118
- - 🏹 Frugal — Haiku handled ≥50% of session cost
119
- - 🎲 Rogue Run — Haiku handled ≥75% of session cost
120
-
121
- **Prompting Skill**
122
- - 🥊 One Shot — Completed in a single prompt
123
- - 💬 Conversationalist — ≥20 prompts
124
- - 🤐 Terse — ≤3 prompts, ≥10 tool calls
125
- - 🪑 Backseat Driver — ≥15 prompts, <1 tool call per prompt
126
- - 🏗️ High Leverage — ≥5 tools per prompt (≥2 prompts)
127
-
128
- **Tool Mastery**
129
- - 👁️ Read Only — No Edit or Write calls (≥1 Read)
130
- - ✏️ Editor — ≥10 Edit calls
131
- - 🐚 Bash Warrior — ≥10 Bash calls, ≥50% of tool usage
132
- - 🔍 Scout — ≥60% Read calls (≥5 total)
133
- - 🔪 Surgeon — 1–3 Edit calls, completed under budget
134
- - 🧰 Toolbox — ≥5 distinct tool types used
135
-
136
- **Cost per Prompt**
137
- - 💲 Cheap Shots — Under $0.01 per prompt (≥3 prompts)
138
- - 🍷 Expensive Taste — Over $0.50 per prompt (≥3 prompts; also a death mark)
139
-
140
- **Time**
141
- - ⏱️ Speedrun — Completed in under 5 minutes
142
- - 🏃 Marathon — Session 60–180 minutes
143
- - 🫠 Endurance — Session over 3 hours
144
-
145
- **Tool Reliability**
146
- - ✅ Clean Run — Zero failed tool calls (≥5 total tool uses)
147
- - 🐂 Stubborn — ≥10 failed tool calls, still won
148
-
149
- **Subagents**
150
- - 🐺 Lone Wolf — No subagents spawned
151
- - 📡 Summoner — ≥5 subagents spawned
152
- - 🪖 Army of One — ≥10 subagents, under 50% budget used
153
-
154
- **Turn Discipline**
155
- - 🤖 Agentic — ≥3 Claude turns per user prompt
156
- - 🐕 Obedient — Exactly 1 turn per prompt (≥3 prompts)
157
-
158
- **Death Marks** *(fire before won-only cutoff; some also fire on won runs)*
159
- - 🎲 Indecisive — ≥3 model switches *(won or died)*
160
- - 🤦 Hubris — Used ultrathink, busted anyway
161
- - 💥 Blowout — Spent ≥2× budget
162
- - 😭 So Close — Died within 10% of budget
163
- - 🔨 Tool Happy — Died with ≥30 tool calls
164
- - 🪦 Silent Death — Died with ≤2 prompts
165
- - 🤡 Fumble — Died with ≥5 failed tool calls
166
- - 🍷 Expensive Taste — Over $0.50/prompt *(won or died)*
17
+ **Plugin (recommended)** — one step, auto-updates:
18
+ ```
19
+ claude plugin install tokengolf
20
+ ```
167
21
 
168
- ---
22
+ **npm (alternative)** — requires manual hook setup:
23
+ ```
24
+ npm install -g tokengolf && tokengolf install
25
+ ```
169
26
 
170
- ## Tech Stack
27
+ npm users get auto-sync: hooks update automatically on version change via session-start.js.
171
28
 
172
- - **Runtime**: Node.js (ESM, `"type": "module"`)
173
- - **Build**: esbuild (JSX transform, `npm run build` → `dist/cli.js`)
174
- - **TUI**: [Ink v5](https://github.com/vadimdemedes/ink) + [@inkjs/ui v2](https://github.com/vadimdemedes/ink-ui)
175
- - **CLI parsing**: Commander.js
176
- - **Persistence**: JSON files in `~/.tokengolf/` (no native deps, zero compilation)
177
- - **Claude Code integration**: Hooks via `~/.claude/settings.json`
178
- - **Testing**: Vitest (ESM-native, `npm test`)
179
- - **Language**: JavaScript (no TypeScript — keep it simple)
29
+ ## Commands
180
30
 
181
- ### Build pipeline
182
- Source is JSX (`src/`) → esbuild bundles to `dist/cli.js`. The `bin` in `package.json` points to `dist/cli.js`. Run `npm run build` after any source change. `prepare` runs build automatically on `npm link`/`npm install`.
31
+ `npm run build` after source changes. `npm test` after score.js changes. `npm run lint` / `npm run format` for code quality. Husky pre-commit hook runs automatically.
183
32
 
184
33
  **Do not test with `node src/cli.js`** — use `node dist/cli.js` or `tokengolf` after `npm link`.
185
34
 
186
- ### Why JSON not SQLite
187
- `better-sqlite3` requires native compilation which causes install failures. JSON files in `~/.tokengolf/` are sufficient for the data volume (hundreds of runs max) and have zero friction.
35
+ | Command | Description |
36
+ |---------|-------------|
37
+ | `tokengolf scorecard` | Show last run's score card |
38
+ | `tokengolf stats` | Career stats dashboard |
39
+ | `tokengolf demo [component]` | Show UI demos (all, hud, scorecard, stats) |
40
+ | `tokengolf config` | List all config values |
41
+ | `tokengolf config emotions [mode]` | Get/set emotion mode (`off`, `emoji`, `ascii`) |
42
+ | `tokengolf config par [model] [rate]` | View/set par rates per model, or `reset` |
43
+ | `tokengolf config floor [model] [value]` | View/set par floors per model, or `reset` |
44
+ | `tokengolf install` | Patch `~/.claude/settings.json` with hooks |
188
45
 
189
46
  ---
190
47
 
191
- ## Project Structure
48
+ ## Par Budget System
49
+
50
+ Par = the expected cost for a session, scaled sublinearly by prompts and model.
192
51
 
193
52
  ```
194
- tokengolf/
195
- ├── src/
196
- │ ├── cli.js # Main entrypoint, all commands
197
- │ ├── components/
198
- │ │ ├── StartRun.js # Quest declaration wizard
199
- │ │ ├── ActiveRun.js # Live run status display
200
- │ │ ├── ScoreCard.js # End-of-run screen (win/death)
201
- │ │ └── StatsView.js # Career stats dashboard
202
- │ └── lib/
203
- │ ├── state.js # Read/write ~/.tokengolf/current-run.json
204
- │ ├── store.js # Read/write ~/.tokengolf/runs.json
205
- │ ├── score.js # Tiers, ratings, model classes, achievements
206
- │ ├── cost.js # Auto-detect cost from ~/.claude/ transcripts
207
- │ ├── install.js # Patches ~/.claude/settings.json with hooks
208
- │ └── __tests__/
209
- │ └── score.test.js # Vitest: 120 tests covering achievements + pure functions
210
- ├── hooks/
211
- │ ├── session-start.js # Injects run context; auto-creates flow run
212
- │ ├── session-end.js # Captures cost on /exit; saves run; renders scorecard
213
- │ ├── post-tool-use.js # Tracks tool calls, fires budget warnings
214
- │ ├── post-tool-use-failure.js # Tracks failedToolCalls
215
- │ ├── user-prompt-submit.js # Counts prompts, fires 50% nudge
216
- │ ├── pre-compact.js # Tracks compaction events for gear achievements
217
- │ ├── subagent-start.js # Tracks subagentSpawns
218
- │ ├── stop.js # Tracks turnCount
219
- │ └── statusline.sh # Bash HUD shown in Claude Code statusline
220
- ├── dist/
221
- │ └── cli.js # Built output (gitignored? check .gitignore)
222
- ├── CLAUDE.md # This file
223
- ├── package.json
224
- └── README.md
53
+ par = max(rate × sqrt(prompts), model_floor)
54
+ efficiency = actual_cost / par
225
55
  ```
226
56
 
227
- ---
57
+ The sqrt scaling creates increasing pressure: early prompts have headroom for exploration, but long sessions must be increasingly efficient to stay under par.
228
58
 
229
- ## State Files (in `~/.tokengolf/`)
230
-
231
- ### `current-run.json`
232
- Active run state. Written by `tokengolf start` or auto-created by SessionStart hook (flow mode). Cleared on `tokengolf win` or `tokengolf bust`.
233
-
234
- ```json
235
- {
236
- "id": "run_1741345200000",
237
- "quest": "implement pagination for /users",
238
- "model": "claude-sonnet-4-6",
239
- "budget": 0.30,
240
- "spent": 0.11,
241
- "status": "active",
242
- "mode": "roguelike",
243
- "floor": 2,
244
- "totalFloors": 5,
245
- "promptCount": 8,
246
- "totalToolCalls": 14,
247
- "toolCalls": { "Read": 6, "Edit": 4, "Bash": 4 },
248
- "sessionId": "abc123",
249
- "cwd": "/Users/me/projects/my-app",
250
- "sessionCount": 1,
251
- "fainted": false,
252
- "compactionEvents": [],
253
- "thinkingInvocations": 0,
254
- "thinkingTokens": 0,
255
- "failedToolCalls": 0,
256
- "subagentSpawns": 2,
257
- "turnCount": 12,
258
- "startedAt": "2026-03-07T10:00:00Z"
259
- }
260
- ```
59
+ ### Model Par Rates
261
60
 
262
- Flow mode runs have `"quest": null, "budget": null, "mode": "flow"`.
263
-
264
- ### `runs.json`
265
- Array of all completed runs. Append-only.
266
-
267
- ```json
268
- [
269
- {
270
- "id": "run_1741345200000",
271
- "quest": "...",
272
- "status": "won",
273
- "spent": 0.18,
274
- "budget": 0.30,
275
- "model": "claude-sonnet-4-6",
276
- "modelBreakdown": { "claude-sonnet-4-6": 0.15, "claude-haiku-4-5-20251001": 0.03 },
277
- "achievements": [...],
278
- "startedAt": "...",
279
- "endedAt": "..."
280
- }
281
- ]
282
- ```
61
+ | Model | Par Rate | Floor | Rationale |
62
+ |-------|----------|-------|-----------|
63
+ | Haiku | $0.15 | $0.10 | Calibrated from API pricing ratios |
64
+ | Sonnet | $1.50 | $0.75 | Heavy sessions (~$0.35/prompt) bust around 20 prompts |
65
+ | Paladin | $4.50 | $2.00 | Opus planning + Sonnet execution blend |
66
+ | Opus | $8.00 | $3.00 | Calibrated from actual session data (~$1.72/prompt heavy) |
67
+
68
+ Constants: `MODEL_PAR_RATES`, `MODEL_PAR_FLOORS`, `getParBudget()` in `src/lib/score.js`.
69
+
70
+ The floor prevents 1-prompt agentic sessions from being instant BUST.
283
71
 
284
72
  ---
285
73
 
286
- ## CLI Commands
74
+ ## Scoring & Achievements
287
75
 
288
- | Command | Description |
289
- |---------|-------------|
290
- | `tokengolf start` | Declare quest, model, budget — begin a roguelike run |
291
- | `tokengolf status` | Show live status of current run |
292
- | `tokengolf win` | Complete current run (auto-detects cost from transcripts) |
293
- | `tokengolf win --spent 0.18` | Complete with manually specified cost |
294
- | `tokengolf bust` | Mark run as budget busted (permadeath) |
295
- | `tokengolf scorecard` | Show last run's score card |
296
- | `tokengolf stats` | Career stats dashboard |
297
- | `tokengolf install` | Patch `~/.claude/settings.json` with hooks |
76
+ All tiers, ratings, model classes, par rates, and achievements are defined in `src/lib/score.js`. Read that file for the full catalog — don't duplicate it here.
77
+
78
+ Key concepts:
79
+ - **Model classes**: Rogue (Haiku), Fighter (Sonnet), Warlock (Opus), Paladin (Opus plan mode)
80
+ - **Spend tiers**: absolute $ thresholds, model-calibrated (`MODEL_BUDGET_TIERS` / `getModelBudgets()`)
81
+ - **Efficiency ratings**: LEGENDARY (<15%) EPIC (<30%) PRO SOLID → CLOSE CALL → BUST (>100%), computed against dynamic par
82
+ - **Death marks fire before the early return** in `calculateAchievements` they're checked before `if (!won) return []`. `indecisive` and `expensive_taste` also fire on won runs.
298
83
 
299
84
  ---
300
85
 
301
86
  ## Claude Code Hooks
302
87
 
303
- Nine hooks in `hooks/` directory, installed via `tokengolf install`. Most complete in < 5s (synchronous JSON I/O). `session-end.js` uses async dynamic imports with a 30s timeout.
304
-
305
- ### `SessionStart` (`session-start.js`)
306
- - Does NOT read stdin (SessionStart doesn't pipe data)
307
- - Reads `current-run.json`; if no active run, auto-creates a flow mode run
308
- - Auto-detects `effort` from env var or `~/.claude/settings.json`; auto-detects `fastMode` from settings.json
309
- - Increments `sessionCount` on existing runs
310
- - Outputs `additionalContext` injected into Claude's conversation
311
-
312
- ### `PostToolUse` (`post-tool-use.js`)
313
- - Reads stdin (event JSON with `tool_name`)
314
- - Updates `toolCalls` count in `current-run.json`
315
- - At 80%+ budget: outputs `systemMessage` warning to Claude
316
-
317
- ### `UserPromptSubmit` (`user-prompt-submit.js`)
318
- - Increments `promptCount`
319
- - At 50% budget: injects halfway nudge as `additionalContext`
320
-
321
- ### `PreCompact` (`pre-compact.js`)
322
- - Reads stdin (compact event JSON with `trigger` and `context_window.used_percentage`)
323
- - Appends to `compactionEvents` array in `current-run.json`
324
- - Powers gear achievements (Ghost Run, Ultralight, Traveling Light, Overencumbered)
325
-
326
- ### `SessionEnd` (`session-end.js`)
327
- - Reads stdin for `reason` field (detects Fainted if reason is `'other'`)
328
- - Calls `autoDetectCost(run)` — returns spent, modelBreakdown, thinkingInvocations, thinkingTokens
329
- - Resting runs: updates state with fainted:true, does NOT clear — run continues next session
330
- - Won/died runs: calls `saveRun()` (which runs `calculateAchievements()`), clears state, renders ANSI scorecard
331
-
332
- ### `PostToolUseFailure` (`post-tool-use-failure.js`)
333
- - Reads stdin (event JSON with `tool_name` and error info)
334
- - Increments `failedToolCalls` in `current-run.json`
335
- - Powers Fumble death mark (≥5 failed tool calls)
336
-
337
- ### `SubagentStart` (`subagent-start.js`)
338
- - Reads stdin (subagent event JSON)
339
- - Increments `subagentSpawns` in `current-run.json`
340
- - Powers Lone Wolf / Summoner / Army of One achievements
341
-
342
- ### `Stop` (`stop.js`)
343
- - Reads stdin for turn data
344
- - Increments `turnCount` in `current-run.json`
345
- - Powers Agentic / Obedient turn discipline achievements
346
-
347
- ### `StatusLine` (`statusline.sh`)
348
- - Bash script; uses `TG_SESSION_JSON=... python3 - "$STATE_FILE" <<'PYEOF'` pattern to avoid heredoc/stdin conflict
349
- - Receives live session JSON (cost, context %, model) via stdin
350
- - **Design D accent bar**: `██` prefix on each line, color-coded (yellow normal, red when budget >75%)
351
- - Line 1: `██ ⛳ quest $cost/budget ▓▓▓░░░ pct% RATING model F1/5`
352
- - Line 2 (always shown when context data available): `██ 🧠 ▓▓▓▓░░░ ctx% 🪶/🎒/📦`
353
- - Budget progress bar: `▓` filled, `░` empty, 11 chars wide. Red when >75%, yellow otherwise
354
- - Context progress bar: `▓░` 10 chars wide. Green (50–74%), yellow (75–89%), red (90%+); hidden below 50%
355
- - Model label: `⚔️ Sonnet`, `⚔️ Sonnet·High`, `🏹 Haiku`, `🧙 Opus·Max`, etc. Effort appended only when explicitly set in settings.json (medium omitted — it's the default)
356
- - Always 2 lines when context data is available from Claude Code
357
- - statusLine config must be an object: `{type:"command", command:"...statusline.sh", padding:1}`
358
-
359
- ### Hook installation
360
- `tokengolf install` patches `~/.claude/settings.json`. Uses `fs.realpathSync(process.argv[1])` to resolve npm link symlinks to real hook paths. Hook entries tagged with `_tg: true` for reliable dedup. Non-destructive statusLine install: wraps existing statusline if one is configured.
88
+ Nine hooks in `hooks/`, installed via `tokengolf install`. All are synchronous JSON I/O (< 1s) except `session-end.js` (async imports, 30s timeout).
361
89
 
362
- ---
90
+ | Hook | Stdin? | What it does |
91
+ |------|--------|-------------|
92
+ | `session-start.js` | No | Auto-creates run if none active; detects effort/fastMode; injects `additionalContext` with par budget |
93
+ | `session-end.js` | Yes (`reason`) | Authoritative for cost/scorecard. Scans transcripts, saves run, renders ANSI scorecard. Death = spent > par |
94
+ | `post-tool-use.js` | Yes (`tool_name`) | Tracks `toolCalls`; fires par warning at 80%+ |
95
+ | `post-tool-use-failure.js` | Yes (`tool_name`) | Increments `failedToolCalls` |
96
+ | `user-prompt-submit.js` | No | Increments `promptCount`; fires halfway nudge at 50% of par |
97
+ | `pre-compact.js` | Yes (`trigger`, `context_window`) | Tracks compaction events for gear achievements |
98
+ | `subagent-start.js` | Yes | Increments `subagentSpawns` |
99
+ | `stop.js` | Yes | Increments `turnCount` |
100
+ | `statusline.sh` | Yes (session JSON) | 2-line HUD with `██` accent bar, par-based progress bar. Also fixes model detection (writes real model back to current-run.json) |
363
101
 
364
- ## Cost Detection (`src/lib/cost.js`)
102
+ **Plugin distribution**: `plugin/` directory contains the Claude Code plugin scaffold — hooks.json, bundled scripts, slash commands. Build with `npm run build:plugin`. Uses `${CLAUDE_PLUGIN_ROOT}` for paths.
103
+
104
+ **Hook installation (npm)**: `tokengolf install` resolves npm link symlinks via `fs.realpathSync(process.argv[1])`. Entries tagged `_tg: true` for dedup. Non-destructive statusLine install wraps existing config. Stamps `~/.tokengolf/installed-version` for auto-sync.
365
105
 
366
- `autoDetectCost(run)` is called by `session-end.js` and `tokengolf win/bust`. It:
367
- 1. Parses `~/.claude/projects/<cwd>/` transcript files — all `.jsonl` files modified since `run.startedAt`
368
- 2. Scans ALL files (not just the main session) — this captures subagent sidechain files where Haiku usage lives
369
- 3. Also calls `parseThinkingFromTranscripts(paths)` to count thinking blocks and estimate tokens
370
- 4. Returns `{ spent, modelBreakdown, thinkingInvocations, thinkingTokens }`
106
+ **Auto-sync (npm)**: `session-start.js` checks `installed-version` vs `package.json` on every session start. On mismatch, updates all `_tg: true` hook paths and statusLine paths in `~/.claude/settings.json`, then stamps the new version.
371
107
 
372
- `process.cwd()` is used (not `run.cwd`) because the user always runs `tokengolf win` from their project directory.
108
+ **StatusLine gotcha**: Uses `TG_SESSION_JSON=... python3 - "$STATE_FILE" <<'PYEOF'` pattern to avoid heredoc/stdin conflict. Config must be an object: `{type:"command", command:"...statusline.sh", padding:1}`.
373
109
 
374
- Thinking tokens are estimated from character count ÷ 4 (approximate displayed with `~` prefix). Invocations = assistant turns containing at least one `{"type":"thinking"}` content block.
110
+ **Model detection fix**: `session-start.js` defaults model to `claude-sonnet-4-6`. `statusline.sh` gets the real model from session JSON and writes it back to `current-run.json` if different. This ensures par rates use the correct model.
111
+
112
+ **Emotion modes** (`tokengolf config emotions <mode>`): `emoji` (default) = mood emoji replaces `⛳` on line 1. `ascii` = adds 3rd line with kaomoji + emotion label. `off` = classic `⛳`/`💤`. Config stored in `~/.tokengolf/config.json`. Emotions are a multi-signal composite: par%, context%, failedToolCalls, promptCount.
375
113
 
376
114
  ---
377
115
 
378
- ## Key Design Decisions
116
+ ## Cost Detection (`src/lib/cost.js`)
379
117
 
380
- 1. **SessionEnd hook is authoritative for cost/scorecard** SessionEnd fires on `/exit`, scans transcripts, saves run, and renders ANSI scorecard. `tokengolf win` is a manual override that still works. The Stop hook is also active but only for `turnCount` tracking — it does NOT include `total_cost_usd` so it cannot determine final cost.
118
+ `autoDetectCost(run)` parses `~/.claude/projects/<cwd>/` transcript files modified since `run.startedAt`. Scans ALL `.jsonl` files (not just main session) to capture subagent sidechain files where Haiku usage lives. Same pass detects thinking blocks for ultrathink tracking.
381
119
 
382
- 2. **Scan all transcripts for multi-model + ultrathink** — Claude Code creates separate `.jsonl` files for subagent sidechains (Haiku usage lives there). Same scan also picks up thinking blocks for ultrathink detection. One pass, all data.
120
+ **Gotcha**: Uses `process.cwd()` not `run.cwd` user always runs `tokengolf` from their project directory.
383
121
 
384
- 3. **Floors are cosmetic** — Floor structure exists in the data model but isn't enforced. It's a UI element. Full roguelike floor mechanics with per-floor budgets are a future feature.
122
+ ---
385
123
 
386
- 4. **Flow mode is automatic** — SessionStart hook creates a flow run if none exists. Any Claude Code session is tracked. Just `/exit` and the scorecard appears.
124
+ ## Key Design Decisions
387
125
 
388
- 5. **Budget presets are model-calibrated** — `MODEL_BUDGET_TIERS` in score.js defines Diamond/Gold/Silver/Bronze amounts per model class. Wizard calls `getModelBudgets(model)` so Haiku sees $0.15/$0.40/$1.00/$2.50 and Opus sees $2.50/$7.50/$20.00/$50.00. Efficiency ratings (LEGENDARY/EFFICIENT/etc.) still derive as % of whatever budget was committed — no change there.
126
+ 1. **SessionEnd is authoritative for cost** — fires on `/exit`, scans transcripts, saves run, renders scorecard. Stop hook only tracks `turnCount`.
389
127
 
390
- 6. **Ultrathink is natural language, not a slash command** — Writing `ultrathink` in a prompt triggers extended thinking mode. It's tracked via thinking blocks in transcripts, not via any hook. `thinkingInvocations === 0` on a won run = Silent Run achievement; on a died run with invocations > 0 = Hubris death mark.
128
+ 2. **Every session is a run** — SessionStart creates a run if none exists. Any Claude Code session is tracked automatically. No wizard, no upfront commitment.
391
129
 
392
- 7. **Death marks fire before the early return** — `calculateAchievements` has an `if (!won) return []` early exit, but death marks (blowout, so_close, tool_happy, silent_death, fumble, expensive_taste, hubris) fire before it. `indecisive` (model switches) and `expensive_taste` also fire on won runs — they're behavior patterns, not death verdicts.
130
+ 3. **Par budget scales sublinearly with prompts** — `max(rate × sqrt(prompts), floor)`. Par grows slower than spending, creating increasing pressure. Early prompts have headroom; long sessions must be efficient or bust.
393
131
 
394
- 8. **Design D: ██ block accent, no right borders** — All UI cards use a left-only `██` block accent bar instead of full box borders. This eliminates persistent right-border misalignment caused by emoji/unicode width calculation differences across terminals. Color-coded: yellow `██` for won, red `██` for died, gray `██` for neutral (stats, wizard). Ink components use a custom `borderStyle` object with `left: '██'` and `borderRight/Top/Bottom={false}`, `paddingLeft={3}`. session-end.js ANSI scorecard uses `██` prefix + `─` horizontal separators. No screenshots README uses inline code block demos.
132
+ 4. **Death is cosmetic** — BUST (>100% of par) = `status: 'died'`, red accent, death achievements. It's a scoring signal, not a hard stop.
395
133
 
396
- ---
134
+ 5. **Ultrathink is natural language** — writing "ultrathink" in a prompt triggers extended thinking. Tracked via transcript parsing, not hooks.
397
135
 
398
- ## Current Status: v0.4
399
-
400
- ### Done
401
- - [x] Full project scaffold with esbuild pipeline
402
- - [x] All CLI commands wired up
403
- - [x] Ink components: StartRun, ActiveRun, ScoreCard, StatsView
404
- - [x] JSON persistence (state.js + store.js)
405
- - [x] Scoring logic (tiers, ratings, achievements, multi-model)
406
- - [x] 9 Claude Code hooks: SessionStart, PostToolUse, PostToolUseFailure, UserPromptSubmit, PreCompact, SessionEnd, SubagentStart, Stop, StatusLine
407
- - [x] `tokengolf install` hook installer with symlink resolution + statusLine config
408
- - [x] Auto cost detection from transcripts (`cost.js`) — multi-file, multi-model
409
- - [x] SessionEnd hook auto-displays ANSI scorecard on /exit; replaces dead Stop hook
410
- - [x] Flow mode auto-tracking (SessionStart creates run if none exists)
411
- - [x] Multi-model breakdown in ScoreCard
412
- - [x] Haiku efficiency achievements (Frugal, Rogue Run)
413
- - [x] Effort level wizard step (Low/Medium/High for Sonnet; +Max for Opus; Haiku skips)
414
- - [x] Fast mode auto-detection from settings.json; tracked in run state
415
- - [x] Fainted / rest mechanic (usage limit hit = fainted, run continues next session)
416
- - [x] Context window % in StatusLine HUD: 🪶/🎒/📦 with green/yellow/red
417
- - [x] PreCompact hook tracks manual vs auto compaction + context % for gear achievements
418
- - [x] Multi-session tracking (sessionCount increments on each SessionStart)
419
- - [x] Model-aware budget presets in wizard (MODEL_BUDGET_TIERS, getModelBudgets)
420
- - [x] Ultrathink detection from transcripts (thinkingInvocations, thinkingTokens)
421
- - [x] 5 ultrathink achievements including Hubris death mark
422
- - [x] Paladin (⚜️ opusplan) character class with model-aware budgets and statusline support
423
- - [x] 28 new achievements: prompting skill, tool mastery, cost/prompt, time, subagents, turn discipline, death marks
424
- - [x] 3 new hooks: PostToolUseFailure, SubagentStart, Stop
425
- - [x] Vitest test suite — 120 tests covering all achievements + pure score functions
426
- - [x] Design D block accent UI — ██ left bar, no right borders, color-coded state
427
- - [x] Landing page terminal demos updated to ██ style
428
- - [x] README inline code block demos (replaced PNG screenshots)
429
-
430
- ### Next up (v0.4)
431
- - [ ] `tokengolf floor` command to advance floor manually
432
- - [ ] Roguelike floor mechanics with per-floor sub-budgets
433
- - [ ] Leaderboard / shareable run URLs
434
- - [ ] Team mode (shared `runs.json` via git)
136
+ 6. **Design D: `██` block accent, no right borders** — eliminates emoji/unicode width misalignment across terminals. Yellow = won, red = died, gray = neutral. Ink: custom `borderStyle` with `borderRight/Top/Bottom={false}`, `paddingLeft={3}`. ANSI scorecard: `██` prefix + `─` separators.
435
137
 
436
138
  ---
437
139
 
438
140
  ## Working in This Repo
439
141
 
440
- When making changes:
441
142
  - Keep it ESM (`import/export`, no `require`)
442
143
  - Ink components are functional React — hooks only, no classes
443
- - State mutations always go through `state.js` and `store.js` — never write to `~/.tokengolf/` directly from components
444
- - Hooks must be fast (< 1s) — no async, no network, JSON file I/O only
445
- - **Always run `npm run build` after source changes**
446
- - **Run `npm test` after score.js changes** — 83 tests catch achievement regressions
144
+ - State mutations go through `state.js` and `store.js` — never write to `~/.tokengolf/` directly
145
+ - Hooks must be fast (< 1s), sync, end with `process.exit(0)` (except session-end.js)
146
+ - Hooks run in a separate process with no access to shell env vars
447
147
  - Test hooks standalone: `echo '{"tool_name":"Read"}' | node hooks/post-tool-use.js`
448
- - Remember hooks run in a separate process with no access to shell env vars
449
- - Always `process.exit(0)` at the end of hooks
148
+ - **Always `npm run build` after source changes**
149
+ - **Always `npm test` after score.js changes**
450
150
 
451
151
  When adding a new CLI command:
452
- 1. Add it to `src/cli.js`
453
- 2. Add a component in `src/components/` if it needs a TUI
454
- 3. Document it in this file and README.md
152
+ 1. Add to `src/cli.js`
153
+ 2. Add component in `src/components/` if it needs a TUI
154
+ 3. Update this file and README.md