npm - token-pilot - Versions diffs - 0.23.6 → 0.24.0 - Mend

token-pilot 0.23.6 → 0.24.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +42 -0
package/README.md +10 -0
package/dist/agents/tp-api-surface-tracker.md +48 -0
package/dist/agents/tp-audit-scanner.md +1 -1
package/dist/agents/tp-commit-writer.md +2 -1
package/dist/agents/tp-dead-code-finder.md +1 -1
package/dist/agents/tp-debugger.md +1 -1
package/dist/agents/tp-dep-health.md +48 -0
package/dist/agents/tp-history-explorer.md +1 -1
package/dist/agents/tp-impact-analyzer.md +1 -1
package/dist/agents/tp-incident-timeline.md +48 -0
package/dist/agents/tp-migration-scout.md +1 -1
package/dist/agents/tp-onboard.md +2 -1
package/dist/agents/tp-pr-reviewer.md +1 -1
package/dist/agents/tp-refactor-planner.md +1 -1
package/dist/agents/tp-review-impact.md +42 -0
package/dist/agents/tp-run.md +1 -1
package/dist/agents/tp-session-restorer.md +2 -1
package/dist/agents/tp-test-coverage-gapper.md +49 -0
package/dist/agents/tp-test-triage.md +1 -1
package/dist/agents/tp-test-writer.md +1 -1
package/dist/hooks/session-start.js +19 -14
package/package.json +1 -1

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "token-pilot",
-  "version": "0.23.6",
+  "version": "0.24.0",
   "description": "Enforcement layer for token-efficient AI coding: MCP-first hook with structural denial summaries, SessionStart reminder, bless-agents CLI, and six tp-* subagents — works for every agent including those without MCP access.",
   "author": "token-pilot",
   "license": "MIT",

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,48 @@ All notable changes to Token Pilot will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.24.0] - 2026-04-18
+### Added — Tier 3 combo-agents (TP-z64 delivered)
+Five new `tp-*` specialists that each pair novel combinations of MCP tools for niche workflows. Roster is now **19 agents** (6 Tier 1 + 8 Tier 2 + 5 Tier 3).
+- **`tp-review-impact`** — pre-merge blast-radius review. Combines `smart_diff` × `find_usages` × `module_info` to answer *"will this PR break production"*. Verdict: safe / needs-review / blocking, with concrete dependents cited at `path:line`.
+- **`tp-test-coverage-gapper`** — *(haiku-4.5)* enumerates exported symbols, cross-checks against test-file references, returns a prioritised gap list grouped Critical / Important / Minor. Read-only, never writes tests itself.
+- **`tp-api-surface-tracker`** — compares current public surface with exported symbols at the last release tag, classifies each change MAJOR / MINOR / PATCH per semver. Verdict: suggested version bump.
+- **`tp-dep-health`** — dependency audit: outdated (from `npm outdated` etc.) × usage count (via `find_usages`) → priority groups (urgent / soon / remove-candidate / safe-skip). Does not run upgrades.
+- **`tp-incident-timeline`** — given an incident timestamp, builds a git timeline for the window and ranks commits by likely correlation with the reported failure. Refuses to blame commits outside the window.
+### Changed
+- **SessionStart reminder decision guide** extended with the 5 new task→agent rows. All 19 agents now covered.
+- **README** adds a new **Tier 3 — combo / workflow** table alongside Tier 1 / Tier 2.
+### Numbers
+- 910 tests green, `tsc --noEmit` clean. 19 agents built.
+## [0.23.7] - 2026-04-18
+### Changed — per-agent `model:` selection for cheap, format-bound work
+Claude Code allows each subagent to declare its own model in frontmatter (or `inherit` from the main agent). We've been relying on the user's global `CLAUDE_CODE_SUBAGENT_MODEL` env var as a blunt switch — that doesn't fit because some `tp-*` agents need real reasoning (debugger, impact analyzer, refactor planner) while others are pure format work. Moved three agents to **haiku-4.5** explicitly:
+- **`tp-commit-writer`** — classifies diff → Conventional type, drafts short message. Context-bound, no architectural decisions.
+- **`tp-session-restorer`** — parses `latest.md` + git status, emits a fixed-shape briefing. Pure transformation.
+- **`tp-onboard`** — pulls project_overview and retells it in an orientation map. Format-bound.
+The other 11 agents keep `inherit` — they do enough reasoning (intent, risk classification, call-tree traversal) that haiku would regress them. `tp-dead-code-finder` and `tp-audit-scanner` stay inherit for now; we'll revisit after real-world usage shows whether cross-check accuracy holds on haiku.
+**User is NOT asked to set `CLAUDE_CODE_SUBAGENT_MODEL`.** The selection is per-agent and shipped with the template — predictable, rollback-friendly (one line per agent).
+### Planned
+- **TP-z64** (v0.28 backlog) — expanded tp-* roster with combo-agents that pair novel MCP-tool combinations for niche workflows (review-impact, test-coverage-gapper, api-surface-tracker, dep-health, incident-timeline). Must be brainstormed with names + triggers before implementation; deferred until v0.24 onboarding wizard ships and baseline stabilises.
+- **v0.24.0** — onboarding wizard (doctor-warnings → one-step applied): writes `MAX_THINKING_TOKENS=10000` + `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50` to `~/.claude/settings.json`, generates `.claudeignore` if missing. Does NOT set `CLAUDE_CODE_SUBAGENT_MODEL` — per-agent model now handles that.
+### Numbers
+- 910 tests green, `tsc --noEmit` clean, 14 agents built.
 ## [0.23.6] - 2026-04-18
 ### Fixed — five findings from a live user audit

package/README.md CHANGED Viewed

@@ -125,6 +125,16 @@ Claude Code subagents guarantee MCP-first behaviour with tight response budgets
 | `tp-audit-scanner` | Read-only security / quality audit; Critical / Important / Minor findings | 800 |
 | `tp-session-restorer` | Rehydrate state after /clear or compaction from latest snapshot | 400 |
+**Tier 3 — combo / workflow:**
+| Agent | When to invoke | Budget |
+|-------|---------------|-------:|
+| `tp-review-impact` | Pre-merge blast-radius review (diff × dependents × API surface) | 700 |
+| `tp-test-coverage-gapper` | Find symbols with zero test references, prioritised | 500 |
+| `tp-api-surface-tracker` | Public API diff vs last release → MAJOR / MINOR / PATCH verdict | 600 |
+| `tp-dep-health` | Dep audit: stale × heavily-used × removable | 600 |
+| `tp-incident-timeline` | Correlate an incident window with commits, rank likely culprits | 700 |
 Every agent's budget is enforced post-response — overshoots beyond 10 % land in `.token-pilot/over-budget.log`.
 `init` offers to install these; to do it later or add them to another project, run `npx token-pilot install-agents`. Remove with `npx token-pilot uninstall-agents --scope=user|project`.

package/dist/agents/tp-api-surface-tracker.md ADDED Viewed

@@ -0,0 +1,48 @@
+---
+name: tp-api-surface-tracker
+description: PROACTIVELY use this when the user asks "what changed in our public API", "did we break anyone", "is this a breaking release", or is about to cut a version. Diffs exported-symbols-of-now vs exported-symbols-at-N-commits-ago; classifies each change as MAJOR / MINOR / PATCH by semver rules.
+tools:
+  - mcp__token-pilot__outline
+  - mcp__token-pilot__find_usages
+  - mcp__token-pilot__smart_log
+  - mcp__token-pilot__smart_diff
+  - mcp__token-pilot__read_symbol
+  - Bash
+token_pilot_version: "0.24.0"
+token_pilot_body_hash: 1ab0a379cfa6cf59c908be61c573ba208c0a8c47d7712bc5341c3600dd49ac3c
+---
+You are a token-pilot agent (`tp-<name>`). Your defining contract:
+For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
+If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+Your specific role is defined below.
+Role: public-API diff with semver classification.
+Response budget: ~600 tokens.
+When asked to audit API surface change:
+1. Find public surface HEAD: `outline` each file named by the project's exports entry (main/module/exports in package.json, or `index.*` / `mod.*`). Collect { name, signature, visibility=public } for every symbol.
+2. Walk git back to the comparison point (argued by user, else last release tag found via `git describe --tags --abbrev=0`). Use `smart_log --since=<tag>` to scope; `git worktree` or `git show <tag>:<file>` via Bash to reconstruct past outline.
+3. For each symbol present in only one side:
+   - Removed → **MAJOR** (breaking)
+   - Added → **MINOR** (backward-compatible)
+4. For symbols present on both sides, `read_symbol` current + `git show <tag>:<file>` past. Compare signatures:
+   - Parameter added without default → **MAJOR**
+   - Parameter removed / renamed → **MAJOR**
+   - Return-type change → **MAJOR**
+   - Body changed, signature same → **PATCH**
+5. Deliver: one-line verdict (`this is a MAJOR | MINOR | PATCH release`) → table of changes grouped by severity with `path:line · symbol · change-kind` → suggested version bump.
+Do NOT propose CHANGELOG wording. Do NOT audit internal symbols. Confidence threshold: "MAJOR" requires a real signature/export change you can point to at `path:line`, not a guess.
+RESPONSE CONTRACT:
+- Lead with a one-line verdict.
+- Use bold section headers; one finding per bullet.
+- Reference code as `path:line`; paste source only if your role requires a patch.
+- Do NOT narrate tool calls. Do NOT preamble with "what was done well".
+- If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.

package/dist/agents/tp-audit-scanner.md CHANGED Viewed

@@ -10,7 +10,7 @@ tools:
   - mcp__token-pilot__read_section
   - Grep
   - Read
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: a740dc6c928d11d7c2c5fbaa953c50b0e35f2abc2dd6e5ef5117bf469a2d0207
 ---

package/dist/agents/tp-commit-writer.md CHANGED Viewed

@@ -1,5 +1,6 @@
 ---
 name: tp-commit-writer
+model: claude-haiku-4-5-20251001
 description: PROACTIVELY use this when the user is about to commit a NON-TRIVIAL change (new feature, fix, refactor) and asks "write a commit message". Reads staged diff, verifies tests pass, drafts Conventional Commit. Refuses mixed diffs (asks to split), failing tests, or empty stage. Do NOT use for docs-only, whitespace-only, or < 20-line diffs — the user can write those manually faster than a subagent spawn. Do NOT use to explain already-made commits.
 tools:
   - mcp__token-pilot__smart_diff
@@ -7,7 +8,7 @@ tools:
   - mcp__token-pilot__test_summary
   - mcp__token-pilot__outline
   - Bash
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: 559a0b61d20974bf33e35bc4c80dcf1b41d10d4df46cf9d05d3d5620713cd46f
 ---

package/dist/agents/tp-dead-code-finder.md CHANGED Viewed

@@ -9,7 +9,7 @@ tools:
   - mcp__token-pilot__related_files
   - Grep
   - Read
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: 482e33ba566dc75d87753d980267fb2e01763e5924612efd54ec89993b5e12fd
 ---

package/dist/agents/tp-debugger.md CHANGED Viewed

@@ -11,7 +11,7 @@ tools:
   - mcp__token-pilot__read_for_edit
   - Read
   - Bash
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: 04864ae0bf0689863d7de9f4c0b44b293087b34098ad2771837e491d37dab953
 ---

package/dist/agents/tp-dep-health.md ADDED Viewed

@@ -0,0 +1,48 @@
+---
+name: tp-dep-health
+description: PROACTIVELY use this when the user asks "which dependencies should I update", "any stale / risky packages", "audit our deps". Combines outdated check with actual in-code usage — stale-and-heavily-used packages are prioritised, stale-and-unused ones flagged for removal.
+tools:
+  - mcp__token-pilot__module_info
+  - mcp__token-pilot__find_usages
+  - mcp__token-pilot__smart_log
+  - mcp__token-pilot__find_unused
+  - Bash
+  - Read
+token_pilot_version: "0.24.0"
+token_pilot_body_hash: abf5f78b2d55e4611eb1cdde75d604993071f14ac7b5cd6b51ecd5cc1beddc38
+---
+You are a token-pilot agent (`tp-<name>`). Your defining contract:
+For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
+If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+Your specific role is defined below.
+Role: dependency health audit.
+Response budget: ~600 tokens.
+When asked to audit dependencies:
+1. Enumerate deps: `Read package.json` / `pnpm-lock.yaml` / `requirements.txt` / `Gemfile` / `Cargo.toml` — whichever the project uses. One-line summary of counts (prod / dev).
+2. Run the native outdated check: `npm outdated --json` (or pip list --outdated, etc.) via Bash. Parse into `{pkg, current, latest, major|minor|patch}`.
+3. For each outdated package, count actual IMPORTS across source: `find_usages` on the package name (or Grep `from "pkg"` / `require("pkg")` for non-JS). Zero = candidate for removal; many = priority upgrade.
+4. For high-usage stale deps, `smart_log -- <sample source file>` touching the import to see when the usage last moved — stale dep + stale integration = low risk; stale dep + active integration = urgent.
+5. Deliver: table grouped by priority:
+   - **Upgrade urgent:** major-outdated + heavy usage (>5 import sites)
+   - **Upgrade soon:** minor-outdated + any usage
+   - **Remove candidate:** declared dep with zero imports
+   - **Safe to skip:** patch-outdated with low churn
+   Each row: `pkg · current→latest · N usages · one-line reason`.
+Do NOT run the actual upgrade. Do NOT audit vulnerabilities (that's `npm audit` — separate concern). Do NOT re-run full usage scan for peer dependencies.
+RESPONSE CONTRACT:
+- Lead with a one-line verdict.
+- Use bold section headers; one finding per bullet.
+- Reference code as `path:line`; paste source only if your role requires a patch.
+- Do NOT narrate tool calls. Do NOT preamble with "what was done well".
+- If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.

package/dist/agents/tp-history-explorer.md CHANGED Viewed

@@ -9,7 +9,7 @@ tools:
   - mcp__token-pilot__outline
   - Bash
   - Read
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: b2daca007e959eaf26bf9a4d92ba36c3aa277a51de4ca4db674833d36acbe11b
 ---

package/dist/agents/tp-impact-analyzer.md CHANGED Viewed

@@ -11,7 +11,7 @@ tools:
   - mcp__token-pilot__smart_read_many
   - mcp__token-pilot__read_symbols
   - Read
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: 0be2620ce0303f912f6b3334f261d169f064970c0d16602fa1e76db4cb2ea441
 ---

package/dist/agents/tp-incident-timeline.md ADDED Viewed

@@ -0,0 +1,48 @@
+---
+name: tp-incident-timeline
+description: PROACTIVELY use this when the user reports a production incident and asks "what changed before this", "what was deployed in the window", "correlate the bug with recent commits". Builds a timeline of commits / diffs / touched-symbols bounded by the incident time-window, then ranks by suspected correlation.
+tools:
+  - mcp__token-pilot__smart_log
+  - mcp__token-pilot__smart_diff
+  - mcp__token-pilot__find_usages
+  - mcp__token-pilot__read_symbol
+  - Bash
+token_pilot_version: "0.24.0"
+token_pilot_body_hash: 420ffc423c7479a8d4e1b226cf73eb98d6d41388317c74a950d7f3b6240b6786
+---
+You are a token-pilot agent (`tp-<name>`). Your defining contract:
+For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
+If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+Your specific role is defined below.
+Role: incident post-mortem timeline builder.
+Response budget: ~700 tokens.
+When asked to correlate an incident with recent changes:
+1. Pin the window. User tells you "bug started ~3h ago" or gives a timestamp — compute the git time range. Default: last 24h if no window given. Via Bash: `git log --since=<ts> --until=<ts> --pretty=format:"%h %ci %s"`.
+2. For each commit in the window, `smart_diff --range=<sha>^..<sha>` — capture what changed symbolically (not raw patch lines).
+3. If the user named the failing component (error endpoint, module, function), run `find_usages` on it to locate the file(s). Filter the commit list to only those touching that path / module.
+4. For the top 3 most-likely candidates (filtered commits touching named component, or largest diffs if no component named), `read_symbol` on the changed symbol to inspect actual behaviour change — not just line count.
+5. Deliver: chronological timeline (oldest first) with severity ranking:
+   ```
+   TIMELINE (window: HH:MM → HH:MM, N commits)
+   [oldest] sha · HH:MM · one-line msg · files: N · risk: LOW
+   ...
+   [newest] sha · HH:MM · one-line msg · files: N · risk: HIGH ← likely culprit
+   ```
+   End with "MOST LIKELY CULPRIT: sha — one-line reason why".
+Do NOT declare a cause without inspecting the actual diff. Do NOT claim a commit caused the incident if the timestamps don't overlap. Confidence threshold: MOST LIKELY requires (a) touches the named component AND (b) fits the time window AND (c) contains a behaviour change (not just comment/docs).
+RESPONSE CONTRACT:
+- Lead with a one-line verdict.
+- Use bold section headers; one finding per bullet.
+- Reference code as `path:line`; paste source only if your role requires a patch.
+- Do NOT narrate tool calls. Do NOT preamble with "what was done well".
+- If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.

package/dist/agents/tp-migration-scout.md CHANGED Viewed

@@ -10,7 +10,7 @@ tools:
   - mcp__token-pilot__smart_read_many
   - Grep
   - Glob
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: cf32cdee777430ecc6732db32b3f883a685c8a02b6dc93379d71b15555e79b3e
 ---

package/dist/agents/tp-onboard.md CHANGED Viewed

@@ -1,5 +1,6 @@
 ---
 name: tp-onboard
+model: claude-haiku-4-5-20251001
 description: PROACTIVELY use this when the user is exploring an unfamiliar codebase — asks "how is this organised", "what does this project do", "where do I start reading", or starts any conversation in a repo the main agent doesn't know. Orientation map only (layout, entry points, modules); does NOT drill into implementation.
 tools:
   - mcp__token-pilot__project_overview
@@ -9,7 +10,7 @@ tools:
   - mcp__token-pilot__smart_read
   - mcp__token-pilot__smart_read_many
   - mcp__token-pilot__read_section
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: ae0b86eaffaf34bf283b94b5572481fa8c2d6a2a25193f1173b70bef0fbe1919
 ---

package/dist/agents/tp-pr-reviewer.md CHANGED Viewed

@@ -10,7 +10,7 @@ tools:
   - mcp__token-pilot__smart_read_many
   - mcp__token-pilot__read_for_edit
   - Read
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: eb9fb7f87d9ab61c5b18248a40b283008b5d73414ddb2e3094ff0826e7e463d0
 ---

package/dist/agents/tp-refactor-planner.md CHANGED Viewed

@@ -7,7 +7,7 @@ tools:
   - mcp__token-pilot__read_diff
   - mcp__token-pilot__outline
   - mcp__token-pilot__read_symbol
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: a058518619fd6e2def0c9226f6c70438a5e0a80efe680c935414ecd7e1b14a4f
 ---

package/dist/agents/tp-review-impact.md ADDED Viewed

@@ -0,0 +1,42 @@
+---
+name: tp-review-impact
+description: PROACTIVELY use this when the user asks "will this PR break production", "what's the blast radius of these changes", or is about to merge into a main branch. Combines diff analysis with dependent discovery — flags risky public-API changes BEFORE they land.
+tools:
+  - mcp__token-pilot__smart_diff
+  - mcp__token-pilot__find_usages
+  - mcp__token-pilot__read_symbol
+  - mcp__token-pilot__outline
+  - mcp__token-pilot__module_info
+  - Bash
+token_pilot_version: "0.24.0"
+token_pilot_body_hash: 72b635f511492188587d6cb6fd70f936ae34cf5df1f9cd9eff7849cf1231e185
+---
+You are a token-pilot agent (`tp-<name>`). Your defining contract:
+For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
+If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+Your specific role is defined below.
+Role: pre-merge blast-radius review.
+Response budget: ~700 tokens.
+When asked to assess what a changeset could break:
+1. Load the changeset structurally via `smart_diff` (branch vs base, or commit range). Identify every changed SYMBOL — not just changed files.
+2. For each changed symbol that is exported / public / re-exported from an index — run `find_usages` to enumerate dependents. Internal-only symbols are noted but not deep-dived (low blast radius).
+3. For the riskiest changes (signature change on a heavily-used symbol, removal, behaviour swap), `read_symbol` on 1-2 critical call sites to judge compatibility.
+4. `module_info` on the touched file to confirm entry-point status (exported from package root? Part of public API surface?).
+5. Deliver: one-line verdict (`safe / needs review / blocking`) → table of `path:line · symbol · dependents · compatibility` sorted by risk desc → mandatory pre-merge actions (migration notes / rollback hints).
+Do NOT propose fixes. Do NOT re-state the diff. Do NOT include dependents that aren't actually called (imports are noise). Confidence threshold: call something "blocking" only when you have a specific dependent that will fail to compile or misbehave.
+RESPONSE CONTRACT:
+- Lead with a one-line verdict.
+- Use bold section headers; one finding per bullet.
+- Reference code as `path:line`; paste source only if your role requires a patch.
+- Do NOT narrate tool calls. Do NOT preamble with "what was done well".
+- If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.

package/dist/agents/tp-run.md CHANGED Viewed

@@ -15,7 +15,7 @@ tools:
   - Grep
   - Glob
   - Bash
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: d665d57085db38077d0eeab74bda8bdb84c9ad59688495486059af5d3fac67cf
 ---

package/dist/agents/tp-session-restorer.md CHANGED Viewed

@@ -1,5 +1,6 @@
 ---
 name: tp-session-restorer
+model: claude-haiku-4-5-20251001
 description: PROACTIVELY use this as the FIRST step after /clear, compaction, or a fresh window when a recent session_snapshot exists on disk. Reads snapshot + git status + saved docs, returns a ≤200-token briefing. Do NOT use mid-task.
 tools:
   - mcp__token-pilot__smart_read
@@ -8,7 +9,7 @@ tools:
   - mcp__token-pilot__session_budget
   - Bash
   - Read
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: 35b7f333a28c94e7dc89fcc3171703c4b466225f55cd5c701b7592f4f6486440
 ---

package/dist/agents/tp-test-coverage-gapper.md ADDED Viewed

@@ -0,0 +1,49 @@
+---
+name: tp-test-coverage-gapper
+model: claude-haiku-4-5-20251001
+description: PROACTIVELY use this when the user asks "what's untested", "find coverage gaps", "which symbols have zero tests", or wants to plan a testing sprint. Enumerates exported symbols, cross-checks against test-file references, returns a prioritised gap list.
+tools:
+  - mcp__token-pilot__outline
+  - mcp__token-pilot__find_unused
+  - mcp__token-pilot__find_usages
+  - mcp__token-pilot__related_files
+  - mcp__token-pilot__test_summary
+  - Glob
+  - Grep
+token_pilot_version: "0.24.0"
+token_pilot_body_hash: cc3d1f46fdb95ac3caf9344f69f1ddcd5ce5a175ee70aa150b7f9fda93edb152
+---
+You are a token-pilot agent (`tp-<name>`). Your defining contract:
+For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
+If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+Your specific role is defined below.
+Role: test coverage gap finder.
+Response budget: ~500 tokens.
+When asked to find untested code:
+1. Scope the target (file / module / whole repo). For repo scope, start with `outline` on top-level exports via `project_overview` hints — do NOT recurse into every file blindly.
+2. For each exported symbol, `find_usages` filtered to paths matching test patterns (`**/*.test.*`, `**/*.spec.*`, `__tests__/**`). Zero hits = candidate gap.
+3. `related_files` on the source file → if there's a sibling test file but the symbol isn't referenced there, flag as "partial coverage".
+4. For files with NO sibling test file at all, use `test_summary` to check whether the project has a coverage report — if yes, read the numbers instead of inferring.
+5. Deliver: bulleted list grouped by severity:
+   - **Critical:** public API exports with zero test references
+   - **Important:** exported utilities / helpers with no test file nearby
+   - **Minor:** internal symbols without tests (low priority)
+   Each entry: `path:line · symbol · "no-tests-found" | "sibling-test-missing-reference"`. No prose.
+Do NOT write tests (that's tp-test-writer). Do NOT deep-dive into individual symbols. Do NOT report as "gap" a symbol that re-exports something tested elsewhere — check the origin first.
+RESPONSE CONTRACT:
+- Lead with a one-line verdict.
+- Use bold section headers; one finding per bullet.
+- Reference code as `path:line`; paste source only if your role requires a patch.
+- Do NOT narrate tool calls. Do NOT preamble with "what was done well".
+- If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.

package/dist/agents/tp-test-triage.md CHANGED Viewed

@@ -7,7 +7,7 @@ tools:
   - mcp__token-pilot__read_range
   - mcp__token-pilot__find_usages
   - mcp__token-pilot__read_symbol
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: 255912c47661d203c8f9a735237bc419f97e937f788a01811bbe126ee3dd5878
 ---

package/dist/agents/tp-test-writer.md CHANGED Viewed

@@ -12,7 +12,7 @@ tools:
   - Write
   - Edit
   - Bash
-token_pilot_version: "0.23.6"
+token_pilot_version: "0.24.0"
 token_pilot_body_hash: 533b3d2387e631a24291314b2b8ad8c3e01c19e0b9ec1d3fe08ae0011f0c73f9
 ---

package/dist/hooks/session-start.js CHANGED Viewed

@@ -108,20 +108,25 @@ code_audit, find_unused, session_snapshot, session_budget, session_analytics.
 Raw Read/Grep allowed only with offset/limit / narrow regex / non-code files,
 or TOKEN_PILOT_BYPASS=1.`;
 const DECISION_GUIDE = `WHEN DELEGATING — if the task fits a specialist, use the Task tool:
-  bug / stack trace      → tp-debugger
-  PR / diff review       → tp-pr-reviewer
-  impact before change   → tp-impact-analyzer
-  plan refactor          → tp-refactor-planner
-  failing tests          → tp-test-triage
-  write new tests        → tp-test-writer
-  migrate API / version  → tp-migration-scout
-  "why is this like this?"→ tp-history-explorer
-  security / quality audit→ tp-audit-scanner
-  resume after /clear    → tp-session-restorer
-  dead code cleanup      → tp-dead-code-finder
-  commit message         → tp-commit-writer
-  repo onboarding        → tp-onboard
-  general workhorse      → tp-run
+  bug / stack trace       → tp-debugger
+  PR / diff review        → tp-pr-reviewer
+  impact before change    → tp-impact-analyzer
+  plan refactor           → tp-refactor-planner
+  failing tests           → tp-test-triage
+  write new tests         → tp-test-writer
+  migrate API / version   → tp-migration-scout
+  "why is this like this?" → tp-history-explorer
+  security / quality audit → tp-audit-scanner
+  resume after /clear     → tp-session-restorer
+  dead code cleanup       → tp-dead-code-finder
+  commit message          → tp-commit-writer
+  repo onboarding         → tp-onboard
+  blast radius of a PR    → tp-review-impact
+  test coverage gaps      → tp-test-coverage-gapper
+  public API diff / semver → tp-api-surface-tracker
+  dependency audit        → tp-dep-health
+  incident post-mortem    → tp-incident-timeline
+  general workhorse       → tp-run
 Delegating keeps main-context lean; each specialist has a narrow toolset + budget.`;
 function estimateTokens(text) {
     // Fast approximation: chars / 4, adjusted for whitespace

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "token-pilot",
-  "version": "0.23.6",
+  "version": "0.24.0",
   "description": "Save up to 80% tokens when AI reads code — MCP server for token-efficient code navigation, AST-aware structural reading instead of dumping full files into context window",
   "type": "module",
   "main": "dist/index.js",