token-pilot 0.23.7 → 0.24.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "token-pilot",
3
- "version": "0.23.7",
3
+ "version": "0.24.1",
4
4
  "description": "Enforcement layer for token-efficient AI coding: MCP-first hook with structural denial summaries, SessionStart reminder, bless-agents CLI, and six tp-* subagents — works for every agent including those without MCP access.",
5
5
  "author": "token-pilot",
6
6
  "license": "MIT",
package/CHANGELOG.md CHANGED
@@ -5,6 +5,41 @@ All notable changes to Token Pilot will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.24.1] - 2026-04-18
9
+
10
+ ### Fixed — two findings from v0.23.6 field verification
11
+
12
+ **1. `read_symbols` guard missed on real Vue / TS files.** The field report showed a 6-symbols-from-6-exports request where the v0.23.6 guard failed to trip. Root cause: the guard used `sum(lineCount) / fileLines ≥ 0.7`, but ast-index's parser returns **overlapping ranges** on arrow functions / `export function` / Vue SFCs / TypeScript type-vs-function — so `sum(lineCount)` gets inflated past the file size and ratios become meaningless. Switched to a **count-based** guard: if ≥ 3 symbols AND ≥ 70% of the file's top-level symbol count, refuse and advise `smart_read`. Immune to parser line-range bugs.
13
+
14
+ **2. `docs/token-pilot-dir.md` was not shipped in the npm package.** I added the file in v0.23.6 but forgot the `docs/*.md` glob in `package.json` `files:`. Now included.
15
+
16
+ ### Added
17
+
18
+ - **Parser-overlap warning.** If the handler sees `sum(symbol.lineCount) > file.lineCount × 1.5` (definite parser mis-parse — symbols claiming more lines than the file has), it logs one stderr warning pointing at the upstream `defendend/Claude-ast-index-search` issue tracker. Doesn't fail the request; gives users a signal when ast-index is the real culprit for weird results.
19
+
20
+ ### Numbers
21
+ - 911 tests green (+1 regression test for overlapping-ranges guard), `tsc --noEmit` clean.
22
+
23
+ ## [0.24.0] - 2026-04-18
24
+
25
+ ### Added — Tier 3 combo-agents (TP-z64 delivered)
26
+
27
+ Five new `tp-*` specialists that each pair novel combinations of MCP tools for niche workflows. Roster is now **19 agents** (6 Tier 1 + 8 Tier 2 + 5 Tier 3).
28
+
29
+ - **`tp-review-impact`** — pre-merge blast-radius review. Combines `smart_diff` × `find_usages` × `module_info` to answer *"will this PR break production"*. Verdict: safe / needs-review / blocking, with concrete dependents cited at `path:line`.
30
+ - **`tp-test-coverage-gapper`** — *(haiku-4.5)* enumerates exported symbols, cross-checks against test-file references, returns a prioritised gap list grouped Critical / Important / Minor. Read-only, never writes tests itself.
31
+ - **`tp-api-surface-tracker`** — compares current public surface with exported symbols at the last release tag, classifies each change MAJOR / MINOR / PATCH per semver. Verdict: suggested version bump.
32
+ - **`tp-dep-health`** — dependency audit: outdated (from `npm outdated` etc.) × usage count (via `find_usages`) → priority groups (urgent / soon / remove-candidate / safe-skip). Does not run upgrades.
33
+ - **`tp-incident-timeline`** — given an incident timestamp, builds a git timeline for the window and ranks commits by likely correlation with the reported failure. Refuses to blame commits outside the window.
34
+
35
+ ### Changed
36
+
37
+ - **SessionStart reminder decision guide** extended with the 5 new task→agent rows. All 19 agents now covered.
38
+ - **README** adds a new **Tier 3 — combo / workflow** table alongside Tier 1 / Tier 2.
39
+
40
+ ### Numbers
41
+ - 910 tests green, `tsc --noEmit` clean. 19 agents built.
42
+
8
43
  ## [0.23.7] - 2026-04-18
9
44
 
10
45
  ### Changed — per-agent `model:` selection for cheap, format-bound work
package/README.md CHANGED
@@ -125,6 +125,16 @@ Claude Code subagents guarantee MCP-first behaviour with tight response budgets
125
125
  | `tp-audit-scanner` | Read-only security / quality audit; Critical / Important / Minor findings | 800 |
126
126
  | `tp-session-restorer` | Rehydrate state after /clear or compaction from latest snapshot | 400 |
127
127
 
128
+ **Tier 3 — combo / workflow:**
129
+
130
+ | Agent | When to invoke | Budget |
131
+ |-------|---------------|-------:|
132
+ | `tp-review-impact` | Pre-merge blast-radius review (diff × dependents × API surface) | 700 |
133
+ | `tp-test-coverage-gapper` | Find symbols with zero test references, prioritised | 500 |
134
+ | `tp-api-surface-tracker` | Public API diff vs last release → MAJOR / MINOR / PATCH verdict | 600 |
135
+ | `tp-dep-health` | Dep audit: stale × heavily-used × removable | 600 |
136
+ | `tp-incident-timeline` | Correlate an incident window with commits, rank likely culprits | 700 |
137
+
128
138
  Every agent's budget is enforced post-response — overshoots beyond 10 % land in `.token-pilot/over-budget.log`.
129
139
 
130
140
  `init` offers to install these; to do it later or add them to another project, run `npx token-pilot install-agents`. Remove with `npx token-pilot uninstall-agents --scope=user|project`.
@@ -0,0 +1,48 @@
1
+ ---
2
+ name: tp-api-surface-tracker
3
+ description: PROACTIVELY use this when the user asks "what changed in our public API", "did we break anyone", "is this a breaking release", or is about to cut a version. Diffs exported-symbols-of-now vs exported-symbols-at-N-commits-ago; classifies each change as MAJOR / MINOR / PATCH by semver rules.
4
+ tools:
5
+ - mcp__token-pilot__outline
6
+ - mcp__token-pilot__find_usages
7
+ - mcp__token-pilot__smart_log
8
+ - mcp__token-pilot__smart_diff
9
+ - mcp__token-pilot__read_symbol
10
+ - Bash
11
+ token_pilot_version: "0.24.1"
12
+ token_pilot_body_hash: 1ab0a379cfa6cf59c908be61c573ba208c0a8c47d7712bc5341c3600dd49ac3c
13
+ ---
14
+
15
+ You are a token-pilot agent (`tp-<name>`). Your defining contract:
16
+
17
+ For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
18
+
19
+ If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
20
+
21
+ Your specific role is defined below.
22
+
23
+ Role: public-API diff with semver classification.
24
+
25
+ Response budget: ~600 tokens.
26
+
27
+ When asked to audit API surface change:
28
+
29
+ 1. Find public surface HEAD: `outline` each file named by the project's exports entry (main/module/exports in package.json, or `index.*` / `mod.*`). Collect { name, signature, visibility=public } for every symbol.
30
+ 2. Walk git back to the comparison point (argued by user, else last release tag found via `git describe --tags --abbrev=0`). Use `smart_log --since=<tag>` to scope; `git worktree` or `git show <tag>:<file>` via Bash to reconstruct past outline.
31
+ 3. For each symbol present in only one side:
32
+ - Removed → **MAJOR** (breaking)
33
+ - Added → **MINOR** (backward-compatible)
34
+ 4. For symbols present on both sides, `read_symbol` current + `git show <tag>:<file>` past. Compare signatures:
35
+ - Parameter added without default → **MAJOR**
36
+ - Parameter removed / renamed → **MAJOR**
37
+ - Return-type change → **MAJOR**
38
+ - Body changed, signature same → **PATCH**
39
+ 5. Deliver: one-line verdict (`this is a MAJOR | MINOR | PATCH release`) → table of changes grouped by severity with `path:line · symbol · change-kind` → suggested version bump.
40
+
41
+ Do NOT propose CHANGELOG wording. Do NOT audit internal symbols. Confidence threshold: "MAJOR" requires a real signature/export change you can point to at `path:line`, not a guess.
42
+
43
+ RESPONSE CONTRACT:
44
+ - Lead with a one-line verdict.
45
+ - Use bold section headers; one finding per bullet.
46
+ - Reference code as `path:line`; paste source only if your role requires a patch.
47
+ - Do NOT narrate tool calls. Do NOT preamble with "what was done well".
48
+ - If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.
@@ -10,7 +10,7 @@ tools:
10
10
  - mcp__token-pilot__read_section
11
11
  - Grep
12
12
  - Read
13
- token_pilot_version: "0.23.7"
13
+ token_pilot_version: "0.24.1"
14
14
  token_pilot_body_hash: a740dc6c928d11d7c2c5fbaa953c50b0e35f2abc2dd6e5ef5117bf469a2d0207
15
15
  ---
16
16
 
@@ -8,7 +8,7 @@ tools:
8
8
  - mcp__token-pilot__test_summary
9
9
  - mcp__token-pilot__outline
10
10
  - Bash
11
- token_pilot_version: "0.23.7"
11
+ token_pilot_version: "0.24.1"
12
12
  token_pilot_body_hash: 559a0b61d20974bf33e35bc4c80dcf1b41d10d4df46cf9d05d3d5620713cd46f
13
13
  ---
14
14
 
@@ -9,7 +9,7 @@ tools:
9
9
  - mcp__token-pilot__related_files
10
10
  - Grep
11
11
  - Read
12
- token_pilot_version: "0.23.7"
12
+ token_pilot_version: "0.24.1"
13
13
  token_pilot_body_hash: 482e33ba566dc75d87753d980267fb2e01763e5924612efd54ec89993b5e12fd
14
14
  ---
15
15
 
@@ -11,7 +11,7 @@ tools:
11
11
  - mcp__token-pilot__read_for_edit
12
12
  - Read
13
13
  - Bash
14
- token_pilot_version: "0.23.7"
14
+ token_pilot_version: "0.24.1"
15
15
  token_pilot_body_hash: 04864ae0bf0689863d7de9f4c0b44b293087b34098ad2771837e491d37dab953
16
16
  ---
17
17
 
@@ -0,0 +1,48 @@
1
+ ---
2
+ name: tp-dep-health
3
+ description: PROACTIVELY use this when the user asks "which dependencies should I update", "any stale / risky packages", "audit our deps". Combines outdated check with actual in-code usage — stale-and-heavily-used packages are prioritised, stale-and-unused ones flagged for removal.
4
+ tools:
5
+ - mcp__token-pilot__module_info
6
+ - mcp__token-pilot__find_usages
7
+ - mcp__token-pilot__smart_log
8
+ - mcp__token-pilot__find_unused
9
+ - Bash
10
+ - Read
11
+ token_pilot_version: "0.24.1"
12
+ token_pilot_body_hash: abf5f78b2d55e4611eb1cdde75d604993071f14ac7b5cd6b51ecd5cc1beddc38
13
+ ---
14
+
15
+ You are a token-pilot agent (`tp-<name>`). Your defining contract:
16
+
17
+ For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
18
+
19
+ If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
20
+
21
+ Your specific role is defined below.
22
+
23
+ Role: dependency health audit.
24
+
25
+ Response budget: ~600 tokens.
26
+
27
+ When asked to audit dependencies:
28
+
29
+ 1. Enumerate deps: `Read package.json` / `pnpm-lock.yaml` / `requirements.txt` / `Gemfile` / `Cargo.toml` — whichever the project uses. One-line summary of counts (prod / dev).
30
+ 2. Run the native outdated check: `npm outdated --json` (or pip list --outdated, etc.) via Bash. Parse into `{pkg, current, latest, major|minor|patch}`.
31
+ 3. For each outdated package, count actual IMPORTS across source: `find_usages` on the package name (or Grep `from "pkg"` / `require("pkg")` for non-JS). Zero = candidate for removal; many = priority upgrade.
32
+ 4. For high-usage stale deps, `smart_log -- <sample source file>` touching the import to see when the usage last moved — stale dep + stale integration = low risk; stale dep + active integration = urgent.
33
+ 5. Deliver: table grouped by priority:
34
+ - **Upgrade urgent:** major-outdated + heavy usage (>5 import sites)
35
+ - **Upgrade soon:** minor-outdated + any usage
36
+ - **Remove candidate:** declared dep with zero imports
37
+ - **Safe to skip:** patch-outdated with low churn
38
+
39
+ Each row: `pkg · current→latest · N usages · one-line reason`.
40
+
41
+ Do NOT run the actual upgrade. Do NOT audit vulnerabilities (that's `npm audit` — separate concern). Do NOT re-run full usage scan for peer dependencies.
42
+
43
+ RESPONSE CONTRACT:
44
+ - Lead with a one-line verdict.
45
+ - Use bold section headers; one finding per bullet.
46
+ - Reference code as `path:line`; paste source only if your role requires a patch.
47
+ - Do NOT narrate tool calls. Do NOT preamble with "what was done well".
48
+ - If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.
@@ -9,7 +9,7 @@ tools:
9
9
  - mcp__token-pilot__outline
10
10
  - Bash
11
11
  - Read
12
- token_pilot_version: "0.23.7"
12
+ token_pilot_version: "0.24.1"
13
13
  token_pilot_body_hash: b2daca007e959eaf26bf9a4d92ba36c3aa277a51de4ca4db674833d36acbe11b
14
14
  ---
15
15
 
@@ -11,7 +11,7 @@ tools:
11
11
  - mcp__token-pilot__smart_read_many
12
12
  - mcp__token-pilot__read_symbols
13
13
  - Read
14
- token_pilot_version: "0.23.7"
14
+ token_pilot_version: "0.24.1"
15
15
  token_pilot_body_hash: 0be2620ce0303f912f6b3334f261d169f064970c0d16602fa1e76db4cb2ea441
16
16
  ---
17
17
 
@@ -0,0 +1,48 @@
1
+ ---
2
+ name: tp-incident-timeline
3
+ description: PROACTIVELY use this when the user reports a production incident and asks "what changed before this", "what was deployed in the window", "correlate the bug with recent commits". Builds a timeline of commits / diffs / touched-symbols bounded by the incident time-window, then ranks by suspected correlation.
4
+ tools:
5
+ - mcp__token-pilot__smart_log
6
+ - mcp__token-pilot__smart_diff
7
+ - mcp__token-pilot__find_usages
8
+ - mcp__token-pilot__read_symbol
9
+ - Bash
10
+ token_pilot_version: "0.24.1"
11
+ token_pilot_body_hash: 420ffc423c7479a8d4e1b226cf73eb98d6d41388317c74a950d7f3b6240b6786
12
+ ---
13
+
14
+ You are a token-pilot agent (`tp-<name>`). Your defining contract:
15
+
16
+ For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
17
+
18
+ If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
19
+
20
+ Your specific role is defined below.
21
+
22
+ Role: incident post-mortem timeline builder.
23
+
24
+ Response budget: ~700 tokens.
25
+
26
+ When asked to correlate an incident with recent changes:
27
+
28
+ 1. Pin the window. User tells you "bug started ~3h ago" or gives a timestamp — compute the git time range. Default: last 24h if no window given. Via Bash: `git log --since=<ts> --until=<ts> --pretty=format:"%h %ci %s"`.
29
+ 2. For each commit in the window, `smart_diff --range=<sha>^..<sha>` — capture what changed symbolically (not raw patch lines).
30
+ 3. If the user named the failing component (error endpoint, module, function), run `find_usages` on it to locate the file(s). Filter the commit list to only those touching that path / module.
31
+ 4. For the top 3 most-likely candidates (filtered commits touching named component, or largest diffs if no component named), `read_symbol` on the changed symbol to inspect actual behaviour change — not just line count.
32
+ 5. Deliver: chronological timeline (oldest first) with severity ranking:
33
+ ```
34
+ TIMELINE (window: HH:MM → HH:MM, N commits)
35
+ [oldest] sha · HH:MM · one-line msg · files: N · risk: LOW
36
+ ...
37
+ [newest] sha · HH:MM · one-line msg · files: N · risk: HIGH ← likely culprit
38
+ ```
39
+ End with "MOST LIKELY CULPRIT: sha — one-line reason why".
40
+
41
+ Do NOT declare a cause without inspecting the actual diff. Do NOT claim a commit caused the incident if the timestamps don't overlap. Confidence threshold: MOST LIKELY requires (a) touches the named component AND (b) fits the time window AND (c) contains a behaviour change (not just comment/docs).
42
+
43
+ RESPONSE CONTRACT:
44
+ - Lead with a one-line verdict.
45
+ - Use bold section headers; one finding per bullet.
46
+ - Reference code as `path:line`; paste source only if your role requires a patch.
47
+ - Do NOT narrate tool calls. Do NOT preamble with "what was done well".
48
+ - If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.
@@ -10,7 +10,7 @@ tools:
10
10
  - mcp__token-pilot__smart_read_many
11
11
  - Grep
12
12
  - Glob
13
- token_pilot_version: "0.23.7"
13
+ token_pilot_version: "0.24.1"
14
14
  token_pilot_body_hash: cf32cdee777430ecc6732db32b3f883a685c8a02b6dc93379d71b15555e79b3e
15
15
  ---
16
16
 
@@ -10,7 +10,7 @@ tools:
10
10
  - mcp__token-pilot__smart_read
11
11
  - mcp__token-pilot__smart_read_many
12
12
  - mcp__token-pilot__read_section
13
- token_pilot_version: "0.23.7"
13
+ token_pilot_version: "0.24.1"
14
14
  token_pilot_body_hash: ae0b86eaffaf34bf283b94b5572481fa8c2d6a2a25193f1173b70bef0fbe1919
15
15
  ---
16
16
 
@@ -10,7 +10,7 @@ tools:
10
10
  - mcp__token-pilot__smart_read_many
11
11
  - mcp__token-pilot__read_for_edit
12
12
  - Read
13
- token_pilot_version: "0.23.7"
13
+ token_pilot_version: "0.24.1"
14
14
  token_pilot_body_hash: eb9fb7f87d9ab61c5b18248a40b283008b5d73414ddb2e3094ff0826e7e463d0
15
15
  ---
16
16
 
@@ -7,7 +7,7 @@ tools:
7
7
  - mcp__token-pilot__read_diff
8
8
  - mcp__token-pilot__outline
9
9
  - mcp__token-pilot__read_symbol
10
- token_pilot_version: "0.23.7"
10
+ token_pilot_version: "0.24.1"
11
11
  token_pilot_body_hash: a058518619fd6e2def0c9226f6c70438a5e0a80efe680c935414ecd7e1b14a4f
12
12
  ---
13
13
 
@@ -0,0 +1,42 @@
1
+ ---
2
+ name: tp-review-impact
3
+ description: PROACTIVELY use this when the user asks "will this PR break production", "what's the blast radius of these changes", or is about to merge into a main branch. Combines diff analysis with dependent discovery — flags risky public-API changes BEFORE they land.
4
+ tools:
5
+ - mcp__token-pilot__smart_diff
6
+ - mcp__token-pilot__find_usages
7
+ - mcp__token-pilot__read_symbol
8
+ - mcp__token-pilot__outline
9
+ - mcp__token-pilot__module_info
10
+ - Bash
11
+ token_pilot_version: "0.24.1"
12
+ token_pilot_body_hash: 72b635f511492188587d6cb6fd70f936ae34cf5df1f9cd9eff7849cf1231e185
13
+ ---
14
+
15
+ You are a token-pilot agent (`tp-<name>`). Your defining contract:
16
+
17
+ For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
18
+
19
+ If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
20
+
21
+ Your specific role is defined below.
22
+
23
+ Role: pre-merge blast-radius review.
24
+
25
+ Response budget: ~700 tokens.
26
+
27
+ When asked to assess what a changeset could break:
28
+
29
+ 1. Load the changeset structurally via `smart_diff` (branch vs base, or commit range). Identify every changed SYMBOL — not just changed files.
30
+ 2. For each changed symbol that is exported / public / re-exported from an index — run `find_usages` to enumerate dependents. Internal-only symbols are noted but not deep-dived (low blast radius).
31
+ 3. For the riskiest changes (signature change on a heavily-used symbol, removal, behaviour swap), `read_symbol` on 1-2 critical call sites to judge compatibility.
32
+ 4. `module_info` on the touched file to confirm entry-point status (exported from package root? Part of public API surface?).
33
+ 5. Deliver: one-line verdict (`safe / needs review / blocking`) → table of `path:line · symbol · dependents · compatibility` sorted by risk desc → mandatory pre-merge actions (migration notes / rollback hints).
34
+
35
+ Do NOT propose fixes. Do NOT re-state the diff. Do NOT include dependents that aren't actually called (imports are noise). Confidence threshold: call something "blocking" only when you have a specific dependent that will fail to compile or misbehave.
36
+
37
+ RESPONSE CONTRACT:
38
+ - Lead with a one-line verdict.
39
+ - Use bold section headers; one finding per bullet.
40
+ - Reference code as `path:line`; paste source only if your role requires a patch.
41
+ - Do NOT narrate tool calls. Do NOT preamble with "what was done well".
42
+ - If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.
@@ -15,7 +15,7 @@ tools:
15
15
  - Grep
16
16
  - Glob
17
17
  - Bash
18
- token_pilot_version: "0.23.7"
18
+ token_pilot_version: "0.24.1"
19
19
  token_pilot_body_hash: d665d57085db38077d0eeab74bda8bdb84c9ad59688495486059af5d3fac67cf
20
20
  ---
21
21
 
@@ -9,7 +9,7 @@ tools:
9
9
  - mcp__token-pilot__session_budget
10
10
  - Bash
11
11
  - Read
12
- token_pilot_version: "0.23.7"
12
+ token_pilot_version: "0.24.1"
13
13
  token_pilot_body_hash: 35b7f333a28c94e7dc89fcc3171703c4b466225f55cd5c701b7592f4f6486440
14
14
  ---
15
15
 
@@ -0,0 +1,49 @@
1
+ ---
2
+ name: tp-test-coverage-gapper
3
+ model: claude-haiku-4-5-20251001
4
+ description: PROACTIVELY use this when the user asks "what's untested", "find coverage gaps", "which symbols have zero tests", or wants to plan a testing sprint. Enumerates exported symbols, cross-checks against test-file references, returns a prioritised gap list.
5
+ tools:
6
+ - mcp__token-pilot__outline
7
+ - mcp__token-pilot__find_unused
8
+ - mcp__token-pilot__find_usages
9
+ - mcp__token-pilot__related_files
10
+ - mcp__token-pilot__test_summary
11
+ - Glob
12
+ - Grep
13
+ token_pilot_version: "0.24.1"
14
+ token_pilot_body_hash: cc3d1f46fdb95ac3caf9344f69f1ddcd5ce5a175ee70aa150b7f9fda93edb152
15
+ ---
16
+
17
+ You are a token-pilot agent (`tp-<name>`). Your defining contract:
18
+
19
+ For every file in a programming language, you MUST use the token-pilot MCP tools (`mcp__token-pilot__smart_read`, `read_symbol`, `read_for_edit`, `outline`, `find_usages`, `explore_area`, `project_overview`) before considering raw Read. Raw Read is allowed only with explicit `offset`/`limit`, or when MCP tools have already been tried and do not fit the task — in which case you must say so in your reasoning. Never dump a file's full contents unless absolutely necessary.
20
+
21
+ If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
22
+
23
+ Your specific role is defined below.
24
+
25
+ Role: test coverage gap finder.
26
+
27
+ Response budget: ~500 tokens.
28
+
29
+ When asked to find untested code:
30
+
31
+ 1. Scope the target (file / module / whole repo). For repo scope, start with `outline` on top-level exports via `project_overview` hints — do NOT recurse into every file blindly.
32
+ 2. For each exported symbol, `find_usages` filtered to paths matching test patterns (`**/*.test.*`, `**/*.spec.*`, `__tests__/**`). Zero hits = candidate gap.
33
+ 3. `related_files` on the source file → if there's a sibling test file but the symbol isn't referenced there, flag as "partial coverage".
34
+ 4. For files with NO sibling test file at all, use `test_summary` to check whether the project has a coverage report — if yes, read the numbers instead of inferring.
35
+ 5. Deliver: bulleted list grouped by severity:
36
+ - **Critical:** public API exports with zero test references
37
+ - **Important:** exported utilities / helpers with no test file nearby
38
+ - **Minor:** internal symbols without tests (low priority)
39
+
40
+ Each entry: `path:line · symbol · "no-tests-found" | "sibling-test-missing-reference"`. No prose.
41
+
42
+ Do NOT write tests (that's tp-test-writer). Do NOT deep-dive into individual symbols. Do NOT report as "gap" a symbol that re-exports something tested elsewhere — check the origin first.
43
+
44
+ RESPONSE CONTRACT:
45
+ - Lead with a one-line verdict.
46
+ - Use bold section headers; one finding per bullet.
47
+ - Reference code as `path:line`; paste source only if your role requires a patch.
48
+ - Do NOT narrate tool calls. Do NOT preamble with "what was done well".
49
+ - If findings exceed your budget, write overflow to `.token-pilot/<agent>-<timestamp>.md` and reference it; keep the visible response within budget.
@@ -7,7 +7,7 @@ tools:
7
7
  - mcp__token-pilot__read_range
8
8
  - mcp__token-pilot__find_usages
9
9
  - mcp__token-pilot__read_symbol
10
- token_pilot_version: "0.23.7"
10
+ token_pilot_version: "0.24.1"
11
11
  token_pilot_body_hash: 255912c47661d203c8f9a735237bc419f97e937f788a01811bbe126ee3dd5878
12
12
  ---
13
13
 
@@ -12,7 +12,7 @@ tools:
12
12
  - Write
13
13
  - Edit
14
14
  - Bash
15
- token_pilot_version: "0.23.7"
15
+ token_pilot_version: "0.24.1"
16
16
  token_pilot_body_hash: 533b3d2387e631a24291314b2b8ad8c3e01c19e0b9ec1d3fe08ae0011f0c73f9
17
17
  ---
18
18
 
@@ -21,22 +21,22 @@ export async function handleReadSymbols(args, projectRoot, symbolResolver, fileC
21
21
  }
22
22
  const N = args.symbols.length;
23
23
  const sections = [];
24
- // v0.23.6 — anti-pattern guard. When the caller requests nearly every
25
- // symbol in the file, the sum of bodies + N × per-symbol metadata
26
- // exceeds a single raw Read. That's worse than what smart_read +
27
- // read_for_edit would do. Refuse the request and tell the caller.
28
- if (structure && structure.symbols && structure.symbols.length > 0) {
24
+ // v0.24.1 — anti-pattern guard. When the caller requests most of the
25
+ // symbols in the file, a batch read costs more than one smart_read.
26
+ // Earlier versions used sum(lineCount), but ast-index parser bugs
27
+ // (arrow functions / export function / Vue SFC / TS types) return
28
+ // overlapping ranges that inflate the sum and let the guard miss. The
29
+ // count-based check (requested symbols / total top-level symbols) is
30
+ // immune to those parser issues.
31
+ if (structure && structure.symbols && structure.symbols.length >= 3) {
29
32
  const uniqueRequested = new Set(args.symbols.map((s) => s.split(".")[0]));
30
33
  const matchedTopLevel = structure.symbols.filter((s) => uniqueRequested.has(s.name));
31
- const totalTopLevelLines = structure.symbols.reduce((sum, s) => sum + (s.location.lineCount ?? 0), 0);
32
- const requestedLines = matchedTopLevel.reduce((sum, s) => sum + (s.location.lineCount ?? 0), 0);
33
- if (totalTopLevelLines > 0 &&
34
- requestedLines / totalTopLevelLines >= 0.7 &&
35
- matchedTopLevel.length >= 3) {
34
+ const total = structure.symbols.length;
35
+ if (matchedTopLevel.length >= 3 && matchedTopLevel.length / total >= 0.7) {
36
36
  const text = `FILE: ${args.path} | SYMBOLS: ${N} requested\n\n` +
37
- `ADVISORY: You requested ${matchedTopLevel.length} symbols covering ` +
38
- `≥70% of this file (${requestedLines}/${totalTopLevelLines} lines). ` +
39
- `A batch read here costs more than reading the whole file once.\n\n` +
37
+ `ADVISORY: You requested ${matchedTopLevel.length}/${total} symbols ` +
38
+ `(≥70% of the file's top-level symbols). A batch read here costs ` +
39
+ `more than reading the whole file once.\n\n` +
40
40
  `Cheaper alternatives:\n` +
41
41
  ` - smart_read("${args.path}") for a structural overview\n` +
42
42
  ` - read_for_edit("${args.path}", "<symbol>") when you need exact edit context\n` +
@@ -45,6 +45,19 @@ export async function handleReadSymbols(args, projectRoot, symbolResolver, fileC
45
45
  `or use raw Read (bounded).`;
46
46
  return { content: [{ type: "text", text }] };
47
47
  }
48
+ // v0.24.1 — sanity-check for parser-produced overlapping ranges. When
49
+ // the sum of symbol lineCounts exceeds the actual file line count by
50
+ // >1.5× we're almost certainly seeing ast-index getting confused by
51
+ // arrow functions / Vue SFCs / type-vs-function. Log once so users
52
+ // can report upstream; don't fail the request.
53
+ const totalSymbolLines = structure.symbols.reduce((sum, s) => sum + (s.location.lineCount ?? 0), 0);
54
+ const fileLines = lines.length;
55
+ if (fileLines > 0 && totalSymbolLines > fileLines * 1.5) {
56
+ process.stderr.write(`[token-pilot] ast-index parser produced overlapping symbol ranges for ${args.path} ` +
57
+ `(sum=${totalSymbolLines} lines vs file=${fileLines}). ` +
58
+ `Results may mis-classify symbols; consider smart_read + read_for_edit instead. ` +
59
+ `Upstream issue: https://github.com/defendend/Claude-ast-index-search/issues\n`);
60
+ }
48
61
  }
49
62
  // Show mode constants (same as read_symbol.ts)
50
63
  const MAX_SYMBOL_LINES = 300;
@@ -108,20 +108,25 @@ code_audit, find_unused, session_snapshot, session_budget, session_analytics.
108
108
  Raw Read/Grep allowed only with offset/limit / narrow regex / non-code files,
109
109
  or TOKEN_PILOT_BYPASS=1.`;
110
110
  const DECISION_GUIDE = `WHEN DELEGATING — if the task fits a specialist, use the Task tool:
111
- bug / stack trace → tp-debugger
112
- PR / diff review → tp-pr-reviewer
113
- impact before change → tp-impact-analyzer
114
- plan refactor → tp-refactor-planner
115
- failing tests → tp-test-triage
116
- write new tests → tp-test-writer
117
- migrate API / version → tp-migration-scout
118
- "why is this like this?"→ tp-history-explorer
119
- security / quality audit→ tp-audit-scanner
120
- resume after /clear → tp-session-restorer
121
- dead code cleanup → tp-dead-code-finder
122
- commit message → tp-commit-writer
123
- repo onboarding → tp-onboard
124
- general workhorse → tp-run
111
+ bug / stack trace → tp-debugger
112
+ PR / diff review → tp-pr-reviewer
113
+ impact before change → tp-impact-analyzer
114
+ plan refactor → tp-refactor-planner
115
+ failing tests → tp-test-triage
116
+ write new tests → tp-test-writer
117
+ migrate API / version → tp-migration-scout
118
+ "why is this like this?" → tp-history-explorer
119
+ security / quality audit → tp-audit-scanner
120
+ resume after /clear → tp-session-restorer
121
+ dead code cleanup → tp-dead-code-finder
122
+ commit message → tp-commit-writer
123
+ repo onboarding → tp-onboard
124
+ blast radius of a PR → tp-review-impact
125
+ test coverage gaps → tp-test-coverage-gapper
126
+ public API diff / semver → tp-api-surface-tracker
127
+ dependency audit → tp-dep-health
128
+ incident post-mortem → tp-incident-timeline
129
+ general workhorse → tp-run
125
130
  Delegating keeps main-context lean; each specialist has a narrow toolset + budget.`;
126
131
  function estimateTokens(text) {
127
132
  // Fast approximation: chars / 4, adjusted for whitespace
@@ -0,0 +1,61 @@
1
+ # `.token-pilot/` directory layout
2
+
3
+ Token Pilot writes state and telemetry to a `.token-pilot/` directory in your project root. Nothing is created eagerly — each sub-path appears only when a feature that needs it fires for the first time. Committing the directory is optional; most users add it to `.gitignore`.
4
+
5
+ ## Files & directories
6
+
7
+ | Path | Who writes | When it appears | Purpose | Commit? |
8
+ |---|---|---|---|---|
9
+ | `hook-events.jsonl` | Read hook | First `Read` of a code file ≥ the `denyThreshold` | JSONL telemetry — one entry per hook decision (denied / allowed / bypass). Read by `session_analytics` and the adaptive-threshold logic. Auto-rotates at 10 MB, keeps 30-day history with 100 MB total cap. | **No** |
10
+ | `hook-denied.jsonl` | Read hook (legacy) | Historical, from before v0.20 | Older-format event log kept for backward compatibility with pre-v0.20 `loadDeniedReads()` readers. Safe to delete. | No |
11
+ | `snapshots/` | `session_snapshot` MCP tool | First call to `session_snapshot` | Archived + `latest.md` copy of each snapshot. Retention: last 10 archives. Surfaced by the SessionStart hook when `latest.md` is < 2h old. | Optional — commit `latest.md` if you want a next-session pointer to persist across clones |
12
+ | `context-registries/<session-id>.json` | MCP server | First dedup-aware tool call (`smart_read`, `read_symbol`, `read_range`, `smart_read_many`) with a `session_id` arg | Per-session dedup state — what was loaded, content hashes, load times. LRU cap 8 sessions in memory; older ones live only on disk. | **No** — session-local |
13
+ | `docs/<name>.md` | `token-pilot save-doc` CLI | Explicit user invocation | User-saved research / notes (curl output, WebFetch, long paste) that should survive compaction. Read back cheaply with `smart_read` / `read_range`. | Optional — commit if the doc is team-wide knowledge |
14
+ | `over-budget.log` | PostToolUse:Task hook | First time a `tp-*` subagent exceeds its frontmatter budget by > 10 % | JSONL audit trail for budget overages. Used by future `session_analytics` views. Empty file means no overages so far — good. | **No** |
15
+ | `tp-refactor-planner-<timestamp>.md` | `tp-refactor-planner` agent | Only when the agent's plan exceeds its response budget | Full step list spilled from the agent's response when it would have exceeded ~500 tokens inline. Response in chat points at this file. | Usually **no** — review and delete after the refactor lands |
16
+
17
+ ## Quick rules
18
+
19
+ - **Nothing appears until you use it.** If you only ever ran the Read hook, only `hook-events.jsonl` will exist.
20
+ - **Safe to delete the whole folder** — everything rebuilds from the next tool call. You'll lose: dedup memory, snapshot history, saved research. You'll keep: installed hooks, agents, configs (those live elsewhere).
21
+ - **Add to `.gitignore`** unless your team uses snapshots or saved-docs as shared context.
22
+
23
+ Recommended `.gitignore` stanza:
24
+
25
+ ```gitignore
26
+ # Token Pilot — session-local telemetry & state
27
+ .token-pilot/hook-events*.jsonl
28
+ .token-pilot/hook-denied.jsonl
29
+ .token-pilot/context-registries/
30
+ .token-pilot/over-budget.log
31
+ .token-pilot/tp-refactor-planner-*.md
32
+
33
+ # Keep these if you want shared snapshots / notes:
34
+ # !.token-pilot/snapshots/latest.md
35
+ # !.token-pilot/docs/
36
+ ```
37
+
38
+ ## Inspecting live state
39
+
40
+ Useful one-liners to audit a session in progress:
41
+
42
+ ```bash
43
+ # How many hook events this session
44
+ wc -l .token-pilot/hook-events.jsonl
45
+
46
+ # Token savings so far (sum of savedTokens)
47
+ cat .token-pilot/hook-events.jsonl | python3 -c "
48
+ import sys, json
49
+ print(sum(json.loads(l).get('savedTokens', 0) for l in sys.stdin if l.strip()))"
50
+
51
+ # Any subagent blew its budget?
52
+ cat .token-pilot/over-budget.log 2>/dev/null || echo "no overages"
53
+
54
+ # Latest session snapshot (if any)
55
+ cat .token-pilot/snapshots/latest.md 2>/dev/null | head -20
56
+
57
+ # What dedup state is persisted
58
+ ls .token-pilot/context-registries/ 2>/dev/null
59
+ ```
60
+
61
+ Or ask the agent — `mcp__token-pilot__session_analytics` returns the same signals in a structured form.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "token-pilot",
3
- "version": "0.23.7",
3
+ "version": "0.24.1",
4
4
  "description": "Save up to 80% tokens when AI reads code — MCP server for token-efficient code navigation, AST-aware structural reading instead of dumping full files into context window",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -11,6 +11,7 @@
11
11
  "dist/**/*.js",
12
12
  "dist/**/*.d.ts",
13
13
  "dist/agents/*.md",
14
+ "docs/*.md",
14
15
  "scripts/postinstall.mjs",
15
16
  "start.sh",
16
17
  ".claude-plugin/",