npm - token-pilot - Versions diffs - 0.28.3 → 0.29.0 - Mend

token-pilot 0.28.3 → 0.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +40 -0
package/agents/tp-api-surface-tracker.md +4 -2
package/agents/tp-audit-scanner.md +4 -2
package/agents/tp-commit-writer.md +4 -2
package/agents/tp-context-engineer.md +4 -2
package/agents/tp-dead-code-finder.md +4 -2
package/agents/tp-debugger.md +4 -2
package/agents/tp-dep-health.md +4 -2
package/agents/tp-doc-writer.md +4 -2
package/agents/tp-history-explorer.md +4 -2
package/agents/tp-impact-analyzer.md +4 -2
package/agents/tp-incident-timeline.md +4 -2
package/agents/tp-incremental-builder.md +4 -2
package/agents/tp-migration-scout.md +4 -2
package/agents/tp-onboard.md +4 -2
package/agents/tp-performance-profiler.md +4 -2
package/agents/tp-pr-reviewer.md +4 -2
package/agents/tp-refactor-planner.md +4 -2
package/agents/tp-review-impact.md +4 -2
package/agents/tp-run.md +4 -2
package/agents/tp-session-restorer.md +4 -2
package/agents/tp-ship-coordinator.md +4 -2
package/agents/tp-spec-writer.md +4 -2
package/agents/tp-test-coverage-gapper.md +4 -2
package/agents/tp-test-triage.md +4 -2
package/agents/tp-test-writer.md +4 -2
package/dist/hooks/pre-bash.d.ts +11 -0
package/dist/hooks/pre-bash.js +53 -0
package/dist/server/tool-definitions.js +2 -2
package/package.json +1 -1

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -6,14 +6,14 @@
   },
   "metadata": {
     "description": "Token Pilot \u2014 save 60-90% tokens when AI reads code",
-    "version": "0.28.3"
+    "version": "0.29.0"
   },
   "plugins": [
     {
       "name": "token-pilot",
       "source": "./",
       "description": "Reduces token consumption by 60-90% via AST-aware lazy file reading, structural symbol navigation, and cross-session tool-usage analytics. 22 MCP tools + 19 subagents + budget watchdog hooks.",
-      "version": "0.28.3",
+      "version": "0.29.0",
       "author": {
         "name": "Digital-Threads"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "token-pilot",
-  "version": "0.28.3",
+  "version": "0.29.0",
   "description": "Saves 60-90% tokens when AI reads code. AST-aware lazy reading, symbol navigation, cross-session tool-usage analytics, 22 subagents (haiku/sonnet/opus-tiered) with budget watchdog.",
   "author": {
     "name": "Digital-Threads",

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,46 @@ All notable changes to Token Pilot will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.29.0] - 2026-04-19
+Consolidation release based on Sonnet 4.6 + Opus 4.7 verification findings. Closes the short-tail issues that came out of the two live runs before the weekly-quota window reopens.
+### Added — context-mode partnership in shared preamble
+Both verification runs showed the same asymmetry: `token-pilot` saves on delegated (subagent) code reads; `context-mode` saves on main-thread Bash/command execution. Opus 4.7 literally wrote: "Во всей остальной работе использовал `ctx_batch_execute` вместо raw Bash — это adoption context-mode, не token-pilot". That's the right behaviour — we shouldn't fight it, we should formalise it.
+All 25 tp-* agents now carry an instruction in the shared preamble: *for heavy Bash (tests, builds, recursive searches, network calls), prefer `mcp__context-mode__execute` / `ctx_batch_execute` when available — runs in sandbox, only result enters context (95% reduction vs raw stdout)*. This is complementary, not redundant: token-pilot owns code reading, context-mode owns command execution.
+### Fixed — composite Bash escape patterns (from Opus 4.7 v0.28.2 report)
+Opus's verification noted that quoted / wrapped heavy commands slipped past our `PreToolUse:Bash` hook:
+- `bash -c "cat src/foo.ts"` → slipped
+- `sh -c "grep -r foo ."` → slipped
+- `eval "cat src/foo.ts"` → slipped
+- `for f in *.ts; do cat $f; done` → slipped
+- `while read f; do git log; done` → slipped
+Added `extractWrappedCommands()` in `src/hooks/pre-bash.ts` — unwraps `bash/sh/zsh -c "..."`, `eval "..."`, `for/while/until ... do BODY done` — and re-runs the heavy-pattern check on each inner body. First deny wins. Adds 7 regression tests covering both deny (heavy inside wrapper) and allow (benign inside wrapper — `bash -c "ls"`, `eval "echo hello"`).
+### Changed — honest tool descriptions for weak performers
+- `smart_log` description now carries a heads-up: "two verification runs measured this tool at ~39% token reduction (borderline). Cumulative data being gathered — tool may be dropped or redesigned in v0.30.0 if numbers don't improve". The description already advised scoping with `path` or `count`; kept.
+- `session_budget` re-framed as **META / info-only** — doesn't save tokens itself, purely diagnostic. This matches the META_TOOLS grouping in profiles (shipped in v0.28.1) and stops users thinking it's an optimisation tool.
+### Changed — composed-agent line budget 60 → 65
+Shared preamble now carries the context-mode paragraph — 3 extra lines flow into every composed agent file. Three agents (tp-context-engineer, tp-dead-code-finder, tp-doc-writer) ticked over the 60-line cap by 1-3 lines. Raised the hard limit to 65 to accommodate the new content without trimming per-agent instructions. 25 agents currently in the 38-63 range.
+### Deferred to v0.30.0
+- **Stop-hook output watchdog** — cap main-thread response size. Needs an experiment against Claude Code API first; too much new surface for a same-day patch.
+- **Automatic MCP response buffer** — intercept 3rd-party MCP (GitHub / Jira / Slack) responses via `updatedMCPToolOutput`. Biggest potential lever in the ecosystem, but a full feature, not a patch.
+- **`smart_log` final decision** — keep, redesign, or drop based on cumulative `tool-audit` data after a week of use.
+- **`explore_area` self-sizing** — v0.28.3 tightened the caps (20/500 → 10/200); next step is compare predicted output to `estimateExploreAreaWorkflowTokens` baseline and trim when exceeded.
+1026 tests passing (+7 new on composite Bash escape).
 ## [0.28.3] - 2026-04-19
 ### Fixed — `explore_area` output size (was −31% savings)

package/agents/tp-api-surface-tracker.md CHANGED Viewed

@@ -9,8 +9,8 @@ tools:
   - mcp__token-pilot__read_symbol
   - Bash
 model: haiku
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: f30fb3378463d6518041650487f1074b5411c6c3d6d7df315d21267f25f812d6
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: c9d33476fdf70c8a7a493ec8720f54792eda2f81585996246e94c130ff3ec356
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -19,6 +19,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: public-API diff with semver classification.

package/agents/tp-audit-scanner.md CHANGED Viewed

@@ -11,8 +11,8 @@ tools:
   - Grep
   - Read
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: a740dc6c928d11d7c2c5fbaa953c50b0e35f2abc2dd6e5ef5117bf469a2d0207
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 7095ffab66aca2e424f00875933e3f63bc10651eef2fde6a59f08bbbdbf86f7c
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -21,6 +21,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: audit scanner — surfaces risks, never fixes.

package/agents/tp-commit-writer.md CHANGED Viewed

@@ -8,8 +8,8 @@ tools:
   - mcp__token-pilot__test_summary
   - mcp__token-pilot__outline
   - Bash
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 559a0b61d20974bf33e35bc4c80dcf1b41d10d4df46cf9d05d3d5620713cd46f
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: b6831f11c61a9b255c2b6ffa04837130242fd02843463a7d30f109c1a06b3e3f
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -18,6 +18,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: commit-message authoring.

package/agents/tp-context-engineer.md CHANGED Viewed

@@ -13,8 +13,8 @@ tools:
   - Edit
   - Glob
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 8977f452021085a9ba63338bf94e8903e56b30e199dc32e41acc4ec3173a931d
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 43f9364ce722ff76daf0f8720ddaf9f77e18d4c4ed8bee3e15f12d207798e778
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -23,6 +23,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: curate what AI agents see so output quality stays high.

package/agents/tp-dead-code-finder.md CHANGED Viewed

@@ -11,8 +11,8 @@ tools:
   - Grep
   - Read
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 33798b70002a206c4547d08ff46caefe6dbe5a9300f94ab5dad4a57ab5fb4478
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 386760aed26df6c3595d3267954605565fad08afa8761e016079ae60c19887a8
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -21,6 +21,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: safe dead-code detection.

package/agents/tp-debugger.md CHANGED Viewed

@@ -12,8 +12,8 @@ tools:
   - Read
   - Bash
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: ada78a5a3f029721fa51e7cd203395ff0e87f0ab614cc7cf0d5bcc1bf9a80435
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 71738830d025e86c70988e046a2f7f30b4590f3d284291a18609ed5fdd732321
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -22,6 +22,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: bug diagnosis via systematic triage.

package/agents/tp-dep-health.md CHANGED Viewed

@@ -9,8 +9,8 @@ tools:
   - Bash
   - Read
 model: haiku
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 6224d989835ea284985b474005b8b46052b7007c4610e661b10658286b5c6624
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 12634cd28889d0a0ef1b4a6b994ba978353e14f3cb349011c393076e7e2b5c96
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -19,6 +19,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: dependency health audit.

package/agents/tp-doc-writer.md CHANGED Viewed

@@ -13,8 +13,8 @@ tools:
   - Edit
   - Glob
 model: haiku
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 72347b06aaea75ed960972e96e2523c221b2ea7c892a3931aa0e7c32e4c86555
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 8e29d07dd8f58adeb9530ec477a59a6e42de6c624f322d2c6cfa8da66456b46a
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -23,6 +23,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: documentation author — decisions, ADRs, READMEs, API docs.

package/agents/tp-history-explorer.md CHANGED Viewed

@@ -10,8 +10,8 @@ tools:
   - Bash
   - Read
 model: haiku
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: b2daca007e959eaf26bf9a4d92ba36c3aa277a51de4ca4db674833d36acbe11b
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 260197bc31531352f5eda3b70cf114c7c57bb7e9373f68ca76161dd68a804b0d
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -20,6 +20,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: git-history archaeology — why, when, by whom.

package/agents/tp-impact-analyzer.md CHANGED Viewed

@@ -12,8 +12,8 @@ tools:
   - mcp__token-pilot__read_symbols
   - Read
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 0be2620ce0303f912f6b3334f261d169f064970c0d16602fa1e76db4cb2ea441
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 1da6936cc117a7627640fae3cc85bf13a17f0b0b0d0d533423dfb4b7c0b4b1c2
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -22,6 +22,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: impact analysis.

package/agents/tp-incident-timeline.md CHANGED Viewed

@@ -8,8 +8,8 @@ tools:
   - mcp__token-pilot__read_symbol
   - Bash
 model: inherit
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 420ffc423c7479a8d4e1b226cf73eb98d6d41388317c74a950d7f3b6240b6786
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 213746bab7acb6730a6edb16e1ff7b2c56572c3adf4f94990799f1c168cfa2ad
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -18,6 +18,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: incident post-mortem timeline builder.

package/agents/tp-incremental-builder.md CHANGED Viewed

@@ -13,8 +13,8 @@ tools:
   - Edit
   - Bash
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 9cb0bdf6e209d8ac613487385c01ef269d827dc3eddaf81b8eba581a3150b1e3
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 14c9adcabfb772c77a467a5fbfa682abbd5adc87e22d7fbe5d1329ffd790dde5
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -23,6 +23,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: incremental feature implementation with slice-by-slice discipline.

package/agents/tp-migration-scout.md CHANGED Viewed

@@ -11,8 +11,8 @@ tools:
   - Grep
   - Glob
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: cf32cdee777430ecc6732db32b3f883a685c8a02b6dc93379d71b15555e79b3e
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 62893e448e943d0e1b928a670823ec3e152de395e487564862f145bd82161fcb
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -21,6 +21,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: migration impact mapping.

package/agents/tp-onboard.md CHANGED Viewed

@@ -10,8 +10,8 @@ tools:
   - mcp__token-pilot__smart_read
   - mcp__token-pilot__smart_read_many
   - mcp__token-pilot__read_section
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: ae0b86eaffaf34bf283b94b5572481fa8c2d6a2a25193f1173b70bef0fbe1919
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 4e82f7b3c6446663e958fb6bf5eb5348bbdf33389269c888ce0dab766e50561f
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -20,6 +20,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: repository onboarding.

package/agents/tp-performance-profiler.md CHANGED Viewed

@@ -11,8 +11,8 @@ tools:
   - Bash
   - Read
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 14b6fb4423a839c119120c2ea12c9dd6ab6ad1aeb13df1e7c22807b290cf1f9c
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 8b9f454a47e57e3761668de788850ef97d5d6f127b059cf8e0cef03deaca3f98
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -21,6 +21,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: performance diagnosis and targeted optimization.

package/agents/tp-pr-reviewer.md CHANGED Viewed

@@ -11,8 +11,8 @@ tools:
   - mcp__token-pilot__read_for_edit
   - Read
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 73ba5844c8354088dcb10c671622daecc0e8589568de15a6001e1cf951eea586
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 91003b244472c4e65d840b55474a86ce04fba379859d588cc0fa54850b0e1e4f
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -21,6 +21,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: PR / diff review across five axes.

package/agents/tp-refactor-planner.md CHANGED Viewed

@@ -8,8 +8,8 @@ tools:
   - mcp__token-pilot__outline
   - mcp__token-pilot__read_symbol
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: dcc2c2aaeb443cc9688639b4337c6069b9d5bf21e3ed757fc8b3ac8a9d61bc03
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 45f972c6b36929491a529322bac3c34fd44872f7be4a974d25c7e27cb12e9dc3
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -18,6 +18,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: refactor planning with behaviour-preservation discipline.

package/agents/tp-review-impact.md CHANGED Viewed

@@ -9,8 +9,8 @@ tools:
   - mcp__token-pilot__module_info
   - Bash
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 72b635f511492188587d6cb6fd70f936ae34cf5df1f9cd9eff7849cf1231e185
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 3c1c66f952ac63a5936bec86fefda8c842fb9713bca81e48ca5bb568ccb5f367
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -19,6 +19,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: pre-merge blast-radius review.

package/agents/tp-run.md CHANGED Viewed

@@ -16,8 +16,8 @@ tools:
   - Glob
   - Bash
 model: haiku
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: d665d57085db38077d0eeab74bda8bdb84c9ad59688495486059af5d3fac67cf
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: de342efe1e3ee265df1773ebde1241555750ab17de249190a5c1c200f1f8f51a
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -26,6 +26,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: general-purpose token-pilot workhorse.

package/agents/tp-session-restorer.md CHANGED Viewed

@@ -9,8 +9,8 @@ tools:
   - mcp__token-pilot__session_budget
   - Bash
   - Read
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 35b7f333a28c94e7dc89fcc3171703c4b466225f55cd5c701b7592f4f6486440
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: d031f30e9cc4ea454aa256427659ed27249d820b75dc8b9b99c81ba7635230a7
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -19,6 +19,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: session-state rehydration.

package/agents/tp-ship-coordinator.md CHANGED Viewed

@@ -11,8 +11,8 @@ tools:
   - Read
   - Grep
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: e8f9c28da23e318328f5afd85b09e8e7b96e0dab21a4c6779ba798cd709ced64
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 6b1c27b3dc4fad622cebff7c49e079fc764ca0ae57ef5bc4e61b563d8321092d
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -21,6 +21,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: pre-production readiness coordinator.

package/agents/tp-spec-writer.md CHANGED Viewed

@@ -9,8 +9,8 @@ tools:
   - Read
   - Write
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: ed0b9f938c152c0d7be5a6a5eaf3c97c19b27ae4a9540aec342f0edb0927cb27
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 4ae44482db80a8a3a43794c6ecb665ec0b5385a274e1e5b2e3a404956075be88
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -19,6 +19,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: pre-code specification author.

package/agents/tp-test-coverage-gapper.md CHANGED Viewed

@@ -10,8 +10,8 @@ tools:
   - mcp__token-pilot__test_summary
   - Glob
   - Grep
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: cc3d1f46fdb95ac3caf9344f69f1ddcd5ce5a175ee70aa150b7f9fda93edb152
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 6d862d1bcaeda3fb13099f51e40faaaf45d16d7d41d1b938609500192aa606f2
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -20,6 +20,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: test coverage gap finder.

package/agents/tp-test-triage.md CHANGED Viewed

@@ -8,8 +8,8 @@ tools:
   - mcp__token-pilot__find_usages
   - mcp__token-pilot__read_symbol
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 255912c47661d203c8f9a735237bc419f97e937f788a01811bbe126ee3dd5878
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: f4e0dcbd2b4e8648efcafc9d53101a66bf394d7c90e97df7581ac47fcfbff5cb
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -18,6 +18,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: test-failure triage.

package/agents/tp-test-writer.md CHANGED Viewed

@@ -13,8 +13,8 @@ tools:
   - Edit
   - Bash
 model: sonnet
-token_pilot_version: "0.28.3"
-token_pilot_body_hash: 96211a3e7f6b52dd47fef286eec3584b1c269fb3464c1102f8b7edbe470700e6
+token_pilot_version: "0.29.0"
+token_pilot_body_hash: 960fe9e907e9c7d13b14dcc22af99e8cc7e7335f99791fa808df76ac21e1f5e9
 ---
 You are a token-pilot agent (`tp-<name>`). Your defining contract:
@@ -23,6 +23,8 @@ For every file in a programming language, you MUST use the token-pilot MCP tools
 If any MCP tool fails, fall back sensibly (another MCP tool → bounded Read → pass-through) and note the fallback in your output. Never silently abandon the contract.
+For heavy Bash operations (test runs, builds, recursive searches, network calls, any command with potentially large stdout): when `mcp__context-mode__execute` or `ctx_batch_execute` is available, use it instead of raw Bash. Context-mode runs commands in a sandbox and only the result enters your context — typically 95% token reduction vs raw stdout dump. This is complementary to token-pilot: we own code reading, context-mode owns command execution.
 Your specific role is defined below.
 Role: targeted test authoring with TDD discipline.

package/dist/hooks/pre-bash.d.ts CHANGED Viewed

@@ -37,6 +37,17 @@ export type PreBashDecision = {
     kind: "deny";
     reason: string;
 };
+/**
+ * v0.29.0 — expose wrapped commands. Opus 4.7's v0.28.2 verification
+ * report showed escape patterns: `bash -c "cat src/foo.ts"`,
+ * `eval "..."`, `for f in *.ts; do cat $f; done` all slipped through
+ * our heuristics because the dangerous call sat inside quotes / a loop
+ * body. Unwrap those before matching.
+ *
+ * Returns the original command PLUS the extracted inner body for each
+ * wrapper found. Duplication is fine — detectHeavyPattern is pure.
+ */
+export declare function extractWrappedCommands(command: string): string[];
 export declare function detectHeavyPattern(command: string): PreBashDecision;
 export declare function decidePreBash(input: PreBashInput): PreBashDecision;
 export declare function renderPreBashOutput(decision: PreBashDecision): string | null;

package/dist/hooks/pre-bash.js CHANGED Viewed

@@ -32,7 +32,60 @@ function invokes(command, utility) {
     const re = new RegExp(`(^|[;&|\\n]\\s*)${utility}(\\s|$)`, "m");
     return re.test(command);
 }
+/**
+ * v0.29.0 — expose wrapped commands. Opus 4.7's v0.28.2 verification
+ * report showed escape patterns: `bash -c "cat src/foo.ts"`,
+ * `eval "..."`, `for f in *.ts; do cat $f; done` all slipped through
+ * our heuristics because the dangerous call sat inside quotes / a loop
+ * body. Unwrap those before matching.
+ *
+ * Returns the original command PLUS the extracted inner body for each
+ * wrapper found. Duplication is fine — detectHeavyPattern is pure.
+ */
+export function extractWrappedCommands(command) {
+    const out = [command];
+    // bash -c "..." / sh -c "..." / zsh -c "..."
+    for (const shell of ["bash", "sh", "zsh"]) {
+        const re = new RegExp(`\\b${shell}\\s+-c\\s+(?:"([^"]+)"|'([^']+)')`, "g");
+        for (const m of command.matchAll(re)) {
+            const inner = m[1] ?? m[2];
+            if (inner)
+                out.push(inner);
+        }
+    }
+    // eval "..." / eval '...'
+    for (const m of command.matchAll(/\beval\s+(?:"([^"]+)"|'([^']+)')/g)) {
+        const inner = m[1] ?? m[2];
+        if (inner)
+            out.push(inner);
+    }
+    // for LOOP with body: `for X in Y; do BODY; done` — extract BODY
+    // Also covers `while COND; do BODY; done` and `until COND; do BODY; done`
+    for (const m of command.matchAll(/\b(?:for|while|until)\b[^;]*;\s*do\s+(.+?)\s*;?\s*done\b/gs)) {
+        const body = m[1];
+        if (body)
+            out.push(body);
+    }
+    return out;
+}
 export function detectHeavyPattern(command) {
+    const cmd = command.trim();
+    if (!cmd)
+        return { kind: "allow" };
+    // v0.29.0: check each of the original + any unwrapped inner commands.
+    // First deny wins.
+    const candidates = extractWrappedCommands(cmd);
+    if (candidates.length > 1) {
+        // Check only the unwrapped inners; the original is handled below.
+        for (let i = 1; i < candidates.length; i++) {
+            const inner = detectHeavyPatternSingle(candidates[i]);
+            if (inner.kind === "deny")
+                return inner;
+        }
+    }
+    return detectHeavyPatternSingle(cmd);
+}
+function detectHeavyPatternSingle(command) {
     const cmd = command.trim();
     if (!cmd)
         return { kind: "allow" };

package/dist/server/tool-definitions.js CHANGED Viewed

@@ -499,7 +499,7 @@ export const TOOL_DEFINITIONS = [
     },
     {
         name: "smart_log",
-        description: "Use INSTEAD OF raw git log. Structured commit history with category detection (feat/fix/refactor/docs), file stats, author breakdown. Filters by path and ref.",
+        description: "Use INSTEAD OF raw git log. Structured commit history with category detection (feat/fix/refactor/docs), file stats, author breakdown. Filters by path and ref. HEADS UP: two verification runs measured this tool at ~39% token reduction (borderline — vs 95-99% for outline/smart_diff). Cumulative data being gathered — tool may be dropped or redesigned in v0.30.0 if numbers don't improve. Prefer scoping with `path` or `count` to tighten savings.",
         inputSchema: {
             type: "object",
             properties: {
@@ -581,7 +581,7 @@ export const TOOL_DEFINITIONS = [
     },
     {
         name: "session_budget",
-        description: "Report Read-hook pressure for this session: suppressed tokens so far, reference budget, burn fraction (0..1), and the effective denyThreshold the adaptive curve would apply right now. NOTE: burnFraction measures hook activity, not actual context-window occupancy. Useful to decide when to tighten further before a big read.",
+        description: "META / info-only: reports Read-hook pressure for this session (suppressed tokens, reference budget, burn fraction, effective denyThreshold). Does NOT save tokens itself — this is diagnostic, use to decide when to tighten before a big read. NOTE: burnFraction measures hook activity, not actual context-window occupancy.",
         inputSchema: {
             type: "object",
             properties: {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "token-pilot",
-  "version": "0.28.3",
+  "version": "0.29.0",
   "description": "Save up to 80% tokens when AI reads code \u2014 MCP server for token-efficient code navigation, AST-aware structural reading instead of dumping full files into context window",
   "type": "module",
   "main": "dist/index.js",