pan-wizard 2.9.1 → 3.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +8 -8
- package/agents/pan-conductor.md +189 -0
- package/agents/pan-counterfactual.md +112 -0
- package/agents/pan-debugger.md +15 -1
- package/agents/pan-document_code.md +21 -0
- package/agents/pan-executor.md +16 -0
- package/agents/pan-hardener.md +113 -0
- package/agents/pan-integration-checker.md +2 -0
- package/agents/pan-knowledge.md +81 -0
- package/agents/pan-meta-reviewer.md +91 -0
- package/agents/pan-plan-checker.md +2 -0
- package/agents/pan-previewer.md +98 -0
- package/agents/pan-project-researcher.md +4 -4
- package/agents/pan-reviewer.md +2 -0
- package/agents/pan-verifier.md +2 -0
- package/bin/install-lib.cjs +197 -0
- package/bin/install.js +1999 -1959
- package/commands/pan/cost.md +132 -0
- package/commands/pan/exec-phase.md +15 -0
- package/commands/pan/focus-auto.md +18 -0
- package/commands/pan/focus-exec.md +10 -1
- package/commands/pan/knowledge.md +129 -0
- package/commands/pan/map-codebase.md +15 -0
- package/commands/pan/mcp-bridge.md +145 -0
- package/commands/pan/plan-phase.md +11 -0
- package/commands/pan/preview.md +114 -0
- package/commands/pan/profile.md +37 -0
- package/commands/pan/review-deep.md +128 -0
- package/commands/pan/verify-phase.md +11 -0
- package/commands/pan/what-if.md +146 -0
- package/hooks/dist/pan-cost-logger.js +102 -0
- package/hooks/dist/pan-statusline.js +154 -108
- package/package.json +1 -1
- package/pan-wizard-core/bin/lib/bridge.cjs +269 -0
- package/pan-wizard-core/bin/lib/bus.cjs +251 -0
- package/pan-wizard-core/bin/lib/codebase.cjs +118 -0
- package/pan-wizard-core/bin/lib/constants.cjs +39 -0
- package/pan-wizard-core/bin/lib/context-budget.cjs +27 -0
- package/pan-wizard-core/bin/lib/core.cjs +91 -6
- package/pan-wizard-core/bin/lib/cost.cjs +359 -0
- package/pan-wizard-core/bin/lib/focus.cjs +100 -2
- package/pan-wizard-core/bin/lib/init.cjs +5 -5
- package/pan-wizard-core/bin/lib/knowledge.cjs +331 -0
- package/pan-wizard-core/bin/lib/memory.cjs +252 -0
- package/pan-wizard-core/bin/lib/phase.cjs +40 -13
- package/pan-wizard-core/bin/lib/preview.cjs +480 -0
- package/pan-wizard-core/bin/lib/review-deep.cjs +280 -0
- package/pan-wizard-core/bin/lib/roadmap.cjs +4 -4
- package/pan-wizard-core/bin/lib/state.cjs +2 -2
- package/pan-wizard-core/bin/lib/verify.cjs +34 -1
- package/pan-wizard-core/bin/lib/whatif.cjs +289 -0
- package/pan-wizard-core/bin/pan-tools.cjs +239 -4
- package/pan-wizard-core/templates/playbook.md +53 -0
- package/pan-wizard-core/templates/preview-report.md +93 -0
- package/pan-wizard-core/templates/roadmap.md +24 -24
- package/pan-wizard-core/templates/state.md +12 -9
- package/pan-wizard-core/workflows/plan-phase.md +1 -1
- package/scripts/build-hooks.js +2 -1
|
@@ -0,0 +1,132 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: pan:cost
|
|
3
|
+
group: Observability
|
|
4
|
+
description: Show token usage and estimated cost across PAN commands and agents
|
|
5
|
+
argument-hint: "[report|append|clear] [--format json|table|chart] [--since YYYY-MM-DD] [--until YYYY-MM-DD]"
|
|
6
|
+
allowed-tools:
|
|
7
|
+
- Read
|
|
8
|
+
- Bash
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
<objective>
|
|
12
|
+
Report token usage and estimated cost across all PAN invocations in this project.
|
|
13
|
+
|
|
14
|
+
Reads `.planning/metrics/tokens.jsonl` — an append-only log where each line is one call (agent or command) with token counts and model. Cost is computed from a built-in rate table (overridable via `.planning/config.json` → `cost.rates`).
|
|
15
|
+
|
|
16
|
+
Default output is JSON for piping. Use `--format table` for human-readable tables or `--format chart` for an ASCII bar chart of daily spend.
|
|
17
|
+
</objective>
|
|
18
|
+
|
|
19
|
+
<execution_context>
|
|
20
|
+
@~/.claude/pan-wizard-core/bin/lib/cost.cjs
|
|
21
|
+
</execution_context>
|
|
22
|
+
|
|
23
|
+
<subcommands>
|
|
24
|
+
|
|
25
|
+
### `report` (default)
|
|
26
|
+
|
|
27
|
+
Aggregate all records into totals + breakdowns by agent, command, tier, and day.
|
|
28
|
+
|
|
29
|
+
```
|
|
30
|
+
pan-tools cost report [--format json|table|chart] [--since YYYY-MM-DD] [--until YYYY-MM-DD]
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
**Flags:**
|
|
34
|
+
- `--format` — `json` (default, for tools) | `table` (aligned text columns) | `chart` (per-day ASCII bars).
|
|
35
|
+
- `--since` — ISO date lower bound (inclusive). Records without `ts` always pass.
|
|
36
|
+
- `--until` — ISO date upper bound (inclusive).
|
|
37
|
+
|
|
38
|
+
**JSON output shape:**
|
|
39
|
+
```json
|
|
40
|
+
{
|
|
41
|
+
"totals": {
|
|
42
|
+
"calls": 42,
|
|
43
|
+
"input_tokens": 123456,
|
|
44
|
+
"output_tokens": 4567,
|
|
45
|
+
"cache_read_tokens": 50000,
|
|
46
|
+
"cache_write_tokens": 5000,
|
|
47
|
+
"cost_usd": 2.1234,
|
|
48
|
+
"cost_unknown": 0
|
|
49
|
+
},
|
|
50
|
+
"cache_hit_rate_pct": 40.5,
|
|
51
|
+
"by_agent": { "pan-planner": { "calls": 8, "input": 50000, ... } },
|
|
52
|
+
"by_command": { ... },
|
|
53
|
+
"by_tier": { ... },
|
|
54
|
+
"by_day": { "2026-04-18": { ... } },
|
|
55
|
+
"window": { "since": null, "until": null }
|
|
56
|
+
}
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### `append`
|
|
60
|
+
|
|
61
|
+
Append a single cost record. Normally called by instrumented agent spawns; users rarely invoke directly.
|
|
62
|
+
|
|
63
|
+
```
|
|
64
|
+
pan-tools cost append \
|
|
65
|
+
[--agent <name>] [--command <name>] [--model <id>] [--tier reasoning|mid|fast] \
|
|
66
|
+
[--input-tokens N] [--output-tokens N] \
|
|
67
|
+
[--cache-read-tokens N] [--cache-write-tokens N] \
|
|
68
|
+
[--phase <num>] [--session <id>]
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Missing fields are stored as `null` / `0`. Cost is auto-computed when `model` or `tier` resolves to a known rate.
|
|
72
|
+
|
|
73
|
+
### `clear`
|
|
74
|
+
|
|
75
|
+
Delete the cost log. Useful at the start of a billing cycle.
|
|
76
|
+
|
|
77
|
+
```
|
|
78
|
+
pan-tools cost clear
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
</subcommands>
|
|
82
|
+
|
|
83
|
+
<rate_table>
|
|
84
|
+
Default rates (USD per million tokens) as of 2026-04. Override per-model in `.planning/config.json`:
|
|
85
|
+
|
|
86
|
+
```json
|
|
87
|
+
{
|
|
88
|
+
"cost": {
|
|
89
|
+
"rates": {
|
|
90
|
+
"claude-opus-4-7": { "input": 15.0, "output": 75.0, "cache_read": 1.5, "cache_write": 18.75 },
|
|
91
|
+
"my-custom-model": { "input": 1.0, "output": 2.0, "cache_read": 0.1, "cache_write": 1.25 }
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
}
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
When a record has neither a known model nor a known tier, its cost is `null` and it counts toward `totals.cost_unknown`.
|
|
98
|
+
</rate_table>
|
|
99
|
+
|
|
100
|
+
<workflow>
|
|
101
|
+
|
|
102
|
+
**Daily check:** run `/pan:cost --format chart` at the end of a working day to see the spend shape.
|
|
103
|
+
|
|
104
|
+
**Before shipping:** run `/pan:cost --since 2026-04-01 --format table` to get a total for the billing period.
|
|
105
|
+
|
|
106
|
+
**After an expensive run:** check `by_agent` and `by_command` to see which stage drove the spend.
|
|
107
|
+
|
|
108
|
+
**To reconcile with provider bill:** providers report total tokens; PAN's log is append-only and in ISO-8601, so `--since / --until` should match the provider's billing window.
|
|
109
|
+
|
|
110
|
+
</workflow>
|
|
111
|
+
|
|
112
|
+
<instrumentation_note>
|
|
113
|
+
|
|
114
|
+
Token records are written by any caller that knows its usage — typically the host runtime or a wrapper. PAN ships the log format + aggregator (this command); the capture hook itself is opt-in (Wave 5 of Spec B v2). Until then, records can be appended manually via `pan-tools cost append` or by external scripts reading the provider API.
|
|
115
|
+
|
|
116
|
+
If `.planning/metrics/tokens.jsonl` is empty, `/pan:cost` returns zero totals — the feature is inert, not broken.
|
|
117
|
+
|
|
118
|
+
</instrumentation_note>
|
|
119
|
+
|
|
120
|
+
<runtime_compatibility>
|
|
121
|
+
|
|
122
|
+
| Runtime | Support |
|
|
123
|
+
|---------|---------|
|
|
124
|
+
| Claude Code | Full — data format + aggregation + all output formats |
|
|
125
|
+
| OpenCode | Full aggregator; token capture depends on OpenCode's own hooks |
|
|
126
|
+
| Gemini | Full aggregator; token capture depends on Gemini CLI instrumentation |
|
|
127
|
+
| Codex | Full aggregator; token capture via external script |
|
|
128
|
+
| Copilot CLI | Full aggregator; Copilot doesn't currently expose per-call usage |
|
|
129
|
+
|
|
130
|
+
The aggregator is runtime-agnostic. What varies across runtimes is how records *get into* `tokens.jsonl` in the first place.
|
|
131
|
+
|
|
132
|
+
</runtime_compatibility>
|
|
@@ -61,6 +61,8 @@ Phase: $ARGUMENTS
|
|
|
61
61
|
- `--skip-tests` — Skip automatic test generation after execution completes.
|
|
62
62
|
- `--skip-review` — Skip automatic code review after execution completes.
|
|
63
63
|
- `--fast` — Skip both test generation and code review (implies `--skip-tests --skip-review`).
|
|
64
|
+
- `--deep-review` (v3.4+) — After the normal reviewer step, also run `/pan:review-deep <phase>` (security audit via pan-hardener + cross-check via pan-meta-reviewer). Produces `.planning/reviews/<N>/deep-review.md`. Recommended for phases touching auth, payment, PII, migrations, or public APIs. Costs roughly 3× a normal review.
|
|
65
|
+
- `--hierarchical` (v3.4+, Claude + Opus 4.7 only) — Spawn `pan-conductor` as a top-level orchestrator that decomposes the phase and spawns executor/reviewer/verifier sub-agents in sequence. Bounded by safety harness: max 2 nesting levels, 12 spawns per phase, budget ceiling, `.planning/orchestration/abort` kill-switch. On non-Claude runtimes or older models, this flag is a no-op with a warning and falls back to flat exec. Use only for large phases (≥4 autonomous plans) where wall-clock reduction justifies the ~20-30% orchestration tax.
|
|
64
66
|
|
|
65
67
|
Context files are resolved inside the workflow via `pan-tools init execute-phase` and per-subagent `<files_to_read>` blocks.
|
|
66
68
|
</context>
|
|
@@ -85,6 +87,19 @@ Each execution stage has a restricted set of appropriate actions. Using the wron
|
|
|
85
87
|
- Wave commit: git operations only — all code changes must be done before committing
|
|
86
88
|
</action_gating>
|
|
87
89
|
|
|
90
|
+
<cache_priming>
|
|
91
|
+
**Before Discovery, prime the prompt cache once per invocation.** All subagents spawned within the next 5 minutes will hit the cache instead of re-sending the full context.
|
|
92
|
+
|
|
93
|
+
Run once:
|
|
94
|
+
```
|
|
95
|
+
pan-tools cache prime --summary
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
This returns `{blocks: [{path, bytes, cache}], total_bytes, sha}` for the cacheable set (project.md, requirements.md, roadmap.md, state.md, standards.md). The `sha` is stable across identical inputs, so repeated calls within the phase hit cached reads.
|
|
99
|
+
|
|
100
|
+
When spawning subagents for wave execution, include the cacheable block paths in each agent's system-context so the host runtime (Claude Code with Opus 4.7) can mark them `cache_control: ephemeral`. On non-Claude runtimes or older models, this step is a no-op — nothing breaks, just no savings.
|
|
101
|
+
</cache_priming>
|
|
102
|
+
|
|
88
103
|
<process>
|
|
89
104
|
Execute the execute-phase workflow from @~/.claude/pan-wizard-core/workflows/exec-phase.md end-to-end.
|
|
90
105
|
Preserve all workflow gates (wave execution, checkpoint handling, verification, state updates, routing).
|
|
@@ -293,6 +293,24 @@ Between cycles, manage context to prevent quality degradation over long campaign
|
|
|
293
293
|
|
|
294
294
|
Display one-line cycle summary: `Cycle N/M | X/Y pts | Z items done | Tests: A -> B`
|
|
295
295
|
|
|
296
|
+
#### Step 2.5a: Reflection Gate (Opus 4.7 thinking-capable models only)
|
|
297
|
+
|
|
298
|
+
Before committing to the next cycle, call the reflection helper:
|
|
299
|
+
|
|
300
|
+
```
|
|
301
|
+
echo '{"run": <run-state>, "cycle": <just-completed-cycle>, "batch": <proposed-next-batch>, "tier": "reasoning"}' \
|
|
302
|
+
| pan-tools focus reflection
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
The helper returns `{reflect: true, prompt: "..."}` when the current model tier supports extended thinking. If `reflect: true`, think through the prompt — which asks whether running another cycle is worthwhile given telemetry and remaining items — and respond with JSON: `{"continue": true|false, "rationale": "..."}`.
|
|
306
|
+
|
|
307
|
+
- If `continue: false`: stop the campaign and treat as a user-reason stop (preserve state, skip to Phase 3).
|
|
308
|
+
- If `continue: true`: proceed to the next cycle.
|
|
309
|
+
|
|
310
|
+
If the helper returns `reflect: false` (tier doesn't support thinking, or `reflection_enabled: false` in run state, or no next batch): skip this step silently and continue to the next cycle.
|
|
311
|
+
|
|
312
|
+
The reflection gate catches "zero progress" or "wrong category" drift earlier than the automatic stop rules.
|
|
313
|
+
|
|
296
314
|
**Attention anchor — emit after every cycle summary:**
|
|
297
315
|
```
|
|
298
316
|
Remaining: {cycles_left} cycles | {budget_remaining}/{total_budget} pts | Safety: {active_harness_warnings}
|
|
@@ -116,6 +116,7 @@ HARD STOP conditions (do not proceed to next stage):
|
|
|
116
116
|
- `--dry-run` — Run Stages 1-2 only (show what WOULD be executed)
|
|
117
117
|
- `--no-commit` — Skip the commit step in Stage 6
|
|
118
118
|
- `--continue` — Resume a previously interrupted execution
|
|
119
|
+
- `--deep-review` (v3.4+) — After each high-stakes item's execution, run `/pan:review-deep` for that item (pan-hardener + pan-meta-reviewer security + cross-check). Slows the campaign by roughly 3× per item that triggers the deep pass; use for batches touching auth/payment/migrations.
|
|
119
120
|
|
|
120
121
|
---
|
|
121
122
|
|
|
@@ -209,7 +210,8 @@ This catches emergent interactions: 5 "add try-catch" fixes might reveal the mod
|
|
|
209
210
|
1. **Check Project Status** — git status, recent commits
|
|
210
211
|
2. **Test Baseline** — run test suite, record current counts
|
|
211
212
|
3. **Create rollback snapshot** — git tag for safety
|
|
212
|
-
4. **
|
|
213
|
+
4. **Prime prompt cache** — `pan-tools cache prime --summary` (once; all sub-agents in the next 5 min hit cached context)
|
|
214
|
+
5. **Report** — Output session start summary
|
|
213
215
|
|
|
214
216
|
**Record baseline:**
|
|
215
217
|
```
|
|
@@ -243,6 +245,13 @@ Display the execution batch to user, then continue automatically.
|
|
|
243
245
|
### 3.0 Pre-Execution Setup
|
|
244
246
|
1. Cache project facts — do NOT re-read later
|
|
245
247
|
2. Create/update progress tracker with the batch table
|
|
248
|
+
3. Classify stages for parallel tool use:
|
|
249
|
+
```
|
|
250
|
+
pan-tools focus classify-stages --raw
|
|
251
|
+
```
|
|
252
|
+
The CLI reads the latest batch and returns `{waves, parallelism_hint}`. When `parallelism_hint` is `emit-micro-in-parallel` or `emit-standard-in-parallel`, all reads and greps for items in the current wave SHOULD be emitted in a single assistant turn (parallel tool calls). Opus 4.7 is markedly better at emitting parallel tool calls than earlier models; use that to collapse Stage 3 latency on MICRO-heavy batches.
|
|
253
|
+
|
|
254
|
+
Serialize on `FULL` tier items — each is its own wave.
|
|
246
255
|
|
|
247
256
|
### 3.1 Process Items by Tier
|
|
248
257
|
|
|
@@ -0,0 +1,129 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: pan:knowledge
|
|
3
|
+
group: Knowledge
|
|
4
|
+
description: Grounded Q&A, multi-turn design discussion, and playbook generation. Three modes in one command.
|
|
5
|
+
argument-hint: "ask <question> | discuss <phase> <topic> | playbook"
|
|
6
|
+
allowed-tools:
|
|
7
|
+
- Read
|
|
8
|
+
- Write
|
|
9
|
+
- Bash
|
|
10
|
+
- Grep
|
|
11
|
+
- Glob
|
|
12
|
+
- Task
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
<objective>
|
|
16
|
+
Retrieve, refine, or consolidate project knowledge. Three modes:
|
|
17
|
+
|
|
18
|
+
- **ask** — answer a natural-language question with inline citations grounded in `.planning/` + `docs/`.
|
|
19
|
+
- **discuss** — multi-turn refinement of a phase's context. Session state persists across invocations; prompt caching keeps turn 3 cheap.
|
|
20
|
+
- **playbook** — aggregate all agents' memory (E-4 layer) into `.planning/playbook.md`, organized by category (Conventions / Gotchas / Decisions / Tool choices / Anti-patterns / Recurring gaps).
|
|
21
|
+
|
|
22
|
+
Consolidates Spec B v1's X-3 converse + X-6 teach + X-10 explain into one command.
|
|
23
|
+
</objective>
|
|
24
|
+
|
|
25
|
+
<execution_context>
|
|
26
|
+
@~/.claude/pan-wizard-core/bin/lib/knowledge.cjs
|
|
27
|
+
@~/.claude/agents/pan-knowledge.md
|
|
28
|
+
@~/.claude/pan-wizard-core/templates/playbook.md
|
|
29
|
+
</execution_context>
|
|
30
|
+
|
|
31
|
+
<modes>
|
|
32
|
+
|
|
33
|
+
### `ask <question>`
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
/pan:knowledge ask "why does phase 4 have a race condition fix?"
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
**Flow:**
|
|
40
|
+
1. `pan-tools knowledge ask "<question>"` returns a ranked list of candidate files.
|
|
41
|
+
2. Spawn `pan-knowledge` with `<mode>ask</mode>`, the question, and the top sources as `<files_to_read>`.
|
|
42
|
+
3. Agent reads sources, answers with citations, returns the answer to stdout. No file is written.
|
|
43
|
+
|
|
44
|
+
**Output:** inline markdown answer with `[file.md:LINE]` and `[ADR-NNNN]` citations.
|
|
45
|
+
|
|
46
|
+
### `discuss <phase> <topic-or-question>`
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
/pan:knowledge discuss 12 "should we use Redis or Memcached?"
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
**Flow:**
|
|
53
|
+
1. `pan-tools knowledge discuss <phase> --subcmd read` loads session state from `.planning/conversations/<phase>/session.json` (empty for new phase).
|
|
54
|
+
2. `pan-tools knowledge discuss <phase> --subcmd append --role user --content "<topic>"` persists the user turn.
|
|
55
|
+
3. Spawn `pan-knowledge` with `<mode>discuss</mode>`, session history, phase context, and the new turn.
|
|
56
|
+
4. Agent responds.
|
|
57
|
+
5. `pan-tools knowledge discuss <phase> --subcmd append --role agent --content "<response>" --cites "a.md,b.md"` persists the response.
|
|
58
|
+
6. If after ≥3 substantive turns the agent offered to emit `context.md`, user can follow up with another `/pan:knowledge discuss <phase>` invocation or run the commit subcommand the agent suggested.
|
|
59
|
+
|
|
60
|
+
**Session persistence:** `.planning/conversations/<phase>/session.json` — array of turns with ts/role/content/cites. Multi-turn cost is dominated by cache hits on stable `.planning/` files.
|
|
61
|
+
|
|
62
|
+
### `playbook`
|
|
63
|
+
|
|
64
|
+
```
|
|
65
|
+
/pan:knowledge playbook
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
**Flow:**
|
|
69
|
+
1. `pan-tools knowledge playbook` reads all agents' memory (`.planning/memory/*.md`), clusters entries by category, writes `.planning/playbook.md` directly.
|
|
70
|
+
2. Optionally spawn `pan-knowledge` with `<mode>playbook</mode>` to polish (dedupe contradictions, consolidate similar entries). Skip the polish step if the draft looks clean.
|
|
71
|
+
|
|
72
|
+
**Output:** `.planning/playbook.md` — team-readable summary of accumulated lessons.
|
|
73
|
+
|
|
74
|
+
**Auto-invocation:** `/pan:milestone-done` can optionally run this (flag-gated, not default). Manual invocation any time.
|
|
75
|
+
|
|
76
|
+
</modes>
|
|
77
|
+
|
|
78
|
+
<workflow>
|
|
79
|
+
|
|
80
|
+
**Onboarding a new team member:** have them run `/pan:knowledge playbook` then `/pan:knowledge ask "what conventions matter in this codebase?"`.
|
|
81
|
+
|
|
82
|
+
**Design debate:** run `/pan:knowledge discuss <phase> "<question>"` iteratively. The agent refines as the debate narrows. After convergence, accept the proposed `context.md` update.
|
|
83
|
+
|
|
84
|
+
**Bug investigation:** `/pan:knowledge ask "why did we add the retry in phase 4?"` — faster than grepping for historical context.
|
|
85
|
+
|
|
86
|
+
**Before milestone-done:** run `/pan:knowledge playbook` to capture what the team learned. Gives contributors something to reference when starting the next milestone.
|
|
87
|
+
|
|
88
|
+
</workflow>
|
|
89
|
+
|
|
90
|
+
<citation_format>
|
|
91
|
+
|
|
92
|
+
Agent output uses bracketed citations that link to files. Supported forms:
|
|
93
|
+
|
|
94
|
+
| Form | Example | Renders as |
|
|
95
|
+
|------|---------|-----------|
|
|
96
|
+
| Plain file | `[README.md]` | markdown link to the file |
|
|
97
|
+
| File + line | `[docs/ARCHITECTURE.md:200]` | link to line 200 |
|
|
98
|
+
| ADR | `[ADR-0015]` | link to ADR file |
|
|
99
|
+
| Phase artifact | `[phase-4/summary.md]` | link to phase summary |
|
|
100
|
+
|
|
101
|
+
The agent should NEVER fabricate citations. The retrieval layer's `sources` list is the allowlist.
|
|
102
|
+
|
|
103
|
+
</citation_format>
|
|
104
|
+
|
|
105
|
+
<runtime_compatibility>
|
|
106
|
+
|
|
107
|
+
| Runtime | ask | discuss | playbook |
|
|
108
|
+
|---------|-----|---------|----------|
|
|
109
|
+
| Claude Code | Full, thinking enabled | Full, prompt caching bonus | Full |
|
|
110
|
+
| OpenCode | Full | Full (no cache bonus) | Full |
|
|
111
|
+
| Gemini | Full | Full | Full |
|
|
112
|
+
| Codex | Full | Full | Full |
|
|
113
|
+
| Copilot | Full | Full | Full |
|
|
114
|
+
|
|
115
|
+
The data layer (retrieval, session state, playbook clustering) is pure Node.js and runtime-agnostic. Only answer synthesis quality varies with model capability.
|
|
116
|
+
|
|
117
|
+
</runtime_compatibility>
|
|
118
|
+
|
|
119
|
+
<privacy_note>
|
|
120
|
+
|
|
121
|
+
`session.json` is persisted to disk and committed unless `.planning/conversations/` is gitignored. For sensitive design discussions, consider:
|
|
122
|
+
|
|
123
|
+
```
|
|
124
|
+
echo '.planning/conversations/' >> .gitignore
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
before starting a `discuss` session. Session turns are not auto-encrypted.
|
|
128
|
+
|
|
129
|
+
</privacy_note>
|
|
@@ -49,6 +49,21 @@ Check for .planning/state.md - loads context if project already initialized
|
|
|
49
49
|
- Trivial codebases (<5 files)
|
|
50
50
|
</when_to_use>
|
|
51
51
|
|
|
52
|
+
<stage_0_ingest_mode>
|
|
53
|
+
**Before spawning mapper agents**, determine whether the repo fits in a single 1M-context window.
|
|
54
|
+
|
|
55
|
+
Run: `node ~/.claude/pan-wizard-core/bin/pan-tools.cjs codebase estimate-size --threshold 700000`
|
|
56
|
+
|
|
57
|
+
The CLI returns `{mode, total_tokens, file_count, languages}`:
|
|
58
|
+
|
|
59
|
+
- **`mode: "single-shot"`** — repo is small enough (≤700K tokens) for one Opus 4.7 agent to ingest the whole thing. Spawn a single `pan-document_code` agent with the full repo in context. This avoids the 6-way stitching artifacts of sharded mode (contradictory version claims, duplicated mentions, missed cross-file references).
|
|
60
|
+
- **`mode: "sharded"`** — repo exceeds 700K tokens. Fall back to the default 6-way parallel sharding (tech, arch, quality, concerns, relationships, practices). Each shard gets a 200K budget.
|
|
61
|
+
|
|
62
|
+
Record the chosen mode + telemetry in the final `.planning/codebase/overview.md` so future runs can reason about drift.
|
|
63
|
+
|
|
64
|
+
Opus 4.7 is required for single-shot mode (only model with a 1M context window). Other models always take the sharded path regardless of size.
|
|
65
|
+
</stage_0_ingest_mode>
|
|
66
|
+
|
|
52
67
|
<tool_priority>
|
|
53
68
|
Each mapper agent should use the simplest sufficient tool:
|
|
54
69
|
1. Glob — discover files by pattern (find all .ts files, config files, test files)
|
|
@@ -0,0 +1,145 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: pan:mcp-bridge
|
|
3
|
+
group: External tools
|
|
4
|
+
description: Discover available MCP tools and recommend which ones apply to a phase. Discovery-only; auto-invocation deferred.
|
|
5
|
+
argument-hint: "list | recommend <phase> | cache [--servers <json>] [--runtime <name>]"
|
|
6
|
+
allowed-tools:
|
|
7
|
+
- Read
|
|
8
|
+
- Bash
|
|
9
|
+
- Write
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
<objective>
|
|
13
|
+
Surface Model Context Protocol (MCP) tools visible to the host runtime and recommend which ones might apply to a specific phase plan.
|
|
14
|
+
|
|
15
|
+
Reduced scope from Spec B v1's X-7: **discovery and recommendation only**. Auto-injection of MCP tools into planner context and auto-invocation from executor agents are deliberately deferred (likely Wave 5+ or v3.5). This keeps v3.3 narrow and avoids coupling PAN to Claude Code's MCP schema stability.
|
|
16
|
+
</objective>
|
|
17
|
+
|
|
18
|
+
<execution_context>
|
|
19
|
+
@~/.claude/pan-wizard-core/bin/lib/bridge.cjs
|
|
20
|
+
</execution_context>
|
|
21
|
+
|
|
22
|
+
<subcommands>
|
|
23
|
+
|
|
24
|
+
### `list`
|
|
25
|
+
|
|
26
|
+
Show cached MCP tools with server grouping and schemas.
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
/pan:mcp-bridge list
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
**Output (JSON):**
|
|
33
|
+
```json
|
|
34
|
+
{
|
|
35
|
+
"cached_at": "2026-04-18T12:34:56Z",
|
|
36
|
+
"runtime": "claude",
|
|
37
|
+
"server_count": 3,
|
|
38
|
+
"tool_count": 12,
|
|
39
|
+
"tools": [
|
|
40
|
+
{ "server": "linear", "name": "linear.updateTicket", "description": "...", "schema": {...} },
|
|
41
|
+
...
|
|
42
|
+
],
|
|
43
|
+
"source": "cache" | "empty"
|
|
44
|
+
}
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
When `source: "empty"`, either no MCP servers are configured or the host runtime hasn't populated the cache yet. See the `cache` subcommand for manual seeding.
|
|
48
|
+
|
|
49
|
+
### `recommend <phase>`
|
|
50
|
+
|
|
51
|
+
Given a phase number, match cached MCP tools against the phase's plan text and return tools ranked by keyword relevance.
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
/pan:mcp-bridge recommend 7
|
|
55
|
+
/pan:mcp-bridge recommend 12 --max 5 --min-score 2
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
**Flags:**
|
|
59
|
+
- `--max N` — cap recommendations (default 10)
|
|
60
|
+
- `--min-score N` — minimum keyword hit count (default 1)
|
|
61
|
+
|
|
62
|
+
**Output (JSON):**
|
|
63
|
+
```json
|
|
64
|
+
{
|
|
65
|
+
"phase": "7",
|
|
66
|
+
"phase_name": "API refactor",
|
|
67
|
+
"runtime": "claude",
|
|
68
|
+
"total_candidates": 12,
|
|
69
|
+
"recommendations": [
|
|
70
|
+
{
|
|
71
|
+
"server": "linear",
|
|
72
|
+
"name": "linear.updateTicket",
|
|
73
|
+
"description": "Update a Linear issue",
|
|
74
|
+
"score": 3,
|
|
75
|
+
"hits": ["linear", "ticket", "update"]
|
|
76
|
+
}
|
|
77
|
+
]
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
Scoring is naive keyword frequency with word boundaries — not semantic embeddings. A tool's name and description are tokenized into keywords (≥3 chars); each match in the phase plan text scores 1 point.
|
|
82
|
+
|
|
83
|
+
### `cache`
|
|
84
|
+
|
|
85
|
+
Write or inspect the MCP tools cache at `.planning/bridge/available-tools.json`.
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
# Inspect current cache (same as `list` but raw)
|
|
89
|
+
/pan:mcp-bridge cache
|
|
90
|
+
|
|
91
|
+
# Seed cache from scripted discovery (for testing or external pipeline)
|
|
92
|
+
/pan:mcp-bridge cache --runtime claude --servers '[{"name":"linear","tools":[{"name":"linear.updateTicket","description":"Update ticket"}]}]'
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Normally the host runtime writes this file. The CLI path exists for test fixtures and external-script integration.
|
|
96
|
+
|
|
97
|
+
</subcommands>
|
|
98
|
+
|
|
99
|
+
<workflow>
|
|
100
|
+
|
|
101
|
+
**New to a project with MCP tools?** Run `/pan:mcp-bridge list` to see what's available. If empty, check the host runtime's MCP config — `.claude/settings.json` for Claude Code, or the runtime's equivalent.
|
|
102
|
+
|
|
103
|
+
**Planning a phase that might touch external systems?** Run `/pan:mcp-bridge recommend <phase>` to get a ranked shortlist. Copy relevant tool names into the phase plan's "External tools" section so the executor knows to invoke them.
|
|
104
|
+
|
|
105
|
+
**Pre-milestone review:** walk through each remaining phase with `/pan:mcp-bridge recommend` to catch "we should have automated this via Linear/Slack/etc." realizations before shipping.
|
|
106
|
+
|
|
107
|
+
</workflow>
|
|
108
|
+
|
|
109
|
+
<caveats>
|
|
110
|
+
|
|
111
|
+
**Discovery is a cache, not a live probe.** The host runtime owns populating `.planning/bridge/available-tools.json`. PAN does not query MCP servers directly — that would require runtime-specific HTTP or IPC integration this command deliberately avoids.
|
|
112
|
+
|
|
113
|
+
**Keyword scoring is crude.** "Postgres" and "PostgreSQL" are different tokens; `postgresql` in a plan won't match a `postgres.query` tool unless the plan also says "postgres." Tune your plan language or expand tool descriptions to improve matches.
|
|
114
|
+
|
|
115
|
+
**Claude Code is the primary target.** MCP is a Claude-first protocol. Other runtimes may have their own tool-discovery mechanisms; the cache schema is intentionally generic so a future Codex/Gemini equivalent could populate the same file.
|
|
116
|
+
|
|
117
|
+
**No automatic invocation.** This command never calls MCP tools. It tells you what's available and what might apply. The actual invocation happens via the host runtime's normal tool-use flow (Claude Code's tool calls, etc.) when the executor agent decides to use a recommended tool.
|
|
118
|
+
|
|
119
|
+
</caveats>
|
|
120
|
+
|
|
121
|
+
<runtime_compatibility>
|
|
122
|
+
|
|
123
|
+
| Runtime | list | recommend | cache |
|
|
124
|
+
|---------|------|-----------|-------|
|
|
125
|
+
| Claude Code | Full | Full | Full (host-populated) |
|
|
126
|
+
| OpenCode | Stub (empty cache returns gracefully) | Stub | CLI write works |
|
|
127
|
+
| Gemini CLI | Stub | Stub | CLI write works |
|
|
128
|
+
| Codex CLI | Stub | Stub | CLI write works |
|
|
129
|
+
| Copilot CLI | Stub | Stub | CLI write works |
|
|
130
|
+
|
|
131
|
+
On non-Claude runtimes, the aggregator and recommendation logic still work — they just report zero tools until something populates the cache.
|
|
132
|
+
|
|
133
|
+
</runtime_compatibility>
|
|
134
|
+
|
|
135
|
+
<future_scope>
|
|
136
|
+
|
|
137
|
+
Explicitly deferred from v3.3 (documented in ADR-0023 / Spec B v2 notes):
|
|
138
|
+
|
|
139
|
+
1. **Auto-inject recommended tools into planner context** — requires a stable MCP schema contract and a plan-template extension. Candidate for v3.5.
|
|
140
|
+
2. **Auto-invoke MCP tools from executor agent** — requires permission-gating and per-tool safety review. Candidate for v3.5+.
|
|
141
|
+
3. **Cross-runtime tool discovery** — generic MCP-like protocol for non-Claude runtimes. No timeline; needs ecosystem signal.
|
|
142
|
+
|
|
143
|
+
Until those land, this command is the minimum viable integration: you see what's there, you get suggestions, you decide manually.
|
|
144
|
+
|
|
145
|
+
</future_scope>
|
|
@@ -124,6 +124,17 @@ ELSE:
|
|
|
124
124
|
```
|
|
125
125
|
</routing_decision_tree>
|
|
126
126
|
|
|
127
|
+
<cache_priming>
|
|
128
|
+
**Before spawning research + planner agents, prime the prompt cache.** All sub-agents spawned within the next 5 minutes hit cached context instead of re-reading project.md / requirements.md / roadmap.md / state.md / standards.md.
|
|
129
|
+
|
|
130
|
+
Run once per invocation:
|
|
131
|
+
```
|
|
132
|
+
pan-tools cache prime --summary
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
Returns `{blocks: [{path, bytes, cache}], total_bytes, sha}`. On Claude Code with Opus 4.7, the host runtime translates these block references into `cache_control: ephemeral`. On non-Claude runtimes or older models this is a no-op — nothing breaks.
|
|
136
|
+
</cache_priming>
|
|
137
|
+
|
|
127
138
|
<process>
|
|
128
139
|
Execute the plan-phase workflow from @~/.claude/pan-wizard-core/workflows/plan-phase.md end-to-end.
|
|
129
140
|
Preserve all workflow gates (validation, research, planning, verification loop, routing).
|
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: pan:preview
|
|
3
|
+
group: Foresight
|
|
4
|
+
description: Preview what will happen — phase blast radius, phase dependency graph, or milestone ETA
|
|
5
|
+
argument-hint: "phase <N> | phases | milestone"
|
|
6
|
+
allowed-tools:
|
|
7
|
+
- Read
|
|
8
|
+
- Bash
|
|
9
|
+
- Glob
|
|
10
|
+
- Grep
|
|
11
|
+
- Write
|
|
12
|
+
- Task
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
<objective>
|
|
16
|
+
Read-only foresight. Given a phase, a set of phases, or a milestone, produce a structured forecast: what files get touched, which tests might break, which phases can parallelize, when the milestone will actually finish.
|
|
17
|
+
|
|
18
|
+
Consolidates Spec B v1's architect + simulate + predict-milestone into one entry point with three modes. The data layer (`pan-tools preview …`) extracts structured inputs from `.planning/`; the `pan-previewer` agent analyzes and writes the report. No source code is modified.
|
|
19
|
+
</objective>
|
|
20
|
+
|
|
21
|
+
<execution_context>
|
|
22
|
+
@~/.claude/pan-wizard-core/bin/lib/preview.cjs
|
|
23
|
+
@~/.claude/pan-wizard-core/templates/preview-report.md
|
|
24
|
+
</execution_context>
|
|
25
|
+
|
|
26
|
+
<modes>
|
|
27
|
+
|
|
28
|
+
### `phase <N>` — Blast radius of one phase
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
/pan:preview phase 7
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
**What it does:**
|
|
35
|
+
1. `pan-tools preview phase <N>` returns `{files_mentioned, test_files_mentioned, risk_signals, risk_score, plans[], status}`.
|
|
36
|
+
2. Spawn `pan-previewer` with the payload as `<preview_input>`.
|
|
37
|
+
3. Agent writes `.planning/phases/<N>/preview.md` with files touched / tests at risk / migration steps / risk assessment / bottom line.
|
|
38
|
+
|
|
39
|
+
**Output:** `.planning/phases/<N>/preview.md`
|
|
40
|
+
|
|
41
|
+
### `phases` — Cross-phase dependency graph
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
/pan:preview phases
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
**What it does:**
|
|
48
|
+
1. `pan-tools preview phases` returns `{phases[], parallel_batches, mermaid, hidden_coupling_count}`.
|
|
49
|
+
2. Spawn `pan-previewer` with `mode: phases` in the payload.
|
|
50
|
+
3. Agent writes `.planning/architecture/dependency-graph.md` with mermaid DAG + parallel batches + hidden-coupling flags.
|
|
51
|
+
|
|
52
|
+
**Output:** `.planning/architecture/dependency-graph.md`
|
|
53
|
+
|
|
54
|
+
**Opus 4.7 1M-context bonus:** when the full repo fits in a single agent window, the agent cross-references plan text with actual source imports to catch coupling the frontmatter missed. On smaller-context models, the agent relies on data-layer output alone.
|
|
55
|
+
|
|
56
|
+
### `milestone` — Completion ETA
|
|
57
|
+
|
|
58
|
+
```
|
|
59
|
+
/pan:preview milestone
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
**What it does:**
|
|
63
|
+
1. `pan-tools preview milestone` returns `{phases_total, completed, remaining, avg_phase_duration_days, eta_date, confidence_pct, bottleneck, sample_size}`.
|
|
64
|
+
2. Spawn `pan-previewer` with `mode: milestone`.
|
|
65
|
+
3. Agent writes `.planning/milestones/preview-<today>.md` with ETA + confidence + bottleneck + caveats + bottom line.
|
|
66
|
+
|
|
67
|
+
**Output:** `.planning/milestones/preview-YYYY-MM-DD.md`
|
|
68
|
+
|
|
69
|
+
</modes>
|
|
70
|
+
|
|
71
|
+
<workflow>
|
|
72
|
+
|
|
73
|
+
**Before committing to a phase:** run `/pan:preview phase <N>` to see blast radius. A `risk_score ≥ 7` or a migration signal on auth files should prompt a review before `/pan:exec-phase`.
|
|
74
|
+
|
|
75
|
+
**Before committing to a milestone date externally:** run `/pan:preview milestone`. Look at `confidence_pct` and `sample_size`. If sample is <3, don't promise a date.
|
|
76
|
+
|
|
77
|
+
**Before running phases in parallel:** run `/pan:preview phases`. Parallel batches from the data layer are based on declared `depends_on` only; `hidden_coupling_count > 0` means there are cross-phase references the author should promote to explicit deps before parallelizing.
|
|
78
|
+
|
|
79
|
+
</workflow>
|
|
80
|
+
|
|
81
|
+
<process>
|
|
82
|
+
|
|
83
|
+
For all modes:
|
|
84
|
+
|
|
85
|
+
1. Run the corresponding `pan-tools preview <mode>` subcommand.
|
|
86
|
+
2. Parse its JSON output.
|
|
87
|
+
3. Spawn `pan-previewer` with a prompt that includes:
|
|
88
|
+
- `<preview_input>` block carrying the full JSON payload (mode field set explicitly)
|
|
89
|
+
- `<output_path>` block with the target file path
|
|
90
|
+
- `<files_to_read>` block with any phase context files the agent should load
|
|
91
|
+
4. Agent writes the report file and returns a short confirmation.
|
|
92
|
+
5. Echo the output path to the user.
|
|
93
|
+
|
|
94
|
+
The agent does not need workflow context beyond what the data layer provides. Keep spawned-agent prompts lean — the agent's context budget is for reasoning about the structured input, not for loading the whole project.
|
|
95
|
+
|
|
96
|
+
</process>
|
|
97
|
+
|
|
98
|
+
<output_contract>
|
|
99
|
+
The command returns the path to the generated preview document. Never paste the report back into conversation output — the file is the deliverable; reference it by path.
|
|
100
|
+
</output_contract>
|
|
101
|
+
|
|
102
|
+
<runtime_compatibility>
|
|
103
|
+
|
|
104
|
+
| Runtime | phase | phases | milestone |
|
|
105
|
+
|---------|-------|--------|-----------|
|
|
106
|
+
| Claude Code | Full, thinking enabled | Full, 1M-ctx bonus on Opus 4.7 | Full |
|
|
107
|
+
| OpenCode | Full | Data-layer + simple report | Full |
|
|
108
|
+
| Gemini CLI | Full | Data-layer + simple report | Full |
|
|
109
|
+
| Codex CLI | Full | Data-layer + simple report | Full |
|
|
110
|
+
| Copilot CLI | Full | Data-layer + simple report | Full |
|
|
111
|
+
|
|
112
|
+
The data layer (`pan-tools preview …`) works identically on all runtimes. What varies is the quality of the agent's synthesis — Opus 4.7 with thinking catches subtler risks than smaller models.
|
|
113
|
+
|
|
114
|
+
</runtime_compatibility>
|