pan-wizard 2.9.0 → 3.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (69) hide show
  1. package/README.md +8 -8
  2. package/agents/pan-conductor.md +189 -0
  3. package/agents/pan-counterfactual.md +112 -0
  4. package/agents/pan-debugger.md +15 -1
  5. package/agents/pan-document_code.md +21 -0
  6. package/agents/pan-executor.md +16 -0
  7. package/agents/pan-hardener.md +113 -0
  8. package/agents/pan-integration-checker.md +2 -0
  9. package/agents/pan-knowledge.md +81 -0
  10. package/agents/pan-meta-reviewer.md +91 -0
  11. package/agents/pan-plan-checker.md +2 -0
  12. package/agents/pan-previewer.md +98 -0
  13. package/agents/pan-project-researcher.md +4 -4
  14. package/agents/pan-reviewer.md +2 -0
  15. package/agents/pan-verifier.md +2 -0
  16. package/bin/install-lib.cjs +197 -0
  17. package/bin/install.js +1999 -1959
  18. package/commands/pan/assumptions.md +38 -3
  19. package/commands/pan/audit-deployment.md +6 -0
  20. package/commands/pan/cost.md +132 -0
  21. package/commands/pan/debug.md +71 -2
  22. package/commands/pan/exec-phase.md +105 -0
  23. package/commands/pan/focus-auto.md +199 -18
  24. package/commands/pan/focus-design.md +67 -2
  25. package/commands/pan/focus-exec.md +178 -47
  26. package/commands/pan/focus-scan.md +17 -5
  27. package/commands/pan/knowledge.md +129 -0
  28. package/commands/pan/map-codebase.md +47 -6
  29. package/commands/pan/mcp-bridge.md +145 -0
  30. package/commands/pan/milestone-audit.md +23 -0
  31. package/commands/pan/new-project.md +64 -0
  32. package/commands/pan/pause.md +42 -1
  33. package/commands/pan/plan-phase.md +95 -0
  34. package/commands/pan/preview.md +114 -0
  35. package/commands/pan/profile.md +37 -0
  36. package/commands/pan/quick.md +15 -0
  37. package/commands/pan/resume.md +62 -2
  38. package/commands/pan/review-deep.md +128 -0
  39. package/commands/pan/verify-phase.md +53 -0
  40. package/commands/pan/what-if.md +146 -0
  41. package/hooks/dist/pan-cost-logger.js +102 -0
  42. package/hooks/dist/pan-statusline.js +154 -108
  43. package/package.json +1 -1
  44. package/pan-wizard-core/bin/lib/bridge.cjs +269 -0
  45. package/pan-wizard-core/bin/lib/bus.cjs +251 -0
  46. package/pan-wizard-core/bin/lib/codebase.cjs +118 -0
  47. package/pan-wizard-core/bin/lib/constants.cjs +42 -1
  48. package/pan-wizard-core/bin/lib/context-budget.cjs +27 -0
  49. package/pan-wizard-core/bin/lib/core.cjs +91 -6
  50. package/pan-wizard-core/bin/lib/cost.cjs +359 -0
  51. package/pan-wizard-core/bin/lib/focus.cjs +105 -2
  52. package/pan-wizard-core/bin/lib/init.cjs +5 -5
  53. package/pan-wizard-core/bin/lib/knowledge.cjs +331 -0
  54. package/pan-wizard-core/bin/lib/memory.cjs +252 -0
  55. package/pan-wizard-core/bin/lib/phase.cjs +40 -13
  56. package/pan-wizard-core/bin/lib/preview.cjs +480 -0
  57. package/pan-wizard-core/bin/lib/review-deep.cjs +280 -0
  58. package/pan-wizard-core/bin/lib/roadmap.cjs +4 -4
  59. package/pan-wizard-core/bin/lib/state.cjs +2 -2
  60. package/pan-wizard-core/bin/lib/verify.cjs +34 -1
  61. package/pan-wizard-core/bin/lib/whatif.cjs +289 -0
  62. package/pan-wizard-core/bin/pan-tools.cjs +239 -4
  63. package/pan-wizard-core/templates/playbook.md +53 -0
  64. package/pan-wizard-core/templates/preview-report.md +93 -0
  65. package/pan-wizard-core/templates/roadmap.md +24 -24
  66. package/pan-wizard-core/templates/state.md +12 -9
  67. package/pan-wizard-core/workflows/plan-phase.md +1 -1
  68. package/scripts/build-hooks.js +2 -1
  69. package/scripts/generate-skills-docs.py +560 -0
@@ -18,6 +18,18 @@ Execute items from the current focus batch with capacity-based sizing, full sess
18
18
 
19
19
  **Goal:** One-command pipeline that starts a session, loads the planned batch, implements items with tier-based execution protocols, verifies the work, syncs documentation, and closes the session cleanly.
20
20
 
21
+ <completion_contract>
22
+ Execution is complete when ALL conditions are met:
23
+ 1. All batch items processed (each marked DONE or FAILED with reason)
24
+ 2. Full test suite passes with count >= Stage 1 baseline
25
+ 3. Stage 6 pre-commit checklist passes (all 6 checks)
26
+ 4. Commit created listing only VERIFIED items
27
+ 5. Session recorded with before/after test counts and budget usage
28
+ 6. Active scan file updated with item statuses
29
+
30
+ Execution FAILS if: test baseline cannot be established (Stage 1), or test count drops below baseline after all reverts.
31
+ </completion_contract>
32
+
21
33
  ---
22
34
 
23
35
  ## Pipeline Overview
@@ -46,13 +58,33 @@ Execute items from the current focus batch with capacity-based sizing, full sess
46
58
  - Commit, record session, generate summary
47
59
  ```
48
60
 
61
+ <action_gating>
62
+ Each stage has a restricted set of appropriate actions. Using the wrong tool at the wrong stage causes regressions.
63
+
64
+ | Stage | Read | Grep/Glob | Edit/Write | Bash (tests) | Bash (git) |
65
+ |-------|------|-----------|------------|--------------|------------|
66
+ | 1. Session Start | YES | YES | NO | YES | YES |
67
+ | 2. Batch Loading | YES | YES | NO | NO | NO |
68
+ | 3. Execution | YES | YES | YES | YES | NO |
69
+ | 4. Verification | YES | YES | NO | YES | NO |
70
+ | 5. Doc Sync | YES | YES | YES | NO | NO |
71
+ | 6. Session End | YES | NO | YES | NO | YES |
72
+
73
+ **Key constraints:**
74
+ - Stage 1: NO Edit/Write — you are establishing baseline, not changing code
75
+ - Stage 2: Read-only — validating the batch, not modifying anything
76
+ - Stage 4: NO Edit/Write — you are verifying work, not doing more work. If tests fail, go back to Stage 3 to fix.
77
+ - Stage 5: Edit docs only — no code changes during doc sync
78
+ - Stage 6: Git operations + session recording only — all work must be done
79
+ </action_gating>
80
+
49
81
  ---
50
82
 
51
- ## CRITICAL: Project Scope Boundary
83
+ ## Project Scope Boundary
52
84
 
53
- This command executes work on the **host project's source code** — NOT on PAN Wizard's own infrastructure.
85
+ This command executes work on the **host project's source code** — not on PAN Wizard's own infrastructure.
54
86
 
55
- **NEVER read, modify, or "fix" files in these PAN directories:**
87
+ **Do not read, modify, or fix files in these PAN directories:**
56
88
  - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
57
89
  - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
58
90
 
@@ -60,9 +92,22 @@ This command executes work on the **host project's source code** — NOT on PAN
60
92
 
61
93
  ---
62
94
 
63
- ## MANDATORY: Execute ALL Stages Sequentially
95
+ ## Execute All Stages Sequentially
96
+
97
+ When `/pan:focus-exec` is invoked, run all 6 stages in order. Do not skip stages or stop between them unless tests regress.
98
+
99
+ <stage_dependencies>
100
+ Stage 1 → Stage 2: Baseline MUST exist before batch loads (regression detection requires it)
101
+ Stage 2 → Stage 3: Batch MUST be validated before execution begins (prevents working on stale/empty batches)
102
+ Stage 3 → Stage 4: All items MUST be processed before verification (partial verification produces false confidence)
103
+ Stage 4 → Stage 5: Tests MUST pass before doc sync (don't document broken code)
104
+ Stage 5 → Stage 6: Docs MUST be updated before commit (commit captures the complete state)
64
105
 
65
- When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages. Do NOT stop between stages unless a critical failure occurs (tests regress).
106
+ HARD STOP conditions (do not proceed to next stage):
107
+ - Stage 1: Test suite fails → fix tests before proceeding
108
+ - Stage 2: No batch file found → tell user to run /pan:focus-plan
109
+ - Stage 4: Test count below baseline → revert last changes, re-verify
110
+ </stage_dependencies>
66
111
 
67
112
  **Flags:**
68
113
  - `--budget N` — Override capacity budget in points (default: 50, min: 5, max: 100)
@@ -71,6 +116,7 @@ When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages
71
116
  - `--dry-run` — Run Stages 1-2 only (show what WOULD be executed)
72
117
  - `--no-commit` — Skip the commit step in Stage 6
73
118
  - `--continue` — Resume a previously interrupted execution
119
+ - `--deep-review` (v3.4+) — After each high-stakes item's execution, run `/pan:review-deep` for that item (pan-hardener + pan-meta-reviewer security + cross-check). Slows the campaign by roughly 3× per item that triggers the deep pass; use for batches touching auth/payment/migrations.
74
120
 
75
121
  ---
76
122
 
@@ -86,34 +132,54 @@ When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages
86
132
 
87
133
  ---
88
134
 
89
- ## AI Behavioral Rules (ALL 9 MANDATORY)
135
+ ## AI Behavioral Rules
90
136
 
91
- ### Rule 1: Read Before You Write (MANDATORY)
92
- Before changing ANY file, read it first. Understand context, callers, and invariants.
137
+ ### Rule 1: Read Before You Write
138
+ Before changing any file, read it first. Understand context, callers, and invariants.
93
139
 
94
- ### Rule 2: Understand the Root Cause (MANDATORY)
95
- Do NOT apply surface-level patches. Trace the code path, identify the actual defect.
140
+ **Violation example:**
141
+ ```
142
+ BAD: Rename parameter `opts` → `options` in utils.cjs without reading callers
143
+ → 3 callers in api.cjs, workers.cjs break silently
144
+ GOOD: Grep for "utils\." → read all 3 callers → confirm param name is safe to change → edit
145
+ ```
96
146
 
97
- ### Rule 3: One Change, One Test (MANDATORY)
147
+ ### Rule 2: Understand the Root Cause
148
+ Do not apply surface-level patches. Trace the code path, identify the actual defect.
149
+
150
+ **Violation example:**
151
+ ```
152
+ BAD: Test fails with "Cannot read property 'name' of undefined"
153
+ → Add `if (!obj) return null` at the crash site
154
+ → Root cause: caller passes wrong argument order — still broken
155
+ GOOD: Trace the call chain → find caller passes (id, name) but function expects (name, id) → fix caller
156
+ ```
157
+
158
+ ### Rule 3: One Change, One Test
98
159
  Every code change must be tested before moving to the next item.
99
160
 
100
161
  Test cadence by tier:
101
162
  - **MICRO (XS/S):** Run specific test after implementing. Batch up to 3 independent items before smoke.
102
- - **STANDARD (M):** Full test suite after EACH item.
103
- - **FULL (L/XL):** Build hooks + full test suite after EACH item.
104
-
105
- ### Rule 4: Don't Invent — Follow the Plan (MANDATORY)
106
- Implement exactly what the batch says. No scope creep.
107
-
108
- ### Rule 5: Cross-Platform Awareness (MANDATORY)
163
+ - **STANDARD (M):** Full test suite after each item.
164
+ - **FULL (L/XL):** Build hooks + full test suite after each item.
165
+
166
+ ### Rule 4: Don't Invent — Follow the Plan
167
+ Implement exactly what the batch says. Do not:
168
+ - Add features not in the batch item
169
+ - Refactor surrounding code that isn't broken
170
+ - Add comments or docstrings to unchanged files
171
+ - Create abstractions for one-time operations
172
+ - Add error handling for scenarios that cannot happen
173
+
174
+ ### Rule 5: Cross-Platform Awareness
109
175
  - Use platform-agnostic path APIs (no hardcoded separators)
110
176
  - Follow the project's module format conventions (discover from existing code)
111
177
  - Use file-based input for shell-sensitive content when needed
112
178
 
113
- ### Rule 6: Revert Fast, Don't Dig Deep (MANDATORY)
179
+ ### Rule 6: Revert Fast, Don't Dig Deep
114
180
  If a fix doesn't work within 5 minutes, revert and move on. Failed items carry forward.
115
181
 
116
- ### Rule 7: Verify Understanding Before Committing (MANDATORY)
182
+ ### Rule 7: Verify Understanding Before Coding
117
183
  For M/L/XL items, state your understanding before writing code:
118
184
  ```
119
185
  Item P2-3 — Add tests for billing module
@@ -123,11 +189,19 @@ Files: billing.ts, tests/billing.test.ts
123
189
  Confidence: HIGH
124
190
  ```
125
191
 
126
- ### Rule 8: Preserve Existing Test Expectations (MANDATORY)
192
+ ### Rule 8: Preserve Existing Test Expectations
127
193
  Never change an existing test's expected output to match broken code.
128
194
 
129
- ### Rule 9: Commit Messages Must Be Accurate (MANDATORY)
130
- List ONLY items that are actually VERIFIED (passed tests). Include actual test counts.
195
+ ### Rule 9: Commit Messages Must Be Accurate
196
+ List only items that are verified (passed tests). Include actual test counts.
197
+
198
+ ### Rule 10: Vary Approach for Similar Items
199
+ When a batch contains 3+ items of the same type (e.g., "add null check to X", "add null check to Y"), deliberately vary your approach to avoid tunnel vision:
200
+ - Item 1: Fix as planned
201
+ - Item 2: Before fixing, re-read the module's error handling pattern — does the same fix apply or does this module handle errors differently?
202
+ - Item 3+: Check if the first fixes introduced a pattern that should be extracted (shared helper) or if each case is genuinely independent
203
+
204
+ This catches emergent interactions: 5 "add try-catch" fixes might reveal the module needs a centralized error boundary, not 5 scattered try-catches.
131
205
 
132
206
  ---
133
207
 
@@ -136,7 +210,8 @@ List ONLY items that are actually VERIFIED (passed tests). Include actual test c
136
210
  1. **Check Project Status** — git status, recent commits
137
211
  2. **Test Baseline** — run test suite, record current counts
138
212
  3. **Create rollback snapshot** — git tag for safety
139
- 4. **Report** — Output session start summary
213
+ 4. **Prime prompt cache** — `pan-tools cache prime --summary` (once; all sub-agents in the next 5 min hit cached context)
214
+ 5. **Report** — Output session start summary
140
215
 
141
216
  **Record baseline:**
142
217
  ```
@@ -170,6 +245,13 @@ Display the execution batch to user, then continue automatically.
170
245
  ### 3.0 Pre-Execution Setup
171
246
  1. Cache project facts — do NOT re-read later
172
247
  2. Create/update progress tracker with the batch table
248
+ 3. Classify stages for parallel tool use:
249
+ ```
250
+ pan-tools focus classify-stages --raw
251
+ ```
252
+ The CLI reads the latest batch and returns `{waves, parallelism_hint}`. When `parallelism_hint` is `emit-micro-in-parallel` or `emit-standard-in-parallel`, all reads and greps for items in the current wave SHOULD be emitted in a single assistant turn (parallel tool calls). Opus 4.7 is markedly better at emitting parallel tool calls than earlier models; use that to collapse Stage 3 latency on MICRO-heavy batches.
253
+
254
+ Serialize on `FULL` tier items — each is its own wave.
173
255
 
174
256
  ### 3.1 Process Items by Tier
175
257
 
@@ -185,9 +267,10 @@ Display the execution batch to user, then continue automatically.
185
267
  ```
186
268
  1. STATE UNDERSTANDING (Rule 7)
187
269
  2. READ target files + test files
188
- 3. IMPLEMENT across necessary files
189
- 4. TEST full test suite
190
- 5. CONFIRMpass -> DONE | regresses -> REVERT -> FAILED
270
+ 3. STATE INTENT "I will modify [files], adding [what], to achieve [goal]"
271
+ 4. IMPLEMENT across necessary files
272
+ 5. TESTfull test suite
273
+ 6. CONFIRM — pass -> DONE | regresses -> REVERT -> FAILED
191
274
  ```
192
275
 
193
276
  #### FULL Items (L/XL)
@@ -195,20 +278,55 @@ Display the execution batch to user, then continue automatically.
195
278
  1. STATE UNDERSTANDING (detailed)
196
279
  2. READ WIDELY — target files, callers, tests, related code
197
280
  3. DESIGN — outline approach before coding
198
- 4. IMPLEMENT in logical chunks
199
- 5. BUILD build hooks if hooks changed
200
- 6. TESTfull test suite
201
- 7. CONFIRMall pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
281
+ 4. STATE INTENT "I will modify [files]. Risk: [what could break]"
282
+ 5. IMPLEMENT in logical chunks
283
+ 6. BUILDbuild hooks if hooks changed
284
+ 7. TESTfull test suite
285
+ 8. CONFIRM — all pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
202
286
  ```
203
287
 
204
288
  ### 3.2 Failure Handling
205
- - Build breaks: fix typo or revert (5 min limit)
206
- - Test regression: identify cause, one fix attempt, else revert
207
- - **Never let a failed item block other items**
208
289
 
209
- ### 3.3 Progress Tracking
290
+ Classify every error before acting. The classification determines the recovery protocol.
291
+
292
+ **RECOVERABLE (retry with analysis, max 3 attempts):**
293
+ - Test failure after code change — read the error output, fix the root cause, re-test
294
+ - File not found — search for moved/renamed paths via Grep/Glob
295
+ - Build failure from syntax error — fix the typo, rebuild
296
+ - Merge conflict in a non-critical file — attempt auto-resolution
297
+
298
+ **UNRECOVERABLE (halt the item, mark FAILED, move to next):**
299
+ - Same test failure persists after 3 fix attempts — revert all changes for this item
300
+ - Permission or auth error on a critical path — cannot proceed without user action
301
+ - State corruption (malformed JSON in planning files) — stop, report to user
302
+ - Persistent build failure unrelated to current item — stop execution, report
303
+ - Test regression in unrelated code — revert, flag for investigation
304
+
305
+ **Never let a failed item block other items.** Mark it FAILED with the error classification and move on.
306
+
307
+ ### 3.3 Failure Pattern Detection
308
+ When marking an item FAILED, check if its error matches a previous failure in this batch:
309
+ - Same error type or root cause category
310
+ - Same file or module involved
311
+
312
+ If a pattern repeats (2+ items fail the same way), log it in the session record:
313
+ ```
314
+ FAILURE PATTERN: {description} — Items {ID1}, {ID2} — Root cause: {cause}
315
+ Suggested avoidance: {what to check before similar items}
316
+ ```
317
+ Before executing remaining items, check if they match the pattern. If so, skip with reason "matches known failure pattern" rather than burning budget on predictable failures.
318
+
319
+ ### 3.4 Progress Tracking
210
320
  Update progress tracker after each item with status and budget tracking.
211
321
 
322
+ **Attention anchor — emit after each item completes:**
323
+ ```
324
+ Item {N}/{total} {DONE|FAILED} | Budget: {used}/{budget} pts | Tests: {baseline} → {current}
325
+ Remaining: {count} items [{IDs with sizes}]
326
+ Next: {next item ID} — {title} ({tier})
327
+ ```
328
+ This prevents lost-in-the-middle drift in large batches where the agent forgets budget limits or remaining items.
329
+
212
330
  ---
213
331
 
214
332
  ## Stage 4: Verification
@@ -254,17 +372,30 @@ Edit the active scan file:
254
372
 
255
373
  ## Stage 6: Session End
256
374
 
257
- ### 6.1 Commit Changes
375
+ ### 6.1 Pre-Commit Verification Checklist
376
+
377
+ Before committing, run through ALL checks. Do not commit until every check passes.
378
+
379
+ 1. Every modified file was read before editing (no blind writes)
380
+ 2. `git diff --stat` contains only files related to batch items (no stray changes)
381
+ 3. Full test suite passes — count matches or exceeds baseline from Stage 1
382
+ 4. No `TODO`, `FIXME`, or `HACK` introduced without a matching batch item tracking it
383
+ 5. Commit message lists only items that are VERIFIED (tests ran, tests passed)
384
+ 6. No secrets, credentials, or `.env` files staged
385
+
386
+ If any check fails: fix the issue and re-run all checks. Only proceed to commit when all 6 pass.
387
+
388
+ ### 6.2 Commit Changes
258
389
  Unless `--no-commit`:
259
390
  1. Stage modified files (specific paths, not `git add -A`)
260
391
  2. Create commit with accurate message listing verified items
261
392
  3. Verify commit succeeded
262
393
 
263
- ### 6.2 Record Session
394
+ ### 6.3 Record Session
264
395
  - Record session summary (items completed, tests before/after, budget used)
265
396
  - Append error patterns if any failures occurred
266
397
 
267
- ### 6.3 Final Report
398
+ ### 6.4 Final Report
268
399
 
269
400
  ```markdown
270
401
  ## /pan:focus-exec Complete
@@ -293,15 +424,15 @@ Run `/pan:focus-scan` to regenerate the scan.
293
424
 
294
425
  ## NEVER DO
295
426
 
296
- - Skip reading files before editing them (Rule 1)
297
- - Apply symptom patches instead of root cause fixes (Rule 2)
298
- - Batch implement without testing between items (Rule 3)
299
- - Expand scope beyond the batch item (Rule 4)
300
- - Ignore cross-platform path issues (Rule 5)
301
- - Spend more than 5 minutes debugging a single failure (Rule 6)
302
- - Start coding without stating understanding for M+ items (Rule 7)
303
- - Change test expectations to match broken code (Rule 8)
304
- - Claim items are fixed without running tests (Rule 9)
427
+ - Skip reading files before editing them — blind edits break callers, miss invariants, and create regressions (Rule 1)
428
+ - Apply symptom patches instead of root cause fixes — surface patches recur and erode trust in the codebase (Rule 2)
429
+ - Batch implement without testing between items — a silent failure in item 2 corrupts items 3-5 before you detect it (Rule 3)
430
+ - Expand scope beyond the batch item — unplanned changes bypass the budget system and risk compounding failures (Rule 4)
431
+ - Ignore cross-platform path issues — hardcoded separators break on Windows or vice versa (Rule 5)
432
+ - Spend more than 5 minutes debugging a single failure — diminishing returns; revert preserves budget for remaining items (Rule 6)
433
+ - Start coding without stating understanding for M+ items — misunderstanding the problem wastes the entire implementation (Rule 7)
434
+ - Change test expectations to match broken code — this hides bugs instead of fixing them (Rule 8)
435
+ - Claim items are fixed without running tests — unverified claims erode the entire verification pipeline (Rule 9)
305
436
 
306
437
  ## ALWAYS DO
307
438
 
@@ -17,11 +17,11 @@ Survey the project for prioritized work items with evidence-based scoring. $ARGU
17
17
 
18
18
  ---
19
19
 
20
- ## CRITICAL: Project Scope Boundary
20
+ ## Project Scope Boundary
21
21
 
22
- This command scans the **host project's source code** for work items — NOT PAN Wizard's own infrastructure.
22
+ This command scans the **host project's source code** for work items — not PAN Wizard's own infrastructure.
23
23
 
24
- **ALWAYS EXCLUDE these directories from scanning:**
24
+ **Exclude these directories from scanning:**
25
25
  - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
26
26
  - `.planning/` — PAN planning state (read for context, but never report PAN planning files as "issues")
27
27
  - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
@@ -32,9 +32,21 @@ If a scan finding points to a file inside `.claude/`, `.github/`, `.opencode/`,
32
32
 
33
33
  ---
34
34
 
35
- ## MANDATORY: Execute ALL Phases Automatically
35
+ ## Tool Selection Priority
36
36
 
37
- When `/pan:focus-scan` is invoked, execute ALL phases without stopping. Do NOT ask questions between phases. Do NOT skip phases. The output is a prioritized work list with Reality Score filtering.
37
+ Use the simplest sufficient tool for each scanning operation:
38
+ 1. **Grep** — for finding patterns (TODO, FIXME, error-prone code) across the codebase
39
+ 2. **Glob** — for discovering files by name pattern (test files, config files, modules)
40
+ 3. **Read** — for examining specific files identified by Grep/Glob
41
+ 4. **Bash** — only for commands that dedicated tools cannot do (git log, test runners)
42
+
43
+ Do not read entire files when Grep can find the relevant lines. Do not use Bash for searches that Grep handles.
44
+
45
+ ---
46
+
47
+ ## Execute All Phases Automatically
48
+
49
+ When `/pan:focus-scan` is invoked, execute all phases without stopping. Do not ask questions between phases or skip phases. The output is a prioritized work list with Reality Score filtering.
38
50
 
39
51
  **Flags:**
40
52
  - `--focus <area>` — Weight items toward a specific area (e.g., `--focus commands`, `--focus hooks`, `--focus tests`)
@@ -0,0 +1,129 @@
1
+ ---
2
+ name: pan:knowledge
3
+ group: Knowledge
4
+ description: Grounded Q&A, multi-turn design discussion, and playbook generation. Three modes in one command.
5
+ argument-hint: "ask <question> | discuss <phase> <topic> | playbook"
6
+ allowed-tools:
7
+ - Read
8
+ - Write
9
+ - Bash
10
+ - Grep
11
+ - Glob
12
+ - Task
13
+ ---
14
+
15
+ <objective>
16
+ Retrieve, refine, or consolidate project knowledge. Three modes:
17
+
18
+ - **ask** — answer a natural-language question with inline citations grounded in `.planning/` + `docs/`.
19
+ - **discuss** — multi-turn refinement of a phase's context. Session state persists across invocations; prompt caching keeps turn 3 cheap.
20
+ - **playbook** — aggregate all agents' memory (E-4 layer) into `.planning/playbook.md`, organized by category (Conventions / Gotchas / Decisions / Tool choices / Anti-patterns / Recurring gaps).
21
+
22
+ Consolidates Spec B v1's X-3 converse + X-6 teach + X-10 explain into one command.
23
+ </objective>
24
+
25
+ <execution_context>
26
+ @~/.claude/pan-wizard-core/bin/lib/knowledge.cjs
27
+ @~/.claude/agents/pan-knowledge.md
28
+ @~/.claude/pan-wizard-core/templates/playbook.md
29
+ </execution_context>
30
+
31
+ <modes>
32
+
33
+ ### `ask <question>`
34
+
35
+ ```
36
+ /pan:knowledge ask "why does phase 4 have a race condition fix?"
37
+ ```
38
+
39
+ **Flow:**
40
+ 1. `pan-tools knowledge ask "<question>"` returns a ranked list of candidate files.
41
+ 2. Spawn `pan-knowledge` with `<mode>ask</mode>`, the question, and the top sources as `<files_to_read>`.
42
+ 3. Agent reads sources, answers with citations, returns the answer to stdout. No file is written.
43
+
44
+ **Output:** inline markdown answer with `[file.md:LINE]` and `[ADR-NNNN]` citations.
45
+
46
+ ### `discuss <phase> <topic-or-question>`
47
+
48
+ ```
49
+ /pan:knowledge discuss 12 "should we use Redis or Memcached?"
50
+ ```
51
+
52
+ **Flow:**
53
+ 1. `pan-tools knowledge discuss <phase> --subcmd read` loads session state from `.planning/conversations/<phase>/session.json` (empty for new phase).
54
+ 2. `pan-tools knowledge discuss <phase> --subcmd append --role user --content "<topic>"` persists the user turn.
55
+ 3. Spawn `pan-knowledge` with `<mode>discuss</mode>`, session history, phase context, and the new turn.
56
+ 4. Agent responds.
57
+ 5. `pan-tools knowledge discuss <phase> --subcmd append --role agent --content "<response>" --cites "a.md,b.md"` persists the response.
58
+ 6. If after ≥3 substantive turns the agent offered to emit `context.md`, user can follow up with another `/pan:knowledge discuss <phase>` invocation or run the commit subcommand the agent suggested.
59
+
60
+ **Session persistence:** `.planning/conversations/<phase>/session.json` — array of turns with ts/role/content/cites. Multi-turn cost is dominated by cache hits on stable `.planning/` files.
61
+
62
+ ### `playbook`
63
+
64
+ ```
65
+ /pan:knowledge playbook
66
+ ```
67
+
68
+ **Flow:**
69
+ 1. `pan-tools knowledge playbook` reads all agents' memory (`.planning/memory/*.md`), clusters entries by category, writes `.planning/playbook.md` directly.
70
+ 2. Optionally spawn `pan-knowledge` with `<mode>playbook</mode>` to polish (dedupe contradictions, consolidate similar entries). Skip the polish step if the draft looks clean.
71
+
72
+ **Output:** `.planning/playbook.md` — team-readable summary of accumulated lessons.
73
+
74
+ **Auto-invocation:** `/pan:milestone-done` can optionally run this (flag-gated, not default). Manual invocation any time.
75
+
76
+ </modes>
77
+
78
+ <workflow>
79
+
80
+ **Onboarding a new team member:** have them run `/pan:knowledge playbook` then `/pan:knowledge ask "what conventions matter in this codebase?"`.
81
+
82
+ **Design debate:** run `/pan:knowledge discuss <phase> "<question>"` iteratively. The agent refines as the debate narrows. After convergence, accept the proposed `context.md` update.
83
+
84
+ **Bug investigation:** `/pan:knowledge ask "why did we add the retry in phase 4?"` — faster than grepping for historical context.
85
+
86
+ **Before milestone-done:** run `/pan:knowledge playbook` to capture what the team learned. Gives contributors something to reference when starting the next milestone.
87
+
88
+ </workflow>
89
+
90
+ <citation_format>
91
+
92
+ Agent output uses bracketed citations that link to files. Supported forms:
93
+
94
+ | Form | Example | Renders as |
95
+ |------|---------|-----------|
96
+ | Plain file | `[README.md]` | markdown link to the file |
97
+ | File + line | `[docs/ARCHITECTURE.md:200]` | link to line 200 |
98
+ | ADR | `[ADR-0015]` | link to ADR file |
99
+ | Phase artifact | `[phase-4/summary.md]` | link to phase summary |
100
+
101
+ The agent should NEVER fabricate citations. The retrieval layer's `sources` list is the allowlist.
102
+
103
+ </citation_format>
104
+
105
+ <runtime_compatibility>
106
+
107
+ | Runtime | ask | discuss | playbook |
108
+ |---------|-----|---------|----------|
109
+ | Claude Code | Full, thinking enabled | Full, prompt caching bonus | Full |
110
+ | OpenCode | Full | Full (no cache bonus) | Full |
111
+ | Gemini | Full | Full | Full |
112
+ | Codex | Full | Full | Full |
113
+ | Copilot | Full | Full | Full |
114
+
115
+ The data layer (retrieval, session state, playbook clustering) is pure Node.js and runtime-agnostic. Only answer synthesis quality varies with model capability.
116
+
117
+ </runtime_compatibility>
118
+
119
+ <privacy_note>
120
+
121
+ `session.json` is persisted to disk and committed unless `.planning/conversations/` is gitignored. For sensitive design discussions, consider:
122
+
123
+ ```
124
+ echo '.planning/conversations/' >> .gitignore
125
+ ```
126
+
127
+ before starting a `discuss` session. Session turns are not auto-encrypted.
128
+
129
+ </privacy_note>
@@ -49,16 +49,57 @@ Check for .planning/state.md - loads context if project already initialized
49
49
  - Trivial codebases (<5 files)
50
50
  </when_to_use>
51
51
 
52
+ <stage_0_ingest_mode>
53
+ **Before spawning mapper agents**, determine whether the repo fits in a single 1M-context window.
54
+
55
+ Run: `node ~/.claude/pan-wizard-core/bin/pan-tools.cjs codebase estimate-size --threshold 700000`
56
+
57
+ The CLI returns `{mode, total_tokens, file_count, languages}`:
58
+
59
+ - **`mode: "single-shot"`** — repo is small enough (≤700K tokens) for one Opus 4.7 agent to ingest the whole thing. Spawn a single `pan-document_code` agent with the full repo in context. This avoids the 6-way stitching artifacts of sharded mode (contradictory version claims, duplicated mentions, missed cross-file references).
60
+ - **`mode: "sharded"`** — repo exceeds 700K tokens. Fall back to the default 6-way parallel sharding (tech, arch, quality, concerns, relationships, practices). Each shard gets a 200K budget.
61
+
62
+ Record the chosen mode + telemetry in the final `.planning/codebase/overview.md` so future runs can reason about drift.
63
+
64
+ Opus 4.7 is required for single-shot mode (only model with a 1M context window). Other models always take the sharded path regardless of size.
65
+ </stage_0_ingest_mode>
66
+
67
+ <tool_priority>
68
+ Each mapper agent should use the simplest sufficient tool:
69
+ 1. Glob — discover files by pattern (find all .ts files, config files, test files)
70
+ 2. Grep — search for patterns across the codebase (imports, exports, function names)
71
+ 3. Read — examine specific files found by Glob/Grep
72
+ 4. Bash — only for git history or commands dedicated tools cannot handle
73
+ </tool_priority>
74
+
75
+ <progressive_context>
76
+ The orchestrator loads context in layers — NOT everything upfront. Mapper agents receive only what they need.
77
+
78
+ **Orchestrator layers (before spawning agents):**
79
+ 1. **Manifest** — package.json/Cargo.toml, project identity, entry points
80
+ 2. **Structure** — top-level directory listing, file count by extension, test presence
81
+ 3. **Git summary** — recent commits (10), contributors, branch info
82
+
83
+ **Per-agent context (each agent loads its own):**
84
+ - Each agent starts with: project manifest + directory structure + its focus area description
85
+ - Each agent discovers its own details via Glob/Grep/Read within its focus area
86
+ - Agents do NOT receive other agents' output (parallel, independent)
87
+
88
+ **Why:** Loading the entire codebase into the orchestrator before spawning agents wastes orchestrator context. Each agent has a fresh 200k window — let them explore independently. The orchestrator only needs enough context to spawn correctly and verify outputs exist.
89
+ </progressive_context>
90
+
52
91
  <process>
53
92
  1. Check if .planning/codebase/ already exists (offer to refresh or skip)
54
93
  2. Create .planning/codebase/ directory structure
55
- 3. Spawn 4 parallel pan-document_code agents:
56
- - Agent 1: tech focus → writes STACK.md, INTEGRATIONS.md
57
- - Agent 2: arch focus → writes ARCHITECTURE.md, STRUCTURE.md
58
- - Agent 3: quality focus → writes CONVENTIONS.md, TESTING.md
59
- - Agent 4: concerns focus → writes CONCERNS.md
94
+ 3. Spawn 6 parallel pan-document_code agents:
95
+ - Agent 1: tech focus → writes stack.md, integrations.md
96
+ - Agent 2: arch focus → writes architecture.md, structure.md
97
+ - Agent 3: quality focus → writes conventions.md, testing.md
98
+ - Agent 4: concerns focus → writes concerns.md
99
+ - Agent 5: relationships focus → writes relationships.md
100
+ - Agent 6: practices focus → writes best-practices.md
60
101
  4. Wait for agents to complete, collect confirmations (NOT document contents)
61
- 5. Verify all 7 documents exist with line counts
102
+ 5. Verify all 9 documents exist with line counts
62
103
  6. Commit codebase map
63
104
  7. Offer next steps (typically: /pan:new-project or /pan:plan-phase)
64
105
  </process>