npm - pan-wizard - Versions diffs - 2.9.0 → 3.4.1 - Mend

pan-wizard 2.9.0 → 3.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (69) hide show

package/README.md +8 -8
package/agents/pan-conductor.md +189 -0
package/agents/pan-counterfactual.md +112 -0
package/agents/pan-debugger.md +15 -1
package/agents/pan-document_code.md +21 -0
package/agents/pan-executor.md +16 -0
package/agents/pan-hardener.md +113 -0
package/agents/pan-integration-checker.md +2 -0
package/agents/pan-knowledge.md +81 -0
package/agents/pan-meta-reviewer.md +91 -0
package/agents/pan-plan-checker.md +2 -0
package/agents/pan-previewer.md +98 -0
package/agents/pan-project-researcher.md +4 -4
package/agents/pan-reviewer.md +2 -0
package/agents/pan-verifier.md +2 -0
package/bin/install-lib.cjs +197 -0
package/bin/install.js +1999 -1959
package/commands/pan/assumptions.md +38 -3
package/commands/pan/audit-deployment.md +6 -0
package/commands/pan/cost.md +132 -0
package/commands/pan/debug.md +71 -2
package/commands/pan/exec-phase.md +105 -0
package/commands/pan/focus-auto.md +199 -18
package/commands/pan/focus-design.md +67 -2
package/commands/pan/focus-exec.md +178 -47
package/commands/pan/focus-scan.md +17 -5
package/commands/pan/knowledge.md +129 -0
package/commands/pan/map-codebase.md +47 -6
package/commands/pan/mcp-bridge.md +145 -0
package/commands/pan/milestone-audit.md +23 -0
package/commands/pan/new-project.md +64 -0
package/commands/pan/pause.md +42 -1
package/commands/pan/plan-phase.md +95 -0
package/commands/pan/preview.md +114 -0
package/commands/pan/profile.md +37 -0
package/commands/pan/quick.md +15 -0
package/commands/pan/resume.md +62 -2
package/commands/pan/review-deep.md +128 -0
package/commands/pan/verify-phase.md +53 -0
package/commands/pan/what-if.md +146 -0
package/hooks/dist/pan-cost-logger.js +102 -0
package/hooks/dist/pan-statusline.js +154 -108
package/package.json +1 -1
package/pan-wizard-core/bin/lib/bridge.cjs +269 -0
package/pan-wizard-core/bin/lib/bus.cjs +251 -0
package/pan-wizard-core/bin/lib/codebase.cjs +118 -0
package/pan-wizard-core/bin/lib/constants.cjs +42 -1
package/pan-wizard-core/bin/lib/context-budget.cjs +27 -0
package/pan-wizard-core/bin/lib/core.cjs +91 -6
package/pan-wizard-core/bin/lib/cost.cjs +359 -0
package/pan-wizard-core/bin/lib/focus.cjs +105 -2
package/pan-wizard-core/bin/lib/init.cjs +5 -5
package/pan-wizard-core/bin/lib/knowledge.cjs +331 -0
package/pan-wizard-core/bin/lib/memory.cjs +252 -0
package/pan-wizard-core/bin/lib/phase.cjs +40 -13
package/pan-wizard-core/bin/lib/preview.cjs +480 -0
package/pan-wizard-core/bin/lib/review-deep.cjs +280 -0
package/pan-wizard-core/bin/lib/roadmap.cjs +4 -4
package/pan-wizard-core/bin/lib/state.cjs +2 -2
package/pan-wizard-core/bin/lib/verify.cjs +34 -1
package/pan-wizard-core/bin/lib/whatif.cjs +289 -0
package/pan-wizard-core/bin/pan-tools.cjs +239 -4
package/pan-wizard-core/templates/playbook.md +53 -0
package/pan-wizard-core/templates/preview-report.md +93 -0
package/pan-wizard-core/templates/roadmap.md +24 -24
package/pan-wizard-core/templates/state.md +12 -9
package/pan-wizard-core/workflows/plan-phase.md +1 -1
package/scripts/build-hooks.js +2 -1
package/scripts/generate-skills-docs.py +560 -0

package/commands/pan/focus-exec.md CHANGED Viewed

@@ -18,6 +18,18 @@ Execute items from the current focus batch with capacity-based sizing, full sess
 **Goal:** One-command pipeline that starts a session, loads the planned batch, implements items with tier-based execution protocols, verifies the work, syncs documentation, and closes the session cleanly.
+<completion_contract>
+Execution is complete when ALL conditions are met:
+1. All batch items processed (each marked DONE or FAILED with reason)
+2. Full test suite passes with count >= Stage 1 baseline
+3. Stage 6 pre-commit checklist passes (all 6 checks)
+4. Commit created listing only VERIFIED items
+5. Session recorded with before/after test counts and budget usage
+6. Active scan file updated with item statuses
+Execution FAILS if: test baseline cannot be established (Stage 1), or test count drops below baseline after all reverts.
+</completion_contract>
 ---
 ## Pipeline Overview
@@ -46,13 +58,33 @@ Execute items from the current focus batch with capacity-based sizing, full sess
     - Commit, record session, generate summary
 ```
+<action_gating>
+Each stage has a restricted set of appropriate actions. Using the wrong tool at the wrong stage causes regressions.
+| Stage | Read | Grep/Glob | Edit/Write | Bash (tests) | Bash (git) |
+|-------|------|-----------|------------|--------------|------------|
+| 1. Session Start | YES | YES | NO | YES | YES |
+| 2. Batch Loading | YES | YES | NO | NO | NO |
+| 3. Execution | YES | YES | YES | YES | NO |
+| 4. Verification | YES | YES | NO | YES | NO |
+| 5. Doc Sync | YES | YES | YES | NO | NO |
+| 6. Session End | YES | NO | YES | NO | YES |
+**Key constraints:**
+- Stage 1: NO Edit/Write — you are establishing baseline, not changing code
+- Stage 2: Read-only — validating the batch, not modifying anything
+- Stage 4: NO Edit/Write — you are verifying work, not doing more work. If tests fail, go back to Stage 3 to fix.
+- Stage 5: Edit docs only — no code changes during doc sync
+- Stage 6: Git operations + session recording only — all work must be done
+</action_gating>
 ---
-## CRITICAL: Project Scope Boundary
+## Project Scope Boundary
-This command executes work on the **host project's source code** — NOT on PAN Wizard's own infrastructure.
+This command executes work on the **host project's source code** — not on PAN Wizard's own infrastructure.
-**NEVER read, modify, or "fix" files in these PAN directories:**
+**Do not read, modify, or fix files in these PAN directories:**
 - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
 - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
@@ -60,9 +92,22 @@ This command executes work on the **host project's source code** — NOT on PAN
 ---
-## MANDATORY: Execute ALL Stages Sequentially
+## Execute All Stages Sequentially
+When `/pan:focus-exec` is invoked, run all 6 stages in order. Do not skip stages or stop between them unless tests regress.
+<stage_dependencies>
+Stage 1 → Stage 2: Baseline MUST exist before batch loads (regression detection requires it)
+Stage 2 → Stage 3: Batch MUST be validated before execution begins (prevents working on stale/empty batches)
+Stage 3 → Stage 4: All items MUST be processed before verification (partial verification produces false confidence)
+Stage 4 → Stage 5: Tests MUST pass before doc sync (don't document broken code)
+Stage 5 → Stage 6: Docs MUST be updated before commit (commit captures the complete state)
-When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages. Do NOT stop between stages unless a critical failure occurs (tests regress).
+HARD STOP conditions (do not proceed to next stage):
+- Stage 1: Test suite fails → fix tests before proceeding
+- Stage 2: No batch file found → tell user to run /pan:focus-plan
+- Stage 4: Test count below baseline → revert last changes, re-verify
+</stage_dependencies>
 **Flags:**
 - `--budget N` — Override capacity budget in points (default: 50, min: 5, max: 100)
@@ -71,6 +116,7 @@ When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages
 - `--dry-run` — Run Stages 1-2 only (show what WOULD be executed)
 - `--no-commit` — Skip the commit step in Stage 6
 - `--continue` — Resume a previously interrupted execution
+- `--deep-review` (v3.4+) — After each high-stakes item's execution, run `/pan:review-deep` for that item (pan-hardener + pan-meta-reviewer security + cross-check). Slows the campaign by roughly 3× per item that triggers the deep pass; use for batches touching auth/payment/migrations.
 ---
@@ -86,34 +132,54 @@ When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages
 ---
-## AI Behavioral Rules (ALL 9 MANDATORY)
+## AI Behavioral Rules
-### Rule 1: Read Before You Write (MANDATORY)
-Before changing ANY file, read it first. Understand context, callers, and invariants.
+### Rule 1: Read Before You Write
+Before changing any file, read it first. Understand context, callers, and invariants.
-### Rule 2: Understand the Root Cause (MANDATORY)
-Do NOT apply surface-level patches. Trace the code path, identify the actual defect.
+**Violation example:**
+```
+BAD:  Rename parameter `opts` → `options` in utils.cjs without reading callers
+      → 3 callers in api.cjs, workers.cjs break silently
+GOOD: Grep for "utils\." → read all 3 callers → confirm param name is safe to change → edit
+```
-### Rule 3: One Change, One Test (MANDATORY)
+### Rule 2: Understand the Root Cause
+Do not apply surface-level patches. Trace the code path, identify the actual defect.
+**Violation example:**
+```
+BAD:  Test fails with "Cannot read property 'name' of undefined"
+      → Add `if (!obj) return null` at the crash site
+      → Root cause: caller passes wrong argument order — still broken
+GOOD: Trace the call chain → find caller passes (id, name) but function expects (name, id) → fix caller
+```
+### Rule 3: One Change, One Test
 Every code change must be tested before moving to the next item.
 Test cadence by tier:
 - **MICRO (XS/S):** Run specific test after implementing. Batch up to 3 independent items before smoke.
-- **STANDARD (M):** Full test suite after EACH item.
-- **FULL (L/XL):** Build hooks + full test suite after EACH item.
-### Rule 4: Don't Invent — Follow the Plan (MANDATORY)
-Implement exactly what the batch says. No scope creep.
-### Rule 5: Cross-Platform Awareness (MANDATORY)
+- **STANDARD (M):** Full test suite after each item.
+- **FULL (L/XL):** Build hooks + full test suite after each item.
+### Rule 4: Don't Invent — Follow the Plan
+Implement exactly what the batch says. Do not:
+- Add features not in the batch item
+- Refactor surrounding code that isn't broken
+- Add comments or docstrings to unchanged files
+- Create abstractions for one-time operations
+- Add error handling for scenarios that cannot happen
+### Rule 5: Cross-Platform Awareness
 - Use platform-agnostic path APIs (no hardcoded separators)
 - Follow the project's module format conventions (discover from existing code)
 - Use file-based input for shell-sensitive content when needed
-### Rule 6: Revert Fast, Don't Dig Deep (MANDATORY)
+### Rule 6: Revert Fast, Don't Dig Deep
 If a fix doesn't work within 5 minutes, revert and move on. Failed items carry forward.
-### Rule 7: Verify Understanding Before Committing (MANDATORY)
+### Rule 7: Verify Understanding Before Coding
 For M/L/XL items, state your understanding before writing code:
 ```
 Item P2-3 — Add tests for billing module
@@ -123,11 +189,19 @@ Files: billing.ts, tests/billing.test.ts
 Confidence: HIGH
 ```
-### Rule 8: Preserve Existing Test Expectations (MANDATORY)
+### Rule 8: Preserve Existing Test Expectations
 Never change an existing test's expected output to match broken code.
-### Rule 9: Commit Messages Must Be Accurate (MANDATORY)
-List ONLY items that are actually VERIFIED (passed tests). Include actual test counts.
+### Rule 9: Commit Messages Must Be Accurate
+List only items that are verified (passed tests). Include actual test counts.
+### Rule 10: Vary Approach for Similar Items
+When a batch contains 3+ items of the same type (e.g., "add null check to X", "add null check to Y"), deliberately vary your approach to avoid tunnel vision:
+- Item 1: Fix as planned
+- Item 2: Before fixing, re-read the module's error handling pattern — does the same fix apply or does this module handle errors differently?
+- Item 3+: Check if the first fixes introduced a pattern that should be extracted (shared helper) or if each case is genuinely independent
+This catches emergent interactions: 5 "add try-catch" fixes might reveal the module needs a centralized error boundary, not 5 scattered try-catches.
 ---
@@ -136,7 +210,8 @@ List ONLY items that are actually VERIFIED (passed tests). Include actual test c
 1. **Check Project Status** — git status, recent commits
 2. **Test Baseline** — run test suite, record current counts
 3. **Create rollback snapshot** — git tag for safety
-4. **Report** — Output session start summary
+4. **Prime prompt cache** — `pan-tools cache prime --summary` (once; all sub-agents in the next 5 min hit cached context)
+5. **Report** — Output session start summary
 **Record baseline:**
 ```
@@ -170,6 +245,13 @@ Display the execution batch to user, then continue automatically.
 ### 3.0 Pre-Execution Setup
 1. Cache project facts — do NOT re-read later
 2. Create/update progress tracker with the batch table
+3. Classify stages for parallel tool use:
+   ```
+   pan-tools focus classify-stages --raw
+   ```
+   The CLI reads the latest batch and returns `{waves, parallelism_hint}`. When `parallelism_hint` is `emit-micro-in-parallel` or `emit-standard-in-parallel`, all reads and greps for items in the current wave SHOULD be emitted in a single assistant turn (parallel tool calls). Opus 4.7 is markedly better at emitting parallel tool calls than earlier models; use that to collapse Stage 3 latency on MICRO-heavy batches.
+   Serialize on `FULL` tier items — each is its own wave.
 ### 3.1 Process Items by Tier
@@ -185,9 +267,10 @@ Display the execution batch to user, then continue automatically.
 ```
 1. STATE UNDERSTANDING (Rule 7)
 2. READ target files + test files
-3. IMPLEMENT across necessary files
-4. TEST — full test suite
-5. CONFIRM — pass -> DONE | regresses -> REVERT -> FAILED
+3. STATE INTENT — "I will modify [files], adding [what], to achieve [goal]"
+4. IMPLEMENT across necessary files
+5. TEST — full test suite
+6. CONFIRM — pass -> DONE | regresses -> REVERT -> FAILED
 ```
 #### FULL Items (L/XL)
@@ -195,20 +278,55 @@ Display the execution batch to user, then continue automatically.
 1. STATE UNDERSTANDING (detailed)
 2. READ WIDELY — target files, callers, tests, related code
 3. DESIGN — outline approach before coding
-4. IMPLEMENT in logical chunks
-5. BUILD — build hooks if hooks changed
-6. TEST — full test suite
-7. CONFIRM — all pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
+4. STATE INTENT — "I will modify [files]. Risk: [what could break]"
+5. IMPLEMENT in logical chunks
+6. BUILD — build hooks if hooks changed
+7. TEST — full test suite
+8. CONFIRM — all pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
 ```
 ### 3.2 Failure Handling
-- Build breaks: fix typo or revert (5 min limit)
-- Test regression: identify cause, one fix attempt, else revert
-- **Never let a failed item block other items**
-### 3.3 Progress Tracking
+Classify every error before acting. The classification determines the recovery protocol.
+**RECOVERABLE (retry with analysis, max 3 attempts):**
+- Test failure after code change — read the error output, fix the root cause, re-test
+- File not found — search for moved/renamed paths via Grep/Glob
+- Build failure from syntax error — fix the typo, rebuild
+- Merge conflict in a non-critical file — attempt auto-resolution
+**UNRECOVERABLE (halt the item, mark FAILED, move to next):**
+- Same test failure persists after 3 fix attempts — revert all changes for this item
+- Permission or auth error on a critical path — cannot proceed without user action
+- State corruption (malformed JSON in planning files) — stop, report to user
+- Persistent build failure unrelated to current item — stop execution, report
+- Test regression in unrelated code — revert, flag for investigation
+**Never let a failed item block other items.** Mark it FAILED with the error classification and move on.
+### 3.3 Failure Pattern Detection
+When marking an item FAILED, check if its error matches a previous failure in this batch:
+- Same error type or root cause category
+- Same file or module involved
+If a pattern repeats (2+ items fail the same way), log it in the session record:
+```
+FAILURE PATTERN: {description} — Items {ID1}, {ID2} — Root cause: {cause}
+Suggested avoidance: {what to check before similar items}
+```
+Before executing remaining items, check if they match the pattern. If so, skip with reason "matches known failure pattern" rather than burning budget on predictable failures.
+### 3.4 Progress Tracking
 Update progress tracker after each item with status and budget tracking.
+**Attention anchor — emit after each item completes:**
+```
+Item {N}/{total} {DONE|FAILED} | Budget: {used}/{budget} pts | Tests: {baseline} → {current}
+Remaining: {count} items [{IDs with sizes}]
+Next: {next item ID} — {title} ({tier})
+```
+This prevents lost-in-the-middle drift in large batches where the agent forgets budget limits or remaining items.
 ---
 ## Stage 4: Verification
@@ -254,17 +372,30 @@ Edit the active scan file:
 ## Stage 6: Session End
-### 6.1 Commit Changes
+### 6.1 Pre-Commit Verification Checklist
+Before committing, run through ALL checks. Do not commit until every check passes.
+1. Every modified file was read before editing (no blind writes)
+2. `git diff --stat` contains only files related to batch items (no stray changes)
+3. Full test suite passes — count matches or exceeds baseline from Stage 1
+4. No `TODO`, `FIXME`, or `HACK` introduced without a matching batch item tracking it
+5. Commit message lists only items that are VERIFIED (tests ran, tests passed)
+6. No secrets, credentials, or `.env` files staged
+If any check fails: fix the issue and re-run all checks. Only proceed to commit when all 6 pass.
+### 6.2 Commit Changes
 Unless `--no-commit`:
 1. Stage modified files (specific paths, not `git add -A`)
 2. Create commit with accurate message listing verified items
 3. Verify commit succeeded
-### 6.2 Record Session
+### 6.3 Record Session
 - Record session summary (items completed, tests before/after, budget used)
 - Append error patterns if any failures occurred
-### 6.3 Final Report
+### 6.4 Final Report
 ```markdown
 ## /pan:focus-exec Complete
@@ -293,15 +424,15 @@ Run `/pan:focus-scan` to regenerate the scan.
 ## NEVER DO
-- Skip reading files before editing them (Rule 1)
-- Apply symptom patches instead of root cause fixes (Rule 2)
-- Batch implement without testing between items (Rule 3)
-- Expand scope beyond the batch item (Rule 4)
-- Ignore cross-platform path issues (Rule 5)
-- Spend more than 5 minutes debugging a single failure (Rule 6)
-- Start coding without stating understanding for M+ items (Rule 7)
-- Change test expectations to match broken code (Rule 8)
-- Claim items are fixed without running tests (Rule 9)
+- Skip reading files before editing them — blind edits break callers, miss invariants, and create regressions (Rule 1)
+- Apply symptom patches instead of root cause fixes — surface patches recur and erode trust in the codebase (Rule 2)
+- Batch implement without testing between items — a silent failure in item 2 corrupts items 3-5 before you detect it (Rule 3)
+- Expand scope beyond the batch item — unplanned changes bypass the budget system and risk compounding failures (Rule 4)
+- Ignore cross-platform path issues — hardcoded separators break on Windows or vice versa (Rule 5)
+- Spend more than 5 minutes debugging a single failure — diminishing returns; revert preserves budget for remaining items (Rule 6)
+- Start coding without stating understanding for M+ items — misunderstanding the problem wastes the entire implementation (Rule 7)
+- Change test expectations to match broken code — this hides bugs instead of fixing them (Rule 8)
+- Claim items are fixed without running tests — unverified claims erode the entire verification pipeline (Rule 9)
 ## ALWAYS DO

package/commands/pan/focus-scan.md CHANGED Viewed

@@ -17,11 +17,11 @@ Survey the project for prioritized work items with evidence-based scoring. $ARGU
 ---
-## CRITICAL: Project Scope Boundary
+## Project Scope Boundary
-This command scans the **host project's source code** for work items — NOT PAN Wizard's own infrastructure.
+This command scans the **host project's source code** for work items — not PAN Wizard's own infrastructure.
-**ALWAYS EXCLUDE these directories from scanning:**
+**Exclude these directories from scanning:**
 - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
 - `.planning/` — PAN planning state (read for context, but never report PAN planning files as "issues")
 - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
@@ -32,9 +32,21 @@ If a scan finding points to a file inside `.claude/`, `.github/`, `.opencode/`,
 ---
-## MANDATORY: Execute ALL Phases Automatically
+## Tool Selection Priority
-When `/pan:focus-scan` is invoked, execute ALL phases without stopping. Do NOT ask questions between phases. Do NOT skip phases. The output is a prioritized work list with Reality Score filtering.
+Use the simplest sufficient tool for each scanning operation:
+1. **Grep** — for finding patterns (TODO, FIXME, error-prone code) across the codebase
+2. **Glob** — for discovering files by name pattern (test files, config files, modules)
+3. **Read** — for examining specific files identified by Grep/Glob
+4. **Bash** — only for commands that dedicated tools cannot do (git log, test runners)
+Do not read entire files when Grep can find the relevant lines. Do not use Bash for searches that Grep handles.
+---
+## Execute All Phases Automatically
+When `/pan:focus-scan` is invoked, execute all phases without stopping. Do not ask questions between phases or skip phases. The output is a prioritized work list with Reality Score filtering.
 **Flags:**
 - `--focus <area>` — Weight items toward a specific area (e.g., `--focus commands`, `--focus hooks`, `--focus tests`)

package/commands/pan/knowledge.md ADDED Viewed

@@ -0,0 +1,129 @@
+---
+name: pan:knowledge
+group: Knowledge
+description: Grounded Q&A, multi-turn design discussion, and playbook generation. Three modes in one command.
+argument-hint: "ask <question> | discuss <phase> <topic> | playbook"
+allowed-tools:
+  - Read
+  - Write
+  - Bash
+  - Grep
+  - Glob
+  - Task
+---
+<objective>
+Retrieve, refine, or consolidate project knowledge. Three modes:
+- **ask** — answer a natural-language question with inline citations grounded in `.planning/` + `docs/`.
+- **discuss** — multi-turn refinement of a phase's context. Session state persists across invocations; prompt caching keeps turn 3 cheap.
+- **playbook** — aggregate all agents' memory (E-4 layer) into `.planning/playbook.md`, organized by category (Conventions / Gotchas / Decisions / Tool choices / Anti-patterns / Recurring gaps).
+Consolidates Spec B v1's X-3 converse + X-6 teach + X-10 explain into one command.
+</objective>
+<execution_context>
+@~/.claude/pan-wizard-core/bin/lib/knowledge.cjs
+@~/.claude/agents/pan-knowledge.md
+@~/.claude/pan-wizard-core/templates/playbook.md
+</execution_context>
+<modes>
+### `ask <question>`
+```
+/pan:knowledge ask "why does phase 4 have a race condition fix?"
+```
+**Flow:**
+1. `pan-tools knowledge ask "<question>"` returns a ranked list of candidate files.
+2. Spawn `pan-knowledge` with `<mode>ask</mode>`, the question, and the top sources as `<files_to_read>`.
+3. Agent reads sources, answers with citations, returns the answer to stdout. No file is written.
+**Output:** inline markdown answer with `[file.md:LINE]` and `[ADR-NNNN]` citations.
+### `discuss <phase> <topic-or-question>`
+```
+/pan:knowledge discuss 12 "should we use Redis or Memcached?"
+```
+**Flow:**
+1. `pan-tools knowledge discuss <phase> --subcmd read` loads session state from `.planning/conversations/<phase>/session.json` (empty for new phase).
+2. `pan-tools knowledge discuss <phase> --subcmd append --role user --content "<topic>"` persists the user turn.
+3. Spawn `pan-knowledge` with `<mode>discuss</mode>`, session history, phase context, and the new turn.
+4. Agent responds.
+5. `pan-tools knowledge discuss <phase> --subcmd append --role agent --content "<response>" --cites "a.md,b.md"` persists the response.
+6. If after ≥3 substantive turns the agent offered to emit `context.md`, user can follow up with another `/pan:knowledge discuss <phase>` invocation or run the commit subcommand the agent suggested.
+**Session persistence:** `.planning/conversations/<phase>/session.json` — array of turns with ts/role/content/cites. Multi-turn cost is dominated by cache hits on stable `.planning/` files.
+### `playbook`
+```
+/pan:knowledge playbook
+```
+**Flow:**
+1. `pan-tools knowledge playbook` reads all agents' memory (`.planning/memory/*.md`), clusters entries by category, writes `.planning/playbook.md` directly.
+2. Optionally spawn `pan-knowledge` with `<mode>playbook</mode>` to polish (dedupe contradictions, consolidate similar entries). Skip the polish step if the draft looks clean.
+**Output:** `.planning/playbook.md` — team-readable summary of accumulated lessons.
+**Auto-invocation:** `/pan:milestone-done` can optionally run this (flag-gated, not default). Manual invocation any time.
+</modes>
+<workflow>
+**Onboarding a new team member:** have them run `/pan:knowledge playbook` then `/pan:knowledge ask "what conventions matter in this codebase?"`.
+**Design debate:** run `/pan:knowledge discuss <phase> "<question>"` iteratively. The agent refines as the debate narrows. After convergence, accept the proposed `context.md` update.
+**Bug investigation:** `/pan:knowledge ask "why did we add the retry in phase 4?"` — faster than grepping for historical context.
+**Before milestone-done:** run `/pan:knowledge playbook` to capture what the team learned. Gives contributors something to reference when starting the next milestone.
+</workflow>
+<citation_format>
+Agent output uses bracketed citations that link to files. Supported forms:
+| Form | Example | Renders as |
+|------|---------|-----------|
+| Plain file | `[README.md]` | markdown link to the file |
+| File + line | `[docs/ARCHITECTURE.md:200]` | link to line 200 |
+| ADR | `[ADR-0015]` | link to ADR file |
+| Phase artifact | `[phase-4/summary.md]` | link to phase summary |
+The agent should NEVER fabricate citations. The retrieval layer's `sources` list is the allowlist.
+</citation_format>
+<runtime_compatibility>
+| Runtime | ask | discuss | playbook |
+|---------|-----|---------|----------|
+| Claude Code | Full, thinking enabled | Full, prompt caching bonus | Full |
+| OpenCode | Full | Full (no cache bonus) | Full |
+| Gemini | Full | Full | Full |
+| Codex | Full | Full | Full |
+| Copilot | Full | Full | Full |
+The data layer (retrieval, session state, playbook clustering) is pure Node.js and runtime-agnostic. Only answer synthesis quality varies with model capability.
+</runtime_compatibility>
+<privacy_note>
+`session.json` is persisted to disk and committed unless `.planning/conversations/` is gitignored. For sensitive design discussions, consider:
+```
+echo '.planning/conversations/' >> .gitignore
+```
+before starting a `discuss` session. Session turns are not auto-encrypted.
+</privacy_note>

package/commands/pan/map-codebase.md CHANGED Viewed

@@ -49,16 +49,57 @@ Check for .planning/state.md - loads context if project already initialized
 - Trivial codebases (<5 files)
 </when_to_use>
+<stage_0_ingest_mode>
+**Before spawning mapper agents**, determine whether the repo fits in a single 1M-context window.
+Run: `node ~/.claude/pan-wizard-core/bin/pan-tools.cjs codebase estimate-size --threshold 700000`
+The CLI returns `{mode, total_tokens, file_count, languages}`:
+- **`mode: "single-shot"`** — repo is small enough (≤700K tokens) for one Opus 4.7 agent to ingest the whole thing. Spawn a single `pan-document_code` agent with the full repo in context. This avoids the 6-way stitching artifacts of sharded mode (contradictory version claims, duplicated mentions, missed cross-file references).
+- **`mode: "sharded"`** — repo exceeds 700K tokens. Fall back to the default 6-way parallel sharding (tech, arch, quality, concerns, relationships, practices). Each shard gets a 200K budget.
+Record the chosen mode + telemetry in the final `.planning/codebase/overview.md` so future runs can reason about drift.
+Opus 4.7 is required for single-shot mode (only model with a 1M context window). Other models always take the sharded path regardless of size.
+</stage_0_ingest_mode>
+<tool_priority>
+Each mapper agent should use the simplest sufficient tool:
+1. Glob — discover files by pattern (find all .ts files, config files, test files)
+2. Grep — search for patterns across the codebase (imports, exports, function names)
+3. Read — examine specific files found by Glob/Grep
+4. Bash — only for git history or commands dedicated tools cannot handle
+</tool_priority>
+<progressive_context>
+The orchestrator loads context in layers — NOT everything upfront. Mapper agents receive only what they need.
+**Orchestrator layers (before spawning agents):**
+1. **Manifest** — package.json/Cargo.toml, project identity, entry points
+2. **Structure** — top-level directory listing, file count by extension, test presence
+3. **Git summary** — recent commits (10), contributors, branch info
+**Per-agent context (each agent loads its own):**
+- Each agent starts with: project manifest + directory structure + its focus area description
+- Each agent discovers its own details via Glob/Grep/Read within its focus area
+- Agents do NOT receive other agents' output (parallel, independent)
+**Why:** Loading the entire codebase into the orchestrator before spawning agents wastes orchestrator context. Each agent has a fresh 200k window — let them explore independently. The orchestrator only needs enough context to spawn correctly and verify outputs exist.
+</progressive_context>
 <process>
 1. Check if .planning/codebase/ already exists (offer to refresh or skip)
 2. Create .planning/codebase/ directory structure
-3. Spawn 4 parallel pan-document_code agents:
-   - Agent 1: tech focus → writes STACK.md, INTEGRATIONS.md
-   - Agent 2: arch focus → writes ARCHITECTURE.md, STRUCTURE.md
-   - Agent 3: quality focus → writes CONVENTIONS.md, TESTING.md
-   - Agent 4: concerns focus → writes CONCERNS.md
+3. Spawn 6 parallel pan-document_code agents:
+   - Agent 1: tech focus → writes stack.md, integrations.md
+   - Agent 2: arch focus → writes architecture.md, structure.md
+   - Agent 3: quality focus → writes conventions.md, testing.md
+   - Agent 4: concerns focus → writes concerns.md
+   - Agent 5: relationships focus → writes relationships.md
+   - Agent 6: practices focus → writes best-practices.md
 4. Wait for agents to complete, collect confirmations (NOT document contents)
-5. Verify all 7 documents exist with line counts
+5. Verify all 9 documents exist with line counts
 6. Commit codebase map
 7. Offer next steps (typically: /pan:new-project or /pan:plan-phase)
 </process>