npm - pan-wizard - Versions diffs - 2.9.0 → 2.9.1 - Mend

pan-wizard 2.9.0 → 2.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/README.md +1 -1
package/commands/pan/assumptions.md +38 -3
package/commands/pan/audit-deployment.md +6 -0
package/commands/pan/debug.md +71 -2
package/commands/pan/exec-phase.md +90 -0
package/commands/pan/focus-auto.md +181 -18
package/commands/pan/focus-design.md +67 -2
package/commands/pan/focus-exec.md +168 -46
package/commands/pan/focus-scan.md +17 -5
package/commands/pan/map-codebase.md +32 -6
package/commands/pan/milestone-audit.md +23 -0
package/commands/pan/new-project.md +64 -0
package/commands/pan/pause.md +42 -1
package/commands/pan/plan-phase.md +84 -0
package/commands/pan/quick.md +15 -0
package/commands/pan/resume.md +62 -2
package/commands/pan/verify-phase.md +42 -0
package/package.json +1 -1
package/pan-wizard-core/bin/lib/constants.cjs +3 -1
package/pan-wizard-core/bin/lib/focus.cjs +5 -0
package/scripts/generate-skills-docs.py +560 -0

package/commands/pan/focus-exec.md CHANGED Viewed

@@ -18,6 +18,18 @@ Execute items from the current focus batch with capacity-based sizing, full sess
 **Goal:** One-command pipeline that starts a session, loads the planned batch, implements items with tier-based execution protocols, verifies the work, syncs documentation, and closes the session cleanly.
+<completion_contract>
+Execution is complete when ALL conditions are met:
+1. All batch items processed (each marked DONE or FAILED with reason)
+2. Full test suite passes with count >= Stage 1 baseline
+3. Stage 6 pre-commit checklist passes (all 6 checks)
+4. Commit created listing only VERIFIED items
+5. Session recorded with before/after test counts and budget usage
+6. Active scan file updated with item statuses
+Execution FAILS if: test baseline cannot be established (Stage 1), or test count drops below baseline after all reverts.
+</completion_contract>
 ---
 ## Pipeline Overview
@@ -46,13 +58,33 @@ Execute items from the current focus batch with capacity-based sizing, full sess
     - Commit, record session, generate summary
 ```
+<action_gating>
+Each stage has a restricted set of appropriate actions. Using the wrong tool at the wrong stage causes regressions.
+| Stage | Read | Grep/Glob | Edit/Write | Bash (tests) | Bash (git) |
+|-------|------|-----------|------------|--------------|------------|
+| 1. Session Start | YES | YES | NO | YES | YES |
+| 2. Batch Loading | YES | YES | NO | NO | NO |
+| 3. Execution | YES | YES | YES | YES | NO |
+| 4. Verification | YES | YES | NO | YES | NO |
+| 5. Doc Sync | YES | YES | YES | NO | NO |
+| 6. Session End | YES | NO | YES | NO | YES |
+**Key constraints:**
+- Stage 1: NO Edit/Write — you are establishing baseline, not changing code
+- Stage 2: Read-only — validating the batch, not modifying anything
+- Stage 4: NO Edit/Write — you are verifying work, not doing more work. If tests fail, go back to Stage 3 to fix.
+- Stage 5: Edit docs only — no code changes during doc sync
+- Stage 6: Git operations + session recording only — all work must be done
+</action_gating>
 ---
-## CRITICAL: Project Scope Boundary
+## Project Scope Boundary
-This command executes work on the **host project's source code** — NOT on PAN Wizard's own infrastructure.
+This command executes work on the **host project's source code** — not on PAN Wizard's own infrastructure.
-**NEVER read, modify, or "fix" files in these PAN directories:**
+**Do not read, modify, or fix files in these PAN directories:**
 - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
 - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
@@ -60,9 +92,22 @@ This command executes work on the **host project's source code** — NOT on PAN
 ---
-## MANDATORY: Execute ALL Stages Sequentially
+## Execute All Stages Sequentially
+When `/pan:focus-exec` is invoked, run all 6 stages in order. Do not skip stages or stop between them unless tests regress.
+<stage_dependencies>
+Stage 1 → Stage 2: Baseline MUST exist before batch loads (regression detection requires it)
+Stage 2 → Stage 3: Batch MUST be validated before execution begins (prevents working on stale/empty batches)
+Stage 3 → Stage 4: All items MUST be processed before verification (partial verification produces false confidence)
+Stage 4 → Stage 5: Tests MUST pass before doc sync (don't document broken code)
+Stage 5 → Stage 6: Docs MUST be updated before commit (commit captures the complete state)
-When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages. Do NOT stop between stages unless a critical failure occurs (tests regress).
+HARD STOP conditions (do not proceed to next stage):
+- Stage 1: Test suite fails → fix tests before proceeding
+- Stage 2: No batch file found → tell user to run /pan:focus-plan
+- Stage 4: Test count below baseline → revert last changes, re-verify
+</stage_dependencies>
 **Flags:**
 - `--budget N` — Override capacity budget in points (default: 50, min: 5, max: 100)
@@ -86,34 +131,54 @@ When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages
 ---
-## AI Behavioral Rules (ALL 9 MANDATORY)
+## AI Behavioral Rules
-### Rule 1: Read Before You Write (MANDATORY)
-Before changing ANY file, read it first. Understand context, callers, and invariants.
+### Rule 1: Read Before You Write
+Before changing any file, read it first. Understand context, callers, and invariants.
-### Rule 2: Understand the Root Cause (MANDATORY)
-Do NOT apply surface-level patches. Trace the code path, identify the actual defect.
+**Violation example:**
+```
+BAD:  Rename parameter `opts` → `options` in utils.cjs without reading callers
+      → 3 callers in api.cjs, workers.cjs break silently
+GOOD: Grep for "utils\." → read all 3 callers → confirm param name is safe to change → edit
+```
-### Rule 3: One Change, One Test (MANDATORY)
+### Rule 2: Understand the Root Cause
+Do not apply surface-level patches. Trace the code path, identify the actual defect.
+**Violation example:**
+```
+BAD:  Test fails with "Cannot read property 'name' of undefined"
+      → Add `if (!obj) return null` at the crash site
+      → Root cause: caller passes wrong argument order — still broken
+GOOD: Trace the call chain → find caller passes (id, name) but function expects (name, id) → fix caller
+```
+### Rule 3: One Change, One Test
 Every code change must be tested before moving to the next item.
 Test cadence by tier:
 - **MICRO (XS/S):** Run specific test after implementing. Batch up to 3 independent items before smoke.
-- **STANDARD (M):** Full test suite after EACH item.
-- **FULL (L/XL):** Build hooks + full test suite after EACH item.
-### Rule 4: Don't Invent — Follow the Plan (MANDATORY)
-Implement exactly what the batch says. No scope creep.
-### Rule 5: Cross-Platform Awareness (MANDATORY)
+- **STANDARD (M):** Full test suite after each item.
+- **FULL (L/XL):** Build hooks + full test suite after each item.
+### Rule 4: Don't Invent — Follow the Plan
+Implement exactly what the batch says. Do not:
+- Add features not in the batch item
+- Refactor surrounding code that isn't broken
+- Add comments or docstrings to unchanged files
+- Create abstractions for one-time operations
+- Add error handling for scenarios that cannot happen
+### Rule 5: Cross-Platform Awareness
 - Use platform-agnostic path APIs (no hardcoded separators)
 - Follow the project's module format conventions (discover from existing code)
 - Use file-based input for shell-sensitive content when needed
-### Rule 6: Revert Fast, Don't Dig Deep (MANDATORY)
+### Rule 6: Revert Fast, Don't Dig Deep
 If a fix doesn't work within 5 minutes, revert and move on. Failed items carry forward.
-### Rule 7: Verify Understanding Before Committing (MANDATORY)
+### Rule 7: Verify Understanding Before Coding
 For M/L/XL items, state your understanding before writing code:
 ```
 Item P2-3 — Add tests for billing module
@@ -123,11 +188,19 @@ Files: billing.ts, tests/billing.test.ts
 Confidence: HIGH
 ```
-### Rule 8: Preserve Existing Test Expectations (MANDATORY)
+### Rule 8: Preserve Existing Test Expectations
 Never change an existing test's expected output to match broken code.
-### Rule 9: Commit Messages Must Be Accurate (MANDATORY)
-List ONLY items that are actually VERIFIED (passed tests). Include actual test counts.
+### Rule 9: Commit Messages Must Be Accurate
+List only items that are verified (passed tests). Include actual test counts.
+### Rule 10: Vary Approach for Similar Items
+When a batch contains 3+ items of the same type (e.g., "add null check to X", "add null check to Y"), deliberately vary your approach to avoid tunnel vision:
+- Item 1: Fix as planned
+- Item 2: Before fixing, re-read the module's error handling pattern — does the same fix apply or does this module handle errors differently?
+- Item 3+: Check if the first fixes introduced a pattern that should be extracted (shared helper) or if each case is genuinely independent
+This catches emergent interactions: 5 "add try-catch" fixes might reveal the module needs a centralized error boundary, not 5 scattered try-catches.
 ---
@@ -185,9 +258,10 @@ Display the execution batch to user, then continue automatically.
 ```
 1. STATE UNDERSTANDING (Rule 7)
 2. READ target files + test files
-3. IMPLEMENT across necessary files
-4. TEST — full test suite
-5. CONFIRM — pass -> DONE | regresses -> REVERT -> FAILED
+3. STATE INTENT — "I will modify [files], adding [what], to achieve [goal]"
+4. IMPLEMENT across necessary files
+5. TEST — full test suite
+6. CONFIRM — pass -> DONE | regresses -> REVERT -> FAILED
 ```
 #### FULL Items (L/XL)
@@ -195,20 +269,55 @@ Display the execution batch to user, then continue automatically.
 1. STATE UNDERSTANDING (detailed)
 2. READ WIDELY — target files, callers, tests, related code
 3. DESIGN — outline approach before coding
-4. IMPLEMENT in logical chunks
-5. BUILD — build hooks if hooks changed
-6. TEST — full test suite
-7. CONFIRM — all pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
+4. STATE INTENT — "I will modify [files]. Risk: [what could break]"
+5. IMPLEMENT in logical chunks
+6. BUILD — build hooks if hooks changed
+7. TEST — full test suite
+8. CONFIRM — all pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
 ```
 ### 3.2 Failure Handling
-- Build breaks: fix typo or revert (5 min limit)
-- Test regression: identify cause, one fix attempt, else revert
-- **Never let a failed item block other items**
-### 3.3 Progress Tracking
+Classify every error before acting. The classification determines the recovery protocol.
+**RECOVERABLE (retry with analysis, max 3 attempts):**
+- Test failure after code change — read the error output, fix the root cause, re-test
+- File not found — search for moved/renamed paths via Grep/Glob
+- Build failure from syntax error — fix the typo, rebuild
+- Merge conflict in a non-critical file — attempt auto-resolution
+**UNRECOVERABLE (halt the item, mark FAILED, move to next):**
+- Same test failure persists after 3 fix attempts — revert all changes for this item
+- Permission or auth error on a critical path — cannot proceed without user action
+- State corruption (malformed JSON in planning files) — stop, report to user
+- Persistent build failure unrelated to current item — stop execution, report
+- Test regression in unrelated code — revert, flag for investigation
+**Never let a failed item block other items.** Mark it FAILED with the error classification and move on.
+### 3.3 Failure Pattern Detection
+When marking an item FAILED, check if its error matches a previous failure in this batch:
+- Same error type or root cause category
+- Same file or module involved
+If a pattern repeats (2+ items fail the same way), log it in the session record:
+```
+FAILURE PATTERN: {description} — Items {ID1}, {ID2} — Root cause: {cause}
+Suggested avoidance: {what to check before similar items}
+```
+Before executing remaining items, check if they match the pattern. If so, skip with reason "matches known failure pattern" rather than burning budget on predictable failures.
+### 3.4 Progress Tracking
 Update progress tracker after each item with status and budget tracking.
+**Attention anchor — emit after each item completes:**
+```
+Item {N}/{total} {DONE|FAILED} | Budget: {used}/{budget} pts | Tests: {baseline} → {current}
+Remaining: {count} items [{IDs with sizes}]
+Next: {next item ID} — {title} ({tier})
+```
+This prevents lost-in-the-middle drift in large batches where the agent forgets budget limits or remaining items.
 ---
 ## Stage 4: Verification
@@ -254,17 +363,30 @@ Edit the active scan file:
 ## Stage 6: Session End
-### 6.1 Commit Changes
+### 6.1 Pre-Commit Verification Checklist
+Before committing, run through ALL checks. Do not commit until every check passes.
+1. Every modified file was read before editing (no blind writes)
+2. `git diff --stat` contains only files related to batch items (no stray changes)
+3. Full test suite passes — count matches or exceeds baseline from Stage 1
+4. No `TODO`, `FIXME`, or `HACK` introduced without a matching batch item tracking it
+5. Commit message lists only items that are VERIFIED (tests ran, tests passed)
+6. No secrets, credentials, or `.env` files staged
+If any check fails: fix the issue and re-run all checks. Only proceed to commit when all 6 pass.
+### 6.2 Commit Changes
 Unless `--no-commit`:
 1. Stage modified files (specific paths, not `git add -A`)
 2. Create commit with accurate message listing verified items
 3. Verify commit succeeded
-### 6.2 Record Session
+### 6.3 Record Session
 - Record session summary (items completed, tests before/after, budget used)
 - Append error patterns if any failures occurred
-### 6.3 Final Report
+### 6.4 Final Report
 ```markdown
 ## /pan:focus-exec Complete
@@ -293,15 +415,15 @@ Run `/pan:focus-scan` to regenerate the scan.
 ## NEVER DO
-- Skip reading files before editing them (Rule 1)
-- Apply symptom patches instead of root cause fixes (Rule 2)
-- Batch implement without testing between items (Rule 3)
-- Expand scope beyond the batch item (Rule 4)
-- Ignore cross-platform path issues (Rule 5)
-- Spend more than 5 minutes debugging a single failure (Rule 6)
-- Start coding without stating understanding for M+ items (Rule 7)
-- Change test expectations to match broken code (Rule 8)
-- Claim items are fixed without running tests (Rule 9)
+- Skip reading files before editing them — blind edits break callers, miss invariants, and create regressions (Rule 1)
+- Apply symptom patches instead of root cause fixes — surface patches recur and erode trust in the codebase (Rule 2)
+- Batch implement without testing between items — a silent failure in item 2 corrupts items 3-5 before you detect it (Rule 3)
+- Expand scope beyond the batch item — unplanned changes bypass the budget system and risk compounding failures (Rule 4)
+- Ignore cross-platform path issues — hardcoded separators break on Windows or vice versa (Rule 5)
+- Spend more than 5 minutes debugging a single failure — diminishing returns; revert preserves budget for remaining items (Rule 6)
+- Start coding without stating understanding for M+ items — misunderstanding the problem wastes the entire implementation (Rule 7)
+- Change test expectations to match broken code — this hides bugs instead of fixing them (Rule 8)
+- Claim items are fixed without running tests — unverified claims erode the entire verification pipeline (Rule 9)
 ## ALWAYS DO

package/commands/pan/focus-scan.md CHANGED Viewed

@@ -17,11 +17,11 @@ Survey the project for prioritized work items with evidence-based scoring. $ARGU
 ---
-## CRITICAL: Project Scope Boundary
+## Project Scope Boundary
-This command scans the **host project's source code** for work items — NOT PAN Wizard's own infrastructure.
+This command scans the **host project's source code** for work items — not PAN Wizard's own infrastructure.
-**ALWAYS EXCLUDE these directories from scanning:**
+**Exclude these directories from scanning:**
 - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
 - `.planning/` — PAN planning state (read for context, but never report PAN planning files as "issues")
 - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
@@ -32,9 +32,21 @@ If a scan finding points to a file inside `.claude/`, `.github/`, `.opencode/`,
 ---
-## MANDATORY: Execute ALL Phases Automatically
+## Tool Selection Priority
-When `/pan:focus-scan` is invoked, execute ALL phases without stopping. Do NOT ask questions between phases. Do NOT skip phases. The output is a prioritized work list with Reality Score filtering.
+Use the simplest sufficient tool for each scanning operation:
+1. **Grep** — for finding patterns (TODO, FIXME, error-prone code) across the codebase
+2. **Glob** — for discovering files by name pattern (test files, config files, modules)
+3. **Read** — for examining specific files identified by Grep/Glob
+4. **Bash** — only for commands that dedicated tools cannot do (git log, test runners)
+Do not read entire files when Grep can find the relevant lines. Do not use Bash for searches that Grep handles.
+---
+## Execute All Phases Automatically
+When `/pan:focus-scan` is invoked, execute all phases without stopping. Do not ask questions between phases or skip phases. The output is a prioritized work list with Reality Score filtering.
 **Flags:**
 - `--focus <area>` — Weight items toward a specific area (e.g., `--focus commands`, `--focus hooks`, `--focus tests`)

package/commands/pan/map-codebase.md CHANGED Viewed

@@ -49,16 +49,42 @@ Check for .planning/state.md - loads context if project already initialized
 - Trivial codebases (<5 files)
 </when_to_use>
+<tool_priority>
+Each mapper agent should use the simplest sufficient tool:
+1. Glob — discover files by pattern (find all .ts files, config files, test files)
+2. Grep — search for patterns across the codebase (imports, exports, function names)
+3. Read — examine specific files found by Glob/Grep
+4. Bash — only for git history or commands dedicated tools cannot handle
+</tool_priority>
+<progressive_context>
+The orchestrator loads context in layers — NOT everything upfront. Mapper agents receive only what they need.
+**Orchestrator layers (before spawning agents):**
+1. **Manifest** — package.json/Cargo.toml, project identity, entry points
+2. **Structure** — top-level directory listing, file count by extension, test presence
+3. **Git summary** — recent commits (10), contributors, branch info
+**Per-agent context (each agent loads its own):**
+- Each agent starts with: project manifest + directory structure + its focus area description
+- Each agent discovers its own details via Glob/Grep/Read within its focus area
+- Agents do NOT receive other agents' output (parallel, independent)
+**Why:** Loading the entire codebase into the orchestrator before spawning agents wastes orchestrator context. Each agent has a fresh 200k window — let them explore independently. The orchestrator only needs enough context to spawn correctly and verify outputs exist.
+</progressive_context>
 <process>
 1. Check if .planning/codebase/ already exists (offer to refresh or skip)
 2. Create .planning/codebase/ directory structure
-3. Spawn 4 parallel pan-document_code agents:
-   - Agent 1: tech focus → writes STACK.md, INTEGRATIONS.md
-   - Agent 2: arch focus → writes ARCHITECTURE.md, STRUCTURE.md
-   - Agent 3: quality focus → writes CONVENTIONS.md, TESTING.md
-   - Agent 4: concerns focus → writes CONCERNS.md
+3. Spawn 6 parallel pan-document_code agents:
+   - Agent 1: tech focus → writes stack.md, integrations.md
+   - Agent 2: arch focus → writes architecture.md, structure.md
+   - Agent 3: quality focus → writes conventions.md, testing.md
+   - Agent 4: concerns focus → writes concerns.md
+   - Agent 5: relationships focus → writes relationships.md
+   - Agent 6: practices focus → writes best-practices.md
 4. Wait for agents to complete, collect confirmations (NOT document contents)
-5. Verify all 7 documents exist with line counts
+5. Verify all 9 documents exist with line counts
 6. Commit codebase map
 7. Offer next steps (typically: /pan:new-project or /pan:plan-phase)
 </process>

package/commands/pan/milestone-audit.md CHANGED Viewed

@@ -31,6 +31,29 @@ Glob: .planning/phases/*/*-summary.md
 Glob: .planning/phases/*/*-verification.md
 </context>
+<citation_requirement>
+Every coverage judgment in the audit MUST cite evidence from the codebase.
+**Before writing any requirement as "covered" or "not covered", verify by reading the code.**
+**Grounding rules:**
+- "Covered" requires: file:line where the requirement is implemented + verification.md or test evidence
+- "Partially covered" requires: file:line showing what exists + specific gap description with expected location
+- "Not covered" requires: grep showing the expected functionality doesn't exist (show the search and empty result)
+- Cross-phase integration claims require: file:line in phase A's output + file:line in phase B's consumer
+**Anti-pattern:**
+```
+BAD:  "Requirement R3 is covered — the billing module handles this"
+      → Which file? Which function? How do you know?
+GOOD: "Requirement R3 is covered — generateInvoice() at src/billing.ts:42 implements line-item
+       calculation. Verified in phase-2-verification.md (line 18). Integration: called from
+       src/api/orders.ts:156 (phase 3)."
+```
+Do not trust summary files at face value. If a verification.md says "all tests pass" but you haven't confirmed the test count, that claim is ungrounded. Spot-check at least 2 verification files by running the actual tests.
+</citation_requirement>
 <process>
 Execute the audit-milestone workflow from @~/.claude/pan-wizard-core/workflows/milestone-audit.md end-to-end.
 Preserve all workflow gates (scope determination, verification reading, integration check, requirements coverage, routing).

package/commands/pan/new-project.md CHANGED Viewed

@@ -37,6 +37,70 @@ Initialize a new project through unified flow: questioning → research (optiona
 @~/.claude/pan-wizard-core/templates/requirements.md
 </execution_context>
+<progressive_context>
+Load context in layers — do NOT read everything upfront. Each layer builds on the previous.
+**Layer 1: Manifest (always load first)**
+- package.json / Cargo.toml / pyproject.toml — project identity, deps, scripts
+- .planning/ existence check — is this a fresh start or existing project?
+- README.md first 50 lines — what the project claims to be
+**Layer 2: Structure (load during questioning)**
+- Directory tree (Glob top-level patterns) — understand project shape
+- Entry points — main files, index files, server files
+- Test infrastructure — test framework, test directory
+**Layer 3: Hotspots (load during research, if research is enabled)**
+- Most-changed files (git log --name-only) — where active work happens
+- Largest files — complexity centers
+- Import graph roots — most-depended-on modules
+**Layer 4: Baselines (load only when generating requirements/roadmap)**
+- Test count + pass rate
+- Build status
+- Dependency audit (outdated, vulnerable)
+**Why layered:** Loading everything at Layer 1 wastes 40-60% of context on information not needed until later. For greenfield projects, Layers 3-4 are empty and should be skipped entirely.
+</progressive_context>
+<routing_decision_tree>
+Use this decision tree to select the correct path. Evaluate conditions top-to-bottom; take the FIRST match.
+```
+IF .planning/ already exists AND contains project.md:
+  → WARN: "Project already initialized. Use /pan:resume to continue."
+  → STOP (do not overwrite existing project)
+ELSE IF --auto flag AND @ reference document provided:
+  → ASK config questions only (commit_docs, model_profile)
+  → SKIP interactive questioning (use the @ document as project context)
+  → RUN research automatically
+  → GENERATE requirements from research + @ document
+  → GENERATE roadmap from requirements
+  → No further interaction until complete
+ELSE IF --auto flag WITHOUT @ reference:
+  → ERROR: "--auto requires an @ referenced idea document"
+  → STOP
+ELSE (interactive mode — default):
+  → RUN questioning flow (5-area deep questioning)
+  → ASK: "Should I research the domain ecosystem?" (Y/N)
+    → IF Y: spawn researchers → synthesize → continue
+    → IF N: skip research → continue
+  → PRESENT requirements for approval
+  → PRESENT roadmap for approval
+  → COMMIT if commit_docs=true
+```
+**Research routing:**
+```
+IF user says research: spawn pan-project-researcher agents
+IF user declines research: skip directly to requirements generation
+IF codebase already has substantial code: suggest skipping research (existing code IS the context)
+```
+</routing_decision_tree>
 <process>
 Execute the new-project workflow from @~/.claude/pan-wizard-core/workflows/new-project.md end-to-end.
 Preserve all workflow gates (validation, approvals, commits, routing).

package/commands/pan/pause.md CHANGED Viewed

@@ -27,13 +27,54 @@ Routes to the pause-work workflow which handles:
 State and phase progress are gathered in-workflow with targeted reads.
 </context>
+<handoff_schema>
+The `.continue-here.md` file MUST contain ALL of the following sections. Missing sections cause resume failures.
+```yaml
+# Required fields for .continue-here.md
+session_id: "{date}-{slug}"           # Unique session identifier
+paused_at: "{ISO-8601 timestamp}"     # When work was paused
+phase: "{phase number and name}"      # Current phase being worked on
+plan: "{plan file path}"              # Which plan was active
+position:
+  last_completed_task: "{task ID}"    # Last task that was fully done
+  next_task: "{task ID}"              # What to do next
+  wave: "{wave number, if applicable}"
+progress:
+  tasks_done: [{id, title, status}]   # All completed tasks this session
+  tasks_remaining: [{id, title}]      # What's left in the plan
+  test_baseline: "{N passing}"        # Test count when session started
+  test_current: "{N passing}"         # Test count at pause time
+decisions:
+  - "{decision made and why}"         # Choices that affect remaining work
+blockers:
+  - "{blocker description}"           # Anything preventing progress
+context:
+  files_modified: ["{paths}"]         # Files changed this session
+  key_findings: ["{findings}"]        # Non-obvious discoveries
+  next_action: "{specific action}"    # Exact first step on resume
+```
+**Why every field matters:**
+- `position` → resume agent knows WHERE to start (not re-reading the whole plan)
+- `progress` → resume agent knows test baseline (detects regressions vs pre-existing)
+- `decisions` → resume agent won't re-debate settled questions
+- `blockers` → resume agent can flag to user immediately instead of rediscovering
+- `context.next_action` → resume agent's first action is productive, not exploratory
+</handoff_schema>
 <process>
 **Follow the pause-work workflow** from `@~/.claude/pan-wizard-core/workflows/pause.md`.
 The workflow handles all logic including:
 1. Phase directory detection
 2. State gathering with user clarifications
-3. Handoff file writing with timestamp
+3. Handoff file writing with timestamp — **using the schema from `<handoff_schema>`**
 4. Git commit
 5. Confirmation with resume instructions
 </process>

package/commands/pan/plan-phase.md CHANGED Viewed

@@ -40,6 +40,90 @@ Phase number: $ARGUMENTS (optional — auto-detects next unplanned phase if omit
 Normalize phase input in step 2 before any directory lookups.
 </context>
+<reflexion_loop>
+During the plan-checker verification iteration:
+1. Read the plan-checker's critique carefully
+2. For each identified gap: verify it is a genuine gap by re-reading the relevant requirement
+3. Do not blindly accept all critiques — some may be false positives from missing context
+4. Revise the plan to address genuine gaps only
+5. Maximum 2 revision iterations (plan → check → revise → check → final)
+This prevents over-revision while ensuring real gaps are closed.
+</reflexion_loop>
+<completion_contract>
+Planning is complete when ALL conditions are met:
+1. At least one plan.md file created in the phase directory
+2. Plan-checker passed (or max 2 revision iterations exhausted with final approval)
+3. Each plan contains: objective, task breakdown with estimates, dependency ordering, and key file links
+4. Research.md exists (unless --skip-research was used)
+5. User presented with results and next-step options
+Planning FAILS if: phase not found in roadmap, or planner agent returns empty/malformed output after retries.
+</completion_contract>
+<common_mistakes>
+Avoid these planning anti-patterns:
+```
+BAD:  Plan has 25 tasks for a single phase → too granular, executor loses context
+GOOD: 5-8 tasks per plan, each with clear scope and testable outcome
+BAD:  Task says "Implement the feature" with no file links or acceptance criteria
+      → Executor guesses at scope, misses edge cases
+GOOD: Task says "Add retry logic to api/client.ts:fetchData() — 3 retries with exponential backoff, tested by tests/client.test.ts"
+BAD:  Plan-checker flags a gap → blindly add a task without re-reading the requirement
+      → False positive becomes unnecessary work
+GOOD: Re-read the requirement → confirm the gap is real → then add the task
+```
+</common_mistakes>
+<routing_decision_tree>
+Use this decision tree to select the correct path. Evaluate conditions top-to-bottom; take the FIRST match.
+```
+IF --gaps flag is set:
+  → SKIP research (gap closure uses verification.md instead)
+  → READ verification.md for the phase
+  → PLAN with gap context
+  → VERIFY (unless --skip-verify)
+ELSE IF --prd <file> flag is set:
+  → SKIP discuss-phase entirely
+  → PARSE PRD file into context.md
+  → SKIP research (PRD provides requirements)
+  → PLAN from parsed requirements
+  → VERIFY (unless --skip-verify)
+ELSE IF --skip-research flag is set:
+  → SKIP research
+  → PLAN directly (must have roadmap context)
+  → VERIFY (unless --skip-verify)
+ELSE IF research.md already exists AND --research NOT set:
+  → SKIP research (reuse existing)
+  → PLAN using existing research.md
+  → VERIFY (unless --skip-verify)
+ELSE (default path):
+  → RUN research (spawn pan-phase-researcher)
+  → PLAN from research results
+  → VERIFY (unless --skip-verify)
+```
+**Verification loop routing:**
+```
+IF --skip-verify:
+  → Present plan, done
+ELSE:
+  → Spawn pan-plan-checker
+  → IF checker PASSES: done
+  → IF checker finds gaps (iteration 1): revise plan, re-check
+  → IF checker finds gaps (iteration 2): final revision, present with caveats
+  → Max 2 revision iterations
+```
+</routing_decision_tree>
 <process>
 Execute the plan-phase workflow from @~/.claude/pan-wizard-core/workflows/plan-phase.md end-to-end.
 Preserve all workflow gates (validation, research, planning, verification loop, routing).

package/commands/pan/quick.md CHANGED Viewed

@@ -39,4 +39,19 @@ Context files are resolved inside the workflow (`init quick`) and delegated via
 <process>
 Execute the quick workflow from @~/.claude/pan-wizard-core/workflows/quick.md end-to-end.
 Preserve all workflow gates (validation, task description, planning, execution, state updates, commits).
+**Scope Containment:**
+Implement only what was asked. Do not refactor surrounding code, add unrelated improvements, or create abstractions for one-time fixes.
+**State Intent Before Implementing:**
+Before coding, state: "I will modify [files], adding [what], to achieve [goal]."
+**Pre-Commit Verification Checklist — apply before the final commit:**
+1. Every modified file was read before editing
+2. `git diff --stat` contains only files related to the task
+3. Tests pass (run the project's test suite)
+4. Commit message accurately describes the verified change
+5. No secrets or credentials staged
+If any check fails: fix and re-verify before committing.
 </process>