pan-wizard 2.8.1 → 2.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/README.md +4 -2
  2. package/bin/install.js +23 -0
  3. package/commands/pan/assumptions.md +38 -3
  4. package/commands/pan/audit-deployment.md +6 -0
  5. package/commands/pan/debug.md +71 -2
  6. package/commands/pan/exec-phase.md +90 -0
  7. package/commands/pan/focus-auto.md +181 -18
  8. package/commands/pan/focus-design.md +302 -14
  9. package/commands/pan/focus-doc-audit.md +530 -0
  10. package/commands/pan/focus-drift-walking.md +525 -0
  11. package/commands/pan/focus-exec.md +168 -46
  12. package/commands/pan/focus-plan.md +204 -12
  13. package/commands/pan/focus-scan.md +17 -5
  14. package/commands/pan/map-codebase.md +32 -6
  15. package/commands/pan/milestone-audit.md +23 -0
  16. package/commands/pan/new-project.md +64 -0
  17. package/commands/pan/pause.md +42 -1
  18. package/commands/pan/plan-phase.md +84 -0
  19. package/commands/pan/profile.md +2 -1
  20. package/commands/pan/quick.md +15 -0
  21. package/commands/pan/resume.md +62 -2
  22. package/commands/pan/verify-phase.md +42 -0
  23. package/package.json +1 -1
  24. package/pan-wizard-core/bin/lib/commands.cjs +29 -7
  25. package/pan-wizard-core/bin/lib/config.cjs +10 -0
  26. package/pan-wizard-core/bin/lib/constants.cjs +3 -1
  27. package/pan-wizard-core/bin/lib/core.cjs +168 -21
  28. package/pan-wizard-core/bin/lib/focus.cjs +5 -0
  29. package/pan-wizard-core/bin/lib/verify.cjs +283 -4
  30. package/pan-wizard-core/bin/pan-tools.cjs +11 -2
  31. package/pan-wizard-core/references/model-profiles.md +191 -62
  32. package/pan-wizard-core/workflows/help.md +11 -1
  33. package/pan-wizard-core/workflows/profile.md +8 -1
  34. package/pan-wizard-core/workflows/settings.md +14 -0
  35. package/scripts/generate-skills-docs.py +560 -0
@@ -18,6 +18,18 @@ Execute items from the current focus batch with capacity-based sizing, full sess
18
18
 
19
19
  **Goal:** One-command pipeline that starts a session, loads the planned batch, implements items with tier-based execution protocols, verifies the work, syncs documentation, and closes the session cleanly.
20
20
 
21
+ <completion_contract>
22
+ Execution is complete when ALL conditions are met:
23
+ 1. All batch items processed (each marked DONE or FAILED with reason)
24
+ 2. Full test suite passes with count >= Stage 1 baseline
25
+ 3. Stage 6 pre-commit checklist passes (all 6 checks)
26
+ 4. Commit created listing only VERIFIED items
27
+ 5. Session recorded with before/after test counts and budget usage
28
+ 6. Active scan file updated with item statuses
29
+
30
+ Execution FAILS if: test baseline cannot be established (Stage 1), or test count drops below baseline after all reverts.
31
+ </completion_contract>
32
+
21
33
  ---
22
34
 
23
35
  ## Pipeline Overview
@@ -46,13 +58,33 @@ Execute items from the current focus batch with capacity-based sizing, full sess
46
58
  - Commit, record session, generate summary
47
59
  ```
48
60
 
61
+ <action_gating>
62
+ Each stage has a restricted set of appropriate actions. Using the wrong tool at the wrong stage causes regressions.
63
+
64
+ | Stage | Read | Grep/Glob | Edit/Write | Bash (tests) | Bash (git) |
65
+ |-------|------|-----------|------------|--------------|------------|
66
+ | 1. Session Start | YES | YES | NO | YES | YES |
67
+ | 2. Batch Loading | YES | YES | NO | NO | NO |
68
+ | 3. Execution | YES | YES | YES | YES | NO |
69
+ | 4. Verification | YES | YES | NO | YES | NO |
70
+ | 5. Doc Sync | YES | YES | YES | NO | NO |
71
+ | 6. Session End | YES | NO | YES | NO | YES |
72
+
73
+ **Key constraints:**
74
+ - Stage 1: NO Edit/Write — you are establishing baseline, not changing code
75
+ - Stage 2: Read-only — validating the batch, not modifying anything
76
+ - Stage 4: NO Edit/Write — you are verifying work, not doing more work. If tests fail, go back to Stage 3 to fix.
77
+ - Stage 5: Edit docs only — no code changes during doc sync
78
+ - Stage 6: Git operations + session recording only — all work must be done
79
+ </action_gating>
80
+
49
81
  ---
50
82
 
51
- ## CRITICAL: Project Scope Boundary
83
+ ## Project Scope Boundary
52
84
 
53
- This command executes work on the **host project's source code** — NOT on PAN Wizard's own infrastructure.
85
+ This command executes work on the **host project's source code** — not on PAN Wizard's own infrastructure.
54
86
 
55
- **NEVER read, modify, or "fix" files in these PAN directories:**
87
+ **Do not read, modify, or fix files in these PAN directories:**
56
88
  - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
57
89
  - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
58
90
 
@@ -60,9 +92,22 @@ This command executes work on the **host project's source code** — NOT on PAN
60
92
 
61
93
  ---
62
94
 
63
- ## MANDATORY: Execute ALL Stages Sequentially
95
+ ## Execute All Stages Sequentially
96
+
97
+ When `/pan:focus-exec` is invoked, run all 6 stages in order. Do not skip stages or stop between them unless tests regress.
98
+
99
+ <stage_dependencies>
100
+ Stage 1 → Stage 2: Baseline MUST exist before batch loads (regression detection requires it)
101
+ Stage 2 → Stage 3: Batch MUST be validated before execution begins (prevents working on stale/empty batches)
102
+ Stage 3 → Stage 4: All items MUST be processed before verification (partial verification produces false confidence)
103
+ Stage 4 → Stage 5: Tests MUST pass before doc sync (don't document broken code)
104
+ Stage 5 → Stage 6: Docs MUST be updated before commit (commit captures the complete state)
64
105
 
65
- When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages. Do NOT stop between stages unless a critical failure occurs (tests regress).
106
+ HARD STOP conditions (do not proceed to next stage):
107
+ - Stage 1: Test suite fails → fix tests before proceeding
108
+ - Stage 2: No batch file found → tell user to run /pan:focus-plan
109
+ - Stage 4: Test count below baseline → revert last changes, re-verify
110
+ </stage_dependencies>
66
111
 
67
112
  **Flags:**
68
113
  - `--budget N` — Override capacity budget in points (default: 50, min: 5, max: 100)
@@ -86,34 +131,54 @@ When `/pan:focus-exec` is invoked, run ALL 6 stages in order. Do NOT skip stages
86
131
 
87
132
  ---
88
133
 
89
- ## AI Behavioral Rules (ALL 9 MANDATORY)
134
+ ## AI Behavioral Rules
90
135
 
91
- ### Rule 1: Read Before You Write (MANDATORY)
92
- Before changing ANY file, read it first. Understand context, callers, and invariants.
136
+ ### Rule 1: Read Before You Write
137
+ Before changing any file, read it first. Understand context, callers, and invariants.
93
138
 
94
- ### Rule 2: Understand the Root Cause (MANDATORY)
95
- Do NOT apply surface-level patches. Trace the code path, identify the actual defect.
139
+ **Violation example:**
140
+ ```
141
+ BAD: Rename parameter `opts` → `options` in utils.cjs without reading callers
142
+ → 3 callers in api.cjs, workers.cjs break silently
143
+ GOOD: Grep for "utils\." → read all 3 callers → confirm param name is safe to change → edit
144
+ ```
96
145
 
97
- ### Rule 3: One Change, One Test (MANDATORY)
146
+ ### Rule 2: Understand the Root Cause
147
+ Do not apply surface-level patches. Trace the code path, identify the actual defect.
148
+
149
+ **Violation example:**
150
+ ```
151
+ BAD: Test fails with "Cannot read property 'name' of undefined"
152
+ → Add `if (!obj) return null` at the crash site
153
+ → Root cause: caller passes wrong argument order — still broken
154
+ GOOD: Trace the call chain → find caller passes (id, name) but function expects (name, id) → fix caller
155
+ ```
156
+
157
+ ### Rule 3: One Change, One Test
98
158
  Every code change must be tested before moving to the next item.
99
159
 
100
160
  Test cadence by tier:
101
161
  - **MICRO (XS/S):** Run specific test after implementing. Batch up to 3 independent items before smoke.
102
- - **STANDARD (M):** Full test suite after EACH item.
103
- - **FULL (L/XL):** Build hooks + full test suite after EACH item.
104
-
105
- ### Rule 4: Don't Invent — Follow the Plan (MANDATORY)
106
- Implement exactly what the batch says. No scope creep.
107
-
108
- ### Rule 5: Cross-Platform Awareness (MANDATORY)
162
+ - **STANDARD (M):** Full test suite after each item.
163
+ - **FULL (L/XL):** Build hooks + full test suite after each item.
164
+
165
+ ### Rule 4: Don't Invent — Follow the Plan
166
+ Implement exactly what the batch says. Do not:
167
+ - Add features not in the batch item
168
+ - Refactor surrounding code that isn't broken
169
+ - Add comments or docstrings to unchanged files
170
+ - Create abstractions for one-time operations
171
+ - Add error handling for scenarios that cannot happen
172
+
173
+ ### Rule 5: Cross-Platform Awareness
109
174
  - Use platform-agnostic path APIs (no hardcoded separators)
110
175
  - Follow the project's module format conventions (discover from existing code)
111
176
  - Use file-based input for shell-sensitive content when needed
112
177
 
113
- ### Rule 6: Revert Fast, Don't Dig Deep (MANDATORY)
178
+ ### Rule 6: Revert Fast, Don't Dig Deep
114
179
  If a fix doesn't work within 5 minutes, revert and move on. Failed items carry forward.
115
180
 
116
- ### Rule 7: Verify Understanding Before Committing (MANDATORY)
181
+ ### Rule 7: Verify Understanding Before Coding
117
182
  For M/L/XL items, state your understanding before writing code:
118
183
  ```
119
184
  Item P2-3 — Add tests for billing module
@@ -123,11 +188,19 @@ Files: billing.ts, tests/billing.test.ts
123
188
  Confidence: HIGH
124
189
  ```
125
190
 
126
- ### Rule 8: Preserve Existing Test Expectations (MANDATORY)
191
+ ### Rule 8: Preserve Existing Test Expectations
127
192
  Never change an existing test's expected output to match broken code.
128
193
 
129
- ### Rule 9: Commit Messages Must Be Accurate (MANDATORY)
130
- List ONLY items that are actually VERIFIED (passed tests). Include actual test counts.
194
+ ### Rule 9: Commit Messages Must Be Accurate
195
+ List only items that are verified (passed tests). Include actual test counts.
196
+
197
+ ### Rule 10: Vary Approach for Similar Items
198
+ When a batch contains 3+ items of the same type (e.g., "add null check to X", "add null check to Y"), deliberately vary your approach to avoid tunnel vision:
199
+ - Item 1: Fix as planned
200
+ - Item 2: Before fixing, re-read the module's error handling pattern — does the same fix apply or does this module handle errors differently?
201
+ - Item 3+: Check if the first fixes introduced a pattern that should be extracted (shared helper) or if each case is genuinely independent
202
+
203
+ This catches emergent interactions: 5 "add try-catch" fixes might reveal the module needs a centralized error boundary, not 5 scattered try-catches.
131
204
 
132
205
  ---
133
206
 
@@ -185,9 +258,10 @@ Display the execution batch to user, then continue automatically.
185
258
  ```
186
259
  1. STATE UNDERSTANDING (Rule 7)
187
260
  2. READ target files + test files
188
- 3. IMPLEMENT across necessary files
189
- 4. TEST full test suite
190
- 5. CONFIRMpass -> DONE | regresses -> REVERT -> FAILED
261
+ 3. STATE INTENT "I will modify [files], adding [what], to achieve [goal]"
262
+ 4. IMPLEMENT across necessary files
263
+ 5. TESTfull test suite
264
+ 6. CONFIRM — pass -> DONE | regresses -> REVERT -> FAILED
191
265
  ```
192
266
 
193
267
  #### FULL Items (L/XL)
@@ -195,20 +269,55 @@ Display the execution batch to user, then continue automatically.
195
269
  1. STATE UNDERSTANDING (detailed)
196
270
  2. READ WIDELY — target files, callers, tests, related code
197
271
  3. DESIGN — outline approach before coding
198
- 4. IMPLEMENT in logical chunks
199
- 5. BUILD build hooks if hooks changed
200
- 6. TESTfull test suite
201
- 7. CONFIRMall pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
272
+ 4. STATE INTENT "I will modify [files]. Risk: [what could break]"
273
+ 5. IMPLEMENT in logical chunks
274
+ 6. BUILDbuild hooks if hooks changed
275
+ 7. TESTfull test suite
276
+ 8. CONFIRM — all pass -> DONE | fail -> investigate (15 min max) -> REVERT -> FAILED
202
277
  ```
203
278
 
204
279
  ### 3.2 Failure Handling
205
- - Build breaks: fix typo or revert (5 min limit)
206
- - Test regression: identify cause, one fix attempt, else revert
207
- - **Never let a failed item block other items**
208
280
 
209
- ### 3.3 Progress Tracking
281
+ Classify every error before acting. The classification determines the recovery protocol.
282
+
283
+ **RECOVERABLE (retry with analysis, max 3 attempts):**
284
+ - Test failure after code change — read the error output, fix the root cause, re-test
285
+ - File not found — search for moved/renamed paths via Grep/Glob
286
+ - Build failure from syntax error — fix the typo, rebuild
287
+ - Merge conflict in a non-critical file — attempt auto-resolution
288
+
289
+ **UNRECOVERABLE (halt the item, mark FAILED, move to next):**
290
+ - Same test failure persists after 3 fix attempts — revert all changes for this item
291
+ - Permission or auth error on a critical path — cannot proceed without user action
292
+ - State corruption (malformed JSON in planning files) — stop, report to user
293
+ - Persistent build failure unrelated to current item — stop execution, report
294
+ - Test regression in unrelated code — revert, flag for investigation
295
+
296
+ **Never let a failed item block other items.** Mark it FAILED with the error classification and move on.
297
+
298
+ ### 3.3 Failure Pattern Detection
299
+ When marking an item FAILED, check if its error matches a previous failure in this batch:
300
+ - Same error type or root cause category
301
+ - Same file or module involved
302
+
303
+ If a pattern repeats (2+ items fail the same way), log it in the session record:
304
+ ```
305
+ FAILURE PATTERN: {description} — Items {ID1}, {ID2} — Root cause: {cause}
306
+ Suggested avoidance: {what to check before similar items}
307
+ ```
308
+ Before executing remaining items, check if they match the pattern. If so, skip with reason "matches known failure pattern" rather than burning budget on predictable failures.
309
+
310
+ ### 3.4 Progress Tracking
210
311
  Update progress tracker after each item with status and budget tracking.
211
312
 
313
+ **Attention anchor — emit after each item completes:**
314
+ ```
315
+ Item {N}/{total} {DONE|FAILED} | Budget: {used}/{budget} pts | Tests: {baseline} → {current}
316
+ Remaining: {count} items [{IDs with sizes}]
317
+ Next: {next item ID} — {title} ({tier})
318
+ ```
319
+ This prevents lost-in-the-middle drift in large batches where the agent forgets budget limits or remaining items.
320
+
212
321
  ---
213
322
 
214
323
  ## Stage 4: Verification
@@ -254,17 +363,30 @@ Edit the active scan file:
254
363
 
255
364
  ## Stage 6: Session End
256
365
 
257
- ### 6.1 Commit Changes
366
+ ### 6.1 Pre-Commit Verification Checklist
367
+
368
+ Before committing, run through ALL checks. Do not commit until every check passes.
369
+
370
+ 1. Every modified file was read before editing (no blind writes)
371
+ 2. `git diff --stat` contains only files related to batch items (no stray changes)
372
+ 3. Full test suite passes — count matches or exceeds baseline from Stage 1
373
+ 4. No `TODO`, `FIXME`, or `HACK` introduced without a matching batch item tracking it
374
+ 5. Commit message lists only items that are VERIFIED (tests ran, tests passed)
375
+ 6. No secrets, credentials, or `.env` files staged
376
+
377
+ If any check fails: fix the issue and re-run all checks. Only proceed to commit when all 6 pass.
378
+
379
+ ### 6.2 Commit Changes
258
380
  Unless `--no-commit`:
259
381
  1. Stage modified files (specific paths, not `git add -A`)
260
382
  2. Create commit with accurate message listing verified items
261
383
  3. Verify commit succeeded
262
384
 
263
- ### 6.2 Record Session
385
+ ### 6.3 Record Session
264
386
  - Record session summary (items completed, tests before/after, budget used)
265
387
  - Append error patterns if any failures occurred
266
388
 
267
- ### 6.3 Final Report
389
+ ### 6.4 Final Report
268
390
 
269
391
  ```markdown
270
392
  ## /pan:focus-exec Complete
@@ -293,15 +415,15 @@ Run `/pan:focus-scan` to regenerate the scan.
293
415
 
294
416
  ## NEVER DO
295
417
 
296
- - Skip reading files before editing them (Rule 1)
297
- - Apply symptom patches instead of root cause fixes (Rule 2)
298
- - Batch implement without testing between items (Rule 3)
299
- - Expand scope beyond the batch item (Rule 4)
300
- - Ignore cross-platform path issues (Rule 5)
301
- - Spend more than 5 minutes debugging a single failure (Rule 6)
302
- - Start coding without stating understanding for M+ items (Rule 7)
303
- - Change test expectations to match broken code (Rule 8)
304
- - Claim items are fixed without running tests (Rule 9)
418
+ - Skip reading files before editing them — blind edits break callers, miss invariants, and create regressions (Rule 1)
419
+ - Apply symptom patches instead of root cause fixes — surface patches recur and erode trust in the codebase (Rule 2)
420
+ - Batch implement without testing between items — a silent failure in item 2 corrupts items 3-5 before you detect it (Rule 3)
421
+ - Expand scope beyond the batch item — unplanned changes bypass the budget system and risk compounding failures (Rule 4)
422
+ - Ignore cross-platform path issues — hardcoded separators break on Windows or vice versa (Rule 5)
423
+ - Spend more than 5 minutes debugging a single failure — diminishing returns; revert preserves budget for remaining items (Rule 6)
424
+ - Start coding without stating understanding for M+ items — misunderstanding the problem wastes the entire implementation (Rule 7)
425
+ - Change test expectations to match broken code — this hides bugs instead of fixing them (Rule 8)
426
+ - Claim items are fixed without running tests — unverified claims erode the entire verification pipeline (Rule 9)
305
427
 
306
428
  ## ALWAYS DO
307
429
 
@@ -1,19 +1,21 @@
1
1
  ---
2
2
  name: focus-plan
3
3
  group: Focus
4
- description: Create capacity-budgeted work batch with 4 execution modes
4
+ description: Create capacity-budgeted work batch with spec coverage verification and 4 execution modes
5
5
  allowed-tools:
6
6
  - Read
7
+ - Write
8
+ - Edit
7
9
  - Bash
8
10
  - Grep
9
11
  - Glob
10
12
  ---
11
13
 
12
- # /pan:focus-plan — Capacity-Budgeted Work Batch Planner
14
+ # /pan:focus-plan — Capacity-Budgeted Work Batch Planner with Spec Coverage Verification
13
15
 
14
- Create a capacity-budgeted work batch from focus-scan results. $ARGUMENTS
16
+ Create a capacity-budgeted work batch from focus-scan results **with mandatory verification that planned work covers all relevant spec and ADR requirements.** $ARGUMENTS
15
17
 
16
- **Goal:** Select a right-sized batch of work items that fits within the session's point budget, ordered for maximum impact with minimum risk.
18
+ **Goal:** Select a right-sized batch of work items that (a) fits within the session's point budget, (b) is ordered for maximum impact with minimum risk, and (c) demonstrably covers the requirements from any associated specs, ADRs, and success criteria — flagging coverage gaps BEFORE execution begins.
17
19
 
18
20
  ---
19
21
 
@@ -42,10 +44,67 @@ If no recent scan exists, run `/pan:focus-scan` automatically before proceeding.
42
44
  - `full` — Full-spectrum: enhanced budget, all priorities equally weighted (60 pts)
43
45
  - `--priority P0-P6` — Only pick items from these priority tiers
44
46
  - `--lean` — Apply RS filtering: exclude items with RS < 1.5
47
+ - `--no-spec-check` — Skip spec coverage verification (NOT recommended — use only for pure bugfix batches)
45
48
 
46
49
  ---
47
50
 
48
- ## Capacity Budget System
51
+ ## Phase 1: Spec & ADR Discovery (MANDATORY)
52
+
53
+ > *Before planning work, understand what has been designed and promised.*
54
+
55
+ ### 1.1 Scan for Specifications
56
+ Search the project for feature specifications and design documents:
57
+ - `docs/specs/*.md` or `docs/specs/**/*.md`
58
+ - `.planning/specs/` or `.planning/designs/`
59
+ - Any `*_featureai.md`, `*_spec.md`, `*_design.md` files
60
+ - README sections describing planned features
61
+
62
+ For each spec found, extract:
63
+
64
+ | Spec File | Feature Name | Status | Requirements Count | Success Criteria Count |
65
+ |-----------|-------------|--------|-------------------|----------------------|
66
+ | [path] | [name] | Proposed/In Progress/Complete | [N] | [N] |
67
+
68
+ ### 1.2 Scan for ADRs
69
+ Search for Architecture Decision Records:
70
+ - `docs/decisions/ADR-*.md`
71
+ - `.planning/decisions/`
72
+
73
+ For each ADR, extract:
74
+
75
+ | ADR | Decision | Status | Success Criteria | Implementation Tasks |
76
+ |-----|----------|--------|-----------------|---------------------|
77
+ | [ADR-NNNN] | [summary] | Proposed/Accepted/Implemented | [count or "none defined"] | [count or "none defined"] |
78
+
79
+ ### 1.3 Extract Requirement Inventory
80
+ From every spec and ADR found, build a **master requirements list**:
81
+
82
+ | Req ID | Source | Requirement | Type | Implemented? |
83
+ |--------|--------|-------------|------|-------------|
84
+ | SC-1 | ADR-0015 | JWT auth with 4-role RBAC | Feature | Yes/No/Partial |
85
+ | SC-2 | spec/extraction.md | Image extraction for JPG/PNG | Feature | Yes/No/Partial |
86
+ | T-3 | ADR-0018 §Task 6 | Unmatched description table | Task | Yes/No/Partial |
87
+ | BRK-1 | ADR-0018 §Breaking | Hierarchy roll-up for backward compat | Migration | Yes/No/Partial |
88
+
89
+ **Verification method for "Implemented?":**
90
+ - Search the codebase for files, classes, functions, routes, or tests matching each requirement
91
+ - Check if tests exist that verify the requirement
92
+ - Mark as `Partial` if code exists but tests don't, or if the feature is stubbed
93
+
94
+ ### 1.4 Identify Unimplemented Requirements
95
+ Filter the master list to requirements where `Implemented? = No` or `Partial`:
96
+
97
+ | Req ID | Source | Requirement | Gap Type | Estimated Effort |
98
+ |--------|--------|-------------|----------|-----------------|
99
+ | SC-2 | ADR-0018 | Keyword count >= 500 | Not started | M |
100
+ | T-6 | ADR-0018 | Unmatched description table | Not started | M |
101
+ | BRK-1 | ADR-0018 | Hierarchy roll-up | Partial (code, no tests) | S |
102
+
103
+ This becomes the **spec gap backlog** — items that specs/ADRs promised but the codebase doesn't deliver yet.
104
+
105
+ ---
106
+
107
+ ## Phase 2: Capacity Budget System
49
108
 
50
109
  | Size | Points | Per Session | Meaning |
51
110
  |------|--------|-------------|---------|
@@ -57,45 +116,178 @@ If no recent scan exists, run `/pan:focus-scan` automatically before proceeding.
57
116
 
58
117
  ---
59
118
 
60
- ## Execution Modes
119
+ ## Phase 3: Execution Modes & Batch Selection
61
120
 
62
121
  ### `bugfix` — Stability-First
63
122
  - **Budget:** 40 pts
64
123
  - **Algorithm:** P0 mandatory -> P1 -> P2-P4 smallest-first
65
124
  - **Feature allocation:** None
125
+ - **Spec coverage:** Verify P0/P1 items close spec gaps where applicable
66
126
 
67
127
  ### `balanced` — Mix of Fixes + Features (DEFAULT)
68
128
  - **Budget:** 50 pts
69
129
  - **Stability pass (60%):** 30 pts for P0-P2
70
130
  - **Feature pass (40%):** 20 pts for P3-P6
131
+ - **Spec coverage:** Cross-reference feature items against spec gap backlog — prefer items that close gaps
71
132
 
72
133
  ### `features` — Feature-Focused Sprint
73
134
  - **Budget:** 50 pts
74
135
  - **Mandatory pass:** All P0 items
75
136
  - **Feature pass (80%):** 40 pts for P3-P5
76
137
  - **Stability pass (20%):** 10 pts for P1-P2 quick wins
138
+ - **Spec coverage:** Feature items MUST map to spec requirements — reject unspecified feature work
77
139
 
78
140
  ### `full` — Full-Spectrum Marathon
79
141
  - **Budget:** 60 pts
80
142
  - **All priorities weighted equally, largest-impact-first**
143
+ - **Spec coverage:** Full traceability — every item maps to a spec/ADR requirement or is flagged as unspecified
144
+
145
+ ### Batch Selection Algorithm
146
+ 1. Build candidate list from focus-scan results
147
+ 2. **For each candidate, attempt to map it to a spec/ADR requirement** (by keyword match, file overlap, or feature area)
148
+ 3. Score candidates: `impact_score = base_priority_score + spec_coverage_bonus`
149
+ - Items that close spec gaps get +2 priority bonus
150
+ - Items that close success criteria get +3 priority bonus
151
+ - Items with no spec mapping get +0 (no penalty, but no bonus)
152
+ 4. Apply mode-specific budget allocation
153
+ 5. Select items greedily by score until budget exhausted
154
+
155
+ ---
156
+
157
+ ## Phase 4: Spec Coverage Analysis (MANDATORY unless `--no-spec-check`)
158
+
159
+ > *The most important output of focus-plan: does the batch actually deliver against what was designed?*
160
+
161
+ ### 4.1 Coverage Matrix
162
+ For each spec/ADR requirement, show whether the batch covers it:
163
+
164
+ | Req ID | Source | Requirement | Batch Item | Coverage |
165
+ |--------|--------|-------------|-----------|----------|
166
+ | SC-1 | ADR-0018 | Category count >= 65 | #3: Expand categories | COVERED |
167
+ | SC-2 | ADR-0018 | Keyword count >= 500 | #4: Expand keywords | COVERED |
168
+ | SC-3 | ADR-0018 | Unmatched queue API | — | **GAP** |
169
+ | SC-4 | ADR-0018 | NCA affordability output | — | **GAP (deferred to v1)** |
170
+ | SC-5 | ADR-0018 | No regression | #1: Run existing tests | COVERED |
171
+
172
+ ### 4.2 Coverage Score
173
+ ```
174
+ Spec Coverage: X / Y requirements covered (Z%)
175
+ ├── Fully covered: N items
176
+ ├── Partially covered: N items (code but no tests, or tests but incomplete)
177
+ ├── Gaps: N items (not in batch)
178
+ └── Deferred: N items (explicitly deferred to future version)
179
+ ```
180
+
181
+ ### 4.3 Gap Analysis & Justification
182
+ For every **GAP** in the coverage matrix, provide:
183
+
184
+ | Gap | Requirement | Why Not In This Batch | When Will It Be Addressed |
185
+ |-----|------------|----------------------|--------------------------|
186
+ | SC-3 | Unmatched queue API | Exceeds budget (M=4pts, only 2pts remaining) | Next batch (features mode) |
187
+ | SC-4 | NCA affordability | Depends on SC-1 + SC-2 (must complete first) | After category expansion |
188
+
189
+ **CRITICAL:** If the coverage score is < 50% for a spec that has `Status: In Progress`, flag this prominently:
190
+ ```
191
+ ⚠️ WARNING: Batch covers only X% of [spec name] requirements.
192
+ Y requirements remain unaddressed. Consider:
193
+ - Increasing budget (--budget N)
194
+ - Switching to features mode (--mode features)
195
+ - Breaking spec into smaller milestones
196
+ ```
197
+
198
+ ### 4.4 Dependency Verification
199
+ Check that batch items respect dependency ordering from specs:
200
+
201
+ | Batch Item | Depends On | Dependency In Batch? | Order Correct? |
202
+ |-----------|-----------|---------------------|----------------|
203
+ | #4: Keywords | #3: Categories | Yes | Yes (#3 before #4) |
204
+ | #6: Suggestions | #5: Unmatched API | No — #5 not in batch | **BLOCKED** |
205
+
206
+ **If any item is BLOCKED:** Either add the dependency to the batch (if budget allows) or remove the blocked item and flag it.
207
+
208
+ ### 4.5 Success Criteria Verification Plan
209
+ For each success criterion in the batch, specify HOW it will be verified after execution:
210
+
211
+ | SC ID | Criterion | Verification Command | Expected Result |
212
+ |-------|-----------|---------------------|-----------------|
213
+ | SC-1 | Category count >= 65 | `SELECT COUNT(*) FROM stx_category` | >= 65 |
214
+ | SC-2 | Keywords >= 500 | `SELECT COUNT(*) FROM stx_keyword` | >= 500 |
215
+ | SC-5 | No regression | `dotnet test` | All pass, count >= N |
216
+
217
+ This becomes the post-execution checklist for `/pan:focus-exec`.
81
218
 
82
219
  ---
83
220
 
84
- ## Output
221
+ ## Phase 5: Output
85
222
 
86
223
  Produce a batch file at `.planning/focus/batch-<YYYY-MM-DD>.json` via `pan-tools focus plan`:
87
224
 
88
225
  ```markdown
89
226
  ## Focus Batch — <date>
90
227
  **Mode:** balanced | **Budget:** 50 pts | **Allocated:** N pts
228
+ **Specs referenced:** N specs, M ADRs
229
+ **Spec coverage:** X/Y requirements (Z%)
230
+
231
+ ### Batch Items
232
+
233
+ | # | ID | Title | Priority | Size | Pts | Tier | Track | Spec Req |
234
+ |---|----|-------|----------|------|-----|------|-------|----------|
235
+ | 1 | P0-1 | Fix crash in state cmd | P0 | S | 2 | MICRO | Stability | ADR-0005 SC-3 |
236
+ | 2 | P2-3 | Add tests for milestone | P2 | M | 4 | STANDARD | Stability | — |
237
+ | 3 | P3-1 | Expand category taxonomy | P3 | M | 4 | STANDARD | Feature | ADR-0018 SC-1 |
91
238
 
92
- | # | ID | Title | Priority | Size | Pts | Tier | Track |
93
- |---|----|-------|----------|------|-----|------|-------|
94
- | 1 | P0-1 | Fix crash in state cmd | P0 | S | 2 | MICRO | Stability |
95
- | 2 | P2-3 | Add tests for milestone | P2 | M | 4 | STANDARD | Stability |
96
- | 3 | P3-1 | Add --json flag to phase | P3 | M | 4 | STANDARD | Feature |
239
+ ### Spec Coverage Summary
240
+
241
+ | Source | Total Reqs | Covered | Gaps | Deferred |
242
+ |--------|-----------|---------|------|----------|
243
+ | ADR-0018 | 7 | 3 | 2 | 2 |
244
+ | spec/extraction.md | 5 | 5 | 0 | 0 |
245
+ | **Total** | **12** | **8 (67%)** | **2** | **2** |
246
+
247
+ ### Uncovered Requirements (Gaps)
248
+
249
+ | Req | Source | Reason | Next Batch? |
250
+ |-----|--------|--------|-------------|
251
+ | Unmatched queue API | ADR-0018 SC-3 | Budget exceeded | Yes — features mode |
252
+ | NCA affordability | ADR-0018 SC-4 | Blocked by SC-1, SC-2 | After this batch |
253
+
254
+ ### Dependency Order
255
+ ```
256
+ #1 (P0 crash fix) → independent
257
+ #3 (categories) → #4 (keywords) → #5 (match types)
258
+ #2 (tests) → independent
259
+ ```
260
+
261
+ ### Post-Execution Verification Checklist
262
+ - [ ] SC-1: Category count >= 65 → `SELECT COUNT(*) FROM stx_category`
263
+ - [ ] SC-2: Keywords >= 500 → `SELECT COUNT(*) FROM stx_keyword`
264
+ - [ ] SC-5: All existing tests pass → `dotnet test`
97
265
 
98
266
  Execution Order: MICRO first, then STANDARD, then FULL
99
267
  ```
100
268
 
101
269
  Ready for `/pan:focus-exec`.
270
+
271
+ ---
272
+
273
+ ## NEVER DO
274
+
275
+ - Plan a batch without checking specs and ADRs for coverage gaps
276
+ - Include a feature item that contradicts or conflicts with an accepted ADR
277
+ - Ignore dependency ordering defined in specs (Task A before Task B)
278
+ - Claim 100% spec coverage without actually verifying each requirement against the codebase
279
+ - Include blocked items (items whose dependencies are not in the batch and not yet implemented)
280
+ - Silently drop spec requirements — every gap must be justified and scheduled
281
+ - Plan implementation tasks that aren't traceable to a spec, ADR, scan finding, or user request
282
+ - Exceed the capacity budget (hard limit — not "approximately")
283
+
284
+ ## ALWAYS DO
285
+
286
+ - Discover ALL specs and ADRs before selecting batch items
287
+ - Cross-reference every batch item against spec requirements where applicable
288
+ - Flag coverage gaps prominently with justification and scheduling
289
+ - Verify dependency ordering matches spec-defined task dependencies
290
+ - Include a post-execution verification checklist with concrete commands
291
+ - Prefer items that close spec gaps over items with no spec mapping (when priority is equal)
292
+ - State the coverage score as a percentage in the batch header
293
+ - Report unimplemented success criteria that aren't addressed by this batch
@@ -17,11 +17,11 @@ Survey the project for prioritized work items with evidence-based scoring. $ARGU
17
17
 
18
18
  ---
19
19
 
20
- ## CRITICAL: Project Scope Boundary
20
+ ## Project Scope Boundary
21
21
 
22
- This command scans the **host project's source code** for work items — NOT PAN Wizard's own infrastructure.
22
+ This command scans the **host project's source code** for work items — not PAN Wizard's own infrastructure.
23
23
 
24
- **ALWAYS EXCLUDE these directories from scanning:**
24
+ **Exclude these directories from scanning:**
25
25
  - `.claude/`, `.github/copilot-instructions.md`, `.opencode/`, `.gemini/`, `.codex/` — PAN runtime directories
26
26
  - `.planning/` — PAN planning state (read for context, but never report PAN planning files as "issues")
27
27
  - Any `pan-wizard-core/`, `pan-tools`, agent `.md`, or command `.md` files within PAN runtime directories
@@ -32,9 +32,21 @@ If a scan finding points to a file inside `.claude/`, `.github/`, `.opencode/`,
32
32
 
33
33
  ---
34
34
 
35
- ## MANDATORY: Execute ALL Phases Automatically
35
+ ## Tool Selection Priority
36
36
 
37
- When `/pan:focus-scan` is invoked, execute ALL phases without stopping. Do NOT ask questions between phases. Do NOT skip phases. The output is a prioritized work list with Reality Score filtering.
37
+ Use the simplest sufficient tool for each scanning operation:
38
+ 1. **Grep** — for finding patterns (TODO, FIXME, error-prone code) across the codebase
39
+ 2. **Glob** — for discovering files by name pattern (test files, config files, modules)
40
+ 3. **Read** — for examining specific files identified by Grep/Glob
41
+ 4. **Bash** — only for commands that dedicated tools cannot do (git log, test runners)
42
+
43
+ Do not read entire files when Grep can find the relevant lines. Do not use Bash for searches that Grep handles.
44
+
45
+ ---
46
+
47
+ ## Execute All Phases Automatically
48
+
49
+ When `/pan:focus-scan` is invoked, execute all phases without stopping. Do not ask questions between phases or skip phases. The output is a prioritized work list with Reality Score filtering.
38
50
 
39
51
  **Flags:**
40
52
  - `--focus <area>` — Weight items toward a specific area (e.g., `--focus commands`, `--focus hooks`, `--focus tests`)