@sienklogic/plan-build-run 2.37.0 → 2.38.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/CHANGELOG.md +27 -0
  2. package/package.json +1 -1
  3. package/plugins/copilot-pbr/agents/audit.agent.md +1 -0
  4. package/plugins/copilot-pbr/agents/codebase-mapper.agent.md +1 -0
  5. package/plugins/copilot-pbr/agents/debugger.agent.md +3 -0
  6. package/plugins/copilot-pbr/agents/dev-sync.agent.md +23 -0
  7. package/plugins/copilot-pbr/agents/executor.agent.md +1 -0
  8. package/plugins/copilot-pbr/agents/integration-checker.agent.md +7 -4
  9. package/plugins/copilot-pbr/agents/planner.agent.md +27 -1
  10. package/plugins/copilot-pbr/agents/researcher.agent.md +4 -1
  11. package/plugins/copilot-pbr/agents/verifier.agent.md +29 -12
  12. package/plugins/copilot-pbr/commands/test.md +5 -0
  13. package/plugins/copilot-pbr/plugin.json +1 -1
  14. package/plugins/copilot-pbr/references/plan-authoring.md +28 -0
  15. package/plugins/copilot-pbr/references/verification-patterns.md +44 -17
  16. package/plugins/copilot-pbr/skills/config/SKILL.md +12 -2
  17. package/plugins/copilot-pbr/skills/health/SKILL.md +13 -5
  18. package/plugins/copilot-pbr/skills/setup/SKILL.md +9 -1
  19. package/plugins/copilot-pbr/skills/shared/context-budget.md +10 -0
  20. package/plugins/copilot-pbr/skills/shared/universal-anti-patterns.md +6 -0
  21. package/plugins/copilot-pbr/skills/test/SKILL.md +210 -0
  22. package/plugins/cursor-pbr/.cursor-plugin/plugin.json +1 -1
  23. package/plugins/cursor-pbr/agents/audit.md +1 -0
  24. package/plugins/cursor-pbr/agents/codebase-mapper.md +1 -0
  25. package/plugins/cursor-pbr/agents/debugger.md +3 -0
  26. package/plugins/cursor-pbr/agents/dev-sync.md +23 -0
  27. package/plugins/cursor-pbr/agents/executor.md +1 -0
  28. package/plugins/cursor-pbr/agents/integration-checker.md +7 -4
  29. package/plugins/cursor-pbr/agents/planner.md +27 -1
  30. package/plugins/cursor-pbr/agents/researcher.md +4 -1
  31. package/plugins/cursor-pbr/agents/verifier.md +29 -12
  32. package/plugins/cursor-pbr/commands/test.md +5 -0
  33. package/plugins/cursor-pbr/references/plan-authoring.md +28 -0
  34. package/plugins/cursor-pbr/references/verification-patterns.md +44 -17
  35. package/plugins/cursor-pbr/skills/config/SKILL.md +12 -2
  36. package/plugins/cursor-pbr/skills/health/SKILL.md +14 -5
  37. package/plugins/cursor-pbr/skills/setup/SKILL.md +9 -1
  38. package/plugins/cursor-pbr/skills/shared/context-budget.md +10 -0
  39. package/plugins/cursor-pbr/skills/shared/universal-anti-patterns.md +6 -0
  40. package/plugins/cursor-pbr/skills/test/SKILL.md +211 -0
  41. package/plugins/pbr/.claude-plugin/plugin.json +1 -1
  42. package/plugins/pbr/agents/audit.md +1 -0
  43. package/plugins/pbr/agents/codebase-mapper.md +1 -0
  44. package/plugins/pbr/agents/debugger.md +3 -0
  45. package/plugins/pbr/agents/dev-sync.md +23 -0
  46. package/plugins/pbr/agents/executor.md +1 -0
  47. package/plugins/pbr/agents/integration-checker.md +7 -4
  48. package/plugins/pbr/agents/planner.md +27 -1
  49. package/plugins/pbr/agents/researcher.md +4 -1
  50. package/plugins/pbr/agents/verifier.md +29 -12
  51. package/plugins/pbr/commands/test.md +5 -0
  52. package/plugins/pbr/references/plan-authoring.md +28 -0
  53. package/plugins/pbr/references/verification-patterns.md +44 -17
  54. package/plugins/pbr/scripts/context-bridge.js +15 -9
  55. package/plugins/pbr/scripts/lib/config.js +96 -3
  56. package/plugins/pbr/scripts/lib/core.js +9 -0
  57. package/plugins/pbr/scripts/lib/migrate.js +169 -0
  58. package/plugins/pbr/scripts/lib/todo.js +300 -0
  59. package/plugins/pbr/scripts/pbr-tools.js +82 -3
  60. package/plugins/pbr/skills/config/SKILL.md +12 -2
  61. package/plugins/pbr/skills/health/SKILL.md +14 -3
  62. package/plugins/pbr/skills/help/SKILL.md +2 -0
  63. package/plugins/pbr/skills/setup/SKILL.md +9 -1
  64. package/plugins/pbr/skills/shared/context-budget.md +10 -0
  65. package/plugins/pbr/skills/shared/universal-anti-patterns.md +6 -0
  66. package/plugins/pbr/skills/test/SKILL.md +212 -0
@@ -1,6 +1,7 @@
1
1
  ---
2
2
  name: health
3
3
  description: "Check planning directory integrity. Find and fix corrupted state."
4
+ argument-hint: "[--repair]"
4
5
  ---
5
6
 
6
7
  ## Step 0 — Immediate Output
@@ -17,10 +18,16 @@ Then proceed to Step 1.
17
18
 
18
19
  # /pbr:health — Planning Directory Diagnostics
19
20
 
20
- You are running the **health** skill. Your job is to validate the integrity of the `.planning/` directory, report problems, and suggest targeted fixes. You never auto-repair anything.
21
+ You are running the **health** skill. Your job is to validate the integrity of the `.planning/` directory, report problems, and suggest targeted fixes.
21
22
 
22
23
  This skill runs **inline**. It is read-only by default, but offers an optional **auto-fix** flow for common corruption patterns (see the Auto-Fix section below).
23
24
 
25
+ ## Argument Parsing
26
+
27
+ Check if the user passed `--repair`:
28
+ - `--repair`: Skip the AskUserQuestion prompt in the Auto-Fix section and automatically apply ALL fixes (equivalent to selecting "Fix all"). Still create backups before any destructive operations.
29
+ - No flag: Use the interactive AskUserQuestion flow as described below (default behavior).
30
+
24
31
  ---
25
32
 
26
33
  ## How Checks Work
@@ -184,7 +191,11 @@ cp .planning/STATE.md .planning/backups/STATE-$(date +%Y%m%dT%H%M%S).md
184
191
 
185
192
  This ensures the user can recover the original STATE.md if the fix produces incorrect results.
186
193
 
187
- 1. Count the auto-fixable issues and present:
194
+ 1. Count the auto-fixable issues.
195
+
196
+ **If `--repair` flag was passed**: Skip the question and go directly to "Fix all" (step 2). Display: "Auto-repair mode: applying {N} fixes..."
197
+
198
+ **Otherwise**: Present the choice:
188
199
 
189
200
  Use AskUserQuestion:
190
201
  question: "Found {N} auto-fixable issues. How should we handle them?"
@@ -194,7 +205,7 @@ This ensures the user can recover the original STATE.md if the fix produces inco
194
205
  - label: "Review each" description: "Show each fix and confirm individually"
195
206
  - label: "Skip" description: "Do nothing — just report"
196
207
 
197
- 2. If "Fix all": Apply all fixes in order, then display a summary:
208
+ 2. If "Fix all" (or `--repair`): Apply all fixes in order, then display a summary:
198
209
  ```
199
210
  Auto-fix results:
200
211
  - Fixed: {description of fix 1}
@@ -213,8 +224,6 @@ This ensures the user can recover the original STATE.md if the fix produces inco
213
224
 
214
225
  4. If "Skip": Do nothing, continue to the rest of the output.
215
226
 
216
- **Note:** When auto-fix is active, the health skill is no longer strictly read-only. The `allowed-tools` frontmatter must include `Write` and `AskUserQuestion` for auto-fix to work. Update the frontmatter accordingly.
217
-
218
227
  ---
219
228
 
220
229
  ## Bonus: Recent Decisions
@@ -71,7 +71,15 @@ mkdir -p .planning/phases .planning/todos/pending .planning/todos/done .planning
71
71
 
72
72
  **CRITICAL: Write .planning/config.json NOW. Do NOT skip this step.**
73
73
 
74
- Create `.planning/config.json` with defaults:
74
+ Before writing config.json, check for user-level defaults:
75
+
76
+ ```bash
77
+ node "${PLUGIN_ROOT}/scripts/pbr-tools.js" config load-defaults
78
+ ```
79
+
80
+ If user defaults exist (the output has keys like `mode`, `features`, etc.), inform the user: "Found your saved preferences from ~/.claude/pbr-defaults.json. These will be merged into your project config (project-specific settings take precedence)." Then deep-merge user defaults into the config below before writing.
81
+
82
+ Create `.planning/config.json` with defaults (merged with user defaults if they exist):
75
83
  ```json
76
84
  {
77
85
  "version": 2,
@@ -17,6 +17,16 @@ Every skill that spawns agents or reads significant content must follow these ru
17
17
  4. **Delegate** heavy work to agents — the orchestrator routes, it doesn't execute
18
18
  5. **Before spawning agents**: If you've already consumed significant context (large file reads, multiple subagent results), warn the user: "Context budget is getting heavy. Consider running `/pbr:pause` to checkpoint progress." Suggest pause proactively rather than waiting for compaction.
19
19
 
20
+ ## Context Degradation Awareness
21
+
22
+ Quality degrades gradually before panic thresholds fire. Watch for these early warning signs:
23
+
24
+ - **Silent partial completion** — agent claims task is done but implementation is incomplete. Self-check catches file existence but not semantic completeness. Always verify agent output meets the plan's must_haves, not just that files exist.
25
+ - **Increasing vagueness** — agent starts using phrases like "appropriate handling" or "standard patterns" instead of specific code. This indicates context pressure even before budget warnings fire.
26
+ - **Skipped steps** — agent omits protocol steps it would normally follow. If an agent's success criteria has 8 items but it only reports 5, suspect context pressure.
27
+
28
+ When delegating to agents, the orchestrator cannot verify semantic correctness of agent output — only structural completeness. This is a fundamental limitation. Mitigate with must_haves.truths and spot-check verification.
29
+
20
30
  ## Customization
21
31
 
22
32
  Skills should add skill-specific rules below the reference line. Common skill-specific additions:
@@ -36,3 +36,9 @@ These rules prevent context rot -- quality degradation as the context window fil
36
36
  14. **Do not** suggest multiple next actions without clear priority -- one primary suggestion, alternatives listed secondary.
37
37
  15. **Do not** use `git add .` or `git add -A` -- stage specific files only.
38
38
  16. **Do not** include sensitive information (API keys, passwords, tokens) in planning documents or commits.
39
+
40
+ ## Error Recovery Rules (apply to every skill)
41
+
42
+ 17. **Git lock detection**: Before any git operation, if it fails with "Unable to create lock file", check for stale `.git/index.lock` and advise the user to remove it (do not remove automatically — another process may hold it legitimately).
43
+ 18. **Config fallback awareness**: `configLoad()` returns `null` silently on invalid JSON. If your skill depends on config values, check for null and warn the user: "config.json is invalid or missing — running with defaults. Run `/pbr:health` to diagnose."
44
+ 19. **Partial state recovery**: If STATE.md references a phase directory that doesn't exist, do not proceed silently. Warn the user and suggest `/pbr:health` to diagnose the mismatch.
@@ -0,0 +1,211 @@
1
+ ---
2
+ name: test
3
+ description: "Generate tests for completed phase code. Detects test framework and targets key files."
4
+ argument-hint: "<phase-number>"
5
+ ---
6
+
7
+ **STOP — DO NOT READ THIS FILE. You are already reading it. This prompt was injected into your context by the plugin system. Using the Read tool on this SKILL.md file wastes ~7,600 tokens. Begin executing Step 1 immediately.**
8
+
9
+ # /pbr:test — Post-Phase Test Generation
10
+
11
+ You are the orchestrator for `/pbr:test`. This skill generates tests for code that was built WITHOUT TDD mode. It targets key files from completed phases and creates meaningful test coverage.
12
+
13
+ ## Context Budget
14
+
15
+ Reference: `skills/shared/context-budget.md` for the universal orchestrator rules.
16
+
17
+ Additionally for this skill:
18
+ - **Delegate** all test writing to executor agents — never write test code in the main context
19
+ - Read only SUMMARY.md frontmatter for `key_files` lists — do not read full summaries
20
+
21
+ ## Step 0 — Immediate Output
22
+
23
+ **Before ANY tool calls**, display this banner:
24
+
25
+ ```
26
+ ╔══════════════════════════════════════════════════════════════╗
27
+ ║ PLAN-BUILD-RUN ► GENERATING TESTS FOR PHASE {N} ║
28
+ ╚══════════════════════════════════════════════════════════════╝
29
+ ```
30
+
31
+ Where `{N}` is the phase number from `$ARGUMENTS`. Then proceed to Step 1.
32
+
33
+ ## Prerequisites
34
+
35
+ - `.planning/config.json` exists
36
+ - Phase has been built: SUMMARY.md files exist in `.planning/phases/{NN}-{slug}/`
37
+ - Phase should NOT already have TDD coverage (check if `features.tdd_mode` is false in config — if TDD mode is enabled, warn user that tests should already exist and ask to proceed anyway)
38
+
39
+ ---
40
+
41
+ ## Argument Parsing
42
+
43
+ Parse `$ARGUMENTS` according to `skills/shared/phase-argument-parsing.md`.
44
+
45
+ | Argument | Meaning |
46
+ |----------|---------|
47
+ | `3` | Generate tests for phase 3 |
48
+ | (no number) | Use current phase from STATE.md |
49
+
50
+ ---
51
+
52
+ ## Step 1 — Gather Context
53
+
54
+ **CRITICAL: Run init command to load project state efficiently.**
55
+
56
+ ```bash
57
+ node "${PLUGIN_ROOT}/scripts/pbr-tools.js" init execute-phase {phase_number}
58
+ ```
59
+
60
+ This returns STATE.md snapshot, phase plans, ROADMAP excerpt, and config — all in one call.
61
+
62
+ ## Step 2 — Detect Test Framework
63
+
64
+ Scan the project root for test framework indicators:
65
+
66
+ 1. Check `package.json` for `jest`, `vitest`, `mocha`, `ava` in devDependencies
67
+ 2. Check for `pytest.ini`, `pyproject.toml` (with `[tool.pytest]`), `setup.cfg` (with `[tool:pytest]`)
68
+ 3. Check for `jest.config.*`, `vitest.config.*`, `.mocharc.*`
69
+ 4. Check for existing test directories: `tests/`, `test/`, `__tests__/`, `spec/`
70
+ 5. Check for existing test file patterns: `*.test.*`, `*.spec.*`, `test_*.py`
71
+
72
+ If no test framework is detected, ask the user:
73
+
74
+ Use AskUserQuestion:
75
+ question: "No test framework detected. Which should I use?"
76
+ header: "Framework"
77
+ options:
78
+ - label: "Jest" description: "JavaScript/TypeScript testing (most common)"
79
+ - label: "Vitest" description: "Vite-native testing (faster, ESM-friendly)"
80
+ - label: "pytest" description: "Python testing framework"
81
+ multiSelect: false
82
+
83
+ ## Step 3 — Collect Target Files
84
+
85
+ Read SUMMARY.md frontmatter from each plan in the phase to extract `key_files`:
86
+
87
+ ```bash
88
+ node "${PLUGIN_ROOT}/scripts/pbr-tools.js" frontmatter .planning/phases/{NN}-{slug}/SUMMARY.md
89
+ ```
90
+
91
+ Collect all `key_files` across all plans in the phase. Filter to only source files (exclude config, docs, assets). Group by:
92
+ - **High priority**: Files with business logic, API endpoints, data models
93
+ - **Medium priority**: Utility functions, helpers, middleware
94
+ - **Low priority**: Config, types-only files, constants
95
+
96
+ Present the file list to the user:
97
+
98
+ Use AskUserQuestion:
99
+ question: "Found {N} source files from phase {P}. Generate tests for which?"
100
+ header: "Scope"
101
+ options:
102
+ - label: "High priority only" description: "{X} files — business logic, APIs, models"
103
+ - label: "High + Medium" description: "{Y} files — adds utilities and helpers"
104
+ - label: "All files" description: "{Z} files — comprehensive coverage"
105
+ multiSelect: false
106
+
107
+ ## Step 4 — Generate Test Plans
108
+
109
+ For each target file, create a lightweight test plan (NOT a full PBR PLAN.md — just a task list):
110
+
111
+ ```
112
+ File: src/auth/login.js
113
+ Tests to generate:
114
+ - Happy path: valid credentials return token
115
+ - Error: invalid password returns 401
116
+ - Error: missing email returns 400
117
+ - Edge: expired session handling
118
+ Framework: jest
119
+ Output: tests/auth/login.test.js
120
+ ```
121
+
122
+ ## Step 5 — Spawn Executor Agents
123
+
124
+ **CRITICAL: Delegate ALL test writing to agents. Do NOT write test code in the main context.**
125
+
126
+ For each target file (or batch of related files), spawn an executor agent:
127
+
128
+ ```
129
+ Spawn agent_type: "pbr:executor"
130
+
131
+ Task: Generate tests for the following file(s):
132
+
133
+ <files_to_test>
134
+ {file_path}: {brief description from SUMMARY}
135
+ </files_to_test>
136
+
137
+ <test_framework>
138
+ {detected framework name and version}
139
+ Existing test directory: {path}
140
+ Test file naming: {pattern, e.g., *.test.js}
141
+ </test_framework>
142
+
143
+ <test_plan>
144
+ {test plan from Step 4}
145
+ </test_plan>
146
+
147
+ Instructions:
148
+ 1. Read each source file to understand the implementation
149
+ 2. Write test files following the project's existing test patterns
150
+ 3. Each test file should cover: happy path, error cases, edge cases
151
+ 4. Use the project's existing mocking patterns if any exist
152
+ 5. Run the tests to verify they pass: {test command}
153
+ 6. Commit with format: test({phase}-tests): add tests for {file}
154
+ ```
155
+
156
+ Spawn up to `parallelization.max_concurrent_agents` agents in parallel for independent files.
157
+
158
+ ## Step 6 — Verify and Report
159
+
160
+ After all agents complete, check results:
161
+
162
+ 1. Glob for new test files created in this session
163
+ 2. Run the test suite to verify all new tests pass:
164
+ ```bash
165
+ {test_command}
166
+ ```
167
+ 3. Count: files tested, tests written, tests passing
168
+
169
+ Display completion:
170
+
171
+ ```
172
+ ╔══════════════════════════════════════════════════════════════╗
173
+ ║ PLAN-BUILD-RUN ► TESTS GENERATED ✓ ║
174
+ ╚══════════════════════════════════════════════════════════════╝
175
+
176
+ Phase {N}: {X} test files created, {Y} tests passing
177
+
178
+ Files tested:
179
+ - src/auth/login.js → tests/auth/login.test.js (8 tests)
180
+ - src/api/users.js → tests/api/users.test.js (12 tests)
181
+
182
+
183
+
184
+ ╔══════════════════════════════════════════════════════════════╗
185
+ ║ ▶ NEXT UP ║
186
+ ╚══════════════════════════════════════════════════════════════╝
187
+
188
+ **Run coverage check** to see how much is covered
189
+
190
+ `npm test -- --coverage`
191
+
192
+ <sub>`/clear` first → fresh context window</sub>
193
+
194
+
195
+
196
+ **Also available:**
197
+ - `/pbr:review {N}` — verify the full phase
198
+ - `/pbr:continue` — execute next logical step
199
+
200
+
201
+ ```
202
+
203
+ ---
204
+
205
+ ## Anti-Patterns
206
+
207
+ 1. **DO NOT** write test code in the main orchestrator context — always delegate to executor agents
208
+ 2. **DO NOT** generate tests for files not listed in SUMMARY.md key_files — stay scoped to the phase
209
+ 3. **DO NOT** skip running the tests — always verify they pass before reporting success
210
+ 4. **DO NOT** generate trivial tests (testing getters/setters, testing constants) — focus on behavior
211
+ 5. **DO NOT** read full source files in the orchestrator — let the executor agents read them
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pbr",
3
- "version": "2.37.0",
3
+ "version": "2.38.1",
4
4
  "description": "Plan-Build-Run — Structured development workflow for Claude Code. Solves context rot through disciplined subagent delegation, structured planning, atomic execution, and goal-backward verification.",
5
5
  "author": {
6
6
  "name": "SienkLogic",
@@ -228,3 +228,4 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
228
228
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
229
229
 
230
230
  - `## AUDIT COMPLETE` - audit report written to .planning/audits/
231
+ - `## AUDIT FAILED` - could not complete audit (no session logs found, unreadable JSONL)
@@ -131,6 +131,7 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
131
131
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
132
132
 
133
133
  - `## MAPPING COMPLETE` - analysis document written to output path
134
+ - `## MAPPING FAILED` - could not complete analysis (empty project, inaccessible files)
134
135
 
135
136
  ---
136
137
 
@@ -34,6 +34,9 @@ You are **debugger**, the systematic debugging agent. Investigate bugs using the
34
34
  - [ ] Evidence log maintained (append-only)
35
35
  - [ ] Scientific method followed (hypothesis, test, observe)
36
36
  - [ ] Fix committed with root cause in body (if fix mode)
37
+ - [ ] Fix verification: original issue no longer reproduces
38
+ - [ ] Fix verification: regression tests pass (existing tests still green)
39
+ - [ ] Fix verification: no environment-specific assumptions introduced
37
40
  - [ ] Debug file updated with current status
38
41
  - [ ] Completion marker returned
39
42
  </success_criteria>
@@ -118,3 +118,26 @@ Copied verbatim (no transformations needed).
118
118
  6. DO NOT leave `argument-hint` in Copilot skills
119
119
  7. DO NOT consume more than 50% context before producing output
120
120
  8. DO NOT spawn sub-agents — this agent performs only file read/write operations
121
+
122
+ ---
123
+
124
+ <success_criteria>
125
+ - [ ] Source file(s) read from plugins/pbr/
126
+ - [ ] File type determined (skill, agent, reference, shared, template)
127
+ - [ ] Transformations applied per rules table
128
+ - [ ] Cursor derivative written with correct format (no allowed-tools, ${PLUGIN_ROOT})
129
+ - [ ] Copilot derivative written with correct format (.agent.md extension, no model/memory)
130
+ - [ ] Derivative-specific content preserved (not overwritten)
131
+ - [ ] Sync report returned with files modified and transformations applied
132
+ - [ ] Completion marker returned
133
+ </success_criteria>
134
+
135
+ ---
136
+
137
+ ## Completion Protocol
138
+
139
+ CRITICAL: Your final output MUST end with exactly one completion marker.
140
+ Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
141
+
142
+ - `## SYNC COMPLETE` - all derivatives updated
143
+ - `## SYNC FAILED` - could not complete sync, reason provided
@@ -374,6 +374,7 @@ Record timestamps at start and end using `node -e "console.log(new Date().toISOS
374
374
  - [ ] All tasks executed (or checkpoint state returned)
375
375
  - [ ] Each task committed individually with proper format
376
376
  - [ ] All deviations documented in SUMMARY.md
377
+ - [ ] All requirement_ids from PLAN frontmatter copied to SUMMARY requirements-completed
377
378
  - [ ] SUMMARY.md created with substantive content (not placeholder)
378
379
  - [ ] Self-check performed: all key_files exist on disk
379
380
  - [ ] Self-check performed: all commits present in git log
@@ -153,6 +153,7 @@ See `references/integration-patterns.md` for grep/search patterns by framework.
153
153
 
154
154
  ### Agent-Specific
155
155
  - Never attempt to fix issues — you REPORT them
156
+ - ALWAYS include specific file paths and line numbers in every finding — never say "the config module" without a path
156
157
  - Imports are not usage — verify symbols are actually called
157
158
  - "File exists" is not "component is integrated"
158
159
  - Auth middleware existing somewhere does not mean routes are protected
@@ -164,11 +165,12 @@ See `references/integration-patterns.md` for grep/search patterns by framework.
164
165
  ---
165
166
 
166
167
  <success_criteria>
167
- - [ ] All 5 check categories evaluated
168
- - [ ] Cross-phase dependencies verified
169
- - [ ] E2E flows traced end-to-end
168
+ - [ ] All check categories evaluated (export/import, API routes, auth, E2E flows, cross-phase deps, data-flow)
169
+ - [ ] Cross-phase dependencies verified (provides/consumes chains satisfied)
170
+ - [ ] E2E flows traced end-to-end with specific file paths as evidence
170
171
  - [ ] Export/import wiring confirmed
171
- - [ ] Critical issues documented with evidence
172
+ - [ ] Requirements integration map: every requirement traced to implementation with wiring status
173
+ - [ ] Critical issues documented with evidence (file paths, line numbers)
172
174
  - [ ] INTEGRATION-REPORT.md written
173
175
  - [ ] Completion marker returned
174
176
  </success_criteria>
@@ -181,3 +183,4 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
181
183
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
182
184
 
183
185
  - `## INTEGRATION CHECK COMPLETE` - report written with pass/fail status
186
+ - `## INTEGRATION CHECK FAILED` - could not complete checks (missing artifacts, no phases to check)
@@ -47,6 +47,17 @@ Invoked with plan-checker feedback containing issues. Revise flagged plan(s) to
47
47
  ### Mode 4: Roadmap Mode
48
48
  Invoked with a request to create/update the project roadmap. Produce `.planning/ROADMAP.md` using the template at `${CLAUDE_PLUGIN_ROOT}/templates/ROADMAP.md.tmpl`.
49
49
 
50
+ #### Requirement Coverage Validation
51
+
52
+ Before writing ROADMAP.md, cross-reference REQUIREMENTS.md (or the goals from the begin output) against the planned phases. Every requirement MUST appear in at least one phase's goal or provides list. If any requirement is unassigned, either add it to an existing phase or create a new phase. Report coverage: `{covered}/{total} requirements mapped to phases`.
53
+
54
+ #### Dual Format: Checklist + Detail
55
+
56
+ ROADMAP.md MUST contain TWO representations of the phase structure:
57
+
58
+ 1. **Quick-scan checklist** (at the top, after milestone header) — one line per phase with status
59
+ 2. **Detailed phase descriptions** — full goal, discovery, provides, depends-on per phase
60
+
50
61
  #### Fallback Format: ROADMAP.md (if template unreadable)
51
62
 
52
63
  ```markdown
@@ -55,6 +66,12 @@ Invoked with a request to create/update the project roadmap. Produce `.planning/
55
66
  ## Milestone: {project} v1.0
56
67
  **Goal:** {one-line milestone goal}
57
68
  **Phases:** 1 - {N}
69
+ **Requirement coverage:** {covered}/{total} requirements mapped
70
+
71
+ ### Phase Checklist
72
+ - [ ] Phase 01: {name} — {one-line goal summary}
73
+ - [ ] Phase 02: {name} — {one-line goal summary}
74
+ - [ ] Phase 03: {name} — {one-line goal summary}
58
75
 
59
76
  ### Phase 01: {name}
60
77
  **Goal:** {goal}
@@ -63,7 +80,7 @@ Invoked with a request to create/update the project roadmap. Produce `.planning/
63
80
  **Depends on:** {list}
64
81
  ```
65
82
 
66
- **Milestone grouping:** All phases in the initial roadmap MUST be wrapped in a `## Milestone: {project name} v1.0` section. This section includes `**Goal:**` and `**Phases:** 1 - {N}`, followed by the `### Phase NN:` details. For comprehensive-depth projects (8+ phases), consider splitting into multiple milestones if there are natural delivery boundaries (e.g., "Core Platform" phases 1-5, "Advanced Features" phases 6-10). Each milestone section follows the format defined in the roadmap template.
83
+ **Milestone grouping:** All phases in the initial roadmap MUST be wrapped in a `## Milestone: {project name} v1.0` section. This section includes `**Goal:**`, `**Phases:** 1 - {N}`, and `**Requirement coverage:**`, followed by the Phase Checklist and `### Phase NN:` details. For comprehensive-depth projects (8+ phases), consider splitting into multiple milestones if there are natural delivery boundaries (e.g., "Core Platform" phases 1-5, "Advanced Features" phases 6-10). Each milestone section follows the format defined in the roadmap template.
67
84
 
68
85
  ---
69
86
 
@@ -239,8 +256,14 @@ When receiving checker feedback:
239
256
  - [ ] Tasks grouped into plans by wave
240
257
  - [ ] PLAN files exist with XML task structure
241
258
  - [ ] Each plan: frontmatter complete (depends_on, files_modified, must_haves)
259
+ - [ ] Each plan: requirement_ids field populated (MUST NOT be empty)
242
260
  - [ ] Each task: all 5 elements (name, files, action, verify, done)
243
261
  - [ ] Wave structure maximizes parallelism
262
+ - [ ] Every REQ-ID from ROADMAP/REQUIREMENTS appears in at least one plan
263
+ - [ ] Gap closure mode (if VERIFICATION.md exists): gaps clustered, tasks derived from gap.missing
264
+ - [ ] Revision mode (if re-planning): flagged issues addressed, no new issues introduced, waves still valid
265
+ - [ ] Context fidelity: locked decisions from CONTEXT.md all have corresponding tasks
266
+ - [ ] PLAN files written via Write tool (NEVER Bash heredoc)
244
267
  - [ ] PLAN files committed to git
245
268
  </success_criteria>
246
269
 
@@ -254,6 +277,7 @@ Orchestrators pattern-match on these markers to route results. Omitting causes s
254
277
  - `## PLANNING COMPLETE` - all plan files written and self-checked
255
278
  - `## PLANNING FAILED` - cannot produce valid plans from available context
256
279
  - `## PLANNING INCONCLUSIVE` - need more research or user decisions
280
+ - `## CHECKPOINT REACHED` - blocked on human decision, checkpoint details provided
257
281
 
258
282
  ---
259
283
 
@@ -311,6 +335,8 @@ One-line task descriptions in `<name>`. File paths in `<files>`, not explanation
311
335
  10. DO NOT assume research is done — check discovery level
312
336
  11. DO NOT leave done conditions vague — they must be observable
313
337
  12. DO NOT specify literal `undefined` for parameters that have a known source in the calling context — use data contracts to map sources
338
+ 13. DO NOT use Bash heredoc for file creation — ALWAYS use the Write tool
339
+ 14. DO NOT leave requirement_ids empty in PLAN frontmatter — every plan must trace to requirements
314
340
 
315
341
  </anti_patterns>
316
342
 
@@ -245,7 +245,10 @@ Additionally for this agent:
245
245
  - [ ] Source hierarchy followed (S1-S6 ordering)
246
246
  - [ ] All findings tagged with source level and confidence
247
247
  - [ ] Version-sensitive info sourced from S1-S3 only
248
- - [ ] Gaps documented with reasons
248
+ - [ ] Negative claims verified (absence of feature confirmed, not just unmentioned)
249
+ - [ ] Multiple sources cross-referenced for key decisions
250
+ - [ ] Publication dates checked — no stale guidance presented as current
251
+ - [ ] Gaps documented with reasons and "What might I have missed?" reflection
249
252
  - [ ] Research output file written with required sections
250
253
  - [ ] Completion marker returned
251
254
  </success_criteria>
@@ -104,16 +104,30 @@ Check for stub indicators: TODO/FIXME comments, empty function bodies, trivial r
104
104
  #### Level 3: Wired (Connected to the System)
105
105
  Verify the artifact is imported AND used by other parts of the system (functions called, components rendered, middleware applied, routes registered). Result: `WIRED`, `IMPORTED-UNUSED`, or `ORPHANED`.
106
106
 
107
+ #### Level 4: Functional (Actually Works)
108
+ Run the artifact and verify it produces correct results. This goes beyond structural checks (L1-L3) to behavioral verification. Result: `FUNCTIONAL`, `RUNTIME_ERROR`, or `LOGIC_ERROR`.
109
+
110
+ **When to apply L4:** Only for must-haves that have automated verification commands (test suites, build scripts, API endpoints). Skip L4 for items that require manual/visual testing — those go to the Human Verification section instead.
111
+
112
+ **L4 checks:**
113
+ - Tests pass: `npm test`, `pytest`, or the project's test command
114
+ - Build succeeds: `npm run build`, `tsc --noEmit`, or equivalent
115
+ - API responds correctly: endpoint returns expected shape and status codes
116
+ - CLI produces expected output: command-line tools return correct exit codes and output
117
+
107
118
  #### Artifact Outcome Decision Table
108
119
 
109
- | Exists | Substantive | Wired | Status |
110
- |--------|-------------|-------|--------|
111
- | No | -- | -- | MISSING |
112
- | Yes | No | -- | STUB |
113
- | Yes | Yes | No | UNWIRED |
114
- | Yes | Yes | Yes | PASSED |
120
+ | Exists | Substantive | Wired | Functional | Status |
121
+ |--------|-------------|-------|------------|--------|
122
+ | No | -- | -- | -- | MISSING |
123
+ | Yes | No | -- | -- | STUB |
124
+ | Yes | Yes | No | -- | UNWIRED |
125
+ | Yes | Yes | Yes | No | BROKEN |
126
+ | Yes | Yes | Yes | Yes | PASSED |
115
127
 
116
128
  > **Note:** WIRED status (Level 3) requires correct arguments, not just correct function names. A call that passes `undefined` for a parameter available in scope is `ARGS_WRONG`, not `WIRED`.
129
+ >
130
+ > **Note:** FUNCTIONAL status (Level 4) is optional — only applied when automated verification is available. Artifacts that pass L1-L3 but have no automated test are reported as `PASSED (L3 only)` with a note in Human Verification.
117
131
 
118
132
  ### Step 6: Verify Key Links (Always)
119
133
 
@@ -141,13 +155,15 @@ Beyond verifying that calls exist, spot-check that **arguments passed to cross-b
141
155
  Cross-reference all must-haves against verification results in a table:
142
156
 
143
157
  ```markdown
144
- | # | Must-Have | Type | L1 (Exists) | L2 (Substantive) | L3 (Wired) | Status |
145
- |---|----------|------|-------------|-------------------|------------|--------|
146
- | 1 | {description} | truth | - | - | - | VERIFIED/FAILED |
147
- | 2 | {description} | artifact | YES/NO | YES/STUB/PARTIAL | WIRED/ORPHANED/ARGS_WRONG | PASS/FAIL |
148
- | 3 | {description} | key_link | - | - | YES/NO/ARGS_WRONG | PASS/FAIL |
158
+ | # | Must-Have | Type | L1 (Exists) | L2 (Substantive) | L3 (Wired) | L4 (Functional) | Status |
159
+ |---|----------|------|-------------|-------------------|------------|-----------------|--------|
160
+ | 1 | {description} | truth | - | - | - | - | VERIFIED/FAILED |
161
+ | 2 | {description} | artifact | YES/NO | YES/STUB/PARTIAL | WIRED/ORPHANED | FUNCTIONAL/BROKEN/- | PASS/FAIL |
162
+ | 3 | {description} | key_link | - | - | YES/NO/ARGS_WRONG | - | PASS/FAIL |
149
163
  ```
150
164
 
165
+ L4 column shows `-` when no automated verification is available. Only artifacts with test commands or build verification get L4 checks.
166
+
151
167
  ### Step 8: Scan for Anti-Patterns (Full Verification Only)
152
168
 
153
169
  Scan for: dead code/unused imports, console.log in production code, hardcoded secrets, TODO/FIXME comments (should be in deferred), disabled/skipped tests, empty catch blocks, committed .env files. Report blockers only.
@@ -264,7 +280,7 @@ Mark any file containing 2+ stub patterns as "STUB — not substantive".
264
280
  - [ ] Previous VERIFICATION.md checked
265
281
  - [ ] Must-haves established from plan frontmatter
266
282
  - [ ] All truths verified with status and evidence
267
- - [ ] All artifacts checked at 3 levels (exists, substantive, wired)
283
+ - [ ] All artifacts checked at 3-4 levels (exists, substantive, wired, functional when testable)
268
284
  - [ ] All key links verified including argument values
269
285
  - [ ] Anti-patterns scanned and categorized
270
286
  - [ ] Overall status determined
@@ -279,6 +295,7 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
279
295
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
280
296
 
281
297
  - `## VERIFICATION COMPLETE` - VERIFICATION.md written (status in frontmatter)
298
+ - `## VERIFICATION FAILED` - could not complete verification (missing phase dir, no must-haves to check)
282
299
 
283
300
  ---
284
301
 
@@ -0,0 +1,5 @@
1
+ ---
2
+ description: "Generate tests for completed phase code. Detects test framework and targets key files."
3
+ ---
4
+
5
+ This command is provided by the `pbr:test` skill.
@@ -157,6 +157,34 @@ When a plan requires research before execution, set the `discovery` field in pla
157
157
 
158
158
  ---
159
159
 
160
+ ## TDD Decision Heuristic
161
+
162
+ When assigning `tdd="true"` or `tdd="false"` on a task, apply this test:
163
+
164
+ > **Can you write `expect(fn(input)).toBe(output)` before writing `fn`?**
165
+ > Yes → `tdd="true"`. No → `tdd="false"`.
166
+
167
+ ### When TDD Adds Value
168
+
169
+ - Pure functions and data transformations
170
+ - Business logic with defined inputs/outputs
171
+ - API response parsing and validation
172
+ - State machines and workflow transitions
173
+ - Utility functions and helpers
174
+
175
+ ### When to Skip TDD
176
+
177
+ - UI rendering and layout (test after)
178
+ - Configuration and environment setup
179
+ - Glue code wiring modules together
180
+ - Simple CRUD with no business logic
181
+ - File system operations and I/O plumbing
182
+ - One-off scripts and migrations
183
+
184
+ When the global config `features.tdd_mode: true` is set, all tasks default to TDD. The planner should still set `tdd="false"` on tasks matching the skip list above — the global flag is a project preference, not a mandate for every task.
185
+
186
+ ---
187
+
160
188
  ## Dependency Graph Rules
161
189
 
162
190
  ### File Conflict Detection