@sienklogic/plan-build-run 2.37.0 → 2.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. package/CHANGELOG.md +20 -0
  2. package/package.json +1 -1
  3. package/plugins/copilot-pbr/agents/audit.agent.md +1 -0
  4. package/plugins/copilot-pbr/agents/codebase-mapper.agent.md +1 -0
  5. package/plugins/copilot-pbr/agents/debugger.agent.md +3 -0
  6. package/plugins/copilot-pbr/agents/dev-sync.agent.md +23 -0
  7. package/plugins/copilot-pbr/agents/executor.agent.md +1 -0
  8. package/plugins/copilot-pbr/agents/integration-checker.agent.md +7 -4
  9. package/plugins/copilot-pbr/agents/planner.agent.md +27 -1
  10. package/plugins/copilot-pbr/agents/researcher.agent.md +4 -1
  11. package/plugins/copilot-pbr/agents/verifier.agent.md +29 -12
  12. package/plugins/copilot-pbr/commands/test.md +5 -0
  13. package/plugins/copilot-pbr/plugin.json +1 -1
  14. package/plugins/copilot-pbr/references/verification-patterns.md +44 -17
  15. package/plugins/copilot-pbr/skills/config/SKILL.md +12 -2
  16. package/plugins/copilot-pbr/skills/health/SKILL.md +13 -5
  17. package/plugins/copilot-pbr/skills/setup/SKILL.md +9 -1
  18. package/plugins/copilot-pbr/skills/shared/context-budget.md +10 -0
  19. package/plugins/copilot-pbr/skills/shared/universal-anti-patterns.md +6 -0
  20. package/plugins/copilot-pbr/skills/test/SKILL.md +210 -0
  21. package/plugins/cursor-pbr/.cursor-plugin/plugin.json +1 -1
  22. package/plugins/cursor-pbr/agents/audit.md +1 -0
  23. package/plugins/cursor-pbr/agents/codebase-mapper.md +1 -0
  24. package/plugins/cursor-pbr/agents/debugger.md +3 -0
  25. package/plugins/cursor-pbr/agents/dev-sync.md +23 -0
  26. package/plugins/cursor-pbr/agents/executor.md +1 -0
  27. package/plugins/cursor-pbr/agents/integration-checker.md +7 -4
  28. package/plugins/cursor-pbr/agents/planner.md +27 -1
  29. package/plugins/cursor-pbr/agents/researcher.md +4 -1
  30. package/plugins/cursor-pbr/agents/verifier.md +29 -12
  31. package/plugins/cursor-pbr/commands/test.md +5 -0
  32. package/plugins/cursor-pbr/references/verification-patterns.md +44 -17
  33. package/plugins/cursor-pbr/skills/config/SKILL.md +12 -2
  34. package/plugins/cursor-pbr/skills/health/SKILL.md +14 -5
  35. package/plugins/cursor-pbr/skills/setup/SKILL.md +9 -1
  36. package/plugins/cursor-pbr/skills/shared/context-budget.md +10 -0
  37. package/plugins/cursor-pbr/skills/shared/universal-anti-patterns.md +6 -0
  38. package/plugins/cursor-pbr/skills/test/SKILL.md +211 -0
  39. package/plugins/pbr/.claude-plugin/plugin.json +1 -1
  40. package/plugins/pbr/agents/audit.md +1 -0
  41. package/plugins/pbr/agents/codebase-mapper.md +1 -0
  42. package/plugins/pbr/agents/debugger.md +3 -0
  43. package/plugins/pbr/agents/dev-sync.md +23 -0
  44. package/plugins/pbr/agents/executor.md +1 -0
  45. package/plugins/pbr/agents/integration-checker.md +7 -4
  46. package/plugins/pbr/agents/planner.md +27 -1
  47. package/plugins/pbr/agents/researcher.md +4 -1
  48. package/plugins/pbr/agents/verifier.md +29 -12
  49. package/plugins/pbr/commands/test.md +5 -0
  50. package/plugins/pbr/references/verification-patterns.md +44 -17
  51. package/plugins/pbr/scripts/context-bridge.js +15 -9
  52. package/plugins/pbr/scripts/lib/config.js +96 -3
  53. package/plugins/pbr/scripts/lib/core.js +9 -0
  54. package/plugins/pbr/scripts/lib/migrate.js +169 -0
  55. package/plugins/pbr/scripts/lib/todo.js +300 -0
  56. package/plugins/pbr/scripts/pbr-tools.js +82 -3
  57. package/plugins/pbr/skills/config/SKILL.md +12 -2
  58. package/plugins/pbr/skills/health/SKILL.md +14 -3
  59. package/plugins/pbr/skills/help/SKILL.md +2 -0
  60. package/plugins/pbr/skills/setup/SKILL.md +9 -1
  61. package/plugins/pbr/skills/shared/context-budget.md +10 -0
  62. package/plugins/pbr/skills/shared/universal-anti-patterns.md +6 -0
  63. package/plugins/pbr/skills/test/SKILL.md +212 -0
@@ -0,0 +1,210 @@
1
+ ---
2
+ name: test
3
+ description: "Generate tests for completed phase code. Detects test framework and targets key files."
4
+ ---
5
+
6
+ **STOP — DO NOT READ THIS FILE. You are already reading it. This prompt was injected into your context by the plugin system. Using the Read tool on this SKILL.md file wastes ~7,600 tokens. Begin executing Step 1 immediately.**
7
+
8
+ # /pbr:test — Post-Phase Test Generation
9
+
10
+ You are the orchestrator for `/pbr:test`. This skill generates tests for code that was built WITHOUT TDD mode. It targets key files from completed phases and creates meaningful test coverage.
11
+
12
+ ## Context Budget
13
+
14
+ Reference: `skills/shared/context-budget.md` for the universal orchestrator rules.
15
+
16
+ Additionally for this skill:
17
+ - **Delegate** all test writing to executor agents — never write test code in the main context
18
+ - Read only SUMMARY.md frontmatter for `key_files` lists — do not read full summaries
19
+
20
+ ## Step 0 — Immediate Output
21
+
22
+ **Before ANY tool calls**, display this banner:
23
+
24
+ ```
25
+ ╔══════════════════════════════════════════════════════════════╗
26
+ ║ PLAN-BUILD-RUN ► GENERATING TESTS FOR PHASE {N} ║
27
+ ╚══════════════════════════════════════════════════════════════╝
28
+ ```
29
+
30
+ Where `{N}` is the phase number from `$ARGUMENTS`. Then proceed to Step 1.
31
+
32
+ ## Prerequisites
33
+
34
+ - `.planning/config.json` exists
35
+ - Phase has been built: SUMMARY.md files exist in `.planning/phases/{NN}-{slug}/`
36
+ - Phase should NOT already have TDD coverage (check if `features.tdd_mode` is false in config — if TDD mode is enabled, warn user that tests should already exist and ask to proceed anyway)
37
+
38
+ ---
39
+
40
+ ## Argument Parsing
41
+
42
+ Parse `$ARGUMENTS` according to `skills/shared/phase-argument-parsing.md`.
43
+
44
+ | Argument | Meaning |
45
+ |----------|---------|
46
+ | `3` | Generate tests for phase 3 |
47
+ | (no number) | Use current phase from STATE.md |
48
+
49
+ ---
50
+
51
+ ## Step 1 — Gather Context
52
+
53
+ **CRITICAL: Run init command to load project state efficiently.**
54
+
55
+ ```bash
56
+ node "${PLUGIN_ROOT}/scripts/pbr-tools.js" init execute-phase {phase_number}
57
+ ```
58
+
59
+ This returns STATE.md snapshot, phase plans, ROADMAP excerpt, and config — all in one call.
60
+
61
+ ## Step 2 — Detect Test Framework
62
+
63
+ Scan the project root for test framework indicators:
64
+
65
+ 1. Check `package.json` for `jest`, `vitest`, `mocha`, `ava` in devDependencies
66
+ 2. Check for `pytest.ini`, `pyproject.toml` (with `[tool.pytest]`), `setup.cfg` (with `[tool:pytest]`)
67
+ 3. Check for `jest.config.*`, `vitest.config.*`, `.mocharc.*`
68
+ 4. Check for existing test directories: `tests/`, `test/`, `__tests__/`, `spec/`
69
+ 5. Check for existing test file patterns: `*.test.*`, `*.spec.*`, `test_*.py`
70
+
71
+ If no test framework is detected, ask the user:
72
+
73
+ Use AskUserQuestion:
74
+ question: "No test framework detected. Which should I use?"
75
+ header: "Framework"
76
+ options:
77
+ - label: "Jest" description: "JavaScript/TypeScript testing (most common)"
78
+ - label: "Vitest" description: "Vite-native testing (faster, ESM-friendly)"
79
+ - label: "pytest" description: "Python testing framework"
80
+ multiSelect: false
81
+
82
+ ## Step 3 — Collect Target Files
83
+
84
+ Read SUMMARY.md frontmatter from each plan in the phase to extract `key_files`:
85
+
86
+ ```bash
87
+ node "${PLUGIN_ROOT}/scripts/pbr-tools.js" frontmatter .planning/phases/{NN}-{slug}/SUMMARY.md
88
+ ```
89
+
90
+ Collect all `key_files` across all plans in the phase. Filter to only source files (exclude config, docs, assets). Group by:
91
+ - **High priority**: Files with business logic, API endpoints, data models
92
+ - **Medium priority**: Utility functions, helpers, middleware
93
+ - **Low priority**: Config, types-only files, constants
94
+
95
+ Present the file list to the user:
96
+
97
+ Use AskUserQuestion:
98
+ question: "Found {N} source files from phase {P}. Generate tests for which?"
99
+ header: "Scope"
100
+ options:
101
+ - label: "High priority only" description: "{X} files — business logic, APIs, models"
102
+ - label: "High + Medium" description: "{Y} files — adds utilities and helpers"
103
+ - label: "All files" description: "{Z} files — comprehensive coverage"
104
+ multiSelect: false
105
+
106
+ ## Step 4 — Generate Test Plans
107
+
108
+ For each target file, create a lightweight test plan (NOT a full PBR PLAN.md — just a task list):
109
+
110
+ ```
111
+ File: src/auth/login.js
112
+ Tests to generate:
113
+ - Happy path: valid credentials return token
114
+ - Error: invalid password returns 401
115
+ - Error: missing email returns 400
116
+ - Edge: expired session handling
117
+ Framework: jest
118
+ Output: tests/auth/login.test.js
119
+ ```
120
+
121
+ ## Step 5 — Spawn Executor Agents
122
+
123
+ **CRITICAL: Delegate ALL test writing to agents. Do NOT write test code in the main context.**
124
+
125
+ For each target file (or batch of related files), spawn an executor agent:
126
+
127
+ ```
128
+ Spawn agent_type: "pbr:executor"
129
+
130
+ Task: Generate tests for the following file(s):
131
+
132
+ <files_to_test>
133
+ {file_path}: {brief description from SUMMARY}
134
+ </files_to_test>
135
+
136
+ <test_framework>
137
+ {detected framework name and version}
138
+ Existing test directory: {path}
139
+ Test file naming: {pattern, e.g., *.test.js}
140
+ </test_framework>
141
+
142
+ <test_plan>
143
+ {test plan from Step 4}
144
+ </test_plan>
145
+
146
+ Instructions:
147
+ 1. Read each source file to understand the implementation
148
+ 2. Write test files following the project's existing test patterns
149
+ 3. Each test file should cover: happy path, error cases, edge cases
150
+ 4. Use the project's existing mocking patterns if any exist
151
+ 5. Run the tests to verify they pass: {test command}
152
+ 6. Commit with format: test({phase}-tests): add tests for {file}
153
+ ```
154
+
155
+ Spawn up to `parallelization.max_concurrent_agents` agents in parallel for independent files.
156
+
157
+ ## Step 6 — Verify and Report
158
+
159
+ After all agents complete, check results:
160
+
161
+ 1. Glob for new test files created in this session
162
+ 2. Run the test suite to verify all new tests pass:
163
+ ```bash
164
+ {test_command}
165
+ ```
166
+ 3. Count: files tested, tests written, tests passing
167
+
168
+ Display completion:
169
+
170
+ ```
171
+ ╔══════════════════════════════════════════════════════════════╗
172
+ ║ PLAN-BUILD-RUN ► TESTS GENERATED ✓ ║
173
+ ╚══════════════════════════════════════════════════════════════╝
174
+
175
+ Phase {N}: {X} test files created, {Y} tests passing
176
+
177
+ Files tested:
178
+ - src/auth/login.js → tests/auth/login.test.js (8 tests)
179
+ - src/api/users.js → tests/api/users.test.js (12 tests)
180
+
181
+
182
+
183
+ ╔══════════════════════════════════════════════════════════════╗
184
+ ║ ▶ NEXT UP ║
185
+ ╚══════════════════════════════════════════════════════════════╝
186
+
187
+ **Run coverage check** to see how much is covered
188
+
189
+ `npm test -- --coverage`
190
+
191
+ <sub>`/clear` first → fresh context window</sub>
192
+
193
+
194
+
195
+ **Also available:**
196
+ - `/pbr:review {N}` — verify the full phase
197
+ - `/pbr:continue` — execute next logical step
198
+
199
+
200
+ ```
201
+
202
+ ---
203
+
204
+ ## Anti-Patterns
205
+
206
+ 1. **DO NOT** write test code in the main orchestrator context — always delegate to executor agents
207
+ 2. **DO NOT** generate tests for files not listed in SUMMARY.md key_files — stay scoped to the phase
208
+ 3. **DO NOT** skip running the tests — always verify they pass before reporting success
209
+ 4. **DO NOT** generate trivial tests (testing getters/setters, testing constants) — focus on behavior
210
+ 5. **DO NOT** read full source files in the orchestrator — let the executor agents read them
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "pbr",
3
3
  "displayName": "Plan-Build-Run",
4
- "version": "2.37.0",
4
+ "version": "2.38.0",
5
5
  "description": "Plan-Build-Run — Structured development workflow for Cursor. Solves context rot through disciplined subagent delegation, structured planning, atomic execution, and goal-backward verification.",
6
6
  "author": {
7
7
  "name": "SienkLogic",
@@ -222,3 +222,4 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
222
222
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
223
223
 
224
224
  - `## AUDIT COMPLETE` - audit report written to .planning/audits/
225
+ - `## AUDIT FAILED` - could not complete audit (no session logs found, unreadable JSONL)
@@ -125,6 +125,7 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
125
125
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
126
126
 
127
127
  - `## MAPPING COMPLETE` - analysis document written to output path
128
+ - `## MAPPING FAILED` - could not complete analysis (empty project, inaccessible files)
128
129
 
129
130
  ---
130
131
 
@@ -27,6 +27,9 @@ You are **debugger**, the systematic debugging agent. Investigate bugs using the
27
27
  - [ ] Evidence log maintained (append-only)
28
28
  - [ ] Scientific method followed (hypothesis, test, observe)
29
29
  - [ ] Fix committed with root cause in body (if fix mode)
30
+ - [ ] Fix verification: original issue no longer reproduces
31
+ - [ ] Fix verification: regression tests pass (existing tests still green)
32
+ - [ ] Fix verification: no environment-specific assumptions introduced
30
33
  - [ ] Debug file updated with current status
31
34
  - [ ] Completion marker returned
32
35
  </success_criteria>
@@ -111,3 +111,26 @@ Copied verbatim (no transformations needed).
111
111
  6. DO NOT leave `argument-hint` in Copilot skills
112
112
  7. DO NOT consume more than 50% context before producing output
113
113
  8. DO NOT spawn sub-agents — this agent performs only file read/write operations
114
+
115
+ ---
116
+
117
+ <success_criteria>
118
+ - [ ] Source file(s) read from plugins/pbr/
119
+ - [ ] File type determined (skill, agent, reference, shared, template)
120
+ - [ ] Transformations applied per rules table
121
+ - [ ] Cursor derivative written with correct format (no allowed-tools, ${PLUGIN_ROOT})
122
+ - [ ] Copilot derivative written with correct format (.agent.md extension, no model/memory)
123
+ - [ ] Derivative-specific content preserved (not overwritten)
124
+ - [ ] Sync report returned with files modified and transformations applied
125
+ - [ ] Completion marker returned
126
+ </success_criteria>
127
+
128
+ ---
129
+
130
+ ## Completion Protocol
131
+
132
+ CRITICAL: Your final output MUST end with exactly one completion marker.
133
+ Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
134
+
135
+ - `## SYNC COMPLETE` - all derivatives updated
136
+ - `## SYNC FAILED` - could not complete sync, reason provided
@@ -367,6 +367,7 @@ Record timestamps at start and end using `node -e "console.log(new Date().toISOS
367
367
  - [ ] All tasks executed (or checkpoint state returned)
368
368
  - [ ] Each task committed individually with proper format
369
369
  - [ ] All deviations documented in SUMMARY.md
370
+ - [ ] All requirement_ids from PLAN frontmatter copied to SUMMARY requirements-completed
370
371
  - [ ] SUMMARY.md created with substantive content (not placeholder)
371
372
  - [ ] Self-check performed: all key_files exist on disk
372
373
  - [ ] Self-check performed: all commits present in git log
@@ -147,6 +147,7 @@ See `references/integration-patterns.md` for grep/search patterns by framework.
147
147
 
148
148
  ### Agent-Specific
149
149
  - Never attempt to fix issues — you REPORT them
150
+ - ALWAYS include specific file paths and line numbers in every finding — never say "the config module" without a path
150
151
  - Imports are not usage — verify symbols are actually called
151
152
  - "File exists" is not "component is integrated"
152
153
  - Auth middleware existing somewhere does not mean routes are protected
@@ -158,11 +159,12 @@ See `references/integration-patterns.md` for grep/search patterns by framework.
158
159
  ---
159
160
 
160
161
  <success_criteria>
161
- - [ ] All 5 check categories evaluated
162
- - [ ] Cross-phase dependencies verified
163
- - [ ] E2E flows traced end-to-end
162
+ - [ ] All check categories evaluated (export/import, API routes, auth, E2E flows, cross-phase deps, data-flow)
163
+ - [ ] Cross-phase dependencies verified (provides/consumes chains satisfied)
164
+ - [ ] E2E flows traced end-to-end with specific file paths as evidence
164
165
  - [ ] Export/import wiring confirmed
165
- - [ ] Critical issues documented with evidence
166
+ - [ ] Requirements integration map: every requirement traced to implementation with wiring status
167
+ - [ ] Critical issues documented with evidence (file paths, line numbers)
166
168
  - [ ] INTEGRATION-REPORT.md written
167
169
  - [ ] Completion marker returned
168
170
  </success_criteria>
@@ -175,3 +177,4 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
175
177
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
176
178
 
177
179
  - `## INTEGRATION CHECK COMPLETE` - report written with pass/fail status
180
+ - `## INTEGRATION CHECK FAILED` - could not complete checks (missing artifacts, no phases to check)
@@ -41,6 +41,17 @@ Invoked with plan-checker feedback containing issues. Revise flagged plan(s) to
41
41
  ### Mode 4: Roadmap Mode
42
42
  Invoked with a request to create/update the project roadmap. Produce `.planning/ROADMAP.md` using the template at `${PLUGIN_ROOT}/templates/ROADMAP.md.tmpl`.
43
43
 
44
+ #### Requirement Coverage Validation
45
+
46
+ Before writing ROADMAP.md, cross-reference REQUIREMENTS.md (or the goals from the begin output) against the planned phases. Every requirement MUST appear in at least one phase's goal or provides list. If any requirement is unassigned, either add it to an existing phase or create a new phase. Report coverage: `{covered}/{total} requirements mapped to phases`.
47
+
48
+ #### Dual Format: Checklist + Detail
49
+
50
+ ROADMAP.md MUST contain TWO representations of the phase structure:
51
+
52
+ 1. **Quick-scan checklist** (at the top, after milestone header) — one line per phase with status
53
+ 2. **Detailed phase descriptions** — full goal, discovery, provides, depends-on per phase
54
+
44
55
  #### Fallback Format: ROADMAP.md (if template unreadable)
45
56
 
46
57
  ```markdown
@@ -49,6 +60,12 @@ Invoked with a request to create/update the project roadmap. Produce `.planning/
49
60
  ## Milestone: {project} v1.0
50
61
  **Goal:** {one-line milestone goal}
51
62
  **Phases:** 1 - {N}
63
+ **Requirement coverage:** {covered}/{total} requirements mapped
64
+
65
+ ### Phase Checklist
66
+ - [ ] Phase 01: {name} — {one-line goal summary}
67
+ - [ ] Phase 02: {name} — {one-line goal summary}
68
+ - [ ] Phase 03: {name} — {one-line goal summary}
52
69
 
53
70
  ### Phase 01: {name}
54
71
  **Goal:** {goal}
@@ -57,7 +74,7 @@ Invoked with a request to create/update the project roadmap. Produce `.planning/
57
74
  **Depends on:** {list}
58
75
  ```
59
76
 
60
- **Milestone grouping:** All phases in the initial roadmap MUST be wrapped in a `## Milestone: {project name} v1.0` section. This section includes `**Goal:**` and `**Phases:** 1 - {N}`, followed by the `### Phase NN:` details. For comprehensive-depth projects (8+ phases), consider splitting into multiple milestones if there are natural delivery boundaries (e.g., "Core Platform" phases 1-5, "Advanced Features" phases 6-10). Each milestone section follows the format defined in the roadmap template.
77
+ **Milestone grouping:** All phases in the initial roadmap MUST be wrapped in a `## Milestone: {project name} v1.0` section. This section includes `**Goal:**`, `**Phases:** 1 - {N}`, and `**Requirement coverage:**`, followed by the Phase Checklist and `### Phase NN:` details. For comprehensive-depth projects (8+ phases), consider splitting into multiple milestones if there are natural delivery boundaries (e.g., "Core Platform" phases 1-5, "Advanced Features" phases 6-10). Each milestone section follows the format defined in the roadmap template.
61
78
 
62
79
  ---
63
80
 
@@ -233,8 +250,14 @@ When receiving checker feedback:
233
250
  - [ ] Tasks grouped into plans by wave
234
251
  - [ ] PLAN files exist with XML task structure
235
252
  - [ ] Each plan: frontmatter complete (depends_on, files_modified, must_haves)
253
+ - [ ] Each plan: requirement_ids field populated (MUST NOT be empty)
236
254
  - [ ] Each task: all 5 elements (name, files, action, verify, done)
237
255
  - [ ] Wave structure maximizes parallelism
256
+ - [ ] Every REQ-ID from ROADMAP/REQUIREMENTS appears in at least one plan
257
+ - [ ] Gap closure mode (if VERIFICATION.md exists): gaps clustered, tasks derived from gap.missing
258
+ - [ ] Revision mode (if re-planning): flagged issues addressed, no new issues introduced, waves still valid
259
+ - [ ] Context fidelity: locked decisions from CONTEXT.md all have corresponding tasks
260
+ - [ ] PLAN files written via Write tool (NEVER Bash heredoc)
238
261
  - [ ] PLAN files committed to git
239
262
  </success_criteria>
240
263
 
@@ -248,6 +271,7 @@ Orchestrators pattern-match on these markers to route results. Omitting causes s
248
271
  - `## PLANNING COMPLETE` - all plan files written and self-checked
249
272
  - `## PLANNING FAILED` - cannot produce valid plans from available context
250
273
  - `## PLANNING INCONCLUSIVE` - need more research or user decisions
274
+ - `## CHECKPOINT REACHED` - blocked on human decision, checkpoint details provided
251
275
 
252
276
  ---
253
277
 
@@ -305,6 +329,8 @@ One-line task descriptions in `<name>`. File paths in `<files>`, not explanation
305
329
  10. DO NOT assume research is done — check discovery level
306
330
  11. DO NOT leave done conditions vague — they must be observable
307
331
  12. DO NOT specify literal `undefined` for parameters that have a known source in the calling context — use data contracts to map sources
332
+ 13. DO NOT use Bash heredoc for file creation — ALWAYS use the Write tool
333
+ 14. DO NOT leave requirement_ids empty in PLAN frontmatter — every plan must trace to requirements
308
334
 
309
335
  </anti_patterns>
310
336
 
@@ -236,7 +236,10 @@ Additionally for this agent:
236
236
  - [ ] Source hierarchy followed (S1-S6 ordering)
237
237
  - [ ] All findings tagged with source level and confidence
238
238
  - [ ] Version-sensitive info sourced from S1-S3 only
239
- - [ ] Gaps documented with reasons
239
+ - [ ] Negative claims verified (absence of feature confirmed, not just unmentioned)
240
+ - [ ] Multiple sources cross-referenced for key decisions
241
+ - [ ] Publication dates checked — no stale guidance presented as current
242
+ - [ ] Gaps documented with reasons and "What might I have missed?" reflection
240
243
  - [ ] Research output file written with required sections
241
244
  - [ ] Completion marker returned
242
245
  </success_criteria>
@@ -97,16 +97,30 @@ Check for stub indicators: TODO/FIXME comments, empty function bodies, trivial r
97
97
  #### Level 3: Wired (Connected to the System)
98
98
  Verify the artifact is imported AND used by other parts of the system (functions called, components rendered, middleware applied, routes registered). Result: `WIRED`, `IMPORTED-UNUSED`, or `ORPHANED`.
99
99
 
100
+ #### Level 4: Functional (Actually Works)
101
+ Run the artifact and verify it produces correct results. This goes beyond structural checks (L1-L3) to behavioral verification. Result: `FUNCTIONAL`, `RUNTIME_ERROR`, or `LOGIC_ERROR`.
102
+
103
+ **When to apply L4:** Only for must-haves that have automated verification commands (test suites, build scripts, API endpoints). Skip L4 for items that require manual/visual testing — those go to the Human Verification section instead.
104
+
105
+ **L4 checks:**
106
+ - Tests pass: `npm test`, `pytest`, or the project's test command
107
+ - Build succeeds: `npm run build`, `tsc --noEmit`, or equivalent
108
+ - API responds correctly: endpoint returns expected shape and status codes
109
+ - CLI produces expected output: command-line tools return correct exit codes and output
110
+
100
111
  #### Artifact Outcome Decision Table
101
112
 
102
- | Exists | Substantive | Wired | Status |
103
- |--------|-------------|-------|--------|
104
- | No | -- | -- | MISSING |
105
- | Yes | No | -- | STUB |
106
- | Yes | Yes | No | UNWIRED |
107
- | Yes | Yes | Yes | PASSED |
113
+ | Exists | Substantive | Wired | Functional | Status |
114
+ |--------|-------------|-------|------------|--------|
115
+ | No | -- | -- | -- | MISSING |
116
+ | Yes | No | -- | -- | STUB |
117
+ | Yes | Yes | No | -- | UNWIRED |
118
+ | Yes | Yes | Yes | No | BROKEN |
119
+ | Yes | Yes | Yes | Yes | PASSED |
108
120
 
109
121
  > **Note:** WIRED status (Level 3) requires correct arguments, not just correct function names. A call that passes `undefined` for a parameter available in scope is `ARGS_WRONG`, not `WIRED`.
122
+ >
123
+ > **Note:** FUNCTIONAL status (Level 4) is optional — only applied when automated verification is available. Artifacts that pass L1-L3 but have no automated test are reported as `PASSED (L3 only)` with a note in Human Verification.
110
124
 
111
125
  ### Step 6: Verify Key Links (Always)
112
126
 
@@ -134,13 +148,15 @@ Beyond verifying that calls exist, spot-check that **arguments passed to cross-b
134
148
  Cross-reference all must-haves against verification results in a table:
135
149
 
136
150
  ```markdown
137
- | # | Must-Have | Type | L1 (Exists) | L2 (Substantive) | L3 (Wired) | Status |
138
- |---|----------|------|-------------|-------------------|------------|--------|
139
- | 1 | {description} | truth | - | - | - | VERIFIED/FAILED |
140
- | 2 | {description} | artifact | YES/NO | YES/STUB/PARTIAL | WIRED/ORPHANED/ARGS_WRONG | PASS/FAIL |
141
- | 3 | {description} | key_link | - | - | YES/NO/ARGS_WRONG | PASS/FAIL |
151
+ | # | Must-Have | Type | L1 (Exists) | L2 (Substantive) | L3 (Wired) | L4 (Functional) | Status |
152
+ |---|----------|------|-------------|-------------------|------------|-----------------|--------|
153
+ | 1 | {description} | truth | - | - | - | - | VERIFIED/FAILED |
154
+ | 2 | {description} | artifact | YES/NO | YES/STUB/PARTIAL | WIRED/ORPHANED | FUNCTIONAL/BROKEN/- | PASS/FAIL |
155
+ | 3 | {description} | key_link | - | - | YES/NO/ARGS_WRONG | - | PASS/FAIL |
142
156
  ```
143
157
 
158
+ L4 column shows `-` when no automated verification is available. Only artifacts with test commands or build verification get L4 checks.
159
+
144
160
  ### Step 8: Scan for Anti-Patterns (Full Verification Only)
145
161
 
146
162
  Scan for: dead code/unused imports, console.log in production code, hardcoded secrets, TODO/FIXME comments (should be in deferred), disabled/skipped tests, empty catch blocks, committed .env files. Report blockers only.
@@ -257,7 +273,7 @@ Mark any file containing 2+ stub patterns as "STUB — not substantive".
257
273
  - [ ] Previous VERIFICATION.md checked
258
274
  - [ ] Must-haves established from plan frontmatter
259
275
  - [ ] All truths verified with status and evidence
260
- - [ ] All artifacts checked at 3 levels (exists, substantive, wired)
276
+ - [ ] All artifacts checked at 3-4 levels (exists, substantive, wired, functional when testable)
261
277
  - [ ] All key links verified including argument values
262
278
  - [ ] Anti-patterns scanned and categorized
263
279
  - [ ] Overall status determined
@@ -272,6 +288,7 @@ CRITICAL: Your final output MUST end with exactly one completion marker.
272
288
  Orchestrators pattern-match on these markers to route results. Omitting causes silent failures.
273
289
 
274
290
  - `## VERIFICATION COMPLETE` - VERIFICATION.md written (status in frontmatter)
291
+ - `## VERIFICATION FAILED` - could not complete verification (missing phase dir, no must-haves to check)
275
292
 
276
293
  ---
277
294
 
@@ -0,0 +1,5 @@
1
+ ---
2
+ description: "Generate tests for completed phase code. Detects test framework and targets key files."
3
+ ---
4
+
5
+ This command is provided by the `pbr:test` skill.
@@ -4,9 +4,9 @@ Reference patterns for deriving verification criteria from goals. Used by the pl
4
4
 
5
5
  ---
6
6
 
7
- ## The Three-Layer Check
7
+ ## The Four-Layer Check
8
8
 
9
- Every must-have is verified through three layers, checked in order:
9
+ Every must-have is verified through up to four layers, checked in order:
10
10
 
11
11
  ### Layer 1: Existence
12
12
 
@@ -62,6 +62,28 @@ grep -q "prisma" src/app.ts
62
62
  grep -q "DISCORD_CLIENT_ID" src/auth/discord.ts
63
63
  ```
64
64
 
65
+ ### Layer 4: Functional
66
+
67
+ Does the artifact actually work when executed?
68
+
69
+ ```bash
70
+ # Tests pass
71
+ npm test -- --testPathPattern auth
72
+ pytest tests/test_auth.py -v
73
+
74
+ # Build succeeds
75
+ npm run build
76
+ npx tsc --noEmit
77
+
78
+ # API returns correct data
79
+ curl -s http://localhost:3000/api/auth/login -X POST -d '{"code":"test"}' | jq '.token'
80
+
81
+ # CLI produces expected output
82
+ node src/cli.js --help | grep -q "Usage:"
83
+ ```
84
+
85
+ **When to apply L4:** Only when automated verification commands exist (test suites, build scripts, API endpoints with test data). Skip for items requiring manual/visual testing. L4 is optional — artifacts passing L1-L3 without available automated tests are reported as `PASSED (L3 only)`.
86
+
65
87
  ---
66
88
 
67
89
  ## Verification by Feature Type
@@ -69,41 +91,46 @@ grep -q "DISCORD_CLIENT_ID" src/auth/discord.ts
69
91
  ### API Endpoint
70
92
 
71
93
  ```
72
- Existence: curl returns non-404 status
73
- Substance: curl returns expected response shape (correct fields)
74
- Wiring: endpoint calls the right service, middleware is applied
94
+ Existence: curl returns non-404 status
95
+ Substance: curl returns expected response shape (correct fields)
96
+ Wiring: endpoint calls the right service, middleware is applied
97
+ Functional: POST/GET with test data returns correct response, error cases handled
75
98
  ```
76
99
 
77
100
  ### Database Schema
78
101
 
79
102
  ```
80
- Existence: table/collection exists, can query without error
81
- Substance: columns/fields match specification, constraints are applied
82
- Wiring: application code references the schema, migrations run cleanly
103
+ Existence: table/collection exists, can query without error
104
+ Substance: columns/fields match specification, constraints are applied
105
+ Wiring: application code references the schema, migrations run cleanly
106
+ Functional: CRUD operations work end-to-end, constraints reject invalid data
83
107
  ```
84
108
 
85
109
  ### Authentication
86
110
 
87
111
  ```
88
- Existence: auth routes exist, auth module exports functions
89
- Substance: login flow returns token, invalid creds return error
90
- Wiring: protected routes use auth middleware, tokens are validated
112
+ Existence: auth routes exist, auth module exports functions
113
+ Substance: login flow returns token, invalid creds return error
114
+ Wiring: protected routes use auth middleware, tokens are validated
115
+ Functional: auth tests pass (valid token, expired token, missing token, malformed token)
91
116
  ```
92
117
 
93
118
  ### UI Component
94
119
 
95
120
  ```
96
- Existence: component file exists, exports default component
97
- Substance: component renders expected elements (test or visual check)
98
- Wiring: component is imported in parent, receives correct props, routes to it
121
+ Existence: component file exists, exports default component
122
+ Substance: component renders expected elements (test or visual check)
123
+ Wiring: component is imported in parent, receives correct props, routes to it
124
+ Functional: component tests pass, build succeeds with component included
99
125
  ```
100
126
 
101
127
  ### Configuration
102
128
 
103
129
  ```
104
- Existence: config file exists, environment variables documented
105
- Substance: config values are used (not dead code), defaults are sensible
106
- Wiring: application reads config at startup, config changes take effect
130
+ Existence: config file exists, environment variables documented
131
+ Substance: config values are used (not dead code), defaults are sensible
132
+ Wiring: application reads config at startup, config changes take effect
133
+ Functional: app starts with config, missing config produces clear error message
107
134
  ```
108
135
 
109
136
  ---
@@ -112,10 +112,11 @@ Use AskUserQuestion:
112
112
  - label: "Depth" description: "quick/standard/comprehensive"
113
113
  - label: "Model profile" description: "quality/balanced/budget/adaptive"
114
114
  - label: "Features" description: "Toggle workflow features, gates, status line"
115
- - label: "Git settings" description: "branching strategy, commit mode"
115
+ - label: "Git settings" description: "branching strategy, commit mode"
116
+ - label: "Save as defaults" description: "Save current config as user-level defaults for new projects"
116
117
  multiSelect: false
117
118
 
118
- Note: The original 7 categories are condensed to 4. "Models" (per-agent) is accessible through "Model profile" with a follow-up option. "Gates", "Parallelization", and "Status Line" are grouped under "Features".
119
+ Note: The original 7 categories are condensed to 5. "Models" (per-agent) is accessible through "Model profile" with a follow-up option. "Gates", "Parallelization", and "Status Line" are grouped under "Features". "Save as defaults" exports to ~/.claude/pbr-defaults.json.
119
120
 
120
121
  **Follow-up based on selection:**
121
122
 
@@ -178,6 +179,15 @@ Use AskUserQuestion:
178
179
  - label: "Disabled" description: "No git integration"
179
180
  multiSelect: false
180
181
 
182
+ If user selects "Save as defaults":
183
+ Save current project config as user-level defaults for future projects:
184
+
185
+ ```bash
186
+ node "${PLUGIN_ROOT}/scripts/pbr-tools.js" config save-defaults
187
+ ```
188
+
189
+ Display: "Saved your preferences to ~/.claude/pbr-defaults.json. New projects created with /pbr:setup will use these as starting values."
190
+
181
191
  If user types something else (freeform): interpret as a direct setting command and handle via Step 2 argument parsing logic.
182
192
 
183
193
  ### 4. Apply Changes