prizmkit 1.1.68 → 1.1.70

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (55) hide show
  1. package/bundled/VERSION.json +3 -3
  2. package/bundled/dev-pipeline/lib/heartbeat.sh +5 -5
  3. package/bundled/dev-pipeline/scripts/generate-bootstrap-prompt.py +11 -12
  4. package/bundled/dev-pipeline/scripts/parse-stream-progress.py +217 -18
  5. package/bundled/dev-pipeline/templates/agent-prompts/dev-implement.md +36 -22
  6. package/bundled/dev-pipeline/templates/agent-prompts/reviewer-review.md +1 -1
  7. package/bundled/dev-pipeline/templates/bootstrap-tier2.md +19 -1
  8. package/bundled/dev-pipeline/templates/bootstrap-tier3.md +19 -1
  9. package/bundled/dev-pipeline/templates/bugfix-bootstrap-prompt.md +24 -21
  10. package/bundled/dev-pipeline/templates/refactor-bootstrap-prompt.md +32 -24
  11. package/bundled/dev-pipeline/templates/sections/ac-verification-checklist.md +4 -10
  12. package/bundled/dev-pipeline/templates/sections/context-budget-rules.md +1 -0
  13. package/bundled/dev-pipeline/templates/sections/feature-context.md +16 -11
  14. package/bundled/dev-pipeline/templates/sections/phase-browser-verification-auto.md +17 -26
  15. package/bundled/dev-pipeline/templates/sections/phase-browser-verification-opencli.md +1 -1
  16. package/bundled/dev-pipeline/templates/sections/phase-browser-verification.md +1 -1
  17. package/bundled/dev-pipeline/templates/sections/phase-context-snapshot-base.md +1 -1
  18. package/bundled/dev-pipeline/templates/sections/phase-critic-plan-full.md +10 -0
  19. package/bundled/dev-pipeline/templates/sections/phase-critic-plan.md +10 -0
  20. package/bundled/dev-pipeline/templates/sections/phase-implement-agent.md +14 -9
  21. package/bundled/dev-pipeline/templates/sections/phase-implement-full.md +14 -9
  22. package/bundled/dev-pipeline/templates/sections/phase-implement-lite.md +8 -17
  23. package/bundled/dev-pipeline/templates/sections/phase-plan-lite.md +1 -1
  24. package/bundled/dev-pipeline/templates/sections/phase-review-agent.md +5 -1
  25. package/bundled/dev-pipeline/templates/sections/phase-review-full.md +6 -2
  26. package/bundled/dev-pipeline/templates/sections/phase-specify-plan-full.md +1 -1
  27. package/bundled/dev-pipeline/templates/sections/task-contract.md +34 -0
  28. package/bundled/dev-pipeline/templates/sections/test-failure-recovery-agent.md +27 -46
  29. package/bundled/dev-pipeline/templates/sections/test-failure-recovery-lite.md +27 -37
  30. package/bundled/dev-pipeline/tests/test_generate_bootstrap_prompt.py +13 -0
  31. package/bundled/dev-pipeline-windows/scripts/generate-bootstrap-prompt.py +11 -12
  32. package/bundled/dev-pipeline-windows/scripts/parse-stream-progress.py +217 -18
  33. package/bundled/dev-pipeline-windows/templates/agent-prompts/dev-implement.md +36 -22
  34. package/bundled/dev-pipeline-windows/templates/agent-prompts/reviewer-review.md +1 -1
  35. package/bundled/dev-pipeline-windows/templates/bugfix-bootstrap-prompt.md +24 -21
  36. package/bundled/dev-pipeline-windows/templates/refactor-bootstrap-prompt.md +32 -24
  37. package/bundled/dev-pipeline-windows/templates/sections/ac-verification-checklist.md +4 -10
  38. package/bundled/dev-pipeline-windows/templates/sections/context-budget-rules.md +1 -0
  39. package/bundled/dev-pipeline-windows/templates/sections/feature-context.md +16 -11
  40. package/bundled/dev-pipeline-windows/templates/sections/phase-browser-verification-auto.md +22 -10
  41. package/bundled/dev-pipeline-windows/templates/sections/phase-context-snapshot-base.md +1 -1
  42. package/bundled/dev-pipeline-windows/templates/sections/phase-critic-plan-full.md +10 -0
  43. package/bundled/dev-pipeline-windows/templates/sections/phase-critic-plan.md +10 -0
  44. package/bundled/dev-pipeline-windows/templates/sections/phase-implement-agent.md +14 -9
  45. package/bundled/dev-pipeline-windows/templates/sections/phase-implement-full.md +14 -9
  46. package/bundled/dev-pipeline-windows/templates/sections/phase-implement-lite.md +8 -19
  47. package/bundled/dev-pipeline-windows/templates/sections/phase-plan-lite.md +1 -1
  48. package/bundled/dev-pipeline-windows/templates/sections/phase-review-agent.md +5 -1
  49. package/bundled/dev-pipeline-windows/templates/sections/phase-review-full.md +6 -2
  50. package/bundled/dev-pipeline-windows/templates/sections/phase-specify-plan-full.md +1 -1
  51. package/bundled/dev-pipeline-windows/templates/sections/task-contract.md +34 -0
  52. package/bundled/dev-pipeline-windows/templates/sections/test-failure-recovery-agent.md +27 -46
  53. package/bundled/dev-pipeline-windows/templates/sections/test-failure-recovery-lite.md +27 -37
  54. package/bundled/skills/_metadata.json +1 -1
  55. package/package.json +1 -1
@@ -84,12 +84,12 @@ mkdir -p .prizmkit/bugfix/{{BUG_ID}}
84
84
 
85
85
  {{IF_BROWSER_INTERACTION}}
86
86
 
87
- #### Browser Verification Setup
87
+ #### Browser Verification Protocol
88
88
 
89
- The bug may be reproducible via the UI using browser tools:
89
+ The bug may be reproducible via the UI. Use this single browser protocol for planning, implementation, review, and final reporting:
90
90
 
91
91
  {{IF_BROWSER_TOOL_AUTO}}
92
- - **Browser Tool**: Will be auto-selected based on error type and dev server configuration
92
+ - **Browser Tool**: Auto-select at runtime based on error type and dev server configuration
93
93
  {{END_IF_BROWSER_TOOL_AUTO}}
94
94
 
95
95
  {{IF_BROWSER_TOOL_PLAYWRIGHT}}
@@ -103,10 +103,7 @@ The bug may be reproducible via the UI using browser tools:
103
103
  **Browser Verification Goals**:
104
104
  {{BROWSER_VERIFY_STEPS}}
105
105
 
106
- If the bug is related to UI/frontend, you may use these tools to:
107
- 1. Reproduce the bug in a running dev server
108
- 2. Verify the fix after implementation
109
- 3. Smoke-test related UI flows for regression
106
+ Use this protocol only when the bug is UI/frontend-reproducible. Evidence must cover: original bug reproduction, post-fix behavior, related UI regression smoke tests, tool used, and any manual verification steps.
110
107
 
111
108
  {{END_IF_BROWSER_INTERACTION}}
112
109
 
@@ -141,7 +138,7 @@ Run `/prizmkit-plan` with `artifact_dir=.prizmkit/bugfix/{{BUG_ID}}/`:
141
138
  - Resolve any `[NEEDS CLARIFICATION]` markers autonomously — do NOT pause
142
139
 
143
140
  {{IF_BROWSER_INTERACTION}}
144
- - **Browser Verification**: If the bug is UI-reproducible, plan.md should include browser-based reproduction as an optional verification step
141
+ - **Browser Verification**: If the bug is UI-reproducible, plan.md should reference the Browser Verification Protocol above instead of redefining browser steps.
145
142
  {{END_IF_BROWSER_INTERACTION}}
146
143
 
147
144
  **DECISION GATE — Fast Path Check**:
@@ -171,11 +168,7 @@ Run `/prizmkit-implement` with `artifact_dir=.prizmkit/bugfix/{{BUG_ID}}/`:
171
168
 
172
169
  {{IF_BROWSER_INTERACTION}}
173
170
 
174
- **Browser Verification During Implementation**:
175
- - After each code fix, you may optionally use browser tools to verify the behavior
176
- - Reproduce the original bug steps and confirm they no longer occur
177
- - Test related UI flows to ensure no regression
178
- - Document any manual verification steps in the implementation notes
171
+ **Browser Verification During Implementation**: If the bug is UI-reproducible, apply the Browser Verification Protocol above and document evidence in the implementation notes.
179
172
 
180
173
  {{END_IF_BROWSER_INTERACTION}}
181
174
 
@@ -196,20 +189,30 @@ After implement completes, verify:
196
189
  If `FAST_PATH=true` (≤ 2 tasks, obvious root cause), skip this phase entirely.
197
190
 
198
191
  Run `/prizmkit-code-review` with `artifact_dir=.prizmkit/bugfix/{{BUG_ID}}/`:
199
- - The skill runs an internal review-fix loop (Reviewer → filter → Dev fix, max 3 rounds) and writes review-report.md
200
- - If PASS: proceed
201
- - If NEEDS_FIXES: the skill exhausted its max rounds; log remaining findings and proceed
192
+ - The skill runs an internal review-fix loop (Reviewer → filter → Dev fix, max 3 rounds) and writes `review-report.md`
193
+ - `review-report.md` must contain `## Verdict`
194
+
195
+ **Gate Check — Review Report**:
196
+ After `/prizmkit-code-review` returns, verify the review report:
197
+ ```bash
198
+ grep -q "## Verdict" .prizmkit/bugfix/{{BUG_ID}}/review-report.md && echo "GATE:PASS" || echo "GATE:MISSING"
199
+ ```
200
+ If GATE:MISSING:
201
+ - Do not re-run `/prizmkit-code-review` in an unbounded report-repair loop.
202
+ - Perform one bounded status check; retry `/prizmkit-code-review` at most once only if the missing report appears caused by a transient team/config/lock error.
203
+ - If the report is still missing after that single check/retry, write `failure-log.md` with the spawn/skill error and last observable state, then either perform a safe inline fallback review (spec/plan/diff/tests → write `review-report.md` with `## Verdict`) or stop with a clear recovery failure.
204
+
205
+ Read `review-report.md` and check the Verdict:
206
+ - `PASS` → proceed
207
+ - `NEEDS_FIXES` → the skill exhausted its max rounds; log remaining findings and proceed
202
208
 
203
209
  {{IF_BROWSER_INTERACTION}}
204
210
 
205
- **Code Review — Browser Verification Check**:
206
- - Verify that browser-based reproduction steps (if applicable) are clearly documented
207
- - Confirm that the fix maintains the expected UI behavior for all affected flows
208
- - Validate that any manual verification steps have been completed successfully
211
+ **Code Review — Browser Verification Check**: Confirm that UI-reproducible bug evidence follows the Browser Verification Protocol above.
209
212
 
210
213
  {{END_IF_BROWSER_INTERACTION}}
211
214
 
212
- **CP-3**: Code review complete, all tests green.
215
+ **CP-3**: Code review complete, `review-report.md` written, and all tests green.
213
216
 
214
217
  **Checkpoint update**: Set step `prizmkit-code-review` to `"completed"`.
215
218
 
@@ -16,7 +16,7 @@ You are the **refactor session orchestrator**. Execute Refactor {{REFACTOR_ID}}:
16
16
 
17
17
  **NON-INTERACTIVE MODE**: You are running in headless non-interactive mode. There is NO human on the other end. NEVER ask for user confirmation, NEVER wait for user input, NEVER use interactive prompts (e.g. "Would you like me to…"). If a skill has an interactive step (e.g. offer remediation, ask for approval), skip it and proceed autonomously. Make decisions based on the data available and move forward.
18
18
 
19
- **MANDATORY TEAM REQUIREMENT**: You MUST use the `prizm-dev-team` agents (Dev + Reviewer). This is NON-NEGOTIABLE. All implementation and review work MUST be performed by the appropriate team agents (Dev, Reviewer). You are the orchestrator — handle coordination, planning, and commit phases directly.
19
+ **NORMAL-PATH TEAM REQUIREMENT**: Use the `prizm-dev-team` agents (Dev + Reviewer) for implementation and review. You are the orchestrator — handle coordination, planning, and commit phases directly. If a required agent cannot be spawned after the bounded retry policy below, follow the documented recovery exception instead of looping forever.
20
20
 
21
21
  **REFACTOR DOCUMENTATION POLICY**:
22
22
  - **DEFAULT**: Run `/prizmkit-retrospective` with full sync (Job 1 + Job 2), because refactoring changes code structure and interfaces.
@@ -80,6 +80,13 @@ You are the **refactor session orchestrator**. Execute Refactor {{REFACTOR_ID}}:
80
80
 
81
81
  **YOU are the orchestrator. Execute each phase by spawning the appropriate team agent with run_in_background=false.**
82
82
 
83
+ **Agent spawn failure policy (all Agent tool calls)**:
84
+ - If spawning Dev or Reviewer fails with team/config/lock errors, retry at most once.
85
+ - If the second attempt fails, do not keep spawning variants and do not enter artifact polling for Implementation Log, review-report, or refactor-report markers.
86
+ - Recovery exception: complete remaining refactor work directly in the orchestrator only when the action is safe and bounded; otherwise write `failure-log.md` with the spawn error and last observable state before stopping for recovery.
87
+ - Any recovery exception must be recorded in the required report or session status so downstream runs know the normal team path was unavailable.
88
+ - Apply the same cap to any re-spawn for report repair or resume prompts; do not burn multiple minutes on identical team/config/lock failures.
89
+
83
90
  ## Workflow Checkpoint System
84
91
 
85
92
  A checkpoint file tracks your progress through this workflow:
@@ -125,32 +132,26 @@ Resolve any `[NEEDS CLARIFICATION]` markers using the refactor description — d
125
132
 
126
133
  {{IF_BROWSER_INTERACTION}}
127
134
 
128
- #### Browser Verification Strategy
135
+ #### Browser Behavior Preservation Protocol
129
136
 
130
- The refactor may affect UI behavior. Browser verification can validate:
131
- - **UI Render**: UI components render identically after refactoring
132
- - **User Interactions**: Navigation, form submissions, and workflows function as before
133
- - **Feature Coverage**: Refactored features work end-to-end in real browser environment
137
+ The refactor may affect UI behavior. Use this single browser protocol for planning, implementation, review, and final reporting:
134
138
 
135
139
  {{IF_BROWSER_TOOL_AUTO}}
136
- Browser tool will be auto-selected at runtime based on dev server and feature complexity.
140
+ - **Browser Tool**: Auto-select at runtime based on dev server and feature complexity.
137
141
  {{END_IF_BROWSER_TOOL_AUTO}}
138
142
 
139
143
  {{IF_BROWSER_TOOL_PLAYWRIGHT}}
140
- **Tool: playwright-cli**Local isolated browser instance for dev server verification
144
+ - **Browser Tool**: playwright-cli — local isolated browser instance for dev server verification.
141
145
  {{END_IF_BROWSER_TOOL_PLAYWRIGHT}}
142
146
 
143
147
  {{IF_BROWSER_TOOL_OPENCLI}}
144
- **Tool: opencli** — Chrome session with existing login for testing OAuth/third-party integrations
148
+ - **Browser Tool**: opencli — Chrome session with existing login for OAuth/third-party integrations.
145
149
  {{END_IF_BROWSER_TOOL_OPENCLI}}
146
150
 
147
151
  **Verification Goals**:
148
152
  {{BROWSER_VERIFY_STEPS}}
149
153
 
150
- Include browser verification approach in plan.md:
151
- - Which UI flows should be smoke-tested after refactoring?
152
- - Any specific user journeys affected by the structural changes?
153
- - Should verification be part of review phase or implementation phase?
154
+ Evidence must cover affected UI render, primary user interactions, behavior-sensitive workflows, tool used, and whether verification belongs in implementation, review, or both.
154
155
 
155
156
  {{END_IF_BROWSER_INTERACTION}}
156
157
 
@@ -164,6 +165,7 @@ Include browser verification approach in plan.md:
164
165
  **Goal**: Execute all tasks from plan.md while preserving existing behavior.
165
166
 
166
167
  - Spawn Dev agent (Agent tool, subagent_type="prizm-dev-team-dev", run_in_background=false)
168
+ Apply the all-agent spawn failure policy above for this Dev spawn: for team/config/lock errors, retry at most once; if the second attempt fails, do not poll for `## Implementation Log`; use the bounded recovery exception.
167
169
  Prompt: "Read {{DEV_SUBAGENT_PATH}}. For refactor {{REFACTOR_ID}} ('{{REFACTOR_TITLE}}'):
168
170
  1. Read `.prizmkit/refactor/{{REFACTOR_ID}}/spec.md` and `.prizmkit/refactor/{{REFACTOR_ID}}/plan.md`
169
171
  2. Read `.prizmkit/prizm-docs/` for affected modules (TRAPS, RULES, PATTERNS)
@@ -178,11 +180,7 @@ Include browser verification approach in plan.md:
178
180
 
179
181
  {{IF_BROWSER_INTERACTION}}
180
182
 
181
- 6. **Browser Smoke Tests** (after every major refactor task):
182
- - Use browser tools to verify UI still renders correctly
183
- - Test the primary user flows affected by the refactoring
184
- - Confirm no visual or interactive regressions
185
- - Document any manual browser verification steps in implementation notes
183
+ 6. Apply the Browser Behavior Preservation Protocol above after major refactor tasks and document evidence in implementation notes.
186
184
 
187
185
  {{END_IF_BROWSER_INTERACTION}}
188
186
 
@@ -201,6 +199,7 @@ Include browser verification approach in plan.md:
201
199
  **Goal**: Verify refactoring quality and behavior preservation.
202
200
 
203
201
  - Spawn Reviewer agent (Agent tool, subagent_type="prizm-dev-team-reviewer", run_in_background=false)
202
+ Apply the all-agent spawn failure policy above for this Reviewer spawn.
204
203
  Prompt: "Read {{REVIEWER_SUBAGENT_PATH}}. For refactor {{REFACTOR_ID}}:
205
204
  1. Read `.prizmkit/refactor/{{REFACTOR_ID}}/spec.md` for goals and behavior preservation contracts
206
205
  2. Read `.prizmkit/refactor/{{REFACTOR_ID}}/plan.md` for architecture decisions and completed tasks
@@ -209,11 +208,7 @@ Include browser verification approach in plan.md:
209
208
 
210
209
  {{IF_BROWSER_INTERACTION}}
211
210
 
212
- 5. **Browser Verification Review**:
213
- - Verify that critical user workflows still function end-to-end in browser
214
- - Confirm UI renders consistently after structural changes
215
- - Validate any behavior-sensitive components behave identically
216
- - Document browser verification findings in review-report.md
211
+ 5. Verify the Browser Behavior Preservation Protocol evidence and document findings in `review-report.md`.
217
212
 
218
213
  {{END_IF_BROWSER_INTERACTION}}
219
214
 
@@ -221,7 +216,20 @@ Include browser verification approach in plan.md:
221
216
  7. Report: verdict (PASS/NEEDS_FIXES), number of rounds, findings fixed/rejected
222
217
  "
223
218
  - **Wait for Reviewer to return**
224
- - Read `review-report.md`if PASS proceed, if NEEDS_FIXES log remaining findings and proceed.
219
+ - **Gate CheckReview Report**:
220
+ After Reviewer returns, verify the review report contains a verdict:
221
+ ```bash
222
+ grep -q "## Verdict" .prizmkit/refactor/{{REFACTOR_ID}}/review-report.md && echo "GATE:PASS" || echo "GATE:MISSING"
223
+ ```
224
+ If GATE:MISSING:
225
+ - Do not enter an unbounded report-repair loop and do not repeatedly re-spawn Reviewer.
226
+ - Perform one bounded status check; retry at most once: inspect the Reviewer output, `review-report.md` path, and any internal Reviewer/Dev spawn messages from `/prizmkit-code-review`.
227
+ - If the missing report is caused by team/config/lock errors from the Reviewer or internal Reviewer/Dev agent spawn, retry the Reviewer agent at most once only if it appears transient.
228
+ - If the report is still missing after that single check/retry, write `.prizmkit/refactor/{{REFACTOR_ID}}/failure-log.md` with the spawn/skill error and last observable state, then either perform a safe inline fallback review (spec/plan/diff/tests → write `review-report.md` with `## Verdict`) or stop with a clear recovery failure.
229
+
230
+ Read `review-report.md` and check the Verdict:
231
+ - `PASS` → proceed to next phase
232
+ - `NEEDS_FIXES` → log remaining findings and proceed (do not retry externally)
225
233
  - **CP-RF-3**: Code review complete, tests pass, behavior preserved
226
234
  - **Checkpoint update**: set step `prizmkit-code-review` to `"completed"` in `{{CHECKPOINT_PATH}}`
227
235
 
@@ -1,13 +1,7 @@
1
- ## Acceptance Criteria Verification Checklist
1
+ ## Verification Gates Compatibility Note
2
2
 
3
- This checklist is auto-generated from feature.acceptance_criteria. The Dev agent must verify each criterion and mark as complete:
3
+ This section file is retained for compatibility with older bootstrap templates.
4
4
 
5
- {{AC_CHECKLIST}}
6
-
7
- **Verification Gates**:
8
- 1. All [x] items confirmed by Dev agent during implementation
9
- 2. Reviewer agent verifies checklist is 100% complete before clearing PASS verdict
10
- 3. Any unmet AC ([ ] remaining) marks feature as incomplete
11
-
12
- ---
5
+ Current section assembly renders Verification Gates only inside `task-contract.md` so the feature scope and acceptance requirements have a single source of truth.
13
6
 
7
+ Do not add this section separately unless you intentionally want to duplicate the gates outside the Task Contract.
@@ -31,3 +31,4 @@ You are running in **headless non-interactive mode** with a FINITE context windo
31
31
  (c) Use no version constraint (e.g., `"express": "*"`)
32
32
  - **This is a BLOCKING gate**: do NOT run `npm install` / `pip install` / `cargo build` / `go mod tidy` until ALL versions in the manifest have been verified or use open constraints
33
33
  - Batch version lookups: query multiple packages in parallel to save time (e.g., run multiple `npm view` commands concurrently)
34
+ 10. **Build artifact hygiene** — After any build/compile command (`go build`, `npm run build`, `tsc`, etc.), ensure the output binary, build directory, generated bundle, or cache directory is ignored by git before the commit phase. Never commit compiled binaries, build output, or generated artifacts.
@@ -1,25 +1,30 @@
1
1
  <feature-context>
2
- ### Feature Description
2
+ ## Source Semantics
3
3
 
4
- {{FEATURE_DESCRIPTION}}
4
+ Use this material according to the following priority:
5
5
 
6
- {{USER_CONTEXT}}
6
+ 1. **Task Contract** — defines current scope and Verification Gates.
7
+ 2. **User Raw Context** — authoritative constraints, but not automatic scope expansion.
8
+ 3. **Completed Dependencies** — existing behavior and interfaces; do not re-implement them.
9
+ 4. **Project Brief** — product background and architecture alignment.
10
+ 5. **App Global Context** — stack, runtime, and testing conventions.
11
+ 6. **Project Conventions** — repository-specific rules.
7
12
 
8
- ### Acceptance Criteria
13
+ ### User Raw Context
9
14
 
10
- {{ACCEPTANCE_CRITERIA}}
15
+ {{USER_CONTEXT}}
11
16
 
12
- ### Project Brief
17
+ ### Completed Dependencies
13
18
 
14
- > Product ideas checklist from planning. Lines marked [x] are already implemented. When your feature touches any [ ] item, ensure alignment. After implementation, mark relevant items [x] and append the key file/directory paths.
19
+ > Use this section to understand available interfaces and avoid duplicating completed work.
15
20
 
16
- {{PROJECT_BRIEF}}
21
+ {{COMPLETED_DEPENDENCIES}}
17
22
 
18
- ### Dependencies (Already Completed)
23
+ ### Project Brief
19
24
 
20
- > Below are features that this feature depends on, along with their key implementation notes. Use this context to understand what has already been built and what interfaces/APIs/models are available.
25
+ > Use this section for alignment only. Do not treat unrelated backlog items as current feature scope.
21
26
 
22
- {{COMPLETED_DEPENDENCIES}}
27
+ {{PROJECT_BRIEF}}
23
28
 
24
29
  ### App Global Context
25
30
 
@@ -1,46 +1,37 @@
1
1
  ### Browser Verification — MANDATORY
2
2
 
3
- You MUST execute this phase. Do NOT skip it. Do NOT mark it as completed without actually running browser verification.
3
+ You MUST attempt this phase. Mark it as skipped only when no usable browser tool can be found, installed, or started.
4
4
 
5
5
  **Step 0 — Tool Selection (BLOCKING — decide before any browser action)**:
6
6
 
7
7
  0a. Check which browser tools are available:
8
8
  ```bash
9
9
  echo "=== playwright-cli ==="
10
- which playwright-cli 2>/dev/null && playwright-cli --version 2>/dev/null || echo "NOT_INSTALLED"
10
+ which playwright-cli 2>/dev/null && playwright-cli --version 2>/dev/null || echo "PLAYWRIGHT_CLI:NOT_INSTALLED"
11
11
  echo "=== opencli ==="
12
- which opencli 2>/dev/null && opencli --version 2>/dev/null || echo "NOT_INSTALLED"
12
+ which opencli 2>/dev/null && opencli --version 2>/dev/null || echo "OPENCLI:NOT_INSTALLED"
13
13
  ```
14
14
 
15
- 0b. If opencli is installed, verify Browser Bridge connectivity:
16
- ```bash
17
- opencli doctor 2>/dev/null || echo "OPENCLI_BRIDGE_FAILED"
18
- ```
19
-
20
- 0c. **Choose your tool** based on availability and scenario:
21
-
22
- | Condition | Use this tool |
23
- |-----------|--------------|
24
- | Only playwright-cli installed | `playwright-cli` |
25
- | Only opencli installed (and doctor passes) | `opencli` |
26
- | Both installed — verifying local dev server, forms, components | `playwright-cli` (isolated, deterministic) |
27
- | Both installed — feature needs real login state (OAuth/SSO) | `opencli` (reuses Chrome sessions) |
28
- | Both installed — verifying third-party integration pages | `opencli` (has logged-in cookies) |
29
- | Both installed — headless CI environment | `playwright-cli` (no Chrome dependency) |
30
- | Neither installed | Skip — log `## Browser Verification: SKIPPED — no browser tool available` |
15
+ 0b. Use this single decision tree:
16
+ 1. If `playwright-cli` is available, use it by default for local dev-server verification.
17
+ 2. Else if `opencli` is available, run `opencli doctor`; use `opencli` only if doctor passes.
18
+ 3. Else install `playwright-cli` as the default:
19
+ ```bash
20
+ npm install -g @playwright/cli@latest
21
+ playwright-cli --version
22
+ ```
23
+ 4. If no browser tool is usable after these steps, append `## Browser Verification: SKIPPED — no usable browser tool` to `context-snapshot.md`, then continue to reporting/checkpoint.
31
24
 
32
- If neither tool is available, install playwright-cli as the default:
33
- ```bash
34
- npm install -g @playwright/cli@latest
35
- playwright-cli --version
36
- ```
25
+ Use `opencli` instead of `playwright-cli` only when the scenario requires an existing Chrome login/session or third-party integration cookies.
37
26
 
38
27
  Record your choice:
39
28
  ```bash
40
- BROWSER_TOOL="playwright-cli" # or "opencli"
29
+ BROWSER_TOOL="playwright-cli" # or "opencli" or "skipped"
41
30
  echo "Selected browser tool: $BROWSER_TOOL"
42
31
  ```
43
32
 
33
+ If `BROWSER_TOOL="skipped"`, do NOT start a dev server and do NOT run browser commands. Go directly to Step 4 Reporting and the checkpoint update.
34
+
44
35
  ---
45
36
 
46
37
  **If you chose `playwright-cli`**, follow this workflow:
@@ -74,7 +65,7 @@ If `SKILL_MISSING`: run `playwright-cli install --skills`. If current platform i
74
65
  Use `playwright-cli snapshot` to discover element refs, then verify these goals:
75
66
  {{BROWSER_VERIFY_STEPS}}
76
67
 
77
- Construct your verification workflow based on: (1) the playwright-cli skill documentation, (2) the `--help` output, (3) the current task's acceptance criteria. Take a final screenshot: `playwright-cli screenshot`.
68
+ Construct your verification workflow based on: (1) the playwright-cli skill documentation, (2) the `--help` output, (3) the current task's Verification Gates. Take a final screenshot: `playwright-cli screenshot`.
78
69
 
79
70
  **Step 3 — Cleanup (playwright-cli)**:
80
71
  1. `playwright-cli close`
@@ -94,7 +94,7 @@ opencli browser click 12 && opencli browser wait time 1 && opencli browser state
94
94
 
95
95
  **IMPORTANT**: Always run `opencli browser state` after page-changing actions (open, click on links, scroll) to get fresh element indices. Never guess indices.
96
96
 
97
- Construct your verification workflow based on: (1) the `opencli browser --help` output, (2) the current task's acceptance criteria. Decide the concrete actions yourself. Take a final screenshot if needed: `opencli browser screenshot`.
97
+ Construct your verification workflow based on: (1) the `opencli browser --help` output, (2) the current task's Verification Gates. Decide the concrete actions yourself. Take a final screenshot if needed: `opencli browser screenshot`.
98
98
 
99
99
  **Step 3 — Cleanup (REQUIRED — you started it, you stop it)**:
100
100
 
@@ -113,7 +113,7 @@ Use `playwright-cli snapshot` on the running app to discover actual element refs
113
113
  Construct your verification workflow based on:
114
114
  1. The playwright-cli skill documentation (read in Step 0d)
115
115
  2. The `playwright-cli --help` output (captured in Step 0b)
116
- 3. The current task's acceptance criteria and implemented features
116
+ 3. The current task's Verification Gates and implemented features
117
117
 
118
118
  Decide the concrete playwright-cli actions (click, fill, snapshot, screenshot, etc.) yourself based on the snapshot output and your knowledge of the implemented code. The goals above describe WHAT to verify — you determine HOW using playwright-cli commands.
119
119
 
@@ -13,7 +13,7 @@ If MISSING — build it now:
13
13
  Identify the top-level source directories from the results.
14
14
  3. Scan the detected source directories for files related to this feature; read each one
15
15
  4. Write `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md`:
16
- - **Section 1 — Feature Brief**: feature description + acceptance criteria (copy from above)
16
+ - **Section 1 — Task Contract**: Objective, scope rule, non-scope rule, and Verification Gates from the Task Contract above
17
17
  - **Section 2 — Project Structure**: run the following to get a visual directory tree, then paste output:
18
18
  ```bash
19
19
  find . -maxdepth 2 -type d -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*' -not -path '*/build/*' -not -path '*/__pycache__/*' -not -path '*/vendor/*' | sed -e 's;[^/]*/;|____;g;s;____|; |;g'
@@ -8,6 +8,16 @@ If CRITIC:MISSING — skip this phase entirely and proceed. Log: "Critic agent n
8
8
 
9
9
  **Choose ONE path based on `{{CRITIC_COUNT}}`:**
10
10
 
11
+ **Agent spawn failure policy**:
12
+ - If spawning Critic fails with team/config/lock errors, retry at most once.
13
+ - If the second attempt fails, do not keep spawning variants. Either create the required team once (when team tooling is available) or perform the plan challenge inline and write the required challenge report yourself.
14
+ - Record the fallback in the report; do not burn multiple minutes on repeated identical spawn failures.
15
+
16
+ **No silent report polling**:
17
+ - Do NOT run a long no-output loop waiting for `challenge-report*.md`.
18
+ - If you need to wait for a report file, use a short bounded check (≤120s) that prints elapsed time and reports present on every iteration.
19
+ - If reports are still missing after the bounded check, request one status update; if still missing, perform the missing challenge lens inline and continue.
20
+
11
21
  **If {{CRITIC_COUNT}} = 1 → Single Critic** (skip to CP-2.5 after this):
12
22
 
13
23
  **Spawn Agent**:
@@ -16,6 +16,16 @@ If CRITIC:MISSING — skip this phase entirely and proceed. Log: "Critic agent n
16
16
  **Prompt**:
17
17
  > {{AGENT_PROMPT_CRITIC_PLAN_CHALLENGE}}
18
18
 
19
+ **Agent spawn failure policy**:
20
+ - If spawning Critic fails with team/config/lock errors, retry at most once.
21
+ - If the second attempt fails, do not keep spawning variants. Either create the required team once (when team tooling is available) or perform the plan challenge inline and write `challenge-report.md` yourself.
22
+ - Record the fallback in the report; do not burn multiple minutes on repeated identical spawn failures.
23
+
24
+ **No silent report polling**:
25
+ - Do NOT run a long no-output loop waiting for `challenge-report.md`.
26
+ - If you need to wait for the report file, use a short bounded check (≤120s) that prints elapsed time and whether the report exists on every iteration.
27
+ - If the report is still missing after the bounded check, request one status update; if still missing, perform the challenge inline and continue.
28
+
19
29
  Wait for Critic to return.
20
30
  - Read challenge-report.md. For items marked CRITICAL/HIGH: decide whether to adjust plan.md or document why the plan stands.
21
31
  - Max 1 plan revision round.
@@ -1,13 +1,6 @@
1
1
  ### Implement — Dev Subagent
2
2
 
3
- **Build artifacts rule** (passed to Dev): After any build/compile command (`go build`, `npm run build`, `tsc`, etc.), ensure the output binary or build directory is in `.gitignore`. Never commit compiled binaries, build output, or generated artifacts.
4
-
5
- **Dependency version gate (BLOCKING — pass to Dev agent)**: Before running ANY package install command (`npm install`, `pip install`, `cargo build`, `go mod tidy`, `bundle install`, etc.):
6
- 1. Every version number in the dependency manifest MUST be verified against the real registry (see Context Budget Rules §9)
7
- 2. If a scaffold tool generated a `package.json` / `requirements.txt` / etc., verify the versions it wrote too — scaffold tools can emit outdated versions
8
- 3. Do NOT proceed with install until all versions are confirmed real. Violation = wasted timeout cycles that can crash the session
9
-
10
- **Scaffold file rule (pass to Dev agent)**: After running any init/scaffold command, record generated files in context-snapshot.md under `### Scaffold Files (do not re-read)`. Never re-read these files — their content is standard boilerplate (see Context Budget Rules §8). When spawning subagents, explicitly list scaffold files so they skip reading them.
3
+ **Protocol handoff to Dev**: The Dev prompt already carries the required subset of Context Budget Rules, the Test Failure Recovery Protocol, and Task Contract / Verification Gate constraints. Do not duplicate those rules here; verify Dev output against the gates below.
11
4
 
12
5
  **Spawn Agent**:
13
6
  | Parameter | Value |
@@ -15,10 +8,22 @@
15
8
  | subagent_type | prizm-dev-team-dev |
16
9
  | run_in_background | false |
17
10
 
11
+ **Agent spawn failure policy**:
12
+ - If spawning Dev fails with team/config/lock errors, retry at most once.
13
+ - If the second attempt fails, do not enter Implementation Log polling or repeated recovery spawn loops.
14
+ - Use the documented inline/recovery fallback: write `failure-log.md` with the spawn error and last observable state, then either complete remaining tasks directly in the orchestrator or stop with a clear failure for recovery.
15
+ - Apply the same cap to Dev re-spawns for Implementation Log repair or resume prompts; do not burn multiple minutes on identical team/config/lock failures.
16
+
18
17
  **Prompt**:
19
18
  > {{AGENT_PROMPT_DEV_IMPLEMENT}}
20
19
 
21
- Wait for Dev to return. All tasks must be `[x]`, tests pass.
20
+ Wait for Dev to return. Implementation may proceed only when tasks are complete and the Test Failure Recovery Protocol's Success Rule is satisfied.
21
+
22
+ **No silent artifact polling**:
23
+ - Do NOT run a long no-output loop that only waits for `## Implementation Log` or any other file marker.
24
+ - If you must wait for Dev after spawning or sending a status request, use short bounded checks (≤120s) that print a heartbeat line each iteration with: elapsed time, remaining unchecked task count, whether `## Implementation Log` exists, and whether `git diff --stat` changed.
25
+ - If Dev has no transcript/file/diff progress for one bounded check, send one status request. If there is still no progress on the next bounded check, stop waiting, write `failure-log.md` with the last observable state, and follow Subagent Timeout Recovery.
26
+ - Prefer the Agent tool's completion notification or Dev's `COMPLETION_SIGNAL`; file presence alone is not a liveness signal.
22
27
 
23
28
  **Gate Check — Implementation Log**:
24
29
  After Dev agent returns, verify the Implementation Log was written:
@@ -1,13 +1,6 @@
1
1
  ### Implement — Dev Agent
2
2
 
3
- **Build artifacts rule** (passed to Dev): After any build/compile command (`go build`, `npm run build`, `tsc`, etc.), ensure the output binary or build directory is in `.gitignore`. Never commit compiled binaries, build output, or generated artifacts.
4
-
5
- **Dependency version gate (BLOCKING — pass to Dev agent)**: Before running ANY package install command (`npm install`, `pip install`, `cargo build`, `go mod tidy`, `bundle install`, etc.):
6
- 1. Every version number in the dependency manifest MUST be verified against the real registry (see Context Budget Rules §9)
7
- 2. If a scaffold tool generated a `package.json` / `requirements.txt` / etc., verify the versions it wrote too — scaffold tools can emit outdated versions
8
- 3. Do NOT proceed with install until all versions are confirmed real. Violation = wasted timeout cycles that can crash the session
9
-
10
- **Scaffold file rule (pass to Dev agent)**: After running any init/scaffold command, record generated files in context-snapshot.md under `### Scaffold Files (do not re-read)`. Never re-read these files — their content is standard boilerplate (see Context Budget Rules §8). When spawning subagents, explicitly list scaffold files so they skip reading them.
3
+ **Protocol handoff to Dev**: The Dev prompt already carries the required subset of Context Budget Rules, the Test Failure Recovery Protocol, and Task Contract / Verification Gate constraints. Do not duplicate those rules here; verify Dev output against the gates below.
11
4
 
12
5
  Before spawning Dev, check plan.md Tasks section:
13
6
  ```bash
@@ -22,9 +15,21 @@ grep -c '^\- \[ \]' .prizmkit/specs/{{FEATURE_SLUG}}/plan.md 2>/dev/null || true
22
15
  | subagent_type | prizm-dev-team-dev |
23
16
  | run_in_background | false |
24
17
 
18
+ **Agent spawn failure policy**:
19
+ - If spawning Dev fails with team/config/lock errors, retry at most once.
20
+ - If the second attempt fails, do not enter Implementation Log polling or repeated recovery spawn loops.
21
+ - Use the documented inline/recovery fallback: write `failure-log.md` with the spawn error and last observable state, then either complete remaining tasks directly in the orchestrator or stop with a clear failure for recovery.
22
+ - Apply the same cap to Dev re-spawns for Implementation Log repair or resume prompts; do not burn multiple minutes on identical team/config/lock failures.
23
+
25
24
  **Prompt**:
26
25
  > {{AGENT_PROMPT_DEV_IMPLEMENT}}
27
26
 
27
+ **No silent artifact polling**:
28
+ - Do NOT run a long no-output loop that only waits for `## Implementation Log` or any other file marker.
29
+ - If you must wait for Dev after spawning or sending a status request, use short bounded checks (≤120s) that print a heartbeat line each iteration with: elapsed time, remaining unchecked task count, whether `## Implementation Log` exists, and whether `git diff --stat` changed.
30
+ - If Dev has no transcript/file/diff progress for one bounded check, send one status request. If there is still no progress on the next bounded check, stop waiting, write `failure-log.md` with the last observable state, and follow Subagent Timeout Recovery.
31
+ - Prefer the Agent tool's completion notification or Dev's `COMPLETION_SIGNAL`; file presence alone is not a liveness signal.
32
+
28
33
  **Gate Check — Implementation Log**:
29
34
  After Dev agent returns, verify the Implementation Log was written:
30
35
  ```bash
@@ -38,7 +43,7 @@ Wait for Dev to return. **If Dev times out before all tasks are `[x]`**:
38
43
  > {{AGENT_PROMPT_DEV_RESUME}}
39
44
  3. Max 2 recovery retries. After 2 failures, orchestrator implements remaining tasks directly.
40
45
 
41
- All tasks `[x]`, tests pass.
46
+ Implementation phase is complete only when all tasks are `[x]` and the Test Failure Recovery Protocol's Success Rule is satisfied.
42
47
 
43
48
 
44
49
  **Checkpoint update**: Run the update script to set step `prizmkit-implement` to `"completed"`:
@@ -1,18 +1,9 @@
1
1
  ### Implement + Test
2
2
 
3
- **Build artifacts**: After any build/compile command (`go build`, `npm run build`, `tsc`, etc.), ensure the output binary or build directory is in `.gitignore`:
4
- ```bash
5
- # Example for Go
6
- grep -q '^/binary-name$' .gitignore || echo '/binary-name' >> .gitignore
7
- ```
8
- Never commit compiled binaries, build output, or generated artifacts.
9
-
10
- **Dependency version gate (BLOCKING)**: Before running ANY package install command (`npm install`, `pip install`, `cargo build`, `go mod tidy`, `bundle install`, etc.):
11
- 1. Every version number in your dependency manifest MUST be verified against the real registry (see Context Budget Rules §9)
12
- 2. If you used a scaffold tool that generated a `package.json` / `requirements.txt` / etc., verify the versions it wrote too — scaffold tools can emit outdated versions
13
- 3. Do NOT proceed with install until all versions are confirmed real. Violation = wasted timeout cycles
14
-
15
- **Scaffold file rule**: After running any init/scaffold command, record generated files in context-snapshot.md under `### Scaffold Files (do not re-read)`. Never re-read these files — their content is standard boilerplate (see Context Budget Rules §8).
3
+ **Protocol references**:
4
+ - Follow Context Budget Rules §8 for scaffold/generated files.
5
+ - Follow Context Budget Rules §9 before package install/build commands that resolve dependencies.
6
+ - Follow Context Budget Rules §10 after build/compile commands.
16
7
 
17
8
  **3a.** Detect test commands and record baseline:
18
9
 
@@ -32,11 +23,11 @@ You know this project's tech stack. Identify ALL test commands that apply (e.g.,
32
23
 
33
24
  **3c.** After implement completes, verify:
34
25
  1. All tasks in plan.md are `[x]`
35
- 2. Run the full test suite to ensure nothing is broken
36
- 3. Verify each acceptance criterion from Section 1 of context-snapshot.md is met — check mentally, do NOT re-read files you already wrote
37
- 4. If any criterion is not met, fix it now using the convergence-based recovery loop (see Test Failure Recovery Protocol)
26
+ 2. Run the full test suite to ensure no new regressions remain
27
+ 3. Verify each Verification Gate from the Task Contract — check mentally, do NOT re-read files you already wrote
28
+ 4. If any gate is unmet or blocked, follow the Test Failure Recovery Protocol
38
29
 
39
- **CP-2**: All acceptance criteria met, all tests pass.
30
+ **CP-2**: Implementation may proceed only when all tasks are `[x]` and the Test Failure Recovery Protocol's Success Rule is satisfied. Blocked gates must be documented in `failure-log.md` and are not success.
40
31
 
41
32
 
42
33
  **Checkpoint update**: Run the update script to set step `prizmkit-implement` to `"completed"`:
@@ -5,7 +5,7 @@ ls .prizmkit/specs/{{FEATURE_SLUG}}/ 2>/dev/null
5
5
  ```
6
6
 
7
7
  If plan.md missing, run `/prizmkit-plan` with `artifact_dir=.prizmkit/specs/{{FEATURE_SLUG}}/`:
8
- - Pass the feature description and acceptance criteria from the Feature Context section above as input
8
+ - Pass the Objective and Verification Gates from the Task Contract as input
9
9
  - The plan.md should include: key components, data flow, files to create/modify, and a Tasks section with `[ ]` checkboxes (each task = one implementable unit). Keep under 80 lines.
10
10
  - Resolve any `[NEEDS CLARIFICATION]` markers using the feature description — do NOT pause for interactive input.
11
11
 
@@ -9,7 +9,11 @@ After `/prizmkit-code-review` returns, verify the review report:
9
9
  ```bash
10
10
  grep -q "## Verdict" .prizmkit/specs/{{FEATURE_SLUG}}/review-report.md && echo "GATE:PASS" || echo "GATE:MISSING"
11
11
  ```
12
- If GATE:MISSING — re-run `/prizmkit-code-review`.
12
+ If GATE:MISSING:
13
+ - Do not re-run `/prizmkit-code-review` in an unbounded report-repair loop.
14
+ - Perform one bounded status check; retry at most once: inspect the skill output, `review-report.md` path, and any Reviewer/Dev spawn messages.
15
+ - If the missing report is caused by team/config/lock errors from the internal Reviewer/Dev agent spawn, retry `/prizmkit-code-review` at most once only if it appears transient.
16
+ - If the report is still missing after that single check/retry, write `failure-log.md` with the spawn/skill error and last observable state, then either perform a safe inline fallback review (spec/plan/diff/tests → write `review-report.md` with `## Verdict`) or stop with a clear recovery failure.
13
17
 
14
18
  Read `review-report.md` and check the Verdict:
15
19
  - `PASS` → proceed to next phase
@@ -9,7 +9,11 @@ After `/prizmkit-code-review` returns, verify the review report:
9
9
  ```bash
10
10
  grep -q "## Verdict" .prizmkit/specs/{{FEATURE_SLUG}}/review-report.md && echo "GATE:PASS" || echo "GATE:MISSING"
11
11
  ```
12
- If GATE:MISSING — re-run `/prizmkit-code-review`.
12
+ If GATE:MISSING:
13
+ - Do not re-run `/prizmkit-code-review` in an unbounded report-repair loop.
14
+ - Perform one bounded status check; retry at most once: inspect the skill output, `review-report.md` path, and any Reviewer/Dev spawn messages.
15
+ - If the missing report is caused by team/config/lock errors from the internal Reviewer/Dev agent spawn, retry `/prizmkit-code-review` at most once only if it appears transient.
16
+ - If the report is still missing after that single check/retry, write `failure-log.md` with the spawn/skill error and last observable state, then either perform a safe inline fallback review (spec/plan/diff/tests → write `review-report.md` with `## Verdict`) or stop with a clear recovery failure.
13
17
 
14
18
  Read `review-report.md` and check the Verdict:
15
19
  - `PASS` → proceed to next phase
@@ -17,7 +21,7 @@ Read `review-report.md` and check the Verdict:
17
21
 
18
22
  Run the full test suite: `({{TEST_CMD}}) 2>&1 | tee /tmp/review-test-out.txt | tail -20`
19
23
 
20
- **CP-3**: Review complete, tests pass, report written.
24
+ **CP-3**: Review complete, report written, and the Test Failure Recovery Protocol's Success Rule is satisfied.
21
25
 
22
26
 
23
27
  **Checkpoint update**: Run the update script to set step `prizmkit-code-review` to `"completed"`:
@@ -19,7 +19,7 @@ ls .prizmkit/specs/{{FEATURE_SLUG}}/ 2>/dev/null
19
19
  Identify the top-level source directories from the results.
20
20
  3. Scan the detected source directories for files related to this feature; read each one
21
21
  4. Write `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md`:
22
- - **Section 1 — Task Brief**: task description + acceptance criteria (copy from above)
22
+ - **Section 1 — Task Contract**: Objective, scope rule, non-scope rule, and Verification Gates from the Task Contract above
23
23
  - **Section 2 — Project Structure**: run the following to get a visual directory tree, then paste output:
24
24
  ```bash
25
25
  find . -maxdepth 2 -type d -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*' -not -path '*/build/*' -not -path '*/__pycache__/*' -not -path '*/vendor/*' | sed -e 's;[^/]*/;|____;g;s;____|; |;g'