@zigrivers/scaffold 2.38.1 → 2.44.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (201) hide show
  1. package/README.md +10 -7
  2. package/dist/cli/commands/build.js +4 -4
  3. package/dist/cli/commands/build.js.map +1 -1
  4. package/dist/cli/commands/check.test.js +11 -8
  5. package/dist/cli/commands/check.test.js.map +1 -1
  6. package/dist/cli/commands/complete.d.ts.map +1 -1
  7. package/dist/cli/commands/complete.js +2 -1
  8. package/dist/cli/commands/complete.js.map +1 -1
  9. package/dist/cli/commands/complete.test.js +4 -1
  10. package/dist/cli/commands/complete.test.js.map +1 -1
  11. package/dist/cli/commands/dashboard.js +4 -4
  12. package/dist/cli/commands/dashboard.js.map +1 -1
  13. package/dist/cli/commands/knowledge.js +2 -2
  14. package/dist/cli/commands/knowledge.js.map +1 -1
  15. package/dist/cli/commands/knowledge.test.js +5 -12
  16. package/dist/cli/commands/knowledge.test.js.map +1 -1
  17. package/dist/cli/commands/list.d.ts +1 -1
  18. package/dist/cli/commands/list.d.ts.map +1 -1
  19. package/dist/cli/commands/list.js +84 -3
  20. package/dist/cli/commands/list.js.map +1 -1
  21. package/dist/cli/commands/list.test.js +82 -0
  22. package/dist/cli/commands/list.test.js.map +1 -1
  23. package/dist/cli/commands/next.test.js +4 -1
  24. package/dist/cli/commands/next.test.js.map +1 -1
  25. package/dist/cli/commands/reset.d.ts.map +1 -1
  26. package/dist/cli/commands/reset.js +5 -2
  27. package/dist/cli/commands/reset.js.map +1 -1
  28. package/dist/cli/commands/reset.test.js +4 -1
  29. package/dist/cli/commands/reset.test.js.map +1 -1
  30. package/dist/cli/commands/rework.d.ts.map +1 -1
  31. package/dist/cli/commands/rework.js +3 -2
  32. package/dist/cli/commands/rework.js.map +1 -1
  33. package/dist/cli/commands/run.d.ts.map +1 -1
  34. package/dist/cli/commands/run.js +28 -13
  35. package/dist/cli/commands/run.js.map +1 -1
  36. package/dist/cli/commands/run.test.js +1 -1
  37. package/dist/cli/commands/run.test.js.map +1 -1
  38. package/dist/cli/commands/skip.d.ts.map +1 -1
  39. package/dist/cli/commands/skip.js +2 -1
  40. package/dist/cli/commands/skip.js.map +1 -1
  41. package/dist/cli/commands/skip.test.js +4 -1
  42. package/dist/cli/commands/skip.test.js.map +1 -1
  43. package/dist/cli/commands/status.d.ts.map +1 -1
  44. package/dist/cli/commands/status.js +88 -4
  45. package/dist/cli/commands/status.js.map +1 -1
  46. package/dist/cli/commands/version.d.ts.map +1 -1
  47. package/dist/cli/commands/version.js +19 -3
  48. package/dist/cli/commands/version.js.map +1 -1
  49. package/dist/cli/commands/version.test.js +41 -0
  50. package/dist/cli/commands/version.test.js.map +1 -1
  51. package/dist/cli/output/context.test.js +14 -13
  52. package/dist/cli/output/context.test.js.map +1 -1
  53. package/dist/cli/output/interactive.js +4 -4
  54. package/dist/cli/output/json.d.ts +1 -0
  55. package/dist/cli/output/json.d.ts.map +1 -1
  56. package/dist/cli/output/json.js +14 -1
  57. package/dist/cli/output/json.js.map +1 -1
  58. package/dist/config/loader.d.ts.map +1 -1
  59. package/dist/config/loader.js +10 -3
  60. package/dist/config/loader.js.map +1 -1
  61. package/dist/config/loader.test.js +28 -0
  62. package/dist/config/loader.test.js.map +1 -1
  63. package/dist/core/assembly/engine.d.ts.map +1 -1
  64. package/dist/core/assembly/engine.js +6 -1
  65. package/dist/core/assembly/engine.js.map +1 -1
  66. package/dist/e2e/init.test.js +3 -0
  67. package/dist/e2e/init.test.js.map +1 -1
  68. package/dist/index.js +2 -1
  69. package/dist/index.js.map +1 -1
  70. package/dist/project/adopt.test.js +3 -0
  71. package/dist/project/adopt.test.js.map +1 -1
  72. package/dist/project/claude-md.d.ts.map +1 -1
  73. package/dist/project/claude-md.js +2 -1
  74. package/dist/project/claude-md.js.map +1 -1
  75. package/dist/project/detector.js +3 -3
  76. package/dist/project/detector.js.map +1 -1
  77. package/dist/project/signals.d.ts +1 -0
  78. package/dist/project/signals.d.ts.map +1 -1
  79. package/dist/state/decision-logger.d.ts.map +1 -1
  80. package/dist/state/decision-logger.js +7 -4
  81. package/dist/state/decision-logger.js.map +1 -1
  82. package/dist/state/lock-manager.js +1 -1
  83. package/dist/state/lock-manager.js.map +1 -1
  84. package/dist/state/lock-manager.test.js +27 -3
  85. package/dist/state/lock-manager.test.js.map +1 -1
  86. package/dist/state/state-manager.d.ts.map +1 -1
  87. package/dist/state/state-manager.js +6 -0
  88. package/dist/state/state-manager.js.map +1 -1
  89. package/dist/state/state-manager.test.js +7 -0
  90. package/dist/state/state-manager.test.js.map +1 -1
  91. package/dist/types/assembly.d.ts +2 -0
  92. package/dist/types/assembly.d.ts.map +1 -1
  93. package/dist/utils/eligible.d.ts +8 -0
  94. package/dist/utils/eligible.d.ts.map +1 -0
  95. package/dist/utils/eligible.js +36 -0
  96. package/dist/utils/eligible.js.map +1 -0
  97. package/dist/validation/config-validator.test.js +15 -13
  98. package/dist/validation/config-validator.test.js.map +1 -1
  99. package/dist/validation/index.test.js +1 -1
  100. package/dist/wizard/wizard.d.ts.map +1 -1
  101. package/dist/wizard/wizard.js +1 -0
  102. package/dist/wizard/wizard.js.map +1 -1
  103. package/dist/wizard/wizard.test.js +2 -0
  104. package/dist/wizard/wizard.test.js.map +1 -1
  105. package/knowledge/core/automated-review-tooling.md +4 -4
  106. package/knowledge/core/eval-craft.md +44 -0
  107. package/knowledge/core/multi-model-review-dispatch.md +8 -0
  108. package/knowledge/core/system-architecture.md +39 -0
  109. package/knowledge/core/task-decomposition.md +53 -0
  110. package/knowledge/core/testing-strategy.md +160 -0
  111. package/knowledge/finalization/implementation-playbook.md +24 -7
  112. package/knowledge/product/prd-craft.md +41 -0
  113. package/knowledge/review/review-adr.md +1 -1
  114. package/knowledge/review/review-api-design.md +1 -1
  115. package/knowledge/review/review-database-design.md +1 -1
  116. package/knowledge/review/review-domain-modeling.md +1 -1
  117. package/knowledge/review/review-implementation-tasks.md +1 -1
  118. package/knowledge/review/review-methodology.md +1 -1
  119. package/knowledge/review/review-operations.md +1 -1
  120. package/knowledge/review/review-prd.md +1 -1
  121. package/knowledge/review/review-security.md +1 -1
  122. package/knowledge/review/review-system-architecture.md +1 -1
  123. package/knowledge/review/review-testing-strategy.md +1 -1
  124. package/knowledge/review/review-user-stories.md +1 -1
  125. package/knowledge/review/review-ux-specification.md +1 -1
  126. package/knowledge/review/review-vision.md +1 -1
  127. package/knowledge/tools/post-implementation-review-methodology.md +107 -0
  128. package/knowledge/validation/critical-path-analysis.md +13 -0
  129. package/knowledge/validation/implementability-review.md +14 -0
  130. package/package.json +2 -1
  131. package/pipeline/architecture/review-architecture.md +8 -5
  132. package/pipeline/architecture/system-architecture.md +9 -3
  133. package/pipeline/build/multi-agent-resume.md +21 -7
  134. package/pipeline/build/multi-agent-start.md +22 -7
  135. package/pipeline/build/new-enhancement.md +20 -12
  136. package/pipeline/build/quick-task.md +18 -11
  137. package/pipeline/build/single-agent-resume.md +20 -6
  138. package/pipeline/build/single-agent-start.md +24 -8
  139. package/pipeline/consolidation/claude-md-optimization.md +8 -4
  140. package/pipeline/consolidation/workflow-audit.md +9 -5
  141. package/pipeline/decisions/adrs.md +7 -3
  142. package/pipeline/decisions/review-adrs.md +8 -5
  143. package/pipeline/environment/ai-memory-setup.md +6 -2
  144. package/pipeline/environment/automated-pr-review.md +79 -12
  145. package/pipeline/environment/design-system.md +9 -6
  146. package/pipeline/environment/dev-env-setup.md +8 -5
  147. package/pipeline/environment/git-workflow.md +16 -13
  148. package/pipeline/finalization/apply-fixes-and-freeze.md +10 -5
  149. package/pipeline/finalization/developer-onboarding-guide.md +10 -3
  150. package/pipeline/finalization/implementation-playbook.md +13 -4
  151. package/pipeline/foundation/beads.md +8 -5
  152. package/pipeline/foundation/coding-standards.md +13 -10
  153. package/pipeline/foundation/project-structure.md +16 -13
  154. package/pipeline/foundation/tdd.md +9 -4
  155. package/pipeline/foundation/tech-stack.md +7 -5
  156. package/pipeline/integration/add-e2e-testing.md +12 -8
  157. package/pipeline/modeling/domain-modeling.md +9 -7
  158. package/pipeline/modeling/review-domain-modeling.md +8 -6
  159. package/pipeline/parity/platform-parity-review.md +9 -6
  160. package/pipeline/planning/implementation-plan-review.md +10 -7
  161. package/pipeline/planning/implementation-plan.md +41 -9
  162. package/pipeline/pre/create-prd.md +7 -4
  163. package/pipeline/pre/innovate-prd.md +12 -8
  164. package/pipeline/pre/innovate-user-stories.md +10 -7
  165. package/pipeline/pre/review-prd.md +12 -10
  166. package/pipeline/pre/review-user-stories.md +12 -9
  167. package/pipeline/pre/user-stories.md +7 -4
  168. package/pipeline/quality/create-evals.md +6 -3
  169. package/pipeline/quality/operations.md +7 -3
  170. package/pipeline/quality/review-operations.md +12 -5
  171. package/pipeline/quality/review-security.md +11 -6
  172. package/pipeline/quality/review-testing.md +11 -6
  173. package/pipeline/quality/security.md +6 -2
  174. package/pipeline/quality/story-tests.md +14 -9
  175. package/pipeline/specification/api-contracts.md +9 -3
  176. package/pipeline/specification/database-schema.md +8 -2
  177. package/pipeline/specification/review-api.md +10 -4
  178. package/pipeline/specification/review-database.md +8 -3
  179. package/pipeline/specification/review-ux.md +9 -3
  180. package/pipeline/specification/ux-spec.md +9 -4
  181. package/pipeline/validation/critical-path-walkthrough.md +10 -5
  182. package/pipeline/validation/cross-phase-consistency.md +9 -4
  183. package/pipeline/validation/decision-completeness.md +8 -3
  184. package/pipeline/validation/dependency-graph-validation.md +8 -3
  185. package/pipeline/validation/implementability-dry-run.md +9 -5
  186. package/pipeline/validation/scope-creep-check.md +11 -6
  187. package/pipeline/validation/traceability-matrix.md +10 -5
  188. package/pipeline/vision/create-vision.md +7 -4
  189. package/pipeline/vision/innovate-vision.md +11 -8
  190. package/pipeline/vision/review-vision.md +15 -12
  191. package/skills/multi-model-dispatch/SKILL.md +6 -5
  192. package/skills/scaffold-runner/SKILL.md +47 -3
  193. package/tools/dashboard.md +53 -0
  194. package/tools/post-implementation-review.md +655 -0
  195. package/tools/prompt-pipeline.md +160 -0
  196. package/tools/release.md +435 -0
  197. package/tools/review-pr.md +229 -0
  198. package/tools/session-analyzer.md +299 -0
  199. package/tools/update.md +113 -0
  200. package/tools/version-bump.md +290 -0
  201. package/tools/version.md +82 -0
@@ -0,0 +1,107 @@
1
+ ---
2
+ name: post-implementation-review-methodology
3
+ description: Two-phase whole-codebase review methodology for post-implementation quality validation
4
+ topics: [review, code-review, multi-model, post-implementation, methodology]
5
+ ---
6
+
7
+ # Post-Implementation Review Methodology
8
+
9
+ A systematic approach for reviewing an entire scaffold-generated codebase after
10
+ an AI agent has completed all implementation tasks. Differs from PR review in
11
+ that it covers the full codebase against requirements, not just a diff.
12
+
13
+ ## Summary
14
+
15
+ Post-implementation review is a whole-codebase quality validation that runs after all implementation tasks are complete. It uses two sequential phases — a cross-cutting systemic sweep followed by a parallel user-story review — because cross-cutting issues (security, error handling, architecture alignment) must be identified and framed before diving into feature-level requirement satisfaction. Running cross-cutting first sets the context for every downstream fix.
16
+
17
+ ## Deep Guidance
18
+
19
+ ## Why Two Phases
20
+
21
+ Cross-cutting issues — security architecture, error handling patterns, test
22
+ coverage gaps — must be identified before diving into feature-level review.
23
+ Fixing a systemic security pattern affects how you write feature-level fixes.
24
+ Running cross-cutting first sets the frame for everything that follows.
25
+
26
+ Phase 1 catches what story-level review misses (systemic problems).
27
+ Phase 2 catches what cross-cutting review misses (requirement satisfaction gaps).
28
+
29
+ ## Phase 1: Cross-Cutting Sweep
30
+
31
+ Review the whole codebase for systemic concerns:
32
+
33
+ | Category | What to Check |
34
+ |----------|---------------|
35
+ | Architecture alignment | Does code match architecture docs and ADRs? Are layers respected? |
36
+ | Security | Auth, input validation, secrets in code, OWASP Top 10 |
37
+ | Error handling | Consistent patterns? Errors swallowed silently? |
38
+ | Test coverage | Critical paths tested? Obvious gaps in high-risk code? |
39
+ | Complexity | Over-engineered areas, dead code, unnecessary abstractions |
40
+ | Dependencies | Unused deps, obviously outdated packages |
41
+
42
+ ### Context Bundle for CLI Channels
43
+
44
+ Codex and Gemini cannot read files directly. Build a context bundle:
45
+
46
+ 1. Full file tree (excluding node_modules, .git, dist, build, coverage)
47
+ 2. Architecture docs (docs/architecture.md, docs/adrs/*.md if present)
48
+ 3. Coding standards (docs/coding-standards.md)
49
+ 4. Up to 15 strategically selected files:
50
+ - Entry points (main.*, index.*, app.*, server.* at root/src level)
51
+ - Core services (src/services/, src/lib/, src/core/)
52
+ - Auth layer (files with auth, login, session, token in name/path)
53
+ - Database layer (files with db, model, schema, migration in name/path)
54
+ - 2-3 test files from different areas
55
+
56
+ Superpowers code-reviewer subagent has full tool access and reads files
57
+ directly — no bundling needed.
58
+
59
+ ## Phase 2: Parallel User Story Review
60
+
61
+ Use docs/user-stories.md as the organizing manifest. For each story:
62
+
63
+ 1. Parse the story title, description, and acceptance criteria
64
+ 2. Map the story to relevant code files:
65
+ - Read acceptance criteria for domain keywords
66
+ - Match keywords to file/directory names in the codebase
67
+ - Include files from the same module as matched files
68
+ - When uncertain, include more files rather than fewer
69
+ 3. Dispatch a parallel subagent per story (or thematic group for small projects)
70
+ 4. Each subagent runs all three channels independently on its story's files
71
+
72
+ ### Grouping Rules
73
+
74
+ - **Small project (fewer than 5 stories):** Group into 2-3 thematic batches
75
+ - **Normal (5-20 stories):** One subagent per story
76
+ - **Large story (maps to more than 20 files):** The subagent splits its review
77
+ by layer (backend files first, frontend second) within a single subagent
78
+
79
+ ## Phase 3: Finding Consolidation & Fix Execution
80
+
81
+ 1. Flatten all findings from all channels across both phases into one list
82
+ 2. Deduplicate: same `file` + matching issue type/description = one finding;
83
+ record all source channels in a `sources` array
84
+ 3. Multi-source (2+ channels): tag as `high_confidence: true`
85
+ 4. Sort: P0 → P1 → P2 → P3
86
+ 5. P3 findings go into the report but NOT into the fix queue
87
+
88
+ ## Update Mode
89
+
90
+ When docs/reviews/post-implementation-review.md already exists and
91
+ --report-only is not set:
92
+
93
+ - Load prior findings directly — skip Phase 1 and Phase 2
94
+ - Surface previously-unresolved findings (those in "Remaining Findings") to
95
+ the user immediately before starting fix execution
96
+ - Only retry a previously-failed finding if the user explicitly says to
97
+
98
+ This shortcut is safe because the user ran --report-only first to validate
99
+ the findings before approving fix execution.
100
+
101
+ ## Fix Execution Rules
102
+
103
+ - Fix high-confidence (multi-source) findings first within each severity tier
104
+ - Verify immediately after each fix (run relevant tests)
105
+ - 3-round limit per finding before surfacing to user for direction
106
+ - After all fixes: run Superpowers code-reviewer on modified files only
107
+ - Full 3-channel re-review only if the Superpowers pass finds new P0/P1 findings
@@ -8,6 +8,19 @@ topics: [validation, critical-path, user-journeys, end-to-end, gap-analysis]
8
8
 
9
9
  Critical path analysis walks through the most important user journeys end-to-end across every specification artifact. For each journey, it verifies that every component, endpoint, query, screen, and task needed to make the journey work actually exists and is consistent.
10
10
 
11
+ ## Summary
12
+
13
+ - **Critical paths** are user journeys representing core functionality — the features that, if broken, would make the product unusable or fail its primary value proposition.
14
+ - **Sources for identifying journeys**: PRD success criteria, user stories, personas, architecture data flows, and revenue/value paths.
15
+ - **Trace 5-10 journeys** per project; more than 15 suggests scope is too broad or granularity too fine.
16
+ - **Four-step tracing process**: define the journey steps, map each step to specification artifacts (UX, API, architecture, data, tasks), check each mapping for existence/completeness/connectivity/error handling, and identify gaps.
17
+ - **Gap types**: missing components, missing endpoints, missing queries, missing screens, missing tasks, broken connections between steps, and missing error paths.
18
+ - **Common gap patterns**: handoff gaps at bounded-context boundaries, state transition gaps for entity lifecycle, async gaps for background processing, first-time user gaps for empty states, and permission gaps for authorization.
19
+ - **Output**: a summary table of all journeys with gap counts and assessments, plus detailed findings with impact analysis and recommended fixes.
20
+ - **When to run**: after all pipeline steps are complete, before implementation tasks are finalized, when PRD changes significantly, and as a final check before freezing docs.
21
+
22
+ ## Deep Guidance
23
+
11
24
  ## What a Critical Path Is
12
25
 
13
26
  A critical path is a user journey that represents core functionality — the features that, if broken, would make the product unusable or fail its primary value proposition. These are not edge cases. They are the main flows that most users will execute most of the time.
@@ -8,6 +8,20 @@ topics: [validation, implementability, ambiguity, agent-readiness, dry-run]
8
8
 
9
9
  An implementability review reads every specification as if you were an AI agent about to implement it. For each task, the question is: "Do I have everything I need to start coding right now?" Every question you would need to ask is a gap. Every ambiguity you would need to resolve is a defect. This is the most practical validation — it tests whether the specs actually work for their intended consumer.
10
10
 
11
+ ## Summary
12
+
13
+ - **Core question per task**: "Do I have everything I need to start coding right now?" — every unanswered question is a gap, every ambiguity is a defect.
14
+ - **Agent constraints to account for**: no institutional memory, no ability to ask clarifying questions, literal interpretation of specs, context window limits, and no ability to infer patterns from existing code.
15
+ - **Five check dimensions**: task-level completeness (inputs, outputs, scope, dependencies), ambiguity detection, error case coverage, data shape precision, and pattern/convention specification.
16
+ - **Ambiguity patterns**: vague adjectives ("fast", "secure", "appropriate"), missing specifics (pagination, notification channels, log levels), and implicit behavior (auth redirects, i18n fallbacks, cache invalidation).
17
+ - **Error cases to verify**: input validation, business logic violations, infrastructure failures, and concurrency conflicts — each needing defined response format, retry behavior, user feedback, and logging level.
18
+ - **Data shape precision**: types beyond primitives (email vs. free text), optional vs. nullable distinction, exhaustive enum values, and format standards (dates, money, IDs).
19
+ - **Review method**: role-play as the implementing agent, read only what the task references, attempt pseudocode, and record every question or assumption.
20
+ - **Scoring**: 5/5 (fully implementable) to 1/5 (not implementable); target all tasks at 4/5+ before implementation begins.
21
+ - **Most frequently missing**: error response formats, logging conventions, edge-case validation rules, concurrency handling, and empty-state behavior.
22
+
23
+ ## Deep Guidance
24
+
11
25
  ## The Implementing Agent Perspective
12
26
 
13
27
  AI agents implementing tasks have specific constraints that make implementability review different from a human code review:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zigrivers/scaffold",
3
- "version": "2.38.1",
3
+ "version": "2.44.2",
4
4
  "description": "AI-powered software project scaffolding pipeline",
5
5
  "type": "module",
6
6
  "keywords": [
@@ -24,6 +24,7 @@
24
24
  "files": [
25
25
  "dist/",
26
26
  "pipeline/",
27
+ "tools/",
27
28
  "knowledge/",
28
29
  "methodology/",
29
30
  "skills/",
@@ -40,9 +40,9 @@ independent review validation.
40
40
  - (deep) Data flow completeness verified (no orphaned components)
41
41
  - (deep) Module structure assessed for merge conflict risk, circular dependency risk, and import depth
42
42
  - (mvp) Downstream readiness confirmed (specification, quality, and planning steps can proceed)
43
- - (mvp) Every finding categorized P0-P3 with specific component, section, and issue
43
+ - (mvp) Every finding categorized P0-P3 with specific component, section, and issue. Severity definitions: P0 = Breaks downstream work. P1 = Prevents quality milestone. P2 = Known tech debt. P3 = Polish.
44
44
  - (mvp) Fix plan documented for all P0/P1 findings; fixes applied to system-architecture.md and re-validated
45
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
45
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
46
46
 
47
47
  ## Methodology Scaling
48
48
  - **deep**: All 10 review passes (coverage, constraints, data flows, module
@@ -51,9 +51,12 @@ independent review validation.
51
51
  review dispatched to Codex and Gemini if available, with graceful fallback
52
52
  to Claude-only enhanced review.
53
53
  - **mvp**: Domain coverage and ADR compliance checks only.
54
- - **custom:depth(1-5)**: Depth 1-3: scale number of passes with depth.
55
- Depth 4: all passes + one external model (if CLI available). Depth 5:
56
- all passes + multi-model with reconciliation.
54
+ - **custom:depth(1-5)**:
55
+ - Depth 1: two passes domain coverage and ADR compliance only.
56
+ - Depth 2: four passes domain coverage, ADR compliance, data flow completeness, and internal consistency.
57
+ - Depth 3: seven passes — add module structure, state consistency, and diagram integrity.
58
+ - Depth 4: all 10 passes + one external model (if CLI available).
59
+ - Depth 5: all 10 passes + multi-model with reconciliation.
57
60
 
58
61
  ## Mode Detection
59
62
  Re-review mode if previous review exists. If multi-model review artifacts exist
@@ -32,7 +32,9 @@ lives and how components communicate.
32
32
  - (mvp) Every ADR constraint is respected in the architecture
33
33
  - (mvp) All components appear in at least one data flow diagram
34
34
  - (deep) Each extension point has interface definition, example usage scenario, and constraints on what can/cannot be extended
35
- - (mvp) Project directory structure is defined with file-level granularity
35
+ - (mvp) System components map to modules defined in docs/project-structure.md
36
+ - (deep) Component diagram shows all system components from domain models plus infrastructure
37
+ - (deep) Data flow diagrams cover all happy-path user journeys from Must-have stories
36
38
 
37
39
  ## Methodology Scaling
38
40
  - **deep**: Full architecture document. Component diagrams, data flow diagrams,
@@ -40,8 +42,12 @@ lives and how components communicate.
40
42
  point inventory, deployment topology.
41
43
  - **mvp**: High-level component overview. Key data flows. Enough structure for
42
44
  an agent to start building without ambiguity.
43
- - **custom:depth(1-5)**: Depth 1-2: MVP-style. Depth 3: add component diagrams
44
- and module boundaries. Depth 4-5: full architecture approach.
45
+ - **custom:depth(1-5)**:
46
+ - Depth 1: high-level component overview with key data flows.
47
+ - Depth 2: component overview with module boundaries and primary data flows.
48
+ - Depth 3: add component diagrams, module boundaries, and state management design.
49
+ - Depth 4: full architecture with extension point inventory, deployment topology, and file-level module detail.
50
+ - Depth 5: full architecture with cross-cutting concern analysis, failure mode documentation, and scalability annotations.
45
51
 
46
52
  ## Mode Detection
47
53
  If outputs already exist, operate in update mode: read existing content, diff
@@ -55,10 +55,12 @@ loop from where the agent left off.
55
55
  eval gates, detailed PR descriptions, between-task cleanup.
56
56
  - **mvp**: Verify worktree, check branch state, finish in-progress work or
57
57
  pick next task, TDD loop, make check, create PR.
58
- - **custom:depth(1-5)**: Depth 1-2: check branch and continue. Depth 3: add
59
- PR reconciliation, lessons.md review, sync with origin. Depth 4: add
60
- rebase, eval gates, between-task cleanup. Depth 5: full state audit with
61
- actor verification and branch cleanup.
58
+ - **custom:depth(1-5)**:
59
+ - Depth 1: verify worktree and check current branch, continue in-progress work.
60
+ - Depth 2: add git status assessment and Beads identity verification.
61
+ - Depth 3: add PR reconciliation, lessons.md review, sync with origin.
62
+ - Depth 4: add rebase, eval gates, between-task cleanup.
63
+ - Depth 5: full state audit with actor verification and branch cleanup.
62
64
 
63
65
  ## Mode Detection
64
66
  This is a stateless execution command. No document is created or updated.
@@ -166,11 +168,22 @@ Once in-progress work is complete (or if there was none):
166
168
  - Create a pull request: `gh pr create`
167
169
  - Include agent name in PR description for traceability
168
170
 
169
- 3. **Between-task cleanup**
171
+ 3. **Run code reviews (MANDATORY)**
172
+ - Run the review-pr tool: `scaffold run review-pr` (CLI) or `/scaffold:review-pr` (plugin)
173
+ - This runs **all three** review channels on the PR diff:
174
+ 1. **Codex CLI**: `codex exec --skip-git-repo-check -s read-only --ephemeral "REVIEW_PROMPT" 2>/dev/null`
175
+ 2. **Gemini CLI**: `NO_BROWSER=true gemini -p "REVIEW_PROMPT" --output-format json --approval-mode yolo 2>/dev/null`
176
+ 3. **Superpowers code-reviewer**: dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
177
+ - Verify auth before each CLI (`codex login status`, `NO_BROWSER=true gemini -p "respond with ok" -o json`)
178
+ - All three channels must execute (skip only if a tool is genuinely not installed)
179
+ - Fix any P0/P1/P2 findings before proceeding
180
+ - Do NOT move to the next task until all channels have run
181
+
182
+ 4. **Between-task cleanup**
170
183
  - `git fetch origin --prune && git clean -fd`
171
184
  - Run the install command from CLAUDE.md Key Commands
172
185
 
173
- 4. **Claim next task**
186
+ 5. **Claim next task**
174
187
  - Branch from remote: `git fetch origin && git checkout -b <branch-name> origin/main`
175
188
  - Pick the next task following the same process as `/scaffold:multi-agent-start`
176
189
  - Continue the TDD execution loop
@@ -217,7 +230,8 @@ Once in-progress work is complete (or if there was none):
217
230
  4. **Clean between tasks** — Run cleanup after each task to prevent state leakage.
218
231
  5. **TDD is not optional** — Continue the red-green-refactor cycle for any in-progress work.
219
232
  6. **Quality gates before PR** — Never create a PR with failing checks.
220
- 7. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
233
+ 7. **Code review before next task** — After creating a PR, run all three review channels (Codex CLI, Gemini CLI, Superpowers code-reviewer) and fix all P0/P1/P2 findings before moving on.
234
+ 8. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
221
235
 
222
236
  ---
223
237
 
@@ -54,10 +54,12 @@ work on different tasks simultaneously without stepping on each other.
54
54
  PR descriptions, between-task cleanup with dependency reinstall.
55
55
  - **mvp**: Verify worktree, pick next task, TDD loop, make check, create PR.
56
56
  Skip onboarding review and between-task reinstalls if not needed.
57
- - **custom:depth(1-5)**: Depth 1-2: verify worktree, TDD loop, make check.
58
- Depth 3: add lessons.md review and test skeleton usage. Depth 4: add
59
- onboarding guide, eval gates, between-task cleanup. Depth 5: full
60
- pre-flight suite, all quality gates, actor verification.
57
+ - **custom:depth(1-5)**:
58
+ - Depth 1: verify worktree environment, TDD loop, make check.
59
+ - Depth 2: add dependency check and Beads identity verification.
60
+ - Depth 3: add lessons.md review and test skeleton usage.
61
+ - Depth 4: add onboarding guide, eval gates, between-task cleanup.
62
+ - Depth 5: full pre-flight suite, all quality gates, actor verification.
61
63
 
62
64
  ## Mode Detection
63
65
  This is a stateless execution command. No document is created or updated.
@@ -143,6 +145,7 @@ For each task:
143
145
  - If Beads: use `bd-<id>/<desc>` naming
144
146
 
145
147
  2. **Red phase — write failing tests**
148
+ - Check `docs/story-tests-map.md` (if it exists) to find test skeletons that correspond to this task's user stories
146
149
  - Check `tests/acceptance/` for existing test skeletons that correspond to the task
147
150
  - If skeletons exist, use them as your starting point
148
151
  - Otherwise, write test cases from the task's acceptance criteria
@@ -169,7 +172,18 @@ For each task:
169
172
  - Include in the PR description: what was implemented, key decisions, files changed, agent name
170
173
  - Follow the PR workflow from `docs/git-workflow.md` or CLAUDE.md
171
174
 
172
- 7. **Between-task cleanup**
175
+ 7. **Run code reviews (MANDATORY)**
176
+ - Run the review-pr tool: `scaffold run review-pr` (CLI) or `/scaffold:review-pr` (plugin)
177
+ - This runs **all three** review channels on the PR diff:
178
+ 1. **Codex CLI**: `codex exec --skip-git-repo-check -s read-only --ephemeral "REVIEW_PROMPT" 2>/dev/null`
179
+ 2. **Gemini CLI**: `NO_BROWSER=true gemini -p "REVIEW_PROMPT" --output-format json --approval-mode yolo 2>/dev/null`
180
+ 3. **Superpowers code-reviewer**: dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
181
+ - Verify auth before each CLI (`codex login status`, `NO_BROWSER=true gemini -p "respond with ok" -o json`)
182
+ - All three channels must execute (skip only if a tool is genuinely not installed)
183
+ - Fix any P0/P1/P2 findings before proceeding
184
+ - Do NOT move to the next task until all channels have run
185
+
186
+ 8. **Between-task cleanup**
173
187
  - `git fetch origin --prune && git clean -fd`
174
188
  - Run the install command from CLAUDE.md Key Commands
175
189
  - This ensures a clean state before the next task
@@ -208,8 +222,9 @@ For each task:
208
222
  3. **Clean between tasks** — Run cleanup after each task to prevent state leakage.
209
223
  4. **TDD is not optional** — Write failing tests before implementation. No exceptions.
210
224
  5. **Quality gates before PR** — Never create a PR with failing checks.
211
- 6. **Avoid task conflicts** — Check what other agents are working on before claiming.
212
- 7. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
225
+ 6. **Code review before next task** — After creating a PR, run all three review channels (Codex CLI, Gemini CLI, Superpowers code-reviewer) and fix all P0/P1/P2 findings before moving on.
226
+ 7. **Avoid task conflicts** — Check what other agents are working on before claiming.
227
+ 8. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
213
228
 
214
229
  ---
215
230
 
@@ -10,7 +10,7 @@ conditional: null
10
10
  stateless: true
11
11
  category: pipeline
12
12
  knowledge-base: [enhancement-workflow, task-claiming-strategy]
13
- reads: [create-prd, user-stories, coding-standards, tdd, project-structure]
13
+ reads: [create-prd, user-stories, coding-standards, tdd, project-structure, system-architecture, domain-modeling, adrs, api-contracts, database-schema, ux-spec, implementation-plan]
14
14
  argument-hint: "<enhancement description>"
15
15
  ---
16
16
 
@@ -31,6 +31,7 @@ This is the full-weight entry point for work that goes beyond a quick fix.
31
31
  - docs/design-system.md (optional) — design tokens, component patterns (if frontend changes)
32
32
  - CLAUDE.md (required) — project conventions, key commands, workflow
33
33
  - .beads/ (conditional) — Beads task tracking if configured
34
+ - docs/implementation-plan.md (required) — existing tasks and task numbering
34
35
  - Relevant source code if needed to understand current implementation
35
36
 
36
37
  ## Expected Outputs
@@ -41,7 +42,7 @@ This is the full-weight entry point for work that goes beyond a quick fix.
41
42
 
42
43
  ## Quality Criteria
43
44
  - (mvp) Impact analysis completed before documentation changes
44
- - (mvp) PRD feature description is thorough enough for an AI agent to build without follow-up questions
45
+ - (mvp) PRD feature description includes: what the feature does, which persona it serves, at least 2 acceptance criteria, and scope boundary (what it does NOT include)
45
46
  - (mvp) User stories follow INVEST criteria
46
47
  - (mvp) Acceptance criteria are testable Given/When/Then scenarios
47
48
  - (mvp) Task dependencies are identified and documented
@@ -57,15 +58,16 @@ This is the full-weight entry point for work that goes beyond a quick fix.
57
58
  - **mvp**: Streamlined discovery, basic impact analysis, PRD feature addition,
58
59
  minimal user stories with acceptance criteria, task list with dependencies.
59
60
  Skip innovation pass, competitive analysis, and follow-up recommendations.
60
- - **custom:depth(1-5)**: Depth 1-2: basic impact check, PRD update, stories,
61
- task creation. Depth 3: add impact analysis, dependency management, cross-
62
- reference check. Depth 4: add innovation pass, frozen artifact handling,
63
- migration considerations. Depth 5: full workflow with competitive analysis,
64
- AI-native possibilities, and follow-up review recommendations.
61
+ - **custom:depth(1-5)**:
62
+ - Depth 1: basic PRD feature addition, minimal user stories, task creation.
63
+ - Depth 2: add impact check and dependency identification.
64
+ - Depth 3: add detailed impact analysis, dependency management, cross-reference check.
65
+ - Depth 4: add innovation pass, frozen artifact handling, migration considerations.
66
+ - Depth 5: full workflow with competitive analysis, AI-native possibilities, and follow-up review recommendations.
65
67
 
66
68
  ## Mode Detection
67
- This is a stateless execution command. It updates existing documents (plan.md,
68
- user-stories.md) but does not create a new standalone output document.
69
+ This is a document-modifying execution command. It updates existing documents
70
+ (plan.md, user-stories.md) in place but does not create a new standalone output.
69
71
  - Always operates in ENHANCEMENT MODE.
70
72
  - PRD and user stories are updated in place (append, do not replace).
71
73
 
@@ -351,7 +353,7 @@ bd ready # Show what's available to work on now
351
353
  #### 7. Consider Follow-Up Reviews
352
354
 
353
355
  Depending on the enhancement scope, you may want to re-run these prompts:
354
- - **Implementation Plan Review**: If you created 5+ tasks, run it to verify sizing, dependencies, and coverage
356
+ - **Implementation Plan Review**: If you created 3+ tasks, run it to verify sizing, dependencies, and coverage
355
357
  - **Platform Parity Review**: If the enhancement has platform-specific behavior (web vs. mobile differences), re-run to check platform coverage
356
358
  - **Workflow Audit**: Only if the enhancement changed project infrastructure or conventions (rare)
357
359
 
@@ -425,6 +427,8 @@ This is appropriate when:
425
427
 
426
428
  ### Phase 5: Version Release
427
429
 
430
+ **Note**: Version release should happen after implementation is complete, not after this documentation step. If going straight to implementation, skip to "After This Step" guidance below.
431
+
428
432
  After all changes are applied and verified:
429
433
 
430
434
  1. Determine release type based on change scope:
@@ -444,11 +448,15 @@ When this step is complete, tell the user:
444
448
  **Enhancement documented** — PRD updated, user stories created, tasks ready.
445
449
 
446
450
  **Next (if applicable):**
447
- - If `docs/implementation-playbook.md` exists: Run `/scaffold:implementation-playbook` Update wave assignments and add per-task context blocks for new tasks.
448
- - If you created **5+ tasks**: Run `/scaffold:implementation-plan-review` — Review task quality, coverage, and dependencies.
451
+ - If `docs/implementation-playbook.md` exists: Run `/scaffold:implementation-playbook` to update wave assignments and add per-task context blocks for new tasks. **This is required** to keep the playbook in sync with the implementation plan.
452
+ - If you created **3+ tasks**: Run `/scaffold:implementation-plan-review` — Review task quality, coverage, and dependencies.
449
453
  - If the enhancement has **platform-specific behavior**: Run `/scaffold:platform-parity-review` — Check platform coverage.
450
454
  - If user stories were added or changed: Run `/scaffold:story-tests` — Regenerate test skeletons for new user stories.
451
455
  - If scope changed materially: Run `/scaffold:create-evals` — Update eval checks for new scope.
456
+ - If impact analysis identified **Data Model changes**: Run `/scaffold:database-schema` to update the schema.
457
+ - If impact analysis identified **API changes**: Run `/scaffold:api-contracts` to update contracts.
458
+ - If impact analysis identified **UI changes**: Run `/scaffold:ux-spec` to update the UX specification.
459
+ - If impact analysis identified **Architecture changes**: Run `/scaffold:system-architecture` to update architecture.
452
460
  - Otherwise: Run `/scaffold:single-agent-start` or `/scaffold:single-agent-resume` to begin implementation (or `/scaffold:multi-agent-start <agent-name>` / `/scaffold:multi-agent-resume <agent-name>` for worktree agents).
453
461
 
454
462
  **Pipeline reference:** `/scaffold:prompt-pipeline`
@@ -30,6 +30,8 @@ prompt.
30
30
  - docs/tdd-standards.md (required) — test categories, mocking strategy, test file locations
31
31
  - docs/project-structure.md (required) — where files live, module organization
32
32
  - docs/implementation-playbook.md (optional) — quality gates section for project-specific gates
33
+ - docs/system-architecture.md (optional) — for bug fixes involving component boundaries or layer violations
34
+ - docs/domain-models/ (optional) — for bug fixes involving domain logic or entity relationships
33
35
  - tasks/lessons.md (optional) — previous lessons learned
34
36
  - .beads/ (conditional) — Beads task tracking if configured
35
37
  - Relevant source code — files that will be modified
@@ -58,16 +60,18 @@ prompt.
58
60
  - **mvp**: Complexity gate, basic acceptance criteria (happy path + one edge
59
61
  case), test plan with category and cases, file list. Skip duplicate check
60
62
  and detailed implementation notes.
61
- - **custom:depth(1-5)**: Depth 1-2: complexity gate, basic AC, test cases,
62
- file list. Depth 3: add duplicate check, lessons.md review, regression
63
- guards. Depth 4: add mocking strategy, specific coding standard references.
64
- Depth 5: full analysis with innovation suggestions and cross-module impact.
63
+ - **custom:depth(1-5)**:
64
+ - Depth 1: complexity gate, basic acceptance criteria (happy path only), and file list.
65
+ - Depth 2: add one edge case to AC, test cases mapped to criteria, and test file locations.
66
+ - Depth 3: add duplicate check, lessons.md review, regression guards.
67
+ - Depth 4: add mocking strategy, specific coding standard references.
68
+ - Depth 5: full analysis with innovation suggestions and cross-module impact.
65
69
 
66
70
  ## Mode Detection
67
- This is a stateless execution command. No persistent document is created.
68
- - Always operates in CREATE MODE produces a task definition.
69
- - If Beads is configured, the task is created via `bd create`.
70
- - If not, the task is documented inline and implementation begins.
71
+ This is a task-creation execution command. Task persistence depends on context:
72
+ - If Beads is configured, the task is persistent via `bd create`.
73
+ - If not, the task is documented inline for immediate execution (not persistent).
74
+ - Always operates in CREATE MODE produces a task definition each time.
71
75
 
72
76
  ## Update Mode Specifics
73
77
  Not applicable — this creates a new task each time. If a similar task already
@@ -110,7 +114,7 @@ Before asking questions, review:
110
114
  - `docs/coding-standards.md` — Code conventions, naming, patterns
111
115
  - `docs/tdd-standards.md` — Test categories, mocking strategy, test file locations
112
116
  - `docs/project-structure.md` — Where files live, module organization
113
- - `tasks/lessons.md` — Previous lessons learned (extract any relevant to this task)
117
+ - `tasks/lessons.md` (if it exists) — Previous lessons learned (extract any relevant to this task)
114
118
  - If `docs/implementation-playbook.md` exists, check its quality gates section for project-specific gates
115
119
  - Relevant source code — Read the files that will be modified
116
120
 
@@ -121,7 +125,7 @@ Before asking questions, review:
121
125
  - If proceeding, note the relationship in the new task's description
122
126
 
123
127
  #### Extract Relevant Lessons
124
- Review `tasks/lessons.md` for anti-patterns, gotchas, or conventions related to:
128
+ Review `tasks/lessons.md` (if it exists) for anti-patterns, gotchas, or conventions related to:
125
129
  - The area of code being modified
126
130
  - The type of change (fix, refactor, perf, etc.)
127
131
  - Similar past mistakes to avoid
@@ -269,7 +273,7 @@ Present the task summary:
269
273
  1. **Respect the complexity gate** — If it is bigger than a quick task, redirect immediately. Do not try to squeeze a feature into the quick task format.
270
274
  2. **One task only** — Quick Task creates exactly one Beads task. If you need multiple, use the Enhancement prompt.
271
275
  3. **Check for duplicates first** — Run `bd list` before creating. Do not create tasks that already exist.
272
- 4. **Lessons.md is required reading** — Always check `tasks/lessons.md` for relevant anti-patterns before defining the task.
276
+ 4. **Lessons.md is required reading** — Always check `tasks/lessons.md` (if it exists) for relevant anti-patterns before defining the task.
273
277
  5. **Acceptance criteria drive tests** — Every criterion must map to at least one test case. If you cannot test it, rewrite the criterion.
274
278
  6. **Conventional commit titles** — Always use `type(scope): description` format. This feeds directly into commit messages.
275
279
 
@@ -306,6 +310,9 @@ Present the task summary:
306
310
  - Naming follows project patterns
307
311
  - Implementation notes reference specific standards, not generic advice
308
312
 
313
+ #### Quality Gates
314
+ - Quick tasks follow the same quality gates as all other tasks — see `docs/implementation-playbook.md` § Quality Gates
315
+
309
316
  #### Eval Gate
310
317
  - If `tests/evals/` exists, run `make eval` (or equivalent eval command from CLAUDE.md Key Commands) as a required pre-commit check
311
318
 
@@ -51,10 +51,12 @@ continues the TDD execution loop from where you left off.
51
51
  guide, consult lessons.md, reconcile all open PRs, detailed PR descriptions.
52
52
  - **mvp**: Quick git state check, identify in-progress work, finish or pick
53
53
  next task, TDD loop, make check, create PR.
54
- - **custom:depth(1-5)**: Depth 1-2: check branch and continue. Depth 3: add
55
- PR reconciliation and lessons.md review. Depth 4: add rebase, full test
56
- suite validation, onboarding review. Depth 5: full state audit with branch
57
- cleanup and eval gates.
54
+ - **custom:depth(1-5)**:
55
+ - Depth 1: check current branch and continue in-progress work.
56
+ - Depth 2: add git status assessment and uncommitted change detection.
57
+ - Depth 3: add PR reconciliation and lessons.md review.
58
+ - Depth 4: add rebase, full test suite validation, onboarding review.
59
+ - Depth 5: full state audit with branch cleanup and eval gates.
58
60
 
59
61
  ## Mode Detection
60
62
  This is a stateless execution command. No document is created or updated.
@@ -143,7 +145,18 @@ Once in-progress work is complete (or if there was none):
143
145
  - Create a pull request: `gh pr create`
144
146
  - Follow the PR workflow from `docs/git-workflow.md` or CLAUDE.md
145
147
 
146
- 3. **Claim next task**
148
+ 3. **Run code reviews (MANDATORY)**
149
+ - Run the review-pr tool: `scaffold run review-pr` (CLI) or `/scaffold:review-pr` (plugin)
150
+ - This runs **all three** review channels on the PR diff:
151
+ 1. **Codex CLI**: `codex exec --skip-git-repo-check -s read-only --ephemeral "REVIEW_PROMPT" 2>/dev/null`
152
+ 2. **Gemini CLI**: `NO_BROWSER=true gemini -p "REVIEW_PROMPT" --output-format json --approval-mode yolo 2>/dev/null`
153
+ 3. **Superpowers code-reviewer**: dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
154
+ - Verify auth before each CLI (`codex login status`, `NO_BROWSER=true gemini -p "respond with ok" -o json`)
155
+ - All three channels must execute (skip only if a tool is genuinely not installed)
156
+ - Fix any P0/P1/P2 findings before proceeding
157
+ - Do NOT move to the next task until all channels have run
158
+
159
+ 4. **Claim next task**
147
160
  - Return to main: `git checkout main && git pull origin main`
148
161
  - Pick the next task following the same process as `/scaffold:single-agent-start`
149
162
  - Continue the TDD execution loop
@@ -182,7 +195,8 @@ Once in-progress work is complete (or if there was none):
182
195
  3. **Reconcile task status** — Merged PRs must be reflected in the task tracker.
183
196
  4. **TDD is not optional** — Continue the red-green-refactor cycle for any in-progress work.
184
197
  5. **Quality gates before PR** — Never create a PR with failing checks.
185
- 6. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
198
+ 6. **Code review before next task** — After creating a PR, run all three review channels (Codex CLI, Gemini CLI, Superpowers code-reviewer) and fix all P0/P1/P2 findings before moving on.
199
+ 7. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
186
200
 
187
201
  ---
188
202
 
@@ -45,6 +45,7 @@ complete.
45
45
  - (mvp) Task status is updated after each completion
46
46
  - (deep) Test skeletons from tests/acceptance/ are used as starting points when available
47
47
  - (deep) lessons.md is consulted before each task for relevant anti-patterns
48
+ - (deep) Before starting each task, agent consults tasks/lessons.md and documents which lesson was applied
48
49
  - (deep) PR description includes implementation summary, assumptions, and files modified
49
50
 
50
51
  ## Methodology Scaling
@@ -54,10 +55,12 @@ complete.
54
55
  - **mvp**: Quick git/dependency check, read playbook or plan, pick next task,
55
56
  TDD loop, make check, create PR. Skip onboarding guide review and detailed
56
57
  PR annotations.
57
- - **custom:depth(1-5)**: Depth 1-2: minimal pre-flight, TDD loop, make check.
58
- Depth 3: add lessons.md review and test skeleton usage. Depth 4: add
59
- onboarding guide, eval gates, detailed PR descriptions. Depth 5: full
60
- pre-flight suite, all quality gates, cross-reference with upstream docs.
58
+ - **custom:depth(1-5)**:
59
+ - Depth 1: git status check, TDD loop, make check.
60
+ - Depth 2: add dependency check and test suite health verification before starting.
61
+ - Depth 3: add lessons.md review and test skeleton usage.
62
+ - Depth 4: add onboarding guide, eval gates, detailed PR descriptions.
63
+ - Depth 5: full pre-flight suite, all quality gates, cross-reference with upstream docs.
61
64
 
62
65
  ## Mode Detection
63
66
  This is a stateless execution command. No document is created or updated.
@@ -121,6 +124,7 @@ For each task:
121
124
  - If Beads: branch as `bd-<id>/<desc>`
122
125
 
123
126
  2. **Red phase — write failing tests**
127
+ - Check `docs/story-tests-map.md` (if it exists) to find test skeletons that correspond to this task's user stories
124
128
  - Check `tests/acceptance/` for existing test skeletons that correspond to the task
125
129
  - If skeletons exist, use them as your starting point
126
130
  - Otherwise, write test cases from the task's acceptance criteria
@@ -147,7 +151,18 @@ For each task:
147
151
  - Include in the PR description: what was implemented, key decisions, files changed
148
152
  - Follow the PR workflow from `docs/git-workflow.md` or CLAUDE.md
149
153
 
150
- 7. **Update status**
154
+ 7. **Run code reviews (MANDATORY)**
155
+ - Run the review-pr tool: `scaffold run review-pr` (CLI) or `/scaffold:review-pr` (plugin)
156
+ - This runs **all three** review channels on the PR diff:
157
+ 1. **Codex CLI**: `codex exec --skip-git-repo-check -s read-only --ephemeral "REVIEW_PROMPT" 2>/dev/null`
158
+ 2. **Gemini CLI**: `NO_BROWSER=true gemini -p "REVIEW_PROMPT" --output-format json --approval-mode yolo 2>/dev/null`
159
+ 3. **Superpowers code-reviewer**: dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
160
+ - Verify auth before each CLI (`codex login status`, `NO_BROWSER=true gemini -p "respond with ok" -o json`)
161
+ - All three channels must execute (skip only if a tool is genuinely not installed)
162
+ - Fix any P0/P1/P2 findings before proceeding
163
+ - Do NOT move to the next task until all channels have run
164
+
165
+ 8. **Update status**
151
166
  - If Beads: task status is managed via `bd` commands
152
167
  - Without Beads: mark the task as complete in the plan/playbook
153
168
 
@@ -178,9 +193,10 @@ For each task:
178
193
  1. **TDD is not optional** — Write failing tests before implementation. No exceptions.
179
194
  2. **One task at a time** — Complete the current task fully before starting the next.
180
195
  3. **Quality gates before PR** — Never create a PR with failing checks.
181
- 4. **Update status immediately** — Mark tasks complete as soon as the PR is created.
182
- 5. **Consult lessons.md** — Check for relevant anti-patterns before each task.
183
- 6. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
196
+ 4. **Code review before next task** — After creating a PR, run all three review channels (Codex CLI, Gemini CLI, Superpowers code-reviewer) and fix all P0/P1/P2 findings before moving on.
197
+ 5. **Update status immediately** — Mark tasks complete as soon as review passes.
198
+ 6. **Consult lessons.md** — Check for relevant anti-patterns before each task.
199
+ 7. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
184
200
 
185
201
  ---
186
202