@zigrivers/scaffold 2.38.1 → 2.44.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (201) hide show
  1. package/README.md +10 -7
  2. package/dist/cli/commands/build.js +4 -4
  3. package/dist/cli/commands/build.js.map +1 -1
  4. package/dist/cli/commands/check.test.js +11 -8
  5. package/dist/cli/commands/check.test.js.map +1 -1
  6. package/dist/cli/commands/complete.d.ts.map +1 -1
  7. package/dist/cli/commands/complete.js +2 -1
  8. package/dist/cli/commands/complete.js.map +1 -1
  9. package/dist/cli/commands/complete.test.js +4 -1
  10. package/dist/cli/commands/complete.test.js.map +1 -1
  11. package/dist/cli/commands/dashboard.js +4 -4
  12. package/dist/cli/commands/dashboard.js.map +1 -1
  13. package/dist/cli/commands/knowledge.js +2 -2
  14. package/dist/cli/commands/knowledge.js.map +1 -1
  15. package/dist/cli/commands/knowledge.test.js +5 -12
  16. package/dist/cli/commands/knowledge.test.js.map +1 -1
  17. package/dist/cli/commands/list.d.ts +1 -1
  18. package/dist/cli/commands/list.d.ts.map +1 -1
  19. package/dist/cli/commands/list.js +84 -3
  20. package/dist/cli/commands/list.js.map +1 -1
  21. package/dist/cli/commands/list.test.js +82 -0
  22. package/dist/cli/commands/list.test.js.map +1 -1
  23. package/dist/cli/commands/next.test.js +4 -1
  24. package/dist/cli/commands/next.test.js.map +1 -1
  25. package/dist/cli/commands/reset.d.ts.map +1 -1
  26. package/dist/cli/commands/reset.js +5 -2
  27. package/dist/cli/commands/reset.js.map +1 -1
  28. package/dist/cli/commands/reset.test.js +4 -1
  29. package/dist/cli/commands/reset.test.js.map +1 -1
  30. package/dist/cli/commands/rework.d.ts.map +1 -1
  31. package/dist/cli/commands/rework.js +3 -2
  32. package/dist/cli/commands/rework.js.map +1 -1
  33. package/dist/cli/commands/run.d.ts.map +1 -1
  34. package/dist/cli/commands/run.js +28 -13
  35. package/dist/cli/commands/run.js.map +1 -1
  36. package/dist/cli/commands/run.test.js +1 -1
  37. package/dist/cli/commands/run.test.js.map +1 -1
  38. package/dist/cli/commands/skip.d.ts.map +1 -1
  39. package/dist/cli/commands/skip.js +2 -1
  40. package/dist/cli/commands/skip.js.map +1 -1
  41. package/dist/cli/commands/skip.test.js +4 -1
  42. package/dist/cli/commands/skip.test.js.map +1 -1
  43. package/dist/cli/commands/status.d.ts.map +1 -1
  44. package/dist/cli/commands/status.js +88 -4
  45. package/dist/cli/commands/status.js.map +1 -1
  46. package/dist/cli/commands/version.d.ts.map +1 -1
  47. package/dist/cli/commands/version.js +22 -3
  48. package/dist/cli/commands/version.js.map +1 -1
  49. package/dist/cli/commands/version.test.js +42 -0
  50. package/dist/cli/commands/version.test.js.map +1 -1
  51. package/dist/cli/output/context.test.js +14 -13
  52. package/dist/cli/output/context.test.js.map +1 -1
  53. package/dist/cli/output/interactive.js +4 -4
  54. package/dist/cli/output/json.d.ts +1 -0
  55. package/dist/cli/output/json.d.ts.map +1 -1
  56. package/dist/cli/output/json.js +14 -1
  57. package/dist/cli/output/json.js.map +1 -1
  58. package/dist/config/loader.d.ts.map +1 -1
  59. package/dist/config/loader.js +10 -3
  60. package/dist/config/loader.js.map +1 -1
  61. package/dist/config/loader.test.js +28 -0
  62. package/dist/config/loader.test.js.map +1 -1
  63. package/dist/core/assembly/engine.d.ts.map +1 -1
  64. package/dist/core/assembly/engine.js +6 -1
  65. package/dist/core/assembly/engine.js.map +1 -1
  66. package/dist/e2e/init.test.js +3 -0
  67. package/dist/e2e/init.test.js.map +1 -1
  68. package/dist/index.js +2 -1
  69. package/dist/index.js.map +1 -1
  70. package/dist/project/adopt.test.js +3 -0
  71. package/dist/project/adopt.test.js.map +1 -1
  72. package/dist/project/claude-md.d.ts.map +1 -1
  73. package/dist/project/claude-md.js +2 -1
  74. package/dist/project/claude-md.js.map +1 -1
  75. package/dist/project/detector.js +3 -3
  76. package/dist/project/detector.js.map +1 -1
  77. package/dist/project/signals.d.ts +1 -0
  78. package/dist/project/signals.d.ts.map +1 -1
  79. package/dist/state/decision-logger.d.ts.map +1 -1
  80. package/dist/state/decision-logger.js +7 -4
  81. package/dist/state/decision-logger.js.map +1 -1
  82. package/dist/state/lock-manager.js +1 -1
  83. package/dist/state/lock-manager.js.map +1 -1
  84. package/dist/state/lock-manager.test.js +27 -3
  85. package/dist/state/lock-manager.test.js.map +1 -1
  86. package/dist/state/state-manager.d.ts.map +1 -1
  87. package/dist/state/state-manager.js +6 -0
  88. package/dist/state/state-manager.js.map +1 -1
  89. package/dist/state/state-manager.test.js +7 -0
  90. package/dist/state/state-manager.test.js.map +1 -1
  91. package/dist/types/assembly.d.ts +2 -0
  92. package/dist/types/assembly.d.ts.map +1 -1
  93. package/dist/utils/eligible.d.ts +8 -0
  94. package/dist/utils/eligible.d.ts.map +1 -0
  95. package/dist/utils/eligible.js +36 -0
  96. package/dist/utils/eligible.js.map +1 -0
  97. package/dist/validation/config-validator.test.js +15 -13
  98. package/dist/validation/config-validator.test.js.map +1 -1
  99. package/dist/validation/index.test.js +1 -1
  100. package/dist/wizard/wizard.d.ts.map +1 -1
  101. package/dist/wizard/wizard.js +1 -0
  102. package/dist/wizard/wizard.js.map +1 -1
  103. package/dist/wizard/wizard.test.js +2 -0
  104. package/dist/wizard/wizard.test.js.map +1 -1
  105. package/knowledge/core/automated-review-tooling.md +4 -4
  106. package/knowledge/core/eval-craft.md +44 -0
  107. package/knowledge/core/multi-model-review-dispatch.md +8 -0
  108. package/knowledge/core/system-architecture.md +39 -0
  109. package/knowledge/core/task-decomposition.md +53 -0
  110. package/knowledge/core/testing-strategy.md +160 -0
  111. package/knowledge/finalization/implementation-playbook.md +24 -7
  112. package/knowledge/product/prd-craft.md +41 -0
  113. package/knowledge/review/review-adr.md +1 -1
  114. package/knowledge/review/review-api-design.md +1 -1
  115. package/knowledge/review/review-database-design.md +1 -1
  116. package/knowledge/review/review-domain-modeling.md +1 -1
  117. package/knowledge/review/review-implementation-tasks.md +1 -1
  118. package/knowledge/review/review-methodology.md +1 -1
  119. package/knowledge/review/review-operations.md +1 -1
  120. package/knowledge/review/review-prd.md +1 -1
  121. package/knowledge/review/review-security.md +1 -1
  122. package/knowledge/review/review-system-architecture.md +1 -1
  123. package/knowledge/review/review-testing-strategy.md +1 -1
  124. package/knowledge/review/review-user-stories.md +1 -1
  125. package/knowledge/review/review-ux-specification.md +1 -1
  126. package/knowledge/review/review-vision.md +1 -1
  127. package/knowledge/tools/post-implementation-review-methodology.md +107 -0
  128. package/knowledge/validation/critical-path-analysis.md +13 -0
  129. package/knowledge/validation/implementability-review.md +14 -0
  130. package/package.json +2 -1
  131. package/pipeline/architecture/review-architecture.md +8 -5
  132. package/pipeline/architecture/system-architecture.md +9 -3
  133. package/pipeline/build/multi-agent-resume.md +21 -7
  134. package/pipeline/build/multi-agent-start.md +22 -7
  135. package/pipeline/build/new-enhancement.md +20 -12
  136. package/pipeline/build/quick-task.md +18 -11
  137. package/pipeline/build/single-agent-resume.md +20 -6
  138. package/pipeline/build/single-agent-start.md +24 -8
  139. package/pipeline/consolidation/claude-md-optimization.md +8 -4
  140. package/pipeline/consolidation/workflow-audit.md +9 -5
  141. package/pipeline/decisions/adrs.md +7 -3
  142. package/pipeline/decisions/review-adrs.md +8 -5
  143. package/pipeline/environment/ai-memory-setup.md +6 -2
  144. package/pipeline/environment/automated-pr-review.md +79 -12
  145. package/pipeline/environment/design-system.md +9 -6
  146. package/pipeline/environment/dev-env-setup.md +8 -5
  147. package/pipeline/environment/git-workflow.md +16 -13
  148. package/pipeline/finalization/apply-fixes-and-freeze.md +10 -5
  149. package/pipeline/finalization/developer-onboarding-guide.md +10 -3
  150. package/pipeline/finalization/implementation-playbook.md +13 -4
  151. package/pipeline/foundation/beads.md +8 -5
  152. package/pipeline/foundation/coding-standards.md +13 -10
  153. package/pipeline/foundation/project-structure.md +16 -13
  154. package/pipeline/foundation/tdd.md +9 -4
  155. package/pipeline/foundation/tech-stack.md +7 -5
  156. package/pipeline/integration/add-e2e-testing.md +12 -8
  157. package/pipeline/modeling/domain-modeling.md +9 -7
  158. package/pipeline/modeling/review-domain-modeling.md +8 -6
  159. package/pipeline/parity/platform-parity-review.md +9 -6
  160. package/pipeline/planning/implementation-plan-review.md +10 -7
  161. package/pipeline/planning/implementation-plan.md +41 -9
  162. package/pipeline/pre/create-prd.md +7 -4
  163. package/pipeline/pre/innovate-prd.md +12 -8
  164. package/pipeline/pre/innovate-user-stories.md +10 -7
  165. package/pipeline/pre/review-prd.md +12 -10
  166. package/pipeline/pre/review-user-stories.md +12 -9
  167. package/pipeline/pre/user-stories.md +7 -4
  168. package/pipeline/quality/create-evals.md +6 -3
  169. package/pipeline/quality/operations.md +7 -3
  170. package/pipeline/quality/review-operations.md +12 -5
  171. package/pipeline/quality/review-security.md +11 -6
  172. package/pipeline/quality/review-testing.md +11 -6
  173. package/pipeline/quality/security.md +6 -2
  174. package/pipeline/quality/story-tests.md +14 -9
  175. package/pipeline/specification/api-contracts.md +9 -3
  176. package/pipeline/specification/database-schema.md +8 -2
  177. package/pipeline/specification/review-api.md +10 -4
  178. package/pipeline/specification/review-database.md +8 -3
  179. package/pipeline/specification/review-ux.md +9 -3
  180. package/pipeline/specification/ux-spec.md +9 -4
  181. package/pipeline/validation/critical-path-walkthrough.md +10 -5
  182. package/pipeline/validation/cross-phase-consistency.md +9 -4
  183. package/pipeline/validation/decision-completeness.md +8 -3
  184. package/pipeline/validation/dependency-graph-validation.md +8 -3
  185. package/pipeline/validation/implementability-dry-run.md +9 -5
  186. package/pipeline/validation/scope-creep-check.md +11 -6
  187. package/pipeline/validation/traceability-matrix.md +10 -5
  188. package/pipeline/vision/create-vision.md +7 -4
  189. package/pipeline/vision/innovate-vision.md +11 -8
  190. package/pipeline/vision/review-vision.md +15 -12
  191. package/skills/multi-model-dispatch/SKILL.md +6 -5
  192. package/skills/scaffold-runner/SKILL.md +47 -3
  193. package/tools/dashboard.md +53 -0
  194. package/tools/post-implementation-review.md +655 -0
  195. package/tools/prompt-pipeline.md +160 -0
  196. package/tools/release.md +440 -0
  197. package/tools/review-pr.md +229 -0
  198. package/tools/session-analyzer.md +299 -0
  199. package/tools/update.md +113 -0
  200. package/tools/version-bump.md +290 -0
  201. package/tools/version.md +82 -0
@@ -34,17 +34,24 @@ independent review validation.
34
34
  - (mvp) Monitoring verified against minimum set: latency, error rate, and saturation
35
35
  - (deep) Alert thresholds have rationale
36
36
  - (deep) Common failure scenarios have runbook entries
37
+ - (mvp) At least production environment operations documented
37
38
  - (deep) Dev/staging/production environment differences documented in operations runbook
38
- - Every finding categorized P0-P3 with specific runbook section, metric, and issue
39
- - Fix plan documented for all P0/P1 findings; fixes applied to operations-runbook.md and re-validated
40
- - Downstream readiness confirmed no unresolved P0 or P1 findings remain before security step proceeds
41
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
39
+ - (deep) Each health check endpoint specifies expected status code, response time SLA, failure thresholds
40
+ - (mvp) Every finding categorized P0-P3 (P0 = Breaks downstream work. P1 = Prevents quality milestone. P2 = Known tech debt. P3 = Polish.) with specific runbook section, metric, and issue
41
+ - (mvp) Fix plan documented for all P0/P1 findings; fixes applied to operations-runbook.md and re-validated
42
+ - (mvp) Downstream readiness confirmed — no unresolved P0 or P1 findings remain before security step proceeds
43
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
42
44
 
43
45
  ## Methodology Scaling
44
46
  - **deep**: Full multi-pass review. Multi-model review dispatched to Codex and
45
47
  Gemini if available, with graceful fallback to Claude-only enhanced review.
46
48
  - **mvp**: Deployment coverage only.
47
- - **custom:depth(1-5)**: Depth 1: monitoring and logging pass only. Depth 2: add deployment and rollback pass. Depth 3: add incident response and scaling passes. Depth 4: add external model review. Depth 5: multi-model review with reconciliation.
49
+ - **custom:depth(1-5)**:
50
+ - Depth 1: Monitoring and logging pass only (1 review pass)
51
+ - Depth 2: Add deployment and rollback pass (2 review passes)
52
+ - Depth 3: Add incident response and scaling passes (4 review passes)
53
+ - Depth 4: Add external model review (4 review passes + external dispatch)
54
+ - Depth 5: Multi-model review with reconciliation (4 review passes + multi-model synthesis)
48
55
 
49
56
  ## Mode Detection
50
57
  Re-review mode if previous review exists. If multi-model review artifacts exist
@@ -38,17 +38,22 @@ independent review validation.
38
38
  - (deep) Secrets management covers: all environment variables, API keys, database credentials, and third-party tokens
39
39
  - (deep) Dependency audit scope covers all dependencies
40
40
  - (deep) Threat model covers all trust boundaries
41
- - (deep) Data classification covers every entity in the domain model
42
- - Every finding categorized P0-P3 with specific control, boundary, and issue
43
- - Fix plan documented for all P0/P1 findings; fixes applied to security-review.md and re-validated
44
- - Downstream readiness confirmed — no unresolved P0 or P1 findings remain before planning phase proceeds
45
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
41
+ - (deep) If docs/domain-models/ exists, data classification covers every entity in the domain model. Otherwise, data classification derived from user stories and API contracts.
42
+ - (mvp) Every finding categorized P0-P3 (P0 = Breaks downstream work. P1 = Prevents quality milestone. P2 = Known tech debt. P3 = Polish.) with specific control, boundary, and issue
43
+ - (mvp) Fix plan documented for all P0/P1 findings; fixes applied to security-review.md and re-validated
44
+ - (mvp) Downstream readiness confirmed — no unresolved P0 or P1 findings remain before planning phase proceeds
45
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
46
46
 
47
47
  ## Methodology Scaling
48
48
  - **deep**: Full multi-pass review. Multi-model review dispatched to Codex and
49
49
  Gemini if available, with graceful fallback to Claude-only enhanced review.
50
50
  - **mvp**: OWASP coverage check only.
51
- - **custom:depth(1-5)**: Depth 1: OWASP top 10 and secrets management pass only. Depth 2: add auth boundary and input validation passes. Depth 3: add dependency audit and data protection passes. Depth 4: add external model security review. Depth 5: multi-model security review with reconciliation.
51
+ - **custom:depth(1-5)**:
52
+ - Depth 1: OWASP top 10 and secrets management pass only (1 review pass)
53
+ - Depth 2: Add auth boundary and input validation passes (2 review passes)
54
+ - Depth 3: Add dependency audit and data protection passes (4 review passes)
55
+ - Depth 4: Add external model security review (4 review passes + external dispatch)
56
+ - Depth 5: Multi-model security review with reconciliation (4 review passes + multi-model synthesis)
52
57
 
53
58
  ## Mode Detection
54
59
  Re-review mode if previous review exists. If multi-model review artifacts exist
@@ -33,21 +33,26 @@ independent review validation.
33
33
 
34
34
  ## Quality Criteria
35
35
  - (mvp) Coverage gaps by layer documented with severity
36
- - (deep) Domain invariant test cases verified
36
+ - (deep) If docs/domain-models/ exists, domain invariant test cases verified. Otherwise, test invariants derived from story acceptance criteria.
37
37
  - (deep) Each test environment assumption verified against actual environment config or flagged as unverifiable
38
38
  - (deep) Performance test coverage assessed against NFRs
39
39
  - (deep) Integration boundaries have integration tests defined
40
- - Every finding categorized P0-P3 with specific test layer, gap, and issue
41
- - Fix plan documented for all P0/P1 findings; fixes applied to tdd-standards.md and re-validated
42
- - Downstream readiness confirmed — no unresolved P0 or P1 findings remain before operations step proceeds
43
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
40
+ - (mvp) Every finding categorized P0-P3 (P0 = Breaks downstream work. P1 = Prevents quality milestone. P2 = Known tech debt. P3 = Polish.) with specific test layer, gap, and issue
41
+ - (mvp) Fix plan documented for all P0/P1 findings; fixes applied to tdd-standards.md and re-validated
42
+ - (mvp) Downstream readiness confirmed — no unresolved P0 or P1 findings remain before operations step proceeds
43
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
44
44
 
45
45
  ## Methodology Scaling
46
46
  - **deep**: Full multi-pass review targeting all testing failure modes. Multi-model
47
47
  review dispatched to Codex and Gemini if available, with graceful fallback
48
48
  to Claude-only enhanced review.
49
49
  - **mvp**: Coverage gap check only.
50
- - **custom:depth(1-5)**: Depth 1: test coverage and pyramid balance pass only. Depth 2: add test quality and naming convention passes. Depth 3: add edge case coverage and CI integration passes. Depth 4: add external model review. Depth 5: multi-model review with reconciliation.
50
+ - **custom:depth(1-5)**:
51
+ - Depth 1: Test coverage and pyramid balance pass only (1 review pass)
52
+ - Depth 2: Add test quality and naming convention passes (2 review passes)
53
+ - Depth 3: Add edge case coverage and CI integration passes (4 review passes)
54
+ - Depth 4: Add external model review (4 review passes + external dispatch)
55
+ - Depth 5: Multi-model review with reconciliation (4 review passes + multi-model synthesis)
51
56
 
52
57
  ## Mode Detection
53
58
  Re-review mode if docs/reviews/review-testing.md or docs/reviews/testing/
@@ -47,8 +47,12 @@ threat modeling across all trust boundaries.
47
47
  scope. Compliance checklist (if applicable).
48
48
  - **mvp**: Key security controls. Auth approach. No secrets in code.
49
49
  Basic input validation strategy.
50
- - **custom:depth(1-5)**: Depth 1-2: MVP-style. Depth 3: add threat model.
51
- Depth 4-5: full security review.
50
+ - **custom:depth(1-5)**:
51
+ - Depth 1: key security controls and auth approach.
52
+ - Depth 2: add secrets management strategy and basic input validation.
53
+ - Depth 3: add threat model (basic STRIDE) and data classification.
54
+ - Depth 4: full threat model with OWASP analysis per component and compliance checklist.
55
+ - Depth 5: full security review with penetration testing scope, dependency audit strategy, and advanced controls.
52
56
 
53
57
  ## Mode Detection
54
58
  Check for docs/security-review.md. If it exists, operate in update mode: read
@@ -36,15 +36,17 @@ pending/skipped — developers implement them during TDD execution.
36
36
  ACs → test cases, and layer assignments (unit/integration/e2e)
37
37
 
38
38
  ## Quality Criteria
39
- - (mvp) Every user story in docs/user-stories.md has a corresponding test file
39
+ - (mvp) Every Must-have user story has a corresponding test file
40
40
  - (mvp) Every acceptance criterion has at least one tagged test case
41
- - Test cases are tagged with story ID and AC ID for traceability
41
+ - (mvp) Test cases are tagged with story ID and AC ID for traceability
42
42
  - (deep) Test layer assignment: single-function ACs → unit; cross-component ACs → integration; full user journey ACs → e2e
43
- - Test files use the project's test framework from docs/tech-stack.md
44
- - All test cases are created as pending/skipped (not implemented)
45
- - docs/story-tests-map.md shows 100% AC-to-test-case coverage
46
- - Test file location follows conventions from docs/project-structure.md
43
+ - (mvp) Test files use the project's test framework from docs/tech-stack.md
44
+ - (mvp) All test cases are created as pending/skipped (or equivalent framework pause/skip mechanism) (not implemented)
45
+ - (mvp) docs/story-tests-map.md shows 100% AC-to-test-case coverage
46
+ - (mvp) Test file location follows conventions from docs/project-structure.md
47
47
  - (deep) Test data fixtures and dependencies documented for each test file
48
+ - (deep) Each pending test case includes story ID and AC ID tags, GWT structure, and at least one assertion hint
49
+ - (mvp) If api-contracts.md does not exist, API test skeletons derived from user story acceptance criteria instead
48
50
 
49
51
  ## Methodology Scaling
50
52
  - **deep**: All stories get test files. Negative test cases for every happy path
@@ -52,9 +54,12 @@ pending/skipped — developers implement them during TDD execution.
52
54
  e2e where applicable). Traceability matrix with confidence analysis.
53
55
  - **mvp**: Test files for Must-have stories only. One test case per AC. No
54
56
  layer splitting — all tests in acceptance/ directory.
55
- - **custom:depth(1-5)**: Depth 1: Must-have stories only. Depth 2: add
56
- Should-have. Depth 3: add negative cases. Depth 4: add boundary conditions
57
- and layer splitting. Depth 5: full suite with all stories and edge cases.
57
+ - **custom:depth(1-5)**:
58
+ - Depth 1: Must-have stories only, one test case per AC
59
+ - Depth 2: Add Should-have stories
60
+ - Depth 3: Add negative test cases for every happy-path AC
61
+ - Depth 4: Add boundary condition tests and layer splitting (unit/integration/e2e)
62
+ - Depth 5: Full suite — all stories including Could-have, edge cases, and confidence analysis in traceability matrix
58
63
 
59
64
  ## Mode Detection
60
65
  Update mode if tests/acceptance/ directory exists. In update mode: add test
@@ -27,7 +27,8 @@ enabling parallel development with confidence.
27
27
  shapes, error contracts, auth requirements
28
28
 
29
29
  ## Quality Criteria
30
- - (mvp) Every domain operation that crosses a component boundary has an API endpoint
30
+ - (mvp) Every domain operation that crosses a component boundary maps to >= 1 API endpoint
31
+ - (mvp) If domain-models/ does not exist, API boundaries derived from user story acceptance criteria
31
32
  - (mvp) Every endpoint documents: success response code, error response codes, error response body schema, and at least 2 domain-specific error codes per endpoint with human-readable reason phrases (e.g., 400 `invalid_email`, 409 `user_already_exists`)
32
33
  - (mvp) Authentication and authorization requirements per endpoint
33
34
  - (deep) Versioning strategy documented (if applicable)
@@ -35,6 +36,7 @@ enabling parallel development with confidence.
35
36
  - (deep) Idempotency documented for mutating operations
36
37
  - (deep) Pagination schema documented for all list endpoints (cursor or offset, page size limits, total count)
37
38
  - (mvp) Example request and response payloads included for each endpoint
39
+ - (mvp) Every API endpoint from system-architecture.md is specified
38
40
 
39
41
  ## Methodology Scaling
40
42
  - **deep**: OpenAPI-style specification. Full request/response schemas with
@@ -42,8 +44,12 @@ enabling parallel development with confidence.
42
44
  SDK generation considerations.
43
45
  - **mvp**: Endpoint list with HTTP methods and brief descriptions. Key
44
46
  request/response shapes. Auth approach.
45
- - **custom:depth(1-5)**: Depth 1-2: endpoint list. Depth 3: add schemas and
46
- error contracts. Depth 4-5: full OpenAPI-style spec.
47
+ - **custom:depth(1-5)**:
48
+ - Depth 1: endpoint list with HTTP methods and brief descriptions.
49
+ - Depth 2: endpoint list with key request/response shapes and auth approach.
50
+ - Depth 3: add full schemas, error contracts with domain-specific codes, and example payloads.
51
+ - Depth 4: full OpenAPI-style spec with rate limiting, pagination, and idempotency documentation.
52
+ - Depth 5: full spec with SDK generation considerations, versioning strategy, and auth flow diagrams.
47
53
 
48
54
  ## Mode Detection
49
55
  Check for docs/api-contracts.md. If it exists, operate in update mode: read
@@ -29,19 +29,25 @@ from the application's query patterns.
29
29
 
30
30
  ## Quality Criteria
31
31
  - (mvp) Every domain entity maps to a table/collection (or justified denormalization)
32
+ - (mvp) If domain-models/ does not exist, entities derived from user story nouns and PRD feature descriptions
32
33
  - (mvp) Relationships match domain model relationships
33
34
  - (mvp) Constraints enforce domain invariants at the database level
34
35
  - (deep) Migration strategy specifies: migration tool, forward migration approach, rollback approach, and data preservation policy
35
36
  - (deep) Every migration is reversible (rollback script or equivalent exists)
36
37
  - (mvp) Indexes cover all query patterns referenced in docs/api-contracts.md (if it exists)
38
+ - (mvp) Schema does not contradict upstream domain models (entity names, relationships, and invariants match docs/domain-models/)
37
39
 
38
40
  ## Methodology Scaling
39
41
  - **deep**: Full schema specification. CREATE TABLE statements or equivalent.
40
42
  Index justification with query patterns. Normalization analysis. Migration
41
43
  plan with rollback strategy. Seed data strategy.
42
44
  - **mvp**: Entity-to-table mapping. Key relationships. Primary indexes only.
43
- - **custom:depth(1-5)**: Depth 1-2: mapping only. Depth 3: add indexes and
44
- constraints. Depth 4-5: full specification with migrations.
45
+ - **custom:depth(1-5)**:
46
+ - Depth 1: entity-to-table mapping with primary keys only.
47
+ - Depth 2: entity-to-table mapping with key relationships and primary indexes.
48
+ - Depth 3: add secondary indexes, constraints enforcing domain invariants, and normalization analysis.
49
+ - Depth 4: full specification with migration plan, rollback strategy, and index justification with query patterns.
50
+ - Depth 5: full specification with seed data strategy, performance annotations, and multi-environment migration considerations.
45
51
 
46
52
  ## Mode Detection
47
53
  Check for docs/database-schema.md. If it exists, operate in update mode: read
@@ -33,21 +33,27 @@ independent review validation.
33
33
 
34
34
  ## Quality Criteria
35
35
  - (mvp) Operation coverage against domain model verified
36
- - (deep) Error contracts complete and consistent
36
+ - (deep) Error contracts complete: every endpoint documents ≥2 domain-specific error codes, human-readable reason phrases, and a consistent error response schema
37
37
  - (deep) Auth requirements specified for every endpoint
38
38
  - (deep) Versioning strategy consistent with ADRs
39
39
  - (deep) Idempotency documented for all mutating operations
40
- - (mvp) Every finding categorized P0-P3 with specific endpoint, field, and issue
40
+ - (mvp) Every finding categorized P0-P3 (P0 = Breaks downstream work. P1 = Prevents quality milestone. P2 = Known tech debt. P3 = Polish.) with specific endpoint, field, and issue
41
41
  - (mvp) Fix plan documented for all P0/P1 findings; fixes applied to api-contracts.md and re-validated
42
+ - (mvp) Review report includes explicit Readiness Status section
42
43
  - (mvp) Downstream readiness confirmed — no unresolved P0 or P1 findings remain before UX spec proceeds
43
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
44
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
44
45
 
45
46
  ## Methodology Scaling
46
47
  - **deep**: Full multi-pass review targeting all API failure modes. Multi-model
47
48
  review dispatched to Codex and Gemini if available, with graceful fallback
48
49
  to Claude-only enhanced review.
49
50
  - **mvp**: Operation coverage check only.
50
- - **custom:depth(1-5)**: Depth 1: endpoint coverage and response format pass only. Depth 2: add error handling and auth requirement passes. Depth 3: add idempotency, pagination, and versioning passes. Depth 4: add external model API review. Depth 5: multi-model review with reconciliation.
51
+ - **custom:depth(1-5)**:
52
+ - Depth 1: Endpoint coverage and response format pass only (1 review pass)
53
+ - Depth 2: Add error handling and auth requirement passes (2 review passes)
54
+ - Depth 3: Add idempotency, pagination, and versioning passes (4 review passes)
55
+ - Depth 4: Add external model API review (4 review passes + external dispatch)
56
+ - Depth 5: Multi-model review with reconciliation (4 review passes + multi-model synthesis)
51
57
 
52
58
  ## Mode Detection
53
59
  Re-review mode if previous review exists. If multi-model review artifacts exist
@@ -36,17 +36,22 @@ independent review validation.
36
36
  - (deep) Index coverage for known query patterns verified
37
37
  - (deep) Migration safety assessed
38
38
  - (mvp) Referential integrity matches domain invariants
39
- - (mvp) Every finding categorized P0-P3 with specific table, column, and issue
39
+ - (mvp) Every finding categorized P0-P3 (P0 = Breaks downstream work. P1 = Prevents quality milestone. P2 = Known tech debt. P3 = Polish.) with specific table, column, and issue
40
40
  - (mvp) Fix plan documented for all P0/P1 findings; fixes applied to database-schema.md and re-validated
41
41
  - (mvp) Downstream readiness confirmed — no unresolved P0 or P1 findings remain before API contracts proceed
42
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
42
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
43
43
 
44
44
  ## Methodology Scaling
45
45
  - **deep**: Full multi-pass review targeting all schema failure modes. Multi-model
46
46
  review dispatched to Codex and Gemini if available, with graceful fallback
47
47
  to Claude-only enhanced review.
48
48
  - **mvp**: Entity coverage check only.
49
- - **custom:depth(1-5)**: Depth 1: entity coverage and normalization pass only. Depth 2: add index strategy and migration safety passes. Depth 3: add query performance and data integrity passes. Depth 4: add external model review. Depth 5: multi-model review with reconciliation.
49
+ - **custom:depth(1-5)**:
50
+ - Depth 1: Entity coverage and normalization pass only (1 review pass)
51
+ - Depth 2: Add index strategy and migration safety passes (2 review passes)
52
+ - Depth 3: Add query performance and data integrity passes (4 review passes)
53
+ - Depth 4: Add external model review (4 review passes + external dispatch)
54
+ - Depth 5: Multi-model review with reconciliation (4 review passes + multi-model synthesis)
50
55
 
51
56
  ## Mode Detection
52
57
  Re-review mode if previous review exists. If multi-model review artifacts exist
@@ -37,16 +37,22 @@ independent review validation.
37
37
  - (deep) Every user action has at minimum: loading, success, and error states documented
38
38
  - (deep) Design system consistency verified
39
39
  - (deep) Error states present for all failure-capable actions
40
- - (mvp) Every finding categorized P0-P3 with specific flow, screen, and issue
40
+ - (mvp) Every finding categorized P0-P3 (P0 = Breaks downstream work. P1 = Prevents quality milestone. P2 = Known tech debt. P3 = Polish.) with specific flow, screen, and issue
41
41
  - (mvp) Fix plan documented for all P0/P1 findings; fixes applied to ux-spec.md and re-validated
42
+ - (mvp) Review report includes explicit Readiness Status section
42
43
  - (mvp) Downstream readiness confirmed — no unresolved P0 or P1 findings remain before quality phase proceeds
43
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
44
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
44
45
 
45
46
  ## Methodology Scaling
46
47
  - **deep**: Full multi-pass review. Multi-model review dispatched to Codex and
47
48
  Gemini if available, with graceful fallback to Claude-only enhanced review.
48
49
  - **mvp**: Journey coverage only.
49
- - **custom:depth(1-5)**: Depth 1: flow completeness and accessibility pass only. Depth 2: add responsive design and error state passes. Depth 3: add interaction patterns and platform consistency passes. Depth 4: add external model UX review. Depth 5: multi-model review with reconciliation.
50
+ - **custom:depth(1-5)**:
51
+ - Depth 1: Flow completeness and accessibility pass only (1 review pass)
52
+ - Depth 2: Add responsive design and error state passes (2 review passes)
53
+ - Depth 3: Add interaction patterns and platform consistency passes (4 review passes)
54
+ - Depth 4: Add external model UX review (4 review passes + external dispatch)
55
+ - Depth 5: Multi-model review with reconciliation (4 review passes + multi-model synthesis)
50
56
 
51
57
  ## Mode Detection
52
58
  Re-review mode if previous review exists. If multi-model review artifacts exist
@@ -29,9 +29,10 @@ step consumes those tokens, it does not redefine them.
29
29
  - docs/ux-spec.md — UX specification with flows, components, design system
30
30
 
31
31
  ## Quality Criteria
32
- - (mvp) Every PRD user journey has a corresponding flow with all states documented
32
+ - (mvp) Every user story's acceptance criteria maps to >= 1 documented flow
33
+ - (mvp) If design-system.md does not exist, use framework defaults for spacing, typography, and color
33
34
  - (mvp) Component hierarchy covers all UI states (loading, error, empty, populated)
34
- - References design tokens from docs/design-system.md (does not redefine them)
35
+ - (mvp) References design tokens from docs/design-system.md (does not redefine them)
35
36
  - (deep) Accessibility requirements documented (WCAG level, keyboard nav, screen readers)
36
37
  - (deep) Responsive breakpoints defined with layout behavior per breakpoint
37
38
  - (mvp) Error states documented for every user action that can fail
@@ -42,8 +43,12 @@ step consumes those tokens, it does not redefine them.
42
43
  Complete design system. Interaction state machines. Accessibility audit
43
44
  checklist. Animation and transition specs.
44
45
  - **mvp**: Key user flows. Core component list. Basic design tokens.
45
- - **custom:depth(1-5)**: Depth 1-2: flows and components. Depth 3: add design
46
- system. Depth 4-5: full specification with accessibility.
46
+ - **custom:depth(1-5)**:
47
+ - Depth 1: key user flows with primary states (success and error).
48
+ - Depth 2: user flows with core component list and basic state documentation.
49
+ - Depth 3: add design system token references, interaction state machines, and responsive behavior.
50
+ - Depth 4: full specification with accessibility audit, keyboard navigation, and screen reader considerations.
51
+ - Depth 5: full specification with animation/transition specs, comprehensive WCAG compliance checklist, and detailed wireframe descriptions.
47
52
 
48
53
  ## Mode Detection
49
54
  Check for docs/ux-spec.md. If it exists, operate in update mode: read existing
@@ -32,12 +32,12 @@ spec gaps along the critical path.
32
32
  - docs/validation/critical-path-walkthrough/gemini-review.json (depth 4+, if available) — raw Gemini findings
33
33
 
34
34
  ## Quality Criteria
35
- - (mvp) Top critical user journeys (all Must-have epics, minimum 3) traced end-to-end
35
+ - (mvp) User specifies >= 3 Must-have epics as critical user journeys; each traced end-to-end
36
36
  - (deep) Every journey verified at each layer: PRD → Story → UX → API → Architecture → DB → Task
37
37
  - (deep) Each critical path verified against story acceptance criteria for behavioral correctness
38
- - Missing layers or broken handoffs documented with specific gap description
39
- - Findings categorized P0-P3 with specific file, section, and issue for each
40
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
38
+ - (mvp) Missing layers or broken handoffs documented with specific gap description
39
+ - (mvp) Findings categorized P0-P3 with specific file, section, and issue for each
40
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
41
41
 
42
42
  ## Finding Disposition
43
43
  - **P0 (blocking)**: Must be resolved before proceeding to implementation. Create
@@ -57,7 +57,12 @@ proceeding without acknowledgment.
57
57
  dispatched to Codex and Gemini if available, with graceful fallback to
58
58
  Claude-only enhanced validation.
59
59
  - **mvp**: High-level scan for blocking issues only.
60
- - **custom:depth(1-5)**: Depth 1: identify critical path and verify task ordering. Depth 2: add dependency bottleneck analysis. Depth 3: full walkthrough simulating agent execution of critical path tasks. Depth 4: add external model simulation. Depth 5: multi-model walkthrough with divergence analysis.
60
+ - **custom:depth(1-5)**:
61
+ - Depth 1: identify critical path and verify task ordering.
62
+ - Depth 2: add dependency bottleneck analysis.
63
+ - Depth 3: full walkthrough simulating agent execution of critical path tasks.
64
+ - Depth 4: add external model simulation.
65
+ - Depth 5: multi-model walkthrough with divergence analysis.
61
66
 
62
67
  ## Mode Detection
63
68
  Not applicable — validation always runs fresh against current artifacts. If
@@ -33,9 +33,9 @@ drift patterns.
33
33
  - (mvp) Entity names are consistent across domain models, database schema, and API contracts (zero mismatches)
34
34
  - (mvp) Technology references match `docs/tech-stack.md` in all documents
35
35
  - (deep) Data flow descriptions in architecture match API endpoint definitions
36
- - (deep) Terminology is consistent (same concept never uses two different names)
37
- - Findings categorized P0-P3 with specific file, section, and issue for each
38
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
36
+ - (deep) Every named entity in the domain model has exactly one name used consistently across domain-models/, api-contracts.md, database-schema.md, and ux-spec.md
37
+ - (mvp) Findings categorized P0-P3 with specific file, section, and issue for each
38
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
39
39
 
40
40
  ## Finding Disposition
41
41
  - **P0 (blocking)**: Must be resolved before proceeding to implementation. Create
@@ -55,7 +55,12 @@ proceeding without acknowledgment.
55
55
  dispatched to Codex and Gemini if available, with graceful fallback to
56
56
  Claude-only enhanced validation.
57
57
  - **mvp**: High-level scan for blocking issues only.
58
- - **custom:depth(1-5)**: Depth 1: entity name check across PRD, user stories, and domain models. Depth 2: add tech stack reference consistency. Depth 3: full terminology audit across all documents with naming collision detection. Depth 4: add external model cross-check. Depth 5: multi-model reconciliation of consistency findings.
58
+ - **custom:depth(1-5)**:
59
+ - Depth 1: entity name check across PRD, user stories, and domain models.
60
+ - Depth 2: add tech stack reference consistency.
61
+ - Depth 3: full terminology audit across all documents with naming collision detection.
62
+ - Depth 4: add external model cross-check.
63
+ - Depth 5: multi-model reconciliation of consistency findings.
59
64
 
60
65
  ## Mode Detection
61
66
  Not applicable — validation always runs fresh against current artifacts. If
@@ -35,8 +35,8 @@ decisions.
35
35
  - (mvp) No two ADRs contradict each other
36
36
  - (deep) Every ADR has alternatives-considered section with pros/cons
37
37
  - (deep) Every ADR referenced in `docs/system-architecture.md` exists in `docs/adrs/`
38
- - Findings categorized P0-P3 with specific file, section, and issue for each
39
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
38
+ - (mvp) Findings categorized P0-P3 with specific file, section, and issue for each
39
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
40
40
 
41
41
  ## Finding Disposition
42
42
  - **P0 (blocking)**: Must be resolved before proceeding to implementation. Create
@@ -56,7 +56,12 @@ proceeding without acknowledgment.
56
56
  dispatched to Codex and Gemini if available, with graceful fallback to
57
57
  Claude-only enhanced validation.
58
58
  - **mvp**: High-level scan for blocking issues only.
59
- - **custom:depth(1-5)**: Depth 1: verify each major tech choice has an ADR. Depth 2: add alternatives-considered check. Depth 3: full ADR completeness audit (rationale, consequences, status). Depth 4: add external model review of decision quality. Depth 5: multi-model reconciliation of decision coverage.
59
+ - **custom:depth(1-5)**:
60
+ - Depth 1: verify each major tech choice has an ADR.
61
+ - Depth 2: add alternatives-considered check.
62
+ - Depth 3: full ADR completeness audit (rationale, consequences, status).
63
+ - Depth 4: add external model review of decision quality.
64
+ - Depth 5: multi-model reconciliation of decision coverage.
60
65
 
61
66
  ## Mode Detection
62
67
  Not applicable — validation always runs fresh against current artifacts. If
@@ -36,8 +36,8 @@ and completeness issues.
36
36
  - (deep) Critical path identified and total estimated duration documented
37
37
  - (deep) No task is blocked by more than 3 sequential dependencies (flag deep chains)
38
38
  - (deep) Wave assignments are consistent with dependency ordering
39
- - Findings categorized P0-P3 with specific file, section, and issue for each
40
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
39
+ - (mvp) Findings categorized P0-P3 with specific file, section, and issue for each
40
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
41
41
 
42
42
  ## Finding Disposition
43
43
  - **P0 (blocking)**: Must be resolved before proceeding to implementation. Create
@@ -57,7 +57,12 @@ proceeding without acknowledgment.
57
57
  dispatched to Codex and Gemini if available, with graceful fallback to
58
58
  Claude-only enhanced validation.
59
59
  - **mvp**: High-level scan for blocking issues only.
60
- - **custom:depth(1-5)**: Depth 1: cycle detection and basic ordering check. Depth 2: add transitive dependency completeness. Depth 3: full DAG validation with critical path identification and parallelization opportunities. Depth 4: add external model review. Depth 5: multi-model validation with optimization recommendations.
60
+ - **custom:depth(1-5)**:
61
+ - Depth 1: cycle detection and basic ordering check.
62
+ - Depth 2: add transitive dependency completeness.
63
+ - Depth 3: full DAG validation with critical path identification and parallelization opportunities.
64
+ - Depth 4: add external model review.
65
+ - Depth 5: multi-model validation with optimization recommendations.
61
66
 
62
67
  ## Mode Detection
63
68
  Not applicable — validation always runs fresh against current artifacts. If
@@ -31,13 +31,12 @@ when simulating implementation.
31
31
  - docs/validation/implementability-dry-run/gemini-review.json (depth 4+, if available) — raw Gemini findings
32
32
 
33
33
  ## Quality Criteria
34
- - (mvp) Every task has sufficient input specification for an agent to start without guessing
35
- - (mvp) Every task has testable acceptance criteria
34
+ - (mvp) Every task specifies: input file paths, expected output artifacts, testable acceptance criteria, and references to upstream documents
36
35
  - (deep) No task references undefined concepts, components, or APIs
37
36
  - (deep) Every task's dependencies are present in the implementation plan
38
37
  - (deep) Shared code patterns identified and documented (no duplication risk across tasks)
39
- - Findings categorized P0-P3 with specific file, section, and issue for each
40
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
38
+ - (mvp) Findings categorized P0-P3 with specific file, section, and issue for each
39
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
41
40
 
42
41
  ## Finding Disposition
43
42
  - **P0 (blocking)**: Must be resolved before proceeding to implementation. Create
@@ -57,7 +56,12 @@ proceeding without acknowledgment.
57
56
  dispatched to Codex and Gemini if available, with graceful fallback to
58
57
  Claude-only enhanced validation.
59
58
  - **mvp**: High-level scan for blocking issues only.
60
- - **custom:depth(1-5)**: Depth 1: verify each task has enough context to start. Depth 2: add tool/dependency availability check. Depth 3: full dry-run simulation of first 3 tasks with quality gate verification. Depth 4: add external model dry-run. Depth 5: multi-model dry-run with implementation plan revision recommendations.
59
+ - **custom:depth(1-5)**:
60
+ - Depth 1: verify each task has enough context to start.
61
+ - Depth 2: add tool/dependency availability check.
62
+ - Depth 3: full dry-run simulation of first 3 tasks with quality gate verification.
63
+ - Depth 4: add external model dry-run.
64
+ - Depth 5: multi-model dry-run with implementation plan revision recommendations.
61
65
 
62
66
  ## Mode Detection
63
67
  Not applicable — validation always runs fresh against current artifacts. If
@@ -33,13 +33,13 @@ differently, surfacing subtle creep.
33
33
  - docs/validation/scope-creep-check/gemini-review.json (depth 4+, if available) — raw Gemini findings
34
34
 
35
35
  ## Quality Criteria
36
- - (mvp) Every user story traces back to a PRD feature or requirement
37
- - (mvp) Every architecture component traces to a PRD requirement
38
- - Items beyond PRD scope are flagged with disposition (remove, defer, or justify)
36
+ - (mvp) Every user story maps to a PRD feature or requirement
37
+ - (mvp) Every architecture component maps to a PRD requirement
38
+ - (mvp) Items beyond PRD scope are flagged with disposition (remove, defer, or justify)
39
39
  - (deep) No "gold-plating" — implementation tasks do not exceed story acceptance criteria
40
40
  - (deep) Feature count has not grown beyond PRD scope without documented justification
41
- - Findings categorized P0-P3 with specific file, section, and issue for each
42
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
41
+ - (mvp) Findings categorized P0-P3 with specific file, section, and issue for each
42
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
43
43
 
44
44
  ## Finding Disposition
45
45
  - **P0 (blocking)**: Must be resolved before proceeding to implementation. Create
@@ -59,7 +59,12 @@ proceeding without acknowledgment.
59
59
  dispatched to Codex and Gemini if available, with graceful fallback to
60
60
  Claude-only enhanced validation.
61
61
  - **mvp**: High-level scan for blocking issues only.
62
- - **custom:depth(1-5)**: Depth 1: feature count comparison (PRD vs implementation plan). Depth 2: add component-level tracing. Depth 3: full story-level and task-level audit against original PRD scope. Depth 4: add external model scope assessment. Depth 5: multi-model scope review with risk-weighted creep analysis.
62
+ - **custom:depth(1-5)**:
63
+ - Depth 1: feature count comparison (PRD vs implementation plan).
64
+ - Depth 2: add component-level tracing.
65
+ - Depth 3: full story-level and task-level audit against original PRD scope.
66
+ - Depth 4: add external model scope assessment.
67
+ - Depth 5: multi-model scope review with risk-weighted creep analysis.
63
68
 
64
69
  ## Mode Detection
65
70
  Not applicable — validation always runs fresh against current artifacts. If
@@ -35,14 +35,14 @@ coverage gaps.
35
35
  - docs/validation/traceability-matrix/gemini-review.json (depth 4+, if available) — raw Gemini findings
36
36
 
37
37
  ## Quality Criteria
38
- - (mvp) Every PRD requirement maps to >= 1 user story
38
+ - (mvp) Every feature and user-facing behavior in the PRD's feature list maps to >= 1 user story
39
39
  - (mvp) Every user story maps to >= 1 implementation task
40
40
  - (deep) Every acceptance criterion maps to >= 1 test case (verified against `docs/story-tests-map.md`)
41
41
  - (deep) Every test case maps to >= 1 implementation task
42
- - (deep) No orphan items in either direction at any layer
42
+ - (deep) Every Must-have and Should-have item maps to >= 1 downstream artifact. Nice-to-have items may be orphaned with explicit rationale.
43
43
  - (deep) Bidirectional traceability verified: PRD → Stories → Domain → Architecture → Tasks
44
- - Findings categorized P0-P3 with specific file, section, and issue for each
45
- - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
44
+ - (mvp) Findings categorized P0-P3 with specific file, section, and issue for each
45
+ - (depth 4+) Multi-model findings synthesized: Consensus (all models agree), Majority (2+ models agree), or Divergent (models disagree — present to user for decision)
46
46
 
47
47
  ## Finding Disposition
48
48
  - **P0 (blocking)**: Must be resolved before proceeding to implementation. Create
@@ -62,7 +62,12 @@ proceeding without acknowledgment.
62
62
  dispatched to Codex and Gemini if available, with graceful fallback to
63
63
  Claude-only enhanced validation.
64
64
  - **mvp**: High-level scan for blocking issues only.
65
- - **custom:depth(1-5)**: Depth 1: PRD requirement to user story mapping only. Depth 2: add story to implementation task mapping. Depth 3: full bidirectional chain (PRD → story → task → test → eval). Depth 4: add external model verification of coverage gaps. Depth 5: multi-model reconciliation with gap resolution recommendations.
65
+ - **custom:depth(1-5)**:
66
+ - Depth 1: PRD requirement to user story mapping only.
67
+ - Depth 2: add story to implementation task mapping.
68
+ - Depth 3: full bidirectional chain (PRD → story → task → test → eval).
69
+ - Depth 4: add external model verification of coverage gaps.
70
+ - Depth 5: multi-model reconciliation with gap resolution recommendations.
66
71
 
67
72
  ## Mode Detection
68
73
  Not applicable — validation always runs fresh against current artifacts. If
@@ -30,7 +30,7 @@ throughout the entire pipeline.
30
30
  - (mvp) Vision statement describes positive change in the world, not a product feature
31
31
  - (mvp) Vision statement is a single sentence of 25 words or fewer
32
32
  - (mvp) Target audience defined by behaviors and motivations, not demographics
33
- - (deep) For each competitor, at least one strength is documented alongside weaknesses
33
+ - (deep) Each named competitor has >= 1 documented strength and >= 1 documented weakness with specific examples
34
34
  - (mvp) Each guiding principle is framed as 'We choose X over Y' where Y is a legitimate alternative
35
35
  - (deep) Anti-vision contains >= 3 named traps, each referencing a concrete product direction or feature class
36
36
  - (deep) Business model addresses sustainability without being a full business plan
@@ -43,9 +43,12 @@ throughout the entire pipeline.
43
43
  analysis, multi-year success horizon. 3-5 pages.
44
44
  - **mvp**: Vision statement, target audience, core problem, value proposition,
45
45
  2-3 guiding principles. 1 page. Enough to anchor the PRD.
46
- - **custom:depth(1-5)**: Depth 1-2: MVP-style. Depth 3: add competitive
47
- landscape and anti-vision. Depth 4: add business model and strategic risks.
48
- Depth 5: full document with all 12 sections.
46
+ - **custom:depth(1-5)**:
47
+ - Depth 1: MVP-style — vision statement, target audience, core problem, value proposition. 1 page.
48
+ - Depth 2: MVP + 2-3 guiding principles with tradeoff framing. 1-2 pages.
49
+ - Depth 3: Add competitive landscape and anti-vision. 2-3 pages.
50
+ - Depth 4: Add business model, strategic risks, and success horizon. 3-4 pages.
51
+ - Depth 5: Full document with all 12 sections, multi-year success criteria. 3-5 pages.
49
52
 
50
53
  ## Mode Detection
51
54
  If docs/vision.md exists, this is an update. Read and analyze the existing