@zigrivers/scaffold 2.1.1 → 2.28.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (100) hide show
  1. package/README.md +272 -59
  2. package/dist/project/frontmatter.d.ts.map +1 -1
  3. package/dist/project/frontmatter.js +4 -0
  4. package/dist/project/frontmatter.js.map +1 -1
  5. package/knowledge/core/adr-craft.md +53 -0
  6. package/knowledge/core/ai-memory-management.md +246 -0
  7. package/knowledge/core/api-design.md +4 -0
  8. package/knowledge/core/claude-md-patterns.md +254 -0
  9. package/knowledge/core/coding-conventions.md +246 -0
  10. package/knowledge/core/database-design.md +4 -0
  11. package/knowledge/core/design-system-tokens.md +465 -0
  12. package/knowledge/core/dev-environment.md +223 -0
  13. package/knowledge/core/domain-modeling.md +4 -0
  14. package/knowledge/core/eval-craft.md +1008 -0
  15. package/knowledge/core/multi-model-review-dispatch.md +250 -0
  16. package/knowledge/core/operations-runbook.md +37 -226
  17. package/knowledge/core/project-structure-patterns.md +231 -0
  18. package/knowledge/core/review-step-template.md +247 -0
  19. package/knowledge/core/{security-review.md → security-best-practices.md} +5 -1
  20. package/knowledge/core/task-decomposition.md +57 -34
  21. package/knowledge/core/task-tracking.md +225 -0
  22. package/knowledge/core/tech-stack-selection.md +214 -0
  23. package/knowledge/core/testing-strategy.md +63 -70
  24. package/knowledge/core/user-stories.md +69 -60
  25. package/knowledge/core/user-story-innovation.md +57 -0
  26. package/knowledge/core/ux-specification.md +5 -148
  27. package/knowledge/finalization/apply-fixes-and-freeze.md +165 -14
  28. package/knowledge/product/prd-craft.md +55 -34
  29. package/knowledge/review/review-adr.md +32 -0
  30. package/knowledge/review/{review-api-contracts.md → review-api-design.md} +34 -1
  31. package/knowledge/review/{review-database-schema.md → review-database-design.md} +27 -1
  32. package/knowledge/review/review-domain-modeling.md +33 -0
  33. package/knowledge/review/review-implementation-tasks.md +50 -0
  34. package/knowledge/review/review-operations.md +55 -0
  35. package/knowledge/review/review-prd.md +33 -0
  36. package/knowledge/review/review-security.md +53 -0
  37. package/knowledge/review/review-system-architecture.md +28 -0
  38. package/knowledge/review/review-testing-strategy.md +51 -0
  39. package/knowledge/review/review-user-stories.md +54 -0
  40. package/knowledge/review/{review-ux-spec.md → review-ux-specification.md} +37 -1
  41. package/methodology/custom-defaults.yml +32 -3
  42. package/methodology/deep.yml +32 -3
  43. package/methodology/mvp.yml +32 -3
  44. package/package.json +2 -1
  45. package/pipeline/architecture/review-architecture.md +18 -6
  46. package/pipeline/architecture/system-architecture.md +14 -2
  47. package/pipeline/consolidation/claude-md-optimization.md +73 -0
  48. package/pipeline/consolidation/workflow-audit.md +73 -0
  49. package/pipeline/decisions/adrs.md +14 -2
  50. package/pipeline/decisions/review-adrs.md +18 -5
  51. package/pipeline/environment/ai-memory-setup.md +70 -0
  52. package/pipeline/environment/automated-pr-review.md +70 -0
  53. package/pipeline/environment/design-system.md +73 -0
  54. package/pipeline/environment/dev-env-setup.md +65 -0
  55. package/pipeline/environment/git-workflow.md +71 -0
  56. package/pipeline/finalization/apply-fixes-and-freeze.md +1 -1
  57. package/pipeline/finalization/developer-onboarding-guide.md +1 -1
  58. package/pipeline/finalization/implementation-playbook.md +3 -3
  59. package/pipeline/foundation/beads.md +68 -0
  60. package/pipeline/foundation/coding-standards.md +68 -0
  61. package/pipeline/foundation/project-structure.md +69 -0
  62. package/pipeline/foundation/tdd.md +60 -0
  63. package/pipeline/foundation/tech-stack.md +74 -0
  64. package/pipeline/integration/add-e2e-testing.md +65 -0
  65. package/pipeline/modeling/domain-modeling.md +14 -2
  66. package/pipeline/modeling/review-domain-modeling.md +18 -5
  67. package/pipeline/parity/platform-parity-review.md +70 -0
  68. package/pipeline/planning/implementation-plan-review.md +56 -0
  69. package/pipeline/planning/{implementation-tasks.md → implementation-plan.md} +29 -9
  70. package/pipeline/pre/create-prd.md +13 -4
  71. package/pipeline/pre/innovate-prd.md +37 -8
  72. package/pipeline/pre/innovate-user-stories.md +38 -7
  73. package/pipeline/pre/review-prd.md +18 -6
  74. package/pipeline/pre/review-user-stories.md +23 -6
  75. package/pipeline/pre/user-stories.md +12 -2
  76. package/pipeline/quality/create-evals.md +102 -0
  77. package/pipeline/quality/operations.md +38 -13
  78. package/pipeline/quality/review-operations.md +17 -5
  79. package/pipeline/quality/review-security.md +17 -5
  80. package/pipeline/quality/review-testing.md +20 -8
  81. package/pipeline/quality/security.md +25 -3
  82. package/pipeline/quality/story-tests.md +73 -0
  83. package/pipeline/specification/api-contracts.md +17 -2
  84. package/pipeline/specification/database-schema.md +17 -2
  85. package/pipeline/specification/review-api.md +18 -6
  86. package/pipeline/specification/review-database.md +18 -6
  87. package/pipeline/specification/review-ux.md +19 -7
  88. package/pipeline/specification/ux-spec.md +29 -10
  89. package/pipeline/validation/critical-path-walkthrough.md +34 -7
  90. package/pipeline/validation/cross-phase-consistency.md +34 -7
  91. package/pipeline/validation/decision-completeness.md +34 -7
  92. package/pipeline/validation/dependency-graph-validation.md +34 -7
  93. package/pipeline/validation/implementability-dry-run.md +34 -7
  94. package/pipeline/validation/scope-creep-check.md +34 -7
  95. package/pipeline/validation/traceability-matrix.md +34 -7
  96. package/skills/multi-model-dispatch/SKILL.md +326 -0
  97. package/skills/scaffold-pipeline/SKILL.md +195 -0
  98. package/skills/scaffold-runner/SKILL.md +465 -0
  99. package/pipeline/planning/review-tasks.md +0 -38
  100. package/pipeline/quality/testing-strategy.md +0 -42
@@ -0,0 +1,102 @@
1
+ ---
2
+ name: create-evals
3
+ description: Generate project-specific eval checks from standards documentation
4
+ phase: "quality"
5
+ order: 920
6
+ dependencies: [tdd, story-tests]
7
+ outputs: [tests/evals/, docs/eval-standards.md]
8
+ reads: [story-tests]
9
+ conditional: null
10
+ knowledge-base: [eval-craft, testing-strategy]
11
+ ---
12
+
13
+ ## Purpose
14
+ Generate automated eval checks that verify AI-generated code meets the project's
15
+ own documented standards. Evals are test files in the project's own test framework
16
+ — not a separate tool. They check up to 13 categories: 5 core (always generated)
17
+ and 8 conditional (generated when their source document exists). Core: consistency,
18
+ structure, adherence, coverage, cross-doc. Conditional: architecture conformance,
19
+ API contract, security patterns, database schema, accessibility, performance budget,
20
+ configuration validation, error handling completeness.
21
+
22
+ ## Inputs
23
+ - docs/tech-stack.md (required) — determines test framework and stack-specific patterns
24
+ - docs/coding-standards.md (required) — adherence and error handling patterns
25
+ - docs/tdd-standards.md (required) — test co-location rules, mocking strategy
26
+ - docs/project-structure.md (required) — file placement rules for structure evals
27
+ - CLAUDE.md (required) — Key Commands table for consistency evals
28
+ - Makefile or package.json (required) — build targets to match against
29
+ - tests/acceptance/ (optional) — story test skeletons for coverage validation
30
+ - docs/user-stories.md (optional) — acceptance criteria for coverage evals
31
+ - docs/plan.md (optional) — feature list for coverage evals, performance NFRs
32
+ - docs/system-architecture.md (optional) — architecture conformance evals
33
+ - docs/api-contracts.md (optional) — API contract validation evals
34
+ - docs/security-review.md (optional) — security pattern verification evals
35
+ - docs/database-schema.md (optional) — database schema conformance evals
36
+ - docs/ux-spec.md (optional) — accessibility compliance evals
37
+ - docs/dev-setup.md (optional) — configuration validation evals
38
+
39
+ ## Expected Outputs
40
+
41
+ Core (always generated):
42
+ - tests/evals/consistency.test.* — command matching, format checking, cross-doc refs
43
+ - tests/evals/structure.test.* — file placement, shared code rules, test co-location
44
+ - tests/evals/adherence.test.* — coding convention patterns, mock rules, TODO format
45
+ - tests/evals/coverage.test.* — feature-to-code mapping, AC-to-test mapping
46
+ - tests/evals/cross-doc.test.* — tech stack consistency, path consistency, terminology
47
+
48
+ Conditional (generated when source doc exists):
49
+ - tests/evals/architecture.test.* — layer direction, module boundaries, circular deps
50
+ - tests/evals/api-contract.test.* — endpoint existence, methods, error codes
51
+ - tests/evals/security.test.* — auth middleware, secrets, input validation, SQL injection
52
+ - tests/evals/database.test.* — migration coverage, columns, indexes, relationships
53
+ - tests/evals/accessibility.test.* — ARIA, alt text, focus styles, contrast
54
+ - tests/evals/performance.test.* — budget files, bundle tracking, perf test existence
55
+ - tests/evals/config.test.* — env var docs, dead config, startup validation
56
+ - tests/evals/error-handling.test.* — bare catches, error responses tested, custom errors
57
+
58
+ Supporting:
59
+ - tests/evals/helpers.* — shared utilities
60
+ - docs/eval-standards.md — documents what is and isn't checked
61
+ - make eval target added to Makefile/package.json
62
+
63
+ ## Quality Criteria
64
+ - (mvp) Consistency + Structure evals generated
65
+ - (mvp) Evals use the project's own test framework from docs/tech-stack.md
66
+ - (mvp) All generated evals pass on the current codebase (no false positives)
67
+ - (mvp) Eval results are binary PASS/FAIL, not scores
68
+ - (mvp) make eval is separate from make test and make check (opt-in for CI)
69
+ - (deep) All applicable eval categories generated including security, API, DB, accessibility (conditional on source doc existence)
70
+ - (deep) Adherence, security, and error-handling evals include exclusion mechanisms
71
+ - (deep) docs/eval-standards.md explicitly documents what evals do NOT check
72
+ - (deep) Full eval suite runs in under 30 seconds
73
+
74
+ ## Methodology Scaling
75
+ - **deep**: All 13 eval categories (conditional on doc existence). Stack-specific
76
+ patterns. Coverage with keyword extraction. Cross-doc consistency. Architecture
77
+ conformance. API contract validation. Security patterns. Full suite.
78
+ - **mvp**: Consistency + Structure only. Skip everything else.
79
+ - **custom:depth(1-5)**:
80
+ - Depth 1-2: Consistency + Structure
81
+ - Depth 3: Add Adherence + Cross-doc
82
+ - Depth 4: Add Coverage + Architecture + Config + Error handling
83
+ - Depth 5: All 13 categories (Security, API, Database, Accessibility, Performance)
84
+
85
+ ## Mode Detection
86
+ Update mode if tests/evals/ directory exists. In update mode: regenerate
87
+ consistency, structure, cross-doc, and conditional category evals. Preserve
88
+ adherence, security, and error-handling eval exclusions. Regenerate coverage
89
+ evals only if plan.md or user-stories.md changed. Add/remove conditional
90
+ categories based on whether their source doc exists.
91
+
92
+ ## Update Mode Specifics
93
+ - **Detect prior artifact**: tests/evals/ directory exists with eval test files
94
+ - **Preserve**: adherence eval exclusions, security eval exclusions,
95
+ error-handling eval exclusions, custom helper utilities in tests/evals/helpers,
96
+ make eval target configuration
97
+ - **Triggers for update**: source docs changed (coding-standards, project-structure,
98
+ tech-stack), new conditional source docs appeared (e.g., security-review.md
99
+ now exists), Makefile targets changed, user-stories.md changed
100
+ - **Conflict resolution**: if a source doc was removed, archive its conditional
101
+ eval category rather than deleting; if exclusion patterns conflict with new
102
+ standards, flag for user review
@@ -1,8 +1,8 @@
1
1
  ---
2
2
  name: operations
3
- description: Define operations, deployment, and dev environment strategy
3
+ description: Define deployment pipeline, deployment strategy, monitoring, alerting, and incident response
4
4
  phase: "quality"
5
- order: 21
5
+ order: 930
6
6
  dependencies: [review-testing]
7
7
  outputs: [docs/operations-runbook.md]
8
8
  conditional: null
@@ -10,33 +10,58 @@ knowledge-base: [operations-runbook]
10
10
  ---
11
11
 
12
12
  ## Purpose
13
- Define the operational strategy: CI/CD pipeline, deployment approach, monitoring
14
- and alerting, incident response, rollback procedures, and dev environment setup.
15
- This is both the production operations guide and the local development workflow.
13
+ Define the production operational strategy: deployment pipeline (extending the
14
+ base CI from git-workflow), deployment approach, monitoring and alerting, incident
15
+ response, and rollback procedures. References docs/dev-setup.md for local
16
+ development setup rather than redefining it.
16
17
 
17
18
  ## Inputs
18
19
  - docs/system-architecture.md (required) — what to deploy
19
- - docs/testing-strategy.md (required) — CI pipeline test stages
20
+ - docs/tdd-standards.md (required) — CI pipeline test stages
20
21
  - docs/adrs/ (required) — infrastructure decisions
22
+ - docs/dev-setup.md (optional) — local dev setup to reference, not redefine
23
+ - docs/git-workflow.md (optional) — base CI pipeline to extend, not redefine
21
24
 
22
25
  ## Expected Outputs
23
- - docs/operations-runbook.md — operations and deployment runbook
26
+ - docs/operations-runbook.md — production operations and deployment runbook
24
27
 
25
28
  ## Quality Criteria
26
- - CI/CD pipeline defined with all stages (build, test, lint, deploy)
29
+ - Deployment pipeline extends existing CI (build, deploy, post-deploy stages)
30
+ - Deployment pipeline has explicit stages (build → test → deploy → verify → rollback-ready)
31
+ - Does not redefine base CI stages (lint, test) from git-workflow
27
32
  - Deployment strategy chosen with rollback procedure
33
+ - Rollback procedure tested with specific trigger conditions (e.g., error rate > X%, health check failure)
34
+ - Runbook structured by operational scenario (deployment, rollback, incident, scaling)
28
35
  - Monitoring covers key metrics (latency, error rate, saturation)
36
+ - Each monitoring metric has an explicit threshold with rationale
37
+ - Health check endpoints defined with expected response codes and latency bounds
38
+ - Log aggregation strategy specifies retention period and searchable fields
29
39
  - Alerting thresholds are justified, not arbitrary
30
- - Dev environment setup is documented and reproducible
40
+ - References docs/dev-setup.md for local dev — does not redefine it
31
41
  - Incident response process defined
32
42
 
33
43
  ## Methodology Scaling
34
44
  - **deep**: Full runbook. Deployment topology diagrams. Monitoring dashboard
35
- specs. Alert playbooks. DR plan. Capacity planning. Local dev with
36
- containers matching production.
37
- - **mvp**: Basic CI/CD pipeline. Deploy command. How to run locally.
45
+ specs. Alert playbooks. DR plan. Capacity planning.
46
+ - **mvp**: Deploy command. Basic monitoring. Rollback procedure.
38
47
  - **custom:depth(1-5)**: Depth 1-2: MVP-style. Depth 3: add monitoring and
39
48
  alerts. Depth 4-5: full runbook with DR.
40
49
 
41
50
  ## Mode Detection
42
- Update mode if runbook exists.
51
+ Check for docs/operations-runbook.md. If it exists, operate in update mode:
52
+ read existing runbook and diff against current system architecture, ADRs, and
53
+ deployment configuration. Preserve existing deployment procedures, monitoring
54
+ thresholds, and incident response processes. Update deployment pipeline stages
55
+ if architecture changed. Never modify rollback procedures without user approval.
56
+
57
+ ## Update Mode Specifics
58
+ - **Detect prior artifact**: docs/operations-runbook.md exists
59
+ - **Preserve**: deployment procedures, monitoring thresholds, alerting rules,
60
+ incident response processes, rollback procedures, environment-specific
61
+ configurations
62
+ - **Triggers for update**: architecture changed deployment topology, new ADRs
63
+ changed infrastructure, security review identified operational requirements,
64
+ CI pipeline changed (new stages to extend)
65
+ - **Conflict resolution**: if architecture changed the deployment target,
66
+ update deployment stages but preserve monitoring and alerting sections;
67
+ verify runbook does not redefine base CI stages from git-workflow.md
@@ -2,9 +2,9 @@
2
2
  name: review-operations
3
3
  description: Review operations runbook for completeness and safety
4
4
  phase: "quality"
5
- order: 22
5
+ order: 940
6
6
  dependencies: [operations]
7
- outputs: [docs/reviews/review-operations.md]
7
+ outputs: [docs/reviews/review-operations.md, docs/reviews/operations/review-summary.md, docs/reviews/operations/codex-review.json, docs/reviews/operations/gemini-review.json]
8
8
  conditional: null
9
9
  knowledge-base: [review-methodology, review-operations]
10
10
  ---
@@ -14,6 +14,9 @@ Review operations runbook targeting operations-specific failure modes: deploymen
14
14
  strategy gaps, missing rollback procedures, monitoring blind spots, unjustified
15
15
  alerting thresholds, missing runbook scenarios, and DR coverage gaps.
16
16
 
17
+ At depth 4+, dispatches to external AI models (Codex, Gemini) for
18
+ independent review validation.
19
+
17
20
  ## Inputs
18
21
  - docs/operations-runbook.md (required) — runbook to review
19
22
  - docs/system-architecture.md (required) — for deployment coverage
@@ -21,6 +24,9 @@ alerting thresholds, missing runbook scenarios, and DR coverage gaps.
21
24
  ## Expected Outputs
22
25
  - docs/reviews/review-operations.md — findings and resolution log
23
26
  - docs/operations-runbook.md — updated with fixes
27
+ - docs/reviews/operations/review-summary.md (depth 4+) — multi-model review synthesis
28
+ - docs/reviews/operations/codex-review.json (depth 4+, if available) — raw Codex findings
29
+ - docs/reviews/operations/gemini-review.json (depth 4+, if available) — raw Gemini findings
24
30
 
25
31
  ## Quality Criteria
26
32
  - Deployment lifecycle fully documented (deploy, verify, rollback)
@@ -28,10 +34,16 @@ alerting thresholds, missing runbook scenarios, and DR coverage gaps.
28
34
  - Alert thresholds have rationale
29
35
  - Common failure scenarios have runbook entries
30
36
  - Dev environment parity assessed
37
+ - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
31
38
 
32
39
  ## Methodology Scaling
33
- - **deep**: Full multi-pass review. **mvp**: Deployment coverage only.
34
- - **custom:depth(1-5)**: Scale passes with depth.
40
+ - **deep**: Full multi-pass review. Multi-model review dispatched to Codex and
41
+ Gemini if available, with graceful fallback to Claude-only enhanced review.
42
+ **mvp**: Deployment coverage only.
43
+ - **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
44
+ review + one external model (if CLI available). Depth 5: full review +
45
+ multi-model with reconciliation.
35
46
 
36
47
  ## Mode Detection
37
- Re-review mode if previous review exists.
48
+ Re-review mode if previous review exists. If multi-model review artifacts exist
49
+ under docs/reviews/operations/, preserve prior findings still valid.
@@ -2,9 +2,9 @@
2
2
  name: review-security
3
3
  description: Review security review for coverage and correctness
4
4
  phase: "quality"
5
- order: 24
5
+ order: 960
6
6
  dependencies: [security]
7
- outputs: [docs/reviews/review-security.md]
7
+ outputs: [docs/reviews/review-security.md, docs/reviews/security/review-summary.md, docs/reviews/security/codex-review.json, docs/reviews/security/gemini-review.json]
8
8
  conditional: null
9
9
  knowledge-base: [review-methodology, review-security]
10
10
  ---
@@ -15,6 +15,9 @@ gaps, auth/authz boundary mismatches with API contracts, secrets management gaps
15
15
  insufficient dependency audit coverage, missing threat model scenarios, and data
16
16
  classification gaps.
17
17
 
18
+ At depth 4+, dispatches to external AI models (Codex, Gemini) for
19
+ independent review validation.
20
+
18
21
  ## Inputs
19
22
  - docs/security-review.md (required) — security review document
20
23
  - docs/api-contracts.md (optional) — for auth boundary alignment
@@ -23,6 +26,9 @@ classification gaps.
23
26
  ## Expected Outputs
24
27
  - docs/reviews/review-security.md — findings and resolution log
25
28
  - docs/security-review.md — updated with fixes
29
+ - docs/reviews/security/review-summary.md (depth 4+) — multi-model review synthesis
30
+ - docs/reviews/security/codex-review.json (depth 4+, if available) — raw Codex findings
31
+ - docs/reviews/security/gemini-review.json (depth 4+, if available) — raw Gemini findings
26
32
 
27
33
  ## Quality Criteria
28
34
  - OWASP coverage verified for this project
@@ -31,10 +37,16 @@ classification gaps.
31
37
  - Dependency audit scope covers all dependencies
32
38
  - Threat model covers all trust boundaries
33
39
  - Data classification is complete
40
+ - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
34
41
 
35
42
  ## Methodology Scaling
36
- - **deep**: Full multi-pass review. **mvp**: OWASP coverage check only.
37
- - **custom:depth(1-5)**: Scale passes with depth.
43
+ - **deep**: Full multi-pass review. Multi-model review dispatched to Codex and
44
+ Gemini if available, with graceful fallback to Claude-only enhanced review.
45
+ **mvp**: OWASP coverage check only.
46
+ - **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
47
+ review + one external model (if CLI available). Depth 5: full review +
48
+ multi-model with reconciliation.
38
49
 
39
50
  ## Mode Detection
40
- Re-review mode if previous review exists.
51
+ Re-review mode if previous review exists. If multi-model review artifacts exist
52
+ under docs/reviews/security/, preserve prior findings still valid.
@@ -2,9 +2,9 @@
2
2
  name: review-testing
3
3
  description: Review testing strategy for coverage gaps and feasibility
4
4
  phase: "quality"
5
- order: 20
6
- dependencies: [testing-strategy]
7
- outputs: [docs/reviews/review-testing.md]
5
+ order: 910
6
+ dependencies: [tdd]
7
+ outputs: [docs/reviews/review-testing.md, docs/reviews/testing/review-summary.md, docs/reviews/testing/codex-review.json, docs/reviews/testing/gemini-review.json]
8
8
  conditional: null
9
9
  knowledge-base: [review-methodology, review-testing-strategy]
10
10
  ---
@@ -14,14 +14,20 @@ Review testing strategy targeting testing-specific failure modes: coverage gaps
14
14
  by layer, missing edge cases from domain invariants, unrealistic test environment
15
15
  assumptions, inadequate performance test coverage, and missing integration boundaries.
16
16
 
17
+ At depth 4+, dispatches to external AI models (Codex, Gemini) for
18
+ independent review validation.
19
+
17
20
  ## Inputs
18
- - docs/testing-strategy.md (required) — strategy to review
21
+ - docs/tdd-standards.md (required) — strategy to review
19
22
  - docs/domain-models/ (required) — for invariant test case coverage
20
23
  - docs/system-architecture.md (required) — for layer coverage
21
24
 
22
25
  ## Expected Outputs
23
26
  - docs/reviews/review-testing.md — findings and resolution log
24
- - docs/testing-strategy.md — updated with fixes
27
+ - docs/tdd-standards.md — updated with fixes
28
+ - docs/reviews/testing/review-summary.md (depth 4+) — multi-model review synthesis
29
+ - docs/reviews/testing/codex-review.json (depth 4+, if available) — raw Codex findings
30
+ - docs/reviews/testing/gemini-review.json (depth 4+, if available) — raw Gemini findings
25
31
 
26
32
  ## Quality Criteria
27
33
  - Coverage gaps by layer identified
@@ -29,11 +35,17 @@ assumptions, inadequate performance test coverage, and missing integration bound
29
35
  - Test environment assumptions validated
30
36
  - Performance test coverage assessed against NFRs
31
37
  - Integration boundaries have integration tests defined
38
+ - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
32
39
 
33
40
  ## Methodology Scaling
34
- - **deep**: Full multi-pass review targeting all testing failure modes.
41
+ - **deep**: Full multi-pass review targeting all testing failure modes. Multi-model
42
+ review dispatched to Codex and Gemini if available, with graceful fallback
43
+ to Claude-only enhanced review.
35
44
  - **mvp**: Coverage gap check only.
36
- - **custom:depth(1-5)**: Scale passes with depth.
45
+ - **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
46
+ review + one external model (if CLI available). Depth 5: full review +
47
+ multi-model with reconciliation.
37
48
 
38
49
  ## Mode Detection
39
- Re-review mode if previous review exists.
50
+ Re-review mode if previous review exists. If multi-model review artifacts exist
51
+ under docs/reviews/testing/, preserve prior findings still valid.
@@ -2,11 +2,11 @@
2
2
  name: security
3
3
  description: Security review and documentation
4
4
  phase: "quality"
5
- order: 23
5
+ order: 950
6
6
  dependencies: [review-operations]
7
7
  outputs: [docs/security-review.md]
8
8
  conditional: null
9
- knowledge-base: [security-review]
9
+ knowledge-base: [security-best-practices]
10
10
  ---
11
11
 
12
12
  ## Purpose
@@ -25,10 +25,16 @@ management, and dependency audit strategy.
25
25
 
26
26
  ## Quality Criteria
27
27
  - OWASP top 10 addressed for this specific project
28
+ - Every API endpoint has authentication and authorization requirements specified
28
29
  - Auth/authz boundaries defined and consistent with API contracts
30
+ - Input validation rules defined for each user-facing field (type, length, pattern)
29
31
  - Data classified by sensitivity with handling requirements
32
+ - Secrets management approach documented (no hardcoded credentials in code)
30
33
  - Secrets management strategy defined (no secrets in code)
34
+ - CORS policy explicitly configured per origin (not wildcard in production)
35
+ - Rate limiting defined for public-facing endpoints with specific thresholds
31
36
  - Threat model covers all trust boundaries
37
+ - Dependency audit strategy documented (automated scanning, update cadence)
32
38
  - Dependency audit integrated into CI
33
39
 
34
40
  ## Methodology Scaling
@@ -41,4 +47,20 @@ management, and dependency audit strategy.
41
47
  Depth 4-5: full security review.
42
48
 
43
49
  ## Mode Detection
44
- Update mode if review exists.
50
+ Check for docs/security-review.md. If it exists, operate in update mode: read
51
+ existing security controls and threat model, diff against current system
52
+ architecture and API contracts. Preserve existing threat model entries, auth
53
+ decisions, and data classification. Add new threat boundaries for new
54
+ components. Update auth requirements if API contracts changed.
55
+
56
+ ## Update Mode Specifics
57
+ - **Detect prior artifact**: docs/security-review.md exists
58
+ - **Preserve**: threat model entries, data classification matrix, auth/authz
59
+ decisions, secrets management strategy, dependency audit configuration,
60
+ compliance checklist items
61
+ - **Triggers for update**: architecture added new components (new attack surface),
62
+ API contracts changed auth requirements, database schema changed data
63
+ sensitivity, operations runbook changed deployment security
64
+ - **Conflict resolution**: if a new component introduces a trust boundary
65
+ that conflicts with existing auth approach, document both and flag for
66
+ user decision; never weaken existing security controls without approval
@@ -0,0 +1,73 @@
1
+ ---
2
+ name: story-tests
3
+ description: Generate test skeletons from user story acceptance criteria
4
+ phase: "quality"
5
+ order: 915
6
+ dependencies: [tdd, review-user-stories, review-architecture]
7
+ outputs: [tests/acceptance/, docs/story-tests-map.md]
8
+ reads: [tech-stack, coding-standards, project-structure]
9
+ conditional: null
10
+ knowledge-base: [testing-strategy, user-stories]
11
+ ---
12
+
13
+ ## Purpose
14
+ Generate test skeleton files from user story acceptance criteria, creating a
15
+ direct, traceable link from every AC to a tagged test case. Each story produces
16
+ a test file with one test case per acceptance criterion, tagged with story and
17
+ AC IDs for downstream coverage verification. Test cases are created as
18
+ pending/skipped — developers implement them during TDD execution.
19
+
20
+ ## Inputs
21
+ - docs/user-stories.md (required) — stories with acceptance criteria in GWT format
22
+ - docs/tdd-standards.md (required) — test framework, patterns, layer conventions
23
+ - docs/tech-stack.md (required) — language, test runner, assertion library
24
+ - docs/coding-standards.md (required) — test naming conventions
25
+ - docs/system-architecture.md (required) — component structure for layer assignment
26
+ - docs/project-structure.md (required) — test file location conventions
27
+ - docs/api-contracts.md (optional) — endpoint details for API test skeletons
28
+ - docs/database-schema.md (optional) — data layer context for integration tests
29
+ - docs/ux-spec.md (optional) — UI component context for component tests
30
+
31
+ ## Expected Outputs
32
+ - tests/acceptance/{story-id}-{slug}.test.* — one test file per story with
33
+ tagged pending test cases per AC
34
+ - docs/story-tests-map.md — traceability matrix mapping stories → test files,
35
+ ACs → test cases, and layer assignments (unit/integration/e2e)
36
+
37
+ ## Quality Criteria
38
+ - Every user story in docs/user-stories.md has a corresponding test file
39
+ - Every acceptance criterion has at least one tagged test case
40
+ - Test cases are tagged with story ID and AC ID for traceability
41
+ - Test layer (unit/integration/e2e) is assigned based on AC type and architecture
42
+ - Test files use the project's test framework from docs/tech-stack.md
43
+ - All test cases are created as pending/skipped (not implemented)
44
+ - docs/story-tests-map.md shows 100% AC-to-test-case coverage
45
+ - Test file location follows conventions from docs/project-structure.md
46
+
47
+ ## Methodology Scaling
48
+ - **deep**: All stories get test files. Negative test cases for every happy path
49
+ AC. Boundary condition tests. Layer-specific skeletons (unit + integration +
50
+ e2e where applicable). Traceability matrix with confidence analysis.
51
+ - **mvp**: Test files for Must-have stories only. One test case per AC. No
52
+ layer splitting — all tests in acceptance/ directory.
53
+ - **custom:depth(1-5)**: Depth 1: Must-have stories only. Depth 2: add
54
+ Should-have. Depth 3: add negative cases. Depth 4: add boundary conditions
55
+ and layer splitting. Depth 5: full suite with all stories and edge cases.
56
+
57
+ ## Mode Detection
58
+ Update mode if tests/acceptance/ directory exists. In update mode: add test
59
+ files for new stories, add test cases for new ACs in existing stories, never
60
+ delete user-implemented test logic (only add new pending cases). Update
61
+ docs/story-tests-map.md with new mappings.
62
+
63
+ ## Update Mode Specifics
64
+ - **Detect prior artifact**: tests/acceptance/ directory exists with test files
65
+ - **Preserve**: all user-implemented test logic, existing test file names and
66
+ structure, story ID and AC ID tags, traceability mappings in
67
+ docs/story-tests-map.md
68
+ - **Triggers for update**: user stories added or changed acceptance criteria,
69
+ architecture changed component structure (layer assignments may shift),
70
+ tdd-standards.md changed test patterns or framework
71
+ - **Conflict resolution**: if a story's AC was reworded, update the test case
72
+ description but preserve any implemented test body; if layer assignment
73
+ changed, move the test case to the correct layer file
@@ -2,7 +2,7 @@
2
2
  name: api-contracts
3
3
  description: Specify API contracts for all system interfaces
4
4
  phase: "specification"
5
- order: 15
5
+ order: 830
6
6
  dependencies: [review-architecture]
7
7
  outputs: [docs/api-contracts.md]
8
8
  conditional: "if-needed"
@@ -41,4 +41,19 @@ response shapes, error codes, authentication requirements, and rate limits.
41
41
  error contracts. Depth 4-5: full OpenAPI-style spec.
42
42
 
43
43
  ## Mode Detection
44
- Update mode if contracts exist. Diff against architecture changes.
44
+ Check for docs/api-contracts.md. If it exists, operate in update mode: read
45
+ existing endpoint definitions and diff against current system architecture and
46
+ domain models. Preserve existing endpoint paths, request/response schemas, and
47
+ error contracts. Add new endpoints for new features or domain operations.
48
+ Update error contracts if domain model changed validation rules. Never remove
49
+ or rename existing endpoints without explicit user approval.
50
+
51
+ ## Update Mode Specifics
52
+ - **Detect prior artifact**: docs/api-contracts.md exists
53
+ - **Preserve**: existing endpoint paths, HTTP methods, request/response schemas,
54
+ error codes, auth requirements, pagination patterns, versioning strategy
55
+ - **Triggers for update**: architecture changed component boundaries, domain
56
+ models added new operations, ADRs changed API style or auth approach
57
+ - **Conflict resolution**: if architecture moved an operation to a different
58
+ component, update the endpoint's component ownership but preserve its contract;
59
+ flag breaking schema changes for user review
@@ -2,7 +2,7 @@
2
2
  name: database-schema
3
3
  description: Design database schema from domain models
4
4
  phase: "specification"
5
- order: 13
5
+ order: 810
6
6
  dependencies: [review-architecture]
7
7
  outputs: [docs/database-schema.md]
8
8
  conditional: "if-needed"
@@ -38,4 +38,19 @@ relationships, indexes, constraints, and migration strategy.
38
38
  constraints. Depth 4-5: full specification with migrations.
39
39
 
40
40
  ## Mode Detection
41
- Update mode if schema exists. Diff against current domain models.
41
+ Check for docs/database-schema.md. If it exists, operate in update mode: read
42
+ existing schema and diff against current domain models in docs/domain-models/.
43
+ Preserve existing table definitions, relationships, constraints, and migration
44
+ history. Add new entities from updated domain models. Update indexes for new
45
+ query patterns identified in architecture data flows. Never drop existing
46
+ tables without explicit user approval.
47
+
48
+ ## Update Mode Specifics
49
+ - **Detect prior artifact**: docs/database-schema.md exists
50
+ - **Preserve**: existing table/collection definitions, relationships, constraints,
51
+ migration history, index justifications, seed data strategy
52
+ - **Triggers for update**: domain models changed (new entities or relationships),
53
+ ADRs changed database technology, architecture introduced new query patterns
54
+ - **Conflict resolution**: if domain model renamed an entity, create a migration
55
+ that renames rather than drops and recreates; flag breaking changes for user
56
+ review
@@ -2,11 +2,11 @@
2
2
  name: review-api
3
3
  description: Review API contracts for completeness and consistency
4
4
  phase: "specification"
5
- order: 16
5
+ order: 840
6
6
  dependencies: [api-contracts]
7
- outputs: [docs/reviews/review-api.md]
7
+ outputs: [docs/reviews/review-api.md, docs/reviews/api/review-summary.md, docs/reviews/api/codex-review.json, docs/reviews/api/gemini-review.json]
8
8
  conditional: "if-needed"
9
- knowledge-base: [review-methodology, review-api-contracts]
9
+ knowledge-base: [review-methodology, review-api-design]
10
10
  ---
11
11
 
12
12
  ## Purpose
@@ -14,6 +14,9 @@ Review API contracts targeting API-specific failure modes: operation coverage
14
14
  gaps, error contract incompleteness, auth/authz gaps, versioning inconsistencies,
15
15
  payload shape mismatches with domain entities, and idempotency gaps.
16
16
 
17
+ At depth 4+, dispatches to external AI models (Codex, Gemini) for
18
+ independent review validation.
19
+
17
20
  ## Inputs
18
21
  - docs/api-contracts.md (required) — contracts to review
19
22
  - docs/domain-models/ (required) — for operation coverage
@@ -23,6 +26,9 @@ payload shape mismatches with domain entities, and idempotency gaps.
23
26
  ## Expected Outputs
24
27
  - docs/reviews/review-api.md — findings and resolution log
25
28
  - docs/api-contracts.md — updated with fixes
29
+ - docs/reviews/api/review-summary.md (depth 4+) — multi-model review synthesis
30
+ - docs/reviews/api/codex-review.json (depth 4+, if available) — raw Codex findings
31
+ - docs/reviews/api/gemini-review.json (depth 4+, if available) — raw Gemini findings
26
32
 
27
33
  ## Quality Criteria
28
34
  - Operation coverage against domain model verified
@@ -30,11 +36,17 @@ payload shape mismatches with domain entities, and idempotency gaps.
30
36
  - Auth requirements specified for every endpoint
31
37
  - Versioning strategy consistent with ADRs
32
38
  - Idempotency documented for all mutating operations
39
+ - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
33
40
 
34
41
  ## Methodology Scaling
35
- - **deep**: Full multi-pass review targeting all API failure modes.
42
+ - **deep**: Full multi-pass review targeting all API failure modes. Multi-model
43
+ review dispatched to Codex and Gemini if available, with graceful fallback
44
+ to Claude-only enhanced review.
36
45
  - **mvp**: Operation coverage check only.
37
- - **custom:depth(1-5)**: Scale passes with depth.
46
+ - **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
47
+ review + one external model (if CLI available). Depth 5: full review +
48
+ multi-model with reconciliation.
38
49
 
39
50
  ## Mode Detection
40
- Re-review mode if previous review exists.
51
+ Re-review mode if previous review exists. If multi-model review artifacts exist
52
+ under docs/reviews/api/, preserve prior findings still valid.
@@ -2,11 +2,11 @@
2
2
  name: review-database
3
3
  description: Review database schema for correctness and completeness
4
4
  phase: "specification"
5
- order: 14
5
+ order: 820
6
6
  dependencies: [database-schema]
7
- outputs: [docs/reviews/review-database.md]
7
+ outputs: [docs/reviews/review-database.md, docs/reviews/database/review-summary.md, docs/reviews/database/codex-review.json, docs/reviews/database/gemini-review.json]
8
8
  conditional: "if-needed"
9
- knowledge-base: [review-methodology, review-database-schema]
9
+ knowledge-base: [review-methodology, review-database-design]
10
10
  ---
11
11
 
12
12
  ## Purpose
@@ -14,6 +14,9 @@ Review database schema targeting schema-specific failure modes: entity coverage
14
14
  gaps, normalization trade-off issues, missing indexes, migration safety, and
15
15
  referential integrity vs. domain invariants.
16
16
 
17
+ At depth 4+, dispatches to external AI models (Codex, Gemini) for
18
+ independent review validation.
19
+
17
20
  ## Inputs
18
21
  - docs/database-schema.md (required) — schema to review
19
22
  - docs/domain-models/ (required) — for entity coverage
@@ -22,6 +25,9 @@ referential integrity vs. domain invariants.
22
25
  ## Expected Outputs
23
26
  - docs/reviews/review-database.md — findings and resolution log
24
27
  - docs/database-schema.md — updated with fixes
28
+ - docs/reviews/database/review-summary.md (depth 4+) — multi-model review synthesis
29
+ - docs/reviews/database/codex-review.json (depth 4+, if available) — raw Codex findings
30
+ - docs/reviews/database/gemini-review.json (depth 4+, if available) — raw Gemini findings
25
31
 
26
32
  ## Quality Criteria
27
33
  - Entity coverage verified
@@ -29,11 +35,17 @@ referential integrity vs. domain invariants.
29
35
  - Index coverage for known query patterns verified
30
36
  - Migration safety assessed
31
37
  - Referential integrity matches domain invariants
38
+ - (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
32
39
 
33
40
  ## Methodology Scaling
34
- - **deep**: Full multi-pass review targeting all schema failure modes.
41
+ - **deep**: Full multi-pass review targeting all schema failure modes. Multi-model
42
+ review dispatched to Codex and Gemini if available, with graceful fallback
43
+ to Claude-only enhanced review.
35
44
  - **mvp**: Entity coverage check only.
36
- - **custom:depth(1-5)**: Scale passes with depth.
45
+ - **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
46
+ review + one external model (if CLI available). Depth 5: full review +
47
+ multi-model with reconciliation.
37
48
 
38
49
  ## Mode Detection
39
- Re-review mode if previous review exists.
50
+ Re-review mode if previous review exists. If multi-model review artifacts exist
51
+ under docs/reviews/database/, preserve prior findings still valid.