npm - @zigrivers/scaffold - Versions diffs - 2.1.1 → 2.28.1 - Mend

@zigrivers/scaffold 2.1.1 → 2.28.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (100) hide show

package/README.md +272 -59
package/dist/project/frontmatter.d.ts.map +1 -1
package/dist/project/frontmatter.js +4 -0
package/dist/project/frontmatter.js.map +1 -1
package/knowledge/core/adr-craft.md +53 -0
package/knowledge/core/ai-memory-management.md +246 -0
package/knowledge/core/api-design.md +4 -0
package/knowledge/core/claude-md-patterns.md +254 -0
package/knowledge/core/coding-conventions.md +246 -0
package/knowledge/core/database-design.md +4 -0
package/knowledge/core/design-system-tokens.md +465 -0
package/knowledge/core/dev-environment.md +223 -0
package/knowledge/core/domain-modeling.md +4 -0
package/knowledge/core/eval-craft.md +1008 -0
package/knowledge/core/multi-model-review-dispatch.md +250 -0
package/knowledge/core/operations-runbook.md +37 -226
package/knowledge/core/project-structure-patterns.md +231 -0
package/knowledge/core/review-step-template.md +247 -0
package/knowledge/core/{security-review.md → security-best-practices.md} +5 -1
package/knowledge/core/task-decomposition.md +57 -34
package/knowledge/core/task-tracking.md +225 -0
package/knowledge/core/tech-stack-selection.md +214 -0
package/knowledge/core/testing-strategy.md +63 -70
package/knowledge/core/user-stories.md +69 -60
package/knowledge/core/user-story-innovation.md +57 -0
package/knowledge/core/ux-specification.md +5 -148
package/knowledge/finalization/apply-fixes-and-freeze.md +165 -14
package/knowledge/product/prd-craft.md +55 -34
package/knowledge/review/review-adr.md +32 -0
package/knowledge/review/{review-api-contracts.md → review-api-design.md} +34 -1
package/knowledge/review/{review-database-schema.md → review-database-design.md} +27 -1
package/knowledge/review/review-domain-modeling.md +33 -0
package/knowledge/review/review-implementation-tasks.md +50 -0
package/knowledge/review/review-operations.md +55 -0
package/knowledge/review/review-prd.md +33 -0
package/knowledge/review/review-security.md +53 -0
package/knowledge/review/review-system-architecture.md +28 -0
package/knowledge/review/review-testing-strategy.md +51 -0
package/knowledge/review/review-user-stories.md +54 -0
package/knowledge/review/{review-ux-spec.md → review-ux-specification.md} +37 -1
package/methodology/custom-defaults.yml +32 -3
package/methodology/deep.yml +32 -3
package/methodology/mvp.yml +32 -3
package/package.json +2 -1
package/pipeline/architecture/review-architecture.md +18 -6
package/pipeline/architecture/system-architecture.md +14 -2
package/pipeline/consolidation/claude-md-optimization.md +73 -0
package/pipeline/consolidation/workflow-audit.md +73 -0
package/pipeline/decisions/adrs.md +14 -2
package/pipeline/decisions/review-adrs.md +18 -5
package/pipeline/environment/ai-memory-setup.md +70 -0
package/pipeline/environment/automated-pr-review.md +70 -0
package/pipeline/environment/design-system.md +73 -0
package/pipeline/environment/dev-env-setup.md +65 -0
package/pipeline/environment/git-workflow.md +71 -0
package/pipeline/finalization/apply-fixes-and-freeze.md +1 -1
package/pipeline/finalization/developer-onboarding-guide.md +1 -1
package/pipeline/finalization/implementation-playbook.md +3 -3
package/pipeline/foundation/beads.md +68 -0
package/pipeline/foundation/coding-standards.md +68 -0
package/pipeline/foundation/project-structure.md +69 -0
package/pipeline/foundation/tdd.md +60 -0
package/pipeline/foundation/tech-stack.md +74 -0
package/pipeline/integration/add-e2e-testing.md +65 -0
package/pipeline/modeling/domain-modeling.md +14 -2
package/pipeline/modeling/review-domain-modeling.md +18 -5
package/pipeline/parity/platform-parity-review.md +70 -0
package/pipeline/planning/implementation-plan-review.md +56 -0
package/pipeline/planning/{implementation-tasks.md → implementation-plan.md} +29 -9
package/pipeline/pre/create-prd.md +13 -4
package/pipeline/pre/innovate-prd.md +37 -8
package/pipeline/pre/innovate-user-stories.md +38 -7
package/pipeline/pre/review-prd.md +18 -6
package/pipeline/pre/review-user-stories.md +23 -6
package/pipeline/pre/user-stories.md +12 -2
package/pipeline/quality/create-evals.md +102 -0
package/pipeline/quality/operations.md +38 -13
package/pipeline/quality/review-operations.md +17 -5
package/pipeline/quality/review-security.md +17 -5
package/pipeline/quality/review-testing.md +20 -8
package/pipeline/quality/security.md +25 -3
package/pipeline/quality/story-tests.md +73 -0
package/pipeline/specification/api-contracts.md +17 -2
package/pipeline/specification/database-schema.md +17 -2
package/pipeline/specification/review-api.md +18 -6
package/pipeline/specification/review-database.md +18 -6
package/pipeline/specification/review-ux.md +19 -7
package/pipeline/specification/ux-spec.md +29 -10
package/pipeline/validation/critical-path-walkthrough.md +34 -7
package/pipeline/validation/cross-phase-consistency.md +34 -7
package/pipeline/validation/decision-completeness.md +34 -7
package/pipeline/validation/dependency-graph-validation.md +34 -7
package/pipeline/validation/implementability-dry-run.md +34 -7
package/pipeline/validation/scope-creep-check.md +34 -7
package/pipeline/validation/traceability-matrix.md +34 -7
package/skills/multi-model-dispatch/SKILL.md +326 -0
package/skills/scaffold-pipeline/SKILL.md +195 -0
package/skills/scaffold-runner/SKILL.md +465 -0
package/pipeline/planning/review-tasks.md +0 -38
package/pipeline/quality/testing-strategy.md +0 -42

package/pipeline/quality/create-evals.md ADDED Viewed

@@ -0,0 +1,102 @@
+---
+name: create-evals
+description: Generate project-specific eval checks from standards documentation
+phase: "quality"
+order: 920
+dependencies: [tdd, story-tests]
+outputs: [tests/evals/, docs/eval-standards.md]
+reads: [story-tests]
+conditional: null
+knowledge-base: [eval-craft, testing-strategy]
+---
+## Purpose
+Generate automated eval checks that verify AI-generated code meets the project's
+own documented standards. Evals are test files in the project's own test framework
+— not a separate tool. They check up to 13 categories: 5 core (always generated)
+and 8 conditional (generated when their source document exists). Core: consistency,
+structure, adherence, coverage, cross-doc. Conditional: architecture conformance,
+API contract, security patterns, database schema, accessibility, performance budget,
+configuration validation, error handling completeness.
+## Inputs
+- docs/tech-stack.md (required) — determines test framework and stack-specific patterns
+- docs/coding-standards.md (required) — adherence and error handling patterns
+- docs/tdd-standards.md (required) — test co-location rules, mocking strategy
+- docs/project-structure.md (required) — file placement rules for structure evals
+- CLAUDE.md (required) — Key Commands table for consistency evals
+- Makefile or package.json (required) — build targets to match against
+- tests/acceptance/ (optional) — story test skeletons for coverage validation
+- docs/user-stories.md (optional) — acceptance criteria for coverage evals
+- docs/plan.md (optional) — feature list for coverage evals, performance NFRs
+- docs/system-architecture.md (optional) — architecture conformance evals
+- docs/api-contracts.md (optional) — API contract validation evals
+- docs/security-review.md (optional) — security pattern verification evals
+- docs/database-schema.md (optional) — database schema conformance evals
+- docs/ux-spec.md (optional) — accessibility compliance evals
+- docs/dev-setup.md (optional) — configuration validation evals
+## Expected Outputs
+Core (always generated):
+- tests/evals/consistency.test.* — command matching, format checking, cross-doc refs
+- tests/evals/structure.test.* — file placement, shared code rules, test co-location
+- tests/evals/adherence.test.* — coding convention patterns, mock rules, TODO format
+- tests/evals/coverage.test.* — feature-to-code mapping, AC-to-test mapping
+- tests/evals/cross-doc.test.* — tech stack consistency, path consistency, terminology
+Conditional (generated when source doc exists):
+- tests/evals/architecture.test.* — layer direction, module boundaries, circular deps
+- tests/evals/api-contract.test.* — endpoint existence, methods, error codes
+- tests/evals/security.test.* — auth middleware, secrets, input validation, SQL injection
+- tests/evals/database.test.* — migration coverage, columns, indexes, relationships
+- tests/evals/accessibility.test.* — ARIA, alt text, focus styles, contrast
+- tests/evals/performance.test.* — budget files, bundle tracking, perf test existence
+- tests/evals/config.test.* — env var docs, dead config, startup validation
+- tests/evals/error-handling.test.* — bare catches, error responses tested, custom errors
+Supporting:
+- tests/evals/helpers.* — shared utilities
+- docs/eval-standards.md — documents what is and isn't checked
+- make eval target added to Makefile/package.json
+## Quality Criteria
+- (mvp) Consistency + Structure evals generated
+- (mvp) Evals use the project's own test framework from docs/tech-stack.md
+- (mvp) All generated evals pass on the current codebase (no false positives)
+- (mvp) Eval results are binary PASS/FAIL, not scores
+- (mvp) make eval is separate from make test and make check (opt-in for CI)
+- (deep) All applicable eval categories generated including security, API, DB, accessibility (conditional on source doc existence)
+- (deep) Adherence, security, and error-handling evals include exclusion mechanisms
+- (deep) docs/eval-standards.md explicitly documents what evals do NOT check
+- (deep) Full eval suite runs in under 30 seconds
+## Methodology Scaling
+- **deep**: All 13 eval categories (conditional on doc existence). Stack-specific
+  patterns. Coverage with keyword extraction. Cross-doc consistency. Architecture
+  conformance. API contract validation. Security patterns. Full suite.
+- **mvp**: Consistency + Structure only. Skip everything else.
+- **custom:depth(1-5)**:
+  - Depth 1-2: Consistency + Structure
+  - Depth 3: Add Adherence + Cross-doc
+  - Depth 4: Add Coverage + Architecture + Config + Error handling
+  - Depth 5: All 13 categories (Security, API, Database, Accessibility, Performance)
+## Mode Detection
+Update mode if tests/evals/ directory exists. In update mode: regenerate
+consistency, structure, cross-doc, and conditional category evals. Preserve
+adherence, security, and error-handling eval exclusions. Regenerate coverage
+evals only if plan.md or user-stories.md changed. Add/remove conditional
+categories based on whether their source doc exists.
+## Update Mode Specifics
+- **Detect prior artifact**: tests/evals/ directory exists with eval test files
+- **Preserve**: adherence eval exclusions, security eval exclusions,
+  error-handling eval exclusions, custom helper utilities in tests/evals/helpers,
+  make eval target configuration
+- **Triggers for update**: source docs changed (coding-standards, project-structure,
+  tech-stack), new conditional source docs appeared (e.g., security-review.md
+  now exists), Makefile targets changed, user-stories.md changed
+- **Conflict resolution**: if a source doc was removed, archive its conditional
+  eval category rather than deleting; if exclusion patterns conflict with new
+  standards, flag for user review

package/pipeline/quality/operations.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
 name: operations
-description: Define operations, deployment, and dev environment strategy
+description: Define deployment pipeline, deployment strategy, monitoring, alerting, and incident response
 phase: "quality"
-order: 21
+order: 930
 dependencies: [review-testing]
 outputs: [docs/operations-runbook.md]
 conditional: null
@@ -10,33 +10,58 @@ knowledge-base: [operations-runbook]
 ---
 ## Purpose
-Define the operational strategy: CI/CD pipeline, deployment approach, monitoring
-and alerting, incident response, rollback procedures, and dev environment setup.
-This is both the production operations guide and the local development workflow.
+Define the production operational strategy: deployment pipeline (extending the
+base CI from git-workflow), deployment approach, monitoring and alerting, incident
+response, and rollback procedures. References docs/dev-setup.md for local
+development setup rather than redefining it.
 ## Inputs
 - docs/system-architecture.md (required) — what to deploy
-- docs/testing-strategy.md (required) — CI pipeline test stages
+- docs/tdd-standards.md (required) — CI pipeline test stages
 - docs/adrs/ (required) — infrastructure decisions
+- docs/dev-setup.md (optional) — local dev setup to reference, not redefine
+- docs/git-workflow.md (optional) — base CI pipeline to extend, not redefine
 ## Expected Outputs
-- docs/operations-runbook.md — operations and deployment runbook
+- docs/operations-runbook.md — production operations and deployment runbook
 ## Quality Criteria
-- CI/CD pipeline defined with all stages (build, test, lint, deploy)
+- Deployment pipeline extends existing CI (build, deploy, post-deploy stages)
+- Deployment pipeline has explicit stages (build → test → deploy → verify → rollback-ready)
+- Does not redefine base CI stages (lint, test) from git-workflow
 - Deployment strategy chosen with rollback procedure
+- Rollback procedure tested with specific trigger conditions (e.g., error rate > X%, health check failure)
+- Runbook structured by operational scenario (deployment, rollback, incident, scaling)
 - Monitoring covers key metrics (latency, error rate, saturation)
+- Each monitoring metric has an explicit threshold with rationale
+- Health check endpoints defined with expected response codes and latency bounds
+- Log aggregation strategy specifies retention period and searchable fields
 - Alerting thresholds are justified, not arbitrary
-- Dev environment setup is documented and reproducible
+- References docs/dev-setup.md for local dev — does not redefine it
 - Incident response process defined
 ## Methodology Scaling
 - **deep**: Full runbook. Deployment topology diagrams. Monitoring dashboard
-  specs. Alert playbooks. DR plan. Capacity planning. Local dev with
-  containers matching production.
-- **mvp**: Basic CI/CD pipeline. Deploy command. How to run locally.
+  specs. Alert playbooks. DR plan. Capacity planning.
+- **mvp**: Deploy command. Basic monitoring. Rollback procedure.
 - **custom:depth(1-5)**: Depth 1-2: MVP-style. Depth 3: add monitoring and
   alerts. Depth 4-5: full runbook with DR.
 ## Mode Detection
-Update mode if runbook exists.
+Check for docs/operations-runbook.md. If it exists, operate in update mode:
+read existing runbook and diff against current system architecture, ADRs, and
+deployment configuration. Preserve existing deployment procedures, monitoring
+thresholds, and incident response processes. Update deployment pipeline stages
+if architecture changed. Never modify rollback procedures without user approval.
+## Update Mode Specifics
+- **Detect prior artifact**: docs/operations-runbook.md exists
+- **Preserve**: deployment procedures, monitoring thresholds, alerting rules,
+  incident response processes, rollback procedures, environment-specific
+  configurations
+- **Triggers for update**: architecture changed deployment topology, new ADRs
+  changed infrastructure, security review identified operational requirements,
+  CI pipeline changed (new stages to extend)
+- **Conflict resolution**: if architecture changed the deployment target,
+  update deployment stages but preserve monitoring and alerting sections;
+  verify runbook does not redefine base CI stages from git-workflow.md

package/pipeline/quality/review-operations.md CHANGED Viewed

@@ -2,9 +2,9 @@
 name: review-operations
 description: Review operations runbook for completeness and safety
 phase: "quality"
-order: 22
+order: 940
 dependencies: [operations]
-outputs: [docs/reviews/review-operations.md]
+outputs: [docs/reviews/review-operations.md, docs/reviews/operations/review-summary.md, docs/reviews/operations/codex-review.json, docs/reviews/operations/gemini-review.json]
 conditional: null
 knowledge-base: [review-methodology, review-operations]
 ---
@@ -14,6 +14,9 @@ Review operations runbook targeting operations-specific failure modes: deploymen
 strategy gaps, missing rollback procedures, monitoring blind spots, unjustified
 alerting thresholds, missing runbook scenarios, and DR coverage gaps.
+At depth 4+, dispatches to external AI models (Codex, Gemini) for
+independent review validation.
 ## Inputs
 - docs/operations-runbook.md (required) — runbook to review
 - docs/system-architecture.md (required) — for deployment coverage
@@ -21,6 +24,9 @@ alerting thresholds, missing runbook scenarios, and DR coverage gaps.
 ## Expected Outputs
 - docs/reviews/review-operations.md — findings and resolution log
 - docs/operations-runbook.md — updated with fixes
+- docs/reviews/operations/review-summary.md (depth 4+) — multi-model review synthesis
+- docs/reviews/operations/codex-review.json (depth 4+, if available) — raw Codex findings
+- docs/reviews/operations/gemini-review.json (depth 4+, if available) — raw Gemini findings
 ## Quality Criteria
 - Deployment lifecycle fully documented (deploy, verify, rollback)
@@ -28,10 +34,16 @@ alerting thresholds, missing runbook scenarios, and DR coverage gaps.
 - Alert thresholds have rationale
 - Common failure scenarios have runbook entries
 - Dev environment parity assessed
+- (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
 ## Methodology Scaling
-- **deep**: Full multi-pass review. **mvp**: Deployment coverage only.
-- **custom:depth(1-5)**: Scale passes with depth.
+- **deep**: Full multi-pass review. Multi-model review dispatched to Codex and
+  Gemini if available, with graceful fallback to Claude-only enhanced review.
+  **mvp**: Deployment coverage only.
+- **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
+  review + one external model (if CLI available). Depth 5: full review +
+  multi-model with reconciliation.
 ## Mode Detection
-Re-review mode if previous review exists.
+Re-review mode if previous review exists. If multi-model review artifacts exist
+under docs/reviews/operations/, preserve prior findings still valid.

package/pipeline/quality/review-security.md CHANGED Viewed

@@ -2,9 +2,9 @@
 name: review-security
 description: Review security review for coverage and correctness
 phase: "quality"
-order: 24
+order: 960
 dependencies: [security]
-outputs: [docs/reviews/review-security.md]
+outputs: [docs/reviews/review-security.md, docs/reviews/security/review-summary.md, docs/reviews/security/codex-review.json, docs/reviews/security/gemini-review.json]
 conditional: null
 knowledge-base: [review-methodology, review-security]
 ---
@@ -15,6 +15,9 @@ gaps, auth/authz boundary mismatches with API contracts, secrets management gaps
 insufficient dependency audit coverage, missing threat model scenarios, and data
 classification gaps.
+At depth 4+, dispatches to external AI models (Codex, Gemini) for
+independent review validation.
 ## Inputs
 - docs/security-review.md (required) — security review document
 - docs/api-contracts.md (optional) — for auth boundary alignment
@@ -23,6 +26,9 @@ classification gaps.
 ## Expected Outputs
 - docs/reviews/review-security.md — findings and resolution log
 - docs/security-review.md — updated with fixes
+- docs/reviews/security/review-summary.md (depth 4+) — multi-model review synthesis
+- docs/reviews/security/codex-review.json (depth 4+, if available) — raw Codex findings
+- docs/reviews/security/gemini-review.json (depth 4+, if available) — raw Gemini findings
 ## Quality Criteria
 - OWASP coverage verified for this project
@@ -31,10 +37,16 @@ classification gaps.
 - Dependency audit scope covers all dependencies
 - Threat model covers all trust boundaries
 - Data classification is complete
+- (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
 ## Methodology Scaling
-- **deep**: Full multi-pass review. **mvp**: OWASP coverage check only.
-- **custom:depth(1-5)**: Scale passes with depth.
+- **deep**: Full multi-pass review. Multi-model review dispatched to Codex and
+  Gemini if available, with graceful fallback to Claude-only enhanced review.
+  **mvp**: OWASP coverage check only.
+- **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
+  review + one external model (if CLI available). Depth 5: full review +
+  multi-model with reconciliation.
 ## Mode Detection
-Re-review mode if previous review exists.
+Re-review mode if previous review exists. If multi-model review artifacts exist
+under docs/reviews/security/, preserve prior findings still valid.

package/pipeline/quality/review-testing.md CHANGED Viewed

@@ -2,9 +2,9 @@
 name: review-testing
 description: Review testing strategy for coverage gaps and feasibility
 phase: "quality"
-order: 20
-dependencies: [testing-strategy]
-outputs: [docs/reviews/review-testing.md]
+order: 910
+dependencies: [tdd]
+outputs: [docs/reviews/review-testing.md, docs/reviews/testing/review-summary.md, docs/reviews/testing/codex-review.json, docs/reviews/testing/gemini-review.json]
 conditional: null
 knowledge-base: [review-methodology, review-testing-strategy]
 ---
@@ -14,14 +14,20 @@ Review testing strategy targeting testing-specific failure modes: coverage gaps
 by layer, missing edge cases from domain invariants, unrealistic test environment
 assumptions, inadequate performance test coverage, and missing integration boundaries.
+At depth 4+, dispatches to external AI models (Codex, Gemini) for
+independent review validation.
 ## Inputs
-- docs/testing-strategy.md (required) — strategy to review
+- docs/tdd-standards.md (required) — strategy to review
 - docs/domain-models/ (required) — for invariant test case coverage
 - docs/system-architecture.md (required) — for layer coverage
 ## Expected Outputs
 - docs/reviews/review-testing.md — findings and resolution log
-- docs/testing-strategy.md — updated with fixes
+- docs/tdd-standards.md — updated with fixes
+- docs/reviews/testing/review-summary.md (depth 4+) — multi-model review synthesis
+- docs/reviews/testing/codex-review.json (depth 4+, if available) — raw Codex findings
+- docs/reviews/testing/gemini-review.json (depth 4+, if available) — raw Gemini findings
 ## Quality Criteria
 - Coverage gaps by layer identified
@@ -29,11 +35,17 @@ assumptions, inadequate performance test coverage, and missing integration bound
 - Test environment assumptions validated
 - Performance test coverage assessed against NFRs
 - Integration boundaries have integration tests defined
+- (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
 ## Methodology Scaling
-- **deep**: Full multi-pass review targeting all testing failure modes.
+- **deep**: Full multi-pass review targeting all testing failure modes. Multi-model
+  review dispatched to Codex and Gemini if available, with graceful fallback
+  to Claude-only enhanced review.
 - **mvp**: Coverage gap check only.
-- **custom:depth(1-5)**: Scale passes with depth.
+- **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
+  review + one external model (if CLI available). Depth 5: full review +
+  multi-model with reconciliation.
 ## Mode Detection
-Re-review mode if previous review exists.
+Re-review mode if previous review exists. If multi-model review artifacts exist
+under docs/reviews/testing/, preserve prior findings still valid.

package/pipeline/quality/security.md CHANGED Viewed

@@ -2,11 +2,11 @@
 name: security
 description: Security review and documentation
 phase: "quality"
-order: 23
+order: 950
 dependencies: [review-operations]
 outputs: [docs/security-review.md]
 conditional: null
-knowledge-base: [security-review]
+knowledge-base: [security-best-practices]
 ---
 ## Purpose
@@ -25,10 +25,16 @@ management, and dependency audit strategy.
 ## Quality Criteria
 - OWASP top 10 addressed for this specific project
+- Every API endpoint has authentication and authorization requirements specified
 - Auth/authz boundaries defined and consistent with API contracts
+- Input validation rules defined for each user-facing field (type, length, pattern)
 - Data classified by sensitivity with handling requirements
+- Secrets management approach documented (no hardcoded credentials in code)
 - Secrets management strategy defined (no secrets in code)
+- CORS policy explicitly configured per origin (not wildcard in production)
+- Rate limiting defined for public-facing endpoints with specific thresholds
 - Threat model covers all trust boundaries
+- Dependency audit strategy documented (automated scanning, update cadence)
 - Dependency audit integrated into CI
 ## Methodology Scaling
@@ -41,4 +47,20 @@ management, and dependency audit strategy.
   Depth 4-5: full security review.
 ## Mode Detection
-Update mode if review exists.
+Check for docs/security-review.md. If it exists, operate in update mode: read
+existing security controls and threat model, diff against current system
+architecture and API contracts. Preserve existing threat model entries, auth
+decisions, and data classification. Add new threat boundaries for new
+components. Update auth requirements if API contracts changed.
+## Update Mode Specifics
+- **Detect prior artifact**: docs/security-review.md exists
+- **Preserve**: threat model entries, data classification matrix, auth/authz
+  decisions, secrets management strategy, dependency audit configuration,
+  compliance checklist items
+- **Triggers for update**: architecture added new components (new attack surface),
+  API contracts changed auth requirements, database schema changed data
+  sensitivity, operations runbook changed deployment security
+- **Conflict resolution**: if a new component introduces a trust boundary
+  that conflicts with existing auth approach, document both and flag for
+  user decision; never weaken existing security controls without approval

package/pipeline/quality/story-tests.md ADDED Viewed

@@ -0,0 +1,73 @@
+---
+name: story-tests
+description: Generate test skeletons from user story acceptance criteria
+phase: "quality"
+order: 915
+dependencies: [tdd, review-user-stories, review-architecture]
+outputs: [tests/acceptance/, docs/story-tests-map.md]
+reads: [tech-stack, coding-standards, project-structure]
+conditional: null
+knowledge-base: [testing-strategy, user-stories]
+---
+## Purpose
+Generate test skeleton files from user story acceptance criteria, creating a
+direct, traceable link from every AC to a tagged test case. Each story produces
+a test file with one test case per acceptance criterion, tagged with story and
+AC IDs for downstream coverage verification. Test cases are created as
+pending/skipped — developers implement them during TDD execution.
+## Inputs
+- docs/user-stories.md (required) — stories with acceptance criteria in GWT format
+- docs/tdd-standards.md (required) — test framework, patterns, layer conventions
+- docs/tech-stack.md (required) — language, test runner, assertion library
+- docs/coding-standards.md (required) — test naming conventions
+- docs/system-architecture.md (required) — component structure for layer assignment
+- docs/project-structure.md (required) — test file location conventions
+- docs/api-contracts.md (optional) — endpoint details for API test skeletons
+- docs/database-schema.md (optional) — data layer context for integration tests
+- docs/ux-spec.md (optional) — UI component context for component tests
+## Expected Outputs
+- tests/acceptance/{story-id}-{slug}.test.* — one test file per story with
+  tagged pending test cases per AC
+- docs/story-tests-map.md — traceability matrix mapping stories → test files,
+  ACs → test cases, and layer assignments (unit/integration/e2e)
+## Quality Criteria
+- Every user story in docs/user-stories.md has a corresponding test file
+- Every acceptance criterion has at least one tagged test case
+- Test cases are tagged with story ID and AC ID for traceability
+- Test layer (unit/integration/e2e) is assigned based on AC type and architecture
+- Test files use the project's test framework from docs/tech-stack.md
+- All test cases are created as pending/skipped (not implemented)
+- docs/story-tests-map.md shows 100% AC-to-test-case coverage
+- Test file location follows conventions from docs/project-structure.md
+## Methodology Scaling
+- **deep**: All stories get test files. Negative test cases for every happy path
+  AC. Boundary condition tests. Layer-specific skeletons (unit + integration +
+  e2e where applicable). Traceability matrix with confidence analysis.
+- **mvp**: Test files for Must-have stories only. One test case per AC. No
+  layer splitting — all tests in acceptance/ directory.
+- **custom:depth(1-5)**: Depth 1: Must-have stories only. Depth 2: add
+  Should-have. Depth 3: add negative cases. Depth 4: add boundary conditions
+  and layer splitting. Depth 5: full suite with all stories and edge cases.
+## Mode Detection
+Update mode if tests/acceptance/ directory exists. In update mode: add test
+files for new stories, add test cases for new ACs in existing stories, never
+delete user-implemented test logic (only add new pending cases). Update
+docs/story-tests-map.md with new mappings.
+## Update Mode Specifics
+- **Detect prior artifact**: tests/acceptance/ directory exists with test files
+- **Preserve**: all user-implemented test logic, existing test file names and
+  structure, story ID and AC ID tags, traceability mappings in
+  docs/story-tests-map.md
+- **Triggers for update**: user stories added or changed acceptance criteria,
+  architecture changed component structure (layer assignments may shift),
+  tdd-standards.md changed test patterns or framework
+- **Conflict resolution**: if a story's AC was reworded, update the test case
+  description but preserve any implemented test body; if layer assignment
+  changed, move the test case to the correct layer file

package/pipeline/specification/api-contracts.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: api-contracts
 description: Specify API contracts for all system interfaces
 phase: "specification"
-order: 15
+order: 830
 dependencies: [review-architecture]
 outputs: [docs/api-contracts.md]
 conditional: "if-needed"
@@ -41,4 +41,19 @@ response shapes, error codes, authentication requirements, and rate limits.
   error contracts. Depth 4-5: full OpenAPI-style spec.
 ## Mode Detection
-Update mode if contracts exist. Diff against architecture changes.
+Check for docs/api-contracts.md. If it exists, operate in update mode: read
+existing endpoint definitions and diff against current system architecture and
+domain models. Preserve existing endpoint paths, request/response schemas, and
+error contracts. Add new endpoints for new features or domain operations.
+Update error contracts if domain model changed validation rules. Never remove
+or rename existing endpoints without explicit user approval.
+## Update Mode Specifics
+- **Detect prior artifact**: docs/api-contracts.md exists
+- **Preserve**: existing endpoint paths, HTTP methods, request/response schemas,
+  error codes, auth requirements, pagination patterns, versioning strategy
+- **Triggers for update**: architecture changed component boundaries, domain
+  models added new operations, ADRs changed API style or auth approach
+- **Conflict resolution**: if architecture moved an operation to a different
+  component, update the endpoint's component ownership but preserve its contract;
+  flag breaking schema changes for user review

package/pipeline/specification/database-schema.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: database-schema
 description: Design database schema from domain models
 phase: "specification"
-order: 13
+order: 810
 dependencies: [review-architecture]
 outputs: [docs/database-schema.md]
 conditional: "if-needed"
@@ -38,4 +38,19 @@ relationships, indexes, constraints, and migration strategy.
   constraints. Depth 4-5: full specification with migrations.
 ## Mode Detection
-Update mode if schema exists. Diff against current domain models.
+Check for docs/database-schema.md. If it exists, operate in update mode: read
+existing schema and diff against current domain models in docs/domain-models/.
+Preserve existing table definitions, relationships, constraints, and migration
+history. Add new entities from updated domain models. Update indexes for new
+query patterns identified in architecture data flows. Never drop existing
+tables without explicit user approval.
+## Update Mode Specifics
+- **Detect prior artifact**: docs/database-schema.md exists
+- **Preserve**: existing table/collection definitions, relationships, constraints,
+  migration history, index justifications, seed data strategy
+- **Triggers for update**: domain models changed (new entities or relationships),
+  ADRs changed database technology, architecture introduced new query patterns
+- **Conflict resolution**: if domain model renamed an entity, create a migration
+  that renames rather than drops and recreates; flag breaking changes for user
+  review

package/pipeline/specification/review-api.md CHANGED Viewed

@@ -2,11 +2,11 @@
 name: review-api
 description: Review API contracts for completeness and consistency
 phase: "specification"
-order: 16
+order: 840
 dependencies: [api-contracts]
-outputs: [docs/reviews/review-api.md]
+outputs: [docs/reviews/review-api.md, docs/reviews/api/review-summary.md, docs/reviews/api/codex-review.json, docs/reviews/api/gemini-review.json]
 conditional: "if-needed"
-knowledge-base: [review-methodology, review-api-contracts]
+knowledge-base: [review-methodology, review-api-design]
 ---
 ## Purpose
@@ -14,6 +14,9 @@ Review API contracts targeting API-specific failure modes: operation coverage
 gaps, error contract incompleteness, auth/authz gaps, versioning inconsistencies,
 payload shape mismatches with domain entities, and idempotency gaps.
+At depth 4+, dispatches to external AI models (Codex, Gemini) for
+independent review validation.
 ## Inputs
 - docs/api-contracts.md (required) — contracts to review
 - docs/domain-models/ (required) — for operation coverage
@@ -23,6 +26,9 @@ payload shape mismatches with domain entities, and idempotency gaps.
 ## Expected Outputs
 - docs/reviews/review-api.md — findings and resolution log
 - docs/api-contracts.md — updated with fixes
+- docs/reviews/api/review-summary.md (depth 4+) — multi-model review synthesis
+- docs/reviews/api/codex-review.json (depth 4+, if available) — raw Codex findings
+- docs/reviews/api/gemini-review.json (depth 4+, if available) — raw Gemini findings
 ## Quality Criteria
 - Operation coverage against domain model verified
@@ -30,11 +36,17 @@ payload shape mismatches with domain entities, and idempotency gaps.
 - Auth requirements specified for every endpoint
 - Versioning strategy consistent with ADRs
 - Idempotency documented for all mutating operations
+- (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
 ## Methodology Scaling
-- **deep**: Full multi-pass review targeting all API failure modes.
+- **deep**: Full multi-pass review targeting all API failure modes. Multi-model
+  review dispatched to Codex and Gemini if available, with graceful fallback
+  to Claude-only enhanced review.
 - **mvp**: Operation coverage check only.
-- **custom:depth(1-5)**: Scale passes with depth.
+- **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
+  review + one external model (if CLI available). Depth 5: full review +
+  multi-model with reconciliation.
 ## Mode Detection
-Re-review mode if previous review exists.
+Re-review mode if previous review exists. If multi-model review artifacts exist
+under docs/reviews/api/, preserve prior findings still valid.

package/pipeline/specification/review-database.md CHANGED Viewed

@@ -2,11 +2,11 @@
 name: review-database
 description: Review database schema for correctness and completeness
 phase: "specification"
-order: 14
+order: 820
 dependencies: [database-schema]
-outputs: [docs/reviews/review-database.md]
+outputs: [docs/reviews/review-database.md, docs/reviews/database/review-summary.md, docs/reviews/database/codex-review.json, docs/reviews/database/gemini-review.json]
 conditional: "if-needed"
-knowledge-base: [review-methodology, review-database-schema]
+knowledge-base: [review-methodology, review-database-design]
 ---
 ## Purpose
@@ -14,6 +14,9 @@ Review database schema targeting schema-specific failure modes: entity coverage
 gaps, normalization trade-off issues, missing indexes, migration safety, and
 referential integrity vs. domain invariants.
+At depth 4+, dispatches to external AI models (Codex, Gemini) for
+independent review validation.
 ## Inputs
 - docs/database-schema.md (required) — schema to review
 - docs/domain-models/ (required) — for entity coverage
@@ -22,6 +25,9 @@ referential integrity vs. domain invariants.
 ## Expected Outputs
 - docs/reviews/review-database.md — findings and resolution log
 - docs/database-schema.md — updated with fixes
+- docs/reviews/database/review-summary.md (depth 4+) — multi-model review synthesis
+- docs/reviews/database/codex-review.json (depth 4+, if available) — raw Codex findings
+- docs/reviews/database/gemini-review.json (depth 4+, if available) — raw Gemini findings
 ## Quality Criteria
 - Entity coverage verified
@@ -29,11 +35,17 @@ referential integrity vs. domain invariants.
 - Index coverage for known query patterns verified
 - Migration safety assessed
 - Referential integrity matches domain invariants
+- (depth 4+) Multi-model findings synthesized with consensus/disagreement analysis
 ## Methodology Scaling
-- **deep**: Full multi-pass review targeting all schema failure modes.
+- **deep**: Full multi-pass review targeting all schema failure modes. Multi-model
+  review dispatched to Codex and Gemini if available, with graceful fallback
+  to Claude-only enhanced review.
 - **mvp**: Entity coverage check only.
-- **custom:depth(1-5)**: Scale passes with depth.
+- **custom:depth(1-5)**: Depth 1-3: scale passes with depth. Depth 4: full
+  review + one external model (if CLI available). Depth 5: full review +
+  multi-model with reconciliation.
 ## Mode Detection
-Re-review mode if previous review exists.
+Re-review mode if previous review exists. If multi-model review artifacts exist
+under docs/reviews/database/, preserve prior findings still valid.