npm - @zigrivers/scaffold - Versions diffs - 2.1.2 → 2.28.1 - Mend

@zigrivers/scaffold 2.1.2 → 2.28.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (97) hide show

package/README.md +272 -59
package/knowledge/core/adr-craft.md +53 -0
package/knowledge/core/ai-memory-management.md +246 -0
package/knowledge/core/api-design.md +4 -0
package/knowledge/core/claude-md-patterns.md +254 -0
package/knowledge/core/coding-conventions.md +246 -0
package/knowledge/core/database-design.md +4 -0
package/knowledge/core/design-system-tokens.md +465 -0
package/knowledge/core/dev-environment.md +223 -0
package/knowledge/core/domain-modeling.md +4 -0
package/knowledge/core/eval-craft.md +1008 -0
package/knowledge/core/multi-model-review-dispatch.md +250 -0
package/knowledge/core/operations-runbook.md +37 -226
package/knowledge/core/project-structure-patterns.md +231 -0
package/knowledge/core/review-step-template.md +247 -0
package/knowledge/core/{security-review.md → security-best-practices.md} +5 -1
package/knowledge/core/task-decomposition.md +57 -34
package/knowledge/core/task-tracking.md +225 -0
package/knowledge/core/tech-stack-selection.md +214 -0
package/knowledge/core/testing-strategy.md +63 -70
package/knowledge/core/user-stories.md +69 -60
package/knowledge/core/user-story-innovation.md +57 -0
package/knowledge/core/ux-specification.md +5 -148
package/knowledge/finalization/apply-fixes-and-freeze.md +165 -14
package/knowledge/product/prd-craft.md +55 -34
package/knowledge/review/review-adr.md +32 -0
package/knowledge/review/{review-api-contracts.md → review-api-design.md} +34 -1
package/knowledge/review/{review-database-schema.md → review-database-design.md} +27 -1
package/knowledge/review/review-domain-modeling.md +33 -0
package/knowledge/review/review-implementation-tasks.md +50 -0
package/knowledge/review/review-operations.md +55 -0
package/knowledge/review/review-prd.md +33 -0
package/knowledge/review/review-security.md +53 -0
package/knowledge/review/review-system-architecture.md +28 -0
package/knowledge/review/review-testing-strategy.md +51 -0
package/knowledge/review/review-user-stories.md +54 -0
package/knowledge/review/{review-ux-spec.md → review-ux-specification.md} +37 -1
package/methodology/custom-defaults.yml +32 -3
package/methodology/deep.yml +32 -3
package/methodology/mvp.yml +32 -3
package/package.json +2 -1
package/pipeline/architecture/review-architecture.md +18 -6
package/pipeline/architecture/system-architecture.md +14 -2
package/pipeline/consolidation/claude-md-optimization.md +73 -0
package/pipeline/consolidation/workflow-audit.md +73 -0
package/pipeline/decisions/adrs.md +14 -2
package/pipeline/decisions/review-adrs.md +18 -5
package/pipeline/environment/ai-memory-setup.md +70 -0
package/pipeline/environment/automated-pr-review.md +70 -0
package/pipeline/environment/design-system.md +73 -0
package/pipeline/environment/dev-env-setup.md +65 -0
package/pipeline/environment/git-workflow.md +71 -0
package/pipeline/finalization/apply-fixes-and-freeze.md +1 -1
package/pipeline/finalization/developer-onboarding-guide.md +1 -1
package/pipeline/finalization/implementation-playbook.md +3 -3
package/pipeline/foundation/beads.md +68 -0
package/pipeline/foundation/coding-standards.md +68 -0
package/pipeline/foundation/project-structure.md +69 -0
package/pipeline/foundation/tdd.md +60 -0
package/pipeline/foundation/tech-stack.md +74 -0
package/pipeline/integration/add-e2e-testing.md +65 -0
package/pipeline/modeling/domain-modeling.md +14 -2
package/pipeline/modeling/review-domain-modeling.md +18 -5
package/pipeline/parity/platform-parity-review.md +70 -0
package/pipeline/planning/implementation-plan-review.md +56 -0
package/pipeline/planning/{implementation-tasks.md → implementation-plan.md} +29 -9
package/pipeline/pre/create-prd.md +13 -4
package/pipeline/pre/innovate-prd.md +37 -8
package/pipeline/pre/innovate-user-stories.md +38 -7
package/pipeline/pre/review-prd.md +18 -6
package/pipeline/pre/review-user-stories.md +23 -6
package/pipeline/pre/user-stories.md +12 -2
package/pipeline/quality/create-evals.md +102 -0
package/pipeline/quality/operations.md +38 -13
package/pipeline/quality/review-operations.md +17 -5
package/pipeline/quality/review-security.md +17 -5
package/pipeline/quality/review-testing.md +20 -8
package/pipeline/quality/security.md +25 -3
package/pipeline/quality/story-tests.md +73 -0
package/pipeline/specification/api-contracts.md +17 -2
package/pipeline/specification/database-schema.md +17 -2
package/pipeline/specification/review-api.md +18 -6
package/pipeline/specification/review-database.md +18 -6
package/pipeline/specification/review-ux.md +19 -7
package/pipeline/specification/ux-spec.md +29 -10
package/pipeline/validation/critical-path-walkthrough.md +34 -7
package/pipeline/validation/cross-phase-consistency.md +34 -7
package/pipeline/validation/decision-completeness.md +34 -7
package/pipeline/validation/dependency-graph-validation.md +34 -7
package/pipeline/validation/implementability-dry-run.md +34 -7
package/pipeline/validation/scope-creep-check.md +34 -7
package/pipeline/validation/traceability-matrix.md +34 -7
package/skills/multi-model-dispatch/SKILL.md +326 -0
package/skills/scaffold-pipeline/SKILL.md +195 -0
package/skills/scaffold-runner/SKILL.md +465 -0
package/pipeline/planning/review-tasks.md +0 -38
package/pipeline/quality/testing-strategy.md +0 -42

package/knowledge/core/multi-model-review-dispatch.md ADDED Viewed

@@ -0,0 +1,250 @@
+---
+name: multi-model-review-dispatch
+description: Patterns for dispatching reviews to external AI models (Codex, Gemini) at depth 4+, including fallback strategies and finding reconciliation
+topics: [multi-model, code-review, depth-scaling, codex, gemini, review-synthesis]
+---
+# Multi-Model Review Dispatch
+At higher methodology depths (4+), reviews benefit from independent validation by external AI models. Different models have different blind spots — Codex excels at code-centric analysis while Gemini brings strength in design and architectural reasoning. Dispatching to multiple models and reconciling their findings produces higher-quality reviews than any single model alone. This knowledge covers when to dispatch, how to dispatch, how to handle failures, and how to reconcile disagreements.
+## Summary
+### When to Dispatch
+Multi-model review activates at depth 4+ in the methodology scaling system:
+| Depth | Review Approach |
+|-------|----------------|
+| 1-2 | Claude-only, reduced pass count |
+| 3 | Claude-only, full pass count |
+| 4 | Full passes + one external model (if available) |
+| 5 | Full passes + multi-model with reconciliation |
+Dispatch is always optional. If no external model CLI is available, the review proceeds as a Claude-only enhanced review with additional self-review passes to partially compensate.
+### Model Selection
+| Model | Strength | Best For |
+|-------|----------|----------|
+| **Codex** (OpenAI) | Code analysis, implementation correctness, API contract validation | Code reviews, security reviews, API reviews, database schema reviews |
+| **Gemini** (Google) | Design reasoning, architectural patterns, broad context understanding | Architecture reviews, PRD reviews, UX reviews, domain model reviews |
+When both models are available at depth 5, dispatch to both and reconcile. At depth 4, choose the model best suited to the artifact type.
+### Graceful Fallback
+External models are never required. The fallback chain:
+1. Attempt dispatch to selected model(s)
+2. If CLI unavailable → skip that model, note in report
+3. If timeout → use partial results if any, note incompleteness
+4. If all external models fail → Claude-only enhanced review (additional self-review passes)
+The review never blocks on external model availability.
+## Deep Guidance
+### Dispatch Mechanics
+#### CLI Availability Check
+Before dispatching, verify the model CLI is installed and authenticated:
+```bash
+# Codex check
+which codex && codex --version 2>/dev/null
+# Gemini check (via Google Cloud CLI or dedicated tool)
+which gemini 2>/dev/null || (which gcloud && gcloud ai models list 2>/dev/null)
+```
+If the CLI is not found, skip dispatch immediately. Do not prompt the user to install it — this is a review enhancement, not a requirement.
+#### Prompt Formatting
+External model prompts must be self-contained. The external model has no access to the pipeline context, CLAUDE.md, or prior conversation. Every dispatch includes:
+1. **Artifact content** — The full text of the document being reviewed
+2. **Review focus** — What specific aspects to evaluate (coverage, consistency, correctness)
+3. **Upstream context** — Relevant upstream artifacts that the document should be consistent with
+4. **Output format** — Structured JSON for machine-parseable findings
+**Prompt template:**
+```
+You are reviewing the following [artifact type] for a software project.
+## Document Under Review
+[full artifact content]
+## Upstream Context
+[relevant upstream artifacts, summarized or in full]
+## Review Instructions
+Evaluate this document for:
+1. Coverage — Are all expected topics addressed?
+2. Consistency — Does it agree with the upstream context?
+3. Correctness — Are technical claims accurate?
+4. Completeness — Are there gaps that would block downstream work?
+## Output Format
+Respond with a JSON array of findings:
+[
+  {
+    "id": "F-001",
+    "severity": "P0|P1|P2|P3",
+    "category": "coverage|consistency|correctness|completeness",
+    "location": "section or line reference",
+    "finding": "description of the issue",
+    "suggestion": "recommended fix"
+  }
+]
+```
+#### Output Parsing
+External model output is parsed as JSON. Handle common parsing issues:
+- Strip markdown code fences (```json ... ```) if the model wraps output
+- Handle trailing commas in JSON arrays
+- Validate that each finding has the required fields (severity, category, finding)
+- Discard malformed entries rather than failing the entire parse
+Store raw output for audit:
+```
+docs/reviews/{artifact}/codex-review.json   — raw Codex findings
+docs/reviews/{artifact}/gemini-review.json  — raw Gemini findings
+docs/reviews/{artifact}/review-summary.md   — reconciled synthesis
+```
+### Timeout Handling
+External model calls can hang or take unreasonably long. Set reasonable timeouts:
+| Operation | Timeout | Rationale |
+|-----------|---------|-----------|
+| CLI availability check | 5 seconds | Should be instant |
+| Small artifact review (<2000 words) | 60 seconds | Quick read and analysis |
+| Medium artifact review (2000-10000 words) | 120 seconds | Needs more processing time |
+| Large artifact review (>10000 words) | 180 seconds | Maximum reasonable wait |
+#### Partial Result Handling
+If a timeout occurs mid-response:
+1. Check if the partial output contains valid JSON entries
+2. If yes, use the valid entries and note "partial results" in the report
+3. If no, treat as a model failure and fall back
+Never wait indefinitely. A review that completes in 3 minutes with Claude-only findings is better than one that blocks for 10 minutes waiting for an external model.
+### Finding Reconciliation
+When multiple models produce findings, reconciliation synthesizes them into a unified report.
+#### Consensus Analysis
+Compare findings across models to identify agreement and disagreement:
+**Consensus** — Multiple models flag the same issue (possibly with different wording). High confidence in the finding. Use the most specific description.
+**Single-source finding** — Only one model flags an issue. Lower confidence but still valuable. Include in the report with a note about which model found it.
+**Disagreement** — One model flags an issue that another model explicitly considers correct. Requires manual analysis.
+#### Reconciliation Process
+1. **Normalize findings.** Map each model's findings to a common schema (severity, category, location, description).
+2. **Match findings across models.** Two findings match if they reference the same location and describe the same underlying issue (even with different wording). Use location + category as the matching key.
+3. **Score by consensus.**
+   - Found by all models → confidence: high
+   - Found by majority → confidence: medium
+   - Found by one model → confidence: low (but still reported)
+4. **Resolve severity disagreements.** When models disagree on severity:
+   - If one says P0 and another says P1 → use P0 (err on the side of caution)
+   - If one says P1 and another says P3 → investigate the specific finding before deciding
+   - Document the disagreement in the synthesis report
+5. **Merge descriptions.** When multiple models describe the same finding differently, combine their perspectives. Model A might identify the symptom while Model B identifies the root cause.
+#### Disagreement Resolution
+When models actively disagree (one flags an issue, another says the same thing is correct):
+1. **Read both arguments.** Each model explains its reasoning. One may have a factual error.
+2. **Check against source material.** Read the actual artifact and upstream docs. The correct answer is in the documents, not in model opinions.
+3. **Default to the stricter interpretation.** If genuinely ambiguous, the finding stands at reduced severity (P1 → P2).
+4. **Document the disagreement.** The reconciliation report should note: "Models disagreed on [topic]. Resolution: [decision and rationale]."
+### Output Format
+#### Review Summary (review-summary.md)
+```markdown
+# Multi-Model Review Summary: [Artifact Name]
+## Models Used
+- Claude (primary reviewer)
+- Codex (external, depth 4+) — [available/unavailable/timeout]
+- Gemini (external, depth 5) — [available/unavailable/timeout]
+## Consensus Findings
+| # | Severity | Finding | Models | Confidence |
+|---|----------|---------|--------|------------|
+| 1 | P0 | [description] | Claude, Codex | High |
+| 2 | P1 | [description] | Claude, Codex, Gemini | High |
+## Single-Source Findings
+| # | Severity | Finding | Source | Confidence |
+|---|----------|---------|--------|------------|
+| 3 | P1 | [description] | Gemini | Low |
+## Disagreements
+| # | Topic | Claude | Codex | Resolution |
+|---|-------|--------|-------|------------|
+| 4 | [topic] | P1 issue | No issue | [resolution rationale] |
+## Reconciliation Notes
+[Any significant observations about model agreement patterns, recurring themes,
+or areas where external models provided unique value]
+```
+#### Raw JSON Preservation
+Always preserve the raw JSON output from external models, even after reconciliation. The raw findings serve as an audit trail and enable re-analysis if the reconciliation logic is later improved.
+```
+docs/reviews/{artifact}/
+  codex-review.json     — raw output from Codex
+  gemini-review.json    — raw output from Gemini
+  review-summary.md     — reconciled synthesis
+```
+### Quality Gates
+Minimum standards for a multi-model review to be considered complete:
+| Gate | Threshold | Rationale |
+|------|-----------|-----------|
+| Minimum finding count | At least 3 findings across all models | A review with zero findings likely missed something |
+| Coverage threshold | Every review pass has at least one finding or explicit "no issues found" note | Ensures all passes were actually executed |
+| Reconciliation completeness | All cross-model disagreements have documented resolutions | No unresolved conflicts |
+| Raw output preserved | JSON files exist for all models that were dispatched | Audit trail |
+If the primary Claude review produces zero findings and external models are unavailable, the review should explicitly note this as unusual and recommend a targeted re-review at a later stage.
+### Common Anti-Patterns
+**Blind trust of external findings.** An external model flags an issue and the reviewer includes it without verification. External models hallucinate — they may flag a "missing section" that actually exists, or cite a "contradiction" based on a misread. Fix: every external finding must be verified against the actual artifact before inclusion in the final report.
+**Ignoring disagreements.** Two models disagree, and the reviewer picks one without analysis. Fix: disagreements are the most valuable signal in multi-model review. They identify areas of genuine ambiguity or complexity. Always investigate and document the resolution.
+**Dispatching at low depth.** Running external model reviews at depth 1-2 where the review scope is intentionally minimal. The external model does a full analysis anyway, producing findings that are out of scope. Fix: only dispatch at depth 4+. Lower depths use Claude-only review with reduced pass count.
+**No fallback plan.** The review pipeline assumes external models are always available. When Codex is down, the review fails entirely. Fix: external dispatch is always optional. The fallback to Claude-only enhanced review must be implemented and tested.
+**Over-weighting consensus.** Two models agree on a finding, so it must be correct. But both models may share the same bias (e.g., both flag a pattern as an anti-pattern that is actually appropriate for this project's constraints). Fix: consensus increases confidence but does not guarantee correctness. All findings still require artifact-level verification.
+**Dispatching the full pipeline context.** Sending the entire project context (all docs, all code) to the external model. This exceeds context limits and dilutes focus. Fix: send only the artifact under review and the minimal upstream context needed for that specific review.
+**Ignoring partial results.** A model times out after producing 3 of 5 findings. The reviewer discards all results because the review is "incomplete." Fix: partial results are still valuable. Include them with a note about incompleteness. Three real findings are better than zero.

package/knowledge/core/operations-runbook.md CHANGED Viewed

@@ -1,257 +1,70 @@
 ---
 name: operations-runbook
-description: Dev environment setup, CI/CD pipeline design, deployment, monitoring, and incident response
-topics: [operations, cicd, deployment, monitoring, dev-environment, incident-response, local-dev]
+description: Deployment pipeline, deployment strategies, monitoring, alerting, and incident response
+topics: [operations, cicd, deployment, monitoring, incident-response, alerting, rollback]
 ---
-## Dev Environment Setup
+## Dev Environment Reference
-A productive development environment lets a developer (or AI agent) go from cloning the repo to running the application in under 5 minutes. The fewer manual steps, the fewer things that can go wrong.
+Local development setup (prerequisites, env vars, one-command setup, database, hot reload, common commands, troubleshooting) is defined in `docs/dev-setup.md`, created by the Dev Setup prompt. The operations runbook should reference it rather than redefine it.
-### Local Development Prerequisites
+**Operations-specific additions** (not covered by dev-setup):
-Document every system-level dependency with exact versions:
+### Environment-Specific Configuration
-| Dependency | Version | Why |
-|------------|---------|-----|
-| Node.js | 20.x LTS | Runtime |
-| Python | 3.12+ | Backend scripts |
-| PostgreSQL | 16+ | Primary database |
-| Redis | 7+ | Caching and sessions |
-| Docker | 24+ | Database containers (optional) |
+Document how environment variables differ across environments:
-**Version management:** Recommend a version manager for each language runtime:
-- Node.js: `nvm`, `fnm`, or `.nvmrc` for automatic version switching
-- Python: `pyenv` with `.python-version`
-- Ruby: `rbenv` with `.ruby-version`
+| Variable | Local | Staging | Production |
+|----------|-------|---------|------------|
+| `APP_ENV` | development | staging | production |
+| `DATABASE_URL` | localhost | staging-db-host | prod-db-host |
+| `LOG_LEVEL` | debug | info | warn |
-Check in a `.node-version` or `.nvmrc` file so tools auto-select the correct version.
+### Secrets Management
-### Environment Variables
+Production secrets should never be in code or `.env` files:
+- Use a secrets manager (AWS Secrets Manager, GCP Secret Manager, Vault, Doppler)
+- Rotate secrets on a schedule (90 days for API keys, 365 days for service accounts)
+- Audit access to secrets
+- Document which secrets exist, where they're stored, and who can access them
-Every project needs a clear environment variable strategy:
+## Deployment Pipeline
-**`.env.example`** — Template committed to git with all required variables, default values for local development, and comments explaining each variable:
-```bash
-# Application
-APP_PORT=3000                    # Port for the dev server
-APP_ENV=development              # development | staging | production
-APP_URL=http://localhost:3000    # Base URL for the app
-# Database
-DATABASE_URL=postgresql://localhost:5432/myapp_dev  # Local PostgreSQL
-DATABASE_POOL_SIZE=5             # Connection pool size
-# Authentication
-JWT_SECRET=local-dev-secret-change-in-production  # JWT signing key
-SESSION_SECRET=local-session-secret               # Session cookie secret
-# External Services (optional for local dev)
-# STRIPE_SECRET_KEY=sk_test_...  # Uncomment when testing payments
-# SENDGRID_API_KEY=SG....       # Uncomment when testing emails
-```
-**`.env`** — Actual local configuration, gitignored. Created by copying `.env.example`.
-**Required vs. optional:** Clearly mark which variables are required for the app to start and which are optional (features degrade gracefully without them).
-### One-Command Setup
-Provide a single command that installs all dependencies, creates the database, runs migrations, seeds data, and starts the dev server:
-```bash
-# First time setup
-make setup    # or: npm run setup
-# Daily development
-make dev      # or: npm run dev
-```
-The setup command should be idempotent — safe to run twice without breaking anything.
-### Database Setup for Local Development
-**Option A: Local installation**
-- Install database server natively
-- Create dev database: `createdb myapp_dev` (PostgreSQL)
-- Run migrations: `make db-migrate`
-- Seed data: `make db-seed`
-**Option B: Docker Compose**
-```yaml
-services:
-  db:
-    image: postgres:16
-    ports: ["5432:5432"]
-    environment:
-      POSTGRES_DB: myapp_dev
-      POSTGRES_USER: postgres
-      POSTGRES_PASSWORD: postgres
-    volumes:
-      - pgdata:/var/lib/postgresql/data
-  redis:
-    image: redis:7
-    ports: ["6379:6379"]
-volumes:
-  pgdata:
-```
-Docker Compose is convenient for managing service dependencies but adds startup time and complexity. For simple stacks (SQLite, single service), skip Docker entirely.
-**Option C: SQLite for development**
-- No setup required
-- Create database file on first run
-- Fast, zero-configuration
-- Trade-off: behavior differences from production PostgreSQL (no JSONB, different SQL dialect)
-### Hot Reloading Configuration
-The dev server must reload automatically when code changes:
-- **Frontend:** Vite HMR (React, Vue, Svelte), Next.js Fast Refresh, or webpack HMR
-- **Backend (Node.js):** `tsx watch`, `nodemon`, or `ts-node-dev` for TypeScript; `node --watch` for plain Node
-- **Backend (Python):** `uvicorn --reload` (FastAPI), `flask run --debug` (Flask), `manage.py runserver` (Django)
-- **Full-stack:** Run frontend and backend concurrently with a process manager (`concurrently`, `honcho`, or Makefile with `&`)
-### Common Dev Commands
-Every project should have a consistent set of commands. Use whatever mechanism fits the stack:
-| Command | Purpose | Implementation |
-|---------|---------|---------------|
-| `make dev` | Start dev server with hot reload | Frontend + backend concurrently |
-| `make test` | Run all tests | Test runner with coverage |
-| `make test-watch` | Run tests in watch mode | Test runner in watch mode |
-| `make lint` | Check code style | Linter for each language |
-| `make format` | Auto-fix formatting | Formatter for each language |
-| `make db-migrate` | Run pending migrations | Migration tool |
-| `make db-seed` | Seed database with sample data | Seed script |
-| `make db-reset` | Drop, recreate, migrate, seed | Compose the above |
-| `make check` | Run all quality gates | lint + type-check + test |
-Commands should be:
-- Short and memorable (not `npx jest --runInBand --coverage --passWithNoTests`)
-- Documented with help text
-- Idempotent where possible
-- Fast enough to run frequently
-### Troubleshooting Guide
-Document solutions for common development issues:
-**Port already in use:**
-```bash
-lsof -i :3000  # Find the process
-kill -9 <PID>  # Kill it
-```
-**Database connection refused:**
-- Is the database running? `pg_isready` or `docker ps`
-- Is the connection string correct? Check `.env`
-- Is the port correct? Check for port conflicts
-**Dependencies out of sync:**
-```bash
-rm -rf node_modules && npm install  # Node.js
-rm -rf .venv && python -m venv .venv && pip install -r requirements.txt  # Python
-```
-**Migrations out of date:**
-```bash
-make db-migrate  # Run pending migrations
-make db-reset    # Nuclear option: start fresh
-```
-## CI/CD Pipeline
+The base CI pipeline (lint + test on PRs) is configured by the Git Workflow prompt in `.github/workflows/ci.yml`. The operations runbook extends this with production deployment stages — it does not redefine the base CI.
 ### Pipeline Architecture
-A CI/CD pipeline automates the path from code commit to production deployment. Design it in stages:
 ```
-Push to branch
-  -> Stage 1: Fast checks (30s)
-       Lint, format check, type check
-  -> Stage 2: Tests (2-5 min)
-       Unit tests, integration tests (parallel)
-  -> Stage 3: Build (1-2 min)
-       Compile, bundle, generate artifacts
-  -> Stage 4: Deploy (2-5 min, only on main)
-       Deploy to staging/production
-PR merge to main
-  -> All stages above
-  -> Stage 5: Post-deploy verification
-       Smoke tests against deployed environment
-```
-### Stage Design
-**Stage 1: Fast Checks**
-- Run on every push and PR
-- Fail fast: if linting fails, don't bother running tests
-- Cache dependencies between runs
-- Target: <30 seconds
-```yaml
-# GitHub Actions example
-lint:
-  runs-on: ubuntu-latest
-  steps:
-    - uses: actions/checkout@v4
-    - uses: actions/setup-node@v4
-      with: { node-version-file: '.nvmrc' }
-    - run: npm ci --ignore-scripts
-    - run: npm run lint
-    - run: npm run type-check
+Existing CI (from git-workflow — already configured):
+  -> Stage 1: Fast checks (30s) — lint, format, type check
+  -> Stage 2: Tests (2-5 min) — unit + integration in parallel
+Operations adds (main branch only):
+  -> Stage 3: Build (1-2 min) — compile, bundle, generate artifacts
+  -> Stage 4: Deploy (2-5 min) — deploy to staging/production
+  -> Stage 5: Post-deploy verification — smoke tests
 ```
-**Stage 2: Tests**
-- Run unit and integration tests in parallel
-- Use a service container for the test database
-- Upload coverage reports as artifacts
-- Target: <5 minutes
-```yaml
-test:
-  runs-on: ubuntu-latest
-  services:
-    postgres:
-      image: postgres:16
-      env:
-        POSTGRES_DB: test
-        POSTGRES_PASSWORD: test
-      ports: ['5432:5432']
-  steps:
-    - uses: actions/checkout@v4
-    - uses: actions/setup-node@v4
-    - run: npm ci
-    - run: npm run test:ci
-      env:
-        DATABASE_URL: postgresql://postgres:test@localhost:5432/test
-```
+### Stage 3: Build
-**Stage 3: Build**
 - Compile TypeScript, bundle frontend assets, generate Docker image
 - Verify the build artifact is valid (start the server and check health endpoint)
 - Store the build artifact for deployment
+- Only runs after Stages 1-2 pass
+### Stage 4: Deploy
-**Stage 4: Deploy**
 - Only runs on main branch (after PR merge)
 - Deploy the build artifact from Stage 3
 - Run database migrations before starting new version
 - Verify health check after deployment
-### Parallelization and Caching
-**Parallel jobs:** Run lint, unit tests, and integration tests as separate parallel jobs. The total pipeline time equals the longest individual job, not the sum.
+### Stage 5: Post-Deploy Verification
-**Dependency caching:** Cache `node_modules/` (keyed by `package-lock.json` hash), Python virtual environments, Docker layer cache. This turns a 60-second install into a 5-second cache restore.
-**Test parallelization:** Split test files across multiple runners. Most test frameworks support `--shard` or `--split` modes.
+- Run smoke tests against the deployed environment
+- Verify critical user flows work end-to-end
+- Check external dependency connectivity
+- If smoke tests fail: trigger automatic rollback
 ### Artifact Management
@@ -502,8 +315,6 @@ Define service level targets for the application:
 **Alert fatigue.** Too many alerts firing for non-critical issues. The on-call person starts ignoring alerts because most are noise. A real incident gets missed. Fix: every alert must have a clear response action. Remove alerts that routinely fire without requiring action.
-**No local dev story.** "It works on the CI server" but developers can't run the application locally. Fix: document the local setup, make it one command, test it regularly by having new team members follow the instructions.
 **Manual deployment steps.** Deployment requires an engineer to SSH into a server and run commands. This is error-prone, unreproducible, and blocks deployment on individual availability. Fix: fully automate deployment. A merge to main should trigger deployment automatically.
 **No monitoring before launch.** Monitoring is added after the first incident, when it's most needed and least available. Fix: set up monitoring as part of the infrastructure phase, before any user traffic.