npm - universal-dev-standards - Versions diffs - 5.4.0 → 5.6.0 - Mend

universal-dev-standards 5.4.0 → 5.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (138) hide show

package/bundled/ai/standards/data-migration-testing.ai.yaml ADDED Viewed

@@ -0,0 +1,96 @@
+# Data Migration Testing Standards - AI Optimized
+# Source: core/data-migration-testing.md
+id: data-migration-testing
+meta:
+  version: "1.0.0"
+  updated: "2026-05-05"
+  source: core/data-migration-testing.md
+  description: Database migration up/down/idempotency testing standards for schema changes
+requirements:
+  REQ-1:
+    id: REQ-DMT-001
+    title: Up Migration Test
+    rule: >
+      Every schema migration MUST have an automated test that applies it to a clean
+      baseline schema and verifies the expected post-state (table names, columns,
+      indexes, constraints).
+    rationale: >
+      Unverified migrations silently corrupt production schema; up tests catch
+      incompatible changes before deployment.
+  REQ-2:
+    id: REQ-DMT-002
+    title: Down Migration (Rollback) Test
+    rule: >
+      Every migration that has a down/rollback function MUST have a test that
+      applies up, then down, and verifies the schema returns to its exact pre-state.
+      Zero data loss is the acceptance criterion.
+    rationale: >
+      Rollback is only reliable if it is tested; untested rollbacks fail at the worst
+      possible moment — during a production incident.
+  REQ-3:
+    id: REQ-DMT-003
+    title: Idempotency Test
+    rule: >
+      Migration runners MUST be tested to ensure applying the same migration twice
+      does not fail or cause side effects (e.g., duplicate columns, duplicate indexes).
+    rationale: >
+      CI retries and operator mistakes can trigger double-apply; idempotent migrations
+      prevent partial state corruption.
+  REQ-4:
+    id: REQ-DMT-004
+    title: Data Preservation Test
+    rule: >
+      Migrations that ALTER or DROP columns MUST include a test that seeds representative
+      data before the migration and verifies data integrity or expected transformation
+      after the migration.
+    rationale: >
+      Schema correctness does not imply data correctness; a column rename can silently
+      null-out existing data if the migration omits a data transform step.
+  REQ-5:
+    id: REQ-DMT-005
+    title: Migration History Integrity
+    rule: >
+      The migration runner's applied-migrations table/ledger MUST be validated:
+      each test run uses an isolated in-memory or temporary database so migrations
+      do not interfere with each other or with production state.
+    rationale: >
+      Shared migration state between tests causes non-deterministic failures that are
+      extremely difficult to diagnose.
+test_structure:
+  isolation: "Each migration test MUST run against an isolated in-memory or temp DB"
+  baseline: "Start from the schema state immediately before the migration under test"
+  assertions:
+    up: "Assert table/column/index existence and types"
+    down: "Assert schema returns to pre-migration state"
+    idempotency: "Apply migration twice; second apply must succeed without error"
+    data: "Seed rows before migration; assert row count and values after"
+tooling_guidance:
+  sqlite: "Use ':memory:' DSN; apply migrations via ORM migrate() or raw SQL"
+  postgres: "Use pgmock or test containers; run migrate up/down in a transaction"
+  drizzle_orm: >
+    Call db.run(sql`...`) with raw DDL, or use drizzle-kit's migrate() function
+    against a fresh :memory: SQLite database for each test file.
+anti_patterns:
+  - description: >
+      Testing migrations against a shared development database — causes cross-test
+      pollution and non-repeatable results.
+  - description: >
+      Skipping down migration tests because "we never rollback" — rollbacks happen
+      during incidents; the worst time to discover they are broken.
+  - description: >
+      Writing migration tests after the fact without seeding data — misses the
+      data preservation class of bugs entirely.
+related_standards:
+  - testing
+  - database-standards
+  - verification-evidence

package/bundled/ai/standards/data-pipeline.ai.yaml ADDED Viewed

@@ -0,0 +1,113 @@
+# Data Pipeline Standards - AI Optimized
+# Source: XSPEC-068 Wave 3 Data Engineering Pack
+id: data-pipeline
+title: Data Pipeline Standards
+version: "1.0.0"
+status: Active
+tags: [data-engineering, pipeline, etl, data-quality, orchestration, idempotency]
+summary: |
+  Defines engineering standards for building reliable, observable, and
+  maintainable data pipelines. Covers idempotency and exactly-once semantics,
+  error handling and dead-letter queues, checkpoint and recovery patterns,
+  data lineage tracking, pipeline observability (metrics, alerting), testing
+  requirements, and deployment practices. Applicable to batch ETL, streaming
+  pipelines, and ML feature pipelines.
+requirements:
+  - id: REQ-001
+    title: Idempotency and Exactly-Once Processing
+    description: |
+      Every data pipeline MUST be designed for idempotent execution:
+      re-running the same pipeline for the same time window or batch MUST
+      produce identical output without duplication or data loss. Pipelines
+      MUST use deterministic keys for deduplication. Batch pipelines MUST
+      support re-processing historical partitions cleanly. Streaming pipelines
+      MUST implement exactly-once or at-least-once with deduplication using
+      unique event IDs. Overwrites of output partitions are preferred over
+      appends for batch jobs.
+    level: MUST
+    examples:
+      - "Batch: pipeline writes to date-partitioned output and overwrites the partition on re-run"
+      - "Streaming: dedup using Kafka message key + consumer group offset tracking"
+      - "Test: running pipeline twice for 2026-04-01 produces same row count both times"
+  - id: REQ-002
+    title: Error Handling and Dead-Letter Queues
+    description: |
+      Data pipelines MUST implement structured error handling with
+      categorized failure modes. Transient errors (network timeout, API
+      rate limit) MUST use exponential backoff retry (max 3 attempts).
+      Permanent errors (schema violation, invalid data) MUST route records
+      to a Dead-Letter Queue (DLQ) with the original record, error type,
+      error message, and processing timestamp. DLQ records MUST be
+      monitored and addressed within the pipeline's SLA.
+    level: MUST
+    examples:
+      - "Transient retry: retry_policy: {max_attempts: 3, backoff_base: 2s, max_backoff: 30s}"
+      - "DLQ record: {original_record: {...}, error_type: 'SCHEMA_VIOLATION', error_msg: 'field amount is null', ts: '...'}"
+      - "DLQ alert: >100 DLQ messages in 1 hour → PagerDuty alert to data-oncall"
+  - id: REQ-003
+    title: Checkpoint and Recovery
+    description: |
+      Long-running batch pipelines and stateful streaming pipelines MUST
+      implement checkpointing to enable recovery from mid-run failures
+      without full reprocessing. Checkpoints MUST record: last successfully
+      processed partition/offset/watermark, job run ID, and timestamp.
+      Recovery MUST resume from the last checkpoint, not from the beginning.
+      Checkpoint state MUST be stored in durable external storage (not
+      local disk).
+    level: MUST
+    examples:
+      - "Batch: checkpoint stores {last_processed_date: '2026-04-28', last_id: 12345678} in S3"
+      - "Streaming: Flink checkpoint interval 5 minutes, stored in S3 with 3 checkpoints retained"
+      - "Recovery test: kill job mid-run, restart, verify output matches full run with no duplicates"
+  - id: REQ-004
+    title: Data Lineage Tracking
+    description: |
+      Every data pipeline MUST emit lineage metadata describing its data
+      flow: source datasets (with versions/timestamps), transformation logic
+      applied, and output datasets produced. Lineage MUST be machine-readable
+      and ingested into a central lineage store or data catalog. Lineage
+      enables root-cause analysis of data quality issues and impact assessment
+      of upstream changes.
+    level: MUST
+    examples:
+      - "Lineage emit: {job: 'orders-aggregator', inputs: ['raw_orders@2026-04-30'], outputs: ['daily_order_summary@2026-04-30'], transform_version: 'v1.3.2'}"
+      - "OpenLineage event emitted to Marquez or DataHub on job start and completion"
+      - "Lineage query: 'Which pipelines read from raw_orders?' returns 5 downstream jobs"
+  - id: REQ-005
+    title: Pipeline Observability and SLOs
+    description: |
+      Every production data pipeline MUST expose the following metrics:
+      records processed (counter), processing latency (histogram), error
+      rate (gauge), DLQ depth (gauge), and last successful run timestamp.
+      Pipelines MUST define SLOs for: freshness (data available within N
+      hours of source), completeness (≥ X% records successfully processed),
+      and latency (p95 processing time within threshold). SLO violations
+      MUST trigger alerts.
+    level: MUST
+    examples:
+      - "Metric: pipeline_records_processed_total{pipeline='orders-agg',status='success'}"
+      - "Freshness SLO: daily_order_summary available by 03:00 UTC — alert if missing by 04:00 UTC"
+      - "Completeness alert: processed_records / expected_records < 0.99 → P2 alert"
+  - id: REQ-006
+    title: Pipeline Testing Requirements
+    description: |
+      Data pipelines MUST have automated tests covering: unit tests for
+      transformation logic (test with sample input/output DataFrames),
+      integration tests validating end-to-end flow with synthetic data,
+      and schema conformance tests validating output matches declared
+      data contract. Pipelines SHOULD have regression tests for historically
+      problematic edge cases (nulls in key fields, negative amounts,
+      duplicate records). Test coverage MUST be ≥ 80% for transformation
+      logic.
+    level: MUST
+    examples:
+      - "Unit test: test_calculate_order_total() — asserts discount applied correctly on sample rows"
+      - "Integration test: runs full pipeline on 1000 synthetic orders, validates output row count and schema"
+      - "Edge case test: pipeline handles duplicate order_id gracefully, deduplication logic verified"

package/bundled/ai/standards/disaster-recovery-drill.ai.yaml ADDED Viewed

@@ -0,0 +1,89 @@
+# SPDX-License-Identifier: MIT
+name: Disaster Recovery Drill Standards
+nameZh: 災難恢復演練標準
+id: disaster-recovery-drill
+version: "1.0.0"
+category: operations
+scope: reliability
+summary: >
+  Structured DR drill standards: quarterly runbook execution, RTO/RPO
+  measurement, backup restore verification, and Game Day protocols.
+  Untested recovery plans fail at the worst moment.
+requirements:
+  - id: REQ-01
+    title: RTO/RPO Targets Defined
+    titleZh: RTO/RPO 目標定義
+    level: MUST
+    description: >
+      Each system MUST have documented RTO (Recovery Time Objective) and RPO
+      (Recovery Point Objective) targets. These must be agreed with stakeholders
+      before any DR drill can be considered meaningful.
+    examples:
+      - "VibeOps commercial: RTO < 1 hour, RPO < 24 hours (daily backup)"
+  - id: REQ-02
+    title: Backup Restore Test
+    titleZh: 備份還原測試
+    level: MUST
+    description: >
+      At minimum quarterly, a full backup restore MUST be executed in an
+      isolated environment and verified for data integrity. The restore time
+      MUST be measured and compared to the RTO target.
+  - id: REQ-03
+    title: Runbook Completeness
+    titleZh: 運行手冊完整性
+    level: MUST
+    description: >
+      A DR runbook MUST exist covering: (1) detection (how do we know disaster
+      occurred?), (2) decision (who declares DR?), (3) recovery steps
+      (step-by-step, executable commands), (4) verification (how do we confirm
+      recovery?), (5) communication plan.
+  - id: REQ-04
+    title: Game Day Exercise
+    titleZh: Game Day 演練
+    level: SHOULD
+    description: >
+      At minimum annually, a Game Day exercise SHOULD be conducted where the
+      team simulates a realistic failure scenario and executes the runbook from
+      scratch. Results SHOULD be documented and used to improve the runbook.
+  - id: REQ-05
+    title: Drill Record
+    titleZh: 演練記錄
+    level: MUST
+    description: >
+      Every DR drill MUST produce a written record including: date, participants,
+      scenario tested, RTO achieved, RPO achieved, issues found, remediation
+      actions. Records MUST be retained for 12 months.
+examples:
+  - name: "DR drill record template"
+    code: |
+      date: 2026-05-05
+      participants: [alice, bob]
+      scenario: "Database total loss — restore from daily backup"
+      rto_target: "1 hour"
+      rto_achieved: "42 minutes"
+      rpo_target: "24 hours"
+      rpo_achieved: "23 hours 15 minutes"
+      issues_found:
+        - "backup script path was stale — fixed in XSPEC-170"
+      remediation:
+        - "Update backup path in backup-restore.sh"
+      status: PASS
+anti_patterns:
+  - description: >
+      Only verifying that a backup file exists — always restore it and
+      verify data integrity. An untested backup is not a backup.
+  - description: >
+      Running DR drills in production — always use an isolated environment
+      to avoid turning a drill into an actual disaster.
+related_standards:
+  - deployment-standards
+  - chaos-engineering-standards
+  - verification-evidence

package/bundled/ai/standards/flaky-test-management.ai.yaml ADDED Viewed

@@ -0,0 +1,89 @@
+# SPDX-License-Identifier: MIT
+name: Flaky Test Management Standards
+nameZh: 不穩定測試管理標準
+id: flaky-test-management
+version: "1.0.0"
+category: testing
+scope: test-reliability
+summary: >
+  Policies and tooling for detecting, quarantining, and eliminating flaky
+  tests. Flaky tests erode CI confidence, cause false failures, and mask
+  real bugs.
+requirements:
+  - id: REQ-01
+    title: Flaky Test Definition
+    titleZh: 不穩定測試定義
+    level: MUST
+    description: >
+      A test is considered flaky if it produces different results (pass/fail)
+      on consecutive runs with the same code. Teams MUST define a flakiness
+      threshold: a test that fails ≥ 2% of runs on main branch without code
+      changes is flaky.
+  - id: REQ-02
+    title: Quarantine Protocol
+    titleZh: 隔離協議
+    level: MUST
+    description: >
+      Flaky tests MUST be quarantined within 48 hours of detection by:
+      (1) adding a `.skip` or `.todo` annotation, (2) opening a tracking
+      issue, (3) adding a comment with the issue link and known failure mode.
+      Quarantined tests MUST NOT block CI merges.
+  - id: REQ-03
+    title: Retry Policy
+    titleZh: 重試策略
+    level: SHOULD
+    description: >
+      CI SHOULD allow a maximum of 2 retries for tests in the quarantine list.
+      Retries SHOULD be applied only to known-flaky tests, not the entire suite.
+      A test that passes after retry is still considered flaky and MUST be fixed.
+  - id: REQ-04
+    title: Flaky Test Elimination SLA
+    titleZh: 修復 SLA
+    level: MUST
+    description: >
+      Quarantined tests MUST be either fixed or permanently removed within
+      30 days of quarantine. Tests left quarantined for > 30 days with no
+      activity SHOULD be automatically deleted.
+  - id: REQ-05
+    title: Root Cause Categories
+    titleZh: 根因分類
+    level: SHOULD
+    description: >
+      When eliminating a flaky test, the root cause SHOULD be documented in
+      the fixing PR. Common root causes: timing/race conditions, test isolation
+      failures (shared state), external service dependencies, random seed
+      dependence, file system ordering.
+examples:
+  - name: "Quarantine annotation (Vitest)"
+    code: |
+      // TODO: flaky test quarantined 2026-05-05 — see issue #42
+      // Root cause: race condition in WebSocket reconnection
+      it.skip("reconnects after disconnect", async () => { ... })
+  - name: "Vitest retry config for known flaky tests"
+    code: |
+      // vitest.config.ts
+      export default defineConfig({
+        test: {
+          retry: 2,  // global retry for all tests
+        }
+      })
+anti_patterns:
+  - description: >
+      Allowing flaky tests to block CI without quarantine — developers learn
+      to ignore CI failures, which hides real bugs.
+  - description: >
+      Using arbitrary sleeps (setTimeout/sleep) to fix race conditions —
+      this makes tests slower and more fragile. Use proper async coordination.
+related_standards:
+  - testing
+  - test-governance
+  - ci-cd-standards

package/bundled/ai/standards/flow-based-testing.ai.yaml ADDED Viewed

@@ -0,0 +1,240 @@
+# Flow-Based Testing Standards - AI Optimized
+# Source: core/flow-based-testing.md
+id: flow-based-testing
+meta:
+  version: "1.0.0"
+  updated: "2026-05-04"
+  source: core/flow-based-testing.md
+  description: Flow decomposition methodology for testing multi-step processes with branch coverage
+core_concepts:
+  problem: >
+    AC-centric tests verify individual behaviors in isolation but miss integration gaps
+    between steps and leave decision-point branches untested.
+    A flow where AC-1 passes and AC-2 passes independently does NOT guarantee
+    the AC-1 → AC-2 → AC-3 sequential flow works correctly.
+  solution: >
+    Decompose each workflow into flows, identify all decision points,
+    expand into a scenario matrix, then write journey tests with shared state threading.
+# ─────────────────────────────────────────────────────────
+# Step 1: Flow Identification
+# ─────────────────────────────────────────────────────────
+flow_identification:
+  description: Extract testable flows from SPEC/AC before writing any test code
+  activities:
+    - name: Define Preconditions
+      description: Document the system's initial state before the flow begins
+      examples:
+        - User is authenticated (token present)
+        - Database has seed data
+        - External services are available
+    - name: List Ordered Action Sequence
+      description: List all steps in the exact execution order
+      examples:
+        - Step 1: Validate input
+        - Step 2: Check quota
+        - Step 3: Call external service
+        - Step 4: Persist result
+        - Step 5: Return response
+    - name: Extract Decision Points
+      description: Identify every if/else/conditional in the flow
+      examples:
+        - "Is the user authorized? [yes → continue | no → 401]"
+        - "Is quota sufficient? [yes → continue | no → 429]"
+        - "Did external service respond? [200 → success | timeout → retry | error → fail]"
+    - name: Define Terminal States
+      description: List all possible end states (success + each distinct failure)
+      examples:
+        - "SUCCESS: Resource created, 201 returned"
+        - "FAIL_AUTH: 401 with error code AUTH_INVALID"
+        - "FAIL_QUOTA: 429 with error code QUOTA_EXCEEDED"
+        - "FAIL_EXTERNAL: 504 with error code EXTERNAL_TIMEOUT"
+  output: Flow definition (decision point list + terminal state map)
+# ─────────────────────────────────────────────────────────
+# Step 2: Decision Table Expansion
+# ─────────────────────────────────────────────────────────
+decision_table_expansion:
+  description: Expand decision points into a scenario matrix using a coverage strategy
+  coverage_strategies:
+    - name: Each-Choice
+      description: Each decision value appears in at least one scenario
+      formula: "Minimum scenarios = sum of unique values across all decision points"
+      use_when: Low-risk flows, fast feedback cycles, initial coverage
+      example: "3 values + 2 values + 3 values = 8 minimum scenarios"
+    - name: Pairwise
+      description: All pairs of decision values are covered (OWASP T-way testing)
+      formula: Approximately N × max_values scenarios (N = decision points)
+      use_when: Medium-risk flows, balance between coverage and test count
+      tools: [allpairs, pairwiseJS]
+    - name: All-Combinations
+      description: Full Cartesian product of all decision values
+      formula: "Scenarios = product of value counts per decision point"
+      use_when: Critical flows (auth, payment, license validation, security controls)
+      warning: Can grow exponentially; only use for flows with < 4 decision points or < 3 values each
+  decision_table_template: |
+    Flow: [Flow Name]
+    Decision Points:
+    | Point          | Values                          |
+    |----------------|---------------------------------|
+    | Authorization  | valid / expired / missing       |
+    | Quota          | sufficient / exceeded           |
+    | ExternalSvc    | available / timeout / error     |
+    Each-Choice Scenarios (minimum coverage):
+    | Scenario | Auth     | Quota      | ExternalSvc | Expected          |
+    |----------|----------|------------|-------------|-------------------|
+    | S1       | valid    | sufficient | available   | success           |
+    | S2       | expired  | sufficient | available   | 401_expired       |
+    | S3       | missing  | sufficient | available   | 401_missing       |
+    | S4       | valid    | exceeded   | available   | 429_quota         |
+    | S5       | valid    | sufficient | timeout     | 504_retry         |
+    | S6       | valid    | sufficient | error       | 502_external_fail |
+# ─────────────────────────────────────────────────────────
+# Step 3: Journey Test Structure
+# ─────────────────────────────────────────────────────────
+journey_test_structure:
+  description: Write flow tests with shared state threading — state accumulates across steps
+  key_principle: >
+    Each test step inherits state from previous steps through a shared context object.
+    Never reset the context between steps in a journey test.
+  happy_path_pattern: |
+    describe("Flow: [Flow Name]", () => {
+      // Shared context — state accumulates, NOT reset between steps
+      const ctx: {
+        token?: string;
+        resourceId?: string;
+        result?: ResponseType;
+      } = {}
+      it("Step 1: [Precondition setup]", async () => {
+        ctx.token = await setupAuth()
+        expect(ctx.token).toBeTruthy()
+      })
+      it("Step 2: [Core action using Step 1 state]", async () => {
+        ctx.resourceId = await createResource(ctx.token!, inputData)
+        expect(ctx.resourceId).toMatch(/^[a-z0-9-]+$/)
+      })
+      it("Step 3: [Verification using accumulated state]", async () => {
+        ctx.result = await getResource(ctx.token!, ctx.resourceId!)
+        expect(ctx.result.status).toBe("active")
+        expect(ctx.result.id).toBe(ctx.resourceId)
+      })
+    })
+  branch_pattern: |
+    // Each branch outcome gets its own describe block
+    describe("Flow Branch: [Decision Point] → [Outcome]", () => {
+      it("should [expected behavior] when [condition]", async () => {
+        // Setup: put the system into the branch condition
+        const expiredToken = buildExpiredJwt()
+        // Act: trigger the flow with branch condition
+        const response = await callApi(expiredToken)
+        // Assert: verify BOTH the response AND any side effects
+        expect(response.status).toBe(401)
+        expect(response.body.code).toBe("AUTH_TOKEN_EXPIRED")
+        // Verify NO side effects occurred (resource not created)
+        const count = await getResourceCount()
+        expect(count).toBe(0)
+      })
+    })
+  anti_patterns_in_structure:
+    - "Using beforeEach to reset ctx — breaks state threading between steps"
+    - "Putting all steps in a single it() block — hides which step failed"
+    - "Asserting only on the final state — intermediate step bugs go undetected"
+    - "Using arbitrary delays between steps — use proper async/await"
+# ─────────────────────────────────────────────────────────
+# Feature Type Mapping
+# ─────────────────────────────────────────────────────────
+feature_type_mapping:
+  - type: Workflow / Multi-step Process
+    requires: flow_decomposition
+    minimum_coverage: Each-Choice
+    test_structure: journey-chained-test
+    dimensions: [1, 3, 4, 5, 9, 10]
+    with_ai: [1, 3, 4, 5, 9, 10, 8]
+    note: Apply flow-based-testing standard; use shared ctx; Each-Choice minimum branch coverage
+# ─────────────────────────────────────────────────────────
+# Rules
+# ─────────────────────────────────────────────────────────
+rules:
+  - id: flow-identification-required
+    trigger: feature has 3 or more sequential steps
+    instruction: Apply 3-step flow decomposition (identify → decision table → journey structure) before writing tests
+    priority: required
+  - id: decision-table-for-branches
+    trigger: flow has any if/else or conditional logic
+    instruction: Create decision table with all decision values; apply Each-Choice minimum coverage
+    priority: required
+  - id: shared-state-in-journey
+    trigger: writing multi-step flow test
+    instruction: Use shared context object (ctx); never reset it between steps within the same flow
+    priority: required
+  - id: separate-branch-describes
+    trigger: decision point has 2 or more outcome values
+    instruction: Each distinct outcome gets its own describe block with a clear name
+    priority: required
+  - id: all-combinations-for-critical
+    trigger: flow involves authentication, payment, license validation, or security controls
+    instruction: Apply All-Combinations coverage strategy; test every value combination
+    priority: required
+  - id: verify-side-effects-in-branches
+    trigger: writing branch test for failure path
+    instruction: Assert BOTH the error response AND that no unintended side effects occurred
+    priority: required
+# ─────────────────────────────────────────────────────────
+# Anti-Patterns
+# ─────────────────────────────────────────────────────────
+anti_patterns:
+  - Testing only the happy path flow (missing failure terminal states)
+  - Resetting shared state between steps in a journey test (breaks state threading)
+  - Testing each step in isolation without verifying accumulated state
+  - Using a single test case for a flow with multiple decision points (hides which branch failed)
+  - Applying Pairwise or All-Combinations to every flow (creates unmaintainable test counts; reserve for critical flows)
+  - Not verifying side effects (or absence of side effects) in branch tests
+  - Starting flow test from midpoint without establishing preconditions
+# ─────────────────────────────────────────────────────────
+# Quick Reference
+# ─────────────────────────────────────────────────────────
+quick_reference:
+  flow_test_checklist: |
+    Flow: ___________________
+    □ Step 1 — Flow Identification
+      □ Preconditions documented
+      □ Ordered step sequence listed (Step 1 → Step N)
+      □ All decision points extracted (every if/else/condition)
+      □ All terminal states defined (success + each distinct failure)
+    □ Step 2 — Decision Table
+      □ Decision table created with all values per decision point
+      □ Coverage strategy chosen (Each-Choice / Pairwise / All-Combinations)
+      □ Critical flows (auth/payment/security) use All-Combinations
+      □ Minimum scenario count = sum of unique values (Each-Choice)
+    □ Step 3 — Journey Test Structure
+      □ Happy path journey test exists (shared ctx, all steps in sequence)
+      □ Each branch outcome has its own describe block
+      □ Step assertions verify accumulated ctx state, not just final result
+      □ Branch tests verify both error response AND absence of side effects
+      □ No beforeEach resetting ctx between steps