npm - gaia-framework - Versions diffs - 1.105.0 → 1.127.2 - Mend

gaia-framework 1.105.0 → 1.127.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

package/_gaia/core/workflows/bridge-toggle/workflow.yaml ADDED Viewed

@@ -0,0 +1,27 @@
+# Bridge Toggle Workflow
+# Enable or disable the Test Execution Bridge via /gaia-bridge-enable or /gaia-bridge-disable.
+# Wraps the manual bridge activation into a single command with idempotency,
+# comment-preserving YAML writes, and a post-toggle summary.
+#
+# Traces: FR-316, ADR-028 §10.20.12
+name: bridge-toggle
+display_name: "Bridge Toggle"
+description: "Enable or disable the Test Execution Bridge in global.yaml"
+module: core
+agent: orchestrator
+parameters:
+  mode:
+    type: string
+    required: true
+    allowed: [enable, disable]
+    description: "Target state — 'enable' sets bridge_enabled to true, 'disable' sets it to false"
+instructions: "{installed_path}/core/workflows/bridge-toggle/instructions.xml"
+validation: "{installed_path}/core/workflows/bridge-toggle/checklist.md"
+config_source: "{installed_path}/core/config.yaml"
+output:
+  primary: "{project-root}/_gaia/_config/global.yaml"

package/_gaia/dev/skills/_skill-index.yaml CHANGED Viewed

@@ -39,6 +39,7 @@ skills:
       - { id: review-checklist, line_range: [14, 59], description: "Universal code review checklist" }
       - { id: solid-principles, line_range: [60, 132], description: "SOLID violation detection" }
       - { id: complexity-metrics, line_range: [133, 226], description: "Cyclomatic and cognitive complexity" }
+      - { id: review-gate-completion, line_range: [228, 276], description: "Review gate completion artifacts and review-summary.md hard gate enforcement" }
   - file: documentation-standards.md
     sections:
@@ -54,6 +55,18 @@ skills:
       - { id: secrets-management, line_range: [119, 166], description: "Environment-based secrets" }
       - { id: cors-csrf, line_range: [167, 230], description: "CORS and CSRF configuration, protection" }
+  - file: edge-cases.md
+    sections:
+      - { id: overview, line_range: [26, 36], description: "Skill purpose, JIT loading mandate, 8K NFR-042 budget" }
+      - { id: when-to-invoke, line_range: [37, 50], description: "When create-story and other workflows load this skill" }
+      - { id: input-contract, line_range: [51, 68], description: "Required input fields and context token limits" }
+      - { id: output-schema, line_range: [69, 105], description: "Structured output: id, scenario, input, expected, category with category enum" }
+      - { id: analysis-heuristics, line_range: [106, 121], description: "Enumeration prompts for boundary, input, state, failure, security, timing, resources, idempotency" }
+      - { id: token-budget, line_range: [122, 135], description: "NFR-042 8K budget enforcement and truncation rules" }
+      - { id: failure-handling, line_range: [136, 152], description: "Non-blocking failure and timeout handling, warn-and-continue contract" }
+      - { id: usage-example, line_range: [153, 194], description: "Sample input/output pair" }
+      - { id: notes, line_range: [195, 201], description: "Notes on persistence, stack-agnostic usage, and related task file" }
   - file: figma-integration.md
     sections:
       - { id: detection, line_range: [48, 135], description: "Figma MCP detection probe, failure handling, adapter selection, security guardrails, API scopes, error sanitization" }

package/_gaia/dev/skills/code-review-standards.md CHANGED Viewed

@@ -224,3 +224,53 @@ function validate(input) {
 | Parameter count | > 4 | Suggest object parameter |
 | Nesting depth | > 3 levels | Require flattening |
 | Class methods | > 10 public | Suggest decomposition |
+<!-- SECTION: review-gate-completion -->
+## Review Gate Completion Requirements
+Before a story transitions from `review` to `done`, all 6 individual review reports AND the consolidated review-summary.md must exist in the filesystem. This is a **hard gate** — it is enforced structurally by the `review-gate-check` protocol and is not advisory.
+### Required Review Artifacts
+| Artifact | Path | Required When |
+|---|---|---|
+| Code review | `docs/implementation-artifacts/{story_key}-review.md` | Always |
+| Security review | `docs/implementation-artifacts/{story_key}-security-review.md` | Always |
+| QA tests | `docs/test-artifacts/{story_key}-qa-tests.md` | Always |
+| Test automation | `docs/test-artifacts/{story_key}-test-automation.md` | Always |
+| Test review | `docs/test-artifacts/{story_key}-test-review.md` | Always |
+| Performance review | `docs/implementation-artifacts/{story_key}-performance-review.md` | Always |
+| **Review summary** | `docs/implementation-artifacts/{story_key}-review-summary.md` | **Always — enforced hard gate** |
+### Enforcement Mechanism (Live)
+The hard gate is enforced by `_gaia/core/protocols/review-gate-check.xml` — step 2 "Evaluate Gate and Transition". Before invoking `status-sync` to move a story from `review` to `done`, the protocol:
+1. Builds the summary file path `{implementation_artifacts}/{story_key}-review-summary.md`
+2. Checks whether the file exists
+3. If missing AND any of the 6 individual review reports exist → HALT with: `Review summary missing for {story_key}. Run /gaia-run-all-reviews {story_key} to generate the summary, or create it manually via /gaia-create-review-summary {story_key}.`
+4. If missing AND all 6 individual review reports are also missing → skip the check (story never entered review)
+5. If present → gate passes, transition proceeds
+**This is a live hard gate, not a guidance note.** Stories with missing summaries physically cannot transition to `done` — the protocol will halt.
+### Auto-Generation via run-all-reviews
+`/gaia-run-all-reviews` auto-generates the review-summary.md as the final step of its 6-review pipeline (see `_gaia/lifecycle/workflows/4-implementation/run-all-reviews/instructions.xml` step 8). The summary aggregates the 6 review verdicts (read from the Review Gate table in the story file and from each review's report) — it does not re-run the reviews.
+### Manual Generation
+If auto-generation fails or is skipped, create the summary manually by copying the schema in `run-all-reviews/instructions.xml` step 8 and filling in the verdicts from the individual reports.
+### Review Summary Schema
+```yaml
+---
+story_key: {story_key}
+date: {YYYY-MM-DD}
+overall_status: PASSED | FAILED | INCOMPLETE
+reviewers: [code-review, qa-tests, security-review, test-automate, test-review, review-perf]
+---
+```
+Followed by 6 sections (one per review) with verdict + report link + one-line synopsis, then a final aggregate Gate Status table.

package/_gaia/dev/skills/edge-cases.md ADDED Viewed

@@ -0,0 +1,201 @@
+---
+name: "edge-cases"
+version: '1.0'
+applicable_agents: [typescript-dev, angular-dev, flutter-dev, java-dev, python-dev, mobile-dev, go-dev]
+test_scenarios:
+  - scenario: M+ story receives structured edge case list
+    expected: edge_case_results populated with id, scenario, input, expected, category fields
+  - scenario: Skill fails or times out
+    expected: Caller logs warning, sets edge_case_results=[], continues without blocking
+  - scenario: S-sized story
+    expected: Skill is not invoked — size gate in create-story excludes S
+  - scenario: Input context exceeds 5K tokens
+    expected: architecture_excerpt is truncated first, warning logged, proceeds
+  - scenario: Output exceeds 3K tokens
+    expected: Lower-priority entries dropped, warning logged, partial results returned
+---
+# Edge Cases Skill
+> Structured edge case analysis for M+ stories. Invoked as a mandatory sub-step by `/gaia-create-story` after acceptance criteria are drafted, and available as a standalone skill for any agent or workflow that needs to enumerate edge cases, error scenarios, and boundary conditions.
+**Traces to:** FR-227, NFR-042, ADR-030 §10.22
+---
+<!-- SECTION: overview -->
+## Overview
+The edge-cases skill enumerates scenarios that are *not* the happy path — boundary conditions, error paths, timing issues, input extremes, failure modes, and concurrency hazards — and returns them as a structured list so downstream artifacts (stories, tests, reviews) can trace coverage.
+This skill is JIT-loaded. It MUST NOT be pre-loaded by any workflow. Token budget for a single invocation is capped at 8K tokens (NFR-042) including the input context and the generated output.
+When invoked from `/gaia-create-story`, the skill is scoped to a single story's acceptance criteria and runs in-context — no separate workflow invocation, no sub-agent spawn.
+---
+<!-- SECTION: when-to-invoke -->
+## When to Invoke
+- `/gaia-create-story` — mandatory for stories with size M, L, or XL (the size gate at Step 4 of create-story instructions.xml enforces this). S-sized stories skip this skill to preserve token budget.
+- `/gaia-edge-cases` — standalone command for ad-hoc edge case brainstorming on existing artifacts.
+- Other workflows MAY load this skill when a step references `edge-cases.md` by name.
+**Do NOT invoke** this skill for:
+- Stories that are already marked `done` (edge cases should be captured during planning)
+- Purely cosmetic / copy changes (no behavior to enumerate)
+- S-sized stories (gate excludes them by design)
+---
+<!-- SECTION: input-contract -->
+## Input Contract
+The caller passes the following context to the skill:
+| Field | Type | Required | Description |
+|---|---|---|---|
+| `story_key` | string | yes | e.g., `E19-S9` — used as prefix for edge case IDs |
+| `story_title` | string | yes | Human-readable title |
+| `story_description` | string | yes | The user story paragraph (As a / I want / so that) |
+| `acceptance_criteria` | string[] | yes | List of AC strings in Given/When/Then format |
+| `size` | enum | yes | One of `S`, `M`, `L`, `XL` (skill halts if `S`) |
+| `architecture_excerpt` | string | no | Optional relevant ADR or architecture section |
+Total input context MUST stay under ~5K tokens to leave room for the output inside the 8K budget.
+---
+<!-- SECTION: output-schema -->
+## Output Schema
+The skill returns a structured list. Each edge case is an object with exactly these fields:
+```yaml
+edge_case_results:
+  - id: "EC-1"          # string — sequential EC-{N} numbering, unique within a single invocation
+    scenario: "..."     # string — one-line description of the edge case
+    input: "..."        # string — specific input / precondition that triggers it
+    expected: "..."     # string — expected system behavior or output
+    category: "..."     # enum — see category list below
+```
+**Required fields (all five MUST be present on every result):**
+- `id` — `EC-{N}` format (EC-1, EC-2, ...)
+- `scenario` — what the edge case is
+- `input` — the triggering input, state, or precondition
+- `expected` — the expected behavior
+- `category` — one of the categories below
+**Category enum:**
+- `boundary` — min/max values, empty sets, off-by-one, buffer limits
+- `error` — validation failures, exception paths, invalid inputs
+- `timing` — race conditions, timeouts, retries, rate limits
+- `concurrency` — parallel access, locking, idempotency
+- `integration` — upstream/downstream dependency failures, contract mismatches
+- `security` — authz bypass, injection, privilege escalation
+- `data` — malformed data, encoding, unicode, large payloads
+- `environment` — offline, degraded, platform-specific quirks
+If no edge cases are identified, return an empty list and log a note — do NOT fabricate edge cases to pad the output.
+Output format when returned to the caller: YAML-serializable list. The caller (e.g., `/gaia-create-story`) stores this list in the `edge_case_results` variable before writing the story file to disk.
+---
+<!-- SECTION: analysis-heuristics -->
+## Analysis Heuristics
+Use these prompts to drive enumeration — one or two results per heuristic is typical, not all will apply:
+1. **Boundary sweep** — for every numeric input, what happens at 0, 1, max, max+1, negative, and fractional? For every collection, what happens empty and full?
+2. **Input extremes** — what about empty strings, very long strings, unicode, null, missing fields, extra fields, wrong types?
+3. **State transitions** — can the operation be invoked in an unexpected state? What if it is called twice? What if it is called while another operation is in flight?
+4. **Failure paths** — what upstream dependencies can fail? What happens on timeout, 5xx, partial response, network partition?
+5. **Security angles** — can a lower-privileged user trigger it? Can input be injected into a downstream query/command?
+6. **Time and clock** — what happens across DST transitions, leap seconds, at midnight, with negative clock skew?
+7. **Resource limits** — what happens under memory pressure, slow disk, saturated queues?
+8. **Idempotency** — is the operation safe to retry? What if the same request arrives twice with the same id?
+---
+<!-- SECTION: token-budget -->
+## Token Budget (NFR-042)
+The skill invocation MUST stay under 8K tokens total. Guidance:
+- Input context: ≤ 5K tokens (story, ACs, optional architecture excerpt)
+- Output: ≤ 3K tokens (typically 5–15 edge cases)
+- If the caller supplies an input that would exceed 5K, the skill truncates the `architecture_excerpt` first, then the `story_description`, and finally caps the number of ACs considered (with a warning)
+- If the generated output would exceed 3K tokens, the skill truncates the list to the highest-priority edge cases (boundary + error + security first) and logs a warning to Dev Notes: "Edge case output truncated at 3K tokens — N results dropped"
+The caller is responsible for checking the total token count before persisting the output. When running inside `/gaia-create-story`, token usage is logged to Dev Notes whenever it exceeds 80% of the 8K budget.
+---
+<!-- SECTION: failure-handling -->
+## Failure and Timeout Handling
+This skill is **non-blocking**. Callers MUST treat skill failure as a warning, never as a hard error.
+| Failure mode | Caller behavior |
+|---|---|
+| Skill file not found | Log warning "Edge case skill not loaded — continuing without edge cases", set `edge_case_results = []`, proceed |
+| Skill invocation timeout (> 30s wall clock) | Log warning "Edge case analysis timed out — continuing without edge cases", set `edge_case_results = []`, proceed |
+| Malformed output (missing required fields) | Log warning "Edge case output schema invalid — continuing without edge cases", set `edge_case_results = []`, proceed |
+| Token budget exceeded | Truncate output with warning (see Token Budget section), still return partial results |
+| Empty result (no edge cases found) | Log note "No edge cases identified for {story_key}", set `edge_case_results = []`, proceed normally — this is a valid outcome, NOT a failure |
+Under no circumstance should an edge-case failure block story creation. The story is written to disk with whatever `edge_case_results` is available, plus a Dev Notes entry describing any degradation.
+---
+<!-- SECTION: usage-example -->
+## Usage Example
+Input (from `/gaia-create-story` context):
+```yaml
+story_key: "E19-S9"
+story_title: "Edge Case Mandatory Sub-Step"
+size: "M"
+acceptance_criteria:
+  - "Given a story of size M, when create-story runs, then edge-cases.md is invoked"
+  - "Given the skill times out, when it fails, then story creation continues with a warning"
+```
+Output:
+```yaml
+edge_case_results:
+  - id: "EC-1"
+    scenario: "Story size missing from frontmatter"
+    input: "size field is null or absent"
+    expected: "Default to skip (treat as S), log warning"
+    category: "error"
+  - id: "EC-2"
+    scenario: "Skill file deleted between registry load and invocation"
+    input: "edge-cases.md missing from dev/skills/"
+    expected: "Caller logs warning, sets edge_case_results=[], continues"
+    category: "error"
+  - id: "EC-3"
+    scenario: "Input context exceeds 5K tokens"
+    input: "Very large architecture_excerpt"
+    expected: "Truncate architecture_excerpt first, warn, proceed"
+    category: "boundary"
+  - id: "EC-4"
+    scenario: "Two simultaneous invocations for the same story"
+    input: "Parallel runs of /gaia-create-story E19-S9"
+    expected: "Each invocation is independent; no shared state"
+    category: "concurrency"
+```
+---
+<!-- SECTION: notes -->
+## Notes
+- The `edge_case_results` output is captured in the caller's runtime state as a named variable before the story file is written. The create-story workflow stores these results in the story's Dev Notes or Test Scenarios section.
+- This skill does NOT modify files on disk. Callers persist the output.
+- The skill is stack-agnostic — it works for typescript, angular, flutter, java, python, mobile, and go stories.
+- See also: `_gaia/core/tasks/review-edge-case-hunter.xml` for the standalone `/gaia-edge-cases` workflow task that wraps this skill.

package/_gaia/lifecycle/knowledge/brownfield/ci-test-detection.md ADDED Viewed

@@ -0,0 +1,194 @@
+# CI Test Execution Detection — Brownfield Knowledge Fragment
+> **Version:** 1.0.0
+> **Story:** E19-S21 (documents implementation from E19-S13)
+> **Traces to:** FR-232, NFR-041
+> **Category:** runtime-behavior
+> **Source of truth:** `Gaia-framework/src/brownfield/ci-test-detector.js`
+## Purpose
+Detect whether a brownfield project actually executes tests in CI by scanning
+its CI configuration files for real test execution steps. This fragment
+documents the detection patterns, the output schema, and the zero-false-positive
+rules that the programmatic module enforces.
+This knowledge fragment is the companion to `test-execution-scan.md`:
+- `test-execution-scan.md` covers **local test runner** detection and execution
+  (Jest, Vitest, pytest, JUnit, Go test, Flutter, BATS, etc.)
+- `ci-test-detection.md` (this file) covers **CI pipeline** test execution
+  detection — whether the project's pipelines actually run those runners on
+  every commit.
+## Scope
+The CI test execution detector identifies the **first** CI provider found that
+runs tests. It scans supported providers in priority order and stops on the
+first match. Detection is strictly pattern-based against configuration file
+contents — file presence alone is never sufficient.
+### Supported CI providers
+The implementation in `ci-test-detector.js` (E19-S13) supports **6 providers**:
+| # | Provider       | Enum value       | Config file(s)                     | Field scanned            |
+|---|----------------|------------------|------------------------------------|--------------------------|
+| 1 | GitHub Actions | `github-actions` | `.github/workflows/*.yml` / `*.yaml` | `run:`                 |
+| 2 | GitLab CI      | `gitlab`         | `.gitlab-ci.yml`                   | `script:` list items     |
+| 3 | CircleCI       | `circleci`       | `.circleci/config.yml`             | `run:`                   |
+| 4 | Azure Pipelines| `azure`          | `azure-pipelines.yml`              | `script:` / `bash:`      |
+| 5 | Jenkins        | `jenkins`        | `Jenkinsfile`                      | `sh '...'` / `sh "..."`  |
+| 6 | Bitbucket      | `bitbucket`      | `bitbucket-pipelines.yml`          | `script:` list items     |
+> **Note — Travis CI is intentionally NOT supported.** The original E19-S13
+> story text mentioned `.travis.yml`, but Travis CI is deprecated and the
+> implementation replaced it with `.gitlab-ci.yml` and `bitbucket-pipelines.yml`.
+> This fragment reflects the actual shipped implementation, not the original
+> story wording.
+## Test Command Patterns
+A CI step qualifies as "test execution" only when its command value matches
+one of the following canonical patterns (from `TEST_COMMAND_PATTERNS` in the
+module):
+- `npm test`
+- `npm run test`
+- `pytest`
+- `./gradlew test` / `./gradlew.bat test`
+- `go test`
+- `bats`
+- `mvn test`
+- `vitest`
+- `npx vitest`
+Patterns use word-boundary regexes so `pytest-cov` still matches via `pytest`
+but embedded mentions inside unrelated tokens do not.
+## Detection Algorithm by Provider
+All YAML scanners share the same pipeline: split the file into lines, skip
+comments, extract the command value from the recognized field, apply the
+false-positive guard, then apply the test command patterns.
+### GitHub Actions
+- **Glob:** `.github/workflows/*.yml` and `*.yaml`
+- **Line pattern:** `/^\s*-?\s*run:\s*(.+)$/`
+- **Behavior:** iterate every workflow file in the directory; collect any `run:`
+  field whose value matches a test command pattern.
+- **Stops on first provider with ≥ 1 matching command.**
+### GitLab CI
+- **File:** `.gitlab-ci.yml`
+- **Line pattern:** `/^\s+-\s+(.+)$/` (YAML list items under `script:`)
+- **Behavior:** treat `script:` arrays as the authoritative source. Bare list
+  items whose value matches a test command pattern count as test execution.
+### CircleCI
+- **File:** `.circleci/config.yml`
+- **Line pattern:** `/^\s*-?\s*run:\s*(.+)$/`
+- **Behavior:** mirrors the GitHub Actions scanner — `steps[].run.command`
+  fields are matched via the `run:` prefix.
+### Azure Pipelines
+- **File:** `azure-pipelines.yml`
+- **Line pattern:** `/^\s*-?\s*(?:script|bash):\s*(.+)$/`
+- **Behavior:** matches both `script:` and `bash:` task values.
+### Jenkins
+- **File:** `Jenkinsfile` (Groovy, not YAML)
+- **Comment skip:** lines starting with `//`
+- **Line pattern:** `/\bsh\s+['"](.+?)['"]/`
+- **Behavior:** matches `sh 'npm test'` and `sh "pytest"` style declarative
+  and scripted pipeline steps.
+### Bitbucket Pipelines
+- **File:** `bitbucket-pipelines.yml`
+- **Line pattern:** `/^\s+-\s+(.+)$/` (YAML list items under `script:`)
+- **Behavior:** identical to the GitLab scanner — both providers express steps
+  as YAML list items beneath `script:`.
+## Output Schema
+The detector resolves to a single object per project:
+```json
+{
+  "ci_test_execution": "github-actions" | "gitlab" | "circleci" | "azure" | "jenkins" | "bitbucket" | null,
+  "test_commands": ["<matched command line>", "..."]
+}
+```
+- `ci_test_execution` — the enum value of the first provider whose config file
+  contains at least one matching test command. `null` when no supported
+  provider contains a matching command.
+- `test_commands` — the raw command strings that matched the test patterns, in
+  discovery order. Empty array when `ci_test_execution` is `null`.
+Consumers should treat a `null` result as "no CI test execution detected" —
+the project may still have local test runners (see `test-execution-scan.md`),
+but no supported CI pipeline runs them on commit.
+## Zero-False-Positive Rules (NFR-041)
+The detector enforces NFR-041 (zero false positives) through three guards:
+1. **Config file presence is not sufficient.** A repo may contain
+   `.github/workflows/deploy.yml` that only runs `terraform apply` — it is not
+   a test execution pipeline and will produce `null`.
+2. **Comment skip.** YAML lines matching `/^\s*#/` and Groovy lines matching
+   `/^\s*\/\//` are skipped before any command extraction. Test commands
+   mentioned inside comments never qualify.
+3. **False-positive guard.** Extracted command values starting with `echo` are
+   rejected via `isFalsePositive`. This filters banners such as
+   `- echo "running npm test now"` that look like test execution but are just
+   log lines.
+Pattern matches use word boundaries (`\b`) so `npm-test-utils` or
+`pytestplugin` do not trigger a match against `npm test` / `pytest`. Only
+actual command invocations count.
+## Integration Notes
+- **Companion to test-execution-scan.md.** The two fragments are designed to
+  be read together during brownfield onboarding. Local test runner detection
+  answers "does this project have tests?"; CI test detection answers "does
+  the pipeline actually run them?". A gap exists when the runner detector
+  finds a suite but the CI detector returns `null`.
+- **Correlation with E19-S12 runner detection.** The test commands matched
+  here (`npm test`, `pytest`, `./gradlew test`, etc.) deliberately mirror the
+  runner commands produced by `test-runner-detector.js`. Brownfield workflows
+  can cross-reference the two outputs to flag pipelines that run tests for
+  only a subset of detected runners (e.g., a monorepo that runs `npm test`
+  but skips `pytest`).
+- **First-match semantics.** The detector iterates providers in the fixed
+  order (GitHub Actions → GitLab → CircleCI → Azure → Jenkins → Bitbucket)
+  and returns on the first match. Projects that use more than one CI provider
+  will only surface the first match — callers that need multi-provider
+  detection should invoke the individual scanners directly.
+## Style and Format
+This fragment follows the same conventions as its sibling knowledge files in
+`_gaia/lifecycle/knowledge/brownfield/`:
+- `test-execution-scan.md` — local test runner detection
+- `config-contradiction-scan.md` — configuration contradiction scanning
+- `dead-code-scan.md` — unused code detection
+## See Also
+- [`test-execution-scan.md`](./test-execution-scan.md) — local test runner
+  detection and execution (the companion scan that runs tests, while this
+  fragment identifies whether CI runs them)
+- `Gaia-framework/src/brownfield/ci-test-detector.js` — the source of truth
+  for every pattern and enum value in this fragment
+- `Gaia-framework/src/brownfield/test-runner-detector.js` — the runner
+  detector correlated with CI test command matching

package/_gaia/lifecycle/knowledge/brownfield/test-execution-scan.md CHANGED Viewed

@@ -254,3 +254,16 @@ Format:
 {YAML gap entries here}
 ```
+## See Also
+- [`ci-test-detection.md`](./ci-test-detection.md) — CI pipeline test execution
+  detection. This scan covers **local test runner** detection and execution
+  (Jest, Vitest, pytest, JUnit, Go test, Flutter, BATS); `ci-test-detection.md`
+  is the companion fragment that covers **CI pipeline** test execution
+  detection across GitHub Actions, GitLab CI, CircleCI, Azure Pipelines,
+  Jenkins, and Bitbucket Pipelines. Read both together during brownfield
+  onboarding: the runner scan answers "does this project have tests?" and the
+  CI scan answers "does the pipeline actually run them?".
+- `config-contradiction-scan.md` — configuration contradiction scanning
+- `dead-code-scan.md` — unused code detection

package/_gaia/lifecycle/skills/document-rulesets.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
 name: document-rulesets
-version: '1.1'
+version: '1.2'
 applicable_agents: [validator]
-description: 'Document-specific validation rulesets for artifact type detection (path and frontmatter), structural quality checks per artifact type (application, infrastructure, platform PRDs), and two-pass validation logic.'
-sections: [type-detection, prd-rules, infra-prd-rules, platform-prd-rules, arch-rules, ux-rules, test-plan-rules, epics-rules, two-pass-logic]
+description: 'Document-specific validation rulesets for artifact type detection (path and frontmatter), structural quality checks per artifact type (application, infrastructure, platform PRDs, gap analysis output), and two-pass validation logic.'
+sections: [type-detection, prd-rules, infra-prd-rules, platform-prd-rules, arch-rules, ux-rules, test-plan-rules, epics-rules, gap-analysis-rules, two-pass-logic]
 ---
 <!-- SECTION: type-detection -->
@@ -38,6 +38,7 @@ If no frontmatter match is found, detect the artifact type from the file path ba
 | `ux-design.md` | ux-rules | UX Design Specification |
 | `test-plan.md` | test-plan-rules | Test Plan |
 | `epics-and-stories.md` | epics-rules | Epics and Stories |
+| `test-gap-analysis-*.md` | gap-analysis-rules | Test Gap Analysis Output (E19-S3, FR-223) |
 ### Path-Based Detection Algorithm
@@ -223,6 +224,95 @@ Verify each story has: key, title, user story (As a/I want/So that), acceptance
 Verify all `depends_on` and `blocks` references point to existing stories. Check for circular dependencies. Verify priority ordering respects dependency chains. Flag broken references as WARNING.
 <!-- END SECTION -->
+<!-- SECTION: gap-analysis-rules -->
+## Gap Analysis Output Validation Rules
+Structural quality checks for the test gap analysis output artifact produced by `/gaia-test-gap-analysis`. These rules validate conformance to the FR-223 output schema defined by `_gaia/lifecycle/templates/test-gap-analysis-template.md` (E19-S3, ADR-030 §10.22).
+**Scope:** files matching `docs/test-artifacts/test-gap-analysis-*.md`.
+**Schema version:** 1.0.0
+### YAML Frontmatter — Required Fields
+The frontmatter block must parse cleanly as YAML and contain all five required fields. Missing any field is a WARNING. A frontmatter parse failure (malformed YAML, missing `---` delimiters, unquoted special characters) is a CRITICAL finding.
+| Field | Type | Constraint |
+|-------|------|------------|
+| `mode` | enum | must be `coverage` or `verification` |
+| `date` | string | non-empty ISO-8601 date (YYYY-MM-DD) |
+| `project` | string | non-empty |
+| `story_count` | integer | >= 0 |
+| `gap_count` | integer | >= 0 |
+### Gap Type Enum (Closed)
+The `gap_type` field on every Gap Table row must match exactly one of these four values. Any other value is a CRITICAL finding. This enum is closed by design — adding a new gap type is a breaking change and requires a schema version bump.
+- `missing-test`
+- `unexecuted`
+- `uncovered-ac`
+- `missing-edge-case`
+### Severity Enum (Closed)
+The `severity` field on every Gap Table row must match exactly one of these four values. Any other value is a CRITICAL finding.
+- `critical`
+- `high`
+- `medium`
+- `low`
+### Required Sections
+The output must contain these four top-level sections in this order. Missing any section is a WARNING.
+1. `## Executive Summary`
+2. `## Gap Table`
+3. `## Per-Story Detail`
+4. `## Recommendations`
+### Gap Table Column Order
+The Gap Table must declare its columns in this exact order. A table with columns in a different order or with missing columns is a WARNING.
+1. `story_key`
+2. `gap_type`
+3. `severity`
+4. `description`
+### Cross-Field Consistency
+- If `gap_count == 0`, the Executive Summary should contain the phrase `No coverage gaps detected` — absence is an INFO finding.
+- `gap_count` should equal the number of data rows in the Gap Table (excluding the header and separator rows) — mismatch is a WARNING.
+### Generated vs Executed Tracking (E19-S7, FR-226)
+Verification-mode outputs must report generated and executed test case
+counts per story and in aggregate. These rules apply only when `mode:
+verification` appears in the frontmatter.
+- The Executive Summary must contain a `Generated vs Executed` row in the
+  format `{total_executed}/{total_generated} ({aggregate_exec_ratio}%)`.
+  Missing row is a WARNING.
+- Each Per-Story Detail subsection must declare three fields: `generated`
+  (integer >= 0), `executed` (integer >= 0), and `exec_ratio` (percentage
+  with one decimal place, e.g., `60.0%`). A missing field on a present
+  story subsection is a WARNING.
+- `exec_ratio` must equal `round((executed / generated) * 100, 1)` when
+  `generated > 0`. When `generated == 0`, `exec_ratio` must be `0.0%` and
+  the subsection should include the note `0/0 (no generated tests)` —
+  absence of the note is an INFO finding.
+- Stories with `executed == 0` and `generated > 0` should be flagged as
+  HIGH gap priority — absence of the HIGH flag for such stories is a
+  WARNING.
+- The aggregate row values must be consistent with the sum of per-story
+  counts: `total_generated == sum(story.generated)` and
+  `total_executed == sum(story.executed)` — mismatch is a WARNING.
+**References:** FR-223, FR-226, ADR-030 §10.22, stories E19-S3 and E19-S7, test cases TGA-17–20, TGA-30–32.
+<!-- END SECTION -->
 <!-- SECTION: two-pass-logic -->
 ## Two-Pass Validation Logic

package/_gaia/lifecycle/templates/story-template.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 template: 'story'
-version: 1.2.0
+version: 1.4.0
 used_by: ['create-story']
 key: "{story_key}"
 title: "{story_title}"
@@ -109,12 +109,12 @@ As a {role}, I want to {action}, so that {benefit}.
 | Review | Status | Report |
 |--------|--------|--------|
-| Code Review | PENDING | — |
-| QA Tests | PENDING | — |
-| Security Review | PENDING | — |
-| Test Automation | PENDING | — |
-| Test Review | PENDING | — |
-| Performance Review | PENDING | — |
+| Code Review | UNVERIFIED | — |
+| QA Tests | UNVERIFIED | — |
+| Security Review | UNVERIFIED | — |
+| Test Automation | UNVERIFIED | — |
+| Test Review | UNVERIFIED | — |
+| Performance Review | UNVERIFIED | — |
 > Story moves to `done` only when ALL reviews show PASSED.