gaia-framework 1.65.1 → 1.66.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (31) hide show
  1. package/CLAUDE.md +16 -1
  2. package/README.md +2 -2
  3. package/_gaia/_config/global.yaml +1 -1
  4. package/_gaia/core/engine/workflow.xml +6 -0
  5. package/_gaia/core/protocols/review-gate-check.xml +29 -1
  6. package/_gaia/lifecycle/knowledge/brownfield/config-contradiction-scan.md +137 -0
  7. package/_gaia/lifecycle/knowledge/brownfield/dead-code-scan.md +179 -0
  8. package/_gaia/lifecycle/knowledge/brownfield/test-execution-scan.md +209 -0
  9. package/_gaia/lifecycle/skills/document-rulesets.md +91 -6
  10. package/_gaia/lifecycle/templates/brownfield-scan-doc-code-prompt.md +219 -0
  11. package/_gaia/lifecycle/templates/brownfield-scan-hardcoded-prompt.md +169 -0
  12. package/_gaia/lifecycle/templates/brownfield-scan-integration-seam-prompt.md +127 -0
  13. package/_gaia/lifecycle/templates/brownfield-scan-runtime-behavior-prompt.md +141 -0
  14. package/_gaia/lifecycle/templates/brownfield-scan-security-prompt.md +212 -0
  15. package/_gaia/lifecycle/templates/gap-entry-schema.md +247 -0
  16. package/_gaia/lifecycle/templates/infra-prd-template.md +356 -0
  17. package/_gaia/lifecycle/templates/platform-prd-template.md +431 -0
  18. package/_gaia/lifecycle/templates/prd-template.md +70 -0
  19. package/_gaia/lifecycle/workflows/4-implementation/add-feature/checklist.md +1 -1
  20. package/_gaia/lifecycle/workflows/4-implementation/add-feature/instructions.xml +2 -3
  21. package/_gaia/lifecycle/workflows/4-implementation/add-stories/checklist.md +5 -0
  22. package/_gaia/lifecycle/workflows/4-implementation/add-stories/instructions.xml +73 -1
  23. package/_gaia/lifecycle/workflows/4-implementation/create-story/instructions.xml +1 -1
  24. package/_gaia/lifecycle/workflows/4-implementation/retrospective/instructions.xml +21 -1
  25. package/_gaia/lifecycle/workflows/4-implementation/retrospective/workflow.yaml +1 -1
  26. package/_gaia/lifecycle/workflows/anytime/brownfield-onboarding/checklist.md +12 -0
  27. package/_gaia/lifecycle/workflows/anytime/brownfield-onboarding/instructions.xml +244 -4
  28. package/_gaia/lifecycle/workflows/anytime/brownfield-onboarding/workflow.yaml +1 -0
  29. package/bin/gaia-framework.js +8 -6
  30. package/gaia-install.sh +28 -20
  31. package/package.json +1 -1
package/CLAUDE.md CHANGED
@@ -1,5 +1,5 @@
1
1
 
2
- # GAIA Framework v1.65.1
2
+ # GAIA Framework v1.66.0
3
3
 
4
4
  This project uses the **GAIA** (Generative Agile Intelligence Architecture) framework — an AI agent framework for Claude Code that orchestrates software product development through 26 specialized agents, 65 workflows, and 8 shared skills.
5
5
 
@@ -149,6 +149,21 @@ Run `/gaia-run-all-reviews` to execute all six reviews sequentially via subagent
149
149
 
150
150
  If any review fails, the story returns to `in-progress`. The Review Gate table in the story file tracks progress.
151
151
 
152
+ ### Infra Review Gate Substitutions
153
+
154
+ For infrastructure stories (those whose `traces_to` field contains `IR-###`, `OR-###`, or `SR-###` requirement IDs), 4 of the 6 review gates use adapted criteria. Code Review and Security Review remain unchanged for all story types.
155
+
156
+ | Standard Gate | Infra Equivalent | Change |
157
+ |---|---|---|
158
+ | Code Review | IaC Code Review | Unchanged — same workflow, IaC expertise expected |
159
+ | QA Tests | Policy-as-Code Validation | Checkov/tfsec/OPA pass replaces unit/integration test pass |
160
+ | Security Review | Security Review | Unchanged — critical for infrastructure |
161
+ | Test Automation | Plan Validation + Drift Checks | terraform plan assertions replace automated test coverage |
162
+ | Test Review | Policy Review | OPA/Rego coverage replaces test quality review |
163
+ | Performance Review | Cost Review + Scaling Validation | Cost analysis and autoscaling validation replace load testing |
164
+
165
+ **Detection mechanism:** The `review-gate-check` protocol reads the story's `traces_to` field and checks the requirement ID prefix. Each story is evaluated independently — platform projects with mixed stories get per-story gate selection based on their own requirement prefix.
166
+
152
167
  ## Memory Hygiene
153
168
 
154
169
  Agent memory sidecars accumulate decisions across sessions. Run `/gaia-memory-hygiene` periodically (recommended before each sprint) to detect stale, contradicted, or orphaned entries by cross-referencing sidecar decisions against current planning and architecture artifacts.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # GAIA — Generative Agile Intelligence Architecture
2
2
 
3
- [![Framework](https://img.shields.io/badge/framework-v1.65.1-blue)]()
3
+ [![Framework](https://img.shields.io/badge/framework-v1.66.0-blue)]()
4
4
  [![License](https://img.shields.io/badge/license-AGPL--3.0-green)]()
5
5
  [![Agents](https://img.shields.io/badge/agents-26-purple)]()
6
6
  [![Workflows](https://img.shields.io/badge/workflows-73-orange)]()
@@ -460,7 +460,7 @@ The single source of truth is `_gaia/_config/global.yaml`:
460
460
 
461
461
  ```yaml
462
462
  framework_name: "GAIA"
463
- framework_version: "1.65.1"
463
+ framework_version: "1.66.0"
464
464
  user_name: "your-name"
465
465
  project_name: "your-project"
466
466
  ```
@@ -3,7 +3,7 @@
3
3
  # After modifying this file, run /gaia-build-configs to regenerate resolved configs.
4
4
 
5
5
  framework_name: "GAIA"
6
- framework_version: "1.65.1"
6
+ framework_version: "1.66.0"
7
7
 
8
8
  # User settings
9
9
  user_name: "jlouage"
@@ -50,6 +50,12 @@ execution modes (normal/yolo/planning), checkpoints, and quality gates.
50
50
  <action>Resolve {installed_path} from workflow.yaml location</action>
51
51
  <action>Resolve {date} to current date</action>
52
52
  <action>Ask user for any remaining unresolved variables</action>
53
+
54
+ <!-- Template Resolution (ADR-020, FR-101) — custom/templates/ overrides _gaia/lifecycle/templates/ -->
55
+ <!-- Resolution order for template reads: custom/templates/{filename} > _gaia/lifecycle/templates/{filename} -->
56
+ <!-- Resolution order for template writes: custom/templates/ ONLY — NEVER _gaia/lifecycle/templates/ -->
57
+ <action>Resolve template paths: If workflow.yaml declares a 'template' field, extract the template filename from the fully resolved template path (e.g., extract "story-template.md" from "{project-root}/_gaia/lifecycle/templates/story-template.md"). Check if {project-root}/custom/templates/{filename} exists and is non-empty (file size > 0 bytes). If yes: the custom template overrides the framework default — replace the resolved template variable with the custom path ({project-root}/custom/templates/{filename}). The custom template takes full precedence and completely replaces the framework default (no merge). If no (custom/templates/ directory does not exist, the specific file is not found, or the file is empty / 0 bytes): keep the original resolved framework path unchanged. No error, no warning on fallback — this is silent. If workflow.yaml has no 'template' field, skip template resolution entirely.</action>
58
+ <action>Template write-path mandate: Any workflow that writes or modifies template files MUST write to {project-root}/custom/templates/, NEVER to {project-root}/_gaia/lifecycle/templates/. Framework default templates are read-only.</action>
53
59
  </step>
54
60
 
55
61
  <step n="2" title="Preflight Validation">
@@ -2,16 +2,44 @@
2
2
  <description>
3
3
  Shared protocol invoked by each review workflow after updating its Review Gate row.
4
4
  Checks if ALL reviews have passed and transitions story to done if so.
5
+ Supports infrastructure review gate adaptations (ADR-022 §10.16.6):
6
+ stories tracing to infra requirements (IR-/OR-/SR- prefixes) use adapted gate criteria.
5
7
  </description>
6
8
  <critical>
7
9
  <mandate>The Review Gate table must have EXACTLY 6 rows: Code Review, QA Tests, Security Review, Test Automation, Test Review, Performance Review. No other rows are valid. If extra rows exist, remove them.</mandate>
10
+ <mandate>For infrastructure stories, 4 of 6 gates use adapted criteria. Code Review and Security Review remain unchanged for all story types.</mandate>
8
11
  </critical>
9
- <step n="1" title="Read Review Gate">
12
+
13
+ <!-- Infra Review Gate Substitutions (FR-128, ADR-022 §10.16.6)
14
+ When a story traces to infrastructure requirements, the following gate
15
+ criteria are substituted. The gate row NAMES in the Review Gate table
16
+ stay the same (for compatibility), but the review workflows apply
17
+ infra-specific criteria instead of standard application criteria.
18
+
19
+ | Standard Gate | Infra Equivalent | Changed? |
20
+ | Code Review | IaC Code Review | Unchanged — same workflow, IaC expertise expected |
21
+ | QA Tests | Policy-as-Code Validation | Checkov/tfsec/OPA pass replaces unit/integration test pass |
22
+ | Security Review | Security Review | Unchanged — critical for infrastructure |
23
+ | Test Automation | Plan Validation + Drift Checks | terraform plan assertions replace automated test coverage |
24
+ | Test Review | Policy Review | OPA/Rego coverage replaces test quality review |
25
+ | Performance Review | Cost Review + Scaling Validation | Cost analysis and autoscaling validation replace load testing |
26
+ -->
27
+
28
+ <step n="1" title="Read Review Gate and Determine Gate Type">
10
29
  <action>Read the story file's Review Gate table</action>
11
30
  <action>If Review Gate section is missing: initialize it with EXACTLY 6 rows — Code Review (PENDING), QA Tests (PENDING), Security Review (PENDING), Test Automation (PENDING), Test Review (PENDING), Performance Review (PENDING). Do NOT add any other rows.</action>
12
31
  <action>If Review Gate table has extra rows beyond the 6 valid ones: remove the invalid rows</action>
13
32
  <action>Parse each row: Review name, Status (PENDING | PASSED | FAILED), Report link</action>
33
+
34
+ <!-- Infra Gate Detection (FR-129): per-story gate type selection based on requirement ID prefix -->
35
+ <action>Read the story file's YAML frontmatter traces_to field (e.g., traces_to: [IR-001, FR-128])</action>
36
+ <action>Determine gate_type for this individual story by scanning its traces_to entries:
37
+ - If ANY entry has an IR-, OR-, or SR- prefix → set gate_type = "infra"
38
+ - If entries have only FR- or NFR- prefixes (or traces_to is empty/absent) → set gate_type = "standard"
39
+ Each story is evaluated independently — in platform projects with mixed stories, each story gets the gate set matching its own requirement prefix, not a single gate set for the whole project.</action>
40
+ <action if="gate_type == infra">Log: "Infra review gates detected for this story (traces to IR-/OR-/SR- requirements). Applying infrastructure gate criteria: QA Tests → Policy-as-Code Validation, Test Automation → Plan Validation + Drift Checks, Performance Review → Cost Review + Scaling Validation, Test Review → Policy Review. Code Review and Security Review remain unchanged."</action>
14
41
  </step>
42
+
15
43
  <step n="2" title="Evaluate Gate and Transition">
16
44
  <critical>
17
45
  <mandate>You MUST execute the transition even if the gate was already fully passed before this run. The purpose is to ensure story status matches gate state.</mandate>
@@ -0,0 +1,137 @@
1
+ # Config Contradiction Scanner — Subagent Prompt Template
2
+
3
+ > Brownfield deep analysis scan subagent for detecting contradictory configuration values across files.
4
+ > Reference: Architecture ADR-021, Section 10.15.2, 10.15.3, 10.15.5, ADR-022 §10.16.5
5
+ > Infra-awareness: E12-S6 — applies infra-specific patterns when project_type is infrastructure or platform.
6
+
7
+ ## Subagent Invocation
8
+
9
+ **Input variables:**
10
+ - `{tech_stack}` — Detected technology stack from Step 1 discovery
11
+ - `{project-path}` — Absolute path to the project source code directory
12
+ - `{project_type}` — Project type: `application`, `infrastructure`, or `platform`
13
+
14
+ **Output file:** `{planning_artifacts}/brownfield-scan-config-contradiction.md`
15
+
16
+ ## Subagent Prompt
17
+
18
+ ```
19
+ You are a Config Contradiction Scanner for brownfield project analysis. Your task is to discover config files in the target project, build key-value maps, cross-reference values across files, and report contradictions using the standardized gap schema.
20
+
21
+ ### Inputs
22
+ - Tech stack: {tech_stack}
23
+ - Project path: {project-path}
24
+ - Project type: {project_type}
25
+ - Gap schema reference: Read _gaia/lifecycle/templates/gap-entry-schema.md for the output format
26
+
27
+ ### Step 1: Config File Discovery
28
+
29
+ Discover config files using glob patterns. Apply both generic and stack-specific patterns.
30
+
31
+ **Generic patterns (always apply):**
32
+ - `**/*.yaml`, `**/*.yml` — YAML config files
33
+ - `**/*.json` — JSON config files (exclude package-lock.json, yarn.lock)
34
+ - `**/*.env` and `**/.env*` — Environment variable files
35
+ - `**/*.toml` — TOML config files (exclude Pipfile.lock)
36
+ - `**/*.ini` — INI config files
37
+ - `**/*.properties` — Java properties files
38
+ - `**/config*.xml` — XML config files
39
+
40
+ **Exclusion patterns (always apply):**
41
+ - `node_modules/`, `vendor/`, `dist/`, `build/`, `.git/`
42
+ - Lock files: `package-lock.json`, `yarn.lock`, `Pipfile.lock`, `go.sum`, `pnpm-lock.yaml`
43
+ - Test fixtures and mock data directories
44
+
45
+ **Stack-specific patterns (apply based on {tech_stack}):**
46
+
47
+ #### Java/Spring
48
+ - `application.yml`, `application.properties`, `bootstrap.yml`
49
+ - `application-{profile}.yml`, `application-{profile}.properties`
50
+ - `src/main/resources/**/*.properties`, `src/main/resources/**/*.yml`
51
+
52
+ #### Node/Express
53
+ - `.env`, `.env.production`, `.env.development`, `.env.test`, `.env.local`
54
+ - `config/` directory contents
55
+ - `package.json` scripts section
56
+
57
+ #### Python/Django
58
+ - `settings.py`, `settings/*.py`
59
+ - `.env`, `pyproject.toml` tool sections
60
+ - `config.py`, `config/*.py`
61
+
62
+ #### Go/Gin
63
+ - `config.yaml`, `config.json`, `config.toml`
64
+ - `.env`
65
+ - Struct tags with `json:` / `mapstructure:` bindings
66
+
67
+ ### Step 1b: Infrastructure Config File Discovery (E12-S6)
68
+
69
+ **Apply ONLY when {project_type} is `infrastructure` or `platform`.**
70
+
71
+ In addition to the generic and stack-specific patterns above, scan for infrastructure configuration files:
72
+
73
+ #### Terraform
74
+ - `**/*.tf` — Terraform configuration files
75
+ - `**/*.tfvars` — Terraform variable files (terraform.tfvars, *.auto.tfvars)
76
+ - `**/*.tfvars.json` — JSON-format Terraform variables
77
+ - `**/terraform.tfstate` — State files (check for drift, do not parse fully)
78
+ - `**/backend.tf` — Backend configuration
79
+
80
+ #### Helm / Kubernetes
81
+ - `**/values.yaml`, `**/values-*.yaml` — Helm values files (values.yaml, values-dev.yaml, values-prod.yaml)
82
+ - `**/Chart.yaml` — Helm chart metadata
83
+ - `**/templates/**/*.yaml` — Helm templates (scan for hardcoded values vs template refs)
84
+ - `**/*.yaml` in directories matching `k8s/`, `kubernetes/`, `manifests/`, `deploy/`
85
+
86
+ #### Kustomize
87
+ - `**/kustomization.yaml`, `**/kustomization.yml` — Kustomize configs
88
+ - `**/overlays/**/*.yaml` — Kustomize overlay patches (detect contradictions between base and overlays)
89
+ - `**/base/**/*.yaml` — Kustomize base resources
90
+
91
+ #### Docker / Compose
92
+ - `**/Dockerfile*` — Dockerfile variants
93
+ - `**/docker-compose*.yml`, `**/docker-compose*.yaml` — Compose files
94
+ - `**/.dockerignore` — Docker ignore files
95
+
96
+ #### CI/CD
97
+ - `.github/workflows/**/*.yml` — GitHub Actions workflows
98
+ - `**/.gitlab-ci.yml` — GitLab CI config
99
+ - `**/Jenkinsfile*` — Jenkins pipelines
100
+ - `**/.circleci/config.yml` — CircleCI config
101
+
102
+ **Infra contradiction detection focus areas:**
103
+ - Same variable defined differently across terraform.tfvars files for different environments
104
+ - Helm values.yaml contradicting kustomize overlay values for the same resource
105
+ - Port numbers, resource limits, replica counts, and image tags inconsistent across environments
106
+ - Backend configuration (S3 bucket, DynamoDB table) mismatched between Terraform state backends
107
+
108
+ ### Step 2: Build Key-Value Maps
109
+
110
+ For each discovered config file, extract a key-value map:
111
+ - Parse structured formats (YAML, JSON, TOML, INI, properties) into nested key paths
112
+ - For .env files: parse KEY=VALUE pairs
113
+ - For Terraform files: extract variable defaults, locals, and resource attributes
114
+ - For Helm values: extract the full values tree
115
+ - For kustomize overlays: extract patch operations and their target values
116
+
117
+ ### Step 3: Cross-Reference and Detect Contradictions
118
+
119
+ Compare key-value maps across files:
120
+ - Same key path with different values across files = contradiction
121
+ - Environment-specific overrides that conflict with defaults
122
+ - Port/host/URL mismatches between services
123
+ - For infra projects: resource specification mismatches between environments
124
+
125
+ ### Step 4: Output
126
+
127
+ Format each contradiction as a gap entry using the standardized schema:
128
+ - category: `config-contradiction`
129
+ - For infra-specific contradictions (terraform.tfvars, values.yaml, kustomize): also tag with infra context in the description
130
+ - id: `GAP-CONFIG-{seq}` — sequential numbering starting at 001
131
+ - verified_by: `machine-detected`
132
+ - Budget: max 70 entries, truncate low-severity entries if exceeded
133
+ ```
134
+
135
+ ## Output File
136
+
137
+ Write all findings to: `{planning_artifacts}/brownfield-scan-config-contradiction.md`
@@ -0,0 +1,179 @@
1
+ # Dead Code & Dead State Scanner — Subagent Prompt Template
2
+
3
+ > Brownfield deep analysis scan subagent for detecting dead code, dead state, and abandoned functionality.
4
+ > Reference: Architecture ADR-021, Section 10.15.2, 10.15.3, 10.15.5
5
+
6
+ ## Subagent Invocation
7
+
8
+ **Input variables:**
9
+ - `{tech_stack}` — Detected technology stack from Step 1 discovery (e.g., "Java/Spring", "Node/Express", "Python/Django", "Go/Gin")
10
+ - `{project-path}` — Absolute path to the project source code directory
11
+
12
+ **Output file:** `{planning_artifacts}/brownfield-scan-dead-code.md`
13
+
14
+ **Invocation model:** Spawned via Agent tool in a single message alongside 6 other deep analysis scan subagents (parallel execution per architecture 10.15.2).
15
+
16
+ ## Subagent Prompt
17
+
18
+ ```
19
+ You are a Dead Code & Dead State Scanner for brownfield project analysis. Your task is to discover dead code, unused state, and abandoned functionality in the target project using LLM-based static analysis (grep/glob/read), then report findings using the standardized gap schema format.
20
+
21
+ ### Inputs
22
+ - Tech stack: {tech_stack}
23
+ - Project path: {project-path}
24
+ - Gap schema reference: Read _gaia/lifecycle/templates/gap-entry-schema.md for the output format
25
+
26
+ ### Step 1: Universal Dead Code Detection
27
+
28
+ Apply these detection patterns regardless of tech stack.
29
+
30
+ #### 1.1 Unreachable Code Paths
31
+ Scan for code that can never execute:
32
+ - Code after unconditional `return`, `throw`, `exit`, `break`, `continue` statements
33
+ - Unreachable switch/match branches (default after exhaustive cases)
34
+ - Dead branches behind constant `false` conditions (`if (false)`, `if (0)`)
35
+ - Functions defined but never called anywhere in the project
36
+
37
+ #### 1.2 Unused Exports, Functions, and Classes
38
+ Cross-reference declarations against usage across the entire project:
39
+ - Grep for all exported symbols (functions, classes, constants, types)
40
+ - Cross-reference each export against import/require/usage statements in other files
41
+ - A declaration with zero references across the project is definitely unused (confidence: high)
42
+ - A declaration referenced only in the same file where it is defined may be dead if not exported
43
+
44
+ #### 1.3 Commented-Out Code Blocks (>5 Lines)
45
+ Scan for blocks of more than 5 consecutive commented lines that contain code patterns:
46
+ - Function definitions, class declarations, control flow (if/else, for, while, switch)
47
+ - Variable assignments, return statements, import/require statements
48
+ - Threshold is strictly greater than 5 lines — exactly 5 lines does NOT trigger detection
49
+ - Distinguish code comments from documentation comments (JSDoc, Javadoc, docstrings)
50
+
51
+ #### 1.4 Unused Database Artifacts (Dead State)
52
+ Cross-reference migration files against ORM models and query patterns:
53
+ - Tables or columns defined in migration files but not referenced in any ORM model, query builder, or raw SQL
54
+ - Indexes on columns/tables that are no longer queried
55
+ - Seed data for tables that are no longer used
56
+
57
+ #### 1.5 Feature Flag Staleness
58
+ Identify feature flags that are permanently on or permanently off:
59
+ - Flag variables assigned a constant value (true/false) with no conditional reassignment anywhere
60
+ - Feature gate checks where the flag value is always the same at every call site
61
+ - Determination is based on static analysis of the codebase only — no commit history analysis required
62
+
63
+ ### Step 2: Stack-Aware Pattern Detection
64
+
65
+ Apply patterns based on the detected {tech_stack}. For multi-stack projects (monorepos), apply all relevant stack patterns — each stack's patterns apply only to files matching that stack's file extensions, preventing cross-contamination.
66
+
67
+ #### Java/Spring
68
+ - Unused `@Service`, `@Repository`, `@Component` beans — annotated classes with no `@Autowired` or constructor injection anywhere in the project
69
+ - Unused `@Scheduled` methods — scheduled task methods that are defined but their containing bean is never loaded
70
+ - Orphaned `@Entity` classes — JPA entities not referenced by any repository or query
71
+ - Unused Spring `@Configuration` beans — config classes that declare beans never injected
72
+ - Confidence: set to `medium` for Spring beans (XML config or component scan may inject dynamically)
73
+
74
+ #### Node/Express
75
+ - Unused `module.exports` or `export` declarations — exported symbols never imported elsewhere
76
+ - Orphaned route handlers — handler functions defined but not registered in any router
77
+ - Unused middleware — middleware functions defined but not applied to any route or app
78
+ - Dead `require()` or `import` in index/barrel files — re-exported modules never consumed
79
+ - Unused npm scripts — scripts in package.json never referenced by other scripts or CI
80
+
81
+ #### Python/Django
82
+ - Unused views — view functions or classes defined in views.py but not mapped in any `urlpatterns`
83
+ - Unused serializers — serializer classes defined but never used in any view or viewset
84
+ - Orphaned management commands — commands defined but never invoked in scripts or docs
85
+ - Dead Celery tasks — task functions decorated with `@shared_task` or `@app.task` but never called via `.delay()` or `.apply_async()`
86
+ - Unused Django model methods — methods on models never called outside the model file
87
+
88
+ #### Go/Gin
89
+ - Unexported functions with no callers in the same package — lowercase functions never referenced
90
+ - Unused handler functions — HTTP handler functions not registered in any router group
91
+ - Dead `init()` blocks — init functions in files that are never imported
92
+ - Unused struct methods — methods on types never called anywhere in the project
93
+ - Unused interface implementations — types implementing interfaces but never used polymorphically
94
+
95
+ ### Step 3: Confidence Level Assignment
96
+
97
+ Assign confidence levels to distinguish between "definitely unused" and "possibly unused":
98
+
99
+ - **`high`** — Zero references found anywhere in the project. The code is definitely unused based on static analysis. No dynamic import, reflection, or metaprogramming patterns could reference it.
100
+ - **`medium`** — No direct references found, but dynamic import patterns exist in the project (e.g., `require(variable)`, `importlib.import_module()`, Spring component scanning). The code is possibly unused but dynamic references cannot be ruled out.
101
+ - **`low`** — The code appears unused, but reflection, metaprogramming, or runtime code generation patterns are present (e.g., Java reflection, Python `getattr()`, Go `reflect` package). Cannot confidently determine usage status.
102
+
103
+ Include a note in the `description` field explaining why certainty is limited for medium and low confidence findings.
104
+
105
+ ### Step 4: Format Output
106
+
107
+ Format all findings as gap entries using the standardized gap entry schema format:
108
+
109
+ - `category`: always `"dead-code"`
110
+ - `verified_by`: always `"machine-detected"`
111
+ - `id`: sequential `GAP-DEAD-CODE-001`, `GAP-DEAD-CODE-002`, etc.
112
+ - `confidence`: per Step 3 classification
113
+
114
+ Example gap entry structure:
115
+ ```yaml
116
+ gap:
117
+ id: "GAP-DEAD-CODE-001"
118
+ category: "dead-code"
119
+ severity: "medium"
120
+ title: "Unused exported function processLegacyData()"
121
+ description: "Function is exported but never imported elsewhere. Zero references — definitely unused."
122
+ evidence:
123
+ file: "src/utils/legacy.js"
124
+ line: 42
125
+ recommendation: "Remove the unused function or mark as deprecated."
126
+ verified_by: "machine-detected"
127
+ confidence: "high"
128
+ ```
129
+
130
+ All required fields must be populated:
131
+ - `id` — unique identifier in format `GAP-DEAD-CODE-{seq}` (zero-padded 3-digit sequence)
132
+ - `category` — always `"dead-code"`
133
+ - `severity` — impact level (critical/high/medium/low)
134
+ - `title` — one-line summary (max 80 chars)
135
+ - `description` — detailed explanation including evidence and confidence rationale
136
+ - `evidence` — composite object with `file` (relative path) and `line` (line number)
137
+ - `recommendation` — actionable fix suggestion
138
+ - `verified_by` — always `"machine-detected"`
139
+ - `confidence` — detection certainty (high/medium/low)
140
+
141
+ **Severity classification:**
142
+ - **critical:** Dead code that masks active security vulnerabilities or causes resource leaks
143
+ - **high:** Large dead code blocks (>50 lines) or dead database state causing confusion
144
+ - **medium:** Unused functions, classes, or exports (standard dead code)
145
+ - **low:** Small commented-out blocks, unused imports, stale feature flags
146
+
147
+ ### Step 5: Budget Control
148
+
149
+ Use structured schema format (~100 tokens per gap entry) — no prose descriptions.
150
+
151
+ - Maximum ~70 gap entries in the output (per NFR-024)
152
+ - If more than 70 findings are detected, include the 70 highest-severity entries
153
+ - When approaching the budget limit, prioritize higher-severity findings and summarize remaining as a count
154
+ - Append a budget summary section:
155
+ ```
156
+ ## Budget Summary
157
+ Total gaps detected: {N}. Showing top 70 by severity. Omitted: {N-70} entries ({breakdown by severity}).
158
+ ```
159
+
160
+ Write the complete output to: `{planning_artifacts}/brownfield-scan-dead-code.md`
161
+
162
+ The output file should have this structure:
163
+ ```markdown
164
+ # Brownfield Scan: Dead Code & Dead State
165
+
166
+ > Scanner: Dead Code & Dead State Scanner
167
+ > Tech Stack: {tech_stack}
168
+ > Date: {date}
169
+ > Files Scanned: {count}
170
+
171
+ ## Findings
172
+
173
+ {gap entries in standardized schema format}
174
+
175
+ ## Budget Summary (if applicable)
176
+
177
+ {truncation details if >70 entries}
178
+ ```
179
+ ```
@@ -0,0 +1,209 @@
1
+ # Test Execution Scan — Brownfield Subagent Prompt
2
+
3
+ > **Version:** 1.0.0
4
+ > **Story:** E11-S9
5
+ > **Traces to:** FR-110, US-37, ADR-021
6
+ > **Category:** runtime-behavior (test failures map to runtime-behavior per gap-entry-schema.md)
7
+ > **Output format:** Standardized gap entry schema (`_gaia/lifecycle/templates/gap-entry-schema.md`)
8
+
9
+ ## Objective
10
+
11
+ Run the existing test suite at `{project-path}` during brownfield discovery. Capture test failures as gap entries conforming to the gap schema. This scan is **non-blocking** — failures do not halt the brownfield workflow.
12
+
13
+ ## Test Runner Auto-Detection
14
+
15
+ Detect the test runner by checking for the following files at `{project-path}`. Use the **priority order** below — select the first matching runner. For monorepo/polyglot projects, detect **all** matching runners and execute them sequentially.
16
+
17
+ ### Detection Priority Order
18
+
19
+ | Priority | File Check | Condition | Runner Command |
20
+ |----------|-----------|-----------|----------------|
21
+ | 1 | `package.json` | Has `scripts.test` defined AND value is not `"echo \"Error: no test specified\""` | `npm test` |
22
+ | 2 | `pytest.ini` / `pyproject.toml` / `setup.cfg` | `pytest.ini` exists, OR `pyproject.toml` contains `[tool.pytest]`, OR `setup.cfg` contains `[tool:pytest]` | `pytest` |
23
+ | 3 | `pom.xml` | File exists | `mvn test` |
24
+ | 4 | `build.gradle` / `build.gradle.kts` | Either file exists | `gradle test` |
25
+ | 5 | `go.mod` | File exists | `go test ./...` |
26
+ | 6 | `pubspec.yaml` | File exists | `flutter test` |
27
+
28
+ ### No Test Suite Detected (AC6)
29
+
30
+ If no test runner is detected at `{project-path}`, produce a single info-level gap entry:
31
+
32
+ ```yaml
33
+ id: "GAP-TEST-INFO-001"
34
+ category: "runtime-behavior"
35
+ severity: "info"
36
+ title: "No test suite detected"
37
+ description: "No recognized test runner configuration found at {project-path}. The project has no automated tests or uses an unsupported test framework."
38
+ evidence:
39
+ file: "{project-path}"
40
+ line: 0
41
+ recommendation: "Add a test framework (Jest, pytest, JUnit, etc.) and write initial unit tests for critical paths."
42
+ verified_by: "machine-detected"
43
+ confidence: "high"
44
+ ```
45
+
46
+ Proceed without error after logging this gap.
47
+
48
+ ## Test Execution with Timeout (AC3)
49
+
50
+ Execute each detected runner with a configurable timeout (default **5 minutes** / 300 seconds).
51
+
52
+ ```bash
53
+ timeout 300 npm test 2>&1
54
+ ```
55
+
56
+ ### Timeout Behavior
57
+
58
+ - If the timeout is exceeded, terminate the process gracefully
59
+ - Capture partial results from stdout/stderr up to the timeout point
60
+ - Log a warning-level gap entry noting the timeout:
61
+
62
+ ```yaml
63
+ id: "GAP-TEST-{seq}"
64
+ category: "runtime-behavior"
65
+ severity: "medium"
66
+ title: "Test suite timed out after 5 minutes"
67
+ description: "Test execution exceeded the 300s timeout. Partial results captured: {N} tests ran before timeout."
68
+ evidence:
69
+ file: "{test-config-file}"
70
+ line: 0
71
+ recommendation: "Investigate slow tests. Consider splitting the test suite or increasing the timeout for CI."
72
+ verified_by: "machine-detected"
73
+ confidence: "medium"
74
+ ```
75
+
76
+ ### Sequential Execution for Multiple Runners (AC9)
77
+
78
+ For monorepo or polyglot projects with multiple test runners detected:
79
+ 1. Execute each detected runner sequentially (not in parallel)
80
+ 2. Aggregate results across all runners
81
+ 3. Include the runner name in the `description` field of each gap entry (e.g., "npm test: ...")
82
+ 4. Use a shared sequence counter across all runners for gap entry IDs
83
+
84
+ ## Output Parsing (AC4)
85
+
86
+ After each test run completes (or times out), parse the output to extract metrics:
87
+ - **Total** test count
88
+ - **Passing** count
89
+ - **Failing** count
90
+ - **Skipped** count
91
+ - **Error messages** for each failing test
92
+
93
+ ### Parsing Patterns by Runner
94
+
95
+ **Jest/Mocha/Vitest:**
96
+ - Summary line: `Tests: N passed, N failed, N total` (Jest) or `Tests N passed | N failed` (Vitest)
97
+ - Individual: `FAIL src/path/to/test.js` / `PASS src/path/to/test.js`
98
+ - Exit code 1 = failures present
99
+
100
+ **pytest:**
101
+ - Summary line: `N passed, N failed, N error`
102
+ - Individual: `FAILED test_file.py::test_name`
103
+
104
+ **Maven Surefire:**
105
+ - Summary in `target/surefire-reports/` XML files
106
+ - Console: `Tests run: N, Failures: N, Errors: N, Skipped: N`
107
+
108
+ **Go test:**
109
+ - Per-test: `--- FAIL: TestName` / `--- PASS: TestName`
110
+ - Summary: `FAIL` or `ok` per package
111
+
112
+ **Flutter test:**
113
+ - Summary: `All tests passed!` or `N tests failed`
114
+ - Per-test: `FAILED: test description`
115
+
116
+ ## Infrastructure Error Detection (AC8)
117
+
118
+ Before converting failures to gap entries, check if the error is an infrastructure dependency failure rather than an actual test failure.
119
+
120
+ **Infrastructure error heuristics:**
121
+ - Pattern match stderr/stdout for: `ECONNREFUSED`, `connection refused`, `missing environment variable`, `ENOENT`, `docker`, `database connection`, `redis`, `ETIMEDOUT`, `EHOSTUNREACH`
122
+ - Exit codes indicating non-test errors (e.g., process crash, missing binary)
123
+
124
+ If an infrastructure error is detected:
125
+ - Do NOT convert to test failure gap entries
126
+ - Instead, log a **warning-level** gap entry:
127
+
128
+ ```yaml
129
+ id: "GAP-TEST-{seq}"
130
+ category: "runtime-behavior"
131
+ severity: "medium"
132
+ title: "Test infrastructure dependency unavailable"
133
+ description: "Test execution failed due to infrastructure dependency: {detected_pattern}. This is not a test logic failure."
134
+ evidence:
135
+ file: "{test-config-file}"
136
+ line: 0
137
+ recommendation: "Ensure required infrastructure (databases, caches, external services) is available before running tests. Consider using test doubles for external dependencies."
138
+ verified_by: "machine-detected"
139
+ confidence: "medium"
140
+ ```
141
+
142
+ ## Gap Entry Conversion (AC5)
143
+
144
+ For each failing test, produce a gap entry conforming to the standardized gap schema:
145
+
146
+ ### ID Format
147
+
148
+ - `GAP-TEST-{seq}` where `{seq}` is a zero-padded 3-digit sequence (001, 002, ...)
149
+ - Example: `GAP-TEST-001`, `GAP-TEST-002`
150
+
151
+ ### Severity Mapping by Test Type
152
+
153
+ Infer test type from file path patterns:
154
+
155
+ | File Path Pattern | Test Type | Severity |
156
+ |-------------------|-----------|----------|
157
+ | `test/unit/`, `tests/unit/`, `__tests__/`, `*.unit.test.*` | unit | medium |
158
+ | `test/integration/`, `tests/integration/`, `*.integration.test.*` | integration | high |
159
+ | `test/e2e/`, `tests/e2e/`, `test/end-to-end/`, `*.e2e.test.*` | e2e | critical |
160
+ | Cannot be determined | default | medium |
161
+
162
+ ### Gap Entry Template
163
+
164
+ ```yaml
165
+ id: "GAP-TEST-{seq}"
166
+ category: "runtime-behavior"
167
+ severity: "{severity_from_test_type}"
168
+ title: "{test_name} — failing"
169
+ description: "{runner_name}: {error_message}"
170
+ evidence:
171
+ file: "{test_file_path}"
172
+ line: "{line_number_if_available}"
173
+ recommendation: "Fix the failing test or update the test to match current behavior."
174
+ verified_by: "machine-detected"
175
+ confidence: "high"
176
+ ```
177
+
178
+ ## Token Budget Control (AC7)
179
+
180
+ Per NFR-024, the total output must stay within the 40K token framework budget.
181
+
182
+ - Each gap entry averages ~100 tokens
183
+ - If test output produces more than 70 gap entries, truncate:
184
+ - Keep the first 70 gap entries (prioritized by severity: critical > high > medium > low)
185
+ - Add a summary line: `<!-- TRUNCATED: {N} additional test failures omitted to stay within NFR-024 token budget -->`
186
+ - If raw test output exceeds budget before parsing, truncate the raw output and parse what is available
187
+
188
+ ## Output File
189
+
190
+ Write all gap entries to: `{planning_artifacts}/brownfield-scan-test-execution.md`
191
+
192
+ Format:
193
+ ```markdown
194
+ # Brownfield Scan: Test Execution
195
+
196
+ > Scan type: test-execution
197
+ > Runner(s): {detected_runners}
198
+ > Date: {date}
199
+
200
+ ## Test Metrics
201
+
202
+ | Runner | Total | Passed | Failed | Skipped |
203
+ |--------|-------|--------|--------|---------|
204
+ | {runner} | {n} | {n} | {n} | {n} |
205
+
206
+ ## Gap Entries
207
+
208
+ {YAML gap entries here}
209
+ ```