gaia-framework 1.65.1 → 1.83.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/.claude/commands/gaia-create-stakeholder.md +20 -0
  2. package/.claude/commands/gaia-test-gap-analysis.md +17 -0
  3. package/CLAUDE.md +102 -1
  4. package/README.md +2 -2
  5. package/_gaia/_config/global.yaml +5 -1
  6. package/_gaia/_config/lifecycle-sequence.yaml +20 -0
  7. package/_gaia/_config/skill-manifest.csv +2 -0
  8. package/_gaia/_config/workflow-manifest.csv +3 -1
  9. package/_gaia/core/engine/workflow.xml +11 -1
  10. package/_gaia/core/protocols/review-gate-check.xml +29 -1
  11. package/_gaia/core/workflows/party-mode/steps/step-01-agent-loading.md +60 -9
  12. package/_gaia/creative/workflows/problem-solving/checklist.md +64 -14
  13. package/_gaia/creative/workflows/problem-solving/instructions.xml +367 -22
  14. package/_gaia/creative/workflows/problem-solving/workflow.yaml +31 -1
  15. package/_gaia/dev/agents/_base-dev.md +7 -1
  16. package/_gaia/dev/skills/_skill-index.yaml +9 -0
  17. package/_gaia/dev/skills/figma-integration.md +296 -0
  18. package/_gaia/lifecycle/knowledge/brownfield/config-contradiction-scan.md +137 -0
  19. package/_gaia/lifecycle/knowledge/brownfield/dead-code-scan.md +179 -0
  20. package/_gaia/lifecycle/knowledge/brownfield/test-execution-scan.md +209 -0
  21. package/_gaia/lifecycle/skills/document-rulesets.md +91 -6
  22. package/_gaia/lifecycle/templates/brownfield-scan-doc-code-prompt.md +219 -0
  23. package/_gaia/lifecycle/templates/brownfield-scan-hardcoded-prompt.md +169 -0
  24. package/_gaia/lifecycle/templates/brownfield-scan-integration-seam-prompt.md +127 -0
  25. package/_gaia/lifecycle/templates/brownfield-scan-runtime-behavior-prompt.md +141 -0
  26. package/_gaia/lifecycle/templates/brownfield-scan-security-prompt.md +440 -0
  27. package/_gaia/lifecycle/templates/gap-entry-schema.md +282 -0
  28. package/_gaia/lifecycle/templates/infra-prd-template.md +356 -0
  29. package/_gaia/lifecycle/templates/platform-prd-template.md +431 -0
  30. package/_gaia/lifecycle/templates/prd-template.md +70 -0
  31. package/_gaia/lifecycle/templates/story-template.md +22 -1
  32. package/_gaia/lifecycle/workflows/2-planning/create-ux-design/instructions.xml +52 -3
  33. package/_gaia/lifecycle/workflows/4-implementation/add-feature/checklist.md +1 -1
  34. package/_gaia/lifecycle/workflows/4-implementation/add-feature/instructions.xml +2 -3
  35. package/_gaia/lifecycle/workflows/4-implementation/add-stories/checklist.md +5 -0
  36. package/_gaia/lifecycle/workflows/4-implementation/add-stories/instructions.xml +73 -1
  37. package/_gaia/lifecycle/workflows/4-implementation/create-stakeholder/checklist.md +25 -0
  38. package/_gaia/lifecycle/workflows/4-implementation/create-stakeholder/instructions.xml +79 -0
  39. package/_gaia/lifecycle/workflows/4-implementation/create-stakeholder/workflow.yaml +22 -0
  40. package/_gaia/lifecycle/workflows/4-implementation/create-story/instructions.xml +11 -1
  41. package/_gaia/lifecycle/workflows/4-implementation/retrospective/instructions.xml +21 -1
  42. package/_gaia/lifecycle/workflows/4-implementation/retrospective/workflow.yaml +1 -1
  43. package/_gaia/lifecycle/workflows/4-implementation/validate-story/instructions.xml +11 -0
  44. package/_gaia/lifecycle/workflows/anytime/brownfield-onboarding/checklist.md +12 -0
  45. package/_gaia/lifecycle/workflows/anytime/brownfield-onboarding/instructions.xml +248 -4
  46. package/_gaia/lifecycle/workflows/anytime/brownfield-onboarding/workflow.yaml +1 -0
  47. package/_gaia/testing/workflows/test-gap-analysis/checklist.md +8 -0
  48. package/_gaia/testing/workflows/test-gap-analysis/instructions.xml +53 -0
  49. package/_gaia/testing/workflows/test-gap-analysis/workflow.yaml +38 -0
  50. package/bin/gaia-framework.js +44 -8
  51. package/bin/helpers/derive-bump-label.js +41 -0
  52. package/bin/helpers/validate-bump-labels.js +38 -0
  53. package/gaia-install.sh +96 -21
  54. package/package.json +1 -1
  55. package/_gaia/_memory/tier2-results/.gitkeep +0 -0
  56. package/_gaia/_memory/tier2-results/checkpoint-resume-2026-03-24.yaml +0 -6
  57. package/_gaia/_memory/tier2-results/engine-scenarios-2026-03-22.yaml +0 -14
@@ -0,0 +1,169 @@
1
+ # Hard-Coded Business Logic Scanner — Subagent Prompt
2
+
3
+ > Brownfield deep analysis scan subagent. Detects hard-coded business logic values that should be externalized to configuration.
4
+ > Reference: Architecture ADR-021, Section 10.15.2, Section 10.15.5, ADR-022 §10.16.5
5
+ > Infra-awareness: E12-S6 — applies infra-specific patterns when project_type is infrastructure or platform.
6
+
7
+ ## Objective
8
+
9
+ Scan the codebase at `{project-path}` to identify hard-coded business logic values embedded in source code. These are values that represent business rules, configuration, or environment-specific settings and should be externalized to configuration files, environment variables, or feature flags.
10
+
11
+ **Input variables:**
12
+ - `{tech_stack}` — Detected technology stack from Step 1 discovery
13
+ - `{project-path}` — Absolute path to the project source code directory
14
+ - `{project_type}` — Project type: `application`, `infrastructure`, or `platform`
15
+
16
+ **Output format:** Follow the gap entry schema at `{project-root}/_gaia/lifecycle/templates/gap-entry-schema.md` exactly.
17
+
18
+ ## Detection Categories — Application Patterns
19
+
20
+ Scan for the following 6 categories of hard-coded values:
21
+
22
+ ### 1. Magic Numbers in Business Calculations
23
+
24
+ Values used in business logic that represent thresholds, limits, rates, or quantities.
25
+
26
+ **Flag these:**
27
+ - Numeric thresholds in business conditions: `if (amount > 10000)`
28
+ - Hard-coded retry counts in business logic: `maxRetries = 3`
29
+ - Hard-coded pagination limits: `const PAGE_SIZE = 50`
30
+ - Timeout values embedded in business logic: `setTimeout(callback, 30000)`
31
+
32
+ ### 2. Hard-coded URLs and Endpoints
33
+
34
+ URLs, API endpoints, and service addresses embedded directly in source code.
35
+
36
+ **Flag these:**
37
+ - Production/staging URLs: `fetch("https://api.prod.example.com/v2")`
38
+ - Hard-coded service endpoints: `const API_BASE = "https://internal.service.com"`
39
+ - Database connection strings with hostnames: `mongodb://prod-db:27017`
40
+
41
+ ### 3. Embedded SQL Queries with Business Rules
42
+
43
+ SQL queries containing hard-coded business logic values.
44
+
45
+ **Flag these:**
46
+ - Hard-coded role/status values in WHERE clauses
47
+ - Business tier filtering with literal strings
48
+ - Hard-coded date boundaries
49
+
50
+ ### 4. Date/Time Thresholds
51
+
52
+ Hard-coded dates, times, or durations that represent business policy.
53
+
54
+ ### 5. Pricing and Rate Values
55
+
56
+ Monetary values, percentages, rates, or financial thresholds embedded in code.
57
+
58
+ ### 6. Role and Permission Strings
59
+
60
+ Hard-coded role names, permission identifiers, or authorization strings.
61
+
62
+ ## Detection Categories — Infrastructure Patterns (E12-S6)
63
+
64
+ **Apply ONLY when {project_type} is `infrastructure` or `platform`.**
65
+
66
+ ### 7. Hard-Coded IP Addresses in Infrastructure Files
67
+
68
+ Detect IP addresses embedded directly in IaC, Kubernetes manifests, and network configuration.
69
+
70
+ **Flag these:**
71
+ - IPv4 addresses in Terraform configs: `cidr_block = "10.0.1.0/24"` with specific IPs (not CIDR ranges for subnets)
72
+ - Hard-coded IPs in Kubernetes Services or Endpoints: `clusterIP: "10.96.0.10"`
73
+ - Static IPs in Helm values: `loadBalancerIP: "203.0.113.50"`
74
+ - Hard-coded DNS entries: `server = "10.0.0.53"` instead of using service discovery
75
+ - IPs in security group rules: `cidr_blocks = ["203.0.113.0/32"]`
76
+
77
+ **Do NOT flag:**
78
+ - Standard CIDR ranges for VPC/subnet definitions: `10.0.0.0/16`, `172.16.0.0/12`
79
+ - Loopback addresses: `127.0.0.1`, `0.0.0.0`
80
+ - Kubernetes internal DNS: `kube-dns`, `coredns`
81
+
82
+ **Gap category:** `hard-coded-logic` with infra context in description
83
+
84
+ ### 8. Magic Port Numbers in Infrastructure
85
+
86
+ Detect non-standard or undocumented port numbers in infrastructure configuration.
87
+
88
+ **Flag these:**
89
+ - Non-standard port numbers without documentation: `containerPort: 8443`, `hostPort: 9999`
90
+ - Port numbers that differ between service and deployment: `port: 80` in Service but `containerPort: 8080` in Pod
91
+ - Hard-coded port ranges in security groups: `from_port = 30000, to_port = 32767`
92
+ - Port numbers in environment variables with literal values: `PORT=3001`
93
+
94
+ **Do NOT flag:**
95
+ - Well-known ports with standard usage: 80 (HTTP), 443 (HTTPS), 22 (SSH), 5432 (PostgreSQL), 3306 (MySQL), 6379 (Redis), 27017 (MongoDB)
96
+ - Ports defined in variables/config and referenced: `var.app_port`
97
+
98
+ **Gap category:** `hard-coded-logic` with infra context in description
99
+
100
+ ### 9. Embedded Secrets and Credential Patterns
101
+
102
+ Detect secrets, credentials, AMI IDs, and sensitive values embedded in IaC or config files.
103
+
104
+ **Flag these (critical severity):**
105
+ - AWS access keys: patterns matching `AKIA[0-9A-Z]{16}` in any file
106
+ - AWS secret keys: base64-like strings assigned to `secret_key` or `aws_secret_access_key`
107
+ - API tokens in config: `token = "ghp_..."`, `api_key = "sk-..."`
108
+ - Database passwords in plaintext: `password = "mysecretpassword"` in tfvars or values.yaml
109
+ - SSH private keys embedded in configs or user-data scripts
110
+
111
+ **Flag these (high severity):**
112
+ - AMI IDs hard-coded: `ami = "ami-0abcdef1234567890"` — should use data source or variable
113
+ - Docker image tags with specific SHA: `image: myapp@sha256:abc123` when not pinned intentionally
114
+ - Hard-coded AWS account IDs: `account_id = "123456789012"`
115
+ - Hard-coded region strings: `region = "us-east-1"` without variable reference
116
+
117
+ **Gap category:** `hard-coded-logic` (or escalate to `secret-exposure` for actual credentials)
118
+
119
+ ### 10. Hard-Coded Resource Limits in Infrastructure
120
+
121
+ Detect hard-coded CPU and memory limits in Kubernetes manifests, Terraform configs, and Docker files.
122
+
123
+ **Flag these:**
124
+ - Kubernetes resource requests/limits with literal values:
125
+ ```yaml
126
+ resources:
127
+ requests:
128
+ cpu: "500m"
129
+ memory: "512Mi"
130
+ limits:
131
+ cpu: "1000m"
132
+ memory: "1Gi"
133
+ ```
134
+ These should reference Helm values or kustomize patches for environment-specific tuning.
135
+ - Terraform instance types hard-coded: `instance_type = "t3.medium"` — should be a variable
136
+ - Docker memory limits: `--memory="2g"` in compose files without variable reference
137
+ - Auto-scaling thresholds: `min_size = 2, max_size = 10` without variables
138
+ - EBS volume sizes: `size = 100` without variable reference
139
+
140
+ **Do NOT flag:**
141
+ - Resource values defined in Helm values.yaml (already externalized)
142
+ - Resource values in Terraform variables (already parameterized)
143
+ - Default values in variable blocks with clear documentation
144
+
145
+ **Gap category:** `hard-coded-logic` with infra context in description
146
+
147
+ ## Acceptable Constant Allowlist
148
+
149
+ Do NOT flag: HTTP status codes, math constants, array indices, standard library constants, test fixture data.
150
+
151
+ ## Stack-Aware Detection Patterns
152
+
153
+ Apply framework-specific patterns based on {tech_stack} (Java/Spring, Node/Express, Python/Django, Go/Gin) as documented in the original E11-S4 specification.
154
+
155
+ ## False Positive Suppression Rules
156
+
157
+ - Configuration files (.yml, .yaml, .properties, .env) are externalized — do not flag
158
+ - Test files contain legitimate test fixtures — skip
159
+ - Framework-specific externalization patterns (Spring @Value, process.env, Django settings) — do not flag
160
+
161
+ ## Output Format
162
+
163
+ Gap entry structure uses `category: "hard-coded-logic"` with `id: "GAP-HARDCODED-{seq}"`.
164
+ For infra-specific findings, include "[INFRA]" prefix in the title for clarity.
165
+ Budget: max 70 entries, truncate low-severity if exceeded.
166
+
167
+ ## Output File
168
+
169
+ Write all findings to: `{planning_artifacts}/brownfield-scan-hardcoded.md`
@@ -0,0 +1,127 @@
1
+ # Integration Seam Analyzer — Subagent Prompt
2
+
3
+ > Brownfield deep analysis scan subagent. Traces data flows across service boundaries and detects fragile integration points.
4
+ > Reference: Architecture ADR-021, Section 10.15.2, Section 10.15.5, ADR-022 §10.16.5
5
+ > Infra-awareness: E12-S6 — applies infra-specific patterns when project_type is infrastructure or platform.
6
+
7
+ ## Objective
8
+
9
+ Scan the codebase at `{project-path}` to trace data flows across service boundaries, detect fragile integration points, tight coupling, and missing contracts. For infrastructure projects, additionally map service mesh topology, ingress/egress routes, and cross-namespace dependencies.
10
+
11
+ **Input variables:**
12
+ - `{tech_stack}` — Detected technology stack from Step 1 discovery
13
+ - `{project-path}` — Absolute path to the project source code directory
14
+ - `{project_type}` — Project type: `application`, `infrastructure`, or `platform`
15
+
16
+ **Output format:** Follow the gap entry schema at `{project-root}/_gaia/lifecycle/templates/gap-entry-schema.md` exactly.
17
+
18
+ ## Detection Categories — Application Patterns
19
+
20
+ ### 1. HTTP Client Calls (Service-to-Service)
21
+
22
+ Detect outbound HTTP/REST calls:
23
+ - **Java/Spring:** Feign clients, RestTemplate, WebClient, HttpClient
24
+ - **Node/Express:** axios, fetch, got, node-fetch, superagent
25
+ - **Python/Django:** requests, httpx, urllib3, aiohttp
26
+ - **Go:** net/http.Client, resty, go-retryablehttp
27
+
28
+ ### 2. Message Queue Integration
29
+
30
+ Detect message queue producers/consumers:
31
+ - Bull, BullMQ, RabbitMQ (amqplib), Kafka (kafkajs, confluent-kafka), Celery, SQS, NATS
32
+
33
+ ### 3. Database Shared Access
34
+
35
+ Detect multiple services or modules accessing the same database tables.
36
+
37
+ ### 4. Coupling Classification
38
+
39
+ Classify coupling issues:
40
+ - Tightly coupled: shared DB tables, direct internal API calls
41
+ - Missing circuit breaker or retry logic
42
+ - Undocumented external service dependencies
43
+ - Inconsistent serialization formats
44
+
45
+ ### 5. Dependency Graph
46
+
47
+ Generate adjacency list showing service-to-service relationships with connection type and direction.
48
+
49
+ ## Detection Categories — Infrastructure Patterns (E12-S6)
50
+
51
+ **Apply ONLY when {project_type} is `infrastructure` or `platform`.**
52
+
53
+ ### 6. Service Mesh Topology Mapping
54
+
55
+ Detect and map service mesh configurations and their routing rules.
56
+
57
+ **Scan for:**
58
+ - **Istio:** VirtualService, DestinationRule, Gateway, ServiceEntry, PeerAuthentication
59
+ - `kind: VirtualService` — extract routing rules, traffic splitting percentages, timeout configs
60
+ - `kind: DestinationRule` — extract load balancing policies, circuit breaker settings, TLS modes
61
+ - `kind: Gateway` — extract ingress listeners, TLS configuration, host matching
62
+ - `kind: ServiceEntry` — extract external service registrations
63
+ - `kind: PeerAuthentication` — extract mTLS modes (STRICT, PERMISSIVE, DISABLE)
64
+ - **Linkerd:** ServiceProfile, TrafficSplit, Server, ServerAuthorization
65
+ - **Consul Connect:** ServiceIntention, ServiceRouter, ServiceSplitter, ServiceResolver
66
+
67
+ **Flag these as gaps:**
68
+ - VirtualService without timeout configuration (unbounded request duration)
69
+ - DestinationRule without circuit breaker settings (no fault isolation)
70
+ - PeerAuthentication in PERMISSIVE mode in production (allows plaintext traffic)
71
+ - ServiceEntry for external services without failover configuration
72
+ - Traffic splitting percentages that do not sum to 100%
73
+ - Missing retryOn policies for transient failure codes (5xx, connect-failure)
74
+
75
+ **Severity:** `high` for missing circuit breakers and timeouts, `medium` for permissive mTLS
76
+
77
+ ### 7. Ingress/Egress Route Mapping
78
+
79
+ Map all ingress and egress routes to understand traffic flow in and out of the cluster.
80
+
81
+ **Scan for:**
82
+ - Kubernetes Ingress resources: hosts, paths, backend services, TLS config
83
+ - Istio Gateway + VirtualService pairs: external entry points into the mesh
84
+ - AWS ALB Ingress Controller annotations: `alb.ingress.kubernetes.io/*`
85
+ - Nginx Ingress Controller annotations: `nginx.ingress.kubernetes.io/*`
86
+ - Egress rules: NetworkPolicy egress, Istio ServiceEntry for external services, Calico GlobalNetworkPolicy
87
+ - NAT Gateway / Internet Gateway configurations in Terraform
88
+
89
+ **Flag these as gaps:**
90
+ - Ingress routes without TLS/HTTPS enforcement
91
+ - Ingress to services that are also exposed via NodePort (dual exposure)
92
+ - Missing egress restrictions (all outbound traffic allowed by default)
93
+ - External service dependencies without explicit ServiceEntry or egress policy
94
+ - Ingress paths that bypass the service mesh (direct NodePort access)
95
+
96
+ **Severity:** `high` for missing TLS and unrestricted egress, `medium` for dual exposure
97
+
98
+ ### 8. Cross-Namespace Dependency Detection
99
+
100
+ Detect service dependencies that span Kubernetes namespaces.
101
+
102
+ **Scan for:**
103
+ - Service references using FQDN: `{service}.{namespace}.svc.cluster.local`
104
+ - ExternalName services pointing to other namespaces
105
+ - NetworkPolicy rules referencing `namespaceSelector`
106
+ - Istio VirtualService/DestinationRule targeting services in other namespaces
107
+ - ConfigMap or Secret references from other namespaces (via volume mounts or env)
108
+ - ServiceAccount tokens shared across namespaces
109
+
110
+ **Flag these as gaps:**
111
+ - Cross-namespace service calls without NetworkPolicy allowing the traffic
112
+ - Cross-namespace dependencies without documented ownership or SLA
113
+ - Hardcoded namespace names in service URLs (fragile to namespace renaming)
114
+ - Cross-namespace secret sharing without RBAC scoping
115
+ - Circular cross-namespace dependencies (A -> B -> A)
116
+
117
+ **Severity:** `high` for undocumented cross-namespace dependencies, `medium` for hardcoded namespaces
118
+
119
+ ## Output Format
120
+
121
+ Gap entry structure uses `category: "integration-seam"` with `id: "GAP-INTEGRATION-{seq}"`.
122
+ For infra-specific findings, include "[INFRA]" prefix in the title for clarity.
123
+ Budget: max 70 entries, truncate low-severity if exceeded.
124
+
125
+ ## Output File
126
+
127
+ Write all findings to: `{planning_artifacts}/brownfield-scan-integration-seam.md`
@@ -0,0 +1,141 @@
1
+ # Runtime Behavior Inventory Scanner — Subagent Prompt
2
+
3
+ > Brownfield deep analysis scan subagent. Catalogs runtime behaviors that only manifest during execution.
4
+ > Reference: Architecture ADR-021, Section 10.15.2, Section 10.15.5, ADR-022 §10.16.5
5
+ > Infra-awareness: E12-S6 — applies infra-specific patterns when project_type is infrastructure or platform.
6
+
7
+ ## Objective
8
+
9
+ Scan the codebase at `{project-path}` to catalog runtime behaviors — scheduled tasks, background processes, startup hooks, shutdown handlers, and behaviors that are not visible from static code structure alone.
10
+
11
+ **Input variables:**
12
+ - `{tech_stack}` — Detected technology stack from Step 1 discovery
13
+ - `{project-path}` — Absolute path to the project source code directory
14
+ - `{project_type}` — Project type: `application`, `infrastructure`, or `platform`
15
+
16
+ **Output format:** Follow the gap entry schema at `{project-root}/_gaia/lifecycle/templates/gap-entry-schema.md` exactly.
17
+
18
+ ## Detection Categories — Application Patterns
19
+
20
+ ### 1. Scheduled Tasks and Cron Jobs
21
+
22
+ Detect application-level scheduled tasks:
23
+ - **Java/Spring:** `@Scheduled`, `@EnableScheduling`, Quartz `@DisallowConcurrentExecution`
24
+ - **Node/Express:** `node-cron`, `agenda`, `bull` queue scheduled jobs, `setInterval` for polling
25
+ - **Python/Django:** Celery `@periodic_task`, `celery.conf.beat_schedule`, `django-crontab`
26
+ - **Go:** `robfig/cron`, `time.Ticker`, goroutine polling loops
27
+
28
+ ### 2. Startup and Shutdown Hooks
29
+
30
+ Detect application lifecycle hooks:
31
+ - **Java/Spring:** `@PostConstruct`, `@PreDestroy`, `ApplicationListener`, `CommandLineRunner`
32
+ - **Node/Express:** `process.on('SIGTERM')`, `process.on('SIGINT')`, `beforeExit`
33
+ - **Python/Django:** `AppConfig.ready()`, `atexit.register`, signal handlers
34
+ - **Go:** `os.Signal` handling, `defer` patterns in main(), `sync.Once`
35
+
36
+ ### 3. Background Workers and Async Processors
37
+
38
+ Detect background processing patterns:
39
+ - Message queue consumers (Bull, SQS, Kafka, RabbitMQ consumers)
40
+ - Worker threads, child processes, goroutines for long-running tasks
41
+ - WebSocket connection handlers
42
+ - File watchers and directory monitors
43
+
44
+ ### 4. Race Conditions and Concurrency Risks
45
+
46
+ Detect patterns prone to race conditions:
47
+ - Shared mutable state without synchronization
48
+ - Non-atomic read-modify-write sequences
49
+ - Missing database transaction boundaries on multi-step operations
50
+
51
+ ## Detection Categories — Infrastructure Patterns (E12-S6)
52
+
53
+ **Apply ONLY when {project_type} is `infrastructure` or `platform`.**
54
+
55
+ ### 5. CronJob Detection
56
+
57
+ Detect Kubernetes CronJob resources and their scheduling patterns.
58
+
59
+ **Scan for:**
60
+ - `kind: CronJob` in Kubernetes manifests
61
+ - `spec.schedule` field — extract the cron expression
62
+ - `spec.concurrencyPolicy` — flag if missing (defaults to `Allow`, may cause overlapping runs)
63
+ - `spec.startingDeadlineSeconds` — flag if missing (no deadline for missed schedules)
64
+ - `spec.successfulJobsHistoryLimit` / `spec.failedJobsHistoryLimit` — flag if set to 0 (no history retained)
65
+ - `spec.suspend` — note if suspended (informational)
66
+
67
+ **Flag these as gaps:**
68
+ - CronJobs without `concurrencyPolicy: Forbid` or `Replace` (risk of overlapping runs)
69
+ - CronJobs without `startingDeadlineSeconds` (missed jobs may accumulate)
70
+ - CronJobs without resource limits on their pod template
71
+ - CronJobs with `restartPolicy: Always` (CronJob pods should use `OnFailure` or `Never`)
72
+
73
+ **Severity:** `medium` for missing policies, `high` for incorrect restart policies
74
+
75
+ ### 6. DaemonSet Detection
76
+
77
+ Detect Kubernetes DaemonSet resources and their node scheduling.
78
+
79
+ **Scan for:**
80
+ - `kind: DaemonSet` in Kubernetes manifests
81
+ - `spec.updateStrategy` — flag if missing or set to `OnDelete` (prefer `RollingUpdate`)
82
+ - `spec.template.spec.tolerations` — catalog which node taints are tolerated
83
+ - `spec.template.spec.nodeSelector` — catalog node selection criteria
84
+ - `spec.template.spec.priorityClassName` — note if using system priority classes
85
+
86
+ **Flag these as gaps:**
87
+ - DaemonSets without `updateStrategy` (defaults to `OnDelete`, requires manual pod deletion)
88
+ - DaemonSets without resource requests/limits (can starve node resources)
89
+ - DaemonSets with `hostNetwork: true` without documented justification
90
+ - DaemonSets without `terminationGracePeriodSeconds` set appropriately
91
+
92
+ **Severity:** `medium` for missing update strategy, `high` for unbounded resource usage
93
+
94
+ ### 7. Init Container and Sidecar Pattern Detection
95
+
96
+ Detect init containers and sidecar container patterns in Kubernetes Pods.
97
+
98
+ **Scan for:**
99
+ - `spec.initContainers` in Pod specs — catalog each init container's purpose
100
+ - Multi-container pods where one container serves as a sidecar (log collector, proxy, metrics agent)
101
+ - Istio/Envoy sidecar injection annotations: `sidecar.istio.io/inject: "true"`
102
+ - Init containers that run database migrations, config loading, or secret fetching
103
+ - Sidecar containers for: logging (fluentd, filebeat), monitoring (prometheus exporter), proxying (envoy, nginx)
104
+
105
+ **Flag these as gaps:**
106
+ - Init containers without resource limits (can block pod startup indefinitely)
107
+ - Init containers without timeout or failure handling
108
+ - Sidecar containers without health checks (liveness/readiness probes)
109
+ - Multi-container pods without clear documentation of container roles
110
+
111
+ **Severity:** `medium` for missing resource limits, `low` for missing documentation
112
+
113
+ ### 8. Health Probe Detection (Liveness, Readiness, Startup)
114
+
115
+ Detect the presence and configuration of Kubernetes health probes.
116
+
117
+ **Scan for:**
118
+ - `livenessProbe` — checks if the container is running; restarts on failure
119
+ - `readinessProbe` — checks if the container can serve traffic; removes from service on failure
120
+ - `startupProbe` — checks if the application has started; disables liveness/readiness until success
121
+
122
+ **Flag these as gaps:**
123
+ - Containers without `livenessProbe` (no automatic restart on hang)
124
+ - Containers without `readinessProbe` (may receive traffic before ready)
125
+ - Long-starting containers without `startupProbe` (liveness probe may kill them during startup)
126
+ - Probes with `initialDelaySeconds: 0` and no `startupProbe` (may restart healthy containers during startup)
127
+ - Probes using `exec` commands that could be expensive (e.g., database queries as health checks)
128
+ - Liveness and readiness probes pointing to the same endpoint (if the endpoint is slow, both fail simultaneously)
129
+ - Missing `periodSeconds`, `timeoutSeconds`, `failureThreshold` customization (relying on defaults may not suit the workload)
130
+
131
+ **Severity:** `high` for missing liveness/readiness probes, `medium` for suboptimal probe configuration
132
+
133
+ ## Output Format
134
+
135
+ Gap entry structure uses `category: "runtime-behavior"` with `id: "GAP-RUNTIME-{seq}"`.
136
+ For infra-specific findings, include "[INFRA]" prefix in the title for clarity.
137
+ Budget: max 70 entries, truncate low-severity if exceeded.
138
+
139
+ ## Output File
140
+
141
+ Write all findings to: `{planning_artifacts}/brownfield-scan-runtime-behavior.md`