@blamejs/exceptd-skills 0.9.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +232 -0
- package/ARCHITECTURE.md +267 -0
- package/CHANGELOG.md +616 -0
- package/CONTEXT.md +203 -0
- package/LICENSE +200 -0
- package/NOTICE +82 -0
- package/README.md +307 -0
- package/SECURITY.md +73 -0
- package/agents/README.md +81 -0
- package/agents/report-generator.md +156 -0
- package/agents/skill-updater.md +102 -0
- package/agents/source-validator.md +119 -0
- package/agents/threat-researcher.md +149 -0
- package/bin/exceptd.js +183 -0
- package/data/_indexes/_meta.json +88 -0
- package/data/_indexes/activity-feed.json +362 -0
- package/data/_indexes/catalog-summaries.json +229 -0
- package/data/_indexes/chains.json +7135 -0
- package/data/_indexes/currency.json +359 -0
- package/data/_indexes/did-ladders.json +451 -0
- package/data/_indexes/frequency.json +2072 -0
- package/data/_indexes/handoff-dag.json +476 -0
- package/data/_indexes/jurisdiction-clocks.json +967 -0
- package/data/_indexes/jurisdiction-map.json +536 -0
- package/data/_indexes/recipes.json +319 -0
- package/data/_indexes/section-offsets.json +3656 -0
- package/data/_indexes/stale-content.json +14 -0
- package/data/_indexes/summary-cards.json +1736 -0
- package/data/_indexes/theater-fingerprints.json +381 -0
- package/data/_indexes/token-budget.json +2137 -0
- package/data/_indexes/trigger-table.json +1374 -0
- package/data/_indexes/xref.json +818 -0
- package/data/atlas-ttps.json +282 -0
- package/data/cve-catalog.json +496 -0
- package/data/cwe-catalog.json +1017 -0
- package/data/d3fend-catalog.json +738 -0
- package/data/dlp-controls.json +1039 -0
- package/data/exploit-availability.json +67 -0
- package/data/framework-control-gaps.json +1255 -0
- package/data/global-frameworks.json +2913 -0
- package/data/rfc-references.json +324 -0
- package/data/zeroday-lessons.json +377 -0
- package/keys/public.pem +3 -0
- package/lib/framework-gap.js +328 -0
- package/lib/job-queue.js +195 -0
- package/lib/lint-skills.js +536 -0
- package/lib/prefetch.js +372 -0
- package/lib/refresh-external.js +713 -0
- package/lib/schemas/cve-catalog.schema.json +151 -0
- package/lib/schemas/manifest.schema.json +106 -0
- package/lib/schemas/skill-frontmatter.schema.json +113 -0
- package/lib/scoring.js +149 -0
- package/lib/sign.js +197 -0
- package/lib/ttp-mapper.js +80 -0
- package/lib/validate-catalog-meta.js +198 -0
- package/lib/validate-cve-catalog.js +213 -0
- package/lib/validate-indexes.js +83 -0
- package/lib/validate-package.js +162 -0
- package/lib/validate-vendor.js +85 -0
- package/lib/verify.js +216 -0
- package/lib/worker-pool.js +84 -0
- package/manifest-snapshot.json +1833 -0
- package/manifest.json +2108 -0
- package/orchestrator/README.md +124 -0
- package/orchestrator/dispatcher.js +140 -0
- package/orchestrator/event-bus.js +146 -0
- package/orchestrator/index.js +874 -0
- package/orchestrator/pipeline.js +201 -0
- package/orchestrator/scanner.js +327 -0
- package/orchestrator/scheduler.js +137 -0
- package/package.json +113 -0
- package/sbom.cdx.json +158 -0
- package/scripts/audit-cross-skill.js +261 -0
- package/scripts/audit-perf.js +160 -0
- package/scripts/bootstrap.js +205 -0
- package/scripts/build-indexes.js +721 -0
- package/scripts/builders/activity-feed.js +79 -0
- package/scripts/builders/catalog-summaries.js +67 -0
- package/scripts/builders/currency.js +109 -0
- package/scripts/builders/cwe-chains.js +105 -0
- package/scripts/builders/did-ladders.js +149 -0
- package/scripts/builders/frequency.js +89 -0
- package/scripts/builders/jurisdiction-clocks.js +126 -0
- package/scripts/builders/recipes.js +159 -0
- package/scripts/builders/section-offsets.js +162 -0
- package/scripts/builders/stale-content.js +171 -0
- package/scripts/builders/summary-cards.js +166 -0
- package/scripts/builders/theater-fingerprints.js +198 -0
- package/scripts/builders/token-budget.js +96 -0
- package/scripts/check-manifest-snapshot.js +217 -0
- package/scripts/predeploy.js +267 -0
- package/scripts/refresh-manifest-snapshot.js +57 -0
- package/scripts/refresh-sbom.js +222 -0
- package/skills/age-gates-child-safety/skill.md +456 -0
- package/skills/ai-attack-surface/skill.md +282 -0
- package/skills/ai-c2-detection/skill.md +440 -0
- package/skills/ai-risk-management/skill.md +311 -0
- package/skills/api-security/skill.md +287 -0
- package/skills/attack-surface-pentest/skill.md +381 -0
- package/skills/cloud-security/skill.md +384 -0
- package/skills/compliance-theater/skill.md +365 -0
- package/skills/container-runtime-security/skill.md +379 -0
- package/skills/coordinated-vuln-disclosure/skill.md +473 -0
- package/skills/defensive-countermeasure-mapping/skill.md +300 -0
- package/skills/dlp-gap-analysis/skill.md +337 -0
- package/skills/email-security-anti-phishing/skill.md +206 -0
- package/skills/exploit-scoring/skill.md +331 -0
- package/skills/framework-gap-analysis/skill.md +374 -0
- package/skills/fuzz-testing-strategy/skill.md +313 -0
- package/skills/global-grc/skill.md +564 -0
- package/skills/identity-assurance/skill.md +272 -0
- package/skills/incident-response-playbook/skill.md +546 -0
- package/skills/kernel-lpe-triage/skill.md +303 -0
- package/skills/mcp-agent-trust/skill.md +326 -0
- package/skills/mlops-security/skill.md +325 -0
- package/skills/ot-ics-security/skill.md +340 -0
- package/skills/policy-exception-gen/skill.md +437 -0
- package/skills/pqc-first/skill.md +546 -0
- package/skills/rag-pipeline-security/skill.md +294 -0
- package/skills/researcher/skill.md +310 -0
- package/skills/sector-energy/skill.md +409 -0
- package/skills/sector-federal-government/skill.md +302 -0
- package/skills/sector-financial/skill.md +398 -0
- package/skills/sector-healthcare/skill.md +373 -0
- package/skills/security-maturity-tiers/skill.md +464 -0
- package/skills/skill-update-loop/skill.md +463 -0
- package/skills/supply-chain-integrity/skill.md +318 -0
- package/skills/threat-model-currency/skill.md +404 -0
- package/skills/threat-modeling-methodology/skill.md +312 -0
- package/skills/webapp-security/skill.md +281 -0
- package/skills/zeroday-gap-learn/skill.md +350 -0
- package/vendor/blamejs/LICENSE +201 -0
- package/vendor/blamejs/README.md +54 -0
- package/vendor/blamejs/_PROVENANCE.json +54 -0
- package/vendor/blamejs/retry.js +335 -0
- package/vendor/blamejs/worker-pool.js +418 -0
|
@@ -0,0 +1,379 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: container-runtime-security
|
|
3
|
+
version: "1.0.0"
|
|
4
|
+
description: Container + Kubernetes runtime security for mid-2026 — CIS K8s Benchmark, NSA/CISA Hardening, Pod Security Standards, Kyverno/Gatekeeper admission, Sigstore policy-controller, eBPF runtime detection (Falco/Tetragon), AI inference workload hardening
|
|
5
|
+
triggers:
|
|
6
|
+
- container security
|
|
7
|
+
- kubernetes security
|
|
8
|
+
- k8s security
|
|
9
|
+
- cis kubernetes
|
|
10
|
+
- nsa hardening
|
|
11
|
+
- pod security standards
|
|
12
|
+
- kyverno
|
|
13
|
+
- gatekeeper
|
|
14
|
+
- opa
|
|
15
|
+
- falco
|
|
16
|
+
- tetragon
|
|
17
|
+
- sigstore policy
|
|
18
|
+
- admission controller
|
|
19
|
+
- networkpolicy
|
|
20
|
+
- cilium
|
|
21
|
+
- kserve
|
|
22
|
+
- vllm
|
|
23
|
+
data_deps:
|
|
24
|
+
- cve-catalog.json
|
|
25
|
+
- atlas-ttps.json
|
|
26
|
+
- framework-control-gaps.json
|
|
27
|
+
- cwe-catalog.json
|
|
28
|
+
- d3fend-catalog.json
|
|
29
|
+
- rfc-references.json
|
|
30
|
+
atlas_refs:
|
|
31
|
+
- AML.T0010
|
|
32
|
+
attack_refs:
|
|
33
|
+
- T1610
|
|
34
|
+
- T1611
|
|
35
|
+
- T1068
|
|
36
|
+
- T1190
|
|
37
|
+
framework_gaps:
|
|
38
|
+
- NIST-800-53-CM-7
|
|
39
|
+
- ISO-27001-2022-A.8.28
|
|
40
|
+
- SLSA-v1.0-Build-L3
|
|
41
|
+
rfc_refs:
|
|
42
|
+
- RFC-8446
|
|
43
|
+
- RFC-8032
|
|
44
|
+
cwe_refs:
|
|
45
|
+
- CWE-269
|
|
46
|
+
- CWE-732
|
|
47
|
+
- CWE-1188
|
|
48
|
+
- CWE-787
|
|
49
|
+
- CWE-1395
|
|
50
|
+
d3fend_refs:
|
|
51
|
+
- D3-EAL
|
|
52
|
+
- D3-EHB
|
|
53
|
+
- D3-PSEP
|
|
54
|
+
- D3-NI
|
|
55
|
+
- D3-NTPM
|
|
56
|
+
- D3-IOPR
|
|
57
|
+
last_threat_review: "2026-05-11"
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
# Container + Kubernetes Runtime Security (mid-2026)
|
|
61
|
+
|
|
62
|
+
## Threat Context (mid-2026)
|
|
63
|
+
|
|
64
|
+
Kubernetes is no longer "the cloud-native orchestrator." It is the AI inference runtime. KServe, vLLM, Triton Inference Server, Ray Serve, Seldon, BentoML, and the Hugging Face TGI / text-generation-inference family all ship as K8s workloads. Anywhere there is a production LLM endpoint in mid-2026 there is a K8s cluster underneath it, and the cluster's hardening posture is the LLM endpoint's hardening posture.
|
|
65
|
+
|
|
66
|
+
The dominant attack class is container escape (ATT&CK T1611). Two vectors drive it:
|
|
67
|
+
|
|
68
|
+
- **Kernel LPEs as container-escape primitives.** Copy Fail (CVE-2026-31431) is the canonical mid-2026 example: any unprivileged container on a Linux 4.14+ host that has not been live-patched is a kernel-LPE-driven host takeover. Hand off the host-level kernel triage to `kernel-lpe-triage`; the container-side reality is that namespace + seccomp + capability drops do not stop an in-kernel write primitive once the syscall surface is reachable, and most clusters do not deploy a syscall-restricting profile beyond the broken default. Earlier-era container-runtime CVEs (runc CVE-2024-21626 "LeakyVessels," containerd CVE-2024-21626 family, CRI-O CVE-2024-3154 class) remain instructive even where patched — they established that "the runtime is the perimeter" is wrong and that the host kernel is the perimeter.
|
|
69
|
+
- **Misconfigured admission and RBAC.** Privileged pods, `hostPID`, `hostNetwork`, mounted Docker sockets, wildcard ClusterRoles, and ServiceAccounts with `create`/`patch` on `pods` or `secrets` collapse the cluster to a single-node compromise. Tooling like Peirates, kube-hunter, and the offensive use of kube-bench results enumerates these in minutes.
|
|
70
|
+
|
|
71
|
+
The Pod Security Standards (Privileged / Baseline / Restricted) replaced PodSecurityPolicy in K8s 1.25 (PSP removed) and are enforced by the built-in PSA admission controller. The unhappy reality across the install base in mid-2026: PSS-Restricted is the documented target, PSS-Baseline is the typical actual posture, and a non-trivial fraction of production namespaces run effectively Privileged because operators namespace-label `pod-security.kubernetes.io/enforce: privileged` to unblock a vendor Helm chart and never roll it back.
|
|
72
|
+
|
|
73
|
+
Admission policy is the next layer. Kyverno (CNCF Incubating, in wide enterprise use by 2026) and OPA Gatekeeper are the dominant policy engines; both gate pod creation against declarative rules. Mature programs use admission policy to verify image signatures via Sigstore policy-controller (CNCF Sandbox, with the `policy.sigstore.dev/v1beta1 ClusterImagePolicy` CRD) — only signed images from approved publisher identities are admitted. Hand off the upstream signing pipeline to `supply-chain-integrity`; the container-runtime concern is admission-time verification, not build-time signing.
|
|
74
|
+
|
|
75
|
+
Runtime detection catches what admission misses. Falco (CNCF Graduated, libbpf-based by default in 2026 with the kernel-module fallback deprecated) and Tetragon (CNCF Sandbox, Isovalent, eBPF-only) are the canonical eBPF detection stacks. They observe syscalls, network flows, and process-exec events at kernel speed and emit policy-driven alerts (Falco) or in-kernel preventive enforcement (Tetragon's TracingPolicy CRD with `Sigkill` action). The NSA/CISA Kubernetes Hardening Guide v1.2 (August 2022) is still the de facto US baseline as of mid-2026 — no v1.3 has shipped — and treats runtime detection as a defense-in-depth pillar.
|
|
76
|
+
|
|
77
|
+
NetworkPolicy is the default-deny lateral-movement control. Stock NetworkPolicy is L3/L4 only; Cilium NetworkPolicy and CiliumClusterwideNetworkPolicy (eBPF datapath, CNCF Graduated 2023) add L7 (HTTP method, gRPC service, Kafka topic, DNS FQDN) and identity-aware policy via SPIFFE identities. Service mesh (Istio, Linkerd, Cilium Service Mesh) supplies mTLS between workloads by default. Without a default-deny NetworkPolicy posture per namespace, "zero trust between pods" is a slide.
|
|
78
|
+
|
|
79
|
+
The AI-inference-workload wrinkle. KServe `InferenceService` and vLLM deployments typically run as privileged-adjacent pods: GPU device plugins require host device access (`/dev/nvidia*`), model weights are mounted from PVCs or hostPath, and the model server itself loads code-executing serialization formats (PyTorch `.pt`, Python's native serialization) unless the pipeline enforces safetensors. A compromise of the inference pod yields not just an RCE foothold but model-weight exfiltration, training-data leak (via inference-time embedding extraction), and a privileged GPU container that bypasses most generic container hardening guidance. The CIS Kubernetes Benchmark v1.10 (early 2025) does not specifically cover AI inference workload patterns; NIST 800-190 (Application Container Security, September 2017) predates the AI workload class entirely.
|
|
80
|
+
|
|
81
|
+
State of standards baselines:
|
|
82
|
+
|
|
83
|
+
- **CIS Kubernetes Benchmark v1.10** (covering K8s 1.30–1.31) — current as of mid-2026. CIS lags upstream K8s by 1–2 minor releases; v1.11 expected to cover K8s 1.32+.
|
|
84
|
+
- **NSA/CISA Kubernetes Hardening Guide v1.2** (August 2022) — still the latest revision; voluntary; widely cited in US federal and CI sectors.
|
|
85
|
+
- **NIST 800-190** (September 2017) — superseded by reality on every page; NIST has not yet published a 2nd revision. Forward-watch.
|
|
86
|
+
- **OWASP Kubernetes Top 10** (2022, refreshed 2024) — the offensive-perspective companion to CIS.
|
|
87
|
+
- **Pod Security Standards** — built-in (`kubernetes.io/pod-security`), three levels, enforced via PSA admission.
|
|
88
|
+
|
|
89
|
+
## Framework Lag Declaration
|
|
90
|
+
|
|
91
|
+
| Framework | Control | Designed For | Fails Because |
|
|
92
|
+
|---|---|---|---|
|
|
93
|
+
| CIS Kubernetes Benchmark v1.10 | Master/worker/policy/managed-services controls | K8s 1.30–1.31 hardened baseline | Lags upstream K8s by 1–2 minor releases. Treats every cluster identically — no carve-out for AI inference workloads needing GPU device plugins, hostPath model PVCs, or large-memory huge-page configurations. Compliance-tool output (kube-bench) is interpreted as "the security posture" instead of "one input to the posture." |
|
|
94
|
+
| NSA/CISA Kubernetes Hardening Guide v1.2 (Aug 2022) | Pod security, network separation, authentication/authorization, audit logging, threat detection, upgrades | US federal + CI baseline | Voluntary. Predates Pod Security Standards going GA (K8s 1.25, Aug 2022 — only just). Predates the AI-inference-as-K8s-workload pattern entirely. No update through mid-2026; treats Falco/eBPF detection as "consider" rather than baseline. |
|
|
95
|
+
| NIST 800-190 Application Container Security | Container image, registry, orchestrator, host OS, container runtime risks and countermeasures | 2017-era container baseline (Docker-era) | Predates Pod Security Standards, Kyverno/Gatekeeper, Sigstore, eBPF runtime detection, and the AI inference workload class. No revision through 2026. Auditors still cite it because it is the most concrete NIST container document — that is the gap. |
|
|
96
|
+
| OWASP Kubernetes Top 10 (2022, refresh 2024) | Top-10 misconfigurations and attack patterns | Offensive-perspective awareness | Awareness document, not a controls catalog. K01 Insecure Workload Configurations and K02 Supply Chain Vulnerabilities map cleanly but stop short of admission-policy and runtime-detection prescription. |
|
|
97
|
+
| Pod Security Standards (Privileged / Baseline / Restricted) | Built-in PSA admission controller policy levels | K8s 1.25+ pod-level security profile | PSS is namespace-scoped via labels (`pod-security.kubernetes.io/enforce`). Privileged-namespace overrides are silent in audit unless explicit reporting layers (Kyverno PolicyReports, Kubescape) surface them. PSS does not cover network, RBAC, image-signing, or runtime — it is one of six required layers. |
|
|
98
|
+
| NIST 800-53 Rev 5 CM-7 (Least Functionality) | Configuration-management baseline | Method-neutral configuration hardening | Method-neutral on K8s entirely. CM-7 is satisfied by "we disabled unused features"; no requirement that Pod Security Standards Restricted be the namespace default. See `data/framework-control-gaps.json` `NIST-800-53-CM-7`. |
|
|
99
|
+
| ISO 27001:2022 A.8.28 (Secure coding) | Annex A 2022 refresh | Generic ISMS secure-coding control | Method-neutral. Does not address container image hardening, admission policy, or runtime detection. ISO 27002:2022 implementation guidance is high-level. See `data/framework-control-gaps.json` `ISO-27001-2022-A.8.28`. |
|
|
100
|
+
| SLSA v1.0 Build L3 | Hardened-builder provenance | Build-pipeline integrity | SLSA L3 attests how the image was built — not whether it is verified at admission. SLSA L3 evidence with no `ClusterImagePolicy` enforcement on the cluster is build-side theater. Hand off the build side to `supply-chain-integrity`. See `data/framework-control-gaps.json` `SLSA-v1.0-Build-L3`. |
|
|
101
|
+
| EU NIS2 Directive Art. 21 | Risk management measures for essential/important entities | EU-wide cybersecurity baseline | "Appropriate technical and organisational measures." Member-State authorities (ENISA NIS Cooperation Group guidance) increasingly cite CIS Benchmarks and NSA/CISA Hardening Guide as the operational floor; the Directive itself does not. |
|
|
102
|
+
| EU CRA (Regulation 2024/2847) | Annex I essential cybersecurity requirements for products with digital elements | Products placed on EU market | OT-style application to brownfield K8s distros (Red Hat OpenShift, SUSE Rancher, VMware Tanzu, Mirantis k0s) is in scope; managed services (EKS, GKE, AKS) are typically out of scope as services. Implementing acts through Dec 2027 may tighten container-runtime expectations. |
|
|
103
|
+
| UK NCSC CAF v3.2 (2024) | Cyber Assessment Framework outcomes (B2 Identity, B4 System Security, B5 Resilient Networks) | UK CNI outcome-focused assessment | Outcome-focused; sound principles but no K8s-specific operational floor. B2.b "technical configuration" maps to CIS K8s Benchmark in practice without naming it. |
|
|
104
|
+
| AU ISM (Sep 2024 update) | Control 1739 (containers) and Application Control (E8 ML2/ML3) | AU government and CI baseline | ISM 1739 references container hardening at high level; does not pin CIS K8s Benchmark version or PSS profile. E8 Application Control closest analogue is Kyverno verify-images + Sigstore policy-controller, not named in ISM. |
|
|
105
|
+
| JP NISC Container Security Guidance | NISC critical-infrastructure container guidelines | JP CI sectors | Sector-level policy; defers operational specifics to METI guidance. AI inference workload patterns not yet codified. |
|
|
106
|
+
| IL INCD Container Hardening | INCD national directive on container security | IL CI operators | OT-style application; AI inference workloads in K8s not yet codified. |
|
|
107
|
+
| SG GovTech Cloud-Native Standards | TRM container baseline + MAS TRM cloud guidance | SG public-sector + financial-sector cloud | Strong on managed-service usage; admission policy and eBPF runtime detection not yet pinned. |
|
|
108
|
+
| TW CSMA + FSC Cloud Guidance | National critical-infrastructure cyber + financial-sector cloud | TW CII operators | CSMA Art. 14 supplier-risk obligations apply to K8s distro vendors; technical container floor not pinned. |
|
|
109
|
+
| US FedRAMP Rev 5 + DoD SRG IL2/IL4/IL5 | Federal cloud authorization baseline + DoD impact-level baselines | Federal cloud workloads | FedRAMP Rev 5 inherits NIST 800-53 CM-7 / SI-7. CIS Kubernetes Benchmark is increasingly expected as ATO evidence but not mandated in Rev 5 baseline text. DoD SRG IL5 in practice requires CIS K8s Benchmark + STIG. |
|
|
110
|
+
| ISO 27001:2022 + ISO/IEC 27017 (cloud) | A.8.28, A.8.9 + cloud sector extension | Cloud-service ISMS | Method-neutral; cloud-sector extension predates K8s-native admission policy. |
|
|
111
|
+
|
|
112
|
+
**Cross-jurisdiction posture (per AGENTS.md rule #5):** Any container/K8s assessment for a multi-jurisdiction operator must cite at minimum EU NIS2 + CRA, UK NCSC CAF, AU ISM, IL INCD, SG GovTech/MAS TRM, alongside ISO 27001:2022 + ISO/IEC 27017 and NIST 800-53 CM-7. US-only is insufficient.
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
## TTP Mapping
|
|
117
|
+
|
|
118
|
+
| Surface | TTP | Matrix | Variant in mid-2026 | Gap Flag |
|
|
119
|
+
|---|---|---|---|---|
|
|
120
|
+
| Adversary deploys a workload (malicious container image, attacker-pushed Helm chart, compromised CI deploy pipeline) | T1610 — Deploy Container | ATT&CK Enterprise | Typosquatted public image (`docker.io/libray/...`), compromised public Helm chart, AI-coding-assistant-emitted manifest with `:latest` tag from untrusted registry | NIST 800-53 CM-7 is method-neutral on image-source allowlisting; PSS does not gate image source; admission policy via Kyverno + Sigstore policy-controller is the actual control and is not mandated by any framework |
|
|
121
|
+
| Container escape to host | T1611 — Escape to Host | ATT&CK Enterprise | Kernel LPE (Copy Fail CVE-2026-31431, Dirty Frag CVE-2026-43284 family); historical runc CVE-2024-21626 LeakyVessels family; cgroup v1 release_agent legacy abuses; abuse of overly permissive capabilities (`CAP_SYS_ADMIN`, `CAP_SYS_MODULE`) | NIST 800-190 predates kernel-LPE-as-container-escape as the dominant vector. Defense requires kernel patching cadence (hand off to `kernel-lpe-triage`) plus seccomp default profile, capability drops, read-only rootfs, and runtime detection. None of these are framework-mandated. |
|
|
122
|
+
| Privilege escalation within the container | T1068 — Exploitation for Privilege Escalation | ATT&CK Enterprise | In-container kernel LPE (yields host root via T1611 chain); abuse of writable hostPath; abuse of mounted Docker socket | Method-neutral framework controls; the actual control is seccomp + dropped capabilities + read-only rootfs + non-root runAsUser, all enforced by PSS-Restricted profile |
|
|
123
|
+
| Exploit public-facing K8s component | T1190 — Exploit Public-Facing Application | ATT&CK Enterprise | Exposed kube-apiserver (rare but seen on self-managed clusters); exposed kubelet read-only port (10255) or read/write port (10250) without authentication; exposed Kubernetes Dashboard with no auth; exposed Argo CD or Jenkins on the cluster; ingress controller CVEs (ingress-nginx CVE-2025 family) | NSA/CISA Hardening Guide v1.2 addresses control-plane exposure; managed services close this by default; self-managed clusters in CI/government still expose these |
|
|
124
|
+
| Compromised container image at a public/private registry | AML.T0010 — ML Supply Chain Compromise (umbrella) | ATLAS v5.1.0 | Poisoned base image; backdoored model-serving image; typosquatted MCP server in a sidecar; AI-pipeline-specific (KServe / vLLM / Triton image with embedded malicious payload) | ATLAS classifies; no framework mandates signature verification at admission. Hand off the build-side provenance to `supply-chain-integrity`; the container-runtime control is `ClusterImagePolicy` enforcement |
|
|
125
|
+
|
|
126
|
+
ATT&CK Containers matrix (sub-matrix, since 2021) and ATT&CK for Kubernetes (Microsoft's threat matrix, 2020, since absorbed conceptually into ATT&CK Containers) are both relevant prior art. The Enterprise IDs above are canonical in ATLAS v5.1.0 alignment and pass the linter regex `^T\d{4}(\.\d{3})?$`.
|
|
127
|
+
|
|
128
|
+
CWE cross-walk (see `data/cwe-catalog.json`):
|
|
129
|
+
|
|
130
|
+
| CWE | Why It Maps |
|
|
131
|
+
|---|---|
|
|
132
|
+
| CWE-269 (Improper Privilege Management) | Privileged pods, hostPID/hostNetwork/hostIPC, mounted Docker sockets, wildcard ClusterRoles, ServiceAccount with `create/patch` on `pods` or `secrets`. PSS-Restricted closes most pod-level instances; RBAC review closes the cluster-level instances. |
|
|
133
|
+
| CWE-732 (Incorrect Permission Assignment for Critical Resource) | `runAsUser: 0`, world-writable mounts, hostPath mounted read-write, secret mounted with overly broad mode bits, ServiceAccount tokens auto-mounted by default. |
|
|
134
|
+
| CWE-1188 (Initialization of a Resource with an Insecure Default) | K8s ships with auto-mount of ServiceAccount tokens on by default; PSS namespace label default is none; NetworkPolicy default is allow-all. Every cluster begins insecure until the operator inverts these defaults. |
|
|
135
|
+
| CWE-787 (Out-of-bounds Write) | Kernel LPE class that drives container escape (Copy Fail CVE-2026-31431). The container is not the boundary; the kernel is. |
|
|
136
|
+
| CWE-1395 (Dependency on Vulnerable Third-Party Component) | Base-image transitive vulnerabilities; vendor Helm chart pinned to a version with known CVEs; cluster components (CNI plugin, CSI driver, ingress controller, admission webhooks) with known CVEs. SBOM + VEX is the discovery layer; admission policy enforcing image signing + version pinning is the prevention layer. |
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## Exploit Availability Matrix
|
|
141
|
+
|
|
142
|
+
| Class / CVE | CVSS | RWEP | CISA KEV | PoC Public | AI-Discovered | Active Exploitation | Patch / Mitigation | Admission-Detectable | Runtime-Detectable (Falco/Tetragon) |
|
|
143
|
+
|---|---|---|---|---|---|---|---|---|---|
|
|
144
|
+
| Host-kernel LPE as container escape — Copy Fail (CVE-2026-31431) | 7.8 | 90 (see `cve-catalog.json`) | Yes (2026-03-15) | Yes — 732-byte script | Yes | Confirmed | Kernel patch + live-patch (kpatch/livepatch/kGraft) on supported distros; reboot rolling fleet on others | No (admission doesn't see kernel ops) | Yes — Falco/Tetragon catches the post-escape host operations; the in-kernel write itself is invisible to eBPF |
|
|
145
|
+
| Container-runtime CVE class — runc CVE-2024-21626 ("LeakyVessels") family | 8.6 | varies (historical reference) | Yes (at time of disclosure) | Yes | No (manual disclosure) | Patched in modern fleets; brownfield self-managed clusters lag | runc / containerd / CRI-O upgrade | Partial — admission can require minimum runtime versions via node-feature labels | Yes — Tetragon can enforce SIGKILL on the abuse syscall sequence |
|
|
146
|
+
| Misconfigured PSS — `pod-security.kubernetes.io/enforce: privileged` on a workload namespace | n/a (class) | n/a | n/a | Trivial — `kubectl run --privileged` | Operator misconfig + AI-coding-assistant template drift | Routinely observed in incident response 2024–2026 | Set namespace label to `restricted`; remediate the workload | Yes — Kyverno PolicyReport + Kubescape surface this; PSA itself enforces on admit if label set |
|
|
147
|
+
| Misconfigured RBAC — ServiceAccount with `create/patch` on `pods` or `secrets` cluster-wide | n/a (class) | n/a | n/a | Trivial — `can-i --list --as=system:serviceaccount:...` | Operator misconfig | Routinely observed | Replace wildcard ClusterRoles with scoped Roles; deny `automountServiceAccountToken: true` by default | Yes — Kyverno + OPA policies; kube-bench check | Yes — Falco detects token use from unexpected ServiceAccount |
|
|
148
|
+
| Exposed kubelet 10250 / 10255 | n/a (class) | n/a | n/a | Trivial — Shodan / Censys queries; Peirates | n/a | Confirmed in self-managed / CI / lab clusters | Network policy default-deny; kubelet authentication + authorization; firewall | No (network-layer control) | Yes — Cilium L7 policy + Falco network rule |
|
|
149
|
+
| Ingress controller CVE class — ingress-nginx (recent CVE-2025 family, "IngressNightmare"-style) | varies | varies | Mixed KEV listings | Yes (multiple) | Mixed (some AI-assisted RE) | Confirmed | Vendor patch; remove ingress-nginx admission webhook if not in use; restrict snippet annotations | Partial — admission policy can forbid the dangerous annotations | Yes — Tetragon process-exec detection |
|
|
150
|
+
| Unsigned container image admitted to production | n/a (class) | n/a | n/a | n/a | n/a | Pervasive — default cluster posture | Sigstore policy-controller `ClusterImagePolicy` requiring keyless verification against pinned publisher identity; Kyverno `verifyImages` rule | Yes — that is the entire point | n/a |
|
|
151
|
+
| AI-inference image with code-executing model load (PyTorch `.pt` / Python-native serialization) | n/a (class) | n/a | n/a | Trivial — published PoC research | Yes — adversarial-weights research and 2025 incident reports | Suspected in advanced campaigns | Reject code-executing serialization formats; require safetensors; verify model signature pre-load | Partial — admission policy on the model-PVC contents | Yes — Falco rule on unexpected serialization-load syscalls from inference container |
|
|
152
|
+
|
|
153
|
+
**Honest gap statement (per AGENTS.md rule #10).** `data/cve-catalog.json` does not yet enumerate every ingress-controller, CNI plugin, CSI driver, or admission webhook CVE. The authoritative feeds are upstream advisories (`kubernetes.io/security/`, `kubernetes-announce@googlegroups.com`, vendor PSIRT feeds for managed services, ingress-nginx CHANGELOG, Cilium security advisories). Forward-watch covers ingestion of these feeds.
|
|
154
|
+
|
|
155
|
+
---
|
|
156
|
+
|
|
157
|
+
## Analysis Procedure
|
|
158
|
+
|
|
159
|
+
This procedure threads the three foundational principles required by AGENTS.md skill-format spec (defense in depth, least privilege, zero trust) through every step. Per AGENTS.md rule #9, containers are the canonical ephemeral runtime — and the audit/forensic implication is that this is the operating reality the program must design for, not work around.
|
|
160
|
+
|
|
161
|
+
### Defense in depth
|
|
162
|
+
|
|
163
|
+
Six layers, each independently capable of blocking a different attacker stage. A program with five of six is still better than a program with one strong layer; a program with one strong layer plus heroic effort is fragile.
|
|
164
|
+
|
|
165
|
+
- **Image signing and provenance.** Sigstore cosign keyless signing of every image at build time (hand off to `supply-chain-integrity`). Rekor inclusion verified.
|
|
166
|
+
- **Admission control.** Sigstore policy-controller `ClusterImagePolicy` + Kyverno `verifyImages` rule + Kyverno / OPA Gatekeeper constraints enforcing no-root, no-host-network, no-hostPID, no-hostIPC, no-privileged, no-CAP_SYS_ADMIN, no-mountedDockerSocket, no-`:latest`-tag, allowlisted registries only.
|
|
167
|
+
- **Pod Security Standards Restricted profile** enforced by built-in PSA admission controller at the namespace level (`pod-security.kubernetes.io/enforce: restricted`).
|
|
168
|
+
- **NetworkPolicy default-deny per namespace.** Cilium L7 policy where flow control needs HTTP-method / gRPC-service / DNS-FQDN granularity. mTLS via service mesh (Istio, Linkerd, Cilium Service Mesh) for service-to-service identity.
|
|
169
|
+
- **Runtime detection (eBPF).** Falco (CNCF Graduated) syscall + network + process-exec rules feeding a SIEM; Tetragon (CNCF Sandbox) TracingPolicy CRDs with `Sigkill` action for in-kernel preventive enforcement on selected high-confidence signatures.
|
|
170
|
+
- **Control-plane hardening.** kube-apiserver audit-log level `RequestResponse` for sensitive resources, log shipped to a SIEM out-of-cluster, audit retention per regulator. etcd encryption-at-rest with KMS-backed key. RBAC review every quarter. kubelet TLS bootstrap rotation. Anonymous auth disabled.
|
|
171
|
+
|
|
172
|
+
### Least privilege
|
|
173
|
+
|
|
174
|
+
- Every pod: non-root (`runAsNonRoot: true`, `runAsUser: 10000+`), all capabilities dropped (`drop: ["ALL"]`) with explicit `add:` list only where required, read-only root filesystem, `allowPrivilegeEscalation: false`, seccomp `RuntimeDefault` profile minimum, `automountServiceAccountToken: false` unless required.
|
|
175
|
+
- ServiceAccounts scoped per workload, never shared across unrelated workloads. ClusterRoles avoided where Roles suffice. No `*` verbs.
|
|
176
|
+
- RBAC verbs scoped per resource per namespace. `secrets` access on a separate ServiceAccount from `pods` access.
|
|
177
|
+
- AI inference workloads: scoped GPU access (NVIDIA device plugin with per-pod GPU allocation, no shared GPU containers across security boundaries); model PVCs mounted read-only; model-registry credentials in CSI Secret Store (External Secrets Operator + AWS Secrets Manager / GCP Secret Manager / Azure Key Vault / HashiCorp Vault), not in-cluster Secret resources.
|
|
178
|
+
- Cluster-admin is a break-glass identity, not a daily-driver account; just-in-time elevation via SPIFFE + OIDC + audit log; standing cluster-admin tokens are a finding.
|
|
179
|
+
|
|
180
|
+
### Zero trust
|
|
181
|
+
|
|
182
|
+
- Never assume the pod network is trustworthy because it sits inside the cluster CNI. NetworkPolicy default-deny at every namespace; explicit allow rules for every required flow.
|
|
183
|
+
- Every service-to-service call mutually authenticated (mTLS via service mesh or SPIFFE/SPIRE identities). Cilium identity-aware policy where the mesh sidecar overhead is unacceptable.
|
|
184
|
+
- kubelet TLS bootstrap rotation enforced (`RotateKubeletServerCertificate`, `RotateKubeletClientCertificate`).
|
|
185
|
+
- Every image untrusted until signature verified at admission against a pinned publisher identity. Verify the signature, not just its presence — bare `.sig` files without identity binding are decoration.
|
|
186
|
+
- Every kernel syscall from a container subject to seccomp default-deny outside the allowed set. Capability drops are a syscall-level zero-trust posture.
|
|
187
|
+
- AI inference outputs treated as untrusted content (hand off to `ai-attack-surface` and `rag-pipeline-security` for the downstream content layer).
|
|
188
|
+
|
|
189
|
+
### Step-by-step procedure
|
|
190
|
+
|
|
191
|
+
1. **CIS Kubernetes Benchmark baseline scan.** Run `kube-bench` (Aqua) on every cluster — control plane, worker, policy, managed-services sections per the cluster's distribution. Record pass/fail per check ID. Schedule weekly minimum, daily for high-impact clusters. Quarterly is theater (see Compliance Theater Check 1).
|
|
192
|
+
|
|
193
|
+
2. **Pod Security Standards enforcement audit.** Enumerate every namespace's `pod-security.kubernetes.io/enforce` label and `audit`/`warn` labels. Goal state: `enforce: restricted` on every workload namespace; `baseline` only for documented exceptions (vendor Helm chart requirements, GPU device plugin pods) with a written exception per the `policy-exception-gen` skill. `privileged` is a finding unless it is `kube-system` or a documented infrastructure namespace.
|
|
194
|
+
|
|
195
|
+
3. **Image-signing rollout.** Deploy Sigstore policy-controller (`policy.sigstore.dev/v1beta1 ClusterImagePolicy`). Define per-namespace policies pinning publisher OIDC identity (Fulcio certificate subject) and OIDC issuer. Migrate workloads from "any image, any tag" to "signed images from approved publishers, version-pinned digest references." Track coverage % per namespace. Hand off the build-side signing pipeline to `supply-chain-integrity`.
|
|
196
|
+
|
|
197
|
+
4. **Admission policy via Kyverno or OPA Gatekeeper.** Deploy at least the following ClusterPolicy / ConstraintTemplate rules:
|
|
198
|
+
- `disallow-privileged-containers`
|
|
199
|
+
- `disallow-host-namespaces` (hostPID, hostNetwork, hostIPC)
|
|
200
|
+
- `disallow-host-path`
|
|
201
|
+
- `disallow-host-ports`
|
|
202
|
+
- `disallow-capabilities` (no `add` of `SYS_ADMIN`, `NET_ADMIN`, `SYS_MODULE`, `SYS_PTRACE`, `DAC_OVERRIDE`)
|
|
203
|
+
- `require-non-root`
|
|
204
|
+
- `require-read-only-root-fs`
|
|
205
|
+
- `require-image-signature` (Kyverno `verifyImages` with Sigstore key/identity)
|
|
206
|
+
- `disallow-default-namespace`
|
|
207
|
+
- `restrict-image-registries` (allowlist)
|
|
208
|
+
- `disallow-latest-tag`
|
|
209
|
+
- `require-resource-limits`
|
|
210
|
+
- `require-pod-disruption-budget` (production namespaces)
|
|
211
|
+
Run Kyverno PolicyReports against the live cluster; non-compliance becomes the remediation backlog.
|
|
212
|
+
|
|
213
|
+
5. **NetworkPolicy default-deny per namespace.** Apply a default-deny `NetworkPolicy` (ingress + egress) to every namespace. Explicit allow rules per workload. Cilium L7 policy where HTTP / gRPC / DNS granularity is required. Validate with `cilium connectivity test` or equivalent.
|
|
214
|
+
|
|
215
|
+
6. **Runtime detection deployment.** Deploy Falco (libbpf driver) cluster-wide as DaemonSet. Ship alerts to SIEM (Splunk HEC, Elastic, Sumo Logic, Panther). Tune the default ruleset; add custom rules for AI inference workloads (Python-native serialization-load syscalls, `torch.load` from network paths, unexpected GPU library loads). Deploy Tetragon for selected high-confidence preventive policies (`Sigkill` on `/proc/self/exe` overwrite, on cgroup release_agent abuse, on container-runtime CVE syscall sequences). Track MTTR per alert class.
|
|
216
|
+
|
|
217
|
+
7. **Control plane hardening.** Audit kube-apiserver flags: anonymous-auth disabled, audit-log level `RequestResponse` for `pods`/`secrets`/`configmaps`/`serviceaccounts`/`roles`/`rolebindings`, audit-log shipped out of cluster, audit-log retention per regulator (NERC CIP 90 days, NIS2 indefinite, DORA 5 years). etcd encryption-at-rest with KMS provider. RBAC review: enumerate every ClusterRoleBinding to `cluster-admin`, every wildcard verb, every wildcard resource. Justify or remove. Kubelet authn + authz enabled (webhook mode, not always-allow).
|
|
218
|
+
|
|
219
|
+
8. **AI inference workload hardening.** For each `InferenceService` (KServe), `VLLMDeployment` (vLLM), `Triton` deployment, `RayService`, `BentoDeployment`:
|
|
220
|
+
- Pod runs PSS Restricted with documented GPU-device-plugin exception.
|
|
221
|
+
- Model PVC mounted read-only.
|
|
222
|
+
- Model serialization format: safetensors (preferred) or GGUF; reject `.pt` / Python-native code-executing formats in admission policy (Kyverno rule on init-container model fetcher).
|
|
223
|
+
- Model signature verified against pinned publisher identity before load (cosign or OpenSSF model-signing).
|
|
224
|
+
- Model-registry credentials via External Secrets Operator from out-of-cluster KMS; never in `Secret` resource.
|
|
225
|
+
- NetworkPolicy: egress to model registry + observability sink only; no general internet egress; ingress from API gateway only.
|
|
226
|
+
- Inference-time content treated as untrusted (`ai-attack-surface` / `rag-pipeline-security` hand-offs apply).
|
|
227
|
+
|
|
228
|
+
9. **Kernel patching cadence (hand off to `kernel-lpe-triage`).** For every node OS (Ubuntu, RHEL, Bottlerocket, Talos, Flatcar, AKS Mariner, EKS AL2023, GKE COS), maintain a kernel-patch SLA. Copy Fail (CVE-2026-31431) — live-patch within 4 hours on supported distros; rolling node reboot within 7 days on unsupported. Bottlerocket / Talos / Flatcar deliver kernel updates via image swap, not in-place — operational pattern is node-pool rolling replace.
|
|
229
|
+
|
|
230
|
+
10. **Supply chain hand-off (`supply-chain-integrity`).** Every image in the admission allowlist must trace to a SLSA L3 build with cosign signature, Rekor inclusion proof, and SBOM. The container-runtime job is to enforce verification at admission; the build-side job is to produce the evidence.
|
|
231
|
+
|
|
232
|
+
### Ephemeral / audit-forensic posture (AGENTS.md rule #9)
|
|
233
|
+
|
|
234
|
+
Containers are ephemeral by design: pods die, nodes are replaced, log file paths inside the container are gone the moment the pod is. The audit/forensic implication:
|
|
235
|
+
|
|
236
|
+
- **All audit data must leave the host before the host can be lost.** kube-apiserver audit log, container stdout/stderr, Falco/Tetragon events, kube-bench scan results, Kyverno PolicyReports — all shipped to an out-of-cluster sink (SIEM, object storage with WORM lock) within seconds, not on a scheduled cron.
|
|
237
|
+
- **Forensic disk acquisition is rarely meaningful for a container.** The relevant artifact is the image, the pod manifest at admission time, the audit log of operations performed by the pod, and the Falco/Tetragon event stream — not a disk image of an ephemeral container layer. IR playbooks must reflect this: snapshot the pod manifest, snapshot the image digest, pull the event window, isolate the node (cordon + drain + freeze instead of shutdown so the kernel + memory remain available for `crash` / live forensics if needed).
|
|
238
|
+
- **NERC CIP and NIS2 retention requirements still apply** to the out-of-cluster log sink; in-cluster log volumes are not the system of record.
|
|
239
|
+
- **AI inference workloads compound the challenge** — model weights at inference time may be loaded from object storage and only ever exist in pod memory; capturing them post-incident requires either a memory snapshot of the live pod or the registry pull record. Plan for both.
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
## Output Format
|
|
244
|
+
|
|
245
|
+
Produce this structure verbatim:
|
|
246
|
+
|
|
247
|
+
```
|
|
248
|
+
## Container + Kubernetes Runtime Security Posture Assessment
|
|
249
|
+
|
|
250
|
+
**Assessment Date:** YYYY-MM-DD
|
|
251
|
+
**Cluster(s) in scope:** [cluster name + K8s version + distribution (EKS / GKE / AKS / OpenShift / Rancher / k0s / Talos / kubeadm) + node OS]
|
|
252
|
+
**Workload classes in scope:** [general microservices / AI inference (KServe / vLLM / Triton / Ray Serve) / data / batch]
|
|
253
|
+
**Regulatory jurisdictions:** [US / EU NIS2+CRA / UK NCSC CAF / AU ISM / IL INCD / SG GovTech / TW CSMA / sector-specific]
|
|
254
|
+
|
|
255
|
+
### CIS Kubernetes Benchmark Scorecard
|
|
256
|
+
| Section | Total Checks | Pass | Fail | Manual | Last Scan Date |
|
|
257
|
+
|---------|--------------|------|------|--------|----------------|
|
|
258
|
+
| Master Node Configuration | ... | ... | ... | ... | ... |
|
|
259
|
+
| Worker Node Configuration | ... | ... | ... | ... | ... |
|
|
260
|
+
| Policies | ... | ... | ... | ... | ... |
|
|
261
|
+
| Managed Services | ... | ... | ... | ... | ... |
|
|
262
|
+
|
|
263
|
+
### Pod Security Standards Adoption Matrix
|
|
264
|
+
| Namespace | enforce | audit | warn | Workload Class | Exception Justified? |
|
|
265
|
+
|
|
266
|
+
### Admission Policy Coverage
|
|
267
|
+
| Policy Engine | Kyverno / OPA Gatekeeper | # ClusterPolicies | # Workload Exceptions | Blocking vs. Audit Mode |
|
|
268
|
+
| Rule | Status (enforce / audit / absent) | Coverage % | Workload Exceptions |
|
|
269
|
+
|
|
270
|
+
### Image-Signing Coverage
|
|
271
|
+
| Namespace | Images Admitted (count) | Cosign-Verified % | ClusterImagePolicy Identity Pinned? | Rekor Inclusion Verified? |
|
|
272
|
+
|
|
273
|
+
### NetworkPolicy Coverage per Namespace
|
|
274
|
+
| Namespace | Default-Deny Ingress? | Default-Deny Egress? | L7 Policy (Cilium)? | mTLS (Mesh)? |
|
|
275
|
+
|
|
276
|
+
### Runtime Detection Posture
|
|
277
|
+
| Tool (Falco / Tetragon) | Driver (libbpf / kmod / ko) | Ruleset Version | Custom Rules # | SIEM Sink | Alert Volume / day | MTTR (median) |
|
|
278
|
+
|
|
279
|
+
### Control-Plane Hardening Checklist
|
|
280
|
+
| Control | Status | Evidence |
|
|
281
|
+
| anonymous-auth disabled | ... | ... |
|
|
282
|
+
| audit-log level RequestResponse for sensitive resources | ... | ... |
|
|
283
|
+
| audit-log shipped out-of-cluster within seconds | ... | ... |
|
|
284
|
+
| etcd encryption-at-rest with KMS | ... | ... |
|
|
285
|
+
| kubelet authn/authz webhook mode | ... | ... |
|
|
286
|
+
| no standing cluster-admin tokens | ... | ... |
|
|
287
|
+
| RBAC reviewed within last 90 days | ... | ... |
|
|
288
|
+
|
|
289
|
+
### AI Inference Workload Posture
|
|
290
|
+
| InferenceService / Deployment | PSS Profile | Model Format (safetensors / pt / native-codec / GGUF / ONNX) | Signature Verified at Load? | GPU Access Scope | Egress Policy | NetworkPolicy Ingress |
|
|
291
|
+
|
|
292
|
+
### Compliance Theater Findings
|
|
293
|
+
[Outcome of the four tests in the Compliance Theater Check section]
|
|
294
|
+
|
|
295
|
+
### Defensive Countermeasure Plan (D3FEND)
|
|
296
|
+
[D3-NI, D3-NTPM, D3-EAL, D3-EHB, D3-PSEP, D3-IOPR — concrete control placements by cluster layer]
|
|
297
|
+
|
|
298
|
+
### Priority Remediation Actions
|
|
299
|
+
1. ...
|
|
300
|
+
2. ...
|
|
301
|
+
3. ...
|
|
302
|
+
|
|
303
|
+
### RWEP-Prioritised CVE Exposure
|
|
304
|
+
[Host-kernel + container-runtime + ingress-controller + admission-webhook CVEs ranked by RWEP, not CVSS; see `exploit-scoring` skill for recalculation]
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
---
|
|
308
|
+
|
|
309
|
+
## Compliance Theater Check
|
|
310
|
+
|
|
311
|
+
Run all four. Any "fail" is a theater finding documented with the evidence (or absence thereof).
|
|
312
|
+
|
|
313
|
+
**Theater Test 1 — CIS Kubernetes Benchmark scan currency.**
|
|
314
|
+
Ask: *"Show me the kube-bench output from this week, signed by a CI workflow run, with diffs against last week's run highlighting any newly failing checks."*
|
|
315
|
+
|
|
316
|
+
- If the answer is "we ran kube-bench at go-live two years ago": CIS conformance evidence is fabricated.
|
|
317
|
+
- If the answer is "we run it quarterly": theater for a workload class where new CVEs and policy drift land weekly.
|
|
318
|
+
- If the answer is "we run it weekly but no one looks at the diff": theater for the gap-closure intent.
|
|
319
|
+
- Acceptable: weekly automated run, diffs reviewed, newly-failing checks ticketed within 24h.
|
|
320
|
+
|
|
321
|
+
**Theater Test 2 — Pod Security Standards Restricted coverage.**
|
|
322
|
+
Ask: *"What percentage of your production-workload namespaces enforce `pod-security.kubernetes.io/enforce: restricted`? For namespaces at `baseline` or `privileged`, paste the documented exception."*
|
|
323
|
+
|
|
324
|
+
- If the answer is "0%" or "we don't track": PSS adoption is theater regardless of how the documentation describes the target.
|
|
325
|
+
- If the answer is a non-zero number but every exception traces to "vendor Helm chart required it": the documented exceptions are theater for the security review intent — the vendor relationship is a finding, not a justification.
|
|
326
|
+
- Acceptable: at least 80% of workload namespaces at `restricted`, every exception traces to a signed `policy-exception-gen` record with compensating controls listed.
|
|
327
|
+
|
|
328
|
+
**Theater Test 3 — Image-signing verification at admission.**
|
|
329
|
+
Ask: *"For your last 10 production container deploys, paste the Sigstore policy-controller ClusterImagePolicy that verified them, paste the cosign / Rekor verification record, and identify the OIDC identity that signed."*
|
|
330
|
+
|
|
331
|
+
- If the answer is "we sign images in CI but don't verify on the cluster": the signature is decoration. CVE-2026-30615-class supply-chain compromise is not blocked by build-side signing alone. See `supply-chain-integrity` Compliance Theater Test 2 for the matched build-side test.
|
|
332
|
+
- If the answer is "we verify but accept any signed image": identity-pinning is missing; any compromised CI workflow that produces a Sigstore signature gets in.
|
|
333
|
+
- Acceptable: ClusterImagePolicy pinned to specific OIDC identity + issuer; verification logs reviewable; admission denied for unsigned or wrongly-signed images.
|
|
334
|
+
|
|
335
|
+
**Theater Test 4 — Container-escape tabletop / IR readiness.**
|
|
336
|
+
Ask: *"Show me the most recent tabletop exercise where the scenario was a kernel-LPE-driven container escape on a production cluster. What was the detection MTTD, what was the kernel-patch / node-replace MTTR, what was the audit-evidence preservation procedure?"*
|
|
337
|
+
|
|
338
|
+
- If the answer is "we have a generic IR playbook": theater. Container-escape IR has specific moves (cordon-not-shutdown, snapshot the pod manifest and image digest before terminating, preserve eBPF event window) that a generic playbook will not cover.
|
|
339
|
+
- If the answer is "we don't run container-specific tabletops": exposure to T1611 is unmanaged regardless of the patch posture.
|
|
340
|
+
- If the answer is "we tested but the IR team had no kubectl access to the production cluster during the exercise": the playbook is theater because the responders cannot execute it.
|
|
341
|
+
- Acceptable: container-escape-specific tabletop within the last 12 months, MTTD under 1h on the test scenario, post-exercise tickets closed.
|
|
342
|
+
|
|
343
|
+
---
|
|
344
|
+
|
|
345
|
+
## Defensive Countermeasure Mapping
|
|
346
|
+
|
|
347
|
+
Per AGENTS.md optional 8th section (required for skills shipped on or after 2026-05-11). Maps container/K8s findings to MITRE D3FEND IDs from `data/d3fend-catalog.json`, with explicit defense-in-depth layer position, least-privilege scope, zero-trust posture, and AI-pipeline applicability per Hard Rule #9.
|
|
348
|
+
|
|
349
|
+
| D3FEND ID | Technique | Cluster Layer Position | Least-Privilege Scope | Zero-Trust Posture | AI-Pipeline Applicability |
|
|
350
|
+
|---|---|---|---|---|---|
|
|
351
|
+
| D3-EAL | Executable Allowlisting | Admission control (Sigstore policy-controller `ClusterImagePolicy` + Kyverno `verifyImages`) as the cluster-layer analogue to host-OS allowlisting; secondary at the host OS layer for node-image hardening | Per-namespace ClusterImagePolicy pinning OIDC identity + issuer; per-workload allowlist where vendor images carry distinct identities | Default-deny on unsigned images at admission; verify identity + Rekor inclusion, not signature presence | Highly applicable. KServe / vLLM / Triton images go through the same admission gate; AI-inference vendor images (NVIDIA NGC, Hugging Face TGI) must carry the same signature evidence. The ephemeral nature of pods means D3-EAL is the most reliable layer — once a pod is admitted, the host-layer EAL is largely moot for the container's lifetime. |
|
|
352
|
+
| D3-EHB | Executable Hash-based Allowlist | Image digest pinning (`image: registry/repo@sha256:...`) at admission; model-weight SHA-256 pinning at pre-load for AI inference | Per-workload digest pin; model-weight digest pin per `InferenceService` | Tag references (`:latest`, `:v1`) are not zero-trust — digest references are. Admission policy can enforce `image: ...@sha256:...` only. | Highly applicable. Model-weight hash verification before load is the AI-inference analogue. Hand off model-weight signing pipeline to `supply-chain-integrity`. |
|
|
353
|
+
| D3-PSEP | Process Segment Execution Prevention | Pod-spec `seccompProfile: RuntimeDefault` minimum, `Localhost` with a custom restrictive profile for high-risk workloads; `readOnlyRootFilesystem: true`; `allowPrivilegeEscalation: false`; capability drops | Per-pod seccomp profile; per-pod capability set | Verify the syscall surface at runtime, not just declare the profile at admission. Tetragon TracingPolicy enforces in-kernel; Falco detects post-fact. | Applicable. AI inference workloads need a seccomp profile that accommodates GPU device-plugin syscalls (CUDA, NVIDIA driver ioctls) while still restricting the broader attack surface. `RuntimeDefault` is often too restrictive; `Localhost` with a tuned profile is the production posture. |
|
|
354
|
+
| D3-NI | Network Isolation | NetworkPolicy at every namespace; CiliumNetworkPolicy where L7 needed; service mesh mTLS at the workload-identity layer | Per-namespace default-deny; explicit allow rules per workload-to-workload flow; identity-aware policy via SPIFFE | Conduit posture is default-deny; every flow verified per-identity per-protocol | Highly applicable. AI inference workloads need narrow egress (model registry + observability + inference-result sink only); broad internet egress from inference pods is a finding. Identity-aware policy (Cilium + SPIFFE) is the production posture for multi-tenant inference clusters. |
|
|
355
|
+
| D3-NTPM | Network Traffic Policy Mapping | Cilium L7 policy expression of allowed HTTP methods, gRPC services, DNS FQDNs, Kafka topics per workload | Per-workload allowlist of L7 endpoints | Continuous verification of conformance; deviation triggers alert | Highly applicable. AI inference egress should be policy-mapped to specific Hugging Face / vendor registry FQDNs; deviation indicates either model-pull misconfiguration or exfiltration. |
|
|
356
|
+
| D3-IOPR | Input / Output Process Profiling | Falco / Tetragon eBPF syscall + network + process-exec event stream; behavioral baselines per workload class | Per-workload behavioral profile; deviation alerts scoped to that workload's owner | Detection assumes the container is hostile until profile-conformant per-event | Highly applicable. AI inference workloads have a tight behavioral profile (CUDA syscalls, model-PVC reads, inference-result writes); unexpected serialization-load syscalls, `subprocess.Popen`, or `socket(AF_INET)` to non-registry endpoints is a high-confidence alert. The ephemeral nature of pods means baselines should be per-image-digest rather than per-pod-instance. |
|
|
357
|
+
|
|
358
|
+
**Ephemeral runtime posture (per Hard Rule #9, applied straight).** Containers are the canonical ephemeral runtime: pods die, nodes are replaced, in-pod logs are gone the moment the pod is. Controls that assume a long-lived host (host-based EDR signature DB updated weekly, on-host log retention, in-place patching) are architecturally mismatched. The container-realistic posture: admission policy as the primary preventive layer (D3-EAL, D3-EHB), eBPF runtime detection with out-of-cluster event sink as the primary detective layer (D3-IOPR), node-image swap (Bottlerocket, Talos, Flatcar) as the primary patching pattern, and all audit/forensic data leaving the host before the host can be lost. Recommendations that read "deploy host EDR with weekly signature updates" without specifying how that survives a node replacement are operationally indefensible.
|
|
359
|
+
|
|
360
|
+
---
|
|
361
|
+
|
|
362
|
+
## Hand-Off / Related Skills
|
|
363
|
+
|
|
364
|
+
After producing the container/K8s posture assessment, chain into the following skills.
|
|
365
|
+
|
|
366
|
+
- **`kernel-lpe-triage`** — for every Linux node OS in scope, score Copy Fail (CVE-2026-31431) and Dirty Frag (CVE-2026-43284 / -43500) exposure. Container escape is fundamentally a kernel problem; the cluster-layer controls are mitigations, not the closure. Live-patch-vs-node-replace decisions depend on node OS class (in-place patching on Ubuntu/RHEL; image-swap on Bottlerocket/Talos/Flatcar).
|
|
367
|
+
- **`supply-chain-integrity`** — for the build-side provenance, SLSA L3 evidence, Sigstore signing keys, in-toto attestations, SBOM, and ML-BOM for model weights. The container-runtime skill enforces verification at admission; supply-chain-integrity produces the evidence to verify.
|
|
368
|
+
- **`cloud-security`** *(if shipped — currently surfaced via sector skills and `framework-gap-analysis`)* — K8s typically runs on cloud IaaS or managed K8s (EKS, GKE, AKS). The control-plane managed-services section of CIS K8s Benchmark and the IAM-to-K8s-RBAC bridge (IRSA, Workload Identity, AAD Pod Identity) are cloud-specific. Until a dedicated cloud-security skill ships, use `sector-federal-government` for FedRAMP Rev 5 cloud baseline, `sector-financial` for MAS TRM / DORA cloud expectations, and `framework-gap-analysis` for ISO/IEC 27017 cloud-sector ISMS alignment.
|
|
369
|
+
- **`mlops-security`** *(if shipped — currently surfaced via `ai-attack-surface`, `ai-risk-management`, `rag-pipeline-security`)* — AI inference workloads in K8s require additional MLOps-specific controls (model registry governance, training-data lineage, model versioning, A/B inference routing security). Until a dedicated mlops-security skill ships, the AI Inference Workload Posture section of the output references the model-side controls and `ai-attack-surface` covers the inference-input attack surface.
|
|
370
|
+
- **`defensive-countermeasure-mapping`** — to deepen the D3FEND mapping above into a layered remediation plan rather than a single-control patch ticket; the container-realistic compensating-control programme typically combines all six D3FEND techniques above plus host-layer kernel patching.
|
|
371
|
+
- **`attack-surface-pentest`** — K8s in pen-test scope. Pen-test rules of engagement must spell out: cluster-admin access not granted by default (red team starts from a workload-level foothold), no DoS testing on control plane, presence of a cluster operator with stop-test authority, and explicit identification of AI inference services that are out of scope for adversarial-input testing (route that to `ai-attack-surface` testing instead).
|
|
372
|
+
- **`compliance-theater`** — to extend the four theater tests above with general-purpose theater detection on the operator's wider GRC posture.
|
|
373
|
+
- **`framework-gap-analysis`** — for any multi-jurisdiction operator, to produce the per-jurisdiction reconciliation called for in Analysis Procedure Step 10.
|
|
374
|
+
- **`global-grc`** — alongside framework-gap-analysis when EU NIS2 + CRA, UK NCSC CAF, AU ISM, JP NISC, IL INCD, SG GovTech, TW CSMA all apply.
|
|
375
|
+
- **`ai-attack-surface`** and **`mcp-agent-trust`** — when AI inference workloads (KServe / vLLM / Triton / Ray Serve) are in scope. ai-attack-surface for prompt-injection and model-input threats against the inference endpoint; mcp-agent-trust for the developer-side MCP servers that may be used to push manifests or operate the cluster.
|
|
376
|
+
- **`identity-assurance`** — for the cluster-admin / kubectl identity layer. Standing cluster-admin tokens are an identity-theater finding; SPIFFE + OIDC + just-in-time elevation is the production posture.
|
|
377
|
+
- **`policy-exception-gen`** — to generate defensible exceptions for namespaces where PSS Restricted is architecturally infeasible (GPU device plugin, vendor Helm chart requirements). The exception evidence is the documented compensating-control programme: Kyverno workload-specific policy, scoped NetworkPolicy, scoped RBAC, Falco/Tetragon rule coverage.
|
|
378
|
+
|
|
379
|
+
**Forward watch (per skill-format spec).** CIS Kubernetes Benchmark v1.11 (covering K8s 1.32+); NIST 800-190 r2 (long-overdue revision covering Pod Security Standards, admission policy, eBPF detection, AI inference workloads); NSA/CISA Kubernetes Hardening Guide v1.3 (no public draft as of 2026-05-11); OWASP Kubernetes Top 10 next revision; ingress-nginx CVE feed; Cilium / Falco / Tetragon security advisory feeds; KServe and vLLM security advisory ingestion into `data/cve-catalog.json`; Sigstore policy-controller GA and CRD stability (`policy.sigstore.dev` graduating from v1beta1 to v1).
|