@blamejs/exceptd-skills 0.12.11 → 0.12.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (91) hide show
  1. package/CHANGELOG.md +243 -0
  2. package/bin/exceptd.js +299 -48
  3. package/data/_indexes/_meta.json +49 -48
  4. package/data/_indexes/activity-feed.json +13 -5
  5. package/data/_indexes/catalog-summaries.json +51 -29
  6. package/data/_indexes/chains.json +3238 -3210
  7. package/data/_indexes/frequency.json +3 -0
  8. package/data/_indexes/jurisdiction-map.json +5 -3
  9. package/data/_indexes/section-offsets.json +712 -685
  10. package/data/_indexes/theater-fingerprints.json +1 -1
  11. package/data/_indexes/token-budget.json +355 -340
  12. package/data/atlas-ttps.json +144 -129
  13. package/data/attack-techniques.json +339 -0
  14. package/data/cve-catalog.json +515 -475
  15. package/data/cwe-catalog.json +1081 -759
  16. package/data/exploit-availability.json +63 -15
  17. package/data/framework-control-gaps.json +867 -843
  18. package/data/rfc-references.json +276 -276
  19. package/keys/EXPECTED_FINGERPRINT +1 -0
  20. package/lib/auto-discovery.js +21 -4
  21. package/lib/cross-ref-api.js +39 -6
  22. package/lib/cve-curation.js +505 -47
  23. package/lib/lint-skills.js +217 -15
  24. package/lib/playbook-runner.js +1224 -183
  25. package/lib/prefetch.js +121 -8
  26. package/lib/refresh-external.js +261 -95
  27. package/lib/refresh-network.js +208 -18
  28. package/lib/schemas/manifest.schema.json +16 -0
  29. package/lib/scoring.js +83 -7
  30. package/lib/sign.js +112 -3
  31. package/lib/source-ghsa.js +219 -37
  32. package/lib/source-osv.js +381 -122
  33. package/lib/validate-catalog-meta.js +64 -9
  34. package/lib/validate-cve-catalog.js +213 -7
  35. package/lib/validate-indexes.js +88 -37
  36. package/lib/validate-playbooks.js +469 -0
  37. package/lib/verify.js +313 -16
  38. package/manifest-snapshot.json +1 -1
  39. package/manifest-snapshot.sha256 +1 -0
  40. package/manifest.json +73 -73
  41. package/orchestrator/dispatcher.js +21 -1
  42. package/orchestrator/event-bus.js +52 -8
  43. package/orchestrator/index.js +279 -20
  44. package/orchestrator/pipeline.js +63 -2
  45. package/orchestrator/scanner.js +32 -10
  46. package/orchestrator/scheduler.js +196 -20
  47. package/package.json +3 -1
  48. package/sbom.cdx.json +9 -9
  49. package/scripts/check-manifest-snapshot.js +32 -0
  50. package/scripts/check-sbom-currency.js +65 -3
  51. package/scripts/check-test-coverage.js +142 -19
  52. package/scripts/predeploy.js +110 -40
  53. package/scripts/refresh-manifest-snapshot.js +55 -4
  54. package/scripts/validate-vendor-online.js +169 -0
  55. package/scripts/verify-shipped-tarball.js +106 -3
  56. package/skills/ai-attack-surface/skill.md +18 -10
  57. package/skills/ai-c2-detection/skill.md +7 -2
  58. package/skills/ai-risk-management/skill.md +5 -4
  59. package/skills/api-security/skill.md +3 -3
  60. package/skills/attack-surface-pentest/skill.md +5 -5
  61. package/skills/cloud-security/skill.md +1 -1
  62. package/skills/compliance-theater/skill.md +8 -8
  63. package/skills/container-runtime-security/skill.md +1 -1
  64. package/skills/dlp-gap-analysis/skill.md +5 -1
  65. package/skills/email-security-anti-phishing/skill.md +1 -1
  66. package/skills/exploit-scoring/skill.md +18 -18
  67. package/skills/framework-gap-analysis/skill.md +6 -6
  68. package/skills/global-grc/skill.md +3 -2
  69. package/skills/identity-assurance/skill.md +2 -2
  70. package/skills/incident-response-playbook/skill.md +4 -4
  71. package/skills/kernel-lpe-triage/skill.md +21 -2
  72. package/skills/mcp-agent-trust/skill.md +17 -10
  73. package/skills/mlops-security/skill.md +2 -1
  74. package/skills/ot-ics-security/skill.md +1 -1
  75. package/skills/policy-exception-gen/skill.md +3 -3
  76. package/skills/pqc-first/skill.md +1 -1
  77. package/skills/rag-pipeline-security/skill.md +7 -3
  78. package/skills/researcher/skill.md +20 -3
  79. package/skills/sector-energy/skill.md +1 -1
  80. package/skills/sector-federal-government/skill.md +1 -1
  81. package/skills/sector-financial/skill.md +3 -3
  82. package/skills/sector-healthcare/skill.md +2 -2
  83. package/skills/security-maturity-tiers/skill.md +7 -7
  84. package/skills/skill-update-loop/skill.md +19 -3
  85. package/skills/supply-chain-integrity/skill.md +1 -1
  86. package/skills/threat-model-currency/skill.md +11 -11
  87. package/skills/threat-modeling-methodology/skill.md +3 -3
  88. package/skills/webapp-security/skill.md +1 -1
  89. package/skills/zeroday-gap-learn/skill.md +51 -7
  90. package/vendor/blamejs/_PROVENANCE.json +4 -1
  91. package/vendor/blamejs/worker-pool.js +38 -0
@@ -0,0 +1,169 @@
1
+ #!/usr/bin/env node
2
+ "use strict";
3
+ /**
4
+ * scripts/validate-vendor-online.js — Audit G F6.
5
+ *
6
+ * Optional, network-touching companion to lib/validate-vendor.js. For every
7
+ * file recorded in vendor/blamejs/_PROVENANCE.json, fetches the upstream
8
+ * blob from github.com/<source_repo>/blob/<pinned_commit>/<upstream_path>
9
+ * (via the raw.githubusercontent.com mirror), hashes it, and compares the
10
+ * result against the `upstream_sha256_at_pin` recorded in _PROVENANCE.json.
11
+ *
12
+ * This catches the class where _PROVENANCE.json was hand-edited to
13
+ * advertise a `upstream_sha256_at_pin` that does not actually match what
14
+ * upstream had at that commit. lib/validate-vendor.js only checks that the
15
+ * local vendored file matches its own recorded hash — that's self-attesting.
16
+ * This script extends the check to upstream, closing the gap.
17
+ *
18
+ * Not part of `npm run predeploy` by default — the predeploy gate sequence
19
+ * must remain network-independent (offline gates only). Run manually:
20
+ *
21
+ * node scripts/validate-vendor-online.js
22
+ * node scripts/validate-vendor-online.js --timeout 30000
23
+ * node scripts/validate-vendor-online.js --json
24
+ *
25
+ * Exit codes:
26
+ * 0 every vendored file's upstream_sha256_at_pin matched upstream
27
+ * 1 at least one mismatch
28
+ * 2 runtime / network error
29
+ *
30
+ * Zero npm deps. Node 24 stdlib only.
31
+ */
32
+
33
+ const fs = require("fs");
34
+ const path = require("path");
35
+ const crypto = require("crypto");
36
+ const https = require("https");
37
+
38
+ const ROOT = path.join(__dirname, "..");
39
+ const PROV_PATH = path.join(ROOT, "vendor", "blamejs", "_PROVENANCE.json");
40
+
41
+ function parseArgs(argv) {
42
+ const out = { timeoutMs: 15000, json: false };
43
+ for (let i = 2; i < argv.length; i++) {
44
+ const a = argv[i];
45
+ if (a === "--timeout") out.timeoutMs = Number(argv[++i]) || out.timeoutMs;
46
+ else if (a === "--json") out.json = true;
47
+ else if (a === "--help" || a === "-h") {
48
+ process.stdout.write(
49
+ "Usage: node scripts/validate-vendor-online.js [--timeout <ms>] [--json]\n"
50
+ );
51
+ process.exit(0);
52
+ } else {
53
+ process.stderr.write(`Unknown argument: ${a}\n`);
54
+ process.exit(2);
55
+ }
56
+ }
57
+ return out;
58
+ }
59
+
60
+ function rawUrlForPin(sourceRepo, commit, upstreamPath) {
61
+ // Translate https://github.com/owner/repo → raw.githubusercontent.com/owner/repo
62
+ // sourceRepo may end in .git; strip it. Tolerate trailing slash.
63
+ const m = (sourceRepo || "").match(
64
+ /^https?:\/\/github\.com\/([^/]+)\/([^/]+?)(?:\.git)?\/?$/
65
+ );
66
+ if (!m) return null;
67
+ const [, owner, repo] = m;
68
+ const cleanPath = String(upstreamPath || "").replace(/^\/+/, "");
69
+ return `https://raw.githubusercontent.com/${owner}/${repo}/${commit}/${cleanPath}`;
70
+ }
71
+
72
+ const MAX_REDIRECTS = 5;
73
+
74
+ function fetchBuffer(url, timeoutMs, redirectsLeft = MAX_REDIRECTS) {
75
+ return new Promise((resolve, reject) => {
76
+ const req = https.get(url, (res) => {
77
+ // v0.12.14 (codex P2): cap redirect hops. A redirect loop (or a
78
+ // hostile / mis-configured upstream that keeps returning 3xx with
79
+ // Location pointing back to itself) used to recurse until stack
80
+ // overflow or hang. Now: count hops, fail clean on exhaustion.
81
+ if (res.statusCode >= 300 && res.statusCode < 400 && res.headers.location) {
82
+ res.resume();
83
+ if (redirectsLeft <= 0) {
84
+ return reject(new Error(`exceeded ${MAX_REDIRECTS} redirects fetching ${url}`));
85
+ }
86
+ return resolve(fetchBuffer(res.headers.location, timeoutMs, redirectsLeft - 1));
87
+ }
88
+ if (res.statusCode !== 200) {
89
+ res.resume();
90
+ return reject(new Error(`HTTP ${res.statusCode} for ${url}`));
91
+ }
92
+ const chunks = [];
93
+ res.on("data", (c) => chunks.push(c));
94
+ res.on("end", () => resolve(Buffer.concat(chunks)));
95
+ res.on("error", reject);
96
+ });
97
+ req.on("error", reject);
98
+ req.setTimeout(timeoutMs, () => {
99
+ req.destroy(new Error(`timeout after ${timeoutMs}ms fetching ${url}`));
100
+ });
101
+ });
102
+ }
103
+
104
+ async function main() {
105
+ const opts = parseArgs(process.argv);
106
+ if (!fs.existsSync(PROV_PATH)) {
107
+ process.stderr.write(`vendor/blamejs/_PROVENANCE.json missing\n`);
108
+ process.exitCode = 2;
109
+ return;
110
+ }
111
+ const prov = JSON.parse(fs.readFileSync(PROV_PATH, "utf8"));
112
+ const sourceRepo = prov.source_repo;
113
+ const pinnedCommit = prov.pinned_commit;
114
+ if (!sourceRepo || !pinnedCommit) {
115
+ process.stderr.write(`_PROVENANCE.json missing source_repo or pinned_commit\n`);
116
+ process.exitCode = 2;
117
+ return;
118
+ }
119
+
120
+ const findings = [];
121
+ for (const [name, info] of Object.entries(prov.files || {})) {
122
+ const url = rawUrlForPin(sourceRepo, pinnedCommit, info.upstream_path);
123
+ if (!url) {
124
+ findings.push({ name, ok: false, reason: `cannot compute raw URL for ${sourceRepo}` });
125
+ continue;
126
+ }
127
+ try {
128
+ const buf = await fetchBuffer(url, opts.timeoutMs);
129
+ const sha = crypto.createHash("sha256").update(buf).digest("hex");
130
+ if (info.upstream_sha256_at_pin && sha !== info.upstream_sha256_at_pin) {
131
+ findings.push({
132
+ name,
133
+ ok: false,
134
+ reason:
135
+ `upstream sha mismatch: recorded ${String(info.upstream_sha256_at_pin).slice(0, 12)}…, ` +
136
+ `live ${sha.slice(0, 12)}…`,
137
+ url,
138
+ });
139
+ } else {
140
+ findings.push({ name, ok: true, sha, url });
141
+ }
142
+ } catch (e) {
143
+ findings.push({ name, ok: false, reason: `fetch failed: ${e.message}`, url });
144
+ }
145
+ }
146
+
147
+ const failed = findings.filter((f) => !f.ok);
148
+ if (opts.json) {
149
+ process.stdout.write(JSON.stringify({ ok: failed.length === 0, findings }, null, 2) + "\n");
150
+ } else {
151
+ for (const f of findings) {
152
+ if (f.ok) process.stdout.write(`PASS ${f.name} ${f.sha.slice(0, 12)}…\n`);
153
+ else process.stdout.write(`FAIL ${f.name} ${f.reason}\n`);
154
+ }
155
+ process.stdout.write(
156
+ `\n${findings.length - failed.length}/${findings.length} vendored files match upstream pin.\n`
157
+ );
158
+ }
159
+ process.exitCode = failed.length === 0 ? 0 : 1;
160
+ }
161
+
162
+ if (require.main === module) {
163
+ main().catch((e) => {
164
+ process.stderr.write(`runtime error: ${e.message}\n`);
165
+ process.exitCode = 2;
166
+ });
167
+ }
168
+
169
+ module.exports = { rawUrlForPin, fetchBuffer };
@@ -18,6 +18,20 @@
18
18
  * The bug was invisible because CI's verify ran against the SOURCE tree,
19
19
  * not the shipped tarball. This gate closes that gap.
20
20
  *
21
+ * Audit G:
22
+ * F9 — After the first-pass extraction (using the source-tree parseTar),
23
+ * re-parse the tarball using the parseTar shipped INSIDE the
24
+ * extracted tree itself. If the two parses disagree, fail with a
25
+ * structured error. Catches the class where the shipped parser
26
+ * silently rejects entries the source parser accepts (or vice
27
+ * versa), which would mean operators run a different extractor
28
+ * than CI exercised.
29
+ * F15 — Invoke `npm pack --offline` so the gate cannot be blocked by
30
+ * registry reachability problems during predeploy.
31
+ * F4 — Cross-check the extracted public.pem against
32
+ * keys/EXPECTED_FINGERPRINT (warn-and-continue when missing, fail
33
+ * when present-but-mismatched and KEYS_ROTATED != 1).
34
+ *
21
35
  * Exit codes:
22
36
  * 0 verify passed against the packed tarball
23
37
  * 1 verify failed against the packed tarball (the bug class above)
@@ -42,7 +56,10 @@ function fail(msg, code = 1) {
42
56
  const tmpRoot = fs.mkdtempSync(path.join(os.tmpdir(), "verify-shipped-"));
43
57
  try {
44
58
  emit(`packing into ${tmpRoot} ...`);
45
- const pack = spawnSync("npm", ["pack", "--pack-destination", tmpRoot], {
59
+ // F15 pass --offline. Predeploy must run without registry
60
+ // reachability; `npm pack` does not need the network for a local
61
+ // package and forcing offline mode hard-locks the assumption.
62
+ const pack = spawnSync("npm", ["pack", "--offline", "--pack-destination", tmpRoot], {
46
63
  cwd: ROOT,
47
64
  encoding: "utf8",
48
65
  shell: process.platform === "win32",
@@ -60,10 +77,10 @@ try {
60
77
  const extractDir = path.join(tmpRoot, "extract");
61
78
  fs.mkdirSync(extractDir, { recursive: true });
62
79
  const zlib = require("zlib");
63
- const { parseTar } = require(path.join(ROOT, "lib", "refresh-network.js"));
80
+ const { parseTar: parseTarSource } = require(path.join(ROOT, "lib", "refresh-network.js"));
64
81
  const tgz = fs.readFileSync(tarballPath);
65
82
  const tarBuf = zlib.gunzipSync(tgz);
66
- const entries = parseTar(tarBuf);
83
+ const entries = parseTarSource(tarBuf);
67
84
  for (const e of entries) {
68
85
  if (!e.name) continue;
69
86
  const dst = path.join(extractDir, e.name);
@@ -77,6 +94,65 @@ try {
77
94
  }
78
95
  emit(`extracted to ${pkgRoot}`);
79
96
 
97
+ // Audit G F9 — load the extracted tree's OWN parseTar and re-parse the
98
+ // tarball. If the two parsers diverge on entry list or content, the
99
+ // gate trips: this means CI exercised a different parser than operators
100
+ // will. Defense against drift between source and shipped tarball when
101
+ // someone edits lib/refresh-network.js without re-vendoring or vice
102
+ // versa.
103
+ const shippedParserPath = path.join(pkgRoot, "lib", "refresh-network.js");
104
+ if (!fs.existsSync(shippedParserPath)) {
105
+ fail(`extracted tree missing lib/refresh-network.js (cannot run F9 cross-parse check)`, 2);
106
+ }
107
+ let parseTarShipped;
108
+ try {
109
+ parseTarShipped = require(shippedParserPath).parseTar;
110
+ } catch (e) {
111
+ fail(`failed to load extracted parseTar: ${e.message}`, 2);
112
+ }
113
+ if (typeof parseTarShipped !== "function") {
114
+ fail(`extracted lib/refresh-network.js does not export parseTar`, 2);
115
+ }
116
+ const shippedEntries = parseTarShipped(tarBuf);
117
+ // Compare counts first — fast bailout.
118
+ const divergences = [];
119
+ if (shippedEntries.length !== entries.length) {
120
+ divergences.push(
121
+ `entry count divergence: source-tree parser produced ${entries.length}, ` +
122
+ `shipped parser produced ${shippedEntries.length}`
123
+ );
124
+ } else {
125
+ // Walk in parallel; tarball entry order is deterministic so positional
126
+ // compare is correct. Compare name + byte length + body bytes.
127
+ for (let i = 0; i < entries.length; i++) {
128
+ const a = entries[i];
129
+ const b = shippedEntries[i];
130
+ if (a.name !== b.name) {
131
+ divergences.push(`entry[${i}] name mismatch: source=${a.name} shipped=${b.name}`);
132
+ continue;
133
+ }
134
+ const aBuf = Buffer.isBuffer(a.body) ? a.body : Buffer.from(a.body);
135
+ const bBuf = Buffer.isBuffer(b.body) ? b.body : Buffer.from(b.body);
136
+ if (aBuf.length !== bBuf.length || !aBuf.equals(bBuf)) {
137
+ divergences.push(
138
+ `entry[${i}] (${a.name}) body bytes differ between source-tree and shipped parser ` +
139
+ `(source ${aBuf.length} bytes vs shipped ${bBuf.length} bytes)`
140
+ );
141
+ }
142
+ }
143
+ }
144
+ if (divergences.length > 0) {
145
+ emit(`*** F9: parseTar divergence between source-tree and shipped tree ***`);
146
+ for (const d of divergences.slice(0, 5)) emit(` - ${d}`);
147
+ if (divergences.length > 5) emit(` ... and ${divergences.length - 5} more`);
148
+ fail(
149
+ `parseTar implementations diverge between source tree and shipped tarball. ` +
150
+ `Operators will run a different extractor than CI exercised. Refusing to publish.`,
151
+ 1
152
+ );
153
+ }
154
+ emit(`F9: source-tree and shipped parseTar agree on ${entries.length} entries`);
155
+
80
156
  // Run the verifier inline against the extracted package tree. This avoids
81
157
  // having to spawn a separate process whose cwd resolution differs across
82
158
  // platforms.
@@ -108,6 +184,33 @@ try {
108
184
  emit(`*** Something between sign and pack is swapping the key. Verify will fail below. ***`);
109
185
  }
110
186
 
187
+ // Audit G F4 — key-pin cross-check against the EXTRACTED tree. The pin
188
+ // is consumed from keys/EXPECTED_FINGERPRINT in the extracted package —
189
+ // that's the file operators will actually receive on `npm install`.
190
+ // Warn when absent, fail when present-but-mismatched (unless KEYS_ROTATED).
191
+ const expectedFpPath = path.join(pkgRoot, "keys", "EXPECTED_FINGERPRINT");
192
+ if (fs.existsSync(expectedFpPath)) {
193
+ const raw = fs.readFileSync(expectedFpPath, "utf8").trim();
194
+ const firstLine = raw.split(/\r?\n/).map((l) => l.trim()).find((l) => l.length > 0) || "";
195
+ const liveFpLine = `SHA256:${pubFp}`;
196
+ if (firstLine !== liveFpLine) {
197
+ if (process.env.KEYS_ROTATED === "1") {
198
+ emit(`WARN: extracted public.pem fingerprint ${liveFpLine} differs from pin ${firstLine}; KEYS_ROTATED=1 accepted`);
199
+ } else {
200
+ fail(
201
+ `keys/EXPECTED_FINGERPRINT (${firstLine}) does not match the extracted ` +
202
+ `public.pem fingerprint (${liveFpLine}). If this is an intentional rotation ` +
203
+ `set KEYS_ROTATED=1 and commit the new pin.`,
204
+ 1
205
+ );
206
+ }
207
+ } else {
208
+ emit(`F4: key pin verified — ${liveFpLine} matches keys/EXPECTED_FINGERPRINT`);
209
+ }
210
+ } else {
211
+ emit(`WARN: keys/EXPECTED_FINGERPRINT not in extracted tree — key-pin check skipped`);
212
+ }
213
+
111
214
  let pass = 0, miss = 0, fail_count = 0;
112
215
  const failures = [];
113
216
  for (const s of (manifest.skills || [])) {
@@ -63,7 +63,7 @@ The AI attack surface is not speculative. It is actively exploited. The followin
63
63
 
64
64
  ### 1. Prompt Injection as Enterprise RCE
65
65
 
66
- **CVE-2025-53773** — Hidden prompt injection in GitHub Copilot PR descriptions enabling RCE. CVSS 9.6. The attack embeds adversarial instructions in GitHub PR descriptions. When a developer uses GitHub Copilot to review or summarize the PR, the injected instructions execute in the context of the developer's session, enabling remote code execution.
66
+ **CVE-2025-53773** — Hidden prompt injection in GitHub Copilot agent mode coerces the assistant to write `"chat.tools.autoApprove": true` into `.vscode/settings.json`, flipping every subsequent tool call into auto-approval. CVSS 7.8 (AV:L — local-vector through developer-side IDE interaction; RWEP 30). The attack embeds adversarial instructions in any agent-readable content (source comments, README, PR descriptions, retrieved docs, MCP tool responses). Once the YOLO-mode flag lands, the next shell tool call executes attacker-chosen commands in the developer's user context.
67
67
 
68
68
  This is not a chatbot trick. This is enterprise RCE via a developer tool used by hundreds of millions of developers. The attack surface is any system that:
69
69
  - Feeds external content (user input, web content, documents, PR descriptions, emails, calendar events) into an LLM prompt
@@ -71,13 +71,13 @@ This is not a chatbot trick. This is enterprise RCE via a developer tool used by
71
71
 
72
72
  **Attack success rates against SOTA defenses:** A 2026 meta-analysis of 78 studies found adaptive prompt injection strategies succeed against state-of-the-art defenses at rates exceeding 85%. No current framework has adequate controls for this.
73
73
 
74
- **ATLAS ref:** AML.T0054 (Craft Adversarial Data NLP)
74
+ **ATLAS ref:** AML.T0054 (LLM Jailbreak) and AML.T0051 (LLM Prompt Injection)
75
75
 
76
76
  ### 2. MCP Supply Chain — Architectural RCE
77
77
 
78
78
  The Model Context Protocol (MCP) introduced an architectural vulnerability affecting every major AI coding assistant: Cursor, VS Code + GitHub Copilot, Windsurf, Claude Code, Gemini CLI.
79
79
 
80
- **CVE-2026-30615** — Windsurf. Zero user interaction required. The vulnerability allows a malicious MCP server (or a compromised legitimate MCP server) to execute arbitrary code in the context of the AI assistant. 150M+ affected downloads.
80
+ **CVE-2026-30615** — Windsurf MCP. CVSS 8.0 (AV:L — local-vector RCE requiring attacker-controlled HTML the MCP client processes; RWEP 35). The vulnerability allows a malicious or compromised MCP server to drive code execution in the context of the AI assistant once a victim installs it. 150M+ combined downloads across MCP-capable assistants share the same architectural attack surface.
81
81
 
82
82
  This is a supply chain attack surface. Every MCP server a user installs is a potential RCE vector. Trust boundaries that exist for npm packages do not exist for MCP servers because most MCP clients do not enforce signed manifests or tool allowlists.
83
83
 
@@ -89,13 +89,13 @@ This is a supply chain attack surface. Every MCP server a user installs is a pot
89
89
 
90
90
  The implication: the time between a vulnerability's introduction into a codebase and its reliable exploitation has compressed from months or years to hours or days for AI-capable threat actors. Patch management SLAs designed for human-speed exploit development are structurally inadequate.
91
91
 
92
- **ATLAS ref:** AML.T0017 (Develop Capabilities)
92
+ **ATLAS ref:** AML.T0016 (Obtain Capabilities: Develop Capabilities)
93
93
 
94
94
  ### 4. AI Credential Phishing Acceleration
95
95
 
96
96
  Credential theft driven by AI increased 160% in 2025. 82.6% of phishing emails now contain AI-generated content undetectable by grammar/style checks. Traditional phishing detection heuristics (poor grammar, unusual phrasing, template patterns) are no longer reliable detectors.
97
97
 
98
- **ATLAS ref:** AML.T0018 (Acquire Public ML Artifacts — misuse of generation capability)
98
+ **ATLAS ref:** AML.T0016 (Obtain Capabilities: Develop Capabilities — misuse of public AI APIs to generate phishing payloads)
99
99
 
100
100
  ### 5. AI as Covert C2 — SesameOp
101
101
 
@@ -127,6 +127,14 @@ Training pipeline targeting has moved beyond data injection to directly biasing
127
127
 
128
128
  AI-assisted reconnaissance is observed at 36,000 probes per second per campaign. Traditional rate-based detection (100–1,000 req/s threshold alerts) does not fire at legitimate-looking distributed AI-directed probe rates until significant reconnaissance has already occurred.
129
129
 
130
+ ### 10. LLM-Gateway Credential Theft as AI Attack Surface
131
+
132
+ **CVE-2026-42208** — BerriAI LiteLLM Proxy authorization-header SQL injection (CVSS 9.8 / CVSS v4 9.3 / CISA KEV-listed 2026-05-08, due 2026-05-29). LiteLLM is the open-source LLM-API gateway used in front of agent stacks, MCP-server fronts, and multi-model proxy deployments — exactly the trust hinge that this skill's threat-context section treats as the credential boundary for hosted-model use. The proxy concatenated an attacker-controlled `Authorization` header value into a SQL query in the error-logging path, so a single curl-able POST against `/chat/completions` with a SQL-injection payload returns the managed-credentials DB content without prior auth. Patched in 1.83.7+; temporary workaround `general_settings: disable_error_logs: true`. Any organisation whose AI attack-surface inventory treats the LLM gateway as "just a reverse proxy" misses that the gateway holds every downstream model-provider credential.
133
+
134
+ ### 11. AI-Discovered + AI-Weaponized Supply-Chain Worms
135
+
136
+ **CVE-2026-45321** — Mini Shai-Hulud TanStack npm worm (CVSS 9.6, ~150M weekly downloads across 42 @tanstack/* packages, CISA KEV pending). Disclosed 2026-05-11. The attack chain — Pwn-Request via `pull_request_target` on TanStack's bundle-size workflow, pnpm-store cache poisoning under the `actions/cache` key, and OIDC-token theft on the next main push — is engineering-grade and weaponizes three independently-benign primitives. While attribution (TeamPCP) records no AI-assisted exploit development for this specific instance, the worm pattern is exactly what AML.T0016-class capability-development now produces at AI cadence: chained CI/CD primitives that no individual component owner recognises as exploitable. Treat the @tanstack/* surface as an exemplar of the broader AML.T0010 (ML Supply Chain Compromise) threat applied to JS toolchains that the AI assistant ecosystem depends on.
137
+
130
138
  ---
131
139
 
132
140
  ## Framework Lag Declaration
@@ -150,14 +158,14 @@ AI-assisted reconnaissance is observed at 36,000 probes per second per campaign.
150
158
 
151
159
  | ATLAS ID | Technique | Framework Coverage | Gap Description | Exploitation Example |
152
160
  |---|---|---|---|---|
153
- | AML.T0054 | Craft Adversarial Data — NLP | Missing in all major frameworks | No control covers adversarial text injection into LLM prompts | CVE-2025-53773 (GitHub Copilot RCE) |
161
+ | AML.T0054 | LLM Jailbreak | Missing in all major frameworks | No control covers adversarial-instruction injection that bypasses guardrails and coerces the model into attacker-chosen actions | CVE-2025-53773 (GitHub Copilot YOLO-mode RCE) |
154
162
  | AML.T0010 | ML Supply Chain Compromise | Partial (ISO A.8.30) | A.8.30 covers outsourced development; does not cover MCP server trust, package signing for AI tools | CVE-2026-30615 (Windsurf MCP) |
155
163
  | AML.T0096 | LLM Integration Abuse (C2) | Missing in all major frameworks | No framework has a control for AI API traffic as C2 channel | SesameOp campaign |
156
164
  | AML.T0020 | Poison Training Data | Partial (NIST AI RMF) | NIST AI RMF identifies the risk; no specific technical control | Supply chain logistics model poisoning |
157
165
  | AML.T0043 | Craft Adversarial Data | Partial (SI-10) | SI-10 covers web input validation; not semantic injection in LLM prompts | RAG vector manipulation |
158
166
  | AML.T0051 | LLM Prompt Injection | Missing in all major frameworks | Zero controls in NIST, ISO, SOC 2, PCI for prompt injection | CVE-2025-53773, indirect injection via retrieved docs |
159
- | AML.T0017 | Develop Capabilities | Partial (awareness only) | No framework requires monitoring for AI-assisted exploit development against the org | Copy Fail AI discovery, 41% of 2025 0-days |
160
- | AML.T0016 | Acquire Public ML Artifacts | Missing (misuse dimension) | Frameworks don't address adversary use of public AI APIs for reconnaissance/attack | PROMPTFLUX, PROMPTSTEAL, phishing generation |
167
+ | AML.T0017 | Discover ML Model Ontology | Partial (awareness only) | No framework requires monitoring for adversary mapping of deployed model family, guardrail surface, or system-prompt structure via inference-API probing | Reconnaissance step preceding PROMPTSTEAL-class targeting; AML-model registry exposure |
168
+ | AML.T0016 | Obtain Capabilities: Develop Capabilities | Missing (misuse dimension) | Frameworks don't address adversary AI-assisted exploit development or use of public AI APIs to craft malware/phishing payloads | Copy Fail AI discovery (41% of 2025 0-days), PROMPTFLUX, PROMPTSTEAL, phishing generation |
161
169
  | AML.T0018 | Backdoor ML Model | Partial (NIST AI RMF) | No technical control requirements for model integrity verification | Training pipeline poisoning |
162
170
 
163
171
  ---
@@ -166,8 +174,8 @@ AI-assisted reconnaissance is observed at 36,000 probes per second per campaign.
166
174
 
167
175
  | Vulnerability | CVSS | RWEP | KEV | PoC | AI-Accelerated | Active Exploitation |
168
176
  |---|---|---|---|---|---|---|
169
- | CVE-2025-53773 (Copilot prompt injection RCE) | 9.6 | 91 | No | Yes — demonstrated | Yes (AI tooling enables) | Suspected |
170
- | CVE-2026-30615 (Windsurf MCP RCE) | 9.8 | 94 | No | Partial | No | Suspected |
177
+ | CVE-2025-53773 (Copilot YOLO-mode RCE) | 7.8 | 30 | No | Yes — demonstrated | Yes (AI tooling enables) | Suspected |
178
+ | CVE-2026-30615 (Windsurf MCP local-vector RCE) | 8.0 | 35 | No | Partial | No | Suspected |
171
179
  | SesameOp (AI C2 technique) | N/A | N/A | N/A | Yes (ATLAS documented) | Yes | Confirmed campaign |
172
180
  | PROMPTFLUX family | N/A | N/A | N/A | Behavioral signatures | Yes | Active |
173
181
  | PROMPTSTEAL family | N/A | N/A | N/A | Behavioral signatures | Yes | Active |
@@ -324,7 +324,8 @@ level: medium
324
324
  | ID | Source | Technique | C2 Relevance | Gap Flag — Which Detection Control Fails |
325
325
  |---|---|---|---|---|
326
326
  | AML.T0096 | ATLAS v5.1.0 | LLM API as covert C2 / LLM Integration Abuse | Direct: SesameOp encodes commands and exfiltrated data in prompt and completion fields against api.openai.com, api.anthropic.com, generativelanguage.googleapis.com. AI provider domain is the relay, not the attacker C2 endpoint. | NIST-800-53-SC-7 (Boundary Protection) — AI provider domains are allowlisted in most enterprise egress for legitimate developer and product use, so boundary inspection cannot distinguish benign developer prompts from C2-encoded prompts. See SC-7 entry in `data/framework-control-gaps.json` — real requirement is SDK-level prompt logging with identity binding, anomaly detection on prompt-shape and token-volume, and an allowlist that enumerates the sanctioned business reason per identity. Boundary-only SC-7 evidence is incomplete for any org with AI API access in production. |
327
- | AML.T0017 | ATLAS v5.1.0 | Develop Capabilitiesincluding adversary use of inference APIs to develop/refine attack capability (and model exfiltration via inference API where applicable) | PROMPTFLUX queries public LLMs to generate per-execution evasion code; PROMPTSTEAL uses LLMs to prioritise exfiltration targets. The inference API is doing capability-development work for the adversary in real time. | NIST-800-53-SI-3 fails — there is no static signature for code generated per-event by a public LLM. NIST-800-53-SI-4 fails as commonly deployed — no AI-API behavioural baseline per process/identity. |
327
+ | AML.T0017 | ATLAS v5.1.0 | Discover ML Model Ontology — adversary maps the deployed LLM's family, system-prompt structure, guardrail surface via inference-API probing | PROMPTFLUX queries public LLMs to generate per-execution evasion code; PROMPTSTEAL uses LLMs to prioritise exfiltration targets both depend on first discovering what the target model will answer. The inference API is the discovery surface. | NIST-800-53-SI-3 fails — there is no static signature for code generated per-event by a public LLM. NIST-800-53-SI-4 fails as commonly deployed — no AI-API behavioural baseline per process/identity. |
328
+ | AML.T0016 | ATLAS v5.1.0 | Obtain Capabilities: Develop Capabilities — adversary use of inference APIs to generate / refine malware, evasion, phishing payloads | PROMPTFLUX and PROMPTSTEAL both consume public LLMs as a real-time capability-development service. The inference API is doing weaponization work for the adversary. | NIST-800-53-SI-3 fails for the same reason. SC-7 boundary control treats the AI provider as allowlisted SaaS. |
328
329
  | T1071 | ATT&CK | Application Layer Protocol (C2) | AI C2 traffic is standard HTTPS REST to api.openai.com or equivalent. Application-protocol C2 detection that looks for DGA, unusual TLS, or beaconing does not fire. | SC-7 boundary control sees only the destination domain (allowlisted) — no protocol anomaly to alert on. Detection requires identity-bound prompt content inspection, which SC-7 as written does not require. |
329
330
  | T1102 | ATT&CK | Web Service (C2 via legitimate web service) | AI API endpoints are exactly the "legitimate web service used as C2" pattern that T1102 describes — but at scale and pre-allowlisted in nearly every enterprise. | SOC 2 CC7 anomaly-detection control: AI API traffic shares the SaaS blind spot — typically not baselined per process or identity. ISO 27001 A.8.16 monitoring activities: no guidance for AI-API-shaped traffic. |
330
331
  | T1568 | ATT&CK | Dynamic Resolution | AI provider responses can carry encoded instructions that dynamically determine the next-hop behaviour for the malware (effectively model-mediated dynamic resolution of the next attacker instruction). | No standard DNS-tunnelling or DGA detection applies — the "resolution" happens inside an HTTPS payload to a trusted endpoint. SC-7 cannot see it without SDK-level prompt + response logging. |
@@ -342,7 +343,11 @@ The threats in this skill are adversary TTPs and malware families rather than ve
342
343
  | PROMPTSTEAL | No — adversary malware family | Yes — public reporting on LLM-assisted exfiltration prioritisation and lateral-movement guidance | No | Yes — LLM is acting as the adversary's live intelligence analyst | Minimal — requires correlation of AI API calls with credential-access and file-access events. Possible to build in a SIEM; not a default rule pack. | None enterprise-subscribable. |
343
344
  | AI C2 — generic (T1071 / T1102 / T1568 over AI APIs) | No | Yes — research and red-team demonstrations across all major AI providers | No | Yes | Minimal — boundary controls treat AI provider domains as allowlisted SaaS; content-layer inspection requires TLS interception plus SDK-level prompt logging, which most orgs do not run. | Partial / inconsistent across providers. |
344
345
 
345
- **Interpretation:** there is no patch to apply because there is no vendor CVE. Mitigation is detection-architectural: SDK-level prompt logging with identity binding, AI-API behavioural baselining per process, correlation with credential/file/scan events, and an explicit allowlist that enumerates the sanctioned business reason per identity (per the SC-7 real_requirement in `data/framework-control-gaps.json`).
346
+ **Interpretation:** there is no patch to apply because there is no vendor CVE for the SesameOp / PROMPTFLUX / PROMPTSTEAL class. Mitigation is detection-architectural: SDK-level prompt logging with identity binding, AI-API behavioural baselining per process, correlation with credential/file/scan events, and an explicit allowlist that enumerates the sanctioned business reason per identity (per the SC-7 real_requirement in `data/framework-control-gaps.json`).
347
+
348
+ ### LLM-Gateway Credential Theft as an AI-C2 Adjacent Class
349
+
350
+ **CVE-2026-42208** — BerriAI LiteLLM Proxy Authorization-header SQL injection (CVSS 9.8 / CVSS v4 9.3 / CISA KEV-listed 2026-05-08, federal due 2026-05-29; in-wild exploitation evidence). LiteLLM is the open-source LLM-API gateway used in front of agent stacks, MCP-server fronts, and multi-model proxy deployments — exactly the egress hinge this skill's detection architecture treats as the credential boundary for hosted-model use. The proxy concatenated an attacker-controlled `Authorization` header value into a SQL query in the error-logging path, so a curl-able POST against `/chat/completions` with a SQL-injection payload returns the managed-credentials DB content without prior auth. Patched in 1.83.7+; temporary workaround `general_settings: disable_error_logs: true`. Detection-relevance for this skill: an AI-API egress baseline that records only outbound destinations misses the *inbound* SQL injection on the proxy itself; pair `D3-IOPR` (SDK-level prompt logging) with `D3-CSPP` payload profiling on the gateway's inbound request stream. Any organisation whose AI-API egress visibility treats the LLM gateway as "just a reverse proxy" will discover post-breach that the gateway held every downstream model-provider credential and was the actual covert exfiltration channel.
346
351
 
347
352
  ### RFC Transport Reality
348
353
 
@@ -65,7 +65,7 @@ AI governance moved from voluntary to mandatory between 2024 and 2026. The trans
65
65
 
66
66
  The gap on the ground is severe and the same in every jurisdiction the maintainers have spot-checked through Q2 2026: most organisations deploying LLMs, agents, RAG pipelines, and AI-augmented developer tooling have **zero governance artefact specific to AI**. They assume general security policies, the existing risk register, the existing vendor management programme, and the existing incident response playbook cover AI by inheritance. They do not. Concretely:
67
67
 
68
- - The risk register has no entry for prompt injection (AML.T0051), AI-as-C2 (AML.T0096), or AI-assisted exploit development against the organisation (AML.T0017).
68
+ - The risk register has no entry for prompt injection (AML.T0051), AI-as-C2 (AML.T0096), or AI-assisted exploit development against the organisation (AML.T0016 + AML.T0017).
69
69
  - The vendor management programme treats AI providers as ordinary SaaS suppliers and accepts a SOC 2 Type II as evidence of AI-specific control adequacy — even though SOC 2 has no AI-specific criteria.
70
70
  - The incident response playbook does not enumerate AI-specific incident classes (model exfiltration, training-data poisoning, agent compromise via MCP server, RAG corpus contamination, AI vendor breach affecting derived embeddings).
71
71
  - The data inventory does not include vector embedding stores, model weights, or LLM prompt/response logs as classified data assets.
@@ -77,7 +77,7 @@ AI red-team activity has likewise shifted from voluntary research practice to go
77
77
  - NIST AI RMF MEASURE 2.5 expects organisations to assess AI risks during operation (`data/framework-control-gaps.json` → `NIST-AI-RMF-MEASURE-2.5`).
78
78
  - OWASP LLM Top 10 2025 (LLM01: Prompt Injection — `data/framework-control-gaps.json` → `OWASP-LLM-Top-10-2025-LLM01`) is treated by auditors as the working operational checklist where ISO/IEC 42001 is silent on technical specifics.
79
79
 
80
- The 2024–2026 disclosure record is unforgiving: vendor advisories from OpenAI, Anthropic, Google DeepMind, and Microsoft have published AI vulnerability disclosures spanning prompt-injection RCEs (CVE-2025-53773), MCP supply-chain RCEs (CVE-2026-30615), agentic-pipeline compromise patterns, and indirect-injection via retrieved content. An organisation with no governance artefact mapping these classes to internal use cases is not in a position to act on any of them.
80
+ The 2024–2026 disclosure record is unforgiving: vendor advisories from OpenAI, Anthropic, Google DeepMind, and Microsoft have published AI vulnerability disclosures spanning prompt-injection-driven RCE (CVE-2025-53773, CVSS 7.8 / AV:L), local-vector MCP supply-chain RCE (CVE-2026-30615, CVSS 8.0 / AV:L), agentic-pipeline compromise patterns, and indirect-injection via retrieved content. An organisation with no governance artefact mapping these classes to internal use cases is not in a position to act on any of them.
81
81
 
82
82
  ---
83
83
 
@@ -113,7 +113,8 @@ Governance failure surfaces as exploitable threat. The TTPs below are the diagno
113
113
  |---|---|---|---|
114
114
  | AML.T0051 | LLM Prompt Injection | No prompt/response logging, no semantic monitoring, no AI use-case-level risk treatment decision | ISO/IEC 23894 clause 7 risk treatment register has no entry; OWASP LLM01 control unowned. CWE-1426 (improper validation of generative AI output) is the root-cause class. |
115
115
  | AML.T0096 | LLM Integration Abuse (covert C2) | No baseline of normal AI API traffic per principal; AI API egress treated as trusted internal traffic | NIST AI RMF MEASURE 2.5 not operationalised; SesameOp-class detection absent from SOC playbooks. |
116
- | AML.T0017 | Develop Capabilities (adversary AI-assisted exploit development) | No threat-intelligence ingestion path for AI-discovered vulnerabilities; patch SLAs sized for human-speed exploit development | EU AI Act Art. 9 RMS not iterating on the input that 41% of 2025 zero-days are AI-discovered (per `ai-attack-surface` and `zeroday-lessons.json`). |
116
+ | AML.T0017 | Discover ML Model Ontology — adversary reconnaissance of deployed model family / guardrails | No inference-API rate / shape baseline; model-registry RBAC absent; system-prompt extraction queries undetected | NIST AI RMF MEASURE 2.5 not requiring per-identity inference monitoring; AIMS lacks a probing-detection control. |
117
+ | AML.T0016 | Obtain Capabilities: Develop Capabilities (adversary AI-assisted exploit / payload development) | No threat-intelligence ingestion path for AI-discovered vulnerabilities; patch SLAs sized for human-speed exploit development | EU AI Act Art. 9 RMS not iterating on the input that 41% of 2025 zero-days are AI-discovered (per `ai-attack-surface` and `zeroday-lessons.json`). |
117
118
 
118
119
  Supporting weakness classes consumed from `data/cwe-catalog.json`:
119
120
  - **CWE-1426** — improper validation of generative AI output. The governance correlate: every AI use case must declare what output validation is performed and who owns it.
@@ -134,7 +135,7 @@ Adversary capability versus organisational governance maturity is the relevant a
134
135
  |---|---|---|---|
135
136
  | Low (off-the-shelf prompt injection per AML.T0051) | Exploitable today. Bypass rates >85% against SOTA defences (per `ai-attack-surface`). No detection. | Exploitable. Risk register names the threat; no detection or response capability deployed. | Detection latency: minutes-to-hours. Response playbook bound to incident class. |
136
137
  | Medium (AI-as-C2 per AML.T0096, SesameOp pattern) | Exploited last quarter by definition — no AI API logging, no baseline. | Detection-blind: AI traffic logged but no behavioural baseline. | Behavioural baseline + correlation with host activity per `ai-attack-surface` Step 4. |
137
- | High (AI-assisted exploit development per AML.T0017, Copy Fail-class) | Patch SLA structurally inadequate; live-patch capability absent. | Patch SLA sized for human-speed exploit development. | RWEP-driven prioritisation (`lib/scoring.js`), live-patch SLA <4h for KEV+PoC+AI-discovered class. |
138
+ | High (AI-assisted exploit development per AML.T0016, Copy Fail-class) | Patch SLA structurally inadequate; live-patch capability absent. | Patch SLA sized for human-speed exploit development. | RWEP-driven prioritisation (`lib/scoring.js`), live-patch SLA <4h for KEV+PoC+AI-discovered class. |
138
139
  | Frontier (training pipeline poisoning, supply-chain compromise of model weights — AML.T0020 catalogue) | No AI supplier risk register; vendor SOC 2 accepted as adequate. | AI vendor register exists; no 4th-party (AI-of-AI) coverage. | EU AI Act Art. 10 data governance + Art. 72 adversarial testing operationalised; vendor adversarial-test attestations required contractually. |
139
140
 
140
141
  Reference incident inputs to the matrix: vendor advisories from Anthropic, OpenAI, Google DeepMind, Microsoft across 2024–2026; the emergent agentic-attack patterns observed through 2025–2026 disclosed in coordinated-vulnerability programmes per `coordinated-vuln-disclosure`; the AI-as-C2 evidence base referenced in `ai-c2-detection`; the prompt-injection-RCE and MCP-RCE CVE evidence referenced in `ai-attack-surface`.
@@ -78,7 +78,7 @@ APIs are now the integration substrate of every non-trivial system. The mid-2026
78
78
 
79
79
  1. **AI-API rate-limit abuse / denial-of-wallet** — a stolen API key or compromised internal service burns the organisation's spend cap on a model endpoint. GPT-class and Claude-class token costs at production volume run to five-to-six figures per day per workload — a key exfiltrated on Friday and abused over a weekend is a real budget event.
80
80
  2. **Prompt-injection-as-C2** — user-controlled content reaches an LLM-fronted internal API and exfiltrates data through the model's response channel (the model becomes the covert C2 channel). Hand-off to `ai-c2-detection` for SesameOp-class detection patterns.
81
- 3. **Model extraction via inference rate (AML.T0017)** — high-volume queries against a hosted model are used to reconstruct the model's behaviour or training data. Detected at egress only by per-identity rate-and-shape monitoring, not by request count alone.
81
+ 3. **Model extraction via inference rate (AML.T0017 — Discover ML Model Ontology)** — high-volume queries against a hosted model are used to reconstruct the model's behaviour, system prompt, guardrail surface, or training-data signal. Detected at egress only by per-identity rate-and-shape monitoring, not by request count alone.
82
82
 
83
83
  **MCP transport runs over HTTP/SSE.** Anthropic's **Model Context Protocol** (MCP) — the de-facto agent-to-tool protocol adopted across the industry through 2025 — uses HTTP and Server-Sent Events as its transport. That means MCP traffic is API traffic and inherits every API attack surface: auth, rate limiting, schema validation, BOLA on tool calls, SSRF if a tool fetches URLs. Hand-off to `mcp-agent-trust` for MCP-specific semantics; the API-security posture is foundational.
84
84
 
@@ -125,7 +125,7 @@ APIs are now the integration substrate of every non-trivial system. The mid-2026
125
125
  | T1078 | Valid Accounts | Stolen API token / OAuth refresh token / leaked service-account key reused against the API; key-exfil-then-abuse pattern dominant for AI-API rate-limit abuse | CWE-287, CWE-200 | Partial — NIST-800-53-AC-2 manages account lifecycle but not per-object authz, not key rotation cadence for AI-API keys |
126
126
  | T1567 | Exfiltration Over Web Service | Sensitive data egressed via a legitimate API channel — AI-API response stream as covert C2; OAuth-token-scoped exfil over the org's own API | CWE-200, CWE-918 | Missing — no framework mandates per-identity egress baselining; D3-NTA is the operational control (see Defensive Countermeasure Mapping) |
127
127
  | AML.T0096 | AI Service Exploitation (AI-API as covert C2) | LLM API used as a covert command-and-control / exfil channel — prompt content carries instructions; response carries staged data | CWE-77, CWE-200 | Missing in NIST/ISO; hand-off to `ai-c2-detection` |
128
- | AML.T0017 | Model Extraction via Inference API | High-volume queries against a hosted model used to reconstruct behaviour / training data | CWE-200 | Missing — detected only by per-identity rate-and-shape monitoring at egress |
128
+ | AML.T0017 | Discover ML Model Ontology (inference-API probing for system-prompt, guardrail, model-family signal) | High-volume queries against a hosted model used to reconstruct behaviour, guardrail surface, or training-data signal | CWE-200 | Missing — detected only by per-identity rate-and-shape monitoring at egress |
129
129
 
130
130
  CWE root-causes referenced as a set (per `cwe_refs` in frontmatter): CWE-287 (Improper Authentication), CWE-862 (Missing Authorization — BFLA root cause), CWE-863 (Incorrect Authorization — BOLA root cause), CWE-918 (SSRF — API7), CWE-200 (Information Exposure — BOPLA contributor), CWE-352 (CSRF — cookie-auth APIs + WebSocket CSWSH), CWE-22 (Path Traversal — API parameter sinks), CWE-77 (Command Injection — API parameter to shell), CWE-1188 (Insecure Default Initialization — default-open API state).
131
131
 
@@ -147,7 +147,7 @@ CWE root-causes referenced as a set (per `cwe_refs` in frontmatter): CWE-287 (Im
147
147
  | API10 Unsafe Consumption of Third-Party APIs | Burp, custom integration fuzz | Egress allowlist; per-third-party threat model | Yes — agentic frameworks chain via third-party trust | Variable; transitive RCE chains via consumed AI-API or SaaS API are high | Emergent class through 2025–2026; AI-API consumption dominant subtype |
148
148
  | AI-API rate-limit abuse / denial-of-wallet | Stolen-key abuse scripts; trivial automation | Per-identity + per-cost-unit egress quotas; budget alarms | Yes — fully automated | Direct USD loss — measurable per incident | High when keys leak; common via committed secrets, third-party breach, browser-extension exfil |
149
149
  | AML.T0096 prompt-injection-as-C2 | Custom payload corpora; Promptfoo, Garak | Output guardrails, egress baselining (D3-NTA) | Yes — adaptive injection succeeds >85% against SOTA guardrails per 2026 meta-analysis | Emergent category | Active operational reality; hand-off to `ai-c2-detection` |
150
- | AML.T0017 model extraction | High-volume inference scripts; query-shape diversity tooling | Per-identity rate-and-shape monitoring at egress | Yes — agentic query diversification | Emergent | Active in adversarial-ML research; bleeding into production where hosted models expose probability vectors |
150
+ | AML.T0017 Discover ML Model Ontology (inference-API probing) | High-volume inference scripts; query-shape diversity tooling | Per-identity rate-and-shape monitoring at egress | Yes — agentic query diversification | Emergent | Active in adversarial-ML research; bleeding into production where hosted models expose probability vectors |
151
151
 
152
152
  ---
153
153
 
@@ -78,11 +78,11 @@ Every enterprise now has outbound HTTPS to one or more LLM providers (OpenAI, An
78
78
 
79
79
  ### 3. MCP servers as RCE surface
80
80
 
81
- CVE-2026-30615 (Windsurf MCP, CVSS 9.8) is the canonical example: a malicious MCP server executes arbitrary code in the AI assistant's context with zero user interaction. 150M+ affected downloads. Every developer workstation with an MCP-aware client (Cursor, VS Code + Copilot, Windsurf, Claude Code, Gemini CLI) is potentially a network of unsigned-package RCE vectors that no traditional asset inventory enumerates. ATLAS AML.T0010 (ML Supply Chain Compromise).
81
+ CVE-2026-30615 (Windsurf MCP, CVSS 8.0 / AV:L / RWEP 35) is the canonical example: a malicious MCP server drives code execution in the AI assistant's context via attacker-controlled HTML processed by the MCP client. 150M+ combined downloads across MCP-capable assistants share the architectural surface. Every developer workstation with an MCP-aware client (Cursor, VS Code + Copilot, Windsurf, Claude Code, Gemini CLI) is potentially a network of unsigned-package RCE vectors that no traditional asset inventory enumerates. ATLAS AML.T0010 (ML Supply Chain Compromise).
82
82
 
83
83
  ### 4. Prompt-injection footprint
84
84
 
85
- Any system that feeds external content (PR descriptions, support tickets, web-retrieved docs, calendar events, email bodies, RAG-retrieved chunks) into an LLM prompt is in scope (ATLAS AML.T0051). CVE-2025-53773 (GitHub Copilot, CVSS 9.6) showed prompt injection achieving RCE via a developer tool. 2026 meta-analysis: adaptive prompt injection succeeds against SOTA defenses at >85%. Pen testers must enumerate the prompt-injection footprint as a first-class asset class.
85
+ Any system that feeds external content (PR descriptions, support tickets, web-retrieved docs, calendar events, email bodies, RAG-retrieved chunks) into an LLM prompt is in scope (ATLAS AML.T0051 / AML.T0054). CVE-2025-53773 (GitHub Copilot YOLO-mode RCE, CVSS 7.8 / AV:L / RWEP 30) showed prompt-injection coercion flipping `chat.tools.autoApprove: true` and converting subsequent tool calls into shell execution via developer-side IDE interaction. 2026 meta-analysis: adaptive prompt injection succeeds against SOTA defenses at >85%. Pen testers must enumerate the prompt-injection footprint as a first-class asset class.
86
86
 
87
87
  ### 5. RAG corpora and embedding stores
88
88
 
@@ -126,7 +126,7 @@ Pen testers must emulate both classical and AI-class chains. The table below map
126
126
 
127
127
  | Phase | Classical TTP (ATT&CK v17) | AI-Class TTP (ATLAS v5.1.0) | Framework Gap Flag |
128
128
  |---|---|---|---|
129
- | Reconnaissance | T1595 (Active Scanning) — implied by T1190 setup | AML.T0000 (Search Open Technical Databases) — model card / dataset / API endpoint discovery | NIST 800-115 §3.x recon guidance is network-only |
129
+ | Reconnaissance | T1595 (Active Scanning) — implied by T1190 setup | AML.TA0002 (Reconnaissance tactic) — model card / dataset / API endpoint discovery, system-prompt probing | NIST 800-115 §3.x recon guidance is network-only |
130
130
  | Initial Access | T1190 (Exploit Public-Facing Application) | AML.T0051 (LLM Prompt Injection) — entered via PR description, support ticket, retrieved doc | OWASP WSTG covers webapp; not prompt-injection as entry vector |
131
131
  | Initial Access | T1133 (External Remote Services) | AML.T0010 (ML Supply Chain Compromise) — malicious MCP server installed by developer | PTES scoping templates do not require MCP server enumeration |
132
132
  | Execution | T1059 (Command and Scripting Interpreter) | AML.T0051 → tool-use call invoking shell/code execution in agent context | NIS2 Art.21 patch-mgmt language assumes binary exploit; semantic-input exploit lives outside |
@@ -146,8 +146,8 @@ Pen testers select tooling based on real exploit availability. The matrix below
146
146
  | Vulnerability | CVSS | RWEP | CISA KEV | Public PoC | AI-Discovered/Accelerated | Live-Patchable | Pen Tester Use |
147
147
  |---|---|---|---|---|---|---|---|
148
148
  | CVE-2026-31431 (Copy Fail Linux kernel LPE) | 7.8 | 90 | Yes | Yes — 732-byte deterministic exploit | Yes (AI-discovered in ~1 hour) | Yes (kpatch / livepatch) | Post-exploit privilege escalation in unpatched Linux hosts; reliable, no race condition |
149
- | CVE-2025-53773 (GitHub Copilot prompt-injection RCE) | 9.6 | 91 | No (not enterprise infra in KEV sense) | Yes — demonstrated PoC | Yes (AI tooling generates injection payloads) | No (requires Copilot update + prompt hardening) | Initial access via PR descriptions, support tickets, retrieved docs; emulate AML.T0051 |
150
- | CVE-2026-30615 (Windsurf MCP RCE) | 9.8 | 94 | No | Partial — concept demonstrated | No | No (requires Windsurf patch) | Lateral movement to developer workstations via malicious MCP server; emulate AML.T0010 |
149
+ | CVE-2025-53773 (GitHub Copilot YOLO-mode RCE) | 7.8 | 30 | No (not enterprise infra in KEV sense) | Yes — demonstrated PoC | Yes (AI tooling generates injection payloads) | Yes (SaaS push / IDE update) | Initial access via PR descriptions, support tickets, retrieved docs; emulate AML.T0051 / AML.T0054 |
150
+ | CVE-2026-30615 (Windsurf MCP local-vector RCE) | 8.0 | 35 | No | Partial — concept demonstrated | No | Yes (IDE update) | Lateral movement to developer workstations via malicious MCP server; emulate AML.T0010 |
151
151
  | CVE-2026-43284 (covered in fuzz/memory-safety skill) | see `data/cve-catalog.json` | see `data/cve-catalog.json` | see `data/exploit-availability.json` | see `data/exploit-availability.json` | see `data/cve-catalog.json` | see `data/cve-catalog.json` | Memory-corruption chain context — refer to that skill for exploitation detail |
152
152
  | CVE-2026-43500 (covered in supply-chain skill) | see `data/cve-catalog.json` | see `data/cve-catalog.json` | see `data/exploit-availability.json` | see `data/exploit-availability.json` | see `data/cve-catalog.json` | see `data/cve-catalog.json` | Supply chain compromise context — refer to that skill for exploitation detail |
153
153
  | SesameOp (AI-as-C2 technique, no CVE — adversary tradecraft) | N/A (technique, not vulnerability) | N/A | N/A | Yes — ATLAS-documented adversary pattern (AML.T0096) | Yes | N/A | Emulate covert AI-API C2 during adversary emulation; verifies whether egress monitoring catches AML.T0096 |
@@ -136,7 +136,7 @@ Cloud is where AI runs. Every consequential AI service — OpenAI, Anthropic, Go
136
136
  | Cloud-facing application | T1190 — Exploit Public-Facing Application | ATT&CK Enterprise | API Gateway / Load Balancer / managed-WAF-bypass; managed-database exposure (RDS / SQL DB / Cloud SQL public IP); container-registry public image abuse; Lambda / Cloud Functions / Azure Functions endpoint exploit | NIST 800-53 SC-7 perimeter assumption inadequate; CSA CCM AIS-04 and IVS-08 partial; CWE-1188 (Insecure Default Initialization) |
137
137
  | Cloud-credential exposure | T1552 — Unsecured Credentials (incl. T1552.001 Files, T1552.005 Cloud Instance Metadata API, T1552.007 Container API) | ATT&CK Enterprise | IMDSv1 SSRF on EC2 / GCE; static cloud credentials in git / images / env vars; container API and kubeconfig theft; workload-identity-federation trust-policy abuse | CWE-798 (hardcoded credentials), CWE-200; NIST 800-53 IA-5 method-neutral |
138
138
  | AI model registry / cloud-hosted model | AML.T0010 — ML Supply Chain Compromise | ATLAS v5.1.0 | Bedrock / SageMaker custom model from poisoned upstream; Azure ML model registry tampering; Vertex Model Garden mirror tampering; HF model pulled into Bedrock / SageMaker / Vertex with weights backdoor | CSA CCM CCC-09 (vendor / supply chain) silent on model-supply-chain specifics; SLSA / in-toto / Sigstore for models still maturing |
139
- | Cloud inference API abuse / model extraction | AML.T0017 — Develop Adversarial ML Attack Capabilities (closest existing ATLAS mapping for inference-API abuse against cloud-hosted endpoints) | ATLAS v5.1.0 | Programmatic query of Bedrock / Azure OpenAI / Vertex endpoint to extract model behaviour, training-data inference, system-prompt leakage | No cloud-specific ATLAS control mapping for inference-API rate-limit / anomaly detection; chain to `ai-attack-surface` |
139
+ | Cloud inference API abuse / model extraction | AML.T0017 — Discover ML Model Ontology (inference-API probing for system-prompt, guardrail, model-family signal against cloud-hosted endpoints); AML.T0016 — Obtain Capabilities: Develop Capabilities (downstream weaponization) | ATLAS v5.1.0 | Programmatic query of Bedrock / Azure OpenAI / Vertex endpoint to extract model behaviour, training-data inference, system-prompt leakage | No cloud-specific ATLAS control mapping for inference-API rate-limit / anomaly detection; chain to `ai-attack-surface` |
140
140
 
141
141
  **Note on ATT&CK Enterprise cloud-platform sub-techniques.** ATT&CK Enterprise has cloud-platform-specific matrices (IaaS, SaaS, Office 365, Azure AD / Entra ID, Google Workspace). T1078.004 (Cloud Accounts), T1552.005 (Cloud Instance Metadata API), T1552.007 (Container API), T1190 with cloud-service variants, T1530 with managed-storage variants are the most operationally relevant. The frontmatter pins the parent IDs; analysis should descend to the sub-technique appropriate to the cloud(s) in scope.
142
142