agent-threat-rules 2.0.13 → 2.0.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-threat-rules",
3
- "version": "2.0.13",
3
+ "version": "2.0.15",
4
4
  "type": "module",
5
5
  "description": "Open detection standard -- like Sigma, but for AI agents. 311 rules for prompt injection, tool poisoning, context exfiltration, and MCP attacks. Shipped in Cisco AI Defense. 97.1% recall on NVIDIA garak.",
6
6
  "main": "./dist/index.js",
@@ -32,6 +32,57 @@ references:
32
32
  - "AML.T0043 - Craft Adversarial Data"
33
33
  - "AML.T0052.000 - Spearphishing via Social Engineering LLM"
34
34
 
35
+ # Audit-grade compliance mapping — see spec/compliance-metadata.md
36
+ compliance:
37
+ owasp_agentic:
38
+ - id: "ASI01:2026"
39
+ context: "Detects agent goal hijack when an attacker spoofs a peer agent's identity to inject manipulative instructions into inter-agent messages."
40
+ strength: primary
41
+ - id: "ASI07:2026"
42
+ context: "Directly addresses insecure inter-agent communication by flagging forged system-level message tags and manipulated message format conventions."
43
+ strength: primary
44
+ - id: "ASI10:2026"
45
+ context: "Surfaces rogue agents attempting orchestrator bypass or fake status-message injection against trusted peers."
46
+ strength: secondary
47
+ owasp_llm:
48
+ - id: "LLM01:2025"
49
+ context: "Inter-agent prompt injection is a prompt-injection vector operating at the agent-to-agent boundary rather than the user-to-agent boundary."
50
+ strength: primary
51
+ - id: "LLM06:2025"
52
+ context: "Excessive agency is what an attacker exploits when cross-agent spoofing succeeds — the target agent takes actions it would not otherwise take."
53
+ strength: secondary
54
+ eu_ai_act:
55
+ - article: 12
56
+ clause: "Automatic logging for high-risk AI systems"
57
+ context: "Every cross-agent attack detection emits a log record with actor agent, target agent, timestamp, payload hash, and rule match — satisfying Article 12's continuous logging requirement."
58
+ strength: primary
59
+ - article: 14
60
+ clause: "Human oversight"
61
+ context: "Critical-severity detections route to the respond-agent human-in-the-loop flow before allowing the attacker-controlled inter-agent call to complete."
62
+ strength: primary
63
+ - article: 15
64
+ clause: "Accuracy, robustness, and cybersecurity"
65
+ context: "Cross-agent attack detection is a cybersecurity control contributing to the Article 15 robustness requirement for high-risk AI systems."
66
+ strength: secondary
67
+ nist_ai_rmf:
68
+ - function: "Manage"
69
+ subcategory: "MG.2.3"
70
+ context: "Contributes detection evidence to incident response playbooks; each match is an incident candidate for the respond-agent pipeline."
71
+ strength: primary
72
+ - function: "Govern"
73
+ subcategory: "GV.1.1"
74
+ context: "Logs support organizational AI governance by recording every inter-agent communication risk event for quarterly governance review."
75
+ strength: secondary
76
+ iso_42001:
77
+ - clause: "6.2"
78
+ clause_name: "AI risk treatment"
79
+ context: "Cross-agent impersonation is an identified AI threat vector; this rule is the operational risk-treatment control."
80
+ strength: primary
81
+ - clause: "9.1"
82
+ clause_name: "Monitoring, measurement, analysis, evaluation"
83
+ context: "Confidence-scored detection events feed the AIMS monitoring and measurement requirements."
84
+ strength: secondary
85
+
35
86
  tags:
36
87
  category: agent-manipulation
37
88
  subcategory: cross-agent-attack
@@ -32,6 +32,45 @@ references:
32
32
  - "AML.T0056 - LLM Meta Prompt Extraction"
33
33
  - "AML.T0051 - LLM Prompt Injection"
34
34
 
35
+ compliance:
36
+ owasp_agentic:
37
+ - id: ASI01:2026
38
+ context: "Revealing system prompt content enables adversaries to map agent constraints and craft targeted bypass attacks, directly enabling future Agent Goal Hijack attacks classified under ASI01."
39
+ strength: primary
40
+ - id: ASI06:2026
41
+ context: "System prompt leakage exposes the agent's memory and context configuration to adversaries, falling under ASI06 Memory and Context Poisoning as an information disclosure precursor."
42
+ strength: secondary
43
+ owasp_llm:
44
+ - id: LLM07:2025
45
+ context: "This rule is the primary detection implementation for LLM07:2025 System Prompt Leakage, covering direct quotation, instruction paraphrasing, guardrail revelation, and config exposure in agent output."
46
+ strength: primary
47
+ - id: LLM02:2025
48
+ context: "System prompt content constitutes sensitive operational information; its disclosure in agent output also satisfies the LLM02:2025 Sensitive Information Disclosure category."
49
+ strength: secondary
50
+ eu_ai_act:
51
+ - article: "13"
52
+ context: "While transparency is required toward users, internal guardrail and system-prompt disclosure to adversaries violates Article 13 by exposing configurations that undermine safe operation of the AI system."
53
+ strength: primary
54
+ - article: "12"
55
+ context: "Detection events generated by this rule serve as Article 12 automatic log records of system-prompt leakage incidents, enabling post-incident review by deployers."
56
+ strength: secondary
57
+ nist_ai_rmf:
58
+ - function: Govern
59
+ subcategory: GV.6.1
60
+ context: "Organizational policies for AI data and privacy (GV.6.1) must cover protection of internal operational configurations; this rule enforces that policy at runtime."
61
+ strength: primary
62
+ - function: Measure
63
+ subcategory: MS.2.5
64
+ context: "Measuring the effectiveness of data protection controls for AI systems (MS.2.5) requires detection of unauthorized disclosure of system-level configurations."
65
+ strength: secondary
66
+ iso_42001:
67
+ - clause: "8.4"
68
+ context: "Clause 8.4 AI system impact assessment identifies confidential operational parameters as assets requiring protection; this rule detects when those assets are disclosed in output."
69
+ strength: primary
70
+ - clause: "9.1"
71
+ context: "Clause 9.1 monitoring and evaluation requires measuring AI system behavior against expected norms; system prompt disclosure events are anomalies that must be measured and reported."
72
+ strength: secondary
73
+
35
74
  tags:
36
75
  category: context-exfiltration
37
76
  subcategory: system-prompt-leak
@@ -29,6 +29,43 @@ references:
29
29
  - "AML.T0057 - LLM Data Leakage"
30
30
  - "AML.T0055 - Unsecured Credentials"
31
31
 
32
+ compliance:
33
+ owasp_agentic:
34
+ - id: ASI03:2026
35
+ context: "Credential exposure in agent output enables adversaries to abuse the identity and privileges associated with leaked keys, directly enabling unauthorized resource access classified under ASI03."
36
+ strength: primary
37
+ owasp_llm:
38
+ - id: LLM02:2025
39
+ context: "API keys, tokens, private keys, and database credentials are sensitive information whose disclosure in agent output is the canonical LLM02:2025 Sensitive Information Disclosure scenario."
40
+ strength: primary
41
+ - id: LLM07:2025
42
+ context: "Credentials embedded in or derived from system prompts that appear in agent output constitute a LLM07:2025 System Prompt Leakage incident with direct security impact."
43
+ strength: secondary
44
+ eu_ai_act:
45
+ - article: "12"
46
+ section: "1"
47
+ context: "Article 12 requires automatic logging of events; a credential exposure incident is a high-severity event that must be automatically recorded and reported to the deployer for incident investigation."
48
+ strength: primary
49
+ - article: "9"
50
+ context: "Credential leakage is a documented risk class in the AI system risk management plan; Article 9 requires that monitoring controls are implemented to detect and respond to such events."
51
+ strength: secondary
52
+ nist_ai_rmf:
53
+ - function: Govern
54
+ subcategory: GV.6.1
55
+ context: "GV.6.1 requires organizational policies governing sensitive data and credentials in AI systems; this rule enforces those policies by detecting credential exposure at runtime."
56
+ strength: primary
57
+ - function: Manage
58
+ subcategory: MG.3.1
59
+ context: "Credential exposure requires an immediate risk treatment response; this detection rule is the technical implementation of the risk treatment plan for credential leakage events."
60
+ strength: secondary
61
+ iso_42001:
62
+ - clause: "8.4"
63
+ context: "Clause 8.4 AI system impact assessment must identify credential leakage as a high-severity impact scenario; this rule generates the detection evidence needed for audit and impact reporting."
64
+ strength: primary
65
+ - clause: "6.2"
66
+ context: "Protecting credentials from exposure is an explicit AIMS information security objective under clause 6.2; detection of leakage events measures whether this objective is being achieved."
67
+ strength: secondary
68
+
32
69
  tags:
33
70
  category: context-exfiltration
34
71
  subcategory: credential-exposure
@@ -29,6 +29,46 @@ references:
29
29
  mitre_atlas:
30
30
  - AML.T0051.001 - Indirect Prompt Injection
31
31
  - AML.T0020 - Poison Training Data
32
+
33
+ compliance:
34
+ owasp_agentic:
35
+ - id: ASI06:2026
36
+ context: "Injecting hidden directives into RAG-retrieved documents or knowledge base entries is the primary ASI06 Memory and Context Poisoning attack — the agent's context window is contaminated with attacker-controlled instructions."
37
+ strength: primary
38
+ owasp_llm:
39
+ - id: LLM01:2025
40
+ context: "Hidden instruction markers and role-override commands embedded in retrieved content deliver indirect prompt injection through the data layer, satisfying the LLM01:2025 Prompt Injection category via the RAG attack surface."
41
+ strength: primary
42
+ - id: LLM03:2025
43
+ context: "Poisoned knowledge base entries injected into the supply chain of training or retrieval data satisfy the LLM03:2025 Supply Chain Vulnerabilities category at the data tier."
44
+ strength: secondary
45
+ - id: LLM08:2025
46
+ context: "Malicious vector store entries that cause the agent to retrieve and act on attacker-controlled instructions fall under LLM08:2025 Excessive Agency triggered by contaminated retrieval context."
47
+ strength: partial
48
+ eu_ai_act:
49
+ - article: "15"
50
+ context: "Article 15 accuracy and robustness requirements demand that AI systems maintain reliable behavior even when training or retrieval data is adversarially contaminated; this rule detects such contamination in the data pipeline."
51
+ strength: primary
52
+ - article: "9"
53
+ context: "Data poisoning of RAG pipelines is a documented risk requiring monitoring controls under Article 9; detection events from this rule provide the evidence trail for risk management reporting."
54
+ strength: secondary
55
+ nist_ai_rmf:
56
+ - function: Map
57
+ subcategory: MP.5.1
58
+ context: "Adversarial manipulation of retrieval data is an AI-specific risk source requiring identification; this rule maps data poisoning attempts as they occur in the production pipeline."
59
+ strength: primary
60
+ - function: Manage
61
+ subcategory: MG.2.3
62
+ context: "Active detection of data poisoning events implements the risk treatment for the data contamination risk identified in the AI risk register."
63
+ strength: secondary
64
+ iso_42001:
65
+ - clause: "8.3"
66
+ context: "Clause 8.3 data governance for AI systems requires controls ensuring data integrity; detection of hidden directives in retrieved content is the runtime enforcement of clause 8.3 data quality requirements."
67
+ strength: primary
68
+ - clause: "6.2"
69
+ context: "Clause 6.2 AIMS security planning must include controls for adversarial data injection into AI pipelines; this rule operationalizes the detection measure for that planning objective."
70
+ strength: secondary
71
+
32
72
  tags:
33
73
  category: data-poisoning
34
74
  subcategory: rag-and-knowledge-poisoning
@@ -25,6 +25,43 @@ references:
25
25
  mitre_atlas:
26
26
  - AML.T0053 - LLM Plugin Compromise
27
27
  - AML.T0046 - Spamming ML System with Chaff Data
28
+
29
+ compliance:
30
+ owasp_agentic:
31
+ - id: ASI05:2026
32
+ context: "Runaway agent loops represent uncontrolled autonomous execution — the agent performs repeated identical actions without human intervention, satisfying the ASI05 Unexpected Code Execution category at the behavioral level."
33
+ strength: primary
34
+ owasp_llm:
35
+ - id: LLM06:2025
36
+ context: "An agent running infinite retry loops without termination conditions exercises excessive agency beyond its task scope, accumulating costs and resource consumption classified under LLM06:2025."
37
+ strength: primary
38
+ - id: LLM10:2025
39
+ context: "Runaway loops that exhaust compute resources, API rate limits, or accumulated context constitute an LLM10:2025 Unbounded Consumption incident with direct financial and availability impact."
40
+ strength: secondary
41
+ eu_ai_act:
42
+ - article: "14"
43
+ context: "Article 14 human oversight requires that AI systems can be stopped and corrected; runaway loops that resist termination or mask their loop state undermine the human override capability Article 14 mandates."
44
+ strength: primary
45
+ - article: "15"
46
+ context: "Article 15 robustness requires that AI systems handle failure states gracefully; detection of runaway loops is a monitoring control ensuring the system does not enter unrecoverable autonomous states."
47
+ strength: secondary
48
+ nist_ai_rmf:
49
+ - function: Manage
50
+ subcategory: MG.3.2
51
+ context: "Runaway agent loops are an AI system failure mode requiring active detection and termination; this rule implements the MG.3.2 response capability for AI failures and disruptions."
52
+ strength: primary
53
+ - function: Govern
54
+ subcategory: GV.1.2
55
+ context: "GV.1.2 accountability roles must include responsibility for detecting and halting runaway agent behavior; this rule provides the signal required to fulfill that accountability."
56
+ strength: secondary
57
+ iso_42001:
58
+ - clause: "8.6"
59
+ context: "Clause 8.6 AI system operational control requires monitoring for abnormal execution patterns; runaway loop detection is the primary operational control for this failure class."
60
+ strength: primary
61
+ - clause: "9.1"
62
+ context: "Clause 9.1 monitoring and evaluation requires measuring AI system behavior against expected norms; loop counter patterns are the measurable anomaly indicators for this rule."
63
+ strength: secondary
64
+
28
65
  tags:
29
66
  category: excessive-autonomy
30
67
  subcategory: runaway-loop
@@ -30,6 +30,40 @@ references:
30
30
  - T1611 - Escape to Host
31
31
  cve:
32
32
  - CVE-2026-0628
33
+
34
+ compliance:
35
+ owasp_agentic:
36
+ - id: ASI03:2026
37
+ context: "Privilege escalation via tool permission abuse or admin function invocation is the primary ASI03 Identity and Privilege Abuse scenario — the agent acquires capabilities exceeding its authorized scope."
38
+ strength: primary
39
+ owasp_llm:
40
+ - id: LLM06:2025
41
+ context: "An agent requesting tools with elevated permissions beyond its assigned role is the canonical LLM06:2025 Excessive Agency scenario, operationalized here via tool-name and argument pattern detection."
42
+ strength: primary
43
+ eu_ai_act:
44
+ - article: "14"
45
+ context: "Article 14 requires that humans can oversee and intervene in AI system operation; privilege escalation techniques that bypass system-level controls directly undermine the human oversight mechanisms Article 14 mandates."
46
+ strength: primary
47
+ - article: "9"
48
+ context: "Privilege escalation is a documented high-severity risk in the AI system risk register; Article 9 requires monitoring controls to detect and respond to such scope violations."
49
+ strength: secondary
50
+ nist_ai_rmf:
51
+ - function: Govern
52
+ subcategory: GV.1.2
53
+ context: "GV.1.2 requires defined accountability roles and controls for AI system permissions; detection of privilege escalation enforces least-privilege boundaries established in the governance framework."
54
+ strength: primary
55
+ - function: Manage
56
+ subcategory: MG.4.1
57
+ context: "Privilege escalation events require an incident response; this rule generates the alerts needed to initiate the MG.4.1 AI incident response process."
58
+ strength: secondary
59
+ iso_42001:
60
+ - clause: "6.2"
61
+ context: "Clause 6.2 AIMS security objectives include least-privilege enforcement for AI agent operations; this rule detects violations of those objectives at runtime."
62
+ strength: primary
63
+ - clause: "8.6"
64
+ context: "Clause 8.6 AI system operational control requires that agents do not exceed their authorized operational scope; privilege escalation detection enforces that operational boundary."
65
+ strength: secondary
66
+
33
67
  tags:
34
68
  category: privilege-escalation
35
69
  subcategory: tool-permission-escalation
@@ -30,6 +30,39 @@ references:
30
30
  - "CVE-2024-3402"
31
31
  - "CVE-2025-53773"
32
32
 
33
+ compliance:
34
+ owasp_agentic:
35
+ - id: ASI01:2026
36
+ context: "Direct prompt injection is the canonical agent goal hijack vector — adversarial user input overrides the agent's assigned objectives and behavioral constraints via instruction-override verbs, persona switching, and encoding obfuscation."
37
+ strength: primary
38
+ owasp_llm:
39
+ - id: LLM01:2025
40
+ context: "This rule is the primary runtime implementation of the LLM01:2025 Prompt Injection category, covering instruction-override verbs, fake system delimiters, restriction removal, and encoding-wrapped payloads."
41
+ strength: primary
42
+ eu_ai_act:
43
+ - article: "15"
44
+ context: "High-risk AI systems must be resilient against adversarial attempts to alter output or behavior. Deployment of this detection rule satisfies the Article 15 requirement to implement technical measures ensuring robustness against manipulation."
45
+ strength: primary
46
+ - article: "9"
47
+ context: "Prompt injection is a documented risk class; this rule implements the monitoring control required by Article 9 risk management obligations for high-risk AI systems."
48
+ strength: secondary
49
+ nist_ai_rmf:
50
+ - function: Manage
51
+ subcategory: MG.2.3
52
+ context: "Treating direct prompt injection as an identified AI risk requires active runtime countermeasures; this detection rule is the primary risk treatment implementation."
53
+ strength: primary
54
+ - function: Map
55
+ subcategory: MP.5.1
56
+ context: "Identifying adversarial input manipulation as an AI risk to be catalogued in the organizational risk register."
57
+ strength: secondary
58
+ iso_42001:
59
+ - clause: "6.2"
60
+ context: "Addressing adversarial manipulation risk is an objective required under clause 6.2 AIMS information security planning; this rule operationalizes the detection control measure."
61
+ strength: primary
62
+ - clause: "8.4"
63
+ context: "Impact assessment for AI deployments under clause 8.4 must account for adversarial user inputs; detection events from this rule provide the required monitoring evidence."
64
+ strength: secondary
65
+
33
66
  tags:
34
67
  category: prompt-injection
35
68
  subcategory: direct
@@ -33,6 +33,42 @@ references:
33
33
  - "CVE-2025-32711"
34
34
  - "CVE-2026-24307"
35
35
 
36
+ compliance:
37
+ owasp_agentic:
38
+ - id: ASI01:2026
39
+ context: "Indirect prompt injection hijacks agent goals via externally-consumed content (documents, web pages, API responses); the agent processes attacker-controlled instructions without user awareness."
40
+ strength: primary
41
+ - id: ASI06:2026
42
+ context: "Injection via external content poisons the agent's context window and memory with attacker-controlled directives, satisfying the ASI06 Memory and Context Poisoning category."
43
+ strength: secondary
44
+ owasp_llm:
45
+ - id: LLM01:2025
46
+ context: "Indirect prompt injection via HTML comments, zero-width characters, hidden CSS text, and data URIs is a primary LLM01 attack variant delivered through external content rather than direct user input."
47
+ strength: primary
48
+ eu_ai_act:
49
+ - article: "15"
50
+ context: "High-risk AI systems must resist adversarial content embedded in external inputs. Detection of hidden injection payloads in consumed documents satisfies Article 15 robustness and cybersecurity requirements."
51
+ strength: primary
52
+ - article: "9"
53
+ context: "Indirect injection from third-party content sources is a documented risk category requiring mitigation controls under Article 9 risk management obligations."
54
+ strength: secondary
55
+ nist_ai_rmf:
56
+ - function: Manage
57
+ subcategory: MG.2.3
58
+ context: "Runtime detection of injection payloads embedded in third-party content implements the risk treatment for indirect prompt injection identified in the AI risk register."
59
+ strength: primary
60
+ - function: Map
61
+ subcategory: MP.3.3
62
+ context: "External content providers are third-party components in the AI supply chain; this rule identifies their attack surface as a risk source."
63
+ strength: secondary
64
+ iso_42001:
65
+ - clause: "6.2"
66
+ context: "Clause 6.2 AIMS planning requires controls for externally-sourced risks; this rule operationalizes the detection measure for indirect injection via consumed content."
67
+ strength: primary
68
+ - clause: "8.5"
69
+ context: "Externally-provided content processed by the agent falls under clause 8.5 control of externally-provided processes; this rule validates that external content does not contain adversarial directives."
70
+ strength: secondary
71
+
36
72
  tags:
37
73
  category: prompt-injection
38
74
  subcategory: indirect
@@ -26,6 +26,43 @@ references:
26
26
  - AML.T0010 - ML Supply Chain Compromise
27
27
  mitre_attack:
28
28
  - T1195 - Supply Chain Compromise
29
+
30
+ compliance:
31
+ owasp_agentic:
32
+ - id: ASI04:2026
33
+ context: "MCP skill impersonation via typosquatting, namespace collision, and version spoofing is the primary ASI04 Agentic Supply Chain Vulnerabilities attack vector — malicious skills masquerade as trusted tools to gain agent execution context."
34
+ strength: primary
35
+ owasp_llm:
36
+ - id: LLM03:2025
37
+ context: "Typosquatted and impersonated MCP skills are supply chain compromise artifacts targeting the tool ecosystem; this rule implements LLM03:2025 Supply Chain Vulnerabilities detection at the skill-name level."
38
+ strength: primary
39
+ - id: LLM05:2025
40
+ context: "An agent invoking an impersonated skill may receive malicious responses that require LLM05:2025 Improper Output Handling controls; this rule prevents the initial tool invocation before output is processed."
41
+ strength: secondary
42
+ eu_ai_act:
43
+ - article: "13"
44
+ context: "Article 13 transparency requires that AI systems operate with clearly identified components; skill impersonation violates this requirement by substituting unauthorized tools that appear legitimate."
45
+ strength: primary
46
+ - article: "9"
47
+ context: "Supply chain compromise via malicious skill registries is a documented risk requiring monitoring controls under Article 9; skill-name pattern detection is the runtime enforcement of those controls."
48
+ strength: secondary
49
+ nist_ai_rmf:
50
+ - function: Map
51
+ subcategory: MP.2.3
52
+ context: "Identifying typosquatted and impersonated MCP skills as AI supply chain risks implements MP.2.3 AI supply chain risk identification at the tool-registry level."
53
+ strength: primary
54
+ - function: Govern
55
+ subcategory: GV.1.2
56
+ context: "GV.1.2 accountability roles must include responsibility for validating third-party tool integrity; this rule provides the automated signal needed to fulfill that governance obligation."
57
+ strength: secondary
58
+ iso_42001:
59
+ - clause: "8.5"
60
+ context: "MCP skills are externally-provided AI-related components under clause 8.5; this rule enforces controls over externally-provided tools by detecting impersonation before invocation."
61
+ strength: primary
62
+ - clause: "6.2"
63
+ context: "Clause 6.2 AIMS security planning requires controls for third-party component integrity; skill impersonation detection operationalizes that planning objective at runtime."
64
+ strength: secondary
65
+
29
66
  tags:
30
67
  category: skill-compromise
31
68
  subcategory: skill-impersonation
@@ -40,6 +40,45 @@ references:
40
40
  - "CVE-2025-59536"
41
41
  - "CVE-2026-21852"
42
42
 
43
+ compliance:
44
+ owasp_agentic:
45
+ - id: ASI02:2026
46
+ context: "Malicious content injected via MCP tool responses is the primary ASI02 Tool Misuse and Exploitation vector — a compromised or impersonated MCP server weaponizes the tool call interface to deliver shells, encoded payloads, and privilege escalation commands."
47
+ strength: primary
48
+ - id: ASI05:2026
49
+ context: "Shell commands and code execution payloads in tool responses aim to trigger unexpected code execution by the agent, falling under the ASI05 Unexpected Code Execution category."
50
+ strength: secondary
51
+ owasp_llm:
52
+ - id: LLM01:2025
53
+ context: "Prompt injection delivered through MCP tool responses is an indirect LLM01:2025 attack variant where the injection payload is embedded in tool output rather than user input."
54
+ strength: primary
55
+ - id: LLM05:2025
56
+ context: "Failure to validate MCP tool response content before agent processing is a LLM05:2025 Improper Output Handling scenario enabling downstream command injection and reverse shell execution."
57
+ strength: secondary
58
+ eu_ai_act:
59
+ - article: "15"
60
+ context: "MCP tool response injection attacks the cybersecurity integrity of the AI system; Article 15 requires technical measures ensuring the system can resist such third-party content attacks."
61
+ strength: primary
62
+ - article: "9"
63
+ context: "Compromised MCP server responses are a documented attack surface in the AI system risk register; Article 9 requires detection controls to manage this identified risk."
64
+ strength: secondary
65
+ nist_ai_rmf:
66
+ - function: Manage
67
+ subcategory: MG.2.3
68
+ context: "Runtime detection of malicious MCP tool responses is the primary risk treatment for tool-poisoning attacks identified in the AI risk register."
69
+ strength: primary
70
+ - function: Map
71
+ subcategory: MP.3.3
72
+ context: "MCP servers are third-party components in the AI tool ecosystem; identifying malicious tool responses is an MP.3.3 third-party component risk detection action."
73
+ strength: secondary
74
+ iso_42001:
75
+ - clause: "6.2"
76
+ context: "Clause 6.2 AIMS security planning requires controls for third-party tool interfaces; this rule operationalizes the detection measure for malicious content delivered via MCP."
77
+ strength: primary
78
+ - clause: "8.5"
79
+ context: "MCP server integrations are externally-provided AI-related processes under clause 8.5; this rule validates that external tool responses do not contain adversarial payloads before the agent acts on them."
80
+ strength: secondary
81
+
43
82
  tags:
44
83
  category: tool-poisoning
45
84
  subcategory: mcp-response-injection
@@ -0,0 +1,125 @@
1
+ # ATR Rule Compliance Metadata Schema
2
+
3
+ **Status:** Draft v0.1 · Proposed 2026-04-22
4
+ **Scope:** Every `rules/**/*.yaml` may optionally include a top-level `compliance:` block that maps the rule to controls / articles / clauses in published AI compliance frameworks.
5
+
6
+ ## Why
7
+
8
+ ATR rules already include `references:` pointing to OWASP LLM / OWASP Agentic Top 10 / MITRE ATLAS. That is an academic-citation block useful for researchers.
9
+
10
+ `compliance:` is a separate, audit-grade block whose purpose is different: an enterprise customer's GRC team must be able to take a detection event, trace it back to a specific rule ID, and show an auditor that the rule addresses a specific **published article or control** of:
11
+
12
+ 1. EU AI Act (Regulation 2024/1689) — Articles 9-15, 50, 72, Annex III
13
+ 2. Colorado AI Act SB24-205 — enforced 2026-06-30
14
+ 3. NIST AI RMF 1.0 — Govern / Map / Measure / Manage functions + subcategories
15
+ 4. ISO/IEC 42001:2023 — clauses 6-10 (AIMS)
16
+ 5. OWASP Agentic Top 10 (2026) — ASI01..ASI10
17
+ 6. OWASP LLM Top 10 (2025) — LLM01..LLM10
18
+
19
+ The `references:` block is not sufficient because:
20
+ - It does not distinguish "we studied this paper" from "this rule enforces this specific regulatory control."
21
+ - It has no structure for "what clause" vs "what context this rule addresses within that clause."
22
+ - It cannot carry the prose an auditor needs to accept the mapping.
23
+
24
+ ## Schema
25
+
26
+ ```yaml
27
+ compliance:
28
+ # One key per framework the rule maps to. Omit frameworks that do not apply.
29
+ owasp_agentic:
30
+ - id: "ASI01:2026" # Required. Canonical category ID.
31
+ context: "..." # Required. One-sentence prose explaining *how*
32
+ # this rule addresses the category. Auditor-
33
+ # readable; no jargon-only text.
34
+ strength: primary # Optional. primary | secondary | partial.
35
+
36
+ owasp_llm:
37
+ - id: "LLM01:2025"
38
+ context: "..."
39
+ strength: primary
40
+
41
+ eu_ai_act:
42
+ - article: 12 # Required. Article number (integer).
43
+ clause: "Automatic logging for high-risk AI systems" # Required. Short name.
44
+ context: "..." # Required. How this rule satisfies the clause.
45
+ strength: primary
46
+
47
+ colorado_ai_act:
48
+ - section: "SB24-205.5" # Required. Section identifier.
49
+ clause: "High-risk disclosure"
50
+ context: "..."
51
+ strength: primary
52
+
53
+ nist_ai_rmf:
54
+ - function: "Manage" # Required. Govern | Map | Measure | Manage.
55
+ subcategory: "MG.2.3" # Required. Full subcategory ID.
56
+ context: "..."
57
+ strength: primary
58
+
59
+ iso_42001:
60
+ - clause: "6.2" # Required. AIMS clause (e.g. 6.2, 9.1).
61
+ clause_name: "Risk treatment" # Required. Human-readable name.
62
+ context: "..."
63
+ strength: primary
64
+ ```
65
+
66
+ ### Field reference
67
+
68
+ | Field | Type | Required | Notes |
69
+ |---|---|---|---|
70
+ | `id` / `article` / `section` / `function`+`subcategory` / `clause` | string/int | yes | Framework-specific canonical identifier. Must match the published framework exactly. |
71
+ | `clause` (EU/Colorado/ISO) | string | yes | Short human name for the clause. Helps report readers. |
72
+ | `context` | string | yes | One sentence, auditor-readable, explaining *why* the rule addresses this control. Not a copy of the clause text. |
73
+ | `strength` | enum | no | `primary` (rule is a main control for this clause), `secondary` (supports it), `partial` (covers part of it). Defaults to `primary` if omitted. |
74
+
75
+ ### Multiplicity
76
+
77
+ A rule MAY map to multiple items within the same framework (a rule that logs event AND enforces policy touches both Article 12 and Article 14 of the EU AI Act). List each separately.
78
+
79
+ A rule MAY map to zero frameworks (e.g., an experimental research rule). Omit the `compliance:` block entirely in that case — do not include an empty one.
80
+
81
+ ### Deprecation
82
+
83
+ When a framework publishes a new version, both old and new keys MAY coexist during a transition window (e.g., both `owasp_llm` 2023 and 2025 items), clearly distinguished by the `id` version suffix.
84
+
85
+ ## Relationship to `references:`
86
+
87
+ The existing `references:` block is preserved unchanged. `references:` is for academic / research citations (MITRE ATLAS technique IDs, papers, blog posts). `compliance:` is for regulatory audit evidence.
88
+
89
+ A rule can have entries in both blocks — e.g., `references.mitre_atlas` AND `compliance.nist_ai_rmf` — and often will.
90
+
91
+ ## Validation
92
+
93
+ - `scripts/validate-compliance.mjs` (to be added) validates every `compliance:` block against a per-framework allowlist of valid IDs / articles / subcategories / clauses. Rules with invalid entries fail CI.
94
+ - The allowlists live in `data/compliance-frameworks/*.json` — one file per framework — and are updated via PR when a framework publishes revisions.
95
+
96
+ ## Downstream consumers
97
+
98
+ The primary consumer is **PanGuard Enterprise's AI Compliance Audit Evidence Module**, which generates quarterly reports mapping detection events (via rule IDs) to auditor-grade framework evidence. Other downstream consumers may include:
99
+
100
+ - ATR-compatible scanners that want to tag each detection with its regulatory context
101
+ - GRC platforms (Vanta, Drata, etc.) that integrate ATR rule packs
102
+ - Independent auditors verifying AI-system compliance claims
103
+
104
+ All downstream consumers are welcome — the `compliance:` block is MIT-licensed alongside the rules.
105
+
106
+ ## Out of scope for this spec
107
+
108
+ - How a scanner renders compliance data in its UI
109
+ - How a GRC platform surfaces this in a customer's audit trail
110
+ - The legal interpretation of any framework clause — this spec provides the mapping data; auditors and counsel interpret it
111
+
112
+ ## Open questions
113
+
114
+ 1. Should `strength` be required (forcing every mapping to declare its strength)? Argument for: signals rigour. Argument against: extra authoring friction for common `primary` case. **Current answer: optional, default `primary`.**
115
+ 2. Should framework-specific metadata (e.g., EU AI Act Annex III categories) live alongside article mappings? **Current answer: yes, under a nested `annex:` key within the article object if needed.**
116
+ 3. How to handle frameworks that don't exist yet but are expected (e.g., Japan AI Safety Act 2027)? **Current answer: add keys as frameworks publish; no speculative schema for unpublished frameworks.**
117
+
118
+ ## Roll-out plan
119
+
120
+ 1. 2026-04-22: this spec document merged
121
+ 2. 2026-04-W4: 10 sample rules carry `compliance:` block for OWASP Agentic + OWASP LLM (bootstrap from existing `references:` data)
122
+ 3. 2026-05: 50 rules extended across all 6 frameworks (LLM-assisted authoring + human QA)
123
+ 4. 2026-Q2-end: all 311 rules mapped across at least the 3 most-requested frameworks (EU AI Act, NIST AI RMF, OWASP Agentic)
124
+ 5. 2026-Q3: remaining frameworks (Colorado, ISO 42001, OWASP LLM) complete
125
+ 6. Ongoing: new ATR rules MUST include `compliance:` from day 1 (enforced by contribution checklist)