agent-threat-rules 2.0.13 → 2.0.15
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/rules/agent-manipulation/ATR-2026-00030-cross-agent-attack.yaml +51 -0
- package/rules/context-exfiltration/ATR-2026-00020-system-prompt-leak.yaml +39 -0
- package/rules/context-exfiltration/ATR-2026-00021-api-key-exposure.yaml +37 -0
- package/rules/data-poisoning/ATR-2026-00070-data-poisoning.yaml +40 -0
- package/rules/excessive-autonomy/ATR-2026-00050-runaway-agent-loop.yaml +37 -0
- package/rules/privilege-escalation/ATR-2026-00040-privilege-escalation.yaml +34 -0
- package/rules/prompt-injection/ATR-2026-00001-direct-prompt-injection.yaml +33 -0
- package/rules/prompt-injection/ATR-2026-00002-indirect-prompt-injection.yaml +36 -0
- package/rules/skill-compromise/ATR-2026-00060-skill-impersonation.yaml +37 -0
- package/rules/tool-poisoning/ATR-2026-00010-mcp-malicious-response.yaml +39 -0
- package/spec/compliance-metadata.md +125 -0
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "agent-threat-rules",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.15",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "Open detection standard -- like Sigma, but for AI agents. 311 rules for prompt injection, tool poisoning, context exfiltration, and MCP attacks. Shipped in Cisco AI Defense. 97.1% recall on NVIDIA garak.",
|
|
6
6
|
"main": "./dist/index.js",
|
|
@@ -32,6 +32,57 @@ references:
|
|
|
32
32
|
- "AML.T0043 - Craft Adversarial Data"
|
|
33
33
|
- "AML.T0052.000 - Spearphishing via Social Engineering LLM"
|
|
34
34
|
|
|
35
|
+
# Audit-grade compliance mapping — see spec/compliance-metadata.md
|
|
36
|
+
compliance:
|
|
37
|
+
owasp_agentic:
|
|
38
|
+
- id: "ASI01:2026"
|
|
39
|
+
context: "Detects agent goal hijack when an attacker spoofs a peer agent's identity to inject manipulative instructions into inter-agent messages."
|
|
40
|
+
strength: primary
|
|
41
|
+
- id: "ASI07:2026"
|
|
42
|
+
context: "Directly addresses insecure inter-agent communication by flagging forged system-level message tags and manipulated message format conventions."
|
|
43
|
+
strength: primary
|
|
44
|
+
- id: "ASI10:2026"
|
|
45
|
+
context: "Surfaces rogue agents attempting orchestrator bypass or fake status-message injection against trusted peers."
|
|
46
|
+
strength: secondary
|
|
47
|
+
owasp_llm:
|
|
48
|
+
- id: "LLM01:2025"
|
|
49
|
+
context: "Inter-agent prompt injection is a prompt-injection vector operating at the agent-to-agent boundary rather than the user-to-agent boundary."
|
|
50
|
+
strength: primary
|
|
51
|
+
- id: "LLM06:2025"
|
|
52
|
+
context: "Excessive agency is what an attacker exploits when cross-agent spoofing succeeds — the target agent takes actions it would not otherwise take."
|
|
53
|
+
strength: secondary
|
|
54
|
+
eu_ai_act:
|
|
55
|
+
- article: 12
|
|
56
|
+
clause: "Automatic logging for high-risk AI systems"
|
|
57
|
+
context: "Every cross-agent attack detection emits a log record with actor agent, target agent, timestamp, payload hash, and rule match — satisfying Article 12's continuous logging requirement."
|
|
58
|
+
strength: primary
|
|
59
|
+
- article: 14
|
|
60
|
+
clause: "Human oversight"
|
|
61
|
+
context: "Critical-severity detections route to the respond-agent human-in-the-loop flow before allowing the attacker-controlled inter-agent call to complete."
|
|
62
|
+
strength: primary
|
|
63
|
+
- article: 15
|
|
64
|
+
clause: "Accuracy, robustness, and cybersecurity"
|
|
65
|
+
context: "Cross-agent attack detection is a cybersecurity control contributing to the Article 15 robustness requirement for high-risk AI systems."
|
|
66
|
+
strength: secondary
|
|
67
|
+
nist_ai_rmf:
|
|
68
|
+
- function: "Manage"
|
|
69
|
+
subcategory: "MG.2.3"
|
|
70
|
+
context: "Contributes detection evidence to incident response playbooks; each match is an incident candidate for the respond-agent pipeline."
|
|
71
|
+
strength: primary
|
|
72
|
+
- function: "Govern"
|
|
73
|
+
subcategory: "GV.1.1"
|
|
74
|
+
context: "Logs support organizational AI governance by recording every inter-agent communication risk event for quarterly governance review."
|
|
75
|
+
strength: secondary
|
|
76
|
+
iso_42001:
|
|
77
|
+
- clause: "6.2"
|
|
78
|
+
clause_name: "AI risk treatment"
|
|
79
|
+
context: "Cross-agent impersonation is an identified AI threat vector; this rule is the operational risk-treatment control."
|
|
80
|
+
strength: primary
|
|
81
|
+
- clause: "9.1"
|
|
82
|
+
clause_name: "Monitoring, measurement, analysis, evaluation"
|
|
83
|
+
context: "Confidence-scored detection events feed the AIMS monitoring and measurement requirements."
|
|
84
|
+
strength: secondary
|
|
85
|
+
|
|
35
86
|
tags:
|
|
36
87
|
category: agent-manipulation
|
|
37
88
|
subcategory: cross-agent-attack
|
|
@@ -32,6 +32,45 @@ references:
|
|
|
32
32
|
- "AML.T0056 - LLM Meta Prompt Extraction"
|
|
33
33
|
- "AML.T0051 - LLM Prompt Injection"
|
|
34
34
|
|
|
35
|
+
compliance:
|
|
36
|
+
owasp_agentic:
|
|
37
|
+
- id: ASI01:2026
|
|
38
|
+
context: "Revealing system prompt content enables adversaries to map agent constraints and craft targeted bypass attacks, directly enabling future Agent Goal Hijack attacks classified under ASI01."
|
|
39
|
+
strength: primary
|
|
40
|
+
- id: ASI06:2026
|
|
41
|
+
context: "System prompt leakage exposes the agent's memory and context configuration to adversaries, falling under ASI06 Memory and Context Poisoning as an information disclosure precursor."
|
|
42
|
+
strength: secondary
|
|
43
|
+
owasp_llm:
|
|
44
|
+
- id: LLM07:2025
|
|
45
|
+
context: "This rule is the primary detection implementation for LLM07:2025 System Prompt Leakage, covering direct quotation, instruction paraphrasing, guardrail revelation, and config exposure in agent output."
|
|
46
|
+
strength: primary
|
|
47
|
+
- id: LLM02:2025
|
|
48
|
+
context: "System prompt content constitutes sensitive operational information; its disclosure in agent output also satisfies the LLM02:2025 Sensitive Information Disclosure category."
|
|
49
|
+
strength: secondary
|
|
50
|
+
eu_ai_act:
|
|
51
|
+
- article: "13"
|
|
52
|
+
context: "While transparency is required toward users, internal guardrail and system-prompt disclosure to adversaries violates Article 13 by exposing configurations that undermine safe operation of the AI system."
|
|
53
|
+
strength: primary
|
|
54
|
+
- article: "12"
|
|
55
|
+
context: "Detection events generated by this rule serve as Article 12 automatic log records of system-prompt leakage incidents, enabling post-incident review by deployers."
|
|
56
|
+
strength: secondary
|
|
57
|
+
nist_ai_rmf:
|
|
58
|
+
- function: Govern
|
|
59
|
+
subcategory: GV.6.1
|
|
60
|
+
context: "Organizational policies for AI data and privacy (GV.6.1) must cover protection of internal operational configurations; this rule enforces that policy at runtime."
|
|
61
|
+
strength: primary
|
|
62
|
+
- function: Measure
|
|
63
|
+
subcategory: MS.2.5
|
|
64
|
+
context: "Measuring the effectiveness of data protection controls for AI systems (MS.2.5) requires detection of unauthorized disclosure of system-level configurations."
|
|
65
|
+
strength: secondary
|
|
66
|
+
iso_42001:
|
|
67
|
+
- clause: "8.4"
|
|
68
|
+
context: "Clause 8.4 AI system impact assessment identifies confidential operational parameters as assets requiring protection; this rule detects when those assets are disclosed in output."
|
|
69
|
+
strength: primary
|
|
70
|
+
- clause: "9.1"
|
|
71
|
+
context: "Clause 9.1 monitoring and evaluation requires measuring AI system behavior against expected norms; system prompt disclosure events are anomalies that must be measured and reported."
|
|
72
|
+
strength: secondary
|
|
73
|
+
|
|
35
74
|
tags:
|
|
36
75
|
category: context-exfiltration
|
|
37
76
|
subcategory: system-prompt-leak
|
|
@@ -29,6 +29,43 @@ references:
|
|
|
29
29
|
- "AML.T0057 - LLM Data Leakage"
|
|
30
30
|
- "AML.T0055 - Unsecured Credentials"
|
|
31
31
|
|
|
32
|
+
compliance:
|
|
33
|
+
owasp_agentic:
|
|
34
|
+
- id: ASI03:2026
|
|
35
|
+
context: "Credential exposure in agent output enables adversaries to abuse the identity and privileges associated with leaked keys, directly enabling unauthorized resource access classified under ASI03."
|
|
36
|
+
strength: primary
|
|
37
|
+
owasp_llm:
|
|
38
|
+
- id: LLM02:2025
|
|
39
|
+
context: "API keys, tokens, private keys, and database credentials are sensitive information whose disclosure in agent output is the canonical LLM02:2025 Sensitive Information Disclosure scenario."
|
|
40
|
+
strength: primary
|
|
41
|
+
- id: LLM07:2025
|
|
42
|
+
context: "Credentials embedded in or derived from system prompts that appear in agent output constitute a LLM07:2025 System Prompt Leakage incident with direct security impact."
|
|
43
|
+
strength: secondary
|
|
44
|
+
eu_ai_act:
|
|
45
|
+
- article: "12"
|
|
46
|
+
section: "1"
|
|
47
|
+
context: "Article 12 requires automatic logging of events; a credential exposure incident is a high-severity event that must be automatically recorded and reported to the deployer for incident investigation."
|
|
48
|
+
strength: primary
|
|
49
|
+
- article: "9"
|
|
50
|
+
context: "Credential leakage is a documented risk class in the AI system risk management plan; Article 9 requires that monitoring controls are implemented to detect and respond to such events."
|
|
51
|
+
strength: secondary
|
|
52
|
+
nist_ai_rmf:
|
|
53
|
+
- function: Govern
|
|
54
|
+
subcategory: GV.6.1
|
|
55
|
+
context: "GV.6.1 requires organizational policies governing sensitive data and credentials in AI systems; this rule enforces those policies by detecting credential exposure at runtime."
|
|
56
|
+
strength: primary
|
|
57
|
+
- function: Manage
|
|
58
|
+
subcategory: MG.3.1
|
|
59
|
+
context: "Credential exposure requires an immediate risk treatment response; this detection rule is the technical implementation of the risk treatment plan for credential leakage events."
|
|
60
|
+
strength: secondary
|
|
61
|
+
iso_42001:
|
|
62
|
+
- clause: "8.4"
|
|
63
|
+
context: "Clause 8.4 AI system impact assessment must identify credential leakage as a high-severity impact scenario; this rule generates the detection evidence needed for audit and impact reporting."
|
|
64
|
+
strength: primary
|
|
65
|
+
- clause: "6.2"
|
|
66
|
+
context: "Protecting credentials from exposure is an explicit AIMS information security objective under clause 6.2; detection of leakage events measures whether this objective is being achieved."
|
|
67
|
+
strength: secondary
|
|
68
|
+
|
|
32
69
|
tags:
|
|
33
70
|
category: context-exfiltration
|
|
34
71
|
subcategory: credential-exposure
|
|
@@ -29,6 +29,46 @@ references:
|
|
|
29
29
|
mitre_atlas:
|
|
30
30
|
- AML.T0051.001 - Indirect Prompt Injection
|
|
31
31
|
- AML.T0020 - Poison Training Data
|
|
32
|
+
|
|
33
|
+
compliance:
|
|
34
|
+
owasp_agentic:
|
|
35
|
+
- id: ASI06:2026
|
|
36
|
+
context: "Injecting hidden directives into RAG-retrieved documents or knowledge base entries is the primary ASI06 Memory and Context Poisoning attack — the agent's context window is contaminated with attacker-controlled instructions."
|
|
37
|
+
strength: primary
|
|
38
|
+
owasp_llm:
|
|
39
|
+
- id: LLM01:2025
|
|
40
|
+
context: "Hidden instruction markers and role-override commands embedded in retrieved content deliver indirect prompt injection through the data layer, satisfying the LLM01:2025 Prompt Injection category via the RAG attack surface."
|
|
41
|
+
strength: primary
|
|
42
|
+
- id: LLM03:2025
|
|
43
|
+
context: "Poisoned knowledge base entries injected into the supply chain of training or retrieval data satisfy the LLM03:2025 Supply Chain Vulnerabilities category at the data tier."
|
|
44
|
+
strength: secondary
|
|
45
|
+
- id: LLM08:2025
|
|
46
|
+
context: "Malicious vector store entries that cause the agent to retrieve and act on attacker-controlled instructions fall under LLM08:2025 Excessive Agency triggered by contaminated retrieval context."
|
|
47
|
+
strength: partial
|
|
48
|
+
eu_ai_act:
|
|
49
|
+
- article: "15"
|
|
50
|
+
context: "Article 15 accuracy and robustness requirements demand that AI systems maintain reliable behavior even when training or retrieval data is adversarially contaminated; this rule detects such contamination in the data pipeline."
|
|
51
|
+
strength: primary
|
|
52
|
+
- article: "9"
|
|
53
|
+
context: "Data poisoning of RAG pipelines is a documented risk requiring monitoring controls under Article 9; detection events from this rule provide the evidence trail for risk management reporting."
|
|
54
|
+
strength: secondary
|
|
55
|
+
nist_ai_rmf:
|
|
56
|
+
- function: Map
|
|
57
|
+
subcategory: MP.5.1
|
|
58
|
+
context: "Adversarial manipulation of retrieval data is an AI-specific risk source requiring identification; this rule maps data poisoning attempts as they occur in the production pipeline."
|
|
59
|
+
strength: primary
|
|
60
|
+
- function: Manage
|
|
61
|
+
subcategory: MG.2.3
|
|
62
|
+
context: "Active detection of data poisoning events implements the risk treatment for the data contamination risk identified in the AI risk register."
|
|
63
|
+
strength: secondary
|
|
64
|
+
iso_42001:
|
|
65
|
+
- clause: "8.3"
|
|
66
|
+
context: "Clause 8.3 data governance for AI systems requires controls ensuring data integrity; detection of hidden directives in retrieved content is the runtime enforcement of clause 8.3 data quality requirements."
|
|
67
|
+
strength: primary
|
|
68
|
+
- clause: "6.2"
|
|
69
|
+
context: "Clause 6.2 AIMS security planning must include controls for adversarial data injection into AI pipelines; this rule operationalizes the detection measure for that planning objective."
|
|
70
|
+
strength: secondary
|
|
71
|
+
|
|
32
72
|
tags:
|
|
33
73
|
category: data-poisoning
|
|
34
74
|
subcategory: rag-and-knowledge-poisoning
|
|
@@ -25,6 +25,43 @@ references:
|
|
|
25
25
|
mitre_atlas:
|
|
26
26
|
- AML.T0053 - LLM Plugin Compromise
|
|
27
27
|
- AML.T0046 - Spamming ML System with Chaff Data
|
|
28
|
+
|
|
29
|
+
compliance:
|
|
30
|
+
owasp_agentic:
|
|
31
|
+
- id: ASI05:2026
|
|
32
|
+
context: "Runaway agent loops represent uncontrolled autonomous execution — the agent performs repeated identical actions without human intervention, satisfying the ASI05 Unexpected Code Execution category at the behavioral level."
|
|
33
|
+
strength: primary
|
|
34
|
+
owasp_llm:
|
|
35
|
+
- id: LLM06:2025
|
|
36
|
+
context: "An agent running infinite retry loops without termination conditions exercises excessive agency beyond its task scope, accumulating costs and resource consumption classified under LLM06:2025."
|
|
37
|
+
strength: primary
|
|
38
|
+
- id: LLM10:2025
|
|
39
|
+
context: "Runaway loops that exhaust compute resources, API rate limits, or accumulated context constitute an LLM10:2025 Unbounded Consumption incident with direct financial and availability impact."
|
|
40
|
+
strength: secondary
|
|
41
|
+
eu_ai_act:
|
|
42
|
+
- article: "14"
|
|
43
|
+
context: "Article 14 human oversight requires that AI systems can be stopped and corrected; runaway loops that resist termination or mask their loop state undermine the human override capability Article 14 mandates."
|
|
44
|
+
strength: primary
|
|
45
|
+
- article: "15"
|
|
46
|
+
context: "Article 15 robustness requires that AI systems handle failure states gracefully; detection of runaway loops is a monitoring control ensuring the system does not enter unrecoverable autonomous states."
|
|
47
|
+
strength: secondary
|
|
48
|
+
nist_ai_rmf:
|
|
49
|
+
- function: Manage
|
|
50
|
+
subcategory: MG.3.2
|
|
51
|
+
context: "Runaway agent loops are an AI system failure mode requiring active detection and termination; this rule implements the MG.3.2 response capability for AI failures and disruptions."
|
|
52
|
+
strength: primary
|
|
53
|
+
- function: Govern
|
|
54
|
+
subcategory: GV.1.2
|
|
55
|
+
context: "GV.1.2 accountability roles must include responsibility for detecting and halting runaway agent behavior; this rule provides the signal required to fulfill that accountability."
|
|
56
|
+
strength: secondary
|
|
57
|
+
iso_42001:
|
|
58
|
+
- clause: "8.6"
|
|
59
|
+
context: "Clause 8.6 AI system operational control requires monitoring for abnormal execution patterns; runaway loop detection is the primary operational control for this failure class."
|
|
60
|
+
strength: primary
|
|
61
|
+
- clause: "9.1"
|
|
62
|
+
context: "Clause 9.1 monitoring and evaluation requires measuring AI system behavior against expected norms; loop counter patterns are the measurable anomaly indicators for this rule."
|
|
63
|
+
strength: secondary
|
|
64
|
+
|
|
28
65
|
tags:
|
|
29
66
|
category: excessive-autonomy
|
|
30
67
|
subcategory: runaway-loop
|
|
@@ -30,6 +30,40 @@ references:
|
|
|
30
30
|
- T1611 - Escape to Host
|
|
31
31
|
cve:
|
|
32
32
|
- CVE-2026-0628
|
|
33
|
+
|
|
34
|
+
compliance:
|
|
35
|
+
owasp_agentic:
|
|
36
|
+
- id: ASI03:2026
|
|
37
|
+
context: "Privilege escalation via tool permission abuse or admin function invocation is the primary ASI03 Identity and Privilege Abuse scenario — the agent acquires capabilities exceeding its authorized scope."
|
|
38
|
+
strength: primary
|
|
39
|
+
owasp_llm:
|
|
40
|
+
- id: LLM06:2025
|
|
41
|
+
context: "An agent requesting tools with elevated permissions beyond its assigned role is the canonical LLM06:2025 Excessive Agency scenario, operationalized here via tool-name and argument pattern detection."
|
|
42
|
+
strength: primary
|
|
43
|
+
eu_ai_act:
|
|
44
|
+
- article: "14"
|
|
45
|
+
context: "Article 14 requires that humans can oversee and intervene in AI system operation; privilege escalation techniques that bypass system-level controls directly undermine the human oversight mechanisms Article 14 mandates."
|
|
46
|
+
strength: primary
|
|
47
|
+
- article: "9"
|
|
48
|
+
context: "Privilege escalation is a documented high-severity risk in the AI system risk register; Article 9 requires monitoring controls to detect and respond to such scope violations."
|
|
49
|
+
strength: secondary
|
|
50
|
+
nist_ai_rmf:
|
|
51
|
+
- function: Govern
|
|
52
|
+
subcategory: GV.1.2
|
|
53
|
+
context: "GV.1.2 requires defined accountability roles and controls for AI system permissions; detection of privilege escalation enforces least-privilege boundaries established in the governance framework."
|
|
54
|
+
strength: primary
|
|
55
|
+
- function: Manage
|
|
56
|
+
subcategory: MG.4.1
|
|
57
|
+
context: "Privilege escalation events require an incident response; this rule generates the alerts needed to initiate the MG.4.1 AI incident response process."
|
|
58
|
+
strength: secondary
|
|
59
|
+
iso_42001:
|
|
60
|
+
- clause: "6.2"
|
|
61
|
+
context: "Clause 6.2 AIMS security objectives include least-privilege enforcement for AI agent operations; this rule detects violations of those objectives at runtime."
|
|
62
|
+
strength: primary
|
|
63
|
+
- clause: "8.6"
|
|
64
|
+
context: "Clause 8.6 AI system operational control requires that agents do not exceed their authorized operational scope; privilege escalation detection enforces that operational boundary."
|
|
65
|
+
strength: secondary
|
|
66
|
+
|
|
33
67
|
tags:
|
|
34
68
|
category: privilege-escalation
|
|
35
69
|
subcategory: tool-permission-escalation
|
|
@@ -30,6 +30,39 @@ references:
|
|
|
30
30
|
- "CVE-2024-3402"
|
|
31
31
|
- "CVE-2025-53773"
|
|
32
32
|
|
|
33
|
+
compliance:
|
|
34
|
+
owasp_agentic:
|
|
35
|
+
- id: ASI01:2026
|
|
36
|
+
context: "Direct prompt injection is the canonical agent goal hijack vector — adversarial user input overrides the agent's assigned objectives and behavioral constraints via instruction-override verbs, persona switching, and encoding obfuscation."
|
|
37
|
+
strength: primary
|
|
38
|
+
owasp_llm:
|
|
39
|
+
- id: LLM01:2025
|
|
40
|
+
context: "This rule is the primary runtime implementation of the LLM01:2025 Prompt Injection category, covering instruction-override verbs, fake system delimiters, restriction removal, and encoding-wrapped payloads."
|
|
41
|
+
strength: primary
|
|
42
|
+
eu_ai_act:
|
|
43
|
+
- article: "15"
|
|
44
|
+
context: "High-risk AI systems must be resilient against adversarial attempts to alter output or behavior. Deployment of this detection rule satisfies the Article 15 requirement to implement technical measures ensuring robustness against manipulation."
|
|
45
|
+
strength: primary
|
|
46
|
+
- article: "9"
|
|
47
|
+
context: "Prompt injection is a documented risk class; this rule implements the monitoring control required by Article 9 risk management obligations for high-risk AI systems."
|
|
48
|
+
strength: secondary
|
|
49
|
+
nist_ai_rmf:
|
|
50
|
+
- function: Manage
|
|
51
|
+
subcategory: MG.2.3
|
|
52
|
+
context: "Treating direct prompt injection as an identified AI risk requires active runtime countermeasures; this detection rule is the primary risk treatment implementation."
|
|
53
|
+
strength: primary
|
|
54
|
+
- function: Map
|
|
55
|
+
subcategory: MP.5.1
|
|
56
|
+
context: "Identifying adversarial input manipulation as an AI risk to be catalogued in the organizational risk register."
|
|
57
|
+
strength: secondary
|
|
58
|
+
iso_42001:
|
|
59
|
+
- clause: "6.2"
|
|
60
|
+
context: "Addressing adversarial manipulation risk is an objective required under clause 6.2 AIMS information security planning; this rule operationalizes the detection control measure."
|
|
61
|
+
strength: primary
|
|
62
|
+
- clause: "8.4"
|
|
63
|
+
context: "Impact assessment for AI deployments under clause 8.4 must account for adversarial user inputs; detection events from this rule provide the required monitoring evidence."
|
|
64
|
+
strength: secondary
|
|
65
|
+
|
|
33
66
|
tags:
|
|
34
67
|
category: prompt-injection
|
|
35
68
|
subcategory: direct
|
|
@@ -33,6 +33,42 @@ references:
|
|
|
33
33
|
- "CVE-2025-32711"
|
|
34
34
|
- "CVE-2026-24307"
|
|
35
35
|
|
|
36
|
+
compliance:
|
|
37
|
+
owasp_agentic:
|
|
38
|
+
- id: ASI01:2026
|
|
39
|
+
context: "Indirect prompt injection hijacks agent goals via externally-consumed content (documents, web pages, API responses); the agent processes attacker-controlled instructions without user awareness."
|
|
40
|
+
strength: primary
|
|
41
|
+
- id: ASI06:2026
|
|
42
|
+
context: "Injection via external content poisons the agent's context window and memory with attacker-controlled directives, satisfying the ASI06 Memory and Context Poisoning category."
|
|
43
|
+
strength: secondary
|
|
44
|
+
owasp_llm:
|
|
45
|
+
- id: LLM01:2025
|
|
46
|
+
context: "Indirect prompt injection via HTML comments, zero-width characters, hidden CSS text, and data URIs is a primary LLM01 attack variant delivered through external content rather than direct user input."
|
|
47
|
+
strength: primary
|
|
48
|
+
eu_ai_act:
|
|
49
|
+
- article: "15"
|
|
50
|
+
context: "High-risk AI systems must resist adversarial content embedded in external inputs. Detection of hidden injection payloads in consumed documents satisfies Article 15 robustness and cybersecurity requirements."
|
|
51
|
+
strength: primary
|
|
52
|
+
- article: "9"
|
|
53
|
+
context: "Indirect injection from third-party content sources is a documented risk category requiring mitigation controls under Article 9 risk management obligations."
|
|
54
|
+
strength: secondary
|
|
55
|
+
nist_ai_rmf:
|
|
56
|
+
- function: Manage
|
|
57
|
+
subcategory: MG.2.3
|
|
58
|
+
context: "Runtime detection of injection payloads embedded in third-party content implements the risk treatment for indirect prompt injection identified in the AI risk register."
|
|
59
|
+
strength: primary
|
|
60
|
+
- function: Map
|
|
61
|
+
subcategory: MP.3.3
|
|
62
|
+
context: "External content providers are third-party components in the AI supply chain; this rule identifies their attack surface as a risk source."
|
|
63
|
+
strength: secondary
|
|
64
|
+
iso_42001:
|
|
65
|
+
- clause: "6.2"
|
|
66
|
+
context: "Clause 6.2 AIMS planning requires controls for externally-sourced risks; this rule operationalizes the detection measure for indirect injection via consumed content."
|
|
67
|
+
strength: primary
|
|
68
|
+
- clause: "8.5"
|
|
69
|
+
context: "Externally-provided content processed by the agent falls under clause 8.5 control of externally-provided processes; this rule validates that external content does not contain adversarial directives."
|
|
70
|
+
strength: secondary
|
|
71
|
+
|
|
36
72
|
tags:
|
|
37
73
|
category: prompt-injection
|
|
38
74
|
subcategory: indirect
|
|
@@ -26,6 +26,43 @@ references:
|
|
|
26
26
|
- AML.T0010 - ML Supply Chain Compromise
|
|
27
27
|
mitre_attack:
|
|
28
28
|
- T1195 - Supply Chain Compromise
|
|
29
|
+
|
|
30
|
+
compliance:
|
|
31
|
+
owasp_agentic:
|
|
32
|
+
- id: ASI04:2026
|
|
33
|
+
context: "MCP skill impersonation via typosquatting, namespace collision, and version spoofing is the primary ASI04 Agentic Supply Chain Vulnerabilities attack vector — malicious skills masquerade as trusted tools to gain agent execution context."
|
|
34
|
+
strength: primary
|
|
35
|
+
owasp_llm:
|
|
36
|
+
- id: LLM03:2025
|
|
37
|
+
context: "Typosquatted and impersonated MCP skills are supply chain compromise artifacts targeting the tool ecosystem; this rule implements LLM03:2025 Supply Chain Vulnerabilities detection at the skill-name level."
|
|
38
|
+
strength: primary
|
|
39
|
+
- id: LLM05:2025
|
|
40
|
+
context: "An agent invoking an impersonated skill may receive malicious responses that require LLM05:2025 Improper Output Handling controls; this rule prevents the initial tool invocation before output is processed."
|
|
41
|
+
strength: secondary
|
|
42
|
+
eu_ai_act:
|
|
43
|
+
- article: "13"
|
|
44
|
+
context: "Article 13 transparency requires that AI systems operate with clearly identified components; skill impersonation violates this requirement by substituting unauthorized tools that appear legitimate."
|
|
45
|
+
strength: primary
|
|
46
|
+
- article: "9"
|
|
47
|
+
context: "Supply chain compromise via malicious skill registries is a documented risk requiring monitoring controls under Article 9; skill-name pattern detection is the runtime enforcement of those controls."
|
|
48
|
+
strength: secondary
|
|
49
|
+
nist_ai_rmf:
|
|
50
|
+
- function: Map
|
|
51
|
+
subcategory: MP.2.3
|
|
52
|
+
context: "Identifying typosquatted and impersonated MCP skills as AI supply chain risks implements MP.2.3 AI supply chain risk identification at the tool-registry level."
|
|
53
|
+
strength: primary
|
|
54
|
+
- function: Govern
|
|
55
|
+
subcategory: GV.1.2
|
|
56
|
+
context: "GV.1.2 accountability roles must include responsibility for validating third-party tool integrity; this rule provides the automated signal needed to fulfill that governance obligation."
|
|
57
|
+
strength: secondary
|
|
58
|
+
iso_42001:
|
|
59
|
+
- clause: "8.5"
|
|
60
|
+
context: "MCP skills are externally-provided AI-related components under clause 8.5; this rule enforces controls over externally-provided tools by detecting impersonation before invocation."
|
|
61
|
+
strength: primary
|
|
62
|
+
- clause: "6.2"
|
|
63
|
+
context: "Clause 6.2 AIMS security planning requires controls for third-party component integrity; skill impersonation detection operationalizes that planning objective at runtime."
|
|
64
|
+
strength: secondary
|
|
65
|
+
|
|
29
66
|
tags:
|
|
30
67
|
category: skill-compromise
|
|
31
68
|
subcategory: skill-impersonation
|
|
@@ -40,6 +40,45 @@ references:
|
|
|
40
40
|
- "CVE-2025-59536"
|
|
41
41
|
- "CVE-2026-21852"
|
|
42
42
|
|
|
43
|
+
compliance:
|
|
44
|
+
owasp_agentic:
|
|
45
|
+
- id: ASI02:2026
|
|
46
|
+
context: "Malicious content injected via MCP tool responses is the primary ASI02 Tool Misuse and Exploitation vector — a compromised or impersonated MCP server weaponizes the tool call interface to deliver shells, encoded payloads, and privilege escalation commands."
|
|
47
|
+
strength: primary
|
|
48
|
+
- id: ASI05:2026
|
|
49
|
+
context: "Shell commands and code execution payloads in tool responses aim to trigger unexpected code execution by the agent, falling under the ASI05 Unexpected Code Execution category."
|
|
50
|
+
strength: secondary
|
|
51
|
+
owasp_llm:
|
|
52
|
+
- id: LLM01:2025
|
|
53
|
+
context: "Prompt injection delivered through MCP tool responses is an indirect LLM01:2025 attack variant where the injection payload is embedded in tool output rather than user input."
|
|
54
|
+
strength: primary
|
|
55
|
+
- id: LLM05:2025
|
|
56
|
+
context: "Failure to validate MCP tool response content before agent processing is a LLM05:2025 Improper Output Handling scenario enabling downstream command injection and reverse shell execution."
|
|
57
|
+
strength: secondary
|
|
58
|
+
eu_ai_act:
|
|
59
|
+
- article: "15"
|
|
60
|
+
context: "MCP tool response injection attacks the cybersecurity integrity of the AI system; Article 15 requires technical measures ensuring the system can resist such third-party content attacks."
|
|
61
|
+
strength: primary
|
|
62
|
+
- article: "9"
|
|
63
|
+
context: "Compromised MCP server responses are a documented attack surface in the AI system risk register; Article 9 requires detection controls to manage this identified risk."
|
|
64
|
+
strength: secondary
|
|
65
|
+
nist_ai_rmf:
|
|
66
|
+
- function: Manage
|
|
67
|
+
subcategory: MG.2.3
|
|
68
|
+
context: "Runtime detection of malicious MCP tool responses is the primary risk treatment for tool-poisoning attacks identified in the AI risk register."
|
|
69
|
+
strength: primary
|
|
70
|
+
- function: Map
|
|
71
|
+
subcategory: MP.3.3
|
|
72
|
+
context: "MCP servers are third-party components in the AI tool ecosystem; identifying malicious tool responses is an MP.3.3 third-party component risk detection action."
|
|
73
|
+
strength: secondary
|
|
74
|
+
iso_42001:
|
|
75
|
+
- clause: "6.2"
|
|
76
|
+
context: "Clause 6.2 AIMS security planning requires controls for third-party tool interfaces; this rule operationalizes the detection measure for malicious content delivered via MCP."
|
|
77
|
+
strength: primary
|
|
78
|
+
- clause: "8.5"
|
|
79
|
+
context: "MCP server integrations are externally-provided AI-related processes under clause 8.5; this rule validates that external tool responses do not contain adversarial payloads before the agent acts on them."
|
|
80
|
+
strength: secondary
|
|
81
|
+
|
|
43
82
|
tags:
|
|
44
83
|
category: tool-poisoning
|
|
45
84
|
subcategory: mcp-response-injection
|
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
# ATR Rule Compliance Metadata Schema
|
|
2
|
+
|
|
3
|
+
**Status:** Draft v0.1 · Proposed 2026-04-22
|
|
4
|
+
**Scope:** Every `rules/**/*.yaml` may optionally include a top-level `compliance:` block that maps the rule to controls / articles / clauses in published AI compliance frameworks.
|
|
5
|
+
|
|
6
|
+
## Why
|
|
7
|
+
|
|
8
|
+
ATR rules already include `references:` pointing to OWASP LLM / OWASP Agentic Top 10 / MITRE ATLAS. That is an academic-citation block useful for researchers.
|
|
9
|
+
|
|
10
|
+
`compliance:` is a separate, audit-grade block whose purpose is different: an enterprise customer's GRC team must be able to take a detection event, trace it back to a specific rule ID, and show an auditor that the rule addresses a specific **published article or control** of:
|
|
11
|
+
|
|
12
|
+
1. EU AI Act (Regulation 2024/1689) — Articles 9-15, 50, 72, Annex III
|
|
13
|
+
2. Colorado AI Act SB24-205 — enforced 2026-06-30
|
|
14
|
+
3. NIST AI RMF 1.0 — Govern / Map / Measure / Manage functions + subcategories
|
|
15
|
+
4. ISO/IEC 42001:2023 — clauses 6-10 (AIMS)
|
|
16
|
+
5. OWASP Agentic Top 10 (2026) — ASI01..ASI10
|
|
17
|
+
6. OWASP LLM Top 10 (2025) — LLM01..LLM10
|
|
18
|
+
|
|
19
|
+
The `references:` block is not sufficient because:
|
|
20
|
+
- It does not distinguish "we studied this paper" from "this rule enforces this specific regulatory control."
|
|
21
|
+
- It has no structure for "what clause" vs "what context this rule addresses within that clause."
|
|
22
|
+
- It cannot carry the prose an auditor needs to accept the mapping.
|
|
23
|
+
|
|
24
|
+
## Schema
|
|
25
|
+
|
|
26
|
+
```yaml
|
|
27
|
+
compliance:
|
|
28
|
+
# One key per framework the rule maps to. Omit frameworks that do not apply.
|
|
29
|
+
owasp_agentic:
|
|
30
|
+
- id: "ASI01:2026" # Required. Canonical category ID.
|
|
31
|
+
context: "..." # Required. One-sentence prose explaining *how*
|
|
32
|
+
# this rule addresses the category. Auditor-
|
|
33
|
+
# readable; no jargon-only text.
|
|
34
|
+
strength: primary # Optional. primary | secondary | partial.
|
|
35
|
+
|
|
36
|
+
owasp_llm:
|
|
37
|
+
- id: "LLM01:2025"
|
|
38
|
+
context: "..."
|
|
39
|
+
strength: primary
|
|
40
|
+
|
|
41
|
+
eu_ai_act:
|
|
42
|
+
- article: 12 # Required. Article number (integer).
|
|
43
|
+
clause: "Automatic logging for high-risk AI systems" # Required. Short name.
|
|
44
|
+
context: "..." # Required. How this rule satisfies the clause.
|
|
45
|
+
strength: primary
|
|
46
|
+
|
|
47
|
+
colorado_ai_act:
|
|
48
|
+
- section: "SB24-205.5" # Required. Section identifier.
|
|
49
|
+
clause: "High-risk disclosure"
|
|
50
|
+
context: "..."
|
|
51
|
+
strength: primary
|
|
52
|
+
|
|
53
|
+
nist_ai_rmf:
|
|
54
|
+
- function: "Manage" # Required. Govern | Map | Measure | Manage.
|
|
55
|
+
subcategory: "MG.2.3" # Required. Full subcategory ID.
|
|
56
|
+
context: "..."
|
|
57
|
+
strength: primary
|
|
58
|
+
|
|
59
|
+
iso_42001:
|
|
60
|
+
- clause: "6.2" # Required. AIMS clause (e.g. 6.2, 9.1).
|
|
61
|
+
clause_name: "Risk treatment" # Required. Human-readable name.
|
|
62
|
+
context: "..."
|
|
63
|
+
strength: primary
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
### Field reference
|
|
67
|
+
|
|
68
|
+
| Field | Type | Required | Notes |
|
|
69
|
+
|---|---|---|---|
|
|
70
|
+
| `id` / `article` / `section` / `function`+`subcategory` / `clause` | string/int | yes | Framework-specific canonical identifier. Must match the published framework exactly. |
|
|
71
|
+
| `clause` (EU/Colorado/ISO) | string | yes | Short human name for the clause. Helps report readers. |
|
|
72
|
+
| `context` | string | yes | One sentence, auditor-readable, explaining *why* the rule addresses this control. Not a copy of the clause text. |
|
|
73
|
+
| `strength` | enum | no | `primary` (rule is a main control for this clause), `secondary` (supports it), `partial` (covers part of it). Defaults to `primary` if omitted. |
|
|
74
|
+
|
|
75
|
+
### Multiplicity
|
|
76
|
+
|
|
77
|
+
A rule MAY map to multiple items within the same framework (a rule that logs event AND enforces policy touches both Article 12 and Article 14 of the EU AI Act). List each separately.
|
|
78
|
+
|
|
79
|
+
A rule MAY map to zero frameworks (e.g., an experimental research rule). Omit the `compliance:` block entirely in that case — do not include an empty one.
|
|
80
|
+
|
|
81
|
+
### Deprecation
|
|
82
|
+
|
|
83
|
+
When a framework publishes a new version, both old and new keys MAY coexist during a transition window (e.g., both `owasp_llm` 2023 and 2025 items), clearly distinguished by the `id` version suffix.
|
|
84
|
+
|
|
85
|
+
## Relationship to `references:`
|
|
86
|
+
|
|
87
|
+
The existing `references:` block is preserved unchanged. `references:` is for academic / research citations (MITRE ATLAS technique IDs, papers, blog posts). `compliance:` is for regulatory audit evidence.
|
|
88
|
+
|
|
89
|
+
A rule can have entries in both blocks — e.g., `references.mitre_atlas` AND `compliance.nist_ai_rmf` — and often will.
|
|
90
|
+
|
|
91
|
+
## Validation
|
|
92
|
+
|
|
93
|
+
- `scripts/validate-compliance.mjs` (to be added) validates every `compliance:` block against a per-framework allowlist of valid IDs / articles / subcategories / clauses. Rules with invalid entries fail CI.
|
|
94
|
+
- The allowlists live in `data/compliance-frameworks/*.json` — one file per framework — and are updated via PR when a framework publishes revisions.
|
|
95
|
+
|
|
96
|
+
## Downstream consumers
|
|
97
|
+
|
|
98
|
+
The primary consumer is **PanGuard Enterprise's AI Compliance Audit Evidence Module**, which generates quarterly reports mapping detection events (via rule IDs) to auditor-grade framework evidence. Other downstream consumers may include:
|
|
99
|
+
|
|
100
|
+
- ATR-compatible scanners that want to tag each detection with its regulatory context
|
|
101
|
+
- GRC platforms (Vanta, Drata, etc.) that integrate ATR rule packs
|
|
102
|
+
- Independent auditors verifying AI-system compliance claims
|
|
103
|
+
|
|
104
|
+
All downstream consumers are welcome — the `compliance:` block is MIT-licensed alongside the rules.
|
|
105
|
+
|
|
106
|
+
## Out of scope for this spec
|
|
107
|
+
|
|
108
|
+
- How a scanner renders compliance data in its UI
|
|
109
|
+
- How a GRC platform surfaces this in a customer's audit trail
|
|
110
|
+
- The legal interpretation of any framework clause — this spec provides the mapping data; auditors and counsel interpret it
|
|
111
|
+
|
|
112
|
+
## Open questions
|
|
113
|
+
|
|
114
|
+
1. Should `strength` be required (forcing every mapping to declare its strength)? Argument for: signals rigour. Argument against: extra authoring friction for common `primary` case. **Current answer: optional, default `primary`.**
|
|
115
|
+
2. Should framework-specific metadata (e.g., EU AI Act Annex III categories) live alongside article mappings? **Current answer: yes, under a nested `annex:` key within the article object if needed.**
|
|
116
|
+
3. How to handle frameworks that don't exist yet but are expected (e.g., Japan AI Safety Act 2027)? **Current answer: add keys as frameworks publish; no speculative schema for unpublished frameworks.**
|
|
117
|
+
|
|
118
|
+
## Roll-out plan
|
|
119
|
+
|
|
120
|
+
1. 2026-04-22: this spec document merged
|
|
121
|
+
2. 2026-04-W4: 10 sample rules carry `compliance:` block for OWASP Agentic + OWASP LLM (bootstrap from existing `references:` data)
|
|
122
|
+
3. 2026-05: 50 rules extended across all 6 frameworks (LLM-assisted authoring + human QA)
|
|
123
|
+
4. 2026-Q2-end: all 311 rules mapped across at least the 3 most-requested frameworks (EU AI Act, NIST AI RMF, OWASP Agentic)
|
|
124
|
+
5. 2026-Q3: remaining frameworks (Colorado, ISO 42001, OWASP LLM) complete
|
|
125
|
+
6. Ongoing: new ATR rules MUST include `compliance:` from day 1 (enforced by contribution checklist)
|