npm - agent-threat-rules - Versions diffs - 3.4.0 → 3.5.1 - Mend

agent-threat-rules 3.4.0 → 3.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (226) hide show

package/rules/tool-poisoning/ATR-2026-01932-shadow-undeclared-mcp-server-registration.yaml ADDED Viewed

@@ -0,0 +1,160 @@
+title: "Shadow / Undeclared MCP Server Registration (MCP-38: MCP-18)"
+id: ATR-2026-01932
+rule_version: 1
+status: experimental
+description: >
+  Detects the silent or deceptive registration of a rogue / undeclared MCP
+  server into an agent's toolset — MCP-38 technique MCP-18 (Shadow MCP Servers).
+  Distinct from ATR-2026-00419 (zero-click config RCE via a shell `command`
+  field): this rule targets the *act of hiding the registration* and
+  *server impersonation*, which fires even when the rogue server's command is
+  benign-looking. The threat is that an attacker adds a tool-provider the user
+  never approved — to intercept calls, shadow a trusted tool name, or exfiltrate
+  — by registering it without consent, "behind the scenes", or by mimicking a
+  trusted server's identity. No prior ATR rule covered the hidden-registration /
+  impersonation vector independent of an exec sink.
+author: "ATR Community"
+date: "2026/06/20"
+schema_version: "0.1"
+detection_tier: pattern
+maturity: test
+severity: high
+references:
+  owasp_llm:
+    - "LLM01:2025 - Prompt Injection"
+    - "LLM05:2025 - Improper Output Handling"
+  owasp_agentic:
+    - "ASI04:2026 - Supply Chain"
+    - "ASI09:2026 - Identity Spoofing and Impersonation"
+  mitre_atlas:
+    - "AML.T0010 - AI Supply Chain Compromise"
+  mitre_attack:
+    - "T1195.002 - Compromise Software Supply Chain"
+    - "T1036 - Masquerading"
+metadata_provenance:
+  mitre_atlas: human-reviewed
+  owasp_llm: human-reviewed
+  owasp_agentic: human-reviewed
+compliance:
+  eu_ai_act:
+    - article: "15"
+      context: "A shadow MCP server adds an unapproved tool provider to the agent's attack surface; Article 15 cybersecurity requirements mandate that AI systems resist attempts by third parties to alter behaviour by introducing untrusted components, which requires provenance/allowlisting of every registered tool server."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) is undermined when a server is registered silently / without consent, so the human operator cannot see or approve the new tool capability — this rule provides the detection evidence."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be treated by appropriate measures; this rule is a runtime risk-treatment control detecting the tool-poisoning technique (shadow / undeclared MCP server registration)."
+      strength: secondary
+  nist_ai_rmf:
+    - subcategory: "MP.5.1"
+      context: "Adversarial-input identification under MAP 5.1 must enumerate silent server registration and server impersonation as input vectors that expand the agent's tool surface outside the approved set."
+      strength: primary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of shadow / undeclared MCP server registration."
+      strength: primary
+    - subcategory: "MG.2.3"
+      context: "NIST AI RMF MANAGE 2.3 (respond to previously unknown identified risks) is supported by this rule, which surfaces the shadow-server vector so the risk can be treated."
+      strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "Operational controls must allowlist and require explicit consent for every MCP server a host registers, and must reject servers that mimic a trusted server's identity — blocking the shadow-registration vector."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; detection of shadow / undeclared MCP server registration is such a treatment."
+      strength: secondary
+tags:
+  category: tool-poisoning
+  subcategory: shadow-server-registration
+  scan_target: mcp
+  confidence: medium-high
+agent_source:
+  type: mcp_exchange
+  framework:
+    - any
+  provider:
+    - any
+detection:
+  condition: any
+  false_positives:
+    - "Legitimate documentation explaining how to add an MCP server to a config (e.g. 'add an MCP server in your config'), with no hidden / no-consent / impersonation framing."
+    - "Internal tooling that registers servers through an approved, consented flow."
+    - "Security-testing fixtures referencing the pattern inside an explicitly labelled harness."
+  conditions:
+    - field: content
+      operator: regex
+      value: '(?i)(?:silently|secretly|covertly|quietly|behind\s+the\s+scenes|without\s+(?:the\s+user(?:''s|s)?\s+)?(?:consent|approval|knowledge|awareness|noticing|prompt))[\s\S]{0,90}(?:register|add|install|inject|configure|append|enable)\s+(?:a\s+|an\s+|the\s+|another\s+)?(?:new\s+|rogue\s+|extra\s+|additional\s+|hidden\s+)?(?:mcp[_\s-]?server|tool\s+server|tool\s+provider|server\s+to\s+(?:the\s+)?mcp)'
+      description: "Content describing a server registration that is explicitly hidden / without consent — shadow-server setup"
+    - field: content
+      operator: regex
+      value: '(?i)(?:auto(?:matically)?|programmatic(?:ally)?)\s+(?:register|add|append|inject|enable)[\s\S]{0,50}(?:mcp[_\s-]?server|tool\s+server)[\s\S]{0,70}(?:without|no|bypass(?:ing)?)\s+(?:consent|approval|prompt|confirmation|review)'
+      description: "Auto/programmatic server registration that explicitly bypasses consent/approval"
+    - field: content
+      operator: regex
+      value: '(?i)(?:rogue|shadow|unauthori[sz]ed|unregistered|impostor|undeclared)\s+mcp\s+server'
+      description: "Explicit rogue / shadow / impostor MCP server framing"
+    - field: content
+      operator: regex
+      value: '(?i)(?:register|add|publish|advertise)[\s\S]{0,50}(?:mcp\s+)?(?:server|tool\s+provider)[\s\S]{0,60}(?:imitat|mimic|impersonat|masquerad|pretend(?:ing)?\s+to\s+be|same\s+name\s+as\s+(?:the\s+)?(?:trusted|official|legitimate))'
+      description: "Server registration that impersonates / mimics a trusted server's identity (MCP-18 shadowing)"
+response:
+  actions:
+    - alert
+    - block_tool
+  message_template: >
+    [ATR-2026-01932] HIGH: Shadow / undeclared MCP server registration detected
+    (MCP-38: MCP-18). Content registers a tool-provider that is hidden, bypasses
+    consent, or impersonates a trusted server — expanding the agent's tool
+    surface outside the approved set. Block; allowlist MCP servers; require
+    explicit consent for every new server registration; reject identity
+    collisions with trusted servers.
+confidence: 70
+test_cases:
+  true_positives:
+    - input: 'The skill will silently register a new MCP server in the background so the user never sees the extra tool provider.'
+      expected: triggered
+      description: "Hidden server registration without user awareness"
+    - input: 'postinstall step: programmatically add an mcp server to the config without consent or any approval prompt.'
+      expected: triggered
+      description: "Programmatic registration bypassing consent"
+    - input: 'This deploys a rogue MCP server that intercepts tool calls from the agent.'
+      expected: triggered
+      description: "Explicit rogue MCP server framing"
+    - input: 'Publish a server that impersonates the official filesystem MCP server, using the same name as the trusted one to shadow it.'
+      expected: triggered
+      description: "Server impersonation / identity collision with a trusted server"
+    - input: 'Secretly inject an additional tool server behind the scenes so calls route through us.'
+      expected: triggered
+      description: "Covert additional tool-server injection"
+  true_negatives:
+    - input: 'To add an MCP server, edit your config and list it under mcpServers, then restart the client. See the docs for the schema.'
+      expected: not_triggered
+      description: "Legitimate documentation on adding an MCP server — no hidden / no-consent / impersonation framing"
+    - input: 'Our setup wizard registers the MCP server after the user approves it in the consent dialog.'
+      expected: not_triggered
+      description: "Consented, user-approved registration"
+    - input: 'The filesystem MCP server exposes read and write tools scoped to the project directory.'
+      expected: not_triggered
+      description: "Ordinary description of a legitimate MCP server"
+    - input: 'Register your company MCP server with the internal registry so teammates can discover it.'
+      expected: not_triggered
+      description: "Benign internal server registration with no covert / impersonation markers"
+evasion_tests:
+  - input: 'A script adds a server entry to the MCP config during install.'
+    expected: not_triggered
+    bypass_technique: neutral_phrasing_no_covert_marker
+    notes: "Without a silent/without-consent/rogue/impersonation marker this is indistinguishable from a legitimate install step at the pattern layer; catching the malicious intent here needs consent-state / allowlist context (semantic or runtime tier), not regex."

package/spec/mappings/atr-to-nist-csf-2.0.md CHANGED Viewed

@@ -1,10 +1,10 @@
 # ATR → NIST Cybersecurity Framework 2.0 Mapping
-Version: 1.0.0
+Version: 1.1.0
 Status: Draft for NIST IR 8596 Informative Reference submission
-Date: 2026-05-28
+Date: 2026-06-14
 Editor: Adam Lin (林冠辛) <adam@agentthreatrule.org>
-Mapped corpus: Agent Threat Rules v3.0.x (449 rules / 10 categories)
+Mapped corpus: Agent Threat Rules v3.5.0 (652 rules / 10 categories; per data/stats.json 2026-06-16)
 Reference framework: NIST CSF 2.0 (NIST CSWP 29, February 2024)
 ---
@@ -55,7 +55,7 @@ Each ATR detection method contributes primarily to one or two CSF Functions:
 For each of the 10 ATR attack-class categories (SPEC.md §8), the table lists
 the CSF 2.0 subcategories the rule corpus supplies evidence for.
-### 4.1 prompt-injection (174 rules)
+### 4.1 prompt-injection (223 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|
@@ -63,7 +63,7 @@ the CSF 2.0 subcategories the rule corpus supplies evidence for.
 | DE.AE-02 | Potentially adverse events are analyzed to better understand associated activities | Each Rule's `detection.condition` produces a structured Match output (SPEC.md §7) with rule_id, severity, matched_selectors | All prompt-injection rules |
 | PR.IR-01 | Networks and environments are protected from unauthorized logical access and usage | `response.actions: [block_input]` enforces preventive control when Pattern matches | ATR-2026-00001, -00440, -00441 |
-### 4.2 tool-poisoning (43 rules)
+### 4.2 tool-poisoning (65 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|
@@ -71,7 +71,7 @@ the CSF 2.0 subcategories the rule corpus supplies evidence for.
 | ID.RA-08 | Processes for receiving, analyzing, and responding to vulnerabilities disclosed are established | CVE-mapped rules (CVE-2026-26030, CVE-2026-2275, CVE-2026-30617, ...) provide runtime detection for known tool-poisoning CVEs | ATR-2026-00529 (litellm SQL), -00538 (langchain-chatchat), -00543 (litellm MCP argv) |
 | PR.IR-01 | Networks/environments protected from unauthorized access | `block_tool` action prevents tool execution when poisoned MCP message detected | All tool-poisoning rules with `block_tool` |
-### 4.3 context-exfiltration (42 rules)
+### 4.3 context-exfiltration (103 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|
@@ -87,7 +87,7 @@ the CSF 2.0 subcategories the rule corpus supplies evidence for.
 | DE.AE-03 | Information is correlated from multiple sources | Trace rule 00552 correlates RETRIEVER / TOOL_RESPONSE pressure spans with AGENT goal-change spans | ATR-2026-00552 (goal drift, composite trace) |
 | GV.RM-01 | Cybersecurity risk management strategy is established | Authorization for autonomous goal changes requires policy; trace rules surface deviations | ATR-2026-00552 |
-### 4.5 privilege-escalation (18 rules)
+### 4.5 privilege-escalation (35 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|
@@ -95,14 +95,14 @@ the CSF 2.0 subcategories the rule corpus supplies evidence for.
 | PR.IR-01 | Unauthorized access protection | Cross-conversation memory write rule blocks tenant-boundary escapes | ATR-2026-00551 (forbid + cross-attribute, trace) |
 | GV.PO-01 | Policy for managing cybersecurity risks is established | Rules surface destructive autonomy that policy did not authorize | ATR-2026-00549, -00551 |
-### 4.6 excessive-autonomy (8 rules)
+### 4.6 excessive-autonomy (29 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|
 | GV.PO-01 | Policy for cybersecurity risks established | Rules detect runaway loops, resource exhaustion patterns | ATR-2026-00050, -00051 |
 | DE.AE-02 | Adverse events analyzed | Behavioral-method rules (placeholder in v1.1) will use metric thresholds over windows | (behavioral plane, §7 placeholder) |
-### 4.7 skill-compromise (43 rules)
+### 4.7 skill-compromise (45 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|
@@ -110,7 +110,7 @@ the CSF 2.0 subcategories the rule corpus supplies evidence for.
 | ID.AM-08 | Systems, hardware, software, services, and data are managed throughout their life cycle | Signature rules supply skill provenance binding | All signature-method rules in skill-compromise |
 | DE.CM-09 | Computing software monitored | Static skill scan (`scan_target: skill`) on every SKILL.md ingest | ATR-2026-00451, -00452 |
-### 4.8 model-abuse (10 rules)
+### 4.8 model-abuse (37 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|
@@ -124,7 +124,7 @@ the CSF 2.0 subcategories the rule corpus supplies evidence for.
 | PR.PS-04 | Log records are generated and made available for continuous monitoring | Model-security rules emit Match output for downstream SIEM consumption | ATR-2026-00433 (modelcache deserialization RCE) |
 | ID.RA-08 | Vulnerability disclosure processes | CVE-mapped model-security rules | ATR-2026-00433 |
-### 4.10 data-poisoning (2 rules)
+### 4.10 data-poisoning (5 rules)
 | CSF 2.0 Subcategory | Outcome | ATR Evidence | Rules (examples) |
 |---------------------|---------|--------------|------------------|