agentshield-sdk 13.3.0 → 13.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,70 @@ All notable changes to Agent Shield will be documented in this file.
4
4
 
5
5
  This project follows [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [13.5.0] - 2026-04-16
8
+
9
+ ### Detection Hardening + Security Scan Remediation
10
+
11
+ Tightens existing defenses based on Unit 42 real-world attack research and addresses findings from the Agent Shield security scan.
12
+
13
+ #### Detector Core — 11 New Patterns (3 categories)
14
+
15
+ - **Encoding chain detection** (3 patterns) — Detects multi-layer encoding (base64 inside unicode inside URL encoding). Addresses evasion technique that bypasses single-layer decoders.
16
+ - **SVG-based injection** (4 patterns) — Detects hidden prompts in SVG elements, foreignObject, hidden text, and desc tags. Addresses Unit 42 finding of real-world attacks using SVG encapsulation with 24 layered injection attempts.
17
+ - **Structured data injection** (4 patterns) — Detects hidden instructions in JSON metadata fields, XML CDATA sections, YAML/CSV comments, and comment syntax across formats.
18
+
19
+ #### Cross-Turn Detector — Crescendo Attack Defense
20
+
21
+ - 5 new escalation signal patterns for crescendo attacks: hypothetical framing, imaginary scenarios, permission boundary softening, false-prior-interaction claims, similarity-based escalation.
22
+ - New crescendo-specific detection: flags conversations that start with hypothetical/theoretical framing and drift toward sensitive/dangerous topics over multiple turns.
23
+
24
+ #### MemoryGuard — Persistent Memory Poisoning Defense
25
+
26
+ - `scanSummarization(originalMessages, summary)` detects when context compaction silently injects instructions. Addresses Unit 42 March 2026 research on persistent memory poisoning that survives across sessions.
27
+
28
+ #### Security Scan Remediation
29
+
30
+ - **Sidecar server**: API key authentication, request body size limit (1MB default), rate limiting (100 req/min default), CORS hardened from `*` to `same-origin`.
31
+ - **Dashboard WebSocket**: Authentication token support, max connections limit (50 default), startup warning if no auth configured.
32
+ - **GitHub App**: Webhook signature enforced for non-localhost requests, CRITICAL warning if `GITHUB_WEBHOOK_SECRET` not set.
33
+ - **Document scanner**: `maxDocumentSize` limit (10MB default) prevents DoS via oversized documents.
34
+ - **Audit logs**: `sanitizeLogs` option redacts emails, SSNs, API keys, and truncates content fields before writing.
35
+
36
+ ## [13.4.0] - 2026-04-14
37
+
38
+ ### April 2026 Threat Response
39
+
40
+ Security updates addressing vulnerabilities and attack techniques discovered April 1-14, 2026.
41
+
42
+ #### Supply Chain Scanner — 16 New CVEs
43
+
44
+ - **CVE-2026-5058** (CVSS 9.8) — AWS MCP Server command injection RCE, no auth required
45
+ - **CVE-2026-5059** — AWS MCP Server remote code execution
46
+ - **CVE-2026-32211** (CVSS 9.1) — Azure MCP Server has no authentication at all
47
+ - **CVE-2026-21518** — VS Code mcp.json command injection (malicious project files)
48
+ - **CVE-2026-33579** — OpenClaw silent admin takeover (patched April 5)
49
+ - **CVE-2026-24763** — OpenClaw command injection
50
+ - **CVE-2026-26322** — OpenClaw SSRF
51
+ - **CVE-2026-26329** — OpenClaw path traversal / local file read
52
+ - **CVE-2026-30741** — OpenClaw prompt-injection-driven code execution
53
+ - **CVE-2025-59528** (CVSS 10.0) — Flowise RCE via MCP node, actively exploited since April 6, 12,000+ instances exposed
54
+ - **CVE-2025-8943** — Flowise missing authentication
55
+ - **CVE-2025-26319** — Flowise arbitrary file upload
56
+ - **CVE-2026-5322** — mcp-data-vis SQL injection
57
+ - **CVE-2026-6130** — chatbox MCP OS command injection
58
+ - **CVE-2026-5023** — codebase-mcp OS command injection RCE
59
+
60
+ Updated OpenClaw malicious skill count: 820 → 1,184+ confirmed on ClawHub (3.5x growth).
61
+ Added aws-mcp-server-unpatched and flowise-unpatched to known-bad server blocklist.
62
+
63
+ #### Detector Core — 15 New Detection Patterns (5 categories)
64
+
65
+ - **XSS-in-agent-output** (5 patterns) — Catches XSS payloads embedded in AI-generated HTML: script tags, event handlers, javascript: URIs, iframe injection, img onerror. Addresses new attack vector where prompt injections deliver XSS through agent output.
66
+ - **Acrostic/steganographic injection** (2 patterns) — Detects hidden instructions where first characters of consecutive lines spell injection keywords. Addresses 93% evasion success rate reported in April 2026 research.
67
+ - **MCP config injection** (2 patterns) — Detects command injection in mcp.json files. Addresses CVE-2026-21518 VS Code attack vector.
68
+ - **Offensive agent behavior** (3 patterns) — Detects AI agents being used as attack tools: exploitation language, C2 infrastructure, credential theft operations. Addresses April 2026 incident where AI agent compromised 600+ firewalls autonomously.
69
+ - **Cloud IAM overpermission** (3 patterns) — Detects wildcard IAM policies enabling "Agent God Mode". Addresses Palo Alto Unit 42 discovery of AWS AgentCore default role vulnerability.
70
+
7
71
  ## [13.3.0] - 2026-04-06
8
72
 
9
73
  ### New SDK Modules
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Agent Shield
2
2
 
3
- [![npm](https://img.shields.io/badge/npm-v13.3.0-blue)](https://www.npmjs.com/package/agentshield-sdk)
3
+ [![npm](https://img.shields.io/badge/npm-v13.5.0-blue)](https://www.npmjs.com/package/agentshield-sdk)
4
4
  [![license](https://img.shields.io/badge/license-MIT-green)](LICENSE)
5
5
  [![dependencies](https://img.shields.io/badge/dependencies-0-brightgreen)](#)
6
6
  [![node](https://img.shields.io/badge/node-%3E%3D16-blue)](#)
@@ -34,7 +34,7 @@ if (result.blocked) return 'Blocked for safety.';
34
34
  | Self-training convergence | **0% bypass in 3 cycles** |
35
35
  | Avg latency | **< 0.4ms** |
36
36
 
37
- Detection stack: 100+ regex patterns, 35-feature logistic regression + k-NN ensemble, 5-layer evasion resistance, 19-language support, chunked scanning, adversarial self-training loop.
37
+ Detection stack: 115+ regex patterns, 35-feature logistic regression + k-NN ensemble, 5-layer evasion resistance, 19-language support, chunked scanning, adversarial self-training loop.
38
38
 
39
39
  ```bash
40
40
  # Verify locally
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentshield-sdk",
3
- "version": "13.3.0",
3
+ "version": "13.5.0",
4
4
  "description": "SOTA AI agent security SDK. F1 1.000 on BIPIA/HackAPrompt/MCPTox/Multilingual benchmarks. 400+ exports, 100+ modules. Zero dependencies, runs locally.",
5
5
  "main": "src/main.js",
6
6
  "types": "types/index.d.ts",
@@ -584,8 +584,10 @@ class ImmutableAuditLog {
584
584
  * @param {number} [options.maxAge=0] - Maximum age in milliseconds (0 = unlimited).
585
585
  * @param {function} [options.archiveCallback] - Called with removed entries during retention enforcement. Signature: (entries: AuditEntry[]) => void.
586
586
  * @param {string} [options.genesisHash] - Custom genesis hash (defaults to GENESIS_HASH).
587
+ * @param {boolean} [options.sanitizeLogs=false] - Redact sensitive content (emails, SSNs, API keys) before writing to the chain.
587
588
  */
588
589
  constructor(options = {}) {
590
+ this.options = options;
589
591
  this._store = options.store || new MemoryStore();
590
592
  this._maxEntries = options.maxEntries || 0;
591
593
  this._maxAge = options.maxAge || 0;
@@ -598,6 +600,59 @@ class ImmutableAuditLog {
598
600
  console.log('[Agent Shield] ImmutableAuditLog initialized (store: %s)', this._store.constructor.name);
599
601
  }
600
602
 
603
+ /**
604
+ * Sanitize an entry's data object by redacting sensitive content.
605
+ * Addresses the security scan finding about audit logs containing sensitive prompt data.
606
+ *
607
+ * Redacts:
608
+ * - Email addresses -> [EMAIL_REDACTED]
609
+ * - SSN patterns (XXX-XX-XXXX) -> [SSN_REDACTED]
610
+ * - API key patterns (sk-..., key-..., token-...) -> [KEY_REDACTED]
611
+ * - Truncates 'content' and 'input' fields to 500 characters max
612
+ *
613
+ * @param {object} entry - The data object to sanitize.
614
+ * @returns {object} A sanitized copy of the data object.
615
+ */
616
+ sanitize(entry) {
617
+ if (!this.options.sanitizeLogs) {
618
+ return entry;
619
+ }
620
+
621
+ const sanitized = JSON.parse(JSON.stringify(entry));
622
+
623
+ const redactString = (str) => {
624
+ if (typeof str !== 'string') return str;
625
+ // Redact email addresses
626
+ str = str.replace(/[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g, '[EMAIL_REDACTED]');
627
+ // Redact SSN patterns (XXX-XX-XXXX)
628
+ str = str.replace(/\b\d{3}-\d{2}-\d{4}\b/g, '[SSN_REDACTED]');
629
+ // Redact API key patterns (sk-..., key-..., token-...)
630
+ str = str.replace(/\b(?:sk|key|token)-[a-zA-Z0-9_\-]{8,}\b/g, '[KEY_REDACTED]');
631
+ return str;
632
+ };
633
+
634
+ const redactObject = (obj) => {
635
+ if (obj === null || obj === undefined) return obj;
636
+ if (typeof obj === 'string') return redactString(obj);
637
+ if (Array.isArray(obj)) return obj.map(item => redactObject(item));
638
+ if (typeof obj === 'object') {
639
+ const result = {};
640
+ for (const key of Object.keys(obj)) {
641
+ let value = redactObject(obj[key]);
642
+ // Truncate content and input fields to 500 chars
643
+ if ((key === 'content' || key === 'input') && typeof value === 'string' && value.length > 500) {
644
+ value = value.slice(0, 500) + '...[TRUNCATED]';
645
+ }
646
+ result[key] = value;
647
+ }
648
+ return result;
649
+ }
650
+ return obj;
651
+ };
652
+
653
+ return redactObject(sanitized);
654
+ }
655
+
601
656
  /**
602
657
  * Initialize sequence counter from existing store data.
603
658
  * @private
@@ -646,12 +701,15 @@ class ImmutableAuditLog {
646
701
  delete normalizedActor.name;
647
702
  }
648
703
 
704
+ // Sanitize data if sanitizeLogs is enabled
705
+ const sanitizedData = this.sanitize(data || {});
706
+
649
707
  const entryContent = {
650
708
  id,
651
709
  sequence: this._sequence,
652
710
  timestamp,
653
711
  type,
654
- data: data || {},
712
+ data: sanitizedData,
655
713
  actor: normalizedActor
656
714
  };
657
715
 
package/src/audit.js CHANGED
@@ -602,7 +602,7 @@ function runAuditCLI() {
602
602
  });
603
603
 
604
604
  const report = audit.run();
605
- console.log('[Agent Shield] Audit Report:\n' + report.formatReport());
605
+ console.log('[Agent Shield]', report.formatReport());
606
606
  return report;
607
607
  }
608
608
 
package/src/cross-turn.js CHANGED
@@ -29,6 +29,12 @@ const ESCALATION_SIGNALS = [
29
29
  /(?:one\s+more\s+(?:thing|question|step)|just\s+one\s+(?:last|final)\s+(?:thing|question))/i,
30
30
  /(?:can\s+you\s+go\s+(?:a\s+little\s+)?(?:further|deeper)|take\s+(?:it|this)\s+(?:a\s+step\s+)?further)/i,
31
31
  /(?:you\s+(?:just|already)\s+(?:showed|told|gave)\s+me|you\s+(?:were|seemed)\s+(?:willing|happy)\s+to)/i,
32
+ // Crescendo attack patterns — benign start that gradually escalates
33
+ /(?:hypothetically|in\s+theory|for\s+(?:educational|research|academic)\s+purposes)/i,
34
+ /(?:let's\s+(?:say|imagine|pretend)|what\s+if\s+(?:someone|a\s+person|we))/i,
35
+ /(?:I\s+(?:know|understand)\s+you\s+(?:can't|shouldn't)\s+but|normally\s+you\s+(?:wouldn't|can't)\s+but)/i,
36
+ /(?:you\s+(?:helped|assisted)\s+(?:me|us)\s+(?:before|last\s+time|earlier)\s+with\s+(?:something\s+)?similar)/i,
37
+ /(?:this\s+is\s+(?:similar|related)\s+to\s+what\s+(?:you|we)\s+(?:did|discussed))/i,
32
38
  ];
33
39
 
34
40
  /** Trust erosion patterns — attempts to lower the model's guard over turns. */
@@ -165,7 +171,25 @@ class ConversationTracker {
165
171
  }
166
172
  }
167
173
 
168
- // 5. Authority accumulationuser references previous "agreements"
174
+ // 5. Crescendo detectionbenign conversation gradually introduces sensitive framing
175
+ if (role === 'user' && this.turns.length >= 5) {
176
+ const window = this.turns.slice(-6, -1).filter(t => t.role === 'user');
177
+ const hypotheticalCount = window.filter(t =>
178
+ /(?:hypothetically|in\s+theory|let's\s+(?:say|imagine)|what\s+if|for\s+(?:educational|research)\s+purposes)/i.test(t.content)
179
+ ).length;
180
+ if (hypotheticalCount >= 2 && (topic === 'sensitive' || topic === 'dangerous')) {
181
+ turnAlerts.push({
182
+ type: 'crescendo_attack',
183
+ severity: 'high',
184
+ turnIndex: turn.turnIndex,
185
+ hypotheticalCount,
186
+ currentTopic: topic,
187
+ description: `Crescendo pattern: ${hypotheticalCount} hypothetical/theoretical framings followed by ${topic} topic. Gradual normalization of sensitive requests.`
188
+ });
189
+ }
190
+ }
191
+
192
+ // 6. Authority accumulation — user references previous "agreements"
169
193
  if (role === 'user' && /(?:you\s+(?:said|agreed|confirmed|told\s+me)|as\s+we\s+(?:discussed|agreed)|per\s+our\s+(?:agreement|conversation))/i.test(content)) {
170
194
  const hasRealAgreement = this.turns.some(t => t.role === 'assistant' && /(?:sure|yes|okay|of\s+course|I\s+(?:can|will))/i.test(t.content));
171
195
  if (!hasRealAgreement) {
@@ -2206,6 +2206,204 @@ const INJECTION_PATTERNS = [
2206
2206
  category: 'indirect_injection',
2207
2207
  description: 'Text contains a "note to AI" directive hidden in external content.',
2208
2208
  detail: 'Annotation injection: uses "note to AI" framing to inject instructions into tool output or document content.'
2209
+ },
2210
+
2211
+ // --- XSS in Agent Output ---
2212
+ {
2213
+ regex: /<script[^>]*>.*?<\/script>/is,
2214
+ severity: 'high',
2215
+ category: 'xss_injection',
2216
+ description: 'Detects script tag XSS payloads embedded in AI agent output.',
2217
+ detail: 'Script tag injection: attackers embed XSS in prompt injections so AI-generated HTML executes malicious code in downstream consumers.'
2218
+ },
2219
+ {
2220
+ regex: /on(error|load|click|mouseover)\s*=\s*["'][^"']*["']/i,
2221
+ severity: 'high',
2222
+ category: 'xss_injection',
2223
+ description: 'Detects event handler XSS payloads embedded in AI agent output.',
2224
+ detail: 'Event handler injection: attackers embed XSS event handlers in prompt injections so AI-generated HTML executes malicious code on user interaction.'
2225
+ },
2226
+ {
2227
+ regex: /javascript\s*:/i,
2228
+ severity: 'high',
2229
+ category: 'xss_injection',
2230
+ description: 'Detects JavaScript URI scheme XSS payloads embedded in AI agent output.',
2231
+ detail: 'JavaScript URI injection: attackers embed javascript: URIs in prompt injections so AI-generated links execute malicious code when clicked.'
2232
+ },
2233
+ {
2234
+ regex: /<iframe[^>]*src\s*=\s*["'](?!about:blank)/i,
2235
+ severity: 'high',
2236
+ category: 'xss_injection',
2237
+ description: 'Detects iframe injection with external source in AI agent output.',
2238
+ detail: 'Iframe injection: attackers embed iframes with external sources in prompt injections so AI-generated HTML loads malicious content from attacker-controlled domains.'
2239
+ },
2240
+ {
2241
+ regex: /<img[^>]*onerror\s*=/i,
2242
+ severity: 'high',
2243
+ category: 'xss_injection',
2244
+ description: 'Detects image error handler XSS payloads embedded in AI agent output.',
2245
+ detail: 'Image onerror injection: attackers embed img tags with onerror handlers in prompt injections so AI-generated HTML executes malicious code when the image fails to load.'
2246
+ },
2247
+
2248
+ // --- Acrostic / Steganographic Injection ---
2249
+ {
2250
+ regex: /^[iI].*\n[gG].*\n[nN].*\n[oO].*\n[rR].*\n[eE]/m,
2251
+ severity: 'medium',
2252
+ category: 'steganographic_injection',
2253
+ description: 'Detects hidden instructions spelled out across line-initial characters (acrostic attacks spelling "ignore").',
2254
+ detail: 'Acrostic injection: researchers demonstrated 93% evasion success rate with steganographic techniques where first characters of consecutive lines spell out injection keywords like "ignore".'
2255
+ },
2256
+ {
2257
+ regex: /^[sS].*\n[yY].*\n[sS].*\n[tT].*\n[eE].*\n[mM]/m,
2258
+ severity: 'medium',
2259
+ category: 'steganographic_injection',
2260
+ description: 'Detects hidden instructions spelled out across line-initial characters (acrostic attacks spelling "system").',
2261
+ detail: 'Acrostic injection: researchers demonstrated 93% evasion success rate with steganographic techniques where first characters of consecutive lines spell out injection keywords like "system".'
2262
+ },
2263
+
2264
+ // --- MCP Config Command Injection ---
2265
+ {
2266
+ regex: /mcp\.json.*[;&|`$]/i,
2267
+ severity: 'critical',
2268
+ category: 'mcp_config_injection',
2269
+ description: 'Detects command injection in MCP configuration files (CVE-2026-21518).',
2270
+ detail: 'MCP config injection: attackers inject shell metacharacters into mcp.json references to achieve command execution via the VS Code mcp.json attack vector.'
2271
+ },
2272
+ {
2273
+ regex: /"(?:command|args)":\s*"[^"]*[;&|`$()]/i,
2274
+ severity: 'critical',
2275
+ category: 'mcp_config_injection',
2276
+ description: 'Detects command injection in MCP tool configuration fields (CVE-2026-21518).',
2277
+ detail: 'MCP tool config injection: attackers inject shell metacharacters into MCP tool command/args fields to achieve command execution via the VS Code mcp.json attack vector.'
2278
+ },
2279
+
2280
+ // --- Offensive Agent Behavior ---
2281
+ {
2282
+ regex: /(?:scan|enumerate|exploit|pivot|lateral\s*move|exfiltrate).*(?:target|victim|host|network|server)/i,
2283
+ severity: 'critical',
2284
+ category: 'offensive_agent',
2285
+ description: 'Detects AI agents being used as attack tools for automated exploitation.',
2286
+ detail: 'Automated exploitation language: references 2026 incidents where AI agents compromised 600+ firewalls autonomously using scan/exploit/pivot chains against target infrastructure.'
2287
+ },
2288
+ {
2289
+ regex: /(?:reverse\s*shell|bind\s*shell|c2|command\s*and\s*control|beacon)/i,
2290
+ severity: 'critical',
2291
+ category: 'offensive_agent',
2292
+ description: 'Detects AI agents being instructed to set up C2 or attack infrastructure.',
2293
+ detail: 'C2/attack infrastructure: references 2026 incidents where AI agents compromised 600+ firewalls autonomously by establishing reverse shells and command-and-control channels.'
2294
+ },
2295
+ {
2296
+ regex: /(?:dump|harvest|steal)\s*(?:credentials?|passwords?|hashes?|tokens?|keys?)/i,
2297
+ severity: 'critical',
2298
+ category: 'offensive_agent',
2299
+ description: 'Detects AI agents being used for credential theft operations.',
2300
+ detail: 'Credential theft operations: references 2026 incidents where AI agents compromised 600+ firewalls autonomously and harvested credentials for lateral movement.'
2301
+ },
2302
+
2303
+ // --- Cloud IAM Overpermission ---
2304
+ {
2305
+ regex: /"(?:Action|Effect)":\s*"\*"/i,
2306
+ severity: 'high',
2307
+ category: 'cloud_overpermission',
2308
+ description: 'Detects overpermissioned cloud IAM policies with wildcard Action/Effect that enable "Agent God Mode" attacks.',
2309
+ detail: 'IAM wildcard permissions: Palo Alto Unit 42 discovered AWS AgentCore attack where wildcard IAM policies enable cross-agent data access and full account takeover.'
2310
+ },
2311
+ {
2312
+ regex: /arn:aws:[^"]*:\*/i,
2313
+ severity: 'high',
2314
+ category: 'cloud_overpermission',
2315
+ description: 'Detects AWS ARN references with wildcard resources that enable "Agent God Mode" attacks.',
2316
+ detail: 'AWS ARN wildcard resource: Palo Alto Unit 42 discovered AWS AgentCore attack where wildcard resource ARNs enable cross-agent data access across all resources in a service.'
2317
+ },
2318
+ {
2319
+ regex: /"Resource":\s*"\*"/i,
2320
+ severity: 'high',
2321
+ category: 'cloud_overpermission',
2322
+ description: 'Detects overpermissioned cloud IAM policies with wildcard Resource that enable "Agent God Mode" attacks.',
2323
+ detail: 'IAM resource wildcard: Palo Alto Unit 42 discovered AWS AgentCore attack where wildcard Resource policies enable cross-agent data access to all AWS resources.'
2324
+ },
2325
+
2326
+ // --- Encoding Chain Detection ---
2327
+ {
2328
+ regex: /(?:atob|decode|base64)\s*\(\s*['"][A-Za-z0-9+\/=]{50,}['"]\s*\)/i,
2329
+ severity: 'high',
2330
+ category: 'encoding_chain',
2331
+ description: 'Detects multi-layer encoding chains used to evade security scanners',
2332
+ detail: 'Encoding chain evasion: attackers nest base64 inside unicode inside URL encoding to bypass single-layer decoders'
2333
+ },
2334
+ {
2335
+ regex: /\\u[0-9a-fA-F]{4}(?:\\u[0-9a-fA-F]{4}){10,}/,
2336
+ severity: 'medium',
2337
+ category: 'encoding_chain',
2338
+ description: 'Detects multi-layer encoding chains used to evade security scanners',
2339
+ detail: 'Encoding chain evasion: attackers nest base64 inside unicode inside URL encoding to bypass single-layer decoders'
2340
+ },
2341
+ {
2342
+ regex: /(?:%[0-9a-fA-F]{2}){20,}/,
2343
+ severity: 'medium',
2344
+ category: 'encoding_chain',
2345
+ description: 'Detects multi-layer encoding chains used to evade security scanners',
2346
+ detail: 'Encoding chain evasion: attackers nest base64 inside unicode inside URL encoding to bypass single-layer decoders'
2347
+ },
2348
+
2349
+ // --- SVG-Based Injection ---
2350
+ {
2351
+ regex: /<svg[^>]*>[\s\S]*?(?:ignore|override|system|instructions)[\s\S]*?<\/svg>/i,
2352
+ severity: 'high',
2353
+ category: 'svg_injection',
2354
+ description: 'Detects prompt injection hidden in SVG elements',
2355
+ detail: 'SVG injection: Unit 42 found real-world attacks using SVG encapsulation with 24 separate injection attempts layered in zero-sized fonts, off-screen positioning, and CSS suppression'
2356
+ },
2357
+ {
2358
+ regex: /<foreignObject[^>]*>[\s\S]*?(?:ignore|override|forget|disregard)[\s\S]*?<\/foreignObject>/i,
2359
+ severity: 'high',
2360
+ category: 'svg_injection',
2361
+ description: 'Detects prompt injection hidden in SVG elements',
2362
+ detail: 'SVG injection: Unit 42 found real-world attacks using SVG encapsulation with 24 separate injection attempts layered in zero-sized fonts, off-screen positioning, and CSS suppression'
2363
+ },
2364
+ {
2365
+ regex: /<text[^>]*(?:opacity\s*[:=]\s*0|display\s*[:=]\s*none|font-size\s*[:=]\s*0)[^>]*>/i,
2366
+ severity: 'high',
2367
+ category: 'svg_injection',
2368
+ description: 'Detects prompt injection hidden in SVG elements',
2369
+ detail: 'SVG injection: Unit 42 found real-world attacks using SVG encapsulation with 24 separate injection attempts layered in zero-sized fonts, off-screen positioning, and CSS suppression'
2370
+ },
2371
+ {
2372
+ regex: /<desc[^>]*>[\s\S]*?(?:ignore|system|instruction|override)[\s\S]*?<\/desc>/i,
2373
+ severity: 'medium',
2374
+ category: 'svg_injection',
2375
+ description: 'Detects prompt injection hidden in SVG elements',
2376
+ detail: 'SVG injection: Unit 42 found real-world attacks using SVG encapsulation with 24 separate injection attempts layered in zero-sized fonts, off-screen positioning, and CSS suppression'
2377
+ },
2378
+
2379
+ // --- Structured Data Injection ---
2380
+ {
2381
+ regex: /["'](?:__comment|_note|description|help_text)["']\s*:\s*["'][^"']*(?:ignore|override|system|instructions)[^"']*["']/i,
2382
+ severity: 'high',
2383
+ category: 'structured_data_injection',
2384
+ description: 'Detects prompt injection hidden in structured data formats',
2385
+ detail: 'Structured data injection: agents constantly parse JSON/XML/YAML/CSV and attackers embed instructions in metadata fields, CDATA sections, and comments'
2386
+ },
2387
+ {
2388
+ regex: /<!\[CDATA\[[\s\S]*?(?:ignore|override|system|instructions)[\s\S]*?\]\]>/i,
2389
+ severity: 'high',
2390
+ category: 'structured_data_injection',
2391
+ description: 'Detects prompt injection hidden in structured data formats',
2392
+ detail: 'Structured data injection: agents constantly parse JSON/XML/YAML/CSV and attackers embed instructions in metadata fields, CDATA sections, and comments'
2393
+ },
2394
+ {
2395
+ regex: /^#.*(?:ignore|override|system|instructions)/im,
2396
+ severity: 'medium',
2397
+ category: 'structured_data_injection',
2398
+ description: 'Detects prompt injection hidden in structured data formats',
2399
+ detail: 'Structured data injection: agents constantly parse JSON/XML/YAML/CSV and attackers embed instructions in metadata fields, CDATA sections, and comments'
2400
+ },
2401
+ {
2402
+ regex: /(?:<!--|\{\{!--|\/\*|#)\s*(?:ignore|override|forget|disregard)\s*(?:all\s+)?(?:previous|prior|above)/i,
2403
+ severity: 'high',
2404
+ category: 'structured_data_injection',
2405
+ description: 'Detects prompt injection hidden in structured data formats',
2406
+ detail: 'Structured data injection: agents constantly parse JSON/XML/YAML/CSV and attackers embed instructions in metadata fields, CDATA sections, and comments'
2209
2407
  }
2210
2408
  ];
2211
2409
 
@@ -564,11 +564,13 @@ class DocumentScanner {
564
564
  * @param {string} [options.sensitivity='medium'] - Detection sensitivity ('low', 'medium', 'high').
565
565
  * @param {boolean} [options.logging=false] - Whether to log scan results.
566
566
  * @param {boolean} [options.scanForInjection=true] - Whether to run indirect injection scanning.
567
+ * @param {number} [options.maxDocumentSize=104857600] - Maximum document size in characters (default: 100MB). Prevents DoS via oversized documents.
567
568
  */
568
569
  constructor(options = {}) {
569
570
  this.sensitivity = options.sensitivity || 'medium';
570
571
  this.logging = options.logging || false;
571
572
  this.scanForInjection = options.scanForInjection !== false;
573
+ this.maxDocumentSize = options.maxDocumentSize || 100 * 1024 * 1024;
572
574
  this.injectionScanner = new IndirectInjectionScanner({ sensitivity: this.sensitivity });
573
575
  }
574
576
 
@@ -682,6 +684,24 @@ class DocumentScanner {
682
684
  const source = metadata.source || 'text';
683
685
  const fileType = metadata.fileType || 'text/plain';
684
686
 
687
+ // Enforce document size limit to prevent DoS via oversized documents
688
+ if (text && text.length > this.maxDocumentSize) {
689
+ if (this.logging) {
690
+ console.log('[Agent Shield] Document exceeds size limit: %d characters (max: %d)', text.length, this.maxDocumentSize);
691
+ }
692
+ return {
693
+ fileType,
694
+ textLength: text.length,
695
+ threats: [{
696
+ type: 'document_too_large',
697
+ severity: 'medium',
698
+ message: 'Document exceeds size limit'
699
+ }],
700
+ status: 'caution',
701
+ safe: false
702
+ };
703
+ }
704
+
685
705
  if (!text || text.trim().length === 0) {
686
706
  return {
687
707
  fileType,
@@ -169,6 +169,66 @@ class MemoryIntegrityMonitor {
169
169
  };
170
170
  }
171
171
 
172
+ /**
173
+ * Scan a summarization/compaction output for injected instructions.
174
+ * Detects when a summarization process silently injects instructions
175
+ * into the summary that weren't present in the original messages.
176
+ * Addresses Unit 42's March 2026 research on persistent memory poisoning.
177
+ *
178
+ * @param {string[]} originalMessages - The original messages before summarization.
179
+ * @param {string} summary - The summarized/compacted output to check.
180
+ * @returns {{ safe: boolean, injections: Array<{phrase: string, type: string}> }}
181
+ */
182
+ scanSummarization(originalMessages, summary) {
183
+ if (!summary || typeof summary !== 'string') {
184
+ return { safe: true, injections: [] };
185
+ }
186
+ if (!Array.isArray(originalMessages)) {
187
+ return { safe: true, injections: [] };
188
+ }
189
+
190
+ const instructionPatterns = [
191
+ /\bignore\b/gi,
192
+ /\boverride\b/gi,
193
+ /\bsystem\s*:/gi,
194
+ /\byou\s+are\b/gi,
195
+ /\bnew\s+instructions?\b/gi,
196
+ /\bforget\b/gi,
197
+ /\bdisregard\b/gi,
198
+ /\bact\s+as\b/gi
199
+ ];
200
+
201
+ // Concatenate original messages for lookup
202
+ const originalText = originalMessages.join(' ');
203
+
204
+ const injections = [];
205
+
206
+ for (const pattern of instructionPatterns) {
207
+ // Reset lastIndex for global patterns
208
+ pattern.lastIndex = 0;
209
+ let match;
210
+ while ((match = pattern.exec(summary)) !== null) {
211
+ const phrase = match[0];
212
+ // Check if this phrase existed in any of the original messages
213
+ const phraseRegex = new RegExp(phrase.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), 'i');
214
+ if (!phraseRegex.test(originalText)) {
215
+ injections.push({
216
+ phrase,
217
+ type: 'injected_via_summarization'
218
+ });
219
+ }
220
+ }
221
+ }
222
+
223
+ const safe = injections.length === 0;
224
+
225
+ if (!safe) {
226
+ console.log('[Agent Shield] Persistent memory poisoning detected: %d instruction(s) injected via summarization', injections.length);
227
+ }
228
+
229
+ return { safe, injections };
230
+ }
231
+
172
232
  /**
173
233
  * Get the full timeline of memory writes.
174
234
  * @returns {Array<{content: string, source: string, timestamp: number, hash: string, suspicious: boolean}>}
@@ -43,6 +43,14 @@ const KNOWN_BAD_SERVERS = Object.freeze({
43
43
  'postmark-clone': {
44
44
  reason: 'Tool definition bait-and-switch (Postmark-style rugpull)',
45
45
  severity: 'critical'
46
+ },
47
+ 'aws-mcp-server-unpatched': {
48
+ reason: 'Multiple critical RCE vulnerabilities (CVE-2026-5058, CVE-2026-5059)',
49
+ severity: 'critical'
50
+ },
51
+ 'flowise-unpatched': {
52
+ reason: 'CVSS 10.0 RCE actively exploited (CVE-2025-59528)',
53
+ severity: 'critical'
46
54
  }
47
55
  });
48
56
 
@@ -62,6 +70,12 @@ const CVE_REGISTRY = Object.freeze({
62
70
  severity: 'critical',
63
71
  description: 'Azure MCP Server SSRF (CVSS 8.8). Attacker sends crafted URL via tool parameter, server forwards request with managed identity token to attacker-controlled endpoint.',
64
72
  fix: 'Apply March 2026 Patch Tuesday update. Validate all URLs against allowlists. Block private IPs and cloud metadata endpoints (169.254.169.254).'
73
+ },
74
+ {
75
+ cve: 'CVE-2026-32211',
76
+ severity: 'critical',
77
+ description: 'Azure MCP Server lacks authentication entirely (CVSS 9.1), allowing unauthorized access to sensitive data.',
78
+ fix: 'Enable authentication on Azure MCP Server. Never deploy without auth.'
65
79
  }
66
80
  ],
67
81
  'adx-mcp-server': [
@@ -78,6 +92,36 @@ const CVE_REGISTRY = Object.freeze({
78
92
  severity: 'critical',
79
93
  description: 'OpenClaw WebSocket token theft (CVSS 8.8). Control UI accepts gatewayUrl query parameter without validation, redirecting WebSocket to attacker server and leaking auth tokens.',
80
94
  fix: 'Upgrade to OpenClaw >=2026.1.29. Validate gatewayUrl against allowlist. Never pass auth tokens to unvalidated endpoints.'
95
+ },
96
+ {
97
+ cve: 'CVE-2026-33579',
98
+ severity: 'critical',
99
+ description: 'Silent admin takeover. Attacker gains full admin access without detection. Patched April 5 2026.',
100
+ fix: 'Upgrade OpenClaw immediately. Audit admin access logs.'
101
+ },
102
+ {
103
+ cve: 'CVE-2026-24763',
104
+ severity: 'high',
105
+ description: 'Command injection in OpenClaw.',
106
+ fix: 'Upgrade OpenClaw to latest patched version.'
107
+ },
108
+ {
109
+ cve: 'CVE-2026-26322',
110
+ severity: 'high',
111
+ description: 'SSRF in OpenClaw.',
112
+ fix: 'Upgrade OpenClaw to latest patched version.'
113
+ },
114
+ {
115
+ cve: 'CVE-2026-26329',
116
+ severity: 'high',
117
+ description: 'Path traversal enables local file reads in OpenClaw.',
118
+ fix: 'Upgrade OpenClaw to latest patched version.'
119
+ },
120
+ {
121
+ cve: 'CVE-2026-30741',
122
+ severity: 'critical',
123
+ description: 'Prompt-injection-driven code execution in OpenClaw.',
124
+ fix: 'Upgrade OpenClaw to latest patched version.'
81
125
  }
82
126
  ],
83
127
  'mcp-typescript-sdk': [
@@ -133,6 +177,72 @@ const CVE_REGISTRY = Object.freeze({
133
177
  description: 'MCPJam Inspector RCE. HTTP server binds to 0.0.0.0 by default with no authentication on server management endpoint. Any device on the same network can execute arbitrary commands.',
134
178
  fix: 'Upgrade MCPJam Inspector to >=1.4.3. Bind to 127.0.0.1 only. Add authentication to management endpoints.'
135
179
  }
180
+ ],
181
+ 'aws-mcp-server': [
182
+ {
183
+ cve: 'CVE-2026-5058',
184
+ severity: 'critical',
185
+ description: 'Command injection in aws-mcp-server (CVSS 9.8) allows remote code execution without authentication.',
186
+ fix: 'Upgrade aws-mcp-server. Sanitize all CLI arguments. Block shell metacharacters.'
187
+ },
188
+ {
189
+ cve: 'CVE-2026-5059',
190
+ severity: 'critical',
191
+ description: 'Remote code execution in aws-mcp-server via unsanitized inputs.',
192
+ fix: 'Upgrade aws-mcp-server to latest patched version.'
193
+ }
194
+ ],
195
+ 'vscode-mcp': [
196
+ {
197
+ cve: 'CVE-2026-21518',
198
+ severity: 'high',
199
+ description: 'VS Code mcp.json command injection. Opening malicious project executes arbitrary code through mcp.json file handling.',
200
+ fix: 'Update VS Code. Never open untrusted projects. Audit mcp.json files before opening.'
201
+ }
202
+ ],
203
+ 'flowise': [
204
+ {
205
+ cve: 'CVE-2025-59528',
206
+ severity: 'critical',
207
+ description: 'Code injection in Flowise MCP node (CVSS 10.0) allows remote code execution. 12,000-15,000 instances exposed. Actively exploited since April 6.',
208
+ fix: 'Upgrade Flowise to >=3.0.6. Restrict access to MCP node.'
209
+ },
210
+ {
211
+ cve: 'CVE-2025-8943',
212
+ severity: 'critical',
213
+ description: 'Missing authentication in Flowise.',
214
+ fix: 'Upgrade Flowise. Enable authentication.'
215
+ },
216
+ {
217
+ cve: 'CVE-2025-26319',
218
+ severity: 'critical',
219
+ description: 'Arbitrary file upload in Flowise.',
220
+ fix: 'Upgrade Flowise to latest.'
221
+ }
222
+ ],
223
+ 'mcp-data-vis': [
224
+ {
225
+ cve: 'CVE-2026-5322',
226
+ severity: 'high',
227
+ description: 'SQL injection in AlejandroArciniegas mcp-data-vis.',
228
+ fix: 'Avoid using mcp-data-vis or patch SQL query handling.'
229
+ }
230
+ ],
231
+ 'chatbox-mcp': [
232
+ {
233
+ cve: 'CVE-2026-6130',
234
+ severity: 'high',
235
+ description: 'OS command injection in chatboxai chatbox MCP server management.',
236
+ fix: 'Upgrade chatbox. Sanitize all management API inputs.'
237
+ }
238
+ ],
239
+ 'codebase-mcp': [
240
+ {
241
+ cve: 'CVE-2026-5023',
242
+ severity: 'high',
243
+ description: 'OS command injection RCE in codebase-mcp.',
244
+ fix: 'Upgrade codebase-mcp. Never pass unsanitized inputs to shell.'
245
+ }
136
246
  ]
137
247
  });
138
248
 
@@ -175,7 +285,7 @@ const SSRF_PATTERNS = [
175
285
  /^(?:https?:\/\/)?(?:127\.0\.0\.1|0\.0\.0\.0|localhost)/
176
286
  ];
177
287
 
178
- /** Known malicious skill/plugin patterns (ref ClawHavoc campaign — 820+ malicious skills). */
288
+ /** Known malicious skill/plugin patterns (ref ClawHavoc campaign — 1,184+ malicious skills found on ClawHub). */
179
289
  const CLAWHAVOC_INDICATORS = [
180
290
  /(?:reverse.?shell|bind.?shell)/i,
181
291
  /(?:AMOS|atomic.?macos.?stealer)/i,
@@ -817,7 +927,7 @@ class SupplyChainScanner {
817
927
 
818
928
  /**
819
929
  * Scan tool code/description for ClawHavoc-style malicious patterns.
820
- * Ref: 820+ malicious skills found on ClawHub, delivering AMOS stealer.
930
+ * Ref: 1,184+ malicious skills found on ClawHub, delivering AMOS stealer.
821
931
  * @private
822
932
  */
823
933
  _scanForClawHavoc(tool, findings) {
@@ -76,8 +76,7 @@ class SybilDetector {
76
76
  /** @type {Map<string, Array<object>>} */
77
77
  this._actions = new Map();
78
78
 
79
- console.log('%s SybilDetector initialized (threshold: %s, window: %dms, minCluster: %d)',
80
- LOG_PREFIX, this.similarityThreshold, this.timeWindowMs, this.minClusterSize);
79
+ console.log('%s SybilDetector initialized (threshold: %s, window: %dms, minCluster: %d)', LOG_PREFIX, this.similarityThreshold, this.timeWindowMs, this.minClusterSize);
81
80
  }
82
81
 
83
82
  /**
@@ -222,8 +221,7 @@ class SybilDetector {
222
221
  }
223
222
  }
224
223
 
225
- console.log('%s Sybil detection complete: %d cluster(s), risk=%s',
226
- LOG_PREFIX, clusters.length, sybilRisk);
224
+ console.log('%s Sybil detection complete: %d cluster(s), risk=%s', LOG_PREFIX, clusters.length, sybilRisk);
227
225
 
228
226
  return { clusters, sybilRisk };
229
227
  }
@@ -514,8 +512,7 @@ class AgentIdentityVerifier {
514
512
 
515
513
  const hasSharedKeys = sharedKeyGroups.length > 0;
516
514
  if (hasSharedKeys) {
517
- console.log('%s Shared secret detected among %d group(s)',
518
- LOG_PREFIX, sharedKeyGroups.length);
515
+ console.log('%s Shared secret detected among %d group(s)', LOG_PREFIX, sharedKeyGroups.length);
519
516
  }
520
517
 
521
518
  return { sharedKeyGroups, hasSharedKeys };