npm - guard-scanner - Versions diffs - 2.0.0 → 2.1.0 - Mend

guard-scanner 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +82 -43
package/hooks/guard-scanner/plugin.ts +42 -0
package/package.json +2 -2
package/src/patterns.js +22 -0
package/src/scanner.js +8 -2

package/README.md CHANGED Viewed

@@ -2,16 +2,16 @@
   <h1 align="center">🛡️ guard-scanner</h1>
   <p align="center">
     <strong>Static security scanner for AI agent skills</strong><br>
-    Detect prompt injection, credential theft, exfiltration, identity hijacking, and 16 more threat categories.<br>
-    <sub>🆕 Plugin Hook v2.0 — <strong>actual blocking</strong> via <code>block</code>/<code>blockReason</code> API</sub>
+    Detect prompt injection, credential theft, exfiltration, PII exposure, Shadow AI, and 17 more threat categories.<br>
+    <sub>🆕 v2.1 — PII Exposure Detection + Shadow AI + Plugin Hook blocking via <code>block</code>/<code>blockReason</code> API</sub>
   </p>
   <p align="center">
     <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT License"></a>
     <img src="https://img.shields.io/badge/node-%3E%3D18.0.0-brightgreen" alt="Node.js 18+">
     <img src="https://img.shields.io/badge/dependencies-0-success" alt="Zero Dependencies">
-    <img src="https://img.shields.io/badge/tests-56%2F56-brightgreen" alt="Tests Passing">
-    <img src="https://img.shields.io/badge/patterns-186-orange" alt="186 Patterns">
-    <img src="https://img.shields.io/badge/categories-20-blueviolet" alt="20 Categories">
+    <img src="https://img.shields.io/badge/tests-99%2F99-brightgreen" alt="Tests Passing">
+    <img src="https://img.shields.io/badge/patterns-129-orange" alt="129 Patterns">
+    <img src="https://img.shields.io/badge/categories-21-blueviolet" alt="21 Categories">
   </p>
 </p>
@@ -40,8 +40,8 @@ The AI agent skill ecosystem has the same supply-chain security problem that npm
 | Feature | Description |
 |---|---|
-| **20 Threat Categories** | Snyk ToxicSkills + OWASP MCP Top 10 + Identity Hijacking + Sandbox/Complexity/Config |
-| **186 Detection Patterns** | Regex-based static analysis covering code, docs, and data files |
+| **21 Threat Categories** | Snyk ToxicSkills + OWASP MCP Top 10 + Identity Hijacking + Sandbox/Complexity/Config + PII |
+| **129 Detection Patterns** | Regex-based static analysis covering code, docs, and data files |
 | **IoC Database** | Known malicious IPs, domains, URLs, usernames, and typosquat names |
 | **Data Flow Analysis** | Lightweight JS analysis: secret reads → network calls → exec chains |
 | **Cross-File Analysis** | Phantom references, base64 fragment assembly, multi-file exfil detection |
@@ -84,7 +84,7 @@ npx guard-scanner ~/.openclaw/workspace/skills --self-exclude --verbose
 cp hooks/guard-scanner/plugin.ts ~/.openclaw/plugins/guard-scanner-runtime.ts
 ```
-> **🆕 v2.0 Plugin Hook** — Uses OpenClaw's native `block`/`blockReason` API to actually prevent dangerous tool calls. Supports 3 modes: `monitor` (log only), `enforce` (block CRITICAL), `strict` (block HIGH + CRITICAL).
+> **🆕 v2.1** — PII Exposure Detection (OWASP LLM02/06) + Shadow AI detection + Plugin Hook `block`/`blockReason` API. 3 modes: `monitor`, `enforce`, `strict`.
 ### Installation (Optional)
@@ -99,7 +99,7 @@ npx guard-scanner ./skills/
 ### As an OpenClaw Skill
 ```bash
-openclaw skill install guard-scanner
+clawhub install guard-scanner
 guard-scanner ~/.openclaw/workspace/skills/ --self-exclude --verbose
 ```
@@ -109,7 +109,7 @@ guard-scanner ~/.openclaw/workspace/skills/ --self-exclude --verbose
 ## Threat Categories
-guard-scanner covers **20 threat categories** derived from four sources:
+guard-scanner covers **21 threat categories** derived from four sources:
 | # | Category | Based On | Severity | What It Detects |
 |---|----------|----------|----------|----------------|
@@ -133,8 +133,9 @@ guard-scanner covers **20 threat categories** derived from four sources:
 | 18 | **Sandbox Validation** | v1.1 | HIGH | Dangerous binary requirements in SKILL.md, overly broad file scope, sensitive env vars, exec/network declarations |
 | 19 | **Code Complexity** | v1.1 | MEDIUM | Excessive file length (>1000 lines), deep nesting (>5 levels), high eval/exec density |
 | 20 | **Config Impact** | v1.1 | CRITICAL | `openclaw.json` writes, exec approval bypass, exec host gateway, internal hooks modification, network wildcard |
+| 21 | **PII Exposure** | v2.1 | CRITICAL | Hardcoded CC/SSN/phone/email (context-aware), PII logging/network send/plaintext store, Shadow AI (OpenAI/Anthropic/generic LLM), PII collection instructions (address/DOB/government ID) |
-> **Categories 17–20** are unique to guard-scanner. Category 17 (Identity Hijacking) was developed from a real attack. Categories 18–20 were added in v1.1.0 based on community feedback.
+> **Categories 17–21** are unique to guard-scanner. Category 17 (Identity Hijacking) was developed from a real attack. Categories 18–20 added in v1.1.0. Category 21 (PII Exposure) added in v2.1.0 covering OWASP LLM02/LLM06.
 ---
@@ -143,7 +144,7 @@ guard-scanner covers **20 threat categories** derived from four sources:
 ### Terminal (Default)
 ```
-🛡️  guard-scanner v1.1.1
+🛡️  guard-scanner v2.1.0
 ══════════════════════════════════════════════════════
 📂 Scanning: ./skills/
 📦 Skills found: 22
@@ -228,6 +229,9 @@ Certain combinations multiply the base score:
 | Config impact | **×2** | OpenClaw configuration tampering |
 | Config impact + Sandbox violation | **min 70** | Combined config + capability abuse |
 | Complexity + Malicious code/Obfuscation | **×1.5** | Complex code hiding threats |
+| PII exposure + Exfiltration | **×3** | PII being sent to external servers |
+| PII exposure + Shadow AI | **×2.5** | PII leak through unauthorized LLM |
+| PII exposure + Credential handling | **×2** | Combined PII + credential risk |
 | Known IoC (IP/URL/typosquat) | **= 100** | Confirmed malicious |
 ### Verdict Thresholds
@@ -400,8 +404,8 @@ Options:
 ```
 guard-scanner/
 ├── src/
-│   ├── scanner.js      # GuardScanner class — core scan engine (20 checks)
-│   ├── patterns.js     # 186 threat detection patterns (Cat 1–20)
+│   ├── scanner.js      # GuardScanner class — core scan engine (21 checks)
+│   ├── patterns.js     # 129 threat detection patterns (Cat 1–21)
 │   ├── ioc-db.js       # Indicators of Compromise database
 │   └── cli.js          # CLI entry point and argument parser
 ├── hooks/
@@ -410,9 +414,9 @@ guard-scanner/
 │       ├── handler.ts  # Legacy Internal Hook — warn only (deprecated)
 │       └── HOOK.md     # Internal Hook manifest (legacy)
 ├── test/
-│   ├── scanner.test.js # 56 tests — static scanner
+│   ├── scanner.test.js # 64 tests — static scanner (incl. PII v2.1)
 │   ├── plugin.test.js  # 35 tests — Plugin Hook runtime guard
-│   └── fixtures/       # Malicious, clean, complex, config-changer samples
+│   └── fixtures/       # Malicious, clean, complex, config-changer, pii-leaky samples
 ├── package.json        # Zero dependencies, node --test
 ├── CHANGELOG.md
 ├── LICENSE             # MIT
@@ -536,11 +540,11 @@ console.log(scanner.toHTML());    // HTML string
 ## Test Results
 ```
-ℹ tests 56
-ℹ suites 13
-ℹ pass 56
+ℹ tests 99
+ℹ suites 16
+ℹ pass 99
 ℹ fail 0
-ℹ duration_ms 108ms
+ℹ duration_ms 142ms
 ```
 | Suite | Tests | Coverage |
@@ -550,14 +554,34 @@ console.log(scanner.toHTML());    // HTML string
 | Risk Score Calculation | 5 | Empty, single, combo amplifiers, IoC override |
 | Verdict Determination | 5 | All verdicts + strict mode |
 | Output Formats | 4 | JSON + SARIF 2.1.0 + HTML structure |
-| Pattern Database | 4 | 100+ count, required fields, category coverage, regex safety |
+| Pattern Database | 4 | 125+ count, required fields, category coverage, regex safety |
 | IoC Database | 5 | Structure, ClawHavoc C2, webhook.site |
 | Shannon Entropy | 2 | Low entropy, high entropy |
 | Ignore Functionality | 1 | Pattern exclusion |
 | Plugin API | 1 | Plugin loading + custom rule injection |
-| **Manifest Validation (v1.1)** | 4 | Dangerous bins, broad files, sensitive env, clean negatives |
-| **Complexity Metrics (v1.1)** | 2 | Deep nesting, clean negatives |
-| **Config Impact (v1.1)** | 4 | openclaw.json write, exec approval, gateway host, clean negatives |
+| Manifest Validation | 4 | Dangerous bins, broad files, sensitive env, clean negatives |
+| Complexity Metrics | 2 | Deep nesting, clean negatives |
+| Config Impact | 4 | openclaw.json write, exec approval, gateway host, clean negatives |
+| **🆕 PII Exposure Detection** | **8** | **Hardcoded CC/SSN, PII logging, network send, Shadow AI, doc collection, risk amp, clean negatives** |
+| **Plugin Hook Runtime Guard** | **35** | **Blocking in enforce/strict, passthrough in monitor, all 12 threat patterns, blockReason format** |
+---
+## Fills OpenClaw's Own Security Gaps
+OpenClaw's official [`THREAT-MODEL-ATLAS.md`](https://github.com/openclaw/openclaw/blob/main/docs/security/THREAT-MODEL-ATLAS.md) identifies security gaps that guard-scanner directly addresses:
+| Gap (from ATLAS / Source Code) | OpenClaw Status | guard-scanner |
+|---|---|---|
+| _"Simple regex easily bypassed"_ — ClawHub moderation | ⚠️ Basic `FLAG_RULES` | ✅ 129 patterns, 21 categories |
+| _"Does not analyze actual skill code content"_ | ❌ Not implemented | ✅ Full code + doc + data flow analysis |
+| No SOUL.md / IDENTITY.md integrity verification | ❌ Not implemented | ✅ Identity hijacking detection (Cat 17) |
+| `skill:before_install` hook | ❌ Not implemented | 🔜 Proposed ([Issue #18677](https://github.com/openclaw/openclaw/issues/18677)) |
+| `before_tool_call` blocking reference impl | ❌ No official plugin | ✅ First reference implementation (plugin.ts) |
+| SARIF / CI integration for skill security | ❌ Not available | ✅ SARIF 2.1.0 + GitHub Actions |
+| Behavioral analysis beyond VirusTotal | ⏳ In progress | ✅ LLM-specific threat patterns (prompt injection, memory poisoning, MCP attacks) |
+> guard-scanner is **complementary** to OpenClaw's built-in security — not a replacement. OpenClaw handles infrastructure security (SSRF blocking, exec approvals, sandbox, auth). guard-scanner handles **AI-specific threats** that traditional scanning misses.
 ---
@@ -578,19 +602,19 @@ guard-scanner's coverage of the [OWASP Top 10 for LLM Applications (2025)](https
 | # | Risk | Status | Detection Method |
 |---|------|--------|------------------|
 | LLM01 | Prompt Injection | ⚠️ Partial | Regex: Unicode exploits, role override, system tags, base64 instructions |
-| LLM02 | Insecure Output Handling | 🔜 v1.2 | Planned: unvalidated output execution patterns |
+| LLM02 | Sensitive Information Disclosure | ⚠️ Partial | PII Exposure Detection (v2.1): hardcoded PII, PII logging/network/storage, Shadow AI, PII collection instructions |
 | LLM03 | Training Data Poisoning | ⬜ N/A | Out of scope for static analysis |
-| LLM04 | Model Denial of Service | 🔜 v1.3 | Planned: excessive input / infinite loop patterns |
+| LLM04 | Model Denial of Service | 🔜 v2.2 | Planned: excessive input / infinite loop patterns |
 | LLM05 | Supply Chain Vulnerabilities | ⚠️ Partial | IoC database, typosquat detection, dependency chain scan |
-| LLM06 | Sensitive Information Disclosure | ⚠️ Partial | Secret detection, PII patterns, credential leaks |
+| LLM06 | Insecure Output Handling | ⚠️ Partial | PII output detection (console.log, network send, plaintext store) |
 | LLM07 | Insecure Plugin Design | 🔜 v1.3 | Planned: unvalidated plugin input patterns |
 | LLM08 | Excessive Agency | 🔜 v1.3 | Planned: over-permissioned scope detection |
 | LLM09 | Overreliance | 🔜 v1.3 | Planned: unverified output trust patterns |
 | LLM10 | Model Theft | 🔜 v1.3 | Planned: model file exfiltration patterns |
-> **Current coverage: 3/10 (partial).** Full OWASP Gen AI coverage is targeted for v1.3. See [ROADMAP.md](ROADMAP.md) for details.
+> **Current coverage: 5/10 (partial).** LLM02 and LLM06 added in v2.1.0. Full coverage targeted for v3.0. See [ROADMAP.md](ROADMAP.md) for details.
 >
-> **Known limitation:** Regex-based detection can be evaded by AI-generated code obfuscation. v2.0 will introduce AST analysis and ML-based detection to address this structural gap.
+> **Known limitation:** Regex-based detection can be evaded by AI-generated code obfuscation. v3.0 will introduce AST analysis and ML-based detection to address this structural gap.
 ---
@@ -600,7 +624,7 @@ guard-scanner's coverage of the [OWASP Top 10 for LLM Applications (2025)](https
 2. Create a feature branch (`git checkout -b feature/new-pattern`)
 3. Add your pattern to `src/patterns.js` with the required fields
 4. Add a test case in `test/fixtures/` and `test/scanner.test.js`
-5. Run `npm test` — all 56+ tests must pass
+5. Run `npm test` — all 99+ tests must pass
 6. Submit a Pull Request
 ### Adding a New Detection Pattern
@@ -639,19 +663,34 @@ We built one.
 ## 🔒 Need More? — GuavaSuite
-guard-scanner catches threats **before** installation. But what happens **after** a skill is running?
+guard-scanner catches threats **before** installation and **blocks** CRITICAL threats at runtime. **GuavaSuite** unlocks **strict mode** — blocking HIGH + CRITICAL threats, plus exclusive defense-in-depth features.
+### How to Upgrade
+```bash
+# 1. Install GuavaSuite
+clawhub install guava-suite
+# 2. Hold 1M+ $GUAVA on Polygon
+#    Token: 0x25cBD481901990bF0ed2ff9c5F3C0d4f743AC7B8
+#    Buy on QuickSwap V2: https://quickswap.exchange
+# 3. Activate with your wallet → get JWT → strict mode enabled
+```
-[**GuavaSuite**](https://github.com/koatora20) extends guard-scanner with real-time protection for production agent deployments:
+### Feature Comparison
-| | guard-scanner (OSS) | GuavaSuite (Private) |
+| | guard-scanner (Free) | GuavaSuite ($GUAVA) |
 |---|---|---|
-| Static scan | ✅ 20 categories | ✅ 20 categories |
-| Runtime blocking | ✅ Plugin Hook v2.0 (`block`/`blockReason`) | ✅ SuiteGate (enhanced ruleset) |
-| SOUL.md integrity | Pattern detection only | ⏳ SHA-256 hash watchdog (W4 E2E) |
-| On-chain verification | — | ⏳ SoulChain (Polygon, Phase 2) |
-| Identity recovery | — | ⏳ Automatic rollback (Phase 2) |
+| Static scan (129 patterns, 21 categories) | ✅ | ✅ |
+| Runtime Guard — `enforce` (block CRITICAL) | ✅ | ✅ |
+| **Runtime Guard — `strict` (block HIGH + CRITICAL)** | ❌ | ✅ |
+| **Soul Lock** (SOUL.md integrity + auto-rollback) | ❌ | ✅ |
+| **Memory Guard** (L1-L5 記憶保護) | ❌ | ✅ |
+| **On-chain Identity** (SoulRegistry V2 on Polygon) | ❌ | ✅ |
+| Audit Log (JSONL) | ✅ | ✅ |
-guard-scanner is and always will be **free, open-source, and zero-dependency**. If your agent handles production workloads and you want defense-in-depth, [reach out](https://github.com/koatora20).
+guard-scanner is and always will be **free, open-source, and zero-dependency**.
 ---
@@ -660,10 +699,10 @@ guard-scanner is and always will be **free, open-source, and zero-dependency**.
 | Version | Focus | Key Features |
 |---------|-------|------|
 | v1.1.1 ✅ | Stability | 56 tests, bug fixes |
-| v1.2 | PII + Shadow AI | Credential-in-context, unauthorized LLM API calls, memory poisoning vectors |
-| v1.3 | OWASP Gen AI | Complete LLM02/04/07/08/09/10 coverage |
-| v2.0 | AST + ML | JavaScript AST analysis, taint tracking, ML-based obfuscation detection, SBOM generation |
-| v2.1 | Community | YAML pattern definitions, CONTRIBUTING guide, automated pattern updates |
+| v2.0.0 ✅ | **Plugin Hook Runtime Guard** | `block`/`blockReason` API, 3 modes (monitor/enforce/strict), 91 tests |
+| v2.1.0 ✅ | **PII Exposure + Shadow AI** | 13 PII patterns, OWASP LLM02/06, Shadow AI detection, 3 risk amplifiers, 99 tests |
+| v2.2 | OWASP Full Coverage | LLM04/07/08/09/10, YAML pattern definitions, CONTRIBUTING guide |
+| v3.0 | AST + ML | JavaScript AST analysis, taint tracking, ML-based obfuscation detection, SBOM generation |
 See [ROADMAP.md](ROADMAP.md) for full details.

package/hooks/guard-scanner/plugin.ts CHANGED Viewed

@@ -172,10 +172,52 @@ function logAudit(entry: Record<string, unknown>): void {
 type GuardMode = "monitor" | "enforce" | "strict";
+const SUITE_TOKEN_FILE = join(homedir(), ".openclaw", "guava-suite", "token.jwt");
+/**
+ * Check if GuavaSuite JWT exists and hasn't expired.
+ * Why: Lightweight check without jsonwebtoken dependency — just decode base64 payload.
+ * Full JWT signature verification happens at activation time in activate.js.
+ */
+function isSuiteActive(): boolean {
+    try {
+        const token = readFileSync(SUITE_TOKEN_FILE, "utf8").trim();
+        if (!token) return false;
+        // Decode JWT payload (base64url → JSON)
+        const parts = token.split(".");
+        if (parts.length !== 3) return false;
+        const payload = JSON.parse(
+            Buffer.from(parts[1], "base64url").toString("utf8")
+        );
+        // Check expiry
+        if (payload.exp && payload.exp * 1000 < Date.now()) return false;
+        // Check scope
+        return payload.scope === "suite";
+    } catch {
+        return false;
+    }
+}
 function loadMode(): GuardMode {
+    // Priority 1: GuavaSuite JWT token → strict
+    if (isSuiteActive()) {
+        return "strict";
+    }
+    // Priority 2: explicit config in openclaw.json
     try {
         const configPath = join(homedir(), ".openclaw", "openclaw.json");
         const config = JSON.parse(readFileSync(configPath, "utf8"));
+        // Check suiteEnabled flag (set by activate.js)
+        if (config?.plugins?.["guard-scanner"]?.suiteEnabled === true) {
+            return "strict";
+        }
         const mode = config?.plugins?.["guard-scanner"]?.mode;
         if (mode === "monitor" || mode === "enforce" || mode === "strict") {
             return mode;

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
     "name": "guard-scanner",
-    "version": "2.0.0",
-    "description": "Agent skill security scanner — detect prompt injection, malicious code, credential leaks, and 20 threat categories in AI agent skills",
+    "version": "2.1.0",
+    "description": "Agent skill security scanner — detect prompt injection, malicious code, credential leaks, PII exposure, Shadow AI, and 21 threat categories in AI agent skills",
     "main": "src/scanner.js",
     "bin": {
         "guard-scanner": "src/cli.js"

package/src/patterns.js CHANGED Viewed

@@ -185,6 +185,28 @@ const PATTERNS = [
     { id: 'CFG_EXEC_HOST_GW', cat: 'config-impact', regex: /tools\.exec\.host\s*[:=]\s*['"]gateway['"]/gi, severity: 'CRITICAL', desc: 'Set exec host to gateway (bypass sandbox)', all: true },
     { id: 'CFG_SANDBOX_OFF', cat: 'config-impact', regex: /(?:sandbox|sandboxed|containerized)\s*[:=]\s*(?:false|off|none|disabled|0)/gi, severity: 'CRITICAL', desc: 'Disable sandbox via configuration', all: true },
     { id: 'CFG_TOOL_OVERRIDE', cat: 'config-impact', regex: /(?:tools|capabilities)\s*\.\s*(?:exec|write|browser|web_fetch)\s*[:=]\s*\{[^}]*(?:enabled|allowed|host)/gi, severity: 'HIGH', desc: 'Override tool security settings', codeOnly: true },
+    // ── Category 21: PII Exposure (OWASP LLM02 / LLM06) ──
+    // A. Hardcoded PII — actual PII values in code/config (context-aware to reduce FP)
+    { id: 'PII_HARDCODED_CC', cat: 'pii-exposure', regex: /(?:card|cc|credit|payment|pan)[_\s.-]*(?:num|number|no)?\s*[:=]\s*['"`]\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{3,4}['"`]/gi, severity: 'CRITICAL', desc: 'Hardcoded credit card number', codeOnly: true },
+    { id: 'PII_HARDCODED_SSN', cat: 'pii-exposure', regex: /(?:ssn|social[_\s-]*security|tax[_\s-]*id)\s*[:=]\s*['"`]\d{3}-?\d{2}-?\d{4}['"`]/gi, severity: 'CRITICAL', desc: 'Hardcoded SSN/tax ID', codeOnly: true },
+    { id: 'PII_HARDCODED_PHONE', cat: 'pii-exposure', regex: /(?:phone|tel|mobile|cell|fax)[_\s.-]*(?:num|number|no)?\s*[:=]\s*['"`][+]?[\d\s().-]{7,20}['"`]/gi, severity: 'HIGH', desc: 'Hardcoded phone number', codeOnly: true },
+    { id: 'PII_HARDCODED_EMAIL', cat: 'pii-exposure', regex: /(?:email|e-mail|user[_\s-]*mail|contact)\s*[:=]\s*['"`][a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}['"`]/gi, severity: 'HIGH', desc: 'Hardcoded email address', codeOnly: true },
+    // B. PII output/logging — code that outputs or transmits PII-like variables
+    { id: 'PII_LOG_SENSITIVE', cat: 'pii-exposure', regex: /(?:console\.log|console\.info|console\.warn|logger?\.\w+|print|puts)\s*\([^)]*\b(?:ssn|social_security|credit_card|card_number|cvv|cvc|passport|tax_id|date_of_birth|dob)\b/gi, severity: 'HIGH', desc: 'PII variable logged to console', codeOnly: true },
+    { id: 'PII_SEND_NETWORK', cat: 'pii-exposure', regex: /(?:fetch|axios|request|http|post|put|send)\s*\([^)]*\b(?:ssn|social_security|credit_card|card_number|cvv|passport|bank_account|routing_number)\b/gi, severity: 'CRITICAL', desc: 'PII variable sent over network', codeOnly: true },
+    { id: 'PII_STORE_PLAINTEXT', cat: 'pii-exposure', regex: /(?:writeFile|writeFileSync|appendFile|fs\.write|fwrite)\s*\([^)]*\b(?:ssn|social_security|credit_card|card_number|cvv|passport|tax_id|bank_account)\b/gi, severity: 'HIGH', desc: 'PII stored in plaintext file', codeOnly: true },
+    // C. Shadow AI — unauthorized LLM API calls (data leaks to external AI)
+    { id: 'SHADOW_AI_OPENAI', cat: 'pii-exposure', regex: /(?:api\.openai\.com|https:\/\/api\.openai\.com)\s*|openai\.(?:chat|completions|ChatCompletion)/gi, severity: 'HIGH', desc: 'Shadow AI: OpenAI API call', codeOnly: true },
+    { id: 'SHADOW_AI_ANTHROPIC', cat: 'pii-exposure', regex: /(?:api\.anthropic\.com|https:\/\/api\.anthropic\.com)\s*|anthropic\.(?:messages|completions)/gi, severity: 'HIGH', desc: 'Shadow AI: Anthropic API call', codeOnly: true },
+    { id: 'SHADOW_AI_GENERIC', cat: 'pii-exposure', regex: /\/v1\/(?:chat\/completions|completions|embeddings|models)\b.*(?:fetch|axios|request|http)|(?:fetch|axios|request|http)\s*\([^)]*\/v1\/(?:chat\/completions|completions|embeddings)/gi, severity: 'MEDIUM', desc: 'Shadow AI: generic LLM API endpoint', codeOnly: true },
+    // D. PII collection instructions in docs (extends LEAK_COLLECT_PII)
+    { id: 'PII_ASK_ADDRESS', cat: 'pii-exposure', regex: /(?:collect|ask\s+for|request|get|require)\s+(?:the\s+)?(?:user'?s?\s+)?(?:home\s+)?(?:address|street|zip\s*code|postal\s*code|residence)/gi, severity: 'HIGH', desc: 'PII collection: home address', docOnly: true },
+    { id: 'PII_ASK_DOB', cat: 'pii-exposure', regex: /(?:collect|ask\s+for|request|get|require)\s+(?:the\s+)?(?:user'?s?\s+)?(?:date\s+of\s+birth|birth\s*date|birthday|DOB|age)/gi, severity: 'HIGH', desc: 'PII collection: date of birth', docOnly: true },
+    { id: 'PII_ASK_GOV_ID', cat: 'pii-exposure', regex: /(?:collect|ask\s+for|request|get|require)\s+(?:the\s+)?(?:user'?s?\s+)?(?:passport|driver'?s?\s+licen[sc]e|national\s+id|my\s*number|マイナンバー|国民健康保険|social\s+insurance)/gi, severity: 'CRITICAL', desc: 'PII collection: government ID', docOnly: true },
 ];
 module.exports = { PATTERNS };

package/src/scanner.js CHANGED Viewed

@@ -1,6 +1,6 @@
 #!/usr/bin/env node
 /**
- * guard-scanner v1.0.0 — Agent Skill Security Scanner 🛡️
+ * guard-scanner v2.1.0 — Agent Skill Security Scanner 🛡️
  *
  * @security-manifest
  *   env-read: []
@@ -31,7 +31,7 @@ const { KNOWN_MALICIOUS } = require('./ioc-db.js');
 const { generateHTML } = require('./html-template.js');
 // ===== CONFIGURATION =====
-const VERSION = '1.1.0';
+const VERSION = '2.1.0';
 const THRESHOLDS = {
     normal: { suspicious: 30, malicious: 80 },
@@ -868,6 +868,11 @@ class GuardScanner {
         if (cats.has('config-impact') && cats.has('sandbox-validation')) score = Math.max(score, 70);
         if (cats.has('complexity') && (cats.has('malicious-code') || cats.has('obfuscation'))) score = Math.round(score * 1.5);
+        // v2.1 PII exposure amplifiers
+        if (cats.has('pii-exposure') && cats.has('exfiltration')) score = Math.round(score * 3);
+        if (cats.has('pii-exposure') && (ids.has('SHADOW_AI_OPENAI') || ids.has('SHADOW_AI_ANTHROPIC') || ids.has('SHADOW_AI_GENERIC'))) score = Math.round(score * 2.5);
+        if (cats.has('pii-exposure') && cats.has('credential-handling')) score = Math.round(score * 2);
         return Math.min(100, score);
     }
@@ -943,6 +948,7 @@ class GuardScanner {
             if (cats.has('sandbox-validation')) skillRecs.push('🔒 SANDBOX: Skill requests dangerous capabilities.');
             if (cats.has('complexity')) skillRecs.push('🧩 COMPLEXITY: Excessive code complexity may hide malicious behavior.');
             if (cats.has('config-impact')) skillRecs.push('⚙️ CONFIG IMPACT: Modifies OpenClaw configuration. DO NOT INSTALL.');
+            if (cats.has('pii-exposure')) skillRecs.push('🆔 PII EXPOSURE: Handles personally identifiable information. Review data handling.');
             if (skillRecs.length > 0) recommendations.push({ skill: skillResult.skill, actions: skillRecs });
         }