@gianmarcomaz/vantyr 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,808 @@
1
+ # vantyr
2
+
3
+ Zero-telemetry, developer-first static security scanner and trust-certification CLI for Model Context Protocol (MCP) server implementations.
4
+
5
+ As the MCP ecosystem grows, security has become a critical bottleneck. Research indicates that nearly 37% of published agent skills contain security flaws, and up to 43% of tested MCP servers allow command injection. `vantyr` addresses this by providing a fully local, 100% offline vulnerability scanner that requires no account, no cloud service, and no internet connection beyond fetching the repository you ask it to scan.
6
+
7
+ `vantyr` performs static analysis on MCP server source code — either from a GitHub repository or from your local AI configuration files — and produces a weighted **Trust Score (0–100)** with a `CERTIFIED / WARNING / FAILED` label, per-finding remediation guidance, and machine-readable output for CI/CD integration.
8
+
9
+ All analysis is performed locally. No source code, findings, or metadata leave your machine. The only outbound network call is to the GitHub API when you explicitly pass a repository URL.
10
+
11
+ ---
12
+
13
+ ## Table of Contents
14
+
15
+ 1. [Installation](#1-installation)
16
+ 2. [Quick Start](#2-quick-start)
17
+ 3. [All Commands and Flags](#3-all-commands-and-flags)
18
+ 4. [How It Works — Architecture Overview](#4-how-it-works--architecture-overview)
19
+ 5. [The Six Security Analyzers](#5-the-six-security-analyzers)
20
+ - 5.1 [Network Exposure (NE)](#51-network-exposure-ne)
21
+ - 5.2 [Command Injection (CI)](#52-command-injection-ci)
22
+ - 5.3 [Credential Leaks (CL)](#53-credential-leaks-cl)
23
+ - 5.4 [Tool Poisoning (TP)](#54-tool-poisoning-tp)
24
+ - 5.5 [Spec Compliance (SC)](#55-spec-compliance-sc)
25
+ - 5.6 [Input Validation (IV)](#56-input-validation-iv)
26
+ 6. [Scoring Model](#6-scoring-model)
27
+ 7. [Trust Labels and Exit Codes](#7-trust-labels-and-exit-codes)
28
+ 8. [Suppressing False Positives](#8-suppressing-false-positives)
29
+ 9. [Output Formats](#9-output-formats)
30
+ - 9.1 [Terminal](#91-terminal)
31
+ - 9.2 [JSON](#92-json)
32
+ - 9.3 [SARIF (GitHub Code Scanning)](#93-sarif-github-code-scanning)
33
+ 10. [CI/CD Integration](#10-cicd-integration)
34
+ 11. [Local Scan Mode](#11-local-scan-mode)
35
+ 12. [Supported Languages and File Types](#12-supported-languages-and-file-types)
36
+ 13. [Technical Constraints and Limitations](#13-technical-constraints-and-limitations)
37
+ 14. [What vantyr Does Not Cover](#14-what-vantyr-does-not-cover)
38
+ 15. [License](#15-license)
39
+
40
+ ---
41
+
42
+ ## 1. Installation
43
+
44
+ **Global install:**
45
+
46
+ ```bash
47
+ npm install -g vantyr
48
+ ```
49
+
50
+ **Zero-install (npx):**
51
+
52
+ ```bash
53
+ npx vantyr scan https://github.com/owner/repo
54
+ ```
55
+
56
+ Requires Node.js version 18 or higher.
57
+
58
+ ---
59
+
60
+ ## 2. Quick Start
61
+
62
+ **Zero-install trial with npx:**
63
+
64
+ ```bash
65
+ npx vantyr scan https://github.com/owner/repo
66
+ ```
67
+
68
+ **Scan a private repository or avoid rate limits:**
69
+
70
+ ```bash
71
+ vantyr scan https://github.com/owner/repo --token <github-pat>
72
+ ```
73
+
74
+ GitHub limits unauthenticated API requests to 60 per hour. Passing a Personal Access Token increases this to 5,000 per hour and enables access to private repositories. Generate a token at [github.com/settings/tokens](https://github.com/settings/tokens) — a fine-grained token with read-only `Contents` permission on the target repository is sufficient.
75
+
76
+ **Scan your local MCP configuration files:**
77
+
78
+ ```bash
79
+ vantyr scan --local
80
+ ```
81
+
82
+ **Output results as JSON for pipeline consumption:**
83
+
84
+ ```bash
85
+ vantyr scan https://github.com/owner/repo --json
86
+ ```
87
+
88
+ ---
89
+
90
+ ## 3. All Commands and Flags
91
+
92
+ The CLI exposes a single command: `scan`.
93
+
94
+ ```
95
+ vantyr scan [url] [options]
96
+ ```
97
+
98
+ | Flag | Type | Description |
99
+ |---|---|---|
100
+ | `[url]` | Positional argument | GitHub repository URL to scan. Accepts `.git` suffixes and `/tree/...` or `/blob/...` paths — these are stripped automatically. |
101
+ | `--token <pat>` | String | GitHub Personal Access Token. Required for private repositories. Increases the API rate limit from 60 to 5,000 requests per hour for authenticated users. |
102
+ | `--local` | Boolean | Scan local MCP configuration and AI rules files instead of a GitHub repository. The `[url]` argument is ignored when this flag is set. |
103
+ | `--json` | Boolean | Emit structured JSON to stdout and suppress all other output. Intended for programmatic consumption in pipelines. Mutually exclusive with `--sarif`. |
104
+ | `--sarif` | Boolean | Emit a SARIF 2.1.0 document to stdout. Intended for the GitHub `upload-sarif` action to populate the Security tab. Mutually exclusive with `--json`. |
105
+ | `--verbose` | Boolean | Print the path of every scanned file before showing results. Only active in terminal mode (ignored with `--json` or `--sarif`). |
106
+
107
+ ---
108
+
109
+ ## 4. How It Works — Architecture Overview
110
+
111
+ `vantyr` is a pure static analyzer. It does not execute any code, make requests to the scanned server, or require a running MCP environment.
112
+
113
+ The scan pipeline runs in the following sequence:
114
+
115
+ **Step 1 — File acquisition.** For GitHub scans, the GitHub Contents API is used to recursively fetch the repository file tree. Files are fetched up to a limit of 100 files per scan (configurable future work). Files larger than 100 KB are skipped. Directories excluded from scanning: `node_modules`, `dist`, `build`, `out`, `.git`, `vendor`, `__pycache__`, `.venv`, `coverage`, and `.nyc_output`. For local scans, files are read directly from disk from a set of well-known paths.
116
+
117
+ **Step 2 — Suppression pre-processing.** Not applicable at this stage. Suppression is applied after analysis (see Step 5).
118
+
119
+ **Step 3 — Six parallel analyzers.** Each of the six security analyzers receives the complete file list and returns an array of findings. Analyzers are independent — they share no state and do not communicate with each other. Each finding carries: `category`, `severity`, `file`, `line`, `snippet`, `message`, and `remediation`.
120
+
121
+ **Step 4 — Suppression post-processing.** After all analyzers complete, findings are filtered. Any finding whose flagged line, or the line immediately above it, contains a `// vantyr-ignore` or `# vantyr-ignore` comment is removed from the results before scoring.
122
+
123
+ **Step 5 — Trust Score calculation.** The filtered findings are grouped by category. Each category starts at 100 and deducts points per finding based on severity. A weighted average across the six categories produces the final Trust Score. A hard cap is applied if any HIGH or CRITICAL severity findings are present.
124
+
125
+ **Step 6 — Output rendering.** Results are rendered to the terminal, or serialized as JSON or SARIF depending on the active output flags.
126
+
127
+ ---
128
+
129
+ ## 5. The Six Security Analyzers
130
+
131
+ ### 5.1 Network Exposure (NE)
132
+
133
+ **Purpose:** Identify MCP server deployments that are reachable by unauthorized network clients due to wildcard interface bindings or unencrypted external communication.
134
+
135
+ **What it checks:**
136
+
137
+ *Wildcard bindings.* The analyzer looks for server socket bindings on all network interfaces:
138
+
139
+ - JavaScript/TypeScript: `.listen('0.0.0.0')`, `.listen('::')`, `host: '0.0.0.0'`, `host: '::'`, `host: ''` (empty string, which resolves to all interfaces)
140
+ - Python: `.bind(('0.0.0.0', ...))`, `.bind(('::', ...))`
141
+ - Go: `net.Listen(...)` with `0.0.0.0`
142
+ - All languages: `INADDR_ANY`
143
+
144
+ Explicitly safe bindings (`127.0.0.1`, `localhost`, `::1`) on the same line suppress the finding.
145
+
146
+ *Unencrypted external communication.*
147
+
148
+ - Plain `http://` URLs referencing external hosts (not localhost/127.0.0.1/::1): severity HIGH
149
+ - Plain `ws://` WebSocket connections to external hosts: severity HIGH
150
+ - Dynamically constructed `http://` or `ws://` URLs using template literals: severity MEDIUM
151
+
152
+ *Host from environment or config.*
153
+
154
+ - `host: process.env.HOST` or similar dynamic host bindings: severity MEDIUM (may be safe or unsafe depending on deployment defaults)
155
+
156
+ **Context-awareness:**
157
+
158
+ The analyzer is not a naive pattern matcher. For every wildcard binding detected, it examines the same file and all files in the same directory for authentication context before assigning severity:
159
+
160
+ - Binding with NO authentication detected anywhere in the file or sibling files: severity CRITICAL
161
+ - Binding in a file identified as a webhook or bot receiver (by filename pattern or SDK import, e.g., Slack Bolt, Twilio): severity MEDIUM — webhook platforms use platform-side signature verification
162
+ - Binding WITH authentication middleware detected (JWT, OAuth, API key checking, `passport.js`, `flask_login`, `req.headers.authorization`, etc.): severity LOW
163
+
164
+ Authentication is detected by scanning for 30+ patterns including: `jwt.verify`, `passport.`, `@requires_auth`, `BasicAuth`, `BearerAuth`, `OAuth`, `signing_secret`, `hmac.`, `req.headers.authorization`, `['Authorization']`, `['X-API-Key']`, and framework-specific middleware patterns from FastAPI, Flask, Next.js, Clerk, Supabase, Firebase, and others.
165
+
166
+ **What it explicitly skips:**
167
+
168
+ - Test files (`__tests__/`, `*.test.js`, `*.spec.py`, `testdata/`, etc.)
169
+ - CLI and build tooling paths (`cmd/`, `cli/`, `scripts/`, `tools/`, `bin/`, `hack/`)
170
+ - Comment lines
171
+
172
+ ---
173
+
174
+ ### 5.2 Command Injection (CI)
175
+
176
+ **Purpose:** Detect shell execution calls that could allow an attacker (or a compromised LLM) to run arbitrary commands on the host system.
177
+
178
+ **What it checks:**
179
+
180
+ *JavaScript/TypeScript dangerous functions:*
181
+
182
+ | Function | Shell-based | Notes |
183
+ |---|---|---|
184
+ | `exec()`, `execSync()` | Yes | Passes command string through `/bin/sh` |
185
+ | `execFile()`, `execFileSync()` | No | Spawns directly, no shell |
186
+ | `spawn()`, `spawnSync()` | No | Spawns directly, no shell |
187
+ | `eval()` | Yes | Dynamic code execution |
188
+ | `new Function()` | Yes | Dynamic code execution |
189
+ | `vm.runInNewContext()`, `vm.runInThisContext()` | Yes | VM sandbox bypass |
190
+
191
+ *Python dangerous functions:*
192
+
193
+ | Function | Shell-based | Notes |
194
+ |---|---|---|
195
+ | `os.system()` | Yes | Always passes through shell |
196
+ | `os.popen()` | Yes | Always passes through shell |
197
+ | `subprocess.run()`, `subprocess.Popen()`, `subprocess.call()`, `subprocess.check_output()`, `subprocess.check_call()` | Conditional | Shell-based only when `shell=True` is present on the same line |
198
+ | `commands.getoutput()` | Yes | Legacy, always shell-based |
199
+ | `exec()`, `eval()` | Yes | Dynamic code execution |
200
+
201
+ *Go:*
202
+
203
+ | Function | Shell-based | Notes |
204
+ |---|---|---|
205
+ | `exec.Command()` | No | Go does NOT use a shell — args are passed directly to the OS |
206
+
207
+ *Import aliases.* The analyzer performs per-file alias detection before scanning. If a file contains `import subprocess as sp`, it generates a dynamic pattern for `sp.run(...)`, `sp.Popen(...)`, etc. Similarly for `from subprocess import run`, `const cp = require('child_process')`, and `import * as cp from 'child_process'`. This prevents alias-based bypasses.
208
+
209
+ **Severity assignment per call site:**
210
+
211
+ For each dangerous function call found, severity is determined by the following ordered logic:
212
+
213
+ 1. **Test files** — skipped entirely (not production code)
214
+ 2. **Import/require declarations** — skipped (not actual calls)
215
+ 3. **Go `exec.Command` with literal binary** — skipped (literal first argument, no shell, low risk)
216
+ 4. **Go `exec.Command` with dynamic binary** — HIGH (CLI tool context) or MEDIUM (CLI tool context)
217
+ 5. **Go `exec.Command` with spread args** — MEDIUM (binary is fixed, args are dynamic but no shell)
218
+ 6. **Install-time scripts** (`preinstall.js`, `postinstall.js`, `setup.sh`, etc.) — LOW if argument is literal or local constant, MEDIUM if dynamic
219
+ 7. **CLI tools** (`cmd/`, `cli/`, `scripts/`, `bin/`) — LOW if argument is literal, MEDIUM if dynamic
220
+ 8. **Production code, literal argument** — LOW (command is fixed, not dynamic)
221
+ 9. **Production code, shell metacharacters in literal** (`&`, `|`, `;`, `<`, `>`) — CRITICAL (explicit dangerous command in source)
222
+ 10. **Production code, dynamic argument, shell-based function** — CRITICAL
223
+ 11. **Production code, dynamic argument, non-shell function** — HIGH
224
+
225
+ A locally-defined constant (`const CMD = 'git status'`) used as the argument is treated as a literal, not as a dynamic argument. The analyzer tracks all `const/let/var = 'string literal'` assignments per file to detect this.
226
+
227
+ The analyzer also checks whether Python `subprocess` calls include `shell=True` on the same line, and upgrades severity accordingly.
228
+
229
+ ---
230
+
231
+ ### 5.3 Credential Leaks (CL)
232
+
233
+ **Purpose:** Detect secrets, tokens, and credentials committed to source code.
234
+
235
+ **The 10 secret patterns:**
236
+
237
+ | Pattern Name | Detection Rule | Severity |
238
+ |---|---|---|
239
+ | Generic API Key | Assignment of a 20+ character alphanumeric value to a variable named `api_key`, `token`, `secret`, `password`, `passwd`, or `pwd` | HIGH |
240
+ | AWS Access Key | String matching `AKIA[0-9A-Z]{16}` | CRITICAL |
241
+ | GitHub Token | String matching `ghp_[A-Za-z0-9]{36}` | CRITICAL |
242
+ | GitHub OAuth Token | String matching `gho_[A-Za-z0-9]{36}` | CRITICAL |
243
+ | JWT String | Three-part dot-separated base64url token matching `eyJ...` header | CRITICAL |
244
+ | JWT/Bearer Assignment | Assignment of a 10+ character string to a variable named `jwt` or `bearer` | HIGH |
245
+ | Database URL with Password | Connection string matching `mongodb://user:pass@`, `postgres://user:pass@`, `mysql://user:pass@`, `redis://user:pass@` | CRITICAL |
246
+ | Private Key Header | PEM header `-----BEGIN [RSA/EC/DSA/OPENSSH] PRIVATE KEY-----` | CRITICAL |
247
+ | Slack Token | String matching `xox[bpors]-...` Slack token format | HIGH |
248
+ | Stripe Live Key | String matching `sk_live_[A-Za-z0-9]{24,}` | CRITICAL |
249
+
250
+ *Stripe test keys* (`sk_test_...`) are reported at INFO severity — they grant no access to live payment data.
251
+
252
+ **False positive mitigation:**
253
+
254
+ The analyzer applies two layers of false positive reduction before reporting a match:
255
+
256
+ 1. **Placeholder detection.** If the matched string contains any of the following values, severity is downgraded to INFO: `YOUR_API_KEY`, `your_api_key`, `xxx`, `changeme`, `CHANGEME`, `<token>`, `<api_key>`, `TODO`, `FIXME`, `replace_me`, `placeholder`, `example`, `test_key`, `dummy`, `sample`.
257
+
258
+ 2. **Test file detection.** Matches in files whose path contains `test`, `spec`, `mock`, `example`, `sample`, `template`, or `fixture` have their severity reduced by one level (CRITICAL becomes HIGH, HIGH becomes MEDIUM, etc.).
259
+
260
+ 3. **Environment variable lookups.** Lines containing `process.env['VAR_NAME']` or `os.environ['VAR_NAME']` are skipped — these are lookups, not assignments of actual values.
261
+
262
+ **Project-level checks:**
263
+
264
+ Beyond line-by-line scanning, the analyzer checks two project-level conditions:
265
+
266
+ - If no `.gitignore` file is present: LOW finding ("secrets may be accidentally committed")
267
+ - If `.gitignore` exists but does not cover `.env`, `*.key`, or `*.pem`: LOW finding
268
+
269
+ Committed `.env` files (`.env`, `.env.local`, `.env.production`, `.env.development`) are flagged as HIGH regardless of their content.
270
+
271
+ ---
272
+
273
+ ### 5.4 Tool Poisoning (TP)
274
+
275
+ **Purpose:** Detect prompt injection payloads embedded in MCP tool descriptions. Tool descriptions are read by the LLM to decide when and how to use tools. Malicious content in descriptions can redirect the model's behavior without the user's knowledge.
276
+
277
+ **How it works:**
278
+
279
+ The analyzer first identifies files that contain tool definitions by scanning for SDK-specific patterns: `server.tool(`, `.addTool(`, `.add_tool(`, `tools: [`, `@server.tool`, `@mcp.tool`, `mcp.NewTool(`, `.AddTool(`, `RegisterTool`. Files without any tool definition pattern are skipped entirely.
280
+
281
+ For each file with tool definitions, description text is extracted using four patterns: `description: 'value'`, `description = 'value'`, Python triple-quoted docstrings (`"""..."""`), and Python single-quoted docstrings (`'''...'''`). Duplicate descriptions (produced by overlapping extraction patterns) are deduplicated before analysis.
282
+
283
+ **Injection pattern detection (applied to description text):**
284
+
285
+ | Pattern | Severity | What it catches |
286
+ |---|---|---|
287
+ | `ignore previous`, `ignore above`, `ignore all` | CRITICAL | Instruction override |
288
+ | `disregard previous`, `disregard all`, `disregard earlier` | CRITICAL | Instruction override |
289
+ | `you are now [persona]` | CRITICAL | Identity override — the negative lookahead excludes benign status phrases: `connected`, `logged`, `subscribed`, `available`, `ready`, `enabled`, `configured`, `running`, `serving`, `listening`, `active`, `online`, `offline`, `authenticated`, `authorized`, `processing`, `complete`, `done`, `finished`, `started`, `stopped` |
290
+ | `system prompt` | CRITICAL | Reference to system prompt |
291
+ | `instructions override` | CRITICAL | Explicit override language |
292
+ | `forget everything`, `forget all`, `forget previous` | CRITICAL | Memory reset attempt |
293
+ | Zero-width or direction-override Unicode characters (U+200B, U+200C, U+200D, U+202E, U+FEFF) | CRITICAL | Hidden instructions invisible to human reviewers |
294
+ | HTML comments `<!-- ... -->` | CRITICAL | Instructions hidden from users but visible to LLMs |
295
+ | `you must always`, `you should never` (behavioral mandates) | HIGH | Behavioral override (reduced from CRITICAL — legitimate verbose descriptions occasionally match this pattern) |
296
+ | HTML or XML tags in description | MEDIUM | Could hide content |
297
+ | `before using this tool, ...` | INFO | Cross-tool instruction |
298
+ | `first call / also execute / also run` | INFO | Cross-tool invocation instruction |
299
+
300
+ **Structural checks:**
301
+
302
+ - Descriptions longer than 1,000 characters: MEDIUM — unusually long descriptions are harder to audit and may hide injected instructions among legitimate content
303
+ - Base64-encoded strings (40+ characters matching the base64 alphabet) anywhere in a description: MEDIUM — encoded content is invisible to casual readers
304
+
305
+ **Tool name shadowing:**
306
+
307
+ Tool names are checked against a list of common system or sensitive names: `read_file`, `write_file`, `execute_command`, `run_script`, `get_user_info`, `sudo`, `eval`, `system`, `fetch_url`, `http_get`, `fs_read`, `fs_write`. A tool with one of these names shadows a well-known capability and may confuse the LLM into believing it is interacting with a trusted system primitive. Severity: HIGH.
308
+
309
+ ---
310
+
311
+ ### 5.5 Spec Compliance (SC)
312
+
313
+ **Purpose:** Validate conformance with the MCP specification and protocol hygiene. Spec violations do not always represent security vulnerabilities, but they indicate an immature or incomplete implementation that may behave unpredictably with MCP clients.
314
+
315
+ **Check 1 — Server metadata.**
316
+
317
+ Scans the full codebase for server name and version declarations using patterns for JS/TS (`createServer`, `McpServer`, `Server`), Python, and Go (`mcp.NewServer`, `NewServer`). If name or version is absent: MEDIUM finding.
318
+
319
+ Checks for explicit capabilities declarations (`capabilities:`, `setCapabilities()`, `ServerCapabilities`, Go's `WithCapabilities`). If absent: LOW finding.
320
+
321
+ **Check 2 — MCP SDK or protocol declaration.**
322
+
323
+ Looks for official SDK imports: `@modelcontextprotocol/sdk`, `mcp-python`, `mcp-sdk`, `mcp-go`, or a `protocolVersion` declaration. If absent: MEDIUM finding.
324
+
325
+ **Check 3 — Tool definitions and input schemas.**
326
+
327
+ Detects tool definitions using five independent patterns to avoid false negatives:
328
+
329
+ 1. Direct SDK call: `server.tool(`, `.addTool(`, `.registerTools(`, `@server.tool`, `@mcp.tool`
330
+ 2. Variable-then-register pattern: `const myTool = { name: ... }` followed by `addTool(myTool)`
331
+ 3. Array of tools: `tools: [` or `const tools = [`
332
+ 4. Object matching MCP tool shape: object with `name`, `description`, and `inputSchema` properties
333
+ 5. Tool definition files: filenames matching `tools.ts`, `toolDefinitions.ts`, `tool-defs.ts`, `mcpTools.py`, etc.
334
+ 6. Go SDK: `mcp.NewTool(`, `.AddTool(`, `RegisterTool`
335
+
336
+ If no tool definition pattern is found: MEDIUM finding.
337
+
338
+ If tool definitions are found but no `inputSchema` / `input_schema` / `parameters:` declaration exists anywhere: MEDIUM finding.
339
+
340
+ If an input schema exists but lacks a top-level `"type": "object"` declaration: MEDIUM finding.
341
+
342
+ **Check 4 — Error handling.**
343
+
344
+ Scans for try/catch blocks, Go `if err != nil` patterns, or error handler registrations (`.on('error')`, `ErrorHandler`). If none are found: MEDIUM finding.
345
+
346
+ If error handling exists but empty catch blocks or bare `except Exception: pass` are detected: LOW finding (silent errors).
347
+
348
+ Checks for JSON-RPC error response patterns: `JsonRpcError`, `McpError`, error objects with a `code` field, Go's `fmt.Errorf`/`errors.Is`/`errors.As`. If none: LOW finding.
349
+
350
+ **Check 5 — Transport security.**
351
+
352
+ If an HTTP transport is detected (`SSEServerTransport`, `HttpServerTransport`, `express`, `fastify`, `http.createServer`, `http.ListenAndServe`), the analyzer checks for authentication middleware anywhere in the codebase. If HTTP transport is present without any authentication pattern: HIGH finding.
353
+
354
+ **Check 6 — Documentation.**
355
+
356
+ If no `README.md` file is found in the repository: LOW finding.
357
+
358
+ If empty tool descriptions (`description: ''` or `description: ""`) are detected: LOW finding.
359
+
360
+ ---
361
+
362
+ ### 5.6 Input Validation (IV)
363
+
364
+ **Purpose:** Detect tool input flowing unsanitized into dangerous operations. An MCP server that passes LLM-supplied values directly to file operations, network requests, or database queries is vulnerable to path traversal, SSRF, and SQL injection respectively.
365
+
366
+ **Dangerous sinks:**
367
+
368
+ The analyzer tracks tool input variables (identified by common names: `args`, `params`, `input`, `validatedArgs`, `toolInput`, `userInput`, `llmInput`, `request`, `body`, `payload`, `data`, `parsed`, `toolArgs`, `callArgs`, `handlerArgs`) flowing into the following sink categories:
369
+
370
+ *File operations (path traversal risk):*
371
+
372
+ - `readFile(args.*)`, `readFileSync(args.*)`, `writeFile(args.*)`, `writeFileSync(args.*)`, `createReadStream(args.*)`, `createWriteStream(args.*)`, `appendFile(args.*)`
373
+ - Python `open(args.*)`
374
+ - Template literals: `` fs.method(`...${args.path}`) ``
375
+
376
+ *Network requests (SSRF risk):*
377
+
378
+ - `fetch(args.url)`, `axios.get(args.url)`, `axios.post(args.url)`, `got(args.url)`, `request(args.url)`, `http.get(args.url)`, `https.get(args.url)`, `urllib.request(args.url)`
379
+ - Template literals: `` fetch(`https://api.example.com/${args.path}`) ``
380
+
381
+ *SQL injection:*
382
+
383
+ - Template literals in query calls: `` .query(`SELECT ... ${args.name}`) ``
384
+ - String concatenation in queries: `.query('SELECT ... ' + args.name)`
385
+ - Template literals in execute calls: `` execute(`...${args.value}`) ``
386
+
387
+ **Context window analysis (15 lines upstream):**
388
+
389
+ Rather than simply flagging every sink call, the analyzer examines the 15 lines of code immediately above each sink call to detect existing security controls. Severity is adjusted accordingly:
390
+
391
+ | File operation context | Severity |
392
+ |---|---|
393
+ | `path.resolve()` AND `.startsWith(allowedDir)` both present | SAFE — not flagged |
394
+ | `path.resolve()` present but `.startsWith()` absent | MEDIUM — path is resolved but not confined |
395
+ | `path.join()` present but `.startsWith()` absent | MEDIUM — join does not prevent traversal |
396
+ | Type validation present (Zod, Joi, etc.) but no path containment | MEDIUM |
397
+ | No validation at all | HIGH |
398
+
399
+ | Network request context | Severity |
400
+ |---|---|
401
+ | `new URL()` AND allowlist check (`indexOf`, `includes`, `===`) both present | SAFE — not flagged |
402
+ | Allowlist check present without URL parsing | SAFE |
403
+ | `new URL()` present but no allowlist check | MEDIUM |
404
+ | Hardcoded base URL in template literal | LOW |
405
+ | Type validation only | MEDIUM |
406
+ | No validation at all | HIGH |
407
+
408
+ | SQL context | Severity |
409
+ |---|---|
410
+ | Type validation present | HIGH (type validation does not prevent SQL injection) |
411
+ | No validation at all | CRITICAL |
412
+
413
+ **Dynamic dispatch detection:**
414
+
415
+ The pattern `handlerMap[args.action](...)` — using a tool input value to select and call a function — is flagged as HIGH regardless of context. This pattern allows arbitrary function invocation if the dispatch table is not strictly controlled.
416
+
417
+ **Dynamic property access and spread:**
418
+
419
+ - `args[someVar]` in a tool handler context without type validation: MEDIUM
420
+ - `...args` spread inside a tool handler without type validation: LOW
421
+
422
+ **Project-level check:**
423
+
424
+ If no `inputSchema` / `input_schema` / `InputSchema` declaration is found anywhere in the codebase: HIGH finding. A server that declares no input schemas accepts arbitrary input without any structural constraints.
425
+
426
+ **What IV explicitly does not overlap with CI:**
427
+
428
+ The IV analyzer does not scan for `exec`, `spawn`, `os.system`, or similar shell execution patterns — those are the exclusive domain of the CI analyzer. This prevents double-counting the same line in both categories.
429
+
430
+ ---
431
+
432
+ ## 6. Scoring Model
433
+
434
+ **Per-category scoring.**
435
+
436
+ Every category starts at 100. For each finding in that category, a fixed deduction is subtracted:
437
+
438
+ | Severity | Deduction |
439
+ |---|---|
440
+ | CRITICAL | 25 points |
441
+ | HIGH | 15 points |
442
+ | MEDIUM | 5 points |
443
+ | LOW | 3 points |
444
+ | INFO | 0 points |
445
+
446
+ Deductions are cumulative and the floor is 0. A category with two CRITICAL findings scores `100 - 25 - 25 = 50`. A category with four CRITICAL findings scores `max(0, 100 - 100) = 0`.
447
+
448
+ **Weighted average.**
449
+
450
+ The six category scores are combined into a single Trust Score using a weighted average. Weights are aligned with the OWASP Top 10 for LLM Applications — categories that map to higher-ranked OWASP risks carry higher weight:
451
+
452
+ | Category | Code | Weight | OWASP Reference |
453
+ |---|---|---|---|
454
+ | Credential Leaks | CL | 25% | MCP01 — most damaging, most common |
455
+ | Command Injection | CI | 20% | MCP05 — RCE via LLM input |
456
+ | Network Exposure | NE | 15% | MCP07 — unauthorized access |
457
+ | Input Validation | IV | 15% | SSRF, path traversal, SQL injection |
458
+ | Tool Poisoning | TP | 15% | MCP03 — prompt injection |
459
+ | Spec Compliance | SC | 10% | Protocol hygiene |
460
+
461
+ The formula is:
462
+
463
+ ```
464
+ TrustScore = round(
465
+ CL_score * 0.25 +
466
+ CI_score * 0.20 +
467
+ NE_score * 0.15 +
468
+ IV_score * 0.15 +
469
+ TP_score * 0.15 +
470
+ SC_score * 0.10
471
+ )
472
+ ```
473
+
474
+ **Circuit Breaker — hard cap for HIGH and CRITICAL findings.**
475
+
476
+ If any finding in the scan has severity HIGH or CRITICAL, the Trust Score is capped at a maximum of 75, regardless of the weighted calculation. This circuit breaker ensures that a repository with high-severity security issues cannot achieve `CERTIFIED` status even if the other five categories are clean. Resolving all HIGH and CRITICAL findings removes the cap and allows the full calculated score to apply.
477
+
478
+ When the cap is applied, a warning is displayed in terminal output, and the `scoreCapped: true` flag is set in JSON and SARIF output.
479
+
480
+ **Pass count.**
481
+
482
+ A category is considered "passing" if its score is 80 or above. The pass count (0–6) is reported alongside the Trust Score.
483
+
484
+ ---
485
+
486
+ ## 7. Trust Labels and Exit Codes
487
+
488
+ | Score Range | Label | Meaning |
489
+ |---|---|---|
490
+ | 80 – 100 | CERTIFIED | All categories pass or have only low-severity findings. No HIGH or CRITICAL findings exist. |
491
+ | 50 – 79 | WARNING | Security issues are present that should be resolved before production deployment. |
492
+ | 0 – 49 | FAILED | Serious vulnerabilities are present. The process exits with code 1. |
493
+
494
+ Note: A score of exactly 75 from the hard cap mechanism always produces a WARNING label, not CERTIFIED.
495
+
496
+ **Exit codes:**
497
+
498
+ - `0` — Trust Score is 50 or above
499
+ - `1` — Trust Score is below 50, or a fatal error occurred
500
+
501
+ This enables `vantyr` to function as a CI gate: a failing scan fails the pipeline.
502
+
503
+ ---
504
+
505
+ ## 8. Suppressing False Positives
506
+
507
+ If a finding is a confirmed false positive, it can be suppressed using an inline comment. Place the suppression comment either on the same line as the flagged code or on the line immediately above it.
508
+
509
+ **JavaScript/TypeScript/Go:**
510
+
511
+ ```js
512
+ // vantyr-ignore
513
+ const host = "0.0.0.0";
514
+ ```
515
+
516
+ ```js
517
+ const host = "0.0.0.0"; // vantyr-ignore
518
+ ```
519
+
520
+ **Python/YAML:**
521
+
522
+ ```python
523
+ cmd = build_deploy_command() # vantyr-ignore
524
+ os.system(cmd)
525
+ ```
526
+
527
+ **Scope of suppression:**
528
+
529
+ - The suppression applies to exactly one finding on the flagged line or the line below the comment.
530
+ - It does not suppress other findings in the same file or the same category.
531
+ - Suppression is applied as a post-processing step after all six analyzers complete. It does not affect how analyzers run — it only removes the finding from the results before scoring.
532
+
533
+ **What cannot be suppressed:**
534
+
535
+ Project-level findings have no line number (file is reported as "project"). These cannot be suppressed with inline comments. Examples: "No MCP tool definitions found", "No input schemas defined", "No README.md". These require fixing the underlying issue.
536
+
537
+ ---
538
+
539
+ ## 9. Output Formats
540
+
541
+ ### 9.1 Terminal
542
+
543
+ Default mode. Color-coded output showing:
544
+
545
+ - The Trust Score and label
546
+ - A per-category breakdown with scores and finding counts
547
+ - A warning if the score was capped
548
+ - A remediation list for every finding, sorted by severity
549
+
550
+ Example:
551
+
552
+ ```
553
+ MCP Certify -- Trust Score: 61/100 WARNING
554
+
555
+ Network Exposure -- 100/100
556
+ Command Injection -- 100/100
557
+ Credential Leaks -- 70/100 (2 findings)
558
+ Tool Poisoning -- 100/100
559
+ Spec Compliance -- 85/100 (1 finding)
560
+ Input Validation -- 100/100
561
+
562
+ -------------------------------------------------
563
+
564
+ Remediations:
565
+
566
+ [HIGH] [Credential Leaks] Line 12 in config.js
567
+ Generic API Key detected.
568
+ Remove hardcoded credentials and use environment variables instead.
569
+
570
+ [MEDIUM] [Spec Compliance] In project
571
+ No explicit capabilities declaration found.
572
+ Declare server capabilities to inform clients what features are supported.
573
+
574
+ -------------------------------------------------
575
+ Scanned: https://github.com/owner/repo
576
+ Checks: 6 | Critical: 0 | High: 1 | Medium: 1 | Low: 0 | Pass: 4/6
577
+ ```
578
+
579
+ Use `--verbose` to also list every scanned file path before this output.
580
+
581
+ ### 9.2 JSON
582
+
583
+ Activated by `--json`. All other console output is suppressed. The full JSON document is written to stdout.
584
+
585
+ ```jsonc
586
+ {
587
+ "source": "https://github.com/owner/repo",
588
+ "trustScore": 61,
589
+ "label": "WARNING",
590
+ "scoreCapped": false,
591
+ "categories": {
592
+ "NE": { "score": 100, "passed": true, "findingCount": 0, "findings": [] },
593
+ "CI": { "score": 100, "passed": true, "findingCount": 0, "findings": [] },
594
+ "CL": { "score": 70, "passed": false, "findingCount": 2, "findings": [...] },
595
+ "TP": { "score": 100, "passed": true, "findingCount": 0, "findings": [] },
596
+ "SC": { "score": 85, "passed": true, "findingCount": 1, "findings": [...] },
597
+ "IV": { "score": 100, "passed": true, "findingCount": 0, "findings": [] }
598
+ },
599
+ "stats": {
600
+ "critical": 0, "high": 1, "medium": 1, "low": 0, "info": 0
601
+ },
602
+ "passCount": 5,
603
+ "totalFindings": 3,
604
+ "findings": [
605
+ {
606
+ "category": "CL",
607
+ "severity": "high",
608
+ "file": "src/config.js",
609
+ "line": 12,
610
+ "snippet": "const API_KEY = 'AbCd1234...'",
611
+ "message": "Generic API Key detected.",
612
+ "remediation": "Remove hardcoded credentials and use environment variables instead."
613
+ }
614
+ ]
615
+ }
616
+ ```
617
+
618
+ If `--local` is used and no config files are found, the JSON output is:
619
+
620
+ ```json
621
+ {
622
+ "source": "Local Configuration & Rules",
623
+ "trustScore": null,
624
+ "label": "NO_FILES",
625
+ "message": "No local MCP configuration or rules files found.",
626
+ "findings": []
627
+ }
628
+ ```
629
+
630
+ ### 9.3 SARIF (GitHub Code Scanning)
631
+
632
+ Activated by `--sarif`. Emits a SARIF 2.1.0 document to stdout, suitable for upload to GitHub's Security tab via the `upload-sarif` action.
633
+
634
+ The document contains six SARIF rules (one per category) with descriptions, `helpUri` links to OWASP MCP Top 10, and severity tags.
635
+
636
+ SARIF severity mapping:
637
+
638
+ | vantyr severity | SARIF level |
639
+ |---|---|
640
+ | CRITICAL, HIGH | `error` |
641
+ | MEDIUM | `warning` |
642
+ | LOW, INFO | `note` |
643
+
644
+ Each SARIF result includes:
645
+ - `ruleId` (e.g., `MCP-CL`, `MCP-CI`)
646
+ - `level`
647
+ - `message.text` combining the finding message and remediation
648
+ - `physicalLocation` with file URI and line number (when available)
649
+ - `properties.trustScore` and `properties.scoreCapped`
650
+
651
+ ---
652
+
653
+ ## 10. CI/CD Integration
654
+
655
+ **Basic pipeline gate:**
656
+
657
+ ```yaml
658
+ - name: MCP Security Scan
659
+ run: npx vantyr scan ${{ github.event.repository.html_url }}
660
+ ```
661
+
662
+ The step fails (exit code 1) when Trust Score is below 50.
663
+
664
+ **GitHub Code Scanning with SARIF:**
665
+
666
+ Upload results directly to the Security tab for inline PR annotations:
667
+
668
+ ```yaml
669
+ name: MCP Security Scan
670
+
671
+ on: [push, pull_request]
672
+
673
+ jobs:
674
+ scan:
675
+ runs-on: ubuntu-latest
676
+ permissions:
677
+ security-events: write
678
+ steps:
679
+ - uses: actions/checkout@v4
680
+
681
+ - name: Run vantyr
682
+ run: npx vantyr scan ${{ github.event.repository.html_url }} --sarif > results.sarif
683
+
684
+ - name: Upload SARIF to GitHub
685
+ uses: github/codeql-action/upload-sarif@v3
686
+ with:
687
+ sarif_file: results.sarif
688
+ ```
689
+
690
+ **JSON output for custom dashboards or policy checks:**
691
+
692
+ ```bash
693
+ # Fail the build if Trust Score is below 80 (stricter than the default 50 gate)
694
+ SCORE=$(npx vantyr scan https://github.com/owner/repo --json | jq '.trustScore')
695
+ if [ "$SCORE" -lt 80 ]; then
696
+ echo "Score $SCORE is below CERTIFIED threshold"
697
+ exit 1
698
+ fi
699
+ ```
700
+
701
+ **Scanning private repositories in CI:**
702
+
703
+ Store the GitHub PAT as a repository secret and pass it via the `--token` flag:
704
+
705
+ ```yaml
706
+ - name: Run vantyr
707
+ run: npx vantyr scan ${{ github.event.repository.html_url }} --token ${{ secrets.MCPCERTIFY_TOKEN }} --json
708
+ ```
709
+
710
+ ---
711
+
712
+ ## 11. Local Scan Mode
713
+
714
+ `vantyr scan --local` reads MCP configuration and AI rules files from standard locations on your machine. It does not require a GitHub URL.
715
+
716
+ **Files scanned:**
717
+
718
+ | Path | Platform | Purpose |
719
+ |---|---|---|
720
+ | `~/.cursor/mcp.json` | All | Cursor global MCP configuration |
721
+ | `~/.codeium/windsurf/mcp_config.json` | All | Windsurf MCP configuration |
722
+ | `~/Library/Application Support/Claude/claude_desktop_config.json` | macOS | Claude Desktop configuration |
723
+ | `%APPDATA%\Claude\claude_desktop_config.json` | Windows | Claude Desktop configuration |
724
+ | `.vscode/mcp.json` | All (project) | VSCode project MCP config |
725
+ | `.cursor/mcp.json` | All (project) | Cursor project MCP config |
726
+ | `.cursorrules` | All (project) | Cursor AI instructions file |
727
+ | `.windsurfrules` | All (project) | Windsurf AI instructions file |
728
+ | `CLAUDE.md` | All (project) | Claude project instructions |
729
+ | `copilot-instructions.md` | All (project) | GitHub Copilot instructions |
730
+
731
+ **What is checked in local mode:**
732
+
733
+ The same six analyzers run on the discovered files. In practice, the most relevant findings in local mode are:
734
+
735
+ - **Credential Leaks (CL):** Hardcoded API keys or tokens in JSON configuration files, environment variable values inadvertently pasted into config
736
+ - **Tool Poisoning (TP):** Prompt injection payloads embedded in `.cursorrules`, `CLAUDE.md`, or other AI rules files
737
+ - **Spec Compliance (SC):** Missing schemas, missing documentation in project-level config
738
+
739
+ **What local mode does not do:**
740
+
741
+ Local mode scans the configuration files themselves. It does not dynamically trace the `command` entries in MCP configuration (e.g., the `npx` package path or local binary path defined in `claude_desktop_config.json`) and scan the underlying server source code on your local disk. To scan a referenced MCP server's implementation, pass its GitHub URL explicitly as a separate scan.
742
+
743
+ ---
744
+
745
+ ## 12. Supported Languages and File Types
746
+
747
+ **Source code files (full analysis — all six analyzers):**
748
+
749
+ - JavaScript (`.js`, `.jsx`, `.mjs`, `.cjs`)
750
+ - TypeScript (`.ts`, `.tsx`)
751
+ - Python (`.py`)
752
+ - Go (`.go`)
753
+ - Rust (`.rs`)
754
+
755
+ **Configuration and data files (credential and structural checks):**
756
+
757
+ - JSON (`.json`)
758
+ - YAML (`.yaml`, `.yml`)
759
+ - TOML (`.toml`)
760
+ - Markdown (`.md`)
761
+ - Shell scripts (`.sh`, `.bash`, `.zsh`)
762
+ - Environment files (`.env`, `.env.*`)
763
+
764
+ **Skipped always:**
765
+
766
+ Binary files, image files, compiled artifacts, and files larger than 100 KB are excluded regardless of extension.
767
+
768
+ ---
769
+
770
+ ## 13. Technical Constraints and Limitations
771
+
772
+ **File cap.** Up to 100 files per scan. For repositories exceeding this limit, vantyr scans the first 100 eligible files returned by the GitHub API and displays a warning. The remaining files are not analyzed. This may result in an artificially high Trust Score for large codebases.
773
+
774
+ **File size cap.** Files larger than 100 KB are skipped. Large generated files or bundled artifacts often hit this limit. This is intentional — minified or bundled files are not meaningful targets for static analysis.
775
+
776
+ **Static analysis only.** vantyr does not execute any code. It cannot detect vulnerabilities that only manifest at runtime, such as insecure deserialization, race conditions, timing attacks, or vulnerabilities introduced by runtime configuration.
777
+
778
+ **15-line context window.** The IV analyzer's upstream context window is fixed at 15 lines. A validation check placed more than 15 lines above a sink call will not be detected, and the finding will report a higher severity than it deserves. Consider this when reviewing medium-severity IV findings.
779
+
780
+ **Single-file alias detection.** Import alias detection in the CI analyzer operates per-file. Cross-file aliases (e.g., an alias defined in `utils.py` and used in `handler.py`) are not detected.
781
+
782
+ **Pattern-based tool definition detection.** The TP analyzer only analyzes files that contain at least one recognized tool registration pattern. MCP tools registered via unconventional or custom frameworks may not be detected.
783
+
784
+ **No interprocedural analysis.** The IV analyzer does not track data flow across function boundaries. If tool input is passed as an argument to a helper function that then calls a dangerous sink, the finding will not be reported.
785
+
786
+ ---
787
+
788
+ ## 14. What vantyr Does Not Cover
789
+
790
+ The following vulnerability classes are explicitly out of scope for the current version:
791
+
792
+ - **Dependency vulnerabilities.** vantyr does not audit `package.json`, `requirements.txt`, `go.mod`, or other dependency manifests for known CVEs. Use `npm audit`, `pip-audit`, `govulncheck`, or Dependabot for this.
793
+
794
+ - **Runtime security misconfigurations.** TLS certificate validation, DNS rebinding protections, rate limiting, and other server-side runtime controls cannot be assessed through source code analysis alone.
795
+
796
+ - **Output sanitization.** Responses returned by tools to the LLM are not analyzed. A tool that returns attacker-controlled content without sanitization could enable secondary prompt injection, but detecting this requires semantic understanding of data flow that static analysis cannot reliably provide.
797
+
798
+ - **Authentication correctness.** The NE and SC analyzers detect the *presence* of authentication patterns. They do not verify that the authentication implementation is correct (e.g., JWT signature verification, token expiry checks, scope validation).
799
+
800
+ - **Business logic vulnerabilities.** Access control at the tool level (e.g., a tool that performs destructive operations without confirming intent), excessive capability grants, and missing confirmation dialogs for sensitive actions are not analyzed.
801
+
802
+ - **Infrastructure security.** Container configurations, Kubernetes manifests, cloud IAM policies, and firewall rules are outside the scope of source code analysis.
803
+
804
+ ---
805
+
806
+ ## 15. License
807
+
808
+ MIT