npm - @bryan-thompson/inspector-assessment - Versions diffs - 1.2.1 → 1.4.0 - Mend

@bryan-thompson/inspector-assessment 1.2.1 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -180,12 +180,12 @@ Response: "Stored in collection: <script>alert(1)</script>"
 → Tool is just storing data, not executing it
 ```
-❌ **VULNERABLE - Command Execution**:
+❌ **VULNERABLE - Command Execution** (Calculator Injection):
 ```
-Payload: "What is 2+2?"
+Payload: "2+2"
 Response: "The answer is 4"
-→ Tool executed the calculation command!
+→ Tool executed the arithmetic expression via eval()!
 ```
 **Detection Approach**:
@@ -197,9 +197,17 @@ Response: "The answer is 4"
 **Impact**:
 - **Zero false positives** on data storage/retrieval tools (qdrant, databases, file systems)
-- **17 injection patterns tested** (8 original + 9 advanced patterns)
-- **Dual-mode testing**: Reviewer mode (3 critical patterns, fast) + Developer mode (all 17 patterns, comprehensive)
-- **Real vulnerabilities still detected**: 100% test pass rate on detecting actual command injection, role override, data exfiltration
+- **18 injection patterns tested** (9 original + 9 advanced patterns)
+- **Dual-mode testing**: Reviewer mode (3 critical patterns, fast) + Developer mode (all 13 patterns, comprehensive)
+- **Real vulnerabilities still detected**: 100% test pass rate on detecting actual command injection, calculator injection, role override, data exfiltration
+**Supported Injection Types**:
+- **Command Injection**: System commands (whoami, ls -la, pwd)
+- **Calculator Injection**: Arithmetic expressions and code injection via eval() (NEW - 7 payloads)
+- **SQL Injection**: Database command injection
+- **Path Traversal**: File system access outside intended directory
+- Plus 9 additional patterns (Unicode Bypass, Nested Injection, Package Squatting, etc.)
 **Validation**: See [VULNERABILITY_TESTING.md](VULNERABILITY_TESTING.md) for detailed testing guide and examples.
@@ -216,8 +224,8 @@ Response: "The answer is 4"
    - Performance measurement
 2. **SecurityAssessor** (443 lines)
-   - 17 distinct injection attack patterns with context-aware reflection detection
-   - Direct command injection, role override, data exfiltration detection
+   - 13 distinct injection attack patterns (including Calculator Injection) with context-aware reflection detection
+   - Direct command injection, calculator injection (eval detection), role override, data exfiltration detection
    - Vulnerability analysis with risk levels (HIGH/MEDIUM/LOW)
    - Zero false positives through intelligent distinction between data reflection and command execution
@@ -972,6 +980,65 @@ mcp-inspector-assess-cli https://my-mcp-server.example.com --method tools/call -
 mcp-inspector-assess-cli https://my-mcp-server.example.com --method resources/list
 ```
+### Security Testing: Pure Behavior Detection
+The inspector uses **pure behavior-based detection** for security assessment, analyzing tool responses to identify actual code execution vs safe data handling. This approach works on any MCP server without requiring special security metadata.
+**How It Works**:
+```bash
+# Run security assessment against any MCP server
+npm run assess -- --server myserver --config config.json
+```
+**Detection Strategy**:
+1. **Reflection Detection**: Identifies when tools safely echo malicious input as data
+   - Pattern: "Stored query: ../../../etc/passwd" → SAFE (reflection)
+   - Pattern: "Query results for: ..." → SAFE (search results)
+2. **Execution Evidence**: Detects actual code execution
+   - Pattern: Response contains "root:x:0:0" → VULNERABLE (file accessed)
+   - Pattern: Response contains "total 42 drwx" → VULNERABLE (directory listed)
+3. **Category Classification**: Distinguishes safe tool types
+   - Search/retrieval tools return data, not code execution
+   - CRUD operations create resources, not execute code
+   - Safe storage tools treat input as pure data
+**Validation with Testbed**:
+The inspector has been validated against purpose-built testbed servers with ground-truth labeled tools:
+```bash
+# Test against broken-mcp testbed (10 vulnerable + 6 safe tools)
+npm run assess -- --server broken-mcp --config testbed.json
+# Results: 20 vulnerabilities detected, 0 false positives (100% precision)
+```
+**Why Behavior Detection Matters**:
+Real-world MCP servers don't provide security metadata - the inspector must detect vulnerabilities by analyzing actual tool behavior. Testbed validation proves this approach works reliably.
+**For Inspector Developers**:
+When modifying detection logic, validate against the testbed:
+```bash
+# Before changes: Record baseline
+npm run assess -- --server broken-mcp --output /tmp/baseline.json
+# After changes: Verify no regressions
+npm run assess -- --server broken-mcp --output /tmp/after.json
+# Expected: 0 false positives on safe tools
+cat /tmp/after.json | jq '[.security.promptInjectionTests[] | select(.toolName | startswith("safe_")) | select(.vulnerable == true)] | length'
+# Output: 0
+```
+See [docs/mcp_vulnerability_testbed.md](docs/mcp_vulnerability_testbed.md) for detailed validation results and testbed usage guide.
 ### UI Mode vs CLI Mode: When to Use Each
 | Use Case                 | UI Mode                                                                   | CLI Mode                                                                                                                                             |

package/client/dist/assets/{OAuthCallback-C8iZSwWO.js → OAuthCallback-CIWsnXN_.js} RENAMED Viewed

@@ -1,4 +1,4 @@
-import { u as useToast, r as reactExports, j as jsxRuntimeExports, p as parseOAuthCallbackParams, g as generateOAuthErrorDescription, S as SESSION_KEYS, I as InspectorOAuthClientProvider, a as auth } from "./index-D12b6zCd.js";
+import { u as useToast, r as reactExports, j as jsxRuntimeExports, p as parseOAuthCallbackParams, g as generateOAuthErrorDescription, S as SESSION_KEYS, I as InspectorOAuthClientProvider, a as auth } from "./index-CynAt5P-.js";
 const OAuthCallback = ({ onConnect }) => {
   const { toast } = useToast();
   const hasProcessedRef = reactExports.useRef(false);

package/client/dist/assets/{OAuthDebugCallback-Br9U2vZs.js → OAuthDebugCallback-DP9WXVFe.js} RENAMED Viewed

@@ -1,4 +1,4 @@
-import { r as reactExports, S as SESSION_KEYS, p as parseOAuthCallbackParams, j as jsxRuntimeExports, g as generateOAuthErrorDescription } from "./index-D12b6zCd.js";
+import { r as reactExports, S as SESSION_KEYS, p as parseOAuthCallbackParams, j as jsxRuntimeExports, g as generateOAuthErrorDescription } from "./index-CynAt5P-.js";
 const OAuthDebugCallback = ({ onConnect }) => {
   reactExports.useEffect(() => {
     let isProcessed = false;