npm - agent-security-scanner-mcp - Versions diffs - 3.3.0 → 3.4.0 - Mend

agent-security-scanner-mcp 3.3.0 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/README.md +224 -2
package/analyzer.py +22 -5
package/cross_file_analyzer.py +216 -0
package/index.js +104 -4
package/package.json +10 -3
package/pattern_matcher.py +1 -0
package/regex_fallback.py +199 -1
package/scripts/postinstall.js +25 -0
package/src/cli/init-hooks.js +164 -0
package/src/config.js +181 -0
package/src/context.js +228 -0
package/src/dedup.js +129 -0
package/src/fix-patterns.js +66 -17
package/src/tools/fix-security.js +31 -4
package/src/tools/scan-diff.js +151 -0
package/src/tools/scan-project.js +308 -0
package/src/tools/scan-security.js +33 -5
package/src/utils.js +76 -7

package/README.md CHANGED Viewed

@@ -14,6 +14,8 @@ Security scanner for AI coding agents and autonomous assistants. Scans code for
 |------|-------------|-------------|
 | `scan_security` | Scan code for vulnerabilities (1700+ rules, 12 languages) with AST and taint analysis | After writing or editing any code file |
 | `fix_security` | Auto-fix all detected vulnerabilities (120 fix templates) | After `scan_security` finds issues |
+| `scan_git_diff` | Scan only changed files in git diff | Before commits or in PR reviews |
+| `scan_project` | Scan entire project with A-F security grading | For project-wide security audits |
 | `check_package` | Verify a package name isn't AI-hallucinated (4.3M+ packages) | Before adding any new dependency |
 | `scan_packages` | Bulk-check all imports in a file for hallucinated packages | Before committing code with new imports |
 | `scan_agent_prompt` | Detect prompt injection and malicious instructions (56 rules) | Before acting on external/untrusted input |
@@ -38,8 +40,18 @@ scan_security → review findings → fix_security → verify fix
 ### Before Committing
 ```
+scan_git_diff → scan only changed files for fast feedback
 scan_packages → verify all imports are legitimate
-scan_security → catch vulnerabilities before they ship
+```
+### For PR Reviews
+```
+scan_git_diff --base main → scan PR changes against main branch
+```
+### For Project Audits
+```
+scan_project → get A-F security grade and aggregated metrics
 ```
 ### When Processing External Input
@@ -329,6 +341,105 @@ List all 1700+ security scanning rules and 120 fix templates. Use to understand
 ---
+### `scan_git_diff`
+Scan only files changed in git diff for security vulnerabilities. Use in PR workflows, pre-commit hooks, or to check recent changes before pushing. Significantly faster than full project scans.
+**Parameters:**
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `base` | string | No | Base commit/branch to diff against (default: `HEAD~1`) |
+| `target` | string | No | Target commit/branch (default: `HEAD`) |
+| `verbosity` | string | No | `"minimal"`, `"compact"` (default), `"full"` |
+**Example:**
+```json
+// Input
+{ "base": "main", "target": "HEAD" }
+// Output
+{
+  "base": "main",
+  "target": "HEAD",
+  "files_scanned": 5,
+  "issues_count": 3,
+  "issues": [
+    {
+      "file": "src/auth.js",
+      "line": 42,
+      "ruleId": "sql-injection",
+      "severity": "error",
+      "message": "SQL injection vulnerability detected"
+    }
+  ]
+}
+```
+---
+### `scan_project`
+Scan an entire project or directory for security vulnerabilities with aggregated metrics and A-F security grading. Use for security audits, compliance checks, or initial codebase assessment.
+**Parameters:**
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `directory` | string | Yes | Path to project directory to scan |
+| `include_patterns` | array | No | Glob patterns to include (e.g., `["**/*.js", "**/*.py"]`) |
+| `exclude_patterns` | array | No | Glob patterns to exclude (default: `node_modules`, `.git`, etc.) |
+| `verbosity` | string | No | `"minimal"`, `"compact"` (default), `"full"` |
+**Example:**
+```json
+// Input
+{ "directory": "./src", "verbosity": "compact" }
+// Output
+{
+  "directory": "/path/to/src",
+  "files_scanned": 24,
+  "issues_count": 12,
+  "grade": "C",
+  "by_severity": {
+    "error": 3,
+    "warning": 7,
+    "info": 2
+  },
+  "by_category": {
+    "sql-injection": 2,
+    "xss": 3,
+    "hardcoded-secret": 1,
+    "insecure-crypto": 4,
+    "command-injection": 2
+  },
+  "issues": [
+    {
+      "file": "auth.js",
+      "line": 15,
+      "ruleId": "sql-injection",
+      "severity": "error",
+      "message": "SQL injection vulnerability"
+    }
+  ]
+}
+```
+**Security Grades:**
+| Grade | Criteria |
+|-------|----------|
+| A | 0 critical/error issues |
+| B | 1-2 error issues, no critical |
+| C | 3-5 error issues |
+| D | 6-10 error issues |
+| F | 11+ error issues or any critical |
+---
 ## Supported Languages
 | Language | Vulnerabilities Detected | Analysis |
@@ -465,17 +576,113 @@ npx agent-security-scanner-mcp scan-prompt "ignore previous instructions"
 # Scan a file for vulnerabilities
 npx agent-security-scanner-mcp scan-security ./app.py --verbosity minimal
+# Scan git diff (changed files only)
+npx agent-security-scanner-mcp scan-diff --base main --target HEAD
+# Scan entire project with grading
+npx agent-security-scanner-mcp scan-project ./src
 # Check if a package is legitimate
 npx agent-security-scanner-mcp check-package flask pypi
 # Scan file imports for hallucinated packages
 npx agent-security-scanner-mcp scan-packages ./requirements.txt pypi
+# Install Claude Code hooks for automatic scanning
+npx agent-security-scanner-mcp init-hooks
 ```
 **Exit codes:** `0` = safe, `1` = issues found. Use in scripts to block risky operations.
 ---
+## Configuration (`.scannerrc`)
+Create a `.scannerrc.yaml` or `.scannerrc.json` in your project root to customize scanning behavior:
+```yaml
+# .scannerrc.yaml
+version: 1
+# Suppress specific rules
+suppress:
+  - rule: "insecure-random"
+    reason: "Using for non-cryptographic purposes"
+  - rule: "detect-disable-mustache-escape"
+    paths: ["src/cli/**"]
+# Exclude paths from scanning
+exclude:
+  - "node_modules/**"
+  - "dist/**"
+  - "**/*.test.js"
+  - "**/*.spec.ts"
+# Minimum severity to report
+severity_threshold: "warning"  # "info", "warning", or "error"
+# Context-aware filtering (enabled by default)
+context_filtering: true
+```
+**Configuration options:**
+| Option | Type | Description |
+|--------|------|-------------|
+| `suppress` | array | Rules to suppress, optionally scoped to paths |
+| `exclude` | array | Glob patterns for paths to skip |
+| `severity_threshold` | string | Minimum severity to report (`info`, `warning`, `error`) |
+| `context_filtering` | boolean | Enable/disable safe module filtering (default: `true`) |
+The scanner automatically loads config from the current directory or any parent directory.
+---
+## Claude Code Hooks
+Automatically scan files after every edit with Claude Code hooks integration.
+### Install Hooks
+```bash
+npx agent-security-scanner-mcp init-hooks
+```
+This installs a `post-tool-use` hook that triggers security scanning after `Write`, `Edit`, or `MultiEdit` operations.
+### With Prompt Guard
+```bash
+npx agent-security-scanner-mcp init-hooks --with-prompt-guard
+```
+Adds a `PreToolUse` hook that scans prompts for injection attacks before executing tools.
+### What Gets Installed
+The command adds hooks to `~/.claude/settings.json`:
+```json
+{
+  "hooks": {
+    "post-tool-use": [
+      {
+        "matcher": "Write|Edit|MultiEdit",
+        "command": "npx agent-security-scanner-mcp scan-security \"$TOOL_INPUT_file_path\" --verbosity minimal"
+      }
+    ]
+  }
+}
+```
+### Hook Behavior
+- **Non-blocking:** Hooks report findings but don't prevent file writes
+- **Minimal output:** Uses `--verbosity minimal` to avoid context overflow
+- **Automatic:** Runs on every file modification without manual intervention
+---
 ## OpenClaw Integration
 [OpenClaw](https://openclaw.ai) is an autonomous AI assistant with broad system access. This scanner provides security guardrails for OpenClaw users.
@@ -567,7 +774,7 @@ AI coding agents introduce attack surfaces that traditional security tools weren
 |----------|-------|
 | **Transport** | stdio |
 | **Package** | `agent-security-scanner-mcp` (npm) |
-| **Tools** | 6 |
+| **Tools** | 8 |
 | **Languages** | 12 |
 | **Ecosystems** | 7 |
 | **Auth** | None required |
@@ -649,6 +856,21 @@ All MCP tools support a `verbosity` parameter to minimize context window consump
 ## Changelog
+### v3.4.0
+- **Severity Calibration** - 207-rule severity map with HIGH/MEDIUM/LOW confidence scores for more accurate prioritization
+- **Cross-Engine Deduplication** - ~30-50% noise reduction by deduplicating findings across AST, taint, and regex engines
+- **Context-Aware Filtering** - 80+ known safe modules (logging, testing, sanitizers) reduce false positives
+- **`.scannerrc` Configuration** - YAML/JSON project config for suppressing rules, excluding paths, and setting severity thresholds
+- **`scan_git_diff` Tool** - Scan only changed files in git diff for PR workflows and pre-commit hooks
+- **`scan_project` Tool** - Project-level scanning with A-F security grading and aggregated metrics
+- **`init-hooks` CLI** - `npx agent-security-scanner-mcp init-hooks` installs Claude Code post-tool-use hooks for automatic scanning
+- **Safe Fix Validation** - `validateFix()` ensures auto-fixes don't introduce new vulnerabilities
+- **Cross-File Taint Analysis** - Import graph tracking for dataflow analysis across module boundaries
+### v3.3.0
+- **OpenClaw Integration** - Full support with 30+ rules targeting autonomous AI threats
+- **OpenClaw-Specific Rules** - Data exfiltration, credential theft, messaging abuse, unsafe automation detection
 ### v3.2.0
 - **Token Optimization** - New `verbosity` parameter for all tools reduces context window usage by up to 98%
 - **Three Verbosity Levels** - `minimal` (~50 tokens), `compact` (~200 tokens, default), `full` (~2,500 tokens)

package/analyzer.py CHANGED Viewed

@@ -11,6 +11,7 @@ import sys
 import json
 import os
 import re
+import argparse
 from typing import List, Dict, Any
 # Add the directory containing this script to the path
@@ -91,6 +92,7 @@ def analyze_file_regex(file_path):
                                 'column': match.start() + col_offset,
                                 'length': match.end() - match.start(),
                                 'severity': rule['severity'],
+                                'confidence': rule.get('metadata', {}).get('confidence', 'MEDIUM'),
                                 'metadata': rule.get('metadata', {}),
                                 'engine': 'regex'
                             })
@@ -191,6 +193,7 @@ def analyze_file_ast(file_path):
                 'column': f.column,
                 'length': length,
                 'severity': f.severity,
+                'confidence': f.metadata.get('confidence', getattr(f, 'confidence', 'MEDIUM')),
                 'metadata': f.metadata,
                 'engine': 'taint' if is_taint else 'ast',
             })
@@ -229,16 +232,30 @@ def analyze_file(file_path):
 def main():
-    if len(sys.argv) < 2:
-        print(json.dumps({'error': 'No file path provided'}))
-        sys.exit(1)
+    parser = argparse.ArgumentParser(description='Security Analyzer - AST-based with regex fallback')
+    parser.add_argument('file_path', help='Path to the file to analyze')
+    parser.add_argument('--engine', choices=['auto', 'ast', 'regex'], default='auto',
+                        help='Analysis engine: auto (default), ast (tree-sitter only), regex (regex only)')
+    args = parser.parse_args()
-    file_path = sys.argv[1]
+    file_path = args.file_path
     if not os.path.exists(file_path):
         print(json.dumps({'error': f'File not found: {file_path}'}))
         sys.exit(1)
-    results = analyze_file(file_path)
+    engine = args.engine
+    if engine == 'regex':
+        results = analyze_file_regex(file_path)
+    elif engine == 'ast':
+        if not HAS_AST_ENGINE:
+            print(json.dumps({'error': 'AST engine requested but tree-sitter is not available. Install dependencies: python3 -m pip install -r requirements.txt'}))
+            sys.exit(1)
+        results = analyze_file_ast(file_path)
+    else:
+        # auto: use AST if available, otherwise regex
+        results = analyze_file(file_path)
     print(json.dumps(results))

package/cross_file_analyzer.py ADDED Viewed

@@ -0,0 +1,216 @@
+#!/usr/bin/env python3
+"""Cross-file taint analysis for security scanning.
+Builds an import graph across local files, runs per-file analysis,
+and propagates taint warnings when a file imports from another file
+that has ERROR-severity findings.
+"""
+import json
+import os
+import re
+import sys
+# Import the per-file analyzer
+from analyzer import analyze_file
+def extract_js_imports(source):
+    """Extract import/require statements from JavaScript/TypeScript."""
+    imports = []
+    # require('...')
+    for m in re.finditer(r'''require\s*\(\s*['"]([^'"]+)['"]\s*\)''', source):
+        imports.append(m.group(1))
+    # import ... from '...'
+    for m in re.finditer(r'''from\s+['"]([^'"]+)['"]''', source):
+        imports.append(m.group(1))
+    # import '...'
+    for m in re.finditer(r'''import\s+['"]([^'"]+)['"]''', source):
+        imports.append(m.group(1))
+    return imports
+def extract_py_imports(source):
+    """Extract import statements from Python."""
+    imports = []
+    # import module
+    for m in re.finditer(r'^import\s+(\S+)', source, re.MULTILINE):
+        imports.append(m.group(1).split('.')[0])
+    # from module import ...
+    for m in re.finditer(r'^from\s+(\S+)\s+import', source, re.MULTILINE):
+        imports.append(m.group(1).split('.')[0])
+    return imports
+def detect_language(file_path):
+    """Detect language from file extension."""
+    ext = os.path.splitext(file_path)[1].lower()
+    lang_map = {
+        '.py': 'python', '.js': 'javascript', '.ts': 'typescript',
+        '.tsx': 'typescript', '.jsx': 'javascript',
+    }
+    return lang_map.get(ext, 'unknown')
+def resolve_local_import(module, base_dir, lang):
+    """Resolve a relative/local import to an actual file path."""
+    if lang in ('javascript', 'typescript'):
+        # Only resolve relative imports
+        if not module.startswith('.'):
+            return None
+        # Try common extensions
+        candidates = [
+            module,
+            module + '.js', module + '.ts', module + '.tsx', module + '.jsx',
+            os.path.join(module, 'index.js'), os.path.join(module, 'index.ts'),
+        ]
+        for candidate in candidates:
+            full = os.path.normpath(os.path.join(base_dir, candidate))
+            if os.path.isfile(full):
+                return full
+    elif lang == 'python':
+        # Only resolve relative imports (starting with .)
+        if module.startswith('.'):
+            rel = module.lstrip('.')
+            candidates = [
+                os.path.join(base_dir, rel.replace('.', os.sep) + '.py'),
+                os.path.join(base_dir, rel.replace('.', os.sep), '__init__.py'),
+            ]
+            for candidate in candidates:
+                if os.path.isfile(candidate):
+                    return candidate
+        # Also check if the module name matches a sibling file
+        sibling = os.path.join(base_dir, module + '.py')
+        if os.path.isfile(sibling):
+            return sibling
+    return None
+def extract_exports(source, lang):
+    """Extract exported function/class names."""
+    exports = []
+    if lang in ('javascript', 'typescript'):
+        for m in re.finditer(r'export\s+(?:function|class|const|let|var)\s+(\w+)', source):
+            exports.append(m.group(1))
+        for m in re.finditer(r'module\.exports\s*=', source):
+            exports.append('default')
+    elif lang == 'python':
+        for m in re.finditer(r'^(?:def|class)\s+(\w+)', source, re.MULTILINE):
+            exports.append(m.group(1))
+    return exports
+def build_import_graph(file_paths):
+    """Build import graph: {file -> [{module, resolved_path, line}]}."""
+    graph = {}
+    file_set = set(os.path.abspath(f) for f in file_paths)
+    for file_path in file_paths:
+        abs_path = os.path.abspath(file_path)
+        lang = detect_language(file_path)
+        if lang == 'unknown':
+            continue
+        try:
+            source = open(file_path, 'r', encoding='utf-8', errors='ignore').read()
+        except (OSError, IOError):
+            continue
+        if lang in ('javascript', 'typescript'):
+            modules = extract_js_imports(source)
+        elif lang == 'python':
+            modules = extract_py_imports(source)
+        else:
+            continue
+        base_dir = os.path.dirname(abs_path)
+        edges = []
+        for mod in modules:
+            resolved = resolve_local_import(mod, base_dir, lang)
+            if resolved:
+                resolved_abs = os.path.abspath(resolved)
+                if resolved_abs in file_set and resolved_abs != abs_path:
+                    edges.append({
+                        'module': mod,
+                        'resolved_path': resolved_abs,
+                    })
+        graph[abs_path] = edges
+    return graph
+def cross_file_analyze(file_paths):
+    """Run cross-file taint analysis.
+    1. Analyze each file independently
+    2. Build import graph
+    3. For each file importing from another file with ERROR-severity findings,
+       add a cross-file-taint-warning
+    """
+    # Analyze each file
+    file_findings = {}
+    all_findings = []
+    for file_path in file_paths:
+        try:
+            results = analyze_file(file_path)
+            if isinstance(results, list):
+                file_findings[os.path.abspath(file_path)] = results
+                for finding in results:
+                    finding['file'] = file_path
+                all_findings.extend(results)
+        except Exception:
+            continue
+    # Build import graph
+    graph = build_import_graph(file_paths)
+    # Propagate taint warnings
+    cross_file_warnings = []
+    for file_path, edges in graph.items():
+        for edge in edges:
+            imported_path = edge['resolved_path']
+            imported_findings = file_findings.get(imported_path, [])
+            # Check for ERROR-severity findings in imported file
+            error_findings = [f for f in imported_findings if f.get('severity') == 'error']
+            if error_findings:
+                warning = {
+                    'ruleId': 'cross-file-taint-warning',
+                    'severity': 'warning',
+                    'message': f"Imports from '{os.path.basename(imported_path)}' which has {len(error_findings)} critical finding(s): {', '.join(set(f.get('ruleId', 'unknown') for f in error_findings))}",
+                    'file': file_path,
+                    'line': 0,
+                    'metadata': {
+                        'imported_file': imported_path,
+                        'imported_findings_count': len(error_findings),
+                    }
+                }
+                cross_file_warnings.append(warning)
+    # Combine: per-file findings + cross-file warnings
+    combined = all_findings + cross_file_warnings
+    return combined
+def main():
+    """CLI entry point. Accepts file paths as arguments, outputs JSON."""
+    if len(sys.argv) < 2:
+        print(json.dumps({'error': 'Usage: cross_file_analyzer.py file1 file2 ...'}))
+        sys.exit(1)
+    file_paths = sys.argv[1:]
+    # Filter to existing files
+    file_paths = [f for f in file_paths if os.path.isfile(f)]
+    if not file_paths:
+        print(json.dumps({'error': 'No valid files provided'}))
+        sys.exit(1)
+    results = cross_file_analyze(file_paths)
+    print(json.dumps(results))
+if __name__ == '__main__':
+    main()