@jacobmolz/mcpguard 0.2.1 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +18 -8
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -91,14 +91,7 @@ Config merge uses **floor-based semantics**: personal configs can restrict but n
|
|
|
91
91
|
|
|
92
92
|
## Benchmark Results
|
|
93
93
|
|
|
94
|
-
The benchmark suite tests 7,095 attack scenarios across 10 categories and 10,168 legitimate requests.
|
|
95
|
-
|
|
96
|
-
| Metric | Result | Target | Status |
|
|
97
|
-
|--------|--------|--------|--------|
|
|
98
|
-
| Detection rate | 97.0% | >95% | Pass |
|
|
99
|
-
| False positive rate | 0.000% | <0.1% | Pass |
|
|
100
|
-
| Audit integrity | Pass | No raw PII in logs | Pass |
|
|
101
|
-
| p50 latency overhead | 0.17ms | <5ms | Pass |
|
|
94
|
+
The benchmark suite is open-source and fully reproducible (`pnpm benchmark`). It tests MCP-Guard's deterministic interceptor pipeline — policy enforcement, pattern matching, and access control — against 7,095 programmatically generated attack scenarios across 10 categories and 10,168 legitimate requests. See [Benchmark Methodology](docs/benchmark-methodology.md) for threat model, statistical interpretation, and known limitations.
|
|
102
95
|
|
|
103
96
|
### Per-Category Detection
|
|
104
97
|
|
|
@@ -115,6 +108,23 @@ The benchmark suite tests 7,095 attack scenarios across 10 categories and 10,168
|
|
|
115
108
|
| PII request leak | 93.8% | Pass |
|
|
116
109
|
| Rate limit evasion | 92.4% | Pass |
|
|
117
110
|
|
|
111
|
+
### Summary
|
|
112
|
+
|
|
113
|
+
| Metric | Result | Target | Status |
|
|
114
|
+
|--------|--------|--------|--------|
|
|
115
|
+
| Detection rate | 97.0% | >95% | Pass |
|
|
116
|
+
| False positive rate | 0 in 10,168 requests (<0.03% at 95% CI) | <0.1% | Pass |
|
|
117
|
+
| Audit integrity | No raw PII in logs | Pass | Pass |
|
|
118
|
+
| p50 latency overhead | 0.17ms (deterministic pipeline, no network hop) | <5ms | Pass |
|
|
119
|
+
|
|
120
|
+
### Limitations
|
|
121
|
+
|
|
122
|
+
- Tested against own generated scenarios, not an independent corpus — [methodology explains mitigations](docs/benchmark-methodology.md#self-testing-honesty-about-our-own-test-suite)
|
|
123
|
+
- Regex PII detection misses semantic encoding (spelling out digits, splitting across fields)
|
|
124
|
+
- Does not address LLM-level prompt injection — complementary tools like those evaluated by [MCPSecBench](https://arxiv.org/abs/2508.13220) operate at the agent layer
|
|
125
|
+
- No coverage for network-layer attacks (MITM, DNS rebinding)
|
|
126
|
+
- ML-based detection planned but not yet implemented
|
|
127
|
+
|
|
118
128
|
> Full-suite results from `pnpm benchmark`. Quick mode (`pnpm benchmark:quick`) uses stratified sampling and typically reports ~89-93% detection. See [latest report](benchmarks/results/REPORT.md) for charts.
|
|
119
129
|
|
|
120
130
|
## CLI Reference
|
package/package.json
CHANGED