mcpspec 1.2.1 → 1.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +223 -267
- package/package.json +4 -4
package/README.md
CHANGED
|
@@ -5,7 +5,7 @@
|
|
|
5
5
|
<h1 align="center">MCPSpec</h1>
|
|
6
6
|
|
|
7
7
|
<p align="center">
|
|
8
|
-
<strong>
|
|
8
|
+
<strong>Ship reliable MCP servers with confidence</strong>
|
|
9
9
|
</p>
|
|
10
10
|
|
|
11
11
|
<p align="center">
|
|
@@ -16,43 +16,205 @@
|
|
|
16
16
|
</p>
|
|
17
17
|
|
|
18
18
|
<p align="center">
|
|
19
|
-
|
|
19
|
+
Record sessions, generate mock servers, gate your CI pipeline, and catch Tool Poisoning — all without writing a single line of test code. The testing and reliability platform for <a href="https://modelcontextprotocol.io">Model Context Protocol</a> servers.
|
|
20
20
|
</p>
|
|
21
21
|
|
|
22
22
|
---
|
|
23
23
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
mcpspec ci-init --platform github # Generate CI pipeline config
|
|
34
|
-
mcpspec ui # Launch web dashboard
|
|
35
|
-
```
|
|
24
|
+
## Why MCPSpec?
|
|
25
|
+
|
|
26
|
+
**Deterministic.** Unlike LLM-based testing, MCPSpec runs are fast and repeatable. No flaky tests, no token costs. Ideal for CI.
|
|
27
|
+
|
|
28
|
+
**Secure.** Catch [Tool Poisoning](#security-audit) (prompt injection in tool descriptions) and [Excessive Agency](#security-audit) (destructive tools without safeguards) before they reach production.
|
|
29
|
+
|
|
30
|
+
**Collaborative.** Record server interactions once, share mock servers with your team. Frontend developers and CI pipelines can test against mocks — no API keys, no live dependencies.
|
|
31
|
+
|
|
32
|
+
---
|
|
36
33
|
|
|
37
34
|
## Quick Start
|
|
38
35
|
|
|
39
36
|
```bash
|
|
40
|
-
#
|
|
37
|
+
# Install
|
|
41
38
|
npm install -g mcpspec
|
|
42
39
|
|
|
43
|
-
#
|
|
44
|
-
mcpspec
|
|
40
|
+
# Explore a server interactively
|
|
41
|
+
mcpspec inspect "npx @modelcontextprotocol/server-filesystem /tmp"
|
|
42
|
+
|
|
43
|
+
# Record a session — no test code needed
|
|
44
|
+
mcpspec record start "npx my-server"
|
|
45
45
|
|
|
46
|
-
#
|
|
47
|
-
mcpspec
|
|
46
|
+
# Generate a mock for your team
|
|
47
|
+
mcpspec mock my-recording --generate ./mocks/server.js
|
|
48
48
|
|
|
49
|
-
#
|
|
49
|
+
# Add CI gating in 30 seconds
|
|
50
50
|
mcpspec ci-init
|
|
51
51
|
```
|
|
52
52
|
|
|
53
|
-
|
|
53
|
+
**Try it in 30 seconds** with a pre-built community collection — no setup required:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
mcpspec test examples/collections/servers/filesystem.yaml
|
|
57
|
+
mcpspec test examples/collections/servers/time.yaml --tag smoke
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
See all [70 community tests](#community-collections) for 7 popular MCP servers.
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Record & Mock
|
|
65
|
+
|
|
66
|
+
Record a session once. Replay it to catch regressions. Mock it for CI. No API keys required.
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
# 1. Record a session against your real server
|
|
70
|
+
mcpspec record start "npx my-server"
|
|
71
|
+
mcpspec> .call get_user {"id": "1"}
|
|
72
|
+
mcpspec> .call list_items {}
|
|
73
|
+
mcpspec> .save my-api
|
|
74
|
+
|
|
75
|
+
# 2. Replay against a new version — catch regressions instantly
|
|
76
|
+
mcpspec record replay my-api "npx my-server-v2"
|
|
77
|
+
|
|
78
|
+
# 3. Start a mock server — drop-in replacement, zero dependencies
|
|
79
|
+
mcpspec mock my-api
|
|
80
|
+
|
|
81
|
+
# 4. Generate a standalone .js file — commit to your repo
|
|
82
|
+
mcpspec mock my-api --generate ./mocks/server.js
|
|
83
|
+
node ./mocks/server.js
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Replay output** shows exactly what changed:
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
Replaying 3 steps...
|
|
90
|
+
|
|
91
|
+
1/3 get_user (id=1)... [OK] 42ms → {"name": "Alice"}
|
|
92
|
+
2/3 list_items... [CHANGED] 38ms → {"items": [...]}
|
|
93
|
+
3/3 create_item (name=test) [OK] 51ms → {"id": "abc"}
|
|
94
|
+
|
|
95
|
+
Summary: 2 matched, 1 changed, 0 added, 0 removed
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
**Mock options:**
|
|
99
|
+
|
|
100
|
+
| Option | Effect |
|
|
101
|
+
|--------|--------|
|
|
102
|
+
| `--mode match` (default) | Exact input match first, then next queued response per tool |
|
|
103
|
+
| `--mode sequential` | Tape/cassette style — responses served in recorded order |
|
|
104
|
+
| `--latency original` | Simulate original response times |
|
|
105
|
+
| `--latency 100` | Fixed 100ms delay |
|
|
106
|
+
| `--on-missing empty` | Return empty instead of error for unrecorded tools |
|
|
107
|
+
| `--generate <path>` | Output standalone `.js` file (only needs `@modelcontextprotocol/sdk`) |
|
|
108
|
+
|
|
109
|
+
**Manage recordings:**
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
mcpspec record list # List saved recordings
|
|
113
|
+
mcpspec record delete my-session # Delete a recording
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
---
|
|
117
|
+
|
|
118
|
+
## CI/CD Integration
|
|
54
119
|
|
|
55
|
-
|
|
120
|
+
`ci-init` generates ready-to-use pipeline configurations. Deterministic exit codes and JUnit/JSON/TAP reporters for seamless CI integration.
|
|
121
|
+
|
|
122
|
+
```bash
|
|
123
|
+
mcpspec ci-init # Interactive wizard
|
|
124
|
+
mcpspec ci-init --platform github # GitHub Actions
|
|
125
|
+
mcpspec ci-init --platform gitlab # GitLab CI
|
|
126
|
+
mcpspec ci-init --platform shell # Shell script
|
|
127
|
+
mcpspec ci-init --checks test,audit,score # Choose checks
|
|
128
|
+
mcpspec ci-init --fail-on medium # Audit severity gate
|
|
129
|
+
mcpspec ci-init --min-score 70 # MCP Score threshold
|
|
130
|
+
mcpspec ci-init --force # Overwrite/replace existing
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
Auto-detects platform from `.github/` or `.gitlab-ci.yml`. GitLab `--force` surgically replaces only the mcpspec job block, preserving other jobs.
|
|
134
|
+
|
|
135
|
+
**GitHub Actions example** (generated by `mcpspec ci-init --platform github`):
|
|
136
|
+
|
|
137
|
+
```yaml
|
|
138
|
+
name: MCP Server Tests
|
|
139
|
+
on: [push, pull_request]
|
|
140
|
+
|
|
141
|
+
jobs:
|
|
142
|
+
mcpspec:
|
|
143
|
+
runs-on: ubuntu-latest
|
|
144
|
+
steps:
|
|
145
|
+
- uses: actions/checkout@v4
|
|
146
|
+
- uses: actions/setup-node@v4
|
|
147
|
+
with:
|
|
148
|
+
node-version: '22'
|
|
149
|
+
- run: npm install -g mcpspec
|
|
150
|
+
- name: Run tests
|
|
151
|
+
run: mcpspec test --ci --reporter junit --output results.xml
|
|
152
|
+
- name: Security audit
|
|
153
|
+
run: mcpspec audit "npx my-server" --fail-on high
|
|
154
|
+
- name: Quality gate
|
|
155
|
+
run: mcpspec score "npx my-server" --min-score 80
|
|
156
|
+
- uses: mikepenz/action-junit-report@v4
|
|
157
|
+
if: always()
|
|
158
|
+
with:
|
|
159
|
+
report_paths: results.xml
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
**Exit codes:**
|
|
163
|
+
|
|
164
|
+
| Code | Meaning |
|
|
165
|
+
|------|---------|
|
|
166
|
+
| `0` | Success |
|
|
167
|
+
| `1` | Test failure |
|
|
168
|
+
| `2` | Runtime error |
|
|
169
|
+
| `3` | Configuration error |
|
|
170
|
+
| `4` | Connection error |
|
|
171
|
+
| `5` | Timeout |
|
|
172
|
+
| `6` | Security findings above threshold |
|
|
173
|
+
| `7` | Validation error |
|
|
174
|
+
| `130` | Interrupted (Ctrl+C) |
|
|
175
|
+
|
|
176
|
+
**Reporters:** console (default), json, junit, html, tap.
|
|
177
|
+
|
|
178
|
+
---
|
|
179
|
+
|
|
180
|
+
## Security Audit
|
|
181
|
+
|
|
182
|
+
Prevent Tool Poisoning. 8 security rules covering traditional vulnerabilities and LLM-specific threats. A safety filter auto-skips destructive tools, and `--dry-run` previews targets before scanning.
|
|
183
|
+
|
|
184
|
+
```bash
|
|
185
|
+
mcpspec audit "npx my-server" # Passive (safe)
|
|
186
|
+
mcpspec audit "npx my-server" --mode active # Active probing
|
|
187
|
+
mcpspec audit "npx my-server" --fail-on medium # CI gate
|
|
188
|
+
mcpspec audit "npx my-server" --exclude-tools delete # Skip tools
|
|
189
|
+
mcpspec audit "npx my-server" --dry-run # Preview targets
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
**Security rules:**
|
|
193
|
+
|
|
194
|
+
| Rule | Mode | What it detects |
|
|
195
|
+
|------|------|-----------------|
|
|
196
|
+
| **Tool Poisoning** | Passive | LLM prompt injection in descriptions, hidden Unicode, cross-tool manipulation |
|
|
197
|
+
| **Excessive Agency** | Passive | Destructive tools without confirmation params, arbitrary code execution |
|
|
198
|
+
| Path Traversal | Passive | `../../etc/passwd` style directory escape attacks |
|
|
199
|
+
| Input Validation | Passive | Missing constraints (enum, pattern, min/max) on tool inputs |
|
|
200
|
+
| Info Disclosure | Passive | Leaked paths, stack traces, API keys in tool descriptions |
|
|
201
|
+
| Resource Exhaustion | Active | Unbounded loops, large allocations, recursion |
|
|
202
|
+
| Auth Bypass | Active | Missing auth checks, hardcoded credentials |
|
|
203
|
+
| Injection | Active | SQL and command injection in tool inputs |
|
|
204
|
+
|
|
205
|
+
**Scan modes:**
|
|
206
|
+
|
|
207
|
+
- **Passive** (default) — 5 rules, analyzes metadata only, no tool calls. Safe for production.
|
|
208
|
+
- **Active** — All 8 rules, sends test payloads. Requires confirmation prompt.
|
|
209
|
+
- **Aggressive** — All 8 rules with more exhaustive probing. Requires confirmation prompt.
|
|
210
|
+
|
|
211
|
+
Active/aggressive modes auto-skip tools matching destructive patterns (`delete_*`, `drop_*`, `destroy_*`, etc.) and require explicit confirmation unless `--acknowledge-risk` is passed.
|
|
212
|
+
|
|
213
|
+
Each finding includes severity (info/low/medium/high/critical), description, evidence, and remediation advice.
|
|
214
|
+
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## Test Collections
|
|
56
218
|
|
|
57
219
|
Write tests in YAML with 10 assertion types, environments, variable extraction, tags, retries, and parallel execution.
|
|
58
220
|
|
|
@@ -76,7 +238,7 @@ tests:
|
|
|
76
238
|
expectError: true
|
|
77
239
|
```
|
|
78
240
|
|
|
79
|
-
**Advanced features
|
|
241
|
+
**Advanced features** — environments, tags, retries, variable extraction, expressions:
|
|
80
242
|
|
|
81
243
|
```yaml
|
|
82
244
|
schemaVersion: "1.0"
|
|
@@ -171,10 +333,36 @@ mcpspec test --watch # Re-run on file changes
|
|
|
171
333
|
mcpspec test --ci # CI mode (no colors)
|
|
172
334
|
```
|
|
173
335
|
|
|
174
|
-
|
|
336
|
+
---
|
|
337
|
+
|
|
338
|
+
## MCP Score
|
|
339
|
+
|
|
340
|
+
A 0-100 quality rating across 5 weighted categories with opinionated schema linting. Use as a CI gate or generate a badge for your README.
|
|
341
|
+
|
|
342
|
+
```bash
|
|
343
|
+
mcpspec score "npx my-server"
|
|
344
|
+
mcpspec score "npx my-server" --badge badge.svg # Generate SVG badge
|
|
345
|
+
mcpspec score "npx my-server" --min-score 80 # Fail if below threshold
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
**Scoring categories:**
|
|
349
|
+
|
|
350
|
+
| Category (weight) | What it measures |
|
|
351
|
+
|--------------------|-----------------|
|
|
352
|
+
| Documentation (25%) | Percentage of tools and resources with descriptions |
|
|
353
|
+
| Schema Quality (25%) | Property types, descriptions, required fields, constraints (enum/pattern/min/max), naming conventions |
|
|
354
|
+
| Error Handling (20%) | Structured error responses (`isError: true`) vs. crashes on bad input |
|
|
355
|
+
| Responsiveness (15%) | Median latency: <100ms = 100, <500ms = 80, <1s = 60, <5s = 40 |
|
|
356
|
+
| Security (15%) | Findings from passive security scan: 0 = 100, <=2 = 70, <=5 = 40 |
|
|
357
|
+
|
|
358
|
+
Schema quality uses 6 sub-criteria: structure (20%), property types (20%), descriptions (20%), required fields (15%), constraints (15%), naming conventions (10%).
|
|
359
|
+
|
|
360
|
+
The `--badge` flag generates a shields.io-style SVG badge for your README.
|
|
175
361
|
|
|
176
362
|
---
|
|
177
363
|
|
|
364
|
+
## More Features
|
|
365
|
+
|
|
178
366
|
### Interactive Inspector
|
|
179
367
|
|
|
180
368
|
Connect to any MCP server and explore its capabilities in a live REPL.
|
|
@@ -193,131 +381,6 @@ mcpspec inspect "npx @modelcontextprotocol/server-filesystem /tmp"
|
|
|
193
381
|
| `.help` | Show help |
|
|
194
382
|
| `.exit` | Disconnect and exit |
|
|
195
383
|
|
|
196
|
-
```
|
|
197
|
-
mcpspec> .tools
|
|
198
|
-
read_file Read complete contents of a file
|
|
199
|
-
write_file Create or overwrite a file
|
|
200
|
-
list_directory List directory contents
|
|
201
|
-
|
|
202
|
-
mcpspec> .call read_file {"path": "/tmp/test.txt"}
|
|
203
|
-
{
|
|
204
|
-
"content": "Hello, world!"
|
|
205
|
-
}
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
---
|
|
209
|
-
|
|
210
|
-
### Security Audit
|
|
211
|
-
|
|
212
|
-
8 security rules covering traditional vulnerabilities and LLM-specific threats. A safety filter auto-skips destructive tools, and `--dry-run` previews targets before scanning.
|
|
213
|
-
|
|
214
|
-
```bash
|
|
215
|
-
mcpspec audit "npx my-server" # Passive (safe)
|
|
216
|
-
mcpspec audit "npx my-server" --mode active # Active probing
|
|
217
|
-
mcpspec audit "npx my-server" --fail-on medium # CI gate
|
|
218
|
-
mcpspec audit "npx my-server" --exclude-tools delete # Skip tools
|
|
219
|
-
mcpspec audit "npx my-server" --dry-run # Preview targets
|
|
220
|
-
```
|
|
221
|
-
|
|
222
|
-
**Security rules:**
|
|
223
|
-
|
|
224
|
-
| Rule | Mode | What it detects |
|
|
225
|
-
|------|------|-----------------|
|
|
226
|
-
| Path Traversal | Passive | `../../etc/passwd` style directory escape attacks |
|
|
227
|
-
| Input Validation | Passive | Missing constraints (enum, pattern, min/max) on tool inputs |
|
|
228
|
-
| Info Disclosure | Passive | Leaked paths, stack traces, API keys in tool descriptions |
|
|
229
|
-
| Tool Poisoning | Passive | LLM prompt injection in descriptions, hidden Unicode, cross-tool manipulation |
|
|
230
|
-
| Excessive Agency | Passive | Destructive tools without confirmation params, arbitrary code execution |
|
|
231
|
-
| Resource Exhaustion | Active | Unbounded loops, large allocations, recursion |
|
|
232
|
-
| Auth Bypass | Active | Missing auth checks, hardcoded credentials |
|
|
233
|
-
| Injection | Active | SQL and command injection in tool inputs |
|
|
234
|
-
|
|
235
|
-
**Scan modes:**
|
|
236
|
-
|
|
237
|
-
- **Passive** (default) — 5 rules, analyzes metadata only, no tool calls. Safe for production.
|
|
238
|
-
- **Active** — All 8 rules, sends test payloads. Requires confirmation prompt.
|
|
239
|
-
- **Aggressive** — All 8 rules with more exhaustive probing. Requires confirmation prompt.
|
|
240
|
-
|
|
241
|
-
Active/aggressive modes auto-skip tools matching destructive patterns (`delete_*`, `drop_*`, `destroy_*`, etc.) and require explicit confirmation unless `--acknowledge-risk` is passed.
|
|
242
|
-
|
|
243
|
-
Each finding includes severity (info/low/medium/high/critical), description, evidence, and remediation advice.
|
|
244
|
-
|
|
245
|
-
---
|
|
246
|
-
|
|
247
|
-
### Recording & Replay
|
|
248
|
-
|
|
249
|
-
Record inspector sessions, save them, and replay against the same or different server versions. Diff output highlights regressions.
|
|
250
|
-
|
|
251
|
-
```bash
|
|
252
|
-
# Record a session
|
|
253
|
-
mcpspec record start "npx my-server"
|
|
254
|
-
mcpspec> .call get_user {"id": "1"}
|
|
255
|
-
mcpspec> .call list_items {}
|
|
256
|
-
mcpspec> .save my-session
|
|
257
|
-
|
|
258
|
-
# Later: replay against new server version
|
|
259
|
-
mcpspec record replay my-session "npx my-server-v2"
|
|
260
|
-
```
|
|
261
|
-
|
|
262
|
-
**Replay output:**
|
|
263
|
-
|
|
264
|
-
```
|
|
265
|
-
Replaying 3 steps against my-server-v2...
|
|
266
|
-
|
|
267
|
-
1/3 get_user............. [OK] 42ms
|
|
268
|
-
2/3 list_items........... [CHANGED] 38ms
|
|
269
|
-
3/3 create_item.......... [OK] 51ms
|
|
270
|
-
|
|
271
|
-
Summary: 2 matched, 1 changed, 0 added, 0 removed
|
|
272
|
-
```
|
|
273
|
-
|
|
274
|
-
**Manage recordings:**
|
|
275
|
-
|
|
276
|
-
```bash
|
|
277
|
-
mcpspec record list # List saved recordings
|
|
278
|
-
mcpspec record delete my-session # Delete a recording
|
|
279
|
-
```
|
|
280
|
-
|
|
281
|
-
Recordings are stored in `~/.mcpspec/recordings/` and include tool names, inputs, outputs, timing, and error states for each step.
|
|
282
|
-
|
|
283
|
-
---
|
|
284
|
-
|
|
285
|
-
### Mock Server
|
|
286
|
-
|
|
287
|
-
Turn any recording into a mock MCP server — a drop-in replacement for the real server. Useful for CI/CD without real dependencies, offline development, and deterministic tests.
|
|
288
|
-
|
|
289
|
-
```bash
|
|
290
|
-
# Start mock server from a recording (stdio transport)
|
|
291
|
-
mcpspec mock my-api
|
|
292
|
-
|
|
293
|
-
# Use as a server in test collections
|
|
294
|
-
mcpspec test --server "mcpspec mock my-api" ./tests.yaml
|
|
295
|
-
|
|
296
|
-
# Generate standalone .js file (only needs @modelcontextprotocol/sdk)
|
|
297
|
-
mcpspec mock my-api --generate ./mock-server.js
|
|
298
|
-
node mock-server.js
|
|
299
|
-
```
|
|
300
|
-
|
|
301
|
-
**Matching modes:**
|
|
302
|
-
|
|
303
|
-
| Mode | Behavior |
|
|
304
|
-
|------|----------|
|
|
305
|
-
| `match` (default) | Exact input match first, then next queued response per tool |
|
|
306
|
-
| `sequential` | Tape/cassette style — responses served in recorded order |
|
|
307
|
-
|
|
308
|
-
**Options:**
|
|
309
|
-
|
|
310
|
-
```bash
|
|
311
|
-
mcpspec mock my-api --mode sequential # Tape-style matching
|
|
312
|
-
mcpspec mock my-api --latency original # Simulate original response times
|
|
313
|
-
mcpspec mock my-api --latency 100 # Fixed 100ms delay
|
|
314
|
-
mcpspec mock my-api --on-missing empty # Return empty instead of error for unrecorded tools
|
|
315
|
-
```
|
|
316
|
-
|
|
317
|
-
The generated standalone file embeds the recording data and matching logic — commit it to your repo for portable, dependency-light mock servers.
|
|
318
|
-
|
|
319
|
-
---
|
|
320
|
-
|
|
321
384
|
### Performance Benchmarks
|
|
322
385
|
|
|
323
386
|
Measure latency and throughput with statistical analysis across hundreds of iterations.
|
|
@@ -326,58 +389,10 @@ Measure latency and throughput with statistical analysis across hundreds of iter
|
|
|
326
389
|
mcpspec bench "npx my-server" # 100 iterations
|
|
327
390
|
mcpspec bench "npx my-server" --iterations 500
|
|
328
391
|
mcpspec bench "npx my-server" --tool read_file
|
|
329
|
-
mcpspec bench "npx my-server" --args '{"path":"/tmp/f"}'
|
|
330
392
|
mcpspec bench "npx my-server" --warmup 10
|
|
331
393
|
```
|
|
332
394
|
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
```
|
|
336
|
-
Benchmarking read_file (100 iterations, 5 warmup)...
|
|
337
|
-
|
|
338
|
-
Latency
|
|
339
|
-
────────────────────────────
|
|
340
|
-
Min 12.34ms
|
|
341
|
-
Max 89.21ms
|
|
342
|
-
Mean 34.56ms
|
|
343
|
-
Median 31.22ms
|
|
344
|
-
P95 67.89ms
|
|
345
|
-
P99 82.45ms
|
|
346
|
-
Std Dev 15.23ms
|
|
347
|
-
|
|
348
|
-
Throughput: 28.94 calls/sec
|
|
349
|
-
Errors: 0
|
|
350
|
-
```
|
|
351
|
-
|
|
352
|
-
Warmup iterations (default: 5) are excluded from measurements. The profiler uses `performance.now()` for high-resolution timing.
|
|
353
|
-
|
|
354
|
-
---
|
|
355
|
-
|
|
356
|
-
### MCP Score
|
|
357
|
-
|
|
358
|
-
A 0-100 quality rating across 5 weighted categories with opinionated schema linting.
|
|
359
|
-
|
|
360
|
-
```bash
|
|
361
|
-
mcpspec score "npx my-server"
|
|
362
|
-
mcpspec score "npx my-server" --badge badge.svg # Generate SVG badge
|
|
363
|
-
mcpspec score "npx my-server" --min-score 80 # Fail if below threshold
|
|
364
|
-
```
|
|
365
|
-
|
|
366
|
-
**Scoring categories:**
|
|
367
|
-
|
|
368
|
-
| Category (weight) | What it measures |
|
|
369
|
-
|--------------------|-----------------|
|
|
370
|
-
| Documentation (25%) | Percentage of tools and resources with descriptions |
|
|
371
|
-
| Schema Quality (25%) | Property types, descriptions, required fields, constraints (enum/pattern/min/max), naming conventions |
|
|
372
|
-
| Error Handling (20%) | Structured error responses (`isError: true`) vs. crashes on bad input |
|
|
373
|
-
| Responsiveness (15%) | Median latency: <100ms = 100, <500ms = 80, <1s = 60, <5s = 40 |
|
|
374
|
-
| Security (15%) | Findings from passive security scan: 0 = 100, <=2 = 70, <=5 = 40 |
|
|
375
|
-
|
|
376
|
-
Schema quality uses 6 sub-criteria: structure (20%), property types (20%), descriptions (20%), required fields (15%), constraints (15%), naming conventions (10%).
|
|
377
|
-
|
|
378
|
-
The `--badge` flag generates a shields.io-style SVG badge for your README.
|
|
379
|
-
|
|
380
|
-
---
|
|
395
|
+
Output includes min/max/mean/median/P95/P99 latency, standard deviation, and throughput (calls/sec). Warmup iterations (default: 5) are excluded from measurements.
|
|
381
396
|
|
|
382
397
|
### Doc Generator
|
|
383
398
|
|
|
@@ -389,10 +404,6 @@ mcpspec docs "npx my-server" --format html # HTML output
|
|
|
389
404
|
mcpspec docs "npx my-server" --output ./docs # Write to directory
|
|
390
405
|
```
|
|
391
406
|
|
|
392
|
-
Generated docs include: server name/version/description, all tools with their input schemas, and all resources with URIs and descriptions.
|
|
393
|
-
|
|
394
|
-
---
|
|
395
|
-
|
|
396
407
|
### Web Dashboard
|
|
397
408
|
|
|
398
409
|
A full React UI for managing servers, running tests, viewing audit results, and more. Dark mode included.
|
|
@@ -403,57 +414,7 @@ mcpspec ui --port 8080 # Custom port
|
|
|
403
414
|
mcpspec ui --no-open # Don't auto-open browser
|
|
404
415
|
```
|
|
405
416
|
|
|
406
|
-
|
|
407
|
-
|
|
408
|
-
| Page | What it does |
|
|
409
|
-
|------|-------------|
|
|
410
|
-
| Dashboard | Overview of servers, collections, recent runs |
|
|
411
|
-
| Servers | Connect and manage MCP server connections |
|
|
412
|
-
| Collections | Create and edit YAML test collections |
|
|
413
|
-
| Runs | View test run history and results |
|
|
414
|
-
| Inspector | Interactive tool calling with schema forms and protocol logging |
|
|
415
|
-
| Audit | Run security scans and view findings |
|
|
416
|
-
| Benchmark | Performance profiling with charts |
|
|
417
|
-
| Score | MCP Score visualization |
|
|
418
|
-
| Docs | Generated server documentation |
|
|
419
|
-
| Recordings | View, replay, and manage recorded sessions |
|
|
420
|
-
|
|
421
|
-
Real-time WebSocket updates for running tests, live protocol logging in the inspector, and dark mode with localStorage persistence.
|
|
422
|
-
|
|
423
|
-
---
|
|
424
|
-
|
|
425
|
-
### CI/CD Integration
|
|
426
|
-
|
|
427
|
-
`ci-init` generates ready-to-use pipeline configurations. Deterministic exit codes and JUnit/JSON/TAP reporters for seamless CI integration.
|
|
428
|
-
|
|
429
|
-
```bash
|
|
430
|
-
mcpspec ci-init # Interactive wizard
|
|
431
|
-
mcpspec ci-init --platform github # GitHub Actions
|
|
432
|
-
mcpspec ci-init --platform gitlab # GitLab CI
|
|
433
|
-
mcpspec ci-init --platform shell # Shell script
|
|
434
|
-
mcpspec ci-init --checks test,audit,score # Choose checks
|
|
435
|
-
mcpspec ci-init --fail-on medium # Audit severity gate
|
|
436
|
-
mcpspec ci-init --min-score 70 # MCP Score threshold
|
|
437
|
-
mcpspec ci-init --force # Overwrite/replace existing
|
|
438
|
-
```
|
|
439
|
-
|
|
440
|
-
Auto-detects platform from `.github/` or `.gitlab-ci.yml`. GitLab `--force` surgically replaces only the mcpspec job block, preserving other jobs.
|
|
441
|
-
|
|
442
|
-
**Exit codes:**
|
|
443
|
-
|
|
444
|
-
| Code | Meaning |
|
|
445
|
-
|------|---------|
|
|
446
|
-
| `0` | Success |
|
|
447
|
-
| `1` | Test failure |
|
|
448
|
-
| `2` | Runtime error |
|
|
449
|
-
| `3` | Configuration error |
|
|
450
|
-
| `4` | Connection error |
|
|
451
|
-
| `5` | Timeout |
|
|
452
|
-
| `6` | Security findings above threshold |
|
|
453
|
-
| `7` | Validation error |
|
|
454
|
-
| `130` | Interrupted (Ctrl+C) |
|
|
455
|
-
|
|
456
|
-
---
|
|
417
|
+
10 pages: Dashboard, Servers, Collections, Runs, Inspector, Recordings, Audit, Benchmark, Docs, Score.
|
|
457
418
|
|
|
458
419
|
### Baselines & Comparison
|
|
459
420
|
|
|
@@ -467,10 +428,6 @@ mcpspec compare --baseline main # Explicit comparison
|
|
|
467
428
|
mcpspec compare <run-id-1> <run-id-2> # Compare two runs
|
|
468
429
|
```
|
|
469
430
|
|
|
470
|
-
Comparison output shows regressions (tests that now fail), fixes (tests that now pass), new tests, and removed tests.
|
|
471
|
-
|
|
472
|
-
---
|
|
473
|
-
|
|
474
431
|
### Transports
|
|
475
432
|
|
|
476
433
|
MCPSpec supports 3 transport types for connecting to MCP servers:
|
|
@@ -498,27 +455,27 @@ server:
|
|
|
498
455
|
url: http://localhost:3000/mcp
|
|
499
456
|
```
|
|
500
457
|
|
|
501
|
-
|
|
458
|
+
---
|
|
502
459
|
|
|
503
460
|
## Commands
|
|
504
461
|
|
|
505
462
|
| Command | Description |
|
|
506
463
|
|---------|-------------|
|
|
464
|
+
| `mcpspec record start <server>` | Record an inspector session — `.call`, `.save`, `.steps` |
|
|
465
|
+
| `mcpspec record replay <name> <server>` | Replay a recording and diff against original |
|
|
466
|
+
| `mcpspec mock <recording>` | Mock server from recording — `--mode`, `--latency`, `--on-missing`, `--generate` |
|
|
507
467
|
| `mcpspec test [collection]` | Run test collections with `--env`, `--tag`, `--parallel`, `--reporter`, `--watch`, `--ci` |
|
|
508
|
-
| `mcpspec inspect <server>` | Interactive REPL — `.tools`, `.call`, `.schema`, `.resources`, `.info` |
|
|
509
468
|
| `mcpspec audit <server>` | Security scan — `--mode`, `--fail-on`, `--exclude-tools`, `--dry-run` |
|
|
510
|
-
| `mcpspec bench <server>` | Performance benchmark — `--iterations`, `--tool`, `--args`, `--warmup` |
|
|
511
469
|
| `mcpspec score <server>` | Quality score (0-100) — `--badge badge.svg`, `--min-score` |
|
|
470
|
+
| `mcpspec ci-init` | Generate CI config — `--platform github\|gitlab\|shell`, `--checks`, `--fail-on`, `--force` |
|
|
471
|
+
| `mcpspec inspect <server>` | Interactive REPL — `.tools`, `.call`, `.schema`, `.resources`, `.info` |
|
|
472
|
+
| `mcpspec bench <server>` | Performance benchmark — `--iterations`, `--tool`, `--args`, `--warmup` |
|
|
512
473
|
| `mcpspec docs <server>` | Generate docs — `--format markdown\|html`, `--output <dir>` |
|
|
513
474
|
| `mcpspec compare` | Compare test runs or `--baseline <name>` |
|
|
514
475
|
| `mcpspec baseline save <name>` | Save/list baselines for regression detection |
|
|
515
|
-
| `mcpspec record start <server>` | Record an inspector session — `.call`, `.save`, `.steps` |
|
|
516
|
-
| `mcpspec record replay <name> <server>` | Replay a recording and diff against original |
|
|
517
476
|
| `mcpspec record list` | List saved recordings |
|
|
518
477
|
| `mcpspec record delete <name>` | Delete a saved recording |
|
|
519
|
-
| `mcpspec mock <recording>` | Mock server from recording — `--mode`, `--latency`, `--on-missing`, `--generate` |
|
|
520
478
|
| `mcpspec init [dir]` | Scaffold project — `--template minimal\|standard\|full` |
|
|
521
|
-
| `mcpspec ci-init` | Generate CI config — `--platform github\|gitlab\|shell`, `--checks`, `--fail-on`, `--force` |
|
|
522
479
|
| `mcpspec ui` | Launch web dashboard on `localhost:6274` |
|
|
523
480
|
|
|
524
481
|
## Community Collections
|
|
@@ -538,7 +495,6 @@ Pre-built test suites for popular MCP servers in [`examples/collections/servers/
|
|
|
538
495
|
**70 tests** covering tool discovery, read/write operations, error handling, security edge cases, and latency.
|
|
539
496
|
|
|
540
497
|
```bash
|
|
541
|
-
# Run community collections directly
|
|
542
498
|
mcpspec test examples/collections/servers/filesystem.yaml
|
|
543
499
|
mcpspec test examples/collections/servers/time.yaml --tag smoke
|
|
544
500
|
```
|
|
@@ -548,7 +504,7 @@ mcpspec test examples/collections/servers/time.yaml --tag smoke
|
|
|
548
504
|
| Package | Description |
|
|
549
505
|
|---------|-------------|
|
|
550
506
|
| `@mcpspec/shared` | Types, Zod schemas, constants |
|
|
551
|
-
| `@mcpspec/core` | MCP client, test runner, assertions, security scanner (8 rules), profiler, doc generator, scorer, recording/replay |
|
|
507
|
+
| `@mcpspec/core` | MCP client, test runner, assertions, security scanner (8 rules), profiler, doc generator, scorer, recording/replay, mock server |
|
|
552
508
|
| `@mcpspec/cli` | 13 CLI commands built with Commander.js |
|
|
553
509
|
| `@mcpspec/server` | Hono HTTP server with REST API + WebSocket |
|
|
554
510
|
| `@mcpspec/ui` | React SPA — TanStack Router, TanStack Query, Tailwind, shadcn/ui |
|
|
@@ -559,7 +515,7 @@ mcpspec test examples/collections/servers/time.yaml --tag smoke
|
|
|
559
515
|
git clone https://github.com/light-handle/mcpspec.git
|
|
560
516
|
cd mcpspec
|
|
561
517
|
pnpm install && pnpm build
|
|
562
|
-
pnpm test #
|
|
518
|
+
pnpm test # 334 tests across core + server
|
|
563
519
|
```
|
|
564
520
|
|
|
565
521
|
## License
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "mcpspec",
|
|
3
|
-
"version": "1.2.
|
|
3
|
+
"version": "1.2.2",
|
|
4
4
|
"description": "The definitive MCP server testing platform",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"mcp",
|
|
@@ -29,9 +29,9 @@
|
|
|
29
29
|
"@inquirer/prompts": "^7.0.0",
|
|
30
30
|
"commander": "^12.1.0",
|
|
31
31
|
"open": "^10.1.0",
|
|
32
|
-
"@mcpspec/core": "1.2.
|
|
33
|
-
"@mcpspec/shared": "1.2.
|
|
34
|
-
"@mcpspec/server": "1.2.
|
|
32
|
+
"@mcpspec/core": "1.2.2",
|
|
33
|
+
"@mcpspec/shared": "1.2.2",
|
|
34
|
+
"@mcpspec/server": "1.2.2"
|
|
35
35
|
},
|
|
36
36
|
"devDependencies": {
|
|
37
37
|
"tsup": "^8.0.0",
|