mcpspec 1.0.2 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +71 -370
  2. package/dist/index.js +309 -6
  3. package/package.json +4 -4
package/README.md CHANGED
@@ -4,414 +4,115 @@
4
4
 
5
5
  <h1 align="center">MCPSpec</h1>
6
6
 
7
- <p align="center"><strong>The complete testing, debugging, and quality platform for MCP servers.</strong></p>
7
+ <p align="center">
8
+ <strong>The complete testing platform for MCP servers</strong>
9
+ </p>
8
10
 
9
- MCPSpec is Postman for [Model Context Protocol](https://modelcontextprotocol.io) — test collections, interactive inspection, security auditing, performance benchmarking, auto-generated docs, and a quality scoring system. Works from the CLI, in CI/CD, or through a full web UI.
11
+ <p align="center">
12
+ <a href="https://www.npmjs.com/package/mcpspec"><img src="https://img.shields.io/npm/v/mcpspec.svg?style=flat&colorA=18181B&colorB=3b82f6" alt="npm version" /></a>
13
+ <a href="https://www.npmjs.com/package/mcpspec"><img src="https://img.shields.io/npm/dm/mcpspec.svg?style=flat&colorA=18181B&colorB=3b82f6" alt="npm downloads" /></a>
14
+ <a href="https://github.com/light-handle/mcpspec/blob/main/LICENSE"><img src="https://img.shields.io/github/license/light-handle/mcpspec?style=flat&colorA=18181B&colorB=3b82f6" alt="license" /></a>
15
+ <img src="https://img.shields.io/badge/node-%3E%3D22-3b82f6?style=flat&colorA=18181B" alt="node 22+" />
16
+ </p>
10
17
 
11
- ```
18
+ <p align="center">
19
+ Test collections, interactive inspection, security auditing, performance benchmarking, auto-generated docs, and quality scoring for <a href="https://modelcontextprotocol.io">Model Context Protocol</a> servers. Works from the CLI, in CI/CD, or through a full web UI.
20
+ </p>
21
+
22
+ ---
23
+
24
+ ```bash
12
25
  mcpspec test ./collection.yaml # Run tests
13
26
  mcpspec inspect "npx my-server" # Interactive REPL
14
- mcpspec audit "npx my-server" # Security scan
27
+ mcpspec audit "npx my-server" # Security scan (8 rules)
15
28
  mcpspec bench "npx my-server" # Performance benchmark
16
29
  mcpspec score "npx my-server" # Quality rating (0-100)
17
30
  mcpspec docs "npx my-server" # Auto-generate documentation
31
+ mcpspec record start "npx my-server" # Record & replay sessions
18
32
  mcpspec ui # Launch web dashboard
19
33
  ```
20
34
 
21
- ---
22
-
23
- ## Why MCPSpec?
24
-
25
- MCP servers expose tools (file access, database queries, API calls) to AI assistants. Before shipping a server, you need to answer:
26
-
27
- - **Does it work?** — Do tools return correct results? Do they handle bad input?
28
- - **Is it safe?** — Can inputs cause path traversal, injection, or information leaks?
29
- - **Is it fast?** — What's the P95 latency? Can it handle load?
30
- - **Is it documented?** — Do tools have descriptions and proper schemas?
31
-
32
- MCPSpec answers all of these with a single tool.
33
-
34
- ---
35
-
36
- ## Installation
35
+ ## Quick Start
37
36
 
38
37
  ```bash
38
+ # 1. Install
39
39
  npm install -g mcpspec
40
- ```
41
-
42
- Requires Node.js 22+.
43
-
44
- ---
45
-
46
- ## Quick Start
47
-
48
- ### 1. Initialize a project
49
40
 
50
- ```bash
41
+ # 2. Scaffold a project
51
42
  mcpspec init --template standard
52
- ```
53
-
54
- ### 2. Write a test collection
55
-
56
- ```yaml
57
- name: Filesystem Server Tests
58
- server: npx @modelcontextprotocol/server-filesystem /tmp
59
43
 
60
- tests:
61
- - name: Read a file
62
- call: read_file
63
- with:
64
- path: /tmp/test.txt
65
- expect:
66
- - exists: $.content
67
-
68
- - name: Handle missing file
69
- call: read_file
70
- with:
71
- path: /tmp/nonexistent.txt
72
- expectError: true
44
+ # 3. Run tests
45
+ mcpspec test
73
46
  ```
74
47
 
75
- ### 3. Run it
48
+ ## Features
76
49
 
77
- ```bash
78
- mcpspec test ./collection.yaml
79
- ```
80
-
81
- ```
82
- MCPSpec running Filesystem Server Tests (2 tests)
83
-
84
- Read a file (124ms)
85
- Handle missing file (89ms)
86
-
87
- Tests: 2 passed (2 total)
88
- Time: 0.45s
89
- ```
90
-
91
- ---
50
+ | | Feature | Description |
51
+ |---|---|---|
52
+ | **Test Collections** | YAML-based test suites with 10 assertion types, environments, variables, tags, retries, and parallel execution |
53
+ | **Interactive Inspector** | Connect to any MCP server and explore tools, resources, and schemas in a live REPL |
54
+ | **Security Audit** | 8 rules: path traversal, injection, auth bypass, resource exhaustion, info disclosure, **tool poisoning** (LLM prompt injection), and **excessive agency** (overly broad tools). Safety filter auto-skips destructive tools; `--dry-run` previews targets |
55
+ | **Recording & Replay** | Record inspector sessions, save them, and replay against the same or different server versions. Diff output highlights regressions — matched, changed, added, removed steps |
56
+ | **Benchmarks** | Measure min/max/mean/median/P95/P99 latency and throughput across hundreds of iterations |
57
+ | **MCP Score** | 0-100 quality rating across documentation, schema quality (opinionated linting: property types, descriptions, constraints, naming conventions), error handling, responsiveness, and security |
58
+ | **Doc Generator** | Auto-generate Markdown or HTML documentation from server introspection |
59
+ | **Web Dashboard** | Full React UI with server management, test runner, audit viewer, and dark mode |
60
+ | **CI/CD Ready** | JUnit/JSON/TAP reporters, deterministic exit codes, `--ci` mode, GitHub Actions compatible |
92
61
 
93
62
  ## Commands
94
63
 
95
- ### `mcpspec test` — Run Test Collections
96
-
97
- ```bash
98
- mcpspec test # Uses ./mcpspec.yaml
99
- mcpspec test ./tests.yaml # Specific file
100
- mcpspec test --env staging # Use staging variables
101
- mcpspec test --tag @smoke # Filter by tag
102
- mcpspec test --parallel 4 # Parallel execution
103
- mcpspec test --reporter junit --output results.xml # JUnit for CI
104
- mcpspec test --baseline main # Compare against saved baseline
105
- mcpspec test --watch # Re-run on file changes
106
- mcpspec test --ci # CI mode (no colors, strict exit codes)
107
- ```
108
-
109
- **Reporters:** `console`, `json`, `junit`, `html`, `tap`
110
-
111
- ### `mcpspec inspect` — Interactive REPL
112
-
113
- ```bash
114
- mcpspec inspect "npx @modelcontextprotocol/server-filesystem /tmp"
115
- ```
116
-
117
64
  | Command | Description |
118
65
  |---------|-------------|
119
- | `.tools` | List all tools |
120
- | `.resources` | List all resources |
121
- | `.call <tool> <json>` | Call a tool |
122
- | `.schema <tool>` | Show input schema |
123
- | `.info` | Server info |
124
- | `.exit` | Disconnect |
125
-
126
- ### `mcpspec audit` Security Scanner
127
-
128
- Scans for 6 categories of vulnerabilities:
129
-
130
- ```bash
131
- mcpspec audit "npx my-server" # Passive (safe, read-only)
132
- mcpspec audit "npx my-server" --mode active # Active (test payloads)
133
- mcpspec audit "npx my-server" --mode aggressive # Aggressive probing
134
- mcpspec audit "npx my-server" --fail-on medium # Fail CI on medium+ findings
135
- ```
136
-
137
- | Rule | What It Detects |
138
- |------|-----------------|
139
- | Path Traversal | `../../etc/passwd` style attacks |
140
- | Input Validation | Missing/malformed input handling |
141
- | Resource Exhaustion | Crash-inducing large payloads |
142
- | Auth Bypass | Access control circumvention |
143
- | Injection | SQL/command injection in tool inputs |
144
- | Information Disclosure | Leaked paths, stack traces, secrets |
145
-
146
- Active and aggressive modes send potentially harmful payloads and require confirmation (or `--acknowledge-risk` for CI).
147
-
148
- ### `mcpspec bench` Performance Benchmark
149
-
150
- ```bash
151
- mcpspec bench "npx my-server" # Default: 100 iterations
152
- mcpspec bench "npx my-server" --iterations 500 # More iterations
153
- mcpspec bench "npx my-server" --tool read_file # Specific tool
154
- mcpspec bench "npx my-server" --args '{"path":"/tmp/f"}' # With arguments
155
- ```
156
-
157
- Reports min, max, mean, median, P95, P99, standard deviation, and throughput (calls/sec).
158
-
159
- ### `mcpspec score` — MCP Quality Score
160
-
161
- Calculates a 0–100 quality rating:
162
-
163
- ```bash
164
- mcpspec score "npx my-server"
165
- mcpspec score "npx my-server" --badge badge.svg # Generate SVG badge
166
- ```
167
-
168
- ```
169
- MCP Score
170
- ────────────────────────────────────────
171
- Documentation ████████████████████ 100/100
172
- Schema Quality ████████████████████ 100/100
173
- Error Handling ██████████████░░░░░░ 70/100
174
- Performance ████████████████░░░░ 80/100
175
- Security ████████████████████ 100/100
176
-
177
- Overall: 91/100
178
- ```
179
-
180
- | Category (weight) | What It Measures |
181
- |--------------------|-----------------|
182
- | Documentation (25%) | % of tools/resources with descriptions |
183
- | Schema Quality (25%) | Proper `type`, `properties`, `required` in input schemas |
184
- | Error Handling (20%) | Returns `isError: true` for bad input vs. crashing |
185
- | Performance (15%) | Median response latency |
186
- | Security (15%) | Findings from a passive security scan |
187
-
188
- The `--badge` flag generates a shields.io-style SVG for your README.
189
-
190
- ### `mcpspec docs` — Documentation Generator
191
-
192
- ```bash
193
- mcpspec docs "npx my-server" # Markdown to stdout
194
- mcpspec docs "npx my-server" --format html # HTML output
195
- mcpspec docs "npx my-server" --output ./docs # Write to directory
196
- ```
197
-
198
- Connects to the server, introspects all tools and resources, and generates documentation with tool descriptions, input schemas, and resource tables.
199
-
200
- ### `mcpspec compare` / `mcpspec baseline` — Regression Detection
201
-
202
- ```bash
203
- mcpspec baseline save main # Save current run as "main"
204
- mcpspec baseline list # List saved baselines
205
- mcpspec compare --baseline main # Compare latest run against baseline
206
- mcpspec compare <run-id-1> <run-id-2> # Compare two specific runs
207
- ```
208
-
209
- ### `mcpspec init` — Project Scaffolding
210
-
211
- ```bash
212
- mcpspec init # Current directory
213
- mcpspec init ./my-project # Specific directory
214
- mcpspec init --template minimal # Minimal starter
215
- mcpspec init --template standard # Standard (recommended)
216
- mcpspec init --template full # Full with environments
217
- ```
218
-
219
- ### `mcpspec ui` — Web Dashboard
220
-
221
- ```bash
222
- mcpspec ui # Opens localhost:6274
223
- mcpspec ui --port 8080 # Custom port
224
- ```
225
-
226
- Full web interface with:
227
- - Server management and connection testing
228
- - Collection editor with YAML validation
229
- - Test run history with drill-down
230
- - Interactive tool inspector
231
- - Security audit with live progress
232
- - Performance benchmarking with real-time stats
233
- - Documentation generator with copy/download
234
- - MCP Score calculator with category breakdown
235
- - Dark mode
236
-
237
- ---
238
-
239
- ## Collection Format
240
-
241
- ### Simple Format
242
-
243
- ```yaml
244
- name: My Tests
245
- server: npx my-mcp-server
246
-
247
- tests:
248
- - name: Basic call
249
- call: tool_name
250
- with:
251
- param: value
252
- expect:
253
- - exists: $.result
254
- ```
255
-
256
- ### Advanced Format
257
-
258
- ```yaml
259
- schemaVersion: "1.0"
260
- name: Comprehensive Tests
261
- description: Full test suite
262
-
263
- server:
264
- transport: stdio
265
- command: npx
266
- args: ["my-mcp-server"]
267
- env:
268
- NODE_ENV: test
269
-
270
- environments:
271
- dev:
272
- variables:
273
- BASE_PATH: /tmp/dev
274
- prod:
275
- variables:
276
- BASE_PATH: /data
277
-
278
- defaultEnvironment: dev
279
-
280
- tests:
281
- - id: test-1
282
- name: Get data
283
- tags: [smoke, api]
284
- timeout: 5000
285
- retries: 2
286
- call: get_data
287
- with:
288
- path: "{{BASE_PATH}}/file.txt"
289
- assertions:
290
- - type: schema
291
- - type: exists
292
- path: $.content
293
- - type: matches
294
- path: $.content
295
- pattern: "^Hello"
296
- - type: latency
297
- maxMs: 1000
298
- - type: expression
299
- expr: "response.content.length > 0"
300
- extract:
301
- - name: fileContent
302
- path: $.content
303
- ```
304
-
305
- ### Assertion Types
306
-
307
- | Type | Description | Example |
308
- |------|-------------|---------|
309
- | `schema` | Response is valid | `type: schema` |
310
- | `equals` | Exact match | `path: $.id, value: 123` |
311
- | `contains` | Array/string contains | `path: $.tags, value: "active"` |
312
- | `exists` | Path exists | `path: $.name` |
313
- | `matches` | Regex match | `path: $.email, pattern: ".*@.*"` |
314
- | `type` | Type check | `path: $.count, expected: number` |
315
- | `length` | Length check | `path: $.items, operator: gt, value: 0` |
316
- | `latency` | Response time | `maxMs: 1000` |
317
- | `mimeType` | Content type | `expected: "image/png"` |
318
- | `expression` | Safe expression | `expr: "response.total > 0"` |
319
-
320
- Expressions use [expr-eval](https://github.com/silentmatt/expr-eval) — comparisons, logical operators, property access, and math. No arbitrary code execution.
321
-
322
- ---
323
-
324
- ## CI/CD Integration
325
-
326
- ### GitHub Actions
327
-
328
- ```yaml
329
- name: MCP Server Tests
330
- on: [push, pull_request]
331
-
332
- jobs:
333
- test:
334
- runs-on: ubuntu-latest
335
- steps:
336
- - uses: actions/checkout@v4
337
- - uses: actions/setup-node@v4
338
- with:
339
- node-version: '22'
340
-
341
- - run: npm install -g mcpspec
342
-
343
- - name: Run tests
344
- run: mcpspec test --ci --reporter junit --output results.xml
345
-
346
- - name: Security audit
347
- run: mcpspec audit "npx my-server" --mode passive --fail-on high
348
-
349
- - uses: mikepenz/action-junit-report@v4
350
- if: always()
351
- with:
352
- report_paths: results.xml
353
- ```
354
-
355
- ### Exit Codes
356
-
357
- | Code | Meaning |
358
- |------|---------|
359
- | 0 | Success |
360
- | 1 | Test failure |
361
- | 2 | Runtime error |
362
- | 3 | Configuration error |
363
- | 4 | Connection error |
364
- | 5 | Timeout |
365
- | 6 | Security findings above threshold |
366
- | 7 | Validation error |
367
- | 130 | Interrupted (Ctrl+C) |
368
-
369
- ---
66
+ | `mcpspec test [collection]` | Run test collections with `--env`, `--tag`, `--parallel`, `--reporter`, `--watch`, `--ci` |
67
+ | `mcpspec inspect <server>` | Interactive REPL — `.tools`, `.call`, `.schema`, `.resources`, `.info` |
68
+ | `mcpspec audit <server>` | Security scan `--mode`, `--fail-on`, `--exclude-tools`, `--dry-run` |
69
+ | `mcpspec bench <server>` | Performance benchmark `--iterations`, `--tool`, `--args` |
70
+ | `mcpspec score <server>` | Quality score (0-100) — `--badge badge.svg` |
71
+ | `mcpspec docs <server>` | Generate docs — `--format markdown\|html`, `--output <dir>` |
72
+ | `mcpspec compare` | Compare test runs or `--baseline <name>` |
73
+ | `mcpspec baseline save <name>` | Save/list baselines for regression detection |
74
+ | `mcpspec record start <server>` | Record an inspector session — `.call`, `.save`, `.steps` |
75
+ | `mcpspec record replay <name> <server>` | Replay a recording and diff against original |
76
+ | `mcpspec record list` | List saved recordings |
77
+ | `mcpspec record delete <name>` | Delete a saved recording |
78
+ | `mcpspec init [dir]` | Scaffold project — `--template minimal\|standard\|full` |
79
+ | `mcpspec ui` | Launch web dashboard on `localhost:6274` |
80
+
81
+ ## Community Collections
82
+
83
+ Pre-built test suites for popular MCP servers in [`examples/collections/servers/`](https://github.com/light-handle/mcpspec/tree/main/examples/collections/servers):
84
+
85
+ | Collection | Server | Tests |
86
+ |------------|--------|-------|
87
+ | filesystem.yaml | @modelcontextprotocol/server-filesystem | 12 |
88
+ | memory.yaml | @modelcontextprotocol/server-memory | 10 |
89
+ | everything.yaml | @modelcontextprotocol/server-everything | 11 |
90
+ | fetch.yaml | @modelcontextprotocol/server-fetch | 7 |
91
+ | time.yaml | @modelcontextprotocol/server-time | 10 |
92
+ | chrome-devtools.yaml | chrome-devtools-mcp | 11 |
93
+ | github.yaml | @modelcontextprotocol/server-github | 9 |
94
+
95
+ **70 tests** covering tool discovery, read/write operations, error handling, security edge cases, and latency.
370
96
 
371
97
  ## Architecture
372
98
 
373
- MCPSpec is a TypeScript monorepo:
374
-
375
99
  | Package | Description |
376
100
  |---------|-------------|
377
101
  | `@mcpspec/shared` | Types, Zod schemas, constants |
378
- | `@mcpspec/core` | MCP client, test runner, assertions, security scanner, profiler, doc generator, scorer |
379
- | `@mcpspec/cli` | 10 CLI commands built with Commander.js |
380
- | `@mcpspec/server` | Hono HTTP server with REST API + WebSocket for real-time updates |
381
- | `@mcpspec/ui` | React SPA with TanStack Router, TanStack Query, Tailwind CSS, shadcn/ui |
382
-
383
- Key design decisions:
384
- - **Local-first** — works offline, no account needed, server binds to localhost only
385
- - **Safe by default** — FAILSAFE YAML parsing, secret masking, process cleanup on SIGINT/SIGTERM
386
- - **sql.js** for storage — WebAssembly SQLite, no native compilation required
387
- - **Transports** — stdio, SSE, and streamable-http (SSE/HTTP lazy-loaded for code splitting)
388
-
389
- ---
102
+ | `@mcpspec/core` | MCP client, test runner, assertions, security scanner (8 rules), profiler, doc generator, scorer, recording/replay |
103
+ | `@mcpspec/cli` | 11 CLI commands built with Commander.js |
104
+ | `@mcpspec/server` | Hono HTTP server with REST API + WebSocket |
105
+ | `@mcpspec/ui` | React SPA TanStack Router, TanStack Query, Tailwind, shadcn/ui |
390
106
 
391
107
  ## Development
392
108
 
393
109
  ```bash
394
110
  git clone https://github.com/light-handle/mcpspec.git
395
111
  cd mcpspec
396
- pnpm install
397
- pnpm build
398
- pnpm test # 259 tests across core + server
112
+ pnpm install && pnpm build
113
+ pnpm test # 294 tests across core + server
399
114
  ```
400
115
 
401
- Run the CLI locally:
402
-
403
- ```bash
404
- node packages/cli/dist/index.js test ./examples/collections/simple.yaml
405
- ```
406
-
407
- Launch the UI in dev mode:
408
-
409
- ```bash
410
- node packages/cli/dist/index.js ui
411
- ```
412
-
413
- ---
414
-
415
116
  ## License
416
117
 
417
118
  MIT
package/dist/index.js CHANGED
@@ -1,7 +1,10 @@
1
1
  #!/usr/bin/env node
2
2
 
3
3
  // src/index.ts
4
- import { Command as Command11 } from "commander";
4
+ import { Command as Command12 } from "commander";
5
+ import { readFileSync as readFileSync3 } from "fs";
6
+ import { dirname, join as join2 } from "path";
7
+ import { fileURLToPath } from "url";
5
8
 
6
9
  // src/commands/test.ts
7
10
  import { Command } from "commander";
@@ -758,14 +761,16 @@ var SEVERITY_COLORS = {
758
761
  low: COLORS2.cyan,
759
762
  info: COLORS2.gray
760
763
  };
761
- var auditCommand = new Command7("audit").description("Run security audit on an MCP server").argument("<server>", "Server command or URL").option("--mode <mode>", "Scan mode: passive, active, aggressive", "passive").option("--acknowledge-risk", "Skip confirmation prompt for active/aggressive modes", false).option("--fail-on <severity>", "Fail with exit code 6 if findings at or above severity: info, low, medium, high, critical").option("--rules <rules...>", "Only run specific rules").action(async (serverCommand, options) => {
764
+ var auditCommand = new Command7("audit").description("Run security audit on an MCP server").argument("<server>", "Server command or URL").option("--mode <mode>", "Scan mode: passive, active, aggressive", "passive").option("--acknowledge-risk", "Skip confirmation prompt for active/aggressive modes", false).option("--fail-on <severity>", "Fail with exit code 6 if findings at or above severity: info, low, medium, high, critical").option("--rules <rules...>", "Only run specific rules").option("--exclude-tools <tools...>", "Skip specific tools during scanning").option("--dry-run", "Preview which tools will be scanned without running payloads", false).action(async (serverCommand, options) => {
762
765
  let client = null;
763
766
  try {
764
767
  const mode = options.mode;
765
768
  const config = new ScanConfig({
766
769
  mode,
767
770
  acknowledgeRisk: options.acknowledgeRisk,
768
- rules: options.rules
771
+ rules: options.rules,
772
+ excludeTools: options.excludeTools,
773
+ dryRun: options.dryRun
769
774
  });
770
775
  if (config.requiresConfirmation()) {
771
776
  console.log(`
@@ -793,6 +798,24 @@ ${COLORS2.cyan} Connecting to:${COLORS2.reset} ${serverCommand}`);
793
798
  console.log(`${COLORS2.gray} Scan mode: ${mode} | Rules: ${config.rules.join(", ")}${COLORS2.reset}
794
799
  `);
795
800
  const scanner = new SecurityScanner();
801
+ if (config.dryRun) {
802
+ const preview = await scanner.dryRun(client, config);
803
+ console.log(`${COLORS2.bold} Dry Run \u2014 Tools to scan:${COLORS2.reset}
804
+ `);
805
+ for (const tool of preview.tools) {
806
+ if (tool.included) {
807
+ console.log(` ${COLORS2.green}\u2713${COLORS2.reset} ${tool.name}`);
808
+ } else {
809
+ console.log(` ${COLORS2.yellow}\u2717${COLORS2.reset} ${tool.name} ${COLORS2.gray}(${tool.reason})${COLORS2.reset}`);
810
+ }
811
+ }
812
+ console.log(`
813
+ ${COLORS2.gray}Rules: ${preview.rules.join(", ")}${COLORS2.reset}`);
814
+ console.log(` ${COLORS2.gray}Mode: ${preview.mode}${COLORS2.reset}
815
+ `);
816
+ await client.disconnect();
817
+ process.exit(EXIT_CODES6.SUCCESS);
818
+ }
796
819
  const result = await scanner.scan(client, config, {
797
820
  onRuleStart: (_ruleId, ruleName) => {
798
821
  process.stdout.write(` ${COLORS2.gray}Running ${ruleName}...${COLORS2.reset}`);
@@ -1062,7 +1085,7 @@ ${COLORS5.bold} MCP Score${COLORS5.reset}`);
1062
1085
  { name: "Documentation", score: score.categories.documentation },
1063
1086
  { name: "Schema Quality", score: score.categories.schemaQuality },
1064
1087
  { name: "Error Handling", score: score.categories.errorHandling },
1065
- { name: "Performance", score: score.categories.performance },
1088
+ { name: "Responsiveness", score: score.categories.responsiveness },
1066
1089
  { name: "Security", score: score.categories.security }
1067
1090
  ];
1068
1091
  for (const cat of categories) {
@@ -1093,9 +1116,288 @@ ${COLORS5.bold} MCP Score${COLORS5.reset}`);
1093
1116
  }
1094
1117
  });
1095
1118
 
1119
+ // src/commands/record.ts
1120
+ import { Command as Command11 } from "commander";
1121
+ import { createInterface as createInterface2 } from "readline";
1122
+ import { randomUUID } from "crypto";
1123
+ import { EXIT_CODES as EXIT_CODES10 } from "@mcpspec/shared";
1124
+ import {
1125
+ MCPClient as MCPClient6,
1126
+ RecordingStore,
1127
+ RecordingReplayer,
1128
+ RecordingDiffer,
1129
+ formatError as formatError7
1130
+ } from "@mcpspec/core";
1131
+ var COLORS6 = {
1132
+ reset: "\x1B[0m",
1133
+ green: "\x1B[32m",
1134
+ red: "\x1B[31m",
1135
+ yellow: "\x1B[33m",
1136
+ gray: "\x1B[90m",
1137
+ bold: "\x1B[1m",
1138
+ cyan: "\x1B[36m",
1139
+ blue: "\x1B[34m"
1140
+ };
1141
+ var recordCommand = new Command11("record").description("Record, replay, and manage inspector session recordings");
1142
+ recordCommand.command("start").description("Start a recording session (interactive REPL)").argument("<server>", 'Server command (e.g., "npx @modelcontextprotocol/server-filesystem /tmp")').action(async (serverCommand) => {
1143
+ let client = null;
1144
+ const store = new RecordingStore();
1145
+ const steps = [];
1146
+ let toolList = [];
1147
+ try {
1148
+ client = new MCPClient6({ serverConfig: serverCommand });
1149
+ console.log(`${COLORS6.cyan}Connecting to: ${COLORS6.reset}${serverCommand}`);
1150
+ await client.connect();
1151
+ const info = client.getServerInfo();
1152
+ const serverName = info?.name ?? "unknown";
1153
+ console.log(`${COLORS6.green}Connected to ${serverName}${COLORS6.reset}`);
1154
+ const tools = await client.listTools();
1155
+ toolList = tools.map((t) => ({ name: t.name, description: t.description }));
1156
+ console.log(`${COLORS6.gray}${tools.length} tools available${COLORS6.reset}`);
1157
+ console.log(`
1158
+ ${COLORS6.bold}Recording mode.${COLORS6.reset} Type ${COLORS6.bold}.help${COLORS6.reset} for commands.
1159
+ `);
1160
+ const rl = createInterface2({
1161
+ input: process.stdin,
1162
+ output: process.stdout,
1163
+ prompt: `${COLORS6.red}rec>${COLORS6.reset} `
1164
+ });
1165
+ rl.prompt();
1166
+ rl.on("line", async (line) => {
1167
+ const trimmed = line.trim();
1168
+ if (!trimmed) {
1169
+ rl.prompt();
1170
+ return;
1171
+ }
1172
+ try {
1173
+ if (trimmed === ".exit" || trimmed === ".quit") {
1174
+ if (steps.length > 0) {
1175
+ console.log(`${COLORS6.yellow}Warning: ${steps.length} unsaved step(s). Use .save <name> first, or .exit to discard.${COLORS6.reset}`);
1176
+ if (trimmed === ".exit") {
1177
+ await client?.disconnect();
1178
+ rl.close();
1179
+ process.exit(EXIT_CODES10.SUCCESS);
1180
+ }
1181
+ } else {
1182
+ await client?.disconnect();
1183
+ rl.close();
1184
+ process.exit(EXIT_CODES10.SUCCESS);
1185
+ }
1186
+ return;
1187
+ }
1188
+ if (trimmed === ".help") {
1189
+ console.log(`
1190
+ ${COLORS6.bold}Recording commands:${COLORS6.reset}
1191
+ .tools List available tools
1192
+ .call <tool> <json> Call a tool and record the result
1193
+ .steps List recorded steps
1194
+ .save <name> Save recording with given name
1195
+ .exit Disconnect and exit
1196
+ `);
1197
+ rl.prompt();
1198
+ return;
1199
+ }
1200
+ if (trimmed === ".tools") {
1201
+ if (toolList.length === 0) {
1202
+ console.log(`${COLORS6.gray}No tools available${COLORS6.reset}`);
1203
+ } else {
1204
+ console.log(`
1205
+ ${COLORS6.bold}Tools (${toolList.length}):${COLORS6.reset}`);
1206
+ for (const tool of toolList) {
1207
+ console.log(` ${COLORS6.green}${tool.name}${COLORS6.reset}`);
1208
+ if (tool.description) console.log(` ${COLORS6.gray}${tool.description}${COLORS6.reset}`);
1209
+ }
1210
+ console.log("");
1211
+ }
1212
+ rl.prompt();
1213
+ return;
1214
+ }
1215
+ if (trimmed === ".steps") {
1216
+ if (steps.length === 0) {
1217
+ console.log(`${COLORS6.gray}No steps recorded yet${COLORS6.reset}`);
1218
+ } else {
1219
+ console.log(`
1220
+ ${COLORS6.bold}Recorded steps (${steps.length}):${COLORS6.reset}`);
1221
+ for (let i = 0; i < steps.length; i++) {
1222
+ const s = steps[i];
1223
+ const status = s.isError ? `${COLORS6.red}ERROR${COLORS6.reset}` : `${COLORS6.green}OK${COLORS6.reset}`;
1224
+ console.log(` ${i + 1}. ${s.tool} ${COLORS6.gray}${JSON.stringify(s.input)}${COLORS6.reset} [${status}] ${COLORS6.gray}${s.durationMs}ms${COLORS6.reset}`);
1225
+ }
1226
+ console.log("");
1227
+ }
1228
+ rl.prompt();
1229
+ return;
1230
+ }
1231
+ if (trimmed.startsWith(".save ")) {
1232
+ const name = trimmed.slice(6).trim();
1233
+ if (!name) {
1234
+ console.log(`${COLORS6.red}Usage: .save <name>${COLORS6.reset}`);
1235
+ rl.prompt();
1236
+ return;
1237
+ }
1238
+ if (steps.length === 0) {
1239
+ console.log(`${COLORS6.yellow}No steps to save. Use .call first.${COLORS6.reset}`);
1240
+ rl.prompt();
1241
+ return;
1242
+ }
1243
+ const recording = {
1244
+ id: randomUUID(),
1245
+ name,
1246
+ serverName: info?.name,
1247
+ tools: toolList,
1248
+ steps: [...steps],
1249
+ createdAt: (/* @__PURE__ */ new Date()).toISOString()
1250
+ };
1251
+ const path = store.save(name, recording);
1252
+ console.log(`${COLORS6.green}Saved recording "${name}" (${steps.length} steps) to ${path}${COLORS6.reset}`);
1253
+ rl.prompt();
1254
+ return;
1255
+ }
1256
+ if (trimmed.startsWith(".call ")) {
1257
+ const rest = trimmed.slice(6).trim();
1258
+ const spaceIdx = rest.indexOf(" ");
1259
+ let toolName;
1260
+ let args = {};
1261
+ if (spaceIdx === -1) {
1262
+ toolName = rest;
1263
+ } else {
1264
+ toolName = rest.slice(0, spaceIdx);
1265
+ const jsonStr = rest.slice(spaceIdx + 1).trim();
1266
+ try {
1267
+ args = JSON.parse(jsonStr);
1268
+ } catch {
1269
+ console.log(`${COLORS6.red}Invalid JSON: ${jsonStr}${COLORS6.reset}`);
1270
+ rl.prompt();
1271
+ return;
1272
+ }
1273
+ }
1274
+ console.log(`${COLORS6.gray}Calling ${toolName}...${COLORS6.reset}`);
1275
+ const start = performance.now();
1276
+ let output = [];
1277
+ let isError = false;
1278
+ try {
1279
+ const result = await client.callTool(toolName, args);
1280
+ output = result.content;
1281
+ isError = result.isError === true;
1282
+ } catch (err) {
1283
+ output = [{ type: "text", text: err instanceof Error ? err.message : String(err) }];
1284
+ isError = true;
1285
+ }
1286
+ const durationMs = Math.round(performance.now() - start);
1287
+ steps.push({ tool: toolName, input: args, output, isError, durationMs });
1288
+ const statusLabel = isError ? `${COLORS6.red}ERROR${COLORS6.reset}` : `${COLORS6.green}OK${COLORS6.reset}`;
1289
+ console.log(`[${statusLabel}] ${COLORS6.gray}${durationMs}ms${COLORS6.reset} (step ${steps.length})`);
1290
+ console.log(JSON.stringify(output, null, 2));
1291
+ rl.prompt();
1292
+ return;
1293
+ }
1294
+ console.log(`${COLORS6.yellow}Unknown command. Type .help for available commands.${COLORS6.reset}`);
1295
+ } catch (err) {
1296
+ const formatted = formatError7(err);
1297
+ console.log(`${COLORS6.red}${formatted.title}: ${formatted.description}${COLORS6.reset}`);
1298
+ }
1299
+ rl.prompt();
1300
+ });
1301
+ rl.on("close", async () => {
1302
+ await client?.disconnect();
1303
+ process.exit(EXIT_CODES10.SUCCESS);
1304
+ });
1305
+ } catch (err) {
1306
+ const formatted = formatError7(err);
1307
+ console.error(`
1308
+ ${formatted.title}: ${formatted.description}`);
1309
+ formatted.suggestions.forEach((s) => console.error(` - ${s}`));
1310
+ await client?.disconnect();
1311
+ process.exit(formatted.exitCode);
1312
+ }
1313
+ });
1314
+ recordCommand.command("list").description("List saved recordings").action(() => {
1315
+ const store = new RecordingStore();
1316
+ const recordings = store.list();
1317
+ if (recordings.length === 0) {
1318
+ console.log(`${COLORS6.gray}No recordings found.${COLORS6.reset}`);
1319
+ return;
1320
+ }
1321
+ console.log(`
1322
+ ${COLORS6.bold}Saved recordings (${recordings.length}):${COLORS6.reset}`);
1323
+ for (const name of recordings) {
1324
+ const recording = store.load(name);
1325
+ if (recording) {
1326
+ console.log(` ${COLORS6.green}${name}${COLORS6.reset} ${COLORS6.gray}(${recording.steps.length} steps, ${recording.createdAt})${COLORS6.reset}`);
1327
+ } else {
1328
+ console.log(` ${COLORS6.green}${name}${COLORS6.reset}`);
1329
+ }
1330
+ }
1331
+ console.log("");
1332
+ });
1333
+ recordCommand.command("replay").description("Replay a recording against a server and show diff").argument("<name>", "Recording name").argument("<server>", "Server command").action(async (name, serverCommand) => {
1334
+ const store = new RecordingStore();
1335
+ const recording = store.load(name);
1336
+ if (!recording) {
1337
+ console.error(`${COLORS6.red}Recording "${name}" not found.${COLORS6.reset}`);
1338
+ process.exit(EXIT_CODES10.ERROR);
1339
+ }
1340
+ let client = null;
1341
+ try {
1342
+ client = new MCPClient6({ serverConfig: serverCommand });
1343
+ console.log(`${COLORS6.cyan}Connecting to: ${COLORS6.reset}${serverCommand}`);
1344
+ await client.connect();
1345
+ console.log(`${COLORS6.green}Connected. Replaying ${recording.steps.length} steps...${COLORS6.reset}
1346
+ `);
1347
+ const replayer = new RecordingReplayer();
1348
+ const result = await replayer.replay(recording, client, {
1349
+ onStepStart: (i, step) => {
1350
+ process.stdout.write(` ${i + 1}/${recording.steps.length} ${step.tool}... `);
1351
+ },
1352
+ onStepComplete: (_i, replayed) => {
1353
+ const status = replayed.isError ? `${COLORS6.red}ERROR${COLORS6.reset}` : `${COLORS6.green}OK${COLORS6.reset}`;
1354
+ console.log(`[${status}] ${COLORS6.gray}${replayed.durationMs}ms${COLORS6.reset}`);
1355
+ }
1356
+ });
1357
+ const differ = new RecordingDiffer();
1358
+ const diff = differ.diff(recording, result.replayedSteps, result.replayedAt);
1359
+ console.log(`
1360
+ ${COLORS6.bold}Diff Summary:${COLORS6.reset}`);
1361
+ console.log(` ${COLORS6.green}Matched:${COLORS6.reset} ${diff.summary.matched}`);
1362
+ console.log(` ${COLORS6.yellow}Changed:${COLORS6.reset} ${diff.summary.changed}`);
1363
+ console.log(` ${COLORS6.blue}Added:${COLORS6.reset} ${diff.summary.added}`);
1364
+ console.log(` ${COLORS6.red}Removed:${COLORS6.reset} ${diff.summary.removed}`);
1365
+ if (diff.summary.changed > 0) {
1366
+ console.log(`
1367
+ ${COLORS6.bold}Changed steps:${COLORS6.reset}`);
1368
+ for (const step of diff.steps) {
1369
+ if (step.type === "changed") {
1370
+ console.log(` Step ${step.index + 1} (${step.tool}): ${COLORS6.yellow}${step.outputDiff}${COLORS6.reset}`);
1371
+ }
1372
+ }
1373
+ }
1374
+ await client.disconnect();
1375
+ const exitCode = diff.summary.changed > 0 || diff.summary.removed > 0 ? EXIT_CODES10.TEST_FAILURE : EXIT_CODES10.SUCCESS;
1376
+ process.exit(exitCode);
1377
+ } catch (err) {
1378
+ const formatted = formatError7(err);
1379
+ console.error(`
1380
+ ${formatted.title}: ${formatted.description}`);
1381
+ formatted.suggestions.forEach((s) => console.error(` - ${s}`));
1382
+ await client?.disconnect();
1383
+ process.exit(formatted.exitCode);
1384
+ }
1385
+ });
1386
+ recordCommand.command("delete").description("Delete a saved recording").argument("<name>", "Recording name").action((name) => {
1387
+ const store = new RecordingStore();
1388
+ if (store.delete(name)) {
1389
+ console.log(`${COLORS6.green}Deleted recording "${name}".${COLORS6.reset}`);
1390
+ } else {
1391
+ console.error(`${COLORS6.red}Recording "${name}" not found.${COLORS6.reset}`);
1392
+ process.exit(EXIT_CODES10.ERROR);
1393
+ }
1394
+ });
1395
+
1096
1396
  // src/index.ts
1097
- var program = new Command11();
1098
- program.name("mcpspec").description("The definitive MCP server testing platform").version("1.0.0");
1397
+ var __cliDir = dirname(fileURLToPath(import.meta.url));
1398
+ var pkg = JSON.parse(readFileSync3(join2(__cliDir, "..", "package.json"), "utf-8"));
1399
+ var program = new Command12();
1400
+ program.name("mcpspec").description("The definitive MCP server testing platform").version(pkg.version);
1099
1401
  program.addCommand(testCommand);
1100
1402
  program.addCommand(inspectCommand);
1101
1403
  program.addCommand(initCommand);
@@ -1106,4 +1408,5 @@ program.addCommand(auditCommand);
1106
1408
  program.addCommand(benchCommand);
1107
1409
  program.addCommand(docsCommand);
1108
1410
  program.addCommand(scoreCommand);
1411
+ program.addCommand(recordCommand);
1109
1412
  program.parse(process.argv);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mcpspec",
3
- "version": "1.0.2",
3
+ "version": "1.1.0",
4
4
  "description": "The definitive MCP server testing platform",
5
5
  "keywords": [
6
6
  "mcp",
@@ -29,9 +29,9 @@
29
29
  "@inquirer/prompts": "^7.0.0",
30
30
  "commander": "^12.1.0",
31
31
  "open": "^10.1.0",
32
- "@mcpspec/core": "1.0.2",
33
- "@mcpspec/shared": "1.0.2",
34
- "@mcpspec/server": "1.0.2"
32
+ "@mcpspec/core": "1.1.0",
33
+ "@mcpspec/shared": "1.1.0",
34
+ "@mcpspec/server": "1.1.0"
35
35
  },
36
36
  "devDependencies": {
37
37
  "tsup": "^8.0.0",