@yawlabs/mcp-compliance 0.13.0 → 0.13.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,591 +1,602 @@
1
- # @yawlabs/mcp-compliance
2
-
3
- [![npm version](https://img.shields.io/npm/v/@yawlabs/mcp-compliance)](https://www.npmjs.com/package/@yawlabs/mcp-compliance)
4
- [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
5
- [![GitHub stars](https://img.shields.io/github/stars/YawLabs/mcp-compliance)](https://github.com/YawLabs/mcp-compliance/stargazers)
6
- [![CI](https://github.com/YawLabs/mcp-compliance/actions/workflows/ci.yml/badge.svg)](https://github.com/YawLabs/mcp-compliance/actions/workflows/ci.yml)
7
-
8
- **Test any MCP server for spec compliance.** 88-test suite covering transport, lifecycle, tools, resources, prompts, error handling, schema validation, and security against the [MCP specification](https://modelcontextprotocol.io/specification/2025-11-25). Works against **HTTP endpoints** (`https://my-server.com/mcp`) and **stdio servers** (`npx @modelcontextprotocol/server-filesystem /tmp`) alike. CLI, MCP server, and programmatic API.
9
-
10
- Built and maintained by [Yaw Labs](https://yaw.sh).
11
-
12
- ## Why this tool?
13
-
14
- MCP servers are multiplying fast — but most ship without compliance testing. Broken transport handling, missing error codes, malformed schemas, and silent capability violations are common. Hand-rolling test scripts is tedious and incomplete.
15
-
16
- This tool solves that:
17
-
18
- - **88 tests across 8 categories** — transport, lifecycle, tools, resources, prompts, error handling, schema validation, and security. No gaps. (HTTP runs all 85 transport-applicable tests; stdio runs ~75 — HTTP-specific tests like CORS, TLS, session headers, and rate limiting are gated out.)
19
- - **Capability-driven** — tests adapt to what the server declares. If it says it supports tools, tool tests become required. No false failures for features the server doesn't claim.
20
- - **Graded scoring** — A-F letter grade with a weighted score (required tests 70%, optional 30%). One number to communicate compliance.
21
- - **CI-ready** — `--strict` mode exits with code 1 on required test failures. Drop it into any pipeline.
22
- - **Spec-referenced** — every test links to the exact section of the MCP specification it validates. No ambiguity about what's being tested or why.
23
- - **Three interfaces** — CLI for humans, MCP server for AI assistants, programmatic API for integration.
24
- - **Published specification** — the [testing methodology](./MCP_COMPLIANCE_SPEC.md) and [rule catalog](./mcp-compliance-rules.json) are open (CC BY 4.0) so anyone can implement compatible tooling.
25
-
26
- ## Quick start
27
-
28
- **Remote HTTP server:**
29
-
30
- ```bash
31
- npx @yawlabs/mcp-compliance test https://my-server.com/mcp
32
- ```
33
-
34
- **Local stdio server** (the vast majority of MCP servers on npm):
35
-
36
- ```bash
37
- # Pass the command directly, Inspector-style
38
- npx @yawlabs/mcp-compliance test npx @modelcontextprotocol/server-filesystem /tmp
39
-
40
- # Or a local build
41
- npx @yawlabs/mcp-compliance test node ./dist/server.js
42
-
43
- # With env vars
44
- npx @yawlabs/mcp-compliance test -E GITHUB_TOKEN=$GITHUB_TOKEN -- npx @modelcontextprotocol/server-github
45
- ```
46
-
47
- **Install globally:**
48
-
49
- ```bash
50
- npm install -g @yawlabs/mcp-compliance
51
- mcp-compliance test https://my-server.com/mcp
52
- ```
53
-
54
- That's it. You'll get a colored terminal report with a letter grade (A-F), per-test pass/fail, and a compliance score.
55
-
56
- ## CLI usage
57
-
58
- ### HTTP targets
59
-
60
- ```bash
61
- # Terminal output with colors and grade
62
- mcp-compliance test https://my-server.com/mcp
63
-
64
- # JSON / SARIF for scripting + GitHub Code Scanning
65
- mcp-compliance test https://my-server.com/mcp --format json
66
- mcp-compliance test https://my-server.com/mcp --format sarif > compliance.sarif
67
-
68
- # Strict mode for CI — exits 1 on required-test failure
69
- mcp-compliance test https://my-server.com/mcp --strict
70
-
71
- # Auth (shorthand or full header)
72
- mcp-compliance test https://my-server.com/mcp --auth "Bearer tok123"
73
- mcp-compliance test https://my-server.com/mcp -H "X-Api-Key: abc"
74
-
75
- # Focus the run
76
- mcp-compliance test https://my-server.com/mcp --only transport,lifecycle
77
- mcp-compliance test https://my-server.com/mcp --skip prompts,resources
78
- mcp-compliance test https://my-server.com/mcp --verbose
79
- ```
80
-
81
- ### stdio targets
82
-
83
- Pass the command and its args as positional arguments (MCP Inspector-style). Use `--` to disambiguate when the target needs flags that collide with ours.
84
-
85
- ```bash
86
- # npm-distributed stdio servers
87
- mcp-compliance test npx -y @modelcontextprotocol/server-filesystem /tmp
88
- mcp-compliance test uvx mcp-server-git
89
-
90
- # Local build
91
- mcp-compliance test node ./dist/server.js
92
-
93
- # With env vars (repeatable -E, or --env-file)
94
- mcp-compliance test -E API_KEY=secret -E REGION=us-east-1 -- npx my-server
95
- mcp-compliance test --env-file .env -- node ./server.js
96
-
97
- # Set working directory
98
- mcp-compliance test --cwd ./services/mcp -- node ./dist/server.js
99
-
100
- # Target uses a flag that collides with ours — use `--` to separate
101
- mcp-compliance test --verbose -- node ./server.js --verbose
102
- ```
103
-
104
- On Windows, `npx` and other `.cmd` shims are handled automatically by spawning through the shell.
105
-
106
- ### Options
107
-
108
- | Option | Applies to | Description |
109
- |--------|-----------|-------------|
110
- | `--format <format>` | both | Output format: `terminal`, `json`, `sarif`, `github`, or `markdown` (default: `terminal`) |
111
- | `--config <path>` | both | Load defaults from a config file (default: `mcp-compliance.config.json` in cwd) |
112
- | `--output <file>` | both | Write a local SVG badge to the given path after the run |
113
- | `--list` | both | Print test IDs that would run given current filters, then exit (no connection) |
114
- | `--transport <kind>` | both | Filter by `http` or `stdio` (only used with `--list` when no target is provided) |
115
- | `--strict` | both | Exit with code 1 on any required test failure (for CI) |
116
- | `--min-grade <grade>` | both | Exit with code 1 if grade is below this threshold (`A`–`F`) |
117
- | `-H, --header <h>` | HTTP | Add header to all requests, format `"Key: Value"` (repeatable) |
118
- | `--auth <token>` | HTTP | Shorthand for `-H "Authorization: <token>"` |
119
- | `-E, --env <var>` | stdio | Set env var for stdio command, format `"KEY=VALUE"` (repeatable) |
120
- | `--env-file <path>` | stdio | Load env vars from a file (one `KEY=VALUE` per line) |
121
- | `--cwd <dir>` | stdio | Working directory for the stdio command |
122
- | `--timeout <ms>` | both | Request timeout in milliseconds (default: `15000`) |
123
- | `--preflight-timeout <ms>` | HTTP | Preflight connectivity check timeout (HTTP only) |
124
- | `--retries <n>` | both | Number of retries for failed tests (default: `0`) |
125
- | `--only <items>` | both | Only run tests matching these categories or test IDs (comma-separated) |
126
- | `--skip <items>` | both | Skip tests matching these categories or test IDs (comma-separated) |
127
- | `--verbose` | both | Print each test result as it runs (also forwards stdio stderr) |
128
-
129
- ### CI integration
130
-
131
- **GitHub Action** (drop into any `.github/workflows/*.yml`):
132
-
133
- ```yaml
134
- - uses: YawLabs/mcp-compliance@v0
135
- with:
136
- target: 'node ./dist/server.js' # or a URL like https://my-server.com/mcp
137
- format: github # ::error / ::warning annotations on the PR
138
- strict: 'true' # exit non-zero if any required test fails
139
- min-grade: 'A' # also exit if grade slips
140
- ```
141
-
142
- **Manual CLI invocation:**
143
-
144
- ```bash
145
- # GitHub Actions: emits ::error / ::warning annotations inline on the PR
146
- mcp-compliance test https://my-server.com/mcp --format github --strict
147
-
148
- # Slack/Linear/PR comment: drop the body straight into a comment
149
- mcp-compliance test https://my-server.com/mcp --format markdown > report.md
150
-
151
- # HTML report (self-contained, share anywhere — issue comments, S3, GitHub Pages)
152
- mcp-compliance test https://my-server.com/mcp --format html > report.html
153
-
154
- # Block release if grade slips below B
155
- mcp-compliance test https://my-server.com/mcp --min-grade B
156
-
157
- # Preview which tests will run before connecting (handy for --only/--skip authoring)
158
- mcp-compliance test --list --transport stdio --skip security
159
-
160
- # Diff two runs — exit 1 if anything that was passing is now failing
161
- mcp-compliance test https://my-server.com/mcp --format json > current.json
162
- mcp-compliance diff baseline.json current.json
163
-
164
- # Watch mode for stdio dev loop — re-runs on file changes in cwd
165
- mcp-compliance test --watch -- node ./dist/server.js
166
-
167
- # Latency benchmark
168
- mcp-compliance benchmark -- node ./dist/server.js -r 200 -c 4
169
- ```
170
-
171
- **Docker:**
172
-
173
- ```bash
174
- docker run --rm ghcr.io/yawlabs/mcp-compliance test https://my-server.com/mcp
175
- ```
176
-
177
- ### Scaffold a config
178
-
179
- ```bash
180
- mcp-compliance init
181
- ```
182
-
183
- Interactive prompts walk you through transport (http/stdio), command/url, env vars, timeout, and strict mode — then write a `mcp-compliance.config.json` you can commit.
184
-
185
- ### Config file
186
-
187
- Check in a `mcp-compliance.config.json` so CI and your dev loop can run `mcp-compliance test` with no arguments. Supported locations (searched in order): `mcp-compliance.config.json`, `.mcp-compliancerc.json`, `.mcp-compliancerc`, and the `"mcp-compliance"` field of `package.json`. Pass `--config <path>` to load an explicit file.
188
-
189
- **HTTP:**
190
-
191
- ```json
192
- {
193
- "target": {
194
- "type": "http",
195
- "url": "https://my-server.com/mcp",
196
- "headers": { "Authorization": "Bearer tok123" }
197
- },
198
- "timeout": 20000,
199
- "strict": true
200
- }
201
- ```
202
-
203
- **stdio:**
204
-
205
- ```json
206
- {
207
- "target": {
208
- "type": "stdio",
209
- "command": "node",
210
- "args": ["./dist/server.js"],
211
- "env": { "LOG_LEVEL": "error" }
212
- },
213
- "skip": ["security"],
214
- "strict": true
215
- }
216
- ```
217
-
218
- Precedence: CLI flags > config file > defaults. Any field can be overridden on the command line.
219
-
220
- ### Publish a shareable badge (HTTP only)
221
-
222
- ```bash
223
- mcp-compliance badge https://my-server.com/mcp
224
- ```
225
-
226
- Runs the compliance suite, publishes the report to [mcp.hosting](https://mcp.hosting), and prints the markdown embed for your README. The badge image reflects the real grade (A–F) and links to the full report.
227
-
228
- | Option | Description |
229
- |--------|-------------|
230
- | `-H, --header <header>` | Add header to all requests, format `"Key: Value"` (repeatable) |
231
- | `--auth <token>` | Shorthand for `-H "Authorization: <token>"` |
232
- | `--timeout <ms>` | Request timeout in milliseconds (default: `15000`) |
233
- | `--no-publish` | Skip publishing; print a local badge markdown only |
234
- | `--output <file>` | Also write a local SVG badge to the given path |
235
-
236
- Reports are kept for 90 days from last submission; resubmitting the same URL overwrites the previous report. Auth headers are stripped client-side before upload. Private/loopback URLs (`localhost`, `127.0.0.1`, `192.168.*`, etc.) trigger an interactive confirmation before publishing, and are rejected by the server in any case.
237
-
238
- A delete token is returned at publish time and stored at `~/.mcp-compliance/tokens.json` (mode `0600`). Use it to take a report down:
239
-
240
- ```bash
241
- mcp-compliance unpublish https://my-server.com/mcp
242
- ```
243
-
244
- ### Local SVG badge (any transport)
245
-
246
- Stdio servers can't be published (no public URL to key on), but you can commit a local SVG reflecting the real grade:
247
-
248
- ```bash
249
- mcp-compliance test node ./dist/server.js --output badge.svg
250
- mcp-compliance badge npx -y @modelcontextprotocol/server-filesystem /tmp --output badge.svg
251
- ```
252
-
253
- Then embed it in your README:
254
-
255
- ```markdown
256
- ![MCP Compliance](./badge.svg)
257
- ```
258
-
259
- The `test` command never publishes — use it for CI, debugging, and local iteration. `badge` is the only command that publishes to mcp.hosting.
260
-
261
- ## What the 88 tests check
262
-
263
- <details>
264
- <summary><strong>Transport (13 tests)</strong></summary>
265
-
266
- - **transport-post** — Server accepts HTTP POST requests (required)
267
- - **transport-content-type** — Responds with application/json or text/event-stream (required)
268
- - **transport-notification-202** — Notifications return exactly 202 Accepted
269
- - **transport-content-type-reject** — Rejects non-JSON request Content-Type
270
- - **transport-session-id** — Enforces MCP-Session-Id after initialization
271
- - **transport-session-invalid** — Returns 404 for unknown session ID
272
- - **transport-get** — GET returns SSE stream or 405
273
- - **transport-delete** — DELETE accepted or returns 405
274
- - **transport-batch-reject** — Rejects JSON-RPC batch requests (required)
275
- - **transport-content-type-init** — Initialize response has valid content type
276
- - **transport-get-stream** — GET with session returns SSE or 405
277
- - **transport-concurrent** — Handles concurrent requests
278
- - **transport-sse-event-field** — SSE responses include required event: message field
279
-
280
- </details>
281
-
282
- <details>
283
- <summary><strong>Lifecycle (17 tests)</strong></summary>
284
-
285
- - **lifecycle-init** — Initialize handshake succeeds (required)
286
- - **lifecycle-proto-version** — Returns valid YYYY-MM-DD protocol version (required)
287
- - **lifecycle-server-info** — Includes serverInfo with name
288
- - **lifecycle-capabilities** — Returns capabilities object (required)
289
- - **lifecycle-jsonrpc** — Response is valid JSON-RPC 2.0 (required)
290
- - **lifecycle-ping** — Responds to ping method (required)
291
- - **lifecycle-instructions** — Instructions field is valid string if present
292
- - **lifecycle-id-match** — Response ID matches request ID (required)
293
- - **lifecycle-string-id** — Supports string request IDs (JSON-RPC 2.0)
294
- - **lifecycle-version-negotiate** — Handles unknown protocol version gracefully
295
- - **lifecycle-reinit-reject** — Rejects second initialize request
296
- - **lifecycle-logging** — logging/setLevel accepted (required if logging capability declared)
297
- - **lifecycle-completions** — completion/complete accepted (required if completions capability declared)
298
- - **lifecycle-cancellation** — Handles cancellation notifications
299
- - **lifecycle-progress** — Handles progress notifications gracefully
300
- - **lifecycle-list-changed** — Accepts listChanged notifications for declared capabilities
301
- - **lifecycle-progress-token** — Supports progress tokens in requests via SSE
302
-
303
- </details>
304
-
305
- <details>
306
- <summary><strong>Tools (4 tests)</strong></summary>
307
-
308
- - **tools-list** — tools/list returns valid array (required if tools capability declared)
309
- - **tools-call** — tools/call responds with correct format
310
- - **tools-pagination** — tools/list supports cursor-based pagination
311
- - **tools-content-types** — Tool content items have valid types
312
-
313
- </details>
314
-
315
- <details>
316
- <summary><strong>Resources (5 tests)</strong></summary>
317
-
318
- - **resources-list** — resources/list returns valid array (required if resources capability declared)
319
- - **resources-read** — resources/read returns content items
320
- - **resources-templates** — resources/templates/list works or returns method-not-found
321
- - **resources-pagination** — resources/list supports cursor-based pagination
322
- - **resources-subscribe** — Resource subscribe/unsubscribe (required if subscribe capability declared)
323
-
324
- </details>
325
-
326
- <details>
327
- <summary><strong>Prompts (3 tests)</strong></summary>
328
-
329
- - **prompts-list** — prompts/list returns valid array (required if prompts capability declared)
330
- - **prompts-get** — prompts/get returns valid messages
331
- - **prompts-pagination** — prompts/list supports cursor-based pagination
332
-
333
- </details>
334
-
335
- <details>
336
- <summary><strong>Error Handling (10 tests)</strong></summary>
337
-
338
- - **error-unknown-method** — Returns JSON-RPC error for unknown method (required)
339
- - **error-method-code** — Uses correct -32601 error code
340
- - **error-invalid-jsonrpc** — Handles malformed JSON-RPC (required)
341
- - **error-invalid-json** — Handles invalid JSON body
342
- - **error-missing-params** — Returns error for tools/call without name
343
- - **error-parse-code** — Returns -32700 for invalid JSON
344
- - **error-invalid-request-code** — Returns -32600 for invalid request
345
- - **tools-call-unknown** — Returns error for nonexistent tool name
346
- - **error-capability-gated** — Rejects methods for undeclared capabilities
347
- - **error-invalid-cursor** Handles invalid pagination cursor gracefully
348
-
349
- </details>
350
-
351
- <details>
352
- <summary><strong>Schema Validation (6 tests)</strong></summary>
353
-
354
- - **tools-schema** — All tools have valid name and inputSchema (required if tools capability declared)
355
- - **tools-annotations** — Tool annotations are valid if present
356
- - **tools-title-field** — Tools include title field (2025-11-25)
357
- - **tools-output-schema** — Tools with outputSchema are valid (2025-11-25)
358
- - **prompts-schema** — Prompts have valid name field (required if prompts capability declared)
359
- - **resources-schema** — Resources have valid uri and name (required if resources capability declared)
360
-
361
- </details>
362
-
363
- <details>
364
- <summary><strong>Security (23 tests)</strong></summary>
365
-
366
- - **security-auth-required** — Rejects unauthenticated requests
367
- - **security-www-authenticate** — 401 responses include WWW-Authenticate header
368
- - **security-auth-malformed** — Rejects malformed auth credentials
369
- - **security-tls-required** — Enforces HTTPS/TLS
370
- - **security-session-entropy** — Session IDs are high-entropy
371
- - **security-session-not-auth** — Session ID does not bypass auth
372
- - **security-oauth-metadata** — Protected Resource Metadata endpoint exists (RFC 9728)
373
- - **security-token-in-uri** — Rejects auth tokens in query string
374
- - **security-cors-headers** — CORS headers are restrictive
375
- - **security-origin-validation** — Validates Origin header for DNS rebinding protection
376
- - **security-command-injection** — Resists command injection in tool params
377
- - **security-sql-injection** — Resists SQL injection in tool params
378
- - **security-path-traversal** — Resists path traversal in tool params
379
- - **security-ssrf-internal** — Resists SSRF to internal networks
380
- - **security-oversized-input** — Handles oversized inputs gracefully
381
- - **security-extra-params** — Rejects or ignores extra tool params
382
- - **security-tool-schema-defined** — All tools define inputSchema
383
- - **security-tool-rug-pull** — Tool definitions are stable across calls
384
- - **security-tool-description-poisoning** — Tool descriptions free of injection patterns
385
- - **security-tool-cross-reference** — Tools do not reference other tools by name
386
- - **security-error-no-stacktrace** — Error responses do not leak stack traces
387
- - **security-error-no-internal-ip** — Error responses do not leak internal IPs
388
- - **security-rate-limiting** — Rate limiting is enforced
389
-
390
- </details>
391
-
392
- ## Grading
393
-
394
- | Grade | Score |
395
- |-------|--------|
396
- | A | 90-100 |
397
- | B | 75-89 |
398
- | C | 60-74 |
399
- | D | 40-59 |
400
- | F | 0-39 |
401
-
402
- Required tests are worth 70% of the score, optional tests 30%. See the [full scoring algorithm](./MCP_COMPLIANCE_SPEC.md#2-scoring-algorithm) in the specification.
403
-
404
- ## CI integration
405
-
406
- ```yaml
407
- # GitHub Actions example
408
- - name: MCP Compliance Check
409
- run: npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --strict
410
- ```
411
-
412
- ```yaml
413
- # With JSON output for parsing
414
- - name: MCP Compliance Check
415
- run: |
416
- npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --format json > compliance.json
417
- cat compliance.json | jq '.grade'
418
- ```
419
-
420
- ```yaml
421
- # With retries for flaky network conditions
422
- - name: MCP Compliance Check
423
- run: npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --strict --retries 2 --timeout 30000
424
- ```
425
-
426
- ```yaml
427
- # SARIF output for GitHub Code Scanning
428
- - name: MCP Compliance Check
429
- run: npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --format sarif > compliance.sarif
430
- - name: Upload SARIF
431
- uses: github/codeql-action/upload-sarif@v3
432
- with:
433
- sarif_file: compliance.sarif
434
- ```
435
-
436
- ## MCP server (for Claude Code, Cursor, etc.)
437
-
438
- This package also exposes an MCP server with 3 tools that can be used from Claude Code, Cursor, or any MCP client.
439
-
440
- ### Setup
441
-
442
- **Claude Code (one-liner):**
443
-
444
- ```bash
445
- claude mcp add mcp-compliance -- npx -y @yawlabs/mcp-compliance mcp
446
- ```
447
-
448
- **Or create `.mcp.json` in your project root:**
449
-
450
- macOS / Linux / WSL:
451
-
452
- ```json
453
- {
454
- "mcpServers": {
455
- "mcp-compliance": {
456
- "command": "npx",
457
- "args": ["-y", "@yawlabs/mcp-compliance", "mcp"]
458
- }
459
- }
460
- }
461
- ```
462
-
463
- Windows:
464
-
465
- ```json
466
- {
467
- "mcpServers": {
468
- "mcp-compliance": {
469
- "command": "cmd",
470
- "args": ["/c", "npx", "-y", "@yawlabs/mcp-compliance", "mcp"]
471
- }
472
- }
473
- }
474
- ```
475
-
476
- > **Tip:** This file is safe to commit — it contains no secrets.
477
-
478
- Restart your MCP client and approve the server when prompted.
479
-
480
- ### Tools
481
-
482
- - **mcp_compliance_test** — Run the full 88-test suite against a URL or stdio command. Supports auth, custom headers, env vars, timeout, retries, and category/test filtering. Returns grade, score, and detailed results.
483
- - **mcp_compliance_badge** — Get the badge markdown/HTML for a server. Supports auth and custom headers.
484
- - **mcp_compliance_explain** — Explain what a specific test ID checks and why it matters.
485
-
486
- All tools have [MCP tool annotations](https://modelcontextprotocol.io/specification/2025-11-25/server/tools#annotations) (`readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`) so MCP clients can skip confirmation dialogs for safe operations.
487
-
488
- ## Programmatic usage
489
-
490
- ```typescript
491
- import { runComplianceSuite } from '@yawlabs/mcp-compliance';
492
-
493
- const report = await runComplianceSuite('https://my-server.com/mcp');
494
- console.log(`Grade: ${report.grade} (${report.score}%)`);
495
-
496
- // With options
497
- const report2 = await runComplianceSuite('https://my-server.com/mcp', {
498
- headers: { 'Authorization': 'Bearer tok123' },
499
- timeout: 30000,
500
- retries: 1,
501
- only: ['transport', 'lifecycle'],
502
- });
503
-
504
- // Live progress for streaming UIs (e.g. server-sent-events to a browser)
505
- await runComplianceSuite('https://my-server.com/mcp', {
506
- onTestComplete: (result) => {
507
- // result has the full TestResult: id, name, category, required,
508
- // passed, details, durationMs, specRef. Push it to your client.
509
- sendToClient(result);
510
- },
511
- });
512
- ```
513
-
514
- ## Report schema
515
-
516
- The JSON output of the test suite is a stable, versioned contract. Every report includes a `schemaVersion` field at the top level. The full JSON Schema lives at [`schemas/report.v1.json`](./schemas/report.v1.json) and is shipped with the npm package.
517
-
518
- ```jsonc
519
- {
520
- "schemaVersion": "1", // bumped on breaking changes to the report shape
521
- "specVersion": "2025-11-25", // MCP spec version tested against
522
- "toolVersion": "0.10.0", // mcp-compliance version that produced the report
523
- "url": "...",
524
- "timestamp": "...",
525
- "grade": "A",
526
- "score": 92.5,
527
- "tests": [ ... ],
528
- // ...
529
- }
530
- ```
531
-
532
- Consumer guidance:
533
-
534
- - Pin against `schemaVersion`. Reject reports with an unknown version rather than guessing at the shape.
535
- - The schema validates with any Draft 2020-12 validator (e.g. `ajv`).
536
- - Within a major version, additions are non-breaking. Renames, removals, or type changes bump the version.
537
- - Two runs against the same server produce equivalent grade, score, and per-test pass/fail (modulo timings/timestamps).
538
-
539
- ## Specification
540
-
541
- The compliance testing methodology is published as an open specification:
542
-
543
- - **[MCP Compliance Testing Specification](./MCP_COMPLIANCE_SPEC.md)** — test execution model, scoring algorithm, all 88 test rules with pass/fail criteria (CC BY 4.0)
544
- - **[Machine-readable rule catalog](./mcp-compliance-rules.json)** — JSON Schema-compliant catalog for programmatic consumption
545
- - **[Why `mcp-compliance`](./docs/WHY.md)** the problem, existing alternatives, what this tool does differently
546
- - **[Fixing common failures](./docs/FIXES.md)** recipes for the most frequent test failures with code snippets
547
- - **[Spec version migration policy](./docs/SPEC_VERSION_MIGRATION.md)** how this tool evolves with MCP spec releases
548
- - **[mcp.hosting external API](./docs/EXT_API.md)** public submit/retrieve/badge/delete endpoints used by `mcp-compliance badge` and any custom integrations
549
- - **[Enterprise tier (draft)](./docs/ENTERPRISE.md)** — paid tier structure for organizations with scheduled/private/audit-track compliance needs
550
- - **[Performance deep-dive](./docs/PERFORMANCE.md)** — why the suite is sequential and what parallel execution would cost
551
- - **[Spec PR drafts](./docs/spec-prs/)** — our proposed MCP spec clarifications for ambiguous cases we've hit
552
- - **[mcp.hosting integration spec](./docs/mcp-hosting-integration.md)** the contract between this engine and the mcp.hosting platform: URL surfaces, data flow, storage model, badge API, leaderboard, router integration
553
-
554
- These are complementary to (not competing with) the [official MCP specification](https://modelcontextprotocol.io/specification/2025-11-25). The MCP spec defines what servers must do; this spec defines how to verify compliance.
555
-
556
- ## Requirements
557
-
558
- - Node.js 18+
559
-
560
- ## Contributing
561
-
562
- ```bash
563
- git clone https://github.com/YawLabs/mcp-compliance.git
564
- cd mcp-compliance
565
- npm install
566
- npm run build
567
- npm test
568
- ```
569
-
570
- **Development commands:**
571
-
572
- | Command | Description |
573
- |---------|-------------|
574
- | `npm run build` | Compile with tsup |
575
- | `npm run dev` | Watch mode |
576
- | `npm test` | Run tests (vitest) |
577
- | `npm run lint` | Check with Biome |
578
- | `npm run lint:fix` | Auto-fix with Biome |
579
- | `npm run typecheck` | TypeScript type checking |
580
- | `npm run test:ci` | Build + test (CI-safe) |
581
-
582
- ## Links
583
-
584
- - [mcp.hosting](https://mcp.hosting) — Hosted MCP server infrastructure
585
- - [MCP Specification](https://modelcontextprotocol.io/specification/2025-11-25)
586
- - [MCP Compliance Testing Spec](./MCP_COMPLIANCE_SPEC.md)
587
- - [Yaw Labs](https://yaw.sh)
588
-
589
- ## License
590
-
591
- MIT see [LICENSE](./LICENSE).
1
+ # @yawlabs/mcp-compliance
2
+
3
+ [![npm version](https://img.shields.io/npm/v/@yawlabs/mcp-compliance)](https://www.npmjs.com/package/@yawlabs/mcp-compliance)
4
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
5
+ [![GitHub stars](https://img.shields.io/github/stars/YawLabs/mcp-compliance)](https://github.com/YawLabs/mcp-compliance/stargazers)
6
+ [![CI](https://github.com/YawLabs/mcp-compliance/actions/workflows/ci.yml/badge.svg)](https://github.com/YawLabs/mcp-compliance/actions/workflows/ci.yml)
7
+
8
+ **Test any MCP server for spec compliance.** 88-test suite covering transport, lifecycle, tools, resources, prompts, error handling, schema validation, and security against the [MCP specification](https://modelcontextprotocol.io/specification/2025-11-25). Works against **HTTP endpoints** (`https://my-server.com/mcp`) and **stdio servers** (`npx @modelcontextprotocol/server-filesystem /tmp`) alike. CLI, MCP server, and programmatic API.
9
+
10
+ Built and maintained by [Yaw Labs](https://yaw.sh).
11
+
12
+ ## Why this tool?
13
+
14
+ MCP servers are multiplying fast — but most ship without compliance testing. Broken transport handling, missing error codes, malformed schemas, and silent capability violations are common. Hand-rolling test scripts is tedious and incomplete.
15
+
16
+ This tool solves that:
17
+
18
+ - **88 tests across 8 categories** — transport, lifecycle, tools, resources, prompts, error handling, schema validation, and security. No gaps. (HTTP runs all 85 transport-applicable tests; stdio runs ~75 — HTTP-specific tests like CORS, TLS, session headers, and rate limiting are gated out.)
19
+ - **Capability-driven** — tests adapt to what the server declares. If it says it supports tools, tool tests become required. No false failures for features the server doesn't claim.
20
+ - **Graded scoring** — A-F letter grade with a weighted score (required tests 70%, optional 30%). One number to communicate compliance.
21
+ - **CI-ready** — `--strict` mode exits with code 1 on required test failures. Drop it into any pipeline.
22
+ - **Spec-referenced** — every test links to the exact section of the MCP specification it validates. No ambiguity about what's being tested or why.
23
+ - **Three interfaces** — CLI for humans, MCP server for AI assistants, programmatic API for integration.
24
+ - **Published methodology** — the [testing methodology](./COMPLIANCE_RUBRIC.md) and [rule catalog](./mcp-compliance-rules.json) are open (CC BY 4.0) so anyone can build compatible tooling or fork the rules.
25
+
26
+ ## Quick start
27
+
28
+ **Remote HTTP server:**
29
+
30
+ ```bash
31
+ npx @yawlabs/mcp-compliance test https://my-server.com/mcp
32
+ ```
33
+
34
+ **Local stdio server** (the vast majority of MCP servers on npm):
35
+
36
+ ```bash
37
+ # Pass the command directly, Inspector-style
38
+ npx @yawlabs/mcp-compliance test npx @modelcontextprotocol/server-filesystem /tmp
39
+
40
+ # Or a local build
41
+ npx @yawlabs/mcp-compliance test node ./dist/server.js
42
+
43
+ # With env vars
44
+ npx @yawlabs/mcp-compliance test -E GITHUB_TOKEN=$GITHUB_TOKEN -- npx @modelcontextprotocol/server-github
45
+ ```
46
+
47
+ **Install globally:**
48
+
49
+ ```bash
50
+ npm install -g @yawlabs/mcp-compliance
51
+ mcp-compliance test https://my-server.com/mcp
52
+ ```
53
+
54
+ That's it. You'll get a colored terminal report with a letter grade (A-F), per-test pass/fail, and a compliance score.
55
+
56
+ ## CLI usage
57
+
58
+ ### HTTP targets
59
+
60
+ ```bash
61
+ # Terminal output with colors and grade
62
+ mcp-compliance test https://my-server.com/mcp
63
+
64
+ # JSON / SARIF for scripting + GitHub Code Scanning
65
+ mcp-compliance test https://my-server.com/mcp --format json
66
+ mcp-compliance test https://my-server.com/mcp --format sarif > compliance.sarif
67
+
68
+ # Strict mode for CI — exits 1 on required-test failure
69
+ mcp-compliance test https://my-server.com/mcp --strict
70
+
71
+ # Auth (shorthand or full header)
72
+ mcp-compliance test https://my-server.com/mcp --auth "Bearer tok123"
73
+ mcp-compliance test https://my-server.com/mcp -H "X-Api-Key: abc"
74
+
75
+ # Focus the run
76
+ mcp-compliance test https://my-server.com/mcp --only transport,lifecycle
77
+ mcp-compliance test https://my-server.com/mcp --skip prompts,resources
78
+ mcp-compliance test https://my-server.com/mcp --verbose
79
+ ```
80
+
81
+ ### stdio targets
82
+
83
+ Pass the command and its args as positional arguments (MCP Inspector-style). Use `--` to disambiguate when the target needs flags that collide with ours.
84
+
85
+ ```bash
86
+ # npm-distributed stdio servers
87
+ mcp-compliance test npx -y @modelcontextprotocol/server-filesystem /tmp
88
+ mcp-compliance test uvx mcp-server-git
89
+
90
+ # Local build
91
+ mcp-compliance test node ./dist/server.js
92
+
93
+ # With env vars (repeatable -E, or --env-file)
94
+ mcp-compliance test -E API_KEY=secret -E REGION=us-east-1 -- npx my-server
95
+ mcp-compliance test --env-file .env -- node ./server.js
96
+
97
+ # Set working directory
98
+ mcp-compliance test --cwd ./services/mcp -- node ./dist/server.js
99
+
100
+ # Target uses a flag that collides with ours — use `--` to separate
101
+ mcp-compliance test --verbose -- node ./server.js --verbose
102
+ ```
103
+
104
+ On Windows, `npx` and other `.cmd` shims are handled automatically by spawning through the shell.
105
+
106
+ ### Options
107
+
108
+ | Option | Applies to | Description |
109
+ |--------|-----------|-------------|
110
+ | `--format <format>` | both | Output format: `terminal`, `json`, `sarif`, `github`, or `markdown` (default: `terminal`) |
111
+ | `--config <path>` | both | Load defaults from a config file (default: `mcp-compliance.config.json` in cwd) |
112
+ | `--output <file>` | both | Write a local SVG badge to the given path after the run |
113
+ | `--list` | both | Print test IDs that would run given current filters, then exit (no connection) |
114
+ | `--transport <kind>` | both | Filter by `http` or `stdio` (only used with `--list` when no target is provided) |
115
+ | `--strict` | both | Exit with code 1 on any required test failure (for CI) |
116
+ | `--min-grade <grade>` | both | Exit with code 1 if grade is below this threshold (`A`–`F`) |
117
+ | `-H, --header <h>` | HTTP | Add header to all requests, format `"Key: Value"` (repeatable) |
118
+ | `--auth <token>` | HTTP | Shorthand for `-H "Authorization: <token>"` |
119
+ | `-E, --env <var>` | stdio | Set env var for stdio command, format `"KEY=VALUE"` (repeatable) |
120
+ | `--env-file <path>` | stdio | Load env vars from a file (one `KEY=VALUE` per line) |
121
+ | `--cwd <dir>` | stdio | Working directory for the stdio command |
122
+ | `--timeout <ms>` | both | Request timeout in milliseconds (default: `15000`) |
123
+ | `--preflight-timeout <ms>` | HTTP | Preflight connectivity check timeout (HTTP only) |
124
+ | `--retries <n>` | both | Number of retries for failed tests (default: `0`) |
125
+ | `--only <items>` | both | Only run tests matching these categories or test IDs (comma-separated) |
126
+ | `--skip <items>` | both | Skip tests matching these categories or test IDs (comma-separated) |
127
+ | `--concurrency <n>` | both | Max parallel-safe tests in flight (default: `1`; raising reduces wall time but can perturb timing-sensitive servers) |
128
+ | `--verbose` | both | Print each test result as it runs (also forwards stdio stderr) |
129
+
130
+ ### CI integration
131
+
132
+ **GitHub Action** (drop into any `.github/workflows/*.yml`):
133
+
134
+ ```yaml
135
+ - uses: YawLabs/mcp-compliance@v0
136
+ with:
137
+ target: 'node ./dist/server.js' # or a URL like https://my-server.com/mcp
138
+ format: github # ::error / ::warning annotations on the PR
139
+ strict: 'true' # exit non-zero if any required test fails
140
+ min-grade: 'A' # also exit if grade slips
141
+ ```
142
+
143
+ **Manual CLI invocation:**
144
+
145
+ ```bash
146
+ # GitHub Actions: emits ::error / ::warning annotations inline on the PR
147
+ mcp-compliance test https://my-server.com/mcp --format github --strict
148
+
149
+ # Slack/Linear/PR comment: drop the body straight into a comment
150
+ mcp-compliance test https://my-server.com/mcp --format markdown > report.md
151
+
152
+ # HTML report (self-contained, share anywhere issue comments, S3, GitHub Pages)
153
+ mcp-compliance test https://my-server.com/mcp --format html > report.html
154
+
155
+ # Block release if grade slips below B
156
+ mcp-compliance test https://my-server.com/mcp --min-grade B
157
+
158
+ # Preview which tests will run before connecting (handy for --only/--skip authoring)
159
+ mcp-compliance test --list --transport stdio --skip security
160
+
161
+ # Diff two runs exit 1 if anything that was passing is now failing
162
+ mcp-compliance test https://my-server.com/mcp --format json > current.json
163
+ mcp-compliance diff baseline.json current.json
164
+
165
+ # Watch mode for stdio dev loop — re-runs on file changes in cwd
166
+ mcp-compliance test --watch -- node ./dist/server.js
167
+
168
+ # Latency benchmark
169
+ mcp-compliance benchmark -- node ./dist/server.js -r 200 -c 4
170
+ ```
171
+
172
+ **Docker:**
173
+
174
+ ```bash
175
+ docker run --rm ghcr.io/yawlabs/mcp-compliance test https://my-server.com/mcp
176
+ ```
177
+
178
+ ### Scaffold a config
179
+
180
+ ```bash
181
+ mcp-compliance init
182
+ ```
183
+
184
+ Interactive prompts walk you through transport (http/stdio), command/url, env vars, timeout, and strict mode — then write a `mcp-compliance.config.json` you can commit.
185
+
186
+ ### Config file
187
+
188
+ Check in a `mcp-compliance.config.json` so CI and your dev loop can run `mcp-compliance test` with no arguments. Supported locations (searched in order): `mcp-compliance.config.json`, `.mcp-compliancerc.json`, `.mcp-compliancerc`, and the `"mcp-compliance"` field of `package.json`. Pass `--config <path>` to load an explicit file.
189
+
190
+ **HTTP:**
191
+
192
+ ```json
193
+ {
194
+ "target": {
195
+ "type": "http",
196
+ "url": "https://my-server.com/mcp",
197
+ "headers": { "Authorization": "Bearer tok123" }
198
+ },
199
+ "timeout": 20000,
200
+ "strict": true
201
+ }
202
+ ```
203
+
204
+ **stdio:**
205
+
206
+ ```json
207
+ {
208
+ "target": {
209
+ "type": "stdio",
210
+ "command": "node",
211
+ "args": ["./dist/server.js"],
212
+ "env": { "LOG_LEVEL": "error" }
213
+ },
214
+ "skip": ["security"],
215
+ "strict": true
216
+ }
217
+ ```
218
+
219
+ Precedence: CLI flags > config file > defaults. Any field can be overridden on the command line.
220
+
221
+ ### Publish a shareable badge (HTTP only)
222
+
223
+ ```bash
224
+ mcp-compliance badge https://my-server.com/mcp
225
+ ```
226
+
227
+ Runs the compliance suite, publishes the report to [mcp.hosting](https://mcp.hosting), and prints the markdown embed for your README. The badge image reflects the real grade (A–F) and links to the full report.
228
+
229
+ | Option | Description |
230
+ |--------|-------------|
231
+ | `-H, --header <header>` | Add header to all requests, format `"Key: Value"` (repeatable) |
232
+ | `--auth <token>` | Shorthand for `-H "Authorization: <token>"` |
233
+ | `--timeout <ms>` | Request timeout in milliseconds (default: `15000`) |
234
+ | `--no-publish` | Skip publishing; print a local badge markdown only |
235
+ | `--output <file>` | Also write a local SVG badge to the given path |
236
+
237
+ Reports are kept for 90 days from last submission; resubmitting the same URL overwrites the previous report. Auth headers are stripped client-side before upload. Private/loopback URLs (`localhost`, `127.0.0.1`, `192.168.*`, etc.) trigger an interactive confirmation before publishing, and are rejected by the server in any case.
238
+
239
+ A delete token is returned at publish time and stored at `~/.mcp-compliance/tokens.json` (mode `0600`). Use it to take a report down:
240
+
241
+ ```bash
242
+ mcp-compliance unpublish https://my-server.com/mcp
243
+ ```
244
+
245
+ ### Local SVG badge (any transport)
246
+
247
+ Stdio servers can't be published (no public URL to key on), but you can commit a local SVG reflecting the real grade:
248
+
249
+ ```bash
250
+ mcp-compliance test node ./dist/server.js --output badge.svg
251
+ mcp-compliance badge npx -y @modelcontextprotocol/server-filesystem /tmp --output badge.svg
252
+ ```
253
+
254
+ Then embed it in your README:
255
+
256
+ ```markdown
257
+ ![MCP Compliance](./badge.svg)
258
+ ```
259
+
260
+ The `test` command never publishes — use it for CI, debugging, and local iteration. `badge` is the only command that publishes to mcp.hosting.
261
+
262
+ ## What the 88 tests check
263
+
264
+ <details>
265
+ <summary><strong>Transport (16 tests)</strong></summary>
266
+
267
+ HTTP-only (13):
268
+ - **transport-post** — Server accepts HTTP POST requests (required)
269
+ - **transport-content-type** — Responds with application/json or text/event-stream (required)
270
+ - **transport-notification-202** — Notifications return exactly 202 Accepted
271
+ - **transport-content-type-reject** — Rejects non-JSON request Content-Type
272
+ - **transport-session-id** — Enforces MCP-Session-Id after initialization
273
+ - **transport-session-invalid** — Returns 404 for unknown session ID
274
+ - **transport-get** — GET returns SSE stream or 405
275
+ - **transport-delete** — DELETE accepted or returns 405
276
+ - **transport-batch-reject** — Rejects JSON-RPC batch requests (required)
277
+ - **transport-content-type-init** — Initialize response has valid content type
278
+ - **transport-get-stream** — GET with session returns SSE or 405
279
+ - **transport-concurrent** — Handles concurrent requests
280
+ - **transport-sse-event-field** — SSE responses include required event: message field
281
+
282
+ stdio-only (3):
283
+ - **stdio-framing** — Newline-delimited JSON framing (required)
284
+ - **stdio-unicode** — UTF-8 unicode roundtrip preserves non-ASCII payloads
285
+ - **stdio-unknown-method-recovers** — Returns -32601 for unknown methods and keeps serving
286
+
287
+ </details>
288
+
289
+ <details>
290
+ <summary><strong>Lifecycle (21 tests)</strong></summary>
291
+
292
+ - **lifecycle-init** — Initialize handshake succeeds (required)
293
+ - **lifecycle-proto-version** — Returns valid YYYY-MM-DD protocol version (required)
294
+ - **lifecycle-server-info** — Includes serverInfo with name
295
+ - **lifecycle-capabilities** — Returns capabilities object (required)
296
+ - **lifecycle-jsonrpc** — Response is valid JSON-RPC 2.0 (required)
297
+ - **lifecycle-ping** — Responds to ping method (required)
298
+ - **lifecycle-instructions** — Instructions field is valid string if present
299
+ - **lifecycle-id-match** — Response ID matches request ID (required)
300
+ - **lifecycle-string-id** — Supports string request IDs (JSON-RPC 2.0)
301
+ - **lifecycle-version-negotiate** — Handles unknown protocol version gracefully
302
+ - **lifecycle-reinit-reject** — Rejects second initialize request
303
+ - **lifecycle-logging** — logging/setLevel accepted (required if logging capability declared)
304
+ - **lifecycle-completions** — completion/complete accepted (required if completions capability declared)
305
+ - **lifecycle-cancellation** — Handles cancellation notifications
306
+ - **lifecycle-progress** — Handles progress notifications gracefully
307
+ - **lifecycle-list-changed** — Accepts listChanged notifications for declared capabilities
308
+ - **lifecycle-progress-token** — Supports progress tokens in requests via SSE
309
+ - **lifecycle-sampling-capability** — Advisory check for server-side use of the client sampling capability
310
+ - **lifecycle-roots-capability** — Advisory check for server-side use of the client roots capability
311
+ - **lifecycle-elicitation-capability** — Advisory check for the 2025-11-25 client elicitation capability
312
+ - **lifecycle-meta-tolerance** — Server ignores unknown `_meta` fields on incoming requests
313
+
314
+ </details>
315
+
316
+ <details>
317
+ <summary><strong>Tools (4 tests)</strong></summary>
318
+
319
+ - **tools-list** — tools/list returns valid array (required if tools capability declared)
320
+ - **tools-call** — tools/call responds with correct format
321
+ - **tools-pagination** — tools/list supports cursor-based pagination
322
+ - **tools-content-types** — Tool content items have valid types
323
+
324
+ </details>
325
+
326
+ <details>
327
+ <summary><strong>Resources (5 tests)</strong></summary>
328
+
329
+ - **resources-list** — resources/list returns valid array (required if resources capability declared)
330
+ - **resources-read** — resources/read returns content items
331
+ - **resources-templates** — resources/templates/list works or returns method-not-found
332
+ - **resources-pagination** — resources/list supports cursor-based pagination
333
+ - **resources-subscribe** — Resource subscribe/unsubscribe (required if subscribe capability declared)
334
+
335
+ </details>
336
+
337
+ <details>
338
+ <summary><strong>Prompts (3 tests)</strong></summary>
339
+
340
+ - **prompts-list** — prompts/list returns valid array (required if prompts capability declared)
341
+ - **prompts-get** — prompts/get returns valid messages
342
+ - **prompts-pagination** — prompts/list supports cursor-based pagination
343
+
344
+ </details>
345
+
346
+ <details>
347
+ <summary><strong>Error Handling (10 tests)</strong></summary>
348
+
349
+ - **error-unknown-method** — Returns JSON-RPC error for unknown method (required)
350
+ - **error-method-code** — Uses correct -32601 error code
351
+ - **error-invalid-jsonrpc** — Handles malformed JSON-RPC (required)
352
+ - **error-invalid-json** Handles invalid JSON body
353
+ - **error-missing-params** — Returns error for tools/call without name
354
+ - **error-parse-code** — Returns -32700 for invalid JSON
355
+ - **error-invalid-request-code** — Returns -32600 for invalid request
356
+ - **tools-call-unknown** — Returns error for nonexistent tool name
357
+ - **error-capability-gated** — Rejects methods for undeclared capabilities
358
+ - **error-invalid-cursor** — Handles invalid pagination cursor gracefully
359
+
360
+ </details>
361
+
362
+ <details>
363
+ <summary><strong>Schema Validation (6 tests)</strong></summary>
364
+
365
+ - **tools-schema** — All tools have valid name and inputSchema (required if tools capability declared)
366
+ - **tools-annotations** — Tool annotations are valid if present
367
+ - **tools-title-field** — Tools include title field (2025-11-25)
368
+ - **tools-output-schema** — Tools with outputSchema are valid (2025-11-25)
369
+ - **prompts-schema** — Prompts have valid name field (required if prompts capability declared)
370
+ - **resources-schema** — Resources have valid uri and name (required if resources capability declared)
371
+
372
+ </details>
373
+
374
+ <details>
375
+ <summary><strong>Security (23 tests)</strong></summary>
376
+
377
+ - **security-auth-required** — Rejects unauthenticated requests
378
+ - **security-www-authenticate** — 401 responses include WWW-Authenticate header
379
+ - **security-auth-malformed** — Rejects malformed auth credentials
380
+ - **security-tls-required** — Enforces HTTPS/TLS
381
+ - **security-session-entropy** — Session IDs are high-entropy
382
+ - **security-session-not-auth** — Session ID does not bypass auth
383
+ - **security-oauth-metadata** — Protected Resource Metadata endpoint exists (RFC 9728)
384
+ - **security-token-in-uri** — Rejects auth tokens in query string
385
+ - **security-cors-headers** — CORS headers are restrictive
386
+ - **security-origin-validation** — Validates Origin header for DNS rebinding protection
387
+ - **security-command-injection** — Resists command injection in tool params
388
+ - **security-sql-injection** — Resists SQL injection in tool params
389
+ - **security-path-traversal** — Resists path traversal in tool params
390
+ - **security-ssrf-internal** — Resists SSRF to internal networks
391
+ - **security-oversized-input** — Handles oversized inputs gracefully
392
+ - **security-extra-params** — Rejects or ignores extra tool params
393
+ - **security-tool-schema-defined** — All tools define inputSchema
394
+ - **security-tool-rug-pull** Tool definitions are stable across calls
395
+ - **security-tool-description-poisoning** — Tool descriptions free of injection patterns
396
+ - **security-tool-cross-reference** — Tools do not reference other tools by name
397
+ - **security-error-no-stacktrace** — Error responses do not leak stack traces
398
+ - **security-error-no-internal-ip** — Error responses do not leak internal IPs
399
+ - **security-rate-limiting** — Rate limiting is enforced
400
+
401
+ </details>
402
+
403
+ ## Grading
404
+
405
+ | Grade | Score |
406
+ |-------|--------|
407
+ | A | 90-100 |
408
+ | B | 75-89 |
409
+ | C | 60-74 |
410
+ | D | 40-59 |
411
+ | F | 0-39 |
412
+
413
+ Required tests are worth 70% of the score, optional tests 30%. See the [full scoring algorithm](./COMPLIANCE_RUBRIC.md#2-scoring-algorithm) in the methodology doc.
414
+
415
+ ## CI integration
416
+
417
+ ```yaml
418
+ # GitHub Actions example
419
+ - name: MCP Compliance Check
420
+ run: npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --strict
421
+ ```
422
+
423
+ ```yaml
424
+ # With JSON output for parsing
425
+ - name: MCP Compliance Check
426
+ run: |
427
+ npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --format json > compliance.json
428
+ cat compliance.json | jq '.grade'
429
+ ```
430
+
431
+ ```yaml
432
+ # With retries for flaky network conditions
433
+ - name: MCP Compliance Check
434
+ run: npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --strict --retries 2 --timeout 30000
435
+ ```
436
+
437
+ ```yaml
438
+ # SARIF output for GitHub Code Scanning
439
+ - name: MCP Compliance Check
440
+ run: npx @yawlabs/mcp-compliance test ${{ env.MCP_SERVER_URL }} --format sarif > compliance.sarif
441
+ - name: Upload SARIF
442
+ uses: github/codeql-action/upload-sarif@v3
443
+ with:
444
+ sarif_file: compliance.sarif
445
+ ```
446
+
447
+ ## MCP server (for Claude Code, Cursor, etc.)
448
+
449
+ This package also exposes an MCP server with 3 tools that can be used from Claude Code, Cursor, or any MCP client.
450
+
451
+ ### Setup
452
+
453
+ **Claude Code (one-liner):**
454
+
455
+ ```bash
456
+ claude mcp add mcp-compliance -- npx -y @yawlabs/mcp-compliance mcp
457
+ ```
458
+
459
+ **Or create `.mcp.json` in your project root:**
460
+
461
+ macOS / Linux / WSL:
462
+
463
+ ```json
464
+ {
465
+ "mcpServers": {
466
+ "mcp-compliance": {
467
+ "command": "npx",
468
+ "args": ["-y", "@yawlabs/mcp-compliance", "mcp"]
469
+ }
470
+ }
471
+ }
472
+ ```
473
+
474
+ Windows:
475
+
476
+ ```json
477
+ {
478
+ "mcpServers": {
479
+ "mcp-compliance": {
480
+ "command": "cmd",
481
+ "args": ["/c", "npx", "-y", "@yawlabs/mcp-compliance", "mcp"]
482
+ }
483
+ }
484
+ }
485
+ ```
486
+
487
+ > **Tip:** This file is safe to commit — it contains no secrets.
488
+
489
+ Restart your MCP client and approve the server when prompted.
490
+
491
+ ### Tools
492
+
493
+ - **mcp_compliance_test** Run the full 88-test suite against a URL or stdio command. Supports auth, custom headers, env vars, timeout, retries, and category/test filtering. Returns grade, score, and detailed results.
494
+ - **mcp_compliance_badge** — Get the badge markdown/HTML for a server. Supports auth and custom headers.
495
+ - **mcp_compliance_explain** — Explain what a specific test ID checks and why it matters.
496
+
497
+ All tools have [MCP tool annotations](https://modelcontextprotocol.io/specification/2025-11-25/server/tools#annotations) (`readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`) so MCP clients can skip confirmation dialogs for safe operations.
498
+
499
+ ## Programmatic usage
500
+
501
+ ```typescript
502
+ import { runComplianceSuite } from '@yawlabs/mcp-compliance';
503
+
504
+ const report = await runComplianceSuite('https://my-server.com/mcp');
505
+ console.log(`Grade: ${report.grade} (${report.score}%)`);
506
+
507
+ // With options
508
+ const report2 = await runComplianceSuite('https://my-server.com/mcp', {
509
+ headers: { 'Authorization': 'Bearer tok123' },
510
+ timeout: 30000,
511
+ retries: 1,
512
+ only: ['transport', 'lifecycle'],
513
+ });
514
+
515
+ // Live progress for streaming UIs (e.g. server-sent-events to a browser)
516
+ await runComplianceSuite('https://my-server.com/mcp', {
517
+ onTestComplete: (result) => {
518
+ // result has the full TestResult: id, name, category, required,
519
+ // passed, details, durationMs, specRef. Push it to your client.
520
+ sendToClient(result);
521
+ },
522
+ });
523
+ ```
524
+
525
+ ## Report schema
526
+
527
+ The JSON output of the test suite is a stable, versioned contract. Every report includes a `schemaVersion` field at the top level. The full JSON Schema lives at [`schemas/report.v1.json`](./schemas/report.v1.json) and is shipped with the npm package.
528
+
529
+ ```jsonc
530
+ {
531
+ "schemaVersion": "1", // bumped on breaking changes to the report shape
532
+ "specVersion": "2025-11-25", // MCP spec version tested against
533
+ "toolVersion": "0.10.0", // mcp-compliance version that produced the report
534
+ "url": "...",
535
+ "timestamp": "...",
536
+ "grade": "A",
537
+ "score": 92.5,
538
+ "tests": [ ... ],
539
+ // ...
540
+ }
541
+ ```
542
+
543
+ Consumer guidance:
544
+
545
+ - Pin against `schemaVersion`. Reject reports with an unknown version rather than guessing at the shape.
546
+ - The schema validates with any Draft 2020-12 validator (e.g. `ajv`).
547
+ - Within a major version, additions are non-breaking. Renames, removals, or type changes bump the version.
548
+ - Two runs against the same server produce equivalent grade, score, and per-test pass/fail (modulo timings/timestamps).
549
+
550
+ ## Methodology & docs
551
+
552
+ The testing methodology is published openly so the grading is auditable:
553
+
554
+ - **[Testing methodology](./COMPLIANCE_RUBRIC.md)** test execution model, scoring algorithm, all 88 test rules with pass/fail criteria (CC BY 4.0)
555
+ - **[Machine-readable rule catalog](./mcp-compliance-rules.json)** — JSON Schema-compliant catalog for programmatic consumption
556
+ - **[Why `mcp-compliance`](./docs/WHY.md)** — the problem, existing alternatives, what this tool does differently
557
+ - **[Fixing common failures](./docs/FIXES.md)** — recipes for the most frequent test failures with code snippets
558
+ - **[Spec version migration policy](./docs/SPEC_VERSION_MIGRATION.md)** — how this tool evolves with MCP spec releases
559
+ - **[mcp.hosting external API](./docs/EXT_API.md)** — public submit/retrieve/badge/delete endpoints used by `mcp-compliance badge` and any custom integrations
560
+ - **[Enterprise tier (draft)](./docs/ENTERPRISE.md)** — paid tier structure for organizations with scheduled/private/audit-track compliance needs
561
+ - **[Performance deep-dive](./docs/PERFORMANCE.md)** — why the suite is sequential and what parallel execution would cost
562
+ - **[Spec PR drafts](./docs/spec-prs/)** — our proposed MCP spec clarifications for ambiguous cases we've hit
563
+ - **[mcp.hosting integration spec](./docs/mcp-hosting-integration.md)** — the contract between this engine and the mcp.hosting platform: URL surfaces, data flow, storage model, badge API, leaderboard, router integration
564
+
565
+ The methodology is not an authoritative conformance standard — it's one tool's choices, published so they can be inspected, adopted, or forked. The [official MCP specification](https://modelcontextprotocol.io/specification/2025-11-25) defines what servers must do; this document describes how `@yawlabs/mcp-compliance` verifies it.
566
+
567
+ ## Requirements
568
+
569
+ - Node.js 18+
570
+
571
+ ## Contributing
572
+
573
+ ```bash
574
+ git clone https://github.com/YawLabs/mcp-compliance.git
575
+ cd mcp-compliance
576
+ npm install
577
+ npm run build
578
+ npm test
579
+ ```
580
+
581
+ **Development commands:**
582
+
583
+ | Command | Description |
584
+ |---------|-------------|
585
+ | `npm run build` | Compile with tsup |
586
+ | `npm run dev` | Watch mode |
587
+ | `npm test` | Run tests (vitest) |
588
+ | `npm run lint` | Check with Biome |
589
+ | `npm run lint:fix` | Auto-fix with Biome |
590
+ | `npm run typecheck` | TypeScript type checking |
591
+ | `npm run test:ci` | Build + test (CI-safe) |
592
+
593
+ ## Links
594
+
595
+ - [mcp.hosting](https://mcp.hosting) — Hosted MCP server infrastructure
596
+ - [MCP Specification](https://modelcontextprotocol.io/specification/2025-11-25)
597
+ - [Testing methodology](./COMPLIANCE_RUBRIC.md)
598
+ - [Yaw Labs](https://yaw.sh)
599
+
600
+ ## License
601
+
602
+ MIT — see [LICENSE](./LICENSE).