gitlab-mcp 1.2.1 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +30 -27
- package/dist/config/dotenv.js +6 -2
- package/dist/config/dotenv.js.map +1 -1
- package/dist/config/env.d.ts +3 -1
- package/dist/config/env.js +15 -1
- package/dist/config/env.js.map +1 -1
- package/dist/http-app.js +10 -3
- package/dist/http-app.js.map +1 -1
- package/dist/http.js +5 -4
- package/dist/http.js.map +1 -1
- package/dist/index.js +5 -4
- package/dist/index.js.map +1 -1
- package/dist/lib/auth-context.d.ts +2 -1
- package/dist/lib/auth-context.js.map +1 -1
- package/dist/lib/gitlab-client.d.ts +42 -8
- package/dist/lib/gitlab-client.js +380 -42
- package/dist/lib/gitlab-client.js.map +1 -1
- package/dist/lib/network.js +12 -6
- package/dist/lib/network.js.map +1 -1
- package/dist/lib/oauth-scopes.d.ts +2 -0
- package/dist/lib/oauth-scopes.js +16 -0
- package/dist/lib/oauth-scopes.js.map +1 -0
- package/dist/lib/regex.d.ts +5 -0
- package/dist/lib/regex.js +111 -0
- package/dist/lib/regex.js.map +1 -0
- package/dist/lib/request-runtime.js +24 -11
- package/dist/lib/request-runtime.js.map +1 -1
- package/dist/tools/gitlab.js +193 -3
- package/dist/tools/gitlab.js.map +1 -1
- package/dist/types/auth.d.ts +1 -0
- package/dist/types/auth.js +2 -0
- package/dist/types/auth.js.map +1 -0
- package/dist/types/context.d.ts +1 -0
- package/docs/architecture.md +1 -1
- package/docs/authentication.md +4 -1
- package/docs/configuration.md +23 -21
- package/docs/deployment.md +2 -0
- package/docs/mcp-integration-testing-best-practices.md +381 -730
- package/docs/tools.md +24 -14
- package/package.json +1 -1
|
@@ -1,124 +1,138 @@
|
|
|
1
|
-
# MCP Integration Testing Best Practices
|
|
1
|
+
# MCP Server Integration Testing Best Practices (JavaScript / TypeScript)
|
|
2
2
|
|
|
3
|
-
> This
|
|
3
|
+
> This guide is for teams building MCP servers in the Node.js / TypeScript ecosystem. It focuses on a **deterministic, automatable, layered** integration testing strategy.
|
|
4
4
|
>
|
|
5
|
-
>
|
|
5
|
+
> Baseline: MCP specification **2025-11-25**. The guide explicitly distinguishes **Streamable HTTP (current)** from **HTTP+SSE (legacy compatibility)**.
|
|
6
|
+
>
|
|
7
|
+
> Version note: as of early 2026, the TypeScript SDK is still in a v1 → v2 transition period. Many production systems still run v1, while v2 is introducing package splits and API changes.
|
|
8
|
+
>
|
|
9
|
+
> Scope note: this document summarizes methodology and implementation patterns. Snippets, env vars, file paths, and commands are examples and should be adapted to your repository.
|
|
10
|
+
|
|
11
|
+
---
|
|
6
12
|
|
|
7
13
|
## Table of Contents
|
|
8
14
|
|
|
9
|
-
- [Why MCP Servers Need Integration
|
|
10
|
-
- [
|
|
11
|
-
- [Layer
|
|
12
|
-
|
|
13
|
-
- [
|
|
14
|
-
- [Pattern
|
|
15
|
+
- [Why MCP Servers Need Integration Tests](#why-mcp-servers-need-integration-tests)
|
|
16
|
+
- [Testing Pyramid for MCP](#testing-pyramid-for-mcp)
|
|
17
|
+
- [Layer 0: Design for Testability (Server Factory + Dependency Injection)](#layer-0-design-for-testability-server-factory--dependency-injection)
|
|
18
|
+
- [Layer 1: InMemoryTransport Protocol-Level Integration Tests (P0 Core)](#layer-1-inmemorytransport-protocol-level-integration-tests-p0-core)
|
|
19
|
+
- [Core Scaffold: buildContext + createLinkedPair](#core-scaffold-buildcontext--createlinkedpair)
|
|
20
|
+
- [Pattern 1: Capabilities and List Contracts (tools/resources/prompts/list)](#pattern-1-capabilities-and-list-contracts-toolsresourcespromptslist)
|
|
21
|
+
- [Pattern 2: Tool Handler End-to-End Validation (stub external dependencies)](#pattern-2-tool-handler-end-to-end-validation-stub-external-dependencies)
|
|
15
22
|
- [Pattern 3: Schema Validation and Boundary Inputs](#pattern-3-schema-validation-and-boundary-inputs)
|
|
16
|
-
- [
|
|
17
|
-
|
|
18
|
-
- [
|
|
19
|
-
- [
|
|
20
|
-
- [
|
|
21
|
-
- [
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
- [
|
|
25
|
-
- [
|
|
26
|
-
- [
|
|
27
|
-
- [
|
|
28
|
-
- [
|
|
29
|
-
- [
|
|
30
|
-
- [
|
|
23
|
+
- [Pattern 4: Minimal Coverage for Bidirectional Requests (Sampling/Elicitation)](#pattern-4-minimal-coverage-for-bidirectional-requests-samplingelicitation)
|
|
24
|
+
- [Layer 2: Streamable HTTP Transport Integration Tests (P0/P1)](#layer-2-streamable-http-transport-integration-tests-p0p1)
|
|
25
|
+
- [Spec Checklist You Must Align With (2025-11-25)](#spec-checklist-you-must-align-with-2025-11-25)
|
|
26
|
+
- [HTTP Test Harness: Port 0 + Isolated Server Instances](#http-test-harness-port-0--isolated-server-instances)
|
|
27
|
+
- [Must-Test Cases: Session, 404 Reinitialize, DELETE, Protocol Version Header](#must-test-cases-session-404-reinitialize-delete-protocol-version-header)
|
|
28
|
+
- [SSE Stream Tests (GET/POST SSE on Streamable HTTP)](#sse-stream-tests-getpost-sse-on-streamable-http)
|
|
29
|
+
- [Layer 2.5: Legacy HTTP+SSE Compatibility Tests (Only if Needed)](#layer-25-legacy-httpsse-compatibility-tests-only-if-needed)
|
|
30
|
+
- [Layer 3: Security / Auth / Policy Tests (P0/P1)](#layer-3-security--auth--policy-tests-p0p1)
|
|
31
|
+
- [Origin/Host Protection (DNS Rebinding)](#originhost-protection-dns-rebinding)
|
|
32
|
+
- [OAuth/Authorization (Resource Metadata Discovery)](#oauthauthorization-resource-metadata-discovery)
|
|
33
|
+
- [Error Handling and Secret Redaction](#error-handling-and-secret-redaction)
|
|
34
|
+
- [Policy Combinatorics: Read-Only, Allowlists, Feature Flags](#policy-combinatorics-read-only-allowlists-feature-flags)
|
|
35
|
+
- [Layer 4: Conformance Testing (Strongly Recommended)](#layer-4-conformance-testing-strongly-recommended)
|
|
36
|
+
- [Layer 5: Agent Loop Integration Tests (ScriptedLLM + Small Real-LLM Smoke)](#layer-5-agent-loop-integration-tests-scriptedllm--small-real-llm-smoke)
|
|
37
|
+
- [Layer 6: Inspector CLI Black-Box Contract Tests (Pre/Post Deployment)](#layer-6-inspector-cli-black-box-contract-tests-prepost-deployment)
|
|
38
|
+
- [CI/CD Layered Execution Strategy](#cicd-layered-execution-strategy)
|
|
39
|
+
- [Common Pitfalls and Fixes (Updated for 2025-11-25)](#common-pitfalls-and-fixes-updated-for-2025-11-25)
|
|
40
|
+
- [Recommended Test Matrix (Copy/Paste)](#recommended-test-matrix-copypaste)
|
|
41
|
+
- [References and Compatibility Notes](#references-and-compatibility-notes)
|
|
31
42
|
|
|
32
43
|
---
|
|
33
44
|
|
|
34
|
-
## Why MCP Servers Need Integration
|
|
35
|
-
|
|
36
|
-
An MCP Server is not an ordinary HTTP API. It has several characteristics that make testing more complex:
|
|
45
|
+
## Why MCP Servers Need Integration Tests
|
|
37
46
|
|
|
38
|
-
|
|
39
|
-
2. **Multiple transport protocols**: The same server may simultaneously support Streamable HTTP, SSE, and stdio
|
|
40
|
-
3. **Bidirectional communication**: In SSE mode, the server can proactively push events to the client
|
|
41
|
-
4. **Policy layer**: Read-only mode, tool allowlists, and feature flags can alter the available tool set
|
|
42
|
-
5. **Authentication context**: In remote deployments, tokens are passed via HTTP headers and must propagate through AsyncLocalStorage
|
|
47
|
+
An MCP server is not a standard HTTP API. Complexity comes from the combination of protocol mechanics, session behavior, bidirectional messaging, and security boundaries.
|
|
43
48
|
|
|
44
|
-
|
|
49
|
+
1. **Stateful sessions over HTTP**: initialization can return `MCP-Session-Id`; subsequent requests must carry it. When a session expires, the server should return `404`, and the client should re-initialize.
|
|
50
|
+
2. **Multiple transports**: the spec defines stdio and Streamable HTTP. Streamable HTTP uses a single endpoint that can support both `POST` and `GET` (optional SSE stream).
|
|
51
|
+
3. **Bidirectional messaging**: servers can send notifications/requests over SSE. Disconnection is not cancellation; cancellation requires an explicit cancel notification.
|
|
52
|
+
4. **Explicit security requirements**: Streamable HTTP should validate `Origin` to mitigate DNS rebinding. Local deployments should typically bind to `127.0.0.1`, and auth should be implemented where required.
|
|
53
|
+
5. **Authorization is more than Bearer header plumbing**: the spec defines OAuth-based discovery and flow, including Protected Resource Metadata discovery.
|
|
45
54
|
|
|
46
|
-
|
|
55
|
+
Because of this, unit tests alone cannot cover full interaction behavior. Integration tests should use **real MCP clients + real MCP servers + real/simulated transports** so results are reproducible, assertable, and CI-friendly.
|
|
47
56
|
|
|
48
57
|
---
|
|
49
58
|
|
|
50
|
-
##
|
|
51
|
-
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
│ LLM
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
/
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
59
|
+
## Testing Pyramid for MCP
|
|
60
|
+
|
|
61
|
+
```text
|
|
62
|
+
┌───────────────────┐
|
|
63
|
+
│ Real LLM Smoke │ ← small, nightly
|
|
64
|
+
└─────────┬─────────┘
|
|
65
|
+
│
|
|
66
|
+
┌─────────────┴─────────────┐
|
|
67
|
+
│ Agent Loop (ScriptedLLM) │ ← nightly / small PR subset
|
|
68
|
+
└─────────────┬─────────────┘
|
|
69
|
+
│
|
|
70
|
+
┌────────────────┴────────────────┐
|
|
71
|
+
│ Conformance (Spec Compliance) │ ← nightly / pre-release
|
|
72
|
+
└────────────────┬────────────────┘
|
|
73
|
+
│
|
|
74
|
+
┌───────────────────────┴───────────────────────┐
|
|
75
|
+
│ Streamable HTTP + Session + SSE integration │ ← every PR (P0/P1)
|
|
76
|
+
└───────────────────────┬───────────────────────┘
|
|
77
|
+
│
|
|
78
|
+
┌─────────────────────────┴─────────────────────────┐
|
|
79
|
+
│ InMemoryTransport protocol-level integration │ ← every commit (P0)
|
|
80
|
+
└───────────────────────────────────────────────────┘
|
|
70
81
|
```
|
|
71
82
|
|
|
72
|
-
|
|
83
|
+
Principle: lower layers should be broader, faster, and more deterministic. Higher layers should be smaller and smoke-oriented.
|
|
73
84
|
|
|
74
85
|
---
|
|
75
86
|
|
|
76
|
-
## Layer
|
|
87
|
+
## Layer 0: Design for Testability (Server Factory + Dependency Injection)
|
|
77
88
|
|
|
78
|
-
|
|
89
|
+
Whether integration testing is practical is mostly determined by server architecture.
|
|
79
90
|
|
|
80
|
-
|
|
91
|
+
### Core requirements
|
|
81
92
|
|
|
82
|
-
-
|
|
83
|
-
-
|
|
84
|
-
-
|
|
93
|
+
- **Server constructor should be a pure factory**: `createMcpServer(context)` depends only on `context`. Avoid direct top-level reads of `process.env`, DB connections, or network calls.
|
|
94
|
+
- **Context should be complete**: include env, logger, external API clients, policy engine, formatters, and optionally clock/random providers.
|
|
95
|
+
- **Defaults must be usable**: `buildContext()` should provide complete defaults, and tests should override only deltas.
|
|
96
|
+
- **Time/random should be controllable**: session IDs, expiry handling, and retries become stable when time/random sources are injectable.
|
|
85
97
|
|
|
86
|
-
|
|
98
|
+
---
|
|
87
99
|
|
|
88
|
-
|
|
100
|
+
## Layer 1: InMemoryTransport Protocol-Level Integration Tests (P0 Core)
|
|
89
101
|
|
|
90
|
-
|
|
91
|
-
|
|
102
|
+
Goal: avoid opening ports or spawning processes. Use real MCP client ↔ server lifecycle (`initialize` / `list` / `call`) in one process.
|
|
103
|
+
|
|
104
|
+
### Core Scaffold: buildContext + createLinkedPair
|
|
92
105
|
|
|
106
|
+
```ts
|
|
107
|
+
// tests/integration/_helpers.ts
|
|
93
108
|
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
|
|
94
109
|
import { InMemoryTransport } from "@modelcontextprotocol/sdk/inMemory.js";
|
|
95
|
-
import { createMcpServer } from "../../src/server/build-server.js";
|
|
96
110
|
|
|
97
|
-
|
|
98
|
-
|
|
111
|
+
import { createMcpServer } from "../../src/server/createMcpServer.js";
|
|
112
|
+
|
|
113
|
+
export function buildContext(overrides?: Partial<AppContext>): AppContext {
|
|
99
114
|
return {
|
|
100
115
|
env: {
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
116
|
+
READ_ONLY_MODE: false,
|
|
117
|
+
...overrides?.env
|
|
118
|
+
},
|
|
119
|
+
logger: overrides?.logger ?? {
|
|
120
|
+
info: vi.fn(),
|
|
121
|
+
warn: vi.fn(),
|
|
122
|
+
error: vi.fn(),
|
|
123
|
+
debug: vi.fn()
|
|
105
124
|
},
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
debug: vi.fn(), trace: vi.fn(), fatal: vi.fn(),
|
|
109
|
-
child: () => ({}) as never
|
|
125
|
+
services: {
|
|
126
|
+
...overrides?.services
|
|
110
127
|
},
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
formatter: new OutputFormatter({ ... }) // Real formatter
|
|
128
|
+
policy: overrides?.policy ?? new ToolPolicyEngine(),
|
|
129
|
+
formatter: overrides?.formatter ?? new OutputFormatter()
|
|
114
130
|
};
|
|
115
131
|
}
|
|
116
132
|
|
|
117
|
-
// 2) Create Client ↔ Server linked pair
|
|
118
133
|
export async function createLinkedPair(context: AppContext) {
|
|
119
134
|
const server = createMcpServer(context);
|
|
120
|
-
const [clientTransport, serverTransport] =
|
|
121
|
-
InMemoryTransport.createLinkedPair();
|
|
135
|
+
const [clientTransport, serverTransport] = InMemoryTransport.createLinkedPair();
|
|
122
136
|
|
|
123
137
|
await server.connect(serverTransport);
|
|
124
138
|
|
|
@@ -132,41 +146,39 @@ export async function createLinkedPair(context: AppContext) {
|
|
|
132
146
|
}
|
|
133
147
|
```
|
|
134
148
|
|
|
135
|
-
|
|
149
|
+
Always close transports using `try/finally`, even if one side may cascade-close in current implementation.
|
|
136
150
|
|
|
137
|
-
### Pattern 1:
|
|
151
|
+
### Pattern 1: Capabilities and List Contracts (tools/resources/prompts/list)
|
|
138
152
|
|
|
139
|
-
|
|
153
|
+
Do not test tools only. A mature MCP server often exposes tools, resources, and prompts. The list contract is a first-order external API.
|
|
140
154
|
|
|
141
|
-
```
|
|
142
|
-
describe("
|
|
143
|
-
it("
|
|
155
|
+
```ts
|
|
156
|
+
describe("Contract: listTools()", () => {
|
|
157
|
+
it("exposes expected core tools by default", async () => {
|
|
144
158
|
const { client, clientTransport, serverTransport } = await createLinkedPair(buildContext());
|
|
159
|
+
|
|
145
160
|
try {
|
|
146
161
|
const { tools } = await client.listTools();
|
|
147
162
|
const names = tools.map((t) => t.name);
|
|
148
163
|
|
|
149
|
-
expect(names).toContain("gitlab_get_project");
|
|
150
|
-
expect(names).toContain("gitlab_list_issues");
|
|
151
164
|
expect(names).toContain("health_check");
|
|
165
|
+
expect(names).toContain("my_readonly_tool");
|
|
152
166
|
} finally {
|
|
153
167
|
await clientTransport.close();
|
|
154
168
|
await serverTransport.close();
|
|
155
169
|
}
|
|
156
170
|
});
|
|
157
171
|
|
|
158
|
-
it("
|
|
159
|
-
const {
|
|
160
|
-
|
|
161
|
-
|
|
172
|
+
it("hides write tools in read-only mode", async () => {
|
|
173
|
+
const ctx = buildContext({ env: { READ_ONLY_MODE: true } as any });
|
|
174
|
+
const { client, clientTransport, serverTransport } = await createLinkedPair(ctx);
|
|
175
|
+
|
|
162
176
|
try {
|
|
163
177
|
const { tools } = await client.listTools();
|
|
164
178
|
const names = tools.map((t) => t.name);
|
|
165
179
|
|
|
166
|
-
expect(names).not.toContain("
|
|
167
|
-
expect(names).
|
|
168
|
-
// Read-only tools are still present
|
|
169
|
-
expect(names).toContain("gitlab_get_project");
|
|
180
|
+
expect(names).not.toContain("create_issue");
|
|
181
|
+
expect(names).toContain("health_check");
|
|
170
182
|
} finally {
|
|
171
183
|
await clientTransport.close();
|
|
172
184
|
await serverTransport.close();
|
|
@@ -175,46 +187,42 @@ describe("Tool Registration", () => {
|
|
|
175
187
|
});
|
|
176
188
|
```
|
|
177
189
|
|
|
178
|
-
|
|
190
|
+
Recommended assertion style:
|
|
191
|
+
|
|
192
|
+
- Assert presence/absence of tool names (stable).
|
|
193
|
+
- Avoid asserting full text descriptions (high-churn).
|
|
194
|
+
|
|
195
|
+
### Pattern 2: Tool Handler End-to-End Validation (stub external dependencies)
|
|
179
196
|
|
|
180
|
-
|
|
197
|
+
Validate at least three aspects:
|
|
181
198
|
|
|
182
|
-
|
|
199
|
+
1. Correct dependency call with correct parameters.
|
|
200
|
+
2. Correct MCP tool result shape (`content[]` and optional `structuredContent`).
|
|
201
|
+
3. Correct error-path behavior.
|
|
183
202
|
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
203
|
+
```ts
|
|
204
|
+
describe("Tool handler: get_project", () => {
|
|
205
|
+
it("forwards project_id to dependency", async () => {
|
|
206
|
+
const getProject = vi.fn().mockResolvedValue({ id: 42, name: "alpha" });
|
|
187
207
|
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
it("passes project_id to context.gitlab.getProject()", async () => {
|
|
191
|
-
const getProject = vi.fn().mockResolvedValue({
|
|
192
|
-
id: 42,
|
|
193
|
-
name: "my-project",
|
|
194
|
-
path_with_namespace: "group/my-project"
|
|
208
|
+
const ctx = buildContext({
|
|
209
|
+
services: { git: { getProject } } as any
|
|
195
210
|
});
|
|
196
211
|
|
|
197
|
-
const { client, clientTransport, serverTransport } = await createLinkedPair(
|
|
198
|
-
buildContext({
|
|
199
|
-
gitlabStub: { getProject }
|
|
200
|
-
})
|
|
201
|
-
);
|
|
212
|
+
const { client, clientTransport, serverTransport } = await createLinkedPair(ctx);
|
|
202
213
|
|
|
203
214
|
try {
|
|
204
215
|
const result = await client.callTool({
|
|
205
|
-
name: "
|
|
206
|
-
arguments: { project_id: "group/
|
|
216
|
+
name: "get_project",
|
|
217
|
+
arguments: { project_id: "group/alpha" }
|
|
207
218
|
});
|
|
208
219
|
|
|
209
|
-
|
|
210
|
-
expect(getProject).toHaveBeenCalledWith("group/my-project");
|
|
211
|
-
|
|
212
|
-
// Verification 2: response is not an error
|
|
220
|
+
expect(getProject).toHaveBeenCalledWith("group/alpha");
|
|
213
221
|
expect(result.isError).toBeFalsy();
|
|
214
222
|
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
expect(
|
|
223
|
+
const text = (result.content as any[]).find((c) => c.type === "text")?.text ?? "";
|
|
224
|
+
expect(text).toContain("alpha");
|
|
225
|
+
expect(result.structuredContent).toMatchObject({ id: 42, name: "alpha" });
|
|
218
226
|
} finally {
|
|
219
227
|
await clientTransport.close();
|
|
220
228
|
await serverTransport.close();
|
|
@@ -223,51 +231,29 @@ describe("Tool handler: gitlab_get_project", () => {
|
|
|
223
231
|
});
|
|
224
232
|
```
|
|
225
233
|
|
|
226
|
-
|
|
234
|
+
Practical rule: stub only what the test should use. Unexpected dependency usage should fail loudly.
|
|
227
235
|
|
|
228
236
|
### Pattern 3: Schema Validation and Boundary Inputs
|
|
229
237
|
|
|
230
|
-
|
|
238
|
+
Most MCP servers use schema validation (often Zod). Invalid input should be treated as a first-class test target:
|
|
231
239
|
|
|
232
|
-
-
|
|
233
|
-
-
|
|
234
|
-
-
|
|
235
|
-
-
|
|
240
|
+
- Missing/extra fields.
|
|
241
|
+
- Type mismatches.
|
|
242
|
+
- Invalid enum values.
|
|
243
|
+
- `null` / `undefined` semantics.
|
|
236
244
|
|
|
237
|
-
```
|
|
238
|
-
describe("Schema
|
|
239
|
-
it("
|
|
240
|
-
const
|
|
241
|
-
const { client, clientTransport, serverTransport } = await createLinkedPair(
|
|
242
|
-
buildContext({
|
|
243
|
-
gitlabStub: { listProjects }
|
|
244
|
-
})
|
|
245
|
-
);
|
|
245
|
+
```ts
|
|
246
|
+
describe("Schema validation", () => {
|
|
247
|
+
it("returns tool-level error for type mismatch", async () => {
|
|
248
|
+
const ctx = buildContext({ services: { git: { listProjects: vi.fn() } } as any });
|
|
249
|
+
const { client, clientTransport, serverTransport } = await createLinkedPair(ctx);
|
|
246
250
|
|
|
247
251
|
try {
|
|
248
252
|
const result = await client.callTool({
|
|
249
|
-
name: "
|
|
250
|
-
arguments: {
|
|
253
|
+
name: "list_projects",
|
|
254
|
+
arguments: { page: "not-a-number" } as any
|
|
251
255
|
});
|
|
252
|
-
expect(result.isError).toBeFalsy();
|
|
253
|
-
} finally {
|
|
254
|
-
await clientTransport.close();
|
|
255
|
-
await serverTransport.close();
|
|
256
|
-
}
|
|
257
|
-
});
|
|
258
256
|
|
|
259
|
-
it("type mismatch triggers Zod error", async () => {
|
|
260
|
-
const { client, clientTransport, serverTransport } = await createLinkedPair(
|
|
261
|
-
buildContext({
|
|
262
|
-
gitlabStub: { listProjects: vi.fn() }
|
|
263
|
-
})
|
|
264
|
-
);
|
|
265
|
-
|
|
266
|
-
try {
|
|
267
|
-
const result = await client.callTool({
|
|
268
|
-
name: "gitlab_list_projects",
|
|
269
|
-
arguments: { page: "not-a-number" } // should be number
|
|
270
|
-
});
|
|
271
257
|
expect(result.isError).toBe(true);
|
|
272
258
|
} finally {
|
|
273
259
|
await clientTransport.close();
|
|
@@ -277,705 +263,370 @@ describe("Schema Validation", () => {
|
|
|
277
263
|
});
|
|
278
264
|
```
|
|
279
265
|
|
|
266
|
+
If your implementation maps validation failures to protocol errors (`-32602 InvalidParams`) instead of tool-level errors, that can also be valid. The key is consistency with client expectations.
|
|
267
|
+
|
|
268
|
+
### Pattern 4: Minimal Coverage for Bidirectional Requests (Sampling/Elicitation)
|
|
269
|
+
|
|
270
|
+
If your server uses server-initiated requests (for example, sampling/elicitation), add a minimal closed-loop test:
|
|
271
|
+
|
|
272
|
+
- Register client-side handlers with deterministic responses.
|
|
273
|
+
- Verify server behavior continues correctly after handler responses.
|
|
274
|
+
|
|
280
275
|
---
|
|
281
276
|
|
|
282
|
-
## Layer 2: HTTP Transport
|
|
277
|
+
## Layer 2: Streamable HTTP Transport Integration Tests (P0/P1)
|
|
278
|
+
|
|
279
|
+
InMemory tests are fast but skip critical reality: wire serialization, HTTP headers, session headers, SSE behavior, and origin checks.
|
|
280
|
+
|
|
281
|
+
### Spec Checklist You Must Align With (2025-11-25)
|
|
283
282
|
|
|
284
|
-
|
|
283
|
+
- **Single MCP endpoint** supports both `POST` and `GET`.
|
|
284
|
+
- **POST** requires client `Accept: application/json, text/event-stream`.
|
|
285
|
+
- **POST body** must be a single JSON-RPC message (not batch array in strict mode if your stack enforces that).
|
|
286
|
+
- **GET** opens SSE stream, or server may return `405` if SSE is not supported.
|
|
287
|
+
- **Origin security** should reject invalid origins with `403`.
|
|
288
|
+
- **Session behavior**: when session IDs are enabled, post-init requests must include `MCP-Session-Id`.
|
|
289
|
+
- **Protocol version header** should be validated for post-init requests.
|
|
285
290
|
|
|
286
|
-
###
|
|
291
|
+
### HTTP Test Harness: Port 0 + Isolated Server Instances
|
|
287
292
|
|
|
288
|
-
|
|
293
|
+
Never share a stateful server instance across tests involving session/rate-limit/connection state.
|
|
289
294
|
|
|
290
|
-
```
|
|
295
|
+
```ts
|
|
291
296
|
import { createServer, type Server as HttpServer } from "node:http";
|
|
292
|
-
import { setupMcpHttpApp } from "../../src/http-app.js";
|
|
293
297
|
|
|
294
298
|
let httpServer: HttpServer;
|
|
295
299
|
let baseUrl: string;
|
|
296
|
-
let result: SetupMcpHttpAppResult;
|
|
297
|
-
|
|
298
|
-
beforeAll(async () => {
|
|
299
|
-
const context = buildHttpContext();
|
|
300
|
-
result = setupMcpHttpApp({
|
|
301
|
-
context,
|
|
302
|
-
env: context.env,
|
|
303
|
-
logger: context.logger
|
|
304
|
-
});
|
|
305
300
|
|
|
306
|
-
|
|
301
|
+
beforeEach(async () => {
|
|
302
|
+
const app = buildYourExpressOrHonoApp();
|
|
303
|
+
httpServer = createServer(app);
|
|
304
|
+
|
|
307
305
|
await new Promise<void>((resolve) => {
|
|
308
306
|
httpServer.listen(0, "127.0.0.1", () => resolve());
|
|
309
307
|
});
|
|
310
308
|
|
|
311
309
|
const addr = httpServer.address();
|
|
312
|
-
if (typeof addr === "object" && addr
|
|
310
|
+
if (typeof addr === "object" && addr?.port) {
|
|
313
311
|
baseUrl = `http://127.0.0.1:${addr.port}`;
|
|
312
|
+
} else {
|
|
313
|
+
throw new Error("Failed to bind test port");
|
|
314
314
|
}
|
|
315
315
|
});
|
|
316
316
|
|
|
317
|
-
|
|
318
|
-
// Key: close all sessions first, then shut down the HTTP server
|
|
319
|
-
for (const sessionId of result.sessions.keys()) {
|
|
320
|
-
await result.closeSession(sessionId, "shutdown");
|
|
321
|
-
}
|
|
317
|
+
afterEach(async () => {
|
|
322
318
|
await new Promise<void>((resolve, reject) => {
|
|
323
319
|
httpServer.close((err) => (err ? reject(err) : resolve()));
|
|
324
320
|
});
|
|
325
321
|
});
|
|
326
322
|
```
|
|
327
323
|
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
```typescript
|
|
348
|
-
// SSE event parser
|
|
349
|
-
async function* parseSseEvents(response: Response): AsyncGenerator<SseEvent> {
|
|
350
|
-
const reader = response.body!.getReader();
|
|
351
|
-
const decoder = new TextDecoder();
|
|
352
|
-
let buffer = "";
|
|
353
|
-
|
|
354
|
-
try {
|
|
355
|
-
while (true) {
|
|
356
|
-
const { done, value } = await reader.read();
|
|
357
|
-
if (done) break;
|
|
358
|
-
buffer += decoder.decode(value, { stream: true });
|
|
359
|
-
const parts = buffer.split("\n\n");
|
|
360
|
-
buffer = parts.pop()!;
|
|
361
|
-
|
|
362
|
-
for (const part of parts) {
|
|
363
|
-
const event: SseEvent = {};
|
|
364
|
-
for (const line of part.split("\n")) {
|
|
365
|
-
if (line.startsWith("event: ")) event.event = line.slice(7).trim();
|
|
366
|
-
else if (line.startsWith("data: ")) event.data = line.slice(6).trim();
|
|
367
|
-
}
|
|
368
|
-
yield event;
|
|
369
|
-
}
|
|
324
|
+
### Must-Test Cases: Session, 404 Reinitialize, DELETE, Protocol Version Header
|
|
325
|
+
|
|
326
|
+
#### 1) Initialization returns `MCP-Session-Id` (stateful mode)
|
|
327
|
+
|
|
328
|
+
```ts
|
|
329
|
+
const MCP_HEADERS = {
|
|
330
|
+
"Content-Type": "application/json",
|
|
331
|
+
Accept: "application/json, text/event-stream"
|
|
332
|
+
};
|
|
333
|
+
|
|
334
|
+
function initializeBody() {
|
|
335
|
+
return JSON.stringify({
|
|
336
|
+
jsonrpc: "2.0",
|
|
337
|
+
id: 1,
|
|
338
|
+
method: "initialize",
|
|
339
|
+
params: {
|
|
340
|
+
protocolVersion: "2025-11-25",
|
|
341
|
+
capabilities: {},
|
|
342
|
+
clientInfo: { name: "itest", version: "0.0.1" }
|
|
370
343
|
}
|
|
371
|
-
}
|
|
372
|
-
reader.releaseLock();
|
|
373
|
-
}
|
|
344
|
+
});
|
|
374
345
|
}
|
|
375
|
-
```
|
|
376
|
-
|
|
377
|
-
**SSE Test Flow**:
|
|
378
|
-
|
|
379
|
-
```typescript
|
|
380
|
-
it("GET /sse returns an endpoint event", async () => {
|
|
381
|
-
const controller = new AbortController();
|
|
382
|
-
try {
|
|
383
|
-
const response = await fetch(`${baseUrl}/sse`, {
|
|
384
|
-
headers: { Accept: "text/event-stream" },
|
|
385
|
-
signal: controller.signal
|
|
386
|
-
});
|
|
387
346
|
|
|
388
|
-
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
347
|
+
it("returns MCP-Session-Id during initialize", async () => {
|
|
348
|
+
const res = await fetch(`${baseUrl}/mcp`, {
|
|
349
|
+
method: "POST",
|
|
350
|
+
headers: MCP_HEADERS,
|
|
351
|
+
body: initializeBody()
|
|
352
|
+
});
|
|
393
353
|
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
} finally {
|
|
397
|
-
controller.abort(); // Required: clean up the long-lived connection
|
|
398
|
-
}
|
|
354
|
+
expect(res.status).toBe(200);
|
|
355
|
+
expect(res.headers.get("MCP-Session-Id")).toBeTruthy();
|
|
399
356
|
});
|
|
400
357
|
```
|
|
401
358
|
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
> **Important**: `SSE=true` is incompatible with `REMOTE_AUTHORIZATION=true`. The environment validation layer enforces this constraint at startup. If you need remote per-request authentication, use Streamable HTTP transport instead.
|
|
405
|
-
|
|
406
|
-
### Session Lifecycle Testing
|
|
359
|
+
#### 2) Post-init requests require `MCP-Session-Id` and valid protocol version
|
|
407
360
|
|
|
408
|
-
|
|
409
|
-
|
|
410
|
-
|
|
411
|
-
|
|
412
|
-
|
|
413
|
-
|
|
414
|
-
|
|
415
|
-
|
|
416
|
-
|
|
417
|
-
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
422
|
-
|
|
423
|
-
|
|
424
|
-
|
|
425
|
-
body: JSON.stringify({
|
|
426
|
-
jsonrpc: "2.0",
|
|
427
|
-
id: 2,
|
|
428
|
-
method: "tools/list",
|
|
429
|
-
params: {}
|
|
430
|
-
})
|
|
431
|
-
});
|
|
432
|
-
|
|
433
|
-
expect(res.status).toBe(404);
|
|
434
|
-
const body = await res.json();
|
|
435
|
-
expect(body.error?.code).toBe(-32001);
|
|
361
|
+
```ts
|
|
362
|
+
it("requires session and protocol headers post-init", async () => {
|
|
363
|
+
const initRes = await fetch(`${baseUrl}/mcp`, {
|
|
364
|
+
method: "POST",
|
|
365
|
+
headers: MCP_HEADERS,
|
|
366
|
+
body: initializeBody()
|
|
367
|
+
});
|
|
368
|
+
const sessionId = initRes.headers.get("MCP-Session-Id")!;
|
|
369
|
+
|
|
370
|
+
const res = await fetch(`${baseUrl}/mcp`, {
|
|
371
|
+
method: "POST",
|
|
372
|
+
headers: {
|
|
373
|
+
...MCP_HEADERS,
|
|
374
|
+
"MCP-Session-Id": sessionId,
|
|
375
|
+
"MCP-Protocol-Version": "2025-11-25"
|
|
376
|
+
},
|
|
377
|
+
body: JSON.stringify({ jsonrpc: "2.0", id: 2, method: "tools/list", params: {} })
|
|
436
378
|
});
|
|
437
|
-
});
|
|
438
|
-
```
|
|
439
|
-
|
|
440
|
-
**Garbage Collection Testing**: Trigger immediate expiration by setting `SESSION_TIMEOUT_SECONDS` to 0:
|
|
441
|
-
|
|
442
|
-
```typescript
|
|
443
|
-
it("GC cleans up expired sessions", async () => {
|
|
444
|
-
(ctx.env as any).SESSION_TIMEOUT_SECONDS = 0;
|
|
445
|
-
// ... create session ...
|
|
446
|
-
expect(result.sessions.size).toBe(1);
|
|
447
379
|
|
|
448
|
-
|
|
449
|
-
expect(result.sessions.size).toBe(0);
|
|
380
|
+
expect(res.status).toBe(200);
|
|
450
381
|
});
|
|
451
382
|
```
|
|
452
383
|
|
|
453
|
-
|
|
384
|
+
#### 3) Invalid protocol version returns `400`
|
|
454
385
|
|
|
455
|
-
```
|
|
456
|
-
it("
|
|
457
|
-
|
|
386
|
+
```ts
|
|
387
|
+
it("returns 400 for invalid MCP-Protocol-Version", async () => {
|
|
388
|
+
const initRes = await fetch(`${baseUrl}/mcp`, {
|
|
389
|
+
method: "POST",
|
|
390
|
+
headers: MCP_HEADERS,
|
|
391
|
+
body: initializeBody()
|
|
392
|
+
});
|
|
393
|
+
const sessionId = initRes.headers.get("MCP-Session-Id")!;
|
|
394
|
+
|
|
395
|
+
const res = await fetch(`${baseUrl}/mcp`, {
|
|
396
|
+
method: "POST",
|
|
397
|
+
headers: {
|
|
398
|
+
...MCP_HEADERS,
|
|
399
|
+
"MCP-Session-Id": sessionId,
|
|
400
|
+
"MCP-Protocol-Version": "invalid-version"
|
|
401
|
+
},
|
|
402
|
+
body: JSON.stringify({ jsonrpc: "2.0", id: 2, method: "tools/list", params: {} })
|
|
403
|
+
});
|
|
458
404
|
|
|
459
|
-
|
|
460
|
-
while (result.sseSessions.size > 0 && Date.now() < deadline) {
|
|
461
|
-
await new Promise((r) => setTimeout(r, 50));
|
|
462
|
-
}
|
|
463
|
-
expect(result.sseSessions.size).toBe(0);
|
|
405
|
+
expect(res.status).toBe(400);
|
|
464
406
|
});
|
|
465
407
|
```
|
|
466
408
|
|
|
467
|
-
|
|
409
|
+
#### 4) Session termination and re-initialization (`404` behavior)
|
|
468
410
|
|
|
469
|
-
|
|
411
|
+
If your server expires/terminates sessions, requests with stale session IDs should return `404`, and client should re-initialize without sending old session ID.
|
|
470
412
|
|
|
471
|
-
|
|
413
|
+
#### 5) DELETE semantics (`200` or `405`)
|
|
472
414
|
|
|
473
|
-
|
|
415
|
+
The server may support explicit session termination via `DELETE`, or may return `405`.
|
|
474
416
|
|
|
475
|
-
|
|
476
|
-
function buildRemoteAuthContext() {
|
|
477
|
-
const ctx = buildContext({ token: null }); // No default token
|
|
478
|
-
(ctx.env as any).REMOTE_AUTHORIZATION = true;
|
|
479
|
-
(ctx.env as any).HTTP_JSON_ONLY = true;
|
|
480
|
-
return ctx;
|
|
481
|
-
}
|
|
417
|
+
### SSE Stream Tests (GET/POST SSE on Streamable HTTP)
|
|
482
418
|
|
|
483
|
-
|
|
484
|
-
it("missing token returns 401 + error code -32010", async () => {
|
|
485
|
-
const res = await fetch(`${baseUrl}/mcp`, {
|
|
486
|
-
method: "POST",
|
|
487
|
-
headers: MCP_HEADERS, // No Authorization
|
|
488
|
-
body: initializeBody()
|
|
489
|
-
});
|
|
490
|
-
expect(res.status).toBe(401);
|
|
491
|
-
const body = await res.json();
|
|
492
|
-
expect(body.error?.code).toBe(-32010);
|
|
493
|
-
});
|
|
419
|
+
In Streamable HTTP, SSE happens on the **same endpoint**:
|
|
494
420
|
|
|
495
|
-
|
|
496
|
-
|
|
497
|
-
method: "POST",
|
|
498
|
-
headers: {
|
|
499
|
-
...MCP_HEADERS,
|
|
500
|
-
Authorization: "Bearer test-remote-token"
|
|
501
|
-
},
|
|
502
|
-
body: initializeBody()
|
|
503
|
-
});
|
|
504
|
-
expect(res.status).toBe(200);
|
|
505
|
-
expect(res.headers.get("mcp-session-id")).toBeTruthy();
|
|
506
|
-
});
|
|
421
|
+
- POST response can be `text/event-stream`.
|
|
422
|
+
- GET may open a standalone SSE channel for notifications.
|
|
507
423
|
|
|
508
|
-
|
|
509
|
-
// Initialize session with token
|
|
510
|
-
const initRes = await fetch(`${baseUrl}/mcp`, {
|
|
511
|
-
method: "POST",
|
|
512
|
-
headers: {
|
|
513
|
-
...MCP_HEADERS,
|
|
514
|
-
Authorization: "Bearer my-secret-token"
|
|
515
|
-
},
|
|
516
|
-
body: initializeBody()
|
|
517
|
-
});
|
|
518
|
-
const sessionId = initRes.headers.get("mcp-session-id")!;
|
|
424
|
+
#### 1) `GET /mcp` should be SSE (`200`) or unsupported (`405`)
|
|
519
425
|
|
|
520
|
-
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
|
|
426
|
+
```ts
|
|
427
|
+
it("GET /mcp returns SSE or 405", async () => {
|
|
428
|
+
const res = await fetch(`${baseUrl}/mcp`, {
|
|
429
|
+
method: "GET",
|
|
430
|
+
headers: { Accept: "text/event-stream" }
|
|
524
431
|
});
|
|
525
|
-
});
|
|
526
|
-
```
|
|
527
432
|
|
|
528
|
-
|
|
529
|
-
|
|
530
|
-
|
|
531
|
-
|
|
532
|
-
|
|
533
|
-
| ------ | -------------------------- | ------------------------------------------------------------------ |
|
|
534
|
-
| `full` | Implementation-dependent | Returns full error details (better debugging, higher leakage risk) |
|
|
535
|
-
| `safe` | Recommended for production | Hides internal details, returns generic messages |
|
|
536
|
-
|
|
537
|
-
```typescript
|
|
538
|
-
describe("Error Handling", () => {
|
|
539
|
-
it("GitLabApiError 404 → isError + status code", async () => {
|
|
540
|
-
const getProject = vi.fn().mockRejectedValue(new GitLabApiError("Not Found", 404));
|
|
541
|
-
// ...
|
|
542
|
-
expect(result.isError).toBe(true);
|
|
543
|
-
expect(text).toContain("GitLab API error 404");
|
|
544
|
-
});
|
|
545
|
-
|
|
546
|
-
it("safe mode hides error details", async () => {
|
|
547
|
-
(ctx.env as any).GITLAB_ERROR_DETAIL_MODE = "safe";
|
|
548
|
-
|
|
549
|
-
const getProject = vi
|
|
550
|
-
.fn()
|
|
551
|
-
.mockRejectedValue(new Error("DB connection failed: password=hunter2"));
|
|
552
|
-
// ...
|
|
553
|
-
expect(text).toBe("Request failed"); // Generic message
|
|
554
|
-
expect(text).not.toContain("hunter2"); // No leakage
|
|
555
|
-
});
|
|
556
|
-
|
|
557
|
-
it("non-Error thrown values return Unknown error", async () => {
|
|
558
|
-
const getProject = vi.fn().mockRejectedValue("string error");
|
|
559
|
-
// ...
|
|
560
|
-
expect(text).toBe("Unknown error");
|
|
561
|
-
});
|
|
433
|
+
const ct = res.headers.get("content-type") ?? "";
|
|
434
|
+
expect([200, 405]).toContain(res.status);
|
|
435
|
+
if (res.status === 200) {
|
|
436
|
+
expect(ct).toContain("text/event-stream");
|
|
437
|
+
}
|
|
562
438
|
});
|
|
563
439
|
```
|
|
564
440
|
|
|
565
|
-
|
|
566
|
-
|
|
567
|
-
```typescript
|
|
568
|
-
describe("Token Redaction", () => {
|
|
569
|
-
it.each([
|
|
570
|
-
["GitLab PAT", "glpat-abcdef1234567890"],
|
|
571
|
-
["GitHub PAT", "ghp_abcdef1234567890abcde"],
|
|
572
|
-
["JWT", "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.payload"]
|
|
573
|
-
])("redacts %s token", async (label, token) => {
|
|
574
|
-
const getProject = vi.fn().mockRejectedValue(
|
|
575
|
-
new GitLabApiError("Unauthorized", 401, {
|
|
576
|
-
message: `Token ${token} is invalid`
|
|
577
|
-
})
|
|
578
|
-
);
|
|
579
|
-
// ...
|
|
580
|
-
expect(text).toContain("[REDACTED]");
|
|
581
|
-
expect(text).not.toContain(token);
|
|
582
|
-
});
|
|
441
|
+
#### 2) JSON-only mode should reject GET (`405`)
|
|
583
442
|
|
|
584
|
-
|
|
585
|
-
const getProject = vi.fn().mockRejectedValue(
|
|
586
|
-
new GitLabApiError("Error", 400, {
|
|
587
|
-
authorization: "Bearer secret-val",
|
|
588
|
-
password: "hunter2",
|
|
589
|
-
message: "safe value" // Non-sensitive key is preserved
|
|
590
|
-
})
|
|
591
|
-
);
|
|
592
|
-
// ...
|
|
593
|
-
expect(text).not.toContain("secret-val");
|
|
594
|
-
expect(text).not.toContain("hunter2");
|
|
595
|
-
expect(text).toContain("safe value");
|
|
596
|
-
});
|
|
597
|
-
});
|
|
598
|
-
```
|
|
443
|
+
If you explicitly enable JSON-only response mode, test that GET is rejected as expected.
|
|
599
444
|
|
|
600
|
-
|
|
445
|
+
#### 3) SSE disconnect/reconnect behavior
|
|
601
446
|
|
|
602
|
-
|
|
447
|
+
Avoid fixed sleeps. Prefer deadline + polling assertions when testing reconnect behavior.
|
|
603
448
|
|
|
604
|
-
|
|
605
|
-
describe("GraphQL Tool Policy", () => {
|
|
606
|
-
it("disables GraphQL tools when ALLOWED_PROJECT_IDS is set", async () => {
|
|
607
|
-
const { client, clientTransport, serverTransport } = await createLinkedPair(
|
|
608
|
-
buildContext({ allowedProjectIds: ["123"] })
|
|
609
|
-
);
|
|
610
|
-
try {
|
|
611
|
-
const names = (await client.listTools()).tools.map((t) => t.name);
|
|
612
|
-
expect(names).not.toContain("gitlab_execute_graphql_query");
|
|
613
|
-
expect(names).not.toContain("gitlab_execute_graphql_mutation");
|
|
614
|
-
} finally {
|
|
615
|
-
await clientTransport.close();
|
|
616
|
-
await serverTransport.close();
|
|
617
|
-
}
|
|
618
|
-
});
|
|
449
|
+
---
|
|
619
450
|
|
|
620
|
-
|
|
621
|
-
const { client, clientTransport, serverTransport } = await createLinkedPair(
|
|
622
|
-
buildContext({
|
|
623
|
-
allowedProjectIds: ["123"],
|
|
624
|
-
allowGraphqlWithProjectScope: true
|
|
625
|
-
})
|
|
626
|
-
);
|
|
627
|
-
try {
|
|
628
|
-
const names = (await client.listTools()).tools.map((t) => t.name);
|
|
629
|
-
expect(names).toContain("gitlab_execute_graphql_query");
|
|
630
|
-
} finally {
|
|
631
|
-
await clientTransport.close();
|
|
632
|
-
await serverTransport.close();
|
|
633
|
-
}
|
|
634
|
-
});
|
|
451
|
+
## Layer 2.5: Legacy HTTP+SSE Compatibility Tests (Only if Needed)
|
|
635
452
|
|
|
636
|
-
|
|
637
|
-
const { client, clientTransport, serverTransport } = await createLinkedPair(
|
|
638
|
-
buildContext({ readOnlyMode: true, gitlabStub: { executeGraphql: vi.fn() } })
|
|
639
|
-
);
|
|
640
|
-
try {
|
|
641
|
-
const result = await client.callTool({
|
|
642
|
-
name: "gitlab_execute_graphql",
|
|
643
|
-
arguments: { query: "mutation { createProject { id } }" }
|
|
644
|
-
});
|
|
645
|
-
expect(result.isError).toBe(true);
|
|
646
|
-
} finally {
|
|
647
|
-
await clientTransport.close();
|
|
648
|
-
await serverTransport.close();
|
|
649
|
-
}
|
|
650
|
-
});
|
|
453
|
+
Streamable HTTP replaces legacy HTTP+SSE. If you still need old-client compatibility:
|
|
651
454
|
|
|
652
|
-
|
|
653
|
-
|
|
654
|
-
|
|
655
|
-
buildContext({ gitlabStub: { executeGraphql } })
|
|
656
|
-
);
|
|
657
|
-
try {
|
|
658
|
-
const result = await client.callTool({
|
|
659
|
-
name: "gitlab_execute_graphql_query",
|
|
660
|
-
arguments: { query: '{ project(name: "mutation thing") { id } }' }
|
|
661
|
-
});
|
|
662
|
-
expect(result.isError).toBeFalsy(); // Should not error
|
|
663
|
-
} finally {
|
|
664
|
-
await clientTransport.close();
|
|
665
|
-
await serverTransport.close();
|
|
666
|
-
}
|
|
667
|
-
});
|
|
668
|
-
});
|
|
669
|
-
```
|
|
455
|
+
- Clearly label it as **legacy transport**.
|
|
456
|
+
- Keep minimal black-box tests for old endpoints.
|
|
457
|
+
- Implement new capabilities only on Streamable HTTP, not legacy.
|
|
670
458
|
|
|
671
459
|
---
|
|
672
460
|
|
|
673
|
-
## Layer
|
|
461
|
+
## Layer 3: Security / Auth / Policy Tests (P0/P1)
|
|
674
462
|
|
|
675
|
-
###
|
|
463
|
+
### Origin/Host Protection (DNS Rebinding)
|
|
676
464
|
|
|
677
|
-
|
|
465
|
+
Recommended cases:
|
|
678
466
|
|
|
679
|
-
|
|
467
|
+
- Missing `Origin` (allow/deny based on policy, but keep behavior consistent).
|
|
468
|
+
- Invalid `Origin` outside allowlist → `403`.
|
|
469
|
+
- Invalid `Host` when host validation is enabled.
|
|
680
470
|
|
|
681
|
-
|
|
682
|
-
class ScriptedLLM {
|
|
683
|
-
private cursor = 0;
|
|
684
|
-
constructor(private responses: LLMResponse[]) {}
|
|
471
|
+
### OAuth/Authorization (Resource Metadata Discovery)
|
|
685
472
|
|
|
686
|
-
|
|
687
|
-
if (this.cursor >= this.responses.length) {
|
|
688
|
-
// Default end response
|
|
689
|
-
return { content: [{ type: "text", text: "Done" }] };
|
|
690
|
-
}
|
|
691
|
-
return this.responses[this.cursor++];
|
|
692
|
-
}
|
|
693
|
-
}
|
|
473
|
+
If your HTTP transport supports OAuth-compliant authorization, minimally test:
|
|
694
474
|
|
|
695
|
-
|
|
696
|
-
|
|
697
|
-
|
|
698
|
-
|
|
699
|
-
const { client } = await createLinkedPair(buildContext({ gitlabStub: { listProjects } }));
|
|
700
|
-
|
|
701
|
-
const llm = new ScriptedLLM([
|
|
702
|
-
// Round 1: LLM decides to call a tool
|
|
703
|
-
{
|
|
704
|
-
content: [
|
|
705
|
-
{
|
|
706
|
-
type: "tool_use",
|
|
707
|
-
id: "call-1",
|
|
708
|
-
name: "gitlab_list_projects",
|
|
709
|
-
input: { search: "alpha" }
|
|
710
|
-
}
|
|
711
|
-
]
|
|
712
|
-
},
|
|
713
|
-
// Round 2: LLM sees tool result and gives final answer
|
|
714
|
-
{
|
|
715
|
-
content: [{ type: "text", text: "Found project alpha" }]
|
|
716
|
-
}
|
|
717
|
-
]);
|
|
475
|
+
- Unauthorized request returns `401` with expected auth challenge information.
|
|
476
|
+
- Optional scope challenge in auth response.
|
|
477
|
+
- Resource metadata endpoint exists and includes expected authorization server metadata.
|
|
718
478
|
|
|
719
|
-
|
|
479
|
+
If you use a private bearer token model instead of full OAuth discovery, document and test that explicitly as a custom mode.
|
|
720
480
|
|
|
721
|
-
|
|
722
|
-
expect(result).toContain("alpha");
|
|
723
|
-
});
|
|
724
|
-
```
|
|
481
|
+
### Error Handling and Secret Redaction
|
|
725
482
|
|
|
726
|
-
|
|
483
|
+
Differentiate:
|
|
727
484
|
|
|
728
|
-
-
|
|
729
|
-
-
|
|
730
|
-
- **Do not** assert the full text of natural language output — unstable
|
|
485
|
+
- **Tool-level error**: request reached tool; `result.isError === true`.
|
|
486
|
+
- **Protocol-level error**: request itself failed; client gets protocol/transport exception.
|
|
731
487
|
|
|
732
|
-
|
|
488
|
+
Must-test:
|
|
733
489
|
|
|
734
|
-
|
|
490
|
+
- Stable error shape mapping.
|
|
491
|
+
- No leakage of sensitive values (`Authorization`, `password`, `token`, cookies).
|
|
492
|
+
- Security mode vs debug mode output differences.
|
|
735
493
|
|
|
736
|
-
|
|
737
|
-
// Only run in CI with an API key
|
|
738
|
-
describe.skipIf(!process.env.ANTHROPIC_API_KEY)("LLM E2E Smoke", () => {
|
|
739
|
-
it("can discover and call the health_check tool", async () => {
|
|
740
|
-
const result = await runAgentLoop({
|
|
741
|
-
client,
|
|
742
|
-
llm: new AnthropicLLM(process.env.ANTHROPIC_API_KEY!),
|
|
743
|
-
query: "Check the server health"
|
|
744
|
-
});
|
|
494
|
+
### Policy Combinatorics: Read-Only, Allowlists, Feature Flags
|
|
745
495
|
|
|
746
|
-
|
|
747
|
-
expect(result.length).toBeGreaterThan(0);
|
|
748
|
-
}, 30_000); // Generous timeout
|
|
749
|
-
});
|
|
750
|
-
```
|
|
496
|
+
Policy regressions are common.
|
|
751
497
|
|
|
752
|
-
|
|
498
|
+
Recommended approach:
|
|
753
499
|
|
|
754
|
-
|
|
500
|
+
- Layer 1: contract tests on `listTools()` for key policy combinations.
|
|
501
|
+
- Layer 2: a small number of end-to-end HTTP tests combining session + policy.
|
|
755
502
|
|
|
756
|
-
|
|
503
|
+
---
|
|
757
504
|
|
|
758
|
-
|
|
505
|
+
## Layer 4: Conformance Testing (Strongly Recommended)
|
|
759
506
|
|
|
760
|
-
|
|
507
|
+
Conformance tests catch protocol drift during spec and SDK upgrades.
|
|
761
508
|
|
|
762
509
|
```bash
|
|
763
|
-
#
|
|
764
|
-
npx @modelcontextprotocol/
|
|
765
|
-
node dist/index.js \
|
|
766
|
-
--method tools/list
|
|
767
|
-
|
|
768
|
-
# Call a tool
|
|
769
|
-
npx @modelcontextprotocol/inspector --cli \
|
|
770
|
-
node dist/index.js \
|
|
771
|
-
--method tools/call \
|
|
772
|
-
--tool-name health_check
|
|
773
|
-
```
|
|
510
|
+
# Run server conformance scenarios against a running server
|
|
511
|
+
npx @modelcontextprotocol/conformance server --url http://localhost:3000/mcp
|
|
774
512
|
|
|
775
|
-
|
|
513
|
+
# Run one scenario only
|
|
514
|
+
npx @modelcontextprotocol/conformance server --url http://localhost:3000/mcp --scenario server-initialize
|
|
776
515
|
|
|
777
|
-
|
|
778
|
-
|
|
779
|
-
npx @modelcontextprotocol/inspector --cli \
|
|
780
|
-
https://my-mcp-server.example.com \
|
|
781
|
-
--transport http \
|
|
782
|
-
--method tools/list \
|
|
783
|
-
--header "Authorization: Bearer $TOKEN"
|
|
516
|
+
# List all scenarios
|
|
517
|
+
npx @modelcontextprotocol/conformance list
|
|
784
518
|
```
|
|
785
519
|
|
|
786
|
-
|
|
787
|
-
|
|
788
|
-
```typescript
|
|
789
|
-
import { execa } from "execa";
|
|
790
|
-
|
|
791
|
-
test("Inspector CLI: tool list contains health_check", async () => {
|
|
792
|
-
const { stdout } = await execa("npx", [
|
|
793
|
-
"-y",
|
|
794
|
-
"@modelcontextprotocol/inspector",
|
|
795
|
-
"--cli",
|
|
796
|
-
"node",
|
|
797
|
-
"dist/index.js",
|
|
798
|
-
"--method",
|
|
799
|
-
"tools/list"
|
|
800
|
-
]);
|
|
801
|
-
|
|
802
|
-
const res = JSON.parse(stdout);
|
|
803
|
-
const names = res.tools.map((t: { name: string }) => t.name);
|
|
804
|
-
expect(names).toContain("health_check");
|
|
805
|
-
});
|
|
806
|
-
```
|
|
520
|
+
Recommended usage:
|
|
807
521
|
|
|
808
|
-
|
|
522
|
+
- Nightly: full conformance suite.
|
|
523
|
+
- Pre-release: blocking gate.
|
|
809
524
|
|
|
810
525
|
---
|
|
811
526
|
|
|
812
|
-
##
|
|
813
|
-
|
|
814
|
-
### Recommended Layered Strategy
|
|
815
|
-
|
|
816
|
-
| Trigger | Test Type | Tools | Duration |
|
|
817
|
-
| ----------------- | -------------------------------- | -------------------- | -------- |
|
|
818
|
-
| Every commit / PR | InMemoryTransport protocol tests | Vitest + SDK | < 5s |
|
|
819
|
-
| Every commit / PR | HTTP/SSE transport layer tests | Vitest + real server | < 10s |
|
|
820
|
-
| Every commit / PR | Security/policy/error handling | Vitest + SDK | < 5s |
|
|
821
|
-
| Every PR | Inspector CLI contract test | Inspector --cli | < 15s |
|
|
822
|
-
| Nightly | Agent Loop (ScriptedLLM) | Vitest + SDK | < 30s |
|
|
823
|
-
| Nightly | LLM E2E smoke test | Vitest + real LLM | < 60s |
|
|
824
|
-
| Pre-release | Containerized full-stack test | Docker + Inspector | < 5min |
|
|
825
|
-
|
|
826
|
-
### package.json Script Organization
|
|
827
|
-
|
|
828
|
-
The following script layout is one practical example. Rename or regroup scripts based on your repository structure and CI strategy.
|
|
829
|
-
|
|
830
|
-
```json
|
|
831
|
-
{
|
|
832
|
-
"scripts": {
|
|
833
|
-
"test": "vitest run",
|
|
834
|
-
"test:unit": "vitest run tests/unit",
|
|
835
|
-
"test:integration": "vitest run tests/integration",
|
|
836
|
-
"test:e2e": "vitest run tests/e2e",
|
|
837
|
-
"test:smoke": "vitest run tests/smoke --timeout=60000",
|
|
838
|
-
"typecheck": "tsc --noEmit",
|
|
839
|
-
"lint": "eslint ."
|
|
840
|
-
}
|
|
841
|
-
}
|
|
842
|
-
```
|
|
527
|
+
## Layer 5: Agent Loop Integration Tests (ScriptedLLM + Small Real-LLM Smoke)
|
|
843
528
|
|
|
844
|
-
|
|
529
|
+
Core principle: assert deterministic signals (tool sequence/arguments), not natural-language exact text.
|
|
845
530
|
|
|
846
|
-
|
|
847
|
-
# .github/workflows/test.yml or equivalent .gitlab-ci.yml
|
|
848
|
-
test:
|
|
849
|
-
steps:
|
|
850
|
-
- run: pnpm typecheck # Type check first
|
|
851
|
-
- run: pnpm lint # Then lint
|
|
852
|
-
- run: pnpm test # Finally run all tests
|
|
853
|
-
```
|
|
531
|
+
### ScriptedLLM mode (recommended)
|
|
854
532
|
|
|
855
|
-
|
|
533
|
+
- Drive agents with pre-scripted LLM responses.
|
|
534
|
+
- Assert called tools, arguments, and key entities in final output.
|
|
856
535
|
|
|
857
|
-
|
|
536
|
+
### Real-LLM smoke (small)
|
|
858
537
|
|
|
859
|
-
|
|
538
|
+
- Keep to 1–3 scenarios.
|
|
539
|
+
- Use weak assertions (non-empty output, successful basic tool call).
|
|
540
|
+
- Run nightly or on-demand only.
|
|
860
541
|
|
|
861
|
-
|
|
542
|
+
---
|
|
862
543
|
|
|
863
|
-
|
|
544
|
+
## Layer 6: Inspector CLI Black-Box Contract Tests (Pre/Post Deployment)
|
|
864
545
|
|
|
865
|
-
|
|
546
|
+
Inspector can be used as user-perspective black-box validation.
|
|
866
547
|
|
|
867
|
-
|
|
868
|
-
|
|
869
|
-
await session.transport.handlePostMessage(req, res);
|
|
548
|
+
- Local build artifact: `node dist/index.js` (stdio).
|
|
549
|
+
- Remote deployment: `https://your-domain/mcp` (Streamable HTTP).
|
|
870
550
|
|
|
871
|
-
|
|
872
|
-
|
|
551
|
+
```bash
|
|
552
|
+
npx @modelcontextprotocol/inspector --cli node dist/index.js --method tools/list
|
|
873
553
|
```
|
|
874
554
|
|
|
875
|
-
|
|
555
|
+
---
|
|
876
556
|
|
|
877
|
-
|
|
557
|
+
## CI/CD Layered Execution Strategy
|
|
878
558
|
|
|
879
|
-
|
|
559
|
+
| Trigger | Test Layers | Goal |
|
|
560
|
+
| ----------------- | ------------------------------------------- | --------------------------------------------------------------- |
|
|
561
|
+
| Every commit / PR | Layer 1 (InMemory) | P0 contract: tools/handlers/schema/policy |
|
|
562
|
+
| Every PR | Layer 2 (HTTP) | P0/P1: session, 404 reinitialize, protocol headers, GET SSE/405 |
|
|
563
|
+
| Nightly | Layer 4 (Conformance) | Spec compliance and upgrade early warning |
|
|
564
|
+
| Nightly | Layer 5 (Agent loop + small real LLM smoke) | Real usage path smoke checks |
|
|
565
|
+
| Pre-release | Layer 6 (Inspector + deployment black-box) | Release acceptance gate |
|
|
880
566
|
|
|
881
|
-
|
|
882
|
-
const { client, clientTransport, serverTransport } = await createLinkedPair(context);
|
|
883
|
-
try {
|
|
884
|
-
// Test logic
|
|
885
|
-
} finally {
|
|
886
|
-
await clientTransport.close();
|
|
887
|
-
await serverTransport.close();
|
|
888
|
-
}
|
|
889
|
-
```
|
|
567
|
+
---
|
|
890
568
|
|
|
891
|
-
|
|
569
|
+
## Common Pitfalls and Fixes (Updated for 2025-11-25)
|
|
892
570
|
|
|
893
|
-
|
|
571
|
+
1. Treating legacy HTTP+SSE as the current protocol model.
|
|
894
572
|
|
|
895
|
-
|
|
573
|
+
- Fix: use Streamable HTTP as primary; isolate legacy compatibility.
|
|
896
574
|
|
|
897
|
-
|
|
575
|
+
2. Sending JSON-RPC batch arrays in POST bodies.
|
|
898
576
|
|
|
899
|
-
|
|
577
|
+
- Fix: enforce single-message payloads where required by your server/profile.
|
|
900
578
|
|
|
901
|
-
|
|
579
|
+
3. Forgetting `MCP-Protocol-Version` / `MCP-Session-Id` in post-init requests.
|
|
902
580
|
|
|
903
|
-
|
|
904
|
-
const deadline = Date.now() + 2000;
|
|
905
|
-
while (sessions.size > 0 && Date.now() < deadline) {
|
|
906
|
-
await new Promise((r) => setTimeout(r, 50));
|
|
907
|
-
}
|
|
908
|
-
```
|
|
581
|
+
- Fix: test valid, missing, and invalid header combinations.
|
|
909
582
|
|
|
910
|
-
|
|
583
|
+
4. Missing Origin checks (DNS rebinding risk).
|
|
911
584
|
|
|
912
|
-
|
|
585
|
+
- Fix: add origin/host allowlist tests and default-safe binding strategy.
|
|
913
586
|
|
|
914
|
-
|
|
587
|
+
5. Incorrect body handling in HTTP middleware.
|
|
915
588
|
|
|
916
|
-
|
|
917
|
-
const normalized = query
|
|
918
|
-
.replace(/#[^\n]*/g, " ") // Remove line comments
|
|
919
|
-
.replace(/"""[\s\S]*?"""/g, " ") // Remove block strings
|
|
920
|
-
.replace(/"(?:\\.|[^"\\])*"/g, " "); // Remove double-quoted strings
|
|
921
|
-
```
|
|
589
|
+
- Fix: ensure transport receives request body in the expected form.
|
|
922
590
|
|
|
923
|
-
|
|
591
|
+
6. Misclassifying SSE disconnect as request cancellation.
|
|
924
592
|
|
|
925
|
-
|
|
593
|
+
- Fix: model cancellation explicitly and test cancellation semantics directly.
|
|
926
594
|
|
|
927
|
-
|
|
595
|
+
7. Multi-replica session stickiness issues.
|
|
928
596
|
|
|
929
|
-
|
|
930
|
-
|
|
931
|
-
```typescript
|
|
932
|
-
(ctx.env as { HTTP_JSON_ONLY: boolean }).HTTP_JSON_ONLY = true;
|
|
933
|
-
```
|
|
597
|
+
- Fix: sticky sessions or external session store for stateful transport behavior.
|
|
934
598
|
|
|
935
599
|
---
|
|
936
600
|
|
|
937
|
-
##
|
|
938
|
-
|
|
939
|
-
|
|
940
|
-
|
|
941
|
-
|
|
|
942
|
-
|
|
|
943
|
-
|
|
|
944
|
-
|
|
|
945
|
-
|
|
|
946
|
-
|
|
|
947
|
-
|
|
|
948
|
-
|
|
|
949
|
-
|
|
|
950
|
-
|
|
|
951
|
-
|
|
|
952
|
-
|
|
|
953
|
-
|
|
|
954
|
-
|
|
|
955
|
-
|
|
|
956
|
-
| Safe mode vs full mode | InMemoryTransport | P1 |
|
|
957
|
-
| Read-only mode tool filtering | InMemoryTransport | P0 |
|
|
958
|
-
| Feature flags (wiki / pipeline / release) | InMemoryTransport | P1 |
|
|
959
|
-
| Tool allowlist / blocklist | InMemoryTransport | P1 |
|
|
960
|
-
| GraphQL mutation detection and policy | InMemoryTransport | P1 |
|
|
961
|
-
| Agent Loop (ScriptedLLM) | InMemoryTransport | P2 |
|
|
962
|
-
| Response truncation (maxBytes) | InMemoryTransport | P2 |
|
|
963
|
-
| Health check endpoint | Real HTTP server | P2 |
|
|
964
|
-
| Inspector CLI contract test | Inspector --cli | P2 |
|
|
965
|
-
| Real LLM E2E | Real LLM API | P3 |
|
|
601
|
+
## Recommended Test Matrix (Copy/Paste)
|
|
602
|
+
|
|
603
|
+
| Dimension | Method | Priority |
|
|
604
|
+
| ----------------------------------------------------- | --------------- | -------- |
|
|
605
|
+
| initialize / lifecycle | InMemory + HTTP | P0 |
|
|
606
|
+
| tools/resources/prompts list contracts | InMemory | P0 |
|
|
607
|
+
| tool handler correctness with stubs | InMemory | P0 |
|
|
608
|
+
| schema validation (bad input) | InMemory | P0 |
|
|
609
|
+
| Streamable HTTP accept/headers/status | HTTP | P0 |
|
|
610
|
+
| `MCP-Session-Id`: create/reuse/terminate/reinitialize | HTTP | P0 |
|
|
611
|
+
| `MCP-Protocol-Version`: missing/invalid/valid | HTTP | P0 |
|
|
612
|
+
| GET behavior: SSE or 405 | HTTP | P1 |
|
|
613
|
+
| Origin/Host validation | HTTP | P0/P1 |
|
|
614
|
+
| OAuth challenge + resource metadata | HTTP | P1 |
|
|
615
|
+
| SSE reconnect (`retry` / `Last-Event-ID`) | HTTP | P2 |
|
|
616
|
+
| conformance suite | CLI | P1 |
|
|
617
|
+
| inspector black-box on artifacts | CLI | P2 |
|
|
618
|
+
| agent loop (ScriptedLLM) | InMemory | P2 |
|
|
619
|
+
| real LLM smoke | external API | P3 |
|
|
966
620
|
|
|
967
621
|
---
|
|
968
622
|
|
|
969
|
-
## References
|
|
970
|
-
|
|
971
|
-
-
|
|
972
|
-
-
|
|
973
|
-
-
|
|
974
|
-
-
|
|
975
|
-
-
|
|
976
|
-
-
|
|
977
|
-
|
|
978
|
-
|
|
979
|
-
- [MCP Server Testing Tools Overview (Testomat.io)](https://testomat.io/blog/mcp-server-testing-tools/)
|
|
980
|
-
- [MCP Server Best Practices (MarkTechPost)](https://www.marktechpost.com/2025/07/23/7-mcp-server-best-practices-for-scalable-ai-integrations-in-2025/)
|
|
981
|
-
- [MCP Official Node.js Client Tutorial](https://modelcontextprotocol.io/tutorials/building-a-client)
|
|
623
|
+
## References and Compatibility Notes
|
|
624
|
+
|
|
625
|
+
- MCP Specification 2025-11-25: transports, sessions, protocol version behavior.
|
|
626
|
+
- MCP Specification 2025-11-25: authorization and Protected Resource Metadata.
|
|
627
|
+
- MCP Conformance tooling.
|
|
628
|
+
- MCP Inspector documentation.
|
|
629
|
+
- TypeScript SDK migration notes (v1 to v2).
|
|
630
|
+
- TypeScript SDK server/client guides for Streamable HTTP behavior.
|
|
631
|
+
|
|
632
|
+
---
|