retestkit 1.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/openspec/apply.md +23 -0
- package/.claude/commands/openspec/archive.md +27 -0
- package/.claude/commands/openspec/proposal.md +28 -0
- package/.gemini/commands/openspec/apply.toml +21 -0
- package/.gemini/commands/openspec/archive.toml +25 -0
- package/.gemini/commands/openspec/proposal.toml +26 -0
- package/.github/prompts/openspec-apply.prompt.md +22 -0
- package/.github/prompts/openspec-archive.prompt.md +26 -0
- package/.github/prompts/openspec-proposal.prompt.md +27 -0
- package/.github/workflows/release.yml +33 -0
- package/.kilocode/workflows/openspec-apply.md +17 -0
- package/.kilocode/workflows/openspec-archive.md +21 -0
- package/.kilocode/workflows/openspec-proposal.md +22 -0
- package/.mcp.json +23 -0
- package/.opencode/command/openspec-apply.md +25 -0
- package/.opencode/command/openspec-archive.md +28 -0
- package/.opencode/command/openspec-proposal.md +30 -0
- package/.roo/commands/openspec-apply.md +20 -0
- package/.roo/commands/openspec-archive.md +24 -0
- package/.roo/commands/openspec-proposal.md +25 -0
- package/.vscode/mcp.json +23 -0
- package/AGENTS.md +18 -0
- package/CLAUDE.md +18 -0
- package/LICENSE +65 -0
- package/README.md +303 -0
- package/dist/config.d.ts +4 -0
- package/dist/config.d.ts.map +1 -0
- package/dist/config.js +27 -0
- package/dist/config.js.map +1 -0
- package/dist/elicitation/index.d.ts +17 -0
- package/dist/elicitation/index.d.ts.map +1 -0
- package/dist/elicitation/index.js +118 -0
- package/dist/elicitation/index.js.map +1 -0
- package/dist/elicitation/types.d.ts +35 -0
- package/dist/elicitation/types.d.ts.map +1 -0
- package/dist/elicitation/types.js +39 -0
- package/dist/elicitation/types.js.map +1 -0
- package/dist/index.d.ts +3 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +76 -0
- package/dist/index.js.map +1 -0
- package/dist/lifecycle/index.d.ts +31 -0
- package/dist/lifecycle/index.d.ts.map +1 -0
- package/dist/lifecycle/index.js +61 -0
- package/dist/lifecycle/index.js.map +1 -0
- package/dist/logger.d.ts +21 -0
- package/dist/logger.d.ts.map +1 -0
- package/dist/logger.js +182 -0
- package/dist/logger.js.map +1 -0
- package/dist/playwright-client/index.d.ts +29 -0
- package/dist/playwright-client/index.d.ts.map +1 -0
- package/dist/playwright-client/index.js +288 -0
- package/dist/playwright-client/index.js.map +1 -0
- package/dist/playwright-client/types.d.ts +44 -0
- package/dist/playwright-client/types.d.ts.map +1 -0
- package/dist/playwright-client/types.js +49 -0
- package/dist/playwright-client/types.js.map +1 -0
- package/dist/progress/index.d.ts +39 -0
- package/dist/progress/index.d.ts.map +1 -0
- package/dist/progress/index.js +106 -0
- package/dist/progress/index.js.map +1 -0
- package/dist/progress/types.d.ts +24 -0
- package/dist/progress/types.d.ts.map +1 -0
- package/dist/progress/types.js +2 -0
- package/dist/progress/types.js.map +1 -0
- package/dist/prompts/index.d.ts +19 -0
- package/dist/prompts/index.d.ts.map +1 -0
- package/dist/prompts/index.js +207 -0
- package/dist/prompts/index.js.map +1 -0
- package/dist/prompts/loader.d.ts +20 -0
- package/dist/prompts/loader.d.ts.map +1 -0
- package/dist/prompts/loader.js +47 -0
- package/dist/prompts/loader.js.map +1 -0
- package/dist/resources/index.d.ts +27 -0
- package/dist/resources/index.d.ts.map +1 -0
- package/dist/resources/index.js +186 -0
- package/dist/resources/index.js.map +1 -0
- package/dist/resources/subscriptions.d.ts +10 -0
- package/dist/resources/subscriptions.d.ts.map +1 -0
- package/dist/resources/subscriptions.js +23 -0
- package/dist/resources/subscriptions.js.map +1 -0
- package/dist/sampling/index.d.ts +11 -0
- package/dist/sampling/index.d.ts.map +1 -0
- package/dist/sampling/index.js +201 -0
- package/dist/sampling/index.js.map +1 -0
- package/dist/sampling/prompts.d.ts +56 -0
- package/dist/sampling/prompts.d.ts.map +1 -0
- package/dist/sampling/prompts.js +124 -0
- package/dist/sampling/prompts.js.map +1 -0
- package/dist/sampling/types.d.ts +57 -0
- package/dist/sampling/types.d.ts.map +1 -0
- package/dist/sampling/types.js +2 -0
- package/dist/sampling/types.js.map +1 -0
- package/dist/schemas/config.d.ts +40 -0
- package/dist/schemas/config.d.ts.map +1 -0
- package/dist/schemas/config.js +30 -0
- package/dist/schemas/config.js.map +1 -0
- package/dist/security/index.d.ts +38 -0
- package/dist/security/index.d.ts.map +1 -0
- package/dist/security/index.js +281 -0
- package/dist/security/index.js.map +1 -0
- package/dist/server.d.ts +9 -0
- package/dist/server.d.ts.map +1 -0
- package/dist/server.js +142 -0
- package/dist/server.js.map +1 -0
- package/dist/test-utils/index.d.ts +6 -0
- package/dist/test-utils/index.d.ts.map +1 -0
- package/dist/test-utils/index.js +6 -0
- package/dist/test-utils/index.js.map +1 -0
- package/dist/test-utils/mock-context.d.ts +64 -0
- package/dist/test-utils/mock-context.d.ts.map +1 -0
- package/dist/test-utils/mock-context.js +347 -0
- package/dist/test-utils/mock-context.js.map +1 -0
- package/dist/test-utils/mock-playwright-client.d.ts +62 -0
- package/dist/test-utils/mock-playwright-client.d.ts.map +1 -0
- package/dist/test-utils/mock-playwright-client.js +315 -0
- package/dist/test-utils/mock-playwright-client.js.map +1 -0
- package/dist/tools/index.d.ts +4 -0
- package/dist/tools/index.d.ts.map +1 -0
- package/dist/tools/index.js +8 -0
- package/dist/tools/index.js.map +1 -0
- package/dist/tools/webtest/crawl.d.ts +46 -0
- package/dist/tools/webtest/crawl.d.ts.map +1 -0
- package/dist/tools/webtest/crawl.js +678 -0
- package/dist/tools/webtest/crawl.js.map +1 -0
- package/dist/tools/webtest/discover-features.d.ts +30 -0
- package/dist/tools/webtest/discover-features.d.ts.map +1 -0
- package/dist/tools/webtest/discover-features.js +343 -0
- package/dist/tools/webtest/discover-features.js.map +1 -0
- package/dist/tools/webtest/discover-flows.d.ts +29 -0
- package/dist/tools/webtest/discover-flows.d.ts.map +1 -0
- package/dist/tools/webtest/discover-flows.js +341 -0
- package/dist/tools/webtest/discover-flows.js.map +1 -0
- package/dist/tools/webtest/generate-tests.d.ts +54 -0
- package/dist/tools/webtest/generate-tests.d.ts.map +1 -0
- package/dist/tools/webtest/generate-tests.js +364 -0
- package/dist/tools/webtest/generate-tests.js.map +1 -0
- package/dist/tools/webtest/index.d.ts +8 -0
- package/dist/tools/webtest/index.d.ts.map +1 -0
- package/dist/tools/webtest/index.js +8 -0
- package/dist/tools/webtest/index.js.map +1 -0
- package/dist/tools/webtest/run-test-case.d.ts +28 -0
- package/dist/tools/webtest/run-test-case.d.ts.map +1 -0
- package/dist/tools/webtest/run-test-case.js +420 -0
- package/dist/tools/webtest/run-test-case.js.map +1 -0
- package/dist/tools/webtest/schemas.d.ts +175 -0
- package/dist/tools/webtest/schemas.d.ts.map +1 -0
- package/dist/tools/webtest/schemas.js +156 -0
- package/dist/tools/webtest/schemas.js.map +1 -0
- package/dist/tools/webtest/start-analysis.d.ts +16 -0
- package/dist/tools/webtest/start-analysis.d.ts.map +1 -0
- package/dist/tools/webtest/start-analysis.js +137 -0
- package/dist/tools/webtest/start-analysis.js.map +1 -0
- package/dist/transports/http.d.ts +8 -0
- package/dist/transports/http.d.ts.map +1 -0
- package/dist/transports/http.js +9 -0
- package/dist/transports/http.js.map +1 -0
- package/dist/transports/index.d.ts +14 -0
- package/dist/transports/index.d.ts.map +1 -0
- package/dist/transports/index.js +20 -0
- package/dist/transports/index.js.map +1 -0
- package/dist/transports/stdio.d.ts +4 -0
- package/dist/transports/stdio.d.ts.map +1 -0
- package/dist/transports/stdio.js +6 -0
- package/dist/transports/stdio.js.map +1 -0
- package/dist/types/capabilities.d.ts +18 -0
- package/dist/types/capabilities.d.ts.map +1 -0
- package/dist/types/capabilities.js +35 -0
- package/dist/types/capabilities.js.map +1 -0
- package/dist/types/context.d.ts +20 -0
- package/dist/types/context.d.ts.map +1 -0
- package/dist/types/context.js +2 -0
- package/dist/types/context.js.map +1 -0
- package/dist/types/tool.d.ts +10 -0
- package/dist/types/tool.d.ts.map +1 -0
- package/dist/types/tool.js +2 -0
- package/dist/types/tool.js.map +1 -0
- package/dist/workspace/index.d.ts +99 -0
- package/dist/workspace/index.d.ts.map +1 -0
- package/dist/workspace/index.js +648 -0
- package/dist/workspace/index.js.map +1 -0
- package/dist/workspace/markdown.d.ts +50 -0
- package/dist/workspace/markdown.d.ts.map +1 -0
- package/dist/workspace/markdown.js +210 -0
- package/dist/workspace/markdown.js.map +1 -0
- package/dist/workspace/types.d.ts +173 -0
- package/dist/workspace/types.d.ts.map +1 -0
- package/dist/workspace/types.js +2 -0
- package/dist/workspace/types.js.map +1 -0
- package/openspec/AGENTS.md +456 -0
- package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/proposal.md +33 -0
- package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/specs/webtest-resources/spec.md +27 -0
- package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/specs/webtest-tools/spec.md +304 -0
- package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/tasks.md +43 -0
- package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/design.md +209 -0
- package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/proposal.md +41 -0
- package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/specs/mcp-server-core/spec.md +183 -0
- package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/tasks.md +112 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/design.md +333 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/proposal.md +66 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/mcp-server-core/spec.md +129 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-lifecycle/spec.md +138 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-logging/spec.md +211 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-prompts/spec.md +157 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-resources/spec.md +213 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-sampling/spec.md +257 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-tools/spec.md +501 -0
- package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/tasks.md +264 -0
- package/openspec/changes/archive/2025-12-18-allow-analysis-of-incomplete-crawls/proposal.md +24 -0
- package/openspec/changes/archive/2025-12-18-allow-analysis-of-incomplete-crawls/specs/webtest-tools/spec.md +80 -0
- package/openspec/changes/archive/2025-12-18-allow-analysis-of-incomplete-crawls/tasks.md +8 -0
- package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/design.md +90 -0
- package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/proposal.md +28 -0
- package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/specs/webtest-sampling/spec.md +90 -0
- package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/tasks.md +33 -0
- package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/design.md +558 -0
- package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/proposal.md +119 -0
- package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/specs/webtest-resources/spec.md +109 -0
- package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/specs/webtest-tools/spec.md +121 -0
- package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/tasks.md +133 -0
- package/openspec/changes/extract-prompts-to-markdown/design.md +86 -0
- package/openspec/changes/extract-prompts-to-markdown/proposal.md +50 -0
- package/openspec/changes/extract-prompts-to-markdown/specs/webtest-prompts/spec.md +74 -0
- package/openspec/changes/extract-prompts-to-markdown/tasks.md +40 -0
- package/openspec/changes/refactor-webtest-naming/design.md +95 -0
- package/openspec/changes/refactor-webtest-naming/proposal.md +66 -0
- package/openspec/changes/refactor-webtest-naming/specs/webtest-prompts/spec.md +79 -0
- package/openspec/changes/refactor-webtest-naming/specs/webtest-resources/spec.md +80 -0
- package/openspec/changes/refactor-webtest-naming/specs/webtest-sampling/spec.md +122 -0
- package/openspec/changes/refactor-webtest-naming/specs/webtest-tools/spec.md +113 -0
- package/openspec/changes/refactor-webtest-naming/tasks.md +119 -0
- package/openspec/changes/rename-package-to-retest/proposal.md +52 -0
- package/openspec/changes/rename-package-to-retest/specs/mcp-server-core/spec.md +53 -0
- package/openspec/changes/rename-package-to-retest/specs/retest-lifecycle/spec.md +68 -0
- package/openspec/changes/rename-package-to-retest/specs/retest-logging/spec.md +35 -0
- package/openspec/changes/rename-package-to-retest/specs/retest-prompts/spec.md +159 -0
- package/openspec/changes/rename-package-to-retest/specs/retest-resources/spec.md +251 -0
- package/openspec/changes/rename-package-to-retest/specs/retest-sampling/spec.md +99 -0
- package/openspec/changes/rename-package-to-retest/specs/retest-tools/spec.md +295 -0
- package/openspec/changes/rename-package-to-retest/tasks.md +71 -0
- package/openspec/project.md +31 -0
- package/openspec/specs/mcp-server-core/spec.md +178 -0
- package/openspec/specs/webtest-lifecycle/spec.md +136 -0
- package/openspec/specs/webtest-logging/spec.md +209 -0
- package/openspec/specs/webtest-prompts/spec.md +155 -0
- package/openspec/specs/webtest-resources/spec.md +248 -0
- package/openspec/specs/webtest-sampling/spec.md +344 -0
- package/openspec/specs/webtest-tools/spec.md +282 -0
- package/package.json +54 -0
- package/release.config.js +9 -0
- package/src/config.test.ts +96 -0
- package/src/config.ts +32 -0
- package/src/elicitation/index.test.ts +399 -0
- package/src/elicitation/index.ts +171 -0
- package/src/elicitation/types.ts +68 -0
- package/src/index.ts +83 -0
- package/src/lifecycle/index.test.ts +260 -0
- package/src/lifecycle/index.ts +101 -0
- package/src/logger.redaction.test.ts +322 -0
- package/src/logger.test.ts +123 -0
- package/src/logger.ts +229 -0
- package/src/playwright-client/index.ts +392 -0
- package/src/playwright-client/types.ts +99 -0
- package/src/progress/index.test.ts +327 -0
- package/src/progress/index.ts +170 -0
- package/src/progress/types.ts +25 -0
- package/src/prompts/index.test.ts +451 -0
- package/src/prompts/index.ts +246 -0
- package/src/prompts/loader.test.ts +100 -0
- package/src/prompts/loader.ts +59 -0
- package/src/prompts/templates/mcp/webtest-crawl.md +7 -0
- package/src/prompts/templates/mcp/webtest-discover-flows.md +11 -0
- package/src/prompts/templates/mcp/webtest-discover.md +12 -0
- package/src/prompts/templates/mcp/webtest-full-workflow.md +12 -0
- package/src/prompts/templates/mcp/webtest-generate-tests.md +11 -0
- package/src/prompts/templates/mcp/webtest-run-test.md +11 -0
- package/src/prompts/templates/mcp/webtest-start.md +8 -0
- package/src/prompts/templates/sampling/crawl-action.md +35 -0
- package/src/prompts/templates/sampling/feature-discovery.md +27 -0
- package/src/prompts/templates/sampling/flow-discovery.md +29 -0
- package/src/prompts/templates/sampling/page-content-wrapper.md +5 -0
- package/src/prompts/templates/sampling/system-prefix.md +12 -0
- package/src/prompts/templates/sampling/test-evaluation.md +17 -0
- package/src/prompts/templates/sampling/test-generation.md +31 -0
- package/src/resources/index.ts +250 -0
- package/src/resources/subscriptions.ts +37 -0
- package/src/sampling/index.test.ts +414 -0
- package/src/sampling/index.ts +286 -0
- package/src/sampling/prompts.ts +194 -0
- package/src/sampling/types.ts +60 -0
- package/src/schemas/config.ts +39 -0
- package/src/security/index.test.ts +441 -0
- package/src/security/index.ts +361 -0
- package/src/security/security-scenarios.test.ts +468 -0
- package/src/server.ts +211 -0
- package/src/test-utils/index.ts +6 -0
- package/src/test-utils/mock-context.ts +426 -0
- package/src/test-utils/mock-playwright-client.ts +422 -0
- package/src/tools/index.ts +11 -0
- package/src/tools/webtest/crawl.test.ts +834 -0
- package/src/tools/webtest/crawl.ts +901 -0
- package/src/tools/webtest/discover-features.ts +412 -0
- package/src/tools/webtest/discover-flows.ts +408 -0
- package/src/tools/webtest/generate-tests.test.ts +532 -0
- package/src/tools/webtest/generate-tests.ts +425 -0
- package/src/tools/webtest/index.ts +7 -0
- package/src/tools/webtest/integration.test.ts +536 -0
- package/src/tools/webtest/run-test-case.test.ts +659 -0
- package/src/tools/webtest/run-test-case.ts +508 -0
- package/src/tools/webtest/schemas.ts +201 -0
- package/src/tools/webtest/start-analysis.test.ts +151 -0
- package/src/tools/webtest/start-analysis.ts +158 -0
- package/src/transports/http.ts +19 -0
- package/src/transports/index.ts +30 -0
- package/src/transports/stdio.ts +7 -0
- package/src/types/capabilities.test.ts +193 -0
- package/src/types/capabilities.ts +50 -0
- package/src/types/context.ts +21 -0
- package/src/types/tool.ts +11 -0
- package/src/workspace/index.ts +945 -0
- package/src/workspace/markdown.ts +272 -0
- package/src/workspace/types.ts +186 -0
- package/tests/integration/server.test.ts +89 -0
- package/tests/integration/tools.test.ts +99 -0
- package/tsconfig.json +20 -0
- package/vitest.config.ts +9 -0
- package/vitest.integration.config.ts +10 -0
package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/specs/mcp-server-core/spec.md
ADDED
|
@@ -0,0 +1,183 @@
|
|
|
1
|
+
## ADDED Requirements
|
|
2
|
+
|
|
3
|
+
### Requirement: MCP Server Initialization
|
|
4
|
+
|
|
5
|
+
The system SHALL provide an MCP server that initializes with proper identification and connects to the configured transport.
|
|
6
|
+
|
|
7
|
+
#### Scenario: Server starts with stdio transport
|
|
8
|
+
|
|
9
|
+
- **GIVEN** the environment variable `TRANSPORT` is set to `stdio` or not set
|
|
10
|
+
- **WHEN** the server entry point is executed
|
|
11
|
+
- **THEN** it SHALL identify itself with name "testing-mcp" and version from package.json
|
|
12
|
+
- **AND** it SHALL connect to stdio transport for communication
|
|
13
|
+
|
|
14
|
+
#### Scenario: Server starts with HTTP transport
|
|
15
|
+
|
|
16
|
+
- **GIVEN** the environment variable `TRANSPORT` is set to `http`
|
|
17
|
+
- **AND** the environment variable `PORT` is set to a valid port number
|
|
18
|
+
- **WHEN** the server entry point is executed
|
|
19
|
+
- **THEN** it SHALL start a Streamable HTTP server on the specified port
|
|
20
|
+
- **AND** it SHALL accept MCP protocol connections over HTTP
|
|
21
|
+
|
|
22
|
+
#### Scenario: Server handles graceful shutdown
|
|
23
|
+
|
|
24
|
+
- **GIVEN** the server is running
|
|
25
|
+
- **WHEN** the process receives SIGINT or SIGTERM
|
|
26
|
+
- **THEN** the server SHALL disconnect gracefully
|
|
27
|
+
- **AND** the process SHALL exit with code 0
|
|
28
|
+
|
|
29
|
+
### Requirement: Configuration Validation
|
|
30
|
+
|
|
31
|
+
The system SHALL validate configuration at startup using Zod schemas and fail fast on invalid configuration.
|
|
32
|
+
|
|
33
|
+
#### Scenario: Valid configuration starts server
|
|
34
|
+
|
|
35
|
+
- **GIVEN** all required environment variables are valid
|
|
36
|
+
- **WHEN** the server starts
|
|
37
|
+
- **THEN** configuration SHALL be parsed and validated
|
|
38
|
+
- **AND** the server SHALL proceed with initialization
|
|
39
|
+
|
|
40
|
+
#### Scenario: Invalid configuration fails fast
|
|
41
|
+
|
|
42
|
+
- **GIVEN** an environment variable has an invalid value (e.g., `PORT=invalid`)
|
|
43
|
+
- **WHEN** the server attempts to start
|
|
44
|
+
- **THEN** it SHALL log a descriptive error message
|
|
45
|
+
- **AND** the process SHALL exit with a non-zero code
|
|
46
|
+
|
|
47
|
+
### Requirement: Pluggable Transport Layer
|
|
48
|
+
|
|
49
|
+
The system SHALL support multiple transport types through a pluggable architecture with transport selection via environment configuration.
|
|
50
|
+
|
|
51
|
+
#### Scenario: Transport factory selects stdio
|
|
52
|
+
|
|
53
|
+
- **GIVEN** the transport configuration specifies `stdio`
|
|
54
|
+
- **WHEN** the transport factory is invoked
|
|
55
|
+
- **THEN** it SHALL return a configured StdioServerTransport instance
|
|
56
|
+
|
|
57
|
+
#### Scenario: Transport factory selects HTTP
|
|
58
|
+
|
|
59
|
+
- **GIVEN** the transport configuration specifies `http` with a port
|
|
60
|
+
- **WHEN** the transport factory is invoked
|
|
61
|
+
- **THEN** it SHALL return a configured StreamableHTTPServerTransport instance
|
|
62
|
+
|
|
63
|
+
### Requirement: Self-Describing Tool Registry
|
|
64
|
+
|
|
65
|
+
The system SHALL maintain a tool registry where each tool exports a standard interface including name, description, Zod input schema, and async handler function.
|
|
66
|
+
|
|
67
|
+
#### Scenario: Tool is registered and discoverable
|
|
68
|
+
|
|
69
|
+
- **GIVEN** a tool is added to the registry
|
|
70
|
+
- **WHEN** an MCP client requests the tool list
|
|
71
|
+
- **THEN** the tool SHALL appear in the list with its name and description
|
|
72
|
+
- **AND** the input JSON Schema SHALL be generated from the Zod schema
|
|
73
|
+
|
|
74
|
+
#### Scenario: New tool follows registry pattern
|
|
75
|
+
|
|
76
|
+
- **GIVEN** a developer creates a new tool
|
|
77
|
+
- **WHEN** the tool exports `{ name, description, inputSchema, handler }`
|
|
78
|
+
- **AND** the tool is added to the registry index
|
|
79
|
+
- **THEN** it SHALL be automatically registered with the MCP server
|
|
80
|
+
|
|
81
|
+
### Requirement: Hello Tool Implementation
|
|
82
|
+
|
|
83
|
+
The system SHALL provide a "hello" demonstration tool that accepts a name parameter and returns a greeting message, serving as a reference implementation of the tool pattern.
|
|
84
|
+
|
|
85
|
+
#### Scenario: Hello tool returns greeting
|
|
86
|
+
|
|
87
|
+
- **GIVEN** the hello tool is registered
|
|
88
|
+
- **WHEN** called with input `{ "name": "World" }`
|
|
89
|
+
- **THEN** it SHALL return content with text "Hello, World!"
|
|
90
|
+
|
|
91
|
+
#### Scenario: Hello tool validates input
|
|
92
|
+
|
|
93
|
+
- **GIVEN** the hello tool is registered
|
|
94
|
+
- **WHEN** called without required name parameter
|
|
95
|
+
- **THEN** it SHALL return a validation error
|
|
96
|
+
|
|
97
|
+
### Requirement: Structured Logging
|
|
98
|
+
|
|
99
|
+
The system SHALL provide structured JSON logging with configurable log levels and automatic redaction of sensitive fields.
|
|
100
|
+
|
|
101
|
+
#### Scenario: Log output is structured JSON
|
|
102
|
+
|
|
103
|
+
- **GIVEN** the server is running
|
|
104
|
+
- **WHEN** a log event occurs
|
|
105
|
+
- **THEN** it SHALL be output as a JSON object with timestamp, level, and message fields
|
|
106
|
+
|
|
107
|
+
#### Scenario: Sensitive fields are redacted
|
|
108
|
+
|
|
109
|
+
- **GIVEN** a log message contains a field matching a sensitive key pattern (password, token, secret, apiKey, authorization)
|
|
110
|
+
- **WHEN** the log is written
|
|
111
|
+
- **THEN** the sensitive field value SHALL be replaced with "[REDACTED]"
|
|
112
|
+
|
|
113
|
+
#### Scenario: Log level is configurable
|
|
114
|
+
|
|
115
|
+
- **GIVEN** the environment variable `LOG_LEVEL` is set to a valid level (debug, info, warn, error)
|
|
116
|
+
- **WHEN** the server starts
|
|
117
|
+
- **THEN** only log messages at or above that level SHALL be output
|
|
118
|
+
|
|
119
|
+
### Requirement: Project Build Configuration
|
|
120
|
+
|
|
121
|
+
The system SHALL be buildable to JavaScript for production deployment using TypeScript compiler.
|
|
122
|
+
|
|
123
|
+
#### Scenario: Project builds successfully
|
|
124
|
+
|
|
125
|
+
- **GIVEN** the source code is valid TypeScript
|
|
126
|
+
- **WHEN** `npm run build` is executed
|
|
127
|
+
- **THEN** compiled JavaScript SHALL be output to `dist/` directory
|
|
128
|
+
- **AND** the build SHALL complete without errors
|
|
129
|
+
|
|
130
|
+
#### Scenario: Development mode runs with hot-reload
|
|
131
|
+
|
|
132
|
+
- **GIVEN** the development dependencies are installed
|
|
133
|
+
- **WHEN** `npm run dev` is executed
|
|
134
|
+
- **THEN** the server SHALL start with file watching enabled
|
|
135
|
+
- **AND** changes to source files SHALL trigger automatic restart
|
|
136
|
+
|
|
137
|
+
#### Scenario: Package is executable as CLI
|
|
138
|
+
|
|
139
|
+
- **GIVEN** the project is built
|
|
140
|
+
- **WHEN** `npx testing-mcp` is executed (or the bin entry is invoked)
|
|
141
|
+
- **THEN** the server SHALL start with default configuration
|
|
142
|
+
|
|
143
|
+
### Requirement: Unit Test Infrastructure
|
|
144
|
+
|
|
145
|
+
The system SHALL include unit test configuration for validating tool handlers in isolation.
|
|
146
|
+
|
|
147
|
+
#### Scenario: Unit tests execute successfully
|
|
148
|
+
|
|
149
|
+
- **GIVEN** unit test files exist in the project
|
|
150
|
+
- **WHEN** `npm test` is executed
|
|
151
|
+
- **THEN** the test runner SHALL discover and execute all test files
|
|
152
|
+
- **AND** results SHALL be reported to stdout
|
|
153
|
+
|
|
154
|
+
#### Scenario: Tool handlers are testable in isolation
|
|
155
|
+
|
|
156
|
+
- **GIVEN** a tool handler function
|
|
157
|
+
- **WHEN** called directly with valid input
|
|
158
|
+
- **THEN** it SHALL return the expected result without requiring server initialization
|
|
159
|
+
|
|
160
|
+
### Requirement: Integration Test Infrastructure
|
|
161
|
+
|
|
162
|
+
The system SHALL include integration tests that spawn the server and communicate using the MCP protocol to verify end-to-end behavior.
|
|
163
|
+
|
|
164
|
+
#### Scenario: Integration test spawns server
|
|
165
|
+
|
|
166
|
+
- **GIVEN** integration test configuration exists
|
|
167
|
+
- **WHEN** an integration test runs
|
|
168
|
+
- **THEN** it SHALL spawn the server as a child process
|
|
169
|
+
- **AND** connect to it using StdioServerTransport
|
|
170
|
+
|
|
171
|
+
#### Scenario: Integration test executes tool end-to-end
|
|
172
|
+
|
|
173
|
+
- **GIVEN** an integration test has connected to the server
|
|
174
|
+
- **WHEN** it calls a tool with valid input
|
|
175
|
+
- **THEN** it SHALL receive the expected response payload
|
|
176
|
+
- **AND** verify the response matches expected format
|
|
177
|
+
|
|
178
|
+
#### Scenario: Integration test verifies error handling
|
|
179
|
+
|
|
180
|
+
- **GIVEN** an integration test has connected to the server
|
|
181
|
+
- **WHEN** it calls a tool with invalid input
|
|
182
|
+
- **THEN** it SHALL receive an appropriate error response
|
|
183
|
+
- **AND** verify the error format matches MCP protocol specification
|
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
## 1. Project Configuration
|
|
2
|
+
|
|
3
|
+
- [x] 1.1 Initialize `package.json` with:
|
|
4
|
+
- name: "testing-mcp"
|
|
5
|
+
- type: "module"
|
|
6
|
+
- exports map: `{ ".": "./dist/index.js" }`
|
|
7
|
+
- bin entry: `{ "testing-mcp": "./dist/index.js" }`
|
|
8
|
+
- scripts: dev, build, test, test:integration, start
|
|
9
|
+
- engines: `{ "node": ">=22.18.0" }`
|
|
10
|
+
- [x] 1.2 Create `tsconfig.json` targeting ES2022, NodeNext module resolution, strict mode enabled
|
|
11
|
+
- [x] 1.3 Install dependencies: `@modelcontextprotocol/sdk`, `zod`
|
|
12
|
+
- [x] 1.4 Install dev dependencies: `typescript`, `tsx`, `vitest`, `@types/node`
|
|
13
|
+
- [x] 1.5 Add `.gitignore` entries for `node_modules/`, `dist/`, and coverage reports
|
|
14
|
+
- [x] 1.6 Create `vitest.config.ts` with TypeScript and ESM support
|
|
15
|
+
|
|
16
|
+
## 2. Project Structure
|
|
17
|
+
|
|
18
|
+
- [x] 2.1 Create `src/` directory structure:
|
|
19
|
+
```
|
|
20
|
+
src/
|
|
21
|
+
├── index.ts
|
|
22
|
+
├── server.ts
|
|
23
|
+
├── config.ts
|
|
24
|
+
├── logger.ts
|
|
25
|
+
├── transports/
|
|
26
|
+
│ ├── index.ts
|
|
27
|
+
│ ├── stdio.ts
|
|
28
|
+
│ └── http.ts
|
|
29
|
+
├── tools/
|
|
30
|
+
│ ├── index.ts
|
|
31
|
+
│ └── hello.ts
|
|
32
|
+
├── schemas/
|
|
33
|
+
│ └── config.ts
|
|
34
|
+
└── types/
|
|
35
|
+
└── tool.ts
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## 3. Configuration & Logging
|
|
39
|
+
|
|
40
|
+
- [x] 3.1 Create `src/schemas/config.ts` with Zod schema for environment config:
|
|
41
|
+
- `TRANSPORT`: enum `stdio` | `http` (default: `stdio`)
|
|
42
|
+
- `PORT`: number (default: `3000`, required when TRANSPORT=http)
|
|
43
|
+
- `LOG_LEVEL`: enum `debug` | `info` | `warn` | `error` (default: `info`)
|
|
44
|
+
- [x] 3.2 Create `src/config.ts` that parses and validates env vars at startup
|
|
45
|
+
- [x] 3.3 Create `src/logger.ts` with:
|
|
46
|
+
- Structured JSON output
|
|
47
|
+
- Configurable log level
|
|
48
|
+
- Secret redaction for sensitive keys (password, token, secret, apiKey, authorization)
|
|
49
|
+
|
|
50
|
+
## 4. Transport Layer
|
|
51
|
+
|
|
52
|
+
- [x] 4.1 Create `src/types/tool.ts` with `McpTool` interface
|
|
53
|
+
- [x] 4.2 Create `src/transports/stdio.ts` wrapping StdioServerTransport
|
|
54
|
+
- [x] 4.3 Create `src/transports/http.ts` wrapping StreamableHTTPServerTransport
|
|
55
|
+
- [x] 4.4 Create `src/transports/index.ts` transport factory based on config
|
|
56
|
+
|
|
57
|
+
## 5. Tool Registry
|
|
58
|
+
|
|
59
|
+
- [x] 5.1 Create `src/tools/hello.ts` with:
|
|
60
|
+
- Zod input schema for `name` parameter
|
|
61
|
+
- Handler returning greeting message
|
|
62
|
+
- Export following `McpTool` interface
|
|
63
|
+
- [x] 5.2 Create `src/tools/index.ts` registry exporting all tools
|
|
64
|
+
- [x] 5.3 Verify JSON Schema generation from Zod works correctly
|
|
65
|
+
|
|
66
|
+
## 6. Server Core
|
|
67
|
+
|
|
68
|
+
- [x] 6.1 Create `src/server.ts` with:
|
|
69
|
+
- MCP server factory function
|
|
70
|
+
- Tool registration from registry
|
|
71
|
+
- Server identification (name, version from package.json)
|
|
72
|
+
- [x] 6.2 Create `src/index.ts` entry point:
|
|
73
|
+
- Config validation
|
|
74
|
+
- Logger initialization
|
|
75
|
+
- Transport creation
|
|
76
|
+
- Server bootstrap
|
|
77
|
+
- Graceful shutdown handlers (SIGINT/SIGTERM)
|
|
78
|
+
|
|
79
|
+
## 7. Unit Tests
|
|
80
|
+
|
|
81
|
+
- [x] 7.1 Create `src/tools/hello.test.ts` testing:
|
|
82
|
+
- Handler returns correct greeting
|
|
83
|
+
- Input validation rejects invalid input
|
|
84
|
+
- [x] 7.2 Create `src/config.test.ts` testing:
|
|
85
|
+
- Valid config parses correctly
|
|
86
|
+
- Invalid config throws descriptive error
|
|
87
|
+
- [x] 7.3 Create `src/logger.test.ts` testing:
|
|
88
|
+
- Output is valid JSON
|
|
89
|
+
- Sensitive fields are redacted
|
|
90
|
+
- [x] 7.4 Verify `npm test` runs all unit tests successfully
|
|
91
|
+
|
|
92
|
+
## 8. Integration Tests
|
|
93
|
+
|
|
94
|
+
- [x] 8.1 Create `tests/integration/` directory
|
|
95
|
+
- [x] 8.2 Create `tests/integration/server.test.ts`:
|
|
96
|
+
- Spawn server as child process
|
|
97
|
+
- Connect using MCP client with StdioClientTransport
|
|
98
|
+
- Verify tool list includes "hello"
|
|
99
|
+
- [x] 8.3 Create `tests/integration/tools.test.ts`:
|
|
100
|
+
- Call hello tool with valid input, verify response
|
|
101
|
+
- Call hello tool with invalid input, verify error response
|
|
102
|
+
- [x] 8.4 Add `test:integration` npm script
|
|
103
|
+
- [x] 8.5 Verify integration tests pass
|
|
104
|
+
|
|
105
|
+
## 9. Validation & Documentation
|
|
106
|
+
|
|
107
|
+
- [x] 9.1 Verify `npm run build` produces valid output in `dist/`
|
|
108
|
+
- [x] 9.2 Verify `npm run dev` starts server with watch mode (stdio transport)
|
|
109
|
+
- [x] 9.3 Verify `TRANSPORT=http PORT=3000 npm run dev` starts HTTP server
|
|
110
|
+
- [x] 9.4 Test server with MCP client (e.g., Claude Code) to confirm tool discovery and execution
|
|
111
|
+
- [x] 9.5 Add shebang `#!/usr/bin/env node` to entry point for bin execution
|
|
112
|
+
- [x] 9.6 Verify `npx .` works after build (local bin test)
|
|
@@ -0,0 +1,333 @@
|
|
|
1
|
+
# Design: Dynamic Web Testing Orchestrator
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
This MCP server orchestrates web application testing by:
|
|
6
|
+
1. Managing browser automation via an external Playwright MCP server
|
|
7
|
+
2. Using MCP Sampling for all LLM-powered decisions (client-controlled)
|
|
8
|
+
3. Exposing artifacts as MCP Resources for client consumption
|
|
9
|
+
4. Supporting interactive workflows via MCP Elicitation
|
|
10
|
+
|
|
11
|
+
The architecture spans multiple systems (this server, Playwright MCP, client LLM) and introduces patterns for capability negotiation, fallback modes, and secure multi-model orchestration.
|
|
12
|
+
|
|
13
|
+
## Goals / Non-Goals
|
|
14
|
+
|
|
15
|
+
### Goals
|
|
16
|
+
- Provide end-to-end web testing workflow: explore → analyze → generate tests → execute
|
|
17
|
+
- Use MCP Sampling exclusively for LLM reasoning (no server-side API keys)
|
|
18
|
+
- Support graceful degradation when client lacks sampling/elicitation
|
|
19
|
+
- Ensure all artifacts are browsable as MCP Resources
|
|
20
|
+
- Support cancellation and progress for long-running operations
|
|
21
|
+
- Enforce security: domain allowlists, no credential elicitation, prompt injection resistance
|
|
22
|
+
|
|
23
|
+
### Non-Goals
|
|
24
|
+
- Authentication/login automation (explicitly out of scope; stop and inform user)
|
|
25
|
+
- Server-side LLM API keys or model selection
|
|
26
|
+
- Visual regression testing (future enhancement)
|
|
27
|
+
- Multi-browser support (Playwright MCP handles this; we just orchestrate)
|
|
28
|
+
- Parallel test execution (single-threaded for v1)
|
|
29
|
+
|
|
30
|
+
## Architecture Overview
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
34
|
+
│ MCP Client │
|
|
35
|
+
│ (Claude Desktop, VS Code, custom client) │
|
|
36
|
+
│ - Handles sampling/createMessage requests │
|
|
37
|
+
│ - Displays resources, prompts │
|
|
38
|
+
│ - Provides elicitation UI │
|
|
39
|
+
└─────────────────┬───────────────────────────────────────────────┘
|
|
40
|
+
│ MCP Protocol (stdio/HTTP)
|
|
41
|
+
┌─────────────────▼───────────────────────────────────────────────┐
|
|
42
|
+
│ testing-mcp (This Server) │
|
|
43
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
|
|
44
|
+
│ │ Lifecycle │ │ Tool │ │ Resource Manager ││
|
|
45
|
+
│ │ Manager │ │ Handlers │ │ (workspace/artifacts) ││
|
|
46
|
+
│ │ (caps nego) │ │ (5 tools) │ │ ││
|
|
47
|
+
│ └──────────────┘ └──────────────┘ └──────────────────────────┘│
|
|
48
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
|
|
49
|
+
│ │ Sampling │ │ Elicitation │ │ Progress/Cancellation ││
|
|
50
|
+
│ │ Client │ │ Client │ │ Manager ││
|
|
51
|
+
│ └──────────────┘ └──────────────┘ └──────────────────────────┘│
|
|
52
|
+
│ ┌──────────────────────────────────────────────────────────────┐│
|
|
53
|
+
│ │ Playwright MCP Client (orchestrates external server) ││
|
|
54
|
+
│ └──────────────────────────────────────────────────────────────┘│
|
|
55
|
+
└─────────────────┬───────────────────────────────────────────────┘
|
|
56
|
+
│ MCP Protocol (stdio subprocess)
|
|
57
|
+
┌─────────────────▼───────────────────────────────────────────────┐
|
|
58
|
+
│ Playwright MCP Server (Microsoft) │
|
|
59
|
+
│ - browser_snapshot, browser_take_screenshot │
|
|
60
|
+
│ - browser_click, browser_type, browser_navigate │
|
|
61
|
+
│ - browser_run_code (for DOM extraction) │
|
|
62
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Decisions
|
|
66
|
+
|
|
67
|
+
### D1: Playwright MCP as subprocess, not embedded
|
|
68
|
+
**Decision**: Spawn Playwright MCP server as a subprocess and communicate via stdio MCP protocol.
|
|
69
|
+
|
|
70
|
+
**Rationale**:
|
|
71
|
+
- Microsoft's Playwright MCP is maintained separately with frequent updates
|
|
72
|
+
- Subprocess isolation prevents version conflicts
|
|
73
|
+
- Standard MCP client pattern; reusable code
|
|
74
|
+
- Can be configured via environment (e.g., browser type, headless mode)
|
|
75
|
+
|
|
76
|
+
**Alternatives considered**:
|
|
77
|
+
- Direct Playwright library integration: Higher coupling, maintenance burden
|
|
78
|
+
- HTTP transport to remote Playwright MCP: Adds network complexity for local use
|
|
79
|
+
|
|
80
|
+
### D2: Sampling-first with fallback modes
|
|
81
|
+
**Decision**: Primary reasoning via `sampling/createMessage`. When sampling unavailable, emit "human prompt" resources for manual execution.
|
|
82
|
+
|
|
83
|
+
**Rationale**:
|
|
84
|
+
- MCP Sampling keeps API keys client-side (security, flexibility)
|
|
85
|
+
- Fallback ensures server works with minimal clients
|
|
86
|
+
- "Human prompt" resources let users copy/paste to their LLM
|
|
87
|
+
|
|
88
|
+
**Fallback behavior**:
|
|
89
|
+
- `webtest_crawl_app` without sampling: Returns `needsManualInput: true` and a resource with the prompt
|
|
90
|
+
- Tool accepts `manualNextActions` input to continue crawl with user-provided actions
|
|
91
|
+
|
|
92
|
+
### D3: Structured JSON schemas for all sampling requests
|
|
93
|
+
**Decision**: All sampling requests include strict JSON output schemas. Responses are validated before use.
|
|
94
|
+
|
|
95
|
+
**Rationale**:
|
|
96
|
+
- Predictable parsing of LLM responses
|
|
97
|
+
- Type safety in TypeScript handlers
|
|
98
|
+
- Clear contract between server and client LLM
|
|
99
|
+
|
|
100
|
+
**Schema examples**:
|
|
101
|
+
- Crawl action: `{ actions: [{ tool: string, args: object }], reasoning: string, goalProgress: string }`
|
|
102
|
+
- Test generation: `{ tests: [{ id, name, steps: [...] }] }`
|
|
103
|
+
|
|
104
|
+
### D4: File-based workspace with Resource URIs
|
|
105
|
+
**Decision**: Each analysis creates a workspace directory. All artifacts written to disk and exposed as `webtest://analysisId/...` resources.
|
|
106
|
+
|
|
107
|
+
**Rationale**:
|
|
108
|
+
- Persistence survives server restarts
|
|
109
|
+
- Resources are stable, shareable URIs
|
|
110
|
+
- File system is simple, debuggable
|
|
111
|
+
- Enables future features (workspace resume, export)
|
|
112
|
+
|
|
113
|
+
**Structure**:
|
|
114
|
+
```
|
|
115
|
+
workspaces/
|
|
116
|
+
{analysisId}/
|
|
117
|
+
index.json # Analysis metadata
|
|
118
|
+
crawls/
|
|
119
|
+
{crawlId}/
|
|
120
|
+
index.json # Crawl metadata, page list
|
|
121
|
+
pages/
|
|
122
|
+
{pageId}/
|
|
123
|
+
snapshot.json
|
|
124
|
+
screenshot.png
|
|
125
|
+
dom.html
|
|
126
|
+
summary.md
|
|
127
|
+
analysis/
|
|
128
|
+
app-analysis.md
|
|
129
|
+
flows.json
|
|
130
|
+
tests/
|
|
131
|
+
tests.md
|
|
132
|
+
tests.json
|
|
133
|
+
runs/
|
|
134
|
+
{runId}/
|
|
135
|
+
report.md
|
|
136
|
+
artifacts.json
|
|
137
|
+
steps/
|
|
138
|
+
{stepId}/
|
|
139
|
+
screenshot.png
|
|
140
|
+
snapshot.json
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
### D5: Capability-based runtime behavior
|
|
144
|
+
**Decision**: Query client capabilities at initialization; store in server context; branch behavior at runtime.
|
|
145
|
+
|
|
146
|
+
**Rationale**:
|
|
147
|
+
- MCP clients vary in capability support
|
|
148
|
+
- Graceful degradation over hard failures
|
|
149
|
+
- Single codebase serves all client types
|
|
150
|
+
|
|
151
|
+
**Capabilities checked**:
|
|
152
|
+
- `sampling`: Use sampling or fallback to manual prompts
|
|
153
|
+
- `elicitation`: Ask user or write questions to output
|
|
154
|
+
- `logging`: Emit logs or stay silent
|
|
155
|
+
- `progress`: Report progress or skip
|
|
156
|
+
- `tasks`: Use task-augmented execution for long operations (optional)
|
|
157
|
+
|
|
158
|
+
### D6: Security boundaries
|
|
159
|
+
**Decision**: Implement multiple security layers.
|
|
160
|
+
|
|
161
|
+
**Rationale**: Web testing interacts with untrusted content; must prevent abuse.
|
|
162
|
+
|
|
163
|
+
**Layers**:
|
|
164
|
+
1. **Domain allowlist**: Default to target domain only. Explicitly opt-in for additional domains.
|
|
165
|
+
2. **Prompt injection resistance**: Model instructions are prefixed with "SYSTEM:" and wrapped; page content is clearly demarcated as "USER CONTENT:". Sampling prompts instruct model to ignore instructions in page content.
|
|
166
|
+
3. **No credential elicitation**: If auth required, inform user and stop. Never ask for passwords/tokens via elicitation.
|
|
167
|
+
4. **Action validation**: Before executing Playwright actions, validate they target allowed domains.
|
|
168
|
+
|
|
169
|
+
### D7: Progress and cancellation implementation
|
|
170
|
+
**Decision**: Use `progressToken` from request `_meta`; emit `notifications/progress`. Check cancellation registry on each loop iteration.
|
|
171
|
+
|
|
172
|
+
**Rationale**:
|
|
173
|
+
- Standard MCP progress pattern
|
|
174
|
+
- Cancellation enables user control over long crawls
|
|
175
|
+
- Partial results are still valuable
|
|
176
|
+
|
|
177
|
+
**Implementation**:
|
|
178
|
+
- Maintain `Set<requestId>` of cancelled requests
|
|
179
|
+
- On `notifications/cancelled`, add to set
|
|
180
|
+
- Each crawl/test loop iteration checks set; if cancelled, finalize partial output
|
|
181
|
+
|
|
182
|
+
### D8: Elicitation for specific decision points
|
|
183
|
+
**Decision**: Use elicitation only for enumerated, non-sensitive decisions.
|
|
184
|
+
|
|
185
|
+
**Elicitation triggers** (exhaustive list):
|
|
186
|
+
- Cookie consent: "Accept", "Reject", "Dismiss"
|
|
187
|
+
- Modal blocking: "Close modal", "Interact with modal"
|
|
188
|
+
- Ambiguous navigation: Multiple similar options → list them
|
|
189
|
+
- Auth required: "Stop analysis", "Continue unauthenticated"
|
|
190
|
+
|
|
191
|
+
**Never elicit**: Passwords, tokens, 2FA codes, personal data
|
|
192
|
+
|
|
193
|
+
### D9: Protocol version requirements for elicitation
|
|
194
|
+
**Decision**: Require MCP protocol revision 2025-06-18 or later; negotiate gracefully with older clients.
|
|
195
|
+
|
|
196
|
+
**Rationale**:
|
|
197
|
+
- Elicitation is a newer MCP feature not available in all clients
|
|
198
|
+
- Explicit version requirement prevents runtime surprises
|
|
199
|
+
- Graceful degradation for older clients maintains usability
|
|
200
|
+
|
|
201
|
+
**Implementation**:
|
|
202
|
+
- Declare `protocolVersion: "2025-06-18"` in server capabilities
|
|
203
|
+
- If client negotiates older version, mark elicitation as unavailable
|
|
204
|
+
- Log warning when running in degraded mode
|
|
205
|
+
|
|
206
|
+
### D10: Resource change signaling
|
|
207
|
+
**Decision**: Implement `resources/list_changed` notifications and optional `resources/subscribe` for live artifact updates.
|
|
208
|
+
|
|
209
|
+
**Rationale**:
|
|
210
|
+
- Long-running operations (crawl, test execution) produce artifacts incrementally
|
|
211
|
+
- Clients benefit from knowing when new artifacts are available
|
|
212
|
+
- Standard MCP pattern for resource-heavy servers
|
|
213
|
+
|
|
214
|
+
**Implementation**:
|
|
215
|
+
- Check `capabilities.resources.listChanged` at init
|
|
216
|
+
- Emit `notifications/resources/list_changed` when new resources created
|
|
217
|
+
- Support `resources/subscribe` for per-resource update notifications
|
|
218
|
+
- Fallback: clients poll `resources/list` if notifications unsupported
|
|
219
|
+
|
|
220
|
+
### D11: Playwright MCP capability adapter
|
|
221
|
+
**Decision**: Dynamically discover Playwright MCP tools and build an adapter layer mapping canonical operations to actual tool names.
|
|
222
|
+
|
|
223
|
+
**Rationale**:
|
|
224
|
+
- Different Playwright MCP implementations use different naming (browser_*, playwright_*, unprefixed)
|
|
225
|
+
- Microsoft's implementation may change tool names between versions
|
|
226
|
+
- Adapter pattern isolates our code from external API changes
|
|
227
|
+
|
|
228
|
+
**Implementation**:
|
|
229
|
+
- On first Playwright MCP use, call `tools/list`
|
|
230
|
+
- Build mapping: `{ snapshot: "browser_snapshot", click: "browser_click", ... }`
|
|
231
|
+
- Check for required capabilities; log warnings if missing
|
|
232
|
+
- Cache mapping for session lifetime
|
|
233
|
+
|
|
234
|
+
### D12: Crawl checkpointing and loop prevention
|
|
235
|
+
**Decision**: Implement periodic checkpoints and multi-level loop detection during crawl.
|
|
236
|
+
|
|
237
|
+
**Rationale**:
|
|
238
|
+
- Long crawls may be interrupted (cancellation, errors, timeouts)
|
|
239
|
+
- Infinite loops on complex apps are a real risk
|
|
240
|
+
- Checkpoints enable resumption; loop detection prevents resource waste
|
|
241
|
+
|
|
242
|
+
**Checkpointing**:
|
|
243
|
+
- Write checkpoint every N steps (default 5)
|
|
244
|
+
- Checkpoint includes: step count, visited pages, action history, goal progress
|
|
245
|
+
- Support `resume: true` to continue from checkpoint
|
|
246
|
+
|
|
247
|
+
**Loop detection**:
|
|
248
|
+
- DOM signature (hash of structural elements) detects same-state loops
|
|
249
|
+
- URL tracking detects navigation cycles
|
|
250
|
+
- Action deduplication prevents repeated identical actions
|
|
251
|
+
- Include loop state in sampling prompts to help model avoid patterns
|
|
252
|
+
|
|
253
|
+
### D13: Structured logging with correlation and redaction
|
|
254
|
+
**Decision**: Implement MCP logging notifications with correlation IDs, log level control, and sensitive data redaction.
|
|
255
|
+
|
|
256
|
+
**Rationale**:
|
|
257
|
+
- Progress tells "where we are"; logs tell "why we did that"
|
|
258
|
+
- Correlation IDs enable tracing across analysis → crawl → test
|
|
259
|
+
- Sensitive data (tokens, passwords, cookies) must not leak to logs
|
|
260
|
+
|
|
261
|
+
**Implementation**:
|
|
262
|
+
- Support `logging/setLevel` for dynamic control
|
|
263
|
+
- Include `analysisId`, `crawlId`, `testRunId`, `iteration` in all logs
|
|
264
|
+
- Redact: URL query params matching sensitive patterns, cookie values, password inputs
|
|
265
|
+
- Log Playwright tool calls and sampling requests for debugging
|
|
266
|
+
|
|
267
|
+
### D14: Comprehensive prompt injection hardening
|
|
268
|
+
**Decision**: Implement defense-in-depth against prompt injection attacks via page content.
|
|
269
|
+
|
|
270
|
+
**Rationale**:
|
|
271
|
+
- MCP Sampling forwards untrusted page content to a model
|
|
272
|
+
- Injection attacks could expand scope, exfiltrate data, or request secrets
|
|
273
|
+
- Multiple layers of defense required
|
|
274
|
+
|
|
275
|
+
**Layers**:
|
|
276
|
+
1. **Demarcation**: Page content wrapped with explicit security warnings
|
|
277
|
+
2. **Instruction protection**: System instructions use `[WEBTEST-SYSTEM]:` prefix
|
|
278
|
+
3. **Action validation**: All actions checked against allowed domains
|
|
279
|
+
4. **Scope enforcement**: Reject actions outside stated user goal
|
|
280
|
+
5. **Exfiltration blocking**: Block POST to external domains, external network calls
|
|
281
|
+
6. **Audit logging**: Log all sampling inputs/outputs for security review
|
|
282
|
+
7. **Test suite**: Include injection resistance tests (direct, indirect, goal hijacking)
|
|
283
|
+
|
|
284
|
+
## Risks / Trade-offs
|
|
285
|
+
|
|
286
|
+
### Risk: Sampling latency impacts UX
|
|
287
|
+
**Mitigation**:
|
|
288
|
+
- Emit progress notifications frequently
|
|
289
|
+
- Allow cancellation
|
|
290
|
+
- Batch simple decisions where possible
|
|
291
|
+
|
|
292
|
+
### Risk: Playwright MCP tool names change
|
|
293
|
+
**Mitigation**:
|
|
294
|
+
- Discover tools at connection time via `tools/list`
|
|
295
|
+
- Maintain mapping from canonical names to actual names
|
|
296
|
+
- Log warnings if expected tools missing
|
|
297
|
+
|
|
298
|
+
### Risk: Large workspaces consume disk
|
|
299
|
+
**Mitigation**:
|
|
300
|
+
- Configurable retention policy (env var)
|
|
301
|
+
- Screenshots compressed (JPEG quality setting)
|
|
302
|
+
- Future: workspace cleanup command
|
|
303
|
+
|
|
304
|
+
### Risk: Prompt injection via page content
|
|
305
|
+
**Mitigation**:
|
|
306
|
+
- Clear demarcation in sampling prompts
|
|
307
|
+
- Output schema validation (reject malformed responses)
|
|
308
|
+
- Domain allowlist prevents navigation to attacker-controlled sites
|
|
309
|
+
- Monitoring: log all sampling inputs/outputs for audit
|
|
310
|
+
|
|
311
|
+
## Migration Plan
|
|
312
|
+
|
|
313
|
+
This is greenfield functionality; no migration needed. The `hello` tool removal is the only breaking change.
|
|
314
|
+
|
|
315
|
+
**Rollout**:
|
|
316
|
+
1. Implement core infrastructure (lifecycle, sampling client, Playwright client)
|
|
317
|
+
2. Implement `start_analysis` + resource system
|
|
318
|
+
3. Implement `crawl` with sampling/elicitation
|
|
319
|
+
4. Implement `analyze_app` and `generate_tests`
|
|
320
|
+
5. Implement `run_test_case`
|
|
321
|
+
6. Add prompts
|
|
322
|
+
7. Remove `hello` tool
|
|
323
|
+
8. Documentation and examples
|
|
324
|
+
|
|
325
|
+
## Open Questions
|
|
326
|
+
|
|
327
|
+
1. **Playwright MCP package name**: Is it `@anthropic-ai/mcp-playwright`, `@playwright/mcp`, or community package? Need to verify at implementation time.
|
|
328
|
+
|
|
329
|
+
2. **Tasks support**: The MCP tasks extension is optional. Should we implement it in v1 or defer? **Recommendation**: Defer to v2; progress + cancellation covers most use cases.
|
|
330
|
+
|
|
331
|
+
3. **Workspace location**: Default to `./workspaces` relative to CWD, or use temp directory? **Recommendation**: Configurable via `WEBTEST_WORKSPACE_DIR` env var, default to `./webtest-workspaces`.
|
|
332
|
+
|
|
333
|
+
4. **Screenshot format**: PNG (lossless, larger) vs JPEG (lossy, smaller)? **Recommendation**: PNG for accuracy; make configurable.
|