retestkit 1.4.1 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (238) hide show
  1. package/README.md +59 -40
  2. package/dist/config.js +8 -8
  3. package/dist/config.js.map +1 -1
  4. package/dist/logger.js +1 -1
  5. package/dist/logger.js.map +1 -1
  6. package/dist/prompts/index.d.ts +1 -1
  7. package/dist/prompts/index.d.ts.map +1 -1
  8. package/dist/prompts/index.js +21 -21
  9. package/dist/prompts/index.js.map +1 -1
  10. package/dist/prompts/templates/mcp/retest-crawl.md +7 -0
  11. package/{src/prompts/templates/mcp/webtest-discover-flows.md → dist/prompts/templates/mcp/retest-discover-flows.md} +1 -1
  12. package/{src/prompts/templates/mcp/webtest-discover.md → dist/prompts/templates/mcp/retest-discover.md} +2 -2
  13. package/dist/prompts/templates/mcp/retest-full-workflow.md +12 -0
  14. package/{src/prompts/templates/mcp/webtest-generate-tests.md → dist/prompts/templates/mcp/retest-generate-tests.md} +1 -1
  15. package/{src/prompts/templates/mcp/webtest-run-test.md → dist/prompts/templates/mcp/retest-run-test.md} +1 -1
  16. package/{src/prompts/templates/mcp/webtest-start.md → dist/prompts/templates/mcp/retest-start.md} +1 -1
  17. package/{src → dist}/prompts/templates/sampling/system-prefix.md +1 -1
  18. package/dist/resources/index.js +7 -7
  19. package/dist/resources/index.js.map +1 -1
  20. package/dist/schemas/config.js +2 -2
  21. package/dist/schemas/config.js.map +1 -1
  22. package/dist/security/index.js +1 -1
  23. package/dist/security/index.js.map +1 -1
  24. package/dist/server.js +3 -3
  25. package/dist/server.js.map +1 -1
  26. package/dist/test-utils/mock-context.js +22 -22
  27. package/dist/test-utils/mock-context.js.map +1 -1
  28. package/dist/tools/index.d.ts +1 -1
  29. package/dist/tools/index.d.ts.map +1 -1
  30. package/dist/tools/index.js +5 -5
  31. package/dist/tools/index.js.map +1 -1
  32. package/dist/tools/retest/crawl.d.ts.map +1 -0
  33. package/dist/tools/{webtest → retest}/crawl.js +7 -7
  34. package/dist/tools/retest/crawl.js.map +1 -0
  35. package/dist/tools/retest/discover-features.d.ts.map +1 -0
  36. package/dist/tools/{webtest → retest}/discover-features.js +6 -6
  37. package/dist/tools/retest/discover-features.js.map +1 -0
  38. package/dist/tools/retest/discover-flows.d.ts.map +1 -0
  39. package/dist/tools/{webtest → retest}/discover-flows.js +6 -6
  40. package/dist/tools/retest/discover-flows.js.map +1 -0
  41. package/dist/tools/retest/generate-tests.d.ts.map +1 -0
  42. package/dist/tools/{webtest → retest}/generate-tests.js +5 -5
  43. package/dist/tools/retest/generate-tests.js.map +1 -0
  44. package/dist/tools/retest/index.d.ts.map +1 -0
  45. package/dist/tools/retest/index.js.map +1 -0
  46. package/dist/tools/retest/run-test-case.d.ts.map +1 -0
  47. package/dist/tools/{webtest → retest}/run-test-case.js +3 -3
  48. package/dist/tools/retest/run-test-case.js.map +1 -0
  49. package/dist/tools/retest/schemas.d.ts.map +1 -0
  50. package/dist/tools/retest/schemas.js.map +1 -0
  51. package/dist/tools/retest/start-analysis.d.ts.map +1 -0
  52. package/dist/tools/{webtest → retest}/start-analysis.js +5 -5
  53. package/dist/tools/retest/start-analysis.js.map +1 -0
  54. package/dist/workspace/index.js +8 -8
  55. package/dist/workspace/index.js.map +1 -1
  56. package/dist/workspace/types.d.ts +2 -2
  57. package/dist/workspace/types.d.ts.map +1 -1
  58. package/package.json +6 -2
  59. package/.claude/commands/openspec/apply.md +0 -23
  60. package/.claude/commands/openspec/archive.md +0 -27
  61. package/.claude/commands/openspec/proposal.md +0 -28
  62. package/.gemini/commands/openspec/apply.toml +0 -21
  63. package/.gemini/commands/openspec/archive.toml +0 -25
  64. package/.gemini/commands/openspec/proposal.toml +0 -26
  65. package/.github/prompts/openspec-apply.prompt.md +0 -22
  66. package/.github/prompts/openspec-archive.prompt.md +0 -26
  67. package/.github/prompts/openspec-proposal.prompt.md +0 -27
  68. package/.github/workflows/release.yml +0 -33
  69. package/.kilocode/workflows/openspec-apply.md +0 -17
  70. package/.kilocode/workflows/openspec-archive.md +0 -21
  71. package/.kilocode/workflows/openspec-proposal.md +0 -22
  72. package/.mcp.json +0 -23
  73. package/.opencode/command/openspec-apply.md +0 -25
  74. package/.opencode/command/openspec-archive.md +0 -28
  75. package/.opencode/command/openspec-proposal.md +0 -30
  76. package/.roo/commands/openspec-apply.md +0 -20
  77. package/.roo/commands/openspec-archive.md +0 -24
  78. package/.roo/commands/openspec-proposal.md +0 -25
  79. package/.vscode/mcp.json +0 -23
  80. package/AGENTS.md +0 -18
  81. package/CLAUDE.md +0 -18
  82. package/dist/tools/webtest/crawl.d.ts.map +0 -1
  83. package/dist/tools/webtest/crawl.js.map +0 -1
  84. package/dist/tools/webtest/discover-features.d.ts.map +0 -1
  85. package/dist/tools/webtest/discover-features.js.map +0 -1
  86. package/dist/tools/webtest/discover-flows.d.ts.map +0 -1
  87. package/dist/tools/webtest/discover-flows.js.map +0 -1
  88. package/dist/tools/webtest/generate-tests.d.ts.map +0 -1
  89. package/dist/tools/webtest/generate-tests.js.map +0 -1
  90. package/dist/tools/webtest/index.d.ts.map +0 -1
  91. package/dist/tools/webtest/index.js.map +0 -1
  92. package/dist/tools/webtest/run-test-case.d.ts.map +0 -1
  93. package/dist/tools/webtest/run-test-case.js.map +0 -1
  94. package/dist/tools/webtest/schemas.d.ts.map +0 -1
  95. package/dist/tools/webtest/schemas.js.map +0 -1
  96. package/dist/tools/webtest/start-analysis.d.ts.map +0 -1
  97. package/dist/tools/webtest/start-analysis.js.map +0 -1
  98. package/openspec/AGENTS.md +0 -456
  99. package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/proposal.md +0 -33
  100. package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/specs/webtest-resources/spec.md +0 -27
  101. package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/specs/webtest-tools/spec.md +0 -304
  102. package/openspec/changes/archive/2025-12-18-add-hybrid-artifact-paths/tasks.md +0 -43
  103. package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/design.md +0 -209
  104. package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/proposal.md +0 -41
  105. package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/specs/mcp-server-core/spec.md +0 -183
  106. package/openspec/changes/archive/2025-12-18-add-mcp-server-foundation/tasks.md +0 -112
  107. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/design.md +0 -333
  108. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/proposal.md +0 -66
  109. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/mcp-server-core/spec.md +0 -129
  110. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-lifecycle/spec.md +0 -138
  111. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-logging/spec.md +0 -211
  112. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-prompts/spec.md +0 -157
  113. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-resources/spec.md +0 -213
  114. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-sampling/spec.md +0 -257
  115. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/specs/webtest-tools/spec.md +0 -501
  116. package/openspec/changes/archive/2025-12-18-add-webtest-orchestrator/tasks.md +0 -264
  117. package/openspec/changes/archive/2025-12-18-allow-analysis-of-incomplete-crawls/proposal.md +0 -24
  118. package/openspec/changes/archive/2025-12-18-allow-analysis-of-incomplete-crawls/specs/webtest-tools/spec.md +0 -80
  119. package/openspec/changes/archive/2025-12-18-allow-analysis-of-incomplete-crawls/tasks.md +0 -8
  120. package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/design.md +0 -90
  121. package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/proposal.md +0 -28
  122. package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/specs/webtest-sampling/spec.md +0 -90
  123. package/openspec/changes/archive/2025-12-18-fix-crawl-loop-stability/tasks.md +0 -33
  124. package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/design.md +0 -558
  125. package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/proposal.md +0 -119
  126. package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/specs/webtest-resources/spec.md +0 -109
  127. package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/specs/webtest-tools/spec.md +0 -121
  128. package/openspec/changes/archive/2025-12-18-use-markdown-artifacts/tasks.md +0 -133
  129. package/openspec/changes/extract-prompts-to-markdown/design.md +0 -86
  130. package/openspec/changes/extract-prompts-to-markdown/proposal.md +0 -50
  131. package/openspec/changes/extract-prompts-to-markdown/specs/webtest-prompts/spec.md +0 -74
  132. package/openspec/changes/extract-prompts-to-markdown/tasks.md +0 -40
  133. package/openspec/changes/refactor-webtest-naming/design.md +0 -95
  134. package/openspec/changes/refactor-webtest-naming/proposal.md +0 -66
  135. package/openspec/changes/refactor-webtest-naming/specs/webtest-prompts/spec.md +0 -79
  136. package/openspec/changes/refactor-webtest-naming/specs/webtest-resources/spec.md +0 -80
  137. package/openspec/changes/refactor-webtest-naming/specs/webtest-sampling/spec.md +0 -122
  138. package/openspec/changes/refactor-webtest-naming/specs/webtest-tools/spec.md +0 -113
  139. package/openspec/changes/refactor-webtest-naming/tasks.md +0 -119
  140. package/openspec/changes/rename-package-to-retest/proposal.md +0 -52
  141. package/openspec/changes/rename-package-to-retest/specs/mcp-server-core/spec.md +0 -53
  142. package/openspec/changes/rename-package-to-retest/specs/retest-lifecycle/spec.md +0 -68
  143. package/openspec/changes/rename-package-to-retest/specs/retest-logging/spec.md +0 -35
  144. package/openspec/changes/rename-package-to-retest/specs/retest-prompts/spec.md +0 -159
  145. package/openspec/changes/rename-package-to-retest/specs/retest-resources/spec.md +0 -251
  146. package/openspec/changes/rename-package-to-retest/specs/retest-sampling/spec.md +0 -99
  147. package/openspec/changes/rename-package-to-retest/specs/retest-tools/spec.md +0 -295
  148. package/openspec/changes/rename-package-to-retest/tasks.md +0 -71
  149. package/openspec/project.md +0 -31
  150. package/openspec/specs/mcp-server-core/spec.md +0 -178
  151. package/openspec/specs/webtest-lifecycle/spec.md +0 -136
  152. package/openspec/specs/webtest-logging/spec.md +0 -209
  153. package/openspec/specs/webtest-prompts/spec.md +0 -155
  154. package/openspec/specs/webtest-resources/spec.md +0 -248
  155. package/openspec/specs/webtest-sampling/spec.md +0 -344
  156. package/openspec/specs/webtest-tools/spec.md +0 -282
  157. package/release.config.js +0 -9
  158. package/src/config.test.ts +0 -96
  159. package/src/config.ts +0 -32
  160. package/src/elicitation/index.test.ts +0 -399
  161. package/src/elicitation/index.ts +0 -171
  162. package/src/elicitation/types.ts +0 -68
  163. package/src/index.ts +0 -83
  164. package/src/lifecycle/index.test.ts +0 -260
  165. package/src/lifecycle/index.ts +0 -101
  166. package/src/logger.redaction.test.ts +0 -322
  167. package/src/logger.test.ts +0 -123
  168. package/src/logger.ts +0 -229
  169. package/src/playwright-client/index.ts +0 -392
  170. package/src/playwright-client/types.ts +0 -99
  171. package/src/progress/index.test.ts +0 -327
  172. package/src/progress/index.ts +0 -170
  173. package/src/progress/types.ts +0 -25
  174. package/src/prompts/index.test.ts +0 -451
  175. package/src/prompts/index.ts +0 -246
  176. package/src/prompts/loader.test.ts +0 -100
  177. package/src/prompts/loader.ts +0 -59
  178. package/src/prompts/templates/mcp/webtest-crawl.md +0 -7
  179. package/src/prompts/templates/mcp/webtest-full-workflow.md +0 -12
  180. package/src/resources/index.ts +0 -250
  181. package/src/resources/subscriptions.ts +0 -37
  182. package/src/sampling/index.test.ts +0 -414
  183. package/src/sampling/index.ts +0 -286
  184. package/src/sampling/prompts.ts +0 -194
  185. package/src/sampling/types.ts +0 -60
  186. package/src/schemas/config.ts +0 -39
  187. package/src/security/index.test.ts +0 -441
  188. package/src/security/index.ts +0 -361
  189. package/src/security/security-scenarios.test.ts +0 -468
  190. package/src/server.ts +0 -211
  191. package/src/test-utils/index.ts +0 -6
  192. package/src/test-utils/mock-context.ts +0 -426
  193. package/src/test-utils/mock-playwright-client.ts +0 -422
  194. package/src/tools/index.ts +0 -11
  195. package/src/tools/webtest/crawl.test.ts +0 -834
  196. package/src/tools/webtest/crawl.ts +0 -901
  197. package/src/tools/webtest/discover-features.ts +0 -412
  198. package/src/tools/webtest/discover-flows.ts +0 -408
  199. package/src/tools/webtest/generate-tests.test.ts +0 -532
  200. package/src/tools/webtest/generate-tests.ts +0 -425
  201. package/src/tools/webtest/index.ts +0 -7
  202. package/src/tools/webtest/integration.test.ts +0 -536
  203. package/src/tools/webtest/run-test-case.test.ts +0 -659
  204. package/src/tools/webtest/run-test-case.ts +0 -508
  205. package/src/tools/webtest/schemas.ts +0 -201
  206. package/src/tools/webtest/start-analysis.test.ts +0 -151
  207. package/src/tools/webtest/start-analysis.ts +0 -158
  208. package/src/transports/http.ts +0 -19
  209. package/src/transports/index.ts +0 -30
  210. package/src/transports/stdio.ts +0 -7
  211. package/src/types/capabilities.test.ts +0 -193
  212. package/src/types/capabilities.ts +0 -50
  213. package/src/types/context.ts +0 -21
  214. package/src/types/tool.ts +0 -11
  215. package/src/workspace/index.ts +0 -945
  216. package/src/workspace/markdown.ts +0 -272
  217. package/src/workspace/types.ts +0 -186
  218. package/tests/integration/server.test.ts +0 -89
  219. package/tests/integration/tools.test.ts +0 -99
  220. package/tsconfig.json +0 -20
  221. package/vitest.config.ts +0 -9
  222. package/vitest.integration.config.ts +0 -10
  223. /package/{src → dist}/prompts/templates/sampling/crawl-action.md +0 -0
  224. /package/{src → dist}/prompts/templates/sampling/feature-discovery.md +0 -0
  225. /package/{src → dist}/prompts/templates/sampling/flow-discovery.md +0 -0
  226. /package/{src → dist}/prompts/templates/sampling/page-content-wrapper.md +0 -0
  227. /package/{src → dist}/prompts/templates/sampling/test-evaluation.md +0 -0
  228. /package/{src → dist}/prompts/templates/sampling/test-generation.md +0 -0
  229. /package/dist/tools/{webtest → retest}/crawl.d.ts +0 -0
  230. /package/dist/tools/{webtest → retest}/discover-features.d.ts +0 -0
  231. /package/dist/tools/{webtest → retest}/discover-flows.d.ts +0 -0
  232. /package/dist/tools/{webtest → retest}/generate-tests.d.ts +0 -0
  233. /package/dist/tools/{webtest → retest}/index.d.ts +0 -0
  234. /package/dist/tools/{webtest → retest}/index.js +0 -0
  235. /package/dist/tools/{webtest → retest}/run-test-case.d.ts +0 -0
  236. /package/dist/tools/{webtest → retest}/schemas.d.ts +0 -0
  237. /package/dist/tools/{webtest → retest}/schemas.js +0 -0
  238. /package/dist/tools/{webtest → retest}/start-analysis.d.ts +0 -0
@@ -1,501 +0,0 @@
1
- # webtest-tools Specification
2
-
3
- ## Purpose
4
-
5
- Defines the five webtest tools that compose the dynamic web testing workflow.
6
-
7
- ## ADDED Requirements
8
-
9
- ### Requirement: webtest_init Tool
10
-
11
- The system SHALL provide a `webtest_init` tool that initializes an analysis workspace for a target URL and focus.
12
-
13
- #### Scenario: Start analysis with valid URL
14
-
15
- - **GIVEN** the tool is called with a valid URL and focus
16
- - **WHEN** execution completes
17
- - **THEN** it SHALL generate a unique `analysisId`
18
- - **AND** create workspace directories
19
- - **AND** write initial `index.json` metadata
20
- - **AND** return `{ analysisId, workspaceRootUri, statusUri }`
21
-
22
- #### Scenario: Start analysis validates URL
23
-
24
- - **GIVEN** the tool is called with an invalid URL
25
- - **WHEN** validation occurs
26
- - **THEN** it SHALL return an error with message "Invalid URL format"
27
-
28
- #### Scenario: Start analysis normalizes domain for allowlist
29
-
30
- - **GIVEN** the tool is called with URL "https://example.com/path"
31
- - **WHEN** workspace is created
32
- - **THEN** the default `allowedDomains` SHALL include "example.com"
33
- - **AND** this SHALL be stored in workspace metadata
34
-
35
- #### Scenario: Start analysis accepts custom limits
36
-
37
- - **GIVEN** the tool is called with `limits: { maxSteps: 50, maxPages: 10, maxMinutes: 5 }`
38
- - **WHEN** workspace is created
39
- - **THEN** limits SHALL be stored in workspace metadata
40
- - **AND** subsequent crawls SHALL respect these limits
41
-
42
- ### Requirement: webtest_crawl_app Tool
43
-
44
- The system SHALL provide a `webtest_crawl_app` tool that dynamically explores a web application to achieve a goal.
45
-
46
- #### Scenario: Crawl navigates to starting URL
47
-
48
- - **GIVEN** the tool is called with a valid analysisId
49
- - **WHEN** crawl begins
50
- - **THEN** it SHALL launch Playwright MCP browser
51
- - **AND** navigate to the URL from analysis metadata
52
-
53
- #### Scenario: Crawl captures artifacts at each checkpoint
54
-
55
- - **GIVEN** a crawl iteration completes an action
56
- - **WHEN** state is captured
57
- - **THEN** it SHALL call Playwright MCP `browser_snapshot` for accessibility tree
58
- - **AND** call `browser_take_screenshot` for visual evidence
59
- - **AND** optionally extract HTML DOM
60
- - **AND** store artifacts in workspace with unique page IDs
61
-
62
- #### Scenario: Crawl uses sampling for next action
63
-
64
- - **GIVEN** crawl has captured current state
65
- - **WHEN** next action is needed
66
- - **THEN** it SHALL construct a sampling prompt with goal, history, current state
67
- - **AND** request next action via `sampling/createMessage`
68
- - **AND** validate and execute returned Playwright actions
69
-
70
- #### Scenario: Crawl terminates when goal satisfied
71
-
72
- - **GIVEN** crawl sampling returns `goalSatisfied: true`
73
- - **WHEN** this is detected
74
- - **THEN** crawl SHALL finalize with success status
75
- - **AND** write crawl summary to workspace
76
-
77
- #### Scenario: Crawl terminates when limits reached
78
-
79
- - **GIVEN** crawl has executed `maxSteps` actions
80
- - **WHEN** limit is checked
81
- - **THEN** crawl SHALL finalize with "limits_reached" status
82
- - **AND** preserve all collected artifacts
83
-
84
- #### Scenario: Crawl handles navigation loops
85
-
86
- - **GIVEN** crawl detects same page state 3 times consecutively
87
- - **WHEN** loop is detected
88
- - **THEN** it SHALL log a warning
89
- - **AND** request alternative action from sampling with loop context
90
-
91
- #### Scenario: Crawl triggers elicitation for cookie consent
92
-
93
- - **GIVEN** crawl detects a cookie consent dialog
94
- - **WHEN** elicitation is supported
95
- - **THEN** it SHALL call elicitation with options: "Accept", "Reject", "Dismiss"
96
- - **AND** execute the chosen action
97
-
98
- #### Scenario: Crawl triggers elicitation for blocking modal
99
-
100
- - **GIVEN** crawl detects a modal blocking navigation
101
- - **WHEN** elicitation is supported
102
- - **THEN** it SHALL call elicitation with options: "Close modal", "Interact with modal content"
103
-
104
- #### Scenario: Crawl triggers elicitation for ambiguous navigation
105
-
106
- - **GIVEN** sampling returns `needsElicitation: { type: "ambiguous", options: [...] }`
107
- - **WHEN** elicitation is supported
108
- - **THEN** it SHALL present the options to the user
109
- - **AND** use the selection to continue crawl
110
-
111
- #### Scenario: Crawl stops on authentication required
112
-
113
- - **GIVEN** crawl detects login form or auth wall
114
- - **WHEN** elicitation is supported
115
- - **THEN** it SHALL call elicitation with options: "Stop analysis", "Continue unauthenticated"
116
- - **AND** never request credentials
117
-
118
- #### Scenario: Crawl emits progress notifications
119
-
120
- - **GIVEN** crawl is running
121
- - **WHEN** each iteration completes
122
- - **THEN** it SHALL emit progress notification with step count, pages discovered, current intent
123
-
124
- #### Scenario: Crawl responds to cancellation
125
-
126
- - **GIVEN** crawl receives `notifications/cancelled`
127
- - **WHEN** cancellation is detected
128
- - **THEN** it SHALL stop the crawl loop promptly
129
- - **AND** finalize with "cancelled" status
130
- - **AND** preserve collected artifacts
131
-
132
- #### Scenario: Crawl returns fallback when sampling unavailable
133
-
134
- - **GIVEN** client does not support sampling
135
- - **WHEN** crawl needs next action
136
- - **THEN** it SHALL return `{ needsManualInput: true, promptUri, currentState }`
137
- - **AND** accept `manualNextActions` input to continue
138
-
139
- #### Scenario: Crawl outputs complete results
140
-
141
- - **GIVEN** crawl has finalized
142
- - **WHEN** output is returned
143
- - **THEN** it SHALL include `crawlId`, `crawlIndexUri`, `pages[]`, `summaryUri`
144
-
145
- ### Requirement: webtest_analyze_app Tool
146
-
147
- The system SHALL provide a `webtest_analyze_app` tool that reverse-engineers application structure from crawl data.
148
-
149
- #### Scenario: Analyze app loads crawl data
150
-
151
- - **GIVEN** the tool is called with valid analysisId and crawlId
152
- - **WHEN** execution begins
153
- - **THEN** it SHALL load crawl index and artifact references
154
- - **AND** load page snapshots for key pages
155
-
156
- #### Scenario: Analyze app uses sampling for analysis
157
-
158
- - **GIVEN** crawl data is loaded
159
- - **WHEN** analysis is performed
160
- - **THEN** it SHALL construct sampling prompt with crawl summary and snapshots
161
- - **AND** request structured analysis via `sampling/createMessage`
162
-
163
- #### Scenario: Analyze app extracts application purpose
164
-
165
- - **GIVEN** analysis sampling completes
166
- - **WHEN** results are processed
167
- - **THEN** output SHALL include identified app purpose
168
- - **AND** key entities (users, products, orders, etc.)
169
-
170
- #### Scenario: Analyze app identifies user flows
171
-
172
- - **GIVEN** analysis sampling completes
173
- - **WHEN** results are processed
174
- - **THEN** output SHALL include discovered user flows
175
- - **AND** each flow SHALL have id, name, description, steps
176
-
177
- #### Scenario: Analyze app suggests assertions
178
-
179
- - **GIVEN** analysis sampling completes
180
- - **WHEN** results are processed
181
- - **THEN** output SHALL include suggested assertions for testing
182
- - **AND** potential risks or edge cases
183
-
184
- #### Scenario: Analyze app writes markdown report
185
-
186
- - **GIVEN** analysis is complete
187
- - **WHEN** output is generated
188
- - **THEN** it SHALL write `app-analysis.md` resource to workspace
189
-
190
- #### Scenario: Analyze app outputs URIs
191
-
192
- - **GIVEN** analysis is complete
193
- - **WHEN** tool returns
194
- - **THEN** it SHALL include `appAnalysisUri` and `flowsIndexUri`
195
-
196
- ### Requirement: webtest_generate_tests Tool
197
-
198
- The system SHALL provide a `webtest_generate_tests` tool that produces test cases from application analysis.
199
-
200
- #### Scenario: Generate tests loads analysis
201
-
202
- - **GIVEN** the tool is called with valid analysisId and appAnalysisUri
203
- - **WHEN** execution begins
204
- - **THEN** it SHALL load app analysis and flows from workspace
205
-
206
- #### Scenario: Generate tests uses sampling
207
-
208
- - **GIVEN** analysis is loaded
209
- - **WHEN** test generation is performed
210
- - **THEN** it SHALL construct sampling prompt with analysis, flows, strategy
211
- - **AND** request test cases via `sampling/createMessage`
212
-
213
- #### Scenario: Generate tests applies strategy
214
-
215
- - **GIVEN** tool is called with `testStrategy: { count: 5, types: ["smoke", "negative"] }`
216
- - **WHEN** sampling prompt is built
217
- - **THEN** it SHALL instruct model to generate 5 tests covering smoke and negative scenarios
218
-
219
- #### Scenario: Generate tests outputs structured format
220
-
221
- - **GIVEN** test generation completes
222
- - **WHEN** results are written
223
- - **THEN** it SHALL produce `tests.md` with human-readable format
224
- - **AND** `tests.json` with structured test definitions
225
-
226
- #### Scenario: Test case structure is complete
227
-
228
- - **GIVEN** tests.json is generated
229
- - **WHEN** a test case is examined
230
- - **THEN** it SHALL include: id, name, purpose, preconditions, steps[], expected results, priority
231
-
232
- #### Scenario: Generate tests outputs URIs
233
-
234
- - **GIVEN** generation is complete
235
- - **WHEN** tool returns
236
- - **THEN** it SHALL include `testsUri` and `testIndexUri`
237
-
238
- ### Requirement: webtest_run_tests Tool
239
-
240
- The system SHALL provide a `webtest_run_tests` tool that executes a test case with evidence capture.
241
-
242
- #### Scenario: Run test case loads test definition
243
-
244
- - **GIVEN** the tool is called with valid analysisId and testCaseId
245
- - **WHEN** execution begins
246
- - **THEN** it SHALL load test case from tests index
247
- - **AND** validate test case exists
248
-
249
- #### Scenario: Run test case executes steps sequentially
250
-
251
- - **GIVEN** test case has multiple steps
252
- - **WHEN** execution runs
253
- - **THEN** it SHALL execute each step in order
254
- - **AND** capture state before and after each step
255
-
256
- #### Scenario: Run test case uses sampling for step translation
257
-
258
- - **GIVEN** a test step needs execution
259
- - **WHEN** translation is needed
260
- - **THEN** it SHALL use sampling to convert step description to Playwright actions
261
- - **AND** validate actions before execution
262
-
263
- #### Scenario: Run test case captures evidence
264
-
265
- - **GIVEN** a step is executed
266
- - **WHEN** evidence is captured
267
- - **THEN** it SHALL take screenshot after action
268
- - **AND** capture accessibility snapshot
269
- - **AND** store with step identifier
270
-
271
- #### Scenario: Run test case evaluates pass/fail
272
-
273
- - **GIVEN** a step has executed
274
- - **WHEN** evaluation occurs
275
- - **THEN** it SHALL use sampling to compare expected vs actual
276
- - **AND** record pass or fail with reason
277
-
278
- #### Scenario: Run test case continues on step failure
279
-
280
- - **GIVEN** a step fails
281
- - **WHEN** failure is recorded
282
- - **THEN** execution SHALL continue to next step (unless critical)
283
- - **AND** overall test status SHALL be "failed"
284
-
285
- #### Scenario: Run test case emits progress
286
-
287
- - **GIVEN** test is running
288
- - **WHEN** each step completes
289
- - **THEN** it SHALL emit progress notification with step number, status
290
-
291
- #### Scenario: Run test case responds to cancellation
292
-
293
- - **GIVEN** test receives `notifications/cancelled`
294
- - **WHEN** cancellation is detected
295
- - **THEN** it SHALL stop after current step
296
- - **AND** finalize with "cancelled" status and partial results
297
-
298
- #### Scenario: Run test case outputs report
299
-
300
- - **GIVEN** test execution completes
301
- - **WHEN** output is generated
302
- - **THEN** it SHALL write `report.md` with pass/fail summary, step details, evidence links
303
- - **AND** `artifacts.json` with structured run data
304
-
305
- #### Scenario: Run test case returns URIs
306
-
307
- - **GIVEN** execution is complete
308
- - **WHEN** tool returns
309
- - **THEN** it SHALL include `testRunId`, `reportUri`, `runArtifactsIndexUri`
310
-
311
- ### Requirement: Playwright MCP Integration
312
-
313
- The system SHALL orchestrate an external Playwright MCP server for browser automation with dynamic tool discovery.
314
-
315
- #### Scenario: Playwright MCP is spawned on first use
316
-
317
- - **GIVEN** a webtest tool needs browser access
318
- - **WHEN** Playwright client is accessed
319
- - **THEN** it SHALL spawn Playwright MCP server as subprocess if not running
320
- - **AND** connect via stdio transport
321
-
322
- #### Scenario: Playwright MCP tools are discovered dynamically
323
-
324
- - **GIVEN** Playwright MCP is connected
325
- - **WHEN** connection is established
326
- - **THEN** it SHALL call `tools/list` to discover available tools
327
- - **AND** build a capability adapter mapping canonical operations to actual tool names
328
- - **AND** cache the mapping for the session lifetime
329
-
330
- #### Scenario: Capability adapter maps canonical operations
331
-
332
- - **GIVEN** Playwright MCP tools have been discovered
333
- - **WHEN** the adapter is queried for operation "snapshot"
334
- - **THEN** it SHALL return the matching tool (e.g., `browser_snapshot` or `playwright_snapshot`)
335
- - **AND** if multiple matches exist, prefer the most specific
336
-
337
- #### Scenario: Missing required capability logs warning
338
-
339
- - **GIVEN** Playwright MCP tools have been discovered
340
- - **WHEN** a required capability (snapshot, screenshot, click, type, navigate) is missing
341
- - **THEN** it SHALL log a warning with the missing capability
342
- - **AND** tools requiring that capability SHALL return an error when invoked
343
-
344
- #### Scenario: Playwright actions are executed via adapter
345
-
346
- - **GIVEN** a crawl action specifies `{ tool: "click", args: { selector: "button" } }`
347
- - **WHEN** action is executed
348
- - **THEN** it SHALL resolve "click" through the capability adapter
349
- - **AND** call the resolved Playwright MCP tool with appropriate arguments
350
- - **AND** return the result
351
-
352
- #### Scenario: Playwright MCP version differences are handled
353
-
354
- - **GIVEN** different Playwright MCP implementations may have different tool names
355
- - **WHEN** the adapter maps tools
356
- - **THEN** it SHALL check for common variants:
357
- - `browser_*` prefix (Microsoft implementation)
358
- - `playwright_*` prefix (alternative implementations)
359
- - unprefixed names
360
- - **AND** log the detected implementation variant
361
-
362
- #### Scenario: Playwright MCP is terminated on shutdown
363
-
364
- - **GIVEN** the server receives shutdown signal
365
- - **WHEN** shutdown begins
366
- - **THEN** it SHALL terminate Playwright MCP subprocess
367
- - **AND** wait for clean exit
368
-
369
- ### Requirement: Crawl Checkpointing
370
-
371
- The system SHALL implement checkpointing during crawl to enable resumption and provide partial results on failure.
372
-
373
- #### Scenario: Checkpoint is written periodically
374
-
375
- - **GIVEN** a crawl is in progress
376
- - **WHEN** N steps have completed (configurable, default 5)
377
- - **THEN** it SHALL write a checkpoint to `webtest://{analysisId}/crawls/{crawlId}/checkpoint.json`
378
- - **AND** the checkpoint SHALL include: current step, visited pages, action history, goal progress
379
-
380
- #### Scenario: Checkpoint is written on each page capture
381
-
382
- - **GIVEN** a crawl captures a new page
383
- - **WHEN** page artifacts are saved
384
- - **THEN** it SHALL update the crawl index immediately
385
- - **AND** emit `notifications/resources/list_changed` if supported
386
-
387
- #### Scenario: Crawl can resume from checkpoint
388
-
389
- - **GIVEN** a crawl was interrupted (cancelled, error, timeout)
390
- - **AND** a checkpoint exists
391
- - **WHEN** `webtest_crawl_app` is called with `resume: true`
392
- - **THEN** it SHALL load the checkpoint
393
- - **AND** continue from the last recorded state
394
-
395
- #### Scenario: Checkpoint includes DOM signature for loop detection
396
-
397
- - **GIVEN** a checkpoint is written
398
- - **WHEN** the current page state is recorded
399
- - **THEN** it SHALL include a DOM signature (hash of key structural elements)
400
- - **AND** this signature SHALL be used for loop detection
401
-
402
- ### Requirement: Crawl Loop Detection and Prevention
403
-
404
- The system SHALL detect and prevent infinite crawl loops.
405
-
406
- #### Scenario: Same page state detected consecutively
407
-
408
- - **GIVEN** crawl detects same DOM signature 3 times consecutively
409
- - **WHEN** loop is detected
410
- - **THEN** it SHALL log a warning with loop details
411
- - **AND** inform sampling of the loop condition
412
- - **AND** request alternative action with loop context
413
-
414
- #### Scenario: URL cycle detected
415
-
416
- - **GIVEN** crawl visits the same URL more than 3 times
417
- - **WHEN** cycle is detected
418
- - **THEN** it SHALL log a warning
419
- - **AND** exclude that URL from future navigation suggestions
420
-
421
- #### Scenario: Action repeat detected
422
-
423
- - **GIVEN** the same action (tool + args) is attempted 3 times consecutively
424
- - **WHEN** repeat is detected
425
- - **THEN** it SHALL reject the repeated action
426
- - **AND** request a different action from sampling with repeat context
427
-
428
- #### Scenario: Loop detection state is included in sampling prompts
429
-
430
- - **GIVEN** loop detection has flagged potential issues
431
- - **WHEN** the next sampling prompt is built
432
- - **THEN** it SHALL include:
433
- - URLs visited more than once
434
- - Recently repeated actions
435
- - DOM signature history
436
- - **AND** instruct the model to avoid these patterns
437
-
438
- ### Requirement: Crawl Budget Enforcement
439
-
440
- The system SHALL enforce time, step, and page limits during crawl.
441
-
442
- #### Scenario: Step limit is enforced
443
-
444
- - **GIVEN** crawl has a `maxSteps` limit
445
- - **WHEN** the step count reaches the limit
446
- - **THEN** crawl SHALL finalize with status "limits_reached"
447
- - **AND** include all artifacts collected up to that point
448
-
449
- #### Scenario: Time limit is enforced
450
-
451
- - **GIVEN** crawl has a `maxMinutes` limit
452
- - **WHEN** elapsed time reaches the limit
453
- - **THEN** crawl SHALL finalize with status "timeout"
454
- - **AND** complete current action before stopping
455
- - **AND** preserve all collected artifacts
456
-
457
- #### Scenario: Page limit is enforced
458
-
459
- - **GIVEN** crawl has a `maxPages` limit
460
- - **WHEN** the unique page count reaches the limit
461
- - **THEN** crawl SHALL stop discovering new pages
462
- - **AND** continue actions on already-visited pages until goal or step limit
463
-
464
- #### Scenario: Budget status is reported in progress
465
-
466
- - **GIVEN** crawl is running with limits
467
- - **WHEN** progress notification is emitted
468
- - **THEN** it SHALL include budget status:
469
- - `stepsUsed` / `maxSteps`
470
- - `minutesElapsed` / `maxMinutes`
471
- - `pagesDiscovered` / `maxPages`
472
-
473
- ### Requirement: Security Domain Enforcement
474
-
475
- The system SHALL enforce domain allowlists for all navigation actions.
476
-
477
- #### Scenario: Navigation to allowed domain succeeds
478
-
479
- - **GIVEN** allowedDomains includes "example.com"
480
- - **WHEN** Playwright action navigates to "https://example.com/page"
481
- - **THEN** navigation SHALL be allowed
482
-
483
- #### Scenario: Navigation to disallowed domain is blocked
484
-
485
- - **GIVEN** allowedDomains includes only "example.com"
486
- - **WHEN** sampling returns action to navigate to "https://malicious.com"
487
- - **THEN** the action SHALL be rejected
488
- - **AND** error logged with attempted URL
489
-
490
- #### Scenario: Subdomain matching follows rules
491
-
492
- - **GIVEN** allowedDomains includes "example.com"
493
- - **WHEN** navigation to "sub.example.com" is attempted
494
- - **THEN** it SHALL be allowed (subdomain of allowed domain)
495
-
496
- #### Scenario: Link clicks are validated
497
-
498
- - **GIVEN** a click action may navigate to external domain
499
- - **WHEN** click is executed
500
- - **THEN** resulting URL SHALL be checked post-navigation
501
- - **AND** if disallowed, navigate back and report error