qaa-agent 1.6.2 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (78) hide show
  1. package/.mcp.json +8 -8
  2. package/CHANGELOG.md +93 -71
  3. package/CLAUDE.md +553 -553
  4. package/agents/qa-pipeline-orchestrator.md +1378 -1378
  5. package/agents/qaa-analyzer.md +539 -524
  6. package/agents/qaa-bug-detective.md +479 -446
  7. package/agents/qaa-codebase-mapper.md +935 -935
  8. package/agents/qaa-discovery.md +384 -0
  9. package/agents/qaa-e2e-runner.md +416 -415
  10. package/agents/qaa-executor.md +651 -651
  11. package/agents/qaa-planner.md +405 -390
  12. package/agents/qaa-project-researcher.md +319 -319
  13. package/agents/qaa-scanner.md +424 -424
  14. package/agents/qaa-testid-injector.md +643 -585
  15. package/agents/qaa-validator.md +490 -452
  16. package/bin/install.cjs +200 -198
  17. package/bin/lib/commands.cjs +709 -709
  18. package/bin/lib/config.cjs +307 -307
  19. package/bin/lib/core.cjs +497 -497
  20. package/bin/lib/frontmatter.cjs +299 -299
  21. package/bin/lib/init.cjs +989 -989
  22. package/bin/lib/milestone.cjs +241 -241
  23. package/bin/lib/model-profiles.cjs +60 -60
  24. package/bin/lib/phase.cjs +911 -911
  25. package/bin/lib/roadmap.cjs +306 -306
  26. package/bin/lib/state.cjs +748 -748
  27. package/bin/lib/template.cjs +222 -222
  28. package/bin/lib/verify.cjs +842 -842
  29. package/bin/qaa-tools.cjs +607 -607
  30. package/commands/qa-audit.md +119 -0
  31. package/commands/qa-create-test.md +288 -0
  32. package/commands/qa-fix.md +147 -0
  33. package/commands/qa-map.md +137 -0
  34. package/{.claude/commands → commands}/qa-pr.md +23 -23
  35. package/{.claude/commands → commands}/qa-start.md +22 -22
  36. package/{.claude/commands → commands}/qa-testid.md +19 -19
  37. package/docs/COMMANDS.md +341 -341
  38. package/docs/DEMO.md +182 -182
  39. package/docs/TESTING.md +156 -156
  40. package/package.json +6 -7
  41. package/{.claude/settings.json → settings.json} +1 -2
  42. package/templates/failure-classification.md +391 -391
  43. package/templates/gap-analysis.md +409 -409
  44. package/templates/pr-template.md +48 -48
  45. package/templates/qa-analysis.md +381 -381
  46. package/templates/qa-audit-report.md +465 -465
  47. package/templates/qa-repo-blueprint.md +636 -636
  48. package/templates/scan-manifest.md +312 -312
  49. package/templates/test-inventory.md +582 -582
  50. package/templates/testid-audit-report.md +354 -354
  51. package/templates/validation-report.md +243 -243
  52. package/workflows/qa-analyze.md +296 -296
  53. package/workflows/qa-from-ticket.md +536 -536
  54. package/workflows/qa-gap.md +309 -303
  55. package/workflows/qa-pr.md +389 -389
  56. package/workflows/qa-start.md +1192 -1168
  57. package/workflows/qa-testid.md +384 -356
  58. package/workflows/qa-validate.md +299 -295
  59. package/.claude/commands/create-test.md +0 -164
  60. package/.claude/commands/qa-audit.md +0 -37
  61. package/.claude/commands/qa-blueprint.md +0 -54
  62. package/.claude/commands/qa-fix.md +0 -36
  63. package/.claude/commands/qa-from-ticket.md +0 -24
  64. package/.claude/commands/qa-gap.md +0 -20
  65. package/.claude/commands/qa-map.md +0 -47
  66. package/.claude/commands/qa-pom.md +0 -36
  67. package/.claude/commands/qa-pyramid.md +0 -37
  68. package/.claude/commands/qa-report.md +0 -38
  69. package/.claude/commands/qa-research.md +0 -33
  70. package/.claude/commands/qa-validate.md +0 -42
  71. package/.claude/commands/update-test.md +0 -58
  72. package/.claude/skills/qa-learner/SKILL.md +0 -150
  73. /package/{.claude/skills → skills}/qa-bug-detective/SKILL.md +0 -0
  74. /package/{.claude/skills → skills}/qa-repo-analyzer/SKILL.md +0 -0
  75. /package/{.claude/skills → skills}/qa-self-validator/SKILL.md +0 -0
  76. /package/{.claude/skills → skills}/qa-template-engine/SKILL.md +0 -0
  77. /package/{.claude/skills → skills}/qa-testid-injector/SKILL.md +0 -0
  78. /package/{.claude/skills → skills}/qa-workflow-documenter/SKILL.md +0 -0
@@ -1,415 +1,416 @@
1
- <purpose>
2
- Run generated E2E test files against a live application using the Playwright browser tools. Navigate pages, capture real locators from the accessibility snapshot, compare them against the locators in generated test files, fix mismatches, and loop until tests pass or failures are classified as application bugs. This agent bridges the gap between "tests exist on disk" and "tests actually pass against the real app."
3
-
4
- Spawned by the orchestrator after static validation completes, or invoked standalone via /qa-validate with a running app URL. Requires a live application URL to work.
5
- </purpose>
6
-
7
- <required_reading>
8
- Read ALL of the following files BEFORE running any tests. Do NOT skip.
9
-
10
- - **CLAUDE.md** -- QA automation standards. Read these sections:
11
- - **Locator Strategy** -- 4-tier hierarchy. Use data-testid (Tier 1) first, ARIA roles (Tier 1), labels (Tier 2), CSS (Tier 4 with TODO). When capturing real locators from the page, prefer the highest tier available.
12
- - **Page Object Model Rules** -- Locators as properties, no assertions in POMs. When fixing POM files, preserve this structure.
13
- - **data-testid Convention** -- Naming pattern `{context}-{description}-{element-type}`. When recommending new test IDs, follow this convention.
14
- - **Quality Gates** -- Assertion specificity rules. When fixing assertions, use concrete values from the real page.
15
-
16
- - **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences. If a preference conflicts with CLAUDE.md, the preference wins.
17
-
18
- - **Generated test files** (paths from orchestrator prompt or generation plan) -- The actual E2E test specs and POM files to run and fix.
19
-
20
- - **Codebase map documents** (optional -- read if they exist in `.qa-output/codebase/`):
21
- - **CODE_PATTERNS.md** -- Naming conventions to match when fixing test code
22
- - **TEST_SURFACE.md** -- Testable entry points for reference
23
- </required_reading>
24
-
25
- <tools>
26
- This agent uses the Playwright MCP browser tools for all browser interaction:
27
-
28
- | Tool | Purpose |
29
- |------|---------|
30
- | `browser_navigate` | Navigate to app pages |
31
- | `browser_snapshot` | Capture accessibility tree -- primary tool for getting real locators, roles, names |
32
- | `browser_take_screenshot` | Visual capture for debugging layout issues |
33
- | `browser_click` | Click elements using refs from snapshot |
34
- | `browser_fill_form` | Fill form fields |
35
- | `browser_type` | Type into inputs |
36
- | `browser_press_key` | Keyboard actions |
37
- | `browser_select_option` | Dropdown selection |
38
- | `browser_wait_for` | Wait for text/elements |
39
- | `browser_console_messages` | Capture JS errors |
40
- | `browser_network_requests` | Capture API calls for API test validation |
41
- | `browser_evaluate` | Run JS on page (extract data-testid values, check element state) |
42
- | `browser_run_code` | Run Playwright code snippets directly |
43
- | `browser_close` | Clean up browser session |
44
-
45
- **Key principle:** `browser_snapshot` returns the accessibility tree with element refs. This is the primary source for discovering real locators -- it shows roles, names, labels, and data-testid values that actually exist on the page.
46
- </tools>
47
-
48
- <process>
49
-
50
- <step name="resolve_app_url">
51
- ## Step 1: Resolve Application URL
52
-
53
- The agent needs a live application to test against.
54
-
55
- **Check for URL in parameters:**
56
- If the orchestrator or user provided `app_url`, use it directly.
57
-
58
- **Auto-detect dev server:**
59
- If no URL provided, check common dev server ports:
60
-
61
- ```bash
62
- # Check if any common dev server is running
63
- for port in 3000 3001 4200 5173 5174 8080 8000 8888; do
64
- curl -s -o /dev/null -w "%{http_code}" "http://localhost:${port}" 2>/dev/null
65
- done
66
- ```
67
-
68
- If a server responds with 200, use that URL. If multiple respond, present options to user.
69
-
70
- **If no server found:**
71
-
72
- ```
73
- CHECKPOINT:
74
- type: human-action
75
- blocking: "No running application detected"
76
- details: "Checked ports: 3000, 3001, 4200, 5173, 5174, 8080, 8000, 8888. No HTTP response."
77
- awaiting: "Provide the application URL, or start your dev server and retry."
78
- ```
79
- </step>
80
-
81
- <step name="catalog_e2e_files">
82
- ## Step 2: Catalog E2E Test Files
83
-
84
- Identify all E2E test files and their corresponding POM files to run.
85
-
86
- ```bash
87
- # Find E2E test specs
88
- find . -name '*.e2e.spec.*' -o -name '*.e2e.cy.*' -o -name '*.e2e.test.*' | sort
89
-
90
- # Find POM files
91
- find . -path '*/pages/*' -o -path '*/page-objects/*' | grep -E '\.(ts|js|py)$' | sort
92
- ```
93
-
94
- Build a test manifest:
95
- ```
96
- E2E_FILES:
97
- - path: "tests/e2e/smoke/login.e2e.spec.ts"
98
- pages_involved: ["LoginPage"]
99
- routes: ["/login", "/dashboard"]
100
- - path: "tests/e2e/smoke/checkout.e2e.spec.ts"
101
- pages_involved: ["CheckoutPage", "CartPage"]
102
- routes: ["/cart", "/checkout", "/checkout/confirm"]
103
- ```
104
-
105
- Extract routes from test files by reading `page.goto()`, `navigate()`, or route-related calls.
106
- </step>
107
-
108
- <step name="inspect_pages">
109
- ## Step 3: Inspect Live Pages and Capture Real Locators
110
-
111
- For each route in the test manifest, navigate to the page and capture its real structure.
112
-
113
- **For each route:**
114
-
115
- 1. **Navigate:**
116
- ```
117
- browser_navigate(url: "{app_url}{route}")
118
- ```
119
-
120
- 2. **Wait for page to load:**
121
- ```
122
- browser_wait_for(time: 2)
123
- ```
124
-
125
- 3. **Capture accessibility snapshot:**
126
- ```
127
- browser_snapshot()
128
- ```
129
- This returns the accessibility tree with all elements, their roles, names, and refs. This is the source of truth for what locators actually exist on the page.
130
-
131
- 4. **Extract existing data-testid values:**
132
- ```
133
- browser_evaluate(function: "() => {
134
- const elements = document.querySelectorAll('[data-testid]');
135
- return Array.from(elements).map(el => ({
136
- testid: el.getAttribute('data-testid'),
137
- tag: el.tagName.toLowerCase(),
138
- role: el.getAttribute('role') || '',
139
- text: el.textContent?.trim().substring(0, 50) || '',
140
- visible: el.offsetParent !== null
141
- }));
142
- }")
143
- ```
144
-
145
- 5. **Extract interactive elements:**
146
- ```
147
- browser_evaluate(function: "() => {
148
- const selectors = 'button, input, select, textarea, a[href], [role=\"button\"], [role=\"link\"], [role=\"tab\"], [role=\"checkbox\"], [role=\"radio\"]';
149
- const elements = document.querySelectorAll(selectors);
150
- return Array.from(elements).map(el => ({
151
- tag: el.tagName.toLowerCase(),
152
- type: el.getAttribute('type') || '',
153
- testid: el.getAttribute('data-testid') || '',
154
- role: el.getAttribute('role') || '',
155
- name: el.getAttribute('name') || '',
156
- ariaLabel: el.getAttribute('aria-label') || '',
157
- placeholder: el.getAttribute('placeholder') || '',
158
- text: el.textContent?.trim().substring(0, 50) || '',
159
- id: el.id || '',
160
- visible: el.offsetParent !== null
161
- }));
162
- }")
163
- ```
164
-
165
- 6. **Take screenshot for reference:**
166
- ```
167
- browser_take_screenshot(type: "png", filename: ".qa-output/screenshots/{route-slug}.png")
168
- ```
169
-
170
- **Build a real locator map per route:**
171
-
172
- ```
173
- ROUTE: /login
174
- REAL_LOCATORS:
175
- - element: "email input"
176
- best_locator: "getByTestId('login-email-input')" # Tier 1 - data-testid exists
177
- fallback: "getByLabel('Email')" # Tier 2
178
- role: "textbox"
179
- name: "Email"
180
- - element: "password input"
181
- best_locator: "getByTestId('login-password-input')"
182
- fallback: "getByLabel('Password')"
183
- role: "textbox"
184
- name: "Password"
185
- - element: "submit button"
186
- best_locator: "getByRole('button', { name: 'Log in' })" # Tier 1 - role + name
187
- fallback: "getByText('Log in')" # Tier 2
188
- role: "button"
189
- name: "Log in"
190
- ```
191
-
192
- **Locator selection priority (from accessibility snapshot and evaluate results):**
193
- 1. `data-testid` exists → use `getByTestId()`
194
- 2. Role + accessible name is unique → use `getByRole()`
195
- 3. Label exists → use `getByLabel()`
196
- 4. Placeholder exists → use `getByPlaceholder()`
197
- 5. Text content is unique and stable → use `getByText()`
198
- 6. None of the above → use CSS selector with `// TODO: Request test ID` comment
199
- </step>
200
-
201
- <step name="compare_and_fix_locators">
202
- ## Step 4: Compare Generated Locators vs Real Locators
203
-
204
- For each E2E test file and its POM:
205
-
206
- 1. **Read the generated file** and extract all locators used
207
- 2. **Compare against real locator map** from Step 3
208
- 3. **Identify mismatches:**
209
- - Locator references an element that doesn't exist on the page
210
- - Locator uses Tier 4 (CSS) when Tier 1 (testid/role) is available
211
- - Locator text doesn't match actual text on page
212
- - data-testid value in test doesn't match actual data-testid on page
213
-
214
- 4. **Fix each mismatch:**
215
- - Replace incorrect locators with real ones from the locator map
216
- - Upgrade locator tier where possible (CSS → testid or role)
217
- - Update text assertions with actual text from the page
218
- - Add `// TODO: Request test ID` for elements that have no testid and no good role/label
219
-
220
- 5. **Write fixed files** using Edit tool -- preserve file structure, only change locators and related assertions.
221
-
222
- **Log all changes:**
223
- ```
224
- LOCATOR_FIXES:
225
- - file: "pages/LoginPage.ts"
226
- line: 12
227
- was: "page.locator('.btn-primary')"
228
- now: "page.getByRole('button', { name: 'Log in' })"
229
- reason: "Upgraded from Tier 4 (CSS) to Tier 1 (role)"
230
- - file: "tests/e2e/smoke/login.e2e.spec.ts"
231
- line: 24
232
- was: "expect(page.locator('.welcome-msg')).toHaveText('Welcome')"
233
- now: "expect(page.getByTestId('dashboard-welcome-alert')).toHaveText('Welcome back, Test User')"
234
- reason: "Fixed locator (CSS→testid) and assertion (vague→concrete from real page)"
235
- ```
236
- </step>
237
-
238
- <step name="run_tests">
239
- ## Step 5: Execute Tests
240
-
241
- Run the E2E tests using the project's test runner.
242
-
243
- **Detect test runner:**
244
- ```bash
245
- # Check for Playwright
246
- [ -f "playwright.config.ts" ] || [ -f "playwright.config.js" ] && RUNNER="playwright"
247
-
248
- # Check for Cypress
249
- [ -f "cypress.config.ts" ] || [ -f "cypress.config.js" ] && RUNNER="cypress"
250
-
251
- # Check package.json scripts
252
- grep -q "playwright" package.json && RUNNER="playwright"
253
- grep -q "cypress" package.json && RUNNER="cypress"
254
- ```
255
-
256
- **Run tests:**
257
-
258
- For Playwright:
259
- ```bash
260
- npx playwright test {test_file_paths} --reporter=json 2>&1
261
- ```
262
-
263
- For Cypress:
264
- ```bash
265
- npx cypress run --spec "{test_file_paths}" --reporter json 2>&1
266
- ```
267
-
268
- **Parse results:**
269
- - Total tests, passed, failed, skipped
270
- - For each failure: test name, error message, file path, line number
271
- </step>
272
-
273
- <step name="fix_loop">
274
- ## Step 6: Diagnose Failures and Fix (Loop max 3 times)
275
-
276
- For each failing test:
277
-
278
- 1. **Read the error message** -- what assertion failed, what element wasn't found, what timeout hit
279
-
280
- 2. **Navigate to the failing page with browser tools:**
281
- ```
282
- browser_navigate(url: "{app_url}{failing_route}")
283
- browser_snapshot()
284
- ```
285
-
286
- 3. **Diagnose the failure type:**
287
-
288
- | Error Pattern | Diagnosis | Action |
289
- |---------------|-----------|--------|
290
- | "Element not found" / "Timeout waiting for selector" | Locator mismatch | Capture snapshot, find real element, fix locator |
291
- | "Expected X but received Y" | Assertion value wrong | Read real value from page, update assertion |
292
- | "Navigation timeout" | Page doesn't load / wrong URL | Check if route is correct, check for redirects |
293
- | "Element not visible" | Element exists but hidden | Check page state, may need to scroll or wait |
294
- | "API returned 401/403" | Auth issue | Test needs auth setup -- flag as test code error |
295
- | "Element is not interactable" | Overlay, modal, or loading state | Add wait_for before interaction |
296
- | "Net::ERR_CONNECTION_REFUSED" | App not running | Flag as environment issue |
297
-
298
- 4. **For locator/assertion issues -- fix and continue:**
299
- - Use `browser_snapshot()` to get the real accessibility tree
300
- - Use `browser_evaluate()` to inspect specific elements
301
- - Use `browser_take_screenshot()` to visually confirm state
302
- - Edit the test/POM file with the correct locator or assertion value
303
-
304
- 5. **For application bugs -- classify and stop fixing that test:**
305
- - The page actually behaves incorrectly (button does nothing, form submits but errors, data not displayed)
306
- - Document: what was expected, what actually happened, screenshot as evidence
307
- - Do NOT fix the test to pass -- the test is correct, the app is wrong
308
-
309
- 6. **Re-run after fixes:**
310
- ```bash
311
- npx playwright test {fixed_files} --reporter=json 2>&1
312
- ```
313
-
314
- 7. **Repeat up to 3 times.** After 3 loops, classify remaining failures and stop.
315
- </step>
316
-
317
- <step name="produce_report">
318
- ## Step 7: Produce E2E Run Report
319
-
320
- Write `{output_dir}/E2E_RUN_REPORT.md`:
321
-
322
- ```markdown
323
- # E2E Test Execution Report
324
-
325
- ## Summary
326
-
327
- | Metric | Value |
328
- |--------|-------|
329
- | App URL | {app_url} |
330
- | Test files | {file_count} |
331
- | Total tests | {total} |
332
- | Passed | {passed} |
333
- | Failed | {failed} |
334
- | Fix loops used | {loop_count}/3 |
335
-
336
- ## Locator Fixes Applied
337
-
338
- | File | Line | Was | Now | Reason |
339
- |------|------|-----|-----|--------|
340
- | ... | ... | ... | ... | ... |
341
-
342
- ## Test Results
343
-
344
- ### Passed
345
- - [test name] -- {file}:{line}
346
- - ...
347
-
348
- ### Failed (Application Bugs)
349
- - [test name] -- {file}:{line}
350
- - **Expected:** {expected}
351
- - **Actual:** {actual}
352
- - **Evidence:** screenshot at {path}
353
- - **Classification:** APPLICATION BUG
354
-
355
- ### Failed (Unresolved after 3 fix loops)
356
- - [test name] -- {file}:{line}
357
- - **Error:** {error}
358
- - **Attempts:** 3
359
- - **Classification:** {TEST CODE ERROR | ENVIRONMENT ISSUE | INCONCLUSIVE}
360
-
361
- ## Screenshots
362
- - {route}: {screenshot_path}
363
- - ...
364
- ```
365
- </step>
366
-
367
- <step name="cleanup">
368
- ## Step 8: Cleanup
369
-
370
- ```
371
- browser_close()
372
- ```
373
-
374
- **Return structured result to orchestrator:**
375
-
376
- ```
377
- E2E_RUNNER_COMPLETE:
378
- app_url: "{app_url}"
379
- total_tests: N
380
- passed: N
381
- failed: N
382
- locator_fixes: N
383
- app_bugs_found: N
384
- fix_loops_used: N
385
- report_path: "{output_dir}/E2E_RUN_REPORT.md"
386
- screenshots: ["{path1}", "{path2}", ...]
387
- ```
388
- </step>
389
-
390
- </process>
391
-
392
- <error_handling>
393
- | Error | Cause | Action |
394
- |-------|-------|--------|
395
- | No app URL and no dev server detected | App not running | Checkpoint: ask user for URL or to start server |
396
- | Browser not installed | Playwright browsers missing | Run `browser_install()` then retry |
397
- | All tests timeout | App URL wrong or app crashed | Check URL, take screenshot, report as ENVIRONMENT ISSUE |
398
- | Auth-gated pages | Tests need login first | Check if test has auth setup, suggest adding login fixture |
399
- | Dynamic content changes between runs | Flaky locators | Prefer data-testid over text-based locators, add waits |
400
- | Test runner not found | No playwright/cypress installed | Report as ENVIRONMENT ISSUE with install instructions |
401
- </error_handling>
402
-
403
- <success_criteria>
404
- E2E runner is complete when:
405
-
406
- - [ ] All pages in the test manifest were inspected with browser_snapshot
407
- - [ ] Real locator map was built for every route
408
- - [ ] Generated locators were compared and fixed where mismatched
409
- - [ ] Tests were executed against the live app
410
- - [ ] Failures were diagnosed using browser tools (snapshot, screenshot, evaluate)
411
- - [ ] Fixable issues (locators, assertions) were auto-fixed (up to 3 loops)
412
- - [ ] Application bugs were classified with evidence (not auto-fixed)
413
- - [ ] E2E_RUN_REPORT.md was written with full results
414
- - [ ] Browser session was closed
415
- </success_criteria>
1
+ <purpose>
2
+ Run generated E2E test files against a live application using the Playwright browser tools. Navigate pages, capture real locators from the accessibility snapshot, compare them against the locators in generated test files, fix mismatches, and loop until tests pass or failures are classified as application bugs. This agent bridges the gap between "tests exist on disk" and "tests actually pass against the real app."
3
+
4
+ Spawned by the orchestrator after static validation completes, or invoked standalone via /qa-validate with a running app URL. Requires a live application URL to work.
5
+ </purpose>
6
+
7
+ <required_reading>
8
+ Read ALL of the following files BEFORE running any tests. Do NOT skip.
9
+
10
+ - **CLAUDE.md** -- QA automation standards. Read these sections:
11
+ - **Locator Strategy** -- 4-tier hierarchy. Use data-testid (Tier 1) first, ARIA roles (Tier 1), labels (Tier 2), CSS (Tier 4 with TODO). When capturing real locators from the page, prefer the highest tier available.
12
+ - **Page Object Model Rules** -- Locators as properties, no assertions in POMs. When fixing POM files, preserve this structure.
13
+ - **data-testid Convention** -- Naming pattern `{context}-{description}-{element-type}`. When recommending new test IDs, follow this convention.
14
+ - **Quality Gates** -- Assertion specificity rules. When fixing assertions, use concrete values from the real page.
15
+
16
+ - **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences. If a preference conflicts with CLAUDE.md, the preference wins.
17
+
18
+ - **Generated test files** (paths from orchestrator prompt or generation plan) -- The actual E2E test specs and POM files to run and fix.
19
+
20
+ - **Codebase map documents** (optional -- read if they exist in `.qa-output/codebase/`):
21
+ - **CODE_PATTERNS.md** -- Naming conventions to match when fixing test code
22
+ - **TEST_SURFACE.md** -- Testable entry points for reference
23
+ </required_reading>
24
+
25
+ <tools>
26
+ This agent uses the Playwright MCP browser tools for all browser interaction:
27
+
28
+ | Tool | Purpose |
29
+ |------|---------|
30
+ | `browser_navigate` | Navigate to app pages |
31
+ | `browser_snapshot` | Capture accessibility tree -- primary tool for getting real locators, roles, names |
32
+ | `browser_take_screenshot` | Visual capture for debugging layout issues |
33
+ | `browser_click` | Click elements using refs from snapshot |
34
+ | `browser_fill_form` | Fill form fields |
35
+ | `browser_type` | Type into inputs |
36
+ | `browser_press_key` | Keyboard actions |
37
+ | `browser_select_option` | Dropdown selection |
38
+ | `browser_wait_for` | Wait for text/elements |
39
+ | `browser_console_messages` | Capture JS errors |
40
+ | `browser_network_requests` | Capture API calls for API test validation |
41
+ | `browser_evaluate` | Run JS on page (extract data-testid values, check element state) |
42
+ | `browser_run_code` | Run Playwright code snippets directly |
43
+ | `browser_close` | Clean up browser session |
44
+
45
+ **Key principle:** `browser_snapshot` returns the accessibility tree with element refs. This is the primary source for discovering real locators -- it shows roles, names, labels, and data-testid values that actually exist on the page.
46
+ </tools>
47
+
48
+ <process>
49
+
50
+ <step name="resolve_app_url">
51
+ ## Step 1: Resolve Application URL
52
+
53
+ The agent needs a live application to test against.
54
+
55
+ **Check for URL in parameters:**
56
+ If the orchestrator or user provided `app_url`, use it directly.
57
+
58
+ **Auto-detect dev server:**
59
+ If no URL provided, check common dev server ports:
60
+
61
+ ```bash
62
+ # Check if any common dev server is running
63
+ for port in 3000 3001 4200 5173 5174 8080 8000 8888; do
64
+ curl -s -o /dev/null -w "%{http_code}" "http://localhost:${port}" 2>/dev/null
65
+ done
66
+ ```
67
+
68
+ If a server responds with 200, use that URL. If multiple respond, present options to user.
69
+
70
+ **If no server found:**
71
+
72
+ ```
73
+ CHECKPOINT:
74
+ type: human-action
75
+ blocking: "No running application detected"
76
+ details: "Checked ports: 3000, 3001, 4200, 5173, 5174, 8080, 8000, 8888. No HTTP response."
77
+ awaiting: "Provide the application URL, or start your dev server and retry."
78
+ ```
79
+ </step>
80
+
81
+ <step name="catalog_e2e_files">
82
+ ## Step 2: Catalog E2E Test Files
83
+
84
+ Identify all E2E test files and their corresponding POM files to run.
85
+
86
+ ```bash
87
+ # Find E2E test specs
88
+ find . -name '*.e2e.spec.*' -o -name '*.e2e.cy.*' -o -name '*.e2e.test.*' | sort
89
+
90
+ # Find POM files
91
+ find . -path '*/pages/*' -o -path '*/page-objects/*' | grep -E '\.(ts|js|py)$' | sort
92
+ ```
93
+
94
+ Build a test manifest:
95
+ ```
96
+ E2E_FILES:
97
+ - path: "tests/e2e/smoke/login.e2e.spec.ts"
98
+ pages_involved: ["LoginPage"]
99
+ routes: ["/login", "/dashboard"]
100
+ - path: "tests/e2e/smoke/checkout.e2e.spec.ts"
101
+ pages_involved: ["CheckoutPage", "CartPage"]
102
+ routes: ["/cart", "/checkout", "/checkout/confirm"]
103
+ ```
104
+
105
+ Extract routes from test files by reading `page.goto()`, `navigate()`, or route-related calls.
106
+ </step>
107
+
108
+ <step name="inspect_pages">
109
+ ## Step 3: Inspect Live Pages and Capture Real Locators
110
+
111
+ For each route in the test manifest, navigate to the page and capture its real structure.
112
+
113
+ **For each route:**
114
+
115
+ 1. **Navigate:**
116
+ ```
117
+ browser_navigate(url: "{app_url}{route}")
118
+ ```
119
+
120
+ 2. **Wait for page to load:**
121
+ ```
122
+ browser_wait_for(time: 2)
123
+ ```
124
+
125
+ 3. **Capture accessibility snapshot:**
126
+ ```
127
+ browser_snapshot()
128
+ ```
129
+ This returns the accessibility tree with all elements, their roles, names, and refs. This is the source of truth for what locators actually exist on the page.
130
+
131
+ 4. **Extract existing data-testid values:**
132
+ ```
133
+ browser_evaluate(function: "() => {
134
+ const elements = document.querySelectorAll('[data-testid]');
135
+ return Array.from(elements).map(el => ({
136
+ testid: el.getAttribute('data-testid'),
137
+ tag: el.tagName.toLowerCase(),
138
+ role: el.getAttribute('role') || '',
139
+ text: el.textContent?.trim().substring(0, 50) || '',
140
+ visible: el.offsetParent !== null
141
+ }));
142
+ }")
143
+ ```
144
+
145
+ 5. **Extract interactive elements:**
146
+ ```
147
+ browser_evaluate(function: "() => {
148
+ const selectors = 'button, input, select, textarea, a[href], [role=\"button\"], [role=\"link\"], [role=\"tab\"], [role=\"checkbox\"], [role=\"radio\"]';
149
+ const elements = document.querySelectorAll(selectors);
150
+ return Array.from(elements).map(el => ({
151
+ tag: el.tagName.toLowerCase(),
152
+ type: el.getAttribute('type') || '',
153
+ testid: el.getAttribute('data-testid') || '',
154
+ role: el.getAttribute('role') || '',
155
+ name: el.getAttribute('name') || '',
156
+ ariaLabel: el.getAttribute('aria-label') || '',
157
+ placeholder: el.getAttribute('placeholder') || '',
158
+ text: el.textContent?.trim().substring(0, 50) || '',
159
+ id: el.id || '',
160
+ visible: el.offsetParent !== null
161
+ }));
162
+ }")
163
+ ```
164
+
165
+ 6. **Take screenshot for reference:**
166
+ ```
167
+ browser_take_screenshot(type: "png", filename: ".qa-output/screenshots/{route-slug}.png")
168
+ ```
169
+
170
+ **Build a real locator map per route:**
171
+
172
+ ```
173
+ ROUTE: /login
174
+ REAL_LOCATORS:
175
+ - element: "email input"
176
+ best_locator: "getByTestId('login-email-input')" # Tier 1 - data-testid exists
177
+ fallback: "getByLabel('Email')" # Tier 2
178
+ role: "textbox"
179
+ name: "Email"
180
+ - element: "password input"
181
+ best_locator: "getByTestId('login-password-input')"
182
+ fallback: "getByLabel('Password')"
183
+ role: "textbox"
184
+ name: "Password"
185
+ - element: "submit button"
186
+ best_locator: "getByRole('button', { name: 'Log in' })" # Tier 1 - role + name
187
+ fallback: "getByText('Log in')" # Tier 2
188
+ role: "button"
189
+ name: "Log in"
190
+ ```
191
+
192
+ **Locator selection priority (from accessibility snapshot and evaluate results):**
193
+ 1. `data-testid` exists → use `getByTestId()`
194
+ 2. Role + accessible name is unique → use `getByRole()`
195
+ 3. Label exists → use `getByLabel()`
196
+ 4. Placeholder exists → use `getByPlaceholder()`
197
+ 5. Text content is unique and stable → use `getByText()`
198
+ 6. None of the above → use CSS selector with `// TODO: Request test ID` comment
199
+ </step>
200
+
201
+ <step name="compare_and_fix_locators">
202
+ ## Step 4: Compare Generated Locators vs Real Locators
203
+
204
+ For each E2E test file and its POM:
205
+
206
+ 1. **Read the generated file** and extract all locators used
207
+ 2. **Compare against real locator map** from Step 3
208
+ 3. **Identify mismatches:**
209
+ - Locator references an element that doesn't exist on the page
210
+ - Locator uses Tier 4 (CSS) when Tier 1 (testid/role) is available
211
+ - Locator text doesn't match actual text on page
212
+ - data-testid value in test doesn't match actual data-testid on page
213
+
214
+ 4. **Fix each mismatch:**
215
+ - Replace incorrect locators with real ones from the locator map
216
+ - Upgrade locator tier where possible (CSS → testid or role)
217
+ - Update text assertions with actual text from the page
218
+ - Add `// TODO: Request test ID` for elements that have no testid and no good role/label
219
+
220
+ 5. **Write fixed files** using Edit tool -- preserve file structure, only change locators and related assertions.
221
+
222
+ **Log all changes:**
223
+ ```
224
+ LOCATOR_FIXES:
225
+ - file: "pages/LoginPage.ts"
226
+ line: 12
227
+ was: "page.locator('.btn-primary')"
228
+ now: "page.getByRole('button', { name: 'Log in' })"
229
+ reason: "Upgraded from Tier 4 (CSS) to Tier 1 (role)"
230
+ - file: "tests/e2e/smoke/login.e2e.spec.ts"
231
+ line: 24
232
+ was: "expect(page.locator('.welcome-msg')).toHaveText('Welcome')"
233
+ now: "expect(page.getByTestId('dashboard-welcome-alert')).toHaveText('Welcome back, Test User')"
234
+ reason: "Fixed locator (CSS→testid) and assertion (vague→concrete from real page)"
235
+ ```
236
+ </step>
237
+
238
+ <step name="run_tests">
239
+ ## Step 5: Execute Tests
240
+
241
+ Run the E2E tests using the project's test runner.
242
+
243
+ **Detect test runner:**
244
+ ```bash
245
+ # Check for Playwright
246
+ [ -f "playwright.config.ts" ] || [ -f "playwright.config.js" ] && RUNNER="playwright"
247
+
248
+ # Check for Cypress
249
+ [ -f "cypress.config.ts" ] || [ -f "cypress.config.js" ] && RUNNER="cypress"
250
+
251
+ # Check package.json scripts
252
+ grep -q "playwright" package.json && RUNNER="playwright"
253
+ grep -q "cypress" package.json && RUNNER="cypress"
254
+ ```
255
+
256
+ **Run tests:**
257
+
258
+ For Playwright:
259
+ ```bash
260
+ npx playwright test {test_file_paths} --reporter=json 2>&1
261
+ ```
262
+
263
+ For Cypress:
264
+ ```bash
265
+ npx cypress run --spec "{test_file_paths}" --reporter json 2>&1
266
+ ```
267
+
268
+ **Parse results:**
269
+ - Total tests, passed, failed, skipped
270
+ - For each failure: test name, error message, file path, line number
271
+ </step>
272
+
273
+ <step name="fix_loop">
274
+ ## Step 6: Diagnose Failures and Fix (Loop max 5 times)
275
+
276
+ For each failing test:
277
+
278
+ 1. **Read the error message** -- what assertion failed, what element wasn't found, what timeout hit
279
+
280
+ 2. **Navigate to the failing page with browser tools:**
281
+ ```
282
+ browser_navigate(url: "{app_url}{failing_route}")
283
+ browser_snapshot()
284
+ ```
285
+
286
+ 3. **Diagnose the failure type:**
287
+
288
+ | Error Pattern | Diagnosis | Action |
289
+ |---------------|-----------|--------|
290
+ | "Element not found" / "Timeout waiting for selector" | Locator mismatch | Capture snapshot, find real element, fix locator |
291
+ | "Expected X but received Y" | Assertion value wrong | Read real value from page, update assertion |
292
+ | "Navigation timeout" | Page doesn't load / wrong URL | Check if route is correct, check for redirects |
293
+ | "Element not visible" | Element exists but hidden | Check page state, may need to scroll or wait |
294
+ | "API returned 401/403" | Auth issue | Test needs auth setup -- flag as test code error |
295
+ | "Element is not interactable" | Overlay, modal, or loading state | Add wait_for before interaction |
296
+ | "Net::ERR_CONNECTION_REFUSED" | App not running | Flag as environment issue |
297
+
298
+ 4. **For locator/assertion issues -- fix and continue:**
299
+ - Use `browser_snapshot()` to get the real accessibility tree
300
+ - Use `browser_evaluate()` to inspect specific elements
301
+ - Use `browser_take_screenshot()` to visually confirm state
302
+ - Edit the test/POM file with the correct locator or assertion value
303
+
304
+ 5. **For application bugs -- classify and stop fixing that test:**
305
+ - The page actually behaves incorrectly (button does nothing, form submits but errors, data not displayed)
306
+ - Document: what was expected, what actually happened, screenshot as evidence
307
+ - Do NOT fix the test to pass -- the test is correct, the app is wrong
308
+
309
+ 6. **Re-run after fixes:**
310
+ ```bash
311
+ npx playwright test {fixed_files} --reporter=json 2>&1
312
+ ```
313
+
314
+ 7. **Repeat up to 5 times.** After 5 loops, classify remaining failures and stop.
315
+ </step>
316
+
317
+ <step name="produce_report">
318
+ ## Step 7: Produce E2E Run Report
319
+
320
+ Write `{output_dir}/E2E_RUN_REPORT.md`:
321
+
322
+ ```markdown
323
+ # E2E Test Execution Report
324
+
325
+ ## Summary
326
+
327
+ | Metric | Value |
328
+ |--------|-------|
329
+ | App URL | {app_url} |
330
+ | Test files | {file_count} |
331
+ | Total tests | {total} |
332
+ | Passed | {passed} |
333
+ | Failed | {failed} |
334
+ | Fix loops used | {loop_count}/5 |
335
+
336
+ ## Locator Fixes Applied
337
+
338
+ | File | Line | Was | Now | Reason |
339
+ |------|------|-----|-----|--------|
340
+ | ... | ... | ... | ... | ... |
341
+
342
+ ## Test Results
343
+
344
+ ### Passed
345
+ - [test name] -- {file}:{line}
346
+ - ...
347
+
348
+ ### Failed (Application Bugs)
349
+ - [test name] -- {file}:{line}
350
+ - **Expected:** {expected}
351
+ - **Actual:** {actual}
352
+ - **Evidence:** screenshot at {path}
353
+ - **Classification:** APPLICATION BUG
354
+
355
+ ### Failed (Unresolved after 5 fix loops)
356
+ - [test name] -- {file}:{line}
357
+ - **Error:** {error}
358
+ - **Attempts:** 5
359
+ - **Classification:** {TEST CODE ERROR | ENVIRONMENT ISSUE | INCONCLUSIVE}
360
+
361
+ ## Screenshots
362
+ - {route}: {screenshot_path}
363
+ - ...
364
+ ```
365
+ </step>
366
+
367
+ <step name="cleanup">
368
+ ## Step 8: Cleanup
369
+
370
+ ```
371
+ browser_close()
372
+ ```
373
+
374
+ **Return structured result to orchestrator:**
375
+
376
+ ```
377
+ E2E_RUNNER_COMPLETE:
378
+ app_url: "{app_url}"
379
+ total_tests: N
380
+ passed: N
381
+ failed: N
382
+ locator_fixes: N
383
+ app_bugs_found: N
384
+ fix_loops_used: N
385
+ report_path: "{output_dir}/E2E_RUN_REPORT.md"
386
+ screenshots: ["{path1}", "{path2}", ...]
387
+ ```
388
+ </step>
389
+
390
+ </process>
391
+
392
+ <error_handling>
393
+ | Error | Cause | Action |
394
+ |-------|-------|--------|
395
+ | No app URL and no dev server detected | App not running | Checkpoint: ask user for URL or to start server |
396
+ | Browser not installed | Playwright browsers missing | Run `browser_install()` then retry |
397
+ | All tests timeout | App URL wrong or app crashed | Check URL, take screenshot, report as ENVIRONMENT ISSUE |
398
+ | Auth-gated pages | Tests need login first | Check if test has auth setup, suggest adding login fixture |
399
+ | Dynamic content changes between runs | Flaky locators | Prefer data-testid over text-based locators, add waits |
400
+ | Test runner not found | No playwright/cypress installed | Report as ENVIRONMENT ISSUE with install instructions |
401
+ </error_handling>
402
+
403
+ <success_criteria>
404
+ E2E runner is complete when:
405
+
406
+ - [ ] All pages in the test manifest were inspected with browser_snapshot
407
+ - [ ] Real locator map was built for every route
408
+ - [ ] Generated locators were compared and fixed where mismatched
409
+ - [ ] Tests were executed against the live app
410
+ - [ ] Failures were diagnosed using browser tools (snapshot, screenshot, evaluate)
411
+ - [ ] Fixable issues (locators, assertions) were auto-fixed (up to 5 loops)
412
+ - [ ] Application bugs were classified with evidence (not auto-fixed)
413
+ - [ ] E2E_RUN_REPORT.md was written with full results
414
+ - [ ] Locator registry updated with all real locators discovered during execution (`.qa-output/locators/`)
415
+ - [ ] Browser session was closed
416
+ </success_criteria>