npm - qaa-agent - Versions diffs - 1.6.2 → 1.7.0 - Mend

qaa-agent 1.6.2 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (78) hide show

package/.mcp.json +8 -8
package/CHANGELOG.md +93 -71
package/CLAUDE.md +553 -553
package/agents/qa-pipeline-orchestrator.md +1378 -1378
package/agents/qaa-analyzer.md +539 -524
package/agents/qaa-bug-detective.md +479 -446
package/agents/qaa-codebase-mapper.md +935 -935
package/agents/qaa-discovery.md +384 -0
package/agents/qaa-e2e-runner.md +416 -415
package/agents/qaa-executor.md +651 -651
package/agents/qaa-planner.md +405 -390
package/agents/qaa-project-researcher.md +319 -319
package/agents/qaa-scanner.md +424 -424
package/agents/qaa-testid-injector.md +643 -585
package/agents/qaa-validator.md +490 -452
package/bin/install.cjs +200 -198
package/bin/lib/commands.cjs +709 -709
package/bin/lib/config.cjs +307 -307
package/bin/lib/core.cjs +497 -497
package/bin/lib/frontmatter.cjs +299 -299
package/bin/lib/init.cjs +989 -989
package/bin/lib/milestone.cjs +241 -241
package/bin/lib/model-profiles.cjs +60 -60
package/bin/lib/phase.cjs +911 -911
package/bin/lib/roadmap.cjs +306 -306
package/bin/lib/state.cjs +748 -748
package/bin/lib/template.cjs +222 -222
package/bin/lib/verify.cjs +842 -842
package/bin/qaa-tools.cjs +607 -607
package/commands/qa-audit.md +119 -0
package/commands/qa-create-test.md +288 -0
package/commands/qa-fix.md +147 -0
package/commands/qa-map.md +137 -0
package/{.claude/commands → commands}/qa-pr.md +23 -23
package/{.claude/commands → commands}/qa-start.md +22 -22
package/{.claude/commands → commands}/qa-testid.md +19 -19
package/docs/COMMANDS.md +341 -341
package/docs/DEMO.md +182 -182
package/docs/TESTING.md +156 -156
package/package.json +6 -7
package/{.claude/settings.json → settings.json} +1 -2
package/templates/failure-classification.md +391 -391
package/templates/gap-analysis.md +409 -409
package/templates/pr-template.md +48 -48
package/templates/qa-analysis.md +381 -381
package/templates/qa-audit-report.md +465 -465
package/templates/qa-repo-blueprint.md +636 -636
package/templates/scan-manifest.md +312 -312
package/templates/test-inventory.md +582 -582
package/templates/testid-audit-report.md +354 -354
package/templates/validation-report.md +243 -243
package/workflows/qa-analyze.md +296 -296
package/workflows/qa-from-ticket.md +536 -536
package/workflows/qa-gap.md +309 -303
package/workflows/qa-pr.md +389 -389
package/workflows/qa-start.md +1192 -1168
package/workflows/qa-testid.md +384 -356
package/workflows/qa-validate.md +299 -295
package/.claude/commands/create-test.md +0 -164
package/.claude/commands/qa-audit.md +0 -37
package/.claude/commands/qa-blueprint.md +0 -54
package/.claude/commands/qa-fix.md +0 -36
package/.claude/commands/qa-from-ticket.md +0 -24
package/.claude/commands/qa-gap.md +0 -20
package/.claude/commands/qa-map.md +0 -47
package/.claude/commands/qa-pom.md +0 -36
package/.claude/commands/qa-pyramid.md +0 -37
package/.claude/commands/qa-report.md +0 -38
package/.claude/commands/qa-research.md +0 -33
package/.claude/commands/qa-validate.md +0 -42
package/.claude/commands/update-test.md +0 -58
package/.claude/skills/qa-learner/SKILL.md +0 -150
/package/{.claude/skills → skills}/qa-bug-detective/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-repo-analyzer/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-self-validator/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-template-engine/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-testid-injector/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-workflow-documenter/SKILL.md +0 -0

package/agents/qaa-e2e-runner.md CHANGED Viewed

@@ -1,415 +1,416 @@
-<purpose>
-Run generated E2E test files against a live application using the Playwright browser tools. Navigate pages, capture real locators from the accessibility snapshot, compare them against the locators in generated test files, fix mismatches, and loop until tests pass or failures are classified as application bugs. This agent bridges the gap between "tests exist on disk" and "tests actually pass against the real app."
-Spawned by the orchestrator after static validation completes, or invoked standalone via /qa-validate with a running app URL. Requires a live application URL to work.
-</purpose>
-<required_reading>
-Read ALL of the following files BEFORE running any tests. Do NOT skip.
-- **CLAUDE.md** -- QA automation standards. Read these sections:
-  - **Locator Strategy** -- 4-tier hierarchy. Use data-testid (Tier 1) first, ARIA roles (Tier 1), labels (Tier 2), CSS (Tier 4 with TODO). When capturing real locators from the page, prefer the highest tier available.
-  - **Page Object Model Rules** -- Locators as properties, no assertions in POMs. When fixing POM files, preserve this structure.
-  - **data-testid Convention** -- Naming pattern `{context}-{description}-{element-type}`. When recommending new test IDs, follow this convention.
-  - **Quality Gates** -- Assertion specificity rules. When fixing assertions, use concrete values from the real page.
-- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences. If a preference conflicts with CLAUDE.md, the preference wins.
-- **Generated test files** (paths from orchestrator prompt or generation plan) -- The actual E2E test specs and POM files to run and fix.
-- **Codebase map documents** (optional -- read if they exist in `.qa-output/codebase/`):
-  - **CODE_PATTERNS.md** -- Naming conventions to match when fixing test code
-  - **TEST_SURFACE.md** -- Testable entry points for reference
-</required_reading>
-<tools>
-This agent uses the Playwright MCP browser tools for all browser interaction:
-| Tool | Purpose |
-|------|---------|
-| `browser_navigate` | Navigate to app pages |
-| `browser_snapshot` | Capture accessibility tree -- primary tool for getting real locators, roles, names |
-| `browser_take_screenshot` | Visual capture for debugging layout issues |
-| `browser_click` | Click elements using refs from snapshot |
-| `browser_fill_form` | Fill form fields |
-| `browser_type` | Type into inputs |
-| `browser_press_key` | Keyboard actions |
-| `browser_select_option` | Dropdown selection |
-| `browser_wait_for` | Wait for text/elements |
-| `browser_console_messages` | Capture JS errors |
-| `browser_network_requests` | Capture API calls for API test validation |
-| `browser_evaluate` | Run JS on page (extract data-testid values, check element state) |
-| `browser_run_code` | Run Playwright code snippets directly |
-| `browser_close` | Clean up browser session |
-**Key principle:** `browser_snapshot` returns the accessibility tree with element refs. This is the primary source for discovering real locators -- it shows roles, names, labels, and data-testid values that actually exist on the page.
-</tools>
-<process>
-<step name="resolve_app_url">
-## Step 1: Resolve Application URL
-The agent needs a live application to test against.
-**Check for URL in parameters:**
-If the orchestrator or user provided `app_url`, use it directly.
-**Auto-detect dev server:**
-If no URL provided, check common dev server ports:
-```bash
-# Check if any common dev server is running
-for port in 3000 3001 4200 5173 5174 8080 8000 8888; do
-  curl -s -o /dev/null -w "%{http_code}" "http://localhost:${port}" 2>/dev/null
-done
-```
-If a server responds with 200, use that URL. If multiple respond, present options to user.
-**If no server found:**
-```
-CHECKPOINT:
-type: human-action
-blocking: "No running application detected"
-details: "Checked ports: 3000, 3001, 4200, 5173, 5174, 8080, 8000, 8888. No HTTP response."
-awaiting: "Provide the application URL, or start your dev server and retry."
-```
-</step>
-<step name="catalog_e2e_files">
-## Step 2: Catalog E2E Test Files
-Identify all E2E test files and their corresponding POM files to run.
-```bash
-# Find E2E test specs
-find . -name '*.e2e.spec.*' -o -name '*.e2e.cy.*' -o -name '*.e2e.test.*' | sort
-# Find POM files
-find . -path '*/pages/*' -o -path '*/page-objects/*' | grep -E '\.(ts|js|py)$' | sort
-```
-Build a test manifest:
-```
-E2E_FILES:
-  - path: "tests/e2e/smoke/login.e2e.spec.ts"
-    pages_involved: ["LoginPage"]
-    routes: ["/login", "/dashboard"]
-  - path: "tests/e2e/smoke/checkout.e2e.spec.ts"
-    pages_involved: ["CheckoutPage", "CartPage"]
-    routes: ["/cart", "/checkout", "/checkout/confirm"]
-```
-Extract routes from test files by reading `page.goto()`, `navigate()`, or route-related calls.
-</step>
-<step name="inspect_pages">
-## Step 3: Inspect Live Pages and Capture Real Locators
-For each route in the test manifest, navigate to the page and capture its real structure.
-**For each route:**
-1. **Navigate:**
-   ```
-   browser_navigate(url: "{app_url}{route}")
-   ```
-2. **Wait for page to load:**
-   ```
-   browser_wait_for(time: 2)
-   ```
-3. **Capture accessibility snapshot:**
-   ```
-   browser_snapshot()
-   ```
-   This returns the accessibility tree with all elements, their roles, names, and refs. This is the source of truth for what locators actually exist on the page.
-4. **Extract existing data-testid values:**
-   ```
-   browser_evaluate(function: "() => {
-     const elements = document.querySelectorAll('[data-testid]');
-     return Array.from(elements).map(el => ({
-       testid: el.getAttribute('data-testid'),
-       tag: el.tagName.toLowerCase(),
-       role: el.getAttribute('role') || '',
-       text: el.textContent?.trim().substring(0, 50) || '',
-       visible: el.offsetParent !== null
-     }));
-   }")
-   ```
-5. **Extract interactive elements:**
-   ```
-   browser_evaluate(function: "() => {
-     const selectors = 'button, input, select, textarea, a[href], [role=\"button\"], [role=\"link\"], [role=\"tab\"], [role=\"checkbox\"], [role=\"radio\"]';
-     const elements = document.querySelectorAll(selectors);
-     return Array.from(elements).map(el => ({
-       tag: el.tagName.toLowerCase(),
-       type: el.getAttribute('type') || '',
-       testid: el.getAttribute('data-testid') || '',
-       role: el.getAttribute('role') || '',
-       name: el.getAttribute('name') || '',
-       ariaLabel: el.getAttribute('aria-label') || '',
-       placeholder: el.getAttribute('placeholder') || '',
-       text: el.textContent?.trim().substring(0, 50) || '',
-       id: el.id || '',
-       visible: el.offsetParent !== null
-     }));
-   }")
-   ```
-6. **Take screenshot for reference:**
-   ```
-   browser_take_screenshot(type: "png", filename: ".qa-output/screenshots/{route-slug}.png")
-   ```
-**Build a real locator map per route:**
-```
-ROUTE: /login
-REAL_LOCATORS:
-  - element: "email input"
-    best_locator: "getByTestId('login-email-input')"     # Tier 1 - data-testid exists
-    fallback: "getByLabel('Email')"                        # Tier 2
-    role: "textbox"
-    name: "Email"
-  - element: "password input"
-    best_locator: "getByTestId('login-password-input')"
-    fallback: "getByLabel('Password')"
-    role: "textbox"
-    name: "Password"
-  - element: "submit button"
-    best_locator: "getByRole('button', { name: 'Log in' })"  # Tier 1 - role + name
-    fallback: "getByText('Log in')"                            # Tier 2
-    role: "button"
-    name: "Log in"
-```
-**Locator selection priority (from accessibility snapshot and evaluate results):**
-1. `data-testid` exists → use `getByTestId()`
-2. Role + accessible name is unique → use `getByRole()`
-3. Label exists → use `getByLabel()`
-4. Placeholder exists → use `getByPlaceholder()`
-5. Text content is unique and stable → use `getByText()`
-6. None of the above → use CSS selector with `// TODO: Request test ID` comment
-</step>
-<step name="compare_and_fix_locators">
-## Step 4: Compare Generated Locators vs Real Locators
-For each E2E test file and its POM:
-1. **Read the generated file** and extract all locators used
-2. **Compare against real locator map** from Step 3
-3. **Identify mismatches:**
-   - Locator references an element that doesn't exist on the page
-   - Locator uses Tier 4 (CSS) when Tier 1 (testid/role) is available
-   - Locator text doesn't match actual text on page
-   - data-testid value in test doesn't match actual data-testid on page
-4. **Fix each mismatch:**
-   - Replace incorrect locators with real ones from the locator map
-   - Upgrade locator tier where possible (CSS → testid or role)
-   - Update text assertions with actual text from the page
-   - Add `// TODO: Request test ID` for elements that have no testid and no good role/label
-5. **Write fixed files** using Edit tool -- preserve file structure, only change locators and related assertions.
-**Log all changes:**
-```
-LOCATOR_FIXES:
-  - file: "pages/LoginPage.ts"
-    line: 12
-    was: "page.locator('.btn-primary')"
-    now: "page.getByRole('button', { name: 'Log in' })"
-    reason: "Upgraded from Tier 4 (CSS) to Tier 1 (role)"
-  - file: "tests/e2e/smoke/login.e2e.spec.ts"
-    line: 24
-    was: "expect(page.locator('.welcome-msg')).toHaveText('Welcome')"
-    now: "expect(page.getByTestId('dashboard-welcome-alert')).toHaveText('Welcome back, Test User')"
-    reason: "Fixed locator (CSS→testid) and assertion (vague→concrete from real page)"
-```
-</step>
-<step name="run_tests">
-## Step 5: Execute Tests
-Run the E2E tests using the project's test runner.
-**Detect test runner:**
-```bash
-# Check for Playwright
-[ -f "playwright.config.ts" ] || [ -f "playwright.config.js" ] && RUNNER="playwright"
-# Check for Cypress
-[ -f "cypress.config.ts" ] || [ -f "cypress.config.js" ] && RUNNER="cypress"
-# Check package.json scripts
-grep -q "playwright" package.json && RUNNER="playwright"
-grep -q "cypress" package.json && RUNNER="cypress"
-```
-**Run tests:**
-For Playwright:
-```bash
-npx playwright test {test_file_paths} --reporter=json 2>&1
-```
-For Cypress:
-```bash
-npx cypress run --spec "{test_file_paths}" --reporter json 2>&1
-```
-**Parse results:**
-- Total tests, passed, failed, skipped
-- For each failure: test name, error message, file path, line number
-</step>
-<step name="fix_loop">
-## Step 6: Diagnose Failures and Fix (Loop max 3 times)
-For each failing test:
-1. **Read the error message** -- what assertion failed, what element wasn't found, what timeout hit
-2. **Navigate to the failing page with browser tools:**
-   ```
-   browser_navigate(url: "{app_url}{failing_route}")
-   browser_snapshot()
-   ```
-3. **Diagnose the failure type:**
-   | Error Pattern | Diagnosis | Action |
-   |---------------|-----------|--------|
-   | "Element not found" / "Timeout waiting for selector" | Locator mismatch | Capture snapshot, find real element, fix locator |
-   | "Expected X but received Y" | Assertion value wrong | Read real value from page, update assertion |
-   | "Navigation timeout" | Page doesn't load / wrong URL | Check if route is correct, check for redirects |
-   | "Element not visible" | Element exists but hidden | Check page state, may need to scroll or wait |
-   | "API returned 401/403" | Auth issue | Test needs auth setup -- flag as test code error |
-   | "Element is not interactable" | Overlay, modal, or loading state | Add wait_for before interaction |
-   | "Net::ERR_CONNECTION_REFUSED" | App not running | Flag as environment issue |
-4. **For locator/assertion issues -- fix and continue:**
-   - Use `browser_snapshot()` to get the real accessibility tree
-   - Use `browser_evaluate()` to inspect specific elements
-   - Use `browser_take_screenshot()` to visually confirm state
-   - Edit the test/POM file with the correct locator or assertion value
-5. **For application bugs -- classify and stop fixing that test:**
-   - The page actually behaves incorrectly (button does nothing, form submits but errors, data not displayed)
-   - Document: what was expected, what actually happened, screenshot as evidence
-   - Do NOT fix the test to pass -- the test is correct, the app is wrong
-6. **Re-run after fixes:**
-   ```bash
-   npx playwright test {fixed_files} --reporter=json 2>&1
-   ```
-7. **Repeat up to 3 times.** After 3 loops, classify remaining failures and stop.
-</step>
-<step name="produce_report">
-## Step 7: Produce E2E Run Report
-Write `{output_dir}/E2E_RUN_REPORT.md`:
-```markdown
-# E2E Test Execution Report
-## Summary
-| Metric | Value |
-|--------|-------|
-| App URL | {app_url} |
-| Test files | {file_count} |
-| Total tests | {total} |
-| Passed | {passed} |
-| Failed | {failed} |
-| Fix loops used | {loop_count}/3 |
-## Locator Fixes Applied
-| File | Line | Was | Now | Reason |
-|------|------|-----|-----|--------|
-| ... | ... | ... | ... | ... |
-## Test Results
-### Passed
-- [test name] -- {file}:{line}
-- ...
-### Failed (Application Bugs)
-- [test name] -- {file}:{line}
-  - **Expected:** {expected}
-  - **Actual:** {actual}
-  - **Evidence:** screenshot at {path}
-  - **Classification:** APPLICATION BUG
-### Failed (Unresolved after 3 fix loops)
-- [test name] -- {file}:{line}
-  - **Error:** {error}
-  - **Attempts:** 3
-  - **Classification:** {TEST CODE ERROR | ENVIRONMENT ISSUE | INCONCLUSIVE}
-## Screenshots
-- {route}: {screenshot_path}
-- ...
-```
-</step>
-<step name="cleanup">
-## Step 8: Cleanup
-```
-browser_close()
-```
-**Return structured result to orchestrator:**
-```
-E2E_RUNNER_COMPLETE:
-  app_url: "{app_url}"
-  total_tests: N
-  passed: N
-  failed: N
-  locator_fixes: N
-  app_bugs_found: N
-  fix_loops_used: N
-  report_path: "{output_dir}/E2E_RUN_REPORT.md"
-  screenshots: ["{path1}", "{path2}", ...]
-```
-</step>
-</process>
-<error_handling>
-| Error | Cause | Action |
-|-------|-------|--------|
-| No app URL and no dev server detected | App not running | Checkpoint: ask user for URL or to start server |
-| Browser not installed | Playwright browsers missing | Run `browser_install()` then retry |
-| All tests timeout | App URL wrong or app crashed | Check URL, take screenshot, report as ENVIRONMENT ISSUE |
-| Auth-gated pages | Tests need login first | Check if test has auth setup, suggest adding login fixture |
-| Dynamic content changes between runs | Flaky locators | Prefer data-testid over text-based locators, add waits |
-| Test runner not found | No playwright/cypress installed | Report as ENVIRONMENT ISSUE with install instructions |
-</error_handling>
-<success_criteria>
-E2E runner is complete when:
-- [ ] All pages in the test manifest were inspected with browser_snapshot
-- [ ] Real locator map was built for every route
-- [ ] Generated locators were compared and fixed where mismatched
-- [ ] Tests were executed against the live app
-- [ ] Failures were diagnosed using browser tools (snapshot, screenshot, evaluate)
-- [ ] Fixable issues (locators, assertions) were auto-fixed (up to 3 loops)
-- [ ] Application bugs were classified with evidence (not auto-fixed)
-- [ ] E2E_RUN_REPORT.md was written with full results
-- [ ] Browser session was closed
-</success_criteria>
+<purpose>
+Run generated E2E test files against a live application using the Playwright browser tools. Navigate pages, capture real locators from the accessibility snapshot, compare them against the locators in generated test files, fix mismatches, and loop until tests pass or failures are classified as application bugs. This agent bridges the gap between "tests exist on disk" and "tests actually pass against the real app."
+Spawned by the orchestrator after static validation completes, or invoked standalone via /qa-validate with a running app URL. Requires a live application URL to work.
+</purpose>
+<required_reading>
+Read ALL of the following files BEFORE running any tests. Do NOT skip.
+- **CLAUDE.md** -- QA automation standards. Read these sections:
+  - **Locator Strategy** -- 4-tier hierarchy. Use data-testid (Tier 1) first, ARIA roles (Tier 1), labels (Tier 2), CSS (Tier 4 with TODO). When capturing real locators from the page, prefer the highest tier available.
+  - **Page Object Model Rules** -- Locators as properties, no assertions in POMs. When fixing POM files, preserve this structure.
+  - **data-testid Convention** -- Naming pattern `{context}-{description}-{element-type}`. When recommending new test IDs, follow this convention.
+  - **Quality Gates** -- Assertion specificity rules. When fixing assertions, use concrete values from the real page.
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences. If a preference conflicts with CLAUDE.md, the preference wins.
+- **Generated test files** (paths from orchestrator prompt or generation plan) -- The actual E2E test specs and POM files to run and fix.
+- **Codebase map documents** (optional -- read if they exist in `.qa-output/codebase/`):
+  - **CODE_PATTERNS.md** -- Naming conventions to match when fixing test code
+  - **TEST_SURFACE.md** -- Testable entry points for reference
+</required_reading>
+<tools>
+This agent uses the Playwright MCP browser tools for all browser interaction:
+| Tool | Purpose |
+|------|---------|
+| `browser_navigate` | Navigate to app pages |
+| `browser_snapshot` | Capture accessibility tree -- primary tool for getting real locators, roles, names |
+| `browser_take_screenshot` | Visual capture for debugging layout issues |
+| `browser_click` | Click elements using refs from snapshot |
+| `browser_fill_form` | Fill form fields |
+| `browser_type` | Type into inputs |
+| `browser_press_key` | Keyboard actions |
+| `browser_select_option` | Dropdown selection |
+| `browser_wait_for` | Wait for text/elements |
+| `browser_console_messages` | Capture JS errors |
+| `browser_network_requests` | Capture API calls for API test validation |
+| `browser_evaluate` | Run JS on page (extract data-testid values, check element state) |
+| `browser_run_code` | Run Playwright code snippets directly |
+| `browser_close` | Clean up browser session |
+**Key principle:** `browser_snapshot` returns the accessibility tree with element refs. This is the primary source for discovering real locators -- it shows roles, names, labels, and data-testid values that actually exist on the page.
+</tools>
+<process>
+<step name="resolve_app_url">
+## Step 1: Resolve Application URL
+The agent needs a live application to test against.
+**Check for URL in parameters:**
+If the orchestrator or user provided `app_url`, use it directly.
+**Auto-detect dev server:**
+If no URL provided, check common dev server ports:
+```bash
+# Check if any common dev server is running
+for port in 3000 3001 4200 5173 5174 8080 8000 8888; do
+  curl -s -o /dev/null -w "%{http_code}" "http://localhost:${port}" 2>/dev/null
+done
+```
+If a server responds with 200, use that URL. If multiple respond, present options to user.
+**If no server found:**
+```
+CHECKPOINT:
+type: human-action
+blocking: "No running application detected"
+details: "Checked ports: 3000, 3001, 4200, 5173, 5174, 8080, 8000, 8888. No HTTP response."
+awaiting: "Provide the application URL, or start your dev server and retry."
+```
+</step>
+<step name="catalog_e2e_files">
+## Step 2: Catalog E2E Test Files
+Identify all E2E test files and their corresponding POM files to run.
+```bash
+# Find E2E test specs
+find . -name '*.e2e.spec.*' -o -name '*.e2e.cy.*' -o -name '*.e2e.test.*' | sort
+# Find POM files
+find . -path '*/pages/*' -o -path '*/page-objects/*' | grep -E '\.(ts|js|py)$' | sort
+```
+Build a test manifest:
+```
+E2E_FILES:
+  - path: "tests/e2e/smoke/login.e2e.spec.ts"
+    pages_involved: ["LoginPage"]
+    routes: ["/login", "/dashboard"]
+  - path: "tests/e2e/smoke/checkout.e2e.spec.ts"
+    pages_involved: ["CheckoutPage", "CartPage"]
+    routes: ["/cart", "/checkout", "/checkout/confirm"]
+```
+Extract routes from test files by reading `page.goto()`, `navigate()`, or route-related calls.
+</step>
+<step name="inspect_pages">
+## Step 3: Inspect Live Pages and Capture Real Locators
+For each route in the test manifest, navigate to the page and capture its real structure.
+**For each route:**
+1. **Navigate:**
+   ```
+   browser_navigate(url: "{app_url}{route}")
+   ```
+2. **Wait for page to load:**
+   ```
+   browser_wait_for(time: 2)
+   ```
+3. **Capture accessibility snapshot:**
+   ```
+   browser_snapshot()
+   ```
+   This returns the accessibility tree with all elements, their roles, names, and refs. This is the source of truth for what locators actually exist on the page.
+4. **Extract existing data-testid values:**
+   ```
+   browser_evaluate(function: "() => {
+     const elements = document.querySelectorAll('[data-testid]');
+     return Array.from(elements).map(el => ({
+       testid: el.getAttribute('data-testid'),
+       tag: el.tagName.toLowerCase(),
+       role: el.getAttribute('role') || '',
+       text: el.textContent?.trim().substring(0, 50) || '',
+       visible: el.offsetParent !== null
+     }));
+   }")
+   ```
+5. **Extract interactive elements:**
+   ```
+   browser_evaluate(function: "() => {
+     const selectors = 'button, input, select, textarea, a[href], [role=\"button\"], [role=\"link\"], [role=\"tab\"], [role=\"checkbox\"], [role=\"radio\"]';
+     const elements = document.querySelectorAll(selectors);
+     return Array.from(elements).map(el => ({
+       tag: el.tagName.toLowerCase(),
+       type: el.getAttribute('type') || '',
+       testid: el.getAttribute('data-testid') || '',
+       role: el.getAttribute('role') || '',
+       name: el.getAttribute('name') || '',
+       ariaLabel: el.getAttribute('aria-label') || '',
+       placeholder: el.getAttribute('placeholder') || '',
+       text: el.textContent?.trim().substring(0, 50) || '',
+       id: el.id || '',
+       visible: el.offsetParent !== null
+     }));
+   }")
+   ```
+6. **Take screenshot for reference:**
+   ```
+   browser_take_screenshot(type: "png", filename: ".qa-output/screenshots/{route-slug}.png")
+   ```
+**Build a real locator map per route:**
+```
+ROUTE: /login
+REAL_LOCATORS:
+  - element: "email input"
+    best_locator: "getByTestId('login-email-input')"     # Tier 1 - data-testid exists
+    fallback: "getByLabel('Email')"                        # Tier 2
+    role: "textbox"
+    name: "Email"
+  - element: "password input"
+    best_locator: "getByTestId('login-password-input')"
+    fallback: "getByLabel('Password')"
+    role: "textbox"
+    name: "Password"
+  - element: "submit button"
+    best_locator: "getByRole('button', { name: 'Log in' })"  # Tier 1 - role + name
+    fallback: "getByText('Log in')"                            # Tier 2
+    role: "button"
+    name: "Log in"
+```
+**Locator selection priority (from accessibility snapshot and evaluate results):**
+1. `data-testid` exists → use `getByTestId()`
+2. Role + accessible name is unique → use `getByRole()`
+3. Label exists → use `getByLabel()`
+4. Placeholder exists → use `getByPlaceholder()`
+5. Text content is unique and stable → use `getByText()`
+6. None of the above → use CSS selector with `// TODO: Request test ID` comment
+</step>
+<step name="compare_and_fix_locators">
+## Step 4: Compare Generated Locators vs Real Locators
+For each E2E test file and its POM:
+1. **Read the generated file** and extract all locators used
+2. **Compare against real locator map** from Step 3
+3. **Identify mismatches:**
+   - Locator references an element that doesn't exist on the page
+   - Locator uses Tier 4 (CSS) when Tier 1 (testid/role) is available
+   - Locator text doesn't match actual text on page
+   - data-testid value in test doesn't match actual data-testid on page
+4. **Fix each mismatch:**
+   - Replace incorrect locators with real ones from the locator map
+   - Upgrade locator tier where possible (CSS → testid or role)
+   - Update text assertions with actual text from the page
+   - Add `// TODO: Request test ID` for elements that have no testid and no good role/label
+5. **Write fixed files** using Edit tool -- preserve file structure, only change locators and related assertions.
+**Log all changes:**
+```
+LOCATOR_FIXES:
+  - file: "pages/LoginPage.ts"
+    line: 12
+    was: "page.locator('.btn-primary')"
+    now: "page.getByRole('button', { name: 'Log in' })"
+    reason: "Upgraded from Tier 4 (CSS) to Tier 1 (role)"
+  - file: "tests/e2e/smoke/login.e2e.spec.ts"
+    line: 24
+    was: "expect(page.locator('.welcome-msg')).toHaveText('Welcome')"
+    now: "expect(page.getByTestId('dashboard-welcome-alert')).toHaveText('Welcome back, Test User')"
+    reason: "Fixed locator (CSS→testid) and assertion (vague→concrete from real page)"
+```
+</step>
+<step name="run_tests">
+## Step 5: Execute Tests
+Run the E2E tests using the project's test runner.
+**Detect test runner:**
+```bash
+# Check for Playwright
+[ -f "playwright.config.ts" ] || [ -f "playwright.config.js" ] && RUNNER="playwright"
+# Check for Cypress
+[ -f "cypress.config.ts" ] || [ -f "cypress.config.js" ] && RUNNER="cypress"
+# Check package.json scripts
+grep -q "playwright" package.json && RUNNER="playwright"
+grep -q "cypress" package.json && RUNNER="cypress"
+```
+**Run tests:**
+For Playwright:
+```bash
+npx playwright test {test_file_paths} --reporter=json 2>&1
+```
+For Cypress:
+```bash
+npx cypress run --spec "{test_file_paths}" --reporter json 2>&1
+```
+**Parse results:**
+- Total tests, passed, failed, skipped
+- For each failure: test name, error message, file path, line number
+</step>
+<step name="fix_loop">
+## Step 6: Diagnose Failures and Fix (Loop max 5 times)
+For each failing test:
+1. **Read the error message** -- what assertion failed, what element wasn't found, what timeout hit
+2. **Navigate to the failing page with browser tools:**
+   ```
+   browser_navigate(url: "{app_url}{failing_route}")
+   browser_snapshot()
+   ```
+3. **Diagnose the failure type:**
+   | Error Pattern | Diagnosis | Action |
+   |---------------|-----------|--------|
+   | "Element not found" / "Timeout waiting for selector" | Locator mismatch | Capture snapshot, find real element, fix locator |
+   | "Expected X but received Y" | Assertion value wrong | Read real value from page, update assertion |
+   | "Navigation timeout" | Page doesn't load / wrong URL | Check if route is correct, check for redirects |
+   | "Element not visible" | Element exists but hidden | Check page state, may need to scroll or wait |
+   | "API returned 401/403" | Auth issue | Test needs auth setup -- flag as test code error |
+   | "Element is not interactable" | Overlay, modal, or loading state | Add wait_for before interaction |
+   | "Net::ERR_CONNECTION_REFUSED" | App not running | Flag as environment issue |
+4. **For locator/assertion issues -- fix and continue:**
+   - Use `browser_snapshot()` to get the real accessibility tree
+   - Use `browser_evaluate()` to inspect specific elements
+   - Use `browser_take_screenshot()` to visually confirm state
+   - Edit the test/POM file with the correct locator or assertion value
+5. **For application bugs -- classify and stop fixing that test:**
+   - The page actually behaves incorrectly (button does nothing, form submits but errors, data not displayed)
+   - Document: what was expected, what actually happened, screenshot as evidence
+   - Do NOT fix the test to pass -- the test is correct, the app is wrong
+6. **Re-run after fixes:**
+   ```bash
+   npx playwright test {fixed_files} --reporter=json 2>&1
+   ```
+7. **Repeat up to 5 times.** After 5 loops, classify remaining failures and stop.
+</step>
+<step name="produce_report">
+## Step 7: Produce E2E Run Report
+Write `{output_dir}/E2E_RUN_REPORT.md`:
+```markdown
+# E2E Test Execution Report
+## Summary
+| Metric | Value |
+|--------|-------|
+| App URL | {app_url} |
+| Test files | {file_count} |
+| Total tests | {total} |
+| Passed | {passed} |
+| Failed | {failed} |
+| Fix loops used | {loop_count}/5 |
+## Locator Fixes Applied
+| File | Line | Was | Now | Reason |
+|------|------|-----|-----|--------|
+| ... | ... | ... | ... | ... |
+## Test Results
+### Passed
+- [test name] -- {file}:{line}
+- ...
+### Failed (Application Bugs)
+- [test name] -- {file}:{line}
+  - **Expected:** {expected}
+  - **Actual:** {actual}
+  - **Evidence:** screenshot at {path}
+  - **Classification:** APPLICATION BUG
+### Failed (Unresolved after 5 fix loops)
+- [test name] -- {file}:{line}
+  - **Error:** {error}
+  - **Attempts:** 5
+  - **Classification:** {TEST CODE ERROR | ENVIRONMENT ISSUE | INCONCLUSIVE}
+## Screenshots
+- {route}: {screenshot_path}
+- ...
+```
+</step>
+<step name="cleanup">
+## Step 8: Cleanup
+```
+browser_close()
+```
+**Return structured result to orchestrator:**
+```
+E2E_RUNNER_COMPLETE:
+  app_url: "{app_url}"
+  total_tests: N
+  passed: N
+  failed: N
+  locator_fixes: N
+  app_bugs_found: N
+  fix_loops_used: N
+  report_path: "{output_dir}/E2E_RUN_REPORT.md"
+  screenshots: ["{path1}", "{path2}", ...]
+```
+</step>
+</process>
+<error_handling>
+| Error | Cause | Action |
+|-------|-------|--------|
+| No app URL and no dev server detected | App not running | Checkpoint: ask user for URL or to start server |
+| Browser not installed | Playwright browsers missing | Run `browser_install()` then retry |
+| All tests timeout | App URL wrong or app crashed | Check URL, take screenshot, report as ENVIRONMENT ISSUE |
+| Auth-gated pages | Tests need login first | Check if test has auth setup, suggest adding login fixture |
+| Dynamic content changes between runs | Flaky locators | Prefer data-testid over text-based locators, add waits |
+| Test runner not found | No playwright/cypress installed | Report as ENVIRONMENT ISSUE with install instructions |
+</error_handling>
+<success_criteria>
+E2E runner is complete when:
+- [ ] All pages in the test manifest were inspected with browser_snapshot
+- [ ] Real locator map was built for every route
+- [ ] Generated locators were compared and fixed where mismatched
+- [ ] Tests were executed against the live app
+- [ ] Failures were diagnosed using browser tools (snapshot, screenshot, evaluate)
+- [ ] Fixable issues (locators, assertions) were auto-fixed (up to 5 loops)
+- [ ] Application bugs were classified with evidence (not auto-fixed)
+- [ ] E2E_RUN_REPORT.md was written with full results
+- [ ] Locator registry updated with all real locators discovered during execution (`.qa-output/locators/`)
+- [ ] Browser session was closed
+</success_criteria>