npm - @matware/e2e-runner - Versions diffs - 1.1.1 → 1.2.1 - Mend

@matware/e2e-runner 1.1.1 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/.claude-plugin/plugin.json +9 -0
package/.mcp.json +9 -0
package/README.md +475 -307
package/agents/test-analyzer.md +81 -0
package/agents/test-creator.md +102 -0
package/agents/test-improver.md +140 -0
package/bin/cli.js +194 -6
package/commands/create-test.md +50 -0
package/commands/run.md +49 -0
package/commands/verify-issue.md +63 -0
package/package.json +10 -2
package/skills/e2e-testing/SKILL.md +166 -0
package/skills/e2e-testing/references/action-types.md +100 -0
package/skills/e2e-testing/references/test-json-format.md +159 -0
package/skills/e2e-testing/references/troubleshooting.md +182 -0
package/src/actions.js +273 -18
package/src/ai-generate.js +87 -7
package/src/config.js +28 -0
package/src/dashboard.js +156 -6
package/src/db.js +207 -13
package/src/index.js +9 -3
package/src/learner-markdown.js +177 -0
package/src/learner-neo4j.js +255 -0
package/src/learner-sqlite.js +354 -0
package/src/learner.js +413 -0
package/src/mcp-tools.js +448 -18
package/src/module-resolver.js +273 -0
package/src/narrate.js +225 -0
package/src/neo4j-pool.js +124 -0
package/src/reporter.js +35 -2
package/src/runner.js +120 -46
package/src/verify.js +5 -3
package/templates/build-dashboard.js +28 -0
package/templates/dashboard/app.js +1152 -0
package/templates/dashboard/styles.css +413 -0
package/templates/dashboard/template.html +201 -0
package/templates/dashboard.html +964 -378
package/templates/docker-compose-neo4j.yml +19 -0
package/templates/e2e.config.js +3 -0

package/README.md CHANGED Viewed

@@ -1,13 +1,58 @@
+<p align="right">
+  <strong>English</strong> · <a href="LEEME.md">Español</a>
+</p>
+<h1 align="center">@matware/e2e-runner</h1>
+<p align="center">
+  <strong>The AI-native E2E test runner that writes, runs, and debugs tests for you.</strong>
+</p>
 <p align="center">
   <img src="https://img.shields.io/npm/v/@matware/e2e-runner?color=blue" alt="npm version" />
   <img src="https://img.shields.io/node/v/@matware/e2e-runner" alt="node version" />
   <img src="https://img.shields.io/npm/l/@matware/e2e-runner" alt="license" />
   <img src="https://img.shields.io/badge/MCP-compatible-green" alt="MCP compatible" />
+  <img src="https://img.shields.io/badge/AI--native-Claude%20Code-blueviolet" alt="AI native" />
 </p>
-# @matware/e2e-runner
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-live-running.png" alt="E2E Runner Dashboard - Live Execution" width="800" />
+</p>
+---
+**E2E Runner** is a zero-code browser testing framework where tests are plain JSON files — no Playwright scripts, no Cypress boilerplate, no test framework to learn. Define what to click, type, and assert, and the runner executes it in parallel against a shared Chrome pool.
+But what makes it truly different is its **deep AI integration**. With a built-in [MCP server](https://modelcontextprotocol.io/), Claude Code can create tests from a conversation, run them, read the results, capture screenshots, and even visually verify that pages look correct — all without leaving the chat. Paste a GitHub issue URL and get a runnable test back. That's the workflow.
+### What you get
+🧪 **Zero-code tests** — JSON files that anyone on your team can read and write. No JavaScript, no compilation, no framework lock-in.
+🤖 **AI-powered testing** — Claude Code creates, executes, and debugs tests natively through 13 MCP tools. Ask it to "test the checkout flow" and it builds the JSON, runs it, and reports back.
+🐛 **Issue-to-Test pipeline** — Paste a GitHub or GitLab issue URL. The runner fetches it, generates E2E tests, runs them, and tells you: *bug confirmed* or *not reproducible*.
+👁️ **Visual verification** — Describe what the page should look like in plain English. The AI captures a screenshot and judges pass/fail against your description. No pixel-diffing setup needed.
+🧠 **Learning system** — Tracks test stability across runs. Detects flaky tests, unstable selectors, slow APIs, and error patterns — then surfaces actionable insights.
+⚡ **Parallel execution** — Run N tests simultaneously against a shared Chrome pool (browserless/chrome). Serial mode available for tests that share state.
+📊 **Real-time dashboard** — Live execution view, run history with pass-rate charts, screenshot gallery with hash-based search, expandable network request logs.
+🔁 **Smart retries** — Test-level and action-level retries with configurable delays. Flaky tests are detected and flagged automatically.
+📦 **Reusable modules** — Extract common flows (login, navigation, setup) into parameterized modules and reference them with `$use`.
-JSON-driven E2E test runner. Define browser tests as simple JSON action arrays, run them in parallel against a Chrome pool. No JavaScript test files, no complex setup.
+🏗️ **CI-ready** — JUnit XML output, exit code 1 on failure, auto-captured error screenshots. Drop-in GitHub Actions example included.
+🌐 **Multi-project** — One dashboard aggregates test results from all your projects. One Chrome pool serves them all.
+🐳 **Portable** — Chrome runs in Docker, tests are JSON files in your repo. Works on any machine with Node.js and Docker.
+### This is a test
 ```json
 [
@@ -25,15 +70,9 @@ JSON-driven E2E test runner. Define browser tests as simple JSON action arrays,
 ]
 ```
----
-## Why
+No imports. No `describe`/`it`. No compilation step. Just a JSON file that describes what a user does — and the runner makes it happen.
-- **No code** -- Tests are JSON files. QA, product, and devs can all write them.
-- **Parallel** -- Run N tests simultaneously against a shared Chrome pool.
-- **Portable** -- Chrome runs in Docker, tests run anywhere.
-- **CI-ready** -- JUnit XML output, exit code 1 on failure, error screenshots.
-- **AI-native** -- Built-in MCP server for Claude Code integration.
+---
 ## Quick Start
@@ -43,8 +82,6 @@ JSON-driven E2E test runner. Define browser tests as simple JSON action arrays,
 curl -fsSL https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/scripts/quickstart.sh | bash
 ```
-This checks prerequisites, installs the package, scaffolds the project, starts the Chrome pool, and runs the sample tests.
 **Step by step:**
 ```bash
@@ -67,19 +104,17 @@ npx e2e-runner dashboard
 **Add to Claude Code** (once, available in all projects):
 ```bash
+# Full plugin: MCP tools + skills + commands + agents
+claude plugin install npm:@matware/e2e-runner
+# Or MCP-only (tools without skills/commands/agents):
 claude mcp add --transport stdio --scope user e2e-runner \
   -- npx -y -p @matware/e2e-runner e2e-runner-mcp
 ```
-The `init` command creates:
+The **plugin** is the recommended approach — it installs the 13 MCP tools *plus* a skill that teaches Claude the optimal workflow, 3 slash commands (`/e2e-runner:run`, `/e2e-runner:create-test`, `/e2e-runner:verify-issue`), and 2 specialized agents for test analysis and creation.
-```
-e2e/
-  tests/
-    01-sample.json      # Sample test suite
-  screenshots/          # Reports and error screenshots
-e2e.config.js           # Configuration file
-```
+---
 ## Test Format
@@ -110,422 +145,565 @@ Suite files can have numeric prefixes for ordering (`01-auth.json`, `02-dashboar
 | `click` | `selector` or `text` | Click by CSS selector or visible text content |
 | `type` / `fill` | `selector`, `value` | Clear field and type text |
 | `wait` | `selector`, `text`, or `value` (ms) | Wait for element, text, or fixed delay |
-| `assert_text` | `text` | Assert text exists on the page |
-| `assert_url` | `value` | Assert current URL contains value |
-| `assert_visible` | `selector` | Assert element is visible |
-| `assert_count` | `selector`, `value` | Assert element count matches |
 | `screenshot` | `value` (filename) | Capture a screenshot |
 | `select` | `selector`, `value` | Select a dropdown option |
 | `clear` | `selector` | Clear an input field |
-| `press` | `value` | Press a keyboard key (e.g. `Enter`, `Tab`) |
+| `press` | `value` | Press a keyboard key (`Enter`, `Tab`, etc.) |
 | `scroll` | `selector` or `value` (px) | Scroll to element or by pixel amount |
 | `hover` | `selector` | Hover over an element |
 | `evaluate` | `value` | Execute JavaScript in the browser context |
+| `navigate` | `value` | Browser navigation (`back`, `forward`, `reload`) |
+| `clear_cookies` | — | Clear all cookies for the current page |
+### Assertions
+| Action | Fields | Description |
+|--------|--------|-------------|
+| `assert_text` | `text` | Assert text exists anywhere on the page (substring) |
+| `assert_element_text` | `selector`, `text`, optional `value: "exact"` | Assert element's text contains (or exactly matches) the expected text |
+| `assert_url` | `value` | Assert current URL path or full URL. Paths (`/dashboard`) compare against pathname only |
+| `assert_visible` | `selector` | Assert element exists and is visible |
+| `assert_not_visible` | `selector` | Assert element is hidden or doesn't exist |
+| `assert_attribute` | `selector`, `value` | Check attribute: `"type=email"` for value, `"disabled"` for existence |
+| `assert_class` | `selector`, `value` | Assert element has a CSS class |
+| `assert_input_value` | `selector`, `value` | Assert input/select/textarea `.value` contains text |
+| `assert_matches` | `selector`, `value` (regex) | Assert element text matches a regex pattern |
+| `assert_count` | `selector`, `value` | Assert element count: exact (`"5"`), or operators (`">3"`, `">=1"`, `"<10"`) |
+| `assert_no_network_errors` | — | Fail if any network requests failed (e.g. `ERR_CONNECTION_REFUSED`) |
+| `get_text` | `selector` | Extract element text (non-assertion, never fails). Result: `{ value: "..." }` |
 ### Click by Text
-When `click` uses `text` instead of `selector`, it searches across interactive elements:
+When `click` uses `text` instead of `selector`, it searches across common interactive and content elements:
 ```
-button, a, [role="button"], [role="tab"], [role="menuitem"], div[class*="cursor"], span
+button, a, [role="button"], [role="tab"], [role="menuitem"], [role="option"],
+[role="listitem"], div[class*="cursor"], span, li, td, th, label, p, h1-h6
 ```
 ```json
 { "type": "click", "text": "Sign In" }
 ```
-## CLI
+### Framework-Aware Actions
-```bash
-# Run tests
-npx e2e-runner run --all                  # All suites
-npx e2e-runner run --suite auth           # Single suite
-npx e2e-runner run --tests path/to.json   # Specific file
-npx e2e-runner run --inline '<json>'      # Inline JSON
-# Pool management
-npx e2e-runner pool start                 # Start Chrome container
-npx e2e-runner pool stop                  # Stop Chrome container
-npx e2e-runner pool status                # Check pool health
+These actions handle common patterns in React/MUI apps that normally require verbose `evaluate` boilerplate:
-# Issue-to-test
-npx e2e-runner issue <url>                # Fetch issue details
-npx e2e-runner issue <url> --generate     # Generate test file via AI
-npx e2e-runner issue <url> --verify       # Generate + run + report
+| Action | Fields | Description |
+|--------|--------|-------------|
+| `type_react` | `selector`, `value` | Type into React controlled inputs using the native value setter. Dispatches `input` + `change` events so React state updates correctly. |
+| `click_regex` | `text` (regex), optional `selector`, optional `value: "last"` | Click element whose textContent matches a regex (case-insensitive). Default: first match. Use `value: "last"` for last match. |
+| `click_option` | `text` | Click a `[role="option"]` element by text — common in autocomplete/select dropdowns. |
+| `focus_autocomplete` | `text` (label text) | Focus an autocomplete input by its label text. Supports MUI and generic `[role="combobox"]`. |
+| `click_chip` | `text` | Click a chip/tag element by text. Searches `[class*="Chip"]`, `[class*="chip"]`, `[data-chip]`. |
-# Dashboard
-npx e2e-runner dashboard                  # Start web dashboard
+```json
+// Before: 5 lines of evaluate boilerplate
+{ "type": "evaluate", "value": "const input = document.querySelector('#search'); const nativeSet = Object.getOwnPropertyDescriptor(window.HTMLInputElement.prototype, 'value').set; nativeSet.call(input, 'term'); input.dispatchEvent(new Event('input', {bubbles: true})); input.dispatchEvent(new Event('change', {bubbles: true}));" }
-# Other
-npx e2e-runner list                       # List available suites
-npx e2e-runner init                       # Scaffold project
+// After: 1 action
+{ "type": "type_react", "selector": "#search", "value": "term" }
 ```
-### CLI Options
+---
-| Flag | Default | Description |
-|------|---------|-------------|
-| `--base-url <url>` | `http://host.docker.internal:3000` | Application base URL |
-| `--pool-url <ws>` | `ws://localhost:3333` | Chrome pool WebSocket URL |
-| `--tests-dir <dir>` | `e2e/tests` | Tests directory |
-| `--screenshots-dir <dir>` | `e2e/screenshots` | Screenshots/reports directory |
-| `--concurrency <n>` | `3` | Parallel test workers |
-| `--timeout <ms>` | `10000` | Default action timeout |
-| `--retries <n>` | `0` | Retry failed tests N times |
-| `--retry-delay <ms>` | `1000` | Delay between retries |
-| `--test-timeout <ms>` | `60000` | Per-test timeout |
-| `--output <format>` | `json` | Report format: `json`, `junit`, `both` |
-| `--env <name>` | `default` | Environment profile |
-| `--pool-port <port>` | `3333` | Chrome pool port |
-| `--max-sessions <n>` | `10` | Max concurrent Chrome sessions |
-| `--project-name <name>` | dir name | Project display name for dashboard |
+## Retries
-## Configuration
+### Test-Level Retry
-Create `e2e.config.js` (or `e2e.config.json`) in your project root:
+Retry an entire test on failure. Set globally via config or per-test:
-```js
-export default {
-  baseUrl: 'http://host.docker.internal:3000',
-  concurrency: 4,
-  retries: 2,
-  testTimeout: 30000,
-  outputFormat: 'both',
+```json
+{ "name": "flaky-test", "retries": 3, "timeout": 15000, "actions": [...] }
+```
-  hooks: {
-    beforeEach: [{ type: 'goto', value: '/' }],
-    afterEach: [{ type: 'screenshot', value: 'after-test.png' }],
-  },
+Tests that pass after retry are flagged as **flaky** in the report and learning system.
-  environments: {
-    staging: { baseUrl: 'https://staging.example.com' },
-    production: { baseUrl: 'https://example.com', concurrency: 5 },
-  },
-};
+### Action-Level Retry
+Retry a single action without rerunning the entire test. Useful for timing-sensitive clicks and waits:
+```json
+{ "type": "click", "selector": "#dynamic-btn", "retries": 3 }
+{ "type": "wait", "selector": ".lazy-loaded", "retries": 2 }
 ```
-### Config Priority (highest wins)
+Set globally: `actionRetries` in config, `--action-retries <n>` CLI, or `ACTION_RETRIES` env var. Delay between retries: `actionRetryDelay` (default 500ms).
-1. CLI flags (`--base-url`, `--concurrency`, ...)
-2. Environment variables (`BASE_URL`, `CONCURRENCY`, ...)
-3. Config file (`e2e.config.js` or `e2e.config.json`)
-4. Defaults
+---
-When `--env <name>` is set, the matching profile from `environments` overrides everything.
-### Environment Variables
-| Variable | Maps to |
-|----------|---------|
-| `BASE_URL` | `baseUrl` |
-| `CHROME_POOL_URL` | `poolUrl` |
-| `TESTS_DIR` | `testsDir` |
-| `SCREENSHOTS_DIR` | `screenshotsDir` |
-| `CONCURRENCY` | `concurrency` |
-| `DEFAULT_TIMEOUT` | `defaultTimeout` |
-| `POOL_PORT` | `poolPort` |
-| `MAX_SESSIONS` | `maxSessions` |
-| `RETRIES` | `retries` |
-| `RETRY_DELAY` | `retryDelay` |
-| `TEST_TIMEOUT` | `testTimeout` |
-| `OUTPUT_FORMAT` | `outputFormat` |
-| `E2E_ENV` | `env` |
-| `PROJECT_NAME` | `projectName` |
-| `ANTHROPIC_API_KEY` | `anthropicApiKey` |
-| `ANTHROPIC_MODEL` | `anthropicModel` |
+## Serial Tests
-## Hooks
+Tests that share state (e.g., two tests modifying the same record) can race when running in parallel. Mark them as serial:
+```json
+{ "name": "create-patient", "serial": true, "actions": [...] }
+{ "name": "verify-patient-list", "serial": true, "actions": [...] }
+```
-Hooks run actions at lifecycle points. Define them globally in config or per-suite in the JSON file:
+Serial tests run one at a time **after** all parallel tests finish — preventing interference without slowing down independent tests.
+---
+## Reusable Modules
+Extract common flows into parameterized modules:
 ```json
+// e2e/modules/auth.json
 {
-  "hooks": {
-    "beforeAll": [{ "type": "goto", "value": "/login" }],
-    "beforeEach": [{ "type": "goto", "value": "/" }],
-    "afterEach": [],
-    "afterAll": []
+  "$module": "auth-jwt",
+  "description": "Inject JWT token into localStorage",
+  "params": {
+    "token": { "required": true, "description": "JWT token" },
+    "storageKey": { "default": "accessToken" }
   },
-  "tests": [
-    { "name": "test-1", "actions": [...] }
+  "actions": [
+    { "type": "evaluate", "value": "localStorage.setItem('{{storageKey}}', '{{token}}')" },
+    { "type": "goto", "value": "/dashboard" }
+  ]
+}
+```
+Use in tests:
+```json
+{
+  "name": "dashboard-loads",
+  "actions": [
+    { "$use": "auth-jwt", "params": { "token": "eyJhbG..." } },
+    { "type": "assert_text", "text": "Dashboard" }
   ]
 }
 ```
-Suite-level hooks override global hooks per key (non-empty array wins). The plain array format (`[{ name, actions }]`) is still supported.
+Modules support parameter validation (required params fail fast), conditional blocks (`{{#param}}...{{/param}}`), nested composition, and cycle detection.
+---
+## Exclude Patterns
-## Retries and Timeouts
+Skip exploratory or draft tests from `--all` runs:
+```js
+// e2e.config.js
+export default {
+  exclude: ['explore-*', 'debug-*', 'draft-*'],
+};
+```
-Override globally or per-test:
+Individual suite runs (`--suite`) are not affected by exclude patterns.
+---
+## Visual Verification
+Describe what the page should look like — AI judges pass/fail from screenshots:
 ```json
 {
-  "name": "flaky-test",
-  "retries": 3,
-  "timeout": 15000,
-  "actions": [...]
+  "name": "dashboard-loads",
+  "expect": "Patient list with at least 3 rows, no error messages, sidebar with navigation links",
+  "actions": [
+    { "type": "goto", "value": "/dashboard" },
+    { "type": "wait", "selector": ".patient-list" }
+  ]
 }
 ```
-- **Retries**: Each attempt gets its own fresh timeout. Tests that pass after retry are flagged as "flaky" in the report.
-- **Timeout**: Applied via `Promise.race()`. Defaults to 60s.
+After test actions complete, the runner auto-captures a verification screenshot. The MCP response includes the screenshot hash — Claude Code retrieves it and visually verifies against your `expect` description. No API key required.
-## CI/CD
+---
-### JUnit XML
+## Issue-to-Test
+Turn GitHub and GitLab issues into executable E2E tests. Paste an issue URL and get runnable tests — automatically.
+**How it works:**
+1. **Fetch** — Pulls issue details (title, body, labels) via `gh` or `glab` CLI
+2. **Generate** — AI creates JSON test actions based on the issue description
+3. **Run** — Optionally executes the tests immediately to verify if a bug is reproducible
 ```bash
-npx e2e-runner run --all --output junit
-# or: --output both (JSON + XML)
+# Fetch and display
+e2e-runner issue https://github.com/owner/repo/issues/42
+# Generate a test file via Claude API
+e2e-runner issue https://github.com/owner/repo/issues/42 --generate
+# Generate + run + report
+e2e-runner issue https://github.com/owner/repo/issues/42 --verify
+# -> "BUG CONFIRMED" or "NOT REPRODUCIBLE"
 ```
-Output saved to `e2e/screenshots/junit.xml`.
+In Claude Code, just ask:
+> "Fetch issue #42 and create E2E tests for it"
-### GitHub Actions
+**Bug verification logic:** Generated tests assert the **correct** behavior. Test failure = bug confirmed. All tests pass = not reproducible.
-```yaml
-jobs:
-  e2e:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - uses: actions/setup-node@v4
-        with:
-          node-version: 20
-      - run: npm ci
-      - run: npx e2e-runner pool start
-      - run: npx e2e-runner run --all --output junit
-      - uses: mikepenz/action-junit-report@v4
-        if: always()
-        with:
-          report_paths: e2e/screenshots/junit.xml
-```
+**Auth:** GitHub requires `gh` CLI, GitLab requires `glab` CLI. Self-hosted GitLab is supported.
-### Exit Codes
+---
-| Code | Meaning |
-|------|---------|
-| `0` | All tests passed |
-| `1` | One or more tests failed |
+## Learning System
-## Programmatic API
+The runner learns from every test run — building knowledge about your test suite over time.
-```js
-import { createRunner } from '@matware/e2e-runner';
+Query insights via the `e2e_learnings` MCP tool:
-const runner = await createRunner({ baseUrl: 'http://localhost:3000' });
+| Query | Returns |
+|-------|---------|
+| `summary` | Full health overview: pass rate, flaky tests, unstable selectors, API issues |
+| `flaky` | Tests that pass only after retries |
+| `selectors` | CSS selectors with high failure rates |
+| `pages` | Pages with console errors, network failures, load time issues |
+| `apis` | API endpoints with error rates and latency (auto-normalized: UUIDs, hashes, IDs) |
+| `errors` | Most frequent error patterns, categorized |
+| `trends` | Pass rate over time (auto-switches to hourly when all data is from one day) |
+| `test:<name>` | Drill-down history for a specific test |
+| `page:<path>` | Drill-down history for a specific page |
+| `selector:<value>` | Drill-down history for a specific selector |
-// Run all suites
-const report = await runner.runAll();
+**Storage & export:**
+- SQLite (`~/.e2e-runner/dashboard.db`) — default, zero setup
+- Neo4j knowledge graph — optional, for relationship-based analysis. Manage via `e2e_neo4j` MCP tool or `docker compose`
+- Markdown report (`e2e/learnings.md`) — auto-generated after each run
-// Run a specific suite
-const report = await runner.runSuite('auth');
+**Test narration:** Each test run generates a human-readable narrative of what happened step by step, visible in the CLI output and the dashboard.
-// Run a specific file
-const report = await runner.runFile('e2e/tests/login.json');
+---
-// Run inline test objects
-const report = await runner.runTests([
-  {
-    name: 'quick-check',
-    actions: [
-      { type: 'goto', value: '/' },
-      { type: 'assert_text', text: 'Hello' },
-    ],
-  },
-]);
+## Web Dashboard
+Real-time UI for running tests, viewing results, screenshots, and network logs.
+```bash
+e2e-runner dashboard                  # Start on default port 8484
+e2e-runner dashboard --port 9090      # Custom port
 ```
-### Lower-Level Exports
+### Live Execution
-```js
-import {
-  loadConfig,
-  waitForPool, connectToPool, getPoolStatus, startPool, stopPool,
-  runTest, runTestsParallel, loadTestFile, loadTestSuite, loadAllSuites, listSuites,
-  generateReport, generateJUnitXML, saveReport, printReport,
-  executeAction,
-} from '@matware/e2e-runner';
+Monitor tests in real-time with step-by-step progress, durations, and active worker count.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-live-running.png" alt="Dashboard - Live test execution" width="800" />
+</p>
+### Test Suites
+Browse all test suites across multiple projects. Run a single suite or all tests with one click.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-suites.png" alt="Dashboard - Test suites grid" width="800" />
+</p>
+### Run History
+Track pass rate trends with the built-in chart. Click any row to expand full detail with per-test results, screenshot hashes, and errors.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-runs.png" alt="Dashboard - Run history" width="800" />
+</p>
+### Run Detail
+Expanded view with PASS/FAIL badges, screenshot thumbnails with copyable hashes (`ss:77c28b5a`), formatted console errors, and network request logs.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-run-detail.png" alt="Dashboard - Run detail" width="800" />
+</p>
+### Screenshot Gallery
+Browse all captured screenshots with hash search. Includes action screenshots, error screenshots, and verification captures.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-screenshots-gallery.png" alt="Dashboard - Screenshot gallery" width="800" />
+</p>
+### Pool Status
+Monitor Chrome pool health: available slots, running sessions, memory pressure.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-pool-status.png" alt="Dashboard - Pool status" width="800" />
+</p>
+---
+## Screenshot Capture
+Capture screenshots of any URL on demand — no test suite required:
+```bash
+e2e-runner capture https://example.com
+e2e-runner capture https://example.com --full-page --selector ".loaded" --delay 2000
 ```
-## Claude Code Integration (MCP)
+Via MCP, the `e2e_capture` tool supports `authToken` and `authStorageKey` for authenticated pages — it injects the token into localStorage before navigating.
-The package includes a built-in [MCP server](https://modelcontextprotocol.io/) that gives Claude Code native access to the test runner. Install once and it's available in every project.
+Every screenshot gets a deterministic hash (`ss:a3f2b1c9`). Use `e2e_screenshot` to retrieve any screenshot by hash — it returns the image with metadata (test name, step, type).
-**Via npm** (requires Node.js):
+---
+## Claude Code Integration
+The package ships as a **Claude Code plugin** — a single install that gives Claude native access to the test runner, teaches it the optimal workflow, and adds slash commands and specialized agents.
+### Install as Plugin (recommended)
 ```bash
-claude mcp add --transport stdio --scope user e2e-runner \
-  -- npx -y -p @matware/e2e-runner e2e-runner-mcp
+claude plugin install npm:@matware/e2e-runner
 ```
-**Via Docker** (no Node.js required):
+**What you get:**
+| Component | Description |
+|-----------|-------------|
+| **13 MCP tools** | Run tests, create test files, capture screenshots, query network logs, manage dashboard, verify issues, query learnings |
+| **Skill** | Teaches Claude the full e2e-runner workflow — how to combine tools, interpret results, debug failures, create tests |
+| **3 Commands** | `/e2e-runner:run` — run & analyze tests<br>`/e2e-runner:create-test` — explore UI and create tests<br>`/e2e-runner:verify-issue <url>` — verify GitHub/GitLab bugs |
+| **2 Agents** | **test-analyzer** — diagnoses failures, analyzes flaky tests, drills into network errors<br>**test-creator** — explores UI, discovers selectors, designs and validates tests |
+### Install MCP-only (alternative)
+If you only want the 13 MCP tools without skills, commands, or agents:
 ```bash
 claude mcp add --transport stdio --scope user e2e-runner \
-  -- docker run -i --rm fastslack/e2e-runner-mcp
+  -- npx -y -p @matware/e2e-runner e2e-runner-mcp
 ```
+### Slash Commands
+| Command | Description |
+|---------|-------------|
+| `/e2e-runner:run` | Check pool, list suites, run tests, analyze results with screenshots and network drill-down |
+| `/e2e-runner:create-test` | Explore the UI with screenshots, find selectors in source code, design test actions, create and validate |
+| `/e2e-runner:verify-issue <url>` | Fetch a GitHub/GitLab issue, create tests that verify correct behavior, report bug confirmed or not reproducible |
 ### MCP Tools
 | Tool | Description |
 |------|-------------|
-| `e2e_run` | Run tests (all suites, by suite name, or by file path) |
+| `e2e_run` | Run tests: all suites, by name, or by file. Supports `concurrency`, `baseUrl`, `retries`, `failOnNetworkError` overrides. Returns verification results if tests have `expect`. |
 | `e2e_list` | List available test suites with test names and counts |
-| `e2e_create_test` | Create a new test JSON file |
-| `e2e_pool_status` | Check Chrome pool availability and capacity |
-| `e2e_screenshot` | Retrieve a screenshot by its hash (e.g. `ss:a3f2b1c9`) |
-| `e2e_issue` | Fetch a GitHub/GitLab issue and generate E2E tests |
-> **Note:** Pool start/stop are only available via CLI (`e2e-runner pool start|stop`), not via MCP — restarting the pool kills all active sessions from other clients.
-All tools accept an optional `cwd` parameter (absolute path to the project root). Claude Code passes its current working directory so the MCP server resolves `e2e/tests/`, `e2e.config.js`, and `.e2e-pool/` relative to the correct project — even when switching between multiple projects in the same session.
-Once installed, Claude Code can run tests, analyze failures, and create new test files as part of its normal workflow. Just ask:
+| `e2e_create_test` | Create a new test JSON file with name, tests, and optional hooks |
+| `e2e_create_module` | Create a reusable module with parameterized actions |
+| `e2e_pool_status` | Check Chrome pool availability, running sessions, capacity |
+| `e2e_screenshot` | Retrieve a screenshot by hash (`ss:a3f2b1c9`). Returns image + metadata |
+| `e2e_capture` | Capture screenshot of any URL. Supports `authToken`, `fullPage`, `selector`, `delay` |
+| `e2e_dashboard_start` | Start the web dashboard |
+| `e2e_dashboard_stop` | Stop the web dashboard |
+| `e2e_issue` | Fetch GitHub/GitLab issue and generate tests. `mode: "prompt"` or `mode: "verify"` |
+| `e2e_network_logs` | Query network request/response logs by `runDbId`. Filter by test name, method, status, URL pattern. Supports headers and bodies |
+| `e2e_learnings` | Query the learning system: `summary`, `flaky`, `selectors`, `pages`, `apis`, `errors`, `trends` |
+| `e2e_neo4j` | Manage Neo4j knowledge graph container: `start`, `stop`, `status` |
+> **Note:** Pool start/stop are CLI-only (`e2e-runner pool start|stop`) — not exposed via MCP to prevent killing active sessions.
+### What You Can Ask Claude Code
 > "Run all E2E tests"
 > "Create a test that verifies the checkout flow"
-> "What's the status of the Chrome pool?"
-### Verify Installation
+> "What tests are flaky? Show me the learning summary"
+> "Capture a screenshot of /dashboard with auth"
+> "Fetch issue #42 and create tests for it"
+> "What's the API error rate for the last 7 days?"
-```bash
-claude mcp list
-# e2e-runner: ... - Connected
-```
+---
-## Issue-to-Test
+## Network Error Handling
-Turn GitHub and GitLab issues into executable E2E tests. Paste an issue URL and get runnable tests -- automatically.
+### Explicit Assertion
-### How It Works
+Place `assert_no_network_errors` after critical page loads:
-1. **Fetch** -- Pulls issue details (title, body, labels) via `gh` or `glab` CLI
-2. **Generate** -- AI creates JSON test actions based on the issue description
-3. **Run** -- Optionally executes the tests immediately to verify if a bug is reproducible
+```json
+{ "type": "goto", "value": "/dashboard" },
+{ "type": "wait", "selector": ".loaded" },
+{ "type": "assert_no_network_errors" }
+```
-### Two Modes
+### Global Flag
-**Prompt mode** (default, no API key): Returns issue data + a structured prompt. Claude Code uses its own intelligence to create tests via `e2e_create_test` and run them.
+Set `failOnNetworkError: true` to automatically fail any test with network errors:
-**Verify mode** (requires `ANTHROPIC_API_KEY`): Calls Claude API directly, generates tests, runs them, and reports whether the bug is confirmed or not reproducible.
+```bash
+e2e-runner run --all --fail-on-network-error
+```
-### CLI
+When disabled (default), the runner still collects and reports network errors — the MCP response includes a warning when tests pass but have network errors.
-```bash
-# Fetch and display issue details
-e2e-runner issue https://github.com/owner/repo/issues/42
+### Full Network Logging
-# Generate a test file via Claude API
-e2e-runner issue https://github.com/owner/repo/issues/42 --generate
-# -> Creates e2e/tests/issue-42.json
+All XHR/fetch requests are captured with: URL, method, status, duration, request/response headers, and response body (truncated at 50KB). Viewable in the dashboard with expandable request detail rows.
-# Generate + run + report bug status
-e2e-runner issue https://github.com/owner/repo/issues/42 --verify
-# -> "BUG CONFIRMED" or "NOT REPRODUCIBLE"
+**MCP drill-down flow:**
-# Output AI prompt as JSON (for piping)
-e2e-runner issue https://github.com/owner/repo/issues/42 --prompt
+```
+1. e2e_run          → compact networkSummary + runDbId
+2. e2e_network_logs(runDbId)                     → all requests (url, method, status, duration)
+3. e2e_network_logs(runDbId, errorsOnly: true)   → only failed requests
+4. e2e_network_logs(runDbId, includeHeaders: true) → with headers
+5. e2e_network_logs(runDbId, includeBodies: true)  → full request/response bodies
 ```
-### MCP
+The `e2e_run` response stays compact (~5KB) regardless of how many requests were captured. Use `e2e_network_logs` with the returned `runDbId` to drill into details on demand.
-In Claude Code, the `e2e_issue` tool handles everything:
+---
-> "Fetch issue https://github.com/owner/repo/issues/42 and create E2E tests for it"
+## Hooks
-Claude Code receives the issue data, generates appropriate test actions, saves them via `e2e_create_test`, and runs them with `e2e_run`.
+Run actions at lifecycle points. Define globally in config or per-suite:
-### Auth Requirements
+```json
+{
+  "hooks": {
+    "beforeAll": [{ "type": "goto", "value": "/setup" }],
+    "beforeEach": [{ "type": "goto", "value": "/" }],
+    "afterEach": [{ "type": "screenshot", "value": "after.png" }],
+    "afterAll": []
+  },
+  "tests": [...]
+}
+```
-- **GitHub**: `gh` CLI authenticated (`gh auth login`)
-- **GitLab**: `glab` CLI authenticated (`glab auth login`)
+> **Important:** `beforeAll` runs on a separate browser page that is closed before tests start. Use `beforeEach` for state that tests need (cookies, localStorage, auth tokens).
-Provider is auto-detected from the URL. Self-hosted GitLab is supported via `glab` config.
+---
-### Bug Verification Logic
+## CLI
-Generated tests assert the **correct** behavior. If the tests fail, the correct behavior doesn't work -- bug confirmed. If all tests pass, the bug is not reproducible.
+```bash
+# Run tests
+e2e-runner run --all                  # All suites
+e2e-runner run --suite auth           # Single suite
+e2e-runner run --tests path/to.json   # Specific file
+e2e-runner run --inline '<json>'      # Inline JSON
-## Web Dashboard
+# Pool management (CLI only, not MCP)
+e2e-runner pool start                 # Start Chrome container
+e2e-runner pool stop                  # Stop Chrome container
+e2e-runner pool status                # Check pool health
-Real-time UI for running tests, viewing results, screenshots, and run history.
+# Issue-to-test
+e2e-runner issue <url>                # Fetch issue
+e2e-runner issue <url> --generate     # Generate test via AI
+e2e-runner issue <url> --verify       # Generate + run + report
-```bash
-e2e-runner dashboard                  # Start on default port 8484
-e2e-runner dashboard --port 9090      # Custom port
+# Dashboard
+e2e-runner dashboard                  # Start web dashboard
+# Other
+e2e-runner list                       # List available suites
+e2e-runner capture <url>              # On-demand screenshot
+e2e-runner init                       # Scaffold project
 ```
-### Live Execution
+### CLI Options
-Monitor tests in real-time as they run. Each test shows its steps with individual durations, pass/fail status, and active connection count.
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--base-url <url>` | `http://host.docker.internal:3000` | Application base URL |
+| `--pool-url <ws>` | `ws://localhost:3333` | Chrome pool WebSocket URL |
+| `--concurrency <n>` | `3` | Parallel test workers |
+| `--retries <n>` | `0` | Retry failed tests N times |
+| `--action-retries <n>` | `0` | Retry failed actions N times |
+| `--test-timeout <ms>` | `60000` | Per-test timeout |
+| `--timeout <ms>` | `10000` | Default action timeout |
+| `--output <format>` | `json` | Report: `json`, `junit`, `both` |
+| `--env <name>` | `default` | Environment profile |
+| `--fail-on-network-error` | `false` | Fail tests with network errors |
+| `--project-name <name>` | dir name | Project display name |
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-live-running.png" alt="Dashboard - Live test execution" width="900" />
-</p>
+---
-### Test Suites
+## Configuration
-Browse all test suites across multiple projects. Run a single suite or all tests with one click.
+Create `e2e.config.js` in your project root:
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-suites.png" alt="Dashboard - Test suites grid" width="900" />
-</p>
+```js
+export default {
+  baseUrl: 'http://host.docker.internal:3000',
+  concurrency: 4,
+  retries: 2,
+  actionRetries: 1,
+  testTimeout: 30000,
+  outputFormat: 'both',
+  failOnNetworkError: true,
+  exclude: ['explore-*', 'debug-*'],
-### Run History
+  hooks: {
+    beforeEach: [{ type: 'goto', value: '/' }],
+  },
-Track pass rate trends over time with the bar chart. Click any row to expand the full run detail with per-test results, screenshots, and console errors.
+  environments: {
+    staging: { baseUrl: 'https://staging.example.com' },
+    production: { baseUrl: 'https://example.com', concurrency: 5 },
+  },
+};
+```
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-runs.png" alt="Dashboard - Run history with trend chart" width="900" />
-</p>
+### Config Priority (highest wins)
-### Run Detail
+1. CLI flags
+2. Environment variables
+3. Config file (`e2e.config.js` or `e2e.config.json`)
+4. Defaults
-Expanded view shows each test with PASS/FAIL badge, screenshot thumbnails with copyable hashes (`ss:77c28b5a`), and formatted console errors.
+When `--env <name>` is set, the matching profile overrides everything.
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-run-detail.png" alt="Dashboard - Run detail with screenshot hashes" width="900" />
-</p>
+---
-### Screenshot Gallery
+## CI/CD
-Browse all captured screenshots per project. Includes both manual captures and error screenshots.
+### JUnit XML
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-screenshots-gallery.png" alt="Dashboard - Screenshot gallery" width="900" />
-</p>
+```bash
+e2e-runner run --all --output junit
+```
-## Architecture
+### GitHub Actions
-```
-bin/cli.js            CLI entry point (manual argv parsing)
-bin/mcp-server.js     MCP server entry point (stdio transport)
-src/config.js         Config cascade: defaults -> file -> env -> CLI -> profile
-src/pool.js           Chrome pool: Docker Compose lifecycle + WebSocket
-src/runner.js         Parallel test executor with retries and timeouts
-src/actions.js        Action engine: maps JSON actions to Puppeteer calls
-src/reporter.js       JSON reports, JUnit XML, console output
-src/mcp-server.js     MCP server: exposes tools for Claude Code
-src/mcp-tools.js      Shared MCP tool definitions and handlers
-src/dashboard.js      Web dashboard: HTTP server, REST API, WebSocket
-src/db.js             SQLite multi-project database
-src/issues.js         GitHub/GitLab issue fetching (gh/glab CLI)
-src/ai-generate.js    AI test generation (prompt builder + Claude API)
-src/verify.js         Bug verification orchestrator
-src/logger.js         ANSI colored logger
-src/index.js          Programmatic API (createRunner)
-templates/            Scaffolding templates for init command
+```yaml
+jobs:
+  e2e:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 20
+      - run: npm ci
+      - run: npx e2e-runner pool start
+      - run: npx e2e-runner run --all --output junit
+      - uses: mikepenz/action-junit-report@v4
+        if: always()
+        with:
+          report_paths: e2e/screenshots/junit.xml
 ```
-### How It Works
+---
-1. **Pool**: A Docker container running [browserless/chrome](https://github.com/browserless/browserless) provides shared Chrome instances via WebSocket.
-2. **Runner**: Spawns N parallel workers. Each worker connects to the pool, opens a new page, and executes actions sequentially.
-3. **Actions**: Each JSON action maps to a Puppeteer call (`page.goto`, `page.click`, `page.type`, etc.).
-4. **Reports**: Results are collected, aggregated into a report, and saved as JSON and/or JUnit XML.
+## Programmatic API
-The `baseUrl` defaults to `http://host.docker.internal:3000` because Chrome runs inside Docker and needs to reach the host machine.
+```js
+import { createRunner } from '@matware/e2e-runner';
+const runner = await createRunner({ baseUrl: 'http://localhost:3000' });
+const report = await runner.runAll();
+const report = await runner.runSuite('auth');
+const report = await runner.runFile('e2e/tests/login.json');
+const report = await runner.runTests([
+  { name: 'quick-check', actions: [{ type: 'goto', value: '/' }] },
+]);
+```
+---
 ## Requirements
@@ -536,14 +714,4 @@ The `baseUrl` defaults to `http://host.docker.internal:3000` because Chrome runs
 Copyright 2025 Matias Aguirre (fastslack)
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-    http://www.apache.org/licenses/LICENSE-2.0
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
+Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for details.