npm - @matware/e2e-runner - Versions diffs - 1.5.0 → 1.5.1 - Mend

@matware/e2e-runner 1.5.0 → 1.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/.claude-plugin/marketplace.json +3 -3
package/.claude-plugin/plugin.json +1 -1
package/LICENSE +1 -1
package/README.md +451 -274
package/agents/test-improver.md +2 -1
package/bin/cli.js +13 -2
package/package.json +2 -2
package/skills/e2e-testing/SKILL.md +2 -1
package/skills/e2e-testing/references/action-types.md +17 -18
package/skills/e2e-testing/references/troubleshooting.md +2 -26
package/src/actions.js +12 -2
package/src/dashboard.js +50 -5
package/src/db.js +15 -0
package/src/mcp-tools.js +238 -75
package/src/narrate.js +19 -0
package/src/runner.js +72 -14
package/src/visual-diff.js +8 -7
package/templates/dashboard/js/utils.js +23 -2
package/templates/dashboard/js/view-runs.js +94 -9
package/templates/dashboard/styles/components.css +17 -0
package/templates/dashboard/styles/view-runs.css +51 -4
package/templates/dashboard/template.html +2 -2
package/templates/dashboard.html +187 -17

package/README.md CHANGED Viewed

@@ -21,87 +21,133 @@
   <a href="https://skills.sh"><img src="https://img.shields.io/badge/skills.sh-e2e--testing-ff6600" alt="Agent Skills" /></a>
 </p>
+---
+**E2E Runner** lets you test your web app without writing test code. Tests are plain JSON — and you don't even have to write that yourself: **just ask Claude Code.**
+## 🎬 Write a test by asking — then watch it run
 <p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-live-running.png" alt="E2E Runner Dashboard - Live Execution" width="800" />
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/demo-live.gif" alt="Live dashboard streaming screenshots as a test suite runs" width="820" />
+  <br/><sub><em>The live dashboard while a suite runs — every step streams a screenshot into the feed, in real time.</em></sub>
 </p>
----
-**E2E Runner** is a zero-code browser testing framework where tests are plain JSON files — no Playwright scripts, no Cypress boilerplate, no test framework to learn. Define what to click, type, and assert, and the runner executes it in parallel against a shared Chrome pool.
+With the built-in [MCP server](https://modelcontextprotocol.io/), creating a test is a conversation — no docs, no syntax to memorize:
-But what makes it truly different is its **deep AI integration**. With a built-in [MCP server](https://modelcontextprotocol.io/), Claude Code can create tests from a conversation, run them, read the results, capture screenshots, and even visually verify that pages look correct — all without leaving the chat. Paste a GitHub issue URL and get a runnable test back. That's the workflow.
+> **You:** *Create an E2E test for the login flow and run it.*
+>
+> **Claude Code:** *writes the test, runs it in a real browser, and reports back —*
+> ✅ `login-flow` passed in 2.3s · screenshot saved · no network errors.
-### A test is just JSON
+Behind the scenes Claude just wrote and ran this. A test is **just JSON** — an ordered list of what a user does:
 ```json
 [
-  {
-    "name": "login-flow",
-    "actions": [
-      { "type": "goto", "value": "/login" },
-      { "type": "type", "selector": "#email", "value": "user@test.com" },
-      { "type": "type", "selector": "#password", "value": "secret" },
-      { "type": "click", "text": "Sign In" },
-      { "type": "assert_text", "text": "Welcome back" },
-      { "type": "screenshot", "value": "logged-in.png" }
-    ]
-  }
+  { "name": "login-flow", "actions": [
+    { "type": "goto", "value": "/login" },
+    { "type": "type", "selector": "#email", "value": "user@test.com" },
+    { "type": "type", "selector": "#password", "value": "secret" },
+    { "type": "click", "text": "Sign In" },
+    { "type": "assert_text", "text": "Welcome back" },
+    { "type": "screenshot", "value": "logged-in.png" }
+  ]}
 ]
 ```
-You describe what a user does — click this, type that, check the page says X — and the runner does it in a real browser. No imports, no `describe`/`it`, no build step. If you can read it, you can write it.
+No imports, no `describe`/`it`, no build step. If you can read it you can write it — or just ask.
----
-## Agent Skills
-Install E2E testing skills for any coding agent (Claude Code, Cursor, Codex, Copilot, and [40+ more](https://github.com/vercel-labs/skills#supported-agents)):
+**Connect it to Claude Code (2 commands):**
 ```bash
-npx skills add fastslack/mtw-e2e-runner
+claude plugin marketplace add fastslack/mtw-e2e-runner
+claude plugin install e2e-runner@matware
 ```
-This gives your agent the knowledge to create, run, and debug JSON-driven E2E tests — no documentation reading required.
+Now say *"create a test for X and run it"* — Claude gets 17 MCP tools, slash commands, and specialized agents.
-> Browse all available skills at [skills.sh](https://skills.sh)
+> Using a different agent (Cursor, Codex, Copilot, [40+ more](https://github.com/vercel-labs/skills#supported-agents))? Install the skill: `npx skills add fastslack/mtw-e2e-runner`
 ---
-## Getting Started
+## 📖 Contents
+|   | Section | What's inside |
+|---|---------|---------------|
+| 🚀 | **[Install &amp; first test](#install)** | npm setup · run with your own Chrome (no Docker), Obscura, or a Docker pool |
+| ✨ | **[What you get](#features)** | feature overview at a glance |
+| ✍️ | **[Writing tests](#writing-tests)** | test format · full action catalog · retries · serial · modules · auth · hooks |
+| 🤖 | **[AI integration](#ai)** | Claude Code · OpenCode · 17 MCP tools · visual verification · issue-to-test |
+| 📊 | **[Dashboard &amp; insights](#dashboard)** | live dashboard · learning system · network logs · screenshot capture |
+| 🌐 | **[Browser drivers](#drivers)** | browserless · cdp · lightpanda · obscura · steel |
+| ⚙️ | **[CLI, config &amp; CI](#reference)** | commands · flags · `e2e.config.js` · GitHub Actions · programmatic API |
+---
-You need just two things: **Node.js 20+** and **Docker running**. You don't install any browser — the runner spins up Chrome in a container for you.
+<a name="install"></a>
-### Try it in 60 seconds
+## 🚀 Install — it's tiny
 ```bash
 npm install --save-dev @matware/e2e-runner
 npx e2e-runner init        # scaffolds e2e/ with a sample test + config
-npx e2e-runner run --all   # runs it — Chrome starts automatically on first run
 ```
-That's the whole setup. No separate `pool start`, no browser download: the first run boots the Chrome pool for you and reuses it afterwards.
+Then pick how to run the browser. **You don't need Docker** unless you want the parallel pool:
+### Option 1 · Use the Chrome you already have — no Docker ⭐
-> Prefer a single command? `curl -fsSL https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/scripts/quickstart.sh | bash`
+Launch any Chromium browser with a debugging port, then point the runner at it:
-### Point it at your app
+```bash
+google-chrome --headless=new --remote-debugging-port=9222 &   # or brave / chromium / msedge
+CHROME_POOL_URL=http://localhost:9222 POOL_DRIVER=cdp npx e2e-runner run --all
+```
-`init` created `e2e.config.js`. Set your app's URL there:
+Or bake it into `e2e.config.js` so you never repeat it:
 ```js
 export default {
-  baseUrl: 'http://host.docker.internal:3000', // ← change 3000 to your app's port
+  baseUrl: 'http://localhost:3000',     // your app — plain localhost, no docker hostname
+  poolUrls: ['http://localhost:9222'],
+  poolDriver: 'cdp',
 };
 ```
+Nothing to install beyond npm, and `baseUrl` is just `localhost` (the browser is on your machine).
+### Option 2 · Obscura — one tiny binary, no Docker
+A single ~30 MB binary with built-in anti-detection. Install once, run it, point the runner at it:
+```bash
+obscura serve --port 9222 --stealth &
+CHROME_POOL_URL=http://localhost:9222 POOL_DRIVER=obscura npx e2e-runner run --all
+```
+`npx e2e-runner pool start` (with `poolDriver: 'obscura'` in your config) prints the exact install command for your OS.
+### Option 3 · Docker pool — parallel, for CI &amp; big suites
+A shared, queue-managed Chrome pool that runs many tests at once:
+```bash
+npx e2e-runner run --all     # the first run auto-starts the Docker pool for you
+```
+Requires Docker. Set `baseUrl: 'http://host.docker.internal:3000'` so the containerized Chrome can reach your app.
 <details>
-<summary><strong>Why <code>host.docker.internal</code> instead of <code>localhost</code>?</strong></summary>
+<summary><strong>Why <code>host.docker.internal</code> (Docker option only)?</strong></summary>
+<br/>
+With the Docker pool, Chrome runs inside a container, so `localhost` there means the container — not your machine. `host.docker.internal` bridges to your host. On Linux (Docker Engine, not Docker Desktop) add `--add-host=host.docker.internal:host-gateway`, or use your LAN IP. Options 1 &amp; 2 don't have this — the browser is local, so plain `localhost` just works.
-Chrome runs inside Docker, so `localhost` there points at the container, not your machine. `host.docker.internal` bridges to your host. On Linux (Docker Engine, not Docker Desktop) you may need to add `--add-host=host.docker.internal:host-gateway`, or just use your machine's LAN IP.
 </details>
 ### Write your first test
-Open `e2e/tests/sample.json` and describe a flow as a list of actions:
+Open `e2e/tests/sample.json` — a flow is an ordered list of actions:
 ```json
 [
@@ -113,18 +159,12 @@ Open `e2e/tests/sample.json` and describe a flow as a list of actions:
 ]
 ```
-Then `npx e2e-runner run --all` again. Pass/fail, timing, screenshots, and network errors print to your terminal — and to the [web dashboard](#web-dashboard) if it's open.
-### Add Claude Code (optional)
-```bash
-claude plugin marketplace add fastslack/mtw-e2e-runner
-claude plugin install e2e-runner@matware
-```
+Run it with `npx e2e-runner run --all`. Results — pass/fail, timing, screenshots, network errors — print to your terminal and to the [web dashboard](#dashboard) if it's open.
-This gives Claude 17 MCP tools, slash commands, and specialized agents. Just say *"Run all E2E tests"* or *"Create a test for the login flow"*.
+<details>
+<summary><strong>Add OpenCode</strong> (optional)</summary>
-### Add OpenCode (optional)
+<br/>
 ```bash
 cp node_modules/@matware/e2e-runner/opencode.json ./
@@ -133,17 +173,32 @@ mkdir -p .opencode && cp -r node_modules/@matware/e2e-runner/.opencode/* .openco
 See [OPENCODE.md](OPENCODE.md) for details.
-### What's next?
+</details>
+### Updating
+Each install method updates separately — bump the one(s) you use:
+```bash
+# npm dependency (per project)
+npm install --save-dev @matware/e2e-runner@latest
+# Claude Code plugin
+claude plugin update e2e-runner@matware
+# MCP-only install (npx caches the package — pin @latest to force a refresh)
+claude mcp add --transport stdio --scope user e2e-runner \
+  -- npx -y -p @matware/e2e-runner@latest e2e-runner-mcp
+```
-- [Test Format](#test-format) — learn the full action vocabulary
-- [Claude Code Integration](#claude-code-integration) — set up AI-powered testing
-- [Visual Verification](#visual-verification) — describe expected pages in plain English
-- [Issue-to-Test](#issue-to-test) — turn bug reports into executable tests
-- [Web Dashboard](#web-dashboard) — monitor tests in real time
+> [!NOTE]
+> Two gotchas: **(1)** `npx` prefers a copy found in the project's `node_modules` over its own cache — if a project pins an old version, the MCP server and dashboard run that old version, so update the project dependency too. **(2)** Already-running processes keep the old code in memory: after updating, restart the dashboard and reconnect the MCP server (`/mcp` → `e2e-runner` → Reconnect, or restart your session).
 ---
-## What you get
+<a name="features"></a>
+## ✨ What you get
 🧪 **Zero-code tests** — JSON files that anyone on your team can read and write. No JavaScript, no compilation, no framework lock-in.
@@ -173,7 +228,16 @@ See [OPENCODE.md](OPENCODE.md) for details.
 ---
-## Test Format
+<a name="writing-tests"></a>
+## ✍️ Writing tests
+Everything about authoring tests — the file format, the full action vocabulary, retries, state isolation, and reuse. Expand what you need:
+<details>
+<summary><strong>Test format &amp; file layout</strong></summary>
+<br/>
 Each `.json` file in `e2e/tests/` contains an array of tests. Each test has a `name` and sequential `actions`:
@@ -193,7 +257,12 @@ Each `.json` file in `e2e/tests/` contains an array of tests. Each test has a `n
 Suite files can have numeric prefixes for ordering (`01-auth.json`, `02-dashboard.json`). The `--suite` flag matches with or without the prefix, so `--suite auth` finds `01-auth.json`.
-### Available Actions
+</details>
+<details>
+<summary><strong>Action catalog</strong> — navigation, input &amp; interaction</summary>
+<br/>
 | Action | Fields | Description |
 |--------|--------|-------------|
@@ -210,12 +279,33 @@ Suite files can have numeric prefixes for ordering (`01-auth.json`, `02-dashboar
 | `evaluate` | `value` | Execute JavaScript in the browser context |
 | `navigate` | `value` | Browser navigation (`back`, `forward`, `reload`) |
 | `clear_cookies` | — | Clear all cookies for the current page |
+| `wait_network_idle` | optional `value` (idle ms, default 500), `timeout` | Wait until the network has been idle for `value` ms — useful after actions that trigger background requests |
+| `set_storage` | `value` (`"key=val"`), optional `selector: "session"` | Set a `localStorage` key (or `sessionStorage` with `selector: "session"`) |
+| `gql` | `value` (query), optional `text` (variables JSON), optional `selector` (assertion) | Run a GraphQL query/mutation via in-page `fetch`, with the auth token read from `localStorage`. Fails on GraphQL errors. `selector` is a JS expression asserted against the response `r` (e.g. `"r.data.users.length > 0"`). Installs `window.__e2eGql` for later `evaluate` steps |
+**Click by text** — when `click` uses `text` instead of `selector`, it searches across common interactive and content elements:
+```
+button, a, [role="button"], [role="tab"], [role="menuitem"], [role="option"],
+[role="listitem"], div[class*="cursor"], span, li, td, th, label, p, h1-h6
+```
+```json
+{ "type": "click", "text": "Sign In" }
+```
+</details>
+<details>
+<summary><strong>Assertions</strong> — verify text, elements, URLs, counts &amp; network</summary>
-### Assertions
+<br/>
 | Action | Fields | Description |
 |--------|--------|-------------|
 | `assert_text` | `text` | Assert text exists anywhere on the page (substring) |
+| `assert_no_text` | `text` | Assert text does NOT appear anywhere on the page — opposite of `assert_text` |
+| `assert_text_in` | `selector`, `text`, optional `value: "exact"` | Assert text inside a scoped container. `text` is a case-insensitive regex by default; `value: "exact"` switches to case-sensitive substring |
 | `assert_element_text` | `selector`, `text`, optional `value: "exact"` | Assert element's text contains (or exactly matches) the expected text |
 | `assert_url` | `value` | Assert current URL path or full URL. Paths (`/dashboard`) compare against pathname only |
 | `assert_visible` | `selector` | Assert element exists and is visible |
@@ -226,22 +316,16 @@ Suite files can have numeric prefixes for ordering (`01-auth.json`, `02-dashboar
 | `assert_matches` | `selector`, `value` (regex) | Assert element text matches a regex pattern |
 | `assert_count` | `selector`, `value` | Assert element count: exact (`"5"`), or operators (`">3"`, `">=1"`, `"<10"`) |
 | `assert_no_network_errors` | — | Fail if any network requests failed (e.g. `ERR_CONNECTION_REFUSED`) |
+| `assert_storage` | `value` (`"key"` or `"key=expected"`), optional `selector: "session"` | Assert a `localStorage`/`sessionStorage` key exists or has a specific value |
+| `assert_visual` | `value` (golden image), optional `selector`, `text` (max diff, e.g. `"0.02"`), `fullPage`, `maskRegions`, `threshold` | Visual regression: compare a screenshot against a golden reference. The first run saves the golden; later runs fail if more pixels differ than the threshold (default 2%) and write a diff image |
 | `get_text` | `selector` | Extract element text (non-assertion, never fails). Result: `{ value: "..." }` |
-### Click by Text
-When `click` uses `text` instead of `selector`, it searches across common interactive and content elements:
-```
-button, a, [role="button"], [role="tab"], [role="menuitem"], [role="option"],
-[role="listitem"], div[class*="cursor"], span, li, td, th, label, p, h1-h6
-```
+</details>
-```json
-{ "type": "click", "text": "Sign In" }
-```
+<details>
+<summary><strong>Framework-aware actions</strong> — React/MUI without <code>evaluate</code> boilerplate</summary>
-### Framework-Aware Actions
+<br/>
 These actions handle common patterns in React/MUI apps that normally require verbose `evaluate` boilerplate:
@@ -253,6 +337,9 @@ These actions handle common patterns in React/MUI apps that normally require ver
 | `select_combobox` | `text`, optional `selector`, `filter`, `openWait`/`filterWait`/`waitAfter` | Open a MUI Autocomplete/Select, optionally type `filter`, then click the option matching `text`. Falls back across `[role="option"]`, `.MuiAutocomplete-option`, `li.MuiMenuItem-root`. |
 | `focus_autocomplete` | `text` (label text) | Focus an autocomplete input by its label text. Supports MUI and generic `[role="combobox"]`. |
 | `click_chip` | `text` | Click a chip/tag element by text. Searches `[class*="Chip"]`, `[class*="chip"]`, `[data-chip]`. |
+| `click_icon` | `value` (icon id), optional `selector` (scope) | Click an icon by `data-testid`/`data-icon`/`aria-label`/class fragment or SVG `<title>` — MUI, FontAwesome, Heroicons, etc. Clicks the nearest clickable ancestor (button, link, tab). |
+| `click_menu_item` | `text`, optional `selector` (scope) | Click a menu item by text across `[role="menuitem"]`, `.dropdown-item`, `.menu-item`, MUI `MenuItem`. |
+| `click_in_context` | `text` (container text), `selector` (child) | Click a child element inside the smallest container matching `text` — e.g. the delete button of one specific card/row. |
 ```json
 // Before: 5 lines of evaluate boilerplate
@@ -262,13 +349,38 @@ These actions handle common patterns in React/MUI apps that normally require ver
 { "type": "type_react", "selector": "#search", "value": "term" }
 ```
----
+</details>
+<details>
+<summary><strong>Multi-tab actions</strong> — popups, OAuth windows &amp; cross-tab flows</summary>
+<br/>
+| Action | Fields | Description |
+|--------|--------|-------------|
+| `open_tab` | `value` (URL), optional `text` (label) | Open a new tab and navigate to the URL (relative to `baseUrl` or absolute). Label defaults to `tab-<n>` |
+| `switch_tab` | `value` | Switch the active tab by label, numeric index, or title/URL match (regex or substring). `"default"` returns to the original tab |
+| `wait_for_tab` | optional `text` (label), `timeout` | Wait for a new tab/popup opened by the app (`window.open`, `target="_blank"`) and make it the active tab |
+| `assert_tab_count` | `value` | Assert the number of open tabs: exact (`"2"`) or operators (`">=2"`) |
+| `close_tab` | optional `value` (label) | Close the current (or named) tab and switch back to the last remaining one |
+All subsequent actions run in the active tab:
+```json
+{ "type": "click", "text": "Open report" }
+{ "type": "wait_for_tab", "text": "report" }
+{ "type": "assert_text", "text": "Quarterly results" }
+{ "type": "close_tab" }
+```
+</details>
-## Retries
+<details>
+<summary><strong>Retries &amp; flaky detection</strong></summary>
-### Test-Level Retry
+<br/>
-Retry an entire test on failure. Set globally via config or per-test:
+**Test-level retry** — retry an entire test on failure. Set globally via config or per-test:
 ```json
 { "name": "flaky-test", "retries": 3, "timeout": 15000, "actions": [...] }
@@ -276,9 +388,7 @@ Retry an entire test on failure. Set globally via config or per-test:
 Tests that pass after retry are flagged as **flaky** in the report and learning system.
-### Action-Level Retry
-Retry a single action without rerunning the entire test. Useful for timing-sensitive clicks and waits:
+**Action-level retry** — retry a single action without rerunning the entire test. Useful for timing-sensitive clicks and waits:
 ```json
 { "type": "click", "selector": "#dynamic-btn", "retries": 3 }
@@ -287,9 +397,12 @@ Retry a single action without rerunning the entire test. Useful for timing-sensi
 Set globally: `actionRetries` in config, `--action-retries <n>` CLI, or `ACTION_RETRIES` env var. Delay between retries: `actionRetryDelay` (default 500ms).
----
+</details>
-## Serial Tests
+<details>
+<summary><strong>Serial tests</strong> — for tests that share state</summary>
+<br/>
 Tests that share state (e.g., two tests modifying the same record) can race when running in parallel. Mark them as serial:
@@ -300,9 +413,12 @@ Tests that share state (e.g., two tests modifying the same record) can race when
 Serial tests run one at a time **after** all parallel tests finish — preventing interference without slowing down independent tests.
----
+</details>
+<details>
+<summary><strong>Testing authenticated apps</strong></summary>
-## Testing Authenticated Apps
+<br/>
 The simplest approach — log in via the UI like a real user:
@@ -341,9 +457,12 @@ Each test runs in a **fresh browser context**, so auth state is automatically cl
 > **More strategies:** Cookie-based auth, HTTP header injection, OAuth/SSO bypasses, reusable auth modules, and role-based testing — see [docs/authentication.md](docs/authentication.md)
----
+</details>
+<details>
+<summary><strong>Reusable modules</strong> — extract common flows with <code>$use</code></summary>
-## Reusable Modules
+<br/>
 Extract common flows into parameterized modules:
@@ -380,9 +499,35 @@ Use in tests:
 Modules support parameter validation (required params fail fast), conditional blocks (`{{#param}}...{{/param}}`), nested composition, and cycle detection.
----
+</details>
+<details>
+<summary><strong>Hooks</strong> — beforeAll / beforeEach / afterEach / afterAll</summary>
+<br/>
+Run actions at lifecycle points. Define globally in config or per-suite:
+```json
+{
+  "hooks": {
+    "beforeAll": [{ "type": "goto", "value": "/setup" }],
+    "beforeEach": [{ "type": "goto", "value": "/" }],
+    "afterEach": [{ "type": "screenshot", "value": "after.png" }],
+    "afterAll": []
+  },
+  "tests": [...]
+}
+```
+> **Important:** `beforeAll` runs on a separate browser page that is closed before tests start. Use `beforeEach` for state that tests need (cookies, localStorage, auth tokens).
-## Exclude Patterns
+</details>
+<details>
+<summary><strong>Exclude patterns</strong> — skip drafts from <code>--all</code></summary>
+<br/>
 Skip exploratory or draft tests from `--all` runs:
@@ -395,9 +540,84 @@ export default {
 Individual suite runs (`--suite`) are not affected by exclude patterns.
+</details>
 ---
-## Visual Verification
+<a name="ai"></a>
+## 🤖 AI integration
+The whole point: your agent writes, runs, and verifies tests for you.
+<details>
+<summary><strong>Claude Code</strong> — plugin install &amp; MCP-only install</summary>
+<br/>
+```bash
+claude plugin marketplace add fastslack/mtw-e2e-runner
+claude plugin install e2e-runner@matware
+```
+This gives Claude 17 MCP tools, a workflow skill, 4 slash commands (`/e2e-runner:run`, `/e2e-runner:create-test`, `/e2e-runner:verify-issue`, `/e2e-runner:capture`), and 3 specialized agents (test-analyzer, test-creator, test-improver).
+**MCP-only install** (tools only, no skill/commands/agents):
+```bash
+claude mcp add --transport stdio --scope user e2e-runner \
+  -- npx -y -p @matware/e2e-runner e2e-runner-mcp
+```
+</details>
+<details>
+<summary><strong>OpenCode</strong></summary>
+<br/>
+```bash
+cp node_modules/@matware/e2e-runner/opencode.json ./
+mkdir -p .opencode && cp -r node_modules/@matware/e2e-runner/.opencode/* .opencode/
+```
+See [OPENCODE.md](OPENCODE.md) for details.
+</details>
+<details>
+<summary><strong>The 17 MCP tools</strong></summary>
+<br/>
+| Tool | Description |
+|------|-------------|
+| `e2e_run` | Run tests (all, by suite, or by file) |
+| `e2e_list` | List available test suites |
+| `e2e_create_test` | Create a new test JSON file |
+| `e2e_create_module` | Create a reusable module |
+| `e2e_pool_status` | Check Chrome pool health |
+| `e2e_app_pool_status` | Inspect the app environment pool (forks, ports, drivers) |
+| `e2e_screenshot` | Retrieve a screenshot by hash |
+| `e2e_capture` | Capture screenshot of any URL |
+| `e2e_analyze` | Extract page structure (interactive elements, forms, headings) and emit test scaffolds |
+| `e2e_dashboard_start` | Start web dashboard |
+| `e2e_dashboard_stop` | Stop web dashboard |
+| `e2e_dashboard_restart` | Restart the dashboard (new project dir/port, clear stale sessions) |
+| `e2e_issue` | Fetch issue and generate tests |
+| `e2e_network_logs` | Query network logs for a run |
+| `e2e_learnings` | Query stability insights |
+| `e2e_vars` | Manage SQLite-backed `{{var.KEY}}` project variables |
+| `e2e_neo4j` | Manage Neo4j knowledge graph |
+> Pool start/stop are CLI-only — not exposed via MCP.
+</details>
+<details>
+<summary><strong>Visual verification</strong> — describe the page, AI judges it</summary>
+<br/>
 Describe what the page should look like — AI judges pass/fail from screenshots:
@@ -414,9 +634,12 @@ Describe what the page should look like — AI judges pass/fail from screenshots
 After test actions complete, the runner auto-captures a verification screenshot. The MCP response includes the screenshot hash — Claude Code retrieves it and visually verifies against your `expect` description. No API key required.
----
+</details>
+<details>
+<summary><strong>Issue-to-test</strong> — turn a bug report into a runnable test</summary>
-## Issue-to-Test
+<br/>
 Turn GitHub and GitLab issues into executable E2E tests. Paste an issue URL and get runnable tests — automatically.
@@ -445,13 +668,68 @@ In Claude Code, just ask:
 **Auth:** GitHub requires `gh` CLI, GitLab requires `glab` CLI. Self-hosted GitLab is supported.
+</details>
 ---
-## Learning System
+<a name="dashboard"></a>
+## 📊 Dashboard &amp; insights
+```bash
+e2e-runner dashboard                  # Start on default port 8484
+e2e-runner dashboard --port 9090      # Custom port
+```
-The runner learns from every test run — building knowledge about your test suite over time.
+<details>
+<summary><strong>Web dashboard tour</strong> — live view, history, gallery, pool status</summary>
+<br/>
-Query insights via the `e2e_learnings` MCP tool:
+**Live execution** — monitor tests in real-time with step-by-step progress, durations, and active worker count.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-live-running.png" alt="Dashboard - Live test execution" width="800" />
+</p>
+**Test suites** — browse all suites across projects. Run a single suite or all tests with one click.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-suites.png" alt="Dashboard - Test suites grid" width="800" />
+</p>
+**Run history** — track pass-rate trends with the built-in chart. Click any row to expand full detail.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-runs.png" alt="Dashboard - Run history" width="800" />
+</p>
+**Run detail** — PASS/FAIL badges, screenshot thumbnails with copyable hashes (`ss:77c28b5a`), formatted console errors, and network request logs.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-run-detail.png" alt="Dashboard - Run detail" width="800" />
+</p>
+**Screenshot gallery** — browse all captured screenshots with hash search (action, error, and verification captures).
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-screenshots-gallery.png" alt="Dashboard - Screenshot gallery" width="800" />
+</p>
+**Pool status** — Chrome pool health: available slots, running sessions, memory pressure.
+<p align="center">
+  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-pool-status.png" alt="Dashboard - Pool status" width="800" />
+</p>
+</details>
+<details>
+<summary><strong>Learning system</strong> — flaky tests, unstable selectors, slow APIs</summary>
+<br/>
+The runner learns from every test run — building knowledge about your test suite over time. Query insights via the `e2e_learnings` MCP tool:
 | Query | Returns |
 |-------|---------|
@@ -466,75 +744,75 @@ Query insights via the `e2e_learnings` MCP tool:
 | `page:<path>` | Drill-down history for a specific page |
 | `selector:<value>` | Drill-down history for a specific selector |
-**Storage & export:**
+**Storage &amp; export:**
 - SQLite (`~/.e2e-runner/dashboard.db`) — default, zero setup
 - Neo4j knowledge graph — optional, for relationship-based analysis. Manage via `e2e_neo4j` MCP tool or `docker compose`
 - Markdown report (`e2e/learnings.md`) — auto-generated after each run
 **Test narration:** Each test run generates a human-readable narrative of what happened step by step, visible in the CLI output and the dashboard.
----
-## Web Dashboard
-Real-time UI for running tests, viewing results, screenshots, and network logs.
+</details>
-```bash
-e2e-runner dashboard                  # Start on default port 8484
-e2e-runner dashboard --port 9090      # Custom port
-```
+<details>
+<summary><strong>Network error handling</strong> — assertions, global flag, full logging</summary>
-### Live Execution
+<br/>
-Monitor tests in real-time with step-by-step progress, durations, and active worker count.
+**Explicit assertion** — place `assert_no_network_errors` after critical page loads:
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-live-running.png" alt="Dashboard - Live test execution" width="800" />
-</p>
+```json
+{ "type": "goto", "value": "/dashboard" },
+{ "type": "wait", "selector": ".loaded" },
+{ "type": "assert_no_network_errors" }
+```
-### Test Suites
+**Global flag** — set `failOnNetworkError: true` to automatically fail any test with network errors:
-Browse all test suites across multiple projects. Run a single suite or all tests with one click.
+```bash
+e2e-runner run --all --fail-on-network-error
+```
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-suites.png" alt="Dashboard - Test suites grid" width="800" />
-</p>
+When disabled (default), the runner still collects and reports network errors — the MCP response includes a warning when tests pass but have network errors.
-### Run History
+**Full network logging** — all XHR/fetch requests are captured with URL, method, status, duration, request/response headers, and response body (truncated at 50KB). Viewable in the dashboard with expandable request detail rows.
-Track pass rate trends with the built-in chart. Click any row to expand full detail with per-test results, screenshot hashes, and errors.
+MCP drill-down flow:
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-runs.png" alt="Dashboard - Run history" width="800" />
-</p>
+```
+1. e2e_run          → compact networkSummary + runDbId
+2. e2e_network_logs(runDbId)                     → all requests (url, method, status, duration)
+3. e2e_network_logs(runDbId, errorsOnly: true)   → only failed requests
+4. e2e_network_logs(runDbId, includeHeaders: true) → with headers
+5. e2e_network_logs(runDbId, includeBodies: true)  → full request/response bodies
+```
-### Run Detail
+The `e2e_run` response stays compact (~5KB) regardless of how many requests were captured. Use `e2e_network_logs` with the returned `runDbId` to drill into details on demand.
-Expanded view with PASS/FAIL badges, screenshot thumbnails with copyable hashes (`ss:77c28b5a`), formatted console errors, and network request logs.
+</details>
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-run-detail.png" alt="Dashboard - Run detail" width="800" />
-</p>
+<details>
+<summary><strong>Screenshot capture</strong> — snapshot any URL on demand</summary>
-### Screenshot Gallery
+<br/>
-Browse all captured screenshots with hash search. Includes action screenshots, error screenshots, and verification captures.
+Capture screenshots of any URL on demand — no test suite required:
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-screenshots-gallery.png" alt="Dashboard - Screenshot gallery" width="800" />
-</p>
+```bash
+e2e-runner capture https://example.com
+e2e-runner capture https://example.com --full-page --selector ".loaded" --delay 2000
+```
-### Pool Status
+Via MCP, the `e2e_capture` tool supports `authToken` and `authStorageKey` for authenticated pages — it injects the token into localStorage before navigating.
-Monitor Chrome pool health: available slots, running sessions, memory pressure.
+Every screenshot gets a deterministic hash (`ss:a3f2b1c9`). Use `e2e_screenshot` to retrieve any screenshot by hash — it returns the image with metadata (test name, step, type).
-<p align="center">
-  <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-pool-status.png" alt="Dashboard - Pool status" width="800" />
-</p>
+</details>
 ---
-## Browser Drivers
+<a name="drivers"></a>
+## 🌐 Browser drivers
 The runner can talk to multiple browser engines through different drivers. The default is **`auto`** — it probes each pool URL and picks the right driver per pool.
@@ -546,7 +824,10 @@ The runner can talk to multiple browser engines through different drivers. The d
 | `obscura` | [Obscura](https://github.com/h4ckf0r0day/obscura) (Rust + V8) | `/json/version` Browser=obscura | ~30 MB RAM footprint, built-in anti-detection (`--stealth`), stays close to real Chrome via Puppeteer |
 | `steel` | [Steel Browser](https://steel.dev) | `/v1/sessions` returns JSON | Managed session lifecycle, REST API for orchestration |
-### Pick a driver per test
+<details>
+<summary><strong>Pick a driver per test / force one per run</strong></summary>
+<br/>
 ```json
 {
@@ -568,16 +849,19 @@ The runner can talk to multiple browser engines through different drivers. The d
 `driver` is optional. If set, only pools whose detected driver matches become candidates. `fallbackDriver` is **explicit opt-in** — without it, a missing target driver fails the test with a clear message. Pool busyness does **not** trigger fallback; the runner waits inside the filtered set.
-### Force a driver for a whole run
+Force a driver for a whole run (CLI overrides win over per-test fields — useful for A/B benchmarks):
 ```bash
 e2e-runner run --all --driver obscura
 e2e-runner run --all --driver obscura --fallback-driver cdp
 ```
-CLI overrides win over per-test fields — useful for A/B benchmarks against the same suite.
+</details>
+<details>
+<summary><strong>Running each driver locally</strong></summary>
-### Running each driver locally
+<br/>
 ```bash
 # browserless (default) — managed by `pool start`
@@ -593,137 +877,18 @@ tar xzf obscura-x86_64-linux.tar.gz
 # then point the runner at it: poolUrls: ['http://localhost:9222'], poolDriver: 'obscura'
 ```
----
-## Screenshot Capture
-Capture screenshots of any URL on demand — no test suite required:
-```bash
-e2e-runner capture https://example.com
-e2e-runner capture https://example.com --full-page --selector ".loaded" --delay 2000
-```
-Via MCP, the `e2e_capture` tool supports `authToken` and `authStorageKey` for authenticated pages — it injects the token into localStorage before navigating.
-Every screenshot gets a deterministic hash (`ss:a3f2b1c9`). Use `e2e_screenshot` to retrieve any screenshot by hash — it returns the image with metadata (test name, step, type).
----
-## AI Integration
-### Claude Code
-```bash
-claude plugin marketplace add fastslack/mtw-e2e-runner
-claude plugin install e2e-runner@matware
-```
-This gives Claude 17 MCP tools, a workflow skill, 4 slash commands (`/e2e-runner:run`, `/e2e-runner:create-test`, `/e2e-runner:verify-issue`, `/e2e-runner:capture`), and 3 specialized agents (test-analyzer, test-creator, test-improver).
-**MCP-only install** (tools only, no skill/commands/agents):
-```bash
-claude mcp add --transport stdio --scope user e2e-runner \
-  -- npx -y -p @matware/e2e-runner e2e-runner-mcp
-```
-### OpenCode
-```bash
-cp node_modules/@matware/e2e-runner/opencode.json ./
-mkdir -p .opencode && cp -r node_modules/@matware/e2e-runner/.opencode/* .opencode/
-```
-See [OPENCODE.md](OPENCODE.md) for details.
-### MCP Tools
-| Tool | Description |
-|------|-------------|
-| `e2e_run` | Run tests (all, by suite, or by file) |
-| `e2e_list` | List available test suites |
-| `e2e_create_test` | Create a new test JSON file |
-| `e2e_create_module` | Create a reusable module |
-| `e2e_pool_status` | Check Chrome pool health |
-| `e2e_app_pool_status` | Inspect the app environment pool (forks, ports, drivers) |
-| `e2e_screenshot` | Retrieve a screenshot by hash |
-| `e2e_capture` | Capture screenshot of any URL |
-| `e2e_analyze` | Extract page structure (interactive elements, forms, headings) and emit test scaffolds |
-| `e2e_dashboard_start` | Start web dashboard |
-| `e2e_dashboard_stop` | Stop web dashboard |
-| `e2e_dashboard_restart` | Restart the dashboard (new project dir/port, clear stale sessions) |
-| `e2e_issue` | Fetch issue and generate tests |
-| `e2e_network_logs` | Query network logs for a run |
-| `e2e_learnings` | Query stability insights |
-| `e2e_vars` | Manage SQLite-backed `{{var.KEY}}` project variables |
-| `e2e_neo4j` | Manage Neo4j knowledge graph |
-> Pool start/stop are CLI-only — not exposed via MCP.
----
-## Network Error Handling
-### Explicit Assertion
-Place `assert_no_network_errors` after critical page loads:
-```json
-{ "type": "goto", "value": "/dashboard" },
-{ "type": "wait", "selector": ".loaded" },
-{ "type": "assert_no_network_errors" }
-```
-### Global Flag
-Set `failOnNetworkError: true` to automatically fail any test with network errors:
-```bash
-e2e-runner run --all --fail-on-network-error
-```
-When disabled (default), the runner still collects and reports network errors — the MCP response includes a warning when tests pass but have network errors.
-### Full Network Logging
-All XHR/fetch requests are captured with: URL, method, status, duration, request/response headers, and response body (truncated at 50KB). Viewable in the dashboard with expandable request detail rows.
-**MCP drill-down flow:**
-```
-1. e2e_run          → compact networkSummary + runDbId
-2. e2e_network_logs(runDbId)                     → all requests (url, method, status, duration)
-3. e2e_network_logs(runDbId, errorsOnly: true)   → only failed requests
-4. e2e_network_logs(runDbId, includeHeaders: true) → with headers
-5. e2e_network_logs(runDbId, includeBodies: true)  → full request/response bodies
-```
-The `e2e_run` response stays compact (~5KB) regardless of how many requests were captured. Use `e2e_network_logs` with the returned `runDbId` to drill into details on demand.
+</details>
 ---
-## Hooks
-Run actions at lifecycle points. Define globally in config or per-suite:
-```json
-{
-  "hooks": {
-    "beforeAll": [{ "type": "goto", "value": "/setup" }],
-    "beforeEach": [{ "type": "goto", "value": "/" }],
-    "afterEach": [{ "type": "screenshot", "value": "after.png" }],
-    "afterAll": []
-  },
-  "tests": [...]
-}
-```
+<a name="reference"></a>
-> **Important:** `beforeAll` runs on a separate browser page that is closed before tests start. Use `beforeEach` for state that tests need (cookies, localStorage, auth tokens).
+## ⚙️ CLI, config &amp; CI
----
+<details>
+<summary><strong>CLI commands</strong></summary>
-## CLI
+<br/>
 ```bash
 # Run tests
@@ -751,7 +916,12 @@ e2e-runner capture <url>              # On-demand screenshot
 e2e-runner init                       # Scaffold project
 ```
-### CLI Options
+</details>
+<details>
+<summary><strong>CLI options</strong></summary>
+<br/>
 | Flag | Default | Description |
 |------|---------|-------------|
@@ -769,9 +939,12 @@ e2e-runner init                       # Scaffold project
 | `--driver <name>` | _(per-test)_ | Force pool driver for the run: `browserless`, `cdp`, `lightpanda`, `obscura`, `steel` |
 | `--fallback-driver <name>` | _none_ | Explicit fallback if no pool with `--driver` is reachable |
----
+</details>
+<details>
+<summary><strong>Configuration</strong> — <code>e2e.config.js</code> &amp; priority</summary>
-## Configuration
+<br/>
 Create `e2e.config.js` in your project root:
@@ -797,7 +970,7 @@ export default {
 };
 ```
-### Config Priority (highest wins)
+**Config priority (highest wins):**
 1. CLI flags
 2. Environment variables
@@ -806,18 +979,17 @@ export default {
 When `--env <name>` is set, the matching profile overrides everything.
----
+</details>
-## CI/CD
+<details>
+<summary><strong>CI/CD</strong> — JUnit XML &amp; GitHub Actions</summary>
-### JUnit XML
+<br/>
 ```bash
 e2e-runner run --all --output junit
 ```
-### GitHub Actions
 ```yaml
 jobs:
   e2e:
@@ -836,9 +1008,12 @@ jobs:
           report_paths: e2e/screenshots/junit.xml
 ```
----
+</details>
-## Programmatic API
+<details>
+<summary><strong>Programmatic API</strong></summary>
+<br/>
 ```js
 import { createRunner } from '@matware/e2e-runner';
@@ -853,15 +1028,17 @@ const report = await runner.runTests([
 ]);
 ```
+</details>
 ---
 ## Requirements
 - **Node.js** >= 20
-- **Docker** (for the Chrome pool)
+- **Docker** — only for [Option 3](#install) (the parallel Chrome pool). Options 1 &amp; 2 don't need it.
 ## License
-Copyright 2025 Matias Aguirre (fastslack)
+Copyright 2026 Matias Aguirre (fastslack) — Matware
 Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for details.