npm - @diegovelasquezweb/a11y-engine - Versions diffs - 0.1.2 → 0.1.4 - Mend

@diegovelasquezweb/a11y-engine 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/CHANGELOG.md +84 -0
package/README.md +220 -7
package/assets/engine/cdp-checks.json +30 -0
package/assets/engine/pa11y-config.json +53 -0
package/docs/architecture.md +218 -0
package/docs/cli-handbook.md +237 -0
package/docs/outputs.md +303 -0
package/package.json +9 -2
package/scripts/audit.mjs +3 -0
package/scripts/core/asset-loader.mjs +4 -0
package/scripts/engine/analyzer.mjs +8 -1
package/scripts/engine/dom-scanner.mjs +366 -5
package/scripts/index.mjs +262 -0

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,84 @@
+# Changelog
+All notable changes to this project will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [Unreleased]
+---
+## [0.1.3] — 2026-03-14
+### Added
+- **Multi-engine scanning**: three independent engines now run against each page:
+  - **axe-core** (via `@axe-core/playwright`) — primary WCAG rule engine injected into the live page
+  - **CDP** (Chrome DevTools Protocol) — queries the browser's accessibility tree for missing accessible names and aria-hidden on focusable elements
+  - **pa11y** (HTML CodeSniffer via Puppeteer) — catches heading hierarchy, link purpose, and form association issues
+- Cross-engine merge and deduplication in `mergeViolations()` — removes duplicate findings across axe, CDP, and pa11y based on rule equivalence and selector matching
+- Real-time `progress.json` with per-engine step tracking and finding counts (`found` for each engine, `merged` total after dedup)
+- `--axe-tags` CLI flag for filtering axe-core WCAG tag sets (also determines pa11y standard)
+- Non-visible element skip list for screenshots (`<meta>`, `<link>`, `<style>`, `<script>`, `<title>`, `<base>`) — prevents timeout warnings on elements that cannot be scrolled into view
+### Changed
+- `a11y-scan-results.json` now contains merged violations from all three engines (previously axe-core only)
+- Each violation includes a `source` field (`"cdp"` or `"pa11y"`) to identify which engine produced it (axe-core violations have no `source` field for backwards compatibility)
+- README rewritten to reflect multi-engine architecture
+- All documentation (`architecture.md`, `cli-handbook.md`, `outputs.md`) updated to describe the three-engine pipeline, merge/dedup logic, progress tracking, and dual browser requirements
+### Fixed
+- Screenshot capture no longer attempts to scroll non-visible `<head>` elements into view
+---
+## [0.1.2] — 2026-03-13
+### Fixed
+- `bin` field in `package.json` — removed leading `./` from the entry path (`scripts/audit.mjs`) to satisfy npm bin resolution
+- `repository.url` normalized to `git+https://` prefix as required by npm registry validation
+- Missing shebang (`#!/usr/bin/env node`) added to `scripts/audit.mjs` so the `a11y-audit` binary executes correctly when installed globally or via `npx`
+---
+## [0.1.1] — 2026-03-13
+### Added
+- Engine scripts published as a standalone npm package:
+  - `scripts/audit.mjs` — orchestrator for the full audit pipeline
+  - `scripts/core/utils.mjs` — shared logging, path utilities, and defaults
+  - `scripts/core/toolchain.mjs` — dependency and Playwright browser verification
+  - `scripts/core/asset-loader.mjs` — JSON asset loading with error boundaries
+  - `scripts/engine/dom-scanner.mjs` — Playwright + axe-core WCAG 2.2 AA scanner
+  - `scripts/engine/analyzer.mjs` — finding enrichment with fix intelligence
+  - `scripts/engine/source-scanner.mjs` — static source code pattern scanner
+  - `scripts/reports/builders/` — orchestrators for each report format
+  - `scripts/reports/renderers/` — rendering logic for HTML, PDF, Markdown, and checklist
+- Asset files bundled under `assets/`:
+  - `assets/reporting/compliance-config.json` — scoring weights, grade thresholds, and legal regulation mapping
+  - `assets/reporting/wcag-reference.json` — WCAG criterion map, persona config, and persona–rule mapping
+  - `assets/reporting/manual-checks.json` — 41 manual WCAG checks for the interactive checklist
+  - `assets/discovery/crawler-config.json` — BFS crawl configuration defaults
+  - `assets/discovery/stack-detection.json` — framework and CMS fingerprint signatures
+  - `assets/remediation/intelligence.json` — per-rule fix intelligence (106 axe-core rules)
+  - `assets/remediation/code-patterns.json` — source code pattern definitions
+  - `assets/remediation/guardrails.json` — agent fix guardrails and scope rules
+  - `assets/remediation/axe-check-maps.json` — axe check-to-rule mapping
+  - `assets/remediation/source-boundaries.json` — framework-specific file location patterns
+- `a11y-audit` binary registered in `bin` field — invocable via `npx a11y-audit` after install
+- `LICENSE` (MIT)
+---
+## [0.1.0] — 2026-03-13
+### Added
+- Initial package scaffold: `package.json` for `@diegovelasquezweb/a11y-engine` with correct `name`, `version`, `type: module`, `engines`, `files`, and `scripts` fields
+- `devDependencies`: `vitest` for test runner
+- `dependencies`: `playwright`, `@axe-core/playwright`, `axe-core`, `pa11y`

package/README.md CHANGED Viewed

@@ -1,20 +1,233 @@
 # @diegovelasquezweb/a11y-engine
-WCAG 2.2 AA accessibility audit engine. Scanner, analyzer, and report builders.
+Multi-engine WCAG 2.2 AA accessibility audit engine. Combines three scanning engines (axe-core, Chrome DevTools Protocol, and pa11y), merges and deduplicates their findings, enriches results with fix intelligence, and produces structured artifacts for developers, agents, and stakeholders.
-## Install
+## What it is
+A Node.js CLI and programmatic engine that:
+1. Crawls a target URL and discovers routes automatically
+2. Runs three independent accessibility engines against each page:
+   - **axe-core** — industry-standard WCAG rule engine, injected into the live page via Playwright
+   - **CDP** (Chrome DevTools Protocol) — queries the browser's accessibility tree directly for issues axe may miss (missing accessible names, aria-hidden on focusable elements)
+   - **pa11y** (HTML CodeSniffer) — catches WCAG violations around heading hierarchy, link purpose, and form associations
+3. Merges and deduplicates findings across all three engines
+4. Optionally scans project source code for patterns no runtime engine can detect
+5. Enriches each finding with stack-aware fix guidance, selectors, and verification commands
+6. Produces a full artifact set: JSON data, Markdown remediation guide, HTML dashboard, PDF compliance report, and manual testing checklist
+## Why use this engine
+| Capability | With this engine | Without |
+| :--- | :--- | :--- |
+| **Multi-engine scanning** | axe-core + CDP accessibility tree + pa11y (HTML CodeSniffer) with cross-engine deduplication | Single engine — higher false-negative rate |
+| **Full WCAG 2.2 Coverage** | Three runtime engines + source code pattern scanner | Runtime scan only — misses structural and source-level issues |
+| **Fix Intelligence** | Stack-aware patches with code snippets tailored to detected framework | Raw rule violations with no remediation context |
+| **Structured Artifacts** | JSON + Markdown + HTML + PDF + Checklist — ready to consume or forward | Findings exist only in the terminal session |
+| **CI/Agent Integration** | Deterministic exit codes, stdout-parseable output paths, JSON schema | Requires wrapper scripting |
+## How the scan pipeline works
+```
+URL
+ |
+ v
+[1. Crawl & Discover]  sitemap.xml / BFS link crawl / explicit --routes
+ |
+ v
+[2. Navigate]           Playwright opens each route in Chromium
+ |
+ +---> [axe-core]       Injects axe into the page, runs WCAG tag checks
+ |
+ +---> [CDP]            Opens a CDP session, reads the full accessibility tree
+ |
+ +---> [pa11y]          Launches HTML CodeSniffer via Puppeteer Chrome
+ |
+ v
+[3. Merge & Dedup]      Combines findings, removes cross-engine duplicates
+ |
+ v
+[4. Analyze]            Enriches with WCAG mapping, severity, fix code, framework hints
+ |
+ v
+[5. Reports]            HTML dashboard, PDF, checklist, Markdown remediation
+```
+## Installation
 ```bash
-npm i @diegovelasquezweb/a11y-engine
+npm install @diegovelasquezweb/a11y-engine
 npx playwright install chromium
+npx puppeteer browsers install chrome
+```
+```bash
+pnpm add @diegovelasquezweb/a11y-engine
+pnpm exec playwright install chromium
+npx puppeteer browsers install chrome
 ```
-## Usage
+> **Two browsers are required:**
+> - **Playwright Chromium** — used by axe-core and CDP checks
+> - **Puppeteer Chrome** — used by pa11y (HTML CodeSniffer)
+>
+> These are separate browser installations. If Puppeteer Chrome is missing, pa11y checks fail silently (non-fatal) and the scan continues with axe + CDP only.
+## Quick start
 ```bash
-npx a11y-audit --base-url https://example.com --max-routes 5
+# Minimal scan — produces remediation.md in .audit/
+npx a11y-audit --base-url https://example.com
+# Full audit with all reports
+npx a11y-audit --base-url https://example.com --with-reports --output ./audit/report.html
+# Scan with source code intelligence (for stack-aware fix guidance)
+npx a11y-audit --base-url http://localhost:3000 --project-dir . --with-reports --output ./audit/report.html
 ```
-## Options
+## CLI usage
+```
+a11y-audit --base-url <url> [options]
+```
+### Targeting & scope
+| Flag | Argument | Default | Description |
+| :--- | :--- | :--- | :--- |
+| `--base-url` | `<url>` | (Required) | Starting URL for the audit. |
+| `--max-routes` | `<num>` | `10` | Max routes to discover and scan. |
+| `--crawl-depth` | `<num>` | `2` | BFS link-follow depth during discovery (1-3). |
+| `--routes` | `<csv>` | — | Explicit path list, bypasses auto-discovery. |
+| `--project-dir` | `<path>` | — | Path to project source. Enables source pattern scanner and framework auto-detection. |
+### Audit intelligence
+| Flag | Argument | Default | Description |
+| :--- | :--- | :--- | :--- |
+| `--target` | `<text>` | `WCAG 2.2 AA` | Compliance target label in reports. |
+| `--only-rule` | `<id>` | — | Run a single axe rule (e.g. `color-contrast`). |
+| `--ignore-findings` | `<csv>` | — | Rule IDs to exclude from output. |
+| `--exclude-selectors` | `<csv>` | — | CSS selectors to skip during DOM scan. |
+| `--axe-tags` | `<csv>` | `wcag2a,wcag2aa,wcag21a,wcag21aa,wcag22a,wcag22aa` | axe-core WCAG tag filter. |
+| `--framework` | `<name>` | — | Override auto-detected stack. Supported: `nextjs`, `gatsby`, `react`, `nuxt`, `vue`, `angular`, `astro`, `svelte`, `shopify`, `wordpress`, `drupal`. |
+### Execution & emulation
+| Flag | Argument | Default | Description |
+| :--- | :--- | :--- | :--- |
+| `--color-scheme` | `light\|dark` | `light` | Emulate `prefers-color-scheme`. |
+| `--wait-until` | `domcontentloaded\|load\|networkidle` | `domcontentloaded` | Playwright page load strategy. Use `networkidle` for SPAs. |
+| `--viewport` | `<WxH>` | — | Viewport size (e.g. `375x812`, `1440x900`). |
+| `--wait-ms` | `<num>` | `2000` | Delay after page load before running axe (ms). |
+| `--timeout-ms` | `<num>` | `30000` | Network timeout per page (ms). |
+| `--headed` | — | `false` | Run browser in visible mode. |
+| `--affected-only` | — | `false` | Re-scan only routes with previous violations. Requires a prior scan in `.audit/`. |
+### Output generation
+| Flag | Argument | Default | Description |
+| :--- | :--- | :--- | :--- |
+| `--with-reports` | — | `false` | Generate HTML + PDF + Checklist reports. Requires `--output`. |
+| `--skip-reports` | — | `true` | Skip visual report generation (default). |
+| `--output` | `<path>` | — | Output path for `report.html` (PDF and checklist derive from it). |
+| `--skip-patterns` | — | `false` | Disable source code pattern scanner even when `--project-dir` is set. |
+## Common command patterns
+```bash
+# Focused audit — one rule, one route
+a11y-audit --base-url https://example.com --only-rule color-contrast --routes /checkout --max-routes 1
+# Dark mode audit
+a11y-audit --base-url https://example.com --color-scheme dark
+# SPA with deferred rendering
+a11y-audit --base-url https://example.com --wait-until networkidle --wait-ms 3000
+# Mobile viewport
+a11y-audit --base-url https://example.com --viewport 375x812
+# Fast re-audit after fixes (skips clean pages)
+a11y-audit --base-url https://example.com --affected-only
+# Ignore known false positives
+a11y-audit --base-url https://example.com --ignore-findings color-contrast,frame-title
+```
+## Output artifacts
+All artifacts are written to `.audit/` relative to the package root.
+| File | Always generated | Description |
+| :--- | :--- | :--- |
+| `a11y-scan-results.json` | Yes | Raw merged results from axe-core + CDP + pa11y per route |
+| `a11y-findings.json` | Yes | Enriched findings with fix intelligence, WCAG mapping, and severity |
+| `progress.json` | Yes | Real-time scan progress with per-engine step status and finding counts |
+| `remediation.md` | Yes | AI-agent-optimized remediation roadmap |
+| `report.html` | With `--with-reports` | Interactive HTML dashboard |
+| `report.pdf` | With `--with-reports` | Formal compliance PDF |
+| `checklist.html` | With `--with-reports` | Manual WCAG testing checklist |
+See [Output Artifacts](docs/outputs.md) for full schema reference.
+## Scan engines
+### axe-core (via @axe-core/playwright)
+The primary engine. Runs Deque's axe-core rule set against the live DOM inside Playwright's Chromium. Covers the majority of automatable WCAG 2.2 AA success criteria.
+### CDP (Chrome DevTools Protocol)
+Queries the browser's full accessibility tree via a CDP session. Catches issues axe may miss:
+- Interactive elements (buttons, links, inputs) with no accessible name
+- Focusable elements hidden with `aria-hidden`
+### pa11y (HTML CodeSniffer)
+Runs Squiz's HTML CodeSniffer via Puppeteer Chrome. Catches WCAG violations around:
+- Heading hierarchy
+- Link purpose
+- Form label associations
+Requires a separate Chrome installation (`npx puppeteer browsers install chrome`). If Chrome is missing, pa11y fails silently and the scan continues with axe + CDP.
+### Merge & deduplication
+After all three engines run, findings are merged and deduplicated:
+- axe findings are added first (baseline)
+- CDP findings are checked against axe equivalents (e.g. `cdp-missing-accessible-name` vs `button-name`) to avoid duplicates
+- pa11y findings are checked against existing selectors to avoid triple-reporting the same element
+## Troubleshooting
+**`Error: browserType.launch: Executable doesn't exist`**
+Run `npx playwright install chromium` (or `pnpm exec playwright install chromium`).
+**`pa11y checks failed (non-fatal): Could not find Chrome`**
+pa11y requires Puppeteer's Chrome, which is separate from Playwright's Chromium. Install it with `npx puppeteer browsers install chrome`.
+**`Missing required argument: --base-url`**
+The flag is required. Provide a full URL including protocol: `--base-url https://example.com`.
+**Scan returns 0 findings on an SPA**
+Use `--wait-until networkidle --wait-ms 3000` to let async content render before the engines run.
+**`--with-reports` exits without generating PDF**
+Ensure `--output` is also set and points to an `.html` file path: `--output ./audit/report.html`.
+**Chromium crashes in CI**
+Add `--no-sandbox` via the `PLAYWRIGHT_CHROMIUM_LAUNCH_OPTIONS` env var, or run Playwright with the `--with-deps` flag during browser installation.
+## Documentation
+| Resource | Description |
+| :--- | :--- |
+| [Architecture](https://github.com/diegovelasquezweb/a11y-engine/blob/main/docs/architecture.md) | How the multi-engine scanner pipeline works |
+| [CLI Handbook](https://github.com/diegovelasquezweb/a11y-engine/blob/main/docs/cli-handbook.md) | Full flag reference and usage patterns |
+| [Output Artifacts](https://github.com/diegovelasquezweb/a11y-engine/blob/main/docs/outputs.md) | Schema and structure of every generated file |
+## License
-See `a11y-audit --help` for full CLI reference.
+MIT

package/assets/engine/cdp-checks.json ADDED Viewed

@@ -0,0 +1,30 @@
+{
+  "interactiveRoles": [
+    "button", "link", "textbox", "combobox", "listbox",
+    "menuitem", "tab", "checkbox", "radio", "switch", "slider"
+  ],
+  "rules": [
+    {
+      "id": "cdp-missing-accessible-name",
+      "condition": "interactive-no-name",
+      "impact": "serious",
+      "tags": ["wcag2a", "wcag412", "cdp-check"],
+      "help": "Interactive elements must have an accessible name",
+      "helpUrl": "https://dequeuniversity.com/rules/axe/4.11/button-name",
+      "description": "Interactive element with role \"{{role}}\" has no accessible name",
+      "failureMessage": "Element with role \"{{role}}\" has no accessible name in the accessibility tree",
+      "axeEquivalents": ["button-name", "link-name", "input-name", "aria-command-name"]
+    },
+    {
+      "id": "cdp-aria-hidden-focusable",
+      "condition": "hidden-focusable",
+      "impact": "serious",
+      "tags": ["wcag2a", "wcag412", "cdp-check"],
+      "help": "aria-hidden elements must not be focusable",
+      "helpUrl": "https://dequeuniversity.com/rules/axe/4.11/aria-hidden-focus",
+      "description": "Focusable element with role \"{{role}}\" is aria-hidden",
+      "failureMessage": "Focusable element with role \"{{role}}\" is hidden from the accessibility tree",
+      "axeEquivalents": ["aria-hidden-focus"]
+    }
+  ]
+}

package/assets/engine/pa11y-config.json ADDED Viewed

@@ -0,0 +1,53 @@
+{
+  "ignoreByPrinciple": [
+    "Principle1.Guideline1_4.1_4_3.G18.Fail",
+    "Principle4.Guideline4_1.4_1_2.H91.A.NoContent"
+  ],
+  "impactMap": {
+    "1": "serious",
+    "2": "moderate",
+    "3": "minor"
+  },
+  "equivalenceMap": {
+    "Principle1.Guideline1_4.1_4_3.G145": "color-contrast",
+    "Principle1.Guideline1_4.1_4_3.G18": "color-contrast",
+    "Principle1.Guideline1_4.1_4_3.G145.Fail": "color-contrast",
+    "Principle1.Guideline1_4.1_4_3.G18.Fail": "color-contrast",
+    "Principle1.Guideline1_3.1_3_1.H42": "heading-order",
+    "Principle1.Guideline1_3.1_3_1.H42.2": "empty-heading",
+    "Principle1.Guideline1_3.1_3_1.H44": "label",
+    "Principle1.Guideline1_3.1_3_1.H65": "label",
+    "Principle1.Guideline1_3.1_3_1.H71": "label",
+    "Principle1.Guideline1_3.1_3_1.H85": "listitem",
+    "Principle1.Guideline1_3.1_3_1.H48": "list",
+    "Principle1.Guideline1_3.1_3_1.H39": "table-fake-caption",
+    "Principle1.Guideline1_3.1_3_1.H73": "table-fake-caption",
+    "Principle1.Guideline1_1.1_1_1.H37": "image-alt",
+    "Principle1.Guideline1_1.1_1_1.H67": "image-alt",
+    "Principle1.Guideline1_1.1_1_1.H36": "input-image-alt",
+    "Principle1.Guideline1_1.1_1_1.H2": "image-redundant-alt",
+    "Principle1.Guideline1_1.1_1_1.H53": "object-alt",
+    "Principle1.Guideline1_1.1_1_1.G94": "image-alt",
+    "Principle1.Guideline1_1.1_1_1.H24": "area-alt",
+    "Principle2.Guideline2_4.2_4_1.H64": "frame-title",
+    "Principle2.Guideline2_4.2_4_1.G1": "bypass",
+    "Principle2.Guideline2_4.2_4_1.G124": "bypass",
+    "Principle2.Guideline2_4.2_4_2.H25": "document-title",
+    "Principle2.Guideline2_4.2_4_4.H77": "link-name",
+    "Principle1.Guideline1_1.1_1_1.H30": "link-name",
+    "Principle2.Guideline2_4.2_4_6.G197": "label",
+    "Principle2.Guideline2_1.2_1_1.G202": "scrollable-region-focusable",
+    "Principle3.Guideline3_1.3_1_1.H57": "html-has-lang",
+    "Principle3.Guideline3_1.3_1_1.H57.2": "html-has-lang",
+    "Principle3.Guideline3_1.3_1_1.H57.3": "html-lang-valid",
+    "Principle3.Guideline3_1.3_1_1.H57.3.Lang": "html-lang-valid",
+    "Principle3.Guideline3_2.3_2_1.G107": "select-name",
+    "Principle3.Guideline3_3.3_3_2.G131": "label",
+    "Principle4.Guideline4_1.4_1_1.F77": "duplicate-id",
+    "Principle4.Guideline4_1.4_1_2.H91": "button-name",
+    "Principle4.Guideline4_1.4_1_2.H91.A": "link-name",
+    "Principle4.Guideline4_1.4_1_2.H91.Button": "button-name",
+    "Principle4.Guideline4_1.4_1_2.H91.InputText": "label",
+    "Principle4.Guideline4_1.4_1_2.H91.Select": "select-name"
+  }
+}

package/docs/architecture.md ADDED Viewed

@@ -0,0 +1,218 @@
+# Engine Architecture
+**Navigation**: [Home](../README.md) • [Architecture](architecture.md) • [CLI Handbook](cli-handbook.md) • [Output Artifacts](outputs.md)
+---
+## Table of Contents
+- [Pipeline overview](#pipeline-overview)
+- [Stage 1: DOM scanner](#stage-1-dom-scanner)
+  - [axe-core](#axe-core)
+  - [CDP checks](#cdp-checks)
+  - [pa11y](#pa11y)
+  - [Merge and deduplication](#merge-and-deduplication)
+- [Stage 1b: Source scanner](#optional-source-scanner)
+- [Stage 2: Analyzer](#stage-2-analyzer)
+- [Stage 3: Report builders](#stage-3-report-builders)
+- [Assets and rule intelligence](#assets-and-rule-intelligence)
+- [Execution model and timeouts](#execution-model-and-timeouts)
+---
+The engine operates as a three-stage pipeline. Each stage is an independent Node.js process spawned by `audit.mjs`. Stages communicate through JSON files written to `.audit/`.
+## Pipeline overview
+```
+Target URL
+    │
+    ▼
+┌─────────────────────────────────┐
+│  Stage 1: DOM Scanner           │  Three engines per route:
+│  dom-scanner.mjs                │
+│                                 │
+│  ┌──────────┐  ┌──────┐        │
+│  │ axe-core │  │ CDP  │        │  Playwright Chromium
+│  └────┬─────┘  └──┬───┘        │
+│       │           │             │
+│  ┌────▼───────────▼────┐       │
+│  │      pa11y          │       │  Puppeteer Chrome
+│  └────────┬────────────┘       │
+│           │                    │
+│  ┌────────▼────────────┐       │
+│  │  Merge & Dedup      │       │
+│  └────────┬────────────┘       │
+└───────────┼─────────────────────┘
+            │ a11y-scan-results.json
+            │ progress.json
+            ▼
+┌─────────────────────────────────┐
+│  Stage 1b: Source Scanner       │  Static regex analysis
+│  source-scanner.mjs             │  (optional — requires --project-dir)
+└───────────┬─────────────────────┘
+            │ merges into a11y-findings.json
+            ▼
+┌─────────────────────────────────┐
+│  Stage 2: Analyzer              │  Fix intelligence enrichment
+│  analyzer.mjs                   │  intelligence.json + guardrails
+└───────────┬─────────────────────┘
+            │ a11y-findings.json
+            ▼
+┌─────────────────────────────────┐
+│  Stage 3: Report Builders       │  Parallel rendering
+│  md / html / pdf / checklist    │
+└───────────┬─────────────────────┘
+            │
+    ┌───────┼──────────┬──────────────┐
+    ▼       ▼          ▼              ▼
+remediation report   report        checklist
+   .md      .html     .pdf           .html
+```
+## Stage 1: DOM scanner
+**Script**: `scripts/engine/dom-scanner.mjs`
+Launches a Playwright-controlled Chromium browser, discovers routes, and runs three independent accessibility engines against each page. Results are merged and deduplicated before output.
+### Route discovery
+- If the site exposes a `sitemap.xml`, all listed URLs are scanned (up to `--max-routes`).
+- Otherwise, BFS crawl starting from `--base-url`, following same-origin `<a href>` links up to `--crawl-depth` levels deep.
+- Routes are deduplicated and normalized before scanning.
+- 3 parallel browser tabs scan routes concurrently (~2-3x faster than sequential).
+### axe-core
+**Dependency**: `@axe-core/playwright`
+The primary engine. Injects axe-core into the live page via Playwright and runs WCAG 2.2 A/AA tag checks. Covers the majority of automatable WCAG success criteria (~80+ rules).
+- Configurable via `--axe-tags` (default: `wcag2a,wcag2aa,wcag21a,wcag21aa,wcag22a,wcag22aa`)
+- Supports `--only-rule` for focused single-rule audits
+- Supports `--exclude-selectors` to skip specific elements
+### CDP checks
+**Dependency**: Playwright's built-in CDP session (`page.context().newCDPSession()`)
+Queries the browser's full accessibility tree via Chrome DevTools Protocol. Catches issues axe may miss because it operates on the computed accessibility tree rather than the DOM:
+- **Missing accessible names** — interactive elements (`button`, `link`, `textbox`, `combobox`, etc.) with empty names in the accessibility tree
+- **aria-hidden on focusable elements** — elements that are focusable but hidden from assistive technology
+CDP findings use axe-compatible violation format with `source: "cdp"` for downstream processing.
+### pa11y
+**Dependency**: `pa11y` (which uses Puppeteer + Chrome internally)
+Runs Squiz's HTML CodeSniffer against each page URL. Catches WCAG violations that axe and CDP may miss:
+- Heading hierarchy issues
+- Link purpose violations
+- Form label associations
+- Additional WCAG2AA/WCAG2AAA checks from HTML CodeSniffer's rule set
+pa11y requires a separate Chrome installation (`npx puppeteer browsers install chrome`). This is separate from Playwright's Chromium. If Chrome is missing, pa11y fails silently (non-fatal) and the scan continues with axe + CDP only.
+pa11y findings use axe-compatible violation format with `source: "pa11y"` for downstream processing.
+### Merge and deduplication
+After all three engines complete, `mergeViolations()` combines findings and removes cross-engine duplicates:
+1. **axe findings** are added first as the baseline
+2. **CDP findings** are checked against axe equivalents (e.g. `cdp-missing-accessible-name` maps to `button-name`, `link-name`, `input-name`, `aria-command-name`). Only truly new findings are added.
+3. **pa11y findings** are checked against existing selectors. If the same element is already flagged by axe or CDP, the pa11y finding is dropped.
+The merged violations are written to `a11y-scan-results.json` per route.
+### Progress tracking
+The scanner writes `progress.json` in real-time as each engine runs. This file is used by integrations (like `a11y-scanner`) for live progress UI:
+```json
+{
+  "steps": {
+    "page":  { "status": "done", "updatedAt": "..." },
+    "axe":   { "status": "done", "updatedAt": "...", "found": 8 },
+    "cdp":   { "status": "done", "updatedAt": "...", "found": 3 },
+    "pa11y": { "status": "done", "updatedAt": "...", "found": 2 },
+    "merge": { "status": "done", "updatedAt": "...", "axe": 8, "cdp": 3, "pa11y": 2, "merged": 11 }
+  },
+  "currentStep": "merge"
+}
+```
+### Screenshots
+After merging, element screenshots are captured for each violation. Non-visible elements (`<meta>`, `<link>`, `<script>`, etc.) are automatically skipped. Screenshots are stored in `.audit/screenshots/` and referenced by each violation's `screenshot_path` field.
+### Optional: Source scanner
+**Script**: `scripts/engine/source-scanner.mjs` — runs when `--project-dir` is set and `--skip-patterns` is not.
+Performs static analysis of source files for accessibility issues no runtime engine can detect (e.g. focus outline suppression, missing alt text in templates). Uses regex patterns from `assets/remediation/code-patterns.json` scoped to framework-specific file boundaries from `assets/remediation/source-boundaries.json`.
+Findings are classified as `confirmed` (pattern unambiguously matches) or `potential` (requires human verification).
+## Stage 2: Analyzer
+**Script**: `scripts/engine/analyzer.mjs`
+Reads `a11y-scan-results.json` (which contains merged axe + CDP + pa11y results) and enriches each violation with:
+- **Fix intelligence** from `assets/remediation/intelligence.json` — 106 axe-core rules with code snippets, MDN links, framework-specific notes, and WCAG criterion mapping. CDP and pa11y findings receive generic enrichment based on their rule structure.
+- **Selector scoring** — picks the most stable selector from axe's `nodes` list. Priority: `#id` > `[data-*]` > `[aria-*]` > `[type=]`, with penalty for Tailwind utility classes.
+- **Framework context** — `assets/discovery/stack-detection.json` fingerprints the DOM to detect framework and CMS. Per-finding `framework_notes` and `cms_notes` are filtered to the detected stack.
+- **Guardrails** — `assets/remediation/guardrails.json` defines scope rules that prevent agents from touching backend code, third-party scripts, or minified files.
+- **Compliance scoring** — `assets/reporting/compliance-config.json` weights findings by severity to produce a 0-100 score with grade thresholds.
+- **Persona impact groups** — `assets/reporting/wcag-reference.json` maps findings to disability personas (visual, motor, cognitive, etc.).
+**Output**: `a11y-findings.json` — enriched findings array with all intelligence fields.
+## Stage 3: Report builders
+All builders run in parallel when `--with-reports` is set. Each reads `a11y-findings.json` independently.
+| Builder | Script | Output | Audience |
+| :--- | :--- | :--- | :--- |
+| Markdown | `reports/builders/md.mjs` | `remediation.md` | AI agents |
+| HTML | `reports/builders/html.mjs` | `report.html` | Developers |
+| PDF | `reports/builders/pdf.mjs` | `report.pdf` | Stakeholders |
+| Checklist | `reports/builders/checklist.mjs` | `checklist.html` | QA / Developers |
+The `remediation.md` builder always runs (even without `--with-reports`) since it is the primary output for AI agent consumption.
+Renderers in `scripts/reports/renderers/` contain the actual rendering logic — builders are thin orchestrators that call renderers and write output files.
+## Assets and rule intelligence
+Assets are static JSON files bundled with the package under `assets/`. They are read at runtime by the analyzer and report builders.
+| Asset | Purpose |
+| :--- | :--- |
+| `reporting/compliance-config.json` | Score weights, grade thresholds, legal regulation list |
+| `reporting/wcag-reference.json` | WCAG criterion map, persona config, persona-rule mapping |
+| `reporting/manual-checks.json` | 41 manual checks for the WCAG checklist |
+| `discovery/crawler-config.json` | BFS crawl defaults (timeouts, concurrency) |
+| `discovery/stack-detection.json` | Framework/CMS DOM fingerprints |
+| `remediation/intelligence.json` | Per-rule fix intelligence for 106 axe-core rules |
+| `remediation/code-patterns.json` | Source code pattern definitions |
+| `remediation/guardrails.json` | Agent fix scope guardrails |
+| `remediation/axe-check-maps.json` | axe check-to-rule mapping |
+| `remediation/source-boundaries.json` | Framework-specific source file locations |
+## Execution model and timeouts
+`audit.mjs` spawns each stage as a child process via `node:child_process`. All child processes:
+- Inherit the parent's environment
+- Run with `cwd` set to the package root (`SKILL_ROOT`)
+- Have a hard timeout of **15 minutes** (configurable via the `SCRIPT_TIMEOUT_MS` constant)
+The orchestrator exits with code `1` if any stage fails. Individual stage timeouts are also enforced per page via `--timeout-ms` (default: 30s).
+If `node_modules/` is absent on first run, the orchestrator automatically installs dependencies via `pnpm install` (falls back to `npm install`).