@diegovelasquezweb/a11y-engine 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,84 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [Unreleased]
9
+
10
+ ---
11
+
12
+ ## [0.1.3] — 2026-03-14
13
+
14
+ ### Added
15
+
16
+ - **Multi-engine scanning**: three independent engines now run against each page:
17
+ - **axe-core** (via `@axe-core/playwright`) — primary WCAG rule engine injected into the live page
18
+ - **CDP** (Chrome DevTools Protocol) — queries the browser's accessibility tree for missing accessible names and aria-hidden on focusable elements
19
+ - **pa11y** (HTML CodeSniffer via Puppeteer) — catches heading hierarchy, link purpose, and form association issues
20
+ - Cross-engine merge and deduplication in `mergeViolations()` — removes duplicate findings across axe, CDP, and pa11y based on rule equivalence and selector matching
21
+ - Real-time `progress.json` with per-engine step tracking and finding counts (`found` for each engine, `merged` total after dedup)
22
+ - `--axe-tags` CLI flag for filtering axe-core WCAG tag sets (also determines pa11y standard)
23
+ - Non-visible element skip list for screenshots (`<meta>`, `<link>`, `<style>`, `<script>`, `<title>`, `<base>`) — prevents timeout warnings on elements that cannot be scrolled into view
24
+
25
+ ### Changed
26
+
27
+ - `a11y-scan-results.json` now contains merged violations from all three engines (previously axe-core only)
28
+ - Each violation includes a `source` field (`"cdp"` or `"pa11y"`) to identify which engine produced it (axe-core violations have no `source` field for backwards compatibility)
29
+ - README rewritten to reflect multi-engine architecture
30
+ - All documentation (`architecture.md`, `cli-handbook.md`, `outputs.md`) updated to describe the three-engine pipeline, merge/dedup logic, progress tracking, and dual browser requirements
31
+
32
+ ### Fixed
33
+
34
+ - Screenshot capture no longer attempts to scroll non-visible `<head>` elements into view
35
+
36
+ ---
37
+
38
+ ## [0.1.2] — 2026-03-13
39
+
40
+ ### Fixed
41
+
42
+ - `bin` field in `package.json` — removed leading `./` from the entry path (`scripts/audit.mjs`) to satisfy npm bin resolution
43
+ - `repository.url` normalized to `git+https://` prefix as required by npm registry validation
44
+ - Missing shebang (`#!/usr/bin/env node`) added to `scripts/audit.mjs` so the `a11y-audit` binary executes correctly when installed globally or via `npx`
45
+
46
+ ---
47
+
48
+ ## [0.1.1] — 2026-03-13
49
+
50
+ ### Added
51
+
52
+ - Engine scripts published as a standalone npm package:
53
+ - `scripts/audit.mjs` — orchestrator for the full audit pipeline
54
+ - `scripts/core/utils.mjs` — shared logging, path utilities, and defaults
55
+ - `scripts/core/toolchain.mjs` — dependency and Playwright browser verification
56
+ - `scripts/core/asset-loader.mjs` — JSON asset loading with error boundaries
57
+ - `scripts/engine/dom-scanner.mjs` — Playwright + axe-core WCAG 2.2 AA scanner
58
+ - `scripts/engine/analyzer.mjs` — finding enrichment with fix intelligence
59
+ - `scripts/engine/source-scanner.mjs` — static source code pattern scanner
60
+ - `scripts/reports/builders/` — orchestrators for each report format
61
+ - `scripts/reports/renderers/` — rendering logic for HTML, PDF, Markdown, and checklist
62
+ - Asset files bundled under `assets/`:
63
+ - `assets/reporting/compliance-config.json` — scoring weights, grade thresholds, and legal regulation mapping
64
+ - `assets/reporting/wcag-reference.json` — WCAG criterion map, persona config, and persona–rule mapping
65
+ - `assets/reporting/manual-checks.json` — 41 manual WCAG checks for the interactive checklist
66
+ - `assets/discovery/crawler-config.json` — BFS crawl configuration defaults
67
+ - `assets/discovery/stack-detection.json` — framework and CMS fingerprint signatures
68
+ - `assets/remediation/intelligence.json` — per-rule fix intelligence (106 axe-core rules)
69
+ - `assets/remediation/code-patterns.json` — source code pattern definitions
70
+ - `assets/remediation/guardrails.json` — agent fix guardrails and scope rules
71
+ - `assets/remediation/axe-check-maps.json` — axe check-to-rule mapping
72
+ - `assets/remediation/source-boundaries.json` — framework-specific file location patterns
73
+ - `a11y-audit` binary registered in `bin` field — invocable via `npx a11y-audit` after install
74
+ - `LICENSE` (MIT)
75
+
76
+ ---
77
+
78
+ ## [0.1.0] — 2026-03-13
79
+
80
+ ### Added
81
+
82
+ - Initial package scaffold: `package.json` for `@diegovelasquezweb/a11y-engine` with correct `name`, `version`, `type: module`, `engines`, `files`, and `scripts` fields
83
+ - `devDependencies`: `vitest` for test runner
84
+ - `dependencies`: `playwright`, `@axe-core/playwright`, `axe-core`, `pa11y`
package/README.md CHANGED
@@ -1,20 +1,233 @@
1
1
  # @diegovelasquezweb/a11y-engine
2
2
 
3
- WCAG 2.2 AA accessibility audit engine. Scanner, analyzer, and report builders.
3
+ Multi-engine WCAG 2.2 AA accessibility audit engine. Combines three scanning engines (axe-core, Chrome DevTools Protocol, and pa11y), merges and deduplicates their findings, enriches results with fix intelligence, and produces structured artifacts for developers, agents, and stakeholders.
4
4
 
5
- ## Install
5
+ ## What it is
6
+
7
+ A Node.js CLI and programmatic engine that:
8
+
9
+ 1. Crawls a target URL and discovers routes automatically
10
+ 2. Runs three independent accessibility engines against each page:
11
+ - **axe-core** — industry-standard WCAG rule engine, injected into the live page via Playwright
12
+ - **CDP** (Chrome DevTools Protocol) — queries the browser's accessibility tree directly for issues axe may miss (missing accessible names, aria-hidden on focusable elements)
13
+ - **pa11y** (HTML CodeSniffer) — catches WCAG violations around heading hierarchy, link purpose, and form associations
14
+ 3. Merges and deduplicates findings across all three engines
15
+ 4. Optionally scans project source code for patterns no runtime engine can detect
16
+ 5. Enriches each finding with stack-aware fix guidance, selectors, and verification commands
17
+ 6. Produces a full artifact set: JSON data, Markdown remediation guide, HTML dashboard, PDF compliance report, and manual testing checklist
18
+
19
+ ## Why use this engine
20
+
21
+ | Capability | With this engine | Without |
22
+ | :--- | :--- | :--- |
23
+ | **Multi-engine scanning** | axe-core + CDP accessibility tree + pa11y (HTML CodeSniffer) with cross-engine deduplication | Single engine — higher false-negative rate |
24
+ | **Full WCAG 2.2 Coverage** | Three runtime engines + source code pattern scanner | Runtime scan only — misses structural and source-level issues |
25
+ | **Fix Intelligence** | Stack-aware patches with code snippets tailored to detected framework | Raw rule violations with no remediation context |
26
+ | **Structured Artifacts** | JSON + Markdown + HTML + PDF + Checklist — ready to consume or forward | Findings exist only in the terminal session |
27
+ | **CI/Agent Integration** | Deterministic exit codes, stdout-parseable output paths, JSON schema | Requires wrapper scripting |
28
+
29
+ ## How the scan pipeline works
30
+
31
+ ```
32
+ URL
33
+ |
34
+ v
35
+ [1. Crawl & Discover] sitemap.xml / BFS link crawl / explicit --routes
36
+ |
37
+ v
38
+ [2. Navigate] Playwright opens each route in Chromium
39
+ |
40
+ +---> [axe-core] Injects axe into the page, runs WCAG tag checks
41
+ |
42
+ +---> [CDP] Opens a CDP session, reads the full accessibility tree
43
+ |
44
+ +---> [pa11y] Launches HTML CodeSniffer via Puppeteer Chrome
45
+ |
46
+ v
47
+ [3. Merge & Dedup] Combines findings, removes cross-engine duplicates
48
+ |
49
+ v
50
+ [4. Analyze] Enriches with WCAG mapping, severity, fix code, framework hints
51
+ |
52
+ v
53
+ [5. Reports] HTML dashboard, PDF, checklist, Markdown remediation
54
+ ```
55
+
56
+ ## Installation
6
57
 
7
58
  ```bash
8
- npm i @diegovelasquezweb/a11y-engine
59
+ npm install @diegovelasquezweb/a11y-engine
9
60
  npx playwright install chromium
61
+ npx puppeteer browsers install chrome
62
+ ```
63
+
64
+ ```bash
65
+ pnpm add @diegovelasquezweb/a11y-engine
66
+ pnpm exec playwright install chromium
67
+ npx puppeteer browsers install chrome
10
68
  ```
11
69
 
12
- ## Usage
70
+ > **Two browsers are required:**
71
+ > - **Playwright Chromium** — used by axe-core and CDP checks
72
+ > - **Puppeteer Chrome** — used by pa11y (HTML CodeSniffer)
73
+ >
74
+ > These are separate browser installations. If Puppeteer Chrome is missing, pa11y checks fail silently (non-fatal) and the scan continues with axe + CDP only.
75
+
76
+ ## Quick start
13
77
 
14
78
  ```bash
15
- npx a11y-audit --base-url https://example.com --max-routes 5
79
+ # Minimal scan — produces remediation.md in .audit/
80
+ npx a11y-audit --base-url https://example.com
81
+
82
+ # Full audit with all reports
83
+ npx a11y-audit --base-url https://example.com --with-reports --output ./audit/report.html
84
+
85
+ # Scan with source code intelligence (for stack-aware fix guidance)
86
+ npx a11y-audit --base-url http://localhost:3000 --project-dir . --with-reports --output ./audit/report.html
16
87
  ```
17
88
 
18
- ## Options
89
+ ## CLI usage
90
+
91
+ ```
92
+ a11y-audit --base-url <url> [options]
93
+ ```
94
+
95
+ ### Targeting & scope
96
+
97
+ | Flag | Argument | Default | Description |
98
+ | :--- | :--- | :--- | :--- |
99
+ | `--base-url` | `<url>` | (Required) | Starting URL for the audit. |
100
+ | `--max-routes` | `<num>` | `10` | Max routes to discover and scan. |
101
+ | `--crawl-depth` | `<num>` | `2` | BFS link-follow depth during discovery (1-3). |
102
+ | `--routes` | `<csv>` | — | Explicit path list, bypasses auto-discovery. |
103
+ | `--project-dir` | `<path>` | — | Path to project source. Enables source pattern scanner and framework auto-detection. |
104
+
105
+ ### Audit intelligence
106
+
107
+ | Flag | Argument | Default | Description |
108
+ | :--- | :--- | :--- | :--- |
109
+ | `--target` | `<text>` | `WCAG 2.2 AA` | Compliance target label in reports. |
110
+ | `--only-rule` | `<id>` | — | Run a single axe rule (e.g. `color-contrast`). |
111
+ | `--ignore-findings` | `<csv>` | — | Rule IDs to exclude from output. |
112
+ | `--exclude-selectors` | `<csv>` | — | CSS selectors to skip during DOM scan. |
113
+ | `--axe-tags` | `<csv>` | `wcag2a,wcag2aa,wcag21a,wcag21aa,wcag22a,wcag22aa` | axe-core WCAG tag filter. |
114
+ | `--framework` | `<name>` | — | Override auto-detected stack. Supported: `nextjs`, `gatsby`, `react`, `nuxt`, `vue`, `angular`, `astro`, `svelte`, `shopify`, `wordpress`, `drupal`. |
115
+
116
+ ### Execution & emulation
117
+
118
+ | Flag | Argument | Default | Description |
119
+ | :--- | :--- | :--- | :--- |
120
+ | `--color-scheme` | `light\|dark` | `light` | Emulate `prefers-color-scheme`. |
121
+ | `--wait-until` | `domcontentloaded\|load\|networkidle` | `domcontentloaded` | Playwright page load strategy. Use `networkidle` for SPAs. |
122
+ | `--viewport` | `<WxH>` | — | Viewport size (e.g. `375x812`, `1440x900`). |
123
+ | `--wait-ms` | `<num>` | `2000` | Delay after page load before running axe (ms). |
124
+ | `--timeout-ms` | `<num>` | `30000` | Network timeout per page (ms). |
125
+ | `--headed` | — | `false` | Run browser in visible mode. |
126
+ | `--affected-only` | — | `false` | Re-scan only routes with previous violations. Requires a prior scan in `.audit/`. |
127
+
128
+ ### Output generation
129
+
130
+ | Flag | Argument | Default | Description |
131
+ | :--- | :--- | :--- | :--- |
132
+ | `--with-reports` | — | `false` | Generate HTML + PDF + Checklist reports. Requires `--output`. |
133
+ | `--skip-reports` | — | `true` | Skip visual report generation (default). |
134
+ | `--output` | `<path>` | — | Output path for `report.html` (PDF and checklist derive from it). |
135
+ | `--skip-patterns` | — | `false` | Disable source code pattern scanner even when `--project-dir` is set. |
136
+
137
+ ## Common command patterns
138
+
139
+ ```bash
140
+ # Focused audit — one rule, one route
141
+ a11y-audit --base-url https://example.com --only-rule color-contrast --routes /checkout --max-routes 1
142
+
143
+ # Dark mode audit
144
+ a11y-audit --base-url https://example.com --color-scheme dark
145
+
146
+ # SPA with deferred rendering
147
+ a11y-audit --base-url https://example.com --wait-until networkidle --wait-ms 3000
148
+
149
+ # Mobile viewport
150
+ a11y-audit --base-url https://example.com --viewport 375x812
151
+
152
+ # Fast re-audit after fixes (skips clean pages)
153
+ a11y-audit --base-url https://example.com --affected-only
154
+
155
+ # Ignore known false positives
156
+ a11y-audit --base-url https://example.com --ignore-findings color-contrast,frame-title
157
+ ```
158
+
159
+ ## Output artifacts
160
+
161
+ All artifacts are written to `.audit/` relative to the package root.
162
+
163
+ | File | Always generated | Description |
164
+ | :--- | :--- | :--- |
165
+ | `a11y-scan-results.json` | Yes | Raw merged results from axe-core + CDP + pa11y per route |
166
+ | `a11y-findings.json` | Yes | Enriched findings with fix intelligence, WCAG mapping, and severity |
167
+ | `progress.json` | Yes | Real-time scan progress with per-engine step status and finding counts |
168
+ | `remediation.md` | Yes | AI-agent-optimized remediation roadmap |
169
+ | `report.html` | With `--with-reports` | Interactive HTML dashboard |
170
+ | `report.pdf` | With `--with-reports` | Formal compliance PDF |
171
+ | `checklist.html` | With `--with-reports` | Manual WCAG testing checklist |
172
+
173
+ See [Output Artifacts](docs/outputs.md) for full schema reference.
174
+
175
+ ## Scan engines
176
+
177
+ ### axe-core (via @axe-core/playwright)
178
+
179
+ The primary engine. Runs Deque's axe-core rule set against the live DOM inside Playwright's Chromium. Covers the majority of automatable WCAG 2.2 AA success criteria.
180
+
181
+ ### CDP (Chrome DevTools Protocol)
182
+
183
+ Queries the browser's full accessibility tree via a CDP session. Catches issues axe may miss:
184
+ - Interactive elements (buttons, links, inputs) with no accessible name
185
+ - Focusable elements hidden with `aria-hidden`
186
+
187
+ ### pa11y (HTML CodeSniffer)
188
+
189
+ Runs Squiz's HTML CodeSniffer via Puppeteer Chrome. Catches WCAG violations around:
190
+ - Heading hierarchy
191
+ - Link purpose
192
+ - Form label associations
193
+
194
+ Requires a separate Chrome installation (`npx puppeteer browsers install chrome`). If Chrome is missing, pa11y fails silently and the scan continues with axe + CDP.
195
+
196
+ ### Merge & deduplication
197
+
198
+ After all three engines run, findings are merged and deduplicated:
199
+ - axe findings are added first (baseline)
200
+ - CDP findings are checked against axe equivalents (e.g. `cdp-missing-accessible-name` vs `button-name`) to avoid duplicates
201
+ - pa11y findings are checked against existing selectors to avoid triple-reporting the same element
202
+
203
+ ## Troubleshooting
204
+
205
+ **`Error: browserType.launch: Executable doesn't exist`**
206
+ Run `npx playwright install chromium` (or `pnpm exec playwright install chromium`).
207
+
208
+ **`pa11y checks failed (non-fatal): Could not find Chrome`**
209
+ pa11y requires Puppeteer's Chrome, which is separate from Playwright's Chromium. Install it with `npx puppeteer browsers install chrome`.
210
+
211
+ **`Missing required argument: --base-url`**
212
+ The flag is required. Provide a full URL including protocol: `--base-url https://example.com`.
213
+
214
+ **Scan returns 0 findings on an SPA**
215
+ Use `--wait-until networkidle --wait-ms 3000` to let async content render before the engines run.
216
+
217
+ **`--with-reports` exits without generating PDF**
218
+ Ensure `--output` is also set and points to an `.html` file path: `--output ./audit/report.html`.
219
+
220
+ **Chromium crashes in CI**
221
+ Add `--no-sandbox` via the `PLAYWRIGHT_CHROMIUM_LAUNCH_OPTIONS` env var, or run Playwright with the `--with-deps` flag during browser installation.
222
+
223
+ ## Documentation
224
+
225
+ | Resource | Description |
226
+ | :--- | :--- |
227
+ | [Architecture](https://github.com/diegovelasquezweb/a11y-engine/blob/main/docs/architecture.md) | How the multi-engine scanner pipeline works |
228
+ | [CLI Handbook](https://github.com/diegovelasquezweb/a11y-engine/blob/main/docs/cli-handbook.md) | Full flag reference and usage patterns |
229
+ | [Output Artifacts](https://github.com/diegovelasquezweb/a11y-engine/blob/main/docs/outputs.md) | Schema and structure of every generated file |
230
+
231
+ ## License
19
232
 
20
- See `a11y-audit --help` for full CLI reference.
233
+ MIT
@@ -0,0 +1,30 @@
1
+ {
2
+ "interactiveRoles": [
3
+ "button", "link", "textbox", "combobox", "listbox",
4
+ "menuitem", "tab", "checkbox", "radio", "switch", "slider"
5
+ ],
6
+ "rules": [
7
+ {
8
+ "id": "cdp-missing-accessible-name",
9
+ "condition": "interactive-no-name",
10
+ "impact": "serious",
11
+ "tags": ["wcag2a", "wcag412", "cdp-check"],
12
+ "help": "Interactive elements must have an accessible name",
13
+ "helpUrl": "https://dequeuniversity.com/rules/axe/4.11/button-name",
14
+ "description": "Interactive element with role \"{{role}}\" has no accessible name",
15
+ "failureMessage": "Element with role \"{{role}}\" has no accessible name in the accessibility tree",
16
+ "axeEquivalents": ["button-name", "link-name", "input-name", "aria-command-name"]
17
+ },
18
+ {
19
+ "id": "cdp-aria-hidden-focusable",
20
+ "condition": "hidden-focusable",
21
+ "impact": "serious",
22
+ "tags": ["wcag2a", "wcag412", "cdp-check"],
23
+ "help": "aria-hidden elements must not be focusable",
24
+ "helpUrl": "https://dequeuniversity.com/rules/axe/4.11/aria-hidden-focus",
25
+ "description": "Focusable element with role \"{{role}}\" is aria-hidden",
26
+ "failureMessage": "Focusable element with role \"{{role}}\" is hidden from the accessibility tree",
27
+ "axeEquivalents": ["aria-hidden-focus"]
28
+ }
29
+ ]
30
+ }
@@ -0,0 +1,53 @@
1
+ {
2
+ "ignoreByPrinciple": [
3
+ "Principle1.Guideline1_4.1_4_3.G18.Fail",
4
+ "Principle4.Guideline4_1.4_1_2.H91.A.NoContent"
5
+ ],
6
+ "impactMap": {
7
+ "1": "serious",
8
+ "2": "moderate",
9
+ "3": "minor"
10
+ },
11
+ "equivalenceMap": {
12
+ "Principle1.Guideline1_4.1_4_3.G145": "color-contrast",
13
+ "Principle1.Guideline1_4.1_4_3.G18": "color-contrast",
14
+ "Principle1.Guideline1_4.1_4_3.G145.Fail": "color-contrast",
15
+ "Principle1.Guideline1_4.1_4_3.G18.Fail": "color-contrast",
16
+ "Principle1.Guideline1_3.1_3_1.H42": "heading-order",
17
+ "Principle1.Guideline1_3.1_3_1.H42.2": "empty-heading",
18
+ "Principle1.Guideline1_3.1_3_1.H44": "label",
19
+ "Principle1.Guideline1_3.1_3_1.H65": "label",
20
+ "Principle1.Guideline1_3.1_3_1.H71": "label",
21
+ "Principle1.Guideline1_3.1_3_1.H85": "listitem",
22
+ "Principle1.Guideline1_3.1_3_1.H48": "list",
23
+ "Principle1.Guideline1_3.1_3_1.H39": "table-fake-caption",
24
+ "Principle1.Guideline1_3.1_3_1.H73": "table-fake-caption",
25
+ "Principle1.Guideline1_1.1_1_1.H37": "image-alt",
26
+ "Principle1.Guideline1_1.1_1_1.H67": "image-alt",
27
+ "Principle1.Guideline1_1.1_1_1.H36": "input-image-alt",
28
+ "Principle1.Guideline1_1.1_1_1.H2": "image-redundant-alt",
29
+ "Principle1.Guideline1_1.1_1_1.H53": "object-alt",
30
+ "Principle1.Guideline1_1.1_1_1.G94": "image-alt",
31
+ "Principle1.Guideline1_1.1_1_1.H24": "area-alt",
32
+ "Principle2.Guideline2_4.2_4_1.H64": "frame-title",
33
+ "Principle2.Guideline2_4.2_4_1.G1": "bypass",
34
+ "Principle2.Guideline2_4.2_4_1.G124": "bypass",
35
+ "Principle2.Guideline2_4.2_4_2.H25": "document-title",
36
+ "Principle2.Guideline2_4.2_4_4.H77": "link-name",
37
+ "Principle1.Guideline1_1.1_1_1.H30": "link-name",
38
+ "Principle2.Guideline2_4.2_4_6.G197": "label",
39
+ "Principle2.Guideline2_1.2_1_1.G202": "scrollable-region-focusable",
40
+ "Principle3.Guideline3_1.3_1_1.H57": "html-has-lang",
41
+ "Principle3.Guideline3_1.3_1_1.H57.2": "html-has-lang",
42
+ "Principle3.Guideline3_1.3_1_1.H57.3": "html-lang-valid",
43
+ "Principle3.Guideline3_1.3_1_1.H57.3.Lang": "html-lang-valid",
44
+ "Principle3.Guideline3_2.3_2_1.G107": "select-name",
45
+ "Principle3.Guideline3_3.3_3_2.G131": "label",
46
+ "Principle4.Guideline4_1.4_1_1.F77": "duplicate-id",
47
+ "Principle4.Guideline4_1.4_1_2.H91": "button-name",
48
+ "Principle4.Guideline4_1.4_1_2.H91.A": "link-name",
49
+ "Principle4.Guideline4_1.4_1_2.H91.Button": "button-name",
50
+ "Principle4.Guideline4_1.4_1_2.H91.InputText": "label",
51
+ "Principle4.Guideline4_1.4_1_2.H91.Select": "select-name"
52
+ }
53
+ }
@@ -0,0 +1,218 @@
1
+ # Engine Architecture
2
+
3
+ **Navigation**: [Home](../README.md) • [Architecture](architecture.md) • [CLI Handbook](cli-handbook.md) • [Output Artifacts](outputs.md)
4
+
5
+ ---
6
+
7
+ ## Table of Contents
8
+
9
+ - [Pipeline overview](#pipeline-overview)
10
+ - [Stage 1: DOM scanner](#stage-1-dom-scanner)
11
+ - [axe-core](#axe-core)
12
+ - [CDP checks](#cdp-checks)
13
+ - [pa11y](#pa11y)
14
+ - [Merge and deduplication](#merge-and-deduplication)
15
+ - [Stage 1b: Source scanner](#optional-source-scanner)
16
+ - [Stage 2: Analyzer](#stage-2-analyzer)
17
+ - [Stage 3: Report builders](#stage-3-report-builders)
18
+ - [Assets and rule intelligence](#assets-and-rule-intelligence)
19
+ - [Execution model and timeouts](#execution-model-and-timeouts)
20
+
21
+ ---
22
+
23
+ The engine operates as a three-stage pipeline. Each stage is an independent Node.js process spawned by `audit.mjs`. Stages communicate through JSON files written to `.audit/`.
24
+
25
+ ## Pipeline overview
26
+
27
+ ```
28
+ Target URL
29
+
30
+
31
+ ┌─────────────────────────────────┐
32
+ │ Stage 1: DOM Scanner │ Three engines per route:
33
+ │ dom-scanner.mjs │
34
+ │ │
35
+ │ ┌──────────┐ ┌──────┐ │
36
+ │ │ axe-core │ │ CDP │ │ Playwright Chromium
37
+ │ └────┬─────┘ └──┬───┘ │
38
+ │ │ │ │
39
+ │ ┌────▼───────────▼────┐ │
40
+ │ │ pa11y │ │ Puppeteer Chrome
41
+ │ └────────┬────────────┘ │
42
+ │ │ │
43
+ │ ┌────────▼────────────┐ │
44
+ │ │ Merge & Dedup │ │
45
+ │ └────────┬────────────┘ │
46
+ └───────────┼─────────────────────┘
47
+ │ a11y-scan-results.json
48
+ │ progress.json
49
+
50
+ ┌─────────────────────────────────┐
51
+ │ Stage 1b: Source Scanner │ Static regex analysis
52
+ │ source-scanner.mjs │ (optional — requires --project-dir)
53
+ └───────────┬─────────────────────┘
54
+ │ merges into a11y-findings.json
55
+
56
+ ┌─────────────────────────────────┐
57
+ │ Stage 2: Analyzer │ Fix intelligence enrichment
58
+ │ analyzer.mjs │ intelligence.json + guardrails
59
+ └───────────┬─────────────────────┘
60
+ │ a11y-findings.json
61
+
62
+ ┌─────────────────────────────────┐
63
+ │ Stage 3: Report Builders │ Parallel rendering
64
+ │ md / html / pdf / checklist │
65
+ └───────────┬─────────────────────┘
66
+
67
+ ┌───────┼──────────┬──────────────┐
68
+ ▼ ▼ ▼ ▼
69
+ remediation report report checklist
70
+ .md .html .pdf .html
71
+ ```
72
+
73
+ ## Stage 1: DOM scanner
74
+
75
+ **Script**: `scripts/engine/dom-scanner.mjs`
76
+
77
+ Launches a Playwright-controlled Chromium browser, discovers routes, and runs three independent accessibility engines against each page. Results are merged and deduplicated before output.
78
+
79
+ ### Route discovery
80
+
81
+ - If the site exposes a `sitemap.xml`, all listed URLs are scanned (up to `--max-routes`).
82
+ - Otherwise, BFS crawl starting from `--base-url`, following same-origin `<a href>` links up to `--crawl-depth` levels deep.
83
+ - Routes are deduplicated and normalized before scanning.
84
+ - 3 parallel browser tabs scan routes concurrently (~2-3x faster than sequential).
85
+
86
+ ### axe-core
87
+
88
+ **Dependency**: `@axe-core/playwright`
89
+
90
+ The primary engine. Injects axe-core into the live page via Playwright and runs WCAG 2.2 A/AA tag checks. Covers the majority of automatable WCAG success criteria (~80+ rules).
91
+
92
+ - Configurable via `--axe-tags` (default: `wcag2a,wcag2aa,wcag21a,wcag21aa,wcag22a,wcag22aa`)
93
+ - Supports `--only-rule` for focused single-rule audits
94
+ - Supports `--exclude-selectors` to skip specific elements
95
+
96
+ ### CDP checks
97
+
98
+ **Dependency**: Playwright's built-in CDP session (`page.context().newCDPSession()`)
99
+
100
+ Queries the browser's full accessibility tree via Chrome DevTools Protocol. Catches issues axe may miss because it operates on the computed accessibility tree rather than the DOM:
101
+
102
+ - **Missing accessible names** — interactive elements (`button`, `link`, `textbox`, `combobox`, etc.) with empty names in the accessibility tree
103
+ - **aria-hidden on focusable elements** — elements that are focusable but hidden from assistive technology
104
+
105
+ CDP findings use axe-compatible violation format with `source: "cdp"` for downstream processing.
106
+
107
+ ### pa11y
108
+
109
+ **Dependency**: `pa11y` (which uses Puppeteer + Chrome internally)
110
+
111
+ Runs Squiz's HTML CodeSniffer against each page URL. Catches WCAG violations that axe and CDP may miss:
112
+
113
+ - Heading hierarchy issues
114
+ - Link purpose violations
115
+ - Form label associations
116
+ - Additional WCAG2AA/WCAG2AAA checks from HTML CodeSniffer's rule set
117
+
118
+ pa11y requires a separate Chrome installation (`npx puppeteer browsers install chrome`). This is separate from Playwright's Chromium. If Chrome is missing, pa11y fails silently (non-fatal) and the scan continues with axe + CDP only.
119
+
120
+ pa11y findings use axe-compatible violation format with `source: "pa11y"` for downstream processing.
121
+
122
+ ### Merge and deduplication
123
+
124
+ After all three engines complete, `mergeViolations()` combines findings and removes cross-engine duplicates:
125
+
126
+ 1. **axe findings** are added first as the baseline
127
+ 2. **CDP findings** are checked against axe equivalents (e.g. `cdp-missing-accessible-name` maps to `button-name`, `link-name`, `input-name`, `aria-command-name`). Only truly new findings are added.
128
+ 3. **pa11y findings** are checked against existing selectors. If the same element is already flagged by axe or CDP, the pa11y finding is dropped.
129
+
130
+ The merged violations are written to `a11y-scan-results.json` per route.
131
+
132
+ ### Progress tracking
133
+
134
+ The scanner writes `progress.json` in real-time as each engine runs. This file is used by integrations (like `a11y-scanner`) for live progress UI:
135
+
136
+ ```json
137
+ {
138
+ "steps": {
139
+ "page": { "status": "done", "updatedAt": "..." },
140
+ "axe": { "status": "done", "updatedAt": "...", "found": 8 },
141
+ "cdp": { "status": "done", "updatedAt": "...", "found": 3 },
142
+ "pa11y": { "status": "done", "updatedAt": "...", "found": 2 },
143
+ "merge": { "status": "done", "updatedAt": "...", "axe": 8, "cdp": 3, "pa11y": 2, "merged": 11 }
144
+ },
145
+ "currentStep": "merge"
146
+ }
147
+ ```
148
+
149
+ ### Screenshots
150
+
151
+ After merging, element screenshots are captured for each violation. Non-visible elements (`<meta>`, `<link>`, `<script>`, etc.) are automatically skipped. Screenshots are stored in `.audit/screenshots/` and referenced by each violation's `screenshot_path` field.
152
+
153
+ ### Optional: Source scanner
154
+
155
+ **Script**: `scripts/engine/source-scanner.mjs` — runs when `--project-dir` is set and `--skip-patterns` is not.
156
+
157
+ Performs static analysis of source files for accessibility issues no runtime engine can detect (e.g. focus outline suppression, missing alt text in templates). Uses regex patterns from `assets/remediation/code-patterns.json` scoped to framework-specific file boundaries from `assets/remediation/source-boundaries.json`.
158
+
159
+ Findings are classified as `confirmed` (pattern unambiguously matches) or `potential` (requires human verification).
160
+
161
+ ## Stage 2: Analyzer
162
+
163
+ **Script**: `scripts/engine/analyzer.mjs`
164
+
165
+ Reads `a11y-scan-results.json` (which contains merged axe + CDP + pa11y results) and enriches each violation with:
166
+
167
+ - **Fix intelligence** from `assets/remediation/intelligence.json` — 106 axe-core rules with code snippets, MDN links, framework-specific notes, and WCAG criterion mapping. CDP and pa11y findings receive generic enrichment based on their rule structure.
168
+ - **Selector scoring** — picks the most stable selector from axe's `nodes` list. Priority: `#id` > `[data-*]` > `[aria-*]` > `[type=]`, with penalty for Tailwind utility classes.
169
+ - **Framework context** — `assets/discovery/stack-detection.json` fingerprints the DOM to detect framework and CMS. Per-finding `framework_notes` and `cms_notes` are filtered to the detected stack.
170
+ - **Guardrails** — `assets/remediation/guardrails.json` defines scope rules that prevent agents from touching backend code, third-party scripts, or minified files.
171
+ - **Compliance scoring** — `assets/reporting/compliance-config.json` weights findings by severity to produce a 0-100 score with grade thresholds.
172
+ - **Persona impact groups** — `assets/reporting/wcag-reference.json` maps findings to disability personas (visual, motor, cognitive, etc.).
173
+
174
+ **Output**: `a11y-findings.json` — enriched findings array with all intelligence fields.
175
+
176
+ ## Stage 3: Report builders
177
+
178
+ All builders run in parallel when `--with-reports` is set. Each reads `a11y-findings.json` independently.
179
+
180
+ | Builder | Script | Output | Audience |
181
+ | :--- | :--- | :--- | :--- |
182
+ | Markdown | `reports/builders/md.mjs` | `remediation.md` | AI agents |
183
+ | HTML | `reports/builders/html.mjs` | `report.html` | Developers |
184
+ | PDF | `reports/builders/pdf.mjs` | `report.pdf` | Stakeholders |
185
+ | Checklist | `reports/builders/checklist.mjs` | `checklist.html` | QA / Developers |
186
+
187
+ The `remediation.md` builder always runs (even without `--with-reports`) since it is the primary output for AI agent consumption.
188
+
189
+ Renderers in `scripts/reports/renderers/` contain the actual rendering logic — builders are thin orchestrators that call renderers and write output files.
190
+
191
+ ## Assets and rule intelligence
192
+
193
+ Assets are static JSON files bundled with the package under `assets/`. They are read at runtime by the analyzer and report builders.
194
+
195
+ | Asset | Purpose |
196
+ | :--- | :--- |
197
+ | `reporting/compliance-config.json` | Score weights, grade thresholds, legal regulation list |
198
+ | `reporting/wcag-reference.json` | WCAG criterion map, persona config, persona-rule mapping |
199
+ | `reporting/manual-checks.json` | 41 manual checks for the WCAG checklist |
200
+ | `discovery/crawler-config.json` | BFS crawl defaults (timeouts, concurrency) |
201
+ | `discovery/stack-detection.json` | Framework/CMS DOM fingerprints |
202
+ | `remediation/intelligence.json` | Per-rule fix intelligence for 106 axe-core rules |
203
+ | `remediation/code-patterns.json` | Source code pattern definitions |
204
+ | `remediation/guardrails.json` | Agent fix scope guardrails |
205
+ | `remediation/axe-check-maps.json` | axe check-to-rule mapping |
206
+ | `remediation/source-boundaries.json` | Framework-specific source file locations |
207
+
208
+ ## Execution model and timeouts
209
+
210
+ `audit.mjs` spawns each stage as a child process via `node:child_process`. All child processes:
211
+
212
+ - Inherit the parent's environment
213
+ - Run with `cwd` set to the package root (`SKILL_ROOT`)
214
+ - Have a hard timeout of **15 minutes** (configurable via the `SCRIPT_TIMEOUT_MS` constant)
215
+
216
+ The orchestrator exits with code `1` if any stage fails. Individual stage timeouts are also enforced per page via `--timeout-ms` (default: 30s).
217
+
218
+ If `node_modules/` is absent on first run, the orchestrator automatically installs dependencies via `pnpm install` (falls back to `npm install`).