npm - @diegovelasquezweb/a11y-engine - Versions diffs - 0.8.1 → 0.8.3 - Mend

@diegovelasquezweb/a11y-engine 0.8.1 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/CHANGELOG.md +70 -0
package/README.md +41 -4
package/docs/api-reference.md +3 -1
package/docs/architecture.md +18 -6
package/docs/cli-handbook.md +76 -2
package/docs/engine-manifest.md +4 -1
package/docs/intelligence.md +61 -7
package/docs/outputs.md +60 -1
package/package.json +1 -1
package/src/ai/claude.mjs +91 -23
package/src/enrichment/analyzer.mjs +4 -2

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,76 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.8.2] — 2026-03-16
+### Changed
+- **Smarter AI source file selection** — `fetchSourceFilesForFindings` now scores candidate files by how many terms extracted from the finding's selector, class names, IDs, and title match the file path. Files most relevant to the specific failing element are fetched first instead of picking the first 3 files by extension.
+- Extracted `extractSearchTermsFromFinding()` and `scoreFilePath()` helpers for reusable relevance scoring logic.
+---
+## [0.8.1] — 2026-03-16
+### Added
+- **Custom AI system prompt** — `enrichWithAI()` now accepts `options.systemPrompt` to override the default Claude system prompt at runtime.
+- `enrich.mjs` reads `AI_SYSTEM_PROMPT` env var and passes it to `enrichWithAI()` — enabling per-scan prompt customization without code changes.
+- `audit.mjs` forwards `AI_SYSTEM_PROMPT` env var to the `enrich.mjs` child process.
+---
+## [0.8.0] — 2026-03-16
+### Changed
+- **AI enrichment no longer overwrites original fix** — `enrich.mjs` now preserves the original `fix_description`/`fix_code` from the engine and stores Claude's output in separate fields: `ai_fix_description`, `ai_fix_code`, `ai_fix_code_lang`. Findings improved by AI are flagged with `aiEnhanced: true`.
+- **AI system prompt rewritten** — Claude is now explicitly instructed to go beyond the generic fix: explain why the issue matters for real users, what specifically to look for in the codebase, and provide a production-quality code example different from the existing one.
+- Default AI model updated to `claude-haiku-4-5-20251001`.
+---
+## [0.7.9] — 2026-03-16
+### Added
+- **AI enrichment CLI step** — `audit.mjs` now runs `src/ai/enrich.mjs` after the analyzer step when `ANTHROPIC_API_KEY` env var is present. Non-fatal: if AI fails, the pipeline continues with unenriched findings.
+- `src/ai/enrich.mjs` — new CLI script that reads `a11y-findings.json`, calls `enrichWithAI()`, and writes enriched findings back. Reads `A11Y_REPO_URL` and `GH_TOKEN` env vars for repo-aware enrichment.
+- `src/ai/claude.mjs` — Claude AI enrichment module. Enriches Critical and Serious findings with context-aware fix descriptions and code snippets. Uses `claude-haiku-4-5-20251001` by default. Fetches source files from the GitHub repo when `repoUrl` is available.
+---
+## [0.7.8] — 2026-03-16
+### Fixed
+- **pa11y ruleId normalization** — pa11y violation IDs (e.g. `WCAG2AAA.Principle1.Guideline1_4.1_4_6.G17`) are now normalized to a short, readable form (e.g. `pa11y-g17`) by taking only the last segment of the dotted code. Previously the full dotted path was used, producing unreadable badges like `Pa11y Wcag2aaa Principle1 Guideline1 4 1 4 6 G17`.
+---
+## [0.7.7] — 2026-03-15
+### Added
+- **`--repo-url` and `--github-token` CLI flags** — `audit.mjs` now accepts `--repo-url <github-url>` and `--github-token <token>`. When a repo URL is provided, the engine fetches `package.json` via the GitHub API to detect the project framework before running the analyzer, and passes the detected framework to both the analyzer and the source pattern scanner. No `git clone` required.
+- `source-scanner.mjs` CLI now accepts `--repo-url` and `--github-token`. When `--repo-url` is provided (without `--project-dir`), it runs `scanPatternRemote()` against the GitHub API instead of the local filesystem.
+- `detectProjectContext()` is now called in `audit.mjs` when a remote repo is provided, enabling framework-aware fix suggestions without a local clone.
+### Changed
+- `source-scanner.mjs`: `--project-dir` is no longer required when `--repo-url` is provided. `main()` is now async to support remote API calls.
+- `audit.mjs`: pattern scanning is now triggered when either `--project-dir` or `--repo-url` is provided.
+---
+## [0.7.6] — 2026-03-15
+### Changed
+- HTML report renderer: updated Tailwind class syntax (`flex-shrink-0` → `shrink-0`, `bg-gradient-to-br` → `bg-linear-to-br`, `max-h-[360px]` → `max-h-90`).
+---
 ## [0.4.2] — 2026-03-15
 ### Fixed

package/README.md CHANGED Viewed

@@ -50,7 +50,7 @@ import {
 #### runAudit
-Runs the full scan pipeline: route discovery, scan, merge, analyze, and optional AI enrichment. Returns a payload ready for `getFindings`.
+Runs the full scan pipeline: route discovery, scan, merge, analyze, AI enrichment (when configured), and optional source pattern scanning. Returns a payload ready for `getFindings`.
 ```ts
 const payload = await runAudit({
@@ -58,6 +58,14 @@ const payload = await runAudit({
   maxRoutes: 5,
   axeTags: ["wcag2a", "wcag2aa", "best-practice"],
   engines: { axe: true, cdp: true, pa11y: true },
+  repoUrl: "https://github.com/owner/repo", // optional — enables source pattern scan and stack detection from package.json
+  githubToken: process.env.GH_TOKEN,        // optional — for private repos and higher GitHub API rate limits
+  ai: {
+    enabled: true,
+    apiKey: process.env.ANTHROPIC_API_KEY,
+    githubToken: process.env.GH_TOKEN,
+    systemPrompt: "Custom prompt...",        // optional — overrides default Claude system prompt
+  },
   onProgress: (step, status, extra) => console.log(`${step}: ${status}`, extra),
 });
 ```
@@ -125,10 +133,39 @@ These functions expose scanner help content, persona explanations, conformance l
 See [API Reference](docs/api-reference.md) for exact options and return types.
-## Optional CLI
+## CLI
-If you need terminal execution, the package also exposes `a11y-audit`.
-See the [CLI Handbook](docs/cli-handbook.md) for command flags and examples.
+The package exposes an `a11y-audit` binary for terminal execution.
+```bash
+# Basic scan
+pnpm exec a11y-audit --base-url https://example.com
+# With source code pattern scanning via GitHub API (no clone)
+pnpm exec a11y-audit --base-url https://example.com \
+  --repo-url https://github.com/owner/repo \
+  --github-token ghp_...
+# With AI enrichment (set ANTHROPIC_API_KEY env var)
+ANTHROPIC_API_KEY=sk-ant-... pnpm exec a11y-audit --base-url https://example.com
+# With custom AI system prompt
+AI_SYSTEM_PROMPT="You are..." ANTHROPIC_API_KEY=sk-ant-... pnpm exec a11y-audit --base-url https://example.com
+```
+See the [CLI Handbook](docs/cli-handbook.md) for all flags and examples.
+## AI enrichment
+When `ANTHROPIC_API_KEY` is set, the engine runs a post-scan enrichment step that sends Critical and Serious findings to Claude. Claude generates:
+- A specific fix description referencing the actual selector, colors, and violation data
+- A production-quality code snippet in the correct framework syntax
+- Context-aware suggestions when repo source files are available
+AI output is stored in separate fields (`ai_fix_description`, `ai_fix_code`) — the original engine fixes are always preserved. Findings improved by AI are flagged with `aiEnhanced: true`.
+The system prompt is fully customizable via `options.ai.systemPrompt` (programmatic API) or the `AI_SYSTEM_PROMPT` env var (CLI).
 ## Documentation

package/docs/api-reference.md CHANGED Viewed

@@ -35,7 +35,7 @@ Runs route discovery, runtime scan, merge, analyzer enrichment, and optional AI
 | `skipPatterns` | `boolean` |
 | `screenshotsDir` | `string` |
 | `engines` | `{ axe?: boolean; cdp?: boolean; pa11y?: boolean }` |
-| `ai` | `{ enabled?: boolean; apiKey?: string; githubToken?: string; model?: string }` |
+| `ai` | `{ enabled?: boolean; apiKey?: string; githubToken?: string; model?: string; systemPrompt?: string }` — `systemPrompt` overrides the default Claude prompt when set |
 | `onProgress` | `(step: string, status: string, extra?: Record<string, unknown>) => void` |
 Progress steps emitted via `onProgress`:
@@ -54,6 +54,8 @@ Progress steps emitted via `onProgress`:
 Returns: `Promise<ScanPayload>`
+> **`ai_enriched_findings` fast path**: When AI enrichment runs, the engine appends `ai_enriched_findings` to the payload. `getFindings()` checks for this field first — if present, it returns the already-enriched findings directly without re-normalizing the raw `findings` array.
 ### `getFindings(input, options?)`
 Normalizes and enriches findings and returns sorted enriched findings.

package/docs/architecture.md CHANGED Viewed

@@ -33,12 +33,21 @@ flowchart TD
     M --> R[a11y-scan-results.json]
     R --> AN[Analyzer]
-    AN --> F[a11y-findings.json]
-    F --> MD[remediation.md]
-    F --> HTML[report.html]
-    F --> PDF[report.pdf]
-    F --> CHK[checklist.html]
+    REPO[GitHub Repo] -->|fetchPackageJson| AN
+    REPO -->|scanPatternRemote| PAT[a11y-pattern-findings.json]
+    AN --> F[a11y-findings.json]
+    F --> AI{ANTHROPIC_API_KEY?}
+    AI -->|yes| CL[Claude AI enrichment]
+    AI -->|no| SKIP[skip]
+    CL --> F2[a11y-findings.json enriched]
+    SKIP --> F2
+    F2 --> MD[remediation.md]
+    F2 --> HTML[report.html]
+    F2 --> PDF[report.pdf]
+    F2 --> CHK[checklist.html]
 ```
 ## Execution Modes
@@ -75,7 +84,10 @@ flowchart LR
 | :--- | :--- |
 | `src/pipeline/dom-scanner.mjs` | Route discovery, engine execution (axe/CDP/pa11y), merge/dedup, progress updates, screenshots |
 | `src/enrichment/analyzer.mjs` | Rule enrichment, selector strategy, ownership hints, recommendations, scoring metadata |
-| `src/source-patterns/source-scanner.mjs` | Static source pattern detection for issues runtime engines cannot see |
+| `src/ai/enrich.mjs` | CLI subprocess that runs AI enrichment after the analyzer. Reads `ANTHROPIC_API_KEY` and `AI_SYSTEM_PROMPT` env vars. Non-fatal. |
+| `src/ai/claude.mjs` | Anthropic API client. Sends Critical/Serious findings to Claude and parses improved fix suggestions. Supports custom system prompt and repo source file context. |
+| `src/core/github-api.mjs` | GitHub API client. Provides `fetchPackageJson`, `fetchRepoFile`, `listRepoFiles`, and `parseRepoUrl`. Used for remote repo scanning and AI source file fetching without cloning. |
+| `src/source-patterns/source-scanner.mjs` | Source code pattern scanner. Works against local `--project-dir` or remote `--repo-url` via the GitHub API. |
 | `src/reports/*.mjs` | Report builders for markdown/html/pdf/checklist |
 | `src/reports/renderers/*.mjs` | Shared rendering and normalization helpers |
 | `src/core/asset-loader.mjs` | Centralized access to bundled assets |

package/docs/cli-handbook.md CHANGED Viewed

@@ -10,9 +10,12 @@
 - [Prerequisites](#prerequisites)
 - [Flag groups](#flag-groups)
   - [Targeting & scope](#targeting--scope)
+  - [Repository & remote scanning](#repository--remote-scanning)
+  - [AI enrichment](#ai-enrichment)
   - [Audit intelligence](#audit-intelligence)
   - [Execution & emulation](#execution--emulation)
   - [Output generation](#output-generation)
+- [Environment variables](#environment-variables)
 - [Examples](#examples)
 - [Exit codes](#exit-codes)
@@ -62,7 +65,7 @@ Controls what gets scanned.
 | `--max-routes` | `<num>` | `10` | Maximum unique same-origin paths to discover and scan. |
 | `--crawl-depth` | `<num>` | `2` | How deep to follow links during BFS discovery (1-3). Has no effect when `--routes` is set. |
 | `--routes` | `<csv>` | — | Explicit paths to scan (e.g. `/,/about,/contact`). Overrides auto-discovery entirely. |
-| `--project-dir` | `<path>` | — | Path to the audited project source. Enables the source code pattern scanner and framework auto-detection from package.json. |
+| `--project-dir` | `<path>` | — | Path to the audited project source on disk. Enables source code pattern scanning and framework auto-detection from the local `package.json`. |
 **Route discovery logic**:
 1. If the target has a `sitemap.xml`, all listed URLs are used (up to `--max-routes`).
@@ -71,6 +74,41 @@ Controls what gets scanned.
 ---
+### Repository & remote scanning
+Enables source code analysis via the GitHub API — no `git clone` required.
+| Flag | Argument | Default | Description |
+| :--- | :--- | :--- | :--- |
+| `--repo-url` | `<url>` | — | GitHub repository URL (e.g. `https://github.com/owner/repo`). Fetches `package.json` for framework detection and runs source code pattern scanning against the repo via the GitHub API. Mutually exclusive with `--project-dir` for remote usage. |
+| `--github-token` | `<token>` | — | GitHub personal access token. Increases the GitHub API rate limit from 60 to 5,000 req/hour. Required for private repositories. Falls back to `GH_TOKEN` env var if not provided. |
+When `--repo-url` is provided:
+1. The engine fetches `package.json` via `raw.githubusercontent.com` to detect the project framework.
+2. Source code patterns are run against the repo file tree using the GitHub Trees API and Contents API, with no local filesystem access.
+3. The detected framework is passed to the analyzer for framework-specific fix notes.
+---
+### AI enrichment
+Controls Claude-powered fix suggestion enrichment. Requires `ANTHROPIC_API_KEY` to be set.
+| Flag | Argument | Default | Description |
+| :--- | :--- | :--- | :--- |
+| *(no flag)* | — | — | AI enrichment is activated automatically when `ANTHROPIC_API_KEY` env var is present. There is no `--ai-enabled` flag — set or unset the env var to control it. |
+AI enrichment runs after the analyzer step and enriches Critical and Serious findings (up to 20 per scan) with:
+- A specific fix description referencing the actual selector, colors, and violation data
+- A production-quality code snippet in the correct framework syntax
+- Context-aware suggestions when repo source files are available via `--repo-url`
+Original engine fixes are always preserved. AI output is stored in separate fields (`ai_fix_description`, `ai_fix_code`). Enriched findings are flagged with `aiEnhanced: true`.
+The system prompt is customizable via `AI_SYSTEM_PROMPT` env var.
+---
 ### Audit intelligence
 Controls how findings are interpreted and filtered.
@@ -117,6 +155,16 @@ Controls what artifacts are written.
 ---
+## Environment variables
+| Variable | Description |
+| :--- | :--- |
+| `ANTHROPIC_API_KEY` | Enables Claude AI enrichment. Set to a valid Anthropic API key. When absent, AI enrichment is silently skipped. |
+| `AI_SYSTEM_PROMPT` | Custom system prompt for Claude. Overrides the default prompt for the entire scan. Useful for domain-specific fix guidance or custom output formats. |
+| `GH_TOKEN` | GitHub personal access token. Used by the AI enrichment step when fetching source files from the repo. Equivalent to `--github-token` but read from the environment. |
+---
 ## Examples
 ### Minimal scan
@@ -135,7 +183,7 @@ a11y-audit \
   --output ./audit/report.html
 ```
-### Include source code intelligence
+### Include source code intelligence (local)
 ```bash
 a11y-audit \
@@ -145,6 +193,32 @@ a11y-audit \
   --output ./audit/report.html
 ```
+### Scan with remote GitHub repository (no clone)
+```bash
+a11y-audit \
+  --base-url https://example.com \
+  --repo-url https://github.com/owner/repo \
+  --github-token ghp_...
+```
+### Scan with AI enrichment
+```bash
+ANTHROPIC_API_KEY=sk-ant-... a11y-audit \
+  --base-url https://example.com \
+  --repo-url https://github.com/owner/repo \
+  --github-token ghp_...
+```
+### Scan with custom AI system prompt
+```bash
+AI_SYSTEM_PROMPT="You are an expert in Vue.js accessibility. Focus on component-level fixes." \
+ANTHROPIC_API_KEY=sk-ant-... \
+a11y-audit --base-url https://example.com --repo-url https://github.com/owner/repo
+```
 ### Focused re-audit — single rule, single route
 ```bash

package/docs/engine-manifest.md CHANGED Viewed

@@ -16,9 +16,12 @@ This document is the current technical inventory of the engine package.
 | `src/core/utils.mjs` | Logging, JSON I/O, shared helpers |
 | `src/core/asset-loader.mjs` | Centralized asset map and loader |
 | `src/core/toolchain.mjs` | Environment/toolchain checks |
+| `src/core/github-api.mjs` | GitHub API client — `fetchPackageJson`, `fetchRepoFile`, `listRepoFiles`, `parseRepoUrl`. Used for remote repo scanning and AI source file fetching. |
 | `src/pipeline/dom-scanner.mjs` | Runtime scan stage (axe/CDP/pa11y + merge) |
 | `src/enrichment/analyzer.mjs` | Finding enrichment and metadata synthesis |
-| `src/source-patterns/source-scanner.mjs` | Static source-pattern scanner |
+| `src/ai/claude.mjs` | Claude AI client — calls the Anthropic API to enrich findings with context-aware fix suggestions. Accepts custom system prompt via `options.systemPrompt`. |
+| `src/ai/enrich.mjs` | CLI AI enrichment subprocess — reads `a11y-findings.json`, calls `enrichWithAI()`, writes enriched findings back. Activated by `ANTHROPIC_API_KEY` env var. |
+| `src/source-patterns/source-scanner.mjs` | Source code pattern scanner — works with local `--project-dir` or remote `--repo-url` via GitHub API |
 | `src/reports/html.mjs` | HTML report builder |
 | `src/reports/pdf.mjs` | PDF report builder |
 | `src/reports/md.mjs` | Markdown remediation builder |

package/docs/intelligence.md CHANGED Viewed

@@ -152,8 +152,22 @@ A single finding can match multiple personas. The persona configuration (`person
 The compliance score is computed from severity totals using weights defined in `assets/reporting/compliance-config.mjs`:
 1. **Severity totals** — counts findings by `Critical`, `Serious`, `Moderate`, `Minor` (excluding AAA and Best Practice findings).
-2. **Score** — starts at 100, deducts weighted points per finding.
-3. **Label** — maps score ranges to grades (`Excellent`, `Good Compliance`, `Needs Improvement`, `Poor`, `Critical`).
+2. **Score** — starts at 100, deducts weighted points per finding:
+   - Critical: −15 per finding
+   - Serious: −5 per finding
+   - Moderate: −2 per finding
+   - Minor: −0.5 per finding
+   - Score is clamped to 0–100 and rounded to nearest integer.
+3. **Label** — maps score ranges to grades:
+   | Score | Label |
+   | :--- | :--- |
+   | 90 – 100 | `Excellent` |
+   | 75 – 89 | `Good` |
+   | 55 – 74 | `Fair` |
+   | 35 – 54 | `Poor` |
+   | 0 – 34 | `Critical` |
 4. **WCAG status** — `Pass` (no findings), `Conditional Pass` (only Moderate/Minor), or `Fail` (any Critical/Serious).
 The `overallAssessment` in metadata follows the same logic for the formal compliance verdict.
@@ -187,14 +201,54 @@ The source scanner (`src/source-patterns/source-scanner.mjs`) detects accessibil
 5. Output includes a summary with `total`, `confirmed`, and `potential` counts.
+### Remote scanning via GitHub API
+When `--repo-url` (CLI) or `options.repoUrl` (programmatic API) is provided instead of `--project-dir`, the source scanner uses the GitHub API — no `git clone` required:
+1. `listRepoFiles()` fetches the repo file tree using the GitHub Trees API. Falls back to the Contents API for truncated responses (large repos).
+2. Files matching each pattern's `globs` are fetched individually via `raw.githubusercontent.com`.
+3. The same regex and context rejection logic runs against the fetched content.
+4. Results are identical to local scanning.
+A GitHub token (`--github-token` or `GH_TOKEN` env var) increases the API rate limit from 60 to 5,000 req/hour and enables private repo access.
 ### Integration with the audit pipeline
-When `runAudit` is called with `projectDir` and without `skipPatterns`:
+When `runAudit` is called with `projectDir` or `repoUrl` and without `skipPatterns`:
+1. The engine fetches `package.json` from the repo (remote) or reads it from disk (local) to detect the framework before the analyzer runs.
+2. The analyzer runs with the detected framework context.
+3. Source patterns run after enrichment.
+4. Pattern findings are attached to the payload as `patternFindings` with their own `generated_at`, `project_dir`, `findings`, and `summary`.
+5. The remediation guide (`getRemediationGuide`) renders pattern findings in a dedicated section.
+### pa11y ruleId normalization
+pa11y reports violations using dotted WCAG criterion codes (e.g. `WCAG2AA.Principle1.Guideline1_4.1_4_3.G18.Fail`). The engine normalizes these in two places:
+1. **Equivalence mapping** (`assets/scanning/pa11y-config.mjs`, `equivalenceMap`) — known pa11y codes are mapped to their axe-core equivalent rule ID (e.g. `Principle1.Guideline1_4.1_4_3.G145` → `color-contrast`). These findings are merged and deduplicated with axe findings.
+2. **Fallback normalization** (`src/pipeline/dom-scanner.mjs`) — pa11y codes without an axe equivalent are shortened to their last segment (e.g. `WCAG2AAA.Principle1.Guideline1_4.1_4_6.G17` → `pa11y-g17`). This produces a readable rule ID without the full dotted path.
+## AI Enrichment
+After the analyzer step, the engine optionally runs Claude-powered enrichment on Critical and Serious findings (up to 20 per scan).
+### How it works
+1. `src/ai/enrich.mjs` reads `a11y-findings.json`, identifies Critical and Serious findings, and sends them to `enrichWithAI()`.
+2. `src/ai/claude.mjs` calls the Anthropic API with a system prompt instructing Claude to generate specific, production-quality fix suggestions using the actual violation data (selector, colors, ratio, etc.).
+3. When a repo URL is available (`A11Y_REPO_URL` env var), Claude also receives relevant source files fetched via the GitHub API. File selection is scored by how well each file path matches terms extracted from the finding's selector and title.
+4. Claude returns a JSON array of improvements. Each improvement contains a `fixDescription` and `fixCode` specific to the finding's context.
+5. The engine stores Claude's output in separate fields (`ai_fix_description`, `ai_fix_code`, `ai_fix_code_lang`) — the original engine fixes are preserved unchanged. Improved findings are flagged with `aiEnhanced: true`.
+### Activation
+AI enrichment runs automatically when `ANTHROPIC_API_KEY` is present in the environment. It is non-fatal — if the API call fails, the pipeline continues with unenriched findings.
+### Custom system prompt
-1. The analyzer runs first to detect the framework.
-2. Source patterns run after enrichment.
-3. Pattern findings are attached to the payload as `patternFindings` with their own `generated_at`, `project_dir`, `findings`, and `summary`.
-4. The remediation guide (`getRemediationGuide`) renders pattern findings in a dedicated section.
+The default system prompt instructs Claude to go beyond the generic fix: explain why the issue matters for users, reference the specific selector and violation data, and provide a more complete code example than the engine's default. The prompt can be overridden per-scan via the `AI_SYSTEM_PROMPT` env var or `options.ai.systemPrompt` in the programmatic API.
 ## Assets Reference

package/docs/outputs.md CHANGED Viewed

@@ -10,6 +10,7 @@
 - [progress.json](#progressjson)
 - [a11y-scan-results.json](#a11y-scan-resultsjson)
 - [a11y-findings.json](#a11y-findingsjson)
+- [a11y-pattern-findings.json](#a11y-pattern-findingsjson)
 - [remediation.md](#remediationmd)
 - [report.html](#reporthtml)
 - [report.pdf](#reportpdf)
@@ -90,7 +91,7 @@ Merged results from all three engines (axe-core + CDP + pa11y) per route. Writte
 }
 ```
-Each violation in the `violations` array includes a `source` field indicating which engine produced it (`undefined` for axe-core, `"cdp"` for CDP checks, `"pa11y"` for pa11y).
+Each violation in the `violations` array includes a `source` field: `"cdp"` for CDP checks, `"pa11y"` for pa11y, and absent (field not set) for axe-core violations.
 This file is consumed by `analyzer.mjs` and also used by `--affected-only` to determine which routes to re-scan on subsequent runs.
@@ -176,6 +177,25 @@ The primary enriched data artifact. Written by `src/enrichment/analyzer.mjs`. Th
 | `verification_command_fallback` | `string\|null` | Fallback verify command |
 | `pages_affected` | `number\|null` | Number of pages with this violation |
 | `affected_urls` | `string[]\|null` | All URLs where this violation appears |
+| `aiEnhanced` | `boolean` | `true` when Claude improved the fix for this finding. Only present on AI-enriched findings. |
+| `ai_fix_description` | `string\|null` | Claude-generated fix description. More specific than `fix_description` — references the actual selector, colors, and violation data. Only present when `aiEnhanced` is `true`. |
+| `ai_fix_code` | `string\|null` | Claude-generated code snippet in the correct framework syntax. Separate from the engine's `fix_code`. Only present when `aiEnhanced` is `true`. |
+| `ai_fix_code_lang` | `string\|null` | Language of `ai_fix_code` (e.g. `jsx`, `tsx`, `vue`, `css`). Only present when `aiEnhanced` is `true`. |
+> **Note on `ownership_status`**: Values are `"primary"` (issue is in the project's source), `"outside_primary_source"` (issue is in a third-party component), or `"unknown"`. These are different from the pattern finding `status` field which uses `"confirmed"` and `"potential"`.
+### Top-level payload keys (after AI enrichment)
+When AI enrichment runs, the engine appends `ai_enriched_findings` to the payload root. `getFindings()` uses this as a fast path — if present, it returns `ai_enriched_findings` directly without re-normalizing the raw `findings` array.
+```json
+{
+  "metadata": { ... },
+  "findings": [ ... ],
+  "ai_enriched_findings": [ ... ],
+  "incomplete_findings": [ ... ]
+}
+```
 ### `incomplete_findings`
@@ -183,6 +203,45 @@ Violations that axe-core flagged as "needs review" (not confirmed pass or fail).
 ---
+## a11y-pattern-findings.json
+Source code pattern scan results. Written by `src/source-patterns/source-scanner.mjs` when `--project-dir` or `--repo-url` is provided (and `--skip-patterns` is not set).
+```json
+{
+  "generated_at": "2026-03-16T00:00:00.000Z",
+  "project_dir": "https://github.com/owner/repo",
+  "findings": [ ... ],
+  "summary": {
+    "total": 5,
+    "confirmed": 3,
+    "potential": 2
+  }
+}
+```
+### Per-finding fields
+| Field | Type | Description |
+| :--- | :--- | :--- |
+| `id` | `string` | Deterministic finding ID |
+| `pattern_id` | `string` | Pattern definition ID (e.g. `placeholder-only-label`) |
+| `title` | `string` | Pattern title |
+| `severity` | `string` | `Critical`, `Serious`, `Moderate`, or `Minor` |
+| `wcag` | `string` | WCAG success criterion string |
+| `wcag_criterion` | `string` | WCAG criterion ID |
+| `wcag_level` | `string` | `A`, `AA`, or `AAA` |
+| `type` | `string` | Pattern type (`structural`, `css`, etc.) |
+| `fix_description` | `string\|null` | How to fix this pattern |
+| `status` | `string` | `confirmed` (regex match without reject context) or `potential` (match with uncertainty) |
+| `file` | `string` | File path within the repo (e.g. `src/components/Button.tsx`) |
+| `line` | `number` | Line number of the match |
+| `match` | `string` | The matched line content |
+| `context` | `string` | 7-line code context window around the match |
+| `source` | `string` | Always `"code-pattern"` |
+---
 ## remediation.md
 AI agent-optimized remediation guide. Always generated (even without `--with-reports`). Written to `.audit/remediation.md`.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@diegovelasquezweb/a11y-engine",
-  "version": "0.8.1",
+  "version": "0.8.3",
   "description": "WCAG 2.2 accessibility audit engine — scanner, analyzer, and report builders",
   "type": "module",
   "license": "MIT",

package/src/ai/claude.mjs CHANGED Viewed

@@ -132,6 +132,54 @@ async function callClaude(apiKey, model, systemPrompt, userMessage) {
  * @param {string|undefined} githubToken
  * @returns {Promise<Record<string, string>>}
  */
+/**
+ * Extracts candidate component/class names from a CSS selector or HTML snippet.
+ * e.g. ".trustarc-banner-right > span" → ["trustarc", "banner"]
+ * e.g. "#search-input" → ["search", "input"]
+ */
+function extractSearchTermsFromFinding(finding) {
+  const terms = new Set();
+  const sources = [
+    finding.primarySelector || finding.selector || "",
+    finding.title || "",
+  ];
+  for (const src of sources) {
+    // Extract class names: .foo-bar → ["foo", "bar"]
+    const classes = src.match(/\.[\w-]+/g) || [];
+    for (const cls of classes) {
+      const parts = cls.slice(1).split(/[-_]/);
+      for (const p of parts) {
+        if (p.length > 3) terms.add(p.toLowerCase());
+      }
+    }
+    // Extract IDs: #foo-bar → ["foo", "bar"]
+    const ids = src.match(/#[\w-]+/g) || [];
+    for (const id of ids) {
+      const parts = id.slice(1).split(/[-_]/);
+      for (const p of parts) {
+        if (p.length > 3) terms.add(p.toLowerCase());
+      }
+    }
+    // Extract data attributes: [data-component="Foo"] → ["foo"]
+    const dataAttrs = src.match(/data-[\w-]+=["']?[\w-]+["']?/g) || [];
+    for (const attr of dataAttrs) {
+      const val = attr.split(/=["']?/)[1]?.replace(/["']/, "").toLowerCase();
+      if (val && val.length > 3) terms.add(val);
+    }
+  }
+  return [...terms].slice(0, 5);
+}
+/**
+ * Scores a file path by how many search terms it contains.
+ */
+function scoreFilePath(filePath, terms) {
+  const lower = filePath.toLowerCase();
+  return terms.filter((t) => lower.includes(t)).length;
+}
 async function fetchSourceFilesForFindings(findings, repoUrl, githubToken) {
   const sourceFiles = {};
   if (!repoUrl) return sourceFiles;
@@ -139,30 +187,50 @@ async function fetchSourceFilesForFindings(findings, repoUrl, githubToken) {
   const { fetchRepoFile, listRepoFiles, parseRepoUrl } = await import("../core/github-api.mjs");
   if (!parseRepoUrl(repoUrl)) return sourceFiles;
-  const patterns = new Set(
-    findings
-      .filter((f) => f.fileSearchPattern)
-      .map((f) => f.fileSearchPattern)
-  );
+  // Collect all extensions needed
+  const extensions = new Set();
+  for (const f of findings) {
+    if (!f.fileSearchPattern) continue;
+    const extMatch = f.fileSearchPattern.match(/\*\.(\w+)$/);
+    if (extMatch) extensions.add(`.${extMatch[1]}`);
+  }
+  if (extensions.size === 0) return sourceFiles;
-  for (const pattern of patterns) {
-    try {
-      // Extract extension from pattern (e.g. "src/components/*.tsx" -> ".tsx")
-      const extMatch = pattern.match(/\*\.(\w+)$/);
-      if (!extMatch) continue;
-      const ext = `.${extMatch[1]}`;
-      const files = await listRepoFiles(repoUrl, [ext], githubToken);
-      // Pick up to 3 most relevant files per pattern
-      const relevant = files.slice(0, 3);
-      for (const filePath of relevant) {
-        if (!sourceFiles[filePath]) {
-          const content = await fetchRepoFile(repoUrl, filePath, githubToken);
-          if (content) sourceFiles[filePath] = content;
-        }
-      }
-    } catch {
-      // non-fatal
+  // Fetch full file list once
+  let allFiles = [];
+  try {
+    allFiles = await listRepoFiles(repoUrl, [...extensions], githubToken);
+  } catch {
+    return sourceFiles;
+  }
+  // For each finding, find the most relevant files by selector/title terms
+  const MAX_FILES_PER_FINDING = 2;
+  const MAX_TOTAL_FILES = 6;
+  for (const finding of findings) {
+    if (Object.keys(sourceFiles).length >= MAX_TOTAL_FILES) break;
+    const terms = extractSearchTermsFromFinding(finding);
+    // Score and sort files by relevance to this finding
+    const scored = allFiles
+      .map((fp) => ({ fp, score: scoreFilePath(fp, terms) }))
+      .filter(({ score }) => score > 0)
+      .sort((a, b) => b.score - a.score);
+    // Fall back to first files if no relevant match found
+    const candidates = scored.length > 0
+      ? scored.slice(0, MAX_FILES_PER_FINDING).map(({ fp }) => fp)
+      : allFiles.slice(0, 1);
+    for (const filePath of candidates) {
+      if (sourceFiles[filePath]) continue;
+      if (Object.keys(sourceFiles).length >= MAX_TOTAL_FILES) break;
+      try {
+        const content = await fetchRepoFile(repoUrl, filePath, githubToken);
+        if (content) sourceFiles[filePath] = content;
+      } catch { /* non-fatal */ }
     }
   }

package/src/enrichment/analyzer.mjs CHANGED Viewed

@@ -856,8 +856,10 @@ function buildFindings(inputPayload, cliArgs) {
           selector: selectors.join(", "),
           impacted_users: getImpactedUsers(v.id, v.tags),
           primary_selector: bestSelector,
-          actual:
-            firstNode?.failureSummary || `Found ${nodes.length} instance(s).`,
+          actual: (() => {
+            const raw = firstNode?.failureSummary || `Found ${nodes.length} instance(s).`;
+            return raw.replace(/^Fix any of the following:\s*/i, "").trim();
+          })(),
           primary_failure_mode: failureInsights.primaryFailureMode,
           relationship_hint: failureInsights.relationshipHint,
           failure_checks: failureInsights.failureChecks,