accessibility-champion 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,242 @@
1
+ Metadata-Version: 2.4
2
+ Name: accessibility-champion
3
+ Version: 0.1.0
4
+ Summary: Lightweight static HTML accessibility linter for WCAG 2.2
5
+ Author: Bryan Wong
6
+ License-Expression: MIT
7
+ Keywords: accessibility,a11y,wcag,linter,html
8
+ Requires-Python: >=3.8
9
+ Description-Content-Type: text/markdown
10
+
11
+ # Accessibility Champion
12
+
13
+ Accessibility Champion is a lightweight, static accessibility linter for HTML files. It helps identify common WCAG 2.2 AA violations in your markup and generates an accessibility score to help developers triage and fix issues quickly.
14
+
15
+ The linter uses Python's `HTMLParser` with a **rule-registry architecture**: a thin dispatcher forwards parse events to focused rule classes, keeping each check isolated and easy to extend.
16
+
17
+ ## Quick Start
18
+
19
+ Run the linter against any HTML file to get a human-readable text report:
20
+
21
+ ```bash
22
+ python3 a11y_lint.py path/to/your/file.html
23
+ ```
24
+
25
+ Try the included demo fixtures:
26
+
27
+ ```bash
28
+ python3 a11y_lint.py demo/broken_page.html # exits 1 — many violations
29
+ python3 a11y_lint.py demo/passing_page.html # exits 0 — score 100/100
30
+ ```
31
+
32
+ ### CLI Options
33
+
34
+ | Flag | Description |
35
+ |------|-------------|
36
+ | `--json` | Output results as a JSON array (one entry per file) |
37
+ | `--fragment` | Treat input as an HTML fragment; skip full-page landmark and single-`<h1>` checks |
38
+ | `--full-page` | Force full-page mode even when `<html>` / `<body>` tags are absent |
39
+
40
+ **Auto-detection:** When neither `--fragment` nor `--full-page` is passed, the linter treats markup as a **fragment** unless it contains an `<html>` or `<body>` tag. Full-page landmark checks (`<main>`, `<header>`, `<nav>`, `<footer>`, single `<h1>`) only run in full-page mode.
41
+
42
+ ```bash
43
+ # JSON output for CI
44
+ python3 a11y_lint.py path/to/your/file.html --json
45
+
46
+ # Lint a partial HTML snippet (e.g., a component template)
47
+ python3 a11y_lint.py path/to/fragment.html --fragment
48
+ ```
49
+
50
+ ### Exit Codes
51
+
52
+ - `0` — no violations found across all linted files
53
+ - `1` — one or more violations found, or a file could not be read
54
+
55
+ Any violation severity (including minor) causes a non-zero exit when `--json` is not used for machine parsing.
56
+
57
+ ## Output Format
58
+
59
+ ### Text Report
60
+
61
+ The default text output provides a score out of 100, followed by violations grouped by severity. Each violation includes:
62
+
63
+ - **id** — stable rule identifier (e.g., `image-alt`, `form-group-fieldset`)
64
+ - **line** — source line number
65
+ - **message** — human-readable description
66
+ - **fix** — suggested remediation
67
+ - **wcag** — relevant WCAG success criterion
68
+
69
+ ### JSON Report
70
+
71
+ ```json
72
+ [
73
+ {
74
+ "file": "demo/broken_page.html",
75
+ "score": 0,
76
+ "violations": [
77
+ {
78
+ "id": "html-has-lang",
79
+ "severity": "serious",
80
+ "line": 2,
81
+ "message": "<html> tag is missing a lang attribute",
82
+ "fix": "Add lang=\"en\" (or appropriate language code) to the <html> tag",
83
+ "wcag": "3.1.1 Language of Page"
84
+ },
85
+ {
86
+ "id": "image-alt",
87
+ "severity": "critical",
88
+ "line": 14,
89
+ "message": "<img> is missing an alt attribute",
90
+ "fix": "Add alt=\"[description]\" for informational images, or alt=\"\" role=\"presentation\" for decorative ones",
91
+ "wcag": "1.1.1 Non-text Content"
92
+ }
93
+ ]
94
+ }
95
+ ]
96
+ ```
97
+
98
+ ### Programmatic API
99
+
100
+ ```python
101
+ from a11y_lint import check_html, score
102
+
103
+ violations = check_html(source) # auto-detect fragment vs full page
104
+ violations = check_html(source, fragment=True) # force fragment mode
105
+ total = score(violations)
106
+ ```
107
+
108
+ ## Scoring Model
109
+
110
+ The Accessibility Score starts at 100 and deducts points based on severity:
111
+
112
+ | Severity | Points | Examples |
113
+ |----------|--------|----------|
114
+ | **Critical** | −20 | Missing `alt`, unlabelled form controls, buttons without accessible names |
115
+ | **Serious** | −10 | Missing `lang`, duplicate IDs, generic link text, missing `iframe` title |
116
+ | **Moderate** | −5 | Skipped heading levels, missing `<main>`, ungrouped radio/checkbox sets |
117
+ | **Minor** | −2 | Missing `autocomplete` on personal-data fields, optional landmark regions |
118
+
119
+ The score is clamped to a minimum of 0.
120
+
121
+ ## Current Checks
122
+
123
+ ### Pillar 1 — Perceivable
124
+
125
+ | Rule ID | What it checks |
126
+ |---------|----------------|
127
+ | `html-has-lang` | `<html>` missing `lang` attribute |
128
+ | `image-alt` | `<img>` missing `alt` attribute |
129
+ | `image-alt-quality` | Generic `alt` text (`image`, `photo`, `logo`, etc.) |
130
+ | `no-autoplay` | `<video>` / `<audio>` with `autoplay` |
131
+ | `video-captions` | `<video>` missing `<track kind="captions">` |
132
+ | `audio-transcript` | `<audio>` without nearby transcript link/text (heuristic) |
133
+ | `document-title` | Full page missing non-empty `<title>` in `<head>` |
134
+
135
+ ### Pillar 2 — Operable
136
+
137
+ | Rule ID | What it checks |
138
+ |---------|----------------|
139
+ | `button-name` | `<button>` with no accessible name (text, `aria-label`, child `<img alt>`, or `<svg><title>`) |
140
+ | `link-name` | `<a>` with generic text (`click here`, `read more`, `here`, `more`, etc.) |
141
+ | `focus-visible` | `outline: none` / `outline: 0` in CSS without a matching `:focus` / `:focus-visible` fallback rule |
142
+ | `skip-link` | Full page with `<nav>` missing a skip-to-main-content link |
143
+ | `tabindex-positive` | Any `tabindex` value greater than 0 |
144
+ | `button-type-missing` | `<button>` inside `<form>` without explicit `type` attribute |
145
+ | `target-blank-no-warning` | `target="_blank"` without accessible new-window warning |
146
+
147
+ Focus-outline analysis parses CSS rule blocks inside `<style>` elements and inline `style=""` attributes. It matches base selectors (e.g., `.btn`) to companion `:focus-visible` rules that restore a visible outline.
148
+
149
+ ### Pillar 3 — Understandable
150
+
151
+ | Rule ID | What it checks |
152
+ |---------|----------------|
153
+ | `input-unlabelled` | Form control has no `id`, is not wrapped in `<label>`, and has no `aria-label` |
154
+ | `input-missing-label` | Control has an `id` but no matching `<label for="...">` |
155
+ | `placeholder-as-label` | `placeholder` used without a real label or `aria-label` |
156
+ | `input-autocomplete` | Personal-data inputs (`email`, `password`, `tel`, or `name`/`address`-like fields) missing `autocomplete` |
157
+ | `aria-invalid-no-desc` | `aria-invalid="true"` without `aria-describedby` pointing to an error element |
158
+
159
+ Applies to `<input>`, `<select>`, and `<textarea>`.
160
+
161
+ ### Pillar 4 — Robust & Semantic Structure
162
+
163
+ | Rule ID | What it checks |
164
+ |---------|----------------|
165
+ | `duplicate-id` | Duplicate `id` attribute values |
166
+ | `form-group-fieldset` | Multiple radio/checkbox inputs sharing a `name` not wrapped in `<fieldset>` + `<legend>` (one violation per group) |
167
+ | `aria-describedby-missing-target` | `aria-describedby` references an `id` that does not exist |
168
+ | `aria-labelledby-target` | `aria-labelledby` references an `id` that does not exist |
169
+ | `heading-order` | Skipped heading levels (e.g., `<h1>` → `<h3>`) |
170
+ | `heading-single-h1` | More than one `<h1>` on a full page |
171
+ | `frame-title` | `<iframe>` missing `title` |
172
+ | `table-th` | Data table missing `<th>` header cells |
173
+ | `table-caption` | Data table missing `<caption>` |
174
+ | `missing-main` | Full page missing `<main>` landmark |
175
+ | `missing-header-landmark` | Full page missing `<header>` / `role="banner"` |
176
+ | `missing-nav-landmark` | Full page missing `<nav>` / `role="navigation"` |
177
+ | `missing-footer-landmark` | Full page missing `<footer>` / `role="contentinfo"` |
178
+
179
+ Presentation tables (`role="presentation"`) are exempt from table header/caption checks.
180
+
181
+ ## Architecture
182
+
183
+ ```
184
+ a11y_lint.py CLI entry point; thin HTMLParser dispatcher
185
+ a11y_context.py ParseContext (shared state) and TagAttrs helpers
186
+ a11y_rules.py Individual A11yRule classes registered via all_rules()
187
+ a11y_focus.py CSS rule-block parser for focus-outline checks
188
+ ```
189
+
190
+ To add a new check, create a class extending `A11yRule` in `a11y_rules.py` with `on_starttag`, `on_endtag`, `on_data`, and/or `finalize` hooks, then register it in `all_rules()`.
191
+
192
+ ## Running Tests
193
+
194
+ ```bash
195
+ python3 -m unittest test_a11y_lint -v
196
+ ```
197
+
198
+ The suite covers fixture pages, individual rules, CLI behavior, fragment mode, and edge cases (30 tests). **Every new rule must include tests** that verify both failing and passing markup.
199
+
200
+ ## Limitations
201
+
202
+ ⚠️ **This is a static linter and does not replace manual accessibility review.**
203
+
204
+ While it catches a meaningful share of common HTML accessibility issues, automated tooling cannot validate:
205
+
206
+ - **Interaction intent** — Does a custom widget actually behave like its native equivalent?
207
+ - **Meaningful text** — Is the `alt` text actually descriptive in context?
208
+ - **Keyboard navigation** — Are there keyboard traps or focus-management issues?
209
+ - **Visual contrast** — Real contrast ratios require rendering the DOM and CSS.
210
+
211
+ Always supplement this tool with screen reader testing, keyboard navigation audits, and dynamic tools like axe-core.
212
+
213
+ ## Demo Fixtures
214
+
215
+ | File | Purpose |
216
+ |------|---------|
217
+ | `demo/broken_page.html` | Intentionally broken markup; demonstrates linter output |
218
+ | `demo/passing_page.html` | Accessible page that scores 100/100 |
219
+ | `demo/expected_output.txt` | Reference text output for `broken_page.html` |
220
+
221
+ ## Project Layout
222
+
223
+ - `a11y_lint.py` — CLI entry point and thin HTML parser dispatcher
224
+ - `a11y_context.py` — Shared parse context and attribute helpers
225
+ - `a11y_rules.py` — Individual accessibility rule implementations
226
+ - `a11y_focus.py` — CSS focus-outline checks for `<style>` blocks and inline styles
227
+ - `demo/broken_page.html` — Fixture demonstrating failing checks
228
+ - `demo/passing_page.html` — Fixture demonstrating passing checks
229
+ - `test_a11y_lint.py` — Automated test suite
230
+ - `SKILL.md` — AI agent integration guidelines for accessibility audit workflows
231
+
232
+ ## Roadmap
233
+
234
+ See [ROADMAP.md](./ROADMAP.md) for the phased plan to expand coverage toward full WCAG-aligned auditing (static rules → CSS analysis → axe-core integration).
235
+
236
+ ## Contributing
237
+
238
+ 1. Keep changes small and surgical.
239
+ 2. Add new rules as `A11yRule` subclasses in `a11y_rules.py`; register them in `all_rules()`.
240
+ 3. Add tests to `test_a11y_lint.py` for both failing and passing markup.
241
+ 4. Prefer the HTML parser over regular expressions for structural checks.
242
+ 5. Update this README when adding new rule IDs, CLI flags, or architectural changes.
@@ -0,0 +1,232 @@
1
+ # Accessibility Champion
2
+
3
+ Accessibility Champion is a lightweight, static accessibility linter for HTML files. It helps identify common WCAG 2.2 AA violations in your markup and generates an accessibility score to help developers triage and fix issues quickly.
4
+
5
+ The linter uses Python's `HTMLParser` with a **rule-registry architecture**: a thin dispatcher forwards parse events to focused rule classes, keeping each check isolated and easy to extend.
6
+
7
+ ## Quick Start
8
+
9
+ Run the linter against any HTML file to get a human-readable text report:
10
+
11
+ ```bash
12
+ python3 a11y_lint.py path/to/your/file.html
13
+ ```
14
+
15
+ Try the included demo fixtures:
16
+
17
+ ```bash
18
+ python3 a11y_lint.py demo/broken_page.html # exits 1 — many violations
19
+ python3 a11y_lint.py demo/passing_page.html # exits 0 — score 100/100
20
+ ```
21
+
22
+ ### CLI Options
23
+
24
+ | Flag | Description |
25
+ |------|-------------|
26
+ | `--json` | Output results as a JSON array (one entry per file) |
27
+ | `--fragment` | Treat input as an HTML fragment; skip full-page landmark and single-`<h1>` checks |
28
+ | `--full-page` | Force full-page mode even when `<html>` / `<body>` tags are absent |
29
+
30
+ **Auto-detection:** When neither `--fragment` nor `--full-page` is passed, the linter treats markup as a **fragment** unless it contains an `<html>` or `<body>` tag. Full-page landmark checks (`<main>`, `<header>`, `<nav>`, `<footer>`, single `<h1>`) only run in full-page mode.
31
+
32
+ ```bash
33
+ # JSON output for CI
34
+ python3 a11y_lint.py path/to/your/file.html --json
35
+
36
+ # Lint a partial HTML snippet (e.g., a component template)
37
+ python3 a11y_lint.py path/to/fragment.html --fragment
38
+ ```
39
+
40
+ ### Exit Codes
41
+
42
+ - `0` — no violations found across all linted files
43
+ - `1` — one or more violations found, or a file could not be read
44
+
45
+ Any violation severity (including minor) causes a non-zero exit when `--json` is not used for machine parsing.
46
+
47
+ ## Output Format
48
+
49
+ ### Text Report
50
+
51
+ The default text output provides a score out of 100, followed by violations grouped by severity. Each violation includes:
52
+
53
+ - **id** — stable rule identifier (e.g., `image-alt`, `form-group-fieldset`)
54
+ - **line** — source line number
55
+ - **message** — human-readable description
56
+ - **fix** — suggested remediation
57
+ - **wcag** — relevant WCAG success criterion
58
+
59
+ ### JSON Report
60
+
61
+ ```json
62
+ [
63
+ {
64
+ "file": "demo/broken_page.html",
65
+ "score": 0,
66
+ "violations": [
67
+ {
68
+ "id": "html-has-lang",
69
+ "severity": "serious",
70
+ "line": 2,
71
+ "message": "<html> tag is missing a lang attribute",
72
+ "fix": "Add lang=\"en\" (or appropriate language code) to the <html> tag",
73
+ "wcag": "3.1.1 Language of Page"
74
+ },
75
+ {
76
+ "id": "image-alt",
77
+ "severity": "critical",
78
+ "line": 14,
79
+ "message": "<img> is missing an alt attribute",
80
+ "fix": "Add alt=\"[description]\" for informational images, or alt=\"\" role=\"presentation\" for decorative ones",
81
+ "wcag": "1.1.1 Non-text Content"
82
+ }
83
+ ]
84
+ }
85
+ ]
86
+ ```
87
+
88
+ ### Programmatic API
89
+
90
+ ```python
91
+ from a11y_lint import check_html, score
92
+
93
+ violations = check_html(source) # auto-detect fragment vs full page
94
+ violations = check_html(source, fragment=True) # force fragment mode
95
+ total = score(violations)
96
+ ```
97
+
98
+ ## Scoring Model
99
+
100
+ The Accessibility Score starts at 100 and deducts points based on severity:
101
+
102
+ | Severity | Points | Examples |
103
+ |----------|--------|----------|
104
+ | **Critical** | −20 | Missing `alt`, unlabelled form controls, buttons without accessible names |
105
+ | **Serious** | −10 | Missing `lang`, duplicate IDs, generic link text, missing `iframe` title |
106
+ | **Moderate** | −5 | Skipped heading levels, missing `<main>`, ungrouped radio/checkbox sets |
107
+ | **Minor** | −2 | Missing `autocomplete` on personal-data fields, optional landmark regions |
108
+
109
+ The score is clamped to a minimum of 0.
110
+
111
+ ## Current Checks
112
+
113
+ ### Pillar 1 — Perceivable
114
+
115
+ | Rule ID | What it checks |
116
+ |---------|----------------|
117
+ | `html-has-lang` | `<html>` missing `lang` attribute |
118
+ | `image-alt` | `<img>` missing `alt` attribute |
119
+ | `image-alt-quality` | Generic `alt` text (`image`, `photo`, `logo`, etc.) |
120
+ | `no-autoplay` | `<video>` / `<audio>` with `autoplay` |
121
+ | `video-captions` | `<video>` missing `<track kind="captions">` |
122
+ | `audio-transcript` | `<audio>` without nearby transcript link/text (heuristic) |
123
+ | `document-title` | Full page missing non-empty `<title>` in `<head>` |
124
+
125
+ ### Pillar 2 — Operable
126
+
127
+ | Rule ID | What it checks |
128
+ |---------|----------------|
129
+ | `button-name` | `<button>` with no accessible name (text, `aria-label`, child `<img alt>`, or `<svg><title>`) |
130
+ | `link-name` | `<a>` with generic text (`click here`, `read more`, `here`, `more`, etc.) |
131
+ | `focus-visible` | `outline: none` / `outline: 0` in CSS without a matching `:focus` / `:focus-visible` fallback rule |
132
+ | `skip-link` | Full page with `<nav>` missing a skip-to-main-content link |
133
+ | `tabindex-positive` | Any `tabindex` value greater than 0 |
134
+ | `button-type-missing` | `<button>` inside `<form>` without explicit `type` attribute |
135
+ | `target-blank-no-warning` | `target="_blank"` without accessible new-window warning |
136
+
137
+ Focus-outline analysis parses CSS rule blocks inside `<style>` elements and inline `style=""` attributes. It matches base selectors (e.g., `.btn`) to companion `:focus-visible` rules that restore a visible outline.
138
+
139
+ ### Pillar 3 — Understandable
140
+
141
+ | Rule ID | What it checks |
142
+ |---------|----------------|
143
+ | `input-unlabelled` | Form control has no `id`, is not wrapped in `<label>`, and has no `aria-label` |
144
+ | `input-missing-label` | Control has an `id` but no matching `<label for="...">` |
145
+ | `placeholder-as-label` | `placeholder` used without a real label or `aria-label` |
146
+ | `input-autocomplete` | Personal-data inputs (`email`, `password`, `tel`, or `name`/`address`-like fields) missing `autocomplete` |
147
+ | `aria-invalid-no-desc` | `aria-invalid="true"` without `aria-describedby` pointing to an error element |
148
+
149
+ Applies to `<input>`, `<select>`, and `<textarea>`.
150
+
151
+ ### Pillar 4 — Robust & Semantic Structure
152
+
153
+ | Rule ID | What it checks |
154
+ |---------|----------------|
155
+ | `duplicate-id` | Duplicate `id` attribute values |
156
+ | `form-group-fieldset` | Multiple radio/checkbox inputs sharing a `name` not wrapped in `<fieldset>` + `<legend>` (one violation per group) |
157
+ | `aria-describedby-missing-target` | `aria-describedby` references an `id` that does not exist |
158
+ | `aria-labelledby-target` | `aria-labelledby` references an `id` that does not exist |
159
+ | `heading-order` | Skipped heading levels (e.g., `<h1>` → `<h3>`) |
160
+ | `heading-single-h1` | More than one `<h1>` on a full page |
161
+ | `frame-title` | `<iframe>` missing `title` |
162
+ | `table-th` | Data table missing `<th>` header cells |
163
+ | `table-caption` | Data table missing `<caption>` |
164
+ | `missing-main` | Full page missing `<main>` landmark |
165
+ | `missing-header-landmark` | Full page missing `<header>` / `role="banner"` |
166
+ | `missing-nav-landmark` | Full page missing `<nav>` / `role="navigation"` |
167
+ | `missing-footer-landmark` | Full page missing `<footer>` / `role="contentinfo"` |
168
+
169
+ Presentation tables (`role="presentation"`) are exempt from table header/caption checks.
170
+
171
+ ## Architecture
172
+
173
+ ```
174
+ a11y_lint.py CLI entry point; thin HTMLParser dispatcher
175
+ a11y_context.py ParseContext (shared state) and TagAttrs helpers
176
+ a11y_rules.py Individual A11yRule classes registered via all_rules()
177
+ a11y_focus.py CSS rule-block parser for focus-outline checks
178
+ ```
179
+
180
+ To add a new check, create a class extending `A11yRule` in `a11y_rules.py` with `on_starttag`, `on_endtag`, `on_data`, and/or `finalize` hooks, then register it in `all_rules()`.
181
+
182
+ ## Running Tests
183
+
184
+ ```bash
185
+ python3 -m unittest test_a11y_lint -v
186
+ ```
187
+
188
+ The suite covers fixture pages, individual rules, CLI behavior, fragment mode, and edge cases (30 tests). **Every new rule must include tests** that verify both failing and passing markup.
189
+
190
+ ## Limitations
191
+
192
+ ⚠️ **This is a static linter and does not replace manual accessibility review.**
193
+
194
+ While it catches a meaningful share of common HTML accessibility issues, automated tooling cannot validate:
195
+
196
+ - **Interaction intent** — Does a custom widget actually behave like its native equivalent?
197
+ - **Meaningful text** — Is the `alt` text actually descriptive in context?
198
+ - **Keyboard navigation** — Are there keyboard traps or focus-management issues?
199
+ - **Visual contrast** — Real contrast ratios require rendering the DOM and CSS.
200
+
201
+ Always supplement this tool with screen reader testing, keyboard navigation audits, and dynamic tools like axe-core.
202
+
203
+ ## Demo Fixtures
204
+
205
+ | File | Purpose |
206
+ |------|---------|
207
+ | `demo/broken_page.html` | Intentionally broken markup; demonstrates linter output |
208
+ | `demo/passing_page.html` | Accessible page that scores 100/100 |
209
+ | `demo/expected_output.txt` | Reference text output for `broken_page.html` |
210
+
211
+ ## Project Layout
212
+
213
+ - `a11y_lint.py` — CLI entry point and thin HTML parser dispatcher
214
+ - `a11y_context.py` — Shared parse context and attribute helpers
215
+ - `a11y_rules.py` — Individual accessibility rule implementations
216
+ - `a11y_focus.py` — CSS focus-outline checks for `<style>` blocks and inline styles
217
+ - `demo/broken_page.html` — Fixture demonstrating failing checks
218
+ - `demo/passing_page.html` — Fixture demonstrating passing checks
219
+ - `test_a11y_lint.py` — Automated test suite
220
+ - `SKILL.md` — AI agent integration guidelines for accessibility audit workflows
221
+
222
+ ## Roadmap
223
+
224
+ See [ROADMAP.md](./ROADMAP.md) for the phased plan to expand coverage toward full WCAG-aligned auditing (static rules → CSS analysis → axe-core integration).
225
+
226
+ ## Contributing
227
+
228
+ 1. Keep changes small and surgical.
229
+ 2. Add new rules as `A11yRule` subclasses in `a11y_rules.py`; register them in `all_rules()`.
230
+ 3. Add tests to `test_a11y_lint.py` for both failing and passing markup.
231
+ 4. Prefer the HTML parser over regular expressions for structural checks.
232
+ 5. Update this README when adding new rule IDs, CLI flags, or architectural changes.
@@ -0,0 +1,166 @@
1
+ """Shared parse context, violation model, and attribute helpers."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from dataclasses import dataclass, field
6
+ from typing import TypedDict
7
+
8
+
9
+ class Violation(TypedDict):
10
+ id: str
11
+ severity: str
12
+ line: int
13
+ message: str
14
+ fix: str
15
+ wcag: str
16
+
17
+
18
+ @dataclass
19
+ class TagAttrs:
20
+ """HTML attributes with case-preserving values."""
21
+
22
+ raw: dict[str, str]
23
+
24
+ @classmethod
25
+ def from_parser(cls, attrs: list[tuple[str, str | None]]) -> TagAttrs:
26
+ return cls({k: v if v is not None else "" for k, v in attrs})
27
+
28
+ def get(self, name: str) -> str | None:
29
+ target = name.lower()
30
+ for key, value in self.raw.items():
31
+ if key.lower() == target:
32
+ return value
33
+ return None
34
+
35
+ def has(self, name: str) -> bool:
36
+ return self.get(name) is not None
37
+
38
+ def get_lower(self, name: str) -> str | None:
39
+ value = self.get(name)
40
+ return value.lower() if value else None
41
+
42
+
43
+ def make_violation(
44
+ *,
45
+ id: str,
46
+ severity: str,
47
+ line: int,
48
+ message: str,
49
+ fix: str,
50
+ wcag: str,
51
+ ) -> Violation:
52
+ return Violation(
53
+ id=id,
54
+ severity=severity,
55
+ line=line,
56
+ message=message,
57
+ fix=fix,
58
+ wcag=wcag,
59
+ )
60
+
61
+
62
+ @dataclass
63
+ class PageState:
64
+ is_full_page: bool = False
65
+ has_main: bool = False
66
+ has_header: bool = False
67
+ has_nav: bool = False
68
+ has_footer: bool = False
69
+ h1_count: int = 0
70
+ headings_seen: list[int] = field(default_factory=list)
71
+ head_depth: int = 0
72
+ document_title_depth: int = 0
73
+ document_title: str = ""
74
+ has_skip_link: bool = False
75
+ html_line: int = 1
76
+ nav_line: int = 1
77
+ head_line: int = 1
78
+
79
+
80
+ @dataclass
81
+ class FormState:
82
+ label_fors: set[str] = field(default_factory=set)
83
+ inputs_needing_labels: list[dict] = field(default_factory=list)
84
+ placeholder_controls: list[dict] = field(default_factory=list)
85
+ fieldset_stack: list[dict] = field(default_factory=list)
86
+ radio_checkbox_groups: dict[tuple[str, str], dict] = field(default_factory=dict)
87
+ form_depth: int = 0
88
+
89
+
90
+ @dataclass
91
+ class MediaState:
92
+ video_depth: int = 0
93
+ current_video: dict | None = None
94
+ audio_entries: list[dict] = field(default_factory=list)
95
+
96
+
97
+ @dataclass
98
+ class LinkState:
99
+ link_depth: int = 0
100
+ current_link: dict | None = None
101
+
102
+
103
+ @dataclass
104
+ class ButtonState:
105
+ button_depth: int = 0
106
+ current_button: dict | None = None
107
+ in_svg_depth: int = 0
108
+ in_title_depth: int = 0
109
+ current_title_text: str = ""
110
+
111
+
112
+ @dataclass
113
+ class TableState:
114
+ table_depth: int = 0
115
+ current_table_has_th: bool = False
116
+ current_table_has_caption: bool = False
117
+ current_table_is_presentation: bool = False
118
+ current_table_line: int = 0
119
+
120
+
121
+ @dataclass
122
+ class AriaState:
123
+ ids_seen: set[str] = field(default_factory=set)
124
+ duplicate_ids: set[tuple[str, int]] = field(default_factory=set)
125
+ described_by_checks: list[dict] = field(default_factory=list)
126
+ labelled_by_checks: list[dict] = field(default_factory=list)
127
+ aria_invalid_checks: list[dict] = field(default_factory=list)
128
+
129
+
130
+ @dataclass
131
+ class ParseContext:
132
+ """Mutable state shared across accessibility rules during HTML parsing."""
133
+
134
+ source: str = ""
135
+ fragment_mode: bool = False
136
+ violations: list[Violation] = field(default_factory=list)
137
+ tag_stack: list[str] = field(default_factory=list)
138
+ page: PageState = field(default_factory=PageState)
139
+ forms: FormState = field(default_factory=FormState)
140
+ media: MediaState = field(default_factory=MediaState)
141
+ links: LinkState = field(default_factory=LinkState)
142
+ buttons: ButtonState = field(default_factory=ButtonState)
143
+ tables: TableState = field(default_factory=TableState)
144
+ aria: AriaState = field(default_factory=AriaState)
145
+
146
+ def push_tag(self, tag: str) -> None:
147
+ self.tag_stack.append(tag)
148
+
149
+ def pop_tag(self, tag: str) -> None:
150
+ if self.tag_stack and self.tag_stack[-1] == tag:
151
+ self.tag_stack.pop()
152
+
153
+ def in_tag(self, tag: str) -> bool:
154
+ return tag in self.tag_stack
155
+
156
+ def add_violation(self, **kwargs: str | int) -> None:
157
+ self.violations.append(make_violation(**kwargs)) # type: ignore[arg-type]
158
+
159
+ def track_id(self, id_val: str, line: int) -> None:
160
+ if id_val in self.aria.ids_seen:
161
+ self.aria.duplicate_ids.add((id_val, line))
162
+ self.aria.ids_seen.add(id_val)
163
+
164
+ def page_line(self, fallback: int = 1) -> int:
165
+ """Best line number for page-level violations."""
166
+ return self.page.html_line or fallback