@neikyun/ciel 6.11.0 → 6.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/assets/.claude/hooks/memory-engine.py +29 -4
  2. package/assets/.claude/settings.json +8 -8
  3. package/assets/commands/ciel-create-skill.md +2 -2
  4. package/assets/commands/ciel-status.md +1 -1
  5. package/assets/platforms/opencode/.opencode/agents/ciel-improver.md +2 -2
  6. package/assets/platforms/opencode/.opencode/commands/ciel-create-skill.md +2 -2
  7. package/assets/platforms/opencode/.opencode/commands/ciel-memory-bootstrap.md +195 -0
  8. package/assets/skills/workflow/adr-auto/SKILL.md +88 -0
  9. package/assets/skills/workflow/ai-failure-modes-detector/SKILL.md +180 -0
  10. package/assets/skills/workflow/ask-window/SKILL.md +119 -0
  11. package/assets/skills/workflow/avec-quoi-versioner/SKILL.md +111 -0
  12. package/assets/skills/workflow/ci-watcher/SKILL.md +194 -0
  13. package/assets/skills/workflow/critiquer-auditor/SKILL.md +135 -0
  14. package/assets/skills/workflow/critiquer-auditor/reference.md +134 -0
  15. package/assets/skills/workflow/debug-reasoning-rca/SKILL.md +174 -0
  16. package/assets/skills/workflow/depth-classifier/SKILL.md +118 -0
  17. package/assets/skills/workflow/diverge/SKILL.md +91 -0
  18. package/assets/skills/workflow/doc-validator-official/SKILL.md +196 -0
  19. package/assets/skills/workflow/evaluer-sizer/SKILL.md +112 -0
  20. package/assets/skills/workflow/faire-gatekeeper/SKILL.md +99 -0
  21. package/assets/skills/workflow/flux-narrator/SKILL.md +93 -0
  22. package/assets/skills/workflow/memoire/SKILL.md +198 -0
  23. package/assets/skills/workflow/memoire-consolidator/SKILL.md +91 -0
  24. package/assets/skills/workflow/meta-critiquer/SKILL.md +112 -0
  25. package/assets/skills/workflow/modern-patterns-checker/SKILL.md +166 -0
  26. package/assets/skills/workflow/pattern-fitness-check/SKILL.md +108 -0
  27. package/assets/skills/workflow/playwright-visual-critic/SKILL.md +98 -0
  28. package/assets/skills/workflow/pr-review-responder/SKILL.md +214 -0
  29. package/assets/skills/workflow/prouver-verifier/SKILL.md +184 -0
  30. package/assets/skills/workflow/prouver-verifier/reference.md +152 -0
  31. package/assets/skills/workflow/quoi-framer/SKILL.md +91 -0
  32. package/assets/skills/workflow/relire-critic/SKILL.md +99 -0
  33. package/assets/skills/workflow/security-regression-check/SKILL.md +86 -0
  34. package/assets/skills/workflow/self-consistency-verifier/SKILL.md +85 -0
  35. package/assets/skills/workflow/spike-mode/SKILL.md +101 -0
  36. package/assets/skills/workflow/stride-analyzer/SKILL.md +96 -0
  37. package/assets/skills/workflow/stride-analyzer/reference.md +144 -0
  38. package/assets/skills/workflow/test-strategy-vitest-playwright/SKILL.md +119 -0
  39. package/package.json +1 -1
@@ -0,0 +1,112 @@
1
+ ---
2
+ name: meta-critiquer
3
+ description: How to reflect on completed work — 10-item post-task reflection checklist for Ciel v5. Covers depth match, failure mode detection, user corrections, stale branches, uncovered issues, context health, dead code, map update, parking lot, and boy-scout rule. Closes the feedback loop between execution and improvement.
4
+ ---
5
+
6
+ # Post-Task Reflection — 10-Item Checklist (Ciel v5)
7
+
8
+ ## What this covers
9
+
10
+ How to reflect on completed work and capture learnings. In Ciel v5, this is the last pipeline step (etape 16: META). Always run after every task, even trivial ones.
11
+
12
+ ## Core principle
13
+
14
+ **Always reflect, even after trivial tasks.** 30 seconds is cheap; missed reflections compound. In v5, the reflection covers not just the code but the project map, parking lot, and boy-scout rule.
15
+
16
+ ## The 10 checks (v5)
17
+
18
+ ### 1. Depth match?
19
+
20
+ Was the task processed at the right depth?
21
+ - Over-processed trivial = waste
22
+ - Under-processed critical = risk
23
+ - Wrong depth -> flag for future classifier refinement
24
+
25
+ ### 2. New failure mode?
26
+
27
+ Did something go wrong that current gates didn't catch?
28
+ - Yes -> add a new gate immediately
29
+ - Capture the pattern for future reference (in .ciel/learnings.md)
30
+
31
+ ### 3. User correction?
32
+
33
+ Did the user correct you during the task?
34
+ - Yes -> persist to .ciel/learnings.md
35
+ - Don't just note it -- save it so it doesn't happen again
36
+
37
+ ### 4. Stale branches?
38
+
39
+ ```bash
40
+ git branch -r | wc -l
41
+ ```
42
+ Excessive remote branches (> 30) -> consider cleanup of merged branches.
43
+
44
+ ### 5. Uncovered issues?
45
+
46
+ Any recently closed issues with 0 comments? -> missing evidence comment -> add it now.
47
+
48
+ ### 6. Context health?
49
+
50
+ After Critical task or 3+ agent dispatches: consider context compression or new session. Stacking Critical tasks in one context window degrades output quality.
51
+
52
+ ### 7. Dead code sweep
53
+
54
+ Run language-specific linter:
55
+ - **Python**: `ruff check --select F401,F811,F841 . && vulture . --min-confidence 80`
56
+ - **TypeScript**: `npx knip` or manual grep for unused exports
57
+ - **Kotlin**: Detekt `UnusedPrivateMember` + `UnusedImport`
58
+ - **Go**: `go vet ./...`
59
+ - **Rust**: `cargo clippy -- -W unused`
60
+
61
+ ### 8. Map update (v5)
62
+
63
+ Has the project map (.ciel/map.json) been updated with new modules, key files, or patterns discovered during this task?
64
+ - If exploration happened -> update map
65
+ - If new ADR was written -> reference it in map
66
+
67
+ ### 9. Parking lot (v5)
68
+
69
+ Were any tangential discoveries made during the task?
70
+ - Yes -> note in .ciel/parking.md
71
+ - Don't act on them now -- just note them
72
+
73
+ ### 10. Boy-scout rule (v5)
74
+
75
+ Did you leave the code better than you found it?
76
+ - Minor improvements count: better naming, removed dead code, added missing test, improved error message
77
+ - If you only modified what was required and nothing else -> acceptable but note it
78
+
79
+ ## Output format
80
+
81
+ ```
82
+ ## REFLECTION
83
+
84
+ 1. Depth match: <match | over/under-processed>
85
+ 2. New failure mode: <none | detected>
86
+ 3. User correction: <none | captured in .ciel/learnings.md>
87
+ 4. Stale branches: <N branches | cleanup recommended>
88
+ 5. Uncovered issues: <none | #N needs closure>
89
+ 6. Context health: <N% | compact recommended>
90
+ 7. Dead code: <0 findings | N fixed>
91
+ 8. Map update: <up-to-date | needs update>
92
+ 9. Parking: <none | N notes added>
93
+ 10. Boy-scout: <improved | status quo>
94
+
95
+ ### ACTION ITEMS
96
+ - <list or "none">
97
+ ```
98
+
99
+ ## How to verify
100
+
101
+ - [ ] All 10 checks completed?
102
+ - [ ] >= 1 action item generated?
103
+ - [ ] User corrections captured (if any)?
104
+ - [ ] Stale branches flagged (if any)?
105
+ - [ ] Map checked for updates?
106
+ - [ ] Parking lot entries noted?
107
+
108
+ ## Key rules
109
+
110
+ - **Always non-blocking**: reflection never blocks the commit/push
111
+ - **Persist, don't just report**: items 2 and 3 must trigger learnings capture
112
+ - **Map update (item 8) is mandatory after exploration**: without it, the next session starts with a stale map
@@ -0,0 +1,166 @@
1
+ ---
2
+ name: modern-patterns-checker
3
+ description: Scans proposed or existing code for obsolete patterns that LLMs tend to reproduce from stale training data — React class components instead of hooks, sync-when-async-is-standard, callback hell, Python 2 idioms, old Go error handling, jQuery in React codebases, CommonJS in ESM projects. Flags each anti-pattern with the 2026 canonical replacement and a link to the migration note. Referenced by ThoughtWorks Technology Radar April 2026.
4
+ allowed-tools: Read, Grep, Glob, Bash
5
+ ---
6
+
7
+ # modern-patterns-checker — Don't ship 2019-era code in 2026
8
+
9
+ LLMs over-weight patterns that dominated their training set years ago. Without a guardrail, React class components, callback-based async, and sync-APIs-in-async-codebases keep leaking into new PRs. ThoughtWorks 2026 calls this "cognitive debt from AI autocompletion."
10
+
11
+ ---
12
+
13
+ ## Inputs (infer before asking — see orchestrator's Autonomy protocol)
14
+
15
+ ```
16
+ CODE_UNDER_REVIEW: [file paths OR diff hunk]
17
+ TARGET_STACK: [language + framework + version — resolved from package manifests]
18
+ ```
19
+
20
+ ### Auto-inference sources (exhaust BEFORE asking the user)
21
+
22
+ - **CODE_UNDER_REVIEW** → `git diff main...HEAD` for the branch under review; fall back to `git diff HEAD~1` for the latest commit; or the user-named file(s).
23
+ - **TARGET_STACK** → read `package.json` / `pyproject.toml` / `go.mod` / `Cargo.toml`; derive framework from dependencies (`react`, `vue`, `svelte`, `fastapi`, `django`, etc.). Read `tsconfig.json` / `pyproject.toml` for strictness settings. Cross-check with `ciel-overlay.md`.
24
+
25
+ Never ask the user for either. Both are deterministically inferable.
26
+
27
+ ---
28
+
29
+ ## Anti-pattern catalogue (2026)
30
+
31
+ ### TypeScript / JavaScript
32
+
33
+ | Anti-pattern | Canonical 2026 replacement |
34
+ |---|---|
35
+ | `class Foo extends React.Component` | Functional component + hooks |
36
+ | `componentDidMount / componentDidUpdate` | `useEffect` (or Server Component for data fetching) |
37
+ | `.then().catch()` chains > 2 links | `async/await` with `try/catch` |
38
+ | `require()` in a project with `"type":"module"` | `import` (ESM) |
39
+ | `var` | `const` / `let` |
40
+ | `null`-checks everywhere | Discriminated unions + `?.` / `??` |
41
+ | `any` as escape hatch | `unknown` + narrowing, or proper type |
42
+ | `lodash.get` / `lodash.set` | Optional chaining `?.` + `??` |
43
+ | `fetch().then(r => r.json()).then(...)` | `await fetch()` + `await r.json()` |
44
+ | `moment.js` | `Temporal` API (Node 22+) or `date-fns` |
45
+ | Redux for local UI state | `useState` / `useReducer` / Zustand |
46
+ | PropTypes | TypeScript types |
47
+
48
+ ### Python
49
+
50
+ | Anti-pattern | Canonical 2026 replacement |
51
+ |---|---|
52
+ | `print` as debug | `logging` with structured fields |
53
+ | `%`-format or `.format()` | f-strings |
54
+ | `dict.has_key(k)` | `k in dict` |
55
+ | Nested `if` guards | Early-return pattern |
56
+ | Bare `except:` | `except SpecificError:` |
57
+ | `os.path.join` | `pathlib.Path` |
58
+ | Sync `requests` in async codebase | `httpx.AsyncClient` / `aiohttp` |
59
+ | `dataclass` without `slots=True` | `@dataclass(slots=True)` (3.10+) |
60
+ | `typing.List`, `typing.Dict` | Built-in `list`, `dict` (3.9+ PEP 585) |
61
+ | `from typing import Optional` | `X \| None` (3.10+ PEP 604) |
62
+
63
+ ### Go
64
+
65
+ | Anti-pattern | Canonical 2026 replacement |
66
+ |---|---|
67
+ | `if err != nil { return err }` without wrapping | `fmt.Errorf("context: %w", err)` |
68
+ | Bare `err == sql.ErrNoRows` | `errors.Is(err, sql.ErrNoRows)` |
69
+ | Passing request context implicitly | Explicit `ctx context.Context` first arg |
70
+ | `interface{}` | `any` (Go 1.18+), or typed interface |
71
+ | `sync.Mutex` wrapping a slice | `sync.Map` or channel |
72
+
73
+ ### SQL
74
+
75
+ | Anti-pattern | Canonical 2026 replacement |
76
+ |---|---|
77
+ | String concatenation for queries | Parameterized queries / prepared statements |
78
+ | `SELECT *` in production queries | Explicit column list |
79
+ | `N+1` loop queries | JOIN or batched `IN (...)` |
80
+ | Missing indexes on FK | Index on every foreign key |
81
+
82
+ ### React (post-19)
83
+
84
+ | Anti-pattern | Canonical 2026 replacement |
85
+ |---|---|
86
+ | `useEffect` for data fetching | Server Components, `use()`, or TanStack Query |
87
+ | `useState` for derived values | `useMemo` or compute inline |
88
+ | Prop-drilling > 3 levels | Context, composition, or state library |
89
+ | Manual form state | `react-hook-form` or native `<form>` actions |
90
+
91
+ ---
92
+
93
+ ## Detection method
94
+
95
+ 1. **Regex pass** (fast) — grep for obvious markers: `extends Component`, `componentDidMount`, `require(`, `var `, `any`, `.then(.*).then(`, etc.
96
+ 2. **AST pass** (accurate, optional) — if `tsc` / `ruff` / `go vet` configured in the repo, run with strict rules.
97
+ 3. **Context pass** — read `tsconfig.json`, `pyproject.toml`, `go.mod` to confirm the stack is modern enough to allow the replacement. Don't suggest `Temporal` if Node is pinned to 18.
98
+
99
+ ---
100
+
101
+ ## Report format
102
+
103
+ ```
104
+ ## MODERN-PATTERNS VERDICT
105
+
106
+ ### Findings
107
+ [BLOCK] components/Profile.tsx:24 — class component
108
+ Replacement: functional + hooks
109
+ Migration: react.dev/reference/react/Component#alternatives
110
+
111
+ [WARN] lib/api.ts:55-70 — .then() chain (3 links)
112
+ Replacement: async/await
113
+ Rationale: readability + stack traces
114
+
115
+ [INFO] tests/user.test.ts:8 — `any` as escape hatch
116
+ Replacement: `unknown` + narrowing, or proper User type
117
+ Rationale: loses type safety in test-critical code
118
+
119
+ ### Stack-compatibility confirmed
120
+ - Node: 22.3 ✓ allows Temporal
121
+ - TS: 5.5 ✓ allows `satisfies` operator
122
+ - React: 19.0.2 ✓ allows Server Components
123
+
124
+ ### Summary
125
+ BLOCK: 1 (must fix)
126
+ WARN: 1 (strongly advised)
127
+ INFO: 1 (opportunistic)
128
+ ```
129
+
130
+ ---
131
+
132
+ ## Guardrails
133
+
134
+ - **Verify stack before recommending** — suggesting `Temporal` on Node 18 wastes a review cycle.
135
+ - **Don't aggregate-rewrite legacy** — flag, don't refactor wholesale. A single migration is a PR, not a silent edit.
136
+ - **Repo-level opt-outs respected** — if `.eslintrc` deliberately allows `var` or a deprecated pattern (grandfather clause for a legacy module), note and skip.
137
+ - **Citation required** — every suggestion links to the official migration doc or the MDN/React/Python guide. No link → drop the suggestion.
138
+ - **BLOCK only for compile-breaking or security-sensitive** — class components don't BLOCK a working PR; a missing parameterized query DOES.
139
+ - **Stop at 10 findings per file** — above 10, return "file needs a dedicated modernization task" rather than a linter dump.
140
+
141
+ ---
142
+
143
+ ## How to verify
144
+
145
+ - [ ] Anti-pattern catalogue checked for each language in stack?
146
+ - [ ] Each finding has 2026 canonical replacement?
147
+ - [ ] Stack compatibility confirmed?
148
+ - [ ] VERDICT issued (CLEAN / FINDINGS)?
149
+ - [ ] Migration notes provided for each finding?
150
+
151
+ ## When triggered
152
+
153
+ - CODEBASE step after `explorer` reads the target files
154
+ - `@ciel-explorer` dispatched for PR review
155
+ - Before accepting LLM-generated code in a legacy codebase (high drift risk)
156
+ - After `@ciel-researcher` validates an API — this skill confirms the call site uses modern idioms
157
+
158
+ ---
159
+
160
+ ## References
161
+
162
+ - ThoughtWorks Technology Radar April 2026 — "curated shared instructions" volume
163
+ - React 19 migration guide — react.dev/blog/2024/04/25/react-19
164
+ - PEP 585 / PEP 604 — Python builtin-generics + union syntax
165
+ - Go 1.18 — `any` alias, generics
166
+ - MDN Async/Await — developer.mozilla.org/en-US/docs/Learn/JavaScript/Asynchronous
@@ -0,0 +1,108 @@
1
+ ---
2
+ name: pattern-fitness-check
3
+ description: For every existing code pattern being considered for reuse, applies a 3-question fitness check (same problem? same constraints? same volume?) before copying. Flags HUBs (high fan-in files), duplication candidates, and prior AI-generated patterns that contradict official docs. Invoked during the CODEBASE step of every Ciel task.
4
+ allowed-tools: Read, Grep, Glob, Bash
5
+ ---
6
+
7
+ # pattern-fitness-check — Don't copy patterns blindly
8
+
9
+ Part of CRÉER step 5 (CODEBASE). Pattern-matching without fitness checking is the single most common LLM coding failure (per Ciel's Guards table).
10
+
11
+ ---
12
+
13
+ ## 3-question fitness check
14
+
15
+ For EACH pattern considered for reuse, answer all 3:
16
+
17
+ 1. **Same problem?** — What problem did this pattern solve originally? (git blame the commit)
18
+ - If the pattern was written for use case A and you're facing use case B → NOT the same problem.
19
+
20
+ 2. **Same constraints?** — Volume, transport, sync/async, batch/single, cardinality
21
+ - Pagination pattern written for 1k items might fail at 100M items.
22
+ - Sync validation pattern might not fit async flow.
23
+ - REST pagination pattern doesn't fit WebSocket message stream.
24
+
25
+ 3. **Same data shape?** — Is the input/output structure identical?
26
+ - Different field names → adapter needed
27
+ - Different nullable fields → null-safety differs
28
+ - Different ordering guarantees → might break downstream
29
+
30
+ → **All yes** → APPLY. **Any no** → ADAPT or DO NOT USE.
31
+
32
+ ---
33
+
34
+ ## Additional checks
35
+
36
+ ### Prior AI-generated patterns
37
+
38
+ Treat existing code written during a prior AI session as a **suggestion, not law**. If it contradicts current official docs → likely an inherited anti-pattern. Flag and do not follow.
39
+
40
+ Signal: code with unusual structure, comments like `// AI-suggested` or `// TODO: verify this approach`.
41
+
42
+ ### Duplication check
43
+
44
+ If 2+ copies of the pattern you're about to write ALREADY EXIST → extract a shared helper FIRST, then use it.
45
+
46
+ ```bash
47
+ # Find similar patterns
48
+ grep -rn "fun <functionName>" --include='*.kt' src/
49
+ ```
50
+
51
+ ### Mini repo-map (3 greps)
52
+
53
+ For impacted files, build a minimal map:
54
+
55
+ 1. **Signatures** — `grep -n "^fun \|^class \|^interface \|^object " <file>`
56
+ 2. **Dependents** — `grep -rln "import .*<filename>" src/`
57
+ 3. **Hub check** — if step 2 returns 5+ files → **HUB WARNING**: changes ripple widely, proceed with caution
58
+
59
+ ---
60
+
61
+ ## Output format
62
+
63
+ ```
64
+ ## PATTERN FITNESS
65
+
66
+ ### Patterns considered
67
+ - APPLY: <pattern at file:line> — same problem ✓ same constraints ✓ same shape ✓
68
+ - ADAPT: <pattern at file:line> — <what differs> → <how to adapt>
69
+ - DO NOT USE: <pattern at file:line> — <reason>
70
+
71
+ ### Mini repo-map
72
+ - Impacted files: <list>
73
+ - Key signatures: <func/class at file:line>
74
+ - Dependents (1 hop): <list>
75
+ - Hub check: <NO — safe | YES — N files, changes ripple>
76
+
77
+ ### Duplication check
78
+ - [None / Found N copies at file:line — extract helper first]
79
+
80
+ ### Prior AI patterns
81
+ - [None / Flagged: <file:line> contradicts <doc URL> — do not follow]
82
+ ```
83
+
84
+ ---
85
+
86
+ ## Guardrails
87
+
88
+ - **Git blame mandatory** for "same problem?" — don't rely on current code reading. Read the commit message where the pattern was introduced.
89
+ - **Numeric constraints**: quantify "volume" — "1k items" vs "1M items" matters. Don't say "big" or "small".
90
+ - **HUB threshold**: 5+ importers is the default; adjust per project size. A core util imported by 50+ files is extremely high-ripple — needs cross-team coordination.
91
+ - **Don't over-adapt**: if adaptation grows to > 50 lines different from the original, just write new code. Adapting is not saving effort.
92
+
93
+ ---
94
+
95
+ ## How to verify
96
+
97
+ - [ ] 3-question fitness check applied (same problem? same constraints? same volume)?
98
+ - [ ] Prior AI-generated patterns flagged?
99
+ - [ ] Duplication check performed (≥ 2 copies)?
100
+ - [ ] Mini repo-map generated (impacted files, key signatures, dependents)?
101
+ - [ ] Hub check performed (high fan-in files)?
102
+
103
+ ## When triggered
104
+
105
+ - Standard/Critical tasks, during CODEBASE step
106
+ - Trivial tasks, if the fix is "use an existing pattern" (quickly — 1 pattern, 1 fitness check)
107
+ - When user says "we already have code for this" or "reuse X"
108
+ - When `explorer` agent identifies a candidate pattern
@@ -0,0 +1,98 @@
1
+ ---
2
+ name: playwright-visual-critic
3
+ description: How to review UI visually using Playwright MCP — launch dev server, capture accessibility tree (not screenshots), check layout/contrast/focus/responsive at multiple viewports, and produce structured findings. Prefers accessibility-tree analysis over pixel screenshots (deterministic, 2-5KB vs 100KB+). Requires Playwright MCP configured.
4
+ allowed-tools: Read, Grep, Glob, Bash
5
+ ---
6
+
7
+ # Playwright Visual Critique — See Before Shipping UI
8
+
9
+ ## What this covers
10
+
11
+ How to visually review UI using Playwright MCP. UI bugs invisible to code review: clipped text, contrast failures, broken focus order, mobile overflow. The 2026 pattern is NOT "screenshot → vision model"; it's "accessibility tree → structured critique", which is 20-50x cheaper and more accurate.
12
+
13
+ ## Core principle
14
+
15
+ **Accessibility tree first, screenshots last.** Tree is deterministic, cheap, and doesn't break on font/rendering differences. Screenshots are brittle and expensive to analyze.
16
+
17
+ ## Prerequisites
18
+
19
+ Playwright MCP must be installed:
20
+
21
+ ```bash
22
+ # One-time setup
23
+ claude mcp add playwright --transport stdio -- npx @playwright/mcp@latest
24
+ ```
25
+
26
+ Verify with: `claude mcp list | grep playwright`.
27
+
28
+ If not installed → STOP and instruct the user to run the command above. Do not attempt to critique without it.
29
+
30
+ ## Methodology
31
+
32
+ ### Capture via Playwright MCP
33
+
34
+ 1. **`browser_navigate`** — navigate to target URL
35
+ 2. **`browser_resize`** — for each viewport (375 mobile, 768 tablet, 1440 desktop)
36
+ 3. **`browser_snapshot`** — accessibility tree (structured YAML/JSON)
37
+ 4. **`browser_take_screenshot`** — only if visual regression check needed (cost optimization)
38
+ 5. **`browser_console_messages`** — check for JS errors / a11y violations
39
+
40
+ ### Visual critique checklist
41
+
42
+ **Layout**
43
+ - No horizontal overflow (no element with `scrollable: true` on x-axis for main content)
44
+ - No clipped text (elements with `hidden: true` while `expected: visible`)
45
+ - No zero-size interactive elements (touch targets ≥ 24×24px per WCAG 2.5.8)
46
+
47
+ **Contrast & color**
48
+ - Text contrast ≥ 4.5:1 (normal text) / 3:1 (large text)
49
+ - Color is not the sole signal (error states have icon/text, not just red)
50
+
51
+ **Keyboard & focus**
52
+ - Every interactive element has `focusable: true`
53
+ - Focus order matches visual order (tree `focus_index` is monotonic)
54
+ - No focus trap unless intentional (modal dialogs)
55
+ - `:focus-visible` ring is present (no `outline: none` without alternative)
56
+
57
+ **Responsive**
58
+ - At 375px width, primary content fits without zoom
59
+ - Navigation collapses to mobile pattern (drawer / bottom-nav) — not truncated desktop nav
60
+
61
+ **Semantic structure**
62
+ - One `<h1>` per page
63
+ - `<main>`, `<nav>`, `<header>`, `<footer>` landmarks present
64
+ - Form inputs have associated labels (tree `label_id` populated)
65
+
66
+ **Console**
67
+ - No JS errors
68
+ - No React / Vue / Svelte warnings
69
+ - No axe-core violations (if integrated)
70
+
71
+ ## Key points
72
+
73
+ - **Do not attempt if MCP not installed** — halt cleanly with install instructions
74
+ - **Timebox**: 3 viewports × 5 minutes analysis = 15 min hard cap
75
+ - **Don't critique Lighthouse perf metrics here** — stay on visual + a11y
76
+ - **Auth-gated pages**: instruct the user to seed a test session cookie first. Do not attempt to handle login credentials
77
+ - **Local dev SSL**: use `--ignore-https-errors` in Playwright MCP config for self-signed certs
78
+
79
+ ## Common anti-patterns
80
+
81
+ 1. **Screenshot-first analysis**: pixel screenshots are brittle, break on font differences, and cost 20-50x more to process
82
+ 2. **No viewport variation**: mobile bugs are the most common and the most missed — always test at 375px
83
+ 3. **Ignoring console output**: JS errors and framework warnings often point to the root cause
84
+ 4. **Critiquing without MCP installed**: will fail silently or produce empty results — always verify prerequisites
85
+
86
+ ## How to verify
87
+
88
+ - **All viewports checked**: mobile, tablet, desktop — no skipped
89
+ - **Findings have selectors**: every issue points to a specific element in the accessibility tree
90
+ - **Severity classified**: BLOCK (broken), WARN (bad practice), INFO (minor)
91
+ - **Console clean**: no JS errors or framework warnings
92
+
93
+ ## References
94
+
95
+ - Playwright MCP — playwright.dev/docs/getting-started-mcp
96
+ - github.com/microsoft/playwright-mcp — official implementation
97
+ - alexop.dev — "Building an AI QA Engineer with Claude Code and Playwright MCP" (real-world pattern)
98
+ - WCAG 2.2 — w3.org/WAI/WCAG22/Understanding/ (AA criteria used in the checklist)