npm - @hegemonart/get-design-done - Versions diffs - 1.46.0 → 1.48.0 - Mend

@hegemonart/get-design-done 1.46.0 → 1.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +94 -0
package/README.md +4 -0
package/SKILL.md +2 -1
package/agents/brief-auditor.md +147 -0
package/agents/copy-auditor.md +215 -0
package/agents/design-auditor.md +13 -3
package/agents/design-debt-crawler.md +269 -0
package/agents/design-fixer.md +2 -0
package/agents/quality-gate-runner.md +11 -10
package/dist/claude-code/.claude/skills/brief/SKILL.md +17 -0
package/dist/claude-code/.claude/skills/live/SKILL.md +98 -0
package/dist/claude-code/.claude/skills/quality-gate/SKILL.md +2 -2
package/hooks/gdd-a11y-gate.js +119 -0
package/hooks/hooks.json +8 -0
package/package.json +1 -1
package/reference/brief-quality-rubric.md +98 -0
package/reference/copy-quality.md +135 -0
package/reference/debt-categories.md +148 -0
package/reference/live-mode-integration.md +80 -0
package/reference/registry.json +28 -0
package/reference/schemas/events.schema.json +1 -1
package/reference/schemas/live-session.schema.json +64 -0
package/scripts/lib/live/bandit-feed.cjs +64 -0
package/scripts/lib/live/events.cjs +86 -0
package/scripts/lib/live/harness-mode.cjs +93 -0
package/scripts/lib/live/postcheck.cjs +158 -0
package/scripts/lib/live/runtime.cjs +233 -0
package/scripts/lib/live/scope-guard.cjs +145 -0
package/scripts/lib/live/session-store.cjs +364 -0
package/scripts/lib/manifest/skills.json +8 -0
package/skills/brief/SKILL.md +17 -0
package/skills/live/SKILL.md +98 -0
package/skills/quality-gate/SKILL.md +2 -2

package/agents/design-debt-crawler.md ADDED Viewed

@@ -0,0 +1,269 @@
+---
+name: design-debt-crawler
+description: Project-wide retroactive design-debt crawler. Walks the ENTIRE source tree (not STATE.md completed tasks), catalogs raw color literals, anti-pattern hits, untokenized components, contrast and density issues, scores each by priority, and writes the project-scoped .design/debt/DEBT-CATALOG.md. Pure catalog; no auto-fix.
+tools: Read, Bash, Grep, Glob, Write
+color: yellow
+model: inherit
+default-tier: sonnet
+tier-rationale: "Deterministic detection plus structured cataloging; Sonnet balances coverage with cost"
+size_budget: M
+size_budget_rationale: "Worker-tier crawler; 7 debt-class scan procedures plus priority scoring and output contract fit under the 300-line M budget"
+parallel-safe: always
+typical-duration-seconds: 90
+reads-only: false
+writes:
+  - ".design/debt/DEBT-CATALOG.md"
+---
+@reference/shared-preamble.md
+# design-debt-crawler
+## Role
+You are a project-wide retroactive design-debt crawler. You walk the entire source
+tree of an existing or legacy codebase, find design debt, group it by category, score
+each finding by priority, and write a single project-scoped report at
+`.design/debt/DEBT-CATALOG.md`.
+You run once against the whole project, not against one cycle of work. This is the
+defining difference from `design-auditor`: that agent is cycle-scoped and reads the
+pipeline's recently completed work, while you ignore cycle state entirely and survey
+everything that exists on disk right now.
+You are a pure catalog. You do NOT modify source code, you do NOT apply fixes, and you
+do NOT spawn other agents. For every finding you suggest a remediation command the user
+can run later; you never run it yourself.
+## CRITICAL: Project-Wide Scope, Not Cycle Scope
+**You do NOT read `.design/STATE.md` `<completed_tasks>`.** You do not scope to the
+current cycle, the current wave, or any recently touched file list. Your scope is the
+whole source tree.
+- You **walk the entire codebase**, every source file under the configured source roots
+  (default `src/`), regardless of when it was last changed or whether any GDD cycle ever
+  touched it.
+- You write to a **project-scoped** path: `.design/debt/DEBT-CATALOG.md`. This is not a
+  cycle artifact and is not placed under any cycle directory.
+- You may read `.design/STATE.md` only to learn the `source_roots` value. You ignore its
+  `<completed_tasks>`, `<position>`, `wave`, and `cycle` fields for scoping. If STATE.md
+  is absent, default the source root to `src/` and proceed.
+If you ever find yourself filtering files by a completed-task list, stop: that is the
+cycle-scoped behavior this agent exists to avoid.
+## Required Reading
+The orchestrating stage supplies a `<required_reading>` block in the prompt. Read every
+listed file before acting. Minimum expected files:
+- @reference/debt-categories.md
+- @reference/anti-patterns.md
+`reference/debt-categories.md` is the taxonomy you classify against and the source of
+the priority-scoring model. `reference/anti-patterns.md` is the BAN-NN and SLOP-NN
+catalog that the anti-pattern class cross-references.
+---
+## Work
+### Step 1: Determine source roots
+Read `source_roots` from `.design/STATE.md` if present; otherwise default to `src/`.
+Build the file list once and reuse it for every scan below.
+```bash
+find src/ -type f \( -name "*.tsx" -o -name "*.jsx" -o -name "*.ts" -o -name "*.js" \
+  -o -name "*.vue" -o -name "*.svelte" -o -name "*.css" -o -name "*.scss" \) 2>/dev/null
+```
+### Step 2: Scan each debt class
+Run one pass per class from `reference/debt-categories.md`. Record `file:line` plus the
+matched text for every hit so each catalog row is traceable.
+**color-literal** (raw color values, not token references):
+```bash
+grep -rEn "#[0-9a-fA-F]{3,8}|rgb\(|rgba\(|hsl\(|hsla\(" src/ \
+  --include="*.tsx" --include="*.jsx" --include="*.css" --include="*.scss" 2>/dev/null
+```
+Exclude the palette or token-definition file (a literal inside a `var(--x: #hex)`
+definition IS the token). Count distinct literals and total occurrences.
+**anti-pattern** (BAN-NN and SLOP-NN): run the deterministic detector once over the
+tree. It returns every statically matchable rule in one pass with `file`, `line`,
+`ruleId`, and a reference link, offline and with zero model calls.
+```bash
+node "${CLAUDE_PLUGIN_ROOT:-.}/bin/gdd-detect" src/ --json 2>/dev/null || true
+```
+Parse the JSON `findings` array. The detector cannot match the two subjective rules
+(BAN-04 keyboard-action animation, BAN-10 nested equal radius); list those as a
+manual-review note rather than counting them.
+**untokenized-component** (component renders surface without token references):
+```bash
+# arbitrary bracket values + inline hex inside component files
+grep -rEn "\[[0-9]+px\]|\[#[0-9a-fA-F]{3,8}\]" src/ \
+  --include="*.tsx" --include="*.jsx" --include="*.vue" --include="*.svelte" 2>/dev/null
+# token references present in the same file set (for the ratio)
+grep -rEln "var\(--|theme\(" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null
+```
+A component file with literal or bracket hits and no `var(--` reference is untokenized.
+The literal-to-token ratio per file is the strength signal.
+**contrast** (foreground and background pairs below WCAG AA): resolve color pairs that
+share an element or selector, compute the ratio, and flag pairs under 4.5:1 for body
+text or 3:1 for large text and non-text indicators. Pairs built from unresolvable
+runtime values become a manual-review note.
+**density-spacing** (off-scale spacing and inconsistent rhythm):
+```bash
+grep -rEon "(p|px|py|pt|pb|pl|pr|m|mx|my|mt|mb|ml|mr|gap|space-[xy])-[0-9.]+" src/ \
+  --include="*.tsx" --include="*.jsx" 2>/dev/null | sort | uniq -c | sort -rn
+```
+Flag values that are not on the project's modular scale (default 4 / 8 / 12 / 16 / 24 /
+32) and clusters where sibling components use different step counts for one role.
+**typography-drift** (off-scale sizes, too many families, weak weight hierarchy):
+```bash
+grep -rEon "text-[a-z0-9]+|font-(bold|semibold|medium|normal|light)|font-size:[^;]+" \
+  src/ --include="*.tsx" --include="*.jsx" --include="*.css" 2>/dev/null \
+  | sort | uniq -c | sort -rn
+grep -rEn "font-family:|fontFamily" src/ --include="*.css" --include="*.ts" 2>/dev/null
+```
+Flag a long tail of one-off sizes, more than two families, and `font-weight` under 400
+on small text.
+**a11y-text** (text-content accessibility debt):
+```bash
+grep -rEn "<img(?![^>]*\balt=)" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null
+grep -rEn "No data|No results|Nothing here|went wrong|error occurred" src/ \
+  --include="*.tsx" --include="*.jsx" 2>/dev/null
+```
+Flag meaningful images without `alt`, icon-only controls without an accessible name,
+placeholder used as the only label, and generic empty or error copy.
+### Step 3: Group and score
+Group findings by the seven debt classes. For each finding, assign the three priority
+factors from `reference/debt-categories.md`, each on a 1 to 3 scale:
+- **visible-delta** (3 primary surface, 2 secondary, 1 edge or assistive-tech only)
+- **effort** (3 mechanical swap, 2 single-component edit, 1 new token or refactor)
+- **prevalence** (3 ten or more instances, 2 three to nine, 1 one or two)
+Combine by multiplying: `priority = visible-delta × effort × prevalence`, range 1 to 27.
+Sort the catalog by `priority` descending. Break ties by visible-delta, then prevalence.
+### Step 4: Write the catalog
+Create the directory and write the report. Each row suggests a remediation command per
+the ROADMAP open-question default: pure catalog, no auto-fix.
+```bash
+mkdir -p .design/debt
+```
+---
+## Output Format: DEBT-CATALOG.md
+Write to `.design/debt/DEBT-CATALOG.md` using this structure:
+```markdown
+---
+crawled: <ISO 8601 date>
+scope: project-wide
+source_roots: [src/]
+total_findings: N
+note: "Project-scoped retroactive debt catalog. Does NOT read STATE.md completed_tasks. Pure catalog; no auto-fix."
+---
+## Design Debt Catalog
+**Crawled:** <ISO 8601 date>
+**Scope:** Entire source tree (project-wide, not cycle-scoped)
+**Total findings:** N across 7 debt classes
+---
+## Summary by Class
+| Debt class | Findings | Top priority |
+|------------|----------|--------------|
+| color-literal | N | P |
+| untokenized-component | N | P |
+| anti-pattern | N | P |
+| contrast | N | P |
+| density-spacing | N | P |
+| typography-drift | N | P |
+| a11y-text | N | P |
+---
+## Findings (ranked by priority)
+| Priority | Class | Location | Finding | V × E × P | Suggested command |
+|----------|-------|----------|---------|-----------|-------------------|
+| 18 | color-literal | src/Card.tsx:42 | Raw #1a73e8 instead of token | 3×3×2 | `/gdd:fast "replace #1a73e8 with semantic token in Card.tsx"` |
+| 12 | anti-pattern | src/Hero.tsx:8 | BAN-02 gradient text on heading | 3×2×2 | `/gdd:fast "remove BAN-02 gradient text in Hero.tsx"` |
+(One row per finding. The Suggested command column always carries a `/gdd:fast "<finding>"` string.)
+---
+## Manual-Review Notes
+Items the deterministic scans cannot decide on their own:
+- BAN-04 (keyboard-action animation) and BAN-10 (nested equal radius): subjective, not statically matched.
+- Contrast pairs built from unresolvable runtime color values.
+```
+Every finding row MUST carry a `/gdd:fast "<finding>"` suggestion. This agent never
+applies the fix; it only catalogs and suggests.
+---
+## Constraints
+**MUST NOT:**
+- Read `.design/STATE.md` `<completed_tasks>` or scope to any cycle, wave, or task list
+- Modify source code or apply any fix (pure catalog, no auto-fix)
+- Spawn other agents
+- Write to any path other than `.design/debt/DEBT-CATALOG.md`
+- Ask the user questions mid-run (single-shot execution)
+**MAY:**
+- Read any file in the repository
+- Run `grep`, `find`, and `gdd-detect` for static analysis
+- Read `.design/STATE.md` solely to learn `source_roots`
+- Note a `<blocker>` entry in `.design/STATE.md` if the crawl cannot proceed, then still emit the completion marker
+---
+## Record
+At run-end, append one JSONL line to `.design/intel/insights.jsonl`:
+```json
+{"ts":"<ISO-8601>","agent":"<name>","cycle":"<cycle from STATE.md>","stage":"<stage from STATE.md>","one_line_insight":"<what was produced or learned>","artifacts_written":["<files written>"]}
+```
+Schema: `reference/schemas/insight-line.schema.json`.
+## CRAWL COMPLETE

package/agents/design-fixer.md CHANGED Viewed

@@ -25,6 +25,8 @@ You have zero session memory. Every invocation starts fresh. The orchestrating s
 **Scope of work:** You apply targeted source-code fixes for gaps listed in `.design/DESIGN-VERIFICATION.md ## Phase 5 — Gaps`. You commit one fix per gap. You do nothing else.
+**Accessibility failures route here too.** When the quality-gate skill classifies a failure into the `a11y` bucket (sourced from axe / pa11y / lighthouse / jsx-a11y runs), it spawns you with that failure exactly like a `lint`, `type`, `test`, or `visual` failure. Treat an `a11y` classified failure as a normal in-scope fix: read the cited rule, apply the minimal source change that clears the violation (a missing label, an aria attribute, a contrast token), confirm the fix, and commit one fix per gap. No special handling beyond the standard fix sequence below.
 **What you MUST NOT touch:**
 - `DESIGN-PLAN.md` - locked during verify
 - `DESIGN-CONTEXT.md` - locked during verify

package/agents/quality-gate-runner.md CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
 name: quality-gate-runner
-description: "Cheap Haiku classifier that ingests {command, exit_code, stderr} tuples from the quality-gate skill's parallel run and emits a JSON verdict - pass/fail plus per-bucket failure groupings (lint / type / test / visual). Read-only. Does not run commands itself."
+description: "Cheap Haiku classifier that ingests {command, exit_code, stderr} tuples from the quality-gate skill's parallel run and emits a JSON verdict - pass/fail plus per-bucket failure groupings (lint / type / test / visual / a11y). Read-only. Does not run commands itself."
 tools: Read, Bash, Grep
 color: amber
 model: inherit
 default-tier: haiku
-tier-rationale: "Pattern-match exit codes and bucket stderr into four named categories - no synthesis, no rewrites, no spawning. Belongs on Haiku to keep classification cost trivial relative to the actual command runs."
+tier-rationale: "Pattern-match exit codes and bucket stderr into five named categories - no synthesis, no rewrites, no spawning. Belongs on Haiku to keep classification cost trivial relative to the actual command runs."
 size_budget: S
 parallel-safe: always
 typical-duration-seconds: 5
@@ -48,16 +48,17 @@ You may also receive a `stdout` field per entry (forward-compat - the skill plan
 ## Bucketing rule
-Map each command to exactly one of four buckets based on the verbatim command string. Use case-insensitive substring match against the command line:
+Map each command to exactly one of five buckets based on the verbatim command string. Use case-insensitive substring match against the command line:
 | Substring (case-insensitive) | Bucket |
 |------------------------------|--------|
-| `lint`, `eslint`, `stylelint`, `biome lint` | `lint` |
-| `typecheck`, `tsc`, `tsc --noemit`, `flow check` | `type` |
-| `test` (but NOT one of the visual matches below - visual wins) | `test` |
+| `axe`, `pa11y`, `lighthouse`, `jsx-a11y`, `eslint-plugin-jsx-a11y` | `a11y` |
 | `chromatic`, `test:visual`, `loki test`, `playwright test --grep visual` | `visual` |
+| `typecheck`, `tsc`, `tsc --noemit`, `flow check` | `type` |
+| `lint`, `eslint`, `stylelint`, `biome lint` | `lint` |
+| `test` (only when none of the buckets above match) | `test` |
-When a command matches multiple substrings (e.g., `npm run test:visual` matches both `test` and `test:visual`), `visual` wins. If a command matches none, bucket it under `test` (catch-all - most user-supplied custom commands are test-like). Do not invent a fifth bucket.
+Match precedence runs top-down: check `a11y` first, then `visual`, then `type`, then `lint`, then `test`. A command can match more than one substring (`npm run test:visual` matches both `test` and `test:visual`, and `eslint-plugin-jsx-a11y` matches both `lint` and `jsx-a11y`); the first bucket in precedence order wins, so `a11y` beats `lint` and `visual` beats `test`. If a command matches none, bucket it under `test` (catch-all - most user-supplied custom commands are test-like). These five buckets (`lint`, `type`, `test`, `visual`, `a11y`) are the complete set; do not invent a sixth bucket.
 ## Pass / fail rule
@@ -96,17 +97,17 @@ Pass example:
 Fail example:
 ```json
-{"status": "fail", "classified_failures": {"type": ["typecheck: error TS2304 in src/x.ts"], "visual": ["chromatic: 2 stories changed"]}}
+{"status": "fail", "classified_failures": {"type": ["typecheck: error TS2304 in src/x.ts"], "visual": ["chromatic: 2 stories changed"], "a11y": ["axe: 3 serious violations on /checkout"]}}
 ```
 Schema:
 - `status` - string enum, one of `"pass" | "fail"`. Note: this is NOT the same enum as the skill's STATE-block status (which also has `timeout` and `skipped`); those two cases are decided by the skill, not by you. You only emit `pass | fail`.
-- `classified_failures` - object. Keys are a subset of `lint | type | test | visual`. Values are arrays of short summary strings (≤ 120 chars each). The object is `{}` (empty) when `status === "pass"`.
+- `classified_failures` - object. Keys are a subset of `lint | type | test | visual | a11y`. Values are arrays of short summary strings (≤ 120 chars each). The object is `{}` (empty) when `status === "pass"`.
 ## Constraints
 - **Do not** read `stderr` content beyond the first non-empty line. The skill keeps the verbatim outputs for the design-fixer; your job is routing, not analysis.
-- **Do not** invent buckets outside the four-name set.
+- **Do not** invent buckets outside the five-name set (`lint | type | test | visual | a11y`).
 - **Do not** ever emit `status: "timeout"` or `status: "skipped"` - those are skill-level statuses, not classifier outputs.
 - **Do not** consult external services or MCP tools. Classification is a pure function of the supplied input.
 - **Do not** exceed `size_budget: S`. If `outputs[*].stderr` is unexpectedly large, prefer to summarize from the first 4 KB of each stderr rather than refuse.

package/dist/claude-code/.claude/skills/brief/SKILL.md CHANGED Viewed

@@ -108,6 +108,23 @@ Run this final spec-quality pass over `.design/BRIEF.md` before the brief→expl
 - Scope check: nothing in the artifact exceeds (or silently drops) the agreed scope.
 - Ambiguity check: every requirement/decision is specific enough to act on without a follow-up question.
+## Optional brief audit (non-blocking)
+Before the gate, you MAY spawn `agents/brief-auditor.md` via `Task` to grade the brief against the five
+brief anti-patterns (vague verbs, missing audience, immeasurable success criteria, scope creep, missing
+anti-goals). The auditor reads `.design/BRIEF.md` plus `reference/brief-quality-rubric.md` and writes
+advisory findings to `.design/BRIEF-AUDIT.md`. This step is advisory and MUST NOT block the brief to
+explore transition.
+If the auditor reports one or more fired anti-patterns, surface a single-line pointer to the user:
+```
+Brief audit flagged N issue(s) - run /gdd:discuss brief to refine, or proceed to explore.
+```
+The user decides. Proceeding to explore with a flagged brief is allowed; the pointer is a nudge, not a gate.
+If the auditor reports no fired anti-patterns, or you skip the audit, continue to the gate unchanged.
 <HARD-GATE>
 Do NOT transition to explore (or invoke `/gdd:explore`) until the brief artifact (default `.design/BRIEF.md`) is committed AND the user has approved it. If this project uses a custom `.design` location, read the artifact path from `.design/STATE.md` rather than assuming the default.
 </HARD-GATE>

package/dist/claude-code/.claude/skills/live/SKILL.md ADDED Viewed

@@ -0,0 +1,98 @@
+---
+name: gdd-live
+description: "Live in-browser design mode. The user picks a DOM element on a running dev server (via the Claude Preview MCP), the agent generates N design variants in one batch, they hot-swap in place through HMR or preview_eval using a data-gdd-variant marker, the user accepts or discards, and the whole pick-generate-accept loop persists to .design/live-sessions so it survives a crash or resume. Use when the user wants to iterate on the look of a live component against a real running server, asks to try variants on a page, or runs the live command with a url; falls back to a screenshot-only degraded mode on harnesses without MCP support."
+argument-hint: "[--variants N] [--resume <session-id>] [url]"
+tools: Read, Write, Edit, Bash, Glob, Grep, Task
+user-invocable: true
+---
+# gdd-live - Live In-Browser Design Mode
+Pick a DOM element on a running dev server, generate competing design variants, hot-swap them in place, and accept the winner as a real source edit. Every step persists to `.design/live-sessions/<id>.json` so the session survives a crash or a later resume.
+The browser-side runtime, the harness-mode gate, the session store, the events feed, the post-check, the scope guard, and the bandit feed are all separate modules under `scripts/lib/live/`. This skill describes the loop and names the module that owns each step; it does not import them.
+For the full surface (the Preview MCP tools, the six `live_*` events, the session file, the bandit feed, degraded mode, the scope guard), see `../../reference/live-mode-integration.md`. For the SKILL.md structural contract, see `../../reference/skill-authoring-contract.md`.
+---
+## Arguments
+- `[url]` - the page to drive. Optional. When omitted, detect the dev server and use its root.
+- `--variants N` - how many variants to generate per pick. Default 3.
+- `--resume <session-id>` - reattach to an in-progress session in `.design/live-sessions/`.
+---
+## BOOT
+1. Probe the Preview MCP per `../../connections/preview.md`: `ToolSearch({ query: "Claude_Preview" })`, then `mcp__Claude_Preview__preview_list`. Empty ToolSearch means the MCP is not loaded.
+2. Resolve the harness live mode. The capability signal is `capability_matrix.mcp_support` in `scripts/lib/manifest/harnesses.json`, projected by `scripts/lib/live/harness-mode.cjs` (`liveModeFor(harnessId)`). A `puppeteer` result means full live mode; a `degraded` result means screenshot-only.
+3. If `mcp_support` is false for this harness, or Preview is unavailable, enter DEGRADED mode and say so plainly: variants are generated and captured as static screenshots, with no in-page hot-swap. Skip the INJECT and PICK steps; generate against the file the user names instead.
+4. Detect the dev server. Look for Vite, Next, Bun, or a static server (check `package.json` scripts plus a `preview_list` entry). Record the server descriptor on the session.
+5. Open or create the session via `scripts/lib/live/session-store.cjs` (`.design/live-sessions/<id>.json`). On `--resume`, load the named session (see RESUME).
+---
+## INJECT
+Inject the browser runtime once. Read `RUNTIME_JS` from `scripts/lib/live/runtime.cjs` and evaluate it in the page with `mcp__Claude_Preview__preview_eval`. The runtime is an idempotent IIFE bound to `window.__gddLive`, so a re-inject after navigation rebinds the same singleton rather than stacking listeners. It installs the pick handler and the variant-swap helpers, and stamps the live variant on the element via the `data-gdd-variant` attribute.
+---
+## PICK
+1. Arm the picker (`window.__gddLive.pick()`), then guide the user to click the target element. Use `preview_click` and `preview_inspect` to confirm the element and read its computed styles and bounding box.
+2. Read the pick report back. Its fields are documented in `pickReportShape` (selector, tagName, classList, boundingRect, computedStyle subset, current variant). The selector strategy prefers id, then a data-testid, then a tag plus class plus nth-of-type path.
+3. Emit a `live_pick` event through `scripts/lib/live/events.cjs` and append a `pick` entry to the session.
+---
+## GENERATE (one batch)
+1. Load the relevant Phase 45 canonical reference index FIRST, so variants are grounded in real guidance: the domain index that matches the picked element (for example `../../reference/spatial.md` for layout, `../../reference/interaction.md` for components and a11y, `../../reference/color.md` for color, `../../reference/typography.md` for type, `../../reference/motion.md` for animation).
+2. Generate all N variants in ONE batch (default 3), each a distinct, hypothesis-tagged design direction for the picked element. Do not generate them one at a time.
+3. For each variant: write the change atomically to the implicated source file, then make it live. With HMR running, the file write is enough; otherwise apply the variant in place with `window.__gddLive.swapVariant({ n, style, html })`, which sets `data-gdd-variant="n"` and applies the variant's style or markup.
+---
+## POST-CHECK
+Run the post-check on each variant via `scripts/lib/live/postcheck.cjs`, which invokes `gdd-detect`. Show the findings inline next to each variant. A variant that trips a finding is flagged, NOT auto-rejected: the user still decides. Append a `live_postcheck` event per variant.
+---
+## ACCEPT / DISCARD
+- ACCEPT one variant: apply the chosen variant as the canonical source edit, and revert the others in the page (`window.__gddLive.revert()` on each non-chosen element). Emit a `live_accept` event and feed the outcome to the design-variants bandit via `scripts/lib/live/bandit-feed.cjs` (a dev-time signal). Append an `accept` entry.
+- DISCARD: revert every variant in the page back to its captured original and leave the source untouched. Emit a `live_discard` event and append a `discard` entry.
+Either way, persist the result through `scripts/lib/live/session-store.cjs` before continuing.
+---
+## PERSIST
+Every step (boot, pick, generate, post-check, accept, discard) is written to the session file through `scripts/lib/live/session-store.cjs` as it happens. The on-disk event log uses the `pick`, `generate`, `accept`, `discard` kinds; the telemetry stream uses the six `live_*` event types. Writes are atomic, so an interrupted step never leaves a half-written session.
+---
+## RESUME
+With `--resume <session-id>`, load the named session from `.design/live-sessions/`. Only an `in_progress` session is resumable. Offer the user two choices: continue from the last recorded event (report what that was, for example "last pick was the primary button"), or start fresh (open a new session and leave the old one intact). Never silently replay completed events.
+---
+## SCOPE GUARD
+Never write outside the source files implicated by the picked element. Run every proposed write through `scripts/lib/live/scope-guard.cjs`, which maps the picked selector to its owning source files and rejects edits that fall outside them. If a variant would need a change beyond that scope (a shared token, a parent layout, a new dependency), stop and surface it to the user rather than widening the blast radius.
+## Constraints
+- Do NOT edit files outside the picked element's implicated sources (enforced by the scope guard).
+- Do NOT generate variants one at a time; generate the full batch, then swap.
+- Do NOT auto-reject a variant on a post-check finding; flag it and let the user decide.
+- In DEGRADED mode, state up front that hot-swap is unavailable and fall back to screenshots.
+- Persist before every user-facing prompt so a crash never loses accepted work.
+## LIVE COMPLETE

package/dist/claude-code/.claude/skills/quality-gate/SKILL.md CHANGED Viewed

@@ -39,7 +39,7 @@ Read once at start from `.design/config.json` (all optional; defaults in parens)
 Stop at the first tier that produces ≥ 1 command:
 1. **Authoritative config.** If `.design/config.json` has `quality_gate.commands` non-empty, use verbatim.
-2. **Auto-detect from `package.json#scripts`** - match against allowlist: `lint`, `typecheck`, `tsc` (only if `typecheck` absent), `test`, `chromatic`, `test:visual`, `lint:design` (Phase 41 - the `gdd-detect` deterministic anti-pattern gate, alongside `axe`/`pa11y`/`lighthouse`). Exclude by name: `test:e2e`, `test:integration` (if separate `test`), anything starting `dev:`, `build:`, `start:`. Run via `npm run <name>` unless `quality_gate.package_manager` overrides.
+2. **Auto-detect from `package.json#scripts`** - match against allowlist: `lint`, `typecheck`, `tsc` (only if `typecheck` absent), `test`, `chromatic`, `test:visual`, `lint:design` (Phase 41 - the `gdd-detect` deterministic anti-pattern gate), and the accessibility scripts `axe`, `pa11y`, `lighthouse`, `eslint-plugin-jsx-a11y` (or a script named `jsx-a11y`) which classify into the `a11y` bucket. Exclude by name: `test:e2e`, `test:integration` (if separate `test`), anything starting `dev:`, `build:`, `start:`. Run via `npm run <name>` unless `quality_gate.package_manager` overrides.
 3. **Skip with notice.** Emit `quality_gate_skipped` (Step 6) and write a `<run/>` with `status="skipped"`. Verify treats skipped as non-blocking.
 ## Step 2 - Parallel run
@@ -48,7 +48,7 @@ Emit `quality_gate_started`. Spawn each command in a separate `Bash`; collect `{
 ## Step 3 - Classification
-Spawn `quality-gate-runner` agent via `Task` with payload `{outputs: [{command, exit_code, stderr}, ...]}`. Agent returns `{status: "pass"|"fail", classified_failures: {lint, type, test, visual}}`. `pass` → Step 5. `fail` → Step 4.
+Spawn `quality-gate-runner` agent via `Task` with payload `{outputs: [{command, exit_code, stderr}, ...]}`. Agent returns `{status: "pass"|"fail", classified_failures: {lint, type, test, visual, a11y}}`. The `a11y` bucket groups accessibility failures from axe / pa11y / lighthouse / jsx-a11y. `pass` → Step 5. `fail` → Step 4.
 ## Step 4 - Fix loop (D-08)

package/hooks/gdd-a11y-gate.js ADDED Viewed

@@ -0,0 +1,119 @@
+#!/usr/bin/env node
+'use strict';
+/**
+ * hooks/gdd-a11y-gate.js — advisory PostToolUse hook for accessibility failures.
+ *
+ * Phase 48 (A11Y-GATE). The quality-gate skill classifies failed command runs
+ * into buckets {lint, type, test, visual, a11y}. When a tool response carries
+ * classified_failures with a non-empty `a11y` bucket, this hook surfaces an
+ * advisory note so the accessibility failures are visible without being buried
+ * in the gate's JSON, and appends a `quality_gate_a11y` event to the cycle's
+ * events.jsonl for observability.
+ *
+ * Contract (mirrors gdd-mcp-circuit-breaker.js):
+ *   - Read stdin JSON (the PostToolUse payload).
+ *   - Inspect payload.tool_response for quality-gate classified_failures.a11y.
+ *   - If present and non-empty: emit an advisory note + append one events.jsonl row.
+ *   - ALWAYS write {continue:true} to stdout and exit 0. This hook never blocks.
+ *
+ * Advisory only: accessibility findings route to design-fixer through the gate's
+ * own fix loop, not through this hook. The hook is observability, not a gate.
+ * Dependency-free Node (fs + path only).
+ */
+const fs = require('fs');
+const path = require('path');
+/**
+ * Pull the `a11y` bucket out of a tool response, tolerating both the shape
+ * where classified_failures sits at the top level and the shape where it is
+ * nested under a `quality_gate` / `result` wrapper. Returns an array of
+ * summary strings (possibly empty) or null when no a11y bucket is present.
+ */
+function extractA11yFailures(toolResponse) {
+  if (!toolResponse || typeof toolResponse !== 'object') return null;
+  const candidates = [
+    toolResponse.classified_failures,
+    toolResponse.quality_gate && toolResponse.quality_gate.classified_failures,
+    toolResponse.result && toolResponse.result.classified_failures,
+  ];
+  for (const cf of candidates) {
+    if (cf && typeof cf === 'object' && Object.prototype.hasOwnProperty.call(cf, 'a11y')) {
+      const bucket = cf.a11y;
+      if (Array.isArray(bucket)) return bucket;
+      // Tolerate a non-array truthy value by coercing to a single-element list.
+      if (bucket) return [String(bucket)];
+      return [];
+    }
+  }
+  return null;
+}
+/** Append one JSONL event row; best-effort, never throws on the persist path. */
+function appendEvent(cwd, row) {
+  try {
+    const eventsPath = path.join(cwd, '.design', 'events.jsonl');
+    fs.mkdirSync(path.dirname(eventsPath), { recursive: true });
+    fs.appendFileSync(eventsPath, JSON.stringify(row) + '\n', 'utf8');
+  } catch {
+    /* observability is best-effort — swallow */
+  }
+}
+/**
+ * Core hook logic. Accepts a parsed payload and returns the decision object
+ * to write to stdout. Exported for unit testing without spawning a process.
+ * Always returns an object whose `continue` field is true.
+ */
+function evaluate(payload, opts = {}) {
+  const cwd = (payload && payload.cwd) || opts.cwd || process.cwd();
+  const toolResponse = payload && payload.tool_response;
+  const a11y = extractA11yFailures(toolResponse);
+  if (!a11y || a11y.length === 0) {
+    return { continue: true };
+  }
+  const count = a11y.length;
+  const note =
+    `gdd-a11y-gate: quality gate reported ${count} accessibility ` +
+    `failure${count === 1 ? '' : 's'} in the a11y bucket. These route to ` +
+    `design-fixer like lint/type/test/visual failures. Findings: ` +
+    a11y.slice(0, 5).join('; ');
+  appendEvent(cwd, {
+    ts: new Date().toISOString(),
+    event: 'quality_gate_a11y',
+    a11y_failure_count: count,
+    a11y_failures: a11y.slice(0, 20),
+  });
+  // continue:true keeps this advisory — systemMessage surfaces the note.
+  return { continue: true, systemMessage: note };
+}
+async function main(stdin = process.stdin, stdout = process.stdout) {
+  let buf = '';
+  for await (const chunk of stdin) buf += chunk;
+  let payload;
+  try {
+    payload = JSON.parse(buf || '{}');
+  } catch {
+    stdout.write(JSON.stringify({ continue: true }));
+    return;
+  }
+  const decision = evaluate(payload);
+  stdout.write(JSON.stringify(decision));
+}
+// Run as a CLI only when invoked directly; tests require() this module and
+// call evaluate()/main() against mock payloads without triggering stdin reads.
+if (require.main === module) {
+  main().catch(() => {
+    process.stdout.write(JSON.stringify({ continue: true }));
+  });
+}
+module.exports = { main, evaluate, extractA11yFailures, appendEvent };

package/hooks/hooks.json CHANGED Viewed

@@ -116,6 +116,14 @@
             "command": "node --experimental-strip-types \"${CLAUDE_PLUGIN_ROOT}/hooks/context-exhaustion.ts\""
           }
         ]
+      },
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/hooks/gdd-a11y-gate.js\""
+          }
+        ]
       }
     ],
     "Stop": [

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@hegemonart/get-design-done",
-  "version": "1.46.0",
+  "version": "1.48.0",
   "description": "A design-quality pipeline for AI coding agents: brief, plan, implement, and verify UI work against your design system.",
   "author": "Hegemon",
   "homepage": "https://github.com/hegemonart/get-design-done",